Limiting ShortestPath in Cypher to nodes with specific properties - neo4j

I am trying to figure out how to limit a shortest path query in cypher so that it only connects "Person" nodes containing a specific property.
Here is my query:
MATCH p = shortestPath( (from:Person {id: 1})-[*]-(to:Person {id: 2})) RETURN p
I would like to limit it so that when it connects from one Person node to another Person node, the Person node has to contain a property called "job" and a value of "engineer."
Could you help me construct the query? Thanks!

Your requirements are not very clear, but if you simply want one of the people to have an id of 1 and the other person to be an engineer, you would use this:
MATCH p = shortestPath( (from:Person {id: 1})-[*]-(to:Person {job: "engineer"}))
RETURN p;
This kind query should be much faster if you also created indexes for the id and job properties of Person.

Related

How to do this in a single Cypher Query?

So this is a very basic question. I am trying to make a cypher query that creates a node and connects it to multiple nodes.
As an example, let's say I have a database with towns and cars. I want to create a query that:
creates people, and
connects them with the town they live in and any cars they may own.
So here goes:
Here's one way I tried this query (I have WHERE clauses that specify which town and which cars, but to simplify):
MATCH (t: Town)
OPTIONAL MATCH (c: Car)
MERGE a = ((c) <-[:OWNS_CAR]- (p:Person {name: "John"}) -[:LIVES_IN]-> (t))
RETURN a
But this returns multiple people named John - one for each car he owns!
In two queries:
MATCH (t:Town)
MERGE a = ((p:Person {name: "John"}) -[:LIVES_IN]-> (t))
MATCH (p:Person {name: "John"})
OPTIONAL MATCH (c:Car)
MERGE a = ((p) -[:OWNS_CAR]-> (c))
This gives me the result I want, but I was wondering if I could do this in 1 query. I don't like the idea that I have to find John again! Any suggestions?
It took me a bit to wrap my head around why MERGE sometimes creates duplicate nodes when I didn't intend that. This article helped me.
The basic insight is that it would be best to merge the Person node first before you match the towns and cars. That way you won't get a new Person node for each relationship pattern.
If Person nodes are uniquely identified by their name properties, a unique constraint would prevent you from creating duplicates even if you run a mistaken query.
If a person can have multiple cars and residences in multiple towns, you also want to avoid a cartesian product of cars and towns in your result set before you do the merge. Try using the table output in Neo4j Browser to see how many rows are getting returned before you do the MERGE to create relationships.
Here's how I would approach your query.
MERGE (p:Person {name:"John"})
WITH p
OPTIONAL MATCH (c:Car)
WHERE c.licensePlate in ["xyz123", "999aaa"]
WITH p, COLLECT(c) as cars
OPTIONAL MATCH (t:Town)
WHERE t.name in ["Lexington", "Concord"]
WITH p, cars, COLLECT(t) as towns
FOREACH(car in cars | MERGE (p)-[:OWNS]->(car))
FOREACH(town in towns | MERGE (p)-[:LIVES_IN]->(town))
RETURN p, towns, cars

Create relationship between each two nodes for a set of nodes

I have created many nodes in neo4j, the attributes of these nodes are the same, they all have user_id and item_id, the code used is as follows:
LOAD CSV WITH HEADERS FROM 'file://data.csv' AS row
CREATE (main:Main_table {USER_ID: row.user_id,
ITEM_ID: row.item_id}
)
CREATE INDEX ON :Main_table(USER_ID);
CREATE INDEX ON :Main_table(ITEM_ID);
Now I want to create relationship between the nodes with the same user_id or item_id. For example, if node A, B and C have the same USER_ID, I want to create (A)-[:EDGE]->(B), (A)-[:EDGE]->(C) and (B)-[:EDGE]->(C). In order to achieve this goal, I tried the following code:
MATCH (a:Main_table),(b:Main_table)
WHERE a.USER_ID = b.USER_ID
CREATE (a)-[:USER_EDGE]->(b);
MATCH (a:Main_table),(b:Main_table)
WHERE a.ITEM_ID = b.ITEM_ID
CREATE (a)-[:ITEM_EDGE]->(b);
But due to the large amount of data (3000000 nodes, 100000 users), this process is very slow, how can I quickly complete this process? Any help would be greatly appreciated!
Your query is causing a cartesian product, and the Cypher planner does not use indexes to optimize node lookups involving node property comparisons.
A query like this (instead of your USER_EDGE query) may be faster, as it does not cause a cartesian product:
MATCH (a:Main_table)
WITH a.USER_ID AS id, COLLECT(a) AS mains
UNWIND mains AS a
UNWIND mains AS b
WITH a, b
WHERE ID(a) < ID(b)
MERGE (a)-[:USER_EDGE]->(b)
This query uses the aggregating function COLLECT to collect the nodes that have the same USER_ID value, and uses the ID(a) < ID(b) test to ensure that a and b are not the same nodes and to also prevent duplicate relationships (in opposite directions).

Neo4j return subgraph based on certain id

My database stores family relationships.
What I am trying to do is to return the whole family tree of a person with a certain ID.
I have tried the following query:
MATCH (p:Person {id:'1887'})-[r*1..3]-(k:Person)
RETURN distinct(p), r, k
but I didn't get the result that I wanted.
To make it more clear, I would like to get all the relationships that a person with specified ID has, along with the relationships of all the other nodes that he is connected to.
To explain further, here is an example:
Boris is a parent of Mike and Anna. Apart of seeing Boris' relationship to them, I also want to see Mike and Anna's further relationships with their subgraphs.
I believe you should try returning the complete path, like this:
MATCH path = (p:Person {id:'1887'})-[r*1..3]-(k:Person)
RETURN path
This query will store each the path between a person with id = 1887 and another person with depth 1 to 3 and return it. Or if you are interested only in the nodes you can extract it with the nodes() function:
MATCH path = (p:Person {id:'1887'})-[r*1..3]-(k:Person)
RETURN nodes(path)

How to add multiple relationships to a single node in one query

Consider this example: I have one Author node, and several Book nodes. I want to create the WROTE relationship between the Author and several Book nodes in a single cypher Query statement. Also, the only way I have to look up the nodes for both the Author and all the Book nodes is by their node ID.
Here's what I tried:
MATCH (a:Author) WHERE id(a) = '31'
MATCH (b0:Book) WHERE id(b0) = '32'
MATCH (b1:Book) WHERE id(b1) = '33'
CREATE (b0)<-[:WROTE {order : '0'}]-(a)
CREATE (b1)<-[:WROTE {order : '1'}]-(a)
However, it doesn't seem to work.
Thanks.
Neo4js native ids are stored as numbers, so do not match by string. Also Cypher has IN clause that lets you match arrays, so you can simplify you query to this
MATCH (a:Author) where id(a)=31
MATCH (b:Book) where id(b) in [32,33]
CREATE (a)-[:WROTE]->(b)
Found the issue to be that I shouldn't be using the ' when matching node IDs!

Cypher - find nodes whose related nodes have properties contained in a set

I need a cypher query to search for something like this:
I have a graph (a)-[r]->(b) and (a)-[r]->(c) were a is a person and b and c are 2 different skill nodes.
Let's suppose I am looking for someone knowing both java and fortran.
Say b has property name:“java” and c has property name:“fortran”.
How do I find a person that has ALL specified skill nodes?
It'd be useful if the query was scalable, i.e. if I had 20 skill nodes, it would be also easy to execute it.
Thanks a lot in advance!
One way would be to MATCH your Person nodes to Skill nodes, filter Skill nodes for your properties and count the number of nodes per Person. If it's as large as the array of properties your filtering, the Person has all the Skills
MATCH (p:Person)-[r:HAS]->(s:Skill)
WHERE s.name IN ['java', 'fortran', 'cypher']
RETURN DISTINCT p, count(s)
I think you can combine this with a CASE statement to return the data:
MATCH (p:Person)-[r:HAS]->(s:Skill)
WHERE s.name IN ['java', 'fortran', 'cypher']
RETURN
CASE
WHEN count(s) = 3
THEN p
ELSE 0
END

Resources