So this is a very basic question. I am trying to make a cypher query that creates a node and connects it to multiple nodes.
As an example, let's say I have a database with towns and cars. I want to create a query that:
creates people, and
connects them with the town they live in and any cars they may own.
So here goes:
Here's one way I tried this query (I have WHERE clauses that specify which town and which cars, but to simplify):
MATCH (t: Town)
OPTIONAL MATCH (c: Car)
MERGE a = ((c) <-[:OWNS_CAR]- (p:Person {name: "John"}) -[:LIVES_IN]-> (t))
RETURN a
But this returns multiple people named John - one for each car he owns!
In two queries:
MATCH (t:Town)
MERGE a = ((p:Person {name: "John"}) -[:LIVES_IN]-> (t))
MATCH (p:Person {name: "John"})
OPTIONAL MATCH (c:Car)
MERGE a = ((p) -[:OWNS_CAR]-> (c))
This gives me the result I want, but I was wondering if I could do this in 1 query. I don't like the idea that I have to find John again! Any suggestions?
It took me a bit to wrap my head around why MERGE sometimes creates duplicate nodes when I didn't intend that. This article helped me.
The basic insight is that it would be best to merge the Person node first before you match the towns and cars. That way you won't get a new Person node for each relationship pattern.
If Person nodes are uniquely identified by their name properties, a unique constraint would prevent you from creating duplicates even if you run a mistaken query.
If a person can have multiple cars and residences in multiple towns, you also want to avoid a cartesian product of cars and towns in your result set before you do the merge. Try using the table output in Neo4j Browser to see how many rows are getting returned before you do the MERGE to create relationships.
Here's how I would approach your query.
MERGE (p:Person {name:"John"})
WITH p
OPTIONAL MATCH (c:Car)
WHERE c.licensePlate in ["xyz123", "999aaa"]
WITH p, COLLECT(c) as cars
OPTIONAL MATCH (t:Town)
WHERE t.name in ["Lexington", "Concord"]
WITH p, cars, COLLECT(t) as towns
FOREACH(car in cars | MERGE (p)-[:OWNS]->(car))
FOREACH(town in towns | MERGE (p)-[:LIVES_IN]->(town))
RETURN p, towns, cars
Related
I'm new to Neo4J and Cypher and decided to play around with the Movie sample data that is provided when installing Neo4J desktop.
I want to run a very simple query, namely to retrieve the titles of the movies which involved 3 people, Liv Tyler, Charlize Theron, and Bonnie Hunt. Matching up two people is not a problem (see the code below) but including a third one is difficult.
In SQL this wouldn't be a problem for me, but Cypher causes serious headaches. Here is the query so far:
MATCH (Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(Person {name: "Bonnie Hunt"})
RETURN movie.title AS Title
I've tried to use AND statements, but nothing works.
So how to include Charlize Theron in this query?
You can use multiple patterns to match three or more connections to a single node.
You can use the variable movie which you are using in your query to refer same Movie node to include the pattern (:Person {name: "Charlize Thero"})-[:ACTED_IN]->(movie).
MATCH (:Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(:Person {name: "Bonnie Hunt"}),
(:Person {name: "Charlize Theron"})-[:ACTED_IN]->(movie)
RETURN movie.title AS Title
You can also rewrite the above query as follows:
MATCH (:Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie),
(:Person {name: "Bonnie Hunt"})-[:DIRECTED]->(movie),
(:Person {name: "Charlize Theron"})-[:ACTED_IN]->(movie)
RETURN movie.title AS Title
If you have some arbitrary number of actors (parameterized), where you can't hardcode the :Person nodes in question, you can instead match on :Person nodes with their name in the parameter list, then filter based on the count of patterns found (you want to make sure all of the persons who acted in the movie are counted).
But if we do that first for directors, then we have some movie matches already, and can apply an all() predicate on the list of actors to ensure they all acted in the movie.
Assuming two list parameters, one for actors, one for directors:
MATCH (director:Person)-[:DIRECTED]->(m:Movie)
WHERE director.name in $directors
WITH m, count(director) as directorCount
WHERE directorCount = size($directors)
AND all(actor IN $actors WHERE (:Person {name:actor})-[:ACTED_IN]->(m))
RETURN m.title as Title
I would like to use cypher to create a relationship between items in an array and another node.
The result from this query was a list of empty nodes connected to each other.
MATCH (person:person),(preference:preference)
UNWIND person.preferences AS p
WITH p
WHERE NOT (person)-[:likes]->(preference) AND
p = preference.name CREATE (person)-[r:likes]->(preference)
Where person.preferences contains an array of preference names.
Obviously I am doing something wrong. I am new to neo4j and any help with above would be much appreciated.
Properties are attributes of a nodes while relationships involve one or two nodes. As such, it's not possible to create a relationship between properties of two nodes. You'd need to split the properties into their own collection of nodes, and then create a relationship between the respective nodes.
You can do all that in one statement - like so:
create (:Person {name: "John"})-[:LIKES]->(:Preference {food: "ice cream"})
For other people, you don't want to create duplicate Preferences, so you'd look up the preference, create the :Person node, and then create the relationship, like so:
match (preference:Preference {food: "ice cream"})
create (person:Person {name: "Jane"})
create (person)-[:LIKES]->(preference)
The bottom line for your use case is you'll need to split the preference arrays into a set of nodes and then create relationships between the people nodes and your new preference nodes.
One thing....
MATCH (person:person),(preference:preference)
Creates a Cartesian product (inefficient and causes weird things)
Try this...
// Get all persons
MATCH (person:person)
// unwind preference list, (table is now person | preference0, person | preference1)
UNWIND person.preferences AS p
// For each row, Match on prefrence
MATCH (preference:preference)
// Filter on preference column
WHERE preference.name=p
// MERGE instead of CREATE to "create if doesn't exist"
MERGE (person)-[:likes]->(preference)
RETURN person,preference
If this doesn't work, could you supply your sample data and noe4j version? (As far as I can tell, your query should technically work)
I have the nodes: (a:charlie), (b:economy), and (c:bicycle) . I want to create this pattern:
create (a:charlie)-[x:wants_make]->(b:economy)->[y:by_using]->(c:bicycle)
But it gives me cartesian product. I already thought to skip the creation of the node (b) giving to relation [x:want_make]a property. But node (b) has many other relations in the same context(economic context). What I want to get the pattern above.
Any suggestion?
If your query looks like this:
MATCH (a:charlie), (b:economy), (c:bicycle)
MERGE (a)-[:wants_make]->(b)-[:by_using]->(c);
then it is saying both of these things:
Create a wants_make relationship between every charlie node and every economy node.
Create a by_using relationship between every economy node and every bicycle node.
So, if the number of charlie, economy, and bicycle nodes are C, E, and B -- this results in (C * E * B) merges, which is a Cartesian product of a Cartesian product.
Also, your data model seems to be wrong. For example, it seems much more reasonable to have a Person label instead of a charlie label.
A more reasonable query might look something like this:
MERGE (a:Person {name: 'Charlie Brown'})
MERGE (c:Bicycle {id: 123})
MERGE (a)-[:wants_make]->(b:Economy)
MERGE (b)-[:by_using]->(c);
This query avoids Cartesian products by being more specific about the first and last nodes in the path, and it also avoids creating nodes and relationships that already exist.
And, going even further, you might want to combine wants_make, Economy, and by_using into a single economizes_by_using relationship:
MERGE (a:Person {name: 'Charlie Brown'})
MERGE (c:Bicycle {id: 123})
MERGE (a)-[:economizes_by_using]->(c);
You might need to break up your query a bit:
MATCH (a:charlie), (b:economy), (c:bicycle)
MERGE (a)-[:wants_make]->(b), (b)->[:by_using]->(c)
I want to add a "created by" relationship on nodes in my database. Any node should be able of having this relationship but there can never be more than one.
Right now my query looks something like this:
MATCH (u:User {email: 'my#mail.com'})
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.name='Node name', n.attribute='value'
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
As I have understood Cypher there is no way to achieve what I want, the current query will only make sure there are no unique relationships based on TWO nodes, not ONE. So, this will create more CREATED_BY relationships when run for another User and I want to limit the outgoing CREATED_BY relationship to just one for all nodes.
Is there a way to achieve this without running multiple queries involving program logic?
Thanks.
Update
I tried to simplyfy the query by removing implementation details, if it helps here's the updated query based on cybersams response.
MERGE (c:Company {name: 'Test Company'})
ON CREATE SET c.uuid='db764628-5695-40ee-92a7-6b750854ebfa', c.created_at='2015-02-23 23:08:15', c.updated_at='2015-02-23 23:08:15'
WITH c
OPTIONAL MATCH (c)
WHERE NOT (c)-[:CREATED_BY]-()
CREATE (c)-[:CREATED_BY {date: '2015-02-23 23:08:15'}]->(u:User {token: '32ba9d2a2367131cecc53c310cfcdd62413bf18e8048c496ea69257822c0ee53'})
RETURN c
Still not working as expected.
Update #2
I ended up splitting this into two queries.
The problem I found was that there was two possible outcomes as I noticed.
The CREATED_BY relationship was created and (n) was returned using OPTIONAL MATCH, this relationship would always be created if it didn't already exist between (n) and (u), so when changing the email attribute it would re-create the relationship.
The Node (n) was not found (because of not using OPTIONAL MATCH and the WHERE NOT (c)-[:CREATED_BY]-() clause), resulting in no relationship created (yay!) but without getting the (n) back the MERGE query looses all it's meaning I think.
My Solution was the following two queries:
MERGE (n:Node {name: 'Name'})
ON CREATE SET
SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)-[r:CREATED_BY]-()
RETURN c, r
Then I had program logic check the value of r, if there was no relationship I would run the second query.
MATCH (n:Node {name: 'Name'})
MATCH (u:User {email: 'my#email.com'})
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
Unfortunately I couldn't find any real solution to combining this in one single query with Cypher. Sam, thanks! I've selected your answer even though it didn't quite solve my problem, but it was very close.
This should work for you:
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)
WHERE NOT (n)-[:CREATED_BY]->()
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(:User {email: 'my#mail.com'})
RETURN n;
I've removed the starting MATCH clause (because I presume you want to create a CREATED_BY relationship even when that User does not yet exist in the DB), and simplified the ON CREATE to remove the redundant setting of the name property.
I have also added an OPTIONAL MATCH that will only match an n node that does not already have an outgoing CREATED_BY relationship, followed by a CREATE UNIQUE clause that fully specifies the User node.
I am trying to figure out how to limit a shortest path query in cypher so that it only connects "Person" nodes containing a specific property.
Here is my query:
MATCH p = shortestPath( (from:Person {id: 1})-[*]-(to:Person {id: 2})) RETURN p
I would like to limit it so that when it connects from one Person node to another Person node, the Person node has to contain a property called "job" and a value of "engineer."
Could you help me construct the query? Thanks!
Your requirements are not very clear, but if you simply want one of the people to have an id of 1 and the other person to be an engineer, you would use this:
MATCH p = shortestPath( (from:Person {id: 1})-[*]-(to:Person {job: "engineer"}))
RETURN p;
This kind query should be much faster if you also created indexes for the id and job properties of Person.