Problems with simple Cypher Query: complicated match statement - neo4j

I'm new to Neo4J and Cypher and decided to play around with the Movie sample data that is provided when installing Neo4J desktop.
I want to run a very simple query, namely to retrieve the titles of the movies which involved 3 people, Liv Tyler, Charlize Theron, and Bonnie Hunt. Matching up two people is not a problem (see the code below) but including a third one is difficult.
In SQL this wouldn't be a problem for me, but Cypher causes serious headaches. Here is the query so far:
MATCH (Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(Person {name: "Bonnie Hunt"})
RETURN movie.title AS Title
I've tried to use AND statements, but nothing works.
So how to include Charlize Theron in this query?

You can use multiple patterns to match three or more connections to a single node.
You can use the variable movie which you are using in your query to refer same Movie node to include the pattern (:Person {name: "Charlize Thero"})-[:ACTED_IN]->(movie).
MATCH (:Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(:Person {name: "Bonnie Hunt"}),
(:Person {name: "Charlize Theron"})-[:ACTED_IN]->(movie)
RETURN movie.title AS Title
You can also rewrite the above query as follows:
MATCH (:Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie),
(:Person {name: "Bonnie Hunt"})-[:DIRECTED]->(movie),
(:Person {name: "Charlize Theron"})-[:ACTED_IN]->(movie)
RETURN movie.title AS Title

If you have some arbitrary number of actors (parameterized), where you can't hardcode the :Person nodes in question, you can instead match on :Person nodes with their name in the parameter list, then filter based on the count of patterns found (you want to make sure all of the persons who acted in the movie are counted).
But if we do that first for directors, then we have some movie matches already, and can apply an all() predicate on the list of actors to ensure they all acted in the movie.
Assuming two list parameters, one for actors, one for directors:
MATCH (director:Person)-[:DIRECTED]->(m:Movie)
WHERE director.name in $directors
WITH m, count(director) as directorCount
WHERE directorCount = size($directors)
AND all(actor IN $actors WHERE (:Person {name:actor})-[:ACTED_IN]->(m))
RETURN m.title as Title

Related

How to do this in a single Cypher Query?

So this is a very basic question. I am trying to make a cypher query that creates a node and connects it to multiple nodes.
As an example, let's say I have a database with towns and cars. I want to create a query that:
creates people, and
connects them with the town they live in and any cars they may own.
So here goes:
Here's one way I tried this query (I have WHERE clauses that specify which town and which cars, but to simplify):
MATCH (t: Town)
OPTIONAL MATCH (c: Car)
MERGE a = ((c) <-[:OWNS_CAR]- (p:Person {name: "John"}) -[:LIVES_IN]-> (t))
RETURN a
But this returns multiple people named John - one for each car he owns!
In two queries:
MATCH (t:Town)
MERGE a = ((p:Person {name: "John"}) -[:LIVES_IN]-> (t))
MATCH (p:Person {name: "John"})
OPTIONAL MATCH (c:Car)
MERGE a = ((p) -[:OWNS_CAR]-> (c))
This gives me the result I want, but I was wondering if I could do this in 1 query. I don't like the idea that I have to find John again! Any suggestions?
It took me a bit to wrap my head around why MERGE sometimes creates duplicate nodes when I didn't intend that. This article helped me.
The basic insight is that it would be best to merge the Person node first before you match the towns and cars. That way you won't get a new Person node for each relationship pattern.
If Person nodes are uniquely identified by their name properties, a unique constraint would prevent you from creating duplicates even if you run a mistaken query.
If a person can have multiple cars and residences in multiple towns, you also want to avoid a cartesian product of cars and towns in your result set before you do the merge. Try using the table output in Neo4j Browser to see how many rows are getting returned before you do the MERGE to create relationships.
Here's how I would approach your query.
MERGE (p:Person {name:"John"})
WITH p
OPTIONAL MATCH (c:Car)
WHERE c.licensePlate in ["xyz123", "999aaa"]
WITH p, COLLECT(c) as cars
OPTIONAL MATCH (t:Town)
WHERE t.name in ["Lexington", "Concord"]
WITH p, cars, COLLECT(t) as towns
FOREACH(car in cars | MERGE (p)-[:OWNS]->(car))
FOREACH(town in towns | MERGE (p)-[:LIVES_IN]->(town))
RETURN p, towns, cars

Return type matched from multiple relationships in Neo4j with Cypher

I know that you can match on multiple relationships in Neo4j, like this example in the docs:
MATCH (wallstreet {title: 'Wall Street'})<-[:ACTED_IN|:DIRECTED]-(person)
RETURN person.name
which returns nodes with an ACTED_IN or DIRECTED relationship to 'Wall Street'.
However, is there a way to get the type of the relationship in this query? That is, I would like to return not only the name, but also which relationship applies to him/her, in order to see if it was the ACTED_IN, or the DIRECTED relationship that caused the result to be output.
You can do the equivalent here:
MATCH (:Person {name: 'Oliver Stone'})-[r]->(movie)
RETURN type(r)
but that's just matching on any relationship. I would like to do this, but only with the two relationships specified in the clause.
Thanks
You no longer need additional colons in between valid edge types you are querying. otherwise you can use the variable just like you did in the unspecific edge case:
MATCH (:Movie{title: 'The Matrix'})<-[r:ACTED_IN|DIRECTED]-(person)
RETURN type(r), person.name

Neo4j Cypher: interdependent relationship values in a path

I have a graph dataset loaded in Neo4j with nodes being various persons and relationships being some "real" relationships between them. What makes it complicated is that each relationship has a time period during which it was valid. For example:
(p1:PERSON {name: "Andy"})
-[r1:HAS_RELATIONSHIP {from: "20190201", to: "20190215"}]->
(p2:PERSON {name: "Betty"})
-[r2:HAS_RELATIONSHIP {from: "20190301", to: "20190331"}]->
(p3:PERSON {name: "Cecil"})
I'd like to take one concrete person P and get a list of all persons with whom P was in an indirect relationship through other persons. It must hold that the intersection of dates in any relationship chain is nonempty.
So from the previous example, if we take Andy as P, the result should be Andy, Betty, because the relationship with Cecil was valid in a completely different period of time. But in the following case:
(p1:PERSON {name: "Andy"})
-[r1:HAS_RELATIONSHIP {from: "20190201", to: "20190215"}]->
(p2:PERSON {name: "Betty"})
-[r2:HAS_RELATIONSHIP {from: "20190210", to: "20190301"}]->
(p3:PERSON {name: "Cecil"})
the result should be Andy, Betty, Cecil.
Is there a way how to specify this condition in Cypher? I'm looking for an efficient solution which prunes the already found paths.
You basically have a list of intervals from all relationships on a path. For this list of intervals you need to check if they all overlap. This can be done by checking max(from) <= min(to), in cypher:
MATCH path=(p:PERSON {name:'Andy'})-[*..10]-(other) // Doesn't matter how you get the paths
UNWIND relationships(path) as r
WITH path,max(r.from) AS maxFrom,min(r.to) AS minTo
WHERE maxFrom <= minTo
RETURN extract(x in nodes(path) | x.name)

Neo4j Passing distinct nodes through WITH in Cypher

I have the following query, where there are 3 MATCHES, connected with WITH, searching through 3 paths.
MATCH (:File {name: 'A'})-[:FILE_OF]->(:Fun {name: 'B'})-->(ent:CFGEntry)-[:Flows*]->()-->(expr:CallExpr {name: 'C'})-->()-[:IS_PARENT]->(Callee {name: 'd'})
WITH expr, ent
MATCH (expr)-->(:Arg {chNum: '1'})-->(id:Id)
WITH id, ent
MATCH (entry)-[:Flows*]->(:IdDecl)-[:Def]->(sym:Sym)
WHERE id.name = sym.name
RETURN id.name
The query returns two distinct id and one distinct entry, and 7 distinct sym.
The problem is that since in the second MATCH I pass "WITH id, entry", and two distinct id were found, two instances of entry is passed to the third match instead of 1, and the run time of the third match unnecessarily gets doubled at least.
I am wondering if anyone know how I should write this query to just make use of one single instance of entry.
Your best bet will be to aggregate id, but then you'll need to adjust your logic in the third part of your query accordingly:
MATCH (:File {name: 'A'})-[:FILE_OF]->(:Fun {name: 'B'})-->(ent:CFGEntry)-[:Flows*]->()-->(expr:CallExpr {name: 'C'})-->()-[:IS_PARENT]->(Callee {name: 'd'})
WITH expr, ent
MATCH (expr)-->(:Arg {chNum: '1'})-->(id:Id)
WITH collect(id.name) as names, ent
MATCH (entry)-[:Flows*]->(:IdDecl)-[:Def]->(sym:Sym)
WHERE sym.name in names
RETURN sym.name

Creating relationships based on array values in Neo4j

I have two nodes representing two people:
(:Person {name:"John Smith"})
(:Person {name:"Jane Doe"})
Then I have a third node representing an article coauthored by these two people:
(:Article {title:"Some_article"}, {Coauthor:["John Smith", "Jane Doe"]})
My question is: Can I create a relationship between these nodes based on matching the names? Something like this:
MATCH (n1:Person {name:"Jane Doe"})
MATCH (n2:Article{Coauthor:"Jane Doe"})
CREATE (n2)-[:AUTHORED_BY]->(n1)
Is this possible or do I need to break up the array into separate node properties e.g. Coauthor_1, Coauthor_2 etc?
Thanks
Neo4j CE 3.0.1 on Windows 10
You can use a loop for creating authorship relationships :
MATCH (a:Article {title:"some title"})
UNWIND a.Coauthor as author
MERGE (p:Person {name: author})
MERGE (a)-[:AUTHORED_BY]->(p)

Resources