Neo4j Passing distinct nodes through WITH in Cypher - neo4j

I have the following query, where there are 3 MATCHES, connected with WITH, searching through 3 paths.
MATCH (:File {name: 'A'})-[:FILE_OF]->(:Fun {name: 'B'})-->(ent:CFGEntry)-[:Flows*]->()-->(expr:CallExpr {name: 'C'})-->()-[:IS_PARENT]->(Callee {name: 'd'})
WITH expr, ent
MATCH (expr)-->(:Arg {chNum: '1'})-->(id:Id)
WITH id, ent
MATCH (entry)-[:Flows*]->(:IdDecl)-[:Def]->(sym:Sym)
WHERE id.name = sym.name
RETURN id.name
The query returns two distinct id and one distinct entry, and 7 distinct sym.
The problem is that since in the second MATCH I pass "WITH id, entry", and two distinct id were found, two instances of entry is passed to the third match instead of 1, and the run time of the third match unnecessarily gets doubled at least.
I am wondering if anyone know how I should write this query to just make use of one single instance of entry.

Your best bet will be to aggregate id, but then you'll need to adjust your logic in the third part of your query accordingly:
MATCH (:File {name: 'A'})-[:FILE_OF]->(:Fun {name: 'B'})-->(ent:CFGEntry)-[:Flows*]->()-->(expr:CallExpr {name: 'C'})-->()-[:IS_PARENT]->(Callee {name: 'd'})
WITH expr, ent
MATCH (expr)-->(:Arg {chNum: '1'})-->(id:Id)
WITH collect(id.name) as names, ent
MATCH (entry)-[:Flows*]->(:IdDecl)-[:Def]->(sym:Sym)
WHERE sym.name in names
RETURN sym.name

Related

Problems with simple Cypher Query: complicated match statement

I'm new to Neo4J and Cypher and decided to play around with the Movie sample data that is provided when installing Neo4J desktop.
I want to run a very simple query, namely to retrieve the titles of the movies which involved 3 people, Liv Tyler, Charlize Theron, and Bonnie Hunt. Matching up two people is not a problem (see the code below) but including a third one is difficult.
In SQL this wouldn't be a problem for me, but Cypher causes serious headaches. Here is the query so far:
MATCH (Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(Person {name: "Bonnie Hunt"})
RETURN movie.title AS Title
I've tried to use AND statements, but nothing works.
So how to include Charlize Theron in this query?
You can use multiple patterns to match three or more connections to a single node.
You can use the variable movie which you are using in your query to refer same Movie node to include the pattern (:Person {name: "Charlize Thero"})-[:ACTED_IN]->(movie).
MATCH (:Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(:Person {name: "Bonnie Hunt"}),
(:Person {name: "Charlize Theron"})-[:ACTED_IN]->(movie)
RETURN movie.title AS Title
You can also rewrite the above query as follows:
MATCH (:Person {name: "Liv Tyler"})-[:ACTED_IN]->(movie:Movie),
(:Person {name: "Bonnie Hunt"})-[:DIRECTED]->(movie),
(:Person {name: "Charlize Theron"})-[:ACTED_IN]->(movie)
RETURN movie.title AS Title
If you have some arbitrary number of actors (parameterized), where you can't hardcode the :Person nodes in question, you can instead match on :Person nodes with their name in the parameter list, then filter based on the count of patterns found (you want to make sure all of the persons who acted in the movie are counted).
But if we do that first for directors, then we have some movie matches already, and can apply an all() predicate on the list of actors to ensure they all acted in the movie.
Assuming two list parameters, one for actors, one for directors:
MATCH (director:Person)-[:DIRECTED]->(m:Movie)
WHERE director.name in $directors
WITH m, count(director) as directorCount
WHERE directorCount = size($directors)
AND all(actor IN $actors WHERE (:Person {name:actor})-[:ACTED_IN]->(m))
RETURN m.title as Title

Creating Nodes and Relationships, after Matching Nodes using ID

I have 3 types of nodes - Challenge, Entry and User, and 2 relationships between these nodes: (Entry)-[POSTED_BY]->(User) and (Entry)-[PART_OF]->(Challenge).
Here is what I'm trying to accomplish:
- I have an existing Challenge, and I can identify it by its ID (internal neo4j id)
- I also have an existing User, and I can identify it by its id (not the same as neo4j internal id)
- Given the existing Challenge and User, I need to create a new Entry node, and link it with the Challenge node with a PART_OF relationship.
- In addition, I need to link the new Entry with the User node by the POSTED_BY relationship
- I want to achieve the above in a single Cypher query statement, if possible
This is what I'm trying:
MATCH (c:Challenge)
WHERE (id(c) = 240),
MATCH (u:User {id: '70cf6846-b38a-413c-bab8-7c707d4f46a8'})
CREATE (e:Entry {name: "My Entry"})-[:PART_OF]->(c), (u)<-[r:POSTED_BY]-(e)
RETURN e;
However, this is failing as it seems I cannot match two nodes with the above syntax. However, if I match the Challenge with a non-internal property, say name, it seems to work:
MATCH (c:Challenge {name: "Challenge Name"),
MATCH (u:User {id: '70cf6846-b38a-413c-bab8-7c707d4f46a8'})
CREATE (e:Entry {name: "My Entry"})-[:PART_OF]->(c), (u)<-[r:POSTED_BY]-(e)
RETURN e;
However, as I mentioned above, I want to match the neo4j internal Node ID for the Challenge, and I'm not sure if there's a way to match that other than by using the WHERE id(c) = 232 clause.
Your syntax is almost correct, but you don't need the comma between the WHERE and the next MATCH.
MATCH (c:Challenge)
WHERE id(c) = 240
MATCH (u:User {id: '70cf6846-b38a-413c-bab8-7c707d4f46a8'})
CREATE (e:Entry {name: "My Entry"})-[:PART_OF]->(c), (u)<-[r:POSTED_BY]-(e)
RETURN e;

Neo4j: Cypher query returns duplicate results

I have two graphs built like this :
CREATE (level1a:Bug {name: 'a'})
CREATE (level1b:Bug {name: 'b'})
CREATE (level2c:Bug {name: 'c'})
CREATE (level2d:Bug {name: 'd'})
CREATE (level3e:Bug {name: 'e'})
CREATE (level3f:Bug {name: 'f'})
CREATE (level3g:Bug {name: 'g'})
CREATE (level3h:Bug {name: 'h'})
CREATE (level1a)-[:LINK]->(level2c)
CREATE (level1b)-[:LINK]->(level2d)
CREATE (level2c)-[:LINK]->(level3e)
CREATE (level2c)-[:LINK]->(level3f)
CREATE (level2d)-[:LINK]->(level3g)
CREATE (level2d)-[:LINK]->(level3h)
And also available here : http://console.neo4j.org/?id=duplicate_bug2
When I execute the query :
MATCH (a:Bug {name: 'a'})-[:LINK]->()-[:LINK]->(end) return end
I get the expected two nodes (f and e). But if I do two match queries like this :
MATCH (a:Bug {name: 'a'})-[:LINK]->()-[:LINK]->(end)
MATCH (b:Bug {name: 'b'})-[:LINK]->()-[:LINK]->(end2)
return end, end2
I get duplicates nodes in end and end2. Why is this? The two graphs are not even connected!
BR,
S
Since both matches will return multiple rows and there is no correlation between the two match statements it will generate a cross product of the two result sets. In this case it is 2x2 so you get four rows of each node with each node.
I think what you are after is something like this query. It finds all of the ends from the first match, combines them in a collection and then repeats the process for the second match. Then it returns a single row in the result set with all of the ends of a and all of the ends of b regardless of how many there are at the end of each match.
MATCH (a:Bug {name: 'a'})-[:LINK]->()-[:LINK]->(end)
with collect(end) as end
MATCH (b:Bug {name: 'b'})-[:LINK]->()-[:LINK]->(end2)
return end, collect(end2) as end2

CREATE UNIQUE in neo4j produces duplicate nodes

According to the neo4j documentation:
CREATE UNIQUE is in the middle of MATCH and CREATE — it will match
what it can, and create what is missing. CREATE UNIQUE will always
make the least change possible to the graph — if it can use parts of
the existing graph, it will.
This sounds great, but CREATE UNIQUE doesn't seem to follow the 'least possible change' rule. e.g., here is some Cypher to create two people:
CREATE (n:Person {name: 'Alice'})
CREATE (n:Person {name: 'Bob'})
CREATE INDEX ON :Person(name)
and here's two CREATE UNIQUE statements, to create a relationship between those people. Since both people already exist in the graph, only the relationships should be newly created:
MATCH (a:Person {name: 'Alice'})
CREATE UNIQUE (a)-[:knows]->(b:Person {name: 'Bob'})
RETURN a
MATCH (a:Person {name: 'Alice'})
CREATE UNIQUE (a)<-[:knows]-(b:Person {name: 'Bob'})
RETURN a
After this, the graph should look like
(Alice)<---KNOWS--->(Bob).
But when you run a MATCH query:
MATCH (a:Person)
RETURN a
it seems that the graph now looks like
(Bob)
(Bob)--KNOWS-->(Alice)--KNOWS-->(Bob);
two extra Bobs have been created.
I looked a bit through the other Cypher commands, but none of them seem intended for this use case: create a link between existing node A and existing node B if B exists, and otherwise create a link between existing node A and a newly created node B. How can this problem best be solved within the Cypher framework?
This query should do what you want (if you always want to end up with a single knows relationship between the 2 nodes):
MATCH (a:Person {name: 'Alice'})
MERGE (b:Person {name: 'Bob'})
MERGE (a)-[:knows]->(b)
RETURN a;
Here is how you can do it with CREATE UNIQUE
MATCH (a:Person {name: 'Alice'}), (b:Person {name:'Bob'})
CREATE UNIQUE (a)-[:knows]->(b), (b)-[:knows]->(a)
You need 2 match clauses otherwise you are always creating the node in the CREATE UNIQUE statement, not matching existing nodes.

Matching multiple independent (or dependent) paths

I'm looking at a hierarchy of people and organizations, and trying to find where/if they meet and share management. Let's say "Bob" and "Susan" work for different branches. I want to show their two reporting relationships up through the company if/when they overlap.
This query currently works great, and returns a single path:
MATCH path=(p:Person {name: "Bob"})-[:reports_to*]->(o:Organization {code: "TopOfCompany"})
RETURN path;
This query also works great, and returns a single path:
MATCH path2=(p:Person {name: "Susan"})-[:reports_to*]->(o2:Organization {code: "TopOfCompany"})
RETURN path2;
This query (doing both of them in one operation) returns nothing at all:
MATCH path=(p:Person {name: "Bob"})-[:reports_to*]->(o:Organization {code: "TopOfCompany"}),
path2=(p:Person {name: "Susan"})-[:reports_to*]->(o2:Organization {code: "TopOfCompany"})
RETURN path,path2;
The same is true if I reuse the first o binding in the second path query.
I'm aware that I could reformulate this to find where the two people meet in the middle, like this:
MATCH path=(p1:Person {name: "Bob"})-[:reports_to*]->(o:Organization)<-[:reports_to*]-(p2:Person {name: "Susan"})
RETURN path;
And indeed that query runs fine - but if they don't meet in the middle, this query will fail since the o:Organization in the middle doesn't exist.
There are probably other equivalent ways I could reformulate to get to the right results - but the heart of my question is, is it not possible to identify two different independent paths in one query? This would be useful in the case where they don't meet, where the targets ("TopOfCompany") I'm matching to are different, or I just wanted to compare a series of paths.
Oh, and I'm on 2.2M04, using the server. The query with two paths succeeds, but the results are empty, as in the JSON version of the results is:
{"columns":["path","path2"],"data":[],"stats":{"contains_updates":false,"nodes_created":0,"nodes_deleted":0,"properties_set":0,"relationships_created":0,"relationship_deleted":0,"labels_added":0,"labels_removed":0,"indexes_added":0,"indexes_removed":0,"constraints_added":0,"constraints_removed":0}}
This query of yours is using the same variable (p) for the Bob AND Susan nodes, which probably explains why it does not work as you expected (a single node cannot have 2 different values for the same property):
MATCH path=(p:Person {name: "Bob"})-[:reports_to*]->(o:Organization {code: "TopOfCompany"}),
path2=(p:Person {name: "Susan"})-[:reports_to*]->(o2:Organization {code: "TopOfCompany"})
RETURN path,path2;
You can either use different variables, or just get rid of the node variables entirely (since you don't use them anywhere) -- like this:
MATCH path=(:Person {name: "Bob"})-[:reports_to*]->(:Organization {code: "TopOfCompany"}),
path2=(:Person {name: "Susan"})-[:reports_to*]->(:Organization {code: "TopOfCompany"})
RETURN path,path2;
Optional Match, which could be considered the Cypher equivalent of outer join in SQL, can be used when working with path matching. The following query matches the two individual paths as well as the path that matches both people:
MATCH
path1=(p1:Person {name: "Bob"})-[:reports_to*]->(o1:Organization {code: "TopOfCompany"})
OPTIONAL MATCH
path2=(p2:Person {name: "Susan"})-[:reports_to*]->(o2:Organization {code: "TopOfCompany"})
OPTIONAL MATCH
path3=(p1)-[:reports_to*]->(o:Organization {code: "TopOfCompany"})<-[:reports_to*]-(p2)
RETURN path1, path2, path3;

Resources