Cypher query doesn't return all the expected nodes - neo4j

I have this graph:
A<-B->C
B is the root of a tiny tree. There is exactly one relation between A and B, and one between B and C.
When I run the following, one node is returned. Why does this Cypher query not return the A and C nodes?
MATCH(a {name:"A"})<-[]-(rewt)-[]->(c) RETURN c
It would seem to be that the first half of that query would find the root, and the second half would find both child nodes.
Until a few minutes ago, I would have thought it logically identical to the following query which works. What's the difference?
MATCH (a {name:"A"})<-[]-(rewt)
MATCH (rewt)-[]->(c)
RETURN c
EDIT for cybersam
I have abstracted my database so we could discuss my specific issue. Now, we still have a tiny tree, but there are 4 nodes that are children of the root.(Sorry this is different, but I'm developing and don't want to change my environment too much.)
This query returns all 4:
match(a)<-[]-(b:ROOT)-[]->(c) return c
One of them has a name of "dddd"...
match(a {name"dddd"})<-[]-(b:ROOT)-[]->(c) return c
This query only returns three of them. "dddd" is not included. omg.
To answer cybersam's specific question, this query:
MATCH (a {name:"dddd"})<--(rewt:CODE_ROOT)
MATCH (rewt)-->(c)
RETURN a = c;
Returns four rows. The values are true, false, false, false

[UPDATED]
There is a difference between your 2 queries. A MATCH clause will filter out all duplicate relationships.
Therefore, your first query would filter out all matches where the left-side relationship is the same as the right-side relationship:
MATCH(a {name:"A"})<--(rewt)-->(c)
RETURN c;
Your second query would allow the 2 relationships to be the same, since the relationships are found by 2 separate MATCH clauses:
MATCH (a {name:"A"})<--(rewt)
MATCH (rewt)-->(c)
RETURN c;
If I am right, then the following query should return N rows (where N is the number of outgoing relationships from rewt) and only one value should be true:
MATCH (a {name:"A"})<--(rewt)
MATCH (rewt)-->(c)
RETURN a = c;

Both work just fine for me. I've tried on 2.3.0 Community.
Do you mind posting your CREATE command ?

In each MATCH clause, each relationship will be matched only once. See http://neo4j.com/docs/stable/cypherdoc-uniqueness.html for reference.
See this related question as well: What does a comma in a Cypher query do?

Related

traversing neo4j graph with conditional edge types

i have a fixed database that has the nodes connecting people and edges with six different types of relationships. to make it simple, I call in this post the types of relationships A, B, C, D, E and F. None of the relationships are directional. new at the syntax so thank you for help.
I need to get sets of relationships that traverses the graph based on a conditional path A to (B or C.D) to E to F. So this means that I first need the relationship that links two nodes ()-[:A]-(), but then I am confused about how to express a conditional relationship. To get to the next node, I need either B or C then D so that it is ()-[:B]-() OR ()-[:C]-()-[:D]-(). How to express this conditional traversal in the MATCH syntax?
Tried all of these and got syntax errors:
(node2:Node)-[rel2:B|rel3:C]-(node3:Node)
(node2:Node)-[rel2:B]OR[rel3:C]-(node3:Node)
This pure-Cypher query should return all matching paths:
MATCH p=()-[:A]-()-[r:B|C|D*1..2]-()-[:E]-()-[:F]-()
WHERE (SIZE(r) = 1 AND TYPE(r[0]) = 'B') OR
(SIZE(r) = 2 AND TYPE(r[0]) = 'C' AND TYPE(r[1]) = 'D')
RETURN p
The [r:B|C|D*1..2] pattern matches 1 or 2 relationships that have the types B, C, and/or D (which can include subpaths that you don't want); and the WHERE clause filters out subpaths that you don't want.
This isn't something that can really be expressed with Cypher, when the number of hops to traverse isn't the same.
The easiest way to do this would probably be to use apoc.cypher.run() from APOC Procedures to execute a UNION query to cover both paths, then work with the result of the call:
//assume `node2` is in scope
CALL apoc.cypher.run("MATCH (node2)-[:B]-(node3:Node) RETURN node3
UNION
MATCH (node2)-[:C]-()-[:D]-(node3:Node) RETURN node3",
{node2:node2}) YIELD value
WITH value.node3 as node3 // , <whatever else you want in scope>
...

Multiple Match queries in one query

I have the following records in my neo4j database
(:A)-[:B]->(:C)-[:D]->(:E)
(:C)-[:D]->(:E)
I want to get all the C Nodes and all the relations and related Nodes. If I do the query
Match (p:A)-[o:B]->(i:C)-[u:D]->(y:E)
Return p,o,i,u,y
I get the first to match if I do
Match (i:C)-[u:D]->(y:E)
Return i,u,y
I get the second to match.
But I want both of them in one query. How do I do that?
The easiest way is to UNION the queries, and pad unused variables with null (because all cyphers UNION'ed must have the same return columns
Match (p:A)-[o:B]->(i:C)-[u:D]->(y:E)
Return p,o,i,u,y
UNION
Match (i:C)-[u:D]->(y:E)
Return NULL as p, NULL as o,i,u,y
In your example though, the second match actually matches the last half of the first chain as well, so maybe you actually want something more direct like...
MATCH (c:C)
OPTIONAL MATCH (connected)
WHERE (c)-[*..20]-(connected)
RETURN c, COLLECT(connected) as connected
It looks like you're being a bit too specific in your query. If you just need, for all :C nodes, the connected nodes and relationships, then this should work:
MATCH (c:C)-[r]-(n)
RETURN c, r, n

What is the difference between multiple MATCH clauses and a comma in a Cypher query?

In a Cypher query language for Neo4j, what is the difference between one MATCH clause immediately following another like this:
MATCH (d:Document{document_ID:2})
MATCH (d)--(s:Sentence)
RETURN d,s
Versus the comma-separated patterns in the same MATCH clause? E.g.:
MATCH (d:Document{document_ID:2}),(d)--(s:Sentence)
RETURN d,s
In this simple example the result is the same. But are there any "gotchas"?
There is a difference: comma separated matches are actually considered part of the same pattern. So for instance the guarantee that each relationship appears only once in resulting path is upheld here.
Separate MATCHes are separate operations whose paths don't form a single patterns and which don't have these guarantees.
I think it's better to explain providing an example when there's a difference.
Let's say we have the "Movie" database which is provided by official Neo4j tutorials.
And there're 10 :WROTE relationships in total between :Person and :Movie nodes
MATCH (:Person)-[r:WROTE]->(:Movie) RETURN count(r); // returns 10
1) Let's try the next query with two MATCH clauses:
MATCH (p:Person)-[:WROTE]->(m:Movie) MATCH (p2:Person)-[:WROTE]->(m2:Movie)
RETURN p.name, m.title, p2.name, m2.title;
Sure you will see 10*10 = 100 records in the result.
2) Let's try the query with one MATCH clause and two patterns:
MATCH (p:Person)-[:WROTE]->(m:Movie), (p2:Person)-[:WROTE]->(m2:Movie)
RETURN p.name, m.title, p2.name, m2.title;
Now you will see 90 records are returned.
That's because in this case records where p = p2 and m = m2 with the same relationship between them (:WROTE) are excluded.
For example, there IS a record in the first case (two MATCH clauses)
p.name m.title p2.name m2.title
"Aaron Sorkin" "A Few Good Men" "Aaron Sorkin" "A Few Good Men"
while there's NO such a record in the second case (one MATCH, two patterns)
There are no differences between these provided that the clauses are not linked to one another.
If you did this:
MATCH (a:Thing), (b:Thing) RETURN a, b;
That's the same as:
MATCH (a:Thing) MATCH (b:Thing) RETURN a, b;
Because (and only because) a and b are independent. If a and b were linked by a relationship, then the meaning of the query could change.
In a more generic way, "The same relationship cannot be returned more than once in the same result record." [see 1.5. Cypher Result Uniqueness in the Cypher manual]
Both MATCH-after-MATCH, and single MATCH with comma-separated pattern should logically return a Cartesian product. Except, for comma-separated pattern, we must exclude those records for which we already added the relationship(s).
In Andy's answer, this is why we excluded repetitions of the same movie in the second case: because the second expression from each single MATCH was using there the same :WROTE relationship as the first expression.
If a part of a query contains multiple disconnected patterns, this will build a cartesian product between all those parts. This may produce a large amount of data and slow down query processing. While occasionally intended, it may often be possible to reformulate the query that avoids the use of this cross product, perhaps by adding a relationship between the different parts or by using OPTIONAL MATCH (identifier is: (a)) .
IN short their is NO Difference in this both query but used it very carefully.
In a more generic way, "The same relationship cannot be returned more than once in the same result record." [see 1.5. Cypher Result Uniqueness in the Cypher manual]
How about this statement?
MATCH p1=(v:player)-[e1]->(n)
MATCH p2=(n:team)<-[e2]-(m)
WHERE e1=e2
RETURN e1,e2,p1,p2

Cypher: preventing results from duplicating on WITH / sequential querying

In a query like this
MATCH (a)
WHERE id(a) = {x}
MATCH (a)-->(b:x)
WITH a, collect(DISTINCT id(b)) AS Bs
MATCH (a)-->(c:y)
RETURN collect(c) + Bs
what I'm trying to do is to gather two sets of nodes that came from different queries, but with this kind of procedure all the b rows get to be returned multiplied by the number of a rows.
How should I deal with this kind of problem that arises from sequential queries?
[Note that the reported query is only a conceptual representation of what I mean. Please don't try to solve the code (that would be trivial) but only the presented problem.]
Your query shouldn't return any cross product since you aggregate in the WITH clause, so there is only one result item/row (the disconnected path a, collect(b)) when the second match begins. It's not clear therefore what the problem is that you want solved–cross products can be solved differently in different cases.
The way your query would work, conceptually speaking, is: match anything related from a, then filter that anything on having label :x. The second leg of the query does the same but filters on label :y. You can therefore combine your queries as
MATCH (a)-->(b)
WHERE id(a) = {x} AND (b:x OR b:y)
RETURN b
Other cases of 'path explosion' can't be solved as easily (sometimes UNION is good, sometimes you can reorder your pattern, sometimes you can do some aggregate-and-reduce to make it happen) , but you'll have to ask about that separately.
How about using UNION for this? See http://docs.neo4j.org/chunked/milestone/query-union.html#union-combine-two-queries-and-remove-duplicates
-brian

In neo4j is there a way to get path between more than 2 random nodes whose direction of relation is not known

I have a scenario where I have more than 2 random nodes.
I need to get all possible paths connecting all three nodes. I do not know the direction of relation and the relationship type.
Example : I have in the graph database with three nodes person->Purchase->Product.
I need to get the path connecting these three nodes. But I do not know the order in which I need to query, for example if I give the query as person-Product-Purchase, it will return no rows as the order is incorrect.
So in this case how should I frame the query?
In a nutshell I need to find the path between more than two nodes where the match clause may be mentioned in what ever order the user knows.
You could list all of the nodes in multiple bound identifiers in the start, and then your match would find the ones that match, in any order. And you could do this for N items, if needed. For example, here is a query for 3 items:
start a=node:node_auto_index('name:(person product purchase)'),
b=node:node_auto_index('name:(person product purchase)'),
c=node:node_auto_index('name:(person product purchase)')
match p=a-->b-->c
return p;
http://console.neo4j.org/r/tbwu2d
I actually just made a blog post about how start works, which might help:
http://wes.skeweredrook.com/cypher-it-all-starts-with-the-start/
Wouldn't be acceptable to make several queries ? In your case you'd automatically generate 6 queries with all the possible combinations (factorial on the number of variables)
A possible solution would be to first get three sets of nodes (s,m,e). These sets may be the same as in the question (or contain partially or completely different nodes). The sets are important, because starting, middle and end node are not fixed.
Here is the code for the Matrix example with added nodes.
match (s) where s.name in ["Oracle", "Neo", "Cypher"]
match (m) where m.name in ["Oracle", "Neo", "Cypher"] and s <> m
match (e) where e.name in ["Oracle", "Neo", "Cypher"] and s <> e and m <> e
match rel=(s)-[r1*1..]-(m)-[r2*1..]-(e)
return s, r1, m, r2, e, rel;
The additional where clause makes sure the same node is not used twice in one result row.
The relations are matched with one or more edges (*1..) or hops between the nodes s and m or m and e respectively and disregarding the directions.
Note that cypher 3 syntax is used here.

Resources