Neo4j Cypher version 1.8 : Probable bug with relationship identifiers - neo4j

http://console.neo4j.org/r/yx62bk
In the graph above, the query
start n=node(7,8,9)
match n-[objectScore:score]->o-[:object_of_destination]->d<-[:destination_score]-n,
o-[:instance_of]->ot, o-[:date]->oDate, d-[:date]->dDate where ot.name='HOTEL'
return n, o, objectScore, d;
returns o as null.
Change the query to remove relationship identifier - objectScore
start n=node(7,8,9)
match n-[:score]->o-[:object_of_destination]->d<-[:destination_score]-n,
o-[:instance_of]->ot, o-[:date]->oDate, d-[:date]->dDate where ot.name='HOTEL'
return n, o, objectScore, d;
and the output returns o node correctly.
For my scenario I need both of them. Not sure How to do that? Any suggestions on this.

Nice find. We track Cypher issues on github, so I've opened an issue about it there: https://github.com/neo4j/community/issues/837
Thanks so much for reporting it!
Edit: I've found the problem. A simple workaround is to, ironically, introduce an optional relationship. The problem is located in one of the matchers Cypher can use, and by marking a piece of your pattern as optional, you force Cypher to use a different matcher. If you want to
So, change your MATCH to this:
match n-[objectScore:score]->o-[:object_of_destination]->d<-[:destination_score]-n,
o-[:instance_of]->ot,
o-[:date]->oDate,
d-[?:date]->dDate
A real fix is in the works.

Related

Neo4j Cypher find loop from specific node

I want to find all loops that originate and terminate with a specific node in a Neo4j database. I tried:
START n=node:Event(time=",timestamp,")
MATCH p=(n)-[:LINKED_TO*1..5]->(n)
WHERE NONE (n IN nodes(p) WHERE size(filter(x IN nodes(p) WHERE n = x))> 2)
RETURN p, length(p)
This is the best I can mashup from what is on the web. There are two things I don't like about this:
1. it crashes
2. the count threshold must be ">2" to allow for the start+termination node. That means that loops that visit the same intermediate node twice will be included, which I wish was not the case.
I'm not interested in the shortest path. I want to know all loops that return to my starting node.
Thank you in advance!
This query should return all loops that start and end at the specified node and have no other repeated nodes:
START n=node:Event(time=",timestamp,")
MATCH p=(n)-[:LINKED_TO*1..5]->(n)
UNWIND TAIL(NODES(p)) AS m
WITH p, COUNT(DISTINCT m) AS cm
WHERE LENGTH(p)-1 = cm
RETURN p, LENGTH(p);
Thank you, cybersam! That was helpful. As typed, it gave a few errors and warned me that "START" is deprecated. I found the following modifications worked:
MATCH (n:Event{time:1458238060505007})
MATCH p=(n)-[:LINKED_TO*1..5]->(n)
UNWIND TAIL(NODES(p)) AS m WITH p RETURN p
The only problem with this is that it appears to give all paths that go through the desired start node, n. Is that true? If so, is there a way to correct this?
This what finally worked for me. It is very close to what cybersam suggested. Apologies for doing this "the wrong way". I'm sure cybersam will yell at me, again, but adding code via Comment is not very easy to read.
MATCH p=(n:Event{time:",timestamp,"})-[:LINKED_TO*1..5]->(n)
UNWIND TAIL (NODES(p)) AS m
WITH p,COUNT(DISTINCT m) AS cm
WHERE LENGTH(p) = cm
RETURN p
As I noted earlier, one sticking point was the use of "START", which is deprecated and causes errors (for example, when using RNeo4j in R, which I'm using). The new way appears to be to use MATCH and specify your starting node in the path pattern. The other confusing thing for me was the use of "LENGTH(p)-1" instead of "LENGTH(p)". For one node connecting to another, the path has a length of 2, not 3 and there are only 2 distinct nodes. For my application, "LENGTH(p)=cm" worked.
Finally, if you want the nodes in the paths, do NOT try to use "WITH m,..." because this messes up the "COUNT(DISTINCT(m))" computation for some reason that I do not understand.

Cypher query fails with variable length paths when trying to find all paths with unique node occurences

I have a highly interconnected graph where starting from a specific node
i want to find all nodes connected to it regardless of the relation type, direction or length. What i am trying to do is to filter out paths that include a node more than 1 times. But what i get is a
Neo.DatabaseError.General.UnknownError: key not found: UNNAMED27
I have managed to create a much simpler database
in neo4j sandbox and get the same message again using the following data:
CREATE (n1:Person { pid:1, name: 'User1'}),
(n2:Person { pid:2, name: 'User2'}),
(n3:Person { pid:3, name: 'User3'}),
(n4:Person { pid:4, name: 'User4'}),
(n5:Person { pid:5, name: 'User5'})
With the following relationships:
MATCH (n1{pid:1}),(n2{pid:2}),(n3{pid:3}),(n4{pid:4}),(n5{pid:5})
CREATE (n1)-[r1:RELATION]->(n2),
(n5)-[r2:RELATION]->(n2),
(n1)-[r3:RELATION]->(n3),
(n4)-[r4:RELATION]->(n3)
The Cypher Query that causes this issue in the above model is
MATCH p= (n:Person{pid:1})-[*0..]-(m)
WHERE ALL(c IN nodes(p) WHERE 1=size(filter(d in nodes(p) where c.pid = d.pid)) )
return m
Can anybody see what is wrong with this query?
The error seems like a bug to me. There is a closed neo4j issue that seems similar, but it was supposed to be fixed in version 3.2.1. You should probably create a new issue for it, since your comments state you are using 3.2.5.
Meanwhile, this query should get the results you seem to want:
MATCH p=(:Person{pid:1})-[*0..]-(m)
WITH m, NODES(p) AS ns
UNWIND ns AS n
WITH m, ns, COUNT(DISTINCT n) AS cns
WHERE SIZE(ns) = cns
return m
You should strongly consider putting a reasonable upper bound on your variable-length path search, though. If you do not do so, then with any reasonable DB size your query is likely to take a very long time and/or run out of memory.
When finding paths, Cypher will never visit the same node twice in a single path. So MATCH (a:Start)-[*]-(b) RETURN DISTINCT b will return all nodes connected to a. (DISTINCT here is redundant, but it can affect query performance. Use PROFILE on your version of Neo4j to see if it cares and which is better)
NOTE: This works starting with Neo4j 3.2 Cypher planner. For previous versions of
the Cypher planner, the only performant way to do this is with APOC, or add a -[:connected_to]-> relation from start node to all children so that path doesn't have to be explored.)

Combine nodes from more MATCHes into single variable

Using neo4j community edition 2.x. In Cypher, I need to MATCH nodes in (two) different ways, then combine these (two) sets of matched nodes into single set (one variable name). This set would then be used for further action.
naive graph example (I can't post images)
I would like to find all knowledge of the squirrel, including the knowledge shared by the groups she is member of. (example is fictional)
I imagine something like this:
MATCH (u:User{username:'squirrel'}), (:User{username:'squirrel'})<-[:MEMBER]-(g:Group)
WITH "COMBINATION OF u AND g" AS ug
MATCH (ug)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
Outcome should be both "crack nuts" and "escape predators".
In the place of "COMBINATION OF u AND g" I tried variations on collect(u)+collect(g), EXTRACT, etc. Without success.
So far the simplest working way I found is using UNION.
MATCH (u:User{username:'squirrel'})-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
UNION
MATCH (u:User{username:'squirrel'})<-[:MEMBER]-(:Group)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
This might solve this simple example, but is not good for more complex queries. I seek the solution for more general problem: MATCH several sets of nodes, glue them into single set (single variable) and continue with this new set.
Any ideas, please? Am I missing something basic? Or is this impossible? Thanks!
Something possibly similar on grokbase.
edit:
With this hacky solution to similar question I was able to solve the problem by extracting internal ids from collection of nodes:
MATCH (u:User{username:'squirrel'}), (:User{username:'squirrel'})<-[:MEMBER]-(g:Group)
WITH [x in collect(u)+collect(g)|id(x)] as collectedIds MATCH (ug) WHERE id(ug) in collectedIds
MATCH (ug)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
Could it be done any better?
At least since Neo4j 3.0 you can use variable-length pattern matching to solve this issue. Simply set explicitly the minimum length to 0 and move the label test to a separate WHERE clause:
MATCH (:User {username:'squirrel'}) <-[:MEMBER*0..1]- (ug)
WHERE ug:User OR ug:Group
WITH ug
MATCH (ug)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
Not sure about the general case, but for this specific case, you might try to combine the two patterns into one as follows,
MATCH (u:User{username:'squirrel'})<-[:MEMBER*0..1]-()-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
The only general solution I found so far:
match required starting points (with different names)
collect internal ids of starting points
match the starting points with collected ids (now the starting points have single name)
do whatever action you need to do with the starting points
Now the code itself:
MATCH (u:User{username:'squirrel'}), (:User{username:'squirrel'})<-[:MEMBER]-(g:Group)
WITH [x in collect(u)+collect(g)|id(x)] as collectedIds MATCH (ug) WHERE id(ug) in collectedIds
MATCH (ug)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type

cypher query to get middle objects in traversal and shortest path

Can't seem to figure out, and I'm not sure it entirely possible. I have a graph like so
a-[:granted]->b-[:granted]->...x-[granted_source]>s
where b and x are of interest. While I already know a and s, the end points, which are defined in START clause.
Note that b and c could be one ( a->b->s ) or more then one ( a->b->c->x->s ) and the goal is to find the shortest path returning only the nodes that are pointed to by a 'granted' relationship.
The closest I've got is:
start s=node(21), p=node(2)
match paths=shortestPath(p-[:granted|granted_source*]->s)
return NODES(paths)
Which gives all the nodes, including start (p) and end (s). But I can't seem to filter out, or better would be to not return them at all, only the nodes that are pointed to by a granted relationship and in the order from (s) if possible. I'm on Neo4j 2.0b and I'm wondering if Labels, which I have no issue using, would be the better way to go? Any help would be very appreciated.
So, you want to chop the head and tail off of a collection of nodes? (Am I understanding that right?) How about:
start s=node(21), p=node(2)
match paths=shortestPath(p-[:granted|granted_source*]->s)
return NODES(paths)[1..-1]
I think I resolved it using a WITH, I think this is probably the best performance given that first the p-... are fetched, then all ...->s are fetched and then using the shortestPath() is used to get the 'in between' nodes. The results appear correct.
start s=node(21), p=node(2)
match p-[:granted]-x, y-[:granted_source]->s
with x, y
match paths=shortestPath(x-[:granted*]->y)
return NODES(paths)

How to return only the end/leaf nodes in a Neo4j cypher query?

I have a structure like so:
user-[:talking]->topic-[:categorized_in]->topic[:categorized_in]->topic... etc
Starting at a user, how would I get the furthest away topics they're talking about. Basically this represents the top level categories they are talking about. This is the only way I know to go about doing this, and it returns all of the nodes along the way, not just the leaf nodes.
START user=node(1)
MATCH user-[:talking]->x<-[:categorized_in*0..]-y
RETURN distinct y.uuid
This is my latest attempt. It seems to work, though I don't know if this is the best way to go about it?:
START user=node(1)
MATCH user-[:talking]->x<-[:categorized_in*0..]-y<-[?:pull]-z
WHERE z is null
RETURN distinct y.uuid
So this is how to do it for anybody interested:
START user=node(1)
MATCH user-[:talking]->x<-[:categorized_in*0..]-y<-[?:categorized_in]-z
WHERE z is null
RETURN distinct y.uuid
You can now filter against patterns in the WHERE.
So if you have a newer version of Neo4j, I think the query would look like
START user=node(1)
MATCH user-[:talking]->x<-[:categorized_in*0..]-y
WHERE NOT(y<-[:categorized_in]-())
RETURN DISTINCT y.uuid

Resources