Checking every relationship of a path triggers way too many dbhits - neo4j

I am running a query to find the path from a user to a permission node in neo4j and I want to make sure it doesn't traverse some relationships. Therefore I've used the following cypher :
PROFILE
MATCH path = (u:user { _id: 'ea6b17e0-3b9e-11ea-b206-e7610aa23593' })-[r:accessRole|isMemberOf*1..5]->(n:PermissionSet { name: 'project'})
WHERE all(x IN r WHERE NOT (:PermissionSet)-[x]->(:user))
RETURN path
I didn't expect the where clause to trigger so many hits. I believe I'm not writing my test correctly.
(2000 nodes/ 3500 rels => 350,000 hits for "(:PermissionSet)-[x]->(:user)"
any advice?

Not sure this is the correct answer, but I added a WITH statement
PROFILE
MATCH path = (u:user { _id: 'ea6b17e0-3b9e-11ea-b206-e7610aa23593' })-[r:accessRole|isMemberOf*1..5]->(n:PermissionSet { name: 'project'})
WITH path,r
WHERE all(x IN r WHERE NOT (:PermissionSet)-[x]->(:user))
RETURN path
And the dbhits for "(:PermissionSet)-[x]->(:user)" went down to 2800 hits.
I can guess why it does that, but I'd love some more experts explanations, and is there a better way to do it? (this way is fine with me performance-wise)

Related

shortestPath relationship does not reflect the correct direction

I am new to Neo4j and is developing a small site.
I have setup the nodes and relationships between them. For most of the paired nodes, I created a mutual link. For example:
Zeus - FATHER -> Apollo
Appollo - SON -> Zeus
I used shortestPath to find the possible shortest path between these two:
MATCH (o1 { name: 'Apollo' }),(o2 { name: 'Zeus' }), p = shortestPath((o1)-[*..6]-(o2)) RETURN nodes(p), relationships(p)
The result is that it returns "FATHER" instead of "SON".
If I change the query to [*..6]->(o2), "SON" is returned.
But I need to consider the search may make o1 a node with no outgoing relationship, in which case, the above-modified query fails.
So:
The original query can cope with nodes with no outgoing relationship, but may return wrong relationship.
The modified query can return right relationship (so far) but can't cope with "no out relation" nodes.
I can of course change every node to have at least one outgoing relation to fix Issue 2 but that will be too redundant.
Hope to get your advice.
It's a bad practice to create bidirectional relationships like you do, specially when it's a bijection.
You are duplicating some data in the database (it's obvious that if Zeus is the father of Apollo, Apollo is the son of Zeus).
This query :
MATCH
(o1 { name: 'Apollo' }),
(o2 { name: 'Zeus' }),
p = shortestPath((o1)-[*..6]-(o2))
RETURN nodes(p), relationships(p)
search only one shortestpath. But due to the duplication, there is in fact two shortestpaths. You can replace the shortestpath function by the allshortestpaths to find all. So youwill have the son and father result.
Or you can also give to the shortestpath function, a list of relationship type thath it can traverse like this:
MATCH
(o1 { name: 'Apollo' }),
(o2 { name: 'Zeus' }),
p = shortestPath((o1)-[:FATHER*..6]-(o2))
RETURN nodes(p), relationships(p)

Cypher query fails with variable length paths when trying to find all paths with unique node occurences

I have a highly interconnected graph where starting from a specific node
i want to find all nodes connected to it regardless of the relation type, direction or length. What i am trying to do is to filter out paths that include a node more than 1 times. But what i get is a
Neo.DatabaseError.General.UnknownError: key not found: UNNAMED27
I have managed to create a much simpler database
in neo4j sandbox and get the same message again using the following data:
CREATE (n1:Person { pid:1, name: 'User1'}),
(n2:Person { pid:2, name: 'User2'}),
(n3:Person { pid:3, name: 'User3'}),
(n4:Person { pid:4, name: 'User4'}),
(n5:Person { pid:5, name: 'User5'})
With the following relationships:
MATCH (n1{pid:1}),(n2{pid:2}),(n3{pid:3}),(n4{pid:4}),(n5{pid:5})
CREATE (n1)-[r1:RELATION]->(n2),
(n5)-[r2:RELATION]->(n2),
(n1)-[r3:RELATION]->(n3),
(n4)-[r4:RELATION]->(n3)
The Cypher Query that causes this issue in the above model is
MATCH p= (n:Person{pid:1})-[*0..]-(m)
WHERE ALL(c IN nodes(p) WHERE 1=size(filter(d in nodes(p) where c.pid = d.pid)) )
return m
Can anybody see what is wrong with this query?
The error seems like a bug to me. There is a closed neo4j issue that seems similar, but it was supposed to be fixed in version 3.2.1. You should probably create a new issue for it, since your comments state you are using 3.2.5.
Meanwhile, this query should get the results you seem to want:
MATCH p=(:Person{pid:1})-[*0..]-(m)
WITH m, NODES(p) AS ns
UNWIND ns AS n
WITH m, ns, COUNT(DISTINCT n) AS cns
WHERE SIZE(ns) = cns
return m
You should strongly consider putting a reasonable upper bound on your variable-length path search, though. If you do not do so, then with any reasonable DB size your query is likely to take a very long time and/or run out of memory.
When finding paths, Cypher will never visit the same node twice in a single path. So MATCH (a:Start)-[*]-(b) RETURN DISTINCT b will return all nodes connected to a. (DISTINCT here is redundant, but it can affect query performance. Use PROFILE on your version of Neo4j to see if it cares and which is better)
NOTE: This works starting with Neo4j 3.2 Cypher planner. For previous versions of
the Cypher planner, the only performant way to do this is with APOC, or add a -[:connected_to]-> relation from start node to all children so that path doesn't have to be explored.)

Getting relationships from all node's in Neo4j

I am trying to query using Neo4j.
I would like to print result of obtaining information while AUTO-COMPLETE is ON in Neo4j.
For example, suppose query that creating 3 nodes as shown below.
create (david:Person {name: 'david'}), (mike:Person {name: 'mike'}), (book:Book {title:'book'}), (david)-[:KNOWS]->(mike), (david)-[:WRITE]->(book), (mike)-[:WRITE]->(book)
Here are 2 images:
Auto-complete on
Auto-complete off
Figure is shown after query, and I would like to obtain all relating node’s relationships based on starting node ('book' node).
I used this query as shown below.
match (book:Book)-[r]-(person) return book, r, person
Whether AUTO-COMPLETE is ON or OFF, I expect to obtain all node’s relationships including “David knows Mike”, but system says otherwise.
I studied a lot of Syntax structure at neo4j website, and somehow it is very difficult for me. So, I upload this post to acquire assistance for you.
You have to return all the data that you need yourself explicitly. It would be bad for Neo4j to automatically return all the relationships for a super node with thousands of relationships for example, as it would mean lots of I/O, possibly for nothing.
MATCH (book:Book)-[r]-(person)-[r2]-()
RETURN book, r, person, collect(r2) AS r2
Thanks to InverseFalcon, this is my query that works.
MATCH p = (book:Book)-[r]-(person:Person)
UNWIND nodes(p) as allnodes WITH COLLECT(ID(allnodes)) AS ALLID
MATCH (a)-[r2]-(b)
WHERE ID(a) IN ALLID AND ID(b) IN ALLID
WITH DISTINCT r2
RETURN startNode(r2), r2, endNode(r2)

How to query for multiple OR'ed Neo4j paths?

Anyone know of a fast way to query multiple paths in Neo4j ?
Lets say I have movie nodes that can have a type that I want to match (this is psuedo-code)
MATCH
(m:Movie)<-[:TYPE]-(g:Genre { name:'action' })
OR
(m:Movie)<-[:TYPE]-(x:Genre)<-[:G_TYPE*1..3]-(g:Genre { name:'action' })
(m)-[:SUBGENRE]->(sg:SubGenre {name: 'comedy'})
OR
(m)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
The problem is, the first "m:Movie" nodes to be matched must match one of the paths specified, and the second SubGenre is depenedent on the first match.
I can make a query that works using MATCH and WHERE, but its really slow (30 seconds with a small 20MB dataset).
The problem is, I don't know how to OR match in Neo4j with other OR matches hanging off of the first results.
If I use WHERE, then I have to declare all the nodes used in any of the statements, in the initial MATCH which makes the query slow (since you cannot introduce new nodes in a WHERE)
Anyone know an elegant way to solve this ?? Thanks !
You can try a variable length path with a minimal length of 0:
MATCH
(m:Movie)<-[:TYPE|:SUBGENRE*0..4]-(g)
WHERE g:Genre and g.name = 'action' OR g:SubGenre and g.name='comedy'
For the query to use an index to find your genre / subgenre I recommend a UNION query though.
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' })
RETURN distinct m
UNION
(m:Movie)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
RETURN distinct m
Perhaps the OPTIONAL MATCH clause might help here. OPTIONAL MATCH beavior is similar to the MATCH statement, except that instead of an all-or-none pattern matching approach, any elements of the pattern that do not match the pattern specific in the statement are bound to null.
For example, to match on a movie, its genre and a possible sub-genre:
OPTIONAL MATCH (m:Movie)-[:IS_GENRE]->(g:Genre)<-[:IS_SUBGENRE]-(sub:Genre)
WHERE m.title = "The Matrix"
RETURN m, g, sub
This will return the movie node, the genre node and if it exists, the sub-genre. If there is no sub-genre then it will return null for sub. You can use variable length paths as you have above as well with OPTIONAL MATCH.
[EDITED]
The following MATCH clause should be equivalent to your pseudocode. There is also a USING INDEX clause that assumes you have first created an index on :SubGenre(name), for efficiency. (You could use an index on :Genre(name) instead, if Genre nodes are more numerous than SubGenre nodes.)
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' }),
(m)-[:SUBGENRE]->()<-[:SUB_TYPE*0..3]-(sg:SubGenre { name: 'comedy' })
USING INDEX sg:SubGenre(name)
Here is a console that shows the results for some sample data.

Neo4j Cypher query about finding like minded people

I know this may be a simple question but I'm having a hard time finding an answer.
I want to find all "Persons" who have INTERESTED_IN the same Activities as a Person with the id of 1 that is not FRIENDS_WITH person 1
Something like
MATCH (p:Person {Id:1})--[r:INTERSTED_IN]-->(a:Activity {name:Skiing})<--(f:Person)
RETURN f.name
Might be wrong..
I think this will find everyone with the same relationship but then I want to make sure they aren't already friends.
Trying to figure out cypher and can't find any good examples of this.
Almost got it!
MATCH (p:Person { id: 1 })-[r:INTERESTED_IN]->(a:Activity { name: 'Skiing' })<-[r2:INTERESTED_IN]-(f:Person)
WHERE NOT (p)-[:FRIENDS_WITH]-(f)
RETURN f.name
Note that id here is a property, and not the internal node ID. If that's what you're looking for, you'd do this:
MATCH (p:Person)-[r:INTERESTED_IN]->(a:Activity { name: 'Skiing' })<-[r2:INTERESTED_IN]-(f:Person)
WHERE ID(p) = 1 AND NOT (p)-[:FRIENDS_WITH]-(f)
RETURN f.name
And it's "cypher." ;-)

Resources