Matching subtree recursively in Neo4j - neo4j

I'm using Neo4j, and I consider myself quite a newbie, and I don't really understand how I can select a subtree of my graph. I've found solutions using the shortestPath and allShortestPaths but that's not really the same thing as selecting a whole subtree by variable and all its children.
What I want to do is e.g. match MATCH (n {name: "Sovrum"})-[r:CHILDOF]->(child) return n, child but that only gives me the directly related nodes.
Instead I want to select the whole subtree.
Is there any good way of doing this or am I missing some vital point in how stuff works?

It's quite easy with variable length paths, you start at the root and tell Cypher to match CHILD_OF all the way down until it doesn't go further.
Please make sure to use labels in your query to allow Neo4j (with an index) to find your starting point quickly.
You can also assign the pattern matched to a path and return that path
MATCH path = (n:Node {name: "Sovrum"})-[:CHILDOF*]->(child)
RETURN path

Related

Neo4j stop searching an undirected path when given node is encountered

I have the following test data in Neo4j:
merge (n1:device {name:"n1"})-[:phys {name:"phys"}]->(:interface {name:"n1a"})-[:cable {name:"cable"}]->(:interface {name:"n2a"})-[:phys {name:"phys"}]->(n2:device {name:"n2"})
merge (n1)-[:phys {name:"phys"}]->(:interface {name:"n1b"})-[:cable {name:"cable"}]->(:interface {name:"n2b"})-[:phys {name:"phys"}]->(n2)
merge (n1)-[:phys {name:"phys"}]->(:interface {name:"n1c"})-[:cable {name:"cable"}]->(:interface {name:"n2c"})-[:phys {name:"phys"}]->(n2)
merge (n1)-[:phys {name:"phys"}]->(:interface {name:"n1d"})-[:cable {name:"cable"}]->(:interface {name:"n2d"})
Giving:
While this example has exactly 3 relationships and 2 nodes on each of the 4 paths between each of n1 and n2, my real data could have many more, and also many more paths.
This is a undirected graph and in the real dataset, relationships on parts of each path are in either direction.
I know that every path starts at a :device and either just ends at a non :device or ends at a :device, and along the way there could be any number of relationships and other non :device nodes.
So I am looking to do:
match p=(:device {name:"n1"})-[*]-(:device) return (p)
and have it return the same, (I would be happy with double), number of records as:
match p=(:device {name:"n1"})-[*]->(:device) return (p)
So I am looking for a way to stop matching relationships and cease following the path when the first (:device) is encountered in the path.
From my limited understanding, I could easily achieve this by making every relationship bidirectional. However I have avoided that option to date as I have read it is bad practice.
Extra for experts :-)
Additionally, I would like a way to return any full paths that don't end at a :device (eg, the bottom one)
Thanks
This is a use case that is a little hard to do with just Cypher, as we don't have a way to specify "follow a variable-length path and stop when you reach another node of this type".
We can do something like this when we use LIMIT, but that becomes too restrictive when we don't know how many results there will be, or we need to do this for multiple starting nodes.
Because of this, there are some APOC path finder procedures that include more flexible options. One of these is a labelFilter option which lets you describe how to filter nodes with particular labels found during expansion (blacklisting, whitelisting, etc). One of these filters is called a termination filter (uses an / symbol before the appropriate label), which means to include the path to that node as a result, and stop expansion, which is exactly what you're looking for.
After you install APOC, you can use the apoc.path.expandConfig() procedure, starting from your start node, and supply the labelFilter config parameter to get this behavior:
MATCH (start:device {name:"n1"})
CALL apoc.path.expandConfig(start, {labelFilter:'/device'}) YIELD path
RETURN path

finding mutual friends in a cypher statement starting with three or more persons

I am trying to build a cypher statement for neo4j where I know 2-n starting nodes by name and need to find a node (if any) that can be reached by all of the starting nodes.
At first I tought it was similar to the "Mutual Friend" situation that could be handled like
(start1)-[*..2]->(main)<-[*..2]-(start2)
but in my case I often have more then 2 starting points up around 6 that I know by name.
So basically I am puzzled by how i can include the third, fourth and so on node into the cypher to be able to find a commmon root amongst them.
In the above Example from the neo4j Website I would need a path starting with 'Dilshad', 'Becky' and 'Cesar' to check if they have a common friend (Anders) excluding 'Filipa' and 'Emil' as they are not friends of all three.
So far I would create a statement programmatically that looks like
MATCH (start1 {name:'Person1'}), (start2 {name:'Person2'}),
(start3 {name: 'Person3'}), (main)
WHERE (start1)-[*..2]->(main) AND
(start2)-[*..2]->(main) AND
(start3)-[*..2]->(main) RETURN distinct main
But I was wondering if there is a more elegant / efficient way in cypher possibly where I could use the list of names as parameter
The query shown in your question is building a cartesian product because you are matching multiple disconnected patterns.
Instead of MATCH all nodes separately and use WHERE to restrict the relations between these nodes you can do something like:
MATCH (start1 {name:'Person1'})-[*..2]->(main),
(start2 {name:'Person2'})-[*..2]->(main),
(start3 {name: 'Person3'})-[*..2]->(main)
RETURN main
The above query will be more efficient because it will match only the required pattern. Note that when you are doing MATCH (start1 {name:'Person1'}), (start2 {name:'Person2'}), (start3 {name: 'Person3'}), (main), the part (main) is matching all nodes of your graph because no restrictions to this are specified. You can use PROFILE with your query to see it more clearly.

Shortest path that has to include certain waypoints

I’m trying to find the shortest path that connects an arbitrary collection of nodes. Both start and end can be any of the nodes in the collection, as long as they are not the same.
Standard cypher functions shortestPath() or allShortestPaths() fail because they find the shortest path from start to end and do not include waypoints.
The following cypher works, but is there a faster way?
//some collection of nodeids, as waypoints the path has to include
match (n) where id(n) IN [24259,11,24647,28333,196]
with collect(n) as wps
// create possible start en endpoints
unwind wps as wpstart
unwind wps as wpend
with wps,wpstart,wpend where id(wpstart)<id(wpend)
// find paths that include all nodes in wps
match p=((wpstart)-[*..6]-(wpend))
where ALL(n IN wps WHERE n IN nodes(p))
// return paths, ordered by length
return id(wpstart),id(wpend),length(p) as lp,EXTRACT(n IN nodes(p) | id(n)) order by lp asc
Update 23-10-2015:
With the latest Neo4j version 2.3.0, it is possible to combine shortestPath() with a WHERE clasue that is pulled in somewhere during the evaluation process. You then get a construct like this, in which {wps} is a collection of nodeIds.
// unwind the collection to create combinations of all start-end points
UNWIND {wps} AS wpstartid
UNWIND {wps} AS wpendid
WITH wpstartid,wpendid WHERE wpstartid<wpendid
// for each start-end combi,calculate shortestPath() with a WHERE clasue
MATCH (wpstart) WHERE id(wpstart)=wpstartid
MATCH (wpend) WHERE id(wpend)=wpendid
MATCH p=shortestPath((wpstart)-[*..5]-(wpend))
WHERE ALL(id IN {wps} WHERE id IN EXTRACT(n IN nodes(p) | id(n)) )
//return the shortest of the shortestPath()s
WITH p, size(nodes(p)) as length order by length limit 1
RETURN EXTRACT(n IN nodes(p) | id(n))
This approach does not always work, since there is an internal optimization that determines at which stage the WHERE clause is applied. So beware, and be prepared to fall back to the more bruteforce approach in the beginning of the item.
This is going to be a really unsatisfying answer, but here goes:
The question you're asking I strongly suspect is reducible to the problem of Hamiltonian Paths. This is a classic graph algorithm problem that turns out to be NP-complete. So practically speaking, what that means is that while it might be possible to implement this, the performance is likely going to be horrific.
If you really must implement this, I'd probably recommend not using cypher, and instead building something with the neo4j traversal framework. You can find sample code and algorithms online that will do at least a portion of this. But more broadly, if your data is larger than trivial in size, the unsatisfying part of this answer is that I probably wouldn't do it at all.
Better options might be to decompose your graph into smaller sub-problems which you can work independently, or coming up with another heuristic method that gets you close to what you want, but not via this method.

Neo4J find shortest path using all nodes (unordered) cypher

I'm not sure if this can be done in any efficient way, but i'm hoping it can be.
I am getting a set of data with data on it that allows me to find very specific nodes. However this data is not ordered in any way in terms of how the nodes are connected.
What I am trying to do is to be able to find all the nodes in neo4J (up to 7) and then say with these 7 nodes, find the path that connects them all.
These given nodes will be the only nodes connected in the desired path.
basically i'm trying to get a set that looks like
1,2,3,4,5,6,7
and to be able to find
2->7->6->3<-5<-1->4
any help or direction would be greatly appreciated
the way I would do it is the following:
You need a starting node from where on you will query the next 7 nodes. To be able to find the very first 7 nodes I would introduce a starting root node. Lets call it simply :Root .
MATCH (:Root)-[r:NEXT*1..7]->(x)<-[]-(y) RETURN x, y
or even simpler:
MATCH (:Root)-[r:NEXT*..7]->(x)<-[]-(y) RETURN x, y
:Root of course could be any other node in your set, to get the next seven nodes from there on.
Is this what you want?
Take a further look at the following neo4j cheat sheet, which has some great tips:
http://assets.neo4j.org/download/Neo4j_CheatSheet_v3.pdf
Regards
EDIT
Ok sorry, I misunderstood you.
Maybe this brings you further:
MATCH (n:Node) where n.refId in [1,2,3,4,5,6,7]
MATCH (n2:Node) where n2.refId in [1,2,3,4,5,6,7]
MATCH p=shortestPath((q)-[:NEXT*]-(q2))
return collect(distinct p)
or if those numbers are node IDs than like this:
MATCH (n:Node) where id(n) in [1,2,3,4,5,6,7]
MATCH (n2:Node) where id(n2) in [1,2,3,4,5,6,7]
MATCH p=shortestPath((q)-[:NEXT*]-(q2))
return collect(distinct p)
This actually returns all the paths between the given nodes as a collection.
So it doesn't return a single path for all those nodes.
I am not aware of a function doing that.
However the neo4j browser displays just a single path between all those nodes desired, because of it's auto complete function. So I think you would have to build your own logic in code, if you want to connect those paths to a single one.
Maybe this is at least a starting point for the problem

using cypher to find nodes in a subgraph that are NOT connected to a specified node

I'm learning cypher with Neo4j but I'm having some problems that show I still don't quite get it.
I'm trying to write a query that finds a subgraph and then excludes nodes from that subgraph that are connected to a specified node.
In practice, it's a recommendation problem: I find a set of recommendations, but want to exclude those things that a target use already knows about.
I thought I could do something like:
match (u:User{id:"some id"}), (:Category{title:"some category"})-[:categorizes]->(i:Item)
where not (u)-[:knows_about]-(i)
return i
but this doesn't work.
Can anyone explain what I'm doing wrong/what I should be doing?
I think you want the following:
MATCH (:Category{title:"some category"})-[:categorizes]->(i:Item)
MATCH (u:User {id:some_id})
WHERE not (u)-[:knows_about]-(i)
RETURN i
You might want to add a direction in the second WHERE clause (performance!).

Resources