Finding cypher paths that don't visit the same node twice

Finding cypher paths that don't visit the same node twice - neo4j

I'm looking for the paths between two nodes in a graph, but my graph has a loop in it and so I'm getting paths back that are undesirable. I'm hoping that someone here can help me think out a sensible remedy.
Here's my graph:
A
|
2
|
C-3-D
| |
| 4
5 |
| E
| |
| 6
| |
F-7-G
The letters are nodes, and the numbers are edges (relationships).
CREATE (a {i: "A"})
CREATE (c {i: "C"})
CREATE (d {i: "D"})
CREATE (e {i: "E"})
CREATE (f {i: "F"})
CREATE (g {i: "G"})
CREATE a-[:r {i:2}]->c-[:r {i:3}]->d-[:r {i:4}]->e-[:r {i:6}]->g
CREATE c-[:r {i:5}]->f-[:r {i:7}]->g;
I'm looking for paths between A and C, and I'd only expect there to be one, but there are three!
neo4j-sh (?)$ MATCH p=({i: "a"})-[:r*]-({i: "c"}) return EXTRACT(n IN NODES(p) | n.i);
+-------------------------------+
| EXTRACT(n IN NODES(p) | n.i) |
+-------------------------------+
| ["A","C"] |
| ["A","C","D","E","G","F","C"] |
| ["A","C","F","G","E","D","C"] |
+-------------------------------+
neo4j-sh (?)$ MATCH p=({i: "a"})-[:r*]-({i: "c"}) return EXTRACT(n IN RELATIONSHIPS(p) | n.i);
+--------------------------------------+
| EXTRACT(n IN RELATIONSHIPS(p) | n.i) |
+--------------------------------------+
| [2] |
| [2,3,4,6,7,5] |
| [2,5,7,6,4,3] |
+--------------------------------------+
That makes sense from a graph perspective because the path isn't visiting the same edge twice, but from a node perspective it's a pain because C clearly is visited twice.
One thought was to try shortest paths, using allShortestPaths, however that only appears to filter the result by returning just the shortest length paths, which isn't the same as avoiding passing through the same node twice. For example the route A->G has two paths:
"A" -> "G" : [[2, 5, 7], [2, 3, 4, 6]]
but when I use allShortestPaths I only get the three hop path [2,5,7].
Is there a sensible way of applying a restriction so that I only get paths where each node is only visited once?

I think you should use shortestPath or allShortestPaths, like this:
MATCH p=shortestPath((:Label1 {i: "a"})-[:r*]-(:Label2 {i: "c"}))
RETURN EXTRACT(n IN NODES(p) | n.i);
make sure to create an index/constraint for :Label(i)
you can try something like that (filter out all paths where a node appears twice)
MATCH p=({ i: "A" })-[:r*]-({ i: "C" })
WHERE NONE (n IN nodes(p)
WHERE size(filter(x IN nodes(p)
WHERE n = x))> 1)
RETURN EXTRACT(n IN RELATIONSHIPS(p)| n.i);
the index hint was an optimization for a real world dataset

Thanks to Michael hunger.
IN NEO4J 4.0.7, filter and EXTRACT no longer supported, must use list comprehension.
the cypher sentence like this (filter out all paths where a node appears twice):
MATCH p=({ i: "A" })-[:r*]-({ i: "C" })
WHERE NONE (n IN nodes(p)
WHERE size([x IN nodes(p)
WHERE n = x]) > 2 )
return [n IN nodes(p) | n._id]

Related

Unexpected result. Expected to repeat path

Example graph:
Matrix-->Neo
Matrix-->Morpheus
In neo4 version 1.* query returned both Neo and Morpheus
START n=node(0)
MATCH n<--matrix-->m
RETURN m
But in 2.* it return only Morpheus. Why ?
Real example:
Graph Setup:
create
(_0:`Crew` {`name`:"Neo"}),
(_1:`Crew` {`name`:"Morpheus"}),
(_2:`World` {`name`:"Matrix"}),
_2-[:LIVE]->_1,
_2-[:LIVE]->_0
Query:
START
n=node(*)
MATCH
n<--matrix-->Neo
WHERE
n.name="Neo"
RETURN
n,
Neo
live test: http://console.neo4j.org/?id=vuo9ut
Actual result:
| n | Neo |
| (0:Crew {name:"Neo"}) | (1:Crew {name:"Morpheus"}) |
Expected result:
| n | Neo |
| (0:Crew {name:"Neo"}) | (0:Crew {name:"Neo"}) |
| (0:Crew {name:"Neo"}) | (1:Crew {name:"Morpheus"}) |

You can test the same query against different Cypher parser versions in the console you linked above. Just select "Options" in the upper right and choose "Cypher parser v1.9". Doing this for your example queries, I get the same result for both "Latest Cypher version" and "Cypher parser v1.9". If you are getting different results for the same query please submit an issue.
Note that the START clause is no longer necessary in Cypher, so (given your sample graph) your query could be rewritten like this:
MATCH (n:Crew)<-[:LIVE]-(w:World)-[:LIVE]->(neo:Crew)
WHERE n.name = "Neo"
RETURN neo, n
+------------------------------------------------+
| neo | n |
+------------------------------------------------+
| Node[1]{name:"Morpheus"} | Node[0]{name:"Neo"} |
+------------------------------------------------+
This query is essentially asking "What other crew members live in the same world as Neo?". Looking at the data in your console and the query the results you report are what I would expect the query to return.
The reason you are not getting your expected results is that pattern does not exist in your graph. To get the results you are expecting you would have to modify the query to something like this:
MATCH (n:Crew)<-[:LIVE]-(w:World)
WHERE n.name = "Neo"
MATCH (neo)<-[:LIVE]-(w)
RETURN neo, n
+------------------------------------------------+
| neo | n |
+------------------------------------------------+
| Node[1]{name:"Morpheus"} | Node[0]{name:"Neo"} |
| Node[0]{name:"Neo"} | Node[0]{name:"Neo"} |
+------------------------------------------------+
Which is essentially asking "What world does Neo live in? Who are all the crew members that live in this world, including Neo?"

How to query for first node based on first match within array (and found nodes)?

Well, I'm back here to repropose an old unanswered question. I'll try to explain it better.
I got the following cypher query:
neo4j-sh$ start n=node(1344) match (n)-[t:_HAS_TRANSLATION]-(p) return t,p;
+-----------------------------------------------------------------------------------+
| t | p |
+-----------------------------------------------------------------------------------+
| :_HAS_TRANSLATION[2224]{of:"value"} | Node[1349]{language:"hi-hi",text:"(>0#"} |
| :_HAS_TRANSLATION[2223]{of:"value"} | Node[1348]{language:"es-es",text:"hembra"} |
| :_HAS_TRANSLATION[2222]{of:"value"} | Node[1347]{language:"ru-ru",text:"65=A:89"} |
| :_HAS_TRANSLATION[2221]{of:"value"} | Node[1346]{language:"en-us",text:"female"} |
| :_HAS_TRANSLATION[2220]{of:"value"} | Node[1345]{language:"it-it",text:"femmina"} |
+-----------------------------------------------------------------------------------+
Then I have a dynamic array of languages (can change at any query), in any order, like
["fr-fr","jp-jp","en-us", "it-it", "de-de", "ru-ru", "hi-hi"]
I need a query to extract only the first[p] depending on the content of the array (in this case Node[1346]{language:"en-us",text:"female"}, because "en-us" is the first occurrence of the array with a macth on the p column.
Thank you again for your patience.
Paolo

Maybe there's a smarter way to solve this, but at least this statement should solve your question:
START n=node(1344) match (n)-[t:_HAS_TRANSLATION]-(p)
WITH n, collect(p.language) as languages
WITH n, filter(x in ["fr-fr","jp-jp","en-us","it-it","de-de","ru-ru","hi-hi"] where x in languages)[0] as lang
MATCH (n)-[t:_HAS_TRANSLATION]-(p)
WHERE p.language = lang
RETURN t,p

Multiple optional matches in a cypher query, one doesn't match, not sure why

I have been modeling some data related to call paths. A call enters the system, then maybe someone answers, then someone transfers, then someone hangs up, maybe some other stuff happens in between. Since they are ordered from beginning to end, I decided to model them as a linked list:
(call:Call)-[r:NextEvent]->(e:Event)-[r:NextEvent]->(e:Event)
and so on for as many events as there are. To query all the events that happen on a call I can go:
neo4j-sh (?) $ match (call:Call)-[:NextEvent*]->(lastEvent:Event) where call.callid="123"
> return lastEvent;
+------------------------------------------------------------------------------------------------------+
| lastEvent |
+------------------------------------------------------------------------------------------------------+
| Node[22]{name:"Newcall",callerid:"1231231234",calleridname:"David Foo",destination:"3213214321"} |
| Node[24]{name:"EnterQueue"} |
| Node[27]{name:"RingAttempt"} |
+------------------------------------------------------------------------------------------------------+
That's pretty much perfect. When someone enters a queue, I'd like to know what queue they are in, and when a ring attempt is made, I'd like to know the user whose phone rang, so I added some relations.
neo4j-sh (?)$ match (e:Event)-[r:Agent]->(agent:User) where e.name="RingAttempt" return e,agent;
+-----------------------------------------------------------------------------------------------------------------------------------+
| e | agent |
+-----------------------------------------------------------------------------------------------------------------------------------+
| Node[27]{name:"RingAttempt"} | Node[26]{username:"david.foo#foo.com"} |
+-----------------------------------------------------------------------------------------------------------------------------------+
neo4j-sh (?)$ match (e:Event)-[r:Queue]->(queue:Queue) where e.name="EnterQueue" return e,queue;
+-------------------------------------------------------------------+
| e | queue |
+-------------------------------------------------------------------+
| Node[24]{name:"EnterQueue"} | Node[17]{name:"Main Support Queue"} |
+-------------------------------------------------------------------+
Now I'd like to run a query that will get each event, and if the event is a ringattempt, also give me the agent it attempted to ring, and if the event is an enterqueue, give me the queue that was entered, so I tried to write this:
neo4j-sh (?)$ match p = (call:Call)-[:NextEvent*]->(lastEvent:Event) where call.callid="123"
> optional match (lastEvent)-[r:Queue]->(queue:Queue) where lastEvent.name="EnterQueue"
> optional match (lastEvent)-[r:Agent]->(agent:User) where lastEvent.name="RingAttempt"
> return lastEvent,queue,agent;
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| lastEvent | queue | agent |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Node[22]{name:"Newcall",callerid:"1231231234",calleridname:"David Foo",destination:"3213214321"} | <null> | <null> |
| Node[24]{name:"EnterQueue"} | Node[17]{name:"Main Support Queue"} | <null> |
| Node[27]{name:"RingAttempt"} | <null> | <null> |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
But why is agent null? I know for a fact that it exists. When I swap the two optional matches in the cypher query, it causes queue to be null and agent will instead be correct. I don't understand why.
Just to be clear I'm using neo4j-community-2.0.0-RC1.

You're using the same r identifier for the two optional match relationships, so it's already bound by the time you get to the second optional match, either as null, or as a relationship to a Queue. So, it will never match the Agent. Since you don't seem to care about the r, you can leave that out of the optional match.
match p = (call:Call)-[:NextEvent*]->(lastEvent:Event) where call.callid="123"
optional match (lastEvent)-[:Queue]->(queue:Queue) where lastEvent.name="EnterQueue"
optional match (lastEvent)-[:Agent]->(agent:User) where lastEvent.name="RingAttempt"
return lastEvent,queue,agent;

cypher find relation direction

How can I find a relation direction with regards to a containing path? I need this to do a weighted graph search that takes into account relation direction (weighing "wrong" direction with a 0, see also comments).
Lets say:
START a=node({param})
MATCH a-[*]-b
WITH a, b
MATCH p = allshortestpaths(a-[*]-b)
RETURN extract(r in rels(p): flows_with_path(r)) as in_flow
where
flows_with_path = 1 if sp = (a)-[*0..]-[r]->[*0..]-(b), otherwise
0
EDIT: corrected query

So, here's a way to do it with existing cypher functions. I don't promise it's super performant, but give it a shot. We're building our collection with reduce, using an accumulator tuple with a collection and the last node we looked at, so we can check that it's connected to the next node. This requires 2.0's case/when syntax--there may be a way to do it in 1.9 but it's probably even more complex.
START a=node:node_auto_index(name="Trinity")
MATCH a-[*]-b
WHERE a <> b
WITH distinct a,b
MATCH p = allshortestpaths(a-[*]-b)
RETURN extract(x in nodes(p): x.name?), // a concise representation of the path we're checking
head(
reduce(acc=[[], head(nodes(p))], x IN tail(nodes(p)): // pop the first node off, traverse the tail
CASE WHEN ALL (y IN tail(acc) WHERE y-->x) // a bit of a hack because tail(acc)-->x doesn't parse right, so I had to wrap it so I can have a bare identifier in the pattern predicate
THEN [head(acc) + 0, x] // add a 0 to our accumulator collection
ELSE [head(acc) + 1, x] // add a 1 to our accumulator collection
END )) AS in_line
http://console.neo4j.org/r/v0jx03
Output:
+---------------------------------------------------------------------------+
| extract(x in nodes(p): x.name?) | in_line |
+---------------------------------------------------------------------------+
| ["Trinity","Morpheus"] | [1] |
| ["Trinity","Morpheus","Cypher"] | [1,0] |
| ["Trinity","Morpheus","Cypher","Agent Smith"] | [1,0,0] |
| ["Trinity","Morpheus","Cypher","Agent Smith","The Architect"] | [1,0,0,0] |
| ["Trinity","Neo"] | [1] |
| ["Trinity","Neo",<null>] | [1,1] |
+---------------------------------------------------------------------------+
Note: Thanks #boggle for the brainstorming session.

Using regular expressions beyond matching in Cypher

I make the following query
neo4j-sh (?)$ start n=node(*) where n.name =~ 'u(.*)' return n;
==> +-----------------------+
==> | n |
==> +-----------------------+
==> | Node[311]{name:"u1"} |
==> | Node[312]{name:"u2"} |
==> | Node[313]{name:"u3"} |
==> | Node[314]{name:"u4"} |
I want to add a "userId" property and set it the number in the name key. I mean I want
==> +-----------------------+
==> | n |
==> +-----------------------+
==> | Node[311]{name:"u1", userId:'1'} |
==> | Node[312]{name:"u2", userId:'2'} |
==> | Node[313]{name:"u3"},userId:'3' |
==> | Node[314]{name:"u4"}, userId:'4' |
Now I need to strip the numbers from n.name.
How can I do this? How can I get the value from the (.*) in regex?

You can't do that in Cypher (as far as I know)--regex is just for matching.
If it's always just a single letter in front of it, you can take the substring:
start n=node(*)
set n.userId = substring(n.name, 1)

I had the same issue, so I developed a tiny Neo4j server plugin that lets you run regular expressions over REST API and against string node properties for matching/splitting/substituting purposes. It can return the results inside Neo4j console or save them to a property.
Have a look at it, maybe you find it useful: https://github.com/mszargar/Regx4Neo
It takes just a minute to install, but you will have to restart Neo4j for it to take effect.

You should be able to use matched groups with an APOC apoc.text.regexGroups function:
MATCH (a:AppJar)-[:CONTAINS]->(b)
WHERE b.fileName STARTS WITH "/BOOT-INF/lib/event-"
WITH a, b, last(last(apoc.text.regexGroups(b.fileName, '/BOOT-INF/lib/event-([\\d\\.]+).jar'))) as version
SET b.version = version
RETURN a.appName, b.fileName, b.version
LIMIT 100
it is a different query but should be applicable to your data as well

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Finding cypher paths that don't visit the same node twice - neo4j

Related

Unexpected result. Expected to repeat path

How to query for first node based on first match within array (and found nodes)?

Multiple optional matches in a cypher query, one doesn't match, not sure why

cypher find relation direction

Using regular expressions beyond matching in Cypher

Categories

Resources