How to perform distinct while having multiple paths using Cypher - neo4j

For a given node(sourceNode) I want to retrieve all node's which has relationships to my sourceNode within 3 hops.
The problem starts when we have multiple pathes btw source and destination nodes.
I dont care which path I get as long as I get one and I dont want to get the other ones (would be great to get only the shortest path)
So this is my code:
MATCH (user:C9 {userId:'70'})-[r:follow*1..3]-f WHERE f <> user
RETURN DISTINCT (f.userId) as userId,
reduce(s = '', rel IN r | s + rel.dist + ',') as dist,
length(r) as hop
The repose for this consist the same nodeId(userId's) and not performing distinct:
I would like to avoid the duplicated lines with the same userId.
any idea how to perform the distinct here?
Thanks,
ray.

How about something like this? Rather than look for distinct user just ue shortestPath to get to each follower 1..3 out from the starting user.
MATCH p=shortestPath((user:C9 {userId:'70'})-[r:follow*1..3]-(f))
WHERE f <> user
RETURN f.userId,
reduce(s = '', rel IN r | s + rel.dist + ',') as dist,
length(p) as hop
Alternatively, if you were looking to do it by shortest distance regardless of hops you could do something like the following example. Instead of using shortestPath, aggregate the distances on each relationship, order by shortest, put them in a collection, order by user and return the first element of the collection which will be the shortest
MATCH p=(user:C9 {userId:'70'})-[r:follow*1..3]-(f)
WHERE f <> user
with f.userId as user_id
, reduce(s = 0, rel IN relationships(p) | s + rel.dist) as dist
, length(p) as hops
order by dist
with user_id, collect(dist) as dists_per_follow, collect(hops) as hops_per_follow
return user_id
, dists_per_follow[0] as shortest
, dists_per_follow, hops_per_follow
order by user_id

Related

neo4j cypher to filter multi paths based on two relationships

I have the following graph:
I need to get all the AD nodes which are related to a particular User node. If I search by a user B1, I should get all the AD nodes which are connected by HAS relation to B1 node as well as the AD nodes which are connected to its parent by HAS relation. But if any of these AD nodes are connected by an EXCLUDES relation, I should filter that one out.
For example, if I search by B1, I should get AD4,AD2
AD1 has EXCLUDES with D1 and AD3 has excludes with C1, hence filtered out.
I am using the following cypher
MATCH path=(p:AD)-[:HAS|EXCLUDES]-()<-[:CHILD_OF*]-(u:User) USING INDEX u:User(id) WHERE u.id = 'B1'
with p,
collect( filter( r in rels(path)
where type(r) = 'EXCLUDES'
)
) as test
where all( t in test where size(t) = 0 )
return p
The issue is when I search with C1, it return AD4,AD3,AD2. How can I eliminate AD3 from the result?
:CHILD_OF* doesn't include your starting node. To include that, set a lowerbound of 0:
[:CHILD_OF*0..]
That said, there are probably better ways to form your query. Try this, maybe:
MATCH (u:User)
WHERE u.id = 'B1'
WITH u, [(p:AD)-[:EXCLUDES]-()<-[:CHILD_OF*0..]-(u) | p] as excluded
MATCH (p:AD)-[:HAS]-()<-[:CHILD_OF*0..]-(u)
WHERE not p in excluded
RETURN p
EDIT
The pattern comprehension feature was released with Neo4j 3.1. You won't be able to use that in an older version. Try this instead:
MATCH (u:User)
WHERE u.id = 'B1'
OPTIONAL MATCH (p:AD)-[:EXCLUDES]-()<-[:CHILD_OF*0..]-(u)
WITH u, collect(p) as excluded
MATCH (p:AD)-[:HAS]-()<-[:CHILD_OF*0..]-(u)
WHERE not p in excluded
RETURN p

Cypher set where condition only one relation can show of each node in the path

I got a graph with circular type relationship. So there may have two different direction relationships (outgoing and incoming) between each node in my graph. I am trying to find the path between two nodes without using shortestpath(). How to give the condition to set only 1 relationship will show between each node in the path? Here is my query:
Match p = (A)-[r*]-(b) return p
What should I write for where part?
So what yo really want is shortest weighted path. Cypher doesn't directly support this yet, but you can query it like this
MATCH (start:Point {id: '1'}), (end:Point {id: '2'})
MATCH p=(start)-[:GO_TO*1..25]->(end)
WITH p,reduce(s = 0, r IN rels(p) | s + r.myValueProp) AS dist
RETURN p, dist ORDER BY dist DESC LIMIT 1

Match Only Full Paths in Neo4J with Cypher (not sub-paths)

If I have a graph like the following (where the nesting could go on for an arbitrary number of nodes):
(a)-[:KNOWS]->(b)-[:KNOWS]->(c)-[:KNOWS]->(d)-[:KNOWS]->(e)
| |
| (i)-[:KNOWS]->(j)
|
(f)-[:KNOWS]->(g)-[:KNOWS]->(h)-[:KNOWS]->(n)
|
(k)-[:KNOWS]->(l)-[:KNOWS]->(m)
How can I retrieve all of the full-length paths (in this case, from (a)-->(m), (a)-->(n) (a)-->(j) and (a)-->(e)? The query should also be able to return the nodes with no relationships of the given type.
So far I am just doing the following (I only want the id property):
MATCH path=(a)-[:KNOWS*]->(b)
RETURN collect(extract(n in nodes(path) | n.id)) as paths
I need the paths so that in the programming language (in this case clojure) I can create a nested map like this:
{"a" {"b" {"f" {"g" {"k" {"l" {"m" nil}}
"h" {"n" nil}}}
"c" {"d" {"e" nil}
"i" {"j" nil}}}}}
Is it possible to generate the map directly with the query?
Just had to do something similar, this worked on your example, finds all nodes which do not have outgoing [:KNOWS]:
match p=(a:Node {name:'a'})-[:KNOWS*]->(b:Node)
optional match (b)-[v:KNOWS]->()
with p,v
where v IS NULL
return collect(extract(n in nodes(p) | n.id)) as paths
Here is one query that will get you started. This query will return just the longest chain of nodes when there is a single chain without forks. It matches all of the paths like yours does but only returns the longest one by using limit to reduce the result.
MATCH p=(a:Node {name:'a'})-[:KNOWS*]->(:Node)
WITH length(p) AS size, p
ORDER BY size DESC
LIMIT 1
RETURN p AS Longest_Path
I think this gets the second part of your question where there are multiple paths. It looks for paths where the last node does not have an outbound :KNOWS relationship and where the starting node does not have an inbound :KNOWS relationship.
MATCH p=(a:Node {name:'a'})-[:KNOWS*]->(x:Node)
WHERE NOT x-[:KNOWS]->()
AND NOT ()-[:KNOWS]->(a)
WITH length(p) AS size, p
ORDER BY size DESC
RETURN reduce(node_ids = [], n IN nodes(p) | node_ids + [id(n)])

Find shortest path between nodes with additional filter

I'm trying to model flights between airports on certain dates. So far my test graph looks like this:
Finding shortest path between for example LTN and WAW is trivial with:
MATCH (f:Airport {code: "LTN"}), (t:Airport {code: "WAW"}),
p = shortestPath((f)-[]-(t)) RETURN p
Which gives me:
But I have no idea how to get only paths with Flights that have relation FLIES_ON with given Date.
Link to Neo4j console
Here's what I would do with your given model. The other commenters' queries don't seem right, as they use ANY() instead of ALL(). You specifically said you only want paths where all Flight nodes on the path are attached to a given Date node with a :FLIES_ON relationship:
MATCH (LTN:Airport {code:"LTN"}),
(WAW:Airport {code:"WAW"}),
p =(LTN)-[:ROUTE*]-(WAW)
WHERE ALL(x IN FILTER(x IN NODES(p) WHERE x:Flight)
WHERE (x)<-[:FLIES_ON]-(:Date {date:"130114"}))
WITH p ORDER BY LENGTH(p) LIMIT 1
RETURN p
http://console.neo4j.org/r/xgz84y
though this would not be my preferred structure for this kind of data; in answering your question i might go this way instead. get the paths, filter the path and get the first one ordered by length.
in the console tests is runs faster than the one suggested above as the query plan is simpler.
Anyhoo i hope this at least points you in a good direction :)
MATCH (f:Airport { cd: "ltn" }),(t:Airport { cd: "waw" }), p =((f)-[r*]-(t))
WHERE ANY (x IN relationships(p)
WHERE type(x)='FLIES_ON') AND ANY (x IN nodes(p)
WHERE x.cd='130114')
RETURN p
ORDER BY length(p)
LIMIT 1
The problem is that using shortestPath or allShortestPaths will never include the Date nodes.
What you need to do is to filter the pattern with the date node (I don't know however how you store the date, so I'll take Ymd format:
MATCH (f:Airport {code: "LTN"}), (t:Airport {code: "WAW"})
MATCH p=(f)-[*]-(t)
WHERE ANY (r in rels(p) WHERE type(r) = 'FLIES_ON')
AND ANY (n in nodes(p) WHERE 'Date' IN labels(n) AND n.date = 20150120)
RETURN p
ORDER BY length(p)
LIMIT 1
Another solution and less costly, is to include the date in your match and building yourself the path with it :
MATCH (n:Date {date:20150120})
MATCH (f:Airport {code:"LTN"}), (t:Airport {code:"WAW"})
MATCH p=(f)<-[*]-(n)-[*]->(t)
RETURN distinct(p)
ORDER BY length(p)

Find the distance in a path between each node and the last node of the path

I am very new to Cypher and I need help to solve a problem I am facing..
In my graph I have a path represeting a data stream and I need to know, for each node in the path, the distance from the last node of the path.
For example if i have the following path:
(a)->(b)->(c)->(d)
the distance must be 3 for a, 2 for b, 1 for c and 0 for d.
Is there an efficient way to obtain this result in Cypher?
Thanks a lot!
Mauro
If it is just hops between nodes then i think this will fit the bill.
match p=(a:Test {name: 'A'})-[r*3]->(d:Test {name: 'D'})
with p, range(length(p),0,-1) as idx
unwind idx as elem
return (nodes(p)[elem]).name as Node
, length(p) - elem as Distance
order by Node
In this answer, I define a path to be "complete" if its start node has no incoming relationship and its end node has no outgoing relationship.
This query returns, for each "complete" path, a collection of objects containing each node's neo4j-generated ID and the number of hops to the end of that path:
MATCH p=(x)-[*]->(y)
WHERE (NOT ()-->(x)) AND (NOT (y)-->())
WITH NODES(p) AS np, LENGTH(p) AS lp
RETURN EXTRACT(i IN RANGE(0, lp, 1) | {id: ID(np[i]), hops: lp - i})
NOTE: Matching with [*] will be costly with large graphs, so you may need to limit the maximum hop value. For example, use [*..4] instead to limit the max hop value to 4.
Also, qualifying the query with appropriate node labels and relationship types may speed it up.

Resources