cypher -- missing one possible path - neo4j

I'm new with cypher expression.
And I have this database
CREATE
(a:City {name: 'A'}),
(b:City {name: 'B'}),
(c:City {name: 'C'}),
(d:City {name: 'D'}),
(e:City {name: 'E'}),
(a)-[:HAS_RAIL_ROAD_TO {distance : 5 }]->(b),
(b)-[:HAS_RAIL_ROAD_TO {distance : 4 }]->(c),
(c)-[:HAS_RAIL_ROAD_TO {distance : 8 }]->(d),
(d)-[:HAS_RAIL_ROAD_TO {distance : 8 }]->(c),
(d)-[:HAS_RAIL_ROAD_TO {distance : 6 }]->(e),
(a)-[:HAS_RAIL_ROAD_TO {distance : 5 }]->(d),
(c)-[:HAS_RAIL_ROAD_TO {distance : 2 }]->(e),
(e)-[:HAS_RAIL_ROAD_TO {distance : 3 }]->(b),
(a)-[:HAS_RAIL_ROAD_TO {distance : 7 }]->(e)
When I execute
MATCH(:City { name: 'A' })-[r:HAS_RAIL_ROAD_TO*4]->(:City { name: 'C' })
return count(r)
I was expecting 3 as result:
(A, B, C,D, C); (A, D, C, D, C); (A, D, E, B, C).
But the result given is 2
I think in the second case (A, D, C, D, C) it is not coming back to D.
What do you think is the reason for this?

This has to do with the uniqueness behavior in Cypher traversals, which ensures that a relationship can only be traversed once per path per MATCH pattern.
(A, D, C, D, C) will not work because there are only two relationships between D and C, and the D, C, D part traversed both of them, leaving no other relationships available to traverse again back from D to C.
This uniqueness behavior is useful for most cases, and also prevents any kind of infinite loop problems with unbounded variable-length patterns.
If you do need to consider reusing relationships in your paths, you'll need a different approach, one that lets you change the uniqueness behavior during traversal.
You can use path expander procs from APOC Procedures to change the uniqueness and expand out, but PLEASE make sure to set an upper bound (via the maxLevel config property) otherwise you risk an infinite loop traversal which will likely blow the heap.
MATCH (start:City { name: 'A' }), (end:City { name: 'C' })
CALL apoc.path.expandConfig(start, {endNodes:[end], minLevel:4, maxLevel:4, relationshipFilter:'HAS_RAIL_ROAD_TO>', uniqueness:'NONE'}) YIELD path
RETURN [node in nodes(path) | node.name] as path

Related

Query to write hops and return all the properties from the middle nodes or a better way to do it and skip hops?

Node A is connected to Node E through different nodes B (B can be repeating), C and D etc as given below.
(A)--(C)--(D)--(E)
(A)--(B)--(C)--(D)--(E)
(A)--(B)--(B)--(C)--(D)--(E)
(A)--(B)--(B)--(B)--(C)--(D)--(E)
There could be up to 7 B nodes between A and C or no B node at all (like the first case above).
Question: How to get all the E1, E2, E3, E4 connected to A1 with a single query and return properties from all A, B, C, D and E nodes? I could not return the properties using the hops.
MATCH (A {Id:30})-[*1..6]-(E) RETURN DISTINCT A.Name, E.Name;
But we want to return B.Name (If there are multiple B nodes in the middle their names too), C.Name and D.Name too. Happy to skip hopping completely if required. Help please? Thanks in advance.
Try this
// in case there is always A,C and E, you can look for
// paths with length 3 to 6
MATCH path=(A)-[*3..6]-(E)
// return the name of each node in the same order
RETURN [n IN nodes(path) | n.name] AS nodeNames
Assuming A through E are node labels, this query should get all paths that match your pattern (with 0 to 7 B nodes between the A and C nodes), and return distinct lists of node Name values:
MATCH p=(:A)-[*..8]-(:C)--(:D)--(:E)
WHERE ALL(n IN NODES(p)[1..-3] WHERE 'B' IN LABELS(n))
RETURN DISTINCT [m IN NODES(p) | m.Name] AS names
In general, the query would be more efficient if you could also specify the relationship types and their directionality.

How to avoid cycle in neo4j cypher queries

I have friend-friend data model which has two relationships between any two friend nodes based on how one friend defines the other friend.
For example, User "A" can define user "B" as 'FRIEND' and "B" can define "A" as 'BUDDY'.
The problems is, when I try to get the 3rd degree of relationship of user "A", it returns user "B", where as the actual result should be "D" only.
MATCH(a:Users {first_name : "A"}) -[:BUDDY|FRIEND*3] -> (b)
RETURN a,b
OR
MATCH (a)-[]-(b)-[]-(c)-[]-(d)
WHERE a.first_name="A"
RETURN a,d
Alternatively, you can do this:
MATCH p=((a:Users {first_name : "A"})-[:BUDDY|FRIEND*3]->(b))
WITH DISTINCT a, b, nodes(p) as nodes
UNWIND nodes AS node
WITH a, b, nodes, COLLECT(DISTINCT node) as distinct_nodes
WITH a, b WHERE SIZE(nodes)=SIZE(distinct_nodes)
RETURN a, b
or a bit easier with an APOC call:
MATCH p=((a:Users {first_name : "A"})-[:BUDDY|FRIEND*3]->(b))
WITH DISTINCT a, b WHERE SIZE(nodes(p)) = SIZE(apoc.coll.toSet(nodes(p)))
RETURN a, b
I'd suggest the APOC Path Expander procedures which use a means of expansion that only ever consider a single path to a node, allow for specification of the max and min depth, take relationship filters, and set whether visiting a node more than once is permitted. Specifically, the apoc.path.expandConfig() procedure should meet your needs.
MATCH (a:Users {first_name: "A"})
CALL apoc.path.expandConfig(a, {relationshipFilter:"BUDDY|FRIEND",minLevel:3,maxLevel:3, bfs:true,uniqueness:"NODE_GLOBAL"}) YIELD path
RETURN a, path
The uniqueness:"NODE_GLOBAL" parameter makes sure no node is visited more than once.

How to find all multi-directional relations between two nodes?

I'm trying to find all relations between node a and node b, and the relations could be multi-directions. For example,
a <- c -> b or a -> d -> b where c and d are nodes.
I've tried MATCH (a:PERSON {name: 'WD'})-[r*..3]-(b:PERSON{name: 'EK'}) RETURN r, a, b, but I got two isolated nodes, because the relation between a and b is: a <- c -> b.
Any help would be appreciated.
You can return the path if you need all the relationships and nodes in between.
Following query will
You can modify your query to return full paths instead of just nodes a and b as following:
MATCH paths=(a:PERSON {name: 'WD'})-[r*..3]-(b:PERSON{name: 'EK'})
RETURN paths
This will return paths of length up to 3, change it as you need.

Neo4j Cypher - Query partial fixed route and partial variable route

Let's say I have a graph network like shown here:
I can do a cypher query using something like
MATCH (a:A)-[]->(b:B)-[]->(c:C)-[]-(d1:D),
(a)-[]->(b)-[]->(c)-[]-(d2:D),
(a)-[]->(b)-[]->(c)-[]-(d3:D),
(a)-[]->(b)-[]->(c)-[]-(d4:D),
WHERE d1.val = '1' AND d2.val = '2' AND d3.val ='3', d4.val = '4'
RETURN a, b, c, d1, d2, d3, d4
Is there a way to simplify this query, without explicitly rewriting the relationship over and over again, which are identical. I am trying to find every relation which has all the D values I am expecting, which is large list so probably an IN clause would be appropriate.
Edit:
Sample data based on answer below
create (a1:A {name: 'A1'})
create (b1:B {name: 'B1'})
create (c1:C {name: 'C1'})
create (d1:D {name: 'D1', val: 1})
create (d2:D {name: 'D2', val: 2})
create (d3:D {name: 'D3', val: 3})
create (d4:D {name: 'D4', val: 4})
create (a1)-[:NEXT]->(b1)
create (b1)-[:NEXT]->(c1)
create (c1)-[:NEXT]->(d1)
create (c1)-[:NEXT]->(d2)
create (c1)-[:NEXT]->(d3)
create (c1)-[:NEXT]->(d4)
create (a2:A {name: 'A2'})
create (b2:B {name: 'B2'})
create (c2:C {name: 'C2'})
create (a2)-[:NEXT]->(b2)
create (b2)-[:NEXT]->(c2)
create (c2)-[:NEXT]->(d1)
create (c2)-[:NEXT]->(d2)
create (a3:A {name: 'A3'})
create (b3:B {name: 'B3'})
create (c3:C {name: 'C3'})
create (a3)-[:NEXT]->(b3)
create (b3)-[:NEXT]->(c3)
create (c3)-[:NEXT]->(d1)
create (c3)-[:NEXT]->(d2)
create (c3)-[:NEXT]->(d3)
create (c3)-[:NEXT]->(d4)
return *
So the query should result in A1-->B1-->C1-->D1,D2,D3,D4 and A3-->B3-->C3-->D1,D2,D3,D4
Since A2-->B2--C2 links with only D1,D2 and not D3,D4 it should not be in the result.
The beginning of the path is always the same, so you don't need to repeat it. Then, based on a list of values, you want to check if you can find a D for each and every one of them: it could be a job for all.
Mixing all that, we get:
MATCH (a:A)-->(b:B)-->(c:C)-->(d:D)
WHERE d.val IN {values}
WITH a, b, c, collect(d) AS dList
WHERE all(value IN values WHERE any(d IN dList WHERE d.val = value))
RETURN a, d, c, dList
However, if n is the number of values, that's an O(n^2) algorithm because of the second WHERE.
Let's collect the values of the nodes while collecting the nodes themselves, to avoid the double loop and turn it into a O(n) algorithm:
MATCH (a:A)-->(b:B)-->(c:C)-->(d:D)
WHERE d.val IN {values}
WITH a, b, c, collect(d) AS dList, collect(DISTINCT d.val) AS dValues
WHERE all(value IN values WHERE value in dValues)
RETURN a, d, c, dList
Assuming the list of values passed as a parameter only contains distinct values, we can even change that into an O(1) algorithm by simply comparing the size of the input list and the distinct values found:
MATCH (a:A)-->(b:B)-->(c:C)-->(d:D)
WHERE d.val IN {values}
WITH a, b, c, collect(d) AS dList, collect(DISTINCT d.val) AS dValues
WHERE size({values}) = size(dValues)
RETURN a, d, c, dList
Because dValues ⊂ values, if the 2 sets have the same size, they're equal.
If D.val are globally unique, or at least unique for all the D nodes connected to a single C, it can be further simplified:
MATCH (a:A)-->(b:B)-->(c:C)-->(d:D)
WHERE d.val IN {values}
WITH a, b, c, collect(d) AS dList
WHERE size({values}) = size(dList)
RETURN a, d, c, dList
If the values are globally unique, the query will be faster with the unicity constraint as it will also index the values:
CREATE CONSTRAINT ON (d:D) ASSERT d.val IS UNIQUE
If every D node has a unique val property (if any), this should work:
WITH [1,2,3,4] AS desired
MATCH (a:A)-->(b:B)-->(c:C)-->(d:D)
WHERE d.val IN desired
WITH a, b, c, COLLECT(DISTINCT d) AS ds
WHERE SIZE(ds) = SIZE(desired)
RETURN a, b, c, ds
The result will have a row for every matched A, B, C combination, along with the collection of D nodes.
Assuming the following data set...
create (a:A {name: 'A'})
create (b:B {name: 'B'})
create (c:C {name: 'C'})
create (d1:D {name: 'D1', val: 1})
create (d2:D {name: 'D2', val: 2})
create (d3:D {name: 'D3', val: 3})
create (d4:D {name: 'D4', val: 4})
create (a)-[:NEXT]->(b)
create (b)-[:NEXT]->(c)
create (c)-[:NEXT]->(d1)
create (c)-[:NEXT]->(d2)
create (c)-[:NEXT]->(d3)
create (c)-[:NEXT]->(d4)
return *
You could execute a query something like this to match all of the specific D nodes in a particular value range.
match (a:A)-->(b:B)-->(c:C)-->(d:D)
where d.val in range(1,4)
return *
Here is an updated query based on your updated question. I collected the D values for each A,B,C chain of nodes.
match (a:A)-->(b:B)-->(c:C)-->(d:D)
where d.val in range(1,4)
with a, b, c, d
order by a.name, b.name, c.name, d.name
return a, b, c, collect(d) as d
order by a.name, b.name, c.name

Neo4J: Return only the candidates, not all the combinations

Consider a Cypher query in the following form:
MATCH a-->b,a-->c,a-->d WHERE [some conditions on a, b, c and d] RETURN id(a),id(b),id(c),id(d)
The query above, probably as expected, will return all the combinations of candidate nodes for a, b, c, and d. So, for instance, if there are three candidates for b and four candidates for c, the total number of rows returned by the query will be 3 x 4 = 12. How can it be adjusted so the different matching nodes for each alias (a to d) is returned only once?
The following query is not a valid one, but should clarify what I have in mind:
MATCH a-->b,a-->c,a-->d WHERE [some conditions on a, b, c and d] RETURN distinct id(a), distinct id(b), distinct id(c), distinct id(d)
You can use distinct aggregation.
MATCH a-->b,a-->c,a-->d
WHERE [some conditions on a, b, c and d]
RETURN collect(distinct id(a)) as ids_a,collect(distinct id(b)) as ids_b,
collect(distinct id(c)) as ids_c,collect(distinct id(d)) as ids_d;

Resources