cypher get relationships between same nodes,but lose some relationships - neo4j

data set:
neo4j-sh (?)$ START n = node(*) MATCH n-[r]-m RETURN n,r,m;
==> +---------------------------------------------+
==> | n | r | m |
==> +---------------------------------------------+
==> | Node[1]{} | (2)-[1:KNOWS]->(1)| Node[2]{} |
==> | Node[1]{} | (3)-[2:KNOWS]->(1) | Node[3]{} |
==> | Node[2]{} | (2)-[1:KNOWS]->(1) | Node[1]{} |
==> | Node[2]{} | (3)-[0:KNOWS]->(2) | Node[3]{} |
==> | Node[3]{} | (3)-[0:KNOWS]->(2) | Node[2]{} |
==> | Node[3]{} | (3)-[2:KNOWS]->(1) | Node[1]{} |
==> +---------------------------------------------+
==> 6 rows
==>
==> 0 ms
cypher query:
neo4j-sh (0)$ start x=node(1,2,3),y=node(1,2,3) match x-[r]-y return id(x),id(y) order by id(x) desc;
==> +---------------+
==> | id(x) | id(y) |
==> +---------------+
==> | 1 | 2 |
==> | 1 | 3 |
==> | 2 | 1 |
==> | 3 | 1 |
==> +---------------+
==> 4 rows
in fact,2 and 3 are linked,why no returns;
how to get returns?
thanks
url:http://console.neo4j.org/?id=qwdh4p

Related

Neo4j feature or bug with search query?

I have recently updated neo4j from 2.1.7 to 2.2.5. I found out that query
Match (c:C) where id(c) = 111 with c Match (p:I{id: c.id}) return count(p)
worked fine in 2.1.7, but it performs very poor in 2.2.5 (100 times longer). I have all the indexes that are needed.
I modified this query to
Match (c:C) where id(c) = 111 with c.id as c_id Match (p:I{id: c_id}) return count(p)
and after this it works fine in 2.2.5
This two queries have different profile. But I'm not very expirienced with profiling.
UPDATED
One more strange thing is that if i use explain instead of profile - it works fast.
neo4j-sh (?)$ PROFILE Match (c:C) where id(c) = 10563822 with c Match (i:I{id: c.id}) return count(i);
==> +----------+
==> | count(i) |
==> +----------+
==> | 4551 |
==> +----------+
==> 1 row
==> 18257 ms
==>
==> Compiler CYPHER 2.2
==>
==> Planner COST
==>
==> EagerAggregation
==> |
==> +Filter(0)
==> |
==> +CartesianProduct
==> |
==> +Filter(1)
==> | |
==> | +NodeByIdSeek
==> |
==> +NodeByLabelScan
==>
==> +------------------+---------------+---------+---------+-------------+-------------------------+
==> | Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
==> +------------------+---------------+---------+---------+-------------+-------------------------+
==> | EagerAggregation | 26 | 1 | 0 | count(i) | |
==> | Filter(0) | 652 | 4551 | 2522988 | c, i | i.id == c.id |
==> | CartesianProduct | 6521 | 1261494 | 0 | c, i | |
==> | Filter(1) | 0 | 1 | 1 | c | c:C |
==> | NodeByIdSeek | 1 | 1 | 1 | c | |
==> | NodeByLabelScan | 1261494 | 1261494 | 1261495 | i | :I |
==> +------------------+---------------+---------+---------+-------------+-------------------------+
==>
==> Total database accesses: 3784485
sh (?)$ PROFILE Match (c:C) where id(c) = 10563822 with c.id as c_id Match (i:I{id: c_id}) return count(i);
==> +----------+
==> | count(i) |
==> +----------+
==> | 4551 |
==> +----------+
==> 1 row
==> 64 ms
==>
==> Compiler CYPHER 2.2
==>
==> Planner COST
==>
==> EagerAggregation
==> |
==> +Apply
==> |
==> +Projection
==> | |
==> | +Filter
==> | |
==> | +NodeByIdSeek
==> |
==> +NodeIndexSeek
==>
==> +------------------+---------------+------+--------+-------------+---------------------+
==> | Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
==> +------------------+---------------+------+--------+-------------+---------------------+
==> | EagerAggregation | 1 | 1 | 0 | count(i) | |
==> | Apply | 1 | 4551 | 0 | c, c_id, i | |
==> | Projection | 0 | 1 | 1 | c, c_id | c.id |
==> | Filter | 0 | 1 | 1 | c | c:C |
==> | NodeByIdSeek | 1 | 1 | 1 | c | |
==> | NodeIndexSeek | 1 | 4551 | 4552 | i | :I(id) |
==> +------------------+---------------+------+--------+-------------+---------------------+
==>
==> Total database accesses: 4555
I don't have enough knowledge of neo4j internals to know why your query is slower (the CartesianProduct step seems a red flag) in more recent versions, but here is a logically equivalent query that seems like it should be much faster:
START c = node(111)
MATCH (p:I { id: c.id })
RETURN count(p)
Here is the profile:
+------------------+------+--------+----------------------------------------------------------+-----------------------+
| Operator | Rows | DbHits | Identifiers | Other |
+------------------+------+--------+----------------------------------------------------------+-----------------------+
| ColumnFilter | 1 | 0 | count(p) | keep columns count(p) |
| EagerAggregation | 1 | 0 | INTERNAL_AGGREGATE51b25e53-027d-439b-9046-c1a2a6b0fe70 | |
| Filter | 0 | 0 | c, p | p.id == c.id |
| NodeById | 0 | 0 | c, p | Literal(List(111)) |
| NodeByLabel | 0 | 1 | p | :I |
+------------------+------+--------+----------------------------------------------------------+-----------------------+
NOTE: This should be considered a temporary workaround, as START has been deprecated, and I do not know how long this kind of usage will continue to be supported.

Cypher syntax clarification. Multiple MATCH clauses vs using a comma [duplicate]

are these two Chypher statements identical:
//first
match (a)-[r]->(b),b-[r2]->c
//second
match (a)-[r]->(b)
match b-[r2]->c
The 2 Cypher statements are NOT identical. We can show this by using the PROFILE command, which shows you how the Cypher engine would perform a query.
In the following examples, the queries all end with RETURN a, c, since you cannot have a bare MATCH clause.
As you can see, the first query has a NOT(r == r2) filter that the second query does not. This is because Cypher makes sure that the result of a single MATCH clause does not contain duplicate relationships.
First query
profile match (a)-[r]->(b),b-[r2]->c return a,c;
==> +-----------------------------------------------+
==> | a | c |
==> +-----------------------------------------------+
==> | Node[1]{name:"World"} | Node[0]{name:"World"} |
==> +-----------------------------------------------+
==> 1 row
==> 2 ms
==>
==> Compiler CYPHER 2.3
==>
==> Planner COST
==>
==> Runtime INTERPRETED
==>
==> Projection
==> |
==> +Filter
==> |
==> +Expand(All)(0)
==> |
==> +Expand(All)(1)
==> |
==> +AllNodesScan
==>
==> +----------------+---------------+------+--------+----------------+----------------+
==> | Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
==> +----------------+---------------+------+--------+----------------+----------------+
==> | Projection | 1 | 1 | 0 | a, b, c, r, r2 | a; c |
==> | Filter | 1 | 1 | 0 | a, b, c, r, r2 | NOT(r == r2) |
==> | Expand(All)(0) | 1 | 2 | 4 | a, b, c, r, r2 | (b)-[r2:]->(c) |
==> | Expand(All)(1) | 2 | 2 | 8 | a, b, r | (b)<-[r:]-(a) |
==> | AllNodesScan | 6 | 6 | 7 | b | |
==> +----------------+---------------+------+--------+----------------+----------------+
==>
Second query
profile match (a)-[r]->(b) match b-[r2]->c return a,c;
==> +-----------------------------------------------+
==> | a | c |
==> +-----------------------------------------------+
==> | Node[1]{name:"World"} | Node[1]{name:"World"} |
==> | Node[1]{name:"World"} | Node[0]{name:"World"} |
==> +-----------------------------------------------+
==> 2 rows
==> 2 ms
==>
==> Compiler CYPHER 2.3
==>
==> Planner COST
==>
==> Runtime INTERPRETED
==>
==> Projection
==> |
==> +Expand(All)(0)
==> |
==> +Expand(All)(1)
==> |
==> +AllNodesScan
==>
==> +----------------+---------------+------+--------+----------------+----------------+
==> | Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
==> +----------------+---------------+------+--------+----------------+----------------+
==> | Projection | 1 | 2 | 0 | a, b, c, r, r2 | a; c |
==> | Expand(All)(0) | 1 | 2 | 4 | a, b, c, r, r2 | (b)-[r2:]->(c) |
==> | Expand(All)(1) | 2 | 2 | 8 | a, b, r | (b)<-[r:]-(a) |
==> | AllNodesScan | 6 | 6 | 7 | b | |
==> +----------------+---------------+------+--------+----------------+----------------+

What is the difference between these two Cypher queries?

I'm a bit stumped.
In my database, I have a relationship like this:
(u:User)-[r1:LISTENS_TO]->(a:Artist)<-[r2:LISTENS_TO]-(u2:User)
I want to perform a query where for a given user, I find the common artists between that user and every other user.
To give an idea of size of my database, I have about 600 users, 47,546 artists, and 184,211 relationships between users and artists.
The first query I was trying was the following:
START me=node(553314), other=node:userLocations("withinDistance:[38.89037,-77.03196,80.467]")
OPTIONAL MATCH
pMutualArtists=(me:User)-[ar1:LISTENS_TO]->(a:Artist)<-[ar2:LISTENS_TO]-(other:User)
WHERE
other:User
WITH other, COUNT(DISTINCT pMutualArtists) AS mutualArtists
ORDER BY mutualArtists DESC
LIMIT 10
RETURN other.username, mutualArtists
This was taking around 20 seconds to return. The profile for this query is as follows:
+----------------------+-------+--------+------------------------+------------------------------------------------------------------------------------------------+
| Operator | Rows | DbHits | Identifiers | Other |
+----------------------+-------+--------+------------------------+------------------------------------------------------------------------------------------------+
| ColumnFilter(0) | 10 | 0 | | keep columns other.username, mutualArtists |
| Extract | 10 | 20 | | other.username |
| ColumnFilter(1) | 10 | 0 | | keep columns other, mutualArtists |
| Top | 10 | 0 | | { AUTOINT0}; Cached( INTERNAL_AGGREGATEb6facb18-1c5d-45a6-83bf-a75c25ba6baf of type Integer) |
| EagerAggregation | 563 | 0 | | other |
| OptionalMatch | 52806 | 0 | | |
| Eager(0) | 563 | 0 | | |
| NodeByIndexQuery(1) | 563 | 564 | other, other | Literal(withinDistance:[38.89037,-77.03196,80.467]); userLocations |
| NodeById(1) | 1 | 1 | me, me | Literal(List(553314)) |
| Eager(1) | 82 | 0 | | |
| ExtractPath | 82 | 0 | pMutualArtists | |
| Filter(0) | 82 | 82 | | (hasLabel(a:Artist(1)) AND NOT(ar1 == ar2)) |
| SimplePatternMatcher | 82 | 82 | a, me, ar2, ar1, other | |
| Filter(1) | 1 | 3 | | ((hasLabel(me:User(3)) AND hasLabel(other:User(3))) AND hasLabel(other:User(3))) |
| NodeByIndexQuery(1) | 563 | 564 | other, other | Literal(withinDistance:[38.89037,-77.03196,80.467]); userLocations |
| NodeById(1) | 1 | 1 | me, me | Literal(List(553314)) |
+----------------------+-------+--------+------------------------+------------------------------------------------------------------------------------------------+
I was frustrated. It didn't seem like this should take 20 seconds.
I came back to the problem later on, and tried debugging it from the start.
I started to break down the query, and I noticed I was getting much faster results. Without the Neo4J Spatial query, I was getting results in about 1.5 seconds.
I finally added things back, and ended up with the following query:
START u=node(553314), u2=node:userLocations("withinDistance:[38.89037,-77.03196,80.467]")
OPTIONAL MATCH
pMutualArtists=(u:User)-[ar1:LISTENS_TO]->(a:Artist)<-[ar2:LISTENS_TO]-(u2:User)
WHERE
u2:User
WITH u2, COUNT(DISTINCT pMutualArtists) AS mutualArtists
ORDER BY mutualArtists DESC
LIMIT 10
RETURN u2.username, mutualArtists
This query returns in 4240 ms. A 5X improvement! The profile for this query is as follows:
+----------------------+-------+--------+--------------------+------------------------------------------------------------------------------------------------+
| Operator | Rows | DbHits | Identifiers | Other |
+----------------------+-------+--------+--------------------+------------------------------------------------------------------------------------------------+
| ColumnFilter(0) | 10 | 0 | | keep columns u2.username, mutualArtists |
| Extract | 10 | 20 | | u2.username |
| ColumnFilter(1) | 10 | 0 | | keep columns u2, mutualArtists |
| Top | 10 | 0 | | { AUTOINT0}; Cached( INTERNAL_AGGREGATEbdf86ac1-8677-4d45-967f-c2dd594aba49 of type Integer) |
| EagerAggregation | 563 | 0 | | u2 |
| OptionalMatch | 52806 | 0 | | |
| Eager(0) | 563 | 0 | | |
| NodeByIndexQuery(1) | 563 | 564 | u2, u2 | Literal(withinDistance:[38.89037,-77.03196,80.467]); userLocations |
| NodeById(1) | 1 | 1 | u, u | Literal(List(553314)) |
| Eager(1) | 82 | 0 | | |
| ExtractPath | 82 | 0 | pMutualArtists | |
| Filter(0) | 82 | 82 | | (hasLabel(a:Artist(1)) AND NOT(ar1 == ar2)) |
| SimplePatternMatcher | 82 | 82 | a, u2, u, ar2, ar1 | |
| Filter(1) | 1 | 3 | | ((hasLabel(u:User(3)) AND hasLabel(u2:User(3))) AND hasLabel(u2:User(3))) |
| NodeByIndexQuery(1) | 563 | 564 | u2, u2 | Literal(withinDistance:[38.89037,-77.03196,80.467]); userLocations |
| NodeById(1) | 1 | 1 | u, u | Literal(List(553314)) |
+----------------------+-------+--------+--------------------+------------------------------------------------------------------------------------------------+
And, to prove that I ran them both in a row and got very different results:
neo4j-sh (?)$ START u=node(553314), u2=node:userLocations("withinDistance:[38.89037,-77.03196,80.467]")
>
> OPTIONAL MATCH
> pMutualArtists=(u:User)-[ar1:LISTENS_TO]->(a:Artist)<-[ar2:LISTENS_TO]-(u2:User)
> WHERE
> u2:User
>
> WITH u2, COUNT(DISTINCT pMutualArtists) AS mutualArtists
> ORDER BY mutualArtists DESC
> LIMIT 10
> RETURN u2.username, mutualArtists
> ;
+------------------------------+
| u2.username | mutualArtists |
+------------------------------+
| "573904765" | 644 |
| "28600291" | 601 |
| "1092510304" | 558 |
| "1367963461" | 521 |
| "1508790199" | 455 |
| "1335360028" | 447 |
| "18200866" | 444 |
| "1229430376" | 435 |
| "748318333" | 434 |
| "5612902" | 431 |
+------------------------------+
10 rows
4240 ms
neo4j-sh (?)$ START me=node(553314), other=node:userLocations("withinDistance:[38.89037,-77.03196,80.467]")
>
> OPTIONAL MATCH
> pMutualArtists=(me:User)-[ar1:LISTENS_TO]->(a:Artist)<-[ar2:LISTENS_TO]-(other:User)
> WHERE
> other:User
>
> WITH other, COUNT(DISTINCT pMutualArtists) AS mutualArtists
> ORDER BY mutualArtists DESC
> LIMIT 10
> RETURN other.username, mutualArtists;
+--------------------------------+
| other.username | mutualArtists |
+--------------------------------+
| "573904765" | 644 |
| "28600291" | 601 |
| "1092510304" | 558 |
| "1367963461" | 521 |
| "1508790199" | 455 |
| "1335360028" | 447 |
| "18200866" | 444 |
| "1229430376" | 435 |
| "748318333" | 434 |
| "5612902" | 431 |
+--------------------------------+
10 rows
20418 ms
Unless I have gone crazy, the only difference between these two queries is the names of the nodes (I've changed "me" to "u" and "other" to "u2").
Why does that cause a 5X improvement??!?!
If anyone has any insight into this, I would be eternally grateful.
Thanks,
-Adam
EDIT 8.1.14
Based on #ulkas's suggestion, I tried simplifying the query.
The results were:
START u=node(553314), u2=node:userLocations("withinDistance:[38.89037,-77.03196,80.467]")
OPTIONAL MATCH pMutualArtists=(u:User)-[ar1:LISTENS_TO]->(a:Artist)<-[ar2:LISTENS_TO]-(u2:User)
RETURN u2.username, COUNT(DISTINCT pMutualArtists) as mutualArtists
ORDER BY mutualArtists DESC
LIMIT 10
~4 seconds
START me=node(553314), other=node:userLocations("withinDistance:[38.89037,-77.03196,80.467]")
OPTIONAL MATCH pMutualArtists=(me:User)-[ar1:LISTENS_TO]->(a:Artist)<-[ar2:LISTENS_TO]-(other:User)
RETURN other.username, COUNT(DISTINCT pMutualArtists) as mutualArtists
ORDER BY mutualArtists DESC
LIMIT 10
~20 seconds
So bizarre. It seems as though literally the named nodes of "other" and "me" cause the query time to jump tremendously. I'm very confused.
Thanks,
-Adam
That sounds like you're seeing the effect of caching. Upon the first access the cache is not populated. Subsequent queries hitting the same graph will be much faster since the nodes/relationships are already available in the cache.
working with OPTIONAL MATCH following WHERE other:User has no sense, since the end node other (u2) must be match. try to perform the queries without optional match and where and without the last with, simply
START me=node(553314), other=node:userLocations("withinDistance[38.89037,-77.03196,80.467]")
MATCH
pMutualArtists=(me:User)-[ar1:LISTENS_TO]->(a:Artist)<-[ar2:LISTENS_TO]-(other:User)
RETURN other.username, count(DISTINCT pMutualArtists) as mutualArtists
ORDER BY mutualArtists DESC
LIMIT 10

Query unique pair of nodes when pair orders is not important in cypher

I am trying to compare users with according to their common interests in this graph.
I know why the following query produces duplicate pairs but can't think of a good way in cypher to avoid it. Is there any way to do it without looping in cypher?
neo4j-sh (?)$ start n=node(*) match p=n-[:LIKES]->item<-[:LIKES]-other where n <> other return n.name,other.name,collect(item.name) as common, count(*) as freq order by freq desc;
==> +-----------------------------------------------+
==> | n.name | other.name | common | freq |
==> +-----------------------------------------------+
==> | "u1" | "u2" | ["f1","f2","f3"] | 3 |
==> | "u2" | "u1" | ["f1","f2","f3"] | 3 |
==> | "u1" | "u3" | ["f1","f2"] | 2 |
==> | "u3" | "u2" | ["f1","f2"] | 2 |
==> | "u2" | "u3" | ["f1","f2"] | 2 |
==> | "u3" | "u1" | ["f1","f2"] | 2 |
==> | "u4" | "u3" | ["f1"] | 1 |
==> | "u4" | "u2" | ["f1"] | 1 |
==> | "u4" | "u1" | ["f1"] | 1 |
==> | "u2" | "u4" | ["f1"] | 1 |
==> | "u1" | "u4" | ["f1"] | 1 |
==> | "u3" | "u4" | ["f1"] | 1 |
==> +-----------------------------------------------+
In order to avoid having duplicates in the form of a--b and b--a, you can exclude one of the combinations in your WHERE clause with
WHERE ID(a) < ID(b)
making your above query
start n=node(*) match p=n-[:LIKES]->item<-[:LIKES]-other where ID(n) < ID(other) return n.name,other.name,collect(item.name) as common, count(*) as freq order by freq desc;
OK, I see that you use (*) as a start point, which mean to loop through the whole graph and make each node as a start point.. So the output is different, not duplicate as you say..
+-----------------------------------------------+
| n.name | other.name | common | freq |
+-----------------------------------------------+
| "u2" | "u1" | ["f1","f2","f3"] | 3 |
not equal to:
+-----------------------------------------------+
| n.name | other.name | common | freq |
+-----------------------------------------------+
| "u1" | "u2" | ["f1","f2","f3"] | 3 |
So, I see that if you try using an index and set a start point, there won't be any duplicates.
start n=node:someIndex(name='C') match p=n-[:LIKES]->item<-[:LIKES]-other where n <> other return n.name,other.name,collect(item.name) as common, count(*) as freq order by freq desc;

calculating total path cost in cypher, taking relation directionality into account

Using a cypher query on neo4j, in a directed, cyclic graph I need a BFS query and a target node sorting per depth level.
For the within-depth sorting, a custom "total path cost function" should be used, calculated based on
all relation attributes r.followrank between start and end node.
relation directionality (followrank if it points towards end node, or 0 if not)
At any search depth level n, a node connected to a high ranked node at level n-m, m>0 should be ranked higher than a node connected to a low ranked node at level n-m. Reverse directionality should result in a 0 rank (which means, the node and its subtree are still part of the ranking).
I'm using neo4j community-1.9.M01. The approach I've taken so far was to extract an array of followranks for the shortest path to each end node
I thought I've come up with a great first idea for this query but it seems to break down at multiple points.
My query is:
START strt=node(7)
MATCH p=strt-[*1..]-tgt
WHERE not(tgt=strt)
RETURN ID(tgt), extract(r in rels(p): r.followrank*length(strt-[*0..]-()-[r]->() )) as rank, extract(n in nodes(p): ID(n));
which outputs
==> +-----------------------------------------------------------------+
==> | ID(tgt) | rank | extract(n in nodes(p): ID(n)) |
==> +-----------------------------------------------------------------+
==> | 14 | [1.0] | [7,14] |
==> | 15 | [1.0,1.0] | [7,14,15] |
==> | 11 | [1.0,1.0,1.0] | [7,14,15,11] |
==> | 8 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,10] |
==> | 12 | [1.0,1.0,1.0,0.0] | [7,14,15,11,12] |
==> | 8 | [0.0] | [7,8] |
==> | 9 | [0.0] | [7,9] |
==> | 10 | [0.0] | [7,10] |
==> | 11 | [1.0] | [7,11] |
==> | 15 | [1.0,1.0] | [7,11,15] |
==> | 14 | [1.0,1.0,1.0] | [7,11,15,14] |
==> | 8 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,10] |
==> | 12 | [1.0,0.0] | [7,11,12] |
==> +-----------------------------------------------------------------+
==> 17 rows
==> 38 ms
It looks similar to what I need, but the issues are
nodes 8, 9, 10, 11 have the same relation direction to 7! The inverse query result ...*length(strt-[*0..]-()-[r]->() )... looks even stranger - see the queries right below.
I don't know how to normalize the results of the length() expression to 1.
Directionality:
START strt=node(7)
MATCH strt<-[r]-m
RETURN ID(m), r.followrank;
==> +----------------------+
==> | ID(m) | r.followrank |
==> +----------------------+
==> | 8 | 1 |
==> | 9 | 1 |
==> | 10 | 1 |
==> | 11 | 1 |
==> +----------------------+
==> 4 rows
==> 0 ms
START strt=node(7)
MATCH strt-[r]->m
RETURN ID(m), r.followrank;
==> +----------------------+
==> | ID(m) | r.followrank |
==> +----------------------+
==> | 14 | 1 |
==> +----------------------+
==> 1 row
==> 0 ms
Inverse query:
START strt=node(7)
MATCH p=strt-[*1..]-tgt
WHERE not(tgt=strt)
RETURN ID(tgt), extract(rr in rels(p): rr.followrank*length(strt-[*0..]-()<-[rr]-() )) as rank, extract(n in nodes(p): ID(n));
==> +-----------------------------------------------------------------+
==> | ID(tgt) | rank | extract(n in nodes(p): ID(n)) |
==> +-----------------------------------------------------------------+
==> | 14 | [1.0] | [7,14] |
==> | 15 | [1.0,1.0] | [7,14,15] |
==> | 11 | [1.0,1.0,1.0] | [7,14,15,11] |
==> | 8 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,10] |
==> | 12 | [1.0,1.0,1.0,2.0] | [7,14,15,11,12] |
==> | 8 | [3.0] | [7,8] |
==> | 9 | [3.0] | [7,9] |
==> | 10 | [3.0] | [7,10] |
==> | 11 | [1.0] | [7,11] |
==> | 15 | [1.0,1.0] | [7,11,15] |
==> | 14 | [1.0,1.0,1.0] | [7,11,15,14] |
==> | 8 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,10] |
==> | 12 | [1.0,2.0] | [7,11,12] |
==> +-----------------------------------------------------------------+
==> 17 rows
==> 30 ms
So my questions are:
what's going on with this query?
is there a working approach?
For an additional detail, I know the min(length(path)) aggregator, but it doesn't work in this case where I'm trying to extract information about the best hit - the additional information I return about the best hit will disaggreate the result again - I think that's a cypher limitation.
Basically, you want to do a rank only considering relationships that are "with the path flow". Unfortunately, to test "with path flow", you need to check the path-index of each relationships' start/end nodes, and that can only be done with APOC right now.
// allshortestpaths to get all non-cyclic paths
MATCH path=allshortestpaths((a{id:"1"})-[*]-(b{id:"2"}))
// Find rank worthy relationships
WITH path, filter(rl in relationships(path) WHERE apoc.coll.indexOf(path, startnode(rl))<apoc.coll.indexOf(path, endnode(rl)))) as comply
// Filter results
RETURN path, REDUCE(rk = 0, rl in comply | rk+rl.followrank) as rank
ORDER BY rank DESC
(I can't test the APOC part, so you might have to pass NODES(path) instead of path to the APOC procedure)

Resources