I have been struggling to find a solution to my issue with Cypher..
I am trying to sum a given relationship throughout a path.
For example:
1 --> 2 --> 3
--> 4
I want to calculate for node 1 the sum of Amount property for nodes 1,2 3 and 4. (In that case 3 and 4 are both targets of node 2, which i cant manage to represent here)
My understanding is I need to be using collect() and reduce but I still do not get the right answer. I have the following:
MATCH (n)-[p]->(m)
WITH m,n, collect(m) AS amounts
RETURN n.ID as Source,m.ID as Target,n.Amount,
REDUCE(total = 0, tot IN amounts | total + tot.Amount) AS totalEUR
ORDER BY total DESC
I get a syntax error, but I am pretty sure even without the syntax error that i will only be summing direct relationships...
Would you guys know if I am on the right path?
Cheers
Max
You need a variable-length relationship in the query:
MATCH p = (n)-[*]->(m)
RETURN n.ID as Source, m.ID as Target, n.Amount,
reduce(total = 0, tot IN nodes(p) | total + tot.Amount) AS totalEUR
ORDER BY totalEUR DESC
You can't order by total which is a variable local to the reduce function.
Note that this will return rows for each path, i.e. 1-->2, 1-->2-->3, 1-->2-->3-->4, 2-->3, 2-->3-->4, 3-->4, since you haven't matched on a specific n.
Related
I have written the following Cypher query to get the frequency of a certain item from a set of orders.
MATCH (t:Trans)-[r:CONTAINS]->(i:Item)
WITH i,COUNT(*) AS CNT,size(collect(t)) as NumTransactions
RETURN i.ITEM_ID as item, NumTransactions, NumTransactions/CNT as support
I get a table like this as my output
Item NumTransactions Support
A 2 1
B 1132 1
C 2049 1
And so on. What I mean to do is divide each NumTransaction by the total number of transactions. i.e. the sum of the entire num transactions column, to get the support but instead it divides NumTransactions by itself. Can someone point me to the correct function if it exists or an approach to do so?
This should work:
MATCH (:Trans)-[:CONTAINS]->(i:Item)
WITH i, COUNT(*) as c
WITH COLLECT({i: i, c: c}) AS data
WITH data, REDUCE(s = 0.0, n IN data | s + n.c) AS total
UNWIND data AS d
RETURN d.i.ITEM_ID as item, d.c AS NumTransactions, d.c/total as support
By the way, SIZE(COUNT(t)) is inefficient, as it first creates a new collection of t nodes, gets its size, and then deletes the collection. COUNT(t) would have been more efficient.
Also, given your MATCH clause, as long as every t has at most a single CONTAINS relationship to a given i, COUNT(*) (which counts the number of result rows) would give you the same result as COUNT(t).
I have 100 nodes, n1, n2, n3 etc which are connected by three different kind of relationships, r1,r2, and r3. Each of this relationships have a parameter called "weight" which is a number between lets say, 5, 10 and 15. I need to develop a ranking based on the number of total paths per node and also another ranking based on the weight. By total paths i mean that if N1-[r1]->n2 and n2-[r1]->n3 and n3-[r3]->n4 then the total number of paths for n1 would be 3. the value of the ranking by weight would be 5+5+15=25.
Ideally the query would return a list of the nodes ranked.
Is there a way to do that in cypher?
thanks
Something like this??
MATCH (n1:Label {id:1})-[r1]->(n2:Label {id:2})-[r2]->(n3:Label {id:3})-[r3]->()
RETURN n1,
SUM(r1.weight+r2.weight+r3.weight) as weight,
count(*) as paths
ORDER BY weight desc, paths desc
Try this, of course with some tweaks for your data model:
MATCH path=(a:Foo)-[:r1|r2|r3*]->(d:Foo)
RETURN length(path) as NumberOfStepsInPath,
reduce(sum=0, y in
extract(x in relationships(path) | x.weight)
| sum + y)
as TotalCost;
So this matches a path from a to d, on any of the relationship types you specify, r1|r2|r3. The length of the path is easy, that's just length(path). Summing the weights is a bit more involved. First, you extract the weight attribute from each relationship in the path. Then you reduce the list of weights down to a single sum.
Is it possible to extract in a single cypher query a limited set of nodes and the total number of nodes?
match (n:Molecule) with n, count(*) as nb limit 10 return {N: nb, nodes: collect(n)}
The above query properly returns the nodes, but returns 1 as number of nodes. I certainly understand why it returns 1, since there is no grouping, but can't figure out how to correct it.
The following query returns the counter for the entire number of rows (which I guess is what was needed). Then it matches again and limits your search, but the original counter is still available since it is carried through via the WITH-statement.
MATCH
(n:Molecule)
WITH
count(*) AS cnt
MATCH
(n:Molecule)
WITH
n, cnt LIMIT 10
RETURN
{ N: cnt, nodes:collect(n) } AS molecules
Here is an alternate solution:
match (n:Molecule) return {nodes: collect(n)[0..5], n: length(collect(n))}
84 ms for 30k nodes, shorter but not as efficient as the above one proposed by wassgren.
I am using Cyper query in neo4j
My requirement is,
need to get two level unique(friends) and their shortest depth value.
Graph looks like,
a-[:frnd]->b, b-[:frnd]->a
b-[:frnd]->c, c-[:frnd]->b
c-[:frnd]->d, d-[:frnd]->c
a-[:frnd]->c, c-[:frnd]->a
I tried as,
START n=node(8) match p=n-[:frnd*1..2]->(x) return x.email, length(p)
My output is,
b 1 <--length(p)
a 2
c 2
c 1
d 2
a 2 and so on.
My required output,
My parent node(a) should not not be listed.
I need only (c) with shortest length 1
c with 2 should not be repeated.
Pls help me to solve this,.
(EDITED. Finding n via START n=node(8) causes problems with other variables later on. So, below we find n in the MATCH statement.)
MATCH p = shortestPath((n {email:"a"})-[:frnd*..2]->(x))
WHERE n <> x AND length(p) > 0
RETURN x.email, length(p)
ORDER BY length(p)
LIMIT 1
If there are multiple "closest friends", this returns one of them.
Also, the shortestPath() function does not support a minimal path length -- so "1..2" had be become "..2", and the WHERE clause needed to specify length(p) > 0.
Is there a way to limit a cypher query by the sum of a relationship property?
I'm trying to create a cypher query that returns nodes that are within a distance of 100 of the start node. All the relationships have a distance set, the sum of all the distances in a path is the total distance from the start node.
If the WHERE clause could handle aggregate functions what I'm looking for might look like this
START n=node(1)
MATCH path = n-[rel:street*]-x
WHERE SUM( rel.distance ) < 100
RETURN x
Is there a way that I can sum the distances of the relationships in the path for the where clause?
Sure, what you want to do is like a having in a SQL query.
In cypher you can chain query segments and use the results of previous parts in the next part by using WITH, see the manual.
For your example one would assume:
START n=node(1)
MATCH n-[rel:street*]-x
WITH SUM(rel.distance) as distance
WHERE distance < 100
RETURN x
Unfortunately sum doesn't work with collections yet
So I tried to do it differently (for variable length paths):
START n=node(1)
MATCH n-[rel:street*]-x
WITH collect(rel.distance) as distances
WITH head(distances) + head(tail(distances)) + head(tail(tail(distances))) as distance
WHERE distance < 100
RETURN x
Unfortunately head of an empty list doesn't return null which could be coalesced to 0 but just fails. So this approach would only work for fixed length paths, don't know if that's working for you.
I've come across the same problem recently. In more recent versions of neo4j this was solved by the extract and reduce clauses. You could write:
START n=node(1)
MATCH path = (n)-[rel:street*..100]-(x)
WITH extract(x in rel | x.distance) as distances, x
WITH reduce(res = 0, x in rs | res + x) as distance, x
WHERE distance <100
RETURN x
i dont know about a limitation in the WHERE clause, but you can simply specify it in the MATCH clause:
START n=node(1)
MATCH path = n-[rel:street*..100]-x
RETURN x
see http://docs.neo4j.org/chunked/milestone/query-match.html#match-variable-length-relationships