neo4j aggregation on relations for a matched graph

neo4j aggregation on relations for a matched graph - neo4j

I've this cypher case where i need to get the strength of a relation to utilize a better recommendation, my case has an A, B, C nodes with relations (A)-[:HAS {weight:n}]-(B), (A)-[RESPONSIBLE {weight:n}]-(C), what i want to get is the relation between (B)--(C) and to calculate weight of each C with A as weight.
I tried this query which is obviously wrong but that what i could do so far
MATCH (c:C {title:"some title"})
MATCH p=(c)<-[:RESPONSIBLE]-(A)-[:HAS]->(B)
RETURN DISTINCT(c.title) AS c, count(c.id) AS weight
ORDER BY weight DESC
can you guys help ?

I guess, you want to sum up the weight of all :HAS relationships?
MATCH (c:C {title:"some title"})
MATCH p=(c)<-[:RESPONSIBLE]-(A)-[r:HAS]->(B)
RETURN DISTINCT(c.title) AS c, sum(r.weight) AS weight
ORDER BY weight DESC

Related

Neo4J Get only the first relationship per node

I have this graph where the nodes are researchers, and they are related by a relationship named R1, the relationship has a "value" property. How can I get the name of the researchers that are in the relationships with the greatest value? It's like get all the relationships order by r.value DESC but getting only the first relationship per researcher, because I don't want to see on the table duplicated researcher names. By the way, is there a way to get the name of the researchers order by the mean of their relationship "values"? Sorry about the confused topic, I don't speak English very well, thank you very much.
I've been trying things like the Cypher query bellow:
MATCH p=(n)-[r:R1]->(c)
WHERE id(n) < id(c) and r.coauthors = false
return DISTINCT n.name order by n.campus, r.value DESC

Correct me if I am wrong, but you want one result per "n" with the highest value from "r"?
MATCH (n)-[r:R1]->(c)
WHERE r.coauthors = false
WITH n, r ORDER BY r.value DESC
WITH n, head(collect(r)) AS highR
RETURN n.name, highR.value ORDER BY n.campus, highR.value DESC
This will get you all the r's in order and pick the first head(collect(r)) after first doing an ORDER BY. Then you just need to return the values you want. Check out Neo4j Aggregation Functions for some documentation on how aggregation functions work. Good luck!
As an aside, if there is a label that all "n" have, you should add that in your MATCH: MATCH (n:Person) .... it will help speed up your query!

compare sum and sum of different conditions - post union processing?

Hi I have a relationship Artist - Collaborated -> Writer and would like to find who are the artists who write mainly their own songs. Thus the weighted edge between writer and artist with the same name should be bigger than the sum of all other weights.
I managed to do this:
MATCH (n:Artist)-[r:Collaborated]-(m:Writer)
WITH n, m, sum(r.weight) as wrote
WHERE n.name = toLower(m.name)
RETURN n.name as Node, wrote ORDER BY wrote descending;
but I am not sure how to incorporate the second condition. Do I have to use post union processing? Any help pls?
To join the two WHERE conditions, I tried something like this and compare the first sum to the second sum but it doesn't work:
MATCH (o:Artist)-[q:Collaborated]-(p:Writer)
WITH o, p, sum(q.weight) as wrote1
WHERE o.name <> toLower(p.name)
MATCH (n:Artist)-[r:Collaborated]-(m:Writer)
WITH n, m, sum(r.weight) as wrote2
WHERE n.name = toLower(m.name) and wrote2>wrote1
RETURN n.name as Node, wrote2;
This is an example of how my graph looks like:
I would like to know if the weight between eminem and eminem is bigger than all the other weights

Firstly, your model is a little weird, you have two nodes Eminem, one with the label Artist and an other with the label Writer.
For my POV, you should have only one node Eminem with both labels.
To respond to your question I think that this query can helps you :
MATCH (o:Artist)-[r:Collaborated]->(p:Writer)
WITH o, CASE WHEN o.name = p.name THEN r.weight ELSE -1*r.weight END AS score
RETURN o, sum(score) AS score
If the score is superior to 0, then you know that eminem and eminem is bigger than all the other weights.

Return multiple sums of relationship weights using cypher

I have a graph with one node type 'nodeName' and one relationship type 'relName'. Each node pair has 0-1 'relName' relationships with each other but each node can be connected to many nodes.
Given an initial list of nodes (I'll refer to this list as the query subset) I want to:
Find all the nodes that connect to the query subset
I'm currently doing this (which may be overly convoluted):
MATCH (a: nodeName)-[r:relName]-()
WHERE (a.name IN ['query list'])
WITH a
MATCH (b: nodeName)-[r2:relName]-()
WHERE NOT (b.name IN ['query list'])
WITH a, b
MATCH (a)--(b)
RETURN DISTINCT b
Then for each connected node (b) I want to return the SUM of the weights that connect to the query subset
For example. If node b1 has 4 edges that connect to nodes in the query subset I would like to RETURN SUM(r2.weight) AS totalWeight for b2. I actually need a list of all the b nodes ordered by totalWeight.
No. 2 is where I'm stuck. I've been reading the docs about FOREACH and reduce() but I'm not sure how to apply them here.
Speed is important as I have 30,000 nodes and 1.5M edges if you have any suggestions regarding this please throw them into the mix.
Many thanks
Matt

Why do you need so many Match statements? You can specify a nodes and b nodes in single Match statement and select only those who have a relationship between them.
After that just return b nodes and sum of the weights. b nodes will automatically be acting as a group by if it is returned along with aggregation function such as sum.
MATCH (a:nodeName)-[r:relName]-(b:nodeName)
WHERE (a.name IN ['query list']) AND NOT((b.name IN ['query list']))
RETURN b.name, sum(r.weight) as weightSum order by weightSum

I think we can simplify that query a bit.
MATCH (a: nodeName)
WHERE (a.name IN ['query list'])
WITH collect(a) as subset
UNWIND subset as a
MATCH (a)-[r:relName]-(b)
WHERE NOT b in subset
RETURN b, sum(r.weight) as totalWeight
ORDER BY totalWeight ASC
Since sum() is an aggregating function, it will make the non-aggregation variables the grouping key (in this case b), so the sum is per b node, then we order them (switch to DESC if needed).

neo4j indegree outdegree union

I want to compute Indegree and Outdegree and return a graph that has a connection between top 5 Indegree nodes and top 5 Outdegree nodes. I have written a code as
match (a:Port1)<-[r]-()
return a.id as NodeIn, count(r) as Indegree
order by Indegree DESC LIMIT 5
union
match (n:Port1)-[r]->()
return n.id as NodeOut, count(r) as Outdegree
order by Outdegree DESC LIMIT 5
union
match p=(u:Port1)-[:LinkTo*1..]->(t:Port1)
where u.id in NodeIn and t.id in NodeOut
return p
I get an error as
All sub queries in an UNION must have the same column names (line 4, column 1 (offset: 99)) "union"
What are the changes that I need to do to the code?

There's a few things we can improve.
The matches you're doing isn't the most efficient way to get incoming and outgoing degrees for relationships.
Also, UNION can only be used to combine query results with identical columns. In this case, we won't even need UNION, we can use WITH to pipe results from one part of a query to another, and COLLECT() the nodes you need in between.
Try this query:
match (a:Port1)
with a, size((a)<--()) as Indegree
order by Indegree DESC LIMIT 5
with collect(a) as NodesIn
match (a:Port1)
with NodesIn, a, size((a)-->()) as Outdegree
order by Outdegree DESC LIMIT 5
with NodesIn, collect(a) as NodesOut
unwind NodesIn as NodeIn
unwind NodesOut as NodeOut
// we now have a cartesian product between both lists
match p=(NodeIn)-[:LinkTo*1..]->(NodeOut)
return p
Be aware that this performs two NodeLabelScans of :Port1 nodes, and does a cross product of the top 5 of each, so there are 25 variable length path matches, which can be expenses, as this generates all possible paths from each NodeIn to each NodeOut.
If you only one the shortest connection between each, then you might try replacing your variable length match with a shortestPath() call, which only returns the shortest path found between each two nodes:
...
match p = shortestPath((NodeIn)-[:LinkTo*1..]->(NodeOut))
return p
Also, make sure your desired direction is correct, as you're matching nodes with the highest in degree and getting an outgoing path to nodes with the highest out degree, that seems like it might be backwards to me, but you know your requirements best.

rank nodes according to parameters of paths in Neo4J

I have 100 nodes, n1, n2, n3 etc which are connected by three different kind of relationships, r1,r2, and r3. Each of this relationships have a parameter called "weight" which is a number between lets say, 5, 10 and 15. I need to develop a ranking based on the number of total paths per node and also another ranking based on the weight. By total paths i mean that if N1-[r1]->n2 and n2-[r1]->n3 and n3-[r3]->n4 then the total number of paths for n1 would be 3. the value of the ranking by weight would be 5+5+15=25.
Ideally the query would return a list of the nodes ranked.
Is there a way to do that in cypher?
thanks

Something like this??
MATCH (n1:Label {id:1})-[r1]->(n2:Label {id:2})-[r2]->(n3:Label {id:3})-[r3]->()
RETURN n1,
SUM(r1.weight+r2.weight+r3.weight) as weight,
count(*) as paths
ORDER BY weight desc, paths desc

Try this, of course with some tweaks for your data model:
MATCH path=(a:Foo)-[:r1|r2|r3*]->(d:Foo)
RETURN length(path) as NumberOfStepsInPath,
reduce(sum=0, y in
extract(x in relationships(path) | x.weight)
| sum + y)
as TotalCost;
So this matches a path from a to d, on any of the relationship types you specify, r1|r2|r3. The length of the path is easy, that's just length(path). Summing the weights is a bit more involved. First, you extract the weight attribute from each relationship in the path. Then you reduce the list of weights down to a single sum.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

neo4j aggregation on relations for a matched graph - neo4j

I guess, you want to sum up the weight of all :HAS relationships? MATCH (c:C {title:"some title"}) MATCH p=(c)<-[:RESPONSIBLE]-(A)-[r:HAS]->(B) RETURN DISTINCT(c.title) AS c, sum(r.weight) AS weight ORDER BY weight DESC

Related

Neo4J Get only the first relationship per node

compare sum and sum of different conditions - post union processing?

Return multiple sums of relationship weights using cypher

neo4j indegree outdegree union

rank nodes according to parameters of paths in Neo4J

Categories

Resources