rank nodes according to parameters of paths in Neo4J - neo4j

I have 100 nodes, n1, n2, n3 etc which are connected by three different kind of relationships, r1,r2, and r3. Each of this relationships have a parameter called "weight" which is a number between lets say, 5, 10 and 15. I need to develop a ranking based on the number of total paths per node and also another ranking based on the weight. By total paths i mean that if N1-[r1]->n2 and n2-[r1]->n3 and n3-[r3]->n4 then the total number of paths for n1 would be 3. the value of the ranking by weight would be 5+5+15=25.
Ideally the query would return a list of the nodes ranked.
Is there a way to do that in cypher?
thanks

Something like this??
MATCH (n1:Label {id:1})-[r1]->(n2:Label {id:2})-[r2]->(n3:Label {id:3})-[r3]->()
RETURN n1,
SUM(r1.weight+r2.weight+r3.weight) as weight,
count(*) as paths
ORDER BY weight desc, paths desc

Try this, of course with some tweaks for your data model:
MATCH path=(a:Foo)-[:r1|r2|r3*]->(d:Foo)
RETURN length(path) as NumberOfStepsInPath,
reduce(sum=0, y in
extract(x in relationships(path) | x.weight)
| sum + y)
as TotalCost;
So this matches a path from a to d, on any of the relationship types you specify, r1|r2|r3. The length of the path is easy, that's just length(path). Summing the weights is a bit more involved. First, you extract the weight attribute from each relationship in the path. Then you reduce the list of weights down to a single sum.

Related

How can I calculate the average number of relationships each node has in neo4j?

Stuck in finding the avg number of relationships each node has. I know how to find the number of nodes and the number of relationships in total. But I can't combine them into one query. Please help
You can use this approach. This is for getting this info for all nodes regardless of label:
MATCH (n)
WITH n, size((n)--()) as relCount
RETURN avg(relCount) as averageRelCount
If you're trying to return this information in addition to total nodes and total relationships, then you should read this knowledge base article on getting fast counts from the counts store. It can get you those totals, though it can't get you the average rel count as above.
You can combine them by using the counts store early in the query, and the average relationship part at the end.
Here's how you might use it if you use APOC Procedures to select the counts you want from the store:
CALL apoc.meta.stats() YIELD nodeCount, relCount as totalRelCount
MATCH (n)
WITH n, size((n)--()) as relCount, nodeCount, totalRelCount
RETURN avg(relCount) as averageRelCount, nodeCount, totalRelCount

Cypher Query to Collect Arbitrary Depth Nodes and Edge Properties

I have a graph that looks like the image below. However, the depth and the number of rollups from the Person to the topmost Rollup is variable depending on how the rollups have been structured by the user. The edges from the Person to the Metric (HAS_METRIC) have the score values and the relationships from the metrics to the Rollup (HAS_PARENT) has the weighting that should be applied by to the value as it is rolled up to a top score.
Ideally, I would like to have a query that produces a table with the rollup and the summed/weighted scores. Like this:
node | value
-------------------
Metric A 23
Metric B 55
Metric C 29
Metric D 78
Rollup A 45.4
Rollup B 58.4
Rollup Tot 51.9
However, I am not understanding how to collect the edge properties for the HAS_PARENTS.
MATCH (p:Person)-[score:HAS_METRIC]->(m:Metric)-[weight:HAS_PARENT]->(ru:Rollup)
-[par_rel:HAS_PARENT*..8]->(ru_par:Rollup)
WITH p, score, m, weight, par_rel, ru, ru_par
RETURN p.uid, score.score, m.uid, weight.weight, ru.uid par_rel.weight, ru_par.uid
This query is giving me a type mismatch because it does not know what to do with the par_rel.weight. Any pointers are appreciated.
I believe what you are searching for is the relationships(path) function. It is one of the default path functions in Cypher. It returns all relationship is a defined path, and you can combine it with one or more Cypher list expressions to get the values you need from the relationships.
Generally speaking, you could do something like:
MATCH p = (n)-[:HAS_PARENT*..8]->()
RETURN [x IN relationships(p) | x.weight] AS weights
You might also find useful the reduce function. E.g.:
...
RETURN reduce(s = 0, x IN relationships(p) | s + x.weight) AS sumWeight
But you need to be careful with your variable length path queries and probably constrain them in order to get only the paths you are interested in.
A good advice would be probably to mark your leaf and root nodes in order to match only paths from a leaf to a/the root, not just intermediate ones. E.g.:
MATCH p = (n)-[:HAS_PARENT*..8]->(root)
WHERE NOT (root)-[:HAS_PARENT]->() AND NOT (n)<-[:HAS_PARENT]-()
...
And of course you can combine these cypher with others in order to return everything you need in one single query.
I hope this helps. Let us know when you succeed.

How to express constraints on consecutive relations of the Neo4j shortestpath query algoritm?

I would like to query a shortestpath in Neo4j but expressing conditions between consecutive relations.
Suppose I have nodes labelled type and relations labelled rel. Such relations have the attributes start_time, end_time, exec_time (for the moment they are of type string, but if you prefer you can consider them as integer). I would like to find the shortest path between two nodes b1 and b2 subject to the constraint that:
the relation outgoing from b1 should have the attribute starting_time bigger than a given value (let me call it th;
if there are more than one of such relations, starting_time of the next relation should be bigger than ending_time of the previous.
Between two nodes I can have multiple realations.
I started from this query limiting the relations with starting_time bigger than th.
MATCH (b1:type{id:"0247"}), (b2:type{id:"0222"}),
p=shortestPath((b1)−[t:rel*]−>(b2))
WHERE ALL (r in relationships(p) WHERE r.starting_time>"14:56:00" )
RETURN p;
I was trying something like this:
MATCH (b1:type{id:"0247"}), (b2:type{id:"0222"}),
p=shortestPath((b1)−[t:rel*]−>(b2))
WITH "14:56:00" as th
WHERE ALL (r in relationships(p) WHERE r.starting_tme>th WITH r.end_time as th )
RETURN p;
but it does not work and I am not sure the shortestPath algorithm in Neo4j accesses the relations of the shortest path sequentially.
How can I express such a condition in Neo4j cypher query language?
If it is not possible is there a suitable way to model such a time condition in a graph database (I mean how can I change the DB?)
This query may do what you want:
MATCH p = shortestPath((b1:type{id:"0247"})−[t:rel*]−>(b2:type{id:"0222"}))
WHERE REDUCE(s = {ok: true, last: "14:56:00"}, r IN RELATIONSHIPS(p) |
{ok: s.ok AND r.starting_time > s.last, last: r.end_time}
).ok
RETURN p;
This query uses REDUCE to iteratively test the relationships while updating the current state s at every step.

Find a set of (n) nodes where relationship weight between each pair of node is greater than a value(w)

I have a database where each node is connected to all other nodes with a relationship, and each relationship has a weight. I need a query where given a weight w and a number of nodes n, I want all n nodes where each pair of relationship has a weight greater than w.
Any help on this would be great
It depends on what you would like your result set to look like. Something as simple as this query would return all paths that fall under your criteria:
MATCH p=()-[r:my_rel]->() WHERE r.weight > w RETURN p;
This would return all such paths.
If you would like the two nodes only (and not the entire pattern's results), you can return only those two nodes:
MATCH (n1)-[r:my_rel]->(n2) WHERE r.weight > w RETURN n1,n2;
Do note that due to Neo4J's storage internals, performing a search based on the properties of a relationship tends to not perform as well as those based on properties of a node.

Find the distance in a path between each node and the last node of the path

I am very new to Cypher and I need help to solve a problem I am facing..
In my graph I have a path represeting a data stream and I need to know, for each node in the path, the distance from the last node of the path.
For example if i have the following path:
(a)->(b)->(c)->(d)
the distance must be 3 for a, 2 for b, 1 for c and 0 for d.
Is there an efficient way to obtain this result in Cypher?
Thanks a lot!
Mauro
If it is just hops between nodes then i think this will fit the bill.
match p=(a:Test {name: 'A'})-[r*3]->(d:Test {name: 'D'})
with p, range(length(p),0,-1) as idx
unwind idx as elem
return (nodes(p)[elem]).name as Node
, length(p) - elem as Distance
order by Node
In this answer, I define a path to be "complete" if its start node has no incoming relationship and its end node has no outgoing relationship.
This query returns, for each "complete" path, a collection of objects containing each node's neo4j-generated ID and the number of hops to the end of that path:
MATCH p=(x)-[*]->(y)
WHERE (NOT ()-->(x)) AND (NOT (y)-->())
WITH NODES(p) AS np, LENGTH(p) AS lp
RETURN EXTRACT(i IN RANGE(0, lp, 1) | {id: ID(np[i]), hops: lp - i})
NOTE: Matching with [*] will be costly with large graphs, so you may need to limit the maximum hop value. For example, use [*..4] instead to limit the max hop value to 4.
Also, qualifying the query with appropriate node labels and relationship types may speed it up.

Resources