Cypher - Only show node name, not full node in path variable - neo4j

In Cypher I have the following query:
MATCH p=(n1 {name: "Node1"})-[r*..6]-(n2 {name: "Node2"})
RETURN p, reduce(cost = 0, x in r | cost + x.cost) AS cost
It is working as expected. However, it prints the full n1 node, then the full r relationship (with all its attributes), and then full n2.
What I want instead is to just show the value of the name attribute of n1, the type attribute of r and again the name attribute of n2.
How could this be possible?
Thank you.

The tricky part of your request is the type attribute of r, as r is a collection of relationships of the path, not a single relationship. We can use EXTRACT to produce a list of relationship types for all relationships in your path. See if this will work for you:
MATCH (n1 {name: "Node1"})-[r*..6]-(n2 {name: "Node2"})
RETURN n1.name, EXTRACT(rel in r | TYPE(rel)) as types, n2.name, reduce(cost = 0, x in r | cost + x.cost) AS cost
You also seem to be calculating a cost for the path. Have you looked at the shortestPath() function?

Related

Totalize and Relationships in depth

I'm using neo4j 3.5 community edition and trying totalize and get hops from a specific node. I'm running to totalize(its working):
MATCH (p:Person{id:123})-[r]->(m)
RETURN p,count(distinct(r.idSale)) as qtd,sum(r.value) as value,type(r) as type,r.seg as segment,m
How I get hops when its totalize?? I'm trying something like these (not working):
MATCH (p:Person{id:123})-[r]->(m)-[r2*2]->(m2)
RETURN p,count(distinct(r.idSale)) as qtd,sum(r.value) as value,type(r),m,count(distinct(r2.idSale)) as qtd2,sum(r2.value) as t
error : Type mismatch: expected Any, Map, Node, Relationship, Point,
Duration, Date, Time, LocalTime, LocalDateTime or DateTime but was
List (line 1, column 149 (offset: 148))
My graph is like this:
I'm trying:
Since [r2*2] specifies a variable length relationship pattern, r2 is a list of relationships.
Here is one way to get what it seems you want:
MATCH (p:Person{id:123})-[r]->(m), path=(m)-[*2]->(m2)
RETURN p,count(distinct(r.idSale)) as qtd,sum(r.value) as value,type(r),m,
SIZE(apoc.coll.toSet([r2 IN RELATIONSHIPS(path) | r2.idSale])) AS qtd2,
REDUCE(s = 0, r2 IN RELATIONSHIPS(path) | s + r2.value) AS total2,
[r2 IN RELATIONSHIPS(path) | TYPE(r2)] AS types2,
m2
This query uses the APOC function apoc.coll.toSet to get a distinct list. Also, it returns a list of relationship types in types2.

How to sum up property value of node type different from starting and ending node with Cypher in Neo4j

I have Neo4j community 3.5.5, where I built a graph data model with rail station and line sections between stations. Stations and line sections are nodes and connect is the relationship linking them.
I would like to name a starting station and an ending station and sum up length of all rail section in between. I tried the Cypher query below but Neo4j does not recognize line_section as a node type.
match (n1:Station)-[:Connect]-(n2:Station)
where n1.Name='Station1' and n2.Name='Station3'
return sum(Line_Section.length)
I know it is possible to do the summing with Neo4j Traversal API. Is it possible to do this in Cypher?
First capture the path from a start node to the end node in a variable, then reduce the length property over it.
MATCH path=(n1: Station { name: 'Station1' })-[:Connect]->(n2: Station { name: 'Station2' })
RETURN REDUCE (totalLength = 0, node in nodes(path) | totalLength + node.length) as totalDistance
Assuming the line section nodes have the label Line_Section, you can use a variable-length relationship pattern to get the entire path, list comprehension to get a list of the line section nodes, UNWIND to get the individual nodes, and then use aggregation to SUM all the line section lengths:
MATCH p = (n1:Station)-[:Connect*]-(n2:Station)
WHERE n1.Name='Station1' AND n2.Name='Station3'
UNWIND [n IN NODES(p) WHERE 'Line_Section' in LABELS(n)] AS ls
RETURN p, SUM(ls.length) AS distance
Or, you could use REDUCE in place of UNWIND and SUM:
MATCH p = (n1:Station)-[:Connect*]-(n2:Station)
WHERE n1.Name='Station1' AND n2.Name='Station3'
RETURN p, REDUCE(s = 0, ls IN [n IN NODES(p) WHERE 'Line_Section' in LABELS(n)] |
s + ls.length) AS distance
[UPDATED]
Note: a variable-length relationship with an unbounded number of hops is expensive, and can take a long time or run out of memory.
If you only want the distance for the shortest path, then this should be faster:
MATCH
(n1:Station { name: 'Station1' }),
(n2:Station {name: 'Station3' }),
p = shortestPath((n1)-[*]-(n2))
WHERE ALL(r IN RELATIONSHIPS(p) WHERE TYPE(r) = 'Connect')
RETURN p, REDUCE(s = 0, ls IN [n IN NODES(p) WHERE 'Line_Section' in LABELS(n)] |
s + ls.length) AS distance

Cypher - how to walk graph while computing

I'm just starting studying Cypher here..
How would would I specify a Cypher query to return the node connected, from 1 to 3 hops away of the initial node, which has the highest average of weights in the path?
Example
Graph is:
(I know I'm not using the Cypher's notation here..)
A-[2]-B-[4]-C
A-[3.5]-D
It would return D, because 3.5 > (2+4)/2
And with Graph:
A-[2]-B-[4]-C
A-[3.5]-D
A-[2]-B-[4]-C-[20]-E
A-[2]-B-[4]-C-[20]-E-[80]-F
It would return E, because (2+4+20)/3 > 3.5
and F is more than 3 hops away
One way to write the query, which has the benefit of being easy to read, is
MATCH p=(A {name: 'A'})-[*1..3]-(x)
UNWIND [r IN relationships(p) | r.weight] AS weight
RETURN x.name, avg(weight) AS avgWeight
ORDER BY avgWeight DESC
LIMIT 1
Here we extract the weights in the path into a list, and unwind that list. Try inserting a RETURN there to see what the results look like at that point. Because we unwind we can use the avg() aggregation function. By returning not only the avg(weight), but also the name of the last path node, the aggregation will be grouped by that node name. If you don't want to return the weight, only the node name, then change RETURN to WITH in the query, and add another return clause which only returns the node name.
You can also add something like [n IN nodes(p) | n.name] AS nodesInPath to the return statement to see what the path looks like. I created an example graph based on your question with below query with nodes named A, B, C etc.
CREATE (A {name: 'A'}),
(B {name: 'B'}),
(C {name: 'C'}),
(D {name: 'D'}),
(E {name: 'E'}),
(F {name: 'F'}),
(A)-[:R {weight: 2}]->(B),
(B)-[:R {weight: 4}]->(C),
(A)-[:R {weight: 3.5}]->(D),
(C)-[:R {weight: 20}]->(E),
(E)-[:R {weight: 80}]->(F)
1) To select the possible paths with length from one to three - use match with variable length relationships:
MATCH p = (A)-[*1..3]->(T)
2) And then use the reduce function to calculate the average weight. And then sorting and limits to get one value:
MATCH p = (A)-[*1..3]->(T)
WITH p, T,
reduce(s=0, r in rels(p) | s + r.weight)/length(p) AS weight
RETURN T ORDER BY weight DESC LIMIT 1

How to find specific subgraph in Neo4j using where clause

I have a large graph where some of the relationships have properties that I want to use to effectively prune the graph as I create a subgraph. For example, if I have a property called 'relevance score' and I want to start at one node and sprawl out, collecting all nodes and relationships but pruning wherever a relationship has the above property.
My attempt to do so netted this query:
start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r
My attempt has two issues I cannot resolve:
1) Reflecting I believe this will not result in a pruned graph but rather a collection of disjoint graphs. Additionally:
2) I am getting the following error from what looks to be a correctly formed cypher query:
Type mismatch: expected Any, Map, Node or Relationship but was Collection<Relationship> (line 1, column 52 (offset: 51))
"start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r"
You should be able to use the ALL() function on the collection of relationships to enforce that for all relationships in the path, the property in question is null.
Using Gabor's sample graph, this query should work.
MATCH p = (n {name: 'n1'})-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance_score is null)
RETURN p
One solution that I can think of is to go through all relationships (with rs*), filter the the ones without the relevance_score property and see if the rs "path" is still the same. (I quoted "path" as technically it is not a Neo4j path).
I created a small example graph:
CREATE
(n1:Node {name: 'n1'}),
(n2:Node {name: 'n2'}),
(n3:Node {name: 'n3'}),
(n4:Node {name: 'n4'}),
(n5:Node {name: 'n5'}),
(n1)-[:REL {relevance_score: 0.5}]->(n2)-[:REL]->(n3),
(n1)-[:REL]->(n4)-[:REL]->(n5)
The graph contains a single relevant edge, between nodes n1 and n2.
The query (note that I used {name: 'n1'} to get the start node, you might use START node=...):
MATCH (n {name: 'n1'})-[rs1*]->(x)
UNWIND rs1 AS r
WITH n, rs1, x, r
WHERE NOT exists(r.relevance_score)
WITH n, rs1, x, collect(r) AS rs2
WHERE rs1 = rs2
RETURN n, x
The results:
╒══════════╤══════════╕
│n │x │
╞══════════╪══════════╡
│{name: n1}│{name: n4}│
├──────────┼──────────┤
│{name: n1}│{name: n5}│
└──────────┴──────────┘
Update: see InverseFalcon's answer for a simpler solution.

neo4j collecting nodes and relations type b-->a<--c,a<--d

I am extending maxdemarzi's excellent graph visualisation example (http://maxdemarzi.com/2013/07/03/the-last-mile/) using VivaGraph backed by neo4j.
I want to display relationships of the type
a-->b<--c,b<--d
I tried the query
MATCH p = (a)--(b:X)--(c),(b:X)--(d)
RETURN EXTRACT(n in nodes(p) | {id:ID(n), name:COALESCE(n.name, n.title, ID(n)), type:LABELS(n)}) AS nodes,
EXTRACT(r in relationships(p)| {source:ID(startNode(r)) , target:ID(endNode(r))}) AS rels
It looks like the named query picks up only a-->b<--c pattern and omits the b<--d patterns.
Am i missing something... can i not add multiple patterns in a named query?
The most immediate problem is that the comma in the MATCH clause separates the first pattern from the second. The variable 'p' only stores the first pattern. This is why you aren't getting the results you desire. Independent of that, you are at risk of having a 'loose binding' by putting a label on both of your nodes named 'b' in the two patterns. The second 'b' node should not have a label.
So here is a version of your query that should work.
MATCH p1=(a)-->(b:X)<--(c), p2=(b)<--(d)
WITH nodes(p1) + d AS ns, relationships(p1) + relationships(p2) AS rs
RETURN EXTRACT(n IN ns | {id:ID(n), name:COALESCE(n.name, n.title, ID(n)), type:LABELS(n)}) AS nodes,
EXTRACT(r in rs| {source:ID(startNode(r)) , target:ID(endNode(r))}) AS rels
Capture both paths, then build collections from the nodes and relationships of both paths. The collection of nodes actually only extracts the nodes from p1 and adds the 'd' node. You could write that part as
nodes(p1) + nodes(p2) as ns
but then the 'b' node will appear in the list twice.

Resources