collect distinct nodes and relations from multiple rows

collect distinct nodes and relations from multiple rows - neo4j

I'm writing a query to retrieve nodes and relations from multiple paths :
MATCH path=(p:Label)-[*..100]->()
RETURN [n in nodes(path) | ID(n)] as nodeIds,
[n in nodes(path)] as nodes,
[r in relationships(path) | ID(r)] as relationshipIds,
[r in relationships(path) | type(r)] as relationshipTypes,
[r in relationships(path)] as relationships
However I have multiple rows (corresponding to each path) with possibly the same data.
I'd like to have one row containing all the distinct nodeIds, relationshipIds, ...
Thank you !

When I run this query, I don't get duplicate data. But I can see why you might think there's duplicate data. First, you could try this:
MATCH path=(p:Label)-[*..100]->()
WITH DISTINCT(path) as path
RETURN [n in nodes(path) | ID(n)] as nodeIds,
[n in nodes(path)] as nodes,
[r in relationships(path) | ID(r)] as relationshipIds,
[r in relationships(path) | type(r)] as relationshipTypes,
[r in relationships(path)] as relationships
ORDER BY length(path) LIMIT 1;
This would ensure that all paths were distinct, which would mean you couldn't have repeated data, but I think that should already be the case. Ordering by path length means longest paths go first, and limit 1 means only the longest path.
Anyway, the duplication you're probably seeing is due to paths and path fragments. Let's say I have a->b->c. Your query will report three paths:
a->b->c
a->b
b->c
Note that's the correct answer. But in terms of node IDs and relationship IDs, you're going to see a lot of duplication in the result set, because every single node ID will occur at least twice in the results.

Related

How can I find out the repeating node labels in a path in Neo4j?

I have a path that contains several labels like Shipped, Received, Ready to ship node labels. I want to know if a certain path has multiple occurrences of node labels. They may not be in order.
(Shipped)-[:NEXT]->()-[:NEXT]->()-[:NEXT]-(:ReadyToShip)-[:NEXT]-()-[:NEXT]-(:ReadyToShip)-[:NEXT]-(:Received)
i have many paths but I want to find all the paths which have 2 or more occurrences of the ReadyToShip node labels like the one above. How can I do this? I can extract all the possible path between 2 types of nodes using this :
match path=(s:Shipped)-[:NEXT*]->(m:Received) return distinct extract(p in nodes(path) | labels(p))
But I have to extract it out and filter these myself. How can I do this in Cypher?

[UPDATED]
This query should return every path that has at least 2 ReadyToShip nodes, and the number of ReadyToShip nodes in that path:
MATCH p=(s:Shipped)-[:NEXT*]->(:ReadyToShip)-[:NEXT*]->(:ReadyToShip)-[:NEXT*]->(m:Received)
RETURN
p,
REDUCE(s = 0, n IN NODES(p) | CASE WHEN 'ReadyToShip' IN LABELS(n) THEN s + 1 ELSE s END) AS num;

Aggregate nodes and relationships

I'd like to perform a check on a property that exists on both nodes and relationships in my DB. Is there any way to aggregate them all as entity and then perform the check on entity instead of performing three separate checks?
What I currently have is:
MATCH (n1)-[r]-(n2) WHERE (myConditions)
WITH n1, r, n2
WHERE n1.property=1 AND r.property=1 AND n2.property=1
RETURN *
What I'm looking for is something like:
MATCH (n1)-[r]-(n2) WHERE (myConditions)
WITH n1, r, n2 AS entity
WHERE entity.property=1
RETURN *
Important note: There are queries with more than 2 nodes and more than one relationships. I would like to aggregate all graph entities and then perform a single check.
By the way, if "aggregation" is not the right term for this case, please feel free to correct me.

You could use predicates like all(), any(), none() etc. on all nodes and relationships in your path. You still have to check for nodes and relationships separately.
MATCH path=(n1)-[r]-(n2)
WHERE (myConditions)
WITH path
WHERE all(n in nodes(path) WHERE n.property = 1)
AND all(r in relationships(path) WHERE r.property = 1)
RETURN path

Match Only Full Paths in Neo4J with Cypher (not sub-paths)

If I have a graph like the following (where the nesting could go on for an arbitrary number of nodes):
(a)-[:KNOWS]->(b)-[:KNOWS]->(c)-[:KNOWS]->(d)-[:KNOWS]->(e)
| |
| (i)-[:KNOWS]->(j)
|
(f)-[:KNOWS]->(g)-[:KNOWS]->(h)-[:KNOWS]->(n)
|
(k)-[:KNOWS]->(l)-[:KNOWS]->(m)
How can I retrieve all of the full-length paths (in this case, from (a)-->(m), (a)-->(n) (a)-->(j) and (a)-->(e)? The query should also be able to return the nodes with no relationships of the given type.
So far I am just doing the following (I only want the id property):
MATCH path=(a)-[:KNOWS*]->(b)
RETURN collect(extract(n in nodes(path) | n.id)) as paths
I need the paths so that in the programming language (in this case clojure) I can create a nested map like this:
{"a" {"b" {"f" {"g" {"k" {"l" {"m" nil}}
"h" {"n" nil}}}
"c" {"d" {"e" nil}
"i" {"j" nil}}}}}
Is it possible to generate the map directly with the query?

Just had to do something similar, this worked on your example, finds all nodes which do not have outgoing [:KNOWS]:
match p=(a:Node {name:'a'})-[:KNOWS*]->(b:Node)
optional match (b)-[v:KNOWS]->()
with p,v
where v IS NULL
return collect(extract(n in nodes(p) | n.id)) as paths

Here is one query that will get you started. This query will return just the longest chain of nodes when there is a single chain without forks. It matches all of the paths like yours does but only returns the longest one by using limit to reduce the result.
MATCH p=(a:Node {name:'a'})-[:KNOWS*]->(:Node)
WITH length(p) AS size, p
ORDER BY size DESC
LIMIT 1
RETURN p AS Longest_Path
I think this gets the second part of your question where there are multiple paths. It looks for paths where the last node does not have an outbound :KNOWS relationship and where the starting node does not have an inbound :KNOWS relationship.
MATCH p=(a:Node {name:'a'})-[:KNOWS*]->(x:Node)
WHERE NOT x-[:KNOWS]->()
AND NOT ()-[:KNOWS]->(a)
WITH length(p) AS size, p
ORDER BY size DESC
RETURN reduce(node_ids = [], n IN nodes(p) | node_ids + [id(n)])

get all transitive relationships from a node via cypher

Do you know how to write a cypher query that would return all the transitive relationships related to a node.
For instance if I have : (node1)-[rel1]->(node2)-[rel2]->(node3).
I'd like a query that, given node1 returns rel1 and rel2.
Thanks for your help !

You need to use a variable path match, assuming your start node is node 1 having label Label and name='node1':
MATCH path=(node1:Label {name:'node1'})-[*..100]->()
RETURN relationships(path) as rels
The relationships function returns a list holding all relationships along that path. It is a best practice to provide an upper limit to variable depth matches, here I've set it arbitrarily to 100.
update regarding comment below
To get the id's of the relationships:
MATCH path=(node1:Label {name:'node1'})-[*..100]->()
RETURN [r in relationships(path) | ID(x)] as relIds

neo4j collecting nodes and relations type b-->a<--c,a<--d

I am extending maxdemarzi's excellent graph visualisation example (http://maxdemarzi.com/2013/07/03/the-last-mile/) using VivaGraph backed by neo4j.
I want to display relationships of the type
a-->b<--c,b<--d
I tried the query
MATCH p = (a)--(b:X)--(c),(b:X)--(d)
RETURN EXTRACT(n in nodes(p) | {id:ID(n), name:COALESCE(n.name, n.title, ID(n)), type:LABELS(n)}) AS nodes,
EXTRACT(r in relationships(p)| {source:ID(startNode(r)) , target:ID(endNode(r))}) AS rels
It looks like the named query picks up only a-->b<--c pattern and omits the b<--d patterns.
Am i missing something... can i not add multiple patterns in a named query?

The most immediate problem is that the comma in the MATCH clause separates the first pattern from the second. The variable 'p' only stores the first pattern. This is why you aren't getting the results you desire. Independent of that, you are at risk of having a 'loose binding' by putting a label on both of your nodes named 'b' in the two patterns. The second 'b' node should not have a label.
So here is a version of your query that should work.
MATCH p1=(a)-->(b:X)<--(c), p2=(b)<--(d)
WITH nodes(p1) + d AS ns, relationships(p1) + relationships(p2) AS rs
RETURN EXTRACT(n IN ns | {id:ID(n), name:COALESCE(n.name, n.title, ID(n)), type:LABELS(n)}) AS nodes,
EXTRACT(r in rs| {source:ID(startNode(r)) , target:ID(endNode(r))}) AS rels
Capture both paths, then build collections from the nodes and relationships of both paths. The collection of nodes actually only extracts the nodes from p1 and adds the 'd' node. You could write that part as
nodes(p1) + nodes(p2) as ns
but then the 'b' node will appear in the list twice.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

collect distinct nodes and relations from multiple rows - neo4j

Related

How can I find out the repeating node labels in a path in Neo4j?

Aggregate nodes and relationships

Match Only Full Paths in Neo4J with Cypher (not sub-paths)

get all transitive relationships from a node via cypher

neo4j collecting nodes and relations type b-->a<--c,a<--d

Categories

Resources