neo4j turn paths into subgraph - neo4j

Say I have the following graph
Z
/ \
/ \
A -> B -> C -> D
\ /
-> X -> Y
I compute the paths
match p=((:A)-[*]->[:D])
return p
This will return three paths (rows): AZD, ABCD and AXYD
But I would like to return a subgraph that contains all the paths between A and D. so the result should be a subgraph. My understanding is that the only format for a subgraph return is nodes and relationships. So a query like below
// query logic
return nodes, relationships
What should I write in the query logic? NOTE:
this is not the entire graph, there are other subgraphs in my graph, so returning the entire graph does not work
A and D are just node types here, there will be many type A and type D nodes and there will be one or multiple paths between each A and D node pair.

One way to retrieve the unique set of nodes and relationships is using apoc
MATCH p=((:A)-[*]->(:D))
RETURN apoc.coll.toSet(
apoc.coll.flatten(
COLLECT(nodes(p))
)
) AS nodes,
apoc.coll.toSet(
apoc.coll.flatten(
COLLECT(relationships(p))
)
) AS relationships

Related

Query to write hops and return all the properties from the middle nodes or a better way to do it and skip hops?

Node A is connected to Node E through different nodes B (B can be repeating), C and D etc as given below.
(A)--(C)--(D)--(E)
(A)--(B)--(C)--(D)--(E)
(A)--(B)--(B)--(C)--(D)--(E)
(A)--(B)--(B)--(B)--(C)--(D)--(E)
There could be up to 7 B nodes between A and C or no B node at all (like the first case above).
Question: How to get all the E1, E2, E3, E4 connected to A1 with a single query and return properties from all A, B, C, D and E nodes? I could not return the properties using the hops.
MATCH (A {Id:30})-[*1..6]-(E) RETURN DISTINCT A.Name, E.Name;
But we want to return B.Name (If there are multiple B nodes in the middle their names too), C.Name and D.Name too. Happy to skip hopping completely if required. Help please? Thanks in advance.
Try this
// in case there is always A,C and E, you can look for
// paths with length 3 to 6
MATCH path=(A)-[*3..6]-(E)
// return the name of each node in the same order
RETURN [n IN nodes(path) | n.name] AS nodeNames
Assuming A through E are node labels, this query should get all paths that match your pattern (with 0 to 7 B nodes between the A and C nodes), and return distinct lists of node Name values:
MATCH p=(:A)-[*..8]-(:C)--(:D)--(:E)
WHERE ALL(n IN NODES(p)[1..-3] WHERE 'B' IN LABELS(n))
RETURN DISTINCT [m IN NODES(p) | m.Name] AS names
In general, the query would be more efficient if you could also specify the relationship types and their directionality.

How to find all multi-directional relations between two nodes?

I'm trying to find all relations between node a and node b, and the relations could be multi-directions. For example,
a <- c -> b or a -> d -> b where c and d are nodes.
I've tried MATCH (a:PERSON {name: 'WD'})-[r*..3]-(b:PERSON{name: 'EK'}) RETURN r, a, b, but I got two isolated nodes, because the relation between a and b is: a <- c -> b.
Any help would be appreciated.
You can return the path if you need all the relationships and nodes in between.
Following query will
You can modify your query to return full paths instead of just nodes a and b as following:
MATCH paths=(a:PERSON {name: 'WD'})-[r*..3]-(b:PERSON{name: 'EK'})
RETURN paths
This will return paths of length up to 3, change it as you need.

cypher query to return or keep only the final sequence when variable length relationship identifiers are used

Is there a way to keep or return only the final full sequences of nodes instead of all subpaths when variable length identifiers are used in order to do further operations on each of the final full sequence path.
MATCH path = (S:Person)-[rels:NEXT*]->(E:Person)................
eg: find all sequences of nodes with their names in the given list , say ['graph','server','db'] with same 'seqid' property exists in the relationship in between.
i.e.
(graph)->(server)-(db) with same seqid :1
(graph)->(db)->(server) with same seqid :1 //there can be another matching
sequence with same seqid
(graph)->(db)->(server) with same seqid :2
Is there a way to keep only the final sequence of nodes say ' (graph)->(server)->(db)' for each sequences instead of each of the subpath of a large sequence like (graph)->(server) or (server)->(db)
pls help me to solve this.........
(I am using neo4j 2.3.6 community edition via java api in embedded mode..)
What we could really use here is a longestSequences() function that would do exactly what you want it to do, expand the pattern such that a and b would always be matched to start and end points in the sequence such that the pattern is not a subset of any other matched pattern.
I created a feature request on neo4j for exactly this: https://github.com/neo4j/neo4j/issues/7760
And until that gets implemented, we'll have to make do with some alternate approach. I think what we'll have to do is add additional matching to restrict a and b to start and end nodes of full sequences.
Here's my proposed query:
WITH ['graph', 'server' ,'db'] as names
MATCH p=(a)-[rels:NEXT*]->(b)
WHERE ALL(n in nodes(p) WHERE n.name in names)
AND ALL( r in rels WHERE rels[0]['seqid'] = r.seqid )
WITH names, p, a, rels, b
// check if b is a subsequence node instead of an end node
OPTIONAL MATCH (b)-[rel:NEXT]->(c)
WHERE c.name in names
AND rel.seqid = rels[0]['seqid']
// remove any existing matches where b is a subsequence node
WITH names, p, a, rels, b, c
WHERE c IS NULL
WITH names, p, a, rels, b
// check if a is a subsequence node instead of a start node
OPTIONAL MATCH (d)-[rel:NEXT]->(a)
WHERE d.name in names
AND rel.seqid = rels[0]['seqid']
// remove any existing matches where a is a subsequence node
WITH p, a, b, d
WHERE d IS NULL
RETURN p, a as startNode, b as endNode
MATCH (S:Person)-[r:NEXT]->(:Person)
// Possible starting node
WHERE NOT ( (:Person)-[:NEXT {seqid: r.seqid}]->(S) )
WITH S,
// Collect all possible values of `seqid`
collect (distinct r.seqid) as seqids
UNWIND seqids as seqid
// Possible terminal node
MATCH (:Person)-[r:NEXT {seqid: seqid}]->(E:Person)
WHERE NOT ( (E)-[:NEXT {seqid: seqid}]->(:Person) )
WITH S,
seqid,
collect(distinct E) as ES
UNWIND ES as E
MATCH path = (S)-[rels:NEXT* {seqid: seqid}]->(E)
RETURN S,
seqid,
path
[EDITED]
This query might do what you want:
MATCH (p1:Person)-[rel:NEXT]->(:Person)
WHERE NOT (:Person)-[:NEXT {seqid: rel.seqid}]->(p1)
WITH DISTINCT p1, rel.seqid AS seqid
MATCH path = (p1)-[:NEXT* {seqid: seqid}]->(p2:Person)
WHERE NOT (p2)-[:NEXT {seqid: seqid}]->(:Person)
RETURN path;
It first identifies all Person nodes (p1) with at least one outgoing NEXT relationship that have no incoming NEXT relationships (with the same seqid), and their distinct outgoing seqid values. Then it finds all "complete" paths (i.e., paths whose start and end nodes have no incoming or outgoing NEXT relationships with the desired seqid, respectively) starting at each p1 node and having relationships all sharing the same seqid. Finally, it returns each complete path.
If you just want to get the name property of all the Person nodes in each path, try this query (with a different RETURN clause):
MATCH (p1:Person)-[rel:NEXT]->(:Person)
WHERE NOT (:Person)-[:NEXT {seqid: rel.seqid}]->(p1)
WITH DISTINCT p1, rel.seqid AS seqid
MATCH path = (p1)-[:NEXT* {seqid: seqid}]->(p2:Person)
WHERE NOT (p2)-[:NEXT {seqid: seqid}]->(:Person)
RETURN EXTRACT(n IN NODES(path) | n.name);

finding the farthest node using Neo4j (node without any incoming relation)

I have created a graph db in Neo4j and want to use it for generalization purposes.
There are about 500,000 nodes (20 distinct labels) and 2.5 million relations (50 distinct types) between them.
In an example path : a -> b -> c-> d -> e
I want to find out the node without any incoming relations (which is 'a').
And I should do this for all the nodes (finding the nodes at the beginning of all possible paths that have no incoming relations).
I have tried several Cypher codes without any success:
match (a:type_A)-[r:is_a]->(b:type_A)
with a,count (r) as count
where count = 0
set a.isFirst = 'true'
or
match (a:type_A), (b:type_A)
where not (a)<-[:is_a*..]-(b)
set a.isFirst = 'true'
Where is the problem?!
Also, I have to create this code in neo4jClient, too.
Your first query will only match paths where there is a relationship [r:is_a], so counting r can never be 0. Your second query will return any arbitrary pair of nodes labeled :typeA that aren't transitively related by [:is_a]. What you want is to filter on a path predicate. For the general case try
MATCH (a)
WHERE NOT ()-->a
This translates roughly "any node that does not have incoming relationships". You can specify the pattern with types, properties or labels as needed, for instance
MATCH (a:type_A)
WHERE NOT ()-[:is_a]->a
If you want to find all nodes that have no incoming relationships, you can find them using OPTIONAL MATCH:
START n=node(*)
OPTIONAL MATCH n<-[r]-()
WITH n,r
WHERE r IS NULL
RETURN n

neo4j node aggregate filter

I've got a sub graph, of node type c which has a relationship to t1 and t2. Node t1 has a relationship with w1 and w2. Node t2 has a relationship with w1.
What I want to query with cypher is from node c return the w nodes that have 2 or more t nodes related. ie w1 only.
Apparently you can't aggregate in the WHERE clause like
START c=node(7)
MATCH (c)-[:T_TO]-(t)-[:W_TO]-(w)
WHERE COUNT(t) >= 2
RETURN w.WName;
Maybe looking at it another way, this doesn't work either as I only want the w's that only relate to t1 and t2...?
START c=node(7), t1=node(10), t2=node(8)
MATCH (c)-[:T_TO]-(t)-[:W_TO]-(w)
WHERE t in [t1, t2]
RETURN t, w.WName;
Update
Anyone wanting something like the second one, this works:
START c=node(7), t1=node(8), t2=node(10)
MATCH (c)-[:T_TO]-(t1)-[:W_TO]-(w),(c)-[:T_TO]-(t2)-[:W_TO]-(w)
RETURN w.WName;
How about
START c=node(7)
MATCH (c)-[:T_TO]-(t)-[:W_TO]-(w)
WITH COUNT(t) as tCount,w
WHERE tCount >= 2
RETURN w.WName;

Resources