I want to use the returned shortest path between each node and the root (Main topic classification) to build my graph. I use the following query
MATCH ()-[:SUBJECT]->(c:Category)
UNWIND NODES(c) AS nd // to get all the nodes on which I want to iterate
FOREACH(n in nd|
WITH n as start
path = allShortestPaths((start)-[:SUBCAT_OF*]-> (p1:Category {catName: "Main_topic_classifications"}))
UNWIND RELATIONSHIPS(path) AS rel
WITH STARTNODE(rel) AS s, ENDNODE(rel) AS e
MERGE (s)-[:NEW_SUBCAT]->(e)
)
For each node c, I try to compute all shortest paths to reach the root and than I use the returned path result to create a newrelationships (NEW_SUBCAT). However, when I run the previous query, I get the follwoing error:
Neo.ClientError.Statement.SyntaxError: Invalid input '(': expected an identifier character, whitespace, NodeLabel, a property map or ')' (line 5, ...)
This simpler query may do what you want (FOREACH is totally unnecessary):
MATCH (start:Category)
WHERE ()-[:SUBJECT]->(start)
MATCH path = allShortestPaths((start)-[:SUBCAT_OF*]-> (p1:Category {catName: "Main_topic_classifications"}))
UNWIND RELATIONSHIPS(path) AS rel
WITH STARTNODE(rel) AS s, ENDNODE(rel) AS e
MERGE (s)-[:NEW_SUBCAT]->(e)
Related
The aim of this query is to create new relations from an existing graph. I have Category nodes connected by SUBCAT_OF relationships. I want to extract the SUBCAT_OF paths from each Category (up to length 4) and use them to create a new paths consisting of NEW_SUBCAT relationships.
I am using the following query but I'm not sure that it works correctly:
MATCH (start:Category)
WHERE ()-[:SUBJECT]->(start)
MATCH path =((start)-[:SUBCAT_OF*1..4]-> (p1:Category))
UNWIND RELATIONSHIPS(path) AS rel
WITH STARTNODE(rel) AS s, ENDNODE(rel) AS e
MERGE (s)-[:NEW_SUBCAT]->(e)
Your question did not state that the starting Category must have an incoming SUBJECT relationship. But since your query does filter for that, I will assume that is a requirement.
Your query (cleaned up slightly below) does do what you want.
MATCH (start:Category)
WHERE ()-[:SUBJECT]->(start)
MATCH path = (start)-[:SUBCAT_OF*..4]->(:Category)
UNWIND RELATIONSHIPS(path) AS rel
WITH STARTNODE(rel) AS s, ENDNODE(rel) AS e
MERGE (s)-[:NEW_SUBCAT]->(e)
Note: This query ensures that only a single NEW_SUBCAT relationship will exist between each distinct pair of start and end nodes (even if your DB had multiple SUBCAT_OF relationships between that same pair).
The following alternate query may be a bit faster, as it will first filter out duplicate relationships (produced by the variable-length relationship pattern):
MATCH (start:Category)
WHERE ()-[:SUBJECT]->(start)
MATCH path = (start)-[:SUBCAT_OF*..4]->(:Category)
UNWIND RELATIONSHIPS(path) AS rel
WITH DISTINCT rel
WITH STARTNODE(rel) AS s, ENDNODE(rel) AS e
MERGE (s)-[:NEW_SUBCAT]->(e)
I try to build a new relationship from the allshortpath returned path.
$
MATCH (p1:Category {catName: "Main_topic_classifications"}),
(p2:Category {catName: "Monarchs_of_the_Bulgars"}),
path = allShortestPaths((p2)-[:SUBCAT_OF*]->(p1))
FOREACH (s IN rels(path) |
MERGE (startNode(s))-[:NEW_SUBCAT]->(ENDNODE(s)))
However, when I run this previous query I obtained this error:
Neo.ClientError.Statement.SyntaxError: Invalid input '(': expected an identifier character, whitespace, NodeLabel, a property map or ')' (line 5, column 24 (offset: 248))
" MERGE (:startNode(s))-[:NEW_REL]->(:ENDNODE(s)))"
^
The Cypher language does not allow a node pattern to contain a function that returns the node (even though that would be very convenient).
This query (which first creates the node variables s and e, so that they can be used in node patterns) should work for you:
MATCH
(p1:Category {catName: "Main_topic_classifications"}),
(p2:Category {catName: "Monarchs_of_the_Bulgars"}),
path = allShortestPaths((p2)-[:SUBCAT_OF*]->(p1))
UNWIND RELATIONSHIPS(path) AS rel
WITH STARTNODE(rel) AS s, ENDNODE(rel) AS e
MERGE (s)-[:NEW_SUBCAT]->(e)
I have a path and I want to get the nodes that connected to this path by some edges. I wrote this qwery and it's not working properly:
match p=(a)-[:example*]->(c) where length(p) = 5
with p
match (u)-[r:example2]-> p return u,p,r
I want to get all that nodes 'u'.
can you tell me please what i'm doing wrong?
Thanks.
Extract the nodes of the path using the nodes() function, UNWIND the list and then perform the match. You might want to collect the results for each node a.
match p=(a)-[:example*]->(c)
where length(p) = 5
with a, nodes(p) as pathNodes
unwind pathNodes as pathNode
match (u)-[r:example2]->(pathNode)
return a, collect([u, pathNode])
I know there are two types of nodes say (:RefNodeType1) and (:RefNodeType2). Both will have single instances in the given graph. Then I know there are another two types of nodes (:TargetNodeType1) and (:TargetNodeType2) with multiple instances of each of them.
I want to know all instances of (:TargetNodeType1) and (:TargetNodeType2) that have same path between them as the path between (:RefNodeType1) and (:RefNodeType2). While saying "same path", I mean to say relationship and node labels on both the paths should be same and should occur in same sequence.
Pseudo cypher may look something like this
MATCH path = (:RefNodeType1)-<path-description>-(RefNodeType2) //<path-description> can be anything, may include variable length relationship e.g. [:XYZ*1..]
WITH path
MATCH (a:TargetNodeType1)-<path>-(b:TargetNodeType2)
RETURN a,b
But I dont know how can I specify <path>, that is if the path contained in the variable path also exists between TargetNodeType1 and TargetNodeType2. I dont know whether such query is possible or not. But still, how can I do this? Also what could be better approach for this?
This is possible.
Let's create an example graph:
CREATE
(rn1:RefNodeType1 {name: "rn1"}),
(rn2:RefNodeType2 {name: "rn2"}),
(tn1:TargetNodeType1 {name: "tn1"}),
(tn2:TargetNodeType2 {name: "tn2"}),
(i1:IntermediateNodeType {name: "i1"}),
(i2:IntermediateNodeType {name: "i2"}),
(rn1)-[:REL]->(i1)-[:REL]->(rn2),
(tn1)-[:REL]->(i2)-[:REL]->(tn2)
First, get the node labels along a certain path. We first use the nodes() function to extract the nodes of the path. To get the intermediate nodes, we drop the first and the last node, using the [1..length(nodes)-1] range (note that the upper limit is non-inclusive, e.g. RETURN ['a', 'b', 'c'][1..2] return ['b']). We UNWIND the intermediate nodes so that we can invoke the labels() method on each of them, and collect the results to a list.
MATCH path = (:RefNodeType1)-[r*]->(:RefNodeType2)
WITH nodes(path) AS nodes
WITH nodes[1..length(nodes)-1] AS refIntermediateNodes
UNWIND refIntermediateNodes AS refIntermediateNode
WITH collect(labels(refIntermediateNode)) AS refIntermediateNodeLabels
RETURN *
This results in:
╒═════════════════════════╕
│refIntermediateNodeLabels│
╞═════════════════════════╡
│[[IntermediateNodeType]] │
└─────────────────────────┘
We apply this to both the RefNodeTypes and the TargetNodeTypes.
// (1)
MATCH path = (:RefNodeType1)-[r*]->(:RefNodeType2)
WITH nodes(path) AS nodes
WITH nodes[1..length(nodes)-1] AS refIntermediateNodes
UNWIND refIntermediateNodes AS refIntermediateNode
WITH collect(labels(refIntermediateNode)) AS refIntermediateNodeLabels
// (2)
MATCH path = (t1:TargetNodeType1)-[r*]->(t2:TargetNodeType2)
WITH t1, t2, refIntermediateNodeLabels, nodes(path) AS nodes
WITH t1, t2, refIntermediateNodeLabels, nodes[1..length(nodes)-1] AS targetIntermediateNodes
UNWIND targetIntermediateNodes AS targetIntermediateNode
WITH t1, t2, refIntermediateNodeLabels, collect(labels(targetIntermediateNode)) AS targetIntermediateNodeLabels
WHERE refIntermediateNodeLabels = targetIntermediateNodeLabels
RETURN t1, t2
Note that in the second MATCH (2), we need to pass the refIntermediateNodeLabels for each WITH clause (as WITH chains the queries).
The result is:
╒═══════════╤═══════════╕
│t1 │t2 │
╞═══════════╪═══════════╡
│{name: tn1}│{name: tn2}│
└───────────┴───────────┘
And, of course we have to do this for relationships as well - the main differences are that 1) we do not have to drop the first and the last one 2) we have to use type() instead of labels().
So, to get the relationships along a path:
MATCH path = (:RefNodeType1)-[r*]->(:RefNodeType2)
WITH relationships(path) AS refRelationships
UNWIND refRelationships AS refRelationship
WITH collect(type(refRelationship)) AS refRelationshipLabels
RETURN *
This results is:
╒═════════════════════╕
│refRelationshipLabels│
╞═════════════════════╡
│[REL_A, REL_B] │
└─────────────────────┘
So let's add this all together so we get this nice & simple query:
// (1)
MATCH path = (:RefNodeType1)-[r*]->(:RefNodeType2)
WITH nodes(path) AS nodes, relationships(path) AS refRelationships
WITH nodes[1..length(nodes)-1] AS refNodes, refRelationships
UNWIND refNodes AS refNode
WITH refNodes, collect(labels(refNode)) AS refNodeLabels, refRelationships
UNWIND refRelationships AS refRelationship
WITH refNodeLabels, collect(type(refRelationship)) AS refRelationshipLabels
// (2)
MATCH path = (t1:TargetNodeType1)-[r*]->(t2:TargetNodeType2)
WITH t1, t2, refNodeLabels, refRelationshipLabels, nodes(path) AS nodes, relationships(path) AS targetIntermediateRelationships
WITH t1, t2, refNodeLabels, refRelationshipLabels, nodes[1..length(nodes)-1] AS targetIntermediateNodes, targetIntermediateRelationships
UNWIND targetIntermediateNodes AS targetIntermediateNode
WITH t1, t2, refNodeLabels, refRelationshipLabels, collect(labels(targetIntermediateNode)) AS targetIntermediateNodeLabels, targetIntermediateRelationships
UNWIND targetIntermediateRelationships AS targetIntermediateRelationship
WITH t1, t2, refNodeLabels, refRelationshipLabels, targetIntermediateNodeLabels, collect(type(targetIntermediateRelationship)) AS targetIntermediateRelationshipLabels
WHERE refNodeLabels = targetIntermediateNodeLabels
AND refRelationshipLabels = targetIntermediateRelationshipLabels
RETURN t1, t2
Limitations: I am not sure this will work for multiple node labels - but it seems they are return in the same order:
CREATE (:A:B), (:B:A)
MATCH (n:A)
RETURN labels(n)
╒═════════╕
│labels(n)│
╞═════════╡
│[A, B] │
├─────────┤
│[A, B] │
└─────────┘
Is there a way to keep or return only the final full sequences of nodes instead of all subpaths when variable length identifiers are used in order to do further operations on each of the final full sequence path.
MATCH path = (S:Person)-[rels:NEXT*]->(E:Person)................
eg: find all sequences of nodes with their names in the given list , say ['graph','server','db'] with same 'seqid' property exists in the relationship in between.
i.e.
(graph)->(server)-(db) with same seqid :1
(graph)->(db)->(server) with same seqid :1 //there can be another matching
sequence with same seqid
(graph)->(db)->(server) with same seqid :2
Is there a way to keep only the final sequence of nodes say ' (graph)->(server)->(db)' for each sequences instead of each of the subpath of a large sequence like (graph)->(server) or (server)->(db)
pls help me to solve this.........
(I am using neo4j 2.3.6 community edition via java api in embedded mode..)
What we could really use here is a longestSequences() function that would do exactly what you want it to do, expand the pattern such that a and b would always be matched to start and end points in the sequence such that the pattern is not a subset of any other matched pattern.
I created a feature request on neo4j for exactly this: https://github.com/neo4j/neo4j/issues/7760
And until that gets implemented, we'll have to make do with some alternate approach. I think what we'll have to do is add additional matching to restrict a and b to start and end nodes of full sequences.
Here's my proposed query:
WITH ['graph', 'server' ,'db'] as names
MATCH p=(a)-[rels:NEXT*]->(b)
WHERE ALL(n in nodes(p) WHERE n.name in names)
AND ALL( r in rels WHERE rels[0]['seqid'] = r.seqid )
WITH names, p, a, rels, b
// check if b is a subsequence node instead of an end node
OPTIONAL MATCH (b)-[rel:NEXT]->(c)
WHERE c.name in names
AND rel.seqid = rels[0]['seqid']
// remove any existing matches where b is a subsequence node
WITH names, p, a, rels, b, c
WHERE c IS NULL
WITH names, p, a, rels, b
// check if a is a subsequence node instead of a start node
OPTIONAL MATCH (d)-[rel:NEXT]->(a)
WHERE d.name in names
AND rel.seqid = rels[0]['seqid']
// remove any existing matches where a is a subsequence node
WITH p, a, b, d
WHERE d IS NULL
RETURN p, a as startNode, b as endNode
MATCH (S:Person)-[r:NEXT]->(:Person)
// Possible starting node
WHERE NOT ( (:Person)-[:NEXT {seqid: r.seqid}]->(S) )
WITH S,
// Collect all possible values of `seqid`
collect (distinct r.seqid) as seqids
UNWIND seqids as seqid
// Possible terminal node
MATCH (:Person)-[r:NEXT {seqid: seqid}]->(E:Person)
WHERE NOT ( (E)-[:NEXT {seqid: seqid}]->(:Person) )
WITH S,
seqid,
collect(distinct E) as ES
UNWIND ES as E
MATCH path = (S)-[rels:NEXT* {seqid: seqid}]->(E)
RETURN S,
seqid,
path
[EDITED]
This query might do what you want:
MATCH (p1:Person)-[rel:NEXT]->(:Person)
WHERE NOT (:Person)-[:NEXT {seqid: rel.seqid}]->(p1)
WITH DISTINCT p1, rel.seqid AS seqid
MATCH path = (p1)-[:NEXT* {seqid: seqid}]->(p2:Person)
WHERE NOT (p2)-[:NEXT {seqid: seqid}]->(:Person)
RETURN path;
It first identifies all Person nodes (p1) with at least one outgoing NEXT relationship that have no incoming NEXT relationships (with the same seqid), and their distinct outgoing seqid values. Then it finds all "complete" paths (i.e., paths whose start and end nodes have no incoming or outgoing NEXT relationships with the desired seqid, respectively) starting at each p1 node and having relationships all sharing the same seqid. Finally, it returns each complete path.
If you just want to get the name property of all the Person nodes in each path, try this query (with a different RETURN clause):
MATCH (p1:Person)-[rel:NEXT]->(:Person)
WHERE NOT (:Person)-[:NEXT {seqid: rel.seqid}]->(p1)
WITH DISTINCT p1, rel.seqid AS seqid
MATCH path = (p1)-[:NEXT* {seqid: seqid}]->(p2:Person)
WHERE NOT (p2)-[:NEXT {seqid: seqid}]->(:Person)
RETURN EXTRACT(n IN NODES(path) | n.name);