How to duplicate node tree in neo4j? - neo4j

How can I create full node tree from existing node tree, Actually i have 1 node and i need to find all relation and nodes that has my existing top node. I need to create full node tree to another node.
Copy All Node Tree from A with Relation and create Duplicate same node and relation for B Node

Okay, this is a tricky one.
As others have mentioned apoc.refactor.cloneNodesWithRelationships() can help, but this will also result in relationships between cloned nodes and the originals, not just the clones, as this proc wasn't written with this kind of use case in mind.
So this requires us to do some Cypher acrobatics to find the relationships from the cloned nodes that don't go to the cloned nodes so we can delete them.
However at the same time we also have to identify the old root node so we can refactor relationships to the new root (this requires separate processing for incoming and outgoing relationships).
Here's a query that should do the trick, making assumptions about the nodes since you didn't provide any details about your graph:
MATCH (a:Root{name:'A'})
WITH a, id(a) as aId
CALL apoc.path.subgraphNodes(a, {}) YIELD node
WITH aId, collect(node) as nodes
CALL apoc.refactor.cloneNodesWithRelationships(nodes) YIELD input, output
WITH aId, collect({input:input, output:output}) as createdData, collect(output) as createdNodes
WITH createdNodes, [item in createdData WHERE item.input = aId | item.output][0] as aClone // clone of root A node
UNWIND createdNodes as created
OPTIONAL MATCH (created)-[r]-(other)
WHERE NOT other in createdNodes
DELETE r // get rid of relationships that aren't between cloned nodes
// now to refactor relationships from aClone to the new B root node
WITH DISTINCT aClone
MATCH (b:Root{name:'B'})
WITH aClone, b
MATCH (aClone)-[r]->()
CALL apoc.refactor.from(r, b) YIELD output
WITH DISTINCT b, aClone
MATCH (aClone)<-[r]-()
CALL apoc.refactor.to(r, b) YIELD output
WITH DISTINCT aClone
DETACH DELETE aClone

Related

Neo4j Cypher complex query optimization

Now I have a graph with millions of nodes and millions of edge relationships. There is a directed relationship between nodes.
Now suppose the node has two states A and B. I want to find all state A nodes on the path that do not have state B.
As shown in the figure below, there are nodes A--K, and then three of them, E, G and J, are of type B, and the others are of type A.
picture link is https://i.stack.imgur.com/a0yOV.jpg
For node E, its upstream and downstream traversal is shown below, so nodes B, H, K do not meet the requirements.
For node G, its upstream and downstream traversal is shown below, so nodes B, D, K do not meet the requirements.
For node J, its upstream and downstream traversal is shown below, so nodes A, B, C, D, F do not meet the requirements.
So finally only node "I" is the node that meets the requirements.
picture link is https://i.stack.imgur.com/A2eqv.jpg
The case of the above example is a DAG, but the actual situation is that there may be cycle in the graph, including spin cycle (case 1), AB cycle (case 2), large loops (case 3), and complex cycle (case 4)
picture link is https://i.stack.imgur.com/NDpED.jpg
The Cypher query statement I can write
MATCH (n:A)
WHERE NOT exists((n)-[*]->(:B))
AND NOT exists((n)<-[*]-(:B))
RETURN n;
But this query statement is stuck in the case of millions of nodes and millions of edges with a limit 35,But in the end there are more than 30,000 nodes that meet the requirements.
Obviously my statement is taking up too much memory, querying out 30+ nodes has taken up almost all the available memory, how can I write a more efficient query?
Here is a example
CREATE (a:A{id:'a'})
CREATE (b:A{id:'b'})
CREATE (c:A{id:'c'})
CREATE (d:A{id:'d'})
CREATE (e:B{id:'e'})
CREATE (f:A{id:'f'})
CREATE (g:B{id:'g'})
CREATE (h:A{id:'h'})
CREATE (i:A{id:'i'})
CREATE (j:B{id:'j'})
CREATE (k:A{id:'k'})
MERGE (a)-[:REF]->(c)
MERGE (b)-[:REF]->(c)
MERGE (b)-[:REF]->(d)
MERGE (b)-[:REF]->(e)
MERGE (c)-[:REF]->(f)
MERGE (d)-[:REF]->(g)
MERGE (e)-[:REF]->(g)
MERGE (e)-[:REF]->(h)
MERGE (f)-[:REF]->(i)
MERGE (f)-[:REF]->(j)
MERGE (f)-[:REF]->(k)
MERGE (g)-[:REF]->(k)
MERGE (g)-[:REF]->(j)
use this code will get the result 'i'
MATCH (n:A)
WHERE NOT exists((n)-[*]->(:B))
AND NOT exists((n)<-[*]-(:B))
RETURN n;
But when there are 800,000 nodes (400,000 type A, 400,000 type B) and over 1.4 million edges in the graph, this code cannot run the result
Some thoughts:
I don’t think this global graph search can be solved with a single query. You will need some kind of process to optimise exploration and use the result up to a certain point in subsequent steps.
when you could assign node labels instead of properties to reflect
the state of a node, you could use apoc.path.expandConfig to just
explore paths until you hit a node with state B.
you don’t need to re-investigate state A nodes that you traverse before you hit a node with state B, because they will not meet the requirements.
Another approach could be this, given the fact that all nodes that are on the up or downstream paths from a B node, will not fulfil the requirements. Still assuming that you use labels to distinguish A and B nodes.
MATCH (b:B)
CALL apoc.path.spanningTree(b,
{relationshipFilter: "<",
labelFilter:"/B"
}
) YIELD path
UNWIND nodes(path) AS downStreamNode
WITH b,COLLECT(DISTINCT downStreamNode) AS downStreamNodes
CALL apoc.path.spanningTree(b,
{relationshipFilter: ">",
labelFilter:"/B"}
) YIELD path
UNWIND nodes(path) AS upStreamNode
WITH b,downStreamNodes+COLLECT(DISTINCT upStreamNode) AS upAndDownStreamNodes
RETURN apoc.coll.toSet(apoc.coll.flatten(COLLECT(upAndDownStreamNodes))) AS allNodesThatDoNotFulfillRequirements

Neo4J: How can I find if a path traversing multiple nodes given in a list exist?

I have a graph of nodes with a relationship NEXT with 2 properties sequence (s) and position (p). For example:
N1-[NEXT{s:1, p:2}]-> N2-[NEXT{s:1, p:3}]-> N3-[NEXT{s:1, p:4}]-> N4
A node N might have multiple outgoing Next relationships with different property values.
Given a list of node names, e.g. [N2,N3,N4] representing a sequential path, I want to check if the graph contains the nodes and that the nodes are connected with relationship Next in order.
For example, if the list contains [N2,N3,N4], then check if there is a relationship Next between nodes N2,N3 and between N3,N4.
In addition, I want to make sure that the nodes are part of the same sequence, thus the property s is the same for each relationship Next. To ensure that the order maintained, I need to verify if the property p is incremental. Meaning, the value of p in the relationship between N2 -> N3 is 3 and the value p between N3->N4 is (3+1) = 4 and so on.
I tried using APOC to retrieve the possible paths from an initial node N using python (library: neo4jrestclient) and then process the paths manually to check if a sequence exists using the following query:
q = "MATCH (n:Node) WHERE n.name = 'N' CALL apoc.path.expandConfig(n {relationshipFilter:'NEXT>', maxLevel:4}) YIELD path RETURN path"
results = db.query(q,data_contents=True)
However, running the query took some time that I eventually stopped the query. Any ideas?
This one is a bit tough.
First, pre-match to the nodes in the path. We can use the collected nodes here to be a whitelist for nodes in the path
Assuming the start node is included in the list, a query might go like:
UNWIND $names as name
MATCH (n:Node {name:name})
WITH collect(n) as nodes
WITH nodes, nodes[0] as start, tail(nodes) as tail, size(nodes)-1 as depth
CALL apoc.path.expandConfig(start, {whitelistNodes:nodes, minLevel:depth, maxLevel:depth, relationshipFilter:'NEXT>'}) YIELD path
WHERE all(index in range(0, size(nodes)-1) WHERE nodes[index] = nodes(path)[index])
// we now have only paths with the given nodes in order
WITH path, relationships(path)[0].s as sequence
WHERE all(rel in tail(relationships(path)) WHERE rel.s = sequence)
// now each path only has relationships of common sequence
WITH path, apoc.coll.pairsMin([rel in relationships(path) | rel.p]) as pairs
WHERE all(pair in pairs WHERE pair[0] + 1 = pair[1])
RETURN path

Cypher delete a node and its child node based on condition

I have recently started using Neo4j (version 3.4.1) and still learning the nuances. I have the following the node relationship in my application.
What I am trying to achieve is the following.
I can delete nodes C1 or C2. I am able to delete their corresponding relationships as well (i.e HAS_X or HAS_Y).
However, when I delete both C1 and C2, node L1 and its other related nodes (M1, M2 and M3) become orphans. Hence, what I want is that whenever I am deleting C1 or C2, if it is the only node that has HAS_Y relationship with node L1, then in that case node L1 and its related nodes (M1, M2 and M3) should also be deleted. If it is not the only node that has HAS_Y relationship with L1, in that case we just delete that specific node (i.e C1/C2). Node L1 and rest of the nodes are left unotuched.
Nodes U1 and U2 remain unaffected in both the scenarios.
I am not sure how I can achieve this using a single cypher query.
Note : I was able to achieve my goal by running 2 separate queries (1 for deleting node C1/C2 and another one for deleting orphan node L1). However, it isn't the most performant as I have to make 2 roundtrips to db.
Is anyone able to give me some inputs on how I can achieve this task? I am looking for a cyper query solution (I am avoiding APOC procedures atm as I hear it requires some modification to neo4j db config)
Regards,
V
You should be able to do this with just Cypher:
...// above is your match to 'c', the node to delete
OPTIONAL MATCH (c)-[:HAS_Y]->(l)
DETACH DELETE c
WITH DISTINCT l
WHERE size(()-[:HAS_Y]->(l)) = 0
OPTIONAL MATCH (l)-[:HAS_Z*0..1]->(toDelete)
DETACH DELETE toDelete
We first match to l, then we delete c. At this point, we only have to take action for any l nodes that no longer have any incoming :HAS_Y relationships. We filter just for these, and then use an optional match with a 0..1 variable relationship to capture both the l nodes and any children they have down :HAS_Z relationships, then delete all of those nodes (both l and all of its possible children will be addressed via toDelete).

How to delete all child nodes and relationships using single query in Neo4j.?

I have a tree like node structure in my Neo4j DB. When I delete particular node I want to delete all child nodes and relationships related to that node.
Consider node structure generated by below query,
merge (p1:Person{nic:'22222v'})-[r1:R1]->(p2:Person{nic:'33333v'})
merge(p1)-[r2:R2]->(p3:Person{nic:'44444v'})
merge(p2)-[r3:R3]->(p3)
merge (p3)-[r4:R4]->(p4:Person{nic:'55555v'})
merge(p4)-[r5:R5]->(p5:Person{nic:'66666v'})
return r1,r2,r3,r4,r5
If I input node(nic:44444v) it should delete node(nic:44444v),node(nic:55555v),node(nic:66666v
), relationship(r2),relationship(r3),relationship(r4) and relationship(r5)
You can use multiple depth relationships and delete the nodes :
MATCH (n:Person {nic:'44444v'})-[*0..]->(x)
DETACH DELETE x
The 0.. depth definition will embed the n identifier in the x nodes and thus will handle the case where the person doesn't have children nodes.
Alternative syntax for oldier neo4j versions :
MATCH (n:Person {nic:'44444v'})-[*0..]->(x)
OPTIONAL MATCH (x)-[r]-()
DELETE r, x

Cypher Linked LIst: how to unshift and replace by index

I am trying to create a Linked List structure with Neo/Cypher as per the recomendation here: CYPHER store order of node relationships of the same label when I create
(p:Parent)-[r1:PARENTID_RELTYPE]->(c1:Child)-[r2:PARENTID_RELTYPE]->(c2:Child)-[r3:PARENTID_RELTYPE]->(c3:Child)
But I am having trouble understanding the syntax for the required sequence of events in order to unshift new nodes onto the structure or replace a particular index with a different node.
Where I am confused is the part where I need to add the new node and repair the structure that keeps the old node still inside the linked list sturcture. There should be only one relationship of each type (PARENTID_RELTYPE) that stems from a Parent node; But there can be multiple relationships of different type from each parent. Child nodes can be featured multiple times within a LinkedList and a child node could be featured in a LinkedList of multiple Parents or in a LinkedList of the same parent but a different relationship type.
So one of three things could happen when I try to unshift:
There is no existing child that links to the parent by the PARENTID_RELTYPE
There already exists a child node that links to the parent by PARENTID_RELTYPE
There already exists a child node that links to the parent by PARENTID_RELTYPE and that child node is simply a duplicate of the child node I am attempting to unshift onto the linked list structure (in which case the intended result is to have the same child node in the zero and first indices of the linked list).
The answer url mentioned above helps me greatly in understanding how to read a linked list structure in Neo/Cypher but because of the alternative way for which conditionals are handled in cypher, I am having trouble understanding how to write to the structure (and also delete from he structure).
Below is my initial attempt to do so but I am somewhat baffled by what I need the syntax to do.
MATCH (a:%s {id: {_parentnodeid}}), (b:%s {id: {_id}})
MERGE a-[relold:%s]->b
ON CREATE
SET relold.metadata = {_metaData}
ON MATCH
...
I am very grateful for help you can provide.
[UPDATED]
In the following queries, for simplicity I pretend that:
We find the Parent node of interest by name.
The relationship type of current interest is Foo.
General Notes:
The OPTIONAL MATCH clauses find the sibling, if any, that should follow the child being inserted.
The FOREACH clauses take care of linking the that sibling, if any, to the child being inserted, and then deletes the obsolete relationship to that sibling.
To unshift the Child having an id of 123 right after the Parent node:
MATCH (p:Parent {name:"Fred"})
OPTIONAL MATCH (p)-[r:Foo]->(c:Child)
WITH p, r, COLLECT(c) AS cs
MERGE (cNew:Child {id:123})
CREATE (p)-[rNew:Foo]->(cNew)
FOREACH (x IN cs |
CREATE (cNew)-[:Foo]->(x)
DELETE r)
RETURN p, rNew, cNew;
To insert the Child node having an id of 123 at index 4 (i.e, make it the 5th child):
MATCH (p:Parent {name:"Fred"})
MATCH (p)-[:Foo*3]->()-[r:Foo]->(c:Child)
OPTIONAL MATCH (c)-[r1:Foo]->(c1:Child)
WITH c, r1, COLLECT(c1) AS c1s
MERGE (cNew:Child {id:123})
CREATE (c)-[rNew:Foo]->(cNew)
FOREACH (x IN c1s |
CREATE (cNew)-[:Foo]->(x)
DELETE r1)
RETURN c, rNew, cNew;
To replace the Child at index 4 (i.e, the 5th child) with the Child having an id of 123:
MATCH (p:Parent { name:"Fred" })
MATCH (p)-[:Foo*4]->(c0)-[r:Foo]->(c:Child)
OPTIONAL MATCH (c)-[r1:Foo]->(c1:Child)
WITH c0, r, c, r1, COLLECT(c1) AS c1s
MERGE (cNew:Child { id:123 })
CREATE (c0)-[rNew:Foo]->(cNew)
DELETE r, c
FOREACH (x IN c1s |
CREATE (cNew)-[:Foo]->(x)
DELETE r1)
RETURN c0, rNew, cNew;
Note: The DELETE r, c clause always deletes the node being replaced (c). That is only suitable if that is what you actually want to happen, and will only succeed if c does not have relationships other than r. To explore how to address more specific needs, please ask a new question.
If I'm following, your nodes may belong to multiple linked lists. A simple 'next' relation is insufficient because when lists cross--share a child node--the 'next' relations will drag in all the downstream nodes of both lists. So you're making the 'next' relations unique to each list by adding the id of the parent node. (Note using the metadata id may lead to issues down the road.)
So you might have a parent p1 whose id=1, and a unique relationship 'n_p1' to link its children, and a child 'c' whose id=21 you want to add.
For the parent with no child you could add your new child by:
MATCH (c {id:21}), (p {id:1}) WHERE NOT p-[:n_p1]->() MERGE p-[:n_p1]->c
And if the parent has one or more children, find the last one that's not the same as the one being added:
MATCH (c {id:21}), (p {id:1})-[:n_p1*1..5]->(cn) WHERE NOT cn-[:n_p1]->() AND NOT cn.id=c.id MERGE cn-[:n_p1]->c
Somebody else might have a better way, but you could UNION these together. Remember the parts of a UNION must return the same columns, so just return the new child c. The whole thing might look like this:
MATCH (c {id:21}), (p {id:1}) WHERE NOT p-[:n_p1]->() MERGE p-[:n_p1]->c return c UNION MATCH (c {id:21}), (p {id:1})-[:n_p1*1..5]->(cn) WHERE NOT cn-[:n_p1]->() AND NOT cn.id=c.id MERGE cn-[:n_p1]->c return c;

Resources