I have created 10 nodes in Neo4j.
How do I quickly and easily create relationships between all of them? (from all, to all, excluding itself and without duplicated relationships?)
For example, if I were to have 3 nodes called A, B, and C:
A - B
A - C
B - C
This should work:
MATCH (n), (m)
WHERE ID(n) < ID(m)
CREATE (n)-[:FOO]->(m)
The WHERE test ensures that n and m are different, and also that the same pair is not processed a second time (in reverse order).
Related
I have developed a query which, by trial and error, appears to find all of the duplicated relationships in a Neo4j DB. I want delete all but one of these relationships but I'm concerned that I have not thought of problematic cases that could result in data deletion.
So, does this query delete all but one of a duplicated relationship?
MATCH (a)-->(b)<--(a) # identify where the duplication is present
WITH DISTINCT a, b
MATCH (a)-[r]->(b) # get all duplicated paths themselves
WITH a, b, collect(r)[1..] as rs # remove the first instance from the list
UNWIND rs as r
DELETE r
If I replace the UNWIND rs as r; DELETE r with WITH a, b, count(rs) as cnt RETURN cnt it seems to return the unnecessary relationships.
I'm still relucant to put this somewhere to be used by others, though....
Thanks
First of all, let me (strictly) define the term: "duplicate relationships". Two relationships are duplicates if they:
Connect the same pair of nodes (call them a and b)
Have the same relationship type
Have exactly the same set of properties (both names and values)
Have the same directionality between a and b (iff directionality is significant for use case)
Your query only considers #1 and #4, so it generally could delete non-duplicate relationships as well.
Here is a query that will take all of the above into consideration (assuming #4 should be included):
MATCH (a)-[r1]->(b)<-[r2]-(a)
WHERE TYPE(r1) = TYPE(r2) AND PROPERTIES(r1) = PROPERTIES(r2)
WITH a, b, apoc.coll.union(COLLECT(r1), COLLECT(r2))[1..] AS rs
UNWIND rs as r
DELETE r
Aggregating functions (like COLLECT) use non-aggregated terms as grouping keys, so there is no need for the query to perform a separate redundant DISTINCT a,b test.
The APOC function apoc.coll.union returns the distinct union of its 2 input lists.
I have been pushing several times the same relationship between 2 nodes in Neo4j.
It was a mistake as it makes the visualization less clear.
Now, I would like to replace those several relations between 2 nodes by one single relation. It would be great if we could keep the number of relations inside a property "count" on the new unique relation.
What would be an efficient way to solve this problem ?
I have about 100 000 of relations and I am a bit worried about the time it would take.
Here is a quick example to make the problem clearer :
I have :
Node A -- R1 -- Node B
Node A -- R2 -- Node B
And I would like to have
Node A -- R {count : 2} -- Node B
Thanks!
I assume these relationships don't have any properties and Direction of the relationships doesn't matter.
You can combine these relationships with Cypher Query as shown:
MATCH (p:Node)-[r]-(c:Node)
WHERE ID(p) > ID(c)
DELETE r
WITH p, c, COUNT(r) as count
CREATE (p)-[:R{count:count}]->(c)
If you want to merge relationships having the same directions only then you can use the following query:
MATCH (p:Node)-[r]->(c:Node)
DELETE r
WITH p, c, COUNT(r) as count
CREATE (p)-[newrel:R{count:count}]->(c)
If you want to merge the properties as well then you can take help of
apoc plugin's apoc.refactor.mergeRelationships method.
Having the following graphs:
node g1 with child nodes (a, b)
node g2 with child nodes (b, c)
using the query
MATCH (n)-[]-(m) WHERE ID(m) = id RETURN n
being id the id of the node g1, I get a and b, and vice-versa when using the id of g2. What I would like to understand is how can I get the intersection of those two results, in this case having the first return (a, b) and the second return (b, c) getting as final result (b).
I tried using the WITH cause but I wasn't able to achieve the desired result. Keep in mind that I'm new to Neo4j and only came here after a few failed attempts, research on Neo4j Documentation, general google search and
Stackoverflow.
Edit1 (one of my tries):
MATCH (n)-[]->(m)
WHERE ID(m) = 750
WITH n
MATCH (o)-[]->(b)
WHERE ID(b) = 684 and o = n
RETURN o
Edit2:
The node (b), that I represented as being the same on both graphs are in fact two different nodes on the db, each one relating to a different graph (g1 and g2). Representatively they are the same as they have the exactly same info (labels and attributes), but on the database thy are not. I'm sorry since it was my fault for not being more explicit on this matter :(
Edit3:
Why I don't using a single node (b) for both graphs
Using the graphs above as example, imagine that I have yet another layer so: on g1 the child node (b) as a child (e), while on g2 the child node (b) as a child (f). If I had (b) as a single node, when I create (e) and (f) I only could add it to (b) loosing the hierarchy, becoming impossible to distinguish which of them, (e) or (f), belonged to g1 ou g2.
This should work (assuming you pass id1 and id2 as parameters):
MATCH (a)--(n)--(c)
WHERE ID(a) = {id1} AND ID(c) = {id2}
RETURN n;
[UPDATED, based on new info from comments]
If you have multiple "clones" of the "same" node and you want to quickly determine which clones are related without having to perform a lot of (slow) property comparisons, you can add a relationship (say, typed ":CLONE") between clones. That way, a query like this would work:
MATCH (a)--(m)-[:CLONE]-(n)--(c)
WHERE ID(a) = {id1} AND ID(c) = {id2}
RETURN m, n;
You can find the duplicity of the node, by using this query -
[1]
Duplicity with single node -
MATCH pathx =(n)-[:Relationship]-(find) WHERE find.name = "action" RETURN pathx;
[2]
or for two nodes giving only immediate parent node
MATCH pathx =(n)-[:Relationship]-(find), pathy= (p)-[:Relationship]
-(seek) WHERE find.name = "action" AND seek.name="requestID" RETURN pathx,
pathy;
[3]
or to find the entire network i.e. all the nodes connected -
MATCH pathx =(n)--()-[:Relationship]-(find), pathy= (p)--()-[:Relationship]-
(seek) WHERE find.name = "action"
AND seek.name="requestID" RETURN pathx, pathy;
I have two separate nodes in the graph, and at one point I will get a signal that those two nodes should actually be one. So I have to merge the two including their properties (there's no overlap) AND I should maintain the relationships as well.
Example graph: (d)->(a {id:1}), (b)->(c {name:"Sam"})
Desired result: (d)->(a {id:1, name:"Sam"}), (b)->(a {id:1, name:"Sam"})
The label a doesn't have to be the case really in the result - the point is we will have only one node representing the original two.
The following merges the properties fine.
MATCH (a:Entity {id:"1"}), (c:Entity {name:"Sam"})
SET a += c
But I can't seem to find a way to move/copy the relationships.
The specs:
The two nodes won't have the same properties
node a may have incoming and outgoing relationships
node c may have incoming relationships, non outgoing
Any thoughts?
Update:
The following works, assuming the type and properties coming to node c are known in advance. Can I improve this to make it more dynamic?
MATCH (a:Entity {id1:"1"}), (c:Entity {id2:"2"})
SET a += c
WITH a, c
MATCH (c)<-[r]-(b)
WITH a, c, r, b
MERGE (b)-[:REL_NAME {prop1:r.prop1, prop2:r.prop2}]->(a)
WITH c
DETACH DELETE c
Update 2:
The above throws unable to load relationship with id error sometimes and I'm not sure why, but it seems it's related to reading/writing at the same time.
Update 3:
This is a work around for the error:
MATCH (a:Entity {id1:"1"}), (c:Entity {id2:"2"})
SET a += c
WITH a, c
MATCH (c)<-[r]-(b)
WITH a, c, r, b
MERGE (b)-[r2:REL_NAME]->(a)
SET r2 += r
WITH r, c
DELETE r, c
The problem when moving relationships between nodes is that you can't set the relationship type dynamically.
So you can't MATCH all relationships from node c and recreate them on node a. This does not work:
MATCH (a:Entity {id1:"1"}), (c:Entity {id2:"2"})
WITH a, c
// get rels from node c
MATCH (c)-[r]-(b)
WITH a, c, r, b
// create the same rel from node a
MERGE (b)-["r.rel_type" {prop1:r.prop1, prop2:r.prop2}]->(a)
With neo4j 3 there are so called procedures which are plugins for the server and can be called from Cypher queries. The apoc procedure package provides procedures that do what you need: https://neo4j-contrib.github.io/neo4j-apoc-procedures/
First install the apoc plugin on your server, then use something like:
// create relationships with dynamic types
CALL apoc.create.relationship(person1,'KNOWS',{key:value,…}, person2)
// merge nodes
CALL apoc.refactor.mergeNodes([node1,node2])
I have a graph with about 800k nodes and I want to create random relationships among them, using Cypher.
Examples like the following didn't work because the cartesian product is too big:
match (u),(p)
with u,p
create (u)-[:LINKS]->(p);
For example I want 1 relationship for each node (800k), or 10 relationships for each node (8M).
In short, I need a query Cypher in order to UNIFORMLY create relationships between nodes.
Does someone know the query to create relationships in this way?
So you want every node to have exactly x relationships? Try this in batches until no more relationships are updated:
MATCH (u),(p) WHERE size((u)-[:LINKS]->(p)) < {x}
WITH u,p LIMIT 10000 WHERE rand() < 0.2 // LIMIT to 10000 then sample
CREATE (u)-[:LINKS]->(p)
This should work (assuming your neo4j server has enough memory):
MATCH (n)
WITH COLLECT(n) AS ns, COUNT(n) AS len
FOREACH (i IN RANGE(1, {numLinks}) |
FOREACH (x IN ns |
FOREACH(y IN [ns[TOINT(RAND()*len)]] |
CREATE (x)-[:LINK]->(y) )));
This query collects all nodes, and uses nested loops to do the following {numLinks} times: create a LINK relationship between every node and a randomly chosen node.
The innermost FOREACH is used as a workaround for the current Cypher limitation that you cannot put an operation that returns a node inside a node pattern. To be specific, this is illegal: CREATE (x)-[:LINK]->(ns[TOINT(RAND()*len)]).