Create association between nodes if one doesnt exist using cypher - neo4j

Say there are 2 labels P and M. M has nodes with names M1,M2,M3..M10. I need to associate 50 nodes of P with each Node of M. Also no node of label P should have 2 association with node of M.
This is the cypher query I could come up with, but doesn't seem to work.
MATCH (u:P), (r:M{Name:'M1'}),(s:M)
where not (s)-[:OWNS]->(u)
with u limit 50
CREATE (r)-[:OWNS]->(u);
This way I would run for all 10 nodes of M. Any help in correcting the query is appreciated.

You can utilize apoc.periodic.* library for batching. More info in documentation
call apoc.periodic.commit("
MATCH (u:P), (r:M{Name:'M1'}),(s:M) where not (s)-[:OWNS]->(u)
with u,r limit {limit}
CREATE (r)-[:OWNS]->(u)
RETURN count(*)
",{limit:10000})
If there will always be just one (r)-[:OWNS]->(u) relationship, I would change my first match to include
call apoc.periodic.commit("
MATCH (u:P), (r:M{Name:'M1'}),(s:M) where not (s)-[:OWNS]->(u) and not (r)-[:OWNS]->(u)
with u,r limit {limit}
CREATE (r)-[:OWNS]->(u)
RETURN count(*)
",{limit:10000})
So there is no way the procedure will fall into a loop

This query should be a fast and easy-to-understand. It is fast because it avoids Cartesian products:
MATCH (u:P)
WHERE not (:M)-[:OWNS]->(u)
WITH u LIMIT 50
MATCH (r:M {Name:'M1'})
CREATE (r)-[:OWNS]->(u);
It first matches 50 unowned P nodes. It then finds the M node that is supposed to be the "owner", and creates an OWNS relationship between it and each of the 50 P nodes.
To make this query even faster, you can first create an index on :M(Name) so that the owning M node can be found quickly (without scanning all M nodes):
CREATE INDEX ON :M(Name);

This worked for me.
MATCH (u:P), (r:M{Name:'M1'}),(s:M)
where not (s)-[:OWNS]->(u)
with u,r limit 50
CREATE (r)-[:OWNS]->(u);
Thanks for Thomas for mentioning limit on u and r.

I think one way to connect all 10 nodes :M in one query
MATCH (m:M)
WITH collect(m) as nodes
UNWIND nodes as node
MATCH (p:P) where not ()-[:OWNS]->(p)
WITH node,p limit 50
CREATE (node)-[:OWNS]->(p)
Although I am not really sure if we need to collect and unwind, could just simplify it to:
MATCH (m:M)
MATCH (p:P) where not ()-[:OWNS]->(p)
WITH m,p limit 50
CREATE (node)-[:OWNS]->(p)

Related

Replacing relations from one node to another by one single relation

I have been pushing several times the same relationship between 2 nodes in Neo4j.
It was a mistake as it makes the visualization less clear.
Now, I would like to replace those several relations between 2 nodes by one single relation. It would be great if we could keep the number of relations inside a property "count" on the new unique relation.
What would be an efficient way to solve this problem ?
I have about 100 000 of relations and I am a bit worried about the time it would take.
Here is a quick example to make the problem clearer :
I have :
Node A -- R1 -- Node B
Node A -- R2 -- Node B
And I would like to have
Node A -- R {count : 2} -- Node B
Thanks!
I assume these relationships don't have any properties and Direction of the relationships doesn't matter.
You can combine these relationships with Cypher Query as shown:
MATCH (p:Node)-[r]-(c:Node)
WHERE ID(p) > ID(c)
DELETE r
WITH p, c, COUNT(r) as count
CREATE (p)-[:R{count:count}]->(c)
If you want to merge relationships having the same directions only then you can use the following query:
MATCH (p:Node)-[r]->(c:Node)
DELETE r
WITH p, c, COUNT(r) as count
CREATE (p)-[newrel:R{count:count}]->(c)
If you want to merge the properties as well then you can take help of
apoc plugin's apoc.refactor.mergeRelationships method.

Neo4j: get all relations between queried nodes

I want to make a cypher query that do below tasks:
there is a given start node, and I want to get all related nodes in 2 hops
sort queried nodes by hops asc, and limit it with given number
and get all relations between result of 1.
I tried tons of queries, and I made below query for step 1, 2
MATCH path=((start {eid:12018})-[r:REAL_CALL*1..2]-(end))
WITH start, end, path
ORDER BY length(path) ASC
RETURN start, collect(distinct end)[..10]
But when I try to get relationships in queried path with below query, it returns all relationships in the path :
MATCH path=((start {eid:12018})-[r:REAL_CALL*1..2]-(end))
WITH start, end, path
ORDER BY length(path) ASC
RETURN start, collect(distinct end)[..10], relationships(path)
I think I have to match again with result of first match instead of get relationships from path directly, but all of my attempts have failed.
How can I get all relationships between queried nodes?
Any helps appreciate, thanks a lot.
[EDITED]
Something like this may work for you:
MATCH (start {eid:12018})-[rels:REAL_CALL*..2]-(end)
RETURN start, end, COLLECT(rels) AS rels_collection
ORDER BY
REDUCE(s = 2, rs in rels_collection | CASE WHEN SIZE(rs) < s THEN SIZE(rs) ELSE s END)
LIMIT 10;
The COLLECT aggregation function will generate a collection (of relationship collections) for each distinct start/end pair. The LIMIT clause limits the returned results to the first 10 start/end pairs, based on the ORDER BY clause. The ORDER BY clause uses REDCUE to calculate the minimum size of each path to a given end node.

Equivalent of row_number in neo4j

I'm trying to run a query where I assign integers based on the order they appeared in the query. I'd like it to work to the effect of:
MATCH users RETURN users ORDER BY created_at SET user.number=ROW_NUMBER()
Is there a way to do this in a single query? Thanks!
You can do it by playing a bit with a collection :
MATCH (n:User)
WITH n
ORDER BY n.created_at
WITH collect(n) as users
UNWIND range(0, size(users)-1) as pos
SET (users[pos]).number = pos

Create relationships in Neo4j

I have a graph with about 800k nodes and I want to create random relationships among them, using Cypher.
Examples like the following didn't work because the cartesian product is too big:
match (u),(p)
with u,p
create (u)-[:LINKS]->(p);
For example I want 1 relationship for each node (800k), or 10 relationships for each node (8M).
In short, I need a query Cypher in order to UNIFORMLY create relationships between nodes.
Does someone know the query to create relationships in this way?
So you want every node to have exactly x relationships? Try this in batches until no more relationships are updated:
MATCH (u),(p) WHERE size((u)-[:LINKS]->(p)) < {x}
WITH u,p LIMIT 10000 WHERE rand() < 0.2 // LIMIT to 10000 then sample
CREATE (u)-[:LINKS]->(p)
This should work (assuming your neo4j server has enough memory):
MATCH (n)
WITH COLLECT(n) AS ns, COUNT(n) AS len
FOREACH (i IN RANGE(1, {numLinks}) |
FOREACH (x IN ns |
FOREACH(y IN [ns[TOINT(RAND()*len)]] |
CREATE (x)-[:LINK]->(y) )));
This query collects all nodes, and uses nested loops to do the following {numLinks} times: create a LINK relationship between every node and a randomly chosen node.
The innermost FOREACH is used as a workaround for the current Cypher limitation that you cannot put an operation that returns a node inside a node pattern. To be specific, this is illegal: CREATE (x)-[:LINK]->(ns[TOINT(RAND()*len)]).

Is there any way to index relationship existence?

Recently i faced with the problem in creating chain of nodes using next query in loop
MATCH (p: Node) WHERE NOT (p)-[:RELATIONSHIP]->()
WITH p LIMIT 1000
MATCH (q: Node{id: p.id}) WITH p, max(id(q)) as tail
MATCH (t: Node) where id(t) = tail
WITH p, t
CREATE (p)-[:RELATIONSHIP]->(t)
The problem appears after creating chain with first ~1 000 000 nodes. Query
MATCH (p: Node) WHERE NOT (p)-[:RELATIONSHIP]->()
works very slow because it looks through first 1 000 000 and checks if they don't have a relationship, but they all have. At some amount of nodes query ends with "Unknown error". To get around with it I tried next queries.
MATCH (p: Node) with p skip 1000000
Match (p) WHERE NOT (p)-[:RELATIONSHIP]->()
or
MATCH (p: Node) with p order by id(p) desc
MATCH (p) WHERE NOT (p)-[:RELATIONSHIP]->()
But i wonder if there more elegant way to solve this problem like "indexing relationship existence"?
You can index relationship properties using "legacy indexing," which isn't exactly recommended anymore, but this won't index the absence of relationships so it wouldn't do you any good. I'd probably try to find a way to mark nodes in need of relationships through either a label or an index on a property. Start your match from there, it'll be much faster.

Resources