Replacing relations from one node to another by one single relation - neo4j

I have been pushing several times the same relationship between 2 nodes in Neo4j.
It was a mistake as it makes the visualization less clear.
Now, I would like to replace those several relations between 2 nodes by one single relation. It would be great if we could keep the number of relations inside a property "count" on the new unique relation.
What would be an efficient way to solve this problem ?
I have about 100 000 of relations and I am a bit worried about the time it would take.
Here is a quick example to make the problem clearer :
I have :
Node A -- R1 -- Node B
Node A -- R2 -- Node B
And I would like to have
Node A -- R {count : 2} -- Node B
Thanks!

I assume these relationships don't have any properties and Direction of the relationships doesn't matter.
You can combine these relationships with Cypher Query as shown:
MATCH (p:Node)-[r]-(c:Node)
WHERE ID(p) > ID(c)
DELETE r
WITH p, c, COUNT(r) as count
CREATE (p)-[:R{count:count}]->(c)
If you want to merge relationships having the same directions only then you can use the following query:
MATCH (p:Node)-[r]->(c:Node)
DELETE r
WITH p, c, COUNT(r) as count
CREATE (p)-[newrel:R{count:count}]->(c)
If you want to merge the properties as well then you can take help of
apoc plugin's apoc.refactor.mergeRelationships method.

Related

Cypher - given relationship, get the nodes

If I do the query
MATCH (:Label1 {prop1: "start node"}) -[relationships*1..10]-> ()
UNWIND relationships as relationship
RETURN DISTINCT relationship
How do I get nodes for each of acquired relationship to get result in format:
╒════════╤════════╤═══════╕
│"from" │"type" │"to" │
╞════════╪════════╪═══════╡
├────────┼────────┼───────┤
└────────┴────────┴───────┘
Is there a function such as type(r) but for getting nodes from relationship?
RomanMitasov and ray have working answers above.
I don't think they quite get at what you want to do though, because you're basically returning every relationship in the graph in a sort of inefficient way. I say that because without a start or end position, specifying a path length of 1-10 doesn't do anything.
For example:
CREATE (r1:Temp)-[:TEMP_REL]->(r2:Temp)-[:TEMP_REL]->(r3:Temp)
Now we have a graph with 3 Temp nodes with 2 relationships: from r1 to r2, from r2 to r3.
Run your query on these nodes:
MATCH (:Temp)-[rels*1..10]->(:Temp)
UNWIND rels as rel
RETURN startNode(rel), type(rel), endNode(rel)
And you'll see you get four rows. Which is not what you want because there are only two distinct relationships.
You could modify that to return only distinct values, but you're still over-searching the graph.
To get an idea of what relationships are in the graph and what they connect, I use a query like:
MMATCH (n)-[r]->(m)
RETURN labels(n), type(r), labels(m), count(r)
The downside of that, of course, is that it can take a while to run if you have a very large graph.
If you just want to see the structure of your graph:
CALL db.schema.visualization()
Best wishes and happy graphing! :)
Yes, such functions do exist!
startNode(r) to get the start node from relationship r
endNode(r) to get the end node
Here's the final query:
MATCH () -[relationships*1..10]-> ()
UNWIND relationships as r
RETURN startNode(r) as from, type(r) as type, endNode(r) as to

What is missing in this Cypher query?

I'm learning Cypher and I created a 'Crime investigation' project on Neo4j.
I'm trying to return as an output the parent that only has two sons/daughters in total and each member of the family must have committed a crime.
So, in order to get this in the graph, I executed this query:
match(:Crime)<-[:PARTY_TO]-(p:Person)-[:FAMILY_REL]->(s:Person)-[:PARTY_TO]->(:Crime)
where size((p)-[:FAMILY_REL]->())=2
return p, s
FAMILY_REL relation shows the sons the Person (p) and PARTY_TO relation shows the Crime nodes a Person have committed.
The previous query it's not working as it should. It shows parents with more than two sons and also sons that have just one son.
What is wrong with the logic of the query?
SIZE((p)-[:FAMILY_REL]->()) counts all children of p, including ones who had committed no crimes.
This query should work better, as it only counts children who are criminals:
MATCH (:Crime)<-[:PARTY_TO]-(p:Person)-[:FAMILY_REL]->(s:Person)-[:PARTY_TO]->(:Crime)
WITH p, COLLECT(s) AS badKids
WHERE SIZE(badKids) = 2
RETURN p, badKids

Create relationships in Neo4j

I have a graph with about 800k nodes and I want to create random relationships among them, using Cypher.
Examples like the following didn't work because the cartesian product is too big:
match (u),(p)
with u,p
create (u)-[:LINKS]->(p);
For example I want 1 relationship for each node (800k), or 10 relationships for each node (8M).
In short, I need a query Cypher in order to UNIFORMLY create relationships between nodes.
Does someone know the query to create relationships in this way?
So you want every node to have exactly x relationships? Try this in batches until no more relationships are updated:
MATCH (u),(p) WHERE size((u)-[:LINKS]->(p)) < {x}
WITH u,p LIMIT 10000 WHERE rand() < 0.2 // LIMIT to 10000 then sample
CREATE (u)-[:LINKS]->(p)
This should work (assuming your neo4j server has enough memory):
MATCH (n)
WITH COLLECT(n) AS ns, COUNT(n) AS len
FOREACH (i IN RANGE(1, {numLinks}) |
FOREACH (x IN ns |
FOREACH(y IN [ns[TOINT(RAND()*len)]] |
CREATE (x)-[:LINK]->(y) )));
This query collects all nodes, and uses nested loops to do the following {numLinks} times: create a LINK relationship between every node and a randomly chosen node.
The innermost FOREACH is used as a workaround for the current Cypher limitation that you cannot put an operation that returns a node inside a node pattern. To be specific, this is illegal: CREATE (x)-[:LINK]->(ns[TOINT(RAND()*len)]).

Neo4j - Get all related nodes of type and create new relationship

I have a dataset that looks like this (Artefact)-[HAS]-(Keyword), keywords can be shared multiple times by artefacts. What I am trying to achieve is;
Returning most interconnected keyword nodes, count of artefacts related to keywords, count of the overlap between keyword nodes and the hop to another keyword (keyword)-(artefact)-(keywords), the "shared" artefact count between two keywords.
In other words a count of the artefact records within an intersect between two keyword nodes. For example given these three artefact nodes
1) spoon (keywords; metal, food)
2) sword (keywords; metal, fighting)
3) fork (keywords; metal, food)
The query would therefore return the keyword node, count of artefacts related to keyword (3, spoon, sword and fork), count of the keywords related by artefact between keyword nodes (metal has 2 indirect connections to food and 1 to fighting).
Once I've worked that out, for the sake of speed because I realise this is a big query, create a related_to relationship between keywords with the count of the number of artefacts they share in common. Only select 1 record to create this relationship, to test it works :) (hence limit 1)
MATCH (n:Keyword)-[r*2]-(x:Keyword)
WITH n, COUNT(r) AS c, x
LIMIT 1
MERGE (n)-[s:RELATED_KEY]-(x) SET s.weight = c
I'm using neo4j community edition (2.1.6),
Many thanks, Andy
This query will return you the first part of your answer :
MATCH (k:Keyword)
WITH k
LIMIT 1
MATCH (k)<-[:HAS]-(a)
WITH k, collect(a) as artefacts
WITH k, artefacts, size(artefacts) as c
UNWIND artefacts as artefact
MATCH (k)<-[:HAS]-(artefact)-[:HAS]->(k2)
RETURN c, artefacts, collect(distinct(k2.name)) as keywords, count(distinct(k2.name)) as keyWordsCount
However, I guess you may create the relationships between the related nodes directly :
MATCH (k:Keyword)
WITH k
LIMIT 1
MATCH (k)<-[:HAS]-(a)-[:HAS]->(other)
MERGE (k)-[r:RELATED_TO]->(other)
ON CREATE SET r.weight = 1
ON MATCH SET r.weight = r.weight + 1

In neo4j is there a way to get path between more than 2 random nodes whose direction of relation is not known

I have a scenario where I have more than 2 random nodes.
I need to get all possible paths connecting all three nodes. I do not know the direction of relation and the relationship type.
Example : I have in the graph database with three nodes person->Purchase->Product.
I need to get the path connecting these three nodes. But I do not know the order in which I need to query, for example if I give the query as person-Product-Purchase, it will return no rows as the order is incorrect.
So in this case how should I frame the query?
In a nutshell I need to find the path between more than two nodes where the match clause may be mentioned in what ever order the user knows.
You could list all of the nodes in multiple bound identifiers in the start, and then your match would find the ones that match, in any order. And you could do this for N items, if needed. For example, here is a query for 3 items:
start a=node:node_auto_index('name:(person product purchase)'),
b=node:node_auto_index('name:(person product purchase)'),
c=node:node_auto_index('name:(person product purchase)')
match p=a-->b-->c
return p;
http://console.neo4j.org/r/tbwu2d
I actually just made a blog post about how start works, which might help:
http://wes.skeweredrook.com/cypher-it-all-starts-with-the-start/
Wouldn't be acceptable to make several queries ? In your case you'd automatically generate 6 queries with all the possible combinations (factorial on the number of variables)
A possible solution would be to first get three sets of nodes (s,m,e). These sets may be the same as in the question (or contain partially or completely different nodes). The sets are important, because starting, middle and end node are not fixed.
Here is the code for the Matrix example with added nodes.
match (s) where s.name in ["Oracle", "Neo", "Cypher"]
match (m) where m.name in ["Oracle", "Neo", "Cypher"] and s <> m
match (e) where e.name in ["Oracle", "Neo", "Cypher"] and s <> e and m <> e
match rel=(s)-[r1*1..]-(m)-[r2*1..]-(e)
return s, r1, m, r2, e, rel;
The additional where clause makes sure the same node is not used twice in one result row.
The relations are matched with one or more edges (*1..) or hops between the nodes s and m or m and e respectively and disregarding the directions.
Note that cypher 3 syntax is used here.

Resources