Neo4j How to drop a duplicated EDGE? - neo4j

I am learning Cypher / Neo4j, using C#
I created this the EDGE 3 times.
client.Cypher
.Match("(user1:Person)", "(user2:Person)")
.Where((Person user1) => user1.name == "Tony")
.AndWhere((Person user2) => user2.name == "Maria Esther")
//.Create("(user1)-[:PAI]->(user2)")
.Create("(user2)-[:FILHO {DataDeNascimento: '2006'}]->(user1)")
.ExecuteWithoutResults();
How to drop the 2 other :FILHO (duplicated edges)?

This query will delete duplicate :FILHO relationships between Person nodes:
MATCH (p1:Person)-[r:FILHO]->(p2:Person)
WITH p1, p2, COLLECT(r) as rels
FOREACH(r IN tail(rels) | DELETE r)
First, it matches on all FILHO relationships and Person nodes.
Then aggregates the relationships for each pair of Person nodes into the rels collection.
Then iterates through the tail of each rels collection (all relationships, but the first) and deletes them.

It may be better to consider how to avoid creating dupicate edges. Consider using merge rather than create.

Related

Cypher - given relationship, get the nodes

If I do the query
MATCH (:Label1 {prop1: "start node"}) -[relationships*1..10]-> ()
UNWIND relationships as relationship
RETURN DISTINCT relationship
How do I get nodes for each of acquired relationship to get result in format:
╒════════╤════════╤═══════╕
│"from" │"type" │"to" │
╞════════╪════════╪═══════╡
├────────┼────────┼───────┤
└────────┴────────┴───────┘
Is there a function such as type(r) but for getting nodes from relationship?
RomanMitasov and ray have working answers above.
I don't think they quite get at what you want to do though, because you're basically returning every relationship in the graph in a sort of inefficient way. I say that because without a start or end position, specifying a path length of 1-10 doesn't do anything.
For example:
CREATE (r1:Temp)-[:TEMP_REL]->(r2:Temp)-[:TEMP_REL]->(r3:Temp)
Now we have a graph with 3 Temp nodes with 2 relationships: from r1 to r2, from r2 to r3.
Run your query on these nodes:
MATCH (:Temp)-[rels*1..10]->(:Temp)
UNWIND rels as rel
RETURN startNode(rel), type(rel), endNode(rel)
And you'll see you get four rows. Which is not what you want because there are only two distinct relationships.
You could modify that to return only distinct values, but you're still over-searching the graph.
To get an idea of what relationships are in the graph and what they connect, I use a query like:
MMATCH (n)-[r]->(m)
RETURN labels(n), type(r), labels(m), count(r)
The downside of that, of course, is that it can take a while to run if you have a very large graph.
If you just want to see the structure of your graph:
CALL db.schema.visualization()
Best wishes and happy graphing! :)
Yes, such functions do exist!
startNode(r) to get the start node from relationship r
endNode(r) to get the end node
Here's the final query:
MATCH () -[relationships*1..10]-> ()
UNWIND relationships as r
RETURN startNode(r) as from, type(r) as type, endNode(r) as to

How to do this in a single Cypher Query?

So this is a very basic question. I am trying to make a cypher query that creates a node and connects it to multiple nodes.
As an example, let's say I have a database with towns and cars. I want to create a query that:
creates people, and
connects them with the town they live in and any cars they may own.
So here goes:
Here's one way I tried this query (I have WHERE clauses that specify which town and which cars, but to simplify):
MATCH (t: Town)
OPTIONAL MATCH (c: Car)
MERGE a = ((c) <-[:OWNS_CAR]- (p:Person {name: "John"}) -[:LIVES_IN]-> (t))
RETURN a
But this returns multiple people named John - one for each car he owns!
In two queries:
MATCH (t:Town)
MERGE a = ((p:Person {name: "John"}) -[:LIVES_IN]-> (t))
MATCH (p:Person {name: "John"})
OPTIONAL MATCH (c:Car)
MERGE a = ((p) -[:OWNS_CAR]-> (c))
This gives me the result I want, but I was wondering if I could do this in 1 query. I don't like the idea that I have to find John again! Any suggestions?
It took me a bit to wrap my head around why MERGE sometimes creates duplicate nodes when I didn't intend that. This article helped me.
The basic insight is that it would be best to merge the Person node first before you match the towns and cars. That way you won't get a new Person node for each relationship pattern.
If Person nodes are uniquely identified by their name properties, a unique constraint would prevent you from creating duplicates even if you run a mistaken query.
If a person can have multiple cars and residences in multiple towns, you also want to avoid a cartesian product of cars and towns in your result set before you do the merge. Try using the table output in Neo4j Browser to see how many rows are getting returned before you do the MERGE to create relationships.
Here's how I would approach your query.
MERGE (p:Person {name:"John"})
WITH p
OPTIONAL MATCH (c:Car)
WHERE c.licensePlate in ["xyz123", "999aaa"]
WITH p, COLLECT(c) as cars
OPTIONAL MATCH (t:Town)
WHERE t.name in ["Lexington", "Concord"]
WITH p, cars, COLLECT(t) as towns
FOREACH(car in cars | MERGE (p)-[:OWNS]->(car))
FOREACH(town in towns | MERGE (p)-[:LIVES_IN]->(town))
RETURN p, towns, cars

Neo4j - Filter out All Nodes in a Graph that contain a Certain Relationship

Background:
I have a graph with Company nodes joined to eachother through One or MORE relationships.
Looks like this:
Trying to Achieve:
I want to keep all nodes where the relationship between them is "competes_on_backlinks" BUT if another relationship of "links_to" exists between them THEN filter out that node.
I have Tried:
MATCH p =(c:Company {name:"example.com"})-[r]-(b:Company)
where NONE (x IN relationships(p) WHERE type(x) ="links_to")
RETURN p
The above query produces the graph in the image.
That filters out nodes where the ONLY relationship between them is "links_to" BUT it does not remove the nodes where there is another relationship as well(as you can see in the image)
So many other attempts but the same result.
Any idea how to do this?
The NOT operator can be used to exclude a pattern:
MATCH p = (c:Company {name:"F"})-[:competes_on_backlinks]-(b:Company)
WHERE NOT (b)-[:links_to]-(c)
RETURN p

Neo4j read all subgraphs without duplications Cypher query

I've been trying to get a subgraph based on a node query.
The query should ignore the relationship directions as long as all of the nodes in the subgraph are connected:
ex:
u1 -FRIEND-> u2 -FRIEND-> u3
u4 -FRIEND-> u5 -FRIEND-> u6
searching for u1 or u2 or u3, should return a set of: [u1,u2,u3]
I used the following Cypher query:
MATCH (a:User)-[:FRIEND_OF*0..]-(b)
WHERE a.userId = 'some_id'
WITH a, collect(DISTINCT b) AS sets
RETURN DISTINCT sets
The problem is that I'm getting all of the set's permutations like:
DATA: u1 -FRIEND-> u2 -FRIEND-> u3
RETURN: [u1,u2,u3],[u1,u3,u2],[u2,u1,u3]...
how can I distinct the different sets to return only one permutation?
I also would like to support a case a user can be in different subgraphs so the response should be couple of subgraphs.
thanks
This query should work:
MATCH p=(a:User)-[:FRIEND_OF*0..]-(b)
WHERE a.userId = 'some_id'
WITH DISTINCT a, b
ORDER BY ID(b)
WITH a, COLLECT(b) AS sets
RETURN DISTINCT sets;
It gets distinct a/b pairs, orders the b nodes by native ID, puts the ordered nodes in collections, and finally returns distinct collections.
You may want to create an index for :User(userId) for better performance.

Neo4j duplicate relationship

I have duplicate relationships between nodes e.g:
(Author)-[:CONNECTED_TO {weight: 1}]->(Coauthor)
(Author)-[:CONNECTED_TO {weight: 1}]->(Coauthor)
(Author)-[:CONNECTED_TO {weight: 1}]->(Coauthor)
and I want to merge these relations into one relation of the form: A->{weight: 3} B for my whole graph.
I tried something like the following; (I'm reading the data from a csv file)
MATCH (a:Author {authorid: csvLine.author_id}),(b:Coauthor { coauthorid: csvLine.coauthor_id})
CREATE UNIQUE (a)-[r:CONNECTED_TO]-(b)
SET r.weight = coalesce(r.weight, 0) + 1
But when I start this query, ıt creates duplicate coauthor nodes. The weight will update. It seems like this:
(Author)-[r:CONNECTED_TO]->(Coauthor)
( It creates 3 same coauthor nodes for the author)
If you need to fix it after the fact, you could aggregate all of the relationships and the weight between each set of applicable nodes. Then update the first relationship with the new aggregated number. Then with the collection of relationships delete the second through the last. Perform the update only where there is more than one relationship. Something like this...
MATCH (a:Author {name: 'A'})-[r:CONNECTED_TO]->(b:CoAuthor {name: 'B'})
// aggregate the relationships and limit it to those with more than 1
WITH a, b, collect(r) AS rels, sum(r.weight) AS new_weight
WHERE size(rels) > 1
// update the first relationship with the new total weight
SET (rels[0]).weight = new_weight
// bring the aggregated data forward
WITH a, b, rels, new_weight
// delete the relationships 1..n
UNWIND range(1,size(rels)-1) AS idx
DELETE rels[idx]
If you are doing it for the whole graph and the graph is expansive you may want to perm the update it in batches using limit or some other control mechanism.
MATCH (a:Author)-[r:CONNECTED_TO]->(b:CoAuthor)
WITH a, b, collect(r) AS rels, sum(r.weight) AS new_weight
LIMIT 100
WHERE size(rels) > 1
SET (rels[0]).weight = new_weight
WITH a, b, rels, new_weight
UNWIND range(1,size(rels)-1) AS idx
DELETE rels[idx]
If you want to eliminate the problem when loading...
MATCH (a:Author {authorid: csvLine.author_id}),(b:Coauthor { coauthorid: csvLine.coauthor_id})
MERGE (a)-[r:CONNECTED_TO]->(b)
ON CREATE SET r.weight = 1
ON MATCH SET r.weight = coalesce(r.weight, 0) + 1
Side Note: not really knowing your data model, I would consider modelling CoAuthor as Author as they are likely authors in their own right. It is probably only in the context of a particular project they would be considered a coauthor.

Resources