Cypher to find similar nodes without repeating matches - neo4j

I am new to cypher. I want to find similar nodes without repeating matches.
Sample data
CREATE (r1:Repository {id:"repository1"})
CREATE (r2:Repository {id:"repository2"})
CREATE (r3:Repository {id:"repository3"})
CREATE (a1:Actor {id: "actor1"})
CREATE (a2:Actor {id: "actor2"})
CREATE (a3:Actor {id: "actor3"})
CREATE (o1:Organization {id:"organization1"})
CREATE (o2:Organization {id:"organization2"})
MATCH (a:Repository {id:"repository1"}) MATCH (b:Actor {id: 'actor1'})
CREATE (a)-[:IS_ACTOR]->(b)
MATCH (a:Repository {id:"repository1"}) MATCH (b:Actor {id: 'actor2'})
CREATE (a)-[:IS_ACTOR]->(b)
MATCH (a:Repository {id:"repository1"}) MATCH (b:Actor {id: 'actor3'})
CREATE (a)-[:IS_ACTOR]->(b)
MATCH (a:Repository {id:"repository1"}) MATCH (b:Organization {id:
'organization1'}) CREATE (a)-[:IN_ORGANIZATION]->(b)
MATCH (a:Repository {id:"repository2"}) MATCH (b:Actor {id: 'actor1'})
CREATE (a)-[:IS_ACTOR]->(b)
MATCH (a:Repository {id:"repository2"}) MATCH (b:Actor {id: 'actor2'})
CREATE (a)-[:IS_ACTOR]->(b)
MATCH (a:Repository {id:"repository2"}) MATCH (b:Organization {id:
'organization1'}) CREATE (a)-[:IN_ORGANIZATION]->(b)
MATCH (a:Repository {id:"repository3"}) MATCH (b:Actor {id: 'actor3'})
CREATE (a)-[:IS_ACTOR]->(b)
MATCH (a:Repository {id:"repository3"}) MATCH (b:Organization {id:
'organization2'}) CREATE (a)-[:IN_ORGANIZATION]->(b)
Cypher
MATCH (a)-[r1:IS_ACTOR|IN_ORGANIZATION]->(match)<-
[r2:IS_ACTOR|IN_ORGANIZATION]-(b)
where not a.id = b.id with a,b,count(match) as count, collect (match.id) as
connections, collect (type(r1)) as rel1
return a.id,b.id,count,connections,rel1 order by count desc
Result
a.id b.id count connections rel1
repository2 repository1 3 actor1,actor2,organization1 IS_ACTOR, IS_ACTOR,IN_ORGANIZATION
repository1 repository2 3 actor1,actor2,organization1 IS_ACTOR, IS_ACTOR,IN_ORGANIZATION
repository3 repository1 1 actor3 IS_ACTOR
repository1 repository3 1 actor3 IS_ACTOR
How can I remove row #2 & #4 from the result?
Based on response to a similar question I tried using filter but I get syntax error (cypher below)
MATCH (a)-[r1:IS_ACTOR|IN_ORGANIZATION]->(match)<-
[r2:IS_ACTOR|IN_ORGANIZATION]-(b)
with filter(x in connections where x <> b.id)
where not a.id = b.id with a,b,count(match) as count, collect (match.id) as
connections, collect (type(r1)) as rel1
return a.id,b.id,count,connections,rel1 order by count desc

You match the path once from both sides, something that you can do to force only one of those paths to be returned. Compare the id's so you put a and b in a fixed order and avoid the other combo.
MATCH (a)-[r1:IS_ACTOR|IN_ORGANIZATION]->(match)
<-[r2:IS_ACTOR|IN_ORGANIZATION]-(b)
where id(a) > id(b)
with a,b,count(match) as count,
collect (match.id) as connections, collect (type(r1)) as rel1
return a.id,b.id,count,connections,rel1 order by count desc

Related

How to limit the number of relationships between nodes?

I have a question concerning how to limit the number of created relationships between nodes. I sure can limit the number or resulted notes when performing the MATCH. But I am, in fact, more concerned with the idea of not storing data (in this case relationships), as I will never use it in the future.
In my scenario, I have the following graph:
CREATE (u:User {id: 100001}), (:Artist {id: "0001"}), (:Artist {id: "0002"}), (:Artist {id: "0003"}), (:Artist {id: "0004"}), (:Artist {id: "0005"}),(:Artist {id: "0006"}),(:Artist {id: "0007"}),(:Artist {id: "0008"}),(:Artist {id: "0009"}),(:Artist {id: "0010"});
Notice that I have a User and 10 different Artists.
My requirement is to store the last 5 artists that a User has listened to via the LISTENED_TO relationship. Therefore after executing:
MATCH (u:User {id: 100001}), (a:Artist {id: "0001"})
CREATE (u)-[:LISTENED_TO]->(a);
MATCH (u:User {id: 100001}), (a:Artist {id: "0003"})
CREATE (u)-[:LISTENED_TO]->(a);
MATCH (u:User {id: 100001}), (a:Artist {id: "0005"})
CREATE (u)-[:LISTENED_TO]->(a);
MATCH (u:User {id: 100001}), (a:Artist {id: "0007"})
CREATE (u)-[:LISTENED_TO]->(a);
MATCH (u:User {id: 100001}), (a:Artist {id: "0009"})
CREATE (u)-[:LISTENED_TO]->(a);
I would have a graph like:
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0001"})
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0003"})
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0005"})
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0007"})
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0009"})
Now, I have the information that this user has listened to 5 different artists. Let us assume now that the User listened to a song from the Artist {id: "0010"} and I would like that the first inserted relationship, (u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0001"}) being removed (like using a FIFO-like mechanism) and the new graph would be like:
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0003"})
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0005"})
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0007"})
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0009"})
(u:User {id: 100001})-[l:LISTENED_TO]->(a:Artist {id: "0010"})
Maybe I am stretching the features supported by Neo4J, but I wonder if this would be possible. My objective is to save space which I do not need to store as I just need the last 5 most recently used (in this case listened to) artists.
If a LISTENED_TO relationship contains a timestamp in a time property, then you can use this to retain just the 5 most recent relationships when adding a new one (assuming that the timestamp of the new relationship is always going to be recent enough, and that you pass userId, artistId, and time parameters):
MATCH (u:User {id: $userId})
OPTIONAL MATCH (u)-[lt:LISTENED_TO]->(:Artist)
WITH u, lt ORDER BY lt.time DESC
WITH u, COLLECT(lt) AS lts
FOREACH(x IN lts[4..] | DELETE x)
MERGE (a:Artist {id: $artistId})
CREATE (u)-[:LISTENED_TO {time: $time}]->(a)
[UPDATE]
NOTE: The above query allows the same artist to have multiple relationships to the same user if that user had listened to that artist multiple times recently.
If you want an artist to have at most one relationship to a specific user, then this more complex query should work:
MATCH (u:User {id: $userId})
OPTIONAL MATCH p= (u)-[lt:LISTENED_TO]->(a:Artist)
WITH u, {lt: lt, a: a} AS data ORDER BY lt.time DESC
WITH u, REDUCE(
s = {cnt: 0, del: []}, x IN COLLECT(data) |
CASE WHEN x.a.id = $artistId OR s.cnt = 4
THEN {cnt:s.cnt, del:s.del + x.lt}
ELSE {cnt:s.cnt + 1, del:s.del} END).del AS del
FOREACH(x IN del | DELETE x)
MERGE (a:Artist {id: $artistId})
CREATE (u)-[:LISTENED_TO {time: $time}]->(a)

Neo4J Cypher v2 Create Uniquely Labelled Relation With Changing Fields

I have two users:
CREATE (a:user {id: 1})
CREATE (b:user {id: 2})
I want users to be able to follow each other:
MATCH (a:user {id: 1}), (b:user {id: 2})
CREATE (a)-[r:FOLLOWS]->(b)
But I also need to keep track of when that follow happened:
MATCH (a:user {id: 1}), (b:user {id: 2})
CREATE (a)-[r:FOLLOWS {t: 32409823}]->(b)
My issue is that I need create the :FOLLOWS relation if it does not already exist without making a query to check, then another query to create it. Ideally CREATE UNIQUE would solve this, which works just fine without any changing fields on the relation:
MATCH (a:user {id: 1}), (b:user {id: 2})
CREATE UNIQUE (a)-[r:FOLLOWS]->(b)
(THIS WORKS)
But when I include a timestamp on the relation, create unique will make a second relation because it has a different timestamp.
MATCH (a:user {id: 1}), (b:user {id: 2})
CREATE UNIQUE (a)-[r:FOLLOWS {t: 32409823}]->(b)
(THIS DOESN'T WORK)
The above creates a new relation every time because the timestamp is always changing. Is there any way I can check if any relation with the label :FOLLOWS exists and create the relation with fields if it doesn't?
MERGE and its ON CREATE clause should do what you want. MERGE will match on the :FOLLOWS relationship, and if it does not exist it will create it. ON CREATE is only performed if the MERGE operation created the relationship instead of matching on an existing one.
MATCH (a:user {id: 1}), (b:user {id: 2})
MERGE (a)-[r:FOLLOWS]->(b)
ON CREATE SET r.t = timestamp()

Optional nodes in a path?

I'm trying to write a query where I get the :LIKES relationships.
(:USER)
|
[:CREATED]
|
(:POST)<-[:LIKES]-(:USER)
|
[:RESHARED]
|
(:POST)<-[:LIKES]-(:USER)
I was trying something along the lines of:
MATCH (u:USER {name: "Lamoni"})-[:CREATED]-(p:POST)
OPTIONAL MATCH p<-[:LIKES]-(u2:USER)
OPTIONAL MATCH p<-[:RESHARED]-(p2:POST)<-[:LIKES]-(u3:USER)
Any ideas on an optimal way to do this and be able to order them by a property called created_at in a descending order?
Thanks!
If the POST structure always looks like this you can try:
// match the whole user-post-post path
MATCH (u:USER {name: "Lamoni"})-[:CREATED]-(p_direct:POST)-[:RESHARED]-(p_shared:Post)
WITH u, p_direct, p_shared
OPTIONAL MATCH (p_direct)<-[:LIKES]-(u2:USER)
OPTIONAL MATCH (p_shared)<-[:LIKES]-(u3:USER)
RETURN u.name, p_direct.xyz, collect(u2.name), p_shared.xyz, collect(u3.name)
If you just want all USERS that like a POST by a given USER (independent of the type of POST, created or shared) you can also collect all POST:
MATCH (u:USER {name: "Lamoni"})-[:CREATED|RESHARED*1..2]-(p:Post)
WITH u, p
OPTIONAL MATCH (p)<-[:LIKES]-(u2:USER)
WITH u.name, p, u2
ORDER BY u2.created_at
RETURN u.name, p, collect(u2.name)

Chaining result with "WITH" doesn't work when the subsequent query doesn't have matched result in Neo4j cypher query

For example, I created two linked nodes:
create (a:ACTOR {id: "a1", name: "bruce wellis"})
create (m:MOVIE {id: "m1", title: "die hardest"})
create (a)-[:ACTED_IN]->(m)
1. From this cypher query:
match (a:ACTOR {id: "a1"})
with a
optional match (m:MOVIE {id: "m1"})
set m += {
title: "die easier"
}
return a;
I can have result:
+-----------------------------------------+
| a |
+-----------------------------------------+
| Node[1000]{name:"bruce wellis",id:"a1"} |
+-----------------------------------------+
1 row
Properties set: 1
The query successfully returned the actor node.
2. (UPDATED) But if you make the match MOVIE subquery failed:
match (a:ACTOR {id: "a1"})
with a
optional match (m:MOVIE {id: "mm"})
set m += {
title: "die easier"
}
return a;
I got error:
CypherTypeException: Expected m to be a node or a relationship, but it was :`null`.
How to make the second query returning matched actor result?
A MATCH that fails to match anything will always return no rows.
So, in #2, since the second MATCH failed, it returns no rows.
You could use OPTIONAL MATCH in place of the second MATCH, and you should see results.
[EDITED]
For the Updated question, this (somewhat ugly) workaround should work:
MATCH (a:ACTOR {id: "a1"})
WITH a
OPTIONAL MATCH (m:MOVIE {id: "mm"})
WITH a, COLLECT(m) AS cm
FOREACH(m IN cm | SET m += {title: "die easier"})
RETURN a;

match in clause in cypher

How can I do an match in clause in cypher
e.g. I'd like to find movies with ids 1, 2, or 3.
match (m:movie {movie_id:("1","2","3")}) return m
if you were going against an auto index the syntax was
START n=node:node_auto_index('movie_id:("123", "456", "789")')
how is this different against a match clause
The idea is that you can do:
MATCH (m:movie)
WHERE m.movie_id in ["1", "2", "3"]
However, this will not use the index as of 2.0.1. This is a missing feature in the new label indexes that I hope will be resolved soon. https://github.com/neo4j/neo4j/issues/861
I've found a (somewhat ugly) temporary workaround for this.
The following query doesn't make use of an index on Person(name):
match (p:Person)... where p.name in ['JOHN', 'BOB'] return ...;
So one option is to repeat the entire query n times:
match (p:Person)... where p.name = 'JOHN' return ...
union
match (p:Person)... where p.name = 'BOB' return ...
If this is undesirable then another option is to repeat just a small query for the id n times:
match (p:Person) where p.name ='JOHN' return id(p)
union
match (p:Person) where p.name ='BOB' return id(p);
and then perform a second query using the results of the first:
match (p:Person)... where id(p) in [8,16,75,7] return ...;
Is there a way to combine these into a single query? Can a union be nested inside another query?

Resources