cypher NOT IN query with Optional Match - neo4j

NOT RELEVANT - SKIP TO Important Edit.
I have the following query:
MATCH (n)
WHERE (n:person) AND n.id in ['af97ab48544b'] // id is our system identifier
OPTIONAL MATCH (n)-[r:friend|connected|owner]-(m)
WHERE (m:person OR m:dog OR m:cat)
RETURN n,r,m
This query returns all the persons, dogs and cats that have a relationship with a specific person. I would like to turn it over to receive all the nodes & relationships that NOT includes in this query results.
If it was SQL it would be
select * from graph where id NOT IN (my_query)
I think that the OPTIONAL MATCH is the problematic part. I How can I do it?
Any advice?
Thanks.
-- Important Edit --
Hey guys, sorry for changing my question but my requirements has been changed. I need to get the entire graph (all nodes and relationships) connected and disconnected except specific nodes by ids. The following query is working but only for single id, in case of more ids it isn't working.
MATCH (n) WHERE (n:person)
OPTIONAL MATCH (n)-[r:friend|connected|owner]-(m) WHERE (m:person OR m:dog OR m:cat)
WITH n,r,m
MATCH (excludeNode) WHERE excludeNode.id IN ['af97ab48544b']
WITH n,r,m,excludeNode WHERE NOT n.id = excludeNode.id AND (NOT m.id = excludeNode.id OR m is null)
RETURN n,m,r
Alternatively I tried simpler query:
MATCH (n) WHERE (n:person) AND NOT n.id IN ['af97ab48544b'] return n
But this one does not returns the relationships (remember I need disconnected nodes also).
How can I get the entire graph exclude specific nodes? That includes nodes and relationships, connected nodes and disconnected as well.

try this:
match (n) where not n.id = 'id to remove' optional match (n)-[r]-(m)
where not n.id in ['id to remove'] and not m.id in ['id to remove']
return n,r,m

You've gotta switch the 'perspective' of your query... start by looping over every node, then prune the ones that connect to your person.
MATCH (bad:person) WHERE bad.id IN ['af97ab48544b']
WITH COLLECT(bad) AS bads
MATCH path = (n:person) - [r:friend|:connected|:owner] -> (m)
WHERE n._id = '' AND (m:person OR m:cat OR m:dog) AND NOT ANY(bad IN bads WHERE bad IN NODES(path))
RETURN path
That said, this is a problem much more suited to SQL than to a graph. Any time you have to loop over every node with a label, you're in relational territory, the graph will be less efficient.

Related

How Many Nodes Are Involved in a Match

How can I know how many nodes and edges are involved in a MATCH? Is there another way besides Explain / Profile Match?
If you mean how many nodes are matched in a path, such as a variable-length path, then you can assign a path variable for this:
MATCH p = (k:Person {name:'Keanu Reeves'})-[*..8]-(t:Person {name:'Tom Hanks'})
WITH p LIMIT 1
RETURN p, length(p) as pathLength, length(p) + 1 as numberOfNodesInPath
You can also use nodes(p) and relationships(p) to get the collection of nodes and relationships that make up the path, and you can use size() on those collections to get their size.
There exists the COUNT() function of Cypher that allows you to count the number of elements. As for example in this query:
MATCH (n)
RETURN COUNT(n);
This query will count all nodes in your database.
You can find more information in the cypher manual, under the aggregating functions. Check it out.
The following Cypher snippet should return the number of distinct nodes and relationships found by any given MATCH clause. Just replace <your code here> with your MATCH pattern.
MATCH <your code here>
WITH COLLECT(NODES(p)) AS ns, SUM(SIZE(RELATIONSHIPS(p))) AS relCount
UNWIND ns AS nodeList
UNWIND nodeList AS node
RETURN COUNT(DISTINCT node) AS nodeCount, relCount;

Return a graph based on the number of relationships between nodes

I am experimenting with creating multiple relationships between nodes to represent the importance between two given nodes.
For example, I want to know what 'genre' of reading material is most important to Joe.
I want a way to match the Joe node to genre nodes only if there is some number or greater relationships between them.
So, if I want matches with 3 or more relationships, I should get a graph with Joe --> Fantasy
I know I can get this when both endpoints are defined:
MATCH (p:PERSON {name:'Joe'})-[r]->(g: GENRE {name:'Fantasy'})
RETURN count(r)
What I want is something like:
MATCH p = (p:PERSON {name:'Joe'})-[r]->()
WHERE *pair_relationship_count*(r) >= 3
RETURN p
This is my proposition:
MATCH path = (p:PERSON {name:'Joe'})-[r]->()
WITH collect(path) as paths, collect(r) as pair_count
WITH paths WHERE size(pair_count) >= 3
UNWIND paths as path
RETURN path
But maybe it is more efficient to have one relationship with an internal count property on one relationship for each couple of nodes.
First, I think you can achieve your goal using WITH clause:
MATCH path = (:PERSON {name:'Joe'})-[r]->(:GENRE {name:'Fantasy'})
WITH path, count(r) as count
WHERE count > 3
RETURN path
But using one relationship for each read "event" seems to be a bad approach. Maybe you should use an integer property in the relationship, then increment this property for each "read". This way you can do queries like:
MATCH path = (:PERSON {name:'Joe'})-[r]->(:GENRE {name:'Fantasy'})
WHERE r.count > 3
RETURN path
To get a collection of all READS paths for "Joe" that involve each genre that he has read at least 3 times:
MATCH p = (:PERSON {name:'Joe'})-[:READS]->(genre)
WITH genre, COLLECT(p) AS paths
WHERE SIZE(paths) >= 3
RETURN genre, paths;

Neo4j/Cypher dense node match result ordering

I have a (:User) dense node with following relationships:
(:User)-[:SUBSCRIBED]->(:User)
(:User)-[:CONNECTED]->(:SocialNetwork)
If I execute query below
MATCH (u:User {UserId:id})
MATCH (u)-[:SUBSCRIBED]->(s)
RETURN s
I get user's subscribers ordered by recent which is expected.
But the same query with additional matching pattern brakes this ordering
MATCH (u:User {UserId:id})
MATCH (u)-[:SUBSCRIBED]->(s)
OPTIONAL MATCH (s)-[:CONNECTED]->(sn)
RETURN s, COUNT(sn.FriendCount)
Could someone explain why ordering by recent doesn't work in the second example.
There is no guarantee of order in your query because you don't have an ORDER clause, run the same query 1000 times and I'm sure the order will change at some point.
You should order at the end of the query :
MATCH (u:User {UserId:id})
MATCH (u)-[:SUBSCRIBED]->(s)
OPTIONAL MATCH (s)-[:CONNECTED]->(sn)
RETURN s, COUNT(sn.FriendCount)
ORDER BY s.time // ? property representing the time

neo4j how to use count(distinct()) over the nodes of path

I search the longest path of my graph and I want to count the number of distinct nodes of this longest path.
I want to use count(distinct())
I tried two queries.
First is
match p=(primero)-[:ResponseTo*]-(segundo)
with max(length(p)) as lengthPath
match p1=(primero)-[:ResponseTo*]-(segundo)
where length(p1) = lengthPath
return nodes(p1)
The query result is a graph with the path nodes.
But if I tried the query
match p=(primero)-[:ResponseTo*]-(segundo)
with max(length(p)) as lengthPath
match p1=(primero)-[:ResponseTo*]-(segundo)
where length(p1) = lengthPath
return count(distinct(primero))
The result is
count(distinct(primero))
2
How can I use count(distinct()) over the node primero.
Node Primero has a field called id.
You should bind at least one of those nodes, add a direction and also consider a path-limit otherwise this is an extremely expensive query.
match p=(primero)-[:ResponseTo*..30]-(segundo)
with p order by length(p) desc limit 1
unwind nodes(p) as n
return distinct n;

How to find out connection between a set of nodes?

I have a scenario where I know IDs of a list of nodes.
I need to get connection(if exists) between these nodes given their IDs.
Is there any way to achieve this?
Update:
I am using node id property not the neo4j's internal ID(using like match (n:Person{id:3}))
You can use the IN clause to select from a list of values:
MATCH (n)-[r*..2]-(m)
WHERE ID(n) IN [0,1,2] AND ID(m) IN [2,3,4]
RETURN r
I've limited the path length to 2 hops of indeterminate relationship type here, and arbitrarily picked some IDs.
To return the path instead:
MATCH p=(n)-[r*..2]-(m)
WHERE ID(n) IN [0,1,2] AND ID(m) IN [2,3,4]
RETURN p
START n=node(1,2,3,4,5,6) //your IDs of a list of nodes
MATCH p=n-[r]-m //the connection for 1 hop. for multiple hops do n-[r*]-m
WHERE Id(m) in [1,2,3,4,5,6] //your IDs of a list of nodes
RETURN p

Resources