cypher multiple match is very slow - neo4j

I want to match some node which don't satisfy the condition, but it's very slow. How to optimize it? There are about 70000 nodes in the database.
match (n:A)--(p:B)--(q:C)
with collect(n) as nc
match m where not m in nc
return count(m)

There is no need to first get all the nodes that match an initial pattern. It is enough to go through all the nodes and check them:
MATCH (m) WHERE NOT (m)--(:B)--(:C)
RETURN count(m)
Or if the condition on the label 'A' is important:
MATCH (m)
OPTIONAL MATCH p = (m)--(:B)--(:C)
WITH m WHERE (p IS NULL) OR (NOT p IS NULL AND NOT 'A' IN LABELS(m))
RETURN count(m)

Related

How to set a count(variable) as a property of a node

I'm currently trying to get the count of all movies that each actor has acted in (neo4j example movie database), and then set that as a num_movies_acted attribute for the person node.
So far, I'm able to get a list of all the actors and their respective movie count (including if it's 0 because of the OPTIONAL MATCH)
This is what I have:
MATCH (p:Person)
OPTIONAL MATCH (p)-[:ACTED_IN]->(m:Movie)
RETURN p.name as name, count(m) as num_movies_acted
How would I then set that into the Person Node? I know I should use something like:
SET p.num_movies_acted = count(m), but that fails to work.
Invalid use of aggregating function count(...) in this context (line 3, column 26 (offset: 84))
"SET p.num_movies_acted = count(m)"
EDIT: Would this work?
MATCH (p:Person)
OPTIONAL MATCH (p)-[:ACTED_IN]->(m:Movie)
WITH p, count(m) as num_movies_acted
SET p.num_movies_acted = num_movies_acted
RETURN p
since I am "storing" the count(m) into a variable first
MATCH (p:Person) OPTIONAL MATCH (p)-[:ACTED_IN]->(m:Movie) RETURN
p.name as name, count(m) as num_movies_acted
This query returns a list as num_movies_acted, which fails to work when you try to set it as an property of an individual node.
EDIT: Would this work?
MATCH (p:Person) OPTIONAL MATCH (p)-[:ACTED_IN]->(m:Movie) WITH p,
count(m) as num_movies_acted SET p.num_movies_acted = num_movies_acted
RETURN p
Yes this would work fine as you are counting the Movie node for each of the Person node and setting the property.
You can also try:
MATCH (p:Person)
OPTIONAL MATCH (p)-[r:ACTED_IN]->(m:Movie)
WITH p, count(r) as num_movies_acted
SET p.num_movies_acted = num_movies_acted
RETURN p
This is a note for someone expecting a tree with an aggregation property.
I need a tree( Person-[ACTED_IN]->Movie ) with p.num_movies_acted,
so finally I got a cypher:
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE m.released > 2000
WITH p, count(r) as num_movies_acted
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE m.released > 2000
SET p.num_movies_acted = num_movies_acted
RETURN p,r,m
I got the same error of aggregation, so tried to somehow avoid it.
I'm not confident in it. So, kindly tell me more efficient one.

Match all nodes and return nodes + relationships

In the latest version of Cypher, I can use this query to get all nodes with relationships:
MATCH (n)-[r]-(m) RETURN n,r,m
However, I'm missing nodes without any relationships.
In trying to query the missing nodes, this attempt gives me the error: Variable 'r' not defined
MATCH (n) WHERE NOT (n)-[r]->() RETURN n
And, this attempt shows zero results:
MATCH (n)-[r]->() WHERE r is null RETURN n
I can see the stragglers with:
MATCH (n) RETURN n
But, then I'm missing the relationships.
How do I phrase my query to find all nodes and all relationships without duplicates?
You can try the OPTIONAL MATCH:
MATCH (n)
OPTIONAL MATCH (n)-[r]-(m)
RETURN n, r, m

How do I write this Neo4j Query?

There are three node types: A, B and C.
I need all the A's and B's and only the C's that participate in exactly one relationship.
match (n)
where n:A or n:B or (n:C)-[]-()
with count(n) as countOfRels
where countOfRels > 0
return n
Not close, I know. I'm not sure where to go from here.
It's a bit strange that A, B and C do not seem to be related ... but here's how you could solve your question for C :
MATCH (n:C)
WHERE size((n)-[]-()) = 1
RETURN n
UNION
MATCH (n:A)
RETURN n
UNION
MATCH (n:B)
RETURN n;
Hope this helps.
Regards,
Tom
you can use this
match(n)
where n:A OR n:B OR (n:C)-[r]-()
with count(r) as countOfRels
where countOfRels > 0
return n
Hope this helps.
You can do MATCH (a)--() WHERE NOT ()--(a)--() to match "nodes with only one relation". After that, You can use UNION or COLLECT()+UNWIND to combine the separate queries into one row result set.
// using Union
MATCH (n:C)--()
WHERE NOT ()--(n)--()
RETURN n
UNION
MATCH (n:A)
RETURN n
UNION
MATCH (n:B)
RETURN n;
// Using collect
OPTIONAL MATCH (a:A)
OPTIONAL MATCH (b:B)
OPTIONAL MATCH (c:C)--() WHERE NOT ()--(c)--()
WITH COLLECT(a)+COLLECT(b)+COLLECT(c) as nodez
UNWIND nodez as n
RETURN DISTINCT n

Cypher : Return Nodes that matched along with Nodes that didn't match

With Labels A, B, and Z, A and B have their own relationships to Z. With the query
MATCH (a:A)
MATCH (b:B { uuid: {id} })
MATCH (a)-[:rel1]->(z:Z)<-[:rel2]-(b)
WITH a, COLLECT(z) AS matched_z
RETURN DISTINCT a, matched_z
Which returns the nodes of A and all the Nodes Z that have a relationship to A and B
I'm stuck on trying to ALSO return a separate array of the Z Nodes that B has with Z but not with A (i.e. missing_z). I am attempting to do an initial query to return all the relationships between B & Z
results = MATCH (b:B { uuid: {id} })
MATCH (b)-[:rel2]->(z:Z)
RETURN DISTINCT COLLECT(z.uuid) AS z
MATCH (a:A)
MATCH (b:B { uuid: {id} })
MATCH (a)-[:rel1]->(z:Z)<-[:rel2]-(b)
WITH a, COLLECT(z) AS matched_z, z
RETURN DISTINCT a, matched_z, filter(skill IN z.array WHERE NOT z.uuid IN {results}) AS missing_z
The results seem to have nil for missing_z where one would assume it should be populated. Not sure if filter is the correct way to go with a WHERE NOT / IN scenario. Can the above 2 queries be combined into 1?
The hard part here, in my opinion, is that any failed matches will drop everything you have matched so far. But your starting point seems to be "All Z related by B.uuid", So start by collecting that and filtering/copying from there.
Use WITH + aggregation functions to copy+filter columns
Use OPTIONAL MATCH if a failure to match shouldn't drop already collected rows.
If I understand what you are trying to do well enough, This cypher should do the job, and just adjust it as needed (let me know if you need help understanding any part of it/adapting it)
// Match base set
MATCH (z:Z)<-[:rel2]-(b:B { uuid: {id} })
// Collect into single list
WITH COLLECT(z) as zs
// Match all A (ignore relation to Zs)
MATCH (a:A)
// For each a, return a, the sub-list of Zs related to a, and the sub-list of Zs not related to a
RETURN a as a, FILTER(n in zs WHERE (a)-[:rel1]->(n)) as matched, FILTER(n in zs WHERE NOT (a)-[:rel1]->(n)) as unmatched
This query might do what you want:
MATCH (z:Z)<-[:rel2]-(b:B { uuid: {id} })
WITH COLLECT(z) as all_zs
UNWIND all_zs AS z
MATCH (a)-[:rel1]->(z)
WITH all_zs, COLLECT(DISTINCT z) AS matched_zs
RETURN matched_zs, apoc.coll.subtract(all_zs, matched_zs) AS missing_zs;
It first stores in the all_zs variable all the Z nodes that have a rel2 relationship from b. This collection's contents remain unaffected even if the second MATCH clause matches a subset of those Z nodes.
It then stores in matched_zs the distinct all_zs nodes that have a rel1 relationship from any A node.
Finally, it returns:
the matched_zs collection, and
the unique nodes from all_zs that are not also in matched_zs, as missing_zs.
The query uses the convenient APOC function apoc.coll.subtract to generate the latter return value.

How to enumerate nodes and relationships along path returned via Cypher

I opened this question here: How to find specific subgraph in Neo4j using where clause to find a path of a certain criteria. Yet when I try to do things like get the relationship type I cannot.
For example I tried MATCH p = (n:Root)-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance is null)
RETURN nodes(p), TYPE(relationships(p))
But I get the error:
Type mismatch: expected Relationship but was Collection<Relationship>
I think I need to use a WITH clause but not sure.
Similarly I wanted the ID of a node but that also failed.
The problem is that relationships returns a collection and the type function only works on a single relationship. There are two main approaches to solve this.
Use UNWIND to get a separate row for each relationship:
MATCH p = (n:Root)-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance is null)
WITH relationships(p) AS rs
UNWIND n, rs AS r
RETURN n, type(r)
Use extract to get the results in a list (in a single row per root node):
MATCH p = (n:Root)-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance is null)
WITH n, relationships(p) AS rs
RETURN n, extract(r IN rs | type(r))
Or even shorter:
MATCH p = (n:Root)-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance is null)
RETURN n, extract(r IN relationships(p) | type(r))

Resources