Trouble using OPTIONAL MATCH with a MATCH and WHERE - neo4j

I have a cypher query that is not behaving as expected and I'm trying to figure out why. I suspect I don't fully understand how OPTIONAL MATCH works.
The database has one (:'Person::Current') node and one (:'Trait::Current') node. It does not have a (:'PersonTrait::Current') node.
If I run this query, it correctly returns a count(t) of 1
MATCH (n:`Person::Current` {uuid: $person_id}), (t:`Trait::Current` {uuid: $trait_id})
WHERE NOT (
(n)-[:PERSON_TRAIT]->(:`PersonTrait::Current` {has: true})-[:PERSON_TRAIT]->(t) OR
(n)-[:PERSON_TRAIT]->(:`PersonTrait::Current` {has: false})-[:PERSON_TRAIT]->(t) OR
(t)-[:GIVES_TRAIT]->(:`GivesTrait::Current`)-[:GIVES_TRAIT]->(:`Trait::Current`)<-[:PERSON_TRAIT]-(:`PersonTrait::Current` {has: false})<-[:PERSON_TRAIT]-(n)
)
RETURN count(t) as res
When a (:'PersonTrait::Current') node is added to the database in the form
(:`Person::Current`)-[:PERSON_TRAIT]->(:`PersonTrait::Current` {has: true})-[:PERSON_TRAIT]->(:`Trait::Current`)
My query correctly returns a count(t) of 0.
However, if I try and DRY up the query by making use of OPTIONAL MATCH, like so
MATCH (n:`Person::Current` {uuid: $person_id}), (t:`Trait::Current` {uuid: $trait_id})
OPTIONAL MATCH (pt:`PersonTrait::Current`)
WHERE NOT (
((n)-[:PERSON_TRAIT]->(pt)-[:PERSON_TRAIT]->(t) AND exists(pt.has)) OR
(t)-[:GIVES_TRAIT]->(:`GivesTrait::Current`)-[:GIVES_TRAIT]->(:`Trait::Current`)<-[:PERSON_TRAIT]-(pt {has: false})<-[:PERSON_TRAIT]-(n)
)
RETURN count(t) as res
Then the query incorrectly returns a count(t) of 1 when a (:'PersonTrait::Current') node is added to the database in the form
(:`Person::Current`)-[:PERSON_TRAIT]->(:`PersonTrait::Current` {has: true})-[:PERSON_TRAIT]->(:`Trait::Current`)
Anyone know what's going wrong? The WHERE NOT clause should be filtering out (t) nodes if a (pt) node is present with the appropriate pattern.
THANKS!!!

I think the issue is understanding the WHERE clause, in that WHERE only applies to the previous MATCH, OPTIONAL MATCH, or WITH clause.
In this case, it's paired with the OPTIONAL MATCH, so rows won't be filtered out when the WHERE is false, it will behave the same as if the OPTIONAL MATCH failed, so newly introduced variables in the OPTIONAL MATCH would be set to null.
If you want the WHERE to filter out rows, pair it with a WITH clause instead:
MATCH (n:`Person::Current` {uuid: $person_id}), (t:`Trait::Current` {uuid: $trait_id})
OPTIONAL MATCH (pt:`PersonTrait::Current`)
WITH n, t, pt
WHERE NOT (
((n)-[:PERSON_TRAIT]->(pt)-[:PERSON_TRAIT]->(t) AND exists(pt.has)) OR
(t)-[:GIVES_TRAIT]->(:`GivesTrait::Current`)-[:GIVES_TRAIT]->(:`Trait::Current`)<-[:PERSON_TRAIT]-(pt {has: false})<-[:PERSON_TRAIT]-(n)
)
RETURN count(t) as res

Related

Neo4j query for getting multiple connected nodes

In my graph, I want to get the first-degree, second-degree, and third-degree neighbors of a certain node. If my graph is A -> B -> C -> D -> E, then
first-degree neighbor of C is B
second-degree neighbor of C is A
third-degree neighbor of C is none
When checking neighbors, I go in the reverse direction of the edge. To get these nodes, I wrote the following query.
MATCH (changedNode: Function) WHERE changedNode.signature IN [...]
MATCH (neig1: Function)-[:CALLS]->(changedNode)
MATCH (neig2: Function)-[:CALLS]->(neig1)
MATCH (neig3: Function)-[:CALLS]->(neig2)
RETURN DISTINCT neig1.functionName, neig2.functionName, neig3.functionName
I realized that this code does not return B as the first-degree neighbor of C since A does not have any neighbors(neig3 is empty). In other words, this query requires a node to have a third-degree neighbor. I understood this but could not update my code. How should I revise my query?
You can use OPTIONAL MATCH since A may not have a neighbor. Then the query will return a null value for neigh3.
MATCH (changedNode: Function) WHERE changedNode.signature IN [...]
MATCH (neig1: Function)-[:CALLS]->(changedNode)
OPTIONAL MATCH (neig2: Function)-[:CALLS]->(neig1)
OPTIONAL MATCH (neig3: Function)-[:CALLS]->(neig2)
RETURN DISTINCT neig1.functionName, neig2.functionName, neig3.functionName
Use of OPTIONAL MATCH matches patterns against your graph database, just like a MATCH does. The difference is that if no matches are found, OPTIONAL MATCH will use a null for missing parts of the pattern. OPTIONAL MATCH could be considered the Cypher equivalent of the outer join in SQL.
MATCH (changedNode: Function) WHERE changedNode.signature IN [...]
MATCH (neig1: Function)-[:CALLS]->(changedNode)
OPTIONAL MATCH (neig2: Function)-[:CALLS]->(neig1)
OPTIONAL MATCH (neig3: Function)-[:CALLS]->(neig2)
RETURN DISTINCT neig1.functionName, neig2.functionName, neig3.functionName
For more explanation, visit: https://neo4j.com/developer/kb/a-note-on-optional-matches/

OPTIONAL MATCH returns no path for disconnect nodes

I find weird that using OPTIONAL MATCH nodes that don’t have the expected relationship are not returned as a single node in path.
OPTIONAL MATCH path = (:Person) -[:LIKES]- (:Movie)
UNWIND nodes(p) as n
UNWIND rels(p) as e
WITH n
WHERE HEAD(LABELS(n)) = “Person”
return COUNT(DISTINCT n)
The number of people returned only includes those who liked a movie. By using OPTIONAL I would have expected all people to be returned.
Is there a workaround to this or am I doing some this wrong in the query?
A better way to go about this would be to match to all :People nodes first, then use the OPTIONAL MATCH to match to movies (or, if you want a collection of the movies they liked, use pattern comprehension).
If you do need to perform an UNWIND on an empty collection without wiping out the row, use a CASE around some condition to use a single-element list rather than the empty list.
MATCH (n:Person) // match all persons
OPTIONAL MATCH p = (n) -[:LIKES]- (m:Movie) // p and m are the optionals
UNWIND CASE WHEN p is null THEN [null] ELSE nodes(p) END as nodes // already have n, using a different variable
UNWIND CASE WHEN p is null THEN [null] ELSE rels(p) END as e // forcing a single element list means UNWIND won't wipe out the row
WITH n
WHERE HEAD(LABELS(n)) = “Person” // not really needed at all, and bad practice, you don't know the order of the labels on a node
return COUNT(DISTINCT n) // if this is really all you need, just keep the first match and the return of the query (without distinct), don't need anything else

Neo4j Cypher query - collect elements from 2 different variables to single list

I have a following part of Cypher query:
MATCH (ch:Characteristic) WHERE id(ch) = {characteristicId} WITH ch OPTIONAL MATCH (ch)<-[:SET_ON]-(v:Value)...
first of all I'm looking for (ch:Characteristic) by characteristicId and then applying required logic for this variable at the rest of my query.
my Characteristic can also have(or not) a child Characteristic nodes, like:
(ch:Characteristic)-[:CONTAINS]->(childCh)
Please help to extend my query in order to collect ch and childCh into a list of Characteristic thus I'll be able at the rest of my query to apply required logic to all Characteristic at this list.
UPDATED - possible solution #2
This is my current working query:
MATCH (chparent:Characteristic)
WHERE id(chparent) = {characteristicId}
OPTIONAL MATCH (chparent)-[:CONTAINS*]->(chchild:Characteristic)
WITH chparent, collect(distinct(chchild)) as childs
WITH childs + chparent as nodes
UNWIND nodes as ch
OPTIONAL MATCH (ch)<-[:SET_ON]-(v:Value)-[:SET_FOR]->(Decision)
OPTIONAL MATCH (v)-[:CONTAINS]->(vE) OPTIONAL MATCH (vE)-[:CONTAINS]->(vEE)
OPTIONAL MATCH (ch)-[:CONTAINS]->(cho:CharacteristicOption)
OPTIONAL MATCH (cho)-[:CONTAINS]->(choE) OPTIONAL MATCH (ch)-[:CONTAINS]->(chE)
DETACH DELETE choE, cho, ch, vEE, vE, v, chE
This is an attempt to simplify the query above:
MATCH (ch:Characteristic)
WHERE (:Characteristic {id: {characteristicId}})-[:CONTAINS*]->(ch)
OPTIONAL MATCH (ch)<-[:SET_ON]-(v:Value)-[:SET_FOR]->(Decision)
OPTIONAL MATCH (v)-[:CONTAINS]->(vE)
OPTIONAL MATCH (vE)-[:CONTAINS]->(vEE)
OPTIONAL MATCH (ch)-[:CONTAINS]->(cho:CharacteristicOption)
OPTIONAL MATCH (cho)-[:CONTAINS]->(choE)
OPTIONAL MATCH (ch)-[:CONTAINS]->(chE)
DETACH DELETE choE, cho, ch, vEE, vE, v, chE
but this query doesn't delete required Characteristic nodes and my tests fail. What am I doing wrong at the last query ?
You can try something like this with apoc:
MATCH (chparent:Characteristic {characteristicId: <someid>})
OPTIONAL MATCH (chparent)-[:CONTAINS]->(chchild:Characteristic)
WITH apoc.coll.union(chparent,chchild) as distinctList
...
With pure cypher you can try something like this:
MATCH (chparent:Characteristic {characteristicId: <someid>})
OPTIONAL MATCH (chparent)-[:CONTAINS]->(chchild:Characteristic)
WITH chparent,collect(distinct(chchild)) as childs
WITH chparent + childs as list
....
Not really sure if you need distinct in collect, but I added just so you know you can do this to filter out duplicates.
You can actually do this easily by using a variable-length relationship match of 0..1, as it will let you match on your root :Characteristic node and any of its children.
MATCH (chparent:Characteristic)-[:CONTAINS*0..1]->(ch:Characteristic)
WHERE id(chparent) = {characteristicId}
// ch contains both the parent and children, no need for a list
...
A more simplified query.
MATCH (c:Characteristics) WHERE (:Characteristics {id: 123})-[:CONTAINS*0..1]->(c) return c;
Matches all Characteristics including the root node that (optionally) have incoming relationships of type CONTAINS from the node specified with id 123.
I assume all the children of Characteristic will also have the label Characteristic. Another assumption I made is, you need characteristicId which is defined by you, not the internal id defined by neo4J. id(ch) fetching the internal id instead of user defined ID. You might want to pass the characteristicId variable like I gave here.
MATCH (chparent:Characteristic {characteristicId: <someid>})
WITH chparent
OPTIONAL MATCH (chparent)-[:CONTAINS]->(chchild:Characteristic)
WITH chchild
<your operation>

Neo4j get labels on an optional match

I have the following example cypher:
MATCH (n)
OPTIONAL MATCH (n)-[:likes]->(p)
RETURN n, p, label(p)
This works great if optional match return a non null value. However if optional match is empty, this fails. Is there a way to return label(p) if p exists else return null?
First things, I think you probably want to narrow down what n matches with some criteria and an index, but to answer your question, coalesce is your friend.
MATCH (n)
OPTIONAL MATCH (n)-[:likes]->(p)
RETURN n
, coalesce(p, 'nobody')
, coalesce(labels(p),'nothing')

Multiple Outer Joins / OPTIONAL MATCH without a hierarchy in neo4j

I am trying to match multiple outer joins on the same level in neo4j.
My database consists of users and a count of common up ur downratings on articles. The ratings counts are on seprate edges for up and downratings between the users.
---------- -----------
| User n | -[:rating_positive {weight}]-> | User n2 |
---------- -----------
| ^
\-----[:rating_negative {weight}]-------/
Now i want to produce edges that sum up these ratings.
I would love to use multiple optional merges, that do so sch as e.g.:
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
But: In this example I get all users without any positive ratings and those with positive and negatice ratings but none with only negative ratings. So there seems to be a sequence in OPTIONAL MATCH.
If I swap the order of the "OPTIONAL MATCHes" I get those with only negative ratings but not those with onl positive ratings.
So "OPTIONAL MATCH" is somehow a sequence where only when the first
sequence is met I get something from the second and so on?
Is there a workaround?
Neo4j Version is 2.1.3.
P.S.:
Even more confusing matching against NULL does not seem to work. So this query:
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
WHERE rating_positive IS NULL AND rating_negative IS NOT NULL
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
will give me lots of edges with NULL rating_negative and NON NULL rating_positive. I don't know what is happening with null matching in WHERE?
Anyway I found a way to recode the nulls to 0 values using "coalesce":
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
WITH n, n2, coalesce(rating_positive.weight, 0) AS rating_positive, coalesce(rating_negative.weight, 0) as rating_negative
WHERE rating_positive = 0 AND rating_negative > 0
RETURN n.uid, n2.uid, rating_positive, rating_negative
With this query it works as expected.
I believe more than the sequencing of the optional match, it's the fact that you've bound n2.
So the next optional match is restricted to only match the nodes identified to be candidates for n2 in the previous match. And so it appears that the order of the optional match influences it.
If you take a look at a small sample graph I set up here http://console.neo4j.org/r/lrp55o , the following query
MATCH (n:User)
OPTIONAL
MATCH (n)-[:rating_negative]->(n2)
OPTIONAL
MATCH (n)-[:rating_positive]->(n2)
RETURN n,n2
returns B-[:rating_negative]->C and C-[:rating_negative]->D but it leaves out A-[:rating_positive]->B.
The first optional match for rating_negative bound C and D as nodes for "n2". The second optional match found no n which has a rating_positive to C or D and hence the results.
I'm a bit unclear about what you are trying to do with the query and null checks but a union would be one way to give you all the positive and negative relations (which you can add your filters to):
MATCH (n:User)
OPTIONAL
MATCH (n)-[rating:rating_negative]->(n2)
RETURN n,n2,rating.weight
UNION ALL
MATCH (n:User)
OPTIONAL
MATCH (n)-[rating:rating_positive]->(n2)
RETURN n, n2, rating.weight
If this is not what you're looking for, a small subgraph at http://console.neo4j.org?init=0 would be great to help you further.
EDIT: Since comments indicated that the sum of ratings was required between a pair of users, the following query does the job:
MATCH (u:User)-[rating:rating_positive|:rating_negative]->(u2)
RETURN u,u2,sum(rating.weight)
I can't be entirely sure whether this is what is causing your problem but it appears to me that you should be omitting the labels in the OPTIONAL MATCH clauses.
Perhaps try the query below
MATCH (n:`User`)
OPTIONAL MATCH (n)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n)-[rating_negative:rating_negative]-(n2)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
It may also be worth including the relationship directions.
MATCH (n:`User`)
OPTIONAL MATCH (n)-[rating_positive:rating_positive]->(n2:`User`)
OPTIONAL MATCH (n)-[rating_negative:rating_negative]->(n2)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight

Resources