Difference between a single (OPTIONAL) MATCH clause and multiple - neo4j

What is the difference between
OPTIONAL MATCH clauseA, clauseB
and
OPTIONAL MATCH clauseA
OPTIONAL MATCH clauseB
I get different behavior depending on which form I use.
For example:
START n=node(111)
OPTIONAL MATCH n<-[links_n_in]-(n_from),n-[links_n_out]->(n_to)
RETURN n,COLLECT(n_from) AS n_from,COLLECT(links_n_in) AS links_n_in,COLLECT(n_to) AS n_to,COLLECT(links_n_out) AS links_n_out
which is designed to return a node; it's incoming relationships and from nodes; it's outgoing relationship and to nodes.
I have a test graph consisting of Node 111 which has 4 outgoing relationships each of which points to the same Node (I have other test cases in which 111 points to different Nodes). Executing the query as above returns only Node 111 in column 'n'. The columns for 'n_from', 'links_n_in', 'n_to', 'links_n_out' are empty.
If I modify the query to:
START n=node(111)
OPTIONAL MATCH n<-[links_n_in]-(n_from)
OPTIONAL MATCH n-[links_n_out]->(n_to)
RETURN n,COLLECT(n_from) AS n_from,COLLECT(links_n_in) AS links_n_in,COLLECT(n_to) AS n_to,COLLECT(links_n_out) AS links_n_out
then the n_to and link_n_out columns are populated as expected.

The first form treats it as a single extended pattern that must match entirely.
The second form treats them as distinct optional patterns, and can match the two separately.
So your results make sense, when you think about what it's doing--if the whole OPTIONAL MATCH pattern isn't found, it doesn't match any of the OPTIONAL MATCH pattern.

Related

How can I define a Neo4j node property from 2 possible paths with a cypher query?

Quite new to neo4j/cypher. Im trying to return a property that is accessed by 2 different paths, depending on the Label, in this case, the Label of (n).
MATCH (k:KeyNode)<-[:BASED_ON]-(n)-[:CONTROLS|:MODIFIES]->()
WHERE id(k)=123456
//if label(n) = LabelA
OPTIONAL MATCH (n)<-[:LABEL_A_REL]-(c:Controller)-[:CONTROLS]->(r:Resource)-[:TYPE_OF]->(rt:ResourceType)
//if label(n) = NotLabelA
OPTIONAL MATCH (n)-[:LABEL_NOT_A_REL]->(r:Resource)-[:TYPE_OF]->(rt:ResourceType)
OPTIONAL MATCH (r)-[:PARENT*]->(ro:Room)
RETURN ID(r) as resourceId, ID(ro) as siteId, ID(rt) as rt:ResourceType
As is, the path defined 1st optional match and its defined nodes take precedence, leaving the 2nd opt match/path node redefinitions untouched, i assume because cypher won't redefine a variable. The goal is to get (r) and (rt) found on 2 possible paths.
I considered using CASE WHEN structure, but from the documentation I see only the option to return single properties, and not multiple (though i could be wrong)
This could be an approach:
MATCH (k:KeyNode)<-[:BASED_ON]-(n)-[:CONTROLS|:MODIFIES]->()
WHERE id(k)=123456
OPTIONAL MATCH (n)<-[:LABEL_A_REL]-(c:Controller)-[:CONTROLS]->(r1:Resource)-[:TYPE_OF]->(rt1:ResourceType)
OPTIONAL MATCH (n)-[:LABEL_NOT_A_REL]->(r2:Resource)-[:TYPE_OF]->(rt2:ResourceType)
// COALESCE to deal with precedence
WITH COALESCE(r1,r2) AS r,
COALESCE(rt1,rt2) AS rt
OPTIONAL MATCH (r)-[:PARENT*]->(ro:Room)
RETURN ID(r) as resourceId, ID(ro) as siteId, ID(rt) as rt:ResourceType

Combining collection results of multiple MATCH cypher queries

I have a graph in which, there can exist three patters of paths between (:srcType) and (:destType):
Pattern 1
(:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(:destType)
Notice that, here the direction of relationships reverses as path goes through (center):<-[]-(center)-[]->
Pattern 2
In this pattern (srcParent) it self is a center. Thus direction of relationships reverses across (srcParent):
(:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(:destType)
Pattern 3
In this pattern (destParent) it self is a center. Thus direction of relationships reverses across (destParent):
(:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(:destType)
I am giving id of (:srcType) and trying to obtain all (:destType) nodes. Note that given one (:srcType) it can have one (:destType) node associated with it following first pattern, another following 2nd pattern and few more following third pattern. I am trying to retrieve single collection containing all these (:destType) nodes. So I have combined above queries as follows:
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE id(src)=3
WITH dest1
MATCH (src:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE id(src)=3
WITH dest1, dest2
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE id(src)=3
RETURN dest1, dest2, dest3
So here I am matching each pattern one by one in MATCH clauses and feeding (:destType)s output of one MATCH to next one using WITH clause. At the end I am returning all destTypes.
Q1. But this is not executing. When I run one of the pattern (single WITH), it correctly returns whichever (:destType) that matches the path. But with above query it returns 0 rows. Why is it so?
Q2. Also instead of returning all destTypes, I want to return single collection containing elements of all of them. Knowing that collections can be merged using +, is it possible to return something like below?
RETURN destType1+destType2+destType2
Note
I will need to add different filters for each pattern afterwards. So the future query may look something like this:
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE id(src)=3 AND srcParent.prop1='a'
WITH dest1
MATCH (src:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE id(src)=3 AND destParent.prop2='b'
WITH dest1, dest2
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE id(src)=3 AND srcParent.prop3='c'
RETURN dest1, dest2, dest3
Given that these patterns may or may not be present, and that you want a collection of all results at the end, a good approach would be to match on the src node first, then use OPTIONAL MATCHes, and collect the results along the way, adding new ones in.
If we modify your last query, it may look something like this:
MATCH (src:srcType)
WHERE id(src) = 3
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE srcParent.prop1='a'
WITH src, COLLECT(dest1) as dests
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE destParent.prop2='b'
WITH src, dests + COLLECT(dest2) as dests
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE srcParent.prop3='c'
RETURN dests + COLLECT(dest3) as dests

Neo4j Cypher - merge columns and get distinct values from all

I've got a complex query that I'm trying to do with OPTIONAL MATCH statements. It looks like this:
MATCH (p:Person {name:'Victoria'})
OPTIONAL MATCH (p)-[:MANAGES]->(:Office)<-[MERGES_INTO*0..]-(:Office)<-[:WORKS_WITH]-(target1)
OPTIONAL MATCH (p)-[:SUPPORTS]->(:Office)<-[MERGES_INTO*0..]-(:Office)<-[:WORKS_WITH]-(target2)
OPTIONAL MATCH (p)-[:ASSISTS]->(:Person)-[*0..1]->(:Group)<--(target3)
OPTIONAL MATCH (p)-->(:Group)<--(target4)
RETURN DISTINCT target1,target2,target3,target4
What I want to do is get the results as if they were a single column called target instead of getting target1, target2, target3, and target4 back as separate columns.
Is there a way to collect/unwind the four potential target columns to return them as a single column result set?
I know I could get the desired result using a UNION of four separate queries that return a value called target, but I was wondering if there was a better way.
Thanks.
Yes, this is possible, and you're on the right track. Here's an example using collect and unwind to do this:
MATCH (p:Person {name:'Victoria'})
OPTIONAL MATCH (p)-[:MANAGES]->(:Office)<-[:MERGES_INTO*0..]-(:Office)<-[:WORKS_WITH]-(target1)
WITH p, COLLECT(target1) as target1s
OPTIONAL MATCH (p)-[:SUPPORTS]->(:Office)<-[:MERGES_INTO*0..]-(:Office)<-[:WORKS_WITH]-(target2)
WITH p, target1s, COLLECT(target2) as target2s
OPTIONAL MATCH (p)-[:ASSISTS]->(:Person)-[*0..1]->(:Group)<--(target3)
WITH p, target1s, target2s, COLLECT(target3) as target3s
OPTIONAL MATCH (p)-->(:Group)<--(target4)
WITH target1s + target2s + target3s + COLLECT(target4) AS targets
UNWIND targets AS target
RETURN DISTINCT target
Note that you should be able to collapse the :MANAGES and :SUPPORTS optional matches into a single [:MANAGES|SUPPORTS] relationship, as the rest of the path is the same.

Reusing path in multiple MATCH UNION queries cypher neo4j

I'd like to pull and combine data from several different paths that share a path at the beginning, not all of which might exist. For example, I'd like to do something like this:
MATCH (:Complex)-[:PATH]->(s:Somewhere)-[:FETCHING]->(data)
RETURN data.attribute
UNION ALL
MATCH (s)-[:OPTIONAL]->(o:OtherData)
RETURN o.attribute;
so that it doesn't retrace the path up to s. I can't actually do this, though, because UNION separates queries and the (s)-[:OPTIONAL] in the second part will match anything with an outgoing OPTIONAL relation; the s is a loose handle.
Is there a better way of doing this than repeating the path:
MATCH (:Complex)-[:PATH]->(s:Somewhere)-[:FETCHING]->(data)
RETURN data.attribute
UNION ALL
MATCH (:Complex)-[:PATH]->(s:Somewhere)-[:OPTIONAL]->(o:OtherData)
RETURN o.attribute;
I made a few attempts using WITH, but they all either caused the query to return nothing if any part failed, or I could not get them to line up into a single column and instead got rows with redundant data, or (with multiple, nested WITHs, which I'm not sure about the scoping of) just fetching everything.
Have you looked at the semantics of an optional match? So you can match to s, beyond s and your optional component. Something like:
MATCH (:Complex)-[:PATH]->(s:Somewhere)
MATCH (s)-[:FETCHING]->(data)
OPTIONAL MATCH (s)-[:OPTIONAL]->(otherData)
RETURN data.attribute, otherData.attribute
Sorry I missed the importance of a single column, is it really important?
You can gather the vaues into a single collection :
MATCH (:Complex)-[:PATH]->(s:Somewhere)
MATCH (s)-[:FETCHING]->(data)
OPTIONAL MATCH (s)-[:OPTIONAL]->(otherData)
RETURN [data.attribute] + COLLECT(otherData.attribute)
But doesn't this work for a single column:
MATCH (:Complex)-[:PATH]->(s:Somewhere)
MATCH (s)-[:FETCHING]->(data)
OPTIONAL MATCH (s)-[:OPTIONAL]->(otherData)
WITH [data.attribute] + COLLECT(otherData.attribute) as col
RETURN UNWIND col AS val

Difference between START n = node(*) and MATCH (n)

In Neo4j 2.0 this query:
MATCH (n) WHERE n.username = 'blevine'
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
returns different results than this query:
START n = node(*) WHERE n.username = 'blevine'
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
The first query works as expected returning a single Node for 'blevine' and the associated Nodes mentioned in the OPTIONAL MATCH clauses. The second query returns many more Nodes which do not even have a username property. I realize that start n = node(*) is not recommended and that START is not even required in 2.0. But the second form (with OPTIONAL MATCH replaced with question marks on the relationship type) worked prior to 2.0. In the second form, why is 'n' not being constrained to the single 'blevine' node by the first WHERE clause?
To run the second query as expected you would just need to add WITH n. In your query you would need to filter the result and pass it for optional match which is to be done using WITH
START n = node(*) WHERE n.username = 'blevine'
WITH n
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
From the documentation
WHERE defines the MATCH patterns in more detail. The predicates are part of the
pattern description, not a filter applied after the matching is done.
This means that WHERE should always be put together with the MATCH clause it belongs to.
when you do start n=node(*) where n.name="xyz" you need to pass the result explicitly into your next optional matches. But when you do MATCH (n) WHERE n.name="xyz" this tells graph specifically what node to start looking into.
EDIT
Here is the thing. The documentation says Optional Match returns null if a pattern is not found so in your first case, it includes all those results too where n.username property is null or cases where n doesnt even have a relationship suggested in the OPTIONAL MATCH pattern. So when you do a WITH n , the graph is explicitly told to use only n.
Excerpt from the documentation (link : here)
OPTIONAL MATCH matches patterns against your graph database, just like MATCH does.
The difference is that if no matches are found, OPTIONAL MATCH will use NULLs for
missing parts of the pattern. OPTIONAL MATCH could be considered the Cypher
equivalent of the outer join in SQL.
Either the whole pattern is matched, or nothing is matched. Remember that
WHERE is part of the pattern description, and the predicates will be
considered while looking for matches, not after. This matters especially
in the case of multiple (OPTIONAL) MATCH clauses, where it is crucial to
put WHERE together with the MATCH it belongs to.
Also few more things to note about the behaviour of WHERE clause: here
Excerpts:
WHERE is not a clause in it’s own right — rather, it’s part of MATCH,
OPTIONAL MATCH, START and WITH.
In the case of WITH and START, WHERE simply filters the results.
For MATCH and OPTIONAL MATCH on the other hand, WHERE adds constraints
to the patterns described. It should not be seen as a filter after the
matching is finished.

Resources