Combining collection results of multiple MATCH cypher queries

Combining collection results of multiple MATCH cypher queries - neo4j

I have a graph in which, there can exist three patters of paths between (:srcType) and (:destType):
Pattern 1
(:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(:destType)
Notice that, here the direction of relationships reverses as path goes through (center):<-[]-(center)-[]->
Pattern 2
In this pattern (srcParent) it self is a center. Thus direction of relationships reverses across (srcParent):
(:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(:destType)
Pattern 3
In this pattern (destParent) it self is a center. Thus direction of relationships reverses across (destParent):
(:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(:destType)
I am giving id of (:srcType) and trying to obtain all (:destType) nodes. Note that given one (:srcType) it can have one (:destType) node associated with it following first pattern, another following 2nd pattern and few more following third pattern. I am trying to retrieve single collection containing all these (:destType) nodes. So I have combined above queries as follows:
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE id(src)=3
WITH dest1
MATCH (src:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE id(src)=3
WITH dest1, dest2
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE id(src)=3
RETURN dest1, dest2, dest3
So here I am matching each pattern one by one in MATCH clauses and feeding (:destType)s output of one MATCH to next one using WITH clause. At the end I am returning all destTypes.
Q1. But this is not executing. When I run one of the pattern (single WITH), it correctly returns whichever (:destType) that matches the path. But with above query it returns 0 rows. Why is it so?
Q2. Also instead of returning all destTypes, I want to return single collection containing elements of all of them. Knowing that collections can be merged using +, is it possible to return something like below?
RETURN destType1+destType2+destType2
Note
I will need to add different filters for each pattern afterwards. So the future query may look something like this:
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE id(src)=3 AND srcParent.prop1='a'
WITH dest1
MATCH (src:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE id(src)=3 AND destParent.prop2='b'
WITH dest1, dest2
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE id(src)=3 AND srcParent.prop3='c'
RETURN dest1, dest2, dest3

Given that these patterns may or may not be present, and that you want a collection of all results at the end, a good approach would be to match on the src node first, then use OPTIONAL MATCHes, and collect the results along the way, adding new ones in.
If we modify your last query, it may look something like this:
MATCH (src:srcType)
WHERE id(src) = 3
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE srcParent.prop1='a'
WITH src, COLLECT(dest1) as dests
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE destParent.prop2='b'
WITH src, dests + COLLECT(dest2) as dests
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE srcParent.prop3='c'
RETURN dests + COLLECT(dest3) as dests

Related

Why is a single Neo4j relationship shown twice in Cypher query results?

Let's consider a trivial graph with a directed relationship:
CREATE
(`0` :Car {value:"Ford"})
, (`1` :Car {value:"Subaru"})
, (`0`)-[:`DOCUMENT` {value:"DOC-1"}]->(`1`);
The following query MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car) RETURN * returns:
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Subaru"}│{"value":"Ford"} │{"value":"DOC-1"}│
├──────────────────┼──────────────────┼─────────────────┤
│{"value":"Ford"} │{"value":"Subaru"}│{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
The graph defines only a Ford->Subaru relationship, why are two relationships?
How to interpret the reversed one (line 1; not specified in the CREATE) statement?
Note: This is a follow-up to Convert multiple relationships between 2 nodes to a single one with weight asked by me earlier. I solved my problem, but I'm not convinced my answer is the best solution.

Your MATCH statement here doesn't specify the direction, therefore there are two possible paths that will match the pattern (remember that the ordering of nodes in the path is important and distinguishes paths from each other), thus your two answers.
If you specify the direction of the relationship instead you'll find there is only one possible path that matches:
MATCH (n1:Car)-[r:DOCUMENT]->(n2:Car)
RETURN *
As for the question of why we get two paths back when we omit the direction, remember that paths are order-sensitive: two paths that have the same elements but with a different order of the elements are different paths.
To help put this into perspective, consider the following two queries:
# Query 1
MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car)
WHERE n1.value = 'Ford'
RETURN *
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Ford"} │{"value":"Subaru"}│{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
# Query 2
MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car)
WHERE n1.value = 'Subaru'
RETURN *
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Subaru"}│{"value":"Ford"} │{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
Conceptually (and also used by the planner, in absence of indexes), to get to each of the above results you start off with the results of the full match as in your description, then filter to the only one which meets the given criteria.
The results above would not be consistent with the original directionless match query if that original query only returned a single row instead of two.
Additional information from the OP
It will take a while to wrap my head around it, but it does work this way and here's a piece of documentation to confirm it's by design:
When your pattern contains a bound relationship, and that relationship pattern doesn’t specify direction, Cypher will try to match the relationship in both directions.
MATCH (a)-[r]-(b)
WHERE id(r)= 0
RETURN a,b
This returns the two connected nodes, once as the start node, and once as the end node.

Query intersection of Paths in Neo4j using Cypher

Having this query working in Cypher (Neo4j):
MATCH p=(g:Node)-[:FOLLOWED_BY *2..2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
RETURN p
which returns all possible paths belonging a specific group (group is just a property to classify nodes), I am struggling to get a query that returns the paths in common between both collection of paths. It would be something like this:
MATCH p=(g:Node)-[:FOLLOWED_BY *2..2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
MATCH p=(g3:Node)-[:FOLLOWED_BY *2..2]->(g4:Node)
WHERE g3.group=15 AND g4.group=15
RETURN INTERSECTION(path1, path2)
Of course I made that up. The goal is to get all the paths in common between both queries.

The start/end nodes of your 2 MATCHes have different groups, so they can never find common paths.
Therefore, when you ask for "paths in common", I assume you actually want to find the shared middle nodes (between the 2 sets of 3-node paths). If so, this query should work:
MATCH p1=(g:Node)-[:FOLLOWED_BY *2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
WITH COLLECT(DISTINCT NODES(p1)[1]) AS middle1
MATCH p2=(g3:Node)-[:FOLLOWED_BY *2]->(g4:Node)
WHERE g3.group=15 AND g4.group=15 AND NODES(p2)[1] IN middle1
RETURN DISTINCT NODES(p2)[1] AS common_middle_node;

Reusing path in multiple MATCH UNION queries cypher neo4j

I'd like to pull and combine data from several different paths that share a path at the beginning, not all of which might exist. For example, I'd like to do something like this:
MATCH (:Complex)-[:PATH]->(s:Somewhere)-[:FETCHING]->(data)
RETURN data.attribute
UNION ALL
MATCH (s)-[:OPTIONAL]->(o:OtherData)
RETURN o.attribute;
so that it doesn't retrace the path up to s. I can't actually do this, though, because UNION separates queries and the (s)-[:OPTIONAL] in the second part will match anything with an outgoing OPTIONAL relation; the s is a loose handle.
Is there a better way of doing this than repeating the path:
MATCH (:Complex)-[:PATH]->(s:Somewhere)-[:FETCHING]->(data)
RETURN data.attribute
UNION ALL
MATCH (:Complex)-[:PATH]->(s:Somewhere)-[:OPTIONAL]->(o:OtherData)
RETURN o.attribute;
I made a few attempts using WITH, but they all either caused the query to return nothing if any part failed, or I could not get them to line up into a single column and instead got rows with redundant data, or (with multiple, nested WITHs, which I'm not sure about the scoping of) just fetching everything.

Have you looked at the semantics of an optional match? So you can match to s, beyond s and your optional component. Something like:
MATCH (:Complex)-[:PATH]->(s:Somewhere)
MATCH (s)-[:FETCHING]->(data)
OPTIONAL MATCH (s)-[:OPTIONAL]->(otherData)
RETURN data.attribute, otherData.attribute
Sorry I missed the importance of a single column, is it really important?
You can gather the vaues into a single collection :
MATCH (:Complex)-[:PATH]->(s:Somewhere)
MATCH (s)-[:FETCHING]->(data)
OPTIONAL MATCH (s)-[:OPTIONAL]->(otherData)
RETURN [data.attribute] + COLLECT(otherData.attribute)
But doesn't this work for a single column:
MATCH (:Complex)-[:PATH]->(s:Somewhere)
MATCH (s)-[:FETCHING]->(data)
OPTIONAL MATCH (s)-[:OPTIONAL]->(otherData)
WITH [data.attribute] + COLLECT(otherData.attribute) as col
RETURN UNWIND col AS val

Single Cypher query to choose between 2 different paths

I have the following scenario:
At some point in my path (in a node that lies a few links away from my start node),
I have the possibility of going down one path or another, for example:
If S is my startnode,
S-[]->..->(B)-[first:FIRST_WAY]->(...) ,
and
S-[]->..->(B)-[second:SECOND_WAY]->(...)
At the junction point, I will need to go down one path only (first or second)
Ideally, I would like to follow and include results from the second relationship, only if the first one is not present (regardless of what exists afterwards).
Is this possible with Cypher 1.9.7, in a single query?

One way would be to an optional match to match the patterns separately. Example:
MATCH (n:Object) OPTIONAL MATCH (n)-[r1:FIRST_WAY]->(:Object)-->(f1:Object) OPTIONAL MATCH (n)-[r2:SECOND_WAY]->()-->(f2:Object) RETURN coalesce(f2, f1)
This query will match both conditionally and the coalesce function will return the first result which is not null.
AFAIK, OPTIONAL_MATCH was introduced in 2.0 so you can't use that clause in 1.9, but there is an alternate syntax:
CYPHER 1.9 START n=node(*) MATCH (n)-[r1?:FIRST_WAY]->()-->(f1), (n)-[r2?:SECOND_WAY]->()-->(f2) RETURN coalesce(f2, f1)
I'm sure there are other ways to do this, probably using the OR operator for relationship matching, i.e. ()-[r:FIRST_WAY|SECOND_WAY]->(), and then examining the patterns matched to discard some of the result paths based on the relationship type.

Difference between START n = node(*) and MATCH (n)

In Neo4j 2.0 this query:
MATCH (n) WHERE n.username = 'blevine'
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
returns different results than this query:
START n = node(*) WHERE n.username = 'blevine'
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
The first query works as expected returning a single Node for 'blevine' and the associated Nodes mentioned in the OPTIONAL MATCH clauses. The second query returns many more Nodes which do not even have a username property. I realize that start n = node(*) is not recommended and that START is not even required in 2.0. But the second form (with OPTIONAL MATCH replaced with question marks on the relationship type) worked prior to 2.0. In the second form, why is 'n' not being constrained to the single 'blevine' node by the first WHERE clause?

To run the second query as expected you would just need to add WITH n. In your query you would need to filter the result and pass it for optional match which is to be done using WITH
START n = node(*) WHERE n.username = 'blevine'
WITH n
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
From the documentation
WHERE defines the MATCH patterns in more detail. The predicates are part of the
pattern description, not a filter applied after the matching is done.
This means that WHERE should always be put together with the MATCH clause it belongs to.
when you do start n=node(*) where n.name="xyz" you need to pass the result explicitly into your next optional matches. But when you do MATCH (n) WHERE n.name="xyz" this tells graph specifically what node to start looking into.
EDIT
Here is the thing. The documentation says Optional Match returns null if a pattern is not found so in your first case, it includes all those results too where n.username property is null or cases where n doesnt even have a relationship suggested in the OPTIONAL MATCH pattern. So when you do a WITH n , the graph is explicitly told to use only n.
Excerpt from the documentation (link : here)
OPTIONAL MATCH matches patterns against your graph database, just like MATCH does.
The difference is that if no matches are found, OPTIONAL MATCH will use NULLs for
missing parts of the pattern. OPTIONAL MATCH could be considered the Cypher
equivalent of the outer join in SQL.
Either the whole pattern is matched, or nothing is matched. Remember that
WHERE is part of the pattern description, and the predicates will be
considered while looking for matches, not after. This matters especially
in the case of multiple (OPTIONAL) MATCH clauses, where it is crucial to
put WHERE together with the MATCH it belongs to.
Also few more things to note about the behaviour of WHERE clause: here
Excerpts:
WHERE is not a clause in it’s own right — rather, it’s part of MATCH,
OPTIONAL MATCH, START and WITH.
In the case of WITH and START, WHERE simply filters the results.
For MATCH and OPTIONAL MATCH on the other hand, WHERE adds constraints
to the patterns described. It should not be seen as a filter after the
matching is finished.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart