Neo4j query for getting multiple connected nodes - neo4j

In my graph, I want to get the first-degree, second-degree, and third-degree neighbors of a certain node. If my graph is A -> B -> C -> D -> E, then
first-degree neighbor of C is B
second-degree neighbor of C is A
third-degree neighbor of C is none
When checking neighbors, I go in the reverse direction of the edge. To get these nodes, I wrote the following query.
MATCH (changedNode: Function) WHERE changedNode.signature IN [...]
MATCH (neig1: Function)-[:CALLS]->(changedNode)
MATCH (neig2: Function)-[:CALLS]->(neig1)
MATCH (neig3: Function)-[:CALLS]->(neig2)
RETURN DISTINCT neig1.functionName, neig2.functionName, neig3.functionName
I realized that this code does not return B as the first-degree neighbor of C since A does not have any neighbors(neig3 is empty). In other words, this query requires a node to have a third-degree neighbor. I understood this but could not update my code. How should I revise my query?

You can use OPTIONAL MATCH since A may not have a neighbor. Then the query will return a null value for neigh3.
MATCH (changedNode: Function) WHERE changedNode.signature IN [...]
MATCH (neig1: Function)-[:CALLS]->(changedNode)
OPTIONAL MATCH (neig2: Function)-[:CALLS]->(neig1)
OPTIONAL MATCH (neig3: Function)-[:CALLS]->(neig2)
RETURN DISTINCT neig1.functionName, neig2.functionName, neig3.functionName

Use of OPTIONAL MATCH matches patterns against your graph database, just like a MATCH does. The difference is that if no matches are found, OPTIONAL MATCH will use a null for missing parts of the pattern. OPTIONAL MATCH could be considered the Cypher equivalent of the outer join in SQL.
MATCH (changedNode: Function) WHERE changedNode.signature IN [...]
MATCH (neig1: Function)-[:CALLS]->(changedNode)
OPTIONAL MATCH (neig2: Function)-[:CALLS]->(neig1)
OPTIONAL MATCH (neig3: Function)-[:CALLS]->(neig2)
RETURN DISTINCT neig1.functionName, neig2.functionName, neig3.functionName
For more explanation, visit: https://neo4j.com/developer/kb/a-note-on-optional-matches/

Related

Trouble using OPTIONAL MATCH with a MATCH and WHERE

I have a cypher query that is not behaving as expected and I'm trying to figure out why. I suspect I don't fully understand how OPTIONAL MATCH works.
The database has one (:'Person::Current') node and one (:'Trait::Current') node. It does not have a (:'PersonTrait::Current') node.
If I run this query, it correctly returns a count(t) of 1
MATCH (n:`Person::Current` {uuid: $person_id}), (t:`Trait::Current` {uuid: $trait_id})
WHERE NOT (
(n)-[:PERSON_TRAIT]->(:`PersonTrait::Current` {has: true})-[:PERSON_TRAIT]->(t) OR
(n)-[:PERSON_TRAIT]->(:`PersonTrait::Current` {has: false})-[:PERSON_TRAIT]->(t) OR
(t)-[:GIVES_TRAIT]->(:`GivesTrait::Current`)-[:GIVES_TRAIT]->(:`Trait::Current`)<-[:PERSON_TRAIT]-(:`PersonTrait::Current` {has: false})<-[:PERSON_TRAIT]-(n)
)
RETURN count(t) as res
When a (:'PersonTrait::Current') node is added to the database in the form
(:`Person::Current`)-[:PERSON_TRAIT]->(:`PersonTrait::Current` {has: true})-[:PERSON_TRAIT]->(:`Trait::Current`)
My query correctly returns a count(t) of 0.
However, if I try and DRY up the query by making use of OPTIONAL MATCH, like so
MATCH (n:`Person::Current` {uuid: $person_id}), (t:`Trait::Current` {uuid: $trait_id})
OPTIONAL MATCH (pt:`PersonTrait::Current`)
WHERE NOT (
((n)-[:PERSON_TRAIT]->(pt)-[:PERSON_TRAIT]->(t) AND exists(pt.has)) OR
(t)-[:GIVES_TRAIT]->(:`GivesTrait::Current`)-[:GIVES_TRAIT]->(:`Trait::Current`)<-[:PERSON_TRAIT]-(pt {has: false})<-[:PERSON_TRAIT]-(n)
)
RETURN count(t) as res
Then the query incorrectly returns a count(t) of 1 when a (:'PersonTrait::Current') node is added to the database in the form
(:`Person::Current`)-[:PERSON_TRAIT]->(:`PersonTrait::Current` {has: true})-[:PERSON_TRAIT]->(:`Trait::Current`)
Anyone know what's going wrong? The WHERE NOT clause should be filtering out (t) nodes if a (pt) node is present with the appropriate pattern.
THANKS!!!
I think the issue is understanding the WHERE clause, in that WHERE only applies to the previous MATCH, OPTIONAL MATCH, or WITH clause.
In this case, it's paired with the OPTIONAL MATCH, so rows won't be filtered out when the WHERE is false, it will behave the same as if the OPTIONAL MATCH failed, so newly introduced variables in the OPTIONAL MATCH would be set to null.
If you want the WHERE to filter out rows, pair it with a WITH clause instead:
MATCH (n:`Person::Current` {uuid: $person_id}), (t:`Trait::Current` {uuid: $trait_id})
OPTIONAL MATCH (pt:`PersonTrait::Current`)
WITH n, t, pt
WHERE NOT (
((n)-[:PERSON_TRAIT]->(pt)-[:PERSON_TRAIT]->(t) AND exists(pt.has)) OR
(t)-[:GIVES_TRAIT]->(:`GivesTrait::Current`)-[:GIVES_TRAIT]->(:`Trait::Current`)<-[:PERSON_TRAIT]-(pt {has: false})<-[:PERSON_TRAIT]-(n)
)
RETURN count(t) as res

cypher to combine nodes and relationships into a single column

So as a complication to this question, I basically want to do
MATCH (n:TEST) OPTIONAL MATCH (n)-[r]->() RETURN DISTINCT n, r
And I want to return n and r as one column with no repeat values. However, running
MATCH (n:TEST) OPTIONAL MATCH (n)-[r]->() UNWIND n+r AS x RETURN DISTINCT x
gives a "Type mismatch: expected List but was Relationship (line 1, column 47)" error. And this query
MATCH (n:TEST) RETURN DISTINCT n UNION MATCH ()-[n]->() RETURN DISTINCT n
Puts nodes and relationships in the same column, but the context from the first match is lost in the second half.
So how can I return all matched nodes and relationships as one minimal list?
UPDATE:
This is the final modified version of the answer query I am using
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r {properties:properties(r), id:id(r), type:type(r), startNode:id(startNode(r)), endNode:id(endNode(r))})} as n
There are a couple ways to handle this, depending on if you want to hold these within lists, or within maps, or if you want a map projection of a node to include its relationships.
If you're using Neo4j 3.1 or newer, then map projection is probably the easiest approach. Using this, we can output the properties of a node and include its relationships as a collected property:
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r)} as n
Here's what you might do if you wanted each row to be its own pairing of a node and a single one of its relationships as a list:
...
RETURN [n, r] as pair
And as a map:
...
RETURN {node:n, rel:r} as pair
EDIT
As far as returning more data from each relationship, if you check the Code results tab, you'll see that the id, relationship type, and start and end node ids are included, and accessible from your back-end code.
However, if you want to explicitly return this data, then we just need to include it in the query, using another map projection for each relationship:
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r {.*, id:id(r), type:type(r), startNode:startNode(r), endNode:endNode(r)})} as n

Neo4j Optional Relationship Match

I want to write a query that returns a node (a), the nodes that are directly adjacent to it (b), and then all nodes that connect to (b) but not those nodes that have already been identified as (b).
So... If my graph was:
d
/
a<--b
\
c
I want to return { a, [b], [c, d] }.
So far, I have the following query (the 'prop' attribute distinguishes each node from each other):
MATCH (a)<-[:something]-(b)<-[:something*0..]<-(c)
WHERE NOT (c.prop IN b.prop)
RETURN a.prop, collect(b.prop), collect (c.prop)
If my graph looks like:
a<--b
I expect the result to be { a, [b], [] } but instead I get nothing back, most likely due to c.prop being in b.prop. I tried using the OPTIONAL MATCH but that did not work either:
MATCH (a)<-[:something]-(b)
OPTIONAL MATCH (a)<-[:something]<-(b)<-[:something*0..]<-(c)
WHERE NOT (c.prop IN b.prop)
RETURN a.prop, collect(b.prop), collect (c.prop)
Any way to get the intended results?
When I run the following query:
MATCH (n:Crew)-[r:LOVES*]->m
OPTIONAL MATCH (m:Crew)-[r2:KNOWS*]->o
WHERE n.name='Neo' AND NOT (o.name IN m.name)
RETURN n,m,o
in http://console.neo4j.org/, on the sample graph, I get Neo and Trinity, even though Trinity knows nobody (o is empty). I think the OPTIONAL MATCH only needs to contain the actual optional part of your traversal, whereas in your code you have everything. The (a)<-[:something]<-(b) should not appear there, only (b)<-[:something*0..]<-(c)

Multiple Outer Joins / OPTIONAL MATCH without a hierarchy in neo4j

I am trying to match multiple outer joins on the same level in neo4j.
My database consists of users and a count of common up ur downratings on articles. The ratings counts are on seprate edges for up and downratings between the users.
---------- -----------
| User n | -[:rating_positive {weight}]-> | User n2 |
---------- -----------
| ^
\-----[:rating_negative {weight}]-------/
Now i want to produce edges that sum up these ratings.
I would love to use multiple optional merges, that do so sch as e.g.:
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
But: In this example I get all users without any positive ratings and those with positive and negatice ratings but none with only negative ratings. So there seems to be a sequence in OPTIONAL MATCH.
If I swap the order of the "OPTIONAL MATCHes" I get those with only negative ratings but not those with onl positive ratings.
So "OPTIONAL MATCH" is somehow a sequence where only when the first
sequence is met I get something from the second and so on?
Is there a workaround?
Neo4j Version is 2.1.3.
P.S.:
Even more confusing matching against NULL does not seem to work. So this query:
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
WHERE rating_positive IS NULL AND rating_negative IS NOT NULL
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
will give me lots of edges with NULL rating_negative and NON NULL rating_positive. I don't know what is happening with null matching in WHERE?
Anyway I found a way to recode the nulls to 0 values using "coalesce":
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
WITH n, n2, coalesce(rating_positive.weight, 0) AS rating_positive, coalesce(rating_negative.weight, 0) as rating_negative
WHERE rating_positive = 0 AND rating_negative > 0
RETURN n.uid, n2.uid, rating_positive, rating_negative
With this query it works as expected.
I believe more than the sequencing of the optional match, it's the fact that you've bound n2.
So the next optional match is restricted to only match the nodes identified to be candidates for n2 in the previous match. And so it appears that the order of the optional match influences it.
If you take a look at a small sample graph I set up here http://console.neo4j.org/r/lrp55o , the following query
MATCH (n:User)
OPTIONAL
MATCH (n)-[:rating_negative]->(n2)
OPTIONAL
MATCH (n)-[:rating_positive]->(n2)
RETURN n,n2
returns B-[:rating_negative]->C and C-[:rating_negative]->D but it leaves out A-[:rating_positive]->B.
The first optional match for rating_negative bound C and D as nodes for "n2". The second optional match found no n which has a rating_positive to C or D and hence the results.
I'm a bit unclear about what you are trying to do with the query and null checks but a union would be one way to give you all the positive and negative relations (which you can add your filters to):
MATCH (n:User)
OPTIONAL
MATCH (n)-[rating:rating_negative]->(n2)
RETURN n,n2,rating.weight
UNION ALL
MATCH (n:User)
OPTIONAL
MATCH (n)-[rating:rating_positive]->(n2)
RETURN n, n2, rating.weight
If this is not what you're looking for, a small subgraph at http://console.neo4j.org?init=0 would be great to help you further.
EDIT: Since comments indicated that the sum of ratings was required between a pair of users, the following query does the job:
MATCH (u:User)-[rating:rating_positive|:rating_negative]->(u2)
RETURN u,u2,sum(rating.weight)
I can't be entirely sure whether this is what is causing your problem but it appears to me that you should be omitting the labels in the OPTIONAL MATCH clauses.
Perhaps try the query below
MATCH (n:`User`)
OPTIONAL MATCH (n)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n)-[rating_negative:rating_negative]-(n2)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
It may also be worth including the relationship directions.
MATCH (n:`User`)
OPTIONAL MATCH (n)-[rating_positive:rating_positive]->(n2:`User`)
OPTIONAL MATCH (n)-[rating_negative:rating_negative]->(n2)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight

Difference between START n = node(*) and MATCH (n)

In Neo4j 2.0 this query:
MATCH (n) WHERE n.username = 'blevine'
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
returns different results than this query:
START n = node(*) WHERE n.username = 'blevine'
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
The first query works as expected returning a single Node for 'blevine' and the associated Nodes mentioned in the OPTIONAL MATCH clauses. The second query returns many more Nodes which do not even have a username property. I realize that start n = node(*) is not recommended and that START is not even required in 2.0. But the second form (with OPTIONAL MATCH replaced with question marks on the relationship type) worked prior to 2.0. In the second form, why is 'n' not being constrained to the single 'blevine' node by the first WHERE clause?
To run the second query as expected you would just need to add WITH n. In your query you would need to filter the result and pass it for optional match which is to be done using WITH
START n = node(*) WHERE n.username = 'blevine'
WITH n
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
From the documentation
WHERE defines the MATCH patterns in more detail. The predicates are part of the
pattern description, not a filter applied after the matching is done.
This means that WHERE should always be put together with the MATCH clause it belongs to.
when you do start n=node(*) where n.name="xyz" you need to pass the result explicitly into your next optional matches. But when you do MATCH (n) WHERE n.name="xyz" this tells graph specifically what node to start looking into.
EDIT
Here is the thing. The documentation says Optional Match returns null if a pattern is not found so in your first case, it includes all those results too where n.username property is null or cases where n doesnt even have a relationship suggested in the OPTIONAL MATCH pattern. So when you do a WITH n , the graph is explicitly told to use only n.
Excerpt from the documentation (link : here)
OPTIONAL MATCH matches patterns against your graph database, just like MATCH does.
The difference is that if no matches are found, OPTIONAL MATCH will use NULLs for
missing parts of the pattern. OPTIONAL MATCH could be considered the Cypher
equivalent of the outer join in SQL.
Either the whole pattern is matched, or nothing is matched. Remember that
WHERE is part of the pattern description, and the predicates will be
considered while looking for matches, not after. This matters especially
in the case of multiple (OPTIONAL) MATCH clauses, where it is crucial to
put WHERE together with the MATCH it belongs to.
Also few more things to note about the behaviour of WHERE clause: here
Excerpts:
WHERE is not a clause in it’s own right — rather, it’s part of MATCH,
OPTIONAL MATCH, START and WITH.
In the case of WITH and START, WHERE simply filters the results.
For MATCH and OPTIONAL MATCH on the other hand, WHERE adds constraints
to the patterns described. It should not be seen as a filter after the
matching is finished.

Resources