I have the following scenario:
At some point in my path (in a node that lies a few links away from my start node),
I have the possibility of going down one path or another, for example:
If S is my startnode,
S-[]->..->(B)-[first:FIRST_WAY]->(...) ,
and
S-[]->..->(B)-[second:SECOND_WAY]->(...)
At the junction point, I will need to go down one path only (first or second)
Ideally, I would like to follow and include results from the second relationship, only if the first one is not present (regardless of what exists afterwards).
Is this possible with Cypher 1.9.7, in a single query?
One way would be to an optional match to match the patterns separately. Example:
MATCH (n:Object) OPTIONAL MATCH (n)-[r1:FIRST_WAY]->(:Object)-->(f1:Object) OPTIONAL MATCH (n)-[r2:SECOND_WAY]->()-->(f2:Object) RETURN coalesce(f2, f1)
This query will match both conditionally and the coalesce function will return the first result which is not null.
AFAIK, OPTIONAL_MATCH was introduced in 2.0 so you can't use that clause in 1.9, but there is an alternate syntax:
CYPHER 1.9 START n=node(*) MATCH (n)-[r1?:FIRST_WAY]->()-->(f1), (n)-[r2?:SECOND_WAY]->()-->(f2) RETURN coalesce(f2, f1)
I'm sure there are other ways to do this, probably using the OR operator for relationship matching, i.e. ()-[r:FIRST_WAY|SECOND_WAY]->(), and then examining the patterns matched to discard some of the result paths based on the relationship type.
Related
I had another thread about this where someone suggested to do
MATCH (p:Person {person_id: '123'})
WHERE ANY(x IN $names WHERE
EXISTS((p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(:Dias {group_name: x})))
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
This does what I need it to, returns nodes that fit the criteria without returning the relationships, but now I need to include another param that is a list.
....(:Dias {group_name: x, second_name: y}))
I'm unsure of the syntax.. here's what I tried
WHERE ANY(x IN $names and y IN $names_2 WHERE..
this gives me a syntax error :/
Since the ANY() function can only iterate over a single list, it would be difficult to continue to use that for iteration over 2 lists (but still possible, if you create a single list with all possible x/y combinations) AND also be efficient (since each combination would be tested separately).
However, the new existenial subquery synatx introduced in neo4j 4.0 will be very helpful for this use case (I assume the 2 lists are passed as the parameters names1 and names2):
MATCH (p:Person {person_id: '123'})
WHERE EXISTS {
MATCH (p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(d:Dias)
WHERE d.group_name IN $names1 AND d.second_name IN $names2
}
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
By the way, here are some more tips:
If it is possible to specify the direction of each relationship in your query, that would help to speed up the query.
If it is possible to remove any node labels from a (sub)query and still get the same results, that would also be faster. There is an exception, though: if the (sub)query has no variables that are already bound to a value, then you would normally want to specify the node label for the one node that would be used to kick off that (sub)query (you can do a PROFILE to see which node that would be).
I am currently working with neo4j and I need to find path between 2 nodes in large graph. I am using this cypher query:
MATCH p=(acq:Acquisition {id:'1'})-[r*]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
Everything works as expected (query returns path of any length between nodes) but I am getting warning message displayed bellow:
Warning: This feature is deprecated and will be removed in future versions. Binding relationships to a list in a variable length pattern is deprecated.
In official documentation is used same pattern with *.
What is the correct way of finding paths of any lengths between nodes without getting any warning(without using deprecated syntax) ?
Binding relationships to a list in a variable length pattern is deprecated since 3.2.0-rc1.
According this pull request Cypher queries like:
MATCH (n)-[rs*]-() RETURN rs
will generate a warning and the canonical way to write the same query is:
MATCH p=(n)-[*]-() RETURN relationships(p) AS rs
Since you are not using the r variable in your query you can simply remove it from the query and the warning will disappear. This way:
MATCH p=(acq:Acquisition {id:'1'})-[*]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
All that means is that is a future release, your use of r in your pattern will no longer be permitted. Your query will need to look like this or it will break in a future release. Since you are not directly trying to use r in your results it should no problem for you to remove it.
MATCH p=(acq:Acquisition {id:'1'})-[*]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
You could always get the relationships using the relationships(p) to return a collection of relationships from the path if you needed to do something with them after the match.
Depending on the nature of your graph (size and complexity) though your query may become unwieldy because it is virtually unconstrained except for the ending node and the direction. There are a number of ways you could make it safer.
1 - Use shortestPath
You could use shortestPath or allShortestPaths
MATCH p=allShortestPaths((acq:Acquisition {id:'1'})-[*]->(ecs:ExternalCommunicationService {id:'1'}))
RETURN p
2 - Limit the Depth
You could add a limit on the depth of the match. It could be fixed
MATCH p=(acq:Acquisition {id:'1'})-[*10]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
or a range
MATCH p=(acq:Acquisition {id:'1'})-[*5..10]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
3 - Add Relationship TYPE
You could add one or more labels to reduce the potential number of paths you match. That can be used in conjunction with the depth parameters.
MATCH p=(acq:Acquisition {id:'1'})-[:TYPE_A|TYPE_B*5..10]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
4 - Use APOC
You could use APOC procedures.
I have a graph in which, there can exist three patters of paths between (:srcType) and (:destType):
Pattern 1
(:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(:destType)
Notice that, here the direction of relationships reverses as path goes through (center):<-[]-(center)-[]->
Pattern 2
In this pattern (srcParent) it self is a center. Thus direction of relationships reverses across (srcParent):
(:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(:destType)
Pattern 3
In this pattern (destParent) it self is a center. Thus direction of relationships reverses across (destParent):
(:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(:destType)
I am giving id of (:srcType) and trying to obtain all (:destType) nodes. Note that given one (:srcType) it can have one (:destType) node associated with it following first pattern, another following 2nd pattern and few more following third pattern. I am trying to retrieve single collection containing all these (:destType) nodes. So I have combined above queries as follows:
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE id(src)=3
WITH dest1
MATCH (src:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE id(src)=3
WITH dest1, dest2
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE id(src)=3
RETURN dest1, dest2, dest3
So here I am matching each pattern one by one in MATCH clauses and feeding (:destType)s output of one MATCH to next one using WITH clause. At the end I am returning all destTypes.
Q1. But this is not executing. When I run one of the pattern (single WITH), it correctly returns whichever (:destType) that matches the path. But with above query it returns 0 rows. Why is it so?
Q2. Also instead of returning all destTypes, I want to return single collection containing elements of all of them. Knowing that collections can be merged using +, is it possible to return something like below?
RETURN destType1+destType2+destType2
Note
I will need to add different filters for each pattern afterwards. So the future query may look something like this:
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE id(src)=3 AND srcParent.prop1='a'
WITH dest1
MATCH (src:srcType)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE id(src)=3 AND destParent.prop2='b'
WITH dest1, dest2
MATCH (src:srcType)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE id(src)=3 AND srcParent.prop3='c'
RETURN dest1, dest2, dest3
Given that these patterns may or may not be present, and that you want a collection of all results at the end, a good approach would be to match on the src node first, then use OPTIONAL MATCHes, and collect the results along the way, adding new ones in.
If we modify your last query, it may look something like this:
MATCH (src:srcType)
WHERE id(src) = 3
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)<-[]-(center)-[]->(destParent)-[]->()-[]->(dest1:destType)
WHERE srcParent.prop1='a'
WITH src, COLLECT(dest1) as dests
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)-[]->(destParent)-[]->()-[]->(dest2:destType)
WHERE destParent.prop2='b'
WITH src, dests + COLLECT(dest2) as dests
OPTIONAL MATCH (src)<-[]-()<-[]-(srcParent)<-[]-(destParent)-[]->()-[]->(dest3:destType)
WHERE srcParent.prop3='c'
RETURN dests + COLLECT(dest3) as dests
In Neo4j 2.0 this query:
MATCH (n) WHERE n.username = 'blevine'
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
returns different results than this query:
START n = node(*) WHERE n.username = 'blevine'
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
The first query works as expected returning a single Node for 'blevine' and the associated Nodes mentioned in the OPTIONAL MATCH clauses. The second query returns many more Nodes which do not even have a username property. I realize that start n = node(*) is not recommended and that START is not even required in 2.0. But the second form (with OPTIONAL MATCH replaced with question marks on the relationship type) worked prior to 2.0. In the second form, why is 'n' not being constrained to the single 'blevine' node by the first WHERE clause?
To run the second query as expected you would just need to add WITH n. In your query you would need to filter the result and pass it for optional match which is to be done using WITH
START n = node(*) WHERE n.username = 'blevine'
WITH n
OPTIONAL MATCH n-[:Person]->person
OPTIONAL MATCH n-[:UserLink]->role
RETURN n AS user,person,collect(role) AS roles
From the documentation
WHERE defines the MATCH patterns in more detail. The predicates are part of the
pattern description, not a filter applied after the matching is done.
This means that WHERE should always be put together with the MATCH clause it belongs to.
when you do start n=node(*) where n.name="xyz" you need to pass the result explicitly into your next optional matches. But when you do MATCH (n) WHERE n.name="xyz" this tells graph specifically what node to start looking into.
EDIT
Here is the thing. The documentation says Optional Match returns null if a pattern is not found so in your first case, it includes all those results too where n.username property is null or cases where n doesnt even have a relationship suggested in the OPTIONAL MATCH pattern. So when you do a WITH n , the graph is explicitly told to use only n.
Excerpt from the documentation (link : here)
OPTIONAL MATCH matches patterns against your graph database, just like MATCH does.
The difference is that if no matches are found, OPTIONAL MATCH will use NULLs for
missing parts of the pattern. OPTIONAL MATCH could be considered the Cypher
equivalent of the outer join in SQL.
Either the whole pattern is matched, or nothing is matched. Remember that
WHERE is part of the pattern description, and the predicates will be
considered while looking for matches, not after. This matters especially
in the case of multiple (OPTIONAL) MATCH clauses, where it is crucial to
put WHERE together with the MATCH it belongs to.
Also few more things to note about the behaviour of WHERE clause: here
Excerpts:
WHERE is not a clause in it’s own right — rather, it’s part of MATCH,
OPTIONAL MATCH, START and WITH.
In the case of WITH and START, WHERE simply filters the results.
For MATCH and OPTIONAL MATCH on the other hand, WHERE adds constraints
to the patterns described. It should not be seen as a filter after the
matching is finished.
I'm trying to create a query using cypher that will "Find" missing ingredients that a chef might have, My graph is set up like so:
(ingredient_value)-[:is_part_of]->(ingredient)
(ingredient) would have a key/value of name="dye colors". (ingredient_value) could have a key/value of value="red" and "is part of" the (ingredient, name="dye colors").
(chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)
I'm using this query to get all the ingredients, but not their actual values, that a recipe requires, but I would like the return only the ingredients that the chef does not have, instead of all the ingredients each recipe requires. I tried
(chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)<-[:has_ingredient*0..0]-chef
but this returned nothing.
Is this something that can be accomplished by cypher/neo4j or is this something that is best handled by returning all ingredients and sorted through them myself?
Bonus: Also is there a way to use cypher to match all values that a chef has to all values that a recipe requires. So far I've only returned all partial matches that are returned by a chef-[:has_value]->ingredient_value<-[:requires_value]-recipe and aggregating the results myself.
Update 01/10/2013:
Came across this in the Neo4j 2.0 reference:
Try not to use optional relationships.
Above all,
don’t use them like this:
MATCH a-[r?:LOVES]->() WHERE r IS NULL where you just make sure that they don’t exist.
Instead do this like so:
MATCH (a) WHERE NOT (a)-[:LOVES]->()
Using cypher for checking if relationship doesn't exist:
...
MATCH source-[r?:someType]-target
WHERE r is null
RETURN source
The ? mark makes the relationship optional.
OR
In neo4j 2 do:
...
OPTIONAL MATCH source-[r:someType]-target
WHERE r is null
RETURN source
Now you can check for non-existing (null) relationship.
For fetching nodes with not any relationship
This is the good option to check relationship is exist or not
MATCH (player)
WHERE NOT(player)-[:played]->()
RETURN player
You can also check multiple conditions for this
It will return all nodes, which not having "played" Or "notPlayed" Relationship.
MATCH (player)
WHERE NOT (player)-[:played|notPlayed]->()
RETURN player
To fetch nodes which not having any realtionship
MATCH (player)
WHERE NOT (player)-[r]-()
RETURN player
It will check node not having any incoming/outgoing relationship.
If you need "conditional exclude" semantic, you can achieve it this way.
As of neo4j 2.2.1, you can use OPTIONAL MATCH clause and filter out the unmatched(NULL) nodes.
It is also important to use WITH clause between the OPTIONAL MATCH and WHERE clauses, so that the first WHERE defines a condition for the optional match and the second WHERE behaves like a filter.
Assuming we have 2 types of nodes: Person and Communication. If I want to get all Persons which have never communicated by the telephone, but may have communicated other ways, I would make this query:
MATCH (p: Person)
OPTIONAL MATCH p--(c: Communication)
WHERE c.way = 'telephone'
WITH p, c
WHERE c IS NULL
RETURN p
The match pattern will match all Persons with their communications where c will be NULL for non-telephone Communications. Then the filter(WHERE after WITH) will filter out telephone Communications leaving all others.
References:
http://neo4j.com/docs/stable/query-optional-match.html#_introduction_3
http://java.dzone.com/articles/new-neo4j-optional
I wrote a gist showing how this can be done quite naturally using Cypher 2.0
http://gist.neo4j.org/?9171581
The key point is to use optional match to available ingredients and then compare to filter for missing (null) ingredients or ingredients with the wrong value.
Note that the notion is declarative and doesn't need to describe an algorithm, you just write down what you need.
The last query should be:
START chef = node(..)
MATCH (chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)
WHERE (ingredient)<-[:has_ingredient]-chef
RETURN ingredient
This pattern: (ingredient)<-[:has_ingredient*0..0]-chef
Is the reason it didn't return anything. *0..0 means that the length of the relationships must be zero, which means that ingredient and chef must be the same node, which they are not.
I completed this task using gremlin. I did
x=[]
g.idx('Chef')[[name:'chef1']].as('chef')
.out('has_ingredient').as('alreadyHas').aggregate(x).back('chef')
.out('has_value').as('values')
.in('requires_value').as('recipes')
.out('requires_ingredient').as('ingredients').except(x).path()
This returned the paths of all the missing ingredients. I was unable to formulate this in the cypher language, at least for version 1.7.
For new versions of Neo4j, you'll have this error:
MATCH (ingredient:Ingredient)
WHERE NOT (:Chef)-[:HAS_INGREDIENT]->(ingredient)
RETURN * LIMIT 100;
This feature is deprecated and will be removed in future versions.
Coercion of list to boolean is deprecated. Please consider using NOT isEmpty(...) instead.
To fix it:
MATCH (ingredient:Ingredient)
WHERE NOT EXISTS((:Chef)-[:HAS_INGREDIENT]->(ingredient))
RETURN * LIMIT 100;