Using neo4j community edition 2.x. In Cypher, I need to MATCH nodes in (two) different ways, then combine these (two) sets of matched nodes into single set (one variable name). This set would then be used for further action.
naive graph example (I can't post images)
I would like to find all knowledge of the squirrel, including the knowledge shared by the groups she is member of. (example is fictional)
I imagine something like this:
MATCH (u:User{username:'squirrel'}), (:User{username:'squirrel'})<-[:MEMBER]-(g:Group)
WITH "COMBINATION OF u AND g" AS ug
MATCH (ug)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
Outcome should be both "crack nuts" and "escape predators".
In the place of "COMBINATION OF u AND g" I tried variations on collect(u)+collect(g), EXTRACT, etc. Without success.
So far the simplest working way I found is using UNION.
MATCH (u:User{username:'squirrel'})-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
UNION
MATCH (u:User{username:'squirrel'})<-[:MEMBER]-(:Group)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
This might solve this simple example, but is not good for more complex queries. I seek the solution for more general problem: MATCH several sets of nodes, glue them into single set (single variable) and continue with this new set.
Any ideas, please? Am I missing something basic? Or is this impossible? Thanks!
Something possibly similar on grokbase.
edit:
With this hacky solution to similar question I was able to solve the problem by extracting internal ids from collection of nodes:
MATCH (u:User{username:'squirrel'}), (:User{username:'squirrel'})<-[:MEMBER]-(g:Group)
WITH [x in collect(u)+collect(g)|id(x)] as collectedIds MATCH (ug) WHERE id(ug) in collectedIds
MATCH (ug)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
Could it be done any better?
At least since Neo4j 3.0 you can use variable-length pattern matching to solve this issue. Simply set explicitly the minimum length to 0 and move the label test to a separate WHERE clause:
MATCH (:User {username:'squirrel'}) <-[:MEMBER*0..1]- (ug)
WHERE ug:User OR ug:Group
WITH ug
MATCH (ug)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
Not sure about the general case, but for this specific case, you might try to combine the two patterns into one as follows,
MATCH (u:User{username:'squirrel'})<-[:MEMBER*0..1]-()-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
The only general solution I found so far:
match required starting points (with different names)
collect internal ids of starting points
match the starting points with collected ids (now the starting points have single name)
do whatever action you need to do with the starting points
Now the code itself:
MATCH (u:User{username:'squirrel'}), (:User{username:'squirrel'})<-[:MEMBER]-(g:Group)
WITH [x in collect(u)+collect(g)|id(x)] as collectedIds MATCH (ug) WHERE id(ug) in collectedIds
MATCH (ug)-[:KNOW_HOW]->(k:Knowledge)
RETURN k.type
Related
I had another thread about this where someone suggested to do
MATCH (p:Person {person_id: '123'})
WHERE ANY(x IN $names WHERE
EXISTS((p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(:Dias {group_name: x})))
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
This does what I need it to, returns nodes that fit the criteria without returning the relationships, but now I need to include another param that is a list.
....(:Dias {group_name: x, second_name: y}))
I'm unsure of the syntax.. here's what I tried
WHERE ANY(x IN $names and y IN $names_2 WHERE..
this gives me a syntax error :/
Since the ANY() function can only iterate over a single list, it would be difficult to continue to use that for iteration over 2 lists (but still possible, if you create a single list with all possible x/y combinations) AND also be efficient (since each combination would be tested separately).
However, the new existenial subquery synatx introduced in neo4j 4.0 will be very helpful for this use case (I assume the 2 lists are passed as the parameters names1 and names2):
MATCH (p:Person {person_id: '123'})
WHERE EXISTS {
MATCH (p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(d:Dias)
WHERE d.group_name IN $names1 AND d.second_name IN $names2
}
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
By the way, here are some more tips:
If it is possible to specify the direction of each relationship in your query, that would help to speed up the query.
If it is possible to remove any node labels from a (sub)query and still get the same results, that would also be faster. There is an exception, though: if the (sub)query has no variables that are already bound to a value, then you would normally want to specify the node label for the one node that would be used to kick off that (sub)query (you can do a PROFILE to see which node that would be).
I'm trying to fill in my understanding of the fundamentals of neo4j (version 3.4).
I believe that all of the following produce the same results -- ie they are different syntax for doing exactly the same thing:
MATCH (ee {name: "Emil"}) RETURN ee;
MATCH (ee) WHERE ee.name = "Emil" RETURN ee;
MATCH (ee:Person {name: "Emil"}) RETURN ee;
MATCH (ee:Person) WHERE ee.name = "Emil" RETURN ee;
MATCH (ee:Person) WHERE (ee).name = "Emil" RETURN ee;
I actually have several questions:
Among this list is there a "best" way of doing a Node MATCH? Obviously using a :Label makes it more efficient, but the effect of WHERE vs prop maps is mysterious.
Are any of these out-right incorrect? That is to say that although they work, it's kind of unintended or a particularly bad pattern.
Are there additional ways of making a MATCH operation, further to this list? I'm curious for an exhaustive list (wrong or right).
Your clauses are NOT all the same.
The first group of (2) MATCH clauses DO NOT require the Person label for matched nodes. But the second group of (3) clauses DO require the Person label.
Putting a label (or many labels, as appropriate) on a node is generally a good idea, as matching on a label is a quick way to filter your nodes when performing a MATCH. And labels also aid in data model understandability. In addition, a label is required if you wanted to index a node property.
But whether or not you should specify a label (or several) on a specific a MATCH clause depends on the amount of filtering you are trying to do. Omitting all labels from a MATCH would be perfectly appropriate if at that point in your query you really wanted to match nodes with any label (or even no labels).
For an exhaustive list, you might include:
MATCH (ee) WHERE ee.name in ["Emil"] RETURN ee
This is not particularly helpful with a single item, but you can have multiple comma delimited items in a list or a collection. For example,
match (n:order{item:'widget'})
with collect(distinct n.customer_id) as customerCollection
match (c:Customers) where c.customer_id in [customerCollection]
return c.Name, c.City
There is the situation in which i need to match any one of the label of node.
We can do it for relationship types like
(n)-[:KNOWS|LOVES]->(m)
Can we match node labels like this?
eg.
MATCH (c:computer)<-[:belongs_to]-(comp:HP|IBM)
return comp
Currently I have tried this and it gives results, Is there any simpler way?
MATCH (c:computer)<-[:belongs_to]-(comp)
WHERE 'HP' IN labels(comp) OR 'IBM' IN labels(comp)
return comp
I Think
WHERE 'HP' IN labels(comp) OR 'IBM' IN labels(comp)
AND
WHERE comp:HP OR comp:IBM
Will work with same manner second one is simple to use
This form of your last query is at least simpler to write and more easily understood:
MATCH (c:computer)<-[:belongs_to]-(comp)
WHERE comp:HP OR comp:IBM
return comp;
Currently facing this same problem.
Since I have quite a few labels to match on (revealing a bit of a flaw in my architecture!) I found the following to solve this problem concisely:
MATCH (n:computer)
WHERE any(label in labels(n) WHERE label in ['HP', 'IBM'])
RETURN n
Can't seem to figure out, and I'm not sure it entirely possible. I have a graph like so
a-[:granted]->b-[:granted]->...x-[granted_source]>s
where b and x are of interest. While I already know a and s, the end points, which are defined in START clause.
Note that b and c could be one ( a->b->s ) or more then one ( a->b->c->x->s ) and the goal is to find the shortest path returning only the nodes that are pointed to by a 'granted' relationship.
The closest I've got is:
start s=node(21), p=node(2)
match paths=shortestPath(p-[:granted|granted_source*]->s)
return NODES(paths)
Which gives all the nodes, including start (p) and end (s). But I can't seem to filter out, or better would be to not return them at all, only the nodes that are pointed to by a granted relationship and in the order from (s) if possible. I'm on Neo4j 2.0b and I'm wondering if Labels, which I have no issue using, would be the better way to go? Any help would be very appreciated.
So, you want to chop the head and tail off of a collection of nodes? (Am I understanding that right?) How about:
start s=node(21), p=node(2)
match paths=shortestPath(p-[:granted|granted_source*]->s)
return NODES(paths)[1..-1]
I think I resolved it using a WITH, I think this is probably the best performance given that first the p-... are fetched, then all ...->s are fetched and then using the shortestPath() is used to get the 'in between' nodes. The results appear correct.
start s=node(21), p=node(2)
match p-[:granted]-x, y-[:granted_source]->s
with x, y
match paths=shortestPath(x-[:granted*]->y)
return NODES(paths)
I'm trying to create a query using cypher that will "Find" missing ingredients that a chef might have, My graph is set up like so:
(ingredient_value)-[:is_part_of]->(ingredient)
(ingredient) would have a key/value of name="dye colors". (ingredient_value) could have a key/value of value="red" and "is part of" the (ingredient, name="dye colors").
(chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)
I'm using this query to get all the ingredients, but not their actual values, that a recipe requires, but I would like the return only the ingredients that the chef does not have, instead of all the ingredients each recipe requires. I tried
(chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)<-[:has_ingredient*0..0]-chef
but this returned nothing.
Is this something that can be accomplished by cypher/neo4j or is this something that is best handled by returning all ingredients and sorted through them myself?
Bonus: Also is there a way to use cypher to match all values that a chef has to all values that a recipe requires. So far I've only returned all partial matches that are returned by a chef-[:has_value]->ingredient_value<-[:requires_value]-recipe and aggregating the results myself.
Update 01/10/2013:
Came across this in the Neo4j 2.0 reference:
Try not to use optional relationships.
Above all,
don’t use them like this:
MATCH a-[r?:LOVES]->() WHERE r IS NULL where you just make sure that they don’t exist.
Instead do this like so:
MATCH (a) WHERE NOT (a)-[:LOVES]->()
Using cypher for checking if relationship doesn't exist:
...
MATCH source-[r?:someType]-target
WHERE r is null
RETURN source
The ? mark makes the relationship optional.
OR
In neo4j 2 do:
...
OPTIONAL MATCH source-[r:someType]-target
WHERE r is null
RETURN source
Now you can check for non-existing (null) relationship.
For fetching nodes with not any relationship
This is the good option to check relationship is exist or not
MATCH (player)
WHERE NOT(player)-[:played]->()
RETURN player
You can also check multiple conditions for this
It will return all nodes, which not having "played" Or "notPlayed" Relationship.
MATCH (player)
WHERE NOT (player)-[:played|notPlayed]->()
RETURN player
To fetch nodes which not having any realtionship
MATCH (player)
WHERE NOT (player)-[r]-()
RETURN player
It will check node not having any incoming/outgoing relationship.
If you need "conditional exclude" semantic, you can achieve it this way.
As of neo4j 2.2.1, you can use OPTIONAL MATCH clause and filter out the unmatched(NULL) nodes.
It is also important to use WITH clause between the OPTIONAL MATCH and WHERE clauses, so that the first WHERE defines a condition for the optional match and the second WHERE behaves like a filter.
Assuming we have 2 types of nodes: Person and Communication. If I want to get all Persons which have never communicated by the telephone, but may have communicated other ways, I would make this query:
MATCH (p: Person)
OPTIONAL MATCH p--(c: Communication)
WHERE c.way = 'telephone'
WITH p, c
WHERE c IS NULL
RETURN p
The match pattern will match all Persons with their communications where c will be NULL for non-telephone Communications. Then the filter(WHERE after WITH) will filter out telephone Communications leaving all others.
References:
http://neo4j.com/docs/stable/query-optional-match.html#_introduction_3
http://java.dzone.com/articles/new-neo4j-optional
I wrote a gist showing how this can be done quite naturally using Cypher 2.0
http://gist.neo4j.org/?9171581
The key point is to use optional match to available ingredients and then compare to filter for missing (null) ingredients or ingredients with the wrong value.
Note that the notion is declarative and doesn't need to describe an algorithm, you just write down what you need.
The last query should be:
START chef = node(..)
MATCH (chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)
WHERE (ingredient)<-[:has_ingredient]-chef
RETURN ingredient
This pattern: (ingredient)<-[:has_ingredient*0..0]-chef
Is the reason it didn't return anything. *0..0 means that the length of the relationships must be zero, which means that ingredient and chef must be the same node, which they are not.
I completed this task using gremlin. I did
x=[]
g.idx('Chef')[[name:'chef1']].as('chef')
.out('has_ingredient').as('alreadyHas').aggregate(x).back('chef')
.out('has_value').as('values')
.in('requires_value').as('recipes')
.out('requires_ingredient').as('ingredients').except(x).path()
This returned the paths of all the missing ingredients. I was unable to formulate this in the cypher language, at least for version 1.7.
For new versions of Neo4j, you'll have this error:
MATCH (ingredient:Ingredient)
WHERE NOT (:Chef)-[:HAS_INGREDIENT]->(ingredient)
RETURN * LIMIT 100;
This feature is deprecated and will be removed in future versions.
Coercion of list to boolean is deprecated. Please consider using NOT isEmpty(...) instead.
To fix it:
MATCH (ingredient:Ingredient)
WHERE NOT EXISTS((:Chef)-[:HAS_INGREDIENT]->(ingredient))
RETURN * LIMIT 100;