Match on property types in Neo4j - neo4j

Is there a way to match nodes in Neo4j/Cypher based on the type of a property value? I'm looking for something like this:
MATCH (n:Person)
WHERE NOT(n.id_number isa STRING)
RETURN n
The closest I can think of is
MATCH (n:Person)
WHERE NOT(n.id_number = toString(n.id_number))
RETURN n
Although this is still pretty fast, it doesn't use an index, according to PROFILE, whereas I think an isa-style query could use an index.

Using apoc.meta.type procedure which returns type name of a value like INTEGER,FLOAT,STRING,BOOLEAN,RELATIONSHIP,NODE,PATH,NULL,UNKNOWN,MAP,LIST.
Ref: https://community.neo4j.com/t/data-type-of-a-property/1309/2

Aside from your workaround, Cypher has no way to match nodes by property value type.

Related

What is difference between ANY and IN clause in Neo4j?

Neo4j has two different clauses to find value in an array, ANY and IN. Please explain, how they are different as both are used to filter the data by checking if, the specified value is present in the array or not.
Query#1 : MATCH (n) WHERE any(color IN n.liked_colors WHERE color = 'yellow') RETURN n
Result#1: Node with name Eskil
Query#2 : Match (n) where 'yellow' in n.liked_colors return n
Result#2: Node with name Eskil
If both query return same results then where is the diffrence.
In your simple example case the results are the same.
However any() allows more complex means of filtering, especially when we aren't using this to simply detect membership of a known value in a list:
MATCH (n:Person)
WHERE any(color in n.liked_colors WHERE (n)<-[:KNOWS|MARRIED*]-(other:Person {eyes:color}))
Find a person where for any of their liked colors, we can traverse the given pattern to find another person with eyes of that color.

Find nodes where property is not a specific value without WHERE

A node's property has 6 categories. I'd like to leave all the nodes with this property not equal to one of the categories.
It's easy to do with WHERE like this:
MATCH (a)
WHERE a.property <> "category"
RETURN a
I'd like to do it another way without where because it seems to be more efficient. I imagine it like this:
MATCH ( a {property <> "category"} )
RETURN a
Is it possible?
Neo4j MATCH does not have a syntax to inline WHERE NOT <property>=<value>. Furthermore, Cypher is declarative, meaning it only defines what to return, not how to return it. So MATCH (n{id:1}) is equivalent (in execution) to MATCH (n) WHERE n.id=1. The only time WHERE vs inline produces different execution plans is when you don't pair the WHERE clause with the MATCH. By trying to "optimize" your cypher for execution, most of the time you will just be getting in the Cypher planners way. (Unless your original cypher was over complicated)

efficient querying of array property in neo4j

I've wanting to query for nodes that have a specific value in a string array property. For instance my nodes might have 2 properties, name (a string) and aliases (a string array). I have created an index on both properties using something like CREATE INDEX ON :F2(name).
I can query the name property using something cypher like this, and the result is immediate:
MATCH (p:F2) WHERE p.name = 'john' RETURN p;
I can query the aliases property using cypher like this, and I get the expected result but the response is very slow:
MATCH (p:F2) WHERE ANY(item IN p.aliases WHERE item = 'big john') RETURN p;
This looks like either the query is not optimal or the index is not being used.
Can someone suggest how to do this correctly. I'm pretty new to neo4j and cyper :-(
You could refactor your graph to make alias a node. So that any F2 node has zero or more aliases.
CREATE INDEX ON :Alias(name)
Then you could query it with something like this...
MATCH (p:F2)-[:HAS_ALIAS]->(:Alias {name: 'big john'})
RETURN p

How to query for multiple OR'ed Neo4j paths?

Anyone know of a fast way to query multiple paths in Neo4j ?
Lets say I have movie nodes that can have a type that I want to match (this is psuedo-code)
MATCH
(m:Movie)<-[:TYPE]-(g:Genre { name:'action' })
OR
(m:Movie)<-[:TYPE]-(x:Genre)<-[:G_TYPE*1..3]-(g:Genre { name:'action' })
(m)-[:SUBGENRE]->(sg:SubGenre {name: 'comedy'})
OR
(m)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
The problem is, the first "m:Movie" nodes to be matched must match one of the paths specified, and the second SubGenre is depenedent on the first match.
I can make a query that works using MATCH and WHERE, but its really slow (30 seconds with a small 20MB dataset).
The problem is, I don't know how to OR match in Neo4j with other OR matches hanging off of the first results.
If I use WHERE, then I have to declare all the nodes used in any of the statements, in the initial MATCH which makes the query slow (since you cannot introduce new nodes in a WHERE)
Anyone know an elegant way to solve this ?? Thanks !
You can try a variable length path with a minimal length of 0:
MATCH
(m:Movie)<-[:TYPE|:SUBGENRE*0..4]-(g)
WHERE g:Genre and g.name = 'action' OR g:SubGenre and g.name='comedy'
For the query to use an index to find your genre / subgenre I recommend a UNION query though.
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' })
RETURN distinct m
UNION
(m:Movie)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
RETURN distinct m
Perhaps the OPTIONAL MATCH clause might help here. OPTIONAL MATCH beavior is similar to the MATCH statement, except that instead of an all-or-none pattern matching approach, any elements of the pattern that do not match the pattern specific in the statement are bound to null.
For example, to match on a movie, its genre and a possible sub-genre:
OPTIONAL MATCH (m:Movie)-[:IS_GENRE]->(g:Genre)<-[:IS_SUBGENRE]-(sub:Genre)
WHERE m.title = "The Matrix"
RETURN m, g, sub
This will return the movie node, the genre node and if it exists, the sub-genre. If there is no sub-genre then it will return null for sub. You can use variable length paths as you have above as well with OPTIONAL MATCH.
[EDITED]
The following MATCH clause should be equivalent to your pseudocode. There is also a USING INDEX clause that assumes you have first created an index on :SubGenre(name), for efficiency. (You could use an index on :Genre(name) instead, if Genre nodes are more numerous than SubGenre nodes.)
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' }),
(m)-[:SUBGENRE]->()<-[:SUB_TYPE*0..3]-(sg:SubGenre { name: 'comedy' })
USING INDEX sg:SubGenre(name)
Here is a console that shows the results for some sample data.

Return node if relationship is not present

I'm trying to create a query using cypher that will "Find" missing ingredients that a chef might have, My graph is set up like so:
(ingredient_value)-[:is_part_of]->(ingredient)
(ingredient) would have a key/value of name="dye colors". (ingredient_value) could have a key/value of value="red" and "is part of" the (ingredient, name="dye colors").
(chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)
I'm using this query to get all the ingredients, but not their actual values, that a recipe requires, but I would like the return only the ingredients that the chef does not have, instead of all the ingredients each recipe requires. I tried
(chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)<-[:has_ingredient*0..0]-chef
but this returned nothing.
Is this something that can be accomplished by cypher/neo4j or is this something that is best handled by returning all ingredients and sorted through them myself?
Bonus: Also is there a way to use cypher to match all values that a chef has to all values that a recipe requires. So far I've only returned all partial matches that are returned by a chef-[:has_value]->ingredient_value<-[:requires_value]-recipe and aggregating the results myself.
Update 01/10/2013:
Came across this in the Neo4j 2.0 reference:
Try not to use optional relationships.
Above all,
don’t use them like this:
MATCH a-[r?:LOVES]->() WHERE r IS NULL where you just make sure that they don’t exist.
Instead do this like so:
MATCH (a) WHERE NOT (a)-[:LOVES]->()
Using cypher for checking if relationship doesn't exist:
...
MATCH source-[r?:someType]-target
WHERE r is null
RETURN source
The ? mark makes the relationship optional.
OR
In neo4j 2 do:
...
OPTIONAL MATCH source-[r:someType]-target
WHERE r is null
RETURN source
Now you can check for non-existing (null) relationship.
For fetching nodes with not any relationship
This is the good option to check relationship is exist or not
MATCH (player)
WHERE NOT(player)-[:played]->()
RETURN player
You can also check multiple conditions for this
It will return all nodes, which not having "played" Or "notPlayed" Relationship.
MATCH (player)
WHERE NOT (player)-[:played|notPlayed]->()
RETURN player
To fetch nodes which not having any realtionship
MATCH (player)
WHERE NOT (player)-[r]-()
RETURN player
It will check node not having any incoming/outgoing relationship.
If you need "conditional exclude" semantic, you can achieve it this way.
As of neo4j 2.2.1, you can use OPTIONAL MATCH clause and filter out the unmatched(NULL) nodes.
It is also important to use WITH clause between the OPTIONAL MATCH and WHERE clauses, so that the first WHERE defines a condition for the optional match and the second WHERE behaves like a filter.
Assuming we have 2 types of nodes: Person and Communication. If I want to get all Persons which have never communicated by the telephone, but may have communicated other ways, I would make this query:
MATCH (p: Person)
OPTIONAL MATCH p--(c: Communication)
WHERE c.way = 'telephone'
WITH p, c
WHERE c IS NULL
RETURN p
The match pattern will match all Persons with their communications where c will be NULL for non-telephone Communications. Then the filter(WHERE after WITH) will filter out telephone Communications leaving all others.
References:
http://neo4j.com/docs/stable/query-optional-match.html#_introduction_3
http://java.dzone.com/articles/new-neo4j-optional
I wrote a gist showing how this can be done quite naturally using Cypher 2.0
http://gist.neo4j.org/?9171581
The key point is to use optional match to available ingredients and then compare to filter for missing (null) ingredients or ingredients with the wrong value.
Note that the notion is declarative and doesn't need to describe an algorithm, you just write down what you need.
The last query should be:
START chef = node(..)
MATCH (chef)-[:has_value]->(ingredient_value)<-[:requires_value]-(recipe)-[:requires_ingredient]->(ingredient)
WHERE (ingredient)<-[:has_ingredient]-chef
RETURN ingredient
This pattern: (ingredient)<-[:has_ingredient*0..0]-chef
Is the reason it didn't return anything. *0..0 means that the length of the relationships must be zero, which means that ingredient and chef must be the same node, which they are not.
I completed this task using gremlin. I did
x=[]
g.idx('Chef')[[name:'chef1']].as('chef')
.out('has_ingredient').as('alreadyHas').aggregate(x).back('chef')
.out('has_value').as('values')
.in('requires_value').as('recipes')
.out('requires_ingredient').as('ingredients').except(x).path()
This returned the paths of all the missing ingredients. I was unable to formulate this in the cypher language, at least for version 1.7.
For new versions of Neo4j, you'll have this error:
MATCH (ingredient:Ingredient)
WHERE NOT (:Chef)-[:HAS_INGREDIENT]->(ingredient)
RETURN * LIMIT 100;
This feature is deprecated and will be removed in future versions.
Coercion of list to boolean is deprecated. Please consider using NOT isEmpty(...) instead.
To fix it:
MATCH (ingredient:Ingredient)
WHERE NOT EXISTS((:Chef)-[:HAS_INGREDIENT]->(ingredient))
RETURN * LIMIT 100;

Resources