efficient querying of array property in neo4j

efficient querying of array property in neo4j - neo4j

I've wanting to query for nodes that have a specific value in a string array property. For instance my nodes might have 2 properties, name (a string) and aliases (a string array). I have created an index on both properties using something like CREATE INDEX ON :F2(name).
I can query the name property using something cypher like this, and the result is immediate:
MATCH (p:F2) WHERE p.name = 'john' RETURN p;
I can query the aliases property using cypher like this, and I get the expected result but the response is very slow:
MATCH (p:F2) WHERE ANY(item IN p.aliases WHERE item = 'big john') RETURN p;
This looks like either the query is not optimal or the index is not being used.
Can someone suggest how to do this correctly. I'm pretty new to neo4j and cyper :-(

You could refactor your graph to make alias a node. So that any F2 node has zero or more aliases.
CREATE INDEX ON :Alias(name)
Then you could query it with something like this...
MATCH (p:F2)-[:HAS_ALIAS]->(:Alias {name: 'big john'})
RETURN p

Related

Match on property types in Neo4j

Is there a way to match nodes in Neo4j/Cypher based on the type of a property value? I'm looking for something like this:
MATCH (n:Person)
WHERE NOT(n.id_number isa STRING)
RETURN n
The closest I can think of is
MATCH (n:Person)
WHERE NOT(n.id_number = toString(n.id_number))
RETURN n
Although this is still pretty fast, it doesn't use an index, according to PROFILE, whereas I think an isa-style query could use an index.

Using apoc.meta.type procedure which returns type name of a value like INTEGER,FLOAT,STRING,BOOLEAN,RELATIONSHIP,NODE,PATH,NULL,UNKNOWN,MAP,LIST.
Ref: https://community.neo4j.com/t/data-type-of-a-property/1309/2

Aside from your workaround, Cypher has no way to match nodes by property value type.

How to query for multiple OR'ed Neo4j paths?

Anyone know of a fast way to query multiple paths in Neo4j ?
Lets say I have movie nodes that can have a type that I want to match (this is psuedo-code)
MATCH
(m:Movie)<-[:TYPE]-(g:Genre { name:'action' })
OR
(m:Movie)<-[:TYPE]-(x:Genre)<-[:G_TYPE*1..3]-(g:Genre { name:'action' })
(m)-[:SUBGENRE]->(sg:SubGenre {name: 'comedy'})
OR
(m)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
The problem is, the first "m:Movie" nodes to be matched must match one of the paths specified, and the second SubGenre is depenedent on the first match.
I can make a query that works using MATCH and WHERE, but its really slow (30 seconds with a small 20MB dataset).
The problem is, I don't know how to OR match in Neo4j with other OR matches hanging off of the first results.
If I use WHERE, then I have to declare all the nodes used in any of the statements, in the initial MATCH which makes the query slow (since you cannot introduce new nodes in a WHERE)
Anyone know an elegant way to solve this ?? Thanks !

You can try a variable length path with a minimal length of 0:
MATCH
(m:Movie)<-[:TYPE|:SUBGENRE*0..4]-(g)
WHERE g:Genre and g.name = 'action' OR g:SubGenre and g.name='comedy'
For the query to use an index to find your genre / subgenre I recommend a UNION query though.
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' })
RETURN distinct m
UNION
(m:Movie)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
RETURN distinct m

Perhaps the OPTIONAL MATCH clause might help here. OPTIONAL MATCH beavior is similar to the MATCH statement, except that instead of an all-or-none pattern matching approach, any elements of the pattern that do not match the pattern specific in the statement are bound to null.
For example, to match on a movie, its genre and a possible sub-genre:
OPTIONAL MATCH (m:Movie)-[:IS_GENRE]->(g:Genre)<-[:IS_SUBGENRE]-(sub:Genre)
WHERE m.title = "The Matrix"
RETURN m, g, sub
This will return the movie node, the genre node and if it exists, the sub-genre. If there is no sub-genre then it will return null for sub. You can use variable length paths as you have above as well with OPTIONAL MATCH.

[EDITED]
The following MATCH clause should be equivalent to your pseudocode. There is also a USING INDEX clause that assumes you have first created an index on :SubGenre(name), for efficiency. (You could use an index on :Genre(name) instead, if Genre nodes are more numerous than SubGenre nodes.)
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' }),
(m)-[:SUBGENRE]->()<-[:SUB_TYPE*0..3]-(sg:SubGenre { name: 'comedy' })
USING INDEX sg:SubGenre(name)
Here is a console that shows the results for some sample data.

neo4j query error "dont know how to compare"

I loaded Neo4j with Pizza.owl file using hermit reasoner and Java.
when i pass a simple query:
match (n) where n="name:Pizza" return n;
am getting the following error
Don't know how to compare that. Left: Node[1]{name:"owl:Thing"} (NodeProxy); Right: "name:Pizza" (String)
Is NodeProxy a datatype? How can I make both of them to be compared. Can I do casting while querying? Any query to change datatype of the entire graph nodes? How to check the type of the node?

You are comparing a node n to a string "name:Pizza", which doesn't make sense. What you want is to compare the property name of node n with the string "Pizza": WHERE n.name = "Pizza". The whole query then looks like this
MATCH (n)
WHERE n.name = "Pizza"
RETURN n
Nodes don't really have types. Take a look at the Neo4j manual to more about nodes, relationships, properties and labels and about Cypher in general, and the WHERE clause in particular.

Create node and relationship given parent node

I am creating a word tree but when I execute this cypher query:
word = "MATCH {} MERGE {}-[:contains]->(w:WORD {{name:'{}'}}) RETURN w"
.format(parent_node, parent_node, locality[i])
where parent_node has a type Node
It throws this error:
py2neo.cypher.error.statement.InvalidSyntax: Can't create `n8823` with properties or labels here. It already exists in this context
formatted query looks like this:
'MATCH (n8823:HEAD {name:"sanjay"}) MERGE (n8823:HEAD {name:"sanjay"})-[:contains]->(w:WORD {name:\'colony\'}) RETURN w'

The formatted query is broken and won't work, but I also don't see how that could be what the formatted query actually looks like. When you do your string format you pass the same parameter (parent_node) twice so the final string should repeat whatever that parameter looks like. It doesn't, and instead has two different patterns for the match and merge clauses.
Your query should look something like
MATCH (n8823:Head {name: "sanjay"})
MERGE (n8823)-[:CONTAINS]->(w:Word {name: "colony"})
RETURN w
It's probably a bad idea to do string formatting on a Node object. Better to either use property values from your node object in a Cypher query to match the right node (and only the variable that you bind the matched node to in the merge clause) or use the methods of the node object to do the merge.

Although the MERGE clause is able to bind identifiers (like n8823), Cypher unfortunately does not allow MERGE to re-bind an identifier that had already been bound -- even if it would not actually change the binding. (On the other hand, the MATCH clause does allow "rebinding" to the same binding.) Simply re-using a bound identifier is OK, though.
So, the workaround is to change your Cypher query to re-use the bound identifier. Also, the recommended way to dynamically specify query data without changing the overall structure of a query is to use "query parameters". For py2neo, code along these lines should work for you (note that the parent_name variable would contain a name string, like "sanjay"):
from py2neo import Graph
graph = Graph()
cypher = graph.cypher
results = cypher.execute(
"MATCH (foo:{name:{a}}) MERGE (foo)-[:contains]->(w:WORD {{name:'{b}'}}) RETURN w",
a=parent_name, b=locality[i])

Aggregate over a list of matching nodes

I have a Cypher query which I'd like to expand to be summed up over a list of matching nodes.
My query looks like this:
MATCH (u:User {name: {input} })-[r:USES]-(t) RETURN SUM(t.weight)
This matches one User node, and I'd like to adjust it to match a list of User nodes and then perform the aggregation.
My current implementation just calls the query in a loop and performs the aggregation outside of Cypher. This results in slightly inaccurate results and a lot of API calls.
Is there a way to evaluate the Cypher query against a list of elements (strings in my case)?
I'm using Neo4j 2.1 or 2.2.
Cheers

You can use the IN operator and pass in an array of usernames:
MATCH (u:User)-[r:USES]-(t)
WHERE u.name in ['John','Jim','Jack']
RETURN u.name, SUM(t.weight)
Instead of the array here you can use an parameter holding and array value as well:
MATCH (u:User)-[r:USES]-(t)
WHERE u.name in {userNames}
RETURN u.name, SUM(t.weight)
userNames = ['John','Jim','Jack']

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

efficient querying of array property in neo4j - neo4j

You could refactor your graph to make alias a node. So that any F2 node has zero or more aliases. CREATE INDEX ON :Alias(name) Then you could query it with something like this... MATCH (p:F2)-[:HAS_ALIAS]->(:Alias {name: 'big john'}) RETURN p

Related

Match on property types in Neo4j

How to query for multiple OR'ed Neo4j paths?

neo4j query error "dont know how to compare"

Create node and relationship given parent node

Aggregate over a list of matching nodes

Categories

Resources