Find nodes where property is not a specific value without WHERE - neo4j

A node's property has 6 categories. I'd like to leave all the nodes with this property not equal to one of the categories.
It's easy to do with WHERE like this:
MATCH (a)
WHERE a.property <> "category"
RETURN a
I'd like to do it another way without where because it seems to be more efficient. I imagine it like this:
MATCH ( a {property <> "category"} )
RETURN a
Is it possible?

Neo4j MATCH does not have a syntax to inline WHERE NOT <property>=<value>. Furthermore, Cypher is declarative, meaning it only defines what to return, not how to return it. So MATCH (n{id:1}) is equivalent (in execution) to MATCH (n) WHERE n.id=1. The only time WHERE vs inline produces different execution plans is when you don't pair the WHERE clause with the MATCH. By trying to "optimize" your cypher for execution, most of the time you will just be getting in the Cypher planners way. (Unless your original cypher was over complicated)

Related

What is correct way of finding path between 2 nodes without getting warnings in neo4j?

I am currently working with neo4j and I need to find path between 2 nodes in large graph. I am using this cypher query:
MATCH p=(acq:Acquisition {id:'1'})-[r*]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
Everything works as expected (query returns path of any length between nodes) but I am getting warning message displayed bellow:
Warning: This feature is deprecated and will be removed in future versions. Binding relationships to a list in a variable length pattern is deprecated.
In official documentation is used same pattern with *.
What is the correct way of finding paths of any lengths between nodes without getting any warning(without using deprecated syntax) ?
Binding relationships to a list in a variable length pattern is deprecated since 3.2.0-rc1.
According this pull request Cypher queries like:
MATCH (n)-[rs*]-() RETURN rs
will generate a warning and the canonical way to write the same query is:
MATCH p=(n)-[*]-() RETURN relationships(p) AS rs
Since you are not using the r variable in your query you can simply remove it from the query and the warning will disappear. This way:
MATCH p=(acq:Acquisition {id:'1'})-[*]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
All that means is that is a future release, your use of r in your pattern will no longer be permitted. Your query will need to look like this or it will break in a future release. Since you are not directly trying to use r in your results it should no problem for you to remove it.
MATCH p=(acq:Acquisition {id:'1'})-[*]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
You could always get the relationships using the relationships(p) to return a collection of relationships from the path if you needed to do something with them after the match.
Depending on the nature of your graph (size and complexity) though your query may become unwieldy because it is virtually unconstrained except for the ending node and the direction. There are a number of ways you could make it safer.
1 - Use shortestPath
You could use shortestPath or allShortestPaths
MATCH p=allShortestPaths((acq:Acquisition {id:'1'})-[*]->(ecs:ExternalCommunicationService {id:'1'}))
RETURN p
2 - Limit the Depth
You could add a limit on the depth of the match. It could be fixed
MATCH p=(acq:Acquisition {id:'1'})-[*10]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
or a range
MATCH p=(acq:Acquisition {id:'1'})-[*5..10]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
3 - Add Relationship TYPE
You could add one or more labels to reduce the potential number of paths you match. That can be used in conjunction with the depth parameters.
MATCH p=(acq:Acquisition {id:'1'})-[:TYPE_A|TYPE_B*5..10]->(ecs:ExternalCommunicationService {id:'1'})
RETURN p
4 - Use APOC
You could use APOC procedures.

Return results if at least one node (related but possibly not a result) has a certain property, in Neo4j

I have a graph that consists of a set of disjoint family trees.
I have a working query that has a few OPTIONAL MATCH statements, which allow me to get only the immediate parents and siblings of someone in the main_person's family tree, assuming that those relatives are of interest to us:
MATCH (p:Person {main_person: 'y'})
OPTIONAL MATCH (p)<-[]-(parent:Person)
WHERE parent.`person_of_interest` = 'y'
OPTIONAL MATCH (parent:Person)-[]->(sib:Person)
WHERE sib <> p
AND sib.`person_of_interest` = 'y'
RETURN
p, parent, sib;
But say I want to qualify this by making sure:
at least one member of a family has a test_me = 'y' property. This can be a far, distant member of the family. It definitely doesn't have to be family member that is a person_of_interest, or is a close family member.
If at least one of them has this property, then we can return the family members we are looking for. But if nobody has the property, then we don't want any results for that family.
I'm not sure how to construct this. I keep trying to start with the test_me = 'y' part, and carry it with a WITH:
MATCH (p:Person)-[]-(m)
WHERE ANY m.test_me = 'y'
WITH p, m
. . .
Maybe it should be more like:
MATCH (p:Person {main_person: 'y'})
OPTIONAL MATCH (p)<-[]-(parent:Person)
OPTIONAL MATCH (parent:Person)-[]->(sib:Person)
WHERE sib <> p
HAVING <condition here>
RETURN
p, parent, sib;
If this were SQL, I'd try to use a temp table to pipe things along.
None of it is really working.
Thanks for reading this.
[UPDATED to answer updated question]
This query may work for you (or it may run out of memory or appear to run forever):
MATCH (p:Person {main_person: 'y'})
WHERE EXISTS((p)-[*0..]-({test_me: 'y'}))
OPTIONAL MATCH (p)<--(parent:Person)
WHERE parent.person_of_interest = 'y'
OPTIONAL MATCH (parent:Person)-->(sib:Person)
WHERE sib <> p AND sib.person_of_interest = 'y'
RETURN p, COLLECT(parent) AS parents, COLLECT(sib) AS sibs;
The [*0..] syntax denotes a variable length relationship search where the matching paths can have 0 or more relationships. The reason the query uses a lower bound of 0 instead of 1 (which is the default) is this: we also want to also test whether p itself has the desired test_me property value.
However, variable length relationship searches are notorious for using a lot of memory or taking a long time to finish when no upper bound is specified, so normally a query would specify a reasonable upper bound (e.g., [*0..5]).
By the way, you should probably pass values such as 'y' as parameters instead of hard-coding them.
You're definitely on the right track, I think you already have your answer even if you don't realize it.
What you have in your description works as the start of your query, with just a few modifications:
MATCH pattern=(p:Person{main_person: 'y'})-[*]-()
WHERE ANY (person IN nodes(pattern) WHERE person.test_me = 'y')
WITH p
...
The variable relationship lets you consider every person in the tree (if there are non-family relationships in your graph, you'll want to use types on your relationship to ensure you're only considering a single family's tree), as well as the main_person. If nobody in p's family tree has your desired property, p will be null, and any subsequent matchings using p will yield no results. This should let you specify the rest of the query freely, and as long as all matches include p, you shouldn't get any results at the end for families without the desired property value.
EDIT fixed my query a bit, the ANY() clause wasn't written correctly.

MATCH prior to MERGE in single Cypher query confuses ON MATCH and ON CREATE

I've run into this on Neo4j 2.1.5. I have a query which I'm issuing from Node.js using the Neo4j REST API. The point of this query is to be able to create or update a given Node and set its state (including labels and properties) to some known state. The MATCH and REMOVE clause prior to the WITH is to work around the fact that there's no direct way to remove all of a Node's labels nor is there a way to update a Node's labels with a given set of labels. You have to explicitly remove the labels you don't want and add the labels you do want. And there's no way to remove labels in the MERGE clause.
A somewhat simplified version of the query looks like:
MATCH (m {name:'Brian'})
REMOVE m:l1:l2
WITH m
MERGE (n {name:'Brian'})
ON MATCH SET n={mprops} ON CREATE SET n={cprops}
RETURN n
where mprops = {updated:true, created:false} and cprops = {updated:false, created:true}. I do this so that in a single Cypher query I can remove all of the Node's existing labels and set new labels using the ON MATCH clause. The problem is that including the initial MATCH seems to confuse the ON MATCH vs ON CREATE logic.
Assuming the Brian Node already exists, the result of this query should show that n.created = false and n.updated = true. However, I get the opposite result, n.created=true, n.updated=false. If I remove the initial MATCH (and WITH) clause and execute only the MERGE clause, the results are as expected. So somehow, the inclusion of the MATCH clause causes the MERGE clause to think that a CREATE vs MATCH is happening.
I realize this is a weird use of the WITH clause, but it did seem like it would work around the limitation in manipulating labels. And Cypher thinks that it's valid Cypher. I'm assuming this is just a bug and an edge case, but I wanted to get others insights and possible alternatives before I report it.
I realize that I could have created a transaction and issued the MATCH and MERGE as separate queries within that transaction, but there are reasons that this does not work well in the design of the API I'm writing.
Thanks!
If you prefix your query with MATCH it will never execute if there is no existing ('Brian') node.
You also override all properties with your SET n = {param} you should use SET n += {param}
MERGE (n:Label { name:'Brian' })
ON MATCH SET n += {create :false,update:true }
ON CREATE SET n += {create :true,update:false }
REMOVE n:WrongLabel
RETURN n
I don't see why your query would not work, but the issues brought up by #FrobberOfBits are valid.
However, logically, your example query is equivalent to this one:
MATCH (m {name:'Brian'})
REMOVE m:l1:l2
SET m={mprops}
RETURN m
This query is simpler, avoids the use of MERGE entirely, and may avoid whatever issue you are seeing. Does this represent what you were trying to do?

Neo4j: Transitive query and node ordering

I am using Neo4j to track relationships in OOP architecture. Let us assume that nodes represent classes and (u) -[:EXTENDS]-> (v) if class u extends class v (i.e. for each node there is at most one outgoing edge of type EXTENDS). I am trying to find out a chain of predecessor classes for a given class (n). I have used the following Cypher query:
start n=node(...)
match (n) -[:EXTENDS*]-> (m)
return m.className
I need to process nodes in such an order that the direct predecessor of class n comes first, its predecessor comes as second etc. It seems that the Neo4j engine returns the nodes in exactly this order (given the above query) - is this something I should rely on or could this behavior suddenly change in some of the future releases?
If I should not rely on this behavior, what Cypher query would allow me to get all predecessor nodes in given ordering? I was thinking about following query:
start n=node(...)
match p = (n) -[:EXTENDS*]-> (m {className: 'Object'})
return p
Which would work quite fine, but I would like to avoid specifying the root class (Object in this case).
It's unlikely to change anytime soon as this is really the nature of graph databases at work.
The query you've written will return ALL possible "paths" of nodes that match that pattern. But given that you've specified that there is at most one :EXTENDS edge from each such node, the order is implied with the direction you've included in the query.
In other words, what's returned won't start "skipping" nodes in a chain.
What it will do, though, is give you all "sub-paths" of a path. That is, assuming you specified you wanted the predecessors for node "a", for the following path...
(a)-[:EXTENDS]->(b)-[:EXTENDS]->(c)
...your query (omitting the property name) will return "a, b, c" and "a, b". If you only want ALL of its predecessors, and you can use Cypher 2.x, consider using the "path" way, something like:
MATCH p = (a)-[:EXTENDS*]->(b)
WITH p
ORDER BY length(p) DESC
LIMIT 1
RETURN extract(x in nodes(p) | p.className)
Also, as a best practice, given that you're looking at paths of indefinite length, you should likely limit the number of hops your query makes to something reasonable, e.g.
MATCH (n) -[:EXTENDS*0..10]-> (m)
Or some such.
HTH

Single Cypher query to choose between 2 different paths

I have the following scenario:
At some point in my path (in a node that lies a few links away from my start node),
I have the possibility of going down one path or another, for example:
If S is my startnode,
S-[]->..->(B)-[first:FIRST_WAY]->(...) ,
and
S-[]->..->(B)-[second:SECOND_WAY]->(...)
At the junction point, I will need to go down one path only (first or second)
Ideally, I would like to follow and include results from the second relationship, only if the first one is not present (regardless of what exists afterwards).
Is this possible with Cypher 1.9.7, in a single query?
One way would be to an optional match to match the patterns separately. Example:
MATCH (n:Object) OPTIONAL MATCH (n)-[r1:FIRST_WAY]->(:Object)-->(f1:Object) OPTIONAL MATCH (n)-[r2:SECOND_WAY]->()-->(f2:Object) RETURN coalesce(f2, f1)
This query will match both conditionally and the coalesce function will return the first result which is not null.
AFAIK, OPTIONAL_MATCH was introduced in 2.0 so you can't use that clause in 1.9, but there is an alternate syntax:
CYPHER 1.9 START n=node(*) MATCH (n)-[r1?:FIRST_WAY]->()-->(f1), (n)-[r2?:SECOND_WAY]->()-->(f2) RETURN coalesce(f2, f1)
I'm sure there are other ways to do this, probably using the OR operator for relationship matching, i.e. ()-[r:FIRST_WAY|SECOND_WAY]->(), and then examining the patterns matched to discard some of the result paths based on the relationship type.

Resources