Exclude nodes using WHERE not works - neo4j

I have an error when i try to exclude nodes using MATCH&WHERE
I have the next nodes & rrlationships:
MERGE (a1:accz {id: 1})
MERGE (a2:accz {id: 2})
MERGE (a3:accz {id: 3})
MERGE (a4:accz {id: 4})
MERGE (a5:accz {id: 5})
MERGE (i1:itemz {id: 1})
MERGE (i2:itemz {id: 2})
MERGE (i3:itemz {id: 3})
MERGE (i4:itemz {id: 4})
MERGE (a1)-[:AUTHOR]->(i1)
MERGE (a2)-[:AUTHOR]->(i2)
MERGE (a3)-[:AUTHOR]->(i1)
MERGE (a3)-[:AUTHOR]->(i3)
MERGE (a4)-[:AUTHOR]->(i4)
MERGE (a4)-[:AUTHOR]->(i5)
MERGE (a4)-[:AUTHOR]->(i5)
MERGE (a5)-[:AUTHOR]->(i2)
MERGE (a5)-[:AUTHOR]->(i5)
When i execute (I include in a explicit way the items with which the accz need have a relationship):
MATCH (a:accz)-[:AUTHOR]->(i:itemz) WHERE ({id: i.id} IN [({id: 3}), ({id: 4})]) RETURN a
i got the accz nodes (3,4,5), and is ok. But then i exclude some nodes using WHERE, like the next query:
MATCH (a:accz)-[:AUTHOR]->(i:itemz) WHERE ({id: i.id} IN [({id: 3}), ({id: 4})]) AND (NOT (a)-[:AUTHOR]->(:itemz {id:5})) RETURN a
but i continue getting the accz node id:5, this should be excluded because the acc{id:5} is AUTHOR of (:itemz {id:5})
what im doing wrong?

The odd behaviors seen in your example would seem like bugs, but can be explained (after some careful thought). One conclusion, after all is said and done, is that you should avoid using unbound nodes in a MERGE clause.
The odd behaviors
Your creation query has no MERGE clause to create the itemz node i5. That is, this clause is missing: MERGE (i5:itemz {id: 5}).
Therefore, it would seem like the 2 MERGE (a4)-[:AUTHOR]->(i5) clauses should result in the creation of a new unlabelled i5 node with no properties -- but no such node is created!
And it would also seem like the MERGE (a5)-[:AUTHOR]->(i5) clause should result in a relationship with that new i5 -- but instead it unexpectedly results in a relationship with i4!
Explanation
This snippet of code causes the odd behavior (I have added comments to clarify):
MERGE (a4)-[:AUTHOR]->(i4) // Makes sure `(a4)-[:AUTHOR]->(i4)` relationship exists
MERGE (a4)-[:AUTHOR]->(i5) // Matches above relationship, so creates `i5` and binds it to `i4`!
MERGE (a4)-[:AUTHOR]->(i5) // Matches same relationship, so nothing is done.
So, after the snippet is executed, i4 and i5 are bound to the same node. This explains the odd behaviors.
Conclusion
To avoid unexpected results, you should avoid using unbound nodes in MERGE clauses.
If your creation query had included a MERGE (i5:itemz {id: 5}) clause before the relationships were created, then your queries would have worked reasonably. The result of the first query would contain accz nodes 3 and 4, and the result of the second query would only contain 3.
By the way, ({id: i.id} IN [({id: 3}), ({id: 4})]) can be greatly simplified to just i.id IN [3, 4].

Related

How to return nodes that have only one given relationship

I have nodes that represent documents, and nodes that represent entities. Entities can be referenced in document, if so, they are linked together with a relationship like that :
(doc)<-[:IS_REFERENCED_IN]-(entity)
The same entity can be referenced in several documents, and a document can reference several entities.
I'd like to delete, for a given document, every entity that are referenced in this given document only.
I thought of two different ways to do this.
The first one uses java to make a foreach and would basically be something like that :
List<Entity> entities = MATCH (d:Document {id:0})<-[:IS_REFERENCED_IN]-(e:Entity) return e
for (Entity entity : entities){
MATCH (e:Entity)-[r:IS_REFERENCED_IN]->(d:Document) WITH *, count(r) as nb_document_linked WHERE nb_document_linked = 1 DELETE e
}
This method would work but i'd like not to use a foreach or java code to make it. I'd like to do it in one cypher query.
The second one uses only one cypher query but doesn't work. It's something like that :
MATCH (d:Document {id:0})<-[:IS_REFERENCED_IN]-(e:Entity)-[r:IS_REFERENCED_IN]->(d:Document) WITH *, count(r) as nb_document_linked WHERE nb_document_linked = 1 DELETE e
The problem here is that nb_document_linked is not unique for every entity, it is a unique variable for all the entities, which mean it'll count every relationship of every entity, which i don't want.
So how could I make a kind of a foreach in my cypher query to make it work?
Sorry for my english, I hope the question is clear, if you need any information please ask me.
You can do something like:
MATCH (d:Document{key:1})<-[:IS_REFERENCED_IN]-(e:Entity)
WITH e
MATCH (d:Document)<-[:IS_REFERENCED_IN]-(e)
WITH COUNT (d) AS countD, e
WHERE countD=1
DETACH DELETE e
Which you can see working on this sample data:
MERGE (a:Document {key: 1})
MERGE (b:Document {key: 2})
MERGE (c:Document {key: 3})
MERGE (d:Entity{key: 4})
MERGE (e:Entity{key: 5})
MERGE (f:Entity{key: 6})
MERGE (g:Entity{key: 7})
MERGE (h:Entity{key: 8})
MERGE (i:Entity{key: 9})
MERGE (j:Entity{key: 10})
MERGE (k:Entity{key: 11})
MERGE (l:Entity{key: 12})
MERGE (m:Entity{key: 13})
MERGE (d)-[:IS_REFERENCED_IN]-(a)
MERGE (e)-[:IS_REFERENCED_IN]-(a)
MERGE (f)-[:IS_REFERENCED_IN]-(a)
MERGE (g)-[:IS_REFERENCED_IN]-(a)
MERGE (d)-[:IS_REFERENCED_IN]-(b)
MERGE (e)-[:IS_REFERENCED_IN]-(b)
MERGE (f)-[:IS_REFERENCED_IN]-(c)
MERGE (g)-[:IS_REFERENCED_IN]-(c)
MERGE (j)-[:IS_REFERENCED_IN]-(a)
MERGE (h)-[:IS_REFERENCED_IN]-(a)
MERGE (i)-[:IS_REFERENCED_IN]-(a)
MERGE (g)-[:IS_REFERENCED_IN]-(c)
MERGE (k)-[:IS_REFERENCED_IN]-(c)
MERGE (l)-[:IS_REFERENCED_IN]-(c)
MERGE (m)-[:IS_REFERENCED_IN]-(c)
On which it removes 3 Entities.
The first MATCH finds the entities that are attached to your input doc, and the second MATCH finds the number of documents that each of these entities is connected to.

Match paths of node types where nodes may have cycles

I'm trying to find a match pattern to match paths of certain node types. I don't care about the type of relation. Any relation type may match. I only care about the node types.
Of course the following would work:
MATCH (n)-->(:a)-->(:b)-->(:c) WHERE id(n) = 0
But, some of these paths may have relations to themselves. This could be for :b, so I'd also like to match:
MATCH (n)-->(:a)-->(:b)-->(:b)-->(:c) WHERE id(n) = 0
And:
MATCH (n)-->(:a)-->(:b)-->(:b)-->(:b)-->(:c) WHERE id(n) = 0
I can do this with relations easily enough, but I can't figure out how to do this with nodes, something like:
MATCH (n)-->(:a)-->(:b*1..)-->(:c) WHERE id(n) = 0
As a practical example, let's say I have a database with people, cars and bikes. The cars and bikes are "owned" by people, and people have relationships like son, daughter, husband, wife, etc. What I'm looking for is a query that from a specific node, gets all nodes of related types. So:
MATCH (n)-->(:person*1..)-->(:car) WHERE Id(n) = 0
I would expect that to get node "n", it's parents, grandparents, children, grandchildren, all recursively. And then of those people, their cars. If I could assume that I know the full list of relations, and that they only apply to people, I could get this to work as follows:
MATCH
p = (n)-->(:person)-[:son|daughter|husband|wife|etc*0..]->(:person)-->(:car)
WHERE Id(n) = 0
RETURN nodes(p)
What I'm looking for is the same without having to specify the full list of relations; but just the node label.
Edit:
If you want to find the path from one Person node to each Car node, using only the node labels, and assuming nodes may create cycles, you can use apoc.path.expandConfig.
For example:
MERGE (mark:Person {name: "Mark"})
MERGE (lju:Person {name: "Lju"})
MERGE (praveena:Person {name: "Praveena"})
MERGE (zhen:Person {name: "Zhen"})
MERGE (martin:Person {name: "Martin"})
MERGE (joe:Person {name: "Joe"})
MERGE (stefan:Person {name: "Stefan"})
MERGE (alicia:Person {name: "Alicia"})
MERGE (markCar:Car {name: "Mark's car"})
MERGE (ljuCar:Car {name: "Lju's car"})
MERGE (praveenaCar:Car {name: "Praveena's car"})
MERGE (zhenCar:Car {name: "Zhen's car"})
MERGE (zhen)-[:CHILD_OF]-(mark)
MERGE (praveena)-[:CHILD_OF]-(martin)
MERGE (praveena)-[:MARRIED_TO]-(joe)
MERGE (zhen)-[:CHILD_OF]-(joe)
MERGE (alicia)-[:CHILD_OF]-(joe)
MERGE (zhen)-[:CHILD_OF]-(mark)
MERGE (anthony)-[:CHILD_OF]-(rik)
MERGE (martin)-[:CHILD_OF]-(mark)
MERGE (stefan)-[:CHILD_OF]-(zhen)
MERGE (lju)-[:CHILD_OF]-(stefan)
MERGE (markCar)-[:OWNED]-(mark)
MERGE (ljuCar)-[:OWNED]-(lju)
MERGE (praveenaCar)-[:OWNED]-(praveena)
MERGE (zhenCar)-[:OWNED]-(zhen)
Running a query:
MATCH (n:Person{name:'Joe'})
CALL apoc.path.expandConfig(n, {labelFilter: "Person|/Car", uniqueness: "NODE_GLOBAL"})
YIELD path
RETURN path
will return four unique paths from Joe node to the four car nodes. There are several options for uniqueness of the path, see uniqueness
The /CAR makes it a Termination label, i.e. returned paths are only up to this given label.

neo4j apoc.path.expandConfig - How to use relationshipFilter with a list of relationships

According to the documentations, the pipe char (|) acts as a or in the relationshipFilter, while the comma char (,) acts as a concatenation of relationships, creating a list of them.
see for example (look at the explanation with the black background comparing to the query itself):
and:
here, page 14: sequences:
The question is, are commas stronger than pipes?
i.e., If I want several options of steps sequences, can I specify several strict lists, or do I must specify one list, each step with several options?
I wanted to achieve 4 relationship sequence options:
1.CREATE> or
2.REACT,REPLY or
3.CREATE>,RELATED or
4.REPLY>,CREATE
So I wrote a simple query:
MATCH(u:User{key:1})
CALL apoc.path.expandConfig(u, {maxLevel: 3,
relationshipFilter: 'CREATE>|REACT,REPLY|CREATE>,RELATED|REPLY>,CREATE',
uniqueness:"RELATIONSHIP_GLOBAL"})
YIELD path
RETURN path
Given a sample data:
MERGE (a:User{key: 1})
MERGE (b:Tags{key: 2})
MERGE (c:Post{key: 3})
MERGE (d:Comment{key: 4})
MERGE (e:Comment{key: 5})
MERGE (f:Comment{key: 6})
MERGE (g:User{key: 7})
MERGE (h:User{key: 8})
MERGE (i:Post{key: 9})
MERGE (j:Tags{key: 10})
MERGE (k:Post{key: 11})
MERGE (l:Comment{key: 12})
MERGE (a)-[:CREATE]-(b)
MERGE (a)-[:CREATE]-(c)
MERGE (a)-[:REACT]-(c)
MERGE (a)-[:CREATE]-(d)
MERGE (a)-[:REACT]-(d)
MERGE (b)-[:RELATED]-(c)
MERGE (d)-[:REPLY]-(c)
MERGE (d)-[:REPLY]-(d)
MERGE (h)-[:REACT]-(c)
MERGE (g)-[:REACT]-(c)
MERGE (h)-[:CREATE]-(j)
MERGE (j)-[:RELATED]-(c)
MERGE (g)-[:CREATE]-(i)
MERGE (e)-[:REPLY]-(i)
MERGE (f)-[:REPLY]-(i)
MERGE (a)-[:REPLY]-(i)
MERGE (h)-[:CREATE]-(k)
MERGE (l)-[:REPLY]-(k)
MERGE (a)-[:REACT]-(l)
I was expecting to get an answer including (a:User{key: 1})-[:REPLY]->(i:Post{key: 9})<-[:CREATE]-(g:User{key: 7}), which corresponds with my last part of the relationshipFilter, but did not get it.
Thank you for your time
I believe your relationshipFilter needs to be changed.
You have written: 'CREATE>|REACT,REPLY|CREATE>,RELATED|REPLY>,CREATE'
Which matches:
CREATE> OR REACT
REPLY OR CREATE
RELATED OR REPLY
CREATE (this clause is never checked because of maxLevel:3.)
It appears you intended to use the relationshipFilter: "CREATE>,REACT|REPLY,CREATE>|RELATED,REPLY>|CREATE"
Which matches
CREATE>
REACT OR REPLY
CREATE> OR RELATED
REPLY> OR CREATE

How to prevent neo4j MERGE from creating duplicate relationships?

I am attempting to create nodes and relationships if they do not exist. I do not know ahead of time if anything in the DB exists.
This is my initial query:
MERGE (t:type { name: 'aaa'})
MERGE (m:model { name: 'bbb'})
MERGE (r:region {name: 'ccc'})
MERGE (p:param {name: 'ddd'})
MERGE (i:init {value: 123})
MERGE (u:forecast {url: 'http://something.png'})
MERGE (t)-[:HAS]-(m)-[:HAS]-(r)-[:HAS]-(p)-[:HAS]-(i)-[:HAS]-(u)
This correctly produces a graph like this:
Then I run this query again, but this time I change the name of the "model" object to "bbc" (instead of "bbb"):
MERGE (t:type { name: 'aaa'})
MERGE (m:model { name: 'bbc'})
MERGE (r:region {name: 'ccc'})
MERGE (p:param {name: 'ddd'})
MERGE (i:init {value: 123})
MERGE (u:forecast {url: 'http://something.png'})
MERGE (t)-[:HAS]-(m)-[:HAS]-(r)-[:HAS]-(p)-[:HAS]-(i)-[:HAS]-(u)
Now, however, my graph looks like this:
Everything looks correct except for the three duplicated relationships.
I realize that MATCH will create the whole path if it does not exist. There must be some way to avoid creating duplicate relationships, though.
I would appreciate being pointed in the right direction!
The MERGE statement checks if the pattern as a whole already exists or not. So, if there is one node different, the whole pattern is determined as non-existent and all relationships are created.
The solution is to split this MERGE statement into multiple, i.e. one MERGE for each relationship:
MERGE (t)-[:HAS]-(m)-[:HAS]-(r)-[:HAS]-(p)-[:HAS]-(i)-[:HAS]-(u)
becomes
MERGE (t)-[:HAS]-(m)
MERGE (m)-[:HAS]-(r)
MERGE (r)-[:HAS]-(p)
MERGE (p)-[:HAS]-(i)
MERGE (i)-[:HAS]-(u)

Commas in MERGE clause like there are in MATCH clause?

The following works fine in neo4j 4:
MATCH (a)-->(b)<--(c), (b)-->(d)
RETURN a
But the following returns an error:
MERGE (a)-->(b)<--(c), (b)-->(d)
RETURN a
Error text:
Neo.ClientError.Statement.SyntaxError
Invalid input ',': expected whitespace, a relationship pattern, ON, FROM GRAPH, USE GRAPH, CONSTRUCT, LOAD CSV, START, MATCH, UNWIND, MERGE, CREATE UNIQUE, CREATE, SET, DELETE, REMOVE, FOREACH, WITH, CALL, RETURN, UNION, ';' or end of input (line 1, column 22 (offset: 21))
"MERGE (a)-->(b)<--(c), (b)-->(d)"
^
If I understand correctly, merge provides a level of upsert functionality. But is merge more restricted in matching capability than match? How do I merge complex non-linear patterns that require comma separations?
The entire MERGE pattern will be created if any item in the pattern does not yet exist. So, to be safe, you must always make sure every MERGE pattern has only one item that might not exist.
This is why it only makes sense for MERGE to support patterns with a single term.
For example, instead of this (which is not legal Cypher, anyway):
MERGE
(a:Foo {id: 'a'})-[:BAR]->(b:Foo {id: 'b'})<-[:BAR]-(c:Foo {id: 'c'}),
(b)-[:BAR]->(d:Foo {id: 'd'})
RETURN a
you should actually do this:
MERGE (a:Foo {id: 'a'})
MERGE (b:Foo {id: 'b'})
MERGE (c:Foo {id: 'c'})
MERGE (d:Foo {id: 'd'})
MERGE (a)-[:BAR]->(b)
MERGE (b)<-[:BAR]-(c)
MERGE (b)-[:BAR]->(d)
RETURN a

Resources