MERGE clause in Neo4j Cypher query not working - neo4j

I am running neo4j-community-3.0.0-M05.
I am trying out Neo4J Cypher Query Language's MERGE clause. Its explanation is given as follows
It acts like a combination of MATCH or CREATE, which checks for the existence of data first before creating it. With MERGE you define a pattern to be found or created. Usually, as with MATCH you only want to include the key property to look for in your core pattern. MERGE allows you to provide additional properties you want to set ON CREATE.
I already have following node:
(:Movie{title:"Forrest Gump", released:1994})
and now I wanted to add a dummy property addedOn with dummy value 20160108 to it just to try out MERGE clause:
MERGE (a:Movie{title:"Forrest Gump"})
ON CREATE SET a.addedOn= "20160108"
RETURN a;
However this seems not working:
Why is this so?

What you're seeing is precisely the expected behaviour.
Since MERGE finds your pre-existing Forrest Gump, this node is used. The ON CREATE handler will not fire since you didn't create anything.
If you've had a ON MATCH handler this one would have been fired since MERGE's match was successful.

Related

Optional merge on relationships

I am new to Cypher and I am trying to learn it through a small project I am trying to set up.
I have the following data model so far:
For every Thought created, I connect Tags through Categories.
The Categories only serve as intermediate between the Tags and Thoughts, this is done to improve querying, prevent Tag duplication and reduce relationships between the objects.
To prevent creation of new Tags with the same value, I thought of the following query:
CREATE (t: Thought {moment:timestamp(), message:'Testing new Thought'})
MERGE (t1: Tag{value: 'work'})
MERGE (t2: Tag{value: 'tasks'})
MERGE (t3: Tag{value: 'administration'})
MERGE (c: Category)
MERGE (t1)<-[u:CONSISTS_OF{index:0}]-(c)
MERGE (t2)<-[v:CONSISTS_OF{index:1}]-(c)
MERGE (t3)<-[w:CONSISTS_OF{index:2}]-(c)
MERGE (t)-[x:CATEGORIZED_AS{index: 0}]->(c)
This works fine, except for one thing: the Thought receives a relationship with all created Categories.
This I understand, I define no restrictions in the MERGE query.
However, I do not know how to apply restrictions to the CATEGORIZED_AS relationship?
I tried to add this to the bottom of the query, but that does not work:
WHERE (t)-[x]->(c)
Any idea how to apply a restriction like I need in my case?
EDIT:
I forgot to mention the unique connection of a Category:
A category is connect to a fixed set of Tags in a specific order.
E.g I have three tags:
work
tasks
administration
The only way the Category matches the Thought is if the Category has the following relationships with the Tags:
work <-[:CONSISTS_OF {index:0}]-(category)
tasks <-[:CONSISTS_OF {index:1}]-(category)
administration <-[:CONSISTS_OF {index:2}]-(category)
Any other order of relationships is invalid and a new Category should be created.
The Problem: Use of MERGE
MERGE will try and find a pattern in the graph, if it finds the pattern it will return it, else it will try and create the entire pattern. This works individually for each MERGE clause. So, this works great and as expected for (n:Tag) nodes, since you only want one tag for each word in the graph, but the issue comes with the later in your query when you try to merge a category.
What you want to do is try and find this (c:Category) that is connected to these three (t:Tag) nodes with these r.index properties on the relationship (:Tag)-[r:CONSISTS_OF]-(). However, you're running four merge clauses which do the following:
MERGE (c: Category)
Find or create any node c with the label `Category.
MERGE (t1)<-[u:CONSISTS_OF{index:0}]-(c)
MERGE (t2)<-[v:CONSISTS_OF{index:1}]-(c)
MERGE (t3)<-[w:CONSISTS_OF{index:2}]-(c)
Find or Create a relationship between that node and t1, then t2, t3 etc.
If you were to run that query, and then change one of the tags to something different like "rest", and run the query again, you'd expect a new category to appear. But it won't with the current query, it'll simply create a new tag, then find the existing (c:Category) node in that first MERGE clause, and create a relationship between it and the new tag. So, rather than having two categories each linked to three tags (with two tags being shared), you'll just have four tags all linked to one category with duplicate indexes on your relationships.
So, what you actually want to do is use MERGE to find the complex pattern like below.
MERGE (t1)<-[:CONSISTS_OF {index:0}]-(c:Category)-[:CONSISTS_OF {index:1}]->(t2),
(t3)<-[:CONSISTS_OF {index:2}]-(c)
Annoyingly, that will give you a syntax error, as cypher can't currently merge complex patterns like that. So, here comes the creative bit.
Solution 1: Conditional Execution with CASE and FOREACH (Easy)
This is quite a handy goto for these kinds of situation, see the commented query below. You'll essentially split the merge up, use OPTIONAL MATCH to try and find the pattern, and then use a little trick in cypher syntax to CREATE the pattern if we find it doesn't exist.
CREATE (t: Thought {moment:timestamp(), message:'Testing new Thought'})
MERGE (t1:Tag{value: 'work'})
MERGE (t2:Tag{value: 'abayo'})
MERGE (t3:Tag{value: 'rest'})
WITH *
// we can't merge this category because it's a complex pattern
// so, can we find it in the db?
OPTIONAL MATCH (t1)<-[:CONSISTS_OF {index:0}]-(c:Category)-[:CONSISTS_OF {index:1}]->(t2),
(t3)<-[:CONSISTS_OF {index:2}]-(c)
// the CASE here works in conjunction with the foreach to
// conditionally execute the create clause
WITH t, t1, t2, t3, c, CASE c WHEN NULL THEN [1] ELSE [] END AS make_cat
FOREACH (i IN make_cat |
// if no such category exists, this code will run as c is null
// if a category does exist, c will not be null, and so this won't run
CREATE (t1)<-[:CONSISTS_OF {index:0}]-(new_cat:Category)-[:CONSISTS_OF {index:1}]->(t2),
(t3)<-[:CONSISTS_OF {index:2}]-(new_cat)
)
// now we're not sure if we're referring to new_cat or cat
// remove variable c from scope
WITH t, t1, t2, t3
// and now match it, we know for sure now we'll find it
// alternatively, use conditional execution again here
MATCH (t1)<-[:CONSISTS_OF]-(c:Category)-[:CONSISTS_OF]->(t2),
(t3)<-[:CONSISTS_OF]-(c)
// now we have the category, we definitely want
// to create the relationship between the thought and the category
CREATE (t)-[:CATEGORIZED_AS]->(c)
RETURN *
Solution 2: Refactor Your Graph (Hard)
I haven't included a query here - although I can do if requested - but an alternative would be to refactor your graph to attach tags to categories in a ring (or chain - with a final member marker) structure, so that you can merge the pattern straight away without having to split it up.
Since the categories are in an order, you could express the data like the below, in one MERGE clause.
MERGE (c:Category)-[:CONSISTS_OF_TAG_SEQUENCE]->(t1)-[:NEXT_TAG_IN_SEQUENCE]->(t2)-[:NEXT_TAG_IN_SEQUENCE]->(t3)-[:NEXT_TAG_IN_SEQUENCE]->(c)
This might seem like a neat solution at first, but the problem is, that since tags will belong to multiple categories, if tags are shared between categories you will need to either:
create a composite index to identify categories and store this as a property of the sequential relationships so you know which relationships to follow in your path (i.e., so you can always find one, and only one, sequence of tags for a category)
still link each tag to the categories it is in and query on this pattern (to allow you to find that single path like in #1)
Use an intermediate node to achieve the same as 1 and 2
All of the above and more.
As you might have guessed, this will make your query much more complicated than it needs to be quite quickly. It could be fun to try, and may suit some use cases, but for the time being I'd stick with the easy solution!
My solution to your problem, is to enforce that every Category has a unique, consistently reproducible id. In your case, add a cid or id field, where the value is something along the lines of tag1<_>tag2<_>tag3<_>. (<_> is used because the chances of that being part of a tag are zero. If _ is an invalid tag character replacing <_> with _ will do just fine).
This way you can lock onto a category node without having to know anything about the nodes it is attached to. Essentially, the unique id IS your merge logic. This can even be dynamicly built up in Cypher using reduce. I usually also have a value field as a "pretty print display id value".
When running the final Cypher, you would Merge on each node alone by instance id, use Set for non node-defining fields, then use Create Unique to make make sure there was one and only one relation between the nodes.

Multi-part pattern in Merge clause of Cypher

Given the following graph:
(a)<--(b)-->(c)<--(d)-->(e)<--(f)-->(a)
I believe it is (currently) impossible to create a node (g) using the merge clause such that:
(g)-->(a)
(g)-->(c)
(g)-->(e)
The reason being that it requires a comma to describe the above pattern, and the MERGE clause will not accept a comma. e.g. (a)<--(g)-->(c), (g)-->(e)
For ease of reference, see picture below. Given that graph (except node 6), I cannot create node 6 using the MERGE command.
Can someone come up with a way to do this? I believe new functionality needs to be added, but I'd like to be more reasonably sure there's not a viable workaround before heading down that path.
There is no way to do this in Cypher, or in APOC right now. That said, there is a workaround. It's a bit manual, you'll need to acquire locks on the nodes in question (we'll use APOC for that), and we'll use OPTIONAL MATCH along with WHERE ... IS NULL to determine whether or not the center node exists, then create it only when it doesn't.
For this, I'm using the following example graph to mimic yours, before the addition of node 6:
create (zero:Node{name:0})
create (one:Node{name:1})
create (two:Node{name:2})
create (three:Node{name:3})
create (four:Node{name:4})
create (five:Node{name:5})
create (zero)<-[:TYPE]-(one)-[:TYPE]->(two)
create (two)<-[:TYPE]-(three)-[:TYPE]->(four)
create (four)<-[:TYPE]-(five)-[:TYPE]->(zero)
And now, the query to merge
match (node:Node)
where node.name in [0,2,4]
with collect(node) as nodes
call apoc.lock.nodes(nodes)
with nodes[0] as first, nodes[1] as second, nodes[2] as third
optional match (first)<-[:TYPE]-(center)-[:TYPE]->(second)
where (center)-[:TYPE]->(third)
with first, second, third, center
where center is null
// above 'where' will result in no rows if center exists, preventing creation of duplicate pattern below
create (first)<-[:TYPE]-(newCenter:Node{name:6})-[:TYPE]->(second)
create (newCenter)-[:TYPE]->(third)

Neo4J - Creating Relationship on existing nodes

I am new to Neo4J and I am looking to create a new relationship between an existing node and a new node.
I have a university node, and person node.
I am trying to assign a new person to an existing university.
I am trying to following code:
MATCH (p:Person {name:'Nick'}), (u:University {title:'Exeter'}) CREATE (p)-[:LIKES]->(u)
So in the above code: MATCH (p:Person {name:'Nick'}) is the new user
AND (u:University {title:'Exeter'}) is the exisiting univeristy.
But it is coming back (no changes, no rows)
I have even tried the query without the MATCH part but no luck either.
I have looked at few similar answers but they didn't seem to work either.
Any help would be very much appreciated. Thank you.
Match before u create new one, as suggested in the comments!
MATCH(u:University {title:'Exeter'})
CREATE(p:Person {name:'Nick'})
CREATE(p)-[w:LIKES]->(u)
return w
You could also use a MERGE statement as per the docs:
MERGE either matches existing nodes and binds them, or it creates new data and binds that. It’s like a combination of MATCH and CREATE that additionally allows you to specify what happens if the data was matched or created.
You would do a query like
MERGE (p:Person {name:'Nick'})-[:LIKES]->(u:University {title:'Exeter'})
It is because when you match you search for a nodes in your db. The db says i can't make the realtion "when the nodes dont exist".
Luckily there is something called merge it is like a match +create when he does not find the whole path he creates it.
it should be something like merge 'node1' merge'node2' create(node1)[]->(node2)

neo4j relationships between nodes using a common property (id) without creating more nodes

I have been trying to match together two different nodes (process) and (process framework) that have the same id. I want to set a relationship between those (called:SAME).
The query works but it creates 3 times the amount of processes that I wanted and I can´t figure out why:
MATCH (p:Process)
MATCH (pcf:ProcessFramework)
WHERE HAS (p.id) AND HAS (pcf.pcf_id) AND p.id=pcf.pcf_id
MERGE (p)<-[:same]-(pf)
return p,pcf
I also tried it with CREATE UNIQUE instead of MERGE but its the same result. I am not a developer so maybe I am doing a very obvious mistake but I really can´t see it!
Thanks!
You have a typo in your query, you use pf where you should use pcf in the MERGE.
I'd also change it to this and make sure you have an index on :ProcessFramework(pcf_id)
MATCH (p:Process)
MATCH (pcf:ProcessFramework)
WHERE p.id=pcf.pcf_id
MERGE (p)<-[:same]-(pcf)
return p,pcf

Clone nodes and relationships with Cypher

Is it possible to clone arbitrary nodes and relationships in a single Cypher neo4j 2.0 query?
'Arbitrary' reads 'without specifying their labels and relationship types'. Something like:
MATCH (node1:NodeType)-[e]->(n)
CREATE (clone: labels(n)) set clone=n set clone.prop=1
CREATE (node1)-[e1:type(e)]->(clone) set e1=e set e1.prop=2
is not valid in Cypher, so one cannot simply get labels from one node or relationship and assign them to another, because labels are compiled into the query literally.
Sure, labels and relation types are important for MATCH and WHERE for producing effective query plan, but isn't CREATE making another case?
The easiest way to clone parts of a graph is to use the dump command in Neo4j shell. dump generates cypher create statements from your return clauses. The result of dump can be appied to the graph database to create clones.
Today, April 2022, I believe the best approach might be using an APOC procedure
I had a similar requirement and this worked for me.
MATCH (rootA:Root{name:'A'}),
(rootB:Root{name:'B'})
MATCH path = (rootA)-[:LINK*]->(node)
WITH rootA, rootB, collect(path) as paths
CALL apoc.refactor.cloneSubgraphFromPaths(paths, {
standinNodes:[[rootA, rootB]]
})
YIELD input, output, error
RETURN input, output, error

Resources