how to create and update nodes and property using plain cypher? - neo4j

MERGE (c:contact {guid : '500010'})
ON CREATE SET
c.data_source = '1',
c.guid = '500010',
c.created = timestamp()
ON MATCH SET
c.lastUpdated = timestamp()
MERGE (s:speciality {specialtygroup_desc : 'cold'})
ON CREATE SET s.data_source = '1',
s.specialtygroup_desc = 'fever',
s.created = timestamp()
ON MATCH SET s.data_source = '1',
s.specialtygroup_desc = 'comman cold',
s.lastUpdated = timestamp()
MERGE (c)-[r:is_specialised_in]->(s)
ON CREATE SET
r.duration = 1
ON MATCH SET
r.duration = r.duration + 1
On the first run, node is created as "fever". On the second run, I have updated the specialty_group to "common cold". But it is creating new node with "fever". I am not able to update the "fever" to "common cold". What changes should I make to the above query?

Related

how to create and update nodes and property using plain cypher query?

How do I create and update nodes and property using plain cypher query?
Below is my query:
MERGE (c:contact {guid : '500010'})
ON CREATE SET
c.data_source = '1',
c.guid = '500010',
c.created = timestamp()
ON MATCH SET
c.lastUpdated = timestamp()
MERGE (s:speciality {specialtygroup_desc : 'cold'})
ON CREATE SET s.data_source = '1',
s.specialtygroup_desc = 'fever',
s.created = timestamp()
ON MATCH SET s.data_source = '1',
s.specialtygroup_desc = 'comman cold',
s.lastUpdated = timestamp()
MERGE (c)-[r:is_specialised_in]->(s)
ON CREATE SET
r.duration = 1
ON MATCH SET
r.duration = r.duration + 1
On the first run, node is created as "fever".
On the second run, I have updated the specialty_group to "common cold". But it is creating new node with "fever". I am not able to update the "fever" to "common cold".
What changes should I make to the above query?
The MERGE (s:speciality {specialtygroup_desc : 'cold'}) clause looks for a specialtygroup_desc value of "cold".
During the first execution, that MERGE clause finds no "cold" node -- so it creates one, and the subsequent ON CREATE clause changes it to "fever".
During the second execution, that MERGE again finds no "cold" node (since it is now a "fever" node), so it again creates a "cold" node and the ON CREATE clause yet again changes it to "fever". The ON MATCH clause is never used. This is why you end up with another "fever" node.
Unfortunately, you have not explained your use case in enough detail to offer a recommendation for how to fix your code.
I think you want to update all node "cold" to "common cold" and if not exists "cold" or "common cold", create new "fever" ?
My suggestion:
OPTIONAL MATCH (ss:speciality {specialtygroup_desc : 'cold'}
SET ss.specialtygroup_desc='common cold', ss.lastUpdated = timestamp()
MERGE (c:contact {guid : '500010'})
ON CREATE SET
c.data_source = '1',
c.guid = '500010',
c.created = timestamp()
ON MATCH SET
c.lastUpdated = timestamp()
MERGE (s:speciality {specialtygroup_desc : 'common cold'})
ON CREATE SET s.data_source = '1',
s.specialtygroup_desc = 'fever',
s.created = timestamp()
MERGE (c)-[r:is_specialised_in]->(s)
ON CREATE SET
r.duration = 1
ON MATCH SET
r.duration = r.duration + 1

Neo4j: second merge for relationship not working properly

I am working with email data. I have 2 outcomes in the field Outcome2 and they are FAILED_TO and TO. The first one FAILED_TO works fine if there is a failed to event the nodes are created and all properties, are updated or added. But the TO portion doesnt work. No new nodes are created. Now this was created later on in the statement. This may be a simple fix. Any help would be greatly appreciated. And I would like to avoid apoc if at all possible.
// NO ATTACHMENT OR LINK - FOLLOWING IMPORTS
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM ("file:///sessions/new_neo_test_3.csv") AS row
WITH row, datetime(row.DateTime) AS dt
MERGE (a:Sender {name: row.From, domain: row.Sender_Sub_Fld})
ON CREATE SET a.firstseen = dt
SET a.lastseen = dt
MERGE (b:Recipient {name: row.To})
ON CREATE SET b.firstseen = dt
SET b.lastseen = dt
WITH a, b, row, dt
WHERE row.Url = "false" AND row.FileHash = "false" AND row.Outcome2 = "FAILED_TO"
MERGE (a)-[rel1:FAILED_TO]->(b)
ON CREATE SET rel1.firstseen = dt
SET rel1.lastseen = dt
SET rel1.timesseen = coalesce(rel1.timesseen, 0) + 1
WITH a,b,row,dt,rel1
WHERE row.Url = "false" AND row.FileHash = "false" AND row.Outcome2 = "TO"
MERGE (a)-[rel2:TO]->(b)
ON CREATE SET rel2.firstseen = dt
SET rel2.lastseen = dt
SET rel2.timesseen = coalesce(rel2.timesseen, 0) + 1
return a,b
It is because of these two lines
WITH a, b, row, dt
WHERE row.Url = "false" AND row.FileHash = "false" AND row.Outcome2 = "FAILED_TO"
The WHERE ... AND row.Outcome2 = "FAILED_TO literally removes the other rows where row.Outcome2 = "TO".
Instead, you can do something like the following. Instead of the WHERE row.outcome2, create a collection of [1] for each case when either FAILED_TO or TO are found. Then later, use that in a FOREACH loop to create that relationship if the corresponding collection has a value.
Since roe.Woutcome2 can only be one value or the other only one of the sets of statement inside the FOREACH clause will actually be executed per row.
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM ("file:///sessions/new_neo_test_3.csv") AS row
WITH row, datetime(row.DateTime) AS dt
MERGE (a:Sender {name: row.From, domain: row.Sender_Sub_Fld})
ON CREATE SET a.firstseen = dt
SET a.lastseen = dt
MERGE (b:Recipient {name: row.To})
ON CREATE SET b.firstseen = dt
SET b.lastseen = dt
WITH a, b, row, dt
, CASE WHEN row.Outcome2 = 'FAILED_TO' THEN [1] ELSE [] END AS fail
, CASE WHEN row.Outcome2 = 'TO' THEN [1] ELSE [] END AS success
WHERE row.Url = "false" AND row.FileHash = "false"
FOREACH ( x in fail |
MERGE (a)-[rel1:FAILED_TO]->(b)
ON CREATE SET rel1.firstseen = dt
SET rel1.lastseen = dt
SET rel1.timesseen = coalesce(rel1.timesseen, 0) + 1
)
FOREACH ( x in success |
MERGE (a)-[rel2:TO]->(b)
ON CREATE SET rel2.firstseen = dt
SET rel2.lastseen = dt
SET rel2.timesseen = coalesce(rel2.timesseen, 0) + 1
)
RETURN a, b

Updating node attributes based on new nodes

I have edges like this:
(People)-[:USE]->(Product)
(People)-[:REVIEW]->(Product)
Now I have a new csv of People who are reviewers but they are missing some of the attributes I have already.
I want to do something like:
LOAD CSV WITH HEADERS FROM "file:///abcd.csv" AS row
MERGE (svc:Consumer {name: row.referring_name})
ON CREATE SET
svc.skewNum = toInteger(row.skew_num)
MERGE (p:PrimaryConsumer) WHERE p.name = svc.name
ON MATCH SET
svc.city = p.city,
svc.latitude = toFloat(p.latitude),
svc.longitude = toFloat(p.longitude),
svc.consumerId = toInteger(p.primaryConsumerId)
Which borks:
Neo.ClientError.Statement.SyntaxError: Invalid input 'H': expected 'i/I' (line 10, column 28 (offset: 346))
"MERGE (p:PrimaryConsumer) WHERE p.name = svc.name"
I am 100% assured that the names are unique and will match a unique consumer name in the existing set of nodes (to be seen).
How would I add existing attributes to new data when I have a match on unique node attributes? (I am hoping to get unique id's, but I have to be able to perform an update to the new data on match)
Thank you.
This is the entire cypher script -- modified as per #cypher's input.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///abcde.csv" AS row
MERGE (svc:Consumer {name: row.referring_name})
ON CREATE SET
svc.skeyNum = toInteger(row.skew_num)
MATCH (p:primaryConsumer {name: svc:name})
ON MATCH SET
svc.city = p.city,
svc.latitude = toFloat(p.latitude),
svc.longitude = toFloat(p.longitude),
svc.providerId = toInteger(p.providerId)
MERGE (spec:Product {name: row.svc_prod_name})
ON CREATE SET
spec.name = row.svc_prov_name,
spec.skew = toInteger(row.skew_id),
spec.city = row.svc_prov_city,
spec.totalAllowed = toFloat(row.total_allowed)
MERGE (svc)-[r:CONFIRMED_PURCHASE]->(spec)
ON MATCH SET r.totalAllowed = r.totalAllowed + spec.totalAllowed
ON CREATE SET r.totalAllowed = spec.totalAllowed
;
MERGE does not accept a WHERE clause.
Change this:
MERGE (p:PrimaryConsumer) WHERE p.name = svc.name
to this:
MERGE (p:PrimaryConsumer {name: svc.name})
[EDIT]
You entire query should then look like this:
LOAD CSV WITH HEADERS FROM "file:///abcd.csv" AS row
MERGE (svc:Consumer {name: row.referring_name})
ON CREATE SET
svc.skewNum = toInteger(row.skew_num)
MERGE (p:PrimaryConsumer {name: svc.name})
ON MATCH SET
svc.city = p.city,
svc.latitude = toFloat(p.latitude),
svc.longitude = toFloat(p.longitude),
svc.consumerId = toInteger(p.primaryConsumerId)

Neo4J - Optimizing 3 merge queries into a single query

I am trying to make a Cypher query which makes 2 nodes and adds a relationship between them.
For adding a node I'm checking if the node is existing or not, if existing then I'm simply going ahead and setting a property.
// Query 1 for creating or updating node 1
MERGE (Kunal:PERSON)
ON CREATE SET
Kunal.name = 'Kunal',
Kunal.type = 'Person',
Kunal.created = timestamp()
ON MATCH SET
Kunal.lastUpdated = timestamp()
RETURN Kunal
// Query 2 for creating or updating node 2
MERGE (Bangalore: LOC)
ON CREATE SET
Bangalore.name = 'Bangalore',
Bangalore.type = 'Location',
Bangalore.created = timestamp()
ON MATCH SET
Bangalore.lastUpdated = timestamp()
RETURN Bangalore
Likewise I am checking if a relationship exists between the above created nodes, if not exists then creating it else updating its properties.
// Query 3 for creating relation or updating it.
MERGE (Kunal: PERSON { name: 'Kunal', type: 'Person' })
MERGE (Bangalore: LOC { name: 'Bangalore', type: 'Location' })
MERGE (Kunal)-[r:LIVES_IN]->(Bangalore)
ON CREATE SET
r.duration = 36
ON MATCH SET
r.duration = r.duration + 1
RETURN *
The problem is these are 3 separate queries which will have 3 database calls when I run it via the Python driver. Is there a way to optimize these queries into a single query.
Of course you can concatenate your three queries to one.
In this case you can omit the first and second MERGE of your last query, because it is assured by the start of new query already.
MERGE (kunal:PERSON {name: ‘Kunal'})
ON CREATE SET
kunal.type = 'Person',
kunal.created = timestamp()
ON MATCH SET
kunal.lastUpdated = timestamp()
MERGE (bangalore:LOC {name: 'Bangalore'})
ON CREATE SET
bangalore.type = 'Location',
bangalore.created = timestamp()
ON MATCH SET
bangalore.lastUpdated = timestamp()
MERGE (kunal)-[r:LIVES_IN]->(bangalore)
ON CREATE SET
r.duration = 36
ON MATCH SET
r.duration = r.duration + 1
RETURN *

Node creation using cypher Foreach

I have 2 csv files and their sructure is as follows:
1.csv
id name age
1 aa 23
2 bb 24
2.csv
id product location
1 apple CA
2 samsung PA
1 HTC AR
2 philips CA
3 sony AR
// 1.csv
LOAD CSV WITH HEADERS FROM "file:///G:/1.csv" AS csvLine
CREATE (a:first { id: toInt(csvLine.id), name: csvLine.name, age: csvLine.age})
// 2.csv
LOAD CSV WITH HEADERS FROM "file:///G:/2.csv" AS csvLine
CREATE (b:second { id: toInt(csvLine.id), product: csvLine.product, location: csvLine.location})
Now i want to create another node called "third", using the following cypher query.
LOAD CSV WITH HEADERS FROM "file:///G:/1.csv" AS csvLine
MATCH c = (a:first), d = (b.second)
FOREACH (n IN nodes(c) |
CREATE (e:third)
SET e.name = label(a) + label(b) + "id"
SET e.origin = label(a)
SET e.destination = label(b)
SET e.param = a.id)
But the above query give me duplicate entries. I think here it runs 2 time after the load. Please suggest or any alternative way for this.
CREATE always creates, even if something is already there. So that's why you're getting duplicates. You probably want MERGE which only creates an item if it doesn't already exist.
I wouldn't ever do CREATE (e:third) or MERGE (e:third) because without specifying properties, you'll end up with duplicates anyway. I'd change this:
CREATE (e:third)
SET e.name = label(a) + label(b) + "id"
SET e.origin = label(a)
SET e.destination = label(b)
SET e.param = a.id)
To this:
MERGE (e:third { name: label(a) + label(b) + "id",
origin: label(a),
destination: label(b),
param: a.id })
This then would create the same node when necessary, but avoid creating duplicates with all the same property values.
Here's the documentation on MERGE
You don't use csvLine at all for matching the :first and :second node!
So your query doesn't make sense
This doesn't make sense either:
MATCH c = (a:first), d = (b.second)
FOREACH (n IN nodes(c) |
CREATE (e:third)
c are paths with a single node, i.e. (a)
so instead of the foreach you would use a directly instead

Resources