I have created a hierarchical tree to represent the organization chart of a company on Neo4j, which is like the picture below.
When I insert a lot of relation with a LOAD CSV, I use this request:
LOAD CSV WITH HEADERS FROM "file:///newRelation.csv" AS row
MERGE (a:Person {name:row.person1Name})
MERGE(b:Person {name:row.person2Name})
FOREACH (t in CASE WHEN NOT EXISTS((a)-[*]->(b)) THEN [1] ELSE [] END |
MERGE (a)-[pr:Manage]->(b) )
With this request, I only create the relationship if the two people do not already have a hierarchical relationship.
How to save (log) the list of relationships that are not created because the test below fail?
CASE WHEN NOT EXISTS((a)-[*]->(b)
You need to move the existence check to a level above the foreach:
LOAD CSV WITH HEADERS FROM "file:///newRelation.csv" AS row
MERGE (a:Person {name:row.person1Name})
MERGE(b:Person {name:row.person2Name})
WITH a, b, row,
CASE WHEN NOT exists((a)-[*]->(b)) THEN [1] ELSE [] END AS check
FOREACH (t IN check |
MERGE (a)-[pr:Manage]->(b)
)
WITH a, b, row, check WHERE size(check) = 0
RETURN a, b, row
Related
This is regarding neo4j csv import using LOAD csv. Suppose my csv file format is as following.
Id, OID, name, address, Parents , Children
1, mid1, ratta, hello#aa, ["mid250","mid251","mid253"], ["mid60","mid65"]
2, mid2, butta, ado#bb, ["mid350","mid365","mid320", "mid450","mid700"], ["mid20","mid25","mid30"]
3, mid3, natta, hkk#aa, ["mid50","mid311","mid543"], []
So the parents and children columns consists of mids basically..while importing csv into neo4j using LOAD CSV.. I want to create following nodes and relationships.
NODES for each rows (for each id column in csv)
[:PARENT] relationship by matching the OID property in each row and OID properties inside parents column. So as a example when processing the first row...there should be four nodes (mid1, mid250,mid 251 and mid 253) and 3 PARENT relationship between mid1 and other 3 nodes.
[: CHILD ] relationship by matching the OID property in each row and OID properties inside children column.
Please help!!
Tried doing it with for each function but the results didn't come correctly. Im doing it through a python script. just need to edit the cypher query.
def create_AAA(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (e:AAA {id: row._id,OID: row.OID,address: row.address,name: row.name})"
)
def create_parent(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (a:AAA {OID: row.OID}) FOREACH (t in row.parents | MERGE (e:AAA {OID:t}) MERGE (a)-[:PARENT]->(e) )"
)
def create_child(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (a:AAA {OID: row.OID}) FOREACH (t in row.children | MERGE (e:AAA {OID:t}) MERGE (a)-[:CHILD]->(e) )"
)
with driver.session() as session:
session.write_transaction(create_AAA)
session.write_transaction(create_parent)
session.write_transaction(create_child)
Please follow the instructions below:
Change the column names of Parents and Children into parents and children since neo4j is case sensitive.
Remove the spaces in your csv file so that you don't need to do trim () on each columns in the csv.
In your parents and children columns, remove the commas on the string list because it is causing an error. OR use another delimiter and not comma. In my example, I used space as delimiter.
Below script will remove the quotes and [] characters then convert the string list into a list (using split() function)
Do the same for create child function.
LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row
MERGE (a:AAA {OID: row.OID})
FOREACH (t in split(replace(replace(replace(row.parents,'[', ''),']', ''),'"', ''), ' ') |
MERGE (e:AAA {OID:t}) MERGE (a)-[:PARENT]->(e) )
See sample csv here:
Id,OID,name,address,parents,children
1,mid1,ratta,hello#aa,["mid250" "mid251" "mid253"],["mid60" "mid65"]
2,mid2,butta,ado#bb,["mid350" "mid365" "mid320" "mid450" "mid700"],["mid20" "mid25" "mid30"]
3,mid3,natta,hkk#aa,["mid50" "mid311" "mid543"],[]
See sample result here:
LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row
WITH row WHERE row.children <>"[]"
MERGE (a:AAA {OID: row.OID})
FOREACH (t in split(replace(replace(replace(row.children,'[', ''),']', ''),'"', ''), ' ') |
MERGE (e:AAA {OID:t}) MERGE (a)-[:CHILD]->(e) )
Now it works fine.
So, I have some user event data and would like to create a graph of the same. A snapshot of the data
Now, the _id col has duplicate records but they are actually the same person, however there are multiple sessionField records for the same _id
What I'd want is something like this:
Node A -> sessionNode a1 -> Action Node a11 (with event type as properties, 4 in this case)
-> sessionNode a2 -> Action Node a21 (with event type as properties, 2 in this case)
Node B -> sessionNode b1 -> Action Node b11 (with event type as properties, 3 in this case)
I've tried the following code, but being new to graphs I'm not able to replicate the same:
session_streams_y has same data like _id
LOAD CSV WITH HEADERS FROM 'file:///df_temp.csv' AS users
CREATE (p:Person {nodeId: users._id, sessionId: users.session_streams_y})
CREATE (sn:Session {sessId: users.sessionField, sessionId: users.session_streams_y})
MATCH (p:Person)
with p as ppl
MATCH (sn:Session)
WITH ppl, sn as ss
WHERE ppl.sessionId=ss.sessionId
MERGE (ppl)-[:Sessions {sess: 'Has Sessions'}]-(ss)
WITH [ppl,ss] as ns
CALL apoc.refactor.mergeNodes(ns) YIELD node
RETURN node
This gives something different
Something like this may work for you:
LOAD CSV WITH HEADERS FROM 'file:///df_temp.csv' AS row
MERGE (p:Person {id: row._id})
MERGE (s:Session {id: row.sessionField})
FOREACH(
x IN CASE WHEN s.eventTypes IS NULL OR NOT row.eventType IN s.eventTypes THEN [1] END |
SET s.eventTypes = COALESCE(s.eventTypes, []) + row.eventType)
MERGE (p)-[:HAS_SESSION]->(s)
RETURN p, s
The resulting Person and Session nodes would be unique, each Session node would have an eventTypes list with distinct values, and the appropriate Person and Session nodes would be connected by a HAS_SESSION relationship.
An Action node does not seem to be necessary.
I can load CSV into Neo4j for a specific label (say PERSON) and the nodes are created under the label PERSON.
I also have another CSV to illustrate the relationships between the person and it looks like:
name1, relation, name2
a, LOVE, b
a, HATE, c
I want to create a relationship between these pairs and the relationship thus created should be "LOVE", "HATE", etc, instead of a rigid RELATION as done by the below script:
load csv with headers from "file:///d:/Resources/Neo4j/person-rel.csv" as p
match (a:PERSON) where a.name=p.name1
match (b:PERSON) where b.name=p.name2
merge (a)-[r:REL {relation: p.REL}]->(b)
By doing this, I have a bunch of REL-type relations but not LOVE- and HATE-relations.
In another word, I want the REL in the last line of the script to be dynamically assigned. And then I can query out all the relationship types using Neo4j API.
Is this possible?
You can install the APOC library and then use apoc.merge.relationship
apoc.merge.relationship(startNode, relType, {key:value, ...}, {key:value, ...}, endNode) - merge relationship with dynamic type
load csv with headers from "file:///d:/Resources/Neo4j/person-rel.csv" as p
match (a:PERSON) where a.name=p.name1
match (b:PERSON) where b.name=p.name2
call apoc.merge.relationship(a,p.REL,{},{},b) yield rel
return count(*);
I am trying to get a csv into Neo4j. As it consists of log entries, I'd like to connect nodes with a NEXT-pointer/relationship when the corresponding logs have been created at subsequent times.
LOAD CSV WITH HEADERS FROM 'http://localhost/Export.csv' AS line
CREATE (:Entry { date: line[0], ...})
MATCH (n)
RETURN n
ORDER BY n:date
MATCH (a:Entry),(b:Entry),(c:Entry)
WITH p AS min(b:date)
WHERE a:date < b:date AND c.date = p
CREATE (a)-[r:NEXT]->(c)
The last four lines do not work however. What I try is to get the earliest entry 'c' out of the group of entries 'b' with a larger timestamp than 'a'. Can anyone help me out here?
Not sure if I understood your question correctly: you have a csv file containing log records with a timestamp. Each line contains one record. You want to interconnect the events to form a linked list based on a timestamp?
In this case I'd split up the process into two steps:
using LOAD CSV create a node with a data property for each line
afterwards connect the entries using e.g. a cypher statement like this:
.
MATCH (e:Entry)
WITH e ORDER BY e.date DESC
WITH collect(e) as entries
FOREACH(i in RANGE(0, length(entries)-2) |
FOREACH(e1 in [entries[i]] |
FOREACH(e2 in [entries[i+1]] |
MERGE (e1)-[:NEXT]->(e2))))
How do i insert the following into neo4j
create (st:serviceticket {name:'SRT_519'})
with st as st
match (st:serviceticket) where st.name='SRT_519'
match (d:ProductID) where d.name ='PRD_1014'
with st as st , d as d
merge (d)-[:SERVICE_TICKETID]->(st);
create (st:serviceticket {name:'SRT_520'})
with st as st
match (st:serviceticket) where st.name='SRT_520'
match (d:ProductID) where d.name ='PRD_1004'
with st as st , d as d
merge (d)-[:SERVICE_TICKETID]->(st);
if i am having multiple records like this to insert how can i insert all of them at a shot.Please help me.
I'll assume that nodes for your products already exist.
Remember that your "st" is a reference and you are trying to assign the same reference twice in one batch.
The first step is to give your service tickets separate references
match (d:ProductID), (e:ProductID)
where d.name ='PRD_1014'and
e.name ='PRD_1004'
create (st519:serviceticket {name:'SRT_519'})
create (st520:serviceticket {name:'SRT_520'})
merge (d)-[:SERVICE_TICKETID]->(st519)
merge (e)-[:SERVICE_TICKETID]->(st520)
If you use parameters, you can provide an array of pairs:
you should have an index or constraint on :ServiceTicket(name) and :ProductID(name)
WITH [['SRT_519','PRD_1014']] as data
FOREACH (pair in data |
MERGE (st:ServiceTicket {name:data[0]}),(d:ProductID {name:data[1]})
MERGE (d)-[:SERVICE_TICKETID]->(st);
)