I am trying to create a relationship between two existing nodes. I am reading the node ID's from a CSV and creating the relationship with the following query:
LOAD CSV WITH HEADERS FROM "file:///8245.csv" AS f
MATCH (Ev:Event) where id(Ev) =f.first
MATCH (Ev_sec:Event) where id(Ev_sec) = f.second
WITH Ev, Ev_sec
MERGE (Ev) - [:DF_mat] - > (Ev_sec)
However, it is not changing anything the database. How can I solve this problem?
Thanks!
I solved the problem. So, I again queried for the ID(node) and this time I exported them as a string (by using toString(ID(node)) ). Then while loading to the database, I converted them to Integer. The query is as follows:
LOAD CSV WITH HEADERS FROM "file:///8245_new.csv" AS csvLine
match (ev:Event) where id(ev)=toInteger(csvLine.first)
match (ev_sec:Event) where id(ev_sec)=toInteger(csvLine.second)
merge (ev)-[:DF_mat]-> (ev_sec)
Related
Apologies as I am new to neo4j and struggling with what I imagine is a very simple example.
I would like to model an org chart which I have stored as a csv like so
id,name,manager_id
1,allan,2
2,bob,4
3,john,2
4,sam,
5,Jim,2
Note that Bob has 3 direct reports and Bob reports into Sam who doesn't report into anyone.
I would like to produce a graph which shows the management chain. I have tried the following, but it produces relationships which are disjoint from the people:
LOAD CSV WITH HEADERS FROM "file///employees.csv" AS csvLine
CREATE (p:Person {id: csvLine.id, name: csvLine.name})
CREATE (p)-[:MANAGED_BY {manager: csvLine.manager_id}]->(p)
This query creates a bunch of self-referencing relationships. Is there anyway to populate the graph with one command over the single csv? I must be missing something and any help is appreciated. Thanks
I think this is what you are looking for.
In your query tou are creating a relationship between p and p thus the self referencing relationships.
I added a coalesce statement to deal with people that do not have a manager_id value. THis way Sam can report to himself.
LOAD CSV WITH HEADERS FROM "file:///employees.csv" AS csvLine
// create or match the person in the left column
MERGE (p:Person {id: csvLine.id })
// if they are created then assign their name
ON CREATE SET p.name = csvLine.name
// create or match the person/manager in the right column
MERGE (p1:Person {id: coalesce(csvLine.manager_id, csvLine.id) })
// create the reporting relationship
CREATE (p)-[:MANAGED_BY]->(p1)
I am working in a Streamsets pipeline to read data from a active file directory where .csv files are uploaded remotely and put those data in a neo4j database.
The steps I have used is-
Creating a observation node for each row in .csv
Creating a csv node and creating relation between csv & the record
Updating Timestamp taken from csv node to burn_in_test nodes, already created in graph database from different pipeline, if it is latest
creating relation from csv to burn in test
deleting outdated relation based on latest timestamp
Now I am doing all of these using jdbc query and the cypher query used is
MERGE (m:OBSERVATION{
SerialNumber: "${record:value('/SerialNumber')}",
Test_Stage: "${record:value('/Test_Stage')}",
CUR: "${record:value('/CUR')}",
VOLT: "${record:value('/VOLT')}",
Rel_Lot: "${record:value('/Rel_Lot')}",
TimestampINT: "${record:value('/TimestampINT')}",
Temp: "${record:value('/Temp')}",
LP: "${record:value('/LP')}",
MON: "${record:value('/MON')}"
})
MERGE (t:CSV{
SerialNumber: "${record:value('/SerialNumber')}",
Test_Stage: "${record:value('/Test_Stage')}",
TimestampINT: "${record:value('/TimestampINT')}"
})
WITH m
MATCH (t:CSV) where t.SerialNumber=m.SerialNumber and t.Test_Stage=m.Test_Stage and t.TimestampINT=m.TimestampINT MERGE (m)-[:PART_OF]->(t)
WITH t, t.TimestampINT AS TimestampINT
MATCH (rl:Burn_In_Test) where rl.SerialNumber=t.SerialNumber and rl.Test_Stage=t.Test_Stage and rl.TimestampINT<TimestampINT
SET rl.TimestampINT=TimestampINT
WITH t
MATCH (rl:Burn_In_Test) where rl.SerialNumber=t.SerialNumber and rl.Test_Stage=t.Test_Stage
MERGE (t)-[:POINTS_TO]->(rl)
WITH rl
MATCH (t:CSV)-[r:POINTS_TO]->(rl) WHERE t.TimestampINT<rl.TimestampINT
DELETE r
Right now this process is very slow and taking about 15 mins of time for 10 records. Can This be further optimized?
Best practices when using MERGE is to merge on a single property and then use SET to add other properties.
If I assume that serial number is property is unique for every node (might not be), it would look like:
MERGE (m:OBSERVATION{SerialNumber: "${record:value('/SerialNumber')}"})
SET m.Test_Stage = "${record:value('/Test_Stage')}",
m.CUR= "${record:value('/CUR')}",
m.VOLT= "${record:value('/VOLT')}",
m.Rel_Lot= "${record:value('/Rel_Lot')}",
m.TimestampINT = "${record:value('/TimestampINT')}",
m.Temp= "${record:value('/Temp')}",
m.LP= "${record:value('/LP')}",
m.MON= "${record:value('/MON')}"
MERGE (t:CSV{
SerialNumber: "${record:value('/SerialNumber')}"
})
SET t.Test_Stage = "${record:value('/Test_Stage')}",
t.TimestampINT = "${record:value('/TimestampINT')}"
WITH m
MATCH (t:CSV) where t.SerialNumber=m.SerialNumber and t.Test_Stage=m.Test_Stage and t.TimestampINT=m.TimestampINT MERGE (m)-[:PART_OF]->(t)
WITH t, t.TimestampINT AS TimestampINT
MATCH (rl:Burn_In_Test) where rl.SerialNumber=t.SerialNumber and rl.Test_Stage=t.Test_Stage and rl.TimestampINT<TimestampINT
SET rl.TimestampINT=TimestampINT
WITH t
MATCH (rl:Burn_In_Test) where rl.SerialNumber=t.SerialNumber and rl.Test_Stage=t.Test_Stage
MERGE (t)-[:POINTS_TO]->(rl)
WITH rl
MATCH (t:CSV)-[r:POINTS_TO]->(rl) WHERE t.TimestampINT<rl.TimestampINT
DELETE r
another thing to add is that I would probably split this into two queries.
First one would be the importing part and the second one would be the delete of relationships. Also add unique constraints and indexes where possible.
I am trying to load a CSV file using LOAD FROM CSV and build a relationship. I have a crossover table that I am using to support a many to many relationship. For my example I will use two (2) primary nodes, Car and Driver.
A Car can be driven by one or more Drivers and a Driver can drive one or more Cars.
My cross over table looks like this
CarID (int)
DriverID (int)
Here is my code to successfully load it into Neo4j
LOAD CSV WITH HEADERS FROM 'FILE:///CarToDriverXFER.csv' AS row FIELDTERMINATOR ','
MATCH (c:Cars {carID:row.carID})
MATCH (d:Drivers {driverID:row.driverID})
MERGE (c)-[:DRIVES]->(d)
I want to add an attribute into this relationship. Now the table looks like this:
CarID (int)
DriverID (int)
Rating (int)
I am not sure how to do this. I know how to do this if the object is a node but I am not getting the syntax correct with regards to build a relationship. Here is my attempt at a solution but I am getting an error.
LOAD CSV WITH HEADERS FROM 'FILE:///CarToDriverXFER.csv' AS row FIELDTERMINATOR ','
MATCH (c:Cars {carID:row.carID})
MATCH (d:Drivers {driverID:row.driverID})
CREATE ({Rating:row.Rating})
MERGE (c)-[:DRIVES]->(d)
The above script loaded the relationships but the attribute "Rating" is not listed on the attribute.
Can anyone offer help?
You can add the attribute to the relationship the same way you add attributes to nodes, that is:
LOAD CSV WITH HEADERS FROM 'FILE:///CarToDriverXFER.csv' AS row FIELDTERMINATOR ','
MATCH (c:Cars {carID:row.carID})
MATCH (d:Drivers {driverID:row.driverID})
// adding 'Rating' attribute to ':Drives' relationship between 'c' and 'd'
MERGE (c)-[:DRIVES {Rating:row.Rating}]->(d)
In shortest way my problem is below:
I need to get from the following csv file
(https...)drive.google.com/file/d/0B-y9nPaqlH6XdXZsYzAwLThacTg/view?usp=sharing
The following data-structure in neo4j (Using cypher import):
https://drive.google.com/file/d/0B-y9nPaqlH6XdlZHM216eDRSX3c/view?usp=sharing
Instead of:
[https://drive.google.com/file/d/0B-y9nPaqlH6XdE9vZ0gyNU1lR0U/view?usp=sharing]
The longer interpretation:
I thought, the solution of my problem is just need to understand to (un)bound elements.
But I tried many times, in many ways (with(out) creating single nodes first, or in empty database):
LOAD CSV with headers FROM "file:///C:/Users/user/Desktop/neo4j help/calling.csv"
AS csvLine
MERGE (u1:Person { number:(csvLine.A), name:(csvLine.name_A)}) MERGE (u2:Person { number:(csvLine.B), name:(csvLine.name_B)})
MERGE (u1:Person { number:(csvLine.A), name:(csvLine.name_A)})-[c:called]->(u2:Person { number:(csvLine.B), name:(csvLine.name_B)})
RETURN u1.name,c,u2.name
I got instead of wondered results just error message:
Can't create u1 with properties or labels here. It already exists in
this context
And without „pre-merging“ the nodes, I have the results above (in the pink picture)
What do I need to obtain the wanted result (in the first picture)?
You don't need to redefine the u1 and u2 nodes. Just reuse the identifiers and MERGE the relationship :
LOAD CSV with headers FROM "file:///C:/Users/user/Desktop/neo4j help/calling.csv"
AS csvLine
MERGE (u1:Person { number:(csvLine.A), name:(csvLine.name_A)})
MERGE (u2:Person { number:(csvLine.B), name:(csvLine.name_B)})
MERGE (u1)-[c:CALLED]->(u2)
RETURN u1.name,c,u2.name
Nb: I think your images are both the same, and you can post them in your questions, many people will skip your question because they need to open 2 or 3 more browser windows
after importing data via CSV LOAD I want to connect the imported nodes to customer nodes that are already in the DB. The idea was to look up all imported nodes with the Label TICKET and run through the result set and create the relationship.
Here is the code I come up with first approach:
# Find nodes without relationship for label Ticket
MATCH (t:Ticket), (c:Customer)
WHERE NOT (t)--(c)
RETURN t.number as ticket_number, t.type as ticket_type,t.sid as ticket_sid
# Run through the resultset and execute for each found node
MATCH (t:Ticket { number: "xxx" }), (c:Customer {code: "xxx"})
MERGE (t)-[:IS_TICKET_OF]->(c);
There is an index
ON :Ticket (number)
ON :Customer(code)
This way to handle it is very slow and it took minutes to run through the CSV file. I hope there is a way to optimize the query or maybe to find a way to create the missing relationship easier as first to look them all up and then run through a loop.
The CSV Load is :
LOAD CSV FROM "file:c:..." AS csvLine
MERGE (t:Ticket { number: csvLine[0]})
Maybe its also fine to create the relation already in the CSV import - maybe something like
MATCH (c:Customer {code:"xxx"})
MERGE (t) - [:IS_TICKET_OF]-> (c)
But I would need to figure out in the query how to extract the code from a field as I have something like "aaa/vvv/bbb/1234" in the CSV import and would need only aaa for the match above as this is stored in the customer node as ID.
Any hint is very appreciated.
Thanks!
Does this query work for you?
It stores the aaa part of the input string in num, makes sure the ticket with that number exists, and then makes sure a relationship exists to the matching customer (if there is such a customer).
LOAD CSV FROM "file:c:..." AS csvLine
WITH SPLIT(csvLine[0], '/')[0] AS num
MERGE (t:Ticket {number: num})
WITH num, t
OPTIONAL MATCH (c:Customer {code: num})
MERGE (t)-[:IS_TICKET_OF]->(c);