Load csv in neo4j with nodes and relationships in one csv file - neo4j

Apologies as I am new to neo4j and struggling with what I imagine is a very simple example.
I would like to model an org chart which I have stored as a csv like so
id,name,manager_id
1,allan,2
2,bob,4
3,john,2
4,sam,
5,Jim,2
Note that Bob has 3 direct reports and Bob reports into Sam who doesn't report into anyone.
I would like to produce a graph which shows the management chain. I have tried the following, but it produces relationships which are disjoint from the people:
LOAD CSV WITH HEADERS FROM "file///employees.csv" AS csvLine
CREATE (p:Person {id: csvLine.id, name: csvLine.name})
CREATE (p)-[:MANAGED_BY {manager: csvLine.manager_id}]->(p)
This query creates a bunch of self-referencing relationships. Is there anyway to populate the graph with one command over the single csv? I must be missing something and any help is appreciated. Thanks

I think this is what you are looking for.
In your query tou are creating a relationship between p and p thus the self referencing relationships.
I added a coalesce statement to deal with people that do not have a manager_id value. THis way Sam can report to himself.
LOAD CSV WITH HEADERS FROM "file:///employees.csv" AS csvLine
// create or match the person in the left column
MERGE (p:Person {id: csvLine.id })
// if they are created then assign their name
ON CREATE SET p.name = csvLine.name
// create or match the person/manager in the right column
MERGE (p1:Person {id: coalesce(csvLine.manager_id, csvLine.id) })
// create the reporting relationship
CREATE (p)-[:MANAGED_BY]->(p1)

Related

Unable to link a node with itself using Neo4j

How can I create a relationship from a node to itself? I have one node (p:person) and my csv has 2 columns: name and vice. Each row in my csv represents a person who a ceo and their vp at the time. Now sometimes vp were ceo so I want to show that relationship. Here is what I was trying but no luck. If I do not include the WITH I receive error saying I need it but when I add the * or a property, it says it cannot find row. I'm stuck
:auto USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:///ceo_vp.csv' AS row
CREATE (p:person {name:coalesce(row.name,'UNK')})
MATCH (p:person {name:row.vice })
WITH *
CREATE (p)-[:was_vp_for]->(p)
There is typo on the variable p; You must assign a different variable name for vp. Here is the script;
LOAD CSV WITH HEADERS FROM 'file:///ceo_vp.csv' AS row
MERGE (ceo:person {name:coalesce(row.name,'UNK')})
MERGE (vice:person {name:row.vice })
CREATE (vice)-[:was_vp_for]->(ceo)
Notice that I used merge because as you said, a vp can be a former ceo (and vice versa) so merge is better than create. Merge will ignore the person if it already exists.

Neo4j consolidate nodes in cypher single node relationship

I’m desperately trying to understand how to consolidate nodes in Neo4J for a more streamlined relationship.
I have a dataset with four columns: Person_1_ID, Person_1_Name, Person_2_ID, Person_2_Name
How do you consolidate IDs when they exist in two columns so that you only show one node?
For example:
My dataset looks like this:
If I do a simple match and merge like this:
LOAD CSV WITH HEADERS FROM 'file:///users.csv' AS row
MATCH (Person_1:Person_1 {Person_1_ID: row.Person_1_ID})
MATCH (Person_2:Person_2 {Person_2_ID: row.Person_2_ID })
MERGE (Person_2)-[pr:REFERRING]->(Person_1);
It produces this:
I'm trying to merge on names/IDs so that the relationships look like this:
Desperately trying to understand how you merge nodes properly here so that the relationships are consolidated. Any guidance and code example is greatly appreciated!
Doug
You have a couple of issues:
In your initial node import, don't use Person_1 and Person_2 node labels. Just stick with a single node label Person. The same goes for person id node property. If I were you, I would just delete existing graph and use the following Cypher to produce the desired results:
LOAD CSV WITH HEADERS FROM 'file:///users.csv' AS row
MERGE (Person_1:Person {id: row.Person_1_ID})
MERGE (Person_2:Person {id: row.Person_2_ID })
MERGE (Person_2)-[pr:REFERRING]->(Person_1);

How to create relationships (a->b->c)among those nodes by importing csv data

I have a database of existing nodes and would like to add in additional relationships from a CSV file which looks like this:
id_from, id_to,point, nextpoint
1,2,HEILBRONN,ILSFELD
2,3,ILSFELD,MUNDELSHEIM
i would like to create a relationship (a->b->c) just like HEILBRONN->ILSFELD->MUNDELSHEIM
How can i get it? thanks.
In Cypher, assuming the id_from and id_to in the file are the id property of the nodes (and that the property is indexed):
LOAD CSV WITH HEADERS FROM 'file:///path/to/file.csv' AS line
MATCH (from {id: toInt(line.id_from)}), (to {id: toInt(line.id_to)})
MERGE from-[:RELATIONSHIP_TYPE]->to

can't create links from CSV in Neo4j

I can't figure out how to create links out of CSV tables in Neo4j. I've read several parts of the manual (match, loadCSV, etc), that free book, and several tutorials I've found. None of them seems to contemplate my use case (which is weird, because I think it's a pretty simple use case). I've tried adapting the code they have in all sorts of ways, but nothing seems to work.
So, I have three CSV tables: parent companies, child companies, and parent-child pairs. I begin by loading the first two tables (and that works fine - all the properties are there, all the info is correct):
LOAD CSV FROM "file:/C:/Users/thiago.marzagao/Desktop/CSVs/children.csv" AS node
CREATE (:Children {id: node[0], name: node[1]})
LOAD CSV FROM "file:/C:/Users/thiago.marzagao/Desktop/CSVs/parents.csv" AS node
CREATE (:Parent {id: node[0], name: node[1]})
Now, here's the structure of the third table:
child_id,parent_id
Here's some of the things I've tried:
LOAD CSV FROM "file:/C:/Users/thiago.marzagao/Desktop/CSVs/link.csv" AS rels
MATCH (FROM {Parent: rels[1]}), (TO {Children: rels[0]})
CREATE (Parent)-[:OWNS]->(Children)
This doesn't give me an eror, but it returns zero rows.
LOAD CSV FROM "file:/C:/Users/thiago.marzagao/Desktop/CSVs/link.csv" AS rels
MATCH (FROM {id: rels[1]}), (TO {id: rels[0]})
CREATE (Parent)-[:OWNS]->(Children)
This doesn't give me an error, but it just returns a bunch of pairs of empty nodes. So, it creates the links, but somehow it doesn't link the actual nodes.
LOAD CSV FROM "file:/C:/Users/thiago.marzagao/Desktop/CSVs/link.csv" AS rels
MATCH (FROM {Parent.id: rels[1]}), (TO {Children.id: rels[0]})
CREATE (Parent)-[:OWNS]->(Children)
This gives me a syntax error (Neo.ClientError.Statement.InvalidSyntax)
I also tried several variations of the code blocks above, but to no avail. So, what am I doing wrong? (I'm on Neo4j 2.1.6, in case that matters.)
In your cypher statement, you are not referencing to the same identifiers used in the MATCH for creating the relationship, so he will just create new empty nodes :
Look at the difference :
MATCH (FROM {id: rels[1]}), (TO {id: rels[0]})
CREATE (Parent)-[:OWNS]->(Children)
Instead it should be :
LOAD CSV FROM "file:/C:/Users/thiago.marzagao/Desktop/CSVs/link.csv" AS rels
MATCH (Parent {id: rels[1]}), (Children {id: rels[0]})
CREATE (Parent)-[:OWNS]->(Children)

neo4j Optimize a relationsship check (query)

after importing data via CSV LOAD I want to connect the imported nodes to customer nodes that are already in the DB. The idea was to look up all imported nodes with the Label TICKET and run through the result set and create the relationship.
Here is the code I come up with first approach:
# Find nodes without relationship for label Ticket
MATCH (t:Ticket), (c:Customer)
WHERE NOT (t)--(c)
RETURN t.number as ticket_number, t.type as ticket_type,t.sid as ticket_sid
# Run through the resultset and execute for each found node
MATCH (t:Ticket { number: "xxx" }), (c:Customer {code: "xxx"})
MERGE (t)-[:IS_TICKET_OF]->(c);
There is an index
ON :Ticket (number)
ON :Customer(code)
This way to handle it is very slow and it took minutes to run through the CSV file. I hope there is a way to optimize the query or maybe to find a way to create the missing relationship easier as first to look them all up and then run through a loop.
The CSV Load is :
LOAD CSV FROM "file:c:..." AS csvLine
MERGE (t:Ticket { number: csvLine[0]})
Maybe its also fine to create the relation already in the CSV import - maybe something like
MATCH (c:Customer {code:"xxx"})
MERGE (t) - [:IS_TICKET_OF]-> (c)
But I would need to figure out in the query how to extract the code from a field as I have something like "aaa/vvv/bbb/1234" in the CSV import and would need only aaa for the match above as this is stored in the customer node as ID.
Any hint is very appreciated.
Thanks!
Does this query work for you?
It stores the aaa part of the input string in num, makes sure the ticket with that number exists, and then makes sure a relationship exists to the matching customer (if there is such a customer).
LOAD CSV FROM "file:c:..." AS csvLine
WITH SPLIT(csvLine[0], '/')[0] AS num
MERGE (t:Ticket {number: num})
WITH num, t
OPTIONAL MATCH (c:Customer {code: num})
MERGE (t)-[:IS_TICKET_OF]->(c);

Resources