How to create different relationships in neo4j using py2neo reading csv files? - neo4j

I would like to read in a csv file where the first two columns have node names, and the third column has the node relationship. Currently I use this in py2neo:
query2 = """
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line
MERGE (topic:Topic {name: line.Topic})
MERGE (result:Result {name: line.Result})
CREATE UNIQUE (topic)-[:DISCUSSES]->(result)
"""
How can I use the third column in the csv file to set the relationship, instead of having all relationships set as "DISCUSSES"?
I tried this, but it does not have a UNIQUE option:
query1 = """
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line
MERGE (topic:Topic {name: line.Topic})
MERGE (result:Result {name: line.Result})
MERGE (relation:Relation {name: line.Relation})
WITH topic,result,line
CALL apoc.merge.relationship(topic, line.Relation, {}, {}, result) YIELD rel as rel1
RETURN topic,result
"""

Actually, your second query is almost correct (except that it has an extraneous MERGE clause). Here is the corrected query:
USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line
MERGE (topic:Topic {name: line.Topic})
MERGE (result:Result {name: line.Result})
WITH topic, result, line
CALL apoc.merge.relationship(topic, line.Relation, {}, {}, result) YIELD rel
RETURN topic, result
The apoc.merge.relationship call is equivalent to doing a MERGE to create the relationship (with a dynamic label) if it does not already exist.

Related

Neo4j creating duplicate relationships

I am very very new to stack overflow and I am trying to upload a csv file for further querying, however, I noticed the file creates duplicates for the relationships, this is the code I have so far, I have read documentation but can't seem to find the solution for duplicate relationships. Please help.
:auto USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///senti.csv" AS csvline
CREATE (:User {username: csvline.username})
CREATE (:Tweet {tweet_id: csvline.tweet_id, text: csvline.text, date_time: csvline.date_time})
CREATE (:Location {place: csvline.location})
CREATE (:Candidate {name: csvline.candidate})
CREATE (:Sentiment {sentiment_polarity: csvline.sentiment});
CREATE INDEX FOR (u:User) ON (u.username);
CREATE INDEX FOR (t:Tweet) ON (t.tweet_id);
CREATE INDEX FOR (l:Location) ON (l.place);
:auto USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///senti.csv" AS csvline
MATCH (u:User {username: csvline.username})
MATCH (t:Tweet {tweet_id: csvline.tweet_id})
MERGE (u)-[:POSTS]->(t);
MERGE (u)-[:BASED_ON]-> (l)
You missed to find location (l) so cypher is creating dummy nodes for l and thus it is creating duplicated relationships :BASED_ON.
:auto USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///senti.csv" AS csvline
MATCH (u:User {username: csvline.username})
MATCH (t:Tweet {tweet_id: csvline.tweet_id})
MATCH (l:Location {place: csvline.location}
MERGE (u)-[:POSTS]->(t)
MERGE (u)-[:BASED_ON]-> (l)

Is there a better approach to import a spreadsheet with multiple columns that may not have data for every row into Neo4j? Data set image included

Objectives - To add each values in the columns as a node. Each column represents a label type. The colored column is a property of Field column.
The query I used to ingest this into Neo4j Aura is:
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/e/2PACX-1vTlOK8lOIzMR1E6YB-KDqMwsCSSrd/pub?output=csv" AS line
MERGE (m:Module {name: line.Module})
WITH m, line
MERGE (m)-[:CONTAINS_SUBMODULE]->(s:SubModule {name: line.SubModule})
WITH s, line
MERGE (s)-[:CONTAINS_MENU]->(m:Menu {name: line.Menu})
WITH m, line
WHERE line.SubMenu IS NOT NULL
MERGE (m)-[:CONTAINS_SUB_MENU]->(sm:SubMenu{name:line.SubMenu})
WITH sm, line
WHERE line.Screen IS NOT NULL
MERGE (sm)-[:LAUNCHES]->(s:Screen{name:line})
WITH s, line
WHERE line.Panel IS NOT NULL
MERGE (s)-[:CONTAINS_PANEL]->(p:Panel{name:line})
WITH p, line
WHERE line.SubScreen IS NOT NULL
MERGE (p)-[:CONTAINS_SUBSCREEN]->(ss:SubScreen{name:line})
WITH ss, line
WHERE line.Field IS NOT NULL
MERGE (ss)-[:CONTAINS_FIELD]->(f:Field{name:line})
WITH f, line
WHERE line.Button IS NOT NULL
MERGE (f)-[:CONTAINS_BUTTON]->(b:Button{name:line})
It worked fine till I attempted to map the SubMenu with the Screen column. It threw the error:
Property values can only be of primitive types or arrays thereof. Encountered: Map{Panel -> String("Search"), Menu -> String("Block Status"), SubModule -> String("Booking"), SubMenu -> String("Status Codes"), Button -> NO_VALUE, Field -> NO_VALUE, SubScreen -> NO_VALUE, Mandatory Field -> NO_VALUE, Screen -> String("Status Codes"), NodeID -> String("115"), Module -> String("Administration")}.
Is there a more efficient way to add this spreadsheet into Neo4j Aura?
You need to load the data one at a time because neo4j does not allow null values when it creates a node for SubScreen, Panel, Field and Button. Also, neo4j is doing an "eager operator" (google this) when you are trying to load all nodes into one command. For large files, you should divide it into separate load commands such as below. Execute each of the load command one at a time.
// load modules, submodule, menu and submenu
LOAD CSV WITH HEADERS FROM "file:///Test.csv" AS line
MERGE (m:Module {name: line.Module})
WITH m, line
MERGE (m)-[:CONTAINS_SUBMODULE]->(s:SubModule {name: line.SubModule})
WITH s, line
MERGE (s)-[:CONTAINS_MENU]->(m:Menu {name: line.Menu})
WITH m, line
WHERE line.SubMenu IS NOT NULL
MERGE (m)-[:CONTAINS_SUB_MENU]->(sm:SubMenu{name:line.SubMenu})
WITH sm, line
WHERE line.Screen IS NOT NULL
MERGE (sm)-[:LAUNCHES]->(s:Screen{name:line.Screen})
//Load for Panel
LOAD CSV WITH HEADERS FROM "file:///Test.csv" AS line
WITH line WHERE line.Screen is NOT NULL AND line.Panel is NOT NULL
MERGE (s:Screen {name: line.Screen})
MERGE (s)-[:CONTAINS_PANEL]->(p:Panel{name:line.Panel})
// Load for subscreen
LOAD CSV WITH HEADERS FROM "file:///Test.csv" AS line
WITH line WHERE line.Panel is NOT NULL AND line.SubScreen is NOT NULL
MERGE (p:Panel {name: line.Panel})
MERGE (p)-[:CONTAINS_SUBSCREEN]->(ss:SubScreen{name:line.SubScreen})
// Load for field
LOAD CSV WITH HEADERS FROM "file:///Test.csv" AS line
WITH line WHERE line.SubScreen is NOT NULL and line.Field is NOT NULL
MERGE (ss:SubScreen {name: line.SubScreen})
MERGE (ss)-[:CONTAINS_FIELD]->(f:Field{name:line.Field})
// Load for buttons
LOAD CSV WITH HEADERS FROM "file:///Test.csv" AS line
WITH line WHERE line.Field IS NOT NULL and line.Button is NOT NULL
MERGE (f:Field {name: line.Field})
MERGE (f)-[:CONTAINS_BUTTON]->(b:Button{name:line.Button})
result:

Cannot merge the following node because of null property value for 'name':

My code:
LOAD CSV WITH HEADERS FROM "file:///C:/test1.csv" AS line
MERGE (n:SiteA {name: line.SiteA})
MERGE (m:SiteB {name: line.SiteB})
MERGE (n) -[:has_device_function]-> (m);
my error:
Cannot merge the following node because of null property value for 'name': (:SiteA {name: null})
Could you please help me ?
ry this
LOAD CSV WITH HEADERS FROM "file:///C:/test1.csv" AS line MATCH
(n:SiteA {name: line.SiteA}) where n.name is not null MATCH (m:SiteB
{name: line.SiteB}) where m.name is not null MERGE (n)
-[:has_device_function]-> (m);
Two things to notice: filtering out nulls and using MATCH rather than merge in the initial extraction of the nodes for the new relationship.
In order to easily avoid this kind of problem I usually pass a default value on null items for my CSV (usually DELETE_ME_PLEASE). Then I'll just match and delete nodes with this.
First, just to be sure: this means that there are rows in your CSV that have an empty name column.
That said, you can remove lines from the CSV before continuing your query:
LOAD CSV WITH HEADERS FROM "file:///C:/test1.csv" AS line
WITH line WHERE line.name IS NOT NULL
MERGE (n:SiteA {name: line.SiteA})
MERGE (m:SiteB {name: line.SiteB})
MERGE (n) -[:has_device_function]-> (m);
Csv need coma only and that's work ...

GETTING Neo.ClientError.Statement.SemanticError: Cannot merge node using null property

Hello I am trying to make relationship between two column in my csv file
I am loading csv file which has an relationship column and it looks like this
RELATIONSHIP,AGENTID,CUSTOMERID,TXNID,TIMESTAMP,AMOUNT,CHANNEL
hasrelation,17956,2025,6C13MXSESN,2019-03-01T11:52:08,10,USSD
hasrelation,17957,2026,6C13MXSEVF,2019-03-01T11:52:09,50,BAPP
so I want to make relation between AGENTID and CUSTOMERID. the relationship code is
load csv with headers from "file:///test.csv" AS row
MERGE (p1:AGENTID {name: row.AGENTID})
MERGE (p2:CUSTOMERID {name: row.CUSTOMERID})
WITH p1, p2, row
CALL apoc.create.relationship(p1, row.relationship, {}, p2) YIELD rel
RETURN rel;
This is for testing purpose, but I am getting bellow error
Neo.ClientError.Statement.SemanticError: Cannot merge node using null property value for name
Moreover Recently I have tried this too
LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row
MATCH (f:AGENTID), (s:CUSTOMERID)
WHERE f.Name = row.AGENTID
AND s.Name = row.CUSTOMERID
CALL apoc.create.relationship(f, row.RELATIONSHIP,{}, s) YIELD rel
RETURN rel
I am not getting error here but I am not getting relationship result
actually I have a feeling that I have missed something very silly point.Kindly help me to understand why I am getting this error. and help me to solve this problem. Thanks
You are getting SemanticError for the first query because there is no value(null) for CUSTOMERID or AGENTID somewhere in your file and hence it's like you are trying to MERGE node on a null value. You need to check for null value before MERGE and skip MERGE for these. See below.
It's not recommended to use multiple MERGE in the single query. Only one MERGE is recommended in one query.
Suggest you separate your query into two and use your second to create relationships.
Load AGENTIDs:
LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row
WHERE row.AGENTID IS NOT NULL
MERGE (p1:AGENTID {name: row.AGENTID});
Load CUSTOMERIDs:
LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row
WHERE row.CUSTOMERID IS NOT NULL
MERGE (p2:CUSTOMERID {name: row.CUSTOMERID})
Create relationships between AGENTIDs and CUSTOMERIDs:
LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row
MATCH (f:AGENTID), (s:CUSTOMERID)
WHERE f.name = row.AGENTID
AND s.name = row.CUSTOMERID
CALL apoc.create.relationship(f, row.RELATIONSHIP,{}, s) YIELD rel
RETURN rel
Thanks Raj for your answer. In my case it didn't work until i did some modifications. Here my new code:
1. Load AGENTIDs:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row
WITH row
WHERE row.AGENTID IS NOT NULL
MERGE (p1:AGENTID {name: row.AGENTID});
2. Load CUSTOMERIDs:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row
WITH row
WHERE row.CUSTOMERID IS NOT NULL
MERGE (p2:CUSTOMERID {name: row.CUSTOMERID})
3. Create relationships between AGENTIDs and CUSTOMERIDs:
LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row
MATCH (f:AGENTID), (s:CUSTOMERID)
WHERE f.name = row.AGENTID
AND s.name = row.CUSTOMERID
CALL apoc.create.relationship(f, row.RELATIONSHIP,{}, s) YIELD rel
RETURN rel
Actually #Raj already mentioned here "WHERE can not be used with MERGE."

Multiple columns of data import per node?

Is it possible to have multiple columns of information on a Name node when importing a csv? For example, Name is John Doe, Company, Position is President of Sales, Located in California, etc., on a single node. If so, any suggestions on how to merge that information in a single name node in cypher during upload? Lets say I have columns of information as Position, State, County, Phone. So far All I've been able to come up with is the Name and relation to the Company that he/she works for.
LOAD CSV WITH HEADERS FROM 'FILE:///company_name.csv' AS line
MERGE (C:Company {Company: line.Company })
MERGE (N:Name {Name: line.Name })
MERGE (C)<-[:works_for]-(N);
The best approach would be to use ON CREATE SET and ON MATCH SET after creating/matching the node with MERGE on a unique key value.
LOAD CSV WITH HEADERS FROM 'FILE:///company_name.csv' AS line
MERGE (C:Company {Company: line.Company })
MERGE (N:Name {Name: line.Name })
ON CREATE SET
N.Position = line.Position,
N.Location = line.Location,
N.Country = line.Country,
N.Phone = line.Phone
ON MATCH SET
N.Position = line.Position,
N.Location = line.Location,
N.Country = line.Country,
N.Phone = line.Phone
MERGE (C)<-[:works_for]-(N);
Alternatively you can set everything on a single node but if there are multiple rows in your csv file that correspond to the same identity and some of the values you are setting are different on those rows then it will result in multiple nodes in the database afterwards.
LOAD CSV WITH HEADERS FROM 'FILE:///company_name.csv' AS line
MERGE (C:Company {Company: line.Company })
MERGE (N:Name {Name: line.Name, Position: line.Position, Location line.Location, Country: line.Country, Phone: line.Phone })
MERGE (C)<-[:works_for]-(N);

Resources