Upload csv with multiple relations - neo4j

I'm trying to build relationships between a list of companies (with employee hierarchy) that are partnered together in an ecosystem in addition to investors. I have 6 columns in my csv for: Company, Investor, Customer (J labeled as a company but for relationship customer), CompanyX (X labeled as a company but for relationship for partner companies), Employee (for employees), and EmployeeL (L for hierarchy).
LOAD CSV WITH HEADERS FROM 'FILE:///ecosystem.csv' AS line
MERGE (C:Company {Company: line.Company })
MERGE (I:Investor {Investor: line.Investor })
MERGE (J:Customer {Company: line.Company })
MERGE (X:CompanyX {Company: line.Company })
MERGE (N:Employee {Employee: line.Employee })
MERGE (L:EmployeeL {Employee: line.Employee })
MERGE (C)<-[:works_for]-(N)
MERGE (L)<-[:reports_to]-(N)
MERGE (J)<-[:Customer]-(C)
MERGE (X)<-[:Partners]->(C)
MERGE (C)<-[:Investor]-(I);
Am I over complicating this? I'm new to Cypher and I'm not sure I'm doing this right and the last time I did an upload similar to this I had to wipe my database clean. Also how do I input a null value for J/I/C since not all hierarchy's are complete? When there is a null value, I am unable to upload the csv.

If you have only three columns in your csv, I would create a query like this:
LOAD CSV WITH HEADERS FROM 'FILE:///ecosystem.csv' AS line
MERGE (C:Company {name: line.Company })
MERGE (I:Investor {name: line.Investor })
MERGE (N:Employee {name: line.Employee })
MERGE (C)<-[:Investor]-(I)
MERGE (C)<-[:works_for]-(N)
You should avoid using birectional directions as :works_for and :reports_to. Check this article.

Related

How to return nodes that have only one given relationship

I have nodes that represent documents, and nodes that represent entities. Entities can be referenced in document, if so, they are linked together with a relationship like that :
(doc)<-[:IS_REFERENCED_IN]-(entity)
The same entity can be referenced in several documents, and a document can reference several entities.
I'd like to delete, for a given document, every entity that are referenced in this given document only.
I thought of two different ways to do this.
The first one uses java to make a foreach and would basically be something like that :
List<Entity> entities = MATCH (d:Document {id:0})<-[:IS_REFERENCED_IN]-(e:Entity) return e
for (Entity entity : entities){
MATCH (e:Entity)-[r:IS_REFERENCED_IN]->(d:Document) WITH *, count(r) as nb_document_linked WHERE nb_document_linked = 1 DELETE e
}
This method would work but i'd like not to use a foreach or java code to make it. I'd like to do it in one cypher query.
The second one uses only one cypher query but doesn't work. It's something like that :
MATCH (d:Document {id:0})<-[:IS_REFERENCED_IN]-(e:Entity)-[r:IS_REFERENCED_IN]->(d:Document) WITH *, count(r) as nb_document_linked WHERE nb_document_linked = 1 DELETE e
The problem here is that nb_document_linked is not unique for every entity, it is a unique variable for all the entities, which mean it'll count every relationship of every entity, which i don't want.
So how could I make a kind of a foreach in my cypher query to make it work?
Sorry for my english, I hope the question is clear, if you need any information please ask me.
You can do something like:
MATCH (d:Document{key:1})<-[:IS_REFERENCED_IN]-(e:Entity)
WITH e
MATCH (d:Document)<-[:IS_REFERENCED_IN]-(e)
WITH COUNT (d) AS countD, e
WHERE countD=1
DETACH DELETE e
Which you can see working on this sample data:
MERGE (a:Document {key: 1})
MERGE (b:Document {key: 2})
MERGE (c:Document {key: 3})
MERGE (d:Entity{key: 4})
MERGE (e:Entity{key: 5})
MERGE (f:Entity{key: 6})
MERGE (g:Entity{key: 7})
MERGE (h:Entity{key: 8})
MERGE (i:Entity{key: 9})
MERGE (j:Entity{key: 10})
MERGE (k:Entity{key: 11})
MERGE (l:Entity{key: 12})
MERGE (m:Entity{key: 13})
MERGE (d)-[:IS_REFERENCED_IN]-(a)
MERGE (e)-[:IS_REFERENCED_IN]-(a)
MERGE (f)-[:IS_REFERENCED_IN]-(a)
MERGE (g)-[:IS_REFERENCED_IN]-(a)
MERGE (d)-[:IS_REFERENCED_IN]-(b)
MERGE (e)-[:IS_REFERENCED_IN]-(b)
MERGE (f)-[:IS_REFERENCED_IN]-(c)
MERGE (g)-[:IS_REFERENCED_IN]-(c)
MERGE (j)-[:IS_REFERENCED_IN]-(a)
MERGE (h)-[:IS_REFERENCED_IN]-(a)
MERGE (i)-[:IS_REFERENCED_IN]-(a)
MERGE (g)-[:IS_REFERENCED_IN]-(c)
MERGE (k)-[:IS_REFERENCED_IN]-(c)
MERGE (l)-[:IS_REFERENCED_IN]-(c)
MERGE (m)-[:IS_REFERENCED_IN]-(c)
On which it removes 3 Entities.
The first MATCH finds the entities that are attached to your input doc, and the second MATCH finds the number of documents that each of these entities is connected to.

How to prevent neo4j MERGE from creating duplicate relationships?

I am attempting to create nodes and relationships if they do not exist. I do not know ahead of time if anything in the DB exists.
This is my initial query:
MERGE (t:type { name: 'aaa'})
MERGE (m:model { name: 'bbb'})
MERGE (r:region {name: 'ccc'})
MERGE (p:param {name: 'ddd'})
MERGE (i:init {value: 123})
MERGE (u:forecast {url: 'http://something.png'})
MERGE (t)-[:HAS]-(m)-[:HAS]-(r)-[:HAS]-(p)-[:HAS]-(i)-[:HAS]-(u)
This correctly produces a graph like this:
Then I run this query again, but this time I change the name of the "model" object to "bbc" (instead of "bbb"):
MERGE (t:type { name: 'aaa'})
MERGE (m:model { name: 'bbc'})
MERGE (r:region {name: 'ccc'})
MERGE (p:param {name: 'ddd'})
MERGE (i:init {value: 123})
MERGE (u:forecast {url: 'http://something.png'})
MERGE (t)-[:HAS]-(m)-[:HAS]-(r)-[:HAS]-(p)-[:HAS]-(i)-[:HAS]-(u)
Now, however, my graph looks like this:
Everything looks correct except for the three duplicated relationships.
I realize that MATCH will create the whole path if it does not exist. There must be some way to avoid creating duplicate relationships, though.
I would appreciate being pointed in the right direction!
The MERGE statement checks if the pattern as a whole already exists or not. So, if there is one node different, the whole pattern is determined as non-existent and all relationships are created.
The solution is to split this MERGE statement into multiple, i.e. one MERGE for each relationship:
MERGE (t)-[:HAS]-(m)-[:HAS]-(r)-[:HAS]-(p)-[:HAS]-(i)-[:HAS]-(u)
becomes
MERGE (t)-[:HAS]-(m)
MERGE (m)-[:HAS]-(r)
MERGE (r)-[:HAS]-(p)
MERGE (p)-[:HAS]-(i)
MERGE (i)-[:HAS]-(u)

How to create different relationships in neo4j using py2neo reading csv files?

I would like to read in a csv file where the first two columns have node names, and the third column has the node relationship. Currently I use this in py2neo:
query2 = """
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line
MERGE (topic:Topic {name: line.Topic})
MERGE (result:Result {name: line.Result})
CREATE UNIQUE (topic)-[:DISCUSSES]->(result)
"""
How can I use the third column in the csv file to set the relationship, instead of having all relationships set as "DISCUSSES"?
I tried this, but it does not have a UNIQUE option:
query1 = """
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line
MERGE (topic:Topic {name: line.Topic})
MERGE (result:Result {name: line.Result})
MERGE (relation:Relation {name: line.Relation})
WITH topic,result,line
CALL apoc.merge.relationship(topic, line.Relation, {}, {}, result) YIELD rel as rel1
RETURN topic,result
"""
Actually, your second query is almost correct (except that it has an extraneous MERGE clause). Here is the corrected query:
USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line
MERGE (topic:Topic {name: line.Topic})
MERGE (result:Result {name: line.Result})
WITH topic, result, line
CALL apoc.merge.relationship(topic, line.Relation, {}, {}, result) YIELD rel
RETURN topic, result
The apoc.merge.relationship call is equivalent to doing a MERGE to create the relationship (with a dynamic label) if it does not already exist.

Multiple columns of data import per node?

Is it possible to have multiple columns of information on a Name node when importing a csv? For example, Name is John Doe, Company, Position is President of Sales, Located in California, etc., on a single node. If so, any suggestions on how to merge that information in a single name node in cypher during upload? Lets say I have columns of information as Position, State, County, Phone. So far All I've been able to come up with is the Name and relation to the Company that he/she works for.
LOAD CSV WITH HEADERS FROM 'FILE:///company_name.csv' AS line
MERGE (C:Company {Company: line.Company })
MERGE (N:Name {Name: line.Name })
MERGE (C)<-[:works_for]-(N);
The best approach would be to use ON CREATE SET and ON MATCH SET after creating/matching the node with MERGE on a unique key value.
LOAD CSV WITH HEADERS FROM 'FILE:///company_name.csv' AS line
MERGE (C:Company {Company: line.Company })
MERGE (N:Name {Name: line.Name })
ON CREATE SET
N.Position = line.Position,
N.Location = line.Location,
N.Country = line.Country,
N.Phone = line.Phone
ON MATCH SET
N.Position = line.Position,
N.Location = line.Location,
N.Country = line.Country,
N.Phone = line.Phone
MERGE (C)<-[:works_for]-(N);
Alternatively you can set everything on a single node but if there are multiple rows in your csv file that correspond to the same identity and some of the values you are setting are different on those rows then it will result in multiple nodes in the database afterwards.
LOAD CSV WITH HEADERS FROM 'FILE:///company_name.csv' AS line
MERGE (C:Company {Company: line.Company })
MERGE (N:Name {Name: line.Name, Position: line.Position, Location line.Location, Country: line.Country, Phone: line.Phone })
MERGE (C)<-[:works_for]-(N);

Multiple relations like join tables (neo4j)

How do I express the following in neo4j?
match or create user bob; bob works at studio; while at studio, he's allowed to doodle; while at studio, he's also allowed to type.
Here's what I have:
MERGE (u:user {name:'bob'})
MERGE (c:company {name: 'studio'})
MERGE (u)-[:works_at]->(c)-[:allowed_to]->(p:permission {name:'doodle'})
MERGE (u)-[:works_at]->(c)-[:allowed_to]->(p:permission {name:'type'})
This doesn't work as permission becomes a relation of company.
Also, is it possible to chain relations such that:
MERGE work=(u)-[:works_at]->(c)
CREATE (work)-[:allowed_to]->(p:permission {name:'doodle'})
CREATE (work)-[:allowed_to]->(p:permission {name:'type'})
where you assign a relation to a variable to continue it later on in another query?
How about modelling it so the company grants the permission? Something like this...
MERGE (u:user {name:'bob'})
MERGE (c:company {name: 'studio'})
MERGE (u)-[:works_at]->(c)
MERGE (u)-[:allowed_to]->(p1:permission {name:'doodle'})<-[:GRANTS]-(c)
MERGE (u)-[:allowed_to]->(p2:permission {name:'type'})<-[:GRANTS]-(c)
RETURN *
You can't really refer to objects via identifiers/variables you have created previously in other queries. You would have to re-match or merge those previously created objects in your new query.
Part 2 could be modelled something like this..
MERGE (u:user {name:'bob'})
MERGE (c:company {name: 'studio'})
MERGE (u)-[:DOES]->(work:Work {start_date: timestamp()} )-[:AT]->(c)
CREATE (work)-[:allowed_to]->(p:permission {name:'doodle'})
CREATE (work)-[:allowed_to]->(p:permission {name:'type'})
As an alternate, if you never need to lookup all users with a certain permission at a company, you could maintain a collection of permissions as relationship properties.
MERGE (u:user {name:'bob'})
MERGE (c:company {name: 'studio'})
MERGE (u)-[r:works_at]->(c)
SET r.permissions = ['doodle', 'type']

Resources