Some way to create 1 million relationships with a neo4j query - neo4j

With this query I am importing 75000 nodes from my csv file. (Category)
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM "file:///prodcategory.csv" AS row
CREATE (:Category {id: row.idProdCategory, name: row.name, idRestaurant: row.idRestaurant});
And with this query I am also importing 1 million nodes from my csv file (Product)
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM "file:///products.csv" AS row
CREATE (:Product {id: row.idProduct, idProductCategory: row.idProductCategory,name: row.name,idRestaurant:row.idRestaurant ,description: row.description, price: row.price, shipping_price: row.shippingPrice});
I am using this query to create the relationship between id -> category and idProductCategory -> products.
MATCH (category:Category {id: category.id})
MATCH (Product:Product {idProductCategory: Product.idProductCategory})
WHERE Product.idProductCategory=category.id
MERGE (category)-[:OF_CATEGORY]->(Product);
This query only creates 2999 relationships and I do not believe the 1 million relationships I should create, please if there is a method or configuration to be able to create more than 1 million relationships please help me I would be very grateful.

Ensure you have indexes on Product.idProductCategory.
I assume that the category id is unique across categories.
CREATE CONSTRAINT ON (category:Category) ASSERT category.id IS UNIQUE;
I assume that there are multiple products with the same category ID.
CREATE INDEX ON :Product(idProductCategory);
Then you can simply match each category and then for each category find the appropriate products and create the relationships.
// match all of your categories
MATCH (category:Category)
// then with each category find all the products
WITH category
MATCH (Product:Product {idProductCategory: category.id })
// and then create the
MERGE (category)-[:OF_CATEGORY]->(Product);
If you are running into memory constraints you could use the APOC periodic commit to wrap your query...
call apoc.periodic.commit("
MATCH (category:Category)
WITH category
MATCH (Product:Product {idProductCategory: category.id })
MERGE (category)-[:OF_CATEGORY]->(Product)
",{limit:10000})

try to change your query to this... you are using too many filters in your query
check docs for MATCH
MATCH (category:Category),(Product:Product)
WHERE Product.idProductCategory=category.id
MERGE (category)-[:OF_CATEGORY]->(Product)
you can also just change your second import query, so you do not need a separate query for linking.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM "file:///products.csv" AS row
CREATE (p:Product {id: row.idProduct, name: row.name,idRestaurant:row.idRestaurant ,description: row.description, price: row.price, shipping_price: row.shippingPrice})
MATCH (c:Category{id:row.idProductCategory}
MERGE (p)-[:OF_CATEGORY]->(c)

Related

My match/merge process is not creating relationships in the Neo4J database

I am very new to Neo4j/cypher/graph databases, and have been trying to follow the Neo4j tutorial to import data I have in a csv and create relationships.
The following code does what I want in terms of reading in the data, creating nodes, and setting properties.
/* Importing data on seller-buyer relationshsips */
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:///customer_rel_table.tsv' AS row
FIELDTERMINATOR '\t'
MERGE (seller:Seller {sellerID: row.seller})
ON CREATE SET seller += {name: row.seller_name,
root_eid: row.vendor_eid,
city: row.city}
MERGE (buyer:Buyer {buyerID: row.buyer})
ON CREATE SET buyer += {name: row.buyer_name};
/* Creating indices for the properties I might want to match on */
CREATE INDEX seller_id FOR (s:Seller) on (s.seller_name);
CREATE INDEX buyer_id FOR (b:Buyer) on (b.buyer_name);
/* Creating constraints to guarantee buyer-seller pairs are not duplicated */
CREATE CONSTRAINT sellerID ON (s:Seller) ASSERT s.sellerID IS UNIQUE;
CREATE CONSTRAINT buyerID on (b:Buyer) ASSERT b.buyerID IS UNIQUE;
Now I have the nodes (sellers and buyers) that I want, and I would like to link buyers and sellers. The code I have tried for this is:
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:///customer_rel_table.tsv' AS row
MATCH (s:Seller {sellerID: row.seller})
MATCH (b:Buyer {buyerID: row.buyer})
MERGE (s)-[st:SOLD_TO]->(b)
The query runs, but I don't get any relationships:
Query executed in 294ms. Query type: WRITE_ONLY.
No results.
Since I'm not asking it to RETURN anything, I think the "No results" comment is correct, but when I look at metadata for the DB, no relationships appear. Also, my data has ~220K rows, so 294ms seems fast.
EDIT: At #cybersam's prompting, I tried this query:
MATCH p=(:Seller)-[:SOLD_TO]->(:Buyer) RETURN p, which gives No results.
For clarity, there are two fields in my data that are the heart of the relationship:
seller and buyer, where the seller sells stuff to the buyer. The seller identifiers are repeated, but for each seller there are unique seller-buyer pairs.
What do I need to fix in my code to get relationships between the sellers and buyers? Thank you!
Your second query's LOAD CSV clause does not specify FIELDTERMINATOR '\t'. The default terminator is a comma (','). That is probably why it fails to MATCH anything.
Try adding FIELDTERMINATOR '\t' at the end of that clause.

How to create node, create relationship and remove atribute when loading from CSV?

I have written 3 queries in order to create the nodes :Comments an their properties as well as the relationships with the nodes :Users and :Posts:
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "http://neuromancer.inf.um.es:8080/es.stackoverflow/Comments.csv" AS row
CREATE(n)
SET n=row
WITH n AS node
CREATE (c:Comment {Id: node.Id, CreationDate: node.CreationDate,
Score: node.Score, Text: node.Text,
UserId: node.UserId, PostId: node.PostId});
MATCH (c:Comment), (u:User)
WHERE toInt(c.UserId) = toInt(u.Id)
CREATE (u)-[:AUTHOR_OF]->(c)
REMOVE c.UserId;
MATCH (c:Comment), (p:Post)
WHERE toInt(c.PostId) = toInt(p.Id)
CREATE (c)-[:INCLUDED_IN]->(p)
REMOVE c.PostId;
The first query creates the nodes, while the second and third create the relationships between :Comment and :User and :Comment and :Post. However I want to create one single query that creates all of this, in order to make it more efficient. Is this posible and how can I make it? I could not find a way.
First, there's no need to create a temporary node n or temporary id properties.
Second, it might be more efficient, depending on the size of your file, to do two passes over the file, one to create comments, and the second to match nodes and create relationships. Best to try/profile and see.
Here's what it would look like with just one pass:
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "http://neuromancer.inf.um.es:8080/es.stackoverflow/Comments.csv" AS row
CREATE (c:Comment {Id: row.Id, CreationDate: row.CreationDate,Score: row.Score, Text: row.Text})
WITH c
MATCH (u:User) WHERE u.Id = toInt(row.UserId)
CREATE (u)-[:AUTHOR_OF]->(c)
WITH c
MATCH (p:Post) WHERE p.Id=toInt(c.PostId)
CREATE (c)-[:INCLUDED_IN]->(p)
With two passes,
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "http://neuromancer.inf.um.es:8080/es.stackoverflow/Comments.csv" AS row
CREATE (c:Comment {Id: row.Id, CreationDate: row.CreationDate,Score: row.Score, Text: row.Text})
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "http://neuromancer.inf.um.es:8080/es.stackoverflow/Comments.csv" AS row
MATCH (c:Comment {Id: row.Id})
MATCH (u:User) WHERE u.Id = toInt(row.UserId)
MATCH (p:Post) WHERE p.Id=toInt(c.PostId)
CREATE (u)-[:AUTHOR_OF]->(c)
CREATE (c)-[:INCLUDED_IN]->(p)

Load csv in neo4j with nodes and relationships in one csv file

Apologies as I am new to neo4j and struggling with what I imagine is a very simple example.
I would like to model an org chart which I have stored as a csv like so
id,name,manager_id
1,allan,2
2,bob,4
3,john,2
4,sam,
5,Jim,2
Note that Bob has 3 direct reports and Bob reports into Sam who doesn't report into anyone.
I would like to produce a graph which shows the management chain. I have tried the following, but it produces relationships which are disjoint from the people:
LOAD CSV WITH HEADERS FROM "file///employees.csv" AS csvLine
CREATE (p:Person {id: csvLine.id, name: csvLine.name})
CREATE (p)-[:MANAGED_BY {manager: csvLine.manager_id}]->(p)
This query creates a bunch of self-referencing relationships. Is there anyway to populate the graph with one command over the single csv? I must be missing something and any help is appreciated. Thanks
I think this is what you are looking for.
In your query tou are creating a relationship between p and p thus the self referencing relationships.
I added a coalesce statement to deal with people that do not have a manager_id value. THis way Sam can report to himself.
LOAD CSV WITH HEADERS FROM "file:///employees.csv" AS csvLine
// create or match the person in the left column
MERGE (p:Person {id: csvLine.id })
// if they are created then assign their name
ON CREATE SET p.name = csvLine.name
// create or match the person/manager in the right column
MERGE (p1:Person {id: coalesce(csvLine.manager_id, csvLine.id) })
// create the reporting relationship
CREATE (p)-[:MANAGED_BY]->(p1)

Create multiple relationships of same type with different properties between two nodes from csv

I am facing issue while creating multiple relationships of same type with different properties between two nodes in Neo4jDesktop.
Nodes dataset:
File Name: 1.csv
File Contents:
Id,Desc
A,Alpha
B,Beta
C,Charlie
D,Doyce
Relationships Dataset:
File Name: 2.csv
File Contents:
SeqNo,Date,Count,Weight,From,To
0,2018-04-01,12,308,A,B
1,2018-04-01,3,475,B,C
2,2018-04-01,23,308,C,D
3,2018-04-01,32,524,D,A
4,2018-04-01,0,308,A,C
5,2018-04-01,23,237,B,D
6,2018-04-01,54,308,B,A
7,2018-04-01,23,237,D,B
8,2018-04-01,18,308,D,C
9,2018-04-01,23,308,C,A
10,2018-04-01,78,475,B,C
11,2018-04-01,67,308,A,B
12,2018-04-01,56,237,D,B
13,2018-04-01,34,308,A,C
14,2018-04-01,27,524,A,D
15,2018-04-01,84,237,D,B
// Create Nodes
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/1.csv" AS row
CREATE (:Node {Id: row.Id, Desc: row.Desc});
// Create Relationships
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/2.csv" AS row
MERGE (from:Node {Id: row.From})
MERGE (to:Node {Id: row.To})
MERGE (from)-[rel:RELATED_AS]->(to)
ON CREATE SET rel.SeqNo = toInt(row.SeqNo),
rel.Date = row.flightDate,
rel.Count = toInteger(row.Count),
rel.Weight = toFloat(row.Weight)
This syntax works and creates only 11 relationships, with incoming and outgoing relationships between two nodes.
It is ignoring the additional relationships between A-B, B-C, A-C and D-B (2 additional relationships).
How to create the graph with all the 16 relationships?
Thanks in advance.
Mel.
Your second query is MERGING the relationship (from)-[rel:RELATED_AS]->(to) so Cypher is matching that pattern if it exists. So the subsequent ones are matched but then the values are never updated because of the ON CREATE statement.
Since you want to create the relationships every time you could replace your statement with something like the following.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/2.csv" AS row
MERGE (from:Node {Id: row.From})
MERGE (to:Node {Id: row.To})
CREATE (from)-[rel:RELATED_AS {SeqNo: row.SeqNo, Date: row.flightDate, Count: toInteger(row.Count), Weight: toFloat(row.Weight)}]->(to)

Agregate cypher query

This is my database in Neo4j:
CREATE (Alex:Person {name:'Alex', phone:'0420965111'})
CREATE (Oxana:Person {name:'Oxana', email:'oxana#mail.com'})
CREATE (Tango:Dance {name:'Tango'})
CREATE (Ballet:Dance {name:'Ballet'})
CREATE (Zouk:Dance {name:'Zouk'})
CREATE (Saturday:Day {name:'Saturday'})
CREATE (Sunday:Day {name:'Sunday'})
CREATE (Wednesday:Day {name:'Wednesday'})
MERGE (Alex)-[:LIKES]->(Tango)
MERGE (Alex)-[:LIKES]->(Zouk)
MERGE (Oxana)-[:LIKES]->(Tango)
MERGE (Oxana)-[:LIKES]->(Ballet)
MERGE (Alex)-[:AVAILABLE_ON]->(Sunday)
MERGE (Alex)-[:AVAILABLE_ON]->(Wednesday)
MERGE (Oxana)-[:AVAILABLE_ON]->(Sunday)
MERGE (Oxana)-[:AVAILABLE_ON]->(Saturday)
I need a list of more than 1 person who likes the same dance and available on the same day. How to write a query which returns this?:
"Sunday", "Tango", ["Alex","Oxana"]
This almost works: match (p:Person), (d:Dance), (day:Day) where (p)-[:LIKES]->(d) and (p)-[:AVAILABLE_ON]->(day) return day.name, d.name, collect(p.name), count(*) But I don't know how to exclude records where count(*) is less than 2.
You can use WITH:
match (p:Person), (d:Dance), (day:Day)
where (p)-[:LIKES]->(d) and (p)-[:AVAILABLE_ON]->(day)
with day.name as day, d.name as dance, collect(p.name) as names, count(*) as count
where count >= 2
return day, dance, names
From the docs:
The WITH clause allows query parts to be chained together, piping the
results from one to be used as starting points or criteria in the
next.
Also, you can add a constraint (WHERE clause) to filter data.

Resources