create dynamic relationships between nodes in neo4j from csv file - neo4j

Hi Have a CSV file and I want to create nodes and relationships simultaneously
Im using below query to create nodes
using PERIODIC COMMIT 1000
load csv from "file:///home/gaurav/sharing/dataframe6.txt" as line fieldterminator" "
MERGE (A :concept{name:line[0]})
WITH line, A
MERGE (B :concept{name:line[1]})
WITH line, A, B
create (A)-[:line[3]]->(B); // This is trouble part
but when I try to create relationship between imported nodes then I get error
Invalid input '[': expected an identifier character, whitespace, '|', a length specification, a property map or ']' (line 7, column 18 (offset: 218))
"create (A)-[:line[3]]->(B);"

If you truly want to create relationship in a dynamic fashion you need to use an APOC procedure, specifically apoc.create.relationship.
using PERIODIC COMMIT 1000
load csv from "file:///home/gaurav/sharing/dataframe6.txt" as line fieldterminator" "
MERGE (A :concept{name:line[0]})
WITH line, A
MERGE (B :concept{name:line[1]})
WITH line, A, B
CALL apoc.create.relationship(A, line[3], {}, B) YIELD rel
RETURN A,B,rel

A relationship can not contain square brackets as its type name. You're trying to create a "line[3]" relationship between the nodes A and B.

Related

How to match line of csv which is ignored by constraint and create only relationship

I have been created a graph having a constraint on primary id. In my csv a primary id is duplicate but the other proprieties are different. Based on the other properties I want to create relationships.
I tried multiple times to change the code but it does not do what I need.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:///Trial.csv' AS line FIELDTERMINATOR '\t'
MATCH (n:Trial {id: line.primary_id})
with line.cui= cui
MATCH (m:Intervention)
where m.id = cui
MERGE (n)-[:HAS_INTERVENTION]->(m);
I already have the nodes Intervention in the graph as well as the trials. So what I am trying to do is to match a trial with the id from intervention and create only the relationship. Instead is creating me also the nodes.
This is a sample of my data, so the same primary id, having different cuis and I am trying to match on cui:
You can refer the following query which finds Trial and Intervention nodes by primary_id and cui respectively and creates the relationship between them.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:///Trial.csv' AS line FIELDTERMINATOR '\t'
MATCH (n:Trial {id: line.primary_id}), (m:Intervention {id: line.cui})
MERGE (n)-[:HAS_INTERVENTION]->(m);
The behavior you observed is caused by 2 aspects of the Cypher language:
The WITH clause drops all existing variables except for the ones explicitly specified in the clause. Therefore, since your WITH clause does not specify the n node, n becomes an unbound variable after the clause.
The MERGE clause will create its entire pattern if any part of the pattern does not already exist. Since n is not bound to anything, the MERGE clause would go ahead and create the entire pattern (including the 2 nodes).
So, you could have fixed the issue by simply specifying the n variable in the WITH clause, as in:
WITH n, line.cui= cui
But #Raj's query is even better, avoiding the need for WITH entirely.

Longest Path Neo4j returning incorrect path

I have the following graph stored in csv format:
graphUnioned.csv:
a b
b c
The above graph denotes path from Node:a to Node:b. Note that the first column in the file denotes source and the second column denotes destination. With this logic the second path in the graph is from Node:b to Node:c. And the longest path in the graph is: Node:a to Node:b to Node:c.
I loaded the above csv in Neo4j desktop using the following command:
LOAD CSV WITH HEADERS FROM "file:\\graphUnioned.csv" AS csvLine
MERGE (s:s {s:csvLine.s})
MERGE (o:o {o:csvLine.o})
MERGE (s)-[]->(o)
RETURN *;
And then for finding longest path I run the following command:
match (n:s)
where (n:s)-[]->()
match p = (n:s)-[*1..]->(m:o)
return p, length(p) as L
order by L desc
limit 1;
However unfortunately this command only gives me path from Node: a to Node:b and does not return the longest path. Can someone please help me understand as to where am I going wrong?
There are two mistakes in your CSV import query.
First, you need to use a type when you MERGE a relationship between nodes, that query won't compile otherwise. You likely supplied one and forgot to add it when you pasted it here.
Second, the big one, is that your query is merging nodes with different labels and different properties, and this is majorly throwing it off. Your intent was to create 3 nodes, with a longest path connecting them, but your query creates 4 nodes, two isolated groups of two nodes each:
This creates 2 b nodes: (:s {s:b}) and (:o {o:b}). Each of them is connected to a different node, and this is due to treating the nodes to be created from each variable in the CSV differently.
What you should be doing is using the same label and property key for all of the nodes involved, and this will allow the match to the b node to only refer to a single node and not create two:
LOAD CSV WITH HEADERS FROM "file:\\graphUnioned.csv" AS csvLine
MERGE (s:Node {value:csvLine.s})
MERGE (o:Node {value:csvLine.o})
MERGE (s)-[:REL]->(o)
RETURN *;
You'll also want an index on :Node(value) (or whatever your equivalent is when you import real data) so that your MERGEs and subsequent MATCHes are fast when performing lookups of the nodes by property.
Now, to get to your longest path query.
If you are assuming that the start node has no relations to it, and that your end node has no relationships from it, then you can use a query like this:
match (start:Node)
where not ()-->(start)
match p = (start)-[*]->(end)
where not (end)-->()
return p, length(p) as L
order by L desc
limit 1;

Neo4j load CSV to create dynamic relationship types

I can load CSV into Neo4j for a specific label (say PERSON) and the nodes are created under the label PERSON.
I also have another CSV to illustrate the relationships between the person and it looks like:
name1, relation, name2
a, LOVE, b
a, HATE, c
I want to create a relationship between these pairs and the relationship thus created should be "LOVE", "HATE", etc, instead of a rigid RELATION as done by the below script:
load csv with headers from "file:///d:/Resources/Neo4j/person-rel.csv" as p
match (a:PERSON) where a.name=p.name1
match (b:PERSON) where b.name=p.name2
merge (a)-[r:REL {relation: p.REL}]->(b)
By doing this, I have a bunch of REL-type relations but not LOVE- and HATE-relations.
In another word, I want the REL in the last line of the script to be dynamically assigned. And then I can query out all the relationship types using Neo4j API.
Is this possible?
You can install the APOC library and then use apoc.merge.relationship
apoc.merge.relationship(startNode, relType, {key:value, ...}, {key:value, ...}, endNode) - merge relationship with dynamic type
load csv with headers from "file:///d:/Resources/Neo4j/person-rel.csv" as p
match (a:PERSON) where a.name=p.name1
match (b:PERSON) where b.name=p.name2
call apoc.merge.relationship(a,p.REL,{},{},b) yield rel
return count(*);

Neo4j load data relations with unknown labels

I have 4 Labels (A, B, C, D). All of them have a single Property {id}.
Now I have a file with relations which I would like to load. Every row has this structure:
{id_1}, {type_of_relations}, {id_2}
How can I create the relations?
My non-working guess is:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/data.csv" AS line
FIELDTERMINATOR ','
MATCH (a:A{id:line.id_1} OR a:B{id:line.id_1} OR a:C{id:line.id_1} OR a:D{id:line.id_1})
MATCH (b:A{id:line.id_2} OR b:B{id:line.id_2} OR b:C{id:line.id_2} OR b:D{id:line.id_2})
MERGE (a)-[:line.type_of_relations]->(b)
You cannot parameterize the relationship type in Cypher.
However, you can do this using the apoc.create.relationship procedure in Neo4j apoc procedures:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///file.csv" AS row
MATCH (a) WHERE a.id = row.id_1
MATCH (b) WHERE b.id = row.id_2
CALL apoc.create.relationship(a, row.type_of_relations, {}, b) YIELD rel
RETURN count(*) AS num
The procedure takes a parameter for the relationship type which allows for creating dynamic relationship types.
I don't think you can do that. For a number of reasons.
create (f:bar {name:'NewUserA'})
create (f:foo {name:'NewUserA'})
match (f:foo {name:'NewUserA'} or f:bar {name:'NewUserA'}) return f;
Code
Invalid input 'o': expected whitespace, comment, ')' or a relationship pattern (line 1, column 32 (offset: 31))
"match (f:foo {name:'NewUserA'} or f:bar {name:'NewUserA'}) return f".
So there is a problem on the match at any rate.
If the id is globally unique then you can ignore the label and just match on the id. that will take care of your 'or' problem.
match (f) where f.name='NewUserA' match (t) where t.name='NewUserA' return f,t
would give you the nodes.
That being said, when coding parameterized queries RELATIONSHIP_TYPE is one of the items you cannot parameterize. From the docs:
5.5. Parameters
[..]
Parameters can not be used as for property names, relationship types and labels, since these patterns are part of the query structure that is compiled into a query plan.
[..]
So you may need to look to ways of building your MERGE as a string somewhere else (awk is your friend) and then running that in the shell.

Create Relationships between Consequent Nodes (on date attribute) in Neo4j

I am trying to get a csv into Neo4j. As it consists of log entries, I'd like to connect nodes with a NEXT-pointer/relationship when the corresponding logs have been created at subsequent times.
LOAD CSV WITH HEADERS FROM 'http://localhost/Export.csv' AS line
CREATE (:Entry { date: line[0], ...})
MATCH (n)
RETURN n
ORDER BY n:date
MATCH (a:Entry),(b:Entry),(c:Entry)
WITH p AS min(b:date)
WHERE a:date < b:date AND c.date = p
CREATE (a)-[r:NEXT]->(c)
The last four lines do not work however. What I try is to get the earliest entry 'c' out of the group of entries 'b' with a larger timestamp than 'a'. Can anyone help me out here?
Not sure if I understood your question correctly: you have a csv file containing log records with a timestamp. Each line contains one record. You want to interconnect the events to form a linked list based on a timestamp?
In this case I'd split up the process into two steps:
using LOAD CSV create a node with a data property for each line
afterwards connect the entries using e.g. a cypher statement like this:
.
MATCH (e:Entry)
WITH e ORDER BY e.date DESC
WITH collect(e) as entries
FOREACH(i in RANGE(0, length(entries)-2) |
FOREACH(e1 in [entries[i]] |
FOREACH(e2 in [entries[i+1]] |
MERGE (e1)-[:NEXT]->(e2))))

Resources