Extraction of unique nodes from csv in neo4j - neo4j

In shortest way my problem is below:
I need to get from the following csv file
(https...)drive.google.com/file/d/0B-y9nPaqlH6XdXZsYzAwLThacTg/view?usp=sharing
The following data-structure in neo4j (Using cypher import):
https://drive.google.com/file/d/0B-y9nPaqlH6XdlZHM216eDRSX3c/view?usp=sharing
Instead of:
[https://drive.google.com/file/d/0B-y9nPaqlH6XdE9vZ0gyNU1lR0U/view?usp=sharing]
The longer interpretation:
I thought, the solution of my problem is just need to understand to (un)bound elements.
But I tried many times, in many ways (with(out) creating single nodes first, or in empty database):
LOAD CSV with headers FROM "file:///C:/Users/user/Desktop/neo4j help/calling.csv"
AS csvLine
MERGE (u1:Person { number:(csvLine.A), name:(csvLine.name_A)}) MERGE (u2:Person { number:(csvLine.B), name:(csvLine.name_B)})
MERGE (u1:Person { number:(csvLine.A), name:(csvLine.name_A)})-[c:called]->(u2:Person { number:(csvLine.B), name:(csvLine.name_B)})
RETURN u1.name,c,u2.name
I got instead of wondered results just error message:
Can't create u1 with properties or labels here. It already exists in
this context
And without „pre-merging“ the nodes, I have the results above (in the pink picture)
What do I need to obtain the wanted result (in the first picture)?

You don't need to redefine the u1 and u2 nodes. Just reuse the identifiers and MERGE the relationship :
LOAD CSV with headers FROM "file:///C:/Users/user/Desktop/neo4j help/calling.csv"
AS csvLine
MERGE (u1:Person { number:(csvLine.A), name:(csvLine.name_A)})
MERGE (u2:Person { number:(csvLine.B), name:(csvLine.name_B)})
MERGE (u1)-[c:CALLED]->(u2)
RETURN u1.name,c,u2.name
Nb: I think your images are both the same, and you can post them in your questions, many people will skip your question because they need to open 2 or 3 more browser windows

Related

How to create relationship between two existing nodes by using node id?

I am trying to create a relationship between two existing nodes. I am reading the node ID's from a CSV and creating the relationship with the following query:
LOAD CSV WITH HEADERS FROM "file:///8245.csv" AS f
MATCH (Ev:Event) where id(Ev) =f.first
MATCH (Ev_sec:Event) where id(Ev_sec) = f.second
WITH Ev, Ev_sec
MERGE (Ev) - [:DF_mat] - > (Ev_sec)
However, it is not changing anything the database. How can I solve this problem?
Thanks!
I solved the problem. So, I again queried for the ID(node) and this time I exported them as a string (by using toString(ID(node)) ). Then while loading to the database, I converted them to Integer. The query is as follows:
LOAD CSV WITH HEADERS FROM "file:///8245_new.csv" AS csvLine
match (ev:Event) where id(ev)=toInteger(csvLine.first)
match (ev_sec:Event) where id(ev_sec)=toInteger(csvLine.second)
merge (ev)-[:DF_mat]-> (ev_sec)

I can't create a relationship between nodes and predecessors by cypher while creating the graph

I have the following file A.csv
"NODE","PREDECESSORS"
"1",""
"2","1"
"3","1;2"
I want to create with the nodes: 1,2,3 and its relationships 1->2->3 and 1->3
I have already tried to do so:
LOAD CSV WITH HEADERS FROM 'file:///A.csv' AS line
CREATE (:Task { NODE: line.NODE, PREDECESSORS: SPLIT(line.PREDECESSORS ';')})
FOREACH (value IN line.PREDECESSORS |
MERGE (PREDECESSORS:value)-[r:RELATIONSHIP]->(NODE) )
But it does not work, that is, it does not create any relationship.
Please, might you help me?
The problem is in your MERGE:
MERGE (PREDECESSORS:value)-[r:RELATIONSHIP]->(NODE)
This is merging a :value labeled node and assigning it to the variable PREDECESSORS, which can't be what you want to do.
A better approach would be not save the predecessor data in the node, just use that to match on the relevant nodes and create the relationships.
It will also help to have an index on :Task(NODE) so your matches to the predecessors are quick.
Remember also that cypher queries do not process the entire query for each row, but rather each operation in the query is processed for each row, so once the CREATE executes, all nodes will be created, there's no need to use MERGE the predecessor nodes.
Try something like this:
LOAD CSV WITH HEADERS FROM 'file:///A.csv' AS line
CREATE (node:Task { NODE: line.NODE})
WITH node, SPLIT(line.PREDECESSORS, ';') as predecessors
MATCH (p:Task)
WHERE p.NODE in predecessors
MERGE (p)-[:RELATIONSHIP]->(node)

Neo4J CSV relationships

I am a Neo4J newbie and I have a simple CSV with source and dest IPs. I'd like to create a relationship between nodes with the same labels.
Something like ... source_ip >> ALERTS >> dest_ip, or the reverse.
"dest_ip","source_ip"
"130.102.82.16","54.231.19.32"
"130.102.82.116","114.30.64.11"
"130.102.82.116","114.30.64.11"
...
LOAD CSV WITH HEADERS
FROM "file:///Users/me/Desktop/query_result.csv" AS csvLine
CREATE (alert:Alert { source_ip: csvLine.source_ip, dest_ip: csvLine.dest_ip})
MATCH (n:Alert) RETURN n LIMIT 25
dest_ip 130.102.82.16 source_ip 54.231.19.32
....
This works fine. My question is how I create the relationship between the labels inside the alerts? I've tried and failed a slew of times. I'm guessing I need to set up separate Nodes for Source and Dest and then link them, just unsure how.
Thanks in advance!
Peace,
Tom
First create a constraint like this, to guarantee uniqueness and speed up the MERGE operation.
CREATE CONSTRAINT ON (a:Alert) ASSERT a.ip IS UNIQUE;
You can use as many CREATE statements as you want, and then MERGE the relationship, like this:
LOAD CSV WITH HEADERS
FROM "file:///Users/me/Desktop/query_result.csv" AS csvLine
MERGE (node1:Alert { ip: csvLine.source_ip })
MERGE (node2:Alert { ip: csvLine.dest_ip })
MERGE (node1)-[r:ALERT]->(node2)
By the by, I'd recommend using MERGE in most places to make sure you don't end up creating duplicates. In this file, a certain IP address might be listed many times, you don't want a new node each time it's created, you probably want all references under that one IP address, hence MERGE here instead of CREATE
Assuming that your graph model is something like
(:source)-[:ALERT]->(:Destination)
The following Cypher query will create that relationship
LOAD CSV WITH HEADERS FROM "file:///Users/me/Desktop/query_result.csv" AS csvLine
CREATE (source:Source { ip: csvLine.source_ip })-[:ALERTS]->(dest:Destination { ip: csvLine.dest_ip})

Avoid processing duplicate data when CSV importing via Cypher

I have a set of CSV files with duplicate data, i.e. the same row might (and does) appear in multiple files. Each row is uniquely identified by one of the columns (id) and has quite a few other columns that indicate properties, as well as required relationships (i.e. ids of other nodes to link to). The files all have the same format.
My problem is that, due to size and number of the files, I want to avoid processing the rows that already exist - I know that as long as id is the same, the contents of the rows will be the same across the files.
Can any cypher wizard advise how to write a query that would create the node, set all the properties and create all the relationship if a node with given id does not exist, but skip the action altogether if such node is found? I tried with MERGE ON CREATE, something along the lines of:
LOAD CSV WITH HEADERS FROM "..." AS row
MERGE (f:MyLabel {id:row.uniqueId})
ON CREATE SET f....
WITH f,row
MATCH (otherNode:OtherLabel {id : row.otherNodeId})
MERGE (f) -[:REL1] -> (otherNode)
but unfortunately that can only be applied to not setting the properties again, but I couldn't work out how to skip the merging part of relationships (only shown one here, but there are quite a few more).
Thanks in advance!
You can just optionally match the node and then skip with WHERE n IS NULL
Make sure you have an index or constraint on :MyLabel(id)
LOAD CSV WITH HEADERS FROM "..." AS row
OPTIONAL MATCH (f:MyLabel {id:row.uniqueId})
WHERE f IS NULL
MERGE (f:MyLabel {id:row.uniqueId})
ON CREATE SET f....
WITH f,row
MATCH (otherNode:OtherLabel {id : row.otherNodeId})
MERGE (f) -[:REL1] -> (otherNode)

Neo4J Cypher - Case Insensitive MERGE

Is there a way to do a case insensitive MERGE in Cypher (Neo4J)?
I'm creating a graph of entities I have been able to extract from a set of documents, and want to merge entities that are the same across multiple documents (accepting the risk that the same name doesn't mean it's the same entity!). The issue is that the case can vary between documents.
At the moment, I'm using the MERGE syntax to create merged nodes, but it is sensitive to the differences in case. How can I perform a case-insensitive merge?
There is no direct way but you can try out something like below.MERGE is made for pattern matching and labels of different cases constitute different patterns
MERGE (a:Crew123)
WITH a,labels(a) AS t
LIMIT 1
MATCH (n)
WHERE [l IN labels(n)
WHERE lower(l)=lower(t[0])] AND a <> n
WITH a,collect(n) AS s
FOREACH (x IN s |
DELETE a)
RETURN *
The above query will give you an ERROR but it will delete the newly created node if a similar label exists. You can add additional pattern in the MERGE clause . And in case there are no similar labels it will run successfully.
Again this is just a work around to not allow new similar labels.
If the data is coming for instance from a CSV or similar source (parameter) you can use a dedicated, consistent case property for the merge and set the original value separately.
e.g.
CREATE CONSTRAINT ON (u:User) ASSERT u.iname IS UNIQUE;
LOAD CSV WITH HEADERS FROM "http://some/url" AS line
WITH line, lower(line.name) as iname
MERGE (u:User {iname:iname}) ON CREATE SET u.name = line.name;
The best solution we found was to change our schema to include a labelled node that contains the upper-cased value which we can merge on, whilst still retaining the case information on the original mode. E.g. (OriginalCase)-[uppercased]->(ORIGINALCASE)

Resources