Neo4j: Split string and get position - foreach

I need to split a field in different values and store each value in a different node. For each created node I want to store the position.
Example:
Sentence Words
My car is red My;car;is;red
Using:
FOREACH (w IN SPLIT(line.TWords, ";") |
MERGE (wd:Word {word: w})
I can split the field and store the different words, but I'd like to store the position on the relationship.
My car is red -[HAS_WORD {position:1}]-> My
My car is red -[HAS_WORD {position:2}]-> car
My car is red -[HAS_WORD {position:3}]-> is
My car is red -[HAS_WORD {position:4}]-> red
How can I get this?
SOLUTION
USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:///output_2016-05-06_0203_Neo4jImport.csv' AS line FIELDTERMINATOR "\t"
MERGE (s:Segment{text: line.Source})
MERGE (ta:Segment{text: line.Target})
WITH SPLIT(line.SWords, ";") AS SWords, line, s, ta
UNWIND RANGE(0, SIZE(SWords)-1) as i
MERGE (s)-[r:HAS_WORD {position:i+1}]->(w:Word {word: SWords[i]})
WITH SPLIT(line.TWords, ";") AS TWords, line, ta
UNWIND RANGE(0, SIZE(TWords)-1) as i
MERGE (ta)-[r:HAS_WORD {position:i+1}]->(w:Word {word: TWords[i]})
Be sure that the fist WITH has the variable references necessary in second WITH: WITH SPLIT(line.SWords, ";") AS SWords, line, s, ta

You can use a range based on the size of the split, assuming the node containing the sentence is identified with sentence :
WITH split(line.TWords, ';') as splitted
UNWIND range(0, size(splitted) -1) as i
MERGE (w:Word {word: splitted[i]})
MERGE (sentence)-[:HAS_WORD {position: i}]->(w)
Update
USING PERIODIC COMMIT LOAD CSV WITH HEADERS
FROM 'file:///output_2016-05-06_0203_Neo4jImport.csv'
AS line FIELDTERMINATOR "\t"
MERGE (s:Segment{text: line.Source})
WITH SPLIT(line.SWords, ";") AS SWords, line
UNWIND RANGE(0, SIZE(SWords)-1) as i
MERGE (s)-[r:HAS_WORD {position:i+1}]->(w:Word {word: SWords[i]})

Use range:
MERGE (S:Sentence {text:"My car is red"})
WITH S, SPLIT(S.text, " ") as words
UNWIND RANGE(0,SIZE(words)-1) as i
MERGE (S)-[r:HAS_WORD {position:i+1}]->(w:Word {word: words[i]})
RETURN S, r, w

Related

Neo4j LOAD CSV..when CSV columns contains a list of properties

This is regarding neo4j csv import using LOAD csv. Suppose my csv file format is as following.
Id, OID, name, address, Parents , Children
1, mid1, ratta, hello#aa, ["mid250","mid251","mid253"], ["mid60","mid65"]
2, mid2, butta, ado#bb, ["mid350","mid365","mid320", "mid450","mid700"], ["mid20","mid25","mid30"]
3, mid3, natta, hkk#aa, ["mid50","mid311","mid543"], []
So the parents and children columns consists of mids basically..while importing csv into neo4j using LOAD CSV.. I want to create following nodes and relationships.
NODES for each rows (for each id column in csv)
[:PARENT] relationship by matching the OID property in each row and OID properties inside parents column. So as a example when processing the first row...there should be four nodes (mid1, mid250,mid 251 and mid 253) and 3 PARENT relationship between mid1 and other 3 nodes.
[: CHILD ] relationship by matching the OID property in each row and OID properties inside children column.
Please help!!
Tried doing it with for each function but the results didn't come correctly. Im doing it through a python script. just need to edit the cypher query.
def create_AAA(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (e:AAA {id: row._id,OID: row.OID,address: row.address,name: row.name})"
)
def create_parent(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (a:AAA {OID: row.OID}) FOREACH (t in row.parents | MERGE (e:AAA {OID:t}) MERGE (a)-[:PARENT]->(e) )"
)
def create_child(tx):
tx.run(
"LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (a:AAA {OID: row.OID}) FOREACH (t in row.children | MERGE (e:AAA {OID:t}) MERGE (a)-[:CHILD]->(e) )"
)
with driver.session() as session:
session.write_transaction(create_AAA)
session.write_transaction(create_parent)
session.write_transaction(create_child)
Please follow the instructions below:
Change the column names of Parents and Children into parents and children since neo4j is case sensitive.
Remove the spaces in your csv file so that you don't need to do trim () on each columns in the csv.
In your parents and children columns, remove the commas on the string list because it is causing an error. OR use another delimiter and not comma. In my example, I used space as delimiter.
Below script will remove the quotes and [] characters then convert the string list into a list (using split() function)
Do the same for create child function.
LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row
MERGE (a:AAA {OID: row.OID})
FOREACH (t in split(replace(replace(replace(row.parents,'[', ''),']', ''),'"', ''), ' ') |
MERGE (e:AAA {OID:t}) MERGE (a)-[:PARENT]->(e) )
See sample csv here:
Id,OID,name,address,parents,children
1,mid1,ratta,hello#aa,["mid250" "mid251" "mid253"],["mid60" "mid65"]
2,mid2,butta,ado#bb,["mid350" "mid365" "mid320" "mid450" "mid700"],["mid20" "mid25" "mid30"]
3,mid3,natta,hkk#aa,["mid50" "mid311" "mid543"],[]
See sample result here:
LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row
WITH row WHERE row.children <>"[]"
MERGE (a:AAA {OID: row.OID})
FOREACH (t in split(replace(replace(replace(row.children,'[', ''),']', ''),'"', ''), ' ') |
MERGE (e:AAA {OID:t}) MERGE (a)-[:CHILD]->(e) )
Now it works fine.

Importing intra-dependent relationship in Neo4j

My CSV structure is as follows:
Origin City Destination City Route Sales
A B XYZ $5
B C ZED $50
C A FGH $15
Origin and destination cities are from the same bucket i.e. there should be only 3 nodes in this case (A, B, and C) whereas relationship will have 2 properties: sales and route.
When I use the:
LOAD CSV WITH HEADERS FROM 'file:///C:/citylist2.csv' as line fieldterminator ','
MERGE (c:City {id: line.`Origin City`})
MERGE (c)-[r:SALES{id: line.Route, sales: line.Sales}]->(c)
code then it creates a self-referencing flower graph. How d I solve this? I need 3 nodes and intra-node relationships with sales and route as properties.
What about this :
LOAD CSV WITH HEADERS FROM 'file:///C:/citylist2.csv' as line fieldterminator ','
MERGE (c1:City {id: line.`Origin City`})
MERGE (c2:City {id: line.`Destination City`})
MERGE (c1)-[r:SALES{id: line.Route, sales: line.Sales}]->(c2);
Hope this helps,
Regards,
Tom

Cypher query return undesirable result

I need to get texts and save them to Neo4j. After that, I separate each word of that text and create a [:NEXT] relationship between them indicating the word that comes after another one and a [:CONTAINS] relationship indicating that the text contains that word.
Finally I try to get the word in the text that has more relations [:NEXT] but not in the whole database. Only in the given text.
Unfortunatelly I just get the sum of the whole database.
The query is:
query = '''
WITH split("%s"," ") as words
MERGE (p:Post {id: '%s', text: '%s'})
WITH p, words
UNWIND range(0,size(words)-2) as idx
MERGE (w1:Word {name:words[idx]})
MERGE (w2:Word {name:words[idx+1]})
MERGE (w1)-[:NEXT]->(w2)
MERGE (p)-[:CONTAINS]->(w2)
MERGE (p)-[:CONTAINS]->(w1)
WITH p
MATCH (p)-[c:CONTAINS]->(w:Word)
MATCH ()-[n1:NEXT]->(:Word {name: w.name})<-[:CONTAINS]-(p)
MATCH (p)-[:CONTAINS]-(:Word {name: w.name})-[n2:NEXT]->()
WITH COUNT(n1) + COUNT(n2)AS score, w.name AS word, p.text AS post, p.id AS _id
RETURN post, word, score, _id;
''' %(text, id, text)
I just can't find out the problem here.
Thanks!
Well, you may have a data modeling problem here.
You're using MERGE when creating your word nodes, so if that word was added from any prior query with text, it will reuse that same node, so your more common word nodes (a, the, and, I, etc) will likely have many [:NEXT] relationships which will continue to grow with each query.
Is this how you mean this to behave, or are you only going to be asking your db questions about words used in only the given text in the query?
EDIT
The problem is the merging of the :Word nodes. This will match on any prior :Word node created from any previous query, and will be matched to from any future query. It's not enough to merge the :Word node itself; to make your words local only to each associated post, you have to merge the relationship of the word from your post at the same time.
We can also clean up the patterns used to match to calculate the word score, as all we need is the number of [:NEXT] relationships of any direction from each word.
query = '''
WITH split("%s"," ") as words
MERGE (p:Post {id: '%s', text: '%s'})
WITH p, words
UNWIND range(0,size(words)-2) as idx
MERGE (p)-[:CONTAINS]->(w1:Word {name:words[idx]})
MERGE (p)-[:CONTAINS]->(w2:Word {name:words[idx+1]})
MERGE (w1)-[:NEXT]->(w2)
WITH p
MATCH (p)-[:CONTAINS]->(w:Word)
WITH size( ()-[:NEXT]-(w) ) AS score, w.name AS word, p.text AS post, p.id AS _id
RETURN post, word, score, _id;
''' %(text, id, text)
My solution is:
query = '''
WITH split("%s"," ") AS words
MERGE (p:Post {id: "%s", text:"%s"})
WITH p, words
UNWIND range(0,size(words)-2) as idx
MERGE (w1:Word {name:words[idx]})
MERGE (w2:Word {name:words[idx+1]})
MERGE (w1)-[n:NEXT]->(w2)
ON MATCH SET n.count = n.count + 1
ON CREATE SET n.count = 1
MERGE (p)-[:CONTAINS]->(w2)
MERGE (p)-[:CONTAINS]->(w1)
''' %(text, id, text)

LOAD CSV with a collection using MATCH and FOREACH to form a relationship

I have a CSV file that have information in the form:
Col1,Col2
A,B;C;D
I'm trying to generate relationships between A and each of B, C, D. In my particular case, B, C, D have already been created by importing another CSV file. Therefore, I do not need to create these nodes, but I do need to match them to get the node to form the relationship.
LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS line
WITH line
MERGE (a:Item {name: line.Col1})
FOREACH (x IN SPLIT(line.Col2, ';') |
MATCH (s:Item {name: x})
CREATE UNIQUE (s)-[:rel]->(a));
Nodes B, C, D have multiple properties, only one of which is in the second CSV, like:
B: {id: 123, name: B}
Is there a way to do this with Cypher? Currently, I get the error: "Invalid use of MATCH inside FOREACH". Using MERGE instead of MATCH results in new, unwanted nodes.
Simple use MERGE instead MATCH:
LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS line
WITH line
MERGE (a:Item {name: line.Col1})
FOREACH (x IN SPLIT(line.Col2, ';') |
MERGE (s:Item {name: x})
MERGE (s)-[:rel]->(a));
If a match is more complicated, then you can first make a selection, and then walk on it:
LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS line
MERGE (a:Item {name: line.Col1})
WITH a, SPLIT(line.Col2, ';') as names
UNWIND names as x
MATCH (t:Item {name: x})
WITH a, collect(t) as sc
FOREACH (x IN sc |
CREATE UNIQUE (x)-[:rel]->(a));

Neo4j. Create graph from two tables

I have two csv files with nodes and edges.
nodes:
big, adjective
arm, noun
face, noun,
best, adjective
edges:
big, face
best, friend
face, arm
I want to create graph with relationships by edges and add nodes group: noun and adjective.
I use this command to create relationships:
LOAD CSV FROM 'file:copperfield_edges.csv' AS line MERGE (g:G {word1 : line[0]}) WITH line, g MERGE (j:J {word2 : line[1]}) WITH g,j MERGE (g)-[:From_To]->(j);
but in this case each word appears two times. How can I do only unique relationships of words and add noun and adjective group?
I want to get something like this http://joxi.ru/1A5QX6MH6LZ1AE
You're assignen the G label to all nodes in the first column and the J label to all in the second column. Since you have one identifier (e.g. big, face) for every word, use one label for all, e.g. Word
Try the following:
LOAD CSV FROM 'file:copperfield_edges.csv' AS line
MERGE (g:Word {word : line[0]})
MERGE (j:Word {word : line[1]})
MERGE (g)-[:From_To]->(j);
Based on your nodes csv file, you can assign an additional label indicating if the word is an adjective or noun:
LOAD CSV FROM 'file:nodes.csv' AS line
MERGE (w:Word {word: line[0]})
FOREACH (n in (CASE WHEN line[1] = "adjective" THEN [1] ELSE [] END) |
set w :Adjective )
FOREACH (n in (CASE WHEN line[1] = "nound" THEN [1] ELSE [] END) |
set w :Noun )
Since you cannot set labels dynamically, I've had to use the FOREACH trick documented at http://www.markhneedham.com/blog/2014/06/17/neo4j-load-csv-handling-conditionals/
If your graph is more than a handful of nodes consider using creating an index before running LOAD CSV:
CREATE INDEX ON :Word(word)

Resources