Assume a Node "Properties". I am using "LOAD CSV with headers..."
Following is the sample file format:
fields
a=100,b=110,c=120,d=500
How do I convert fields column to having a node with a,b,c,d and 100,110,120,500 respectively as the properties of the node "Properties"?
LOAD CSV WITH HEADERS FROM 'file:/sample.tsv' AS row FIELDTERMINATOR '\t'
CREATE (:Properties {props: row.fields})
The above does not create individual properties, but sets a string value to props as "a=100,b=110,c=120,d=500"
Also, different rows could have different set of Key values. That is the key needs to be dynamic. (There are other columns as well, I trimmed it for SO)
fields
a=100,b=110,c=120,d=500
X=300,y=210,Z=420,P=600
...
I am looking for a way to not split this key-value as columns and then load. The reason is they are dynamic - today it is a,b,c,d it may change to aa,bb,cc,dd etc.
I don't want to keep on changing my loader script to recognize new column headers.
Any pointers to solve this? I am using the latest 3.0.1 neo4j version.
First things first: Your file format currently defines a single header/property: fields:
fields
a=100,b=110,c=120,d=500
Since you defined a tab as field terminator, that entire string (a=100,b=110,c=120,d=500) would end up in your node's props property:
To have properties loaded dynamically: First set up proper header:
"a","b","x","y"
1,2,,
,,3,4
Then you can query with something like this:
LOAD CSV WITH HEADERS FROM 'file:///Users/David/overflow.csv' AS row
CREATE (:StackOverflow { a:row.a, b:row.b,x:row.x,y:row.y})
Then when you run something like:
match(so:StackOverflow) return so
You'll get the variable properties you wanted:
Related
I'm using cypher and the neo4j browser to create nodes from csv input.
I want to read in each row of my csv file with headers and then create a node with that row as properties.
MY current code is:
LOAD CSV WITH HEADERS FROM '<yourFilePath>' AS ROW
WITH ROW
CREATE (n:node $ROW)
This throws an error saying parameter missing.
Try this
LOAD CSV WITH HEADERS FROM '<yourFilePath>' AS row
CREATE (n:node)
SET n+= row
In Cypher, variables that start with "$" must be passed to the query as parameters. Your Cypher code is locally binding values to the ROW variable (and not passing a parameter), so change $ROW to ROW.
In addition, if you want to make sure that you do not generate duplicate nodes, you should consider using MERGE instead of CREATE. But before you do so, you must carefully read the documentation on MERGE to understand how to use it properly.
I am new to Neo4j and graph database. While trying to import a few relationships from a CSV file, I can see that there are no records, even when the file is filled with enough data.
LOAD CSV with headers FROM 'file:/graphdata.csv' as row WITH row
WHERE row.pName is NOT NULL
MERGE(transId:TransactionId)
MERGE(refId:RefNo)
MERGE(kewd:Keyword)
MERGE(accNo:AccountNumber {bName:row.Bank_Name, pAmt:row.Amount, pName:row.Name})
Followed by:
LOAD CSV with headers FROM 'file/graphdata.csv' as row WITH row
WHERE row.pName is NOT NULL
MATCH(transId:TransactionId)
MATCH(refId:RefNo)
MATCH(kewd:Keyword)
MATCH(accNo:AccountNumber {bName:row.Bank_Name, pAmt:row.Amount, pName:row.Name})
MERGE(transId)-[:REFERENCE]->(refId)-[:USED_FOR]->(kewd)-[:AGAINST]->(accNo)
RETURN *
Edit (table replica):
TransactionId Bank_Name RefNo Keyword Amount AccountNumber AccountName
12345 ABC 78 X 1000 5421 WE
23456 DEF X 2000 5471
34567 ABC 32 Y 3000 4759 HE
Is it likely the case that the Nodes and relationships are not created at all? How do I get all these desired relationships?
Neither file:/graphdata.csv nor file/graphdata.csv are legal URLs. You should use file:///graphdata.csv instead.
By default, LOAD CSV expects a "csv" file to consist of comma separated values. You are instead using a variable number of spaces as a separator (and sometimes as a trailer). You need to either:
use a single space as the separator (and specify an appropriate FIELDTERMINATOR option). But this is not a good idea for your data, since some bank names will likely also contain spaces.
use a comma separator (or some other character that will not occur in your data).
For example, this file format would work better:
TransactionId,Bank_Name,RefNo,Keyword,Amount,AccountNumber,AccountName
12345,ABC,78,X,1000,5421,WE
23456,DEF,,X,2000,5471
34567,ABC,32,Y,3000,4759,HE
Your Cypher query is attempting to use row properties that do not exist (since the file has no corresponding column headers). For example, your file has no pName or Name headers.
Your usage of the MERGE clause is probably not doing what you want, generally. You should carefully read the documentation, and this answer may also be helpful.
I have two files. First file contain list of users with certain properties. I have loaded them in Neo4j as below:
USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///users.csv" AS row
CREATE (U:User{userid:row.userid, username:row.username})
Now, I have a second file that contains pincodes of the places the user stays or ever stayed at. Example:
User Pincodes
A 001
B 002
A 003
I want to add a property to the label User such that it adds all the pincodes as a list. But when I am using the below query, it only stores the most latest value and not all the values as a list.
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "file:///user_pincode.csv" AS line
MATCH (U:User)
WHERE U.userid=line.userid
SET U.pincode=[line.pincode]
Any suggestions would be really helpful.
[UPDATED]
You can do this:
USING PERIODIC COMMIT 1000
LOAD CSV FROM "file:///user_pincode.csv" AS line
MATCH (u:User)
WHERE u.userid=line[0]
SET u.pincode = COALESCE(u.pincode, []) + line[1]
Since your CSV data has no header, this query omits the WITH HEADERS option, and treats line as an array. It appends the new pincode to the end of the existing pincode list (or, if the pincode property did not already exist, initialize that property with a single-element list). The COALESCE function returns the first argument that is non-NULL.
I'm trying to create a set of labeled nodes using IMPORT CSV like so:
LOAD CSV WITH HEADERS FROM "file:/D:/OpenData/ProKB/tmp/ErrLink.csv" as line
CREATE (e:ErrLink {kbid:line.Kbid, errnum:line.Errnum })
The CSV file looks like this:
"Kbid:string","Errnum:string"
"S000001080","64"
"S000001096","129"
The problem I'm running into is I'm creating nodes, and they're all property-less. If I get rid of the :string suffixes on the header fields, then the load works.
This is contrary to what Chapter 29.1 of the docs says:
29.1. CSV file header format
The header row of each data source specifies how the fields should be interpreted. The same delimiter is used for the header row as for the rest of the data.
The header contains information for each field, with the format: name:field_type. The name is used as the property key for values, and ignored in other cases. The following field_type settings can be used for both nodes and relationships:
Property value Use one of int, long, float, double, boolean, byte, short, char, string to designate the data type. If no data type is given, this defaults to string. To define an array type, append [] to the type. Array values are by default delimited by a ;, but a different delimiter can be specified.
Is this functionality not working, or is it restricted to just the import tool and not the language?
That section of the documentation is for the Import Tool, which is different than the Cypher language's Load CSV clause.
If you are using the latter, that special header format is not documented, and apparently not supported.
I am using Neo4jImport.bat to perform my initial database load. I have a node file that looks like this:
application_id:ID(application_id),:LABEL
2036983247,application_id
2037028183,application_id
I would like to (sometimes) add a second :suspect label to some of these rows. For example:
application_id:ID(application_id),:LABEL
2036983247,"application_id,suspect"
2037028183,application_id
Using the above format, the files will load successfully, however, when I try and query the data using cypher I run into issues. Specifically, the below queries return 0 results:
match (n:application_id {application_id:"2036983247"}) return *
match (n:suspect) return *
Whilst the query against the row with a single label works fine:
match (n:application_id {application_id:"2037028183"}) return *
To make it more confusing, the labels() function seems to correctly show the labels as expected being returned in an array for the app with multiple labels.
According to the import documentation on labels:
LABEL
Read one or more labels from this field. For multiple labels, the values are separated by the array delimiter.
What am I doing wrong?
Each :LABEL column can have multiple labels in them, separated by whatever --array-delimiter specifies (defaults to ';'). Also, as Robert mentioned, multiple :LABEL columns is also supported.
To add additional labels to a node, simply add an additional :LABEL header column for each additional label you wish to add.
application_id:ID(application_id),:LABEL,:LABEL
In the contents of the file, you then delimit your labels with whatever delimiter you are using:
2036983247,application_id,suspect
2037028183,application_id
Unlike properties, it seems that the import tool will allow :LABEL columns to be 'missing' (at least if they're the last column).