Error in importing data into neo4j via csv - neo4j

I am new to neo4j.
I can't figure out what I am doing wrong?, please help.
LOAD CSV WITH HEADERS FROM "file:///C:\Users\Chandra Harsha\Downloads\neo4j_module_datasets\test.csv" as line
MERGE (n:Node{name:line.Source})
MERGE (m:Node{name:line.Target})
MERGE (n)-[:TO{distance:line.dist}]->(m)
Error message:
Invalid input 's': expected four hexadecimal digits specifying a unicode character (line 1, column 41 (offset: 40))
"LOAD CSV WITH HEADERS FROM "file:///C:\Users\Chandra Harsha\Downloads\neo4j_module_datasets\test.csv" as line"

You've got two problems here.
The immediate error is that the backslashes in the path are being seen as escape characters rather than folder separators - you either have to escape the backslashes (by just doubling them up) or use forward-slashes instead. For example:
LOAD CSV WITH HEADERS FROM "file:///C:\\Users\\Chandra Harsha\\Downloads\\neo4j_module_datasets\\test.csv" as line
MERGE (n:Node{name:line.Source})
MERGE (m:Node{name:line.Target})
MERGE (n)-[:TO{distance:line.dist}]->(m)
However - Neo4j doesn't allow you to load files from arbitrary places on the filesystem anyway as a security precaution. Instead, you should move the file you want to import into the import folder under your database. If you're using Neo4j Desktop, you can find this by selecting your database project and clicking the little down-arrow icon on 'Open Folder', choosing 'Import' from the list.
Drop the CSV in there, then just use its filename in your file space. So for example:
LOAD CSV WITH HEADERS FROM "file:///test.csv" as line
MERGE (n:Node{name:line.Source})
MERGE (m:Node{name:line.Target})
MERGE (n)-[:TO{distance:line.dist}]->(m)
You can still use folders under the import directory - anything starting file:/// will be resolved relative to that import folder, so things like the following are also fine:
LOAD CSV WITH HEADERS FROM "file:///neo4j_module_datasets/test.csv" as line
MERGE (n:Node{name:line.Source})
MERGE (m:Node{name:line.Target})
MERGE (n)-[:TO{distance:line.dist}]->(m)

Related

How to upload Polygons from GeoPandas to Snowflake?

I have a geometry column of a geodataframe populated with polygons and I need to upload these to Snowflake.
I have been exporting the geometry column of the geodataframe to file and have tried both CSV and GeoJSON formats, but so far I either always get an error the staging table always winds up empty.
Here's my code:
design_gdf['geometry'].to_csv('polygons.csv', index=False, header=False, sep='|', compression=None)
import sqlalchemy
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL
engine = create_engine(
URL(<Snowflake Credentials Here>)
)
with engine.connect() as con:
con.execute("PUT file://<path to polygons.csv> #~ AUTO_COMPRESS=FALSE")
Then on Snowflake I run
create or replace table DB.SCHEMA.DESIGN_POLYGONS_STAGING (geometry GEOGRAPHY);
copy into DB.SCHEMA."DESIGN_POLYGONS_STAGING"
from #~/polygons.csv
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = '|' SKIP_HEADER = 1 compression = None encoding = 'iso-8859-1');
Generates the following error:
"Number of columns in file (6) does not match that of the corresponding table (1), use file format option error_on_column_count_mismatch=false to ignore this error File '#~/polygons.csv.gz', line 3, character 1 Row 1 starts at line 2, column "DESIGN_POLYGONS_STAGING"[6] If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client."
Can anyone identify what I'm doing wrong?
Inspired by #Simeon_Pilgrim's comment I went back to Snowflake's documentation. There I found an example of converting a string literal to a GEOGRAPHY.
https://docs.snowflake.com/en/sql-reference/functions/to_geography.html#examples
select to_geography('POINT(-122.35 37.55)');
My polygons looked like strings describing Polygons more than actual GEOGRAPHYs so I decided I needed to be treating them as strings and then calling TO_GEOGRAPHY() on them.
I quickly discovered that they needed to be explicitly enclosed in single quotes and copied into a VARCHAR column in the staging table. This was accomplished by modifying the CSV export code:
import csv
design_gdf['geometry'].to_csv(<path to polygons.csv>,
index=False, header=False, sep='|', compression=None, quoting=csv.QUOTE_ALL, quotechar="'")
The staging table now looks like:
create or replace table DB.SCHEMA."DESIGN_POLYGONS_STAGING" (geometry VARCHAR);
I ran into further problems copying into the staging table related to the presence of a polygons.csv.gz file I must have uploaded in a previous experiment. I deleted this file using:
remove #~/polygons.csv.gz
Finally, converting the staging table to GEOGRAPHY
create or replace table DB.SCHEMA."DESIGN_GEOGRAPHY_STAGING" (geometry GEOGRAPHY);
insert into DB.SCHEMA."DESIGN_GEOGRAPHY"
select to_geography(geometry)
from DB.SCHEMA."DESIGN_POLYGONS_STAGING"
and I wound up with a DESIGN_GEOGRAPHY table with a single column of GEOGRAPHYs in it. Success!!!

CSV Data import in neo4j

I am trying to add relationship between existing employee nodes in my sample database from csv file using the following commands:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///newmsg1.csv' AS line
WITH line
MATCH (e:Employee {mail: line.fromemail}), (b:Employee {mail: line.toemail})
CREATE (e)-[m:Message]->(b);
The problem i am facing is that, while there are only 71253 entries in the csv file in which each entry has a "fromemail" and "toemail",
I am getting "Created 240643 relationships, completed after 506170 ms." as the output. I am not able to understand what I am doing wrong. Kindly help me. Thanks in advance!
You can use MERGE to ensure uniqueness of relationships:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///newmsg1.csv' AS line
WITH line
MATCH (e:Employee {mail: line.fromemail}), (b:Employee {mail: line.toemail})
MERGE (e)-[m:Message]->(b);
Try change your create to CREATE UNIQUE:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///newmsg1.csv' AS line
WITH line
MATCH (e:Employee {mail: line.fromemail}), (b:Employee {mail: line.toemail})
CREATE UNIQUE (e)-[m:Message]->(b);
From the docs:
CREATE UNIQUE is in the middle of MATCH and CREATE — it will match
what it can, and create what is missing. CREATE UNIQUE will always
make the least change possible to the graph — if it can use parts of
the existing graph, it will.

Neo4j - how to import csv having special characters

I am trying to import csv file which has some special characters like ' , " , \ when I am running this query -
LOAD CSV WITH HEADERS FROM "http://192.168.11.121/movie-reco-db/movie_node.csv" as row
CREATE (:Movie_new {movieId: row.movie_id, title: row.title, runtime: row.runtime, metaScore: row.meta_score, imdbRating: row.IMDB_rating, imdbVotes: row.IMDB_votes , released: row.released , description: row.description , watchCount: row.watch_count , country: row.country ,
category: row.category , award: row.award , plot: row.plot , timestamp: row.timestamp})
It throws an error -
At http://192.168.11.121/movie-reco-db/movie_node.csv:35 - there's a field starting with a quote and whereas it ends that quote there seems to be characters in that field after that ending quote. That isn't supported. This is what I read: ' Awesome skills performed by skillful players! Football Circus! If you like my work, become a fan on facebook",0,USA,Football,N/A,N/A,1466776986
260,T.J. Miller: No Real Reason,0,0,7.4,70,2011-11-15," '
I debug the file and remove \ from the line causing problem , then re-run the above query , it worked well. Is there any better way to do it by query or I have to find and remove such kind of characters each time ?
It doesn't appear that the quote character is configurable for LOAD CSV (like the FIELDTERMINATOR is):
http://neo4j.com/docs/developer-manual/current/cypher/#csv-file-format
If you're creating a new Database, you could look into using the neo4j-import tool which does appear to have the option to configure the quote character (and then just set it to a character you know won't be in your csv files (say for example a pipe symbol |):
http://neo4j.com/docs/operations-manual/current/deployment/#import-tool
Otherwise, yes, you're going to need to run a cleansing process on your input files prior to loading.

Cannot successfully import csv to neo4j

I am trying to import a CSV file to neo4j database using the following query:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///I:/Traces.csv" AS row
MERGE (e:Event {SystemCall: coalesce(row.syscall, "No Value"), ReturnValue: coalesce(row.retvalue,"No Value"), ReturnTime: coalesce(row.rettime,"No Value"), CallTime: coalesce(row.calltime,"No Value")})
MERGE (s:Subject {ProcessName: coalesce(row.processname, "No Value"), Pid: coalesce(row.pid, "No Value"), tid: coalesce(row.tid, "No Value")})
MERGE (o:Object {Argument1: coalesce(row.arg1, "No Value"), Argument2: coalesce(row.arg2, "No Value")})
MERGE (e)-[:IS_GENERATED_BY]->(s)
MERGE (e)-[:AFFECTS]->(o)
MERGE (e)-[:AFFECTS] ->(s)
The CSV file is hosted at location: https://drive.google.com/open?id=0B8vCvM9jIcTzRktRTGpxOUZXQjA
The query takes almost 80K milliseconds to run but returns no row. Please help.
There are 2 problems.
You must not have superfluous whitespace around the comma delimiters in your CSV files. (Also, you should eliminate the extra 11 commas at the end of each line).
Every field name in your CSV file header has to match the corresponding property name in your Cypher query. Currently, most of the names are different.
By the way, when talking about a Cypher query, the term "return" almost always refers to the RETURN clause. Your question has nothing to do with a RETURN clause, and is actually about data creation.
[UPDATE]
In addition, if you want to import a huge amount of data into a brand new neo4j DB, you should consider using the Import tool instead. It would be much faster than LOAD CSV.

neo4j LOAD CSV with Tabs

I am trying to load a csv and create nodes in neo4j 2.1.0 using the following:
USING PERIODIC COMMIT
LOAD CSV FROM "file://c:/temp/listings.TXT"
AS line FIELDTERMINATOR '\t'
CREATE (p:person { id: line[0] });
The columns are separated using 0x9 (tab) characters. But the created nodes have the entire row content in the id.
Any help is greatly appreciated.
try FIELDTERMINATOR '\\t'
that's what worked for me
Try casting toint(line[0]) since the default type when importing is string.
Open your csv file with notepad++ and try to delete the (") or to replace them with spaces.
I had the same problem but when I removed all the characters " it worked
enter image description here

Resources