Neo4j - how to import csv having special characters

Neo4j - how to import csv having special characters - neo4j

I am trying to import csv file which has some special characters like ' , " , \ when I am running this query -
LOAD CSV WITH HEADERS FROM "http://192.168.11.121/movie-reco-db/movie_node.csv" as row
CREATE (:Movie_new {movieId: row.movie_id, title: row.title, runtime: row.runtime, metaScore: row.meta_score, imdbRating: row.IMDB_rating, imdbVotes: row.IMDB_votes , released: row.released , description: row.description , watchCount: row.watch_count , country: row.country ,
category: row.category , award: row.award , plot: row.plot , timestamp: row.timestamp})
It throws an error -
At http://192.168.11.121/movie-reco-db/movie_node.csv:35 - there's a field starting with a quote and whereas it ends that quote there seems to be characters in that field after that ending quote. That isn't supported. This is what I read: ' Awesome skills performed by skillful players! Football Circus! If you like my work, become a fan on facebook",0,USA,Football,N/A,N/A,1466776986
260,T.J. Miller: No Real Reason,0,0,7.4,70,2011-11-15," '
I debug the file and remove \ from the line causing problem , then re-run the above query , it worked well. Is there any better way to do it by query or I have to find and remove such kind of characters each time ?

It doesn't appear that the quote character is configurable for LOAD CSV (like the FIELDTERMINATOR is):
http://neo4j.com/docs/developer-manual/current/cypher/#csv-file-format
If you're creating a new Database, you could look into using the neo4j-import tool which does appear to have the option to configure the quote character (and then just set it to a character you know won't be in your csv files (say for example a pipe symbol |):
http://neo4j.com/docs/operations-manual/current/deployment/#import-tool
Otherwise, yes, you're going to need to run a cleansing process on your input files prior to loading.

Related

How to upload Polygons from GeoPandas to Snowflake?

I have a geometry column of a geodataframe populated with polygons and I need to upload these to Snowflake.
I have been exporting the geometry column of the geodataframe to file and have tried both CSV and GeoJSON formats, but so far I either always get an error the staging table always winds up empty.
Here's my code:
design_gdf['geometry'].to_csv('polygons.csv', index=False, header=False, sep='|', compression=None)
import sqlalchemy
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL
engine = create_engine(
URL(<Snowflake Credentials Here>)
)
with engine.connect() as con:
con.execute("PUT file://<path to polygons.csv> #~ AUTO_COMPRESS=FALSE")
Then on Snowflake I run
create or replace table DB.SCHEMA.DESIGN_POLYGONS_STAGING (geometry GEOGRAPHY);
copy into DB.SCHEMA."DESIGN_POLYGONS_STAGING"
from #~/polygons.csv
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = '|' SKIP_HEADER = 1 compression = None encoding = 'iso-8859-1');
Generates the following error:
"Number of columns in file (6) does not match that of the corresponding table (1), use file format option error_on_column_count_mismatch=false to ignore this error File '#~/polygons.csv.gz', line 3, character 1 Row 1 starts at line 2, column "DESIGN_POLYGONS_STAGING"[6] If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client."
Can anyone identify what I'm doing wrong?

Inspired by #Simeon_Pilgrim's comment I went back to Snowflake's documentation. There I found an example of converting a string literal to a GEOGRAPHY.
https://docs.snowflake.com/en/sql-reference/functions/to_geography.html#examples
select to_geography('POINT(-122.35 37.55)');
My polygons looked like strings describing Polygons more than actual GEOGRAPHYs so I decided I needed to be treating them as strings and then calling TO_GEOGRAPHY() on them.
I quickly discovered that they needed to be explicitly enclosed in single quotes and copied into a VARCHAR column in the staging table. This was accomplished by modifying the CSV export code:
import csv
design_gdf['geometry'].to_csv(<path to polygons.csv>,
index=False, header=False, sep='|', compression=None, quoting=csv.QUOTE_ALL, quotechar="'")
The staging table now looks like:
create or replace table DB.SCHEMA."DESIGN_POLYGONS_STAGING" (geometry VARCHAR);
I ran into further problems copying into the staging table related to the presence of a polygons.csv.gz file I must have uploaded in a previous experiment. I deleted this file using:
remove #~/polygons.csv.gz
Finally, converting the staging table to GEOGRAPHY
create or replace table DB.SCHEMA."DESIGN_GEOGRAPHY_STAGING" (geometry GEOGRAPHY);
insert into DB.SCHEMA."DESIGN_GEOGRAPHY"
select to_geography(geometry)
from DB.SCHEMA."DESIGN_POLYGONS_STAGING"
and I wound up with a DESIGN_GEOGRAPHY table with a single column of GEOGRAPHYs in it. Success!!!

Error in importing data into neo4j via csv

I am new to neo4j.
I can't figure out what I am doing wrong?, please help.
LOAD CSV WITH HEADERS FROM "file:///C:\Users\Chandra Harsha\Downloads\neo4j_module_datasets\test.csv" as line
MERGE (n:Node{name:line.Source})
MERGE (m:Node{name:line.Target})
MERGE (n)-[:TO{distance:line.dist}]->(m)
Error message:
Invalid input 's': expected four hexadecimal digits specifying a unicode character (line 1, column 41 (offset: 40))
"LOAD CSV WITH HEADERS FROM "file:///C:\Users\Chandra Harsha\Downloads\neo4j_module_datasets\test.csv" as line"

You've got two problems here.
The immediate error is that the backslashes in the path are being seen as escape characters rather than folder separators - you either have to escape the backslashes (by just doubling them up) or use forward-slashes instead. For example:
LOAD CSV WITH HEADERS FROM "file:///C:\\Users\\Chandra Harsha\\Downloads\\neo4j_module_datasets\\test.csv" as line
MERGE (n:Node{name:line.Source})
MERGE (m:Node{name:line.Target})
MERGE (n)-[:TO{distance:line.dist}]->(m)
However - Neo4j doesn't allow you to load files from arbitrary places on the filesystem anyway as a security precaution. Instead, you should move the file you want to import into the import folder under your database. If you're using Neo4j Desktop, you can find this by selecting your database project and clicking the little down-arrow icon on 'Open Folder', choosing 'Import' from the list.
Drop the CSV in there, then just use its filename in your file space. So for example:
LOAD CSV WITH HEADERS FROM "file:///test.csv" as line
MERGE (n:Node{name:line.Source})
MERGE (m:Node{name:line.Target})
MERGE (n)-[:TO{distance:line.dist}]->(m)
You can still use folders under the import directory - anything starting file:/// will be resolved relative to that import folder, so things like the following are also fine:
LOAD CSV WITH HEADERS FROM "file:///neo4j_module_datasets/test.csv" as line
MERGE (n:Node{name:line.Source})
MERGE (m:Node{name:line.Target})
MERGE (n)-[:TO{distance:line.dist}]->(m)

Convert data source UICollectionView to CSV file

I search how to convert UICollectionView to CSV file and send it with Mail.
I have a collection view like the photo and I want to export the table and send it. I search and found that the best way is to convert to CSV file.
If you have other suggestion, just tell me.

As #Larme has pointed out, converting this to a CSV file has nothing to do with the visual representation in the collection view. You simply need to parse the data source to CSV. CSV stands for Comma Separated Value, which in turn means a type of file where tabular data is encoded using a delimiter between each data point (this is generally a comma, but could be anything), and a new line for each line of the table. Think of the delimiter as the vertical line between each column of the table, and the new line as the row:
So your CSV text file might look like this:
TITLEFORCOLUMN1, TITLEFORCOLUMN2, TITLEFORCOLUMN3
ROWTITLEONE, 200, 300
ROWTITLETWO, 400, 500
and so on. It's not quite this simple, and there are rules that you should follow, especially if you intend the CSV file to be consumed by third parties. There is an official specification which you can look at, but you can also get a lot of tips by searching 'CSV file specification'.
You then need to create a string by iterating through your data source. Start off by creating the line specifying the headers, then add a newline character and then add your data. So for the above example you could do something like (assuming the data is set out as a two dimensional array)
var myCSVString : String = "TITLEFORCOLUMN1, TITLEFORCOLUMN2, TITLEFORCOLUMN3\n"
for lineItem in myDataSource {
myCSVString = myCSVString + lineItem[0] + ", " + lineItem[1] + ", " + lineItem[2] + "\n"
}
Then write the string to file.
You'll need to do more research yourself but hopefully that will set you off in the right direction.

Open Search Server v1.4 select query special characters

We're using Open Search Server v1.4. When the user enters a search for the text "Refrigerator temperature chart (5" we create a URL something like
http://10.192.16.160:8080/services/rest/select/search/<indexname/json?login=<login>&key=<apikey>template=search&query=Refrigerator%20temperature%20chart%20%285&start=0&rows=1000&filter=fileType%3afile&lang=ENGLISH
This fails with ...
HTTP Status 500 - org.apache.cxf.interceptor.Fault:
com.jaeksoft.searchlib.SearchLibException:
com.jaeksoft.searchlib.query.ParseException:
org.apache.lucene.queryParser.ParseException: Cannot parse
'content:(Refrigerator temperature chart (5) OR content:("Refrigerator
temperature chart (5") OR
So adding an escape character %5C before the open bracket fixes this query like so...
http://10.192.16.160:8080/services/rest/select/search/<indexname/json?login=<login>&key=<apikey>template=search&query=Refrigerator%20temperature%20chart%20%5C%285&start=0&rows=1000&filter=fileType%3afile&lang=ENGLISH
Can someone point me to some documentation that lists all the special characters that can be used in an Open Search select query that need to be escaped when entered as part of the search string?

Yes you are right, characters listed in section "Escaping Special Characters" in the page you linked also need to be escaped in OpenSearchServer.
We recently released a fix allowing to escape those characters in query of type Search (field) for Searched fields configured with a pattern mode.
Previously escaping of characters was only available in query of type Search (pattern).
(more information of these two kind of queries here: http://www.opensearchserver.com/documentation/tutorials/functionalities.html#two-kinds-of-queries)
Regards,
Alexandre

I believe Open Search Server is based on Lucene. The query syntax for the Lucene engine is described here...
http://lucene.apache.org/core/2_9_4/queryparsersyntax.html
Lucene supports escaping special characters that are part of the query
syntax. The current list special characters are
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
To escape these character use the \ before the character. For example
to search for (1+1):2 use the query:
\(1\+1\)\:2

PostgreSql + Query Statement having \r in between the attributes !

Suppose we have a textarea in which we put example string. Textbox contain :
Earth is revolving around the Sun.
But at the time of saving, I just pressed a enter key after "the sun". Now the statements in texbox ::
Earth is revolving around
the Sun
Now, in database where enter was pressed the \r is stored. Now i am trying to fetch the data but unable, because my query is of such type ::
SELECT * FROM "volume_factors" WHERE lower(volume_notes) like E'atest\\r\\n 100'
Actual data stored in database field
atest\r
100
Please suggest me to solve the issue.I have tried gsub to replace but no change.
search_text_array[1] = search_text_array[1].gsub('\\r\\n','\r\n')
Thanks in Advance !!!

Try this:
update volume_factors set volume_notes = regexp_replace(volume_notes, '\r\n', ' ');
That's to replace crlf with one space for data that is already in the database. You use postgresql's psql to do this.
To prevent new data containing crlf entering database, you should do it in the application. If you use ruby's gsub, do not use single quote, use double quote to recognize \n like this:
thestring.gsub("\n", " ")

Here we can replace \r\n by % to fetch the data.
Seaching Query will be like this ::
SELECT * FROM "volume_factors" WHERE lower(volume_notes) like E'atest% 100'
gsub function ::
search_text_array[1] = search_text_array[1].gsub('\\r\\n','%')

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart