Neo4j CSV file import Error - neo4j

I'am trying to import the '|' delimited csv file into neo4j and it returning the below mentioned error. "Tried to read a field larger than buffer size 2097152. A common cause of this is that a field has an unterminated quote and so will try to seek until the next quote, which ever line it may be on. This should not happen if multi-line fields are disabled, given that the fields contains no new-line characters. This field started at C:\Users\10077\Documents\Neo4j\default.graphdb\import\customer.csv:0" Please help me to resove this....

The following is definitely working :
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///orders.csv" AS row FIELDTERMINATOR ','
CREATE (:ORDERS {OORDERKEY : row.O_ORDERKEY , OCUSTKEY: row.O_CUSTKEY, OORDERSTATUS: row.O_ORDERSTATUS,OTOTALPRICE: row.O_TOTALPRICE, OORDERDATE: row.O_ORDERDATE, OORDERPRIORITY: row.O_ORDERPRIORITY,OCLERK: row.O_CLERK, OSHIPPRIORITY: row.O_SHIPPRIORITY,OCOMMENT: row.O_COMMENT});
With this data :
O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,‌​O_ORDERPRIORITY,O_CL‌​ERK,O_SHIPPRIORITY,O‌​_COMMENT
1,36901,O,173665.47,02-01-1996,5-LOW,Clerk#000000951,0,nstru‌​ctions sleep furiously among
2,78002,O,46929.18,01-12-1996,1-URGENT,Clerk#000000880,0, foxes. pending accounts at the pending silent asymptot
3,123314,F,193846.25,14-10-1993,5-LOW,Clerk#000000955,0,sly final accounts boost. carefully regular ideas cajole carefully. depos
4,136777,O,32151.78,11-10-1995,5-LOW,Clerk#000000124,0,sits. slyly regular warthogs cajole. regular regular theodolites acro
Can you check your end of line characters ?
Hope this helps,
Regards,
Tom

Related

How to remove 0xa0 in Neo4j csv data?

I have tried
replace(row.field, '0xa0', '')
which didn't work, it still insert whatever presented in the .csv file.
\xa0 is actually non-breaking space in Latin1 (ISO 8859-1), also chr(160). You should replace it with a space.
replace(u'\xa0', u' ')
Let me know if it works and give me a sample csv file to trst it out.

Pipe character ignored in SPSS syntax

I am trying to use the pipe character "|" in SPSS syntax with strange results:
In the syntax it appears like this:
But when I copy this line from the syntax window to here, this is what I get:
SELECT IF(SEX = 1 SEX = 2).
The pipe just disappears!
If I run this line, this is the output:
SELECT IF(SEX = 1 SEX = 2).
Error # 4007 in column 20. Text: SEX
The expression is incomplete. Check for missing operands, invalid operators,
unmatched parentheses or excessive string length.
Execution of this command stops.
So the pipe is invisible to the program too!
When I save this syntax and reopen it, the pipe is gone...
The only way I found to get SPSS to work with the pipe is when I edited the syntax (adding the pipe) and saved it in an alternative editor (notepad++ in this case). Now, without opening the syntax, I ran it from another syntax using insert command, and it worked.
EDIT: some background info:
I have spss version 23 (+service pack 3) 64 bit.
The same things happens if I use my locale (encoding: windows-1255) or Unicode (Encoding: UTF-8). Suspecting my Hebrew keyboard I tried copying syntax from the web with same results.
Can anyone shed any light on this subject?
Turns out (according to SPSS support) that's a version specific (ver. 21) bug and was fixed in later versions.

Parsing FETCH multiple UID

I need to write an IMAP code to parse the result of this command:
tag FETCH 1,2,3,4 ALL
Most of the time, the response is something like this
* 1 FETCH (FLAGS ... ) ENVELOPE ("time" "subject" ... )\r\n
* 2 FETCH (FLAGS ... ) ENVELOPE ("time" "subject" ... )\r\n
....
tag OK FETCH COMPLETE
And so on where each Envelope starts with asterik UID, and end with a CRLF, so I can use the CRLF as a parse point.
The problem is some servers are responding to me using IMAP string literals, ie {150}\r\n .... and since the \r\n is part of the string literal I can no longer use it as a parse point.
One idea is to use the * UID as a parsepoint, but if someone coincidentally uses that as an email subject, or whatnot, it will break the algo so I believe its a bad idea to do that.
Can someone tell me how to effectively parse this type of response without using CRLF? Most thanks you very much.
edit - Hopefully to improve question, I am trying to parse each individual ENVELOPE into it's own string based on parse points, where I need a parsepoint that identifies the start of one string and the end of another.
The trick you need is one to distinguish the two kinds of line feeds, and it exists: Start reading, read until you see a CRLF, then look at the start of what you have. Is it tag space OK, NO, BAD or PREAUTH? If so, you have a complete response. If not, look at the last 10-15 characters. Are they "{", a number, optionally a plus sign, and "}" and the CRLF? If so, read until the next CRLF and repeat. If not, you have a complete response.
Note that in IMAP, you have to act on a response before you can parse the next one. MSN handling breaks if you don't, there may be other problems too.

RAILS 3 CSV "Illegal quoting" is a lie

I've hit a problem during parsing of a CSV file where I get the following error:
CSV::MalformedCSVError: Illegal quoting on line 3.
RAILS code in question:
csv = CSV.read(args.local_file_path, col_sep: "\t", headers: true)
Line 3 in the CSV file is:
A-067067 VO VIA CE 0 8 8 SWCH Ter 4, Loc Is Here, Mne, Per Fl Auia/Sey IMAC NEK_HW 2011-03-09 09:47:44 2011-03-09 11:50:26 2011-01-13 10:49:17 2011-02-14 14:02:43 2011-02-14 14:02:44 0 0 771 771 46273 "[O/H 15/02] B270 W31 ""TEXT TEXT 2 X TEXT SWITC" SOME_TEXT SOME_TEXT N/A Name Here RESOLVED_CLOSED RESOLVED_CLOSED
UPDATE: Tabs don't appear to have come across above. See pastebin RAW TEXT: http://pastebin.com/4gj7iUpP
I've read numerous threads all over StackOverflow and Google about why this is and I understand that. But the CSV row above has perfectly legal quoting does it not?
The CSV is tab delimited and there is only a tab followed by the quote on either side of the column in question. There is 1 quote in that field and it is double quoted to escape it. So what gives? I can't work it out. :(
Assuming I've got something wrong here, I'd like the solution to include a way to work around the issue as I don't have control over how the CSV is constructed.
This part of your CSV is at fault:
46273 "[O/H 15/02] B270 W31 ""TEXT TEXT 2 X TEXT SWITC" SOME_TEXT
At least one of these parts has a stray space:
46273 "
" SOME_TEXT
I'd guess that the "3" and the double are supposed to be separated by one or more tabs but there is a space before the quote. Or, there is a space after the quote on the other end when there are only supposed to be tabs between the closing quote and the "S".
CSV escapes double quotes by double them so this:
"[O/H 15/02] B270 W31 ""TEXT TEXT 2 X TEXT SWITC"
is supposed to be a single filed that contains an embedded quote:
[O/H 15/02] B270 W31 "TEXT TEXT 2 X TEXT SWITC
If you have a space before the first quote or after the last quote then, since your fields are tab delimited, you have an unescaped double quote inside a field and that's where your "illegal quoting" error comes from.
Try sending your CSV file through cat -t (which should represent tabs as ^I) to find where the stray space is.

How does FasterCSV determine whether or not to add quote?

When I try to output some data into a text file using FasterCSV, sometimes it adds the quotes to the concatenated string and sometimes it does not.
For instance:
FasterCSV.generate do |csv|
csv << ["E"+company_code]
csv << ["A"+company_name]
end
Both company_code and company_name are Strings and contains data but the output will show:
EtheCompanyCode
"AtheCompanyName"
I found how to force quoting in FasterCSV's docs but I need exactly the opposite and can not figure out why it quotes one line and not the other when they are both strings...
If anybody has the solution, I'll be deeply grateful for a lead :)
Thanks
If the real input is 'theCompanyName' and 'theCompanyCode' then I would also be confused by one line being quoted and the other not. But I suspect your real input is something else.
Most likely, the quoted line has some character that needs quoting, such as a comma; while the unquoted line doesn't. (Other characters that typically need quoting in Excel-style CSVs are quotation marks and newlines.)

Resources