I have tried
replace(row.field, '0xa0', '')
which didn't work, it still insert whatever presented in the .csv file.
\xa0 is actually non-breaking space in Latin1 (ISO 8859-1), also chr(160). You should replace it with a space.
replace(u'\xa0', u' ')
Let me know if it works and give me a sample csv file to trst it out.
Related
Goal:
I want a CSV file as my result. I want to change the space char to a comma on each line of data. BUT, I also need the data for the 3rd field (Description) to remain as is with original space chars. Each line of data is terminated with a newline char.
Flipping spaces to commas on every line is easy with regex. But how do 'bookend' the string of text which will then become the 3rd/Description field and preserve its spaces? Currently I manually change commas back to spaces just in that text string. Painful.
Example of Final result needed (including column names)
Transaction Date,Posting Date,Description,Reference Number,Account Number,Amount
12/23,12/24,GOOGLE*DOMAINS SUPPORT.GOOGLCA,7811,8550,12.00
My sample data:
12/23 12/24 GOOGLE*DOMAINS SUPPORT.GOOGLCA 7811 8550 12.00
01/02 01/04 CREPEVINE - OAKLAND OAKLAND 234567 CA 1087 8220 16.32
01/06 01/07 AB* ABEBOOKS.CO J6YDBX HTTPSWWW.ABEBWA 6289 85332 6.98
01/20 01/21 SQ *BAGEL STREET CAFE Oakland CA 2313 44444 24.43
A few of My Regex attempts
This cmd changes spaces to commas over all 5 lines by combining it with Join cmd. Easy.
And just fyi: "\n" would not work for some reason so I do the <Ctrl+Enter> keys to inject a newline char, ie the two lines. For now it orks fine.
=regexreplace(join("
",A1:A5)," ",",")
RESULT:
12/23,12/24,GOOGLE*DOMAINS,SUPPORT.GOOGLCA,7811,8550,12.00
...
01/02,01/04,CREPEVINE,-,OAKLAND,OAKLAND,CA,1087,8550,16.32
...
Here is my poor attemp to bookend the description field, then flip commas back to spaces, but no luck either.
=REGEXREPLACE(A1,"(,[A-Z]+[A-Z],)"," ")
How do I craft a regex to do this?
cheers,
Damon
Using Regex101 to reverse learn how you did it
Can you try:
=index(if(len(A:A),regexreplace(A:A,"(?U)(.*) (.*) (.*) (\d[^A-Za-z]*) (\d.*) (\d.*)","$1,$2,$3,$4,$5,$6"),))
I'am trying to import the '|' delimited csv file into neo4j and it returning the below mentioned error. "Tried to read a field larger than buffer size 2097152. A common cause of this is that a field has an unterminated quote and so will try to seek until the next quote, which ever line it may be on. This should not happen if multi-line fields are disabled, given that the fields contains no new-line characters. This field started at C:\Users\10077\Documents\Neo4j\default.graphdb\import\customer.csv:0" Please help me to resove this....
The following is definitely working :
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///orders.csv" AS row FIELDTERMINATOR ','
CREATE (:ORDERS {OORDERKEY : row.O_ORDERKEY , OCUSTKEY: row.O_CUSTKEY, OORDERSTATUS: row.O_ORDERSTATUS,OTOTALPRICE: row.O_TOTALPRICE, OORDERDATE: row.O_ORDERDATE, OORDERPRIORITY: row.O_ORDERPRIORITY,OCLERK: row.O_CLERK, OSHIPPRIORITY: row.O_SHIPPRIORITY,OCOMMENT: row.O_COMMENT});
With this data :
O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT
1,36901,O,173665.47,02-01-1996,5-LOW,Clerk#000000951,0,nstructions sleep furiously among
2,78002,O,46929.18,01-12-1996,1-URGENT,Clerk#000000880,0, foxes. pending accounts at the pending silent asymptot
3,123314,F,193846.25,14-10-1993,5-LOW,Clerk#000000955,0,sly final accounts boost. carefully regular ideas cajole carefully. depos
4,136777,O,32151.78,11-10-1995,5-LOW,Clerk#000000124,0,sits. slyly regular warthogs cajole. regular regular theodolites acro
Can you check your end of line characters ?
Hope this helps,
Regards,
Tom
which iOS CSV parser can work with encoding - ISO-8859-15 , NSWindowsCP1252StringEncoding?
it contains German characters like a , o, u with 2 dots above character.
I need to parse CSV file . Turn it to [[String]]
Did you try CHCSVParser, (ISO-8859-15 is not inclued in readme file but may be)
https://github.com/davedelong/CHCSVParser
I've hit a problem during parsing of a CSV file where I get the following error:
CSV::MalformedCSVError: Illegal quoting on line 3.
RAILS code in question:
csv = CSV.read(args.local_file_path, col_sep: "\t", headers: true)
Line 3 in the CSV file is:
A-067067 VO VIA CE 0 8 8 SWCH Ter 4, Loc Is Here, Mne, Per Fl Auia/Sey IMAC NEK_HW 2011-03-09 09:47:44 2011-03-09 11:50:26 2011-01-13 10:49:17 2011-02-14 14:02:43 2011-02-14 14:02:44 0 0 771 771 46273 "[O/H 15/02] B270 W31 ""TEXT TEXT 2 X TEXT SWITC" SOME_TEXT SOME_TEXT N/A Name Here RESOLVED_CLOSED RESOLVED_CLOSED
UPDATE: Tabs don't appear to have come across above. See pastebin RAW TEXT: http://pastebin.com/4gj7iUpP
I've read numerous threads all over StackOverflow and Google about why this is and I understand that. But the CSV row above has perfectly legal quoting does it not?
The CSV is tab delimited and there is only a tab followed by the quote on either side of the column in question. There is 1 quote in that field and it is double quoted to escape it. So what gives? I can't work it out. :(
Assuming I've got something wrong here, I'd like the solution to include a way to work around the issue as I don't have control over how the CSV is constructed.
This part of your CSV is at fault:
46273 "[O/H 15/02] B270 W31 ""TEXT TEXT 2 X TEXT SWITC" SOME_TEXT
At least one of these parts has a stray space:
46273 "
" SOME_TEXT
I'd guess that the "3" and the double are supposed to be separated by one or more tabs but there is a space before the quote. Or, there is a space after the quote on the other end when there are only supposed to be tabs between the closing quote and the "S".
CSV escapes double quotes by double them so this:
"[O/H 15/02] B270 W31 ""TEXT TEXT 2 X TEXT SWITC"
is supposed to be a single filed that contains an embedded quote:
[O/H 15/02] B270 W31 "TEXT TEXT 2 X TEXT SWITC
If you have a space before the first quote or after the last quote then, since your fields are tab delimited, you have an unescaped double quote inside a field and that's where your "illegal quoting" error comes from.
Try sending your CSV file through cat -t (which should represent tabs as ^I) to find where the stray space is.
When I try to output some data into a text file using FasterCSV, sometimes it adds the quotes to the concatenated string and sometimes it does not.
For instance:
FasterCSV.generate do |csv|
csv << ["E"+company_code]
csv << ["A"+company_name]
end
Both company_code and company_name are Strings and contains data but the output will show:
EtheCompanyCode
"AtheCompanyName"
I found how to force quoting in FasterCSV's docs but I need exactly the opposite and can not figure out why it quotes one line and not the other when they are both strings...
If anybody has the solution, I'll be deeply grateful for a lead :)
Thanks
If the real input is 'theCompanyName' and 'theCompanyCode' then I would also be confused by one line being quoted and the other not. But I suspect your real input is something else.
Most likely, the quoted line has some character that needs quoting, such as a comma; while the unquoted line doesn't. (Other characters that typically need quoting in Excel-style CSVs are quotation marks and newlines.)