I'm working on a project with some supplied CSV files that I need to parse and do some manipulation on. One is throwing this error when I try to load it into a file using CSV.read('path/file.csv')
CSV::MalformedCSVError: Unquoted fields do not allow \r or \n (line 7911).
Now when looking at the file, the last line is just blank. It's a \n character. I feel like this should not break the CSV read but it is. Now, I could just check the end of the CSV documents and strip any access return carriages/new lines since that seems like it'll work but it doesn't seem like the correct way. Anybody have some advice?
Edit: Using Ruby 2.0.0 and Rails 4.0.5
Related
I generate a CSV text file in Rails like this:
CSV.generate(col_sep: ';') do |csv|
sheet.add_row ['1st line']
sheet.add_row ['2nd line']
end
When I open the text file the two lines are there as expected. Unfortunately this file now should be used by a program that reads the file and I get an error message, that the second line is missing. I have a sample file that looks exactly like the file I generated which works fine but my file can't be read properly. It also has the same encoding. Any suggestions where to look? Anything concerning line breaks?
I'm not sure this is a question that can be answered as asked. You said that a 3rd party program is having trouble reading a text file generated by Ruby, but provided no information on that error and how you think Ruby is related to this error.
Could you please update your original post with the plaintext version of your CSV file and what program you're trying to open it in?
I have recently moved an application from Ubuntu across to a Redhat server, and noticed that a difference has occurred when writing a file, with \r\n being written, rather than simply \n.
I am explicitly setting the \n in the data to be written. So, for example
data = "Hello\nWorld"
File.open("#{ Rails.root }/tmp/file.txt", "wb") { |f| f.write(data) }
What is being written is actually "Hello\r\nWorld".
I know Ruby sets the line breaks according to the system it is being run on, but is there a way of enforcing it to keep to \n whatever the system?
Don't put escape sequences in double quote because ruby look for substitutions and replace them with some binary value.
If you want enforcing ruby to keep to same like '\n' than you have to use single quote.
Example:
data = 'Hello\nWorld'
File.open("#{ Rails.root }/tmp/file.txt", "wb") { |f| f.write(data) }
It will keep it same. :)
I have a rails app where my users can manually set up products via a web form. This works fine and accepts foreign characters well, words like 'Svölk' for example.
I now have a need to bulk import products and am using FasterCSV to do so. Generally this works without issue, but when the CSV contains foreign characters it stalls at that point.
Am I correct to believe the file needs to be UTF-8 in the first instance?
Also, I'm running Ruby 1.8.7 so is ICONV my only solution for converting the file? This could be an issue as the format of the original file won't be known.
Have others encountered this issue and if so, how did you overcome it?
You have two alternatives:
Use ensure_encoding gem to find the actual encoding of the strings.
Use Ruby to determine the file encoding using:
File.open(source_file).read.encoding
I prefer the first approach as it tries to detected the encoding based on Strings, and tries to convert to your desired encoding (UTF-8) and then you can set the encoding on FasterCSV options.
I'm using the DocSplit gem for Ruby 1.9.3 to create Unicode UTF-8 versions of word documents. To my surprise today while I was running a test on a particular piece of one of these documents I started running into character encoding inconstencies.
I have tried a number of different methods to resolve the issue which I will list below, but the best success I've had so far is to remove all non-ASCII characters. This is far from ideal, as I don't think the character's are really going to be all that problematic in the DB.
gsub(/[^[:ascii:]]/, "")
This is a sample of what my output looks like vs. what I'm expecting:
My CODES'S APOSTROPHE
My CODES’S APOSTROPHE
The second apostrophe should look squiggly. If you paste it into irb, you get the following: \U+FFE2
I tried Regexing specifically for this character and it appears to work in Rubular. As soon as I put it in my model however, I got a syntax error.
syntax error, unexpected $end, expecting ')'
raw_title = raw_title.gsub(/’/, "")
I also tried forcing the encoding to UTF-8, but everything is already in UTF-8 and this does not appear to have an effect. I tried forcing the output to US-ASCII, but I get a byte sequence error.
I also tried a few of the encoding options found in Ruby library. These basically did the same thing as the Regex.
This all comes down to that I'm trying to match output for testing purposes. Should I even be concerned about these special characters? Is there a better way to match these characters without blindly removing them?
Try adding:
# encoding: utf-8
at the top of the failing rspec file. This should ensure things like:
raw_title = raw_title.gsub(/’/, "")
in your spec work.
I tried using the above example. but even after that it kept failing. So I used iconv to convert that specfic character. THis is what I used
Iconv.conv('ASCII//IGNORE', 'UTF8', text_to_be_converted)
I tried what was given in the following link - How to get rid of non-ascii characters in ruby
I created a Rails file outputting values separated by commas, called posts.csv.erb.
When I click on the link, it downloads as a CSV file, but the commas are now pipes. Also, Vim tries to auto-correct the commas into pipes for some reason (but I got around that, so they are actually commas). It works fine if I call the file posts.txt.erb - I see actual commas. But somehow it seems Rails or ERB wants to output pipes for a CSV file?