Ruby csv import trouble - ruby-on-rails

I tried to import data from csv in my rails app, but something went wrong:
CSV::MalformedCSVError in ArticlesController#index
Unclosed quoted field on line 1.
My csv looks like this:
"Код";"№ по каталогу (артикул)";"Наименование товара";"Ед. изм.";"Цена опт.";"Доп.";"Остатки";"Класс";"Группа";"Бренд";"Блок."
2223;15-562-44;15-562-44 (27-B07-F) VW Polo 95-R ;шт ;37,430;;;Амортизаторы ;Амортизаторы BOGE ;;
10327;24-052-1;24-052-1(46-A27-0) LAND ROVER 84- F ;шт ;68,750;;;Амортизаторы ;Амортизаторы BOGE ;;
10328;24-053-1;24-053-1(46-A28-0) LAND ROVER 84- R ;шт ;68,750;;;Амортизаторы ;Амортизаторы BOGE ;;
Maybe this is because of the first line (which has no ;;)
My code look like this:
def csv_import
require 'csv'
file = File.open("/#{Rails.public_path}/uploads/smallcsv.csv")
#csv = CSV.parse(file)
csv = CSV.open(file, "r:ISO-8859-15:UTF-8", {:col_sep => ";", :row_sep => ";;", :headers => :first_row})
file_path = "/#{Rails.public_path}/uploads/smallcsv.csv"
##parsed_file=CSV::Reader.parse(file_path)
csv.each do |row|
ename = row[2]
eprice = row[5]
eqnt = row[7]
esupp = row[10]
logger.warn(ename)
end
end
I'm running ruby 1.9+ with fastercsv gem

I figured this out myself using "CSV - Unquoted fields do not allow \r or \n (line 2)".
The problem was with the first line, so :auto helped me.

Related

Parse binary CSV file in Ruby

This should have been such an easy thing... buy I can't for the life of me figure out how to parse a CSV file that doesn't seem to have a specific encoding.
File.open(Rails.root.join('data', 'mike/test-csv.csv'), 'rb') { |f| f.read }
=> "ID,\x00Q\x00u\x00a\x00n\x00t\x00i\x00t\x00y\n\x006\x00e\x005\x004\x009\x001\x00e\x007\x00-\x007\x00f\x001\x005\x00-\x004\x001\x007\x00d\x00-\x00a\x004\x000\x003\x00-345\x00,\x00\x005\x000\x00.\x000\x000\x000\x000\x000\x000\x000\x000\x00\n"
Here's a gist of it, can't figure out a way to post the specific CSV.
All I get from checking the encoding of the file is that it's in binary format, any thoughts on how I could get it into a normal csv?
Note: This is a downloaded CSV so converting it to another encoding via opening it in excel and exporting (or something like that) is not an option :)
Thanks!
Updating with attempted solution 1:
path = Rails.root.join('data', 'mike/test-csv.csv')
CSV.read(path, {:headers => true, :encoding => 'utf-8'}).each do |d|
puts d
end
Result: 6e5491e7-7f15-417d-a403-345,50.00000000
While this is correct, it ONLY works with puts, for example:
CSV.read(path, {:headers => true, :encoding => 'utf-8'}).map { |row| row }
=> [#<CSV::Row "ID":"\u00006\u0000e\u00005\u00004\u00009\u00001\u0000e\u00007\u0000-\u00007\u0000f\u00001\u00005\u0000-\u00004\u00001\u00007\u0000d\u0000-\u0000a\u00004\u00000\u00003\u0000-345\u0000" "\u0000Q\u0000u\u0000a\u0000n\u0000t\u0000i\u0000t\u0000y":"\u0000\u00005\u00000\u0000.\u00000\u00000\u00000\u00000\u00000\u00000\u00000\u00000\u0000">]
CSV.read(path, {:headers => true, :encoding => 'utf-8'}).map(&:to_s)
=> ["\u00006\u0000e\u00005\u00004\u00009\u00001\u0000e\u00007\u0000-\u00007\u0000f\u00001\u00005\u0000-\u00004\u00001\u00007\u0000d\u0000-\u0000a\u00004\u00000\u00003\u0000-345\u0000,\u0000\u00005\u00000\u0000.\u00000\u00000\u00000\u00000\u00000\u00000\u00000\u00000\u0000\n"]
It's unfortunately still not the correct string :(
Final Solution (via #ashmaroli below):
path = Rails.root.join('data', 'mike/test-csv.csv')
csv_text = ''
File.open(path, 'r') do |csv|
csv.each_line do |line|
csv_text << line.gsub(/\u0000/, '')
end
end
CSV.parse(csv_text, headers:true).map do |row| row end
Result:
[#<CSV::Row "ID":"6e5491e7-7f15-417d-a403-345" "Quantity":"50.00000000">]
Github Gist
Download Example CSV File
path = Rails.root.join('data', 'mike/test-csv.csv')
file = ""
File.open(path, 'r') do |csv|
csv.each_line do |line|
file << line.gsub(/\u0000/, '')
end
end
print file
print file.inspect # same as above just wraps the string in a
# single line with "\n" chars

Rails 4.2 - how to fix ascii code in CSV exporting without gem 'iconv'?

When exporting csv in Rails 4.2 app, there are ascii code in the csv output for Chinese characters (UTF8):
中åˆåŒç†Šå·¥ç­‰ç”¨é¤
We tried options in send_data without luck:
send_data #payment_requests.to_csv, :type => 'text/csv; charset=utf-8; header=present'
And:
send_data #payment_requests.to_csv.force_encoding("UTF-8")
In model, there is forced encoding utf8:
# encoding: utf-8
But it does not work. There are online posts talking about use gem iconv. However iconv depends on the platform's ruby version. Is there cleaner solution to fix the ascii in Rails 4.2 csv exporting?
If #payment_requests.to_csv includes ASCII text, then you should use encode method:
#payment_requests.to_csv.encode("UTF-8")
or
#payment_requests.to_csv.force_encoding("ASCII").encode("UTF-8")
depending on which internal encoding #payment_requests.to_csv has.
You can try:
#payment_requests.to_csv.force_encoding("ISO-8859-1")
for Chinese characters
CSV.generate(options) do |csv|
csv << column_names
all.each do |product|
csv << product.attributes.values_at(*column_names)
end
end.encode('gb2312', :invalid => :replace, :undef => :replace, :replace => "?")
This is what worked for me:
head = 'EF BB BF'.split(' ').map{|a|a.hex.chr}.join()
csv_str = CSV.generate(csv = head) do |csv|
csv << [ , , , ...]
#elements.each do |element|
csv << [ , , , ...]
end
end

ruby net-sftp read file line by line

I am using ruby 2.0.0 and rails 4.0.0. I have something similar to this:
require 'net/sftp'
sftp = Net::SFTP.start('ftp.app.com','username', :password => 'password')
sftp.file.open("/path/to/remote/file.csv", "r") do |f|
puts f.gets
end
This opens the file on the FTP site, but it only puts the first line of the csv file. I need to read this file row by row, preferably ignoring the header.
How can I read the file row by row, without downloading the file locally?
I solved this by doing this:
data = sftp.download!("/path/to/remote/file.csv").split(/\r\n/)
data.each do |line|
puts line
end
The proper answer for this would actually be to use the file.eof? value.
The code would look like:
require 'net/sftp'
sftp = Net::SFTP.start('ftp.app.com','username', :password => 'password')
sftp.file.open("/path/to/remote/file.csv", "r") do |f|
while !f.eof?
puts f.gets
end
end
Documentation can be found here
In my case something like this worked:
data = sftp.download!("/path/to/remote/file.csv").split(/\n/).map{ |e| e.split(/,/).map{ |x| x.gsub(/"/, "")} }
data.each do |line|
puts line
end
Will also split each row of the .csv into different array columns and remove any excess of "". Note this is for mac where line breaks are \n.

How would I find similar lines in two CSV files?

Here is my code but it takes forever for huge files:
require 'rubygems'
require "faster_csv"
fname1 =ARGV[0]
fname2 =ARGV[1]
if ARGV.size!=2
puts "Display common lines in the two files \n Usage : ruby user_in_both_files.rb <file1> <file2> "
exit 0
end
puts "loading the CSV files ..."
file1=FasterCSV.read(fname1, :headers => :first_row)
file2=FasterCSV.read(fname2, :headers => :first_row)
puts "CSV files loaded"
#puts file2[219808].to_s.strip.gsub(/\s+/,'')
lineN1=0
lineN2=0
# count how many common lines
similarLines=0
file1.each do |line1|
lineN1=lineN1+1
#compare line 1 to all line from file 2
lineN2=0
file2.each do |line2|
puts "file1:l#{lineN1}|file2:l#{lineN2}"
lineN2=lineN2+1
if ( line1.to_s.strip.gsub(/\s+/,'') == line2.to_s.strip.gsub(/\s+/,'') )
puts "file1:l#{line1}|file2:l#{line2}->#{line1}\n"
similarLines=similarLines+1
end
end
end
puts "#{similarLines} similar lines."
Ruby has set operations available with arrays:
a_ary = [1,2,3]
b_ary = [3,4,5]
a_ary & b_ary # => 3
So, from that you should try:
puts "loading the CSV files ..."
file1 = FasterCSV.read(fname1, :headers => :first_row)
file2 = FasterCSV.read(fname2, :headers => :first_row)
puts "CSV files loaded"
common_lines = file1 & file2
puts common_lines.size
If you need to preprocess the arrays, do it as you load them:
file1 = FasterCSV.read(fname1, :headers => :first_row).map{ |l| l.to_s.strip.gsub(/\s+/, '') }
file2 = FasterCSV.read(fname2, :headers => :first_row).map{ |l| l.to_s.strip.gsub(/\s+/, '') }
You're gsubing File2 every time you loop through File1. I'd do that first, and then just compare the results of that.
Edit Something like this (untested)
file1lines = []
file1.each do |line1|
file1lines = line1.strip.gsub(/\s+/, '')
end
# Do the same for `file2lines`
file1lines.each do |line1|
lineN1=lineN1+1
#compare line 1 to all line from file 2
lineN2=0
file2lines.each do |line2|
puts "file1:l#{lineN1}|file2:l#{lineN2}"
lineN2=lineN2+1
if ( line1 == line2 )
puts "file1:l#{line1}|file2:l#{line2}->#{line1}\n"
similarLines=similarLines+1
end
end
end
I'd also get rid of all the putses in the loops unless you really need them. If the files are huge, that's probably slowing it down a noticeable amount.

FasterCSV and Non-Latin Characters

Iv recently written code that will help me export an SQL database into CSV using FasterCSV with rails. However some parts of my database contain Traditional Chinese Characters. When I export it i'm getting ?????? as the output in the CSV file. Iv already tried changing the $KCODE = 'u' so that FasterCSV uses UTF-8 to encode the CSV file, but no luck. Iconv to convert the encoding is giving me strange results as well. Here is the source code:
def csv
#lists = Project.find(:all, :order=> (params[:sort] + ' ' + params[:direction]), :conditions => ["name LIKE ?", "%#{params[:selection]}%"])
csv_string = FasterCSV.generate do |csv|
csv << [<bold> "Status","Name","Summary","Description","Creator","Comment","Contact Information","Created Date","Updated Date"]
#lists.each do |project|
csv << [project.status, project.name, project.summary, project.description, project.creator, project.statusreason, project.contactinfo, project.created_at, project.updated_at]
end
end
filename = Time.now.strftime("%Y%m%d") + ".csv"
send_data(csv_string,
:type => 'text/csv; charset=utf-8; header=present',
:filename => filename)
end
Thanks,
I'm not used to work with chinese characters, but you can try adding the :encoding option to 'u' (UTF-8) on the generate call:
...
csv_string = FasterCSV.generate(:encoding => 'u') do |csv|
...
As a side note, I'd recommend using named_scopes instead of writing this:
Project.find(:all, :order=> (params[:sort] + ' ' + params[:direction]), :conditions => ["name LIKE ?", "%#{params[:selection]}%"])
Good luck!

Resources