I need to validate headers in a CSV file before parsing data in it.
# convert the data into an array of hashes
CSV::Converters[:blank_to_nil] = lambda do |field|
field && field.empty? ? nil : field
end
csv = CSV.new(file, :headers => true, :header_converters => :symbol, :converters => [:all, :blank_to_nil])
csv_data = csv.to_a.map {|row| row.to_hash }
I know I can use headers method to get the headers
headers = csv.headers
But the problem with headers method is it "Returns nil if headers will not be used, true if they will but have not yet been read, or the actual headers after they have been read."
So if I put headers = csv.headers above csv_data = csv.to_a.map {|row| row.to_hash } line headers is true and if I put it after reading data, headers contain headers row in an array. It imposes an order of instructions on my method which is very hard to test and is bad programming.
Is there a way to read headers row without imposing order in this scenario? I'm using ruby 2.0.
CSV.open(file_path, &:readline)
I get the problem! I'm having the same one. Calling read seems to do what you want (populates the headers variable):
data = CSV.new(file, **flags)
data.headers # => true
data = CSV.new(file, **flags).read
data.headers # => ['field1', 'field2']
There might be other side effects I'm not aware of, but this works for me and doesn't smell too bad.
I don't quite get the problem. If you use one of the iterator methods, it's quite easy to do some validation on the headers:
CSV.foreach('tmp.txt', headers: true) do |csv|
return unless csv.headers[0] != 'xyz'
end
Related
I want to loop over a csv file using CSV.foreach, read the data, perform some operation with it, and write the result to the last column of that row, using the Row object.
So let's say I have a csv with data I need to save to a database using Rails ActiveRecord, I validate the record, if it is valid, I write true in the last column, if not I write the errors.
Example csv:
id,title
1,some title
2,another title
3,yet another title
CSV.foreach(path, "r+", headers: true) do |row|
archive = Archive.new(
title: row["title"]
)
archive.save!
row["valid"] = true
rescue ActiveRecord::RecordInvalid => e
row["valid"] = archive.errors.full_messages.join(";")
end
When I run the code it reads the data, but it does not write anything to the csv. Is this possible?
Is it possible to write in the same csv file?
Using:
Ruby 3.0.4
The row variable in your iterator exists only in memory. You need to write the information back to the file like this:
new_csv = ["id,title,valid\n"]
CSV.foreach(path, 'r+', headers: true) do |row| # error here, see edit note below
row["valid"] = 'foo'
new_csv << row.to_s
end
File.open(path, 'w+') do |f|
f.write new_csv
end
[EDIT] the 'r+' option to foreach is not valid, it should be 'r'
Maybe this is over-engineering things a bit. But I would do the following:
Read the original CSV file.
Create a temporary CSV file.
Insert the updated headers into the temporary CSV file.
Insert the updated records into the temporary CSV file.
Replace the original CSV file with the temporary CSV file.
csv_path = 'archives.csv'
input_csv = CSV.read(csv_path, headers: true)
input_headers = input_csv.headers
# using an UUID to prevent file conflicts
tmp_csv_path = "#{csv_path}.#{SecureRandom.uuid}.tmp"
output_headers = input_headers + %w[errors]
CSV.open(tmp_csv_path, 'w', write_headers: true, headers: output_headers) do |output_csv|
input_csv.each do |archive_data|
values = archive_data.values_at(*input_headers)
archive = Archive.new(archive_data.to_h)
archive.valid?
# error_messages is an empty string if there are no errors
error_messages = archive.errors.full_messages.join(';')
output_csv << values + [error_messages]
end
end
FileUtils.move(tmp_csv_path, csv_path)
I'm having problems importing this CSV:
municipality,province,province abbrev,country,region
Vancouver,British Columbia,BC,Canada,Metro Vancouver - North
Specifically, Vancouver is not being returned when I look for its value by its key:
municipality_name = row["municipality"]
Here's the code:
def self.import_csv(file)
CSV.foreach(file, headers: true,
skip_blanks: true,
skip_lines: /^(?:,\s*)+$/,
col_sep: ",") do |row|
municipality_name = row["municipality"]
puts row.to_h
puts "municipality_name: #{municipality_name}"
puts "row[0]: #{row[0]}"
end
end
Here's the output:
irb(main):052:0> Importers::Municipalities.import_csv('tmp/municipalities.csv')
{"municipality"=>"Vancouver", "province"=>"British Columbia", "province abbrev"=>"BC", "country"=>"Canada", "region"=>"Metro Vancouver - North"}
municipality_name:
row['municipality']:
row[0]: Vancouver
Seems like I'm missing something obvious. I thought maybe there was a hidden character in the CSV but turned on hidden characters in Sublime and no dice.
Thanks in advance.
You need to call to_h on the row if you want to access it by its keys. Otherwise, it is an array-like object, accessible by indices.
def self.import_csv(file)
CSV.foreach(file, headers: true,
skip_blanks: true,
skip_lines: /^(?:,\s*)+$/,
col_sep: ",") do |row|
row = row.to_h
municipality_name = row["municipality"]
puts "municipality_name: #{municipality_name}"
end
end
Seems like it was a problem with the CSV and the code works fine. Created a new CSV, typed in the same content, and it worked. Maybe an invisible character that Sublime wasn't showing? Can't verify as I wiped the original CSV that was causing issues.
I am new to RoR.
I want to dynamically add attributes from a csv file so that my code would be able to dynamically read any csv file and build the db (i.e. convert any CSV file into Ruby objects)
I was using the below code
csv_data = File.read('myData.csv')
csv = CSV.parse(csv_data, :headers => true, :header_converters => :symbol)
csv.each do |row|
MyModel.create!(row.to_hash)
end
However it will fail for the following example
myData.csv
Name,id
foo,1
bar,10
myData2.csv
Name,value
foo,1
bar,10
It will result an error for myData2 because the value is not a parameter in MyModel
unknown attribute 'value' for MyModel.
I have thought about using send(:attrAccessor, name) but I was not sure how can I integrate it when reading from csv, any ideas ?
You are doing it properly but you can also bulk upload the records
csv_data =
CSV.read("#{Rails.root}/myData.csv",
headers: true,
header_converters: :symbol
).map(&:to_hash)
MyModel.create(csv_data)
NOTE: If the data is going to be same you can use seeds.rb
I have the following two lines of a code that take an uploaded CSV file from params and return a hash of Contact objects. The code works fine when I input a CSV with UTF-8 encoding. If I try to upload a CSV with another type of encoding though, it breaks. How can I adjust the code to detect the encoding of the uploaded file and convert to UTF-8?
CSV::Converters[:blank_to_nil] = lambda { |field| field && field.empty? ? nil : field }
csv = CSV.new(params[:file].tempfile.open, headers: true, header_converters: :symbol, converters: [:all, :blank_to_nil]).to_a.map {|row| row.to_hash }
This question is not a duplicate! I've seen numerous other questions on here revolving around the same encoding issue, but the specifics of those are different than my case. Specifically, I need a way convert the encoding of a TempFile generated from my params hash. Other solutions I've seen involve encoding String and File objects, as well as passing an encoding option to CSV.parse or CSV.open. I've tried those solutions already without success.
I've tried passing in an encoding option to CSV.new, like so:
csv = CSV.new(params[:file].tempfile.open, encoding: 'iso-8859-1:utf-8', headers: true, header_converters: :symbol, converters: [:all, :blank_to_nil]).to_a.map {|row| row.to_hash }
I've also tried this:
csv = CSV.new(params[:file].tempfile.open, encoding: 'iso-8859-1:utf-8', headers: true, header_converters: :symbol, converters: [:all, :blank_to_nil]).to_a.map {|row| row.to_hash }
I've tried adjusting my converter as well, like so:
CSV::Converters[:blank_to_nil] = lambda { |field| field && field.empty? ? nil : field.encode('utf-8') }
I'm looking for a programatic solution here that does not require the user to convert their CSV to the proper encoding.
I've also had to deal with this problem and here is how I finally solved it.
CSV.open(new_csv_file, 'w') do |csv_object|
lines = File.open(uploaded_file).read
lines.each_line do |line|
csv_object << line.encode!("utf-8", "utf-8", invalid: :replace, undef: :replace, replace: '').parse_csv
end
end
CSV.new(File.read(new_csv_file))
Basically go through every line, sanitize it and shove it into a new CSV file.
Hope that leads you and other in the right direction.
You can use filemagic to detect the encoding of a file, although it's not 100% accurate. It bases on system's file command tool, so I'm not sure if it works on windows.
I'm using ruby 1.9.2. My csv file as follows..,
NAME, Id, No, Dept
Tom, 1, 12, CS
Hendry, 2, 35, EC
Bahamas, 3, 21, IT
Frank, 4, 61, EE
I want to print an specific row say ('Tom'). I tried out in many ways, but I didn't find the exact result. The most recommended options is "Fastercsv". But it is applicable for my version. Also, I noticed that csv print the field as column wise. How to print an entire row using csv in rails. My ruby code is as follows
require 'csv'
csv_text = File.read('sampler.csv')
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
puts "#{row[:NAME]},#{row[:Id]},#{row[:No]},#{row[:Dept]}"
end
Use .find
csv = CSV.read('sampler.csv', headers: true)
puts csv.find {|row| row['NAME'] == 'Tom'} #=> returns first `row` that satisfies the block.
Here's another approach that keeps the code within the CSV API.
csv_table is a CSV::Table
row is a CSV::Row
row_with_specified_name is a CSV::Row.
csv_table = CSV.table("./tables/example.csv", converters: :all)
row_with_specified_name = csv_table.find do |row|
row.field(:name) == 'Bahamas'
end
p row_with_specified_name.to_csv.chomp #=> "Bahamas,3,21,IT"
FYI, CSV.table is just a shortcut for:
CSV.read( path, { headers: true,
converters: :numeric,
header_converters: :symbol }.merge(options) )
As per the docs.
If you have a large CSV file and want to find an exact row it will be way faster and way less memory intense to read one line at a time.
require 'csv'
csv = CSV.open('sampler.csv', 'r', headers: true)
while row = csv.shift
if row['name'] == 'Bahamas'
break
end
end
pp row