Exclude headers when importing Google Spreadsheet content with Roo - ruby-on-rails

I created a rake task to import users from a Google Sheet. Therefore I am using the gem 'Roo'. Everything works so far but I can't seem to get it working without importing the first row (headers).
This is my code:
require 'roo'
namespace :import do
desc "Import users from Google Sheet"
task users: :environment do
#counter = 0
url = 'https://docs.google.com/spreadsheets/d/{mycode}/export?format=xlsx'
xlsx = Roo::Spreadsheet.open(url, extension: :xlsx, headers: true)
xlsx.each do |row|
n = User.where(name:row[0]).first
user = User.find_or_create_by(id: n)
user.update(
name:row[0],
country_id:row[6]
)
user.save!
puts user.name
#counter += 1
end
puts "Imported #{#counter} lines."
end
end

Your code says headers: true when you are opening the sheet. Have you tried turning it to false? Or are you saying it does not work when it's set to false?
Also, you are using .each rather differently than the example in the documentation. The doc shows a hash with keys derived from the headers. You are using [n] array notation. Does that work?
EDIT:
Try using .each in a way that's more similar to what the documentation says:
xlsx.each(name: 'Name', country_id: 'Country ID') do |row|
n = User.where(name: row[:name]).first
...
end
The strings 'Name' and 'Country ID' are just examples; they should be the text of whatever column headers have the name and country_id information.

There is a way to skip the headers, it is using the method: each_row_streaming(offset: 1).
It will return an array with rows skipping the header, so you have to get the value using .value method. In documentation specify it for Excelx::Cell objects, but it works for Roo::Spreadsheet objects too.
The documentation example:
xlsx.each_row_streaming(offset: 1) do |row| # Will exclude first (inevitably header) row
puts row.inspect # Array of Excelx::Cell objects
end

Related

Ruby CSV foreach write to csv using Row object

I want to loop over a csv file using CSV.foreach, read the data, perform some operation with it, and write the result to the last column of that row, using the Row object.
So let's say I have a csv with data I need to save to a database using Rails ActiveRecord, I validate the record, if it is valid, I write true in the last column, if not I write the errors.
Example csv:
id,title
1,some title
2,another title
3,yet another title
CSV.foreach(path, "r+", headers: true) do |row|
archive = Archive.new(
title: row["title"]
)
archive.save!
row["valid"] = true
rescue ActiveRecord::RecordInvalid => e
row["valid"] = archive.errors.full_messages.join(";")
end
When I run the code it reads the data, but it does not write anything to the csv. Is this possible?
Is it possible to write in the same csv file?
Using:
Ruby 3.0.4
The row variable in your iterator exists only in memory. You need to write the information back to the file like this:
new_csv = ["id,title,valid\n"]
CSV.foreach(path, 'r+', headers: true) do |row| # error here, see edit note below
row["valid"] = 'foo'
new_csv << row.to_s
end
File.open(path, 'w+') do |f|
f.write new_csv
end
[EDIT] the 'r+' option to foreach is not valid, it should be 'r'
Maybe this is over-engineering things a bit. But I would do the following:
Read the original CSV file.
Create a temporary CSV file.
Insert the updated headers into the temporary CSV file.
Insert the updated records into the temporary CSV file.
Replace the original CSV file with the temporary CSV file.
csv_path = 'archives.csv'
input_csv = CSV.read(csv_path, headers: true)
input_headers = input_csv.headers
# using an UUID to prevent file conflicts
tmp_csv_path = "#{csv_path}.#{SecureRandom.uuid}.tmp"
output_headers = input_headers + %w[errors]
CSV.open(tmp_csv_path, 'w', write_headers: true, headers: output_headers) do |output_csv|
input_csv.each do |archive_data|
values = archive_data.values_at(*input_headers)
archive = Archive.new(archive_data.to_h)
archive.valid?
# error_messages is an empty string if there are no errors
error_messages = archive.errors.full_messages.join(';')
output_csv << values + [error_messages]
end
end
FileUtils.move(tmp_csv_path, csv_path)

Unknown Attribute error when uploading a CSV file

Here is what my CSV file that I am trying to upload looks like:
1,Order,”{‘customer_name’:’Jack’,’customer_address’:’Trade St.’,’status’:’unpaid’}”
2,Order,”{‘customer_name’:’Sam’,’customer_address’:’Gecko St.’,’status’:’unpaid’}”
1,Product,”{‘name’:’Laptop’,’price’:2100,’stock_levels’:29}"
1,Order,”{‘status’:’paid’,’ship_date’:’2017-01-18’,’shipping_provider’:’DHL’}”
2,Product,”{‘name’:’Microphones’,’price’:160,’stock_levels’:1500}"
1,Invoice,”{‘order_id’:7,’product_ids’:[1,5,3],’status’:’unpaid’,’total’:2500}"
1,Invoice,”{‘status’:’paid’}”
But I'm getting this error: ActiveModel::UnknownAttributeError in CustomersController#import
And these errors in my console:
app/models/customer.rb:4:in `block in import'
app/models/customer.rb:3:in `import'
app/controllers/customers_controller.rb:65:in `import'
Here is my customer.rb model:
class Customer < ApplicationRecord
def self.import(file)
CSV.foreach(file.path, headers: true) do |row|
Customer.create! row.to_hash
end
end
end
To be perfectly honest, part of the problem I'm having here stems from not totally understanding what row.to_hash is doing here. If this function does not iterate through a row, then we want to convert it to a hash. I feel like it may be causing other problems here that I may not be aware of though.
Here is the import function I have as well:
def import
Customer.import(params[:file])
redirect_to customer_path, notice: "Customer Added Successfully"
end
The error, ActiveModel::UnknownAttributeError happens when you pass an attribute that the model doesn't have to new or create, for instance:
User.create(not_a_real_attribute: 'some value')
# => ActiveModel::UnknownAttributeError: unknown attribute 'not_a_real_attribute' for User.
The reason you are getting this is because you are using a header-less CSV as if it had headers. In ruby, if a CSV has headers, it can convert the row to a hash using the header of a column as a key and the column of the current row as a value. So, by telling the CSV library you have headers (via headers: true in the foreach), it treats the first row as a header row and you end up with weird results:
enum = CSV.foreach("./customers.csv", headers: true)
enum.first.to_hash
# => {"1"=>"2", "Order"=>"Order", "”{‘customer_name’:’Jack’"=>"”{‘customer_name’:’Sam’", "’customer_address’:’Trade St.’"=>"’customer_address’:’Gecko St.’", "’status’:’unpaid’}”"=>"’status’:’unpaid’}”"}
Then you are passing that hash to Customer.create!.If you don't expect headers, you need to treat each row as an array:
enum = CSV.foreach("./customers.csv")
enum.first
# => ["1", "Order", "”{‘customer_name’:’Jack’", "’customer_address’:’Trade St.’", "’status’:’unpaid’}”"]
Or, if you want to use a hash, you can insert a better first row, corresponding to the attributes of your model:
"Maybe an ID?","Some kind of Class?","Customer Data?"
1,Order,”{‘customer_name’:’Jack’,’customer_address’:’Trade St.’,’status’:’unpaid’}”
# ... the rest of your file ...
enum = CSV.foreach("./customers.csv", headers: true)
enum.first.to_hash
# => {"Maybe an ID?"=>"1", "Some kind of Class?"=>"Order", "Customer Data?"=>"”{‘customer_name’:’Jack’", nil=>"’status’:’unpaid’}”"}
You'll also notice throughout these examples, those special quotes in your file weren't being handled properly, the results you get if you change those to normal quotes are:
# => {"Maybe an ID?"=>"1", "Some kind of Class?"=>"Order", "Customer Data?"=>"{'customer_name':'Jack','customer_address':'Trade St.','status':'unpaid'}"}
Also, if that last column is the customer data you want to create with, you'll need to pull that column out and parse it into a ruby hash on your own. Looks like maybe YAML?:
YAML.load enum.first.to_hash['Customer Data?']
# => {"customer_name"=>"Jack", "customer_address"=>"Trade St.", "status"=>"unpaid"}

Rails open xls(excel) file

I have a file b.xls from excel I need to import it to my rails app
I have tried to open it
file = File.read(Rails.root.to_s+'/b.xls')
I have got this
file.encoding => #Encoding:UTF-8
I have few questions:
how to open without this symbols(normal language)?
how to convert this file to a hash?
File pretty large about 5k lines
You must have array of all rows then you can convert it to some hash if you like so.
I would recommend to use a batch_factory gem.
The gem is very simple and relies on the roo gem under the hood.
Here is the code example
require 'batch_factory'
factory = BatchFactory.from_file(
Rails.root.join('b.xlsx'),
keys: [:column1, :column2, ..., :what_ever_column_name]
)
Then you can do
factory.each do |row|
puts row[:column1]
end
You can also omit specifying keys. Then batch_factory will automatically fetch headers from the first row. But your keys would be in russian. Like
factory.each do |row|
puts row['Товар']
end
If you want to hash with product name as key you can do
factory.inject({}) do |hash, row|
hash.merge(row['Товар'] => row)
end

Converting a string to integer in CSV import on Rails

I've a rake task where I import CSV data into a database via Rails.
I want a specific column (specifically, row[6] below) to be rendered as an integer. However, everything I try returns that value as a string.
Below is the rake task:
require 'csv'
namespace :import_site_csv do
task :create_sites => :environment do
CSV.foreach('../sites.csv', :headers => true) do |row|
row[6] = row[6].to_i
Site.create!(row.to_hash)
end
end
end
Does anyone have an idea how I might do this? Thanks!
You are making one small (but important) mistake here.
When you call CSV.foreach('../sites.csv') each of the rows will be an array of the values in that particular row. That would allow you to access the data you need, in the way you do it now - row[6].
But, when you add the :headers => true option to CSV.foreach, you will not get an array of values (row will not be an array). Instead, it will be a CSV::Row object (docs). As you can read in the documentation:
A CSV::Row is part Array and part Hash. It retains an order for the fields and allows duplicates just as an Array would, but also allows you to access fields by name just as you could if they were in a Hash.
For example, if you have a column with the name Title in the CSV, to get the title in each of the rows, you need to do something like:
CSV.foreach('file.csv', :headers => true) do |row|
puts row['Title']
end
Since I do not know the structure of your CSV, I cannot tell you which key you should use to get the data and convert it to an Integer, but I think that this should give you a good idea of how to proceed.

Rails import from csv to model

I have a csv file with dump data of table and I would like to import it directly into my database using rails.
I am currently having this code:
csv_text = File.read("public/csv_fetch/#{model.table_name}.csv")
ActiveRecord::Base.connection.execute("TRUNCATE TABLE #{model.table_name}")
puts "\nUpdating table #{model.table_name}"
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
row = row.to_hash.with_indifferent_access
ActiveRecord::Base.record_timestamps = false
model.create!(row.to_hash.symbolize_keys)
end
with help from here..
Consider my Sample csv:
id,code,created_at,updated_at,hashcode
10,00001,2012-04-12 06:07:26,2012-04-12 06:07:26,
2,00002,0000-00-00 00:00:00,0000-00-00 00:00:00,temphashcode
13,00007,0000-00-00 00:00:00,0000-00-00 00:00:00,temphashcode
43,00011,0000-00-00 00:00:00,0000-00-00 00:00:00,temphashcode
5,00012,0000-00-00 00:00:00,0000-00-00 00:00:00,temphashcode
But problem with this code is :
It is generating `id' as autoincrement 1,2,3,.. instead of what in
csv file.
The timestamps for records where there is 0000-00-00 00:00:00 defaults to null automatically and throws error as the column created_at cannot be null...
Is there any way I can do it in generic way to import from csv to models?
or would i have to write custom code for each model to manipulate the attributes in each row manually??
for question1, I suggest you output the row.to_hash.symbolize_keys, e.g.
# ...
csv.each do |row|
#...
hash = row.to_hash.symbolize_keys
Rails.logger.info "hash: #{hash.inspect}"
model.create!(hash)
end
to see if the "id" is assigned.
for Question2, I don't think it's a good idea to store "0000-00-00" instead of nil for the date.
providing fields like 'id' and for timestamps fields too manually solved it...
model.id = row[:id]
and similar for created_at,updated_at if these exists in model..

Resources