Rails import from csv to model - ruby-on-rails

I have a csv file with dump data of table and I would like to import it directly into my database using rails.
I am currently having this code:
csv_text = File.read("public/csv_fetch/#{model.table_name}.csv")
ActiveRecord::Base.connection.execute("TRUNCATE TABLE #{model.table_name}")
puts "\nUpdating table #{model.table_name}"
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
row = row.to_hash.with_indifferent_access
ActiveRecord::Base.record_timestamps = false
model.create!(row.to_hash.symbolize_keys)
end
with help from here..
Consider my Sample csv:
id,code,created_at,updated_at,hashcode
10,00001,2012-04-12 06:07:26,2012-04-12 06:07:26,
2,00002,0000-00-00 00:00:00,0000-00-00 00:00:00,temphashcode
13,00007,0000-00-00 00:00:00,0000-00-00 00:00:00,temphashcode
43,00011,0000-00-00 00:00:00,0000-00-00 00:00:00,temphashcode
5,00012,0000-00-00 00:00:00,0000-00-00 00:00:00,temphashcode
But problem with this code is :
It is generating `id' as autoincrement 1,2,3,.. instead of what in
csv file.
The timestamps for records where there is 0000-00-00 00:00:00 defaults to null automatically and throws error as the column created_at cannot be null...
Is there any way I can do it in generic way to import from csv to models?
or would i have to write custom code for each model to manipulate the attributes in each row manually??

for question1, I suggest you output the row.to_hash.symbolize_keys, e.g.
# ...
csv.each do |row|
#...
hash = row.to_hash.symbolize_keys
Rails.logger.info "hash: #{hash.inspect}"
model.create!(hash)
end
to see if the "id" is assigned.
for Question2, I don't think it's a good idea to store "0000-00-00" instead of nil for the date.

providing fields like 'id' and for timestamps fields too manually solved it...
model.id = row[:id]
and similar for created_at,updated_at if these exists in model..

Related

Ruby CSV foreach write to csv using Row object

I want to loop over a csv file using CSV.foreach, read the data, perform some operation with it, and write the result to the last column of that row, using the Row object.
So let's say I have a csv with data I need to save to a database using Rails ActiveRecord, I validate the record, if it is valid, I write true in the last column, if not I write the errors.
Example csv:
id,title
1,some title
2,another title
3,yet another title
CSV.foreach(path, "r+", headers: true) do |row|
archive = Archive.new(
title: row["title"]
)
archive.save!
row["valid"] = true
rescue ActiveRecord::RecordInvalid => e
row["valid"] = archive.errors.full_messages.join(";")
end
When I run the code it reads the data, but it does not write anything to the csv. Is this possible?
Is it possible to write in the same csv file?
Using:
Ruby 3.0.4
The row variable in your iterator exists only in memory. You need to write the information back to the file like this:
new_csv = ["id,title,valid\n"]
CSV.foreach(path, 'r+', headers: true) do |row| # error here, see edit note below
row["valid"] = 'foo'
new_csv << row.to_s
end
File.open(path, 'w+') do |f|
f.write new_csv
end
[EDIT] the 'r+' option to foreach is not valid, it should be 'r'
Maybe this is over-engineering things a bit. But I would do the following:
Read the original CSV file.
Create a temporary CSV file.
Insert the updated headers into the temporary CSV file.
Insert the updated records into the temporary CSV file.
Replace the original CSV file with the temporary CSV file.
csv_path = 'archives.csv'
input_csv = CSV.read(csv_path, headers: true)
input_headers = input_csv.headers
# using an UUID to prevent file conflicts
tmp_csv_path = "#{csv_path}.#{SecureRandom.uuid}.tmp"
output_headers = input_headers + %w[errors]
CSV.open(tmp_csv_path, 'w', write_headers: true, headers: output_headers) do |output_csv|
input_csv.each do |archive_data|
values = archive_data.values_at(*input_headers)
archive = Archive.new(archive_data.to_h)
archive.valid?
# error_messages is an empty string if there are no errors
error_messages = archive.errors.full_messages.join(';')
output_csv << values + [error_messages]
end
end
FileUtils.move(tmp_csv_path, csv_path)

How to manipulate a CSV object in ruby?

I want to export some ActiveRecords in CSV format. After check some tutorials, I found this:
def export_as_csv(equipments)
attributes = %w[id title description category_id]
CSV.generate(headers: true) do |csv|
csv << attributes
equipments.each do |equipment|
csv << equipment.attributes.values_at(*attributes)
end
return csv
end
end
The problem is, I want to manipulate all in memory in my tests(i.e. I don't want to save the file in the disk). So, when I receive this csv object as return value, how I can iterate through rows and columns? I came from Python and so I tried:
csv = exporter.export_as_csv(equipments)
for row in csv:
foo(row)
But obviously didn't work. Also, the equipments are surely not nil.
CSV.generate returns string formatted according csv rules.
So the most obvious way is to parse it and iterate, like:
csv = exporter.expor_as_csv(equipments)
CSV.parse(csv).each do |line|
# line => ['a', 'b', 'c']
end
After some videos, I found that the return was the problem. Returning the CSV I was receiving a CSV object, and not the CSV itself.

How to Dynamically add attributes from csv file

I am new to RoR.
I want to dynamically add attributes from a csv file so that my code would be able to dynamically read any csv file and build the db (i.e. convert any CSV file into Ruby objects)
I was using the below code
csv_data = File.read('myData.csv')
csv = CSV.parse(csv_data, :headers => true, :header_converters => :symbol)
csv.each do |row|
MyModel.create!(row.to_hash)
end
However it will fail for the following example
myData.csv
Name,id
foo,1
bar,10
myData2.csv
Name,value
foo,1
bar,10
It will result an error for myData2 because the value is not a parameter in MyModel
unknown attribute 'value' for MyModel.
I have thought about using send(:attrAccessor, name) but I was not sure how can I integrate it when reading from csv, any ideas ?
You are doing it properly but you can also bulk upload the records
csv_data =
CSV.read("#{Rails.root}/myData.csv",
headers: true,
header_converters: :symbol
).map(&:to_hash)
MyModel.create(csv_data)
NOTE: If the data is going to be same you can use seeds.rb

In Rails, how do I export to CSV while translating values of a specific column?

Currently, I'm using Rails and able to export, but there are values within the DB that are in a numeric format, and I need them to be translated into an alphanumeric format. I have the translations, but I don't know how to do it while exporting to CSV
Here's my current snippet of code to export to CSV
def self.to_csv(mycolumns)
CSV.generate() do |csv|
csv << mycolumns
all.each do |ccts|
csv << ccts.attributes.values_at(*mycolumns)
end
end
end
So my initial thought was that I could go into each ccts and edit them, but I don't know how to access the value within the hash and alter it. And it's only for a specific column. For instance, if this table was for fruits, and one of the column names was Name. If I wanted to change a value of 0041 into Apple, but only within the Name column, I'm just not sure how to accomplish this.
The csv export code is very compact, especially this line:
csv << ccts.attributes.values_at(*mycolumns)
That makes it difficult to think about how to change it.
First think how you would export your value if it was a single column. It may look something like:
if column_name == :name
lookup_fruit_name(ccts.name)
else
ccts[column_name]
end
Now you need all the values of a ccts inside an array, so it can be sent to csv:
values = mycolums.map do |column_name|
if column_name == :name
lookup_fruit_name(ccts.name)
else
ccts[column_name]
end
end
csv << values
Then just place this inside the inner loop of your original export method.
If you think more functional, you just write an instance method that gets your value and does a conversion depending on the column:
def csv_value_for(column_name)
if column_name == :name
lookup_fruit_name( self.name )
else
self[column_name]
end
end
Then you can use it like this:
csv << mycolumns.map{|col| ccts.csv_value_for(col) }

Ruby on Rails - Import Data from a CSV file

I would like to import data from a CSV file into an existing database table. I do not want to save the CSV file, just take the data from it and put it into the existing table. I am using Ruby 1.9.2 and Rails 3.
This is my table:
create_table "mouldings", :force => true do |t|
t.string "suppliers_code"
t.datetime "created_at"
t.datetime "updated_at"
t.string "name"
t.integer "supplier_id"
t.decimal "length", :precision => 3, :scale => 2
t.decimal "cost", :precision => 4, :scale => 2
t.integer "width"
t.integer "depth"
end
Can you give me some code to show me the best way to do this, thanks.
require 'csv'
csv_text = File.read('...')
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
Moulding.create!(row.to_hash)
end
Simpler version of yfeldblum's answer, that is simpler and works well also with large files:
require 'csv'
CSV.foreach(filename, headers: true) do |row|
Moulding.create!(row.to_hash)
end
No need for with_indifferent_access or symbolize_keys, and no need to read in the file to a string first.
It doesnt't keep the whole file in memory at once, but reads in line by line and creates a Moulding per line.
The smarter_csv gem was specifically created for this use-case: to read data from CSV file and quickly create database entries.
require 'smarter_csv'
options = {}
SmarterCSV.process('input_file.csv', options) do |chunk|
chunk.each do |data_hash|
Moulding.create!( data_hash )
end
end
You can use the option chunk_size to read N csv-rows at a time, and then use Resque in the inner loop to generate jobs which will create the new records, rather than creating them right away - this way you can spread the load of generating entries to multiple workers.
See also:
https://github.com/tilo/smarter_csv
You might try Upsert:
require 'upsert' # add this to your Gemfile
require 'csv'
u = Upsert.new Moulding.connection, Moulding.table_name
CSV.foreach(file, headers: true) do |row|
selector = { name: row['name'] } # this treats "name" as the primary key and prevents the creation of duplicates by name
setter = row.to_hash
u.row selector, setter
end
If this is what you want, you might also consider getting rid of the auto-increment primary key from the table and setting the primary key to name. Alternatively, if there is some combination of attributes that form a primary key, use that as the selector. No index is necessary, it will just make it faster.
This can help. It has code examples too:
http://csv-mapper.rubyforge.org/
Or for a rake task for doing the same:
http://erikonrails.snowedin.net/?p=212
It is better to wrap the database related process inside a transaction block. Code snippet blow is a full process of seeding a set of languages to Language model,
require 'csv'
namespace :lan do
desc 'Seed initial languages data with language & code'
task init_data: :environment do
puts '>>> Initializing Languages Data Table'
ActiveRecord::Base.transaction do
csv_path = File.expand_path('languages.csv', File.dirname(__FILE__))
csv_str = File.read(csv_path)
csv = CSV.new(csv_str).to_a
csv.each do |lan_set|
lan_code = lan_set[0]
lan_str = lan_set[1]
Language.create!(language: lan_str, code: lan_code)
print '.'
end
end
puts ''
puts '>>> Languages Database Table Initialization Completed'
end
end
Snippet below is a partial of languages.csv file,
aa,Afar
ab,Abkhazian
af,Afrikaans
ak,Akan
am,Amharic
ar,Arabic
as,Assamese
ay,Aymara
az,Azerbaijani
ba,Bashkir
...
The better way is to include it in a rake task. Create import.rake file inside /lib/tasks/ and put this code to that file.
desc "Imports a CSV file into an ActiveRecord table"
task :csv_model_import, [:filename, :model] => [:environment] do |task,args|
lines = File.new(args[:filename], "r:ISO-8859-1").readlines
header = lines.shift.strip
keys = header.split(',')
lines.each do |line|
values = line.strip.split(',')
attributes = Hash[keys.zip values]
Module.const_get(args[:model]).create(attributes)
end
end
After that run this command in your terminal rake csv_model_import[file.csv,Name_of_the_Model]
I know it's old question but it still in first 10 links in google.
It is not very efficient to save rows one-by-one because it cause database call in the loop and you better avoid that, especially when you need to insert huge portions of data.
It's better (and significantly faster) to use batch insert.
INSERT INTO `mouldings` (suppliers_code, name, cost)
VALUES
('s1', 'supplier1', 1.111),
('s2', 'supplier2', '2.222')
You can build such a query manually and than do Model.connection.execute(RAW SQL STRING) (not recomended)
or use gem activerecord-import (it was first released on 11 Aug 2010) in this case just put data in array rows and call Model.import rows
refer to gem docs for details
Use this gem:
https://rubygems.org/gems/active_record_importer
class Moulding < ActiveRecord::Base
acts_as_importable
end
Then you may now use:
Moulding.import!(file: File.open(PATH_TO_FILE))
Just be sure to that your headers match the column names of your table
The following module can be extended on any model and it will import the data according to the column headers defined in the CSV.
Note:
This is a great internal tool, for customer use I would recommend adding safeguards and sanitization
The column names in the CSV must be exactly like the DB schema or it won't work
It can be further improved by using the table name to get the headers vs defining them in the file
Create a file named "csv_importer.rb" in your models/concerns folder
module CsvImporter
extend ActiveSupport::Concern
require 'csv'
def convert_csv_to_book_attributes(csv_path)
csv_rows = CSV.open(csv_path).each.to_a.compact
columns = csv_rows[0].map(&:strip).map(&:to_sym)
csv_rows.shift
return columns, csv_rows
end
def import_by_csv(csv_path)
columns, attributes_array = convert_csv_to_book_attributes(csv_path)
message = ""
begin
self.import columns, attributes_array, validate: false
message = "Import Successful."
rescue => e
message = e.message
end
return message
end
end
Add extend CsvImporter to whichever model you would like to extend this functionality to.
In your controller you can have an action like the following to utilize this functionality:
def import_file
model_name = params[:table_name].singularize.camelize.constantize
csv = params[:file].path
#message = model_name.import_by_csv(csv)
end
It's better to use CSV::Table and use String.encode(universal_newline: true). It converting CRLF and CR to LF
If you want to Use SmartCSV
all_data = SmarterCSV.process(
params[:file].tempfile,
{
:col_sep => "\t",
:row_sep => "\n"
}
)
This represents tab delimited data in each row "\t" with rows separated by new lines "\n"

Resources