I am importing a CSV file using 'csv'. Import is working as expected, but I would like to update existing records, based on a secondary key field.
I am using the following code:
CSV.foreach(path, :headers => true) do |row|
if(Product.exists?(secondary_key: row['secondary_key']))
#Update goes here
else
Product.create!(row.to_hash)
end
I have tried (among others):
product = Product.where(:secondary_key => row['secondary_key'])
Product.update(product, row.to_hash)
Now that trial-and-error is not bringing me anywhere, I would appreciate your help!
You can issue an update statement using this syntax:
Product.where(secondary_key: row['secondary_key']).update_all(:foo => "bar")
This will generate a query like
UPDATE products SET foo = 'bar' WHERE secondary_key = "#{row['secondary_key']}"
How about using find_or_initialize_by:
CSV.foreach(path, :headers => true) do |row|
product = Product.find_or_initialize_by(secondary_key: row['secondary_key'])
product.update(row.to_hash.except('secondary_key'))
end
First we either find the existing product by the secondary_key or we initialize a new one with secondary_key. Then, in either case, we update all product attributes from the row values (excluding the secondary_key value since that's already set).
product = Product.first_or_initialize(secundary_key: row['secundary_key'])
product.update_attributes(row.to_hash.except('secundary_key'))
Related
I created a rake task to import users from a Google Sheet. Therefore I am using the gem 'Roo'. Everything works so far but I can't seem to get it working without importing the first row (headers).
This is my code:
require 'roo'
namespace :import do
desc "Import users from Google Sheet"
task users: :environment do
#counter = 0
url = 'https://docs.google.com/spreadsheets/d/{mycode}/export?format=xlsx'
xlsx = Roo::Spreadsheet.open(url, extension: :xlsx, headers: true)
xlsx.each do |row|
n = User.where(name:row[0]).first
user = User.find_or_create_by(id: n)
user.update(
name:row[0],
country_id:row[6]
)
user.save!
puts user.name
#counter += 1
end
puts "Imported #{#counter} lines."
end
end
Your code says headers: true when you are opening the sheet. Have you tried turning it to false? Or are you saying it does not work when it's set to false?
Also, you are using .each rather differently than the example in the documentation. The doc shows a hash with keys derived from the headers. You are using [n] array notation. Does that work?
EDIT:
Try using .each in a way that's more similar to what the documentation says:
xlsx.each(name: 'Name', country_id: 'Country ID') do |row|
n = User.where(name: row[:name]).first
...
end
The strings 'Name' and 'Country ID' are just examples; they should be the text of whatever column headers have the name and country_id information.
There is a way to skip the headers, it is using the method: each_row_streaming(offset: 1).
It will return an array with rows skipping the header, so you have to get the value using .value method. In documentation specify it for Excelx::Cell objects, but it works for Roo::Spreadsheet objects too.
The documentation example:
xlsx.each_row_streaming(offset: 1) do |row| # Will exclude first (inevitably header) row
puts row.inspect # Array of Excelx::Cell objects
end
I'm parsing a CSV and trying to distinguish between columns in Model and "virtual" columns that'll be added to a JSONB :data column. So far I've got this:
rows = SmarterCSV.process(csv.path)
rows.each do |row|
row.select! { |x| Model.attribute_method?(x) } # this ignores non-matches
Model.create(row)
end
That removes columns from the CSV row that don't match up with Model. Instead, I want to add the data from all those into a column in Model called :data. How can I do that?
Edit
Something like this before the select! maybe?
row[:data] = row.select { |x| !Model.attribute_method?(x) }
There are a number of ways you could do this. One particularly straightforward way is with Hash#slice! from Rails' ActiveSupport extensions, which works like Array#slice! and returns a Hash with those keys that weren't given in its arguments, while preserving the keys that were given:
rows = SmarterCSV.process(csv.path)
attrs = Model.attribute_names.map(&:to_sym)
rows.each do |row|
row[:data] = row.slice!(*attrs)
Model.create(row)
end
P.S. This could probably be filed under "Stupid Ruby Tricks," but if you're using Ruby 2.0+ you can take advantage of the double-splat (**) for this compact construction:
rows.each do |row|
Model.create(data: row.slice!(*attrs), **row)
end
P.P.S. If your CSVs are big and you find yourself having performance concerns (calling create a few thousand times—and the subsequent database INSERTs—ain't cheap), I recommend checking out the activerecord-import gem. It's designed for exactly this sort of thing. With it you'd do something like this:
rows = SmarterCSV.process(csv.path)
attrs = Model.attribute_names.map(&:to_sym)
models = rows.map do |row|
row[:data] = row.slice!(*attrs)
Model.new(row)
end
Model.import(models)
There are other, faster options as well in the activerecord-import docs.
You can try this has_attribute?
row[:data] = row.keep_if { |x| !Model.has_attribute?(x) }
Have you tried:
row[:data] = row.delete_if {|k,v| !Model.attribute_method?(k) }
Model.create(row)
This will remove the elements from the row hash and add the key-value pairs back to the row under a :data key.
I have the following CSV import action in my Miniatures model
def self.import(file)
CSV.foreach(file.path, headers: true) do |row|
Miniature.create! row.to_hash
end
end
That works fine but what I want to do is use further columns in the CSV to create associated objects.
This is one of my attempts:
def self.import(file)
CSV.foreach(file.path, headers: true) do |row|
Miniature.create! row.to_hash.slice(row[0..10])
#miniature.sizes.build(:scale_id => row[11])
end
end
My attempted at slicing the row have been very unsuccessful. If I don't slice off the first 10 rows then the Miniature.create tries to parse the 11th column which only applies to the associated sizes model. I want to slice off the first 10 and create a Miniature object with them and then build or create a line in my Sizes join table with the supplied scale_id.
Any help very much appreciated.
Another Update
This is my latest and cleanest attempt:
Miniature.create! row.to_hash.except!(:scale_id)
That throws "unknown attribute: scale_id" as an error. Possible I can't interact with the keys at all after the CSV.foreach(file.path, headers: true) do |row| ?
Updated again
I can see one reason why my above code won't work. I'm specifying a range for the fields in the row but hashes don't have an order. I've now tried specifying the keys that I want to deal with using indices but got undefined method 'indices'.
row.to_hash.indices(:name,:material,:release_date,:pcode,:notes,:set,:random,:quantity,:date_mask,:multipart)
I can't turn it into an array or I'll get stringily keys errors but I need to be able to specify which fields the create action should use so that I can use some for the Miniature create and some for the Size create.
My controller action by the way is as follows.
def import
Miniature.import(params[:file])
redirect_to miniatures_path, notice: "Miniatures imported."
end
Update
Here is the seed data I'm using to import
name,material,release_date,pcode,notes,set,random,quantity,date_mask,multipart,scale_id
A A A CSV test,Metal,03/01/2013,123123,Test notes,f,f,,6,f,1
With the above code, the error I get is
"Validation failed: Name can't be blank, Material can't be blank"
but through trying things out I've had a variety of errors which indicate my row.to_hash.slice is not being parsed in the way the simpler row.to_hash is.
The expected result is either a successfully created Miniature object and a Size object OR an error on creating the size object because it can't infer the miniature_id from my using #miniature.sizes.build and wants more params. Can't debug that until initial slicing stage is passed/parsed.
You're assuming the keys to the hash are symbols like :scale_id but in fact they're strings like 'scale_id' and this is where you're tripping up... you need to symbolize the keys if you want to use them as attributes to a create method.
symbolized_row = row.to_hash.inject({}){|memo,(k,v)| memo[k.to_sym] = v; memo}
Miniature.create! symbolized_row.except!(:scale_id)
EDIT
Actually if you use except instead of the mutating except! then you'll have access to the scale id in subsequent lines.
symbolized_row = row.to_hash.inject({}){|memo,(k,v)| memo[k.to_sym] = v; memo}
Miniature.create! symbolized_row.except(:scale_id)
#miniature.sizes.build(:scale_id => symbolized_row[:scale_id])
I feel like this is programming 101 stuff, but I am going to swallow my pride and ask for help. I have a CSV that I am processing. Here is a sample...
person_id, name, start_date
1111, busta, 1/1/14
1111, busta, 1/4/14
1111, busta, 1/7/14
2222, mista, 1/3/14
2222, mista, 1/1/14
2222, mista, 1/11/14
...and here is a sample of the code I am using to process the rows...
def self.import(file)
student_start_dates = Hash.new {|hsh, key| hsh[key] = [] }
CSV.foreach(file.tempfile, :headers => true) do |row|
student_start_dates[row["person_id"]] << row["start_date"]
#need something in the loop that says hey...when I find a new person_id send this array to the process method
end
end
def self.process(student)
#process something like 1111 => ["1/1/14", "1/4/14", "1/7/14"]
end
So as you can see from the data each student has multiple start dates associated with them. I am trying to build an array of start_dates for each student. When I find a new person_id, then need to 'do some stuff' with my start_date array. My question is what is the best way to add logic that looks for a change in the person_id as I loop through each row in my csv? I know I could set some sort of flag that gets set when the person_id changes, then based on the state of that flag process my start_date array, and reset the flag. However, I'm tried implementing that without much luck. Or when it did it felt 'dirty'. Just hoping a fresh set of eyes will give me some ideas on cleaner code.
A big part of my issue is the best way to set a flag that says "..when you find a new student (new person_id) then call the process method to find the earliest start date.
If I understand this correctly, you're trying to get a resulting hash that would look something like {1111 => ["1/1/14", "1/4/14", "1/7/14"], 2222 => [...], ...}
If so you could use the built in CSV parser and just construct the hash as you loop over each row.
# Create the hash, the default value will be an array
student_start_dates = Hash.new {|hsh, key| hsh[key] = [] }
CSV.foreach(file_name, :headers => true) do |row|
student_start_dates[row["person_id"]] << row["start_date"]
end
In SQL I would do this:
SELECT minimummonths WHERE name = "gold"
I want to do the same in Ruby on Rails and have the following in the new section of my orders controller:
#plan = params[:plan]
#Payplanrow = Payplan.where(:name => #plan).minimummonths
I then try to display #payplanrow in my page using <%=#Payplanrow %> but it doesnt work. I get the error:
undefined method `minimummonths' for #<ActiveRecord::Relation:0x007fe30f870ec0>
I want to print the minimummonths value for the plan selected. There will only ever be one row of data corresponding to the #plan value.
I'm pretty new to Ruby on Rails so I'm just trying to get a pointer in the right direction. I looked everywhere but there doesn't seem to be an example of this.
The problem is Payplan.where(:name => #plan) is returning an array of Payplan objects. Assuming you are using Rails 3, you can read more about it in "Active Record Query Interface".
But, if you are certain that your query is returning only one record you could do:
#Payplanrow = Payplan.where(:name => #plan).first.try(:minimummonths)
The Rails way is to have a scope in your model:
class Payplan < ActiveRecord::Base
scope :by_name, lambda {|name|
{:conditions => {:name => name}}
}
end
#controller
#Payplanrow = Payplan.by_name(#plan).first.try(:minimummonths)
Although it's not really optimal, you can do:
#Payplanrow = Payplan.where(:name => #plan).first.minimummonths
You can use pluck to get only the minimummonths value :
minimummonths = Payplan.where(:name => #plan).pluck(:minimummonths).first
Instead of using where then first, it's better to use find when you are expecting a single record.
#Payplanrow = Payplan.find_by_name(#plan).try(:minimummonths)
That should be:
Payplan.where(:name => #plan).first.minimummonths