How to implement bulk insert in Rails 3 - ruby-on-rails

I need to insert a array of emails as different records into my contacts table. How can this be done.
Eg: #email = ["a#b.com", "c#d.com", "e#f.com", ... ]
I dont want to use.
#email.each do |email|
#contact = Contact.new
#contact.email = email
#contact.save
end
This cause n insert quires. I just need a single insert query to insert these values. How can this be done in rails 3.0.9 (and ideally MySQL). Please help

activerecord-import implements AR#import
activerecord-import is a library for bulk inserting data using ActiveRecord.
see how it works:
books = []
10.times do |i|
books << Book.new(:name => "book #{i}")
end
Book.import books
Project's home is on Github and it's wiki.

You might also try upsert, which is approximately as fast as activerecord-import, but only works (currently) with MySQL, Postgres, and SQLite3:
require 'upsert'
Upsert.batch(Contact.connection, Contact.table_name) do |upsert|
emails.each do |email|
upsert.row(email: email)
end
end
Note that this involves one database query per record, but it's an "upsert," so you don't have to check if a record already exists. In your example, this isn't a concern, but in most applications it becomes one eventually.

The simplest way without additional gem is to concat a string and execute it in one SQL insertion (http://www.electrictoolbox.com/mysql-insert-multiple-records/).
#email = ["a#b.com", "c#d.com", "e#f.com"]
time = Time.current.to_s(:db)
values = #email.map do |email|
"('#{email}', '#{time}', '#{time}')"
end
sql = "INSERT INTO contacts (email, created_at, updated_at) VALUES #{values.join(', ')}"
Contact.connection.execute(sql)

I just wrote a little monkey-patch for Active Record 3.2 to INSERT many new records with a single SQL query, check it out:
https://github.com/alexdowad/showcase/blob/master/activerecord/bulk_db_operations.rb

Related

Ruby on Rails. Searching ActiveRecord objects where condition is the presence of one of the associations

I have a Rails code that queries database for employee objects, which have many social_url_for_service (for various services). The way it is implemented, it first gets all employees from database, and then serches for ones with Twitter. Is there any way to look for this association directly (with Employee.find() metod for example) ?
#e = Employee.all
#employees = []
#tweets = []
#e.each do |employee|
if employee.social_url_for_service(:twitter)
#employees << employee
#tweets.concat(Twitter.user_timeline(employee.social_url_for_service(:twitter).split('/').last, count: 3))
end
end
Assuming social_url_for_service is a method that grabs a social_service_link association with a service_name field:
Employee.joins(:social_service_links).where('social_service_links.service_name = ?', "Twitter")
You'll need to update this for your exact table and field names. You can also drop the .where call to return all employees with a service of any kind.

Rails 3, custom raw SQL insert statement

I have an Evaluation model. Evaluation has many scores. Whenever a new evaluation is created, a score record is created for each user that needs an evaluation (see below for the current method I am using to do this.) So, for example, 40 score records might be created at once. The Evaluation owner then updates each score record with the User's score.
I'm looking to use raw SQL because each insert is its own transaction and is slow.
I would like to convert the following into a mass insert statement using raw SQL:
def build_evaluation_score_items
self.job.active_employees.each do |employee|
employee_score = self.scores.build
employee_score.user_id = employee.id
employee_score.save
end
end
Any thoughts on how this can be done? I've tried adapting a code sample from Chris Heald's Coffee Powered site but, no dice.
Thanks to anyone willing to help!
EDIT 1
I neglected to mention the current method is wrapped in a transaction.
So, essentially, I am trying to add this to the code block so everything is inserted in one statement (** This code snippit is from Chris Heald's Coffee Powered site that discussed the topic. I would ask the question there but the post is > 3 yrs old.):
inserts = []
TIMES.times do
inserts.push "(3.0, '2009-01-23 20:21:13', 2, 1)"
end
sql = "INSERT INTO user_node_scores (`score`, `updated_at`, `node_id`, `user_id`)VALUES #{inserts.join(", ")}"
I'd be happy to show the code from some of my attempts that do not work...
Thanks again!
Well, I've cobbled together something that resembles the code above but I get a SQL statement invalid error around the ('evaluation_id' portion. Any thoughts?
def build_evaluation_score_items
inserts = []
self.job.active_employees.each do |employee|
inserts.push "(#{self.id}, #{employee.id}, #{Time.now}, #{Time.now})"
end
sql = "INSERT INTO scores ('evaluation_id', `user_id`, 'created_at', `updated_at`)VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute(sql)
end
Any idea as to what in the above SQL code is causing the error?
Well, after much trial and error, here is the final answer. The cool thing is that all the records are inserted via one statement. Of course, validations are skipped (so this will not be appropriate if you require model validations on create) but in my case, that's not necessary because all I'm doing is setting up the score record for each employee's evaluation. Of course, validations work as expected when the job leader updates the employee's evaluation score.
def build_evaluation_score_items
inserts = []
time = Time.now.to_s(:db)
self.job.active_employees.each do |employee|
inserts.push "(#{self.id}, #{employee.id}, '#{time}')"
end
sql = "INSERT INTO scores (evaluation_id, user_id, created_at) VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute(sql)
end
Rather than building SQL directly (and opening yourself to SQL injection and other issues), I would recommend the activerecord-import gem. It can issue multi-row INSERT commands, among other strategies.
You could then write something like:
def build_evaluation_score_items
new_scores = job.active_employees.map do |employee|
scores.build(:user_id => employee.id)
end
Score.import new_scores
end
I think what you're looking for is:
def build_evaluation_score_items
ActiveRecord::Base.transaction do
self.job.active_employees.each do |employee|
employee_score = self.scores.build
employee_score.user_id = employee.id
employee_score.save
end
end
end
All child transactions are automatically "pushed" up to the parent transaction. This will prevent the overhead of so many transactions and should increase performance.
You can read more about ActiveRecord transactions here.
UPDATE
Sorry, I misunderstood. Keeping the above answer for posterity. Try this:
def build_evaluation_score_items
raw_sql = "INSERT INTO your_table ('user_id', 'something_else') VALUES "
insert_values = "('%s', '%s'),"
self.job.active_employees.each do |employee|
raw_sql += insert_values % employee.id, "something else"
end
ActiveRecord::Base.connection.execute raw_sql
end

Postgres ORDER BY values in IN list using Rails Active Record

I receive a list of UserIds(about 1000 at a time) sorted by 'Income'. I have User records in "my system's database" but the 'Income' column is not there. I want to retrieve the Users from "my system's database"
in the Sorted Order as received in the list. I tried doing the following using Active Record expecting that the records would be retrieved in the same order as in the Sorted List but it does not work.
//PSEUDO CODE
User.all(:conditions => {:id => [SORTED LIST]})
I found an answer to a similar question at the link below, but am not sure how to implement the suggested solution using Active Record.
ORDER BY the IN value list
Is there any other way to do it?
Please guide.
Shardul.
Your linked to answer provides exactly what you need, you just need to code it in Ruby in a flexible manner.
Something like this:
class User
def self.find_as_sorted(ids)
values = []
ids.each_with_index do |id, index|
values << "(#{id}, #{index + 1})"
end
relation = self.joins("JOIN (VALUES #{values.join(",")}) as x (id, ordering) ON #{table_name}.id = x.id")
relation = relation.order('x.ordering')
relation
end
end
In fact you could easily put that in a module and mixin it into any ActiveRecord classes that need it, since it uses table_name and self its not implemented with any specific class names.
MySQL users can do this via the FIELD function but Postgres lacks it. However this questions has work arounds: Simulating MySQL's ORDER BY FIELD() in Postgresql

Is There a Way To Make ActiveRecord Do A Multiple Insert

I have a one-to-many relationship where one Thing :has_many Elements
I'm looking for a way to create a Thing and all its N Elements without doing N+1 queries. I tried:
[loop in Thing model]
self.elements.build({...})
...
self.save
But it does a separate insert for each Element.
This capability is not built in.
One option is to use a transaction, which will not eliminate the multiple INSERTs but will send all of them in one request, which will help with performance some. For example:
ActiveRecord::Base.transaction do
1000.times { MyModel.create(options) }
end
To do a true bulk INSERT, though, you'll either have to write and execute a raw query, or use a gem such as activerecord-import (formerly part of ar-extensions). An example from the documentation:
books = []
10.times do |i|
books << Book.new(:name => "book #{i}")
end
Book.import books
I think this may be the best option for you.

Does ActiveRecord perform inserts/deletes in bulk when inside a transaction?

What I need:
ensuring atomic updates (no record can gets processed 2 times)
bulk deletion for all 1000 rows selected
#queue = Queue.where("col = 1").limit(1000)
ids = []
#queue.each do |row|
Queue.do_something(row)
ids << row.id
end
Queue.delete_all("id in (#{ids.join(',')}) ")
IS THE SAME AS
Queue.transaction do
#queue.each do |row|
Queue.do_something(row)
Queue.delete(row.id)
end
end
For inserts:
ActiveRecord does not perform a bulk insert when using a transaction. However it does speed things up a bit since it is using a single transaction to execute all INSERT statements as opposed to one transaction per INSERT statement otherwise.
So:
Queue.transaction do
#queue.each do |row|
# an INSERT is done here
end
end
is going to be faster than:
#queue.each do |row|
# an INSERT is done here
end
For more info on how to really do bulk inserts, check out this article.
For deletes:
The ActiveRecord delete_all call is one single SQL DELETE statement, so I guess you could consider this as a bulk delete (no need to use a transaction here since it's already encapsulated in one transaction by ActiveRecord). This is not the case when calling delete on each record, which will result in multiple SQL DELETE statements, thus multiple transactions initiated and committed and overall slower performance.
I suggest you take a look at ActiveRecord Import: https://github.com/zdennis/activerecord-import.
I'm using this tool to insert between 10,000 and 100,000 rows of data.
books = []
10.times do |i|
books << Book.new(:name => "book #{i}")
end
Book.import books
If you're using MySQL, it also supports ON DUPLICATE KEY UPDATE so you can intelligently insert new / update old rows. https://github.com/zdennis/activerecord-import/wiki/MySQL:-On-Duplicate-Key-Update-Support

Resources