Rails 3, custom raw SQL insert statement - ruby-on-rails

I have an Evaluation model. Evaluation has many scores. Whenever a new evaluation is created, a score record is created for each user that needs an evaluation (see below for the current method I am using to do this.) So, for example, 40 score records might be created at once. The Evaluation owner then updates each score record with the User's score.
I'm looking to use raw SQL because each insert is its own transaction and is slow.
I would like to convert the following into a mass insert statement using raw SQL:
def build_evaluation_score_items
self.job.active_employees.each do |employee|
employee_score = self.scores.build
employee_score.user_id = employee.id
employee_score.save
end
end
Any thoughts on how this can be done? I've tried adapting a code sample from Chris Heald's Coffee Powered site but, no dice.
Thanks to anyone willing to help!
EDIT 1
I neglected to mention the current method is wrapped in a transaction.
So, essentially, I am trying to add this to the code block so everything is inserted in one statement (** This code snippit is from Chris Heald's Coffee Powered site that discussed the topic. I would ask the question there but the post is > 3 yrs old.):
inserts = []
TIMES.times do
inserts.push "(3.0, '2009-01-23 20:21:13', 2, 1)"
end
sql = "INSERT INTO user_node_scores (`score`, `updated_at`, `node_id`, `user_id`)VALUES #{inserts.join(", ")}"
I'd be happy to show the code from some of my attempts that do not work...
Thanks again!
Well, I've cobbled together something that resembles the code above but I get a SQL statement invalid error around the ('evaluation_id' portion. Any thoughts?
def build_evaluation_score_items
inserts = []
self.job.active_employees.each do |employee|
inserts.push "(#{self.id}, #{employee.id}, #{Time.now}, #{Time.now})"
end
sql = "INSERT INTO scores ('evaluation_id', `user_id`, 'created_at', `updated_at`)VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute(sql)
end
Any idea as to what in the above SQL code is causing the error?

Well, after much trial and error, here is the final answer. The cool thing is that all the records are inserted via one statement. Of course, validations are skipped (so this will not be appropriate if you require model validations on create) but in my case, that's not necessary because all I'm doing is setting up the score record for each employee's evaluation. Of course, validations work as expected when the job leader updates the employee's evaluation score.
def build_evaluation_score_items
inserts = []
time = Time.now.to_s(:db)
self.job.active_employees.each do |employee|
inserts.push "(#{self.id}, #{employee.id}, '#{time}')"
end
sql = "INSERT INTO scores (evaluation_id, user_id, created_at) VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute(sql)
end

Rather than building SQL directly (and opening yourself to SQL injection and other issues), I would recommend the activerecord-import gem. It can issue multi-row INSERT commands, among other strategies.
You could then write something like:
def build_evaluation_score_items
new_scores = job.active_employees.map do |employee|
scores.build(:user_id => employee.id)
end
Score.import new_scores
end

I think what you're looking for is:
def build_evaluation_score_items
ActiveRecord::Base.transaction do
self.job.active_employees.each do |employee|
employee_score = self.scores.build
employee_score.user_id = employee.id
employee_score.save
end
end
end
All child transactions are automatically "pushed" up to the parent transaction. This will prevent the overhead of so many transactions and should increase performance.
You can read more about ActiveRecord transactions here.
UPDATE
Sorry, I misunderstood. Keeping the above answer for posterity. Try this:
def build_evaluation_score_items
raw_sql = "INSERT INTO your_table ('user_id', 'something_else') VALUES "
insert_values = "('%s', '%s'),"
self.job.active_employees.each do |employee|
raw_sql += insert_values % employee.id, "something else"
end
ActiveRecord::Base.connection.execute raw_sql
end

Related

Does splitting up an active record query over 2 methods hit the database twice?

I have a database query where I want to get an array of Users that are distinct for the set:
#range is a predefinded date range
#shift_list is a list of filtered shifts
def listing
Shift
.where(date: #range, shiftname: #shift_list)
.select(:user_id)
.distinct
.map { |id| User.find( id.user_id ) }
.sort
end
and I read somewhere that for readability, or isolating for testing, or code reuse, you could split this into seperate methods:
def listing
shiftlist
.select(:user_id)
.distinct
.map { |id| User.find( id.user_id ) }
.sort
end
def shift_list
Shift
.where(date: #range, shiftname: #shift_list)
end
So I rewrote this and some other code, and now the page takes 4 times as long to load.
My question is, does this type of method splitting cause the database to be hit twice? Or is it something that I did elsewhere?
And I'd love a suggestion to improve the efficiency of this code.
Further to the need to remove mapping from the code, this shift list is being created with the following code:
def _month_shift_list
Shift
.select(:shiftname)
.distinct
.where(date: #range)
.map {|x| x.shiftname }
end
My intention is to create an array of shiftnames as strings.
I am obviously missing some key understanding in database access, as this method is clearly creating part of the problem.
And I think I have found the solution to this with the following:
def month_shift_list
Shift.
.where(date: #range)
.pluck(:shiftname)
.uniq
end
Nope, the database will not be hit twice. The queries in both methods are lazy loaded. The issue you have with the slow page load times is because the map function now has to do multiple finds which translates to multiple SELECT from the DB. You can re-write your query to this:
def listing
User.
joins(:shift).
merge(Shift.where(date: #range, shiftname: #shift_list).
uniq.
sort
end
This has just one hit to the DB and will be much faster and should produce the same result as above.
The assumption here is that there is a has_one/has_many relationship on the User model for Shifts
class User < ActiveRecord::Base
has_one :shift
end
If you don't want to establish the has_one/has_many relationship on User, you can re-write it to:
def listing
User.
joins("INNER JOIN shifts on shifts.user_id = users.id").
merge(Shift.where(date: #range, shiftname: #shift_list).
uniq.
sort
end
ALTERNATIVE:
You can use 2 queries if you experience issues with using ActiveRecord#merge.
def listing
user_ids = Shift.where(date: #range, shiftname: #shift_list).uniq.pluck(:user_id).sort
User.find(user_ids)
end

Ruby on Rails Active Record Query

Employers and Jobs. Employers have many jobs. Jobs have a boolean field started.
I am trying to query and find a count for Employers that have more than one job that is started.
How do I do this?
Employer.first.jobs.where(started: true).count
Do I use a loop with a counter or is there a way I can do it with a query?
Thanks!
You can have condition on join
Employer.joins(:jobs).where(jobs: {started: true}).count
You could create a scope like this in your Employer model:
def self.with_started_job
joins(:jobs)
.where(jobs: { started: true })
.having('COUNT(jobs.id) > 0')
end
Then, to get the number of employers that have a started job, you can just use Employer.with_started_job.count.
What is missing is a group by clause. Use .group() then count. Something like Employer.select("employers.id,count(*)").joins(:jobs).where("jobs.started = 1").group("employers.id")
The query joins both tables, eliminates the records that are false, then it counts the total records for each employer.id when grouped together.
Time for exploratory programming!
Given I don't know SQL really well, my way of doing this may not be optimal. I've been unable to use two aggregations without using a subquery. So I split the task in two:
Fetch all employers who have more than one job started
Count the number of entries in the result set
All on the database level, of course! And without raw SQL, so using Arel here and there. Here's what I've come up with:
class Employer < ActiveRecord::Base
has_many :jobs
# I explored my possibilities using this method: fetches
# all the employers and number of jobs each has started.
# Looks best> Employer.with_started_job_counts.map(&:attributes)
# In the final method this one is not used, it's just handy.
def self.with_started_job_counts
jobs = Job.arel_table
joins(:jobs).select(arel_table[Arel.star],
jobs[:id].count.as('job_count'))
.merge(Job.where(started: true))
.group(:id)
end
# Alright. Now try to apply it. Seems to work alright.
def self.busy
jobs = Job.arel_table
joins(:jobs).merge(Job.where(started: true))
.group(:id)
.having(jobs[:id].count.gt 1)
end
# This is one of the tricks I've learned while fiddling
# with Arel. Counts any relations on the database level.
def self.generic_count(subquery)
from(subquery).count
end
# So now... we get this.
def self.busy_count
generic_count(busy)
end
# ...and it seems to get what we need!
end
Resulting SQL is...large. Not huge, but you may have to solve performance issues with it.
SELECT COUNT(*)
FROM (
SELECT "employers".*
FROM "employers"
INNER JOIN "jobs" ON "jobs"."employer_id" = "employers"."id"
WHERE "jobs"."started" = ?
GROUP BY "employers"."id"
HAVING COUNT("jobs"."id") > 1
) subquery [["started", "t"]]
...still, it does seem to get the result.

Avoiding n+1 with where query active record

I'm looking at my server log and there's a lot of SELECT statements from a particular query. I'm aware of n+1 issues and this seems to be one. However I'm not sure what the solution would be for this query. Is this a single query or multiple for every user that is returned (I believe WHERE is a single query)?
User.where(profile_type: self.profile_type).where.not(id: black_list)
EDIT:
user.rb
def suggested_friends
black_list = self.friendships.map(&:friend_id)
black_list.push(self.id)
return User.where(profile_type: self.profile_type).where.not(id: black_list)
end
index.json.jbuilder
json.suggested_friends #user.suggested_friends do |friend|
json.set! :user_id, friend.id
json.(friend.profile, *friend.profile.attributes.keys)
end
the issue was leaving out the includes(:profile). Using the following query solved my problem
User.where(profile_type: self.profile_type).where.not(id: black_list).includes(:profile)
thank you for the comments to point me in the right direction
User.where(profile_type: self.profile_type).where.not(id: black_list)
Is only 1 query, even if you have multiple where clauses, rails is smart to make from it 1 sql query.
If you had
users = User.where(profile_type: self.profile_type)
users.each do |u|
u.get_all_comments
end
This is n+1 query, because you ask for all users, then for each user you make new request for its comments, this is N (queries for comments) and 1 query for all users

How to implement bulk insert in Rails 3

I need to insert a array of emails as different records into my contacts table. How can this be done.
Eg: #email = ["a#b.com", "c#d.com", "e#f.com", ... ]
I dont want to use.
#email.each do |email|
#contact = Contact.new
#contact.email = email
#contact.save
end
This cause n insert quires. I just need a single insert query to insert these values. How can this be done in rails 3.0.9 (and ideally MySQL). Please help
activerecord-import implements AR#import
activerecord-import is a library for bulk inserting data using ActiveRecord.
see how it works:
books = []
10.times do |i|
books << Book.new(:name => "book #{i}")
end
Book.import books
Project's home is on Github and it's wiki.
You might also try upsert, which is approximately as fast as activerecord-import, but only works (currently) with MySQL, Postgres, and SQLite3:
require 'upsert'
Upsert.batch(Contact.connection, Contact.table_name) do |upsert|
emails.each do |email|
upsert.row(email: email)
end
end
Note that this involves one database query per record, but it's an "upsert," so you don't have to check if a record already exists. In your example, this isn't a concern, but in most applications it becomes one eventually.
The simplest way without additional gem is to concat a string and execute it in one SQL insertion (http://www.electrictoolbox.com/mysql-insert-multiple-records/).
#email = ["a#b.com", "c#d.com", "e#f.com"]
time = Time.current.to_s(:db)
values = #email.map do |email|
"('#{email}', '#{time}', '#{time}')"
end
sql = "INSERT INTO contacts (email, created_at, updated_at) VALUES #{values.join(', ')}"
Contact.connection.execute(sql)
I just wrote a little monkey-patch for Active Record 3.2 to INSERT many new records with a single SQL query, check it out:
https://github.com/alexdowad/showcase/blob/master/activerecord/bulk_db_operations.rb

ActiveRecord and SELECT AS SQL statements

I am developing in Rails an app where I would like to rank a list of users based on their current points. The table looks like this: user_id:string, points:integer.
Since I can't figure out how to do this "The Rails Way", I've written the following SQL code:
self.find_by_sql ['SELECT t1.user_id, t1.points, COUNT(t2.points) as user_rank FROM registrations as t1, registrations as t2 WHERE t1.points <= t2.points OR (t1.points = t2.points AND t1.user_id = t2.user_id) GROUP BY t1.user_id, t1.points ORDER BY t1.points DESC, t1.user_id DESC']
The thing is this: the only way to access the aliased column "user_rank" is by doing ranking[0].user_rank, which brinks me lots of headaches if I wanted to easily display the resulting table.
Is there a better option?
how about:
#ranked_users = User.all :order => 'users.points'
then in your view you can say
<% #ranked_users.each_with_index do |user, index| %>
<%= "User ##{index}, #{user.name} with #{user.points} points %>
<% end %>
if for some reason you need to keep that numeric index in the database, you'll need to add an after_save callback to update the full list of users whenever the # of points anyone has changes. You might look into using the acts_as_list plugin to help out with that, or that might be total overkill.
Try adding user_rank to your model.
class User < ActiveRecord::Base
def rank
#determine rank based on self.points (switch statement returning a rank name?)
end
end
Then you can access it with #user.rank.
What if you did:
SELECT t1.user_id, COUNT(t1.points)
FROM registrations t1
GROUP BY t1.user_id
ORDER BY COUNT(t1.points) DESC
If you want to get all rails-y, then do
cool_users = self.find_by_sql ['(sql above)']
cool_users.each do |cool_user|
puts "#{cool_user[0]} scores #{cool_user[1]}"
end

Resources