Avoiding n+1 with where query active record - ruby-on-rails

I'm looking at my server log and there's a lot of SELECT statements from a particular query. I'm aware of n+1 issues and this seems to be one. However I'm not sure what the solution would be for this query. Is this a single query or multiple for every user that is returned (I believe WHERE is a single query)?
User.where(profile_type: self.profile_type).where.not(id: black_list)
EDIT:
user.rb
def suggested_friends
black_list = self.friendships.map(&:friend_id)
black_list.push(self.id)
return User.where(profile_type: self.profile_type).where.not(id: black_list)
end
index.json.jbuilder
json.suggested_friends #user.suggested_friends do |friend|
json.set! :user_id, friend.id
json.(friend.profile, *friend.profile.attributes.keys)
end
the issue was leaving out the includes(:profile). Using the following query solved my problem
User.where(profile_type: self.profile_type).where.not(id: black_list).includes(:profile)
thank you for the comments to point me in the right direction

User.where(profile_type: self.profile_type).where.not(id: black_list)
Is only 1 query, even if you have multiple where clauses, rails is smart to make from it 1 sql query.
If you had
users = User.where(profile_type: self.profile_type)
users.each do |u|
u.get_all_comments
end
This is n+1 query, because you ask for all users, then for each user you make new request for its comments, this is N (queries for comments) and 1 query for all users

Related

Does splitting up an active record query over 2 methods hit the database twice?

I have a database query where I want to get an array of Users that are distinct for the set:
#range is a predefinded date range
#shift_list is a list of filtered shifts
def listing
Shift
.where(date: #range, shiftname: #shift_list)
.select(:user_id)
.distinct
.map { |id| User.find( id.user_id ) }
.sort
end
and I read somewhere that for readability, or isolating for testing, or code reuse, you could split this into seperate methods:
def listing
shiftlist
.select(:user_id)
.distinct
.map { |id| User.find( id.user_id ) }
.sort
end
def shift_list
Shift
.where(date: #range, shiftname: #shift_list)
end
So I rewrote this and some other code, and now the page takes 4 times as long to load.
My question is, does this type of method splitting cause the database to be hit twice? Or is it something that I did elsewhere?
And I'd love a suggestion to improve the efficiency of this code.
Further to the need to remove mapping from the code, this shift list is being created with the following code:
def _month_shift_list
Shift
.select(:shiftname)
.distinct
.where(date: #range)
.map {|x| x.shiftname }
end
My intention is to create an array of shiftnames as strings.
I am obviously missing some key understanding in database access, as this method is clearly creating part of the problem.
And I think I have found the solution to this with the following:
def month_shift_list
Shift.
.where(date: #range)
.pluck(:shiftname)
.uniq
end
Nope, the database will not be hit twice. The queries in both methods are lazy loaded. The issue you have with the slow page load times is because the map function now has to do multiple finds which translates to multiple SELECT from the DB. You can re-write your query to this:
def listing
User.
joins(:shift).
merge(Shift.where(date: #range, shiftname: #shift_list).
uniq.
sort
end
This has just one hit to the DB and will be much faster and should produce the same result as above.
The assumption here is that there is a has_one/has_many relationship on the User model for Shifts
class User < ActiveRecord::Base
has_one :shift
end
If you don't want to establish the has_one/has_many relationship on User, you can re-write it to:
def listing
User.
joins("INNER JOIN shifts on shifts.user_id = users.id").
merge(Shift.where(date: #range, shiftname: #shift_list).
uniq.
sort
end
ALTERNATIVE:
You can use 2 queries if you experience issues with using ActiveRecord#merge.
def listing
user_ids = Shift.where(date: #range, shiftname: #shift_list).uniq.pluck(:user_id).sort
User.find(user_ids)
end

Ruby on Rails Active Record Query

Employers and Jobs. Employers have many jobs. Jobs have a boolean field started.
I am trying to query and find a count for Employers that have more than one job that is started.
How do I do this?
Employer.first.jobs.where(started: true).count
Do I use a loop with a counter or is there a way I can do it with a query?
Thanks!
You can have condition on join
Employer.joins(:jobs).where(jobs: {started: true}).count
You could create a scope like this in your Employer model:
def self.with_started_job
joins(:jobs)
.where(jobs: { started: true })
.having('COUNT(jobs.id) > 0')
end
Then, to get the number of employers that have a started job, you can just use Employer.with_started_job.count.
What is missing is a group by clause. Use .group() then count. Something like Employer.select("employers.id,count(*)").joins(:jobs).where("jobs.started = 1").group("employers.id")
The query joins both tables, eliminates the records that are false, then it counts the total records for each employer.id when grouped together.
Time for exploratory programming!
Given I don't know SQL really well, my way of doing this may not be optimal. I've been unable to use two aggregations without using a subquery. So I split the task in two:
Fetch all employers who have more than one job started
Count the number of entries in the result set
All on the database level, of course! And without raw SQL, so using Arel here and there. Here's what I've come up with:
class Employer < ActiveRecord::Base
has_many :jobs
# I explored my possibilities using this method: fetches
# all the employers and number of jobs each has started.
# Looks best> Employer.with_started_job_counts.map(&:attributes)
# In the final method this one is not used, it's just handy.
def self.with_started_job_counts
jobs = Job.arel_table
joins(:jobs).select(arel_table[Arel.star],
jobs[:id].count.as('job_count'))
.merge(Job.where(started: true))
.group(:id)
end
# Alright. Now try to apply it. Seems to work alright.
def self.busy
jobs = Job.arel_table
joins(:jobs).merge(Job.where(started: true))
.group(:id)
.having(jobs[:id].count.gt 1)
end
# This is one of the tricks I've learned while fiddling
# with Arel. Counts any relations on the database level.
def self.generic_count(subquery)
from(subquery).count
end
# So now... we get this.
def self.busy_count
generic_count(busy)
end
# ...and it seems to get what we need!
end
Resulting SQL is...large. Not huge, but you may have to solve performance issues with it.
SELECT COUNT(*)
FROM (
SELECT "employers".*
FROM "employers"
INNER JOIN "jobs" ON "jobs"."employer_id" = "employers"."id"
WHERE "jobs"."started" = ?
GROUP BY "employers"."id"
HAVING COUNT("jobs"."id") > 1
) subquery [["started", "t"]]
...still, it does seem to get the result.

Rails 3, custom raw SQL insert statement

I have an Evaluation model. Evaluation has many scores. Whenever a new evaluation is created, a score record is created for each user that needs an evaluation (see below for the current method I am using to do this.) So, for example, 40 score records might be created at once. The Evaluation owner then updates each score record with the User's score.
I'm looking to use raw SQL because each insert is its own transaction and is slow.
I would like to convert the following into a mass insert statement using raw SQL:
def build_evaluation_score_items
self.job.active_employees.each do |employee|
employee_score = self.scores.build
employee_score.user_id = employee.id
employee_score.save
end
end
Any thoughts on how this can be done? I've tried adapting a code sample from Chris Heald's Coffee Powered site but, no dice.
Thanks to anyone willing to help!
EDIT 1
I neglected to mention the current method is wrapped in a transaction.
So, essentially, I am trying to add this to the code block so everything is inserted in one statement (** This code snippit is from Chris Heald's Coffee Powered site that discussed the topic. I would ask the question there but the post is > 3 yrs old.):
inserts = []
TIMES.times do
inserts.push "(3.0, '2009-01-23 20:21:13', 2, 1)"
end
sql = "INSERT INTO user_node_scores (`score`, `updated_at`, `node_id`, `user_id`)VALUES #{inserts.join(", ")}"
I'd be happy to show the code from some of my attempts that do not work...
Thanks again!
Well, I've cobbled together something that resembles the code above but I get a SQL statement invalid error around the ('evaluation_id' portion. Any thoughts?
def build_evaluation_score_items
inserts = []
self.job.active_employees.each do |employee|
inserts.push "(#{self.id}, #{employee.id}, #{Time.now}, #{Time.now})"
end
sql = "INSERT INTO scores ('evaluation_id', `user_id`, 'created_at', `updated_at`)VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute(sql)
end
Any idea as to what in the above SQL code is causing the error?
Well, after much trial and error, here is the final answer. The cool thing is that all the records are inserted via one statement. Of course, validations are skipped (so this will not be appropriate if you require model validations on create) but in my case, that's not necessary because all I'm doing is setting up the score record for each employee's evaluation. Of course, validations work as expected when the job leader updates the employee's evaluation score.
def build_evaluation_score_items
inserts = []
time = Time.now.to_s(:db)
self.job.active_employees.each do |employee|
inserts.push "(#{self.id}, #{employee.id}, '#{time}')"
end
sql = "INSERT INTO scores (evaluation_id, user_id, created_at) VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute(sql)
end
Rather than building SQL directly (and opening yourself to SQL injection and other issues), I would recommend the activerecord-import gem. It can issue multi-row INSERT commands, among other strategies.
You could then write something like:
def build_evaluation_score_items
new_scores = job.active_employees.map do |employee|
scores.build(:user_id => employee.id)
end
Score.import new_scores
end
I think what you're looking for is:
def build_evaluation_score_items
ActiveRecord::Base.transaction do
self.job.active_employees.each do |employee|
employee_score = self.scores.build
employee_score.user_id = employee.id
employee_score.save
end
end
end
All child transactions are automatically "pushed" up to the parent transaction. This will prevent the overhead of so many transactions and should increase performance.
You can read more about ActiveRecord transactions here.
UPDATE
Sorry, I misunderstood. Keeping the above answer for posterity. Try this:
def build_evaluation_score_items
raw_sql = "INSERT INTO your_table ('user_id', 'something_else') VALUES "
insert_values = "('%s', '%s'),"
self.job.active_employees.each do |employee|
raw_sql += insert_values % employee.id, "something else"
end
ActiveRecord::Base.connection.execute raw_sql
end

rails where() sql query on array

I'll explain this as best as possible. I have a query on user posts:
#selected_posts = Posts.where(:category => "Baseball")
I would like to write the following statement. Here it is in pseudo terms:
User.where(user has a post in #selected_posts)
Keep in mind that I have a many to many relationship setup so post.user is usable.
Any ideas?
/EDIT
#posts_matches = User.includes(#selected_posts).map{ |user|
[user.company_name, user.posts.count, user.username]
}.sort
Basically, I need the above to work so that it uses the users that HAVE posts in selected_posts and not EVERY user we have in our database.
Try this:
user.posts.where("posts.category = ?", "Baseball")
Edit 1:
user.posts.where("posts.id IN (?)", #selected_posts)
Edit 2:
User.select("users.company_name, count(posts.id) userpost_count, user.username").
joins(:posts).
where("posts.id IN (?)", #selected_posts).
order("users.company_name, userpost_count, user.username")
Just use the following:
User.find(#selected_posts.map(&:user_id).uniq)
This takes the user ids from all the selected posts, turns them into an array, and removes any duplicates. Passing an array to user will just find all the users with matching ids. Problem solved.
To combine this with what you showed in your question, you could write:
#posts_matches = User.find(#selected_posts.map(&:user_id).uniq).map{ |user|
[user.company_name, user.posts.size, user.username]
}
Use size to count a relation instead of count because Rails caches the size method and automatically won't look it up more than once. This is better for performance.
Not sure what you were trying to accomplish with Array#sort at the end of your query, but you could always do something like:
#users_with_posts_in_selected = User.find(#selected_posts.map(&:user_id).uniq).order('username DESC')
I don't understand your question but you can pass an array to the where method like this:
where(:id => #selected_posts.map(&:id))
and it will create a SQL query like WHERE id IN (1,2,3,4)
By virtue of your associations your selected posts already have the users:
#selected_posts = Posts.where("posts.category =?", "Baseball")
#users = #selected_posts.collect(&:user);
You'll probably want to remove duplicate users from #users.

Rails - Find results from two join tables

I have have 3 Tables of data and 2 Join Tables connecting everything. I'm trying to figure out a way to query the results based on the condition that the join table data is the same.
To explain, I have User, Interest, and Event Tables. These tables are linked through an HABTM relationship (which is fine for my needs since I dont need to store any other fields) and joined through two join tables. So i also have a UsersInterests table with (user_id, interest_id) and a EventsInterests table with (event_id, interest_id).
The problem comes when trying to query all the Events where the users interests match the events interests.
I thought it would look something like this...
#events= Event.find(:all, :conditions => [#user.interests = #event.interests])
but I get the error
"undefined method `interests' for nil:NilClass", Is there something wrong with my syntax or my logic?
You're problem is that either #user or #event is undefined. Even if you define them, before executing this statement, the conditions option supplied is invalid, [#user.interests = #event.interests].
This named scope on events should do the trick
class Event < ActiveRecord::Base
...
named_scope :shares_interest_with_user, lambda {|user|
{ :joins => "LEFT JOIN events_interests ei ON ei.event_id = events.id " +
"LEFT JOIN users_intersets ui ON ui.interest_id = ei.interest_id",
:conditions => ["ui.user_id = ?", user], :group_by => "events.id"
}
end
#events = Event.shares_interest_with_user(#user)
Given Event <-> Interest <-> User query all the Events where the users interests match the events interests (so the following will find all such Events that this event's interest are also interests of at least one user).
First try, the simplest thing that could work:
#events = []
Interest.all.each do |i|
i.events.each do |e|
#events << e if i.users.any?
end
end
#events.uniq!
Highly inefficient, very resource hungry and cpu intensive. Generates lots of sql queries. But gets the job done.
Second try should incorporate some complicated join, but the more I think about it the more I see how vague your problem is. Be more precise.
Not sure I completely follow what you are trying to do. If you have one user and you want all events that that user also has interest in then something like:
Event.find(:all, :include => [:events_interests], :conditions => ['events_interests.interest_id in (?)', #user.interests.collect(&:id)])
should probably work.

Resources