Preload joined associations matching a condition with Rails - ruby-on-rails

I want to display only the users who have a given skill and the following query works properly:
#users.joins(:personal_skills).where(personal_skills: search_conditions).distinct
Now in the search results, near a user, I want to display his personal_skill, that matching the wherecondition.
I can simply use user.personal_skills.where(search_conditions) for each user but that would cause the N+1 query problem.
How can I avoid that?
I mean, the Rails-way, otherwise just iterating over the returned rows would accomplish the task. Indeed each row contains both user data and the joined skill data: the problem is related to the object relational mapping.
Simply substituting joins with includes is not a solution because that would preload user.personal_skills and not the filtered set user.personal_skills.where(search_conditions) which is what I want to achieve.

You can get the users from the personal_skills
PersonalSkill.joins(:user).where(search_conditions, where(:user_id => #users.map(&:id))
with the result you can group the skills by the user

Related

Include vs Join

I have 3 models
User - has many debits and has many credits
Debit - belongs to User
Credit - belongs to User
Debit and credit are very similar. The fields are basically the same.
I'm trying to run a query on my models to return all fields from debit and credit where user is current_user
User.left_outer_joins(:debits, :credits).where("users.id = ?", #user.id)
As expected returned all fields from User as many times as there were records in credits and debits.
User.includes(:credits, :debits).order(created_at: :asc).where("users.id = ?", #user.id)
It ran 3 queries and I thought it should be done in one.
The second part of this question is. How I could I add the record type into the query?
as in records from credits would have an extra field to show credits and same for debits
I have looked into ActiveRecordUnion gem but I did not see how it would solve the problem here
includes can't magically retrieve everything you want it to in one query; it will run one query per model (typically) that you need to hit. Instead, it eliminates future unnecessary queries. Take the following examples:
Bad
users = User.first(5)
users.each do |user|
p user.debits.first
end
There will be 6 queries in total here, one to User retrieving all the users, then one for each .debits call in the loop.
Good!
users = User.includes(:debits).first(5)
users.each do |user|
p user.debits.first
end
You'll only make two queries here: one for the users and one for their associated debits. This is how includes speeds up your application, by eagerly loading things you know you'll need.
As for your comment, yes it seems to make sense to combine them into one table. Depending on your situation, I'd recommend looking into Single Table Inheritance (STI). If you don't go this route, be careful with adding a column called type, Rails won't like that!
First of all, in the first query, by calling the query on User class you are asking for records of type User and if you do not want user objects you are performing an extra join which could be costly. (COULD BE not will be)
If you want credit and debit records simply call queries on Credit and Debit models. If you load user object somewhere prior to this point, use includes preload eager_load to do load linked credit and debit record all at once.
There is two way of pre-loading records in Rails. In the first, Rails performs single query of each type of record and the second one Rails perform only a one query and load objects of different types using the data returned.
includes is a smart pre-loader that performs either one of the ways depending on which one it thinks would be faster.
If you want to force Rails to use one query no matter what, eager_load is what you are looking for.
Please read all about includes, eager_load and preload in the article here.

Rails includes causing incorrect result

I am using includes to do eager loading on my one query. I am trying to query invoices to get the total amount invoiced. I get 2 different results when I use includes vs. when I don't. Can anyone explain why this is happening/how to best fix this?
all_invoices = Invoice.includes(:contractor, :invoice_items, :refunds).with_user(1).date_between(date_range).search_contractor("tester").displayed_invoices.order(created_at: 'DESC')
tester = Invoice.with_user(1).date_between(date_range).search_contractor("tester").displayed_invoices.order(created_at: 'DESC')
all_invoices.pluck(:total_in_cents).sum #this will return 80000
tester.pluck(:total_in_cents).sum #this will return 40000
The 2nd one is the correct result of what I"m looking for, but obviously having includes in there is helpful for speed so I'm not trying to remove it, but I need to get the correct result from it.
Anyone have any idea why this is happening?
You are plucking :total_in_cents from two different arrays.
Tester is plucking from SELECTED from "invoices"
all_invoices is plucking from SELECTED from "invoices", but also :contractor, :invoice_items and :refunds that meet the same criteria as:
.with_user(1).date_between(date_range).search_contractor("tester")
I assume some of these are fillers, but that .with_user queries a user, and that user probably has some other records related in :contractor, :invoice_items or refunds
You could test it by adjusting the column record names, and re-seeding the database or better yet filtering out those records not associated with the 'Invoices'and running the same query

Optimizing has many record association query

I have this query that I've built using Enumerable#select. The purpose is to find records thave have no has many associated records or if it does have those records select only those with it's preview attribute set to true. The code below works perfectly for that use case. However, this query does not scale well. When I test against thousands of records it takes several hundred seconds to complete. How can this query be improved upon?
# User has many enrollments
# Enrollment belongs to user.
users_with_no_courses = User.includes(:enrollments).select {|user| user.enrollments.empty? || user.enrollments.where(preview: false).empty?}
So first, make sure enrollments.user_id has an index.
Second, you can speed this up by not loading all the enrollments, and doing your filtering in SQL:
User.where(<<-EOQ)
NOT EXISTS (SELECT 1
FROM enrollments e
WHERE e.user_id = users.id
AND NOT e.preview)
EOQ
By the way here I'm simplifying your two conditions into one: "no enrollments or no real enrollments" is the same as "no real enrollments".
If you want you can put this condition into a scope so it is more reusable.
Third, this is still going to be slow if you're instantiating thousands of User objects. So I would look into paginating if that makes sense, or find_each if this is an offline script. Or use raw SQL to avoid all the object instances.
Oh by the way: even though you are saying includes(:enrollments), this will still go back to the database, giving you an n+1 problem:
user.enrollments.where(preview: false)
That is because the where means ActiveRecord can't use the already-loaded association. You can avoid that by using select instead of where. But not loading the enrollments in the first place is even better.

Rails - get objects of objects WITH duplicates

I received some really good help in solving an issue in which I needed to get objects from objects in one query. That worked like a charm.
This is a follow up to that question. I chose to create a new question to this since the last one was answered according to my previous specification. Please do refer to that previous question for details.
Basically, I wanted to get all the objects of multiple objects in one single query. E.g. if a Product has several Categories which in turn has several Products, I want to get all the Products in that relation, easier (and erronously) put:
all_products = #my_product.categories.products
This was solved with this query. It is this query I would (preferably) like to alter:
Product.includes(:categories).where(categories: { id: #my_product.categories.pluck(:id) } )
Now, I realized something I missed using this solution was that I only get a list of unique Products (which one would expect as well). I would however like to get a list with possible duplicates as well.
Basically, if a "Blue, Electric Car" is included in categories ("Blue", "Electric" and "Car") I would like to get three instances of that object returned, instead of one unique.
I guess this does not make Rails-sense but is there a way to alter the query above so that it does not serve me a list of unique objects in the returned list but rather the "complete" list?
The includes method of AREL will choose between two strategies to make the query, one of which simply does two distinct query and the other one does an INNER JOIN.
In both cases the products will be distinct.
You have to do manually a right outer join:
Product.joins('RIGHT JOIN categories ON categories.product_id = products.id').where(categories: { id: #my_product.categories.pluck(:id) } )
adds also .preload(:categories) if you want to keep the eager loading of the categories.
Since you want duplicates, just change includes to joins, (I tested this just now). joins will essentially combine (inner-join) the two tables giving you a list of records that are all unique (per Product and Category). includes does eager loading which just loads the associated tables already but does an outer-join, and therefore, the retrieved records are also unique (but only per Product).
Product.joins(:categories).where(categories: { id: #my_product.categories.pluck(:id) } )

Loading all the data but not from all the tables

I watched this rails cast http://railscasts.com/episodes/22-eager-loading but still I have some confusions about what is the best way of writing an efficient GET REST service for a scenario like this:
Let's say we have an Organization table and there are like twenty other tables that there is a belongs_to and has_many relations between them. (so all those tables have a organization_id field).
Now I want to write a GET and INDEX request in form of a Rails REST service that based on the organization id being passed to the request in URL, it can go and read those tables and fill the JSON BUT NOT for ALL of those table, only for a few of them, for example let's say for a Patients, Orders and Visits table, not all of those twenty tables.
So still I have trouble with getting my head around how to write such a
.find( :all )
sort of query ?
Can someone show some example so I can understand how to do this sort of queries?
You can include all of those tables in one SQL query:
#organization = Organization.includes(:patients, :orders, :visits).find(1)
Now when you do something like:
#organization.patients
It will load the patients in-memory, since it already fetched them in the original query. Without includes, #organization.patients would trigger another database query. This is why it's called "eager loading", because you are loading the patients of the organization before you actually reference them (eagerly), because you know you will need that data later.
You can use includes anytime, whether using all or not. Personally I find it to be more explicit and clear when I chain the includes method onto the model, instead of including it as some sort of hash option (as in the Railscast episode).

Resources