ruby on rails manys' many - ruby-on-rails

I am wondering how to do this without double each loop.
Assume that I have a user model and order model, and user has_many orders.
Now I have a users array which class is User::ActiveRecord_Relation
How can I get the orders of these users in one line?

Actually, the best way to do that is :
users.includes(:orders).map(&:orders)
Cause you eager load the orders of the users (only 2 sql queries)
Or
Order.where(user_id: users.pluck(:id))
is quite good too in term of performance

If you've got a many-to-many association and you need to quickly load in all the associated orders you will need to be careful to avoid the so-called "N plus 1" load that can result from the most obvious approach:
orders = users.collect(&:orders).flatten
This will iterate over each user and run a query like SELECT * FROM orders WHERE user_id=? without any grouping.
What you really want is this:
orders = Order.where(user_id: users.collect(&:id))
That should find all orders from all users in a single query.

An answer just come up my mind after I asked....
users.map {|user| user.orders}

Related

Include vs Join

I have 3 models
User - has many debits and has many credits
Debit - belongs to User
Credit - belongs to User
Debit and credit are very similar. The fields are basically the same.
I'm trying to run a query on my models to return all fields from debit and credit where user is current_user
User.left_outer_joins(:debits, :credits).where("users.id = ?", #user.id)
As expected returned all fields from User as many times as there were records in credits and debits.
User.includes(:credits, :debits).order(created_at: :asc).where("users.id = ?", #user.id)
It ran 3 queries and I thought it should be done in one.
The second part of this question is. How I could I add the record type into the query?
as in records from credits would have an extra field to show credits and same for debits
I have looked into ActiveRecordUnion gem but I did not see how it would solve the problem here
includes can't magically retrieve everything you want it to in one query; it will run one query per model (typically) that you need to hit. Instead, it eliminates future unnecessary queries. Take the following examples:
Bad
users = User.first(5)
users.each do |user|
p user.debits.first
end
There will be 6 queries in total here, one to User retrieving all the users, then one for each .debits call in the loop.
Good!
users = User.includes(:debits).first(5)
users.each do |user|
p user.debits.first
end
You'll only make two queries here: one for the users and one for their associated debits. This is how includes speeds up your application, by eagerly loading things you know you'll need.
As for your comment, yes it seems to make sense to combine them into one table. Depending on your situation, I'd recommend looking into Single Table Inheritance (STI). If you don't go this route, be careful with adding a column called type, Rails won't like that!
First of all, in the first query, by calling the query on User class you are asking for records of type User and if you do not want user objects you are performing an extra join which could be costly. (COULD BE not will be)
If you want credit and debit records simply call queries on Credit and Debit models. If you load user object somewhere prior to this point, use includes preload eager_load to do load linked credit and debit record all at once.
There is two way of pre-loading records in Rails. In the first, Rails performs single query of each type of record and the second one Rails perform only a one query and load objects of different types using the data returned.
includes is a smart pre-loader that performs either one of the ways depending on which one it thinks would be faster.
If you want to force Rails to use one query no matter what, eager_load is what you are looking for.
Please read all about includes, eager_load and preload in the article here.

Optimizing has many record association query

I have this query that I've built using Enumerable#select. The purpose is to find records thave have no has many associated records or if it does have those records select only those with it's preview attribute set to true. The code below works perfectly for that use case. However, this query does not scale well. When I test against thousands of records it takes several hundred seconds to complete. How can this query be improved upon?
# User has many enrollments
# Enrollment belongs to user.
users_with_no_courses = User.includes(:enrollments).select {|user| user.enrollments.empty? || user.enrollments.where(preview: false).empty?}
So first, make sure enrollments.user_id has an index.
Second, you can speed this up by not loading all the enrollments, and doing your filtering in SQL:
User.where(<<-EOQ)
NOT EXISTS (SELECT 1
FROM enrollments e
WHERE e.user_id = users.id
AND NOT e.preview)
EOQ
By the way here I'm simplifying your two conditions into one: "no enrollments or no real enrollments" is the same as "no real enrollments".
If you want you can put this condition into a scope so it is more reusable.
Third, this is still going to be slow if you're instantiating thousands of User objects. So I would look into paginating if that makes sense, or find_each if this is an offline script. Or use raw SQL to avoid all the object instances.
Oh by the way: even though you are saying includes(:enrollments), this will still go back to the database, giving you an n+1 problem:
user.enrollments.where(preview: false)
That is because the where means ActiveRecord can't use the already-loaded association. You can avoid that by using select instead of where. But not loading the enrollments in the first place is even better.

How to remove some items from a relation?

I am loading data from two models, and once the data are loaded in the variables, then I need to remove those items from the first relation, that are not in the second one.
A sample:
users = User.all
articles = Articles.order('created_at DESC').limit(100)
I have these two variables filled with relational data. Now I would need to remove from articles all items, where user_id value is not included in the users object. So in the articles would stay only items with user_id, that is in the variable users.
I tried it with a loop, but it was very slow. How do I do it effectively?
EDIT:
I know there's a way to avoid doing this by building a better query, but in my case, I cannot do that (although I agree that in the example above it's possible to do that). That thing is that I have in 2 variables loaded data from database and I would need to process them with Ruby. Is there a command for doing that?
Thank you
Assuming you have a belongs_to relation on the Article model:
articles.where.not(users: users)
This would give you at most 100, but probably less. If you want to return 100 with the condition (I haven't tested, but the idea is the same, put the conditions for users in the where statement):
Articles.includes(:users).where.not(users: true).order('created_at DESC').limit(100)
The best way to do this would probably be with a SQL join. Would this work?
Articles.joins(:user).order('created_at DESC').limit(100)

ActiveRecord return the newest record per user (unique)

I've got a User model and a Card model. User has many Cards, so card has a attribute user_id.
I want to fetch the newest single Card for each user. I've been able to do this:
Card.all.order(:user_id, :created_at)
# => gives me all the Cards, sorted by user_id then by created_at
This gets me half way there, and I could certainly iterate through these rows and grab the first one per user. But this smells really bad to me as I'd be doing a lot of this using Arrays in Ruby.
I can also do this:
Card.select('user_id, max(created_at)').group('user_id')
# => gives me user_id and created_at
...but I only get back user_ids and created_at timestamps. I can't select any other columns (including id) so what I'm getting back is worthless. I also don't understand why PG won't let me select more columns than above without putting them in the group_by or an aggregate function.
I'd prefer to find a way to get what I want using only ActiveRecord. I'm also willing to write this query in raw SQL but that's if I can't get it done with AR. BTW, I'm using a Postgres DB, which limits some of my options.
Thanks guys.
We join the cards table on itself, ON
a) first.id != second.id
b) first.user_id = second.user_id
c) first.created_at < second.created_at
Card.joins("LEFT JOIN cards AS c ON cards.id != c.id AND c.user_id = cards.user_id AND cards.created_at < c.created_at").where('c.id IS NULL')
This is a bit late, but I am working on the same matter, and i found this one works for me :
Card.all.group_by(&:user_id).map{|s| s.last.last}
What do you think ?
I've found one solution that is suboptimal performance-wise but will work for very small datasets, when time is short or it's a hobby project:
Card.all.order(:user_id, :created_at).to_a.uniq(&:user_id)
This takes the AR:Relation results, casts them into a Ruby Array, then performs a Array#uniq on the results with a Proc. After some brief testing it appears #uniq will preserve order, so as long as everything is in order before using uniq you should be good.
The feature is time sensitive so I'm going to use this for now, but I will be looking at something in raw SQL following #Gene's response and link.

Active records complex relationship query

I have the following scenario:
User->HABTM->businesses
Suppliers->HABTM->businesses
Suppliers->HAS_MANY->Payments
I am having real trouble working out how to get all the payments for a user through the HABTM relationships that describe the business->supplier and User->business relationship.
I am after all payments that belong to the user through the supplier business relationship.
I can do this with SQL very easily but am having trouble doing it the rails way.
It's similar to a post on getting all the comments that belong to a user via a post model.
Any help would be appreciated.
Is it even possible?
I am doing this at the moment:
has_many :payments,:finder_sql => Proc.new {
%Q{
SELECT DISTINCT *
FROM payments
INNER JOIN businesses_users ON businesses_users.user_id
INNER JOIN businesses_suppliers ON businesses_suppliers.business_id
WHERE payments.supplier_id = businesses_suppliers.supplier_id AND businesses_users.user_id = #{id}
ORDER BY payments.created_at
}}
This lets me do user.payments
Kinda pressed for time, but wanted to chime in. First, your models seem to suggest that there might be an IS A relationship between User and Supplier, in which case you could employ polymorphic associations. If that's not the case, I'd look into the includes option within the activerecord query interface. In that manner, you can basically force AR to eager load the relationships down the chain. An example might look like:
all_users_and_their_pmts = User.includes(:businesses => {:suppliers => :payments })
An alternative, but highly inefficient, way to do this would be:
user_record.businesses.map {|b| b.suppliers.map {|s| s.payments}}.flatten
which would give you an array of payments. Using raw sql like you have above will be far more efficient than this, since activerecord can't chain the calls within map{}. I think :include would be a more idiomatic way for you to go, but your solution isn't horrible.

Resources