Multiple joins with count and having with ActiveRecord - ruby-on-rails

My application is about Profiles that have many Wishes, that are related to Movies:
class Profile < ApplicationRecord
has_many :wishes, dependent: :destroy
has_many :movies, through: :wishes
end
class Wish < ApplicationRecord
belongs_to :profile
belongs_to :movie
end
class Movie < ApplicationRecord
has_many :wishes, dependent: :destroy
has_many :profiles, through: :wishes
end
I would like to return all the Movies that are all "wished" by profiles with id 1,2, and 3.
I managed to get this query using raw SQL (postgres), but I wanted to learn how to do it with ActiveRecord.
select movies.id
from movies
join wishes on wishes.movie_id = movies.id
join profiles on wishes.profile_id = profiles.id and profiles.id in (1,2,3)
group by movies.id
having count(*) = 3;
(I'm relying on count(*) = 3 because I have an unique index that prevents creation of Wishes with duplicated profile_id-movie_id pairs, but I'm open to better solutions)
At the moment the best approach I've found is this one:
profiles = Profile.find([1,2,3])
Wish.joins(:profile, :movie).where(profile: profiles).group(:movie_id).count.select { |_,v| v == 3 }
(Also I would begin the AR query with Movie.joins, but I didn't manage to find a way :-)

Since belongs_to puts the foreign key in the wishes table, you should be able to just query it for your profiles like so:
Wish.where("profile_id IN (?)", [1,2,3]).includes(:movie).all.map{|w| w.movie}
This should get you an array of all of the movies by those three profiles, eager loading the movies.

Since what I want from the query is a collection of Movies, the ActiveRecord query needs to start from Movie. What I was missing was that we can specify the table in the query, like where(profiles: {id: profiles_ids}).
Here it is the query I was looking for. (Yes, using count might sound a little bit brittle, but the alternative was an expensive SQL subquery. Also, I think it's safe if you're using a multiple-column unique index.)
profiles_ids = [1,2,3]
Movie.joins(:profiles).where(profiles: {id: profiles_ids}).group(:id).having("COUNT(*) = ?", profiles_ids.size)

Related

How do I get the records with exact has_many through number of entries on rails

I have a many to many relationship through a has_many through
class Person < ActiveRecord::Base
has_many :rentals
has_many :books, through rentals
end
class Rentals < ActiveRecord::Base
belongs_to :book
belongs_to :person
end
class Book < ActiveRecord::Base
has_many :rentals
has_many :persons, through rentals
end
How can I get the persons that have only one book?
If the table for Person is called persons, you can build an appropriate SQL query using ActiveRecord's query DSL:
people_with_book_ids = Person.joins(:books)
.select('persons.id')
.group('persons.id')
.having('COUNT(books.id) = 1')
Person.where(id: people_with_book_ids)
Although it's two lines of Rails code, ActiveRecord will combine it into a single call to the database. If you run it in a Rails console, you may see a SQL statement that looks something like:
SELECT "persons".* FROM "persons" WHERE "deals"."id" IN
(SELECT persons.id FROM "persons" INNER JOIN "rentals"
ON "rentals"."person_id" = "persons"."id"
INNER JOIN "books" ON "rentals"."book_id" = "books"."id"
GROUP BY persons.id HAVING count(books.id) > 1)
If this is something you want to do often, Rails offers what is called a counter cache:
The :counter_cache option can be used to make finding the number of belonging objects more efficient.
With this declaration, Rails will keep the cache value up to date, and then return that value in response to the size method.
Effectively this places a new attribute on your Person called books_count that will allow you to quite simply filter by the number of associated books:
Person.where(books_count: 1)

How to combine two has_many associations for an instance and a collection in Rails?

I'm having trouble combining two has_many relations. Here are my associations currently:
def Note
belongs_to :user
belongs_to :robot
end
def User
has_many :notes
belongs_to :group
end
def Robot
has_many :notes
belongs_to :group
end
def Group
has_many :users
has_many :robots
has_many :user_notes, class_name: 'Note', through: :users, source: :notes
has_many :robot_notes, class_name: 'Note', through: :robots, source: :notes
end
I'd like to be able to get all notes, both from the user and the robots, at the same time. The way I currently do that is:
def notes
Note.where(id: (user_notes.ids + robot_notes.ids))
end
This works, but I don't know a clever way of getting all notes for a given collection of groups (without calling #collect for efficiency purposes).
I would like the following to return all user/robot notes for each group in the collection
Group.all.notes
Is there a way to do this in a single query without looping through each group?
Refer Active record Joins and Eager Loading documentation for detailed and efficient ways.
For example, You could avoid n+1 query problem here in this case as follows,
class Group
# Add a scope to eager load user & robot notes
scope :load_notes, -> { includes(:user_notes, :robot_notes) }
def notes
user_notes & robot_notes
end
end
# Load notes for group collections
Group.load_notes.all.notes
You can always handover the querying to the db which is built for such purposes. For example, your earlier query for returning all the notes associated with users and robots can be achieved by:
Notes.find_by_sql ["SELECT * FROM notes WHERE user_id IN (SELECT id FROM users) UNION SELECT * FROM notes WHERE robot_id IN (SELECT id FROM robots)"]
If you want to return the notes from users and robots associated with a given group with ID gid(say), you'll have to modify the nested sql query:
Notes.find_by_sql ["SELECT * FROM notes WHERE user_id IN (SELECT id FROM users WHERE group_id = ?) UNION SELECT * FROM notes WHERE robot_id IN (SELECT id FROM robots WHERE group_id = ?)", gid, gid]
Note:
If you want your application to scale then you may want as many DB transactions executed within a given period as possible, which means you run shorter multiple queries. But if you want to run as little queries as possible from ActiveRecord using the above mentioned method, then it will effect the performance of you DB due to larger queries.

ActiveRecord custom has_one relations

I'm using Rails 5.0.0.1 ATM and i've come across issue with ActiveRecord relations when optimizing count of my DB requests.
Right now I have:
Model A (let's say 'Orders'), Model B ('OrderDispatches'), Model C ('Person') and Model D ('PersonVersion').
Table 'people' consists only of 'id' and 'hidden' flag, rest of the people data sits in 'person_versions' ('name', 'surname' and some things that can change over time, like scientific title).
Every Order has 'receiving_person_id' as for the person which recorded order in DB and every OrderDispatch has 'dispatching_person_id' for the person, which delivered order. Also Order and OrderDispatch have creation time.
One Order has many dispatches.
The straightforward relations thus is:
has_many :receiving_person, through: :person, foreign_key: "receiving_person_id", class_name: 'PersonVersion'
But when I list my order with according dispatches I have to deal with N+1 situation, because to find accurate (according to the creation date of Order/OrderDispatch) PersonVersion for every receiving_person_id and dispatching_person_id I'm making another requests.
SELECT *
FROM person_versions
WHERE effective_date_from <= ? AND person_id = ?
ORDER BY effective_date_from
LIMIT 1
First '?' is Order/OrderDispatch creation date and second '?' is receiving/ordering person id.
Using this query I'm getting accurate person data for the time of Order/OrderDispatch creation.
It's fairly easy to write query with subquery (or subqueries, as Order comes with OrderDispatches on one list) in raw SQL, but I have no idea how to do that using ActiveRecord.
I tried to write custom has_one relation as this is as far as I've come:
has_one :receiving_person. -> {
where("person_versions.id = (
SELECT id
FROM person_versions sub_pv1
WHERE sub_pv1.date_from <= orders.receive_date
AND sub_pv1.person_id = orders.receiving_person_id
LIMIT 1)")},
through: :person, class_name: "PersonVersion", primary_key: "person_id", source: :person_version
It works if I use this only for receiving or dispatching person. When I try to eager_load this for joined orders and order_dispatches tables then one of 'person_versions' has to be aliased and in my custom where clause it isn't (no way to predict if it's gonna be aliased or not, it's used both ways).
Different aproach would be this:
has_one :receiving_person, -> {
where(:id => PersonVersion.where("
person_versions.date_from <= orders.receive_date
AND person_versions.person_id = orders.receiving_person_id").order(date_from: :desc).limit(1)},
through: :person, class_name: "PersonVersion", primary_key: "person_id", source: :person_version
Raw 'person_versions' in where is OK, because it's in subquery and using symbol ':id' makes raw SQL get correct aliases for person_versions table joined to orders and order_dispatches, but I get 'IN' instead of 'eqauls' for person_versions.id xx subquery and MySQL can't do LIMIT in subqueries which are used with IN/ANY/ALL statements, so I just get random person_version.
So TL;DR I need to transform 'has_many through' to 'has_one' using custom 'where' clause which looks for newest record amongst those which date is lower than originating record creation.
EDIT: Another TL;DR for simplification
def receiving_person
receiving_person_id = self.receiving_person_id
receive_date = self.receive_date
PersonVersion.where(:person_id => receiving_person_id, :hidden => 0).where.has{date_from <= receive_date}.order(date_from: :desc, id: :desc).first
end
I need this method converted to 'has_one' relation so that i could 'eager_load' this.
I would change your schema as it's conflicting with your business domain, restructuring it would alleviate your n+1 problem
class Person < ActiveRecord::Base
has_many :versions, class_name: PersonVersion, dependent: :destroy
has_one :current_version, class_name: PersonVersion
end
class PersonVersion < ActiveRecord::Base
belongs_to :person, inverse_of: :versions,
default_scope ->{
order("person_versions.id desc")
}
end
class Order < ActiveRecord::Base
has_many :order_dispatches, dependent: :destroy
end
class OrderDispatch < ActiveRecord::Base
belongs_to :order
belongs_to :receiving_person_version, class_name: PersonVersion
has_one :receiving_person, through: :receiving_person_version
end

How to fetch records with exactly specified has_many through records?

I feel like I have read all SO "has_many through" questions but none helped me with my problem.
So I have a standard has_many through setup like this:
class User < ActiveRecord::Base
has_many :product_associations
has_many :products, through: :product_associations
end
class ProductAssociation < ActiveRecord::Base
belongs_to :user
belongs_to :product
end
class Product < ActiveRecord::Base
has_many :product_associations
has_many :users, through: :product_associations
end
IMO, What I want is pretty simple:
Find all users that have a product association to products A, B, and C, no more, no less
So I have a couple of products and want to find all users that are connected to exactly those products (they shouldn't have any other product associations to other products).
This is the best I came up with:
products # the array of products that I want to find all connected users for
User
.joins(:product_associations)
.where(product_associations: { product_id: products.map(&:id) })
.group('products.id')
.having("COUNT(product_associations.id) = #{products.count}")
It doesn't work though, it also returns users connected to more products.
I also toyed around with merging scopes but didn't get any result.
All hints appreciated! :)
select * from users
join product_associations on product_associations.user_id = users.id
where product_associations.product_id in (2,3)
and not exists (
select *
from product_associations AS pa
where pa.user_id = users.id
and pa.product_id not in (2,3)
)
group by product_associations.user_id
having count(product_associations.product_id) = 2
It does two things, find users with: 1) all the product associations and 2) no other product associations.
Sqlfiddle example: http://sqlfiddle.com/#!2/aee8e/5
It can be Railsified™ (somewhat) in to:
User.joins(:product_associations)
.where(product_associations: { product_id: products })
.where("not exists (select *
from product_associations AS pa
where pa.user_id = users.id
and pa.product_id not in (?)
)", products.pluck(:id))
.group('product_associations.user_id')
.having('count(product_associations.product_id) = ?', products.count)

How to write Rails finder with several subqueries

This is a library system, people can borrow books here. And each book belongs to a category. We'd like to give people some suggestions according to what kind of books they borrowed most.
Here are four models:
class Person < AR
has_many :borrows
end
class Borrow < AR
belongs_to :person
belongs_to :book
end
class Category < AR
has_many :books
end
class Book < AR
has_many :borrows
belongs_to :category
end
And I wrote SQL to find the books
SELECT * FROM books WHERE category_id =
(SELECT category_id FROM books WHERE id IN
(SELECT book_id FROM borrows WHERE person_id =10000)
GROUP BY category_id ORDER BY count(*) DESC LIMIT 1)
AND id NOT IN
(SELECT book_id FROM borrows WHERE person_id =10000)
This seems to be working, but I wonder how could I write the finder in the Rails way...
You can do following things, write following in person.rb
has_many :books, :through => :borrows
has_many :categories_of_books, :through => :books, :source => :category
&
def suggested_books
Book.where("category_id IN (?) AND id NOT IN (?)", self.categories_of_books, self.books)
end
Though it results in more than 1 query, but its clean, you just have to do:
#user.suggested_books
With active record, you can eliminate two of the three subqueries in favor of joins:
Book.where(
category_id: Category.limit(1)
.joins(:books => :borrows)
.where("borrows.person_id = ?", 10000)
.group("categories.id")
.order("COUNT(*) DESC")
.pluck("categories.id")
).joins(:borrows).where("borrows.person_id != ?", 10000)
Still not the best solution because it generates two separate queries (one for the inner query on Category). Depending on your needs, this may not be so bad, if, say, you decide to use the result of the inner query (the most borrowed category of the user in question) for something else.
May be something like that :
#person = Person.find(10000)
#categories = #person.books.map{|b| b.category}.uniq!
#suggestions = #categories.map{|c| c.books} - #person.books
In order to have '#person.books' working, you have to add in your Person model :
has_many :books, :through => :borrows

Resources