Efficient ActiveRecord has_and_belongs_to_many query - ruby-on-rails

I have Page and Paragraph models with a has_and_belongs_to_many relation. Given a paragraph_id, I'd like to get all matching pages. e.g.:
pages = Paragraph.find(paragraph_id).pages.all
However, this takes two queries. It can be done in one query:
SELECT "pages".* FROM "pages"
INNER JOIN "pages_paragraphs" ON "pages_paragraphs"."page_id" = "pages"."id"
WHERE "pages_paragraphs"."paragraph_id" = 123
But can this be done without
using find_by_sql
without modifications to the page_paragraphs table (e.g. adding an id).
Update:
My page model looks like this:
class Page < ActiveRecord::Base
has_and_belongs_to_many :paragraphs, uniq: true
end

With a has_many :through relationship, you could use this:
pages = Page.joins(:pages_paragraphs).where(:pages_paragraphs => {:paragraph_id => 1})
Take a look at Specifying Conditions on the Joined Tables here: http://guides.rubyonrails.org/active_record_querying.html
If you want the pages and paragraphs together:
pages = Page.joins(:pages_paragraphs => :paragraph).includes(:pages_paragraphs => :paragraph).where(:pages_paragraphs => {:paragraph_id => 1})
With a has_and_belongs_to_many:
pages = Page.joins("join pages_paragraphs on pages.id = pages_paragraphs.page_id").where(["pages_paragraphs.paragraph_id = ?", paragraph_id])

You can use eager_load for this
pages = Paragraph.eager_load(:pages).find(paragraph_id).pages
I found an article all about this sort of thing:
3 ways to do eager loading (preloading) in Rails 3 & 4 by Robert Pankowecki

You can use includes for this :
pages = Paragraph.includes(:pages).find(paragraph_id).pages

Related

In a Rails 3 many to many association, what is the most efficient way to query objects based on conditions on their associations?

I have a many-to-many model relation:
class Movie
has_many :movie_genres
has_many :genres, :through => :movie_genres
class Genre
has_many :movie_genres
has_many :movies, :through => :movie_genres
class MovieGenre
belongs_to :movie
belongs_to :genre
I want to query all movies with a certain genre but not associated with another genre. Example: All movies that are Action but not Drama.
What I have done is this:
action_movies = Genre.find_by_name('action').movies
drama_movies = Genre.find_by_name('drama').movies
action_not_drama_movies = action_movies - drama_movies
Is there a more efficient way of doing this? It should be noted that the query can become more complex like: All movies that are Action but not Drama or All movies that are Romance and Comedy
You can indeed improve efficiency by avoid having to instantiate the Movie instances for all action and drama movies by removing the drama movies from the set of action movies in via the sql statement.
The basic building block is a dynamic scope similar to what widjajayd proposed
class Movie
...
# allows to be called with a string for single genres or an array for multiple genres
# e.g. "action" will result in SQL like `WHERE genres.name = 'action'`
# ["romance", "comedy"] will result in SQL like `WHERE genres.name IN ('romance', 'comedy')`
def self.of_genre(genres_names)
joins(:genres).where(genres: { name: genres_names })
end
...
end
You can use that scope as a building block to get the movies you want
All movies that are action but not drama:
Movie
.of_genre('action')
.where("movies.id NOT IN (#{Movie.of_genre('drama').to_sql}")
This will result in an sql subquery. Using a join would be nicer but it should be good enough for most cases and is a better read that the join alternative.
If your app where a rails 5 application you could even type
Movie
.of_genre('action')
.where.not(id: Movie.of_genre('drama'))
All movies that are Action but not Drama or All movies that are Romance and Comedy
Because it is a rails 3 app you will have to type move most of the sql by hand and can not make a lot of use of the scope. The or method is only introduced in rails 5. So this will mean having to type:
Movie
.joins(:genres)
.where("(genres.name = 'action' AND movies.id NOT IN (#{Movie.of_genre('drama').to_sql}) OR genres.name IN ('romance', 'comedy')" )
Again, if it where a rails 5 application this would be much simpler
Movie
.of_genre('action')
.where.not(id: Movie.of_genre('drama'))
.or(Movie.of_genre(['romance', 'comedy']))
probably using scope is better, here is sample and explanation (but not tested), create scope in movie model as follow
Movie.rb
scope :action_movies, joins(movie_genre: :genre).select('movies.*','genres.*').where('genres.name = ?', 'action')
scope :drama_movies, joins(movie_genre: :genre).select('movies.*','genres.*').where('genres.name = ?', 'drama')
in your controller, you can call as follow
#action_movies = Movie.action_movies
#drama_movies = Movie.drama_movies
#action_not_drama_movies = #action_movies - #drama_movies
edit for dynamic scope
if you want dynamic then you can send parameter to scope below is scope using block.
scope :get_movies, lambda { |genre_request|
joins(movie_genre: :genre).select('movies.*','genres.*').where('genres.name = ?', genre_request)
}
genre_request = parameter variable for scope
in your controller
#action_movies = Movie.get_movies('action')
#drama_movies = Movie.get_movies('drama')
Don't see a way to do it with one query (not without using subqueries anyway). But here is one that I think makes it a little better:
scope :by_genres, lambda { |genres|
genres = [genres] unless genres.is_a? Array
joins(:genres).where(genres: { name: genre }).uniq
}
scope :except_ids, lambda { |ids|
where("movies.id NOT IN (?)", ids)
}
scope :intersect_ids, lambda { |ids|
where("movies.id IN (?)", ids)
}
## all movies that are action but not drama
action_ids = Movie.by_genres("action").ids
drama_movies = Movie.by_genres("drama").except_ids(action_ids)
## all movies that are both action and dramas
action_ids = Movie.by_genres("action").ids
drama_movies = Movie.by_genres("drama").intersect_ids(action_ids)
## all movies that are either action or drama
action_or_drama_movies = Movie.by_genres(["action", "drama"])
It's possible to do except and intersect with raw SQL in Rails. But I think that's in general not a good idea as it still requires more than one query and also might make the code dependent on the DB used.
My original answer is rather naive. I'll leave it here so others won't make the same mistake:
Use joins and you can get it with one query:
Movie.joins(:genres).where("genres.name = ?", "action").where("genres.name != ?", "drama")
As noted in the comments, this will get all the movies that are both action and drama too.

Ruby on Rails - Scope to exclude tags

I have a search method where I want to give the user the option to exclude Cases, the model I'm searching, that have certain topics. The Search model has two separate many-to-many relationships with Topics, set up like so:
class Search < ActiveRecord::Base
include ViewersHelper
has_many :included_topics, through: :including_topics, source: :topic
has_many :excluded_topics, through: :excluding_topics, source: :topic
has_many :including_topics
has_many :excluding_topics
To search for cases including certain topics, I used this code.
ids = included_topics.pluck(:id)
cases = cases.includes(:topics).where('topics.id' => ids)
What would be the opposite of this query to get the excluded topics working?
Based on this question, I tried the following:
ids = excludes_topics.pluck(:id)
cases = cases.includes(:topics).where('topics.id NOT IN (?)', ids)
But rails gives me the following error from that query:
ActiveRecord::StatementInvalid in Searches#show
Showing ../app/views/searches/show.html.erb where line #6 raised:
SQLite3::SQLException: no such column: topics.id: SELECT "cases".* FROM "cases" WHERE (topics.id NOT IN (2)) LIMIT 20 OFFSET 0
line #6 is just the first line where the #cases variable that the search is performed upon in referenced
You can use
ids = excludes_topics.pluck(:id)
cases = Case.joins(:topics).where('topics.id NOT IN (?)', ids)
Here's a solution I created using two queries:
ids = excludes_topics.pluck(:id)
included_cases_ids = Case.all.includes(:topics).where('topics.id' => ids).pluck(:id)
cases = cases.where.not('id' => included_cases_ids)

Is it possible to delete_all with inner join conditions?

I need to delete a lot of records at once and I need to do so based on a condition in another model that is related by a "belongs_to" relationship. I know I can loop through each checking for the condition, but this takes forever with my large record set because for each "belongs_to" it makes a separate query.
Here is an example. I have a "Product" model that "belongs_to" an "Artist" and lets say that artist has a property "is_disabled".
If I want to delete all products that belong to disabled artists, I would like to be able to do something like:
Product.delete_all(:joins => :artist, :conditions => ["artists.is_disabled = ?", true])
Is this possible? I have done this directly in SQL before, but not sure if it is possible to do through rails.
The problem is that delete_all discards all the join information (and rightly so). What you want to do is capture that as an inner select.
If you're using Rails 3 you can create a scope that will give you what you want:
class Product < ActiveRecord::Base
scope :with_disabled_artist, lambda {
where("product_id IN (#{select("product_id").joins(:artist).where("artist.is_disabled = TRUE").to_sql})")
}
end
You query call then becomes
Product.with_disabled_artist.delete_all
You can also use the same query inline but that's not very elegant (or self-documenting):
Product.where("product_id IN (#{Product.select("product_id").joins(:artist).where("artist.is_disabled = TRUE").to_sql})").delete_all
In Rails 4 (I tested on 4.2) you can almost do how OP originally wanted
Application.joins(:vacancy).where(vacancies: {status: 'draft'}).delete_all
will give
DELETE FROM `applications` WHERE `applications`.`id` IN (SELECT id FROM (SELECT `applications`.`id` FROM `applications` INNER JOIN `vacancies` ON `vacancies`.`id` = `applications`.`vacancy_id` WHERE `vacancies`.`status` = 'draft') __active_record_temp)
If you are using Rails 2 you can't do the above. An alternative is to use a joins clause in a find method and call delete on each item.
TellerLocationWidget.find(:all, :joins => [:widget, :teller_location],
:conditions => {:widgets => {:alt_id => params['alt_id']},
:retailer_locations => {:id => #teller_location.id}}).each do |loc|
loc.delete
end

Including associations optimization in Rails

I'm looking for help with Ruby optimization regarding loading of associations on demand.
This is simplified example. I have 3 models: Post, Comment, User. References are: Post has many comments and Comment has reference to User (:author). Now when I go to the post page, I expect to see post body + all comments (and their respective authors names). This requires following 2 queries:
select * from Post -- to get post data (1 row)
select * from Comment inner join User -- to get comment + usernames (N rows)
In the code I have:
Post.find(params[:id], :include => { :comments => [:author] }
But it doesn't work as expected: as I see in the back end, there're still N+1 hits (some of them are cached though). How can I optimize that?
UPD
After some investigation, it looks like code was correct, but it doesn't work as expected in case I have named belongs_to in a Comment model. Once I changed from :author to :user, it worked as expected.
In my project I have a similar relationship to your Post, Comment, and User models. I only see three actual sql queries.
Post.find(1, :include => { :comments => [:author] })
From the debug log it shows these three queries
SELECT * FROM `posts` WHERE (`posts`.`id` = 1)
SELECT `comments`.* FROM `comments` WHERE (`comments`.`post_id` = 1)
SELECT * FROM `authors` WHERE (`authors`.`id` IN (4,8,15,16,23,42))
If you are happy with 2/3 queries, you can try:
#post = Post.find params[:id]
#comments = Comments.find_by_post_id(params[:id], :include => [:author])
or
#comments = #post.comments(:include => [:author])
Edit: Have you tried with:
Post.find(params[:id], :include => { :comments => :author }

Rails: how to load 2 models via join?

I am new to rails and would appreciate some help optimizing my database usage.
Is there a way to load two models associated with each other with one DB query?
I have two models Person and Image:
class Person < ActiveRecord::Base
has_many :images
end
class Image < ActiveRecord::Base
belongs_to :person
end
I would like to load a set of people and their associated images with a single trip to the DB using a join command. For instance, in SQL, I can load all the data I need with the following query:
select * from people join images on people.id = images.person_id where people.id in (2, 3) order by timestamp;
So I was hoping that this rails snippet would do what I need:
>> people_and_images = Person.find(:all, :conditions => ["people.id in (?)", "2, 3"], :joins => :images, :order => :timestamp)
This code executes the SQL statement I am expecting and loads the instances of Person I need. However, I see that accessing a a Person's images leads to an additional SQL query.
>> people_and_images[0].images
Image Load (0.004889) SELECT * FROM `images` WHERE (`images`.person_id = 2)
Using the :include option in the call to find() does load both models, however it will cost me an additional SELECT by executing it along with the JOIN.
I would like to do in Rails what I can do in SQL which is to grab all the data I need with one query.
Any help would be greatly appreciated. Thanks!
You want to use :include like
Person.find(:all, :conditions => ["people.id in (?)", "2, 3"], :include => :images, :order => :timestamp)
Check out the find documentation for more details
You can use :include for eager loading of associations and indeed it does call exactly 2 queries instead of one as with the case of :joins; the first query is to load the primary model and the second is to load the associated models. This is especially helpful in solving the infamous N+1 query problem, which you will face if you doesn't use :include, and :joins doesn't eager-load the associations.
the difference between using :joins and :include is 1 query more for :include, but the difference of not using :include will be a whole lot more.
you can check it up here: http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations

Resources