Rails "includes" Method and Avoiding N+1 Query - ruby-on-rails

I don't understand the Rails includes method as well as I'd like, and I ran into an issue that I'm hoping to clarify. I have a Board model that has_many :members and has_many :lists (and a list has_many :cards). In the following boards controller, the show method looks as follows:
def show
#board = Board.includes(:members, lists: :cards).find(params[:id])
...
end
Why is the includes method needed here? Why couldn't we just use #board = Board.find(params[:id]) and then access the members and lists via #board.members and #board.lists? I guess I'm not really seeing why we would need to prefetch. It'd be awesome if someone could detail why this is more effective in terms of SQL queries. Thanks!

Per the Rails docs:
Eager loading is the mechanism for loading the associated records of
the objects returned by Model.find using as few queries as possible.
When you simply load a record and later query its different relationships, you have to run a query each time. Take this example, also from the Rails docs:
clients = Client.limit(10)
clients.each do |client|
puts client.address.postcode
end
This code executes a database call 11 times to do something pretty trivial, because it has to do each lookup, one at a time.
Compare the above code to this example:
clients = Client.includes(:address).limit(10)
clients.each do |client|
puts client.address.postcode
end
This code executes a database call 2 times, because all of the necessary associations are included at the onset.
Here's a link to the pertinent section of the Rails docs.
Extra
A point to note as of recent: if you do a more complex query with an associated model like so:
Board.includes(:members, lists: :cards).where('members.color = ?', 'foo').references(:members)
You need to make sure to include the appended references(:used_eager_loaded_class) to complete the query.

Related

Rails Controller (events to activity)

I am trying to understand, what's difference between 1 and 2 line of codes.
Is it same code ? Thank You !
Activity : has_many :events
Event : belongs_to :activity
1)
#activity = Activity.find(params[:activity_id])
event = Event.new(event_params)
event.activity_id = #activity
2) Edited, 'events' supposed tobe pluralized.
#activity = Activity.find(params[:activity_id])
event = #activity.events.new(event_params)
Yeah, in general, the two approaches are basically doing the same things and will generate same results.
In scenario 1: You are finding an activity and initializing an event, and then associating the event to the activity.
In scenario 2: You are finding an activity and then initializing one of it's associated events using events association. Although it should be: #activity.events.new(event_params) NOT #activity.event.new(event_params) [Notice events should be plural as you have a has_many association]
If you call save in both cases, you will get the same result. Basically, when you will call: activity.events you will get the list of events associated with that activity. The above-created event will be in that list in both cases.
However, although both of the scenarios are doing the same thing, the second way is considered to be more Railsy way of doing things and hence a better practice.
Two blocks are doing the same. But they are not doing the more preferred way, they are doing differently. See my comment how they are doing differently. I explained line by line.
1)
#
# Finding the activity event
#activity = Activity.find(params[:activity_id])
#
# initialising event object from events parameters
event = Event.new(event_params)
# assigning activity in event, this will help building the
# association though its a manual process. Your ORM active record
# gives the best way to handle that. Your step 2 is
# something what is preferred.
event.activity_id = #activity
#
# Comment:
# This is not the best practice. Because its not utilising Rails's
# ORM active record
2)
# finding the activity
#activity = Activity.find(params[:activity_id])
event = #activity.events.new(event_params)
# Creating event using events association
# I believe your association name is different. it should
# be plural form events.
# it should be:
event = #activity.events.new(event_params)
#
# Comment: This is the preferred way.
# Although you can do more refactoring,
# like moving the #activity on any before action
# call back to ensure it is not define every time in
# your different different action.
No they're not the same lines of code.
They tell ActiveRecord to look up particular files in specific datatables, using the appropriate foreign key:
The has_many declaration will perform a query like this:
"SELECT * FROM `events` WHERE `event`.`id` IN ?", [activity.id]
It's pinging the events data table.
--
The belongs_to will pull data out of the parent table using the provided foreign_key:
"SELECT * FROM `activities` WHERE `activity`.`event_id` IN ?", [event.id]
It's important to note that you could also use this to get a similar result:
event_id = "SELECT * FROM `activites` WHERE `activity`.`id` IN ? LIMIT 1", ["1"]
"SELECT * FROM `activities` WHERE `activity`.`event_id` IN ?", [event_id]
IE you're essentially using data from the same table, whilst has_many pulls data from another table.
Although these look similar, they are very different in the background. The has_many association denotes the possibility of extra records in another data table; the belongs_to association has to have a "parent" object.
Thus, when using has_many / belongs_to, you have to understand which is the "parent" object. For example:
#app/models/post.rb
class Post < ActiveRecord::Base
has_many :comments #-> doesn't have to be any "comment" objects
end
#app/models/comment.rb
class Comment < ActiveRecord::Base
belongs_to :post # -> only works if there is a "post" object
end
Hopefully that explains it a little clearer.
Also, you have to remember that Rails is built on top of a relational database.
This means that each time you use ActiveRecord or any of the adjoining functionality, you have to ensure that you understand what this means.
Relational databases work by taking a "foreign key" and applying it to a conjoining database. This allows your ORM (Object Relational Mapper) (in our case ActiveRecord) to pull the appropriate data from the other tables:
As such, all the associations you call within your application are basically ways to represent the above relational database setup.

Calling a class method through an object instance

Railscasts #4 uses this sample code:
class Task < ActiveRecord::Base
belongs_to :project
def self.find_incomplete
find_all_by_complete(:false, :order => "created_at DESC")
end
end
class ProjectsController < ApplicationController
def show
#project = Project.find(params[:id])
#tasks = #project.tasks.find_incomplete
end
end
Using #project.tasks.find_incomplete, only finds incomplete orders that belong to that specific Project instance.
I would expect that call to be equivalent to Task.find_incomplete, but it is not. How can that be? How does Rails (or Ruby) now to just invoke that method for those specific Tasks in that Project instance?
This works because scopes of ActiveRecord relations are merged. It's not that find_incomplete is running on individual task instances.
#project.tasks creates an ActiveRecord scope of the tasks for that project instance and then that scope is still in effect when your find_incomplete method is called.
Take a look at the documentation here: http://guides.rubyonrails.org/active_record_querying.html#scopes
Your find_incomplete method works in the same way as the self.published example in the docs.
Think of the underlying SQL query that would run:
#project.tasks would create a where condition like SELECT * FROM projects WHERE project_id = <project_id>
find_all_by_complete then merges in an and condition for complete = 0
I think the other piece of the puzzle that might help is that #project.tasks is not just a simple array array of Task objects, although it will have been converted to such if you type project.tasks in the Rails console. project.tasks is actually an Active Record relation object (or more precisely a proxy)
There are a number of reasons and benefits for this but the main 2 are that it allows chaining and it allows the underlying query to be run on demand only if/when needed.
The relation has a sequence of rules for how method calls are delegated, one of which is to call class methods on the class of the associated objects. (the relation knows that it's a relation to Tasks)
So you are correct when you wrote tasks.find_incomplete is still equal to Task.find_incomplete except that when find_incomplete is called through project.tasks a scope narrowing down to the project_id is already in effect.

Override just the default scope (specifically order) and nothing else in Rails

So basically I have two classes, Book and Author. Books can have multiple authors and authors can have multiple books. Books have the following default scope.
default_scope :order => "publish_at DESC"
On the Author show page I want to list all the books associated with that author so I say the following...
#author = Author.find(params[:id])
#books = #author.books
All is good so far. The author#show page lists all books belonging to that author ordered by publication date.
I'm also working on a gem that is able to sort by the popularity of a book.
#books = #author.books.sort_by_popularity
The problem is that whenever it tries to sort, the default_scope always gets in the way. And if I try to unscope it before it will get rid of the author relation and return every book in the database. For example
#books = #author.books.unscoped.sort_by_popularity # returns all books in database
I'm wondering if I can use the ActiveRelation except() method
to do something like this (which seems like it should work but it doesn't. It ignores order, just not when it is a default_scope order)
def sort_by_popularity
self.except(:order).do_some_joining_magic.order('popularity ASC')
# |------------| |---------------------|
end
Any ideas as to why this doesn't work? Any ideas on how to get this to work a different way? I know I can just get rid of the default_scope but I'm wondering if there another way to do this.
You should be able to use reorder to completely replace the existing ORDER BY:
reorder(*args)
Replaces any existing order defined on the relation with the specified order.
So something like this:
def self.sort_by_popularity
scoped.do_some_joining_magic.reorder('popularity ASC')
end
And I think you want to use a class method for that and scoped instead of self but I don't know the whole context so maybe I'm wrong.
I don't know why except doesn't work. The default_scope seems to get applied at the end (sort of) rather than the beginning but I haven't looked into it that much.
You can do it without losing default_scope or other ordering
#books.order_values.prepend 'popularity ASC'

Eager loading associations on ActiveModel instances in Rails

In RoR, it is pretty common mistake for new people to load a class and assiocations like this# the solution to eager load
# The bellow generates an insane amount of queries
# post has many comments
# If you have 10 posts with 5 comments each
# this will run 11 queries
posts = Post.find(:all)
posts.each do |post|
post.comments
end
The solution is pretty simple to eager load
# should be 2 queries
# no matter how many posts you have
posts = Post.find(:all, :include => :comments) # runs a query to get all the comments for all the posts
posts.each do |post|
post.comments # runs a query to get the comments for that post
end
But what if you don't have access to the class methods, and only have access to a collection of instance methods.
Then you are stuck with the query intensive lazy loading.
Is there a way to minimize queries to get all the comments for the collection of posts, from the collection of instances?
Addition for Answer (also added to the code above)
So to eager load from what I can see in the rdoc for rails is a class method on any extension of ActiveRecord::Associations, the problem is say you you don't have the ability to use a class method, so you need to use some sort of instance method
a code example of what I think it would look like would be is something like
post = Posts.find(:all)
posts.get_all(:comments) # runs the query to build comments into each post without the class method.
In Rails 3.0 and earlier you can do:
Post.send :preload_associations, posts, :comments
You can pass arrays or hashes of association names like you can to include:
Post.send :preload_associations, posts, :comments => :users
In Rails 3.1 this has been moved and you use the Preloader like this:
ActiveRecord::Associations::Preloader.new(posts, :comments).run()
And since Rails 4 its invocation has changed to:
ActiveRecord::Associations::Preloader.new.preload(posts, :comments)
I think I get what you're asking.
However, I don't think you have to worry about what methods you have access to. The foreign key relationship (and the ActiveRecord associations, such as has_many, belongs_to, etc.) will take care of figuring out how to load the associated records.
If you can provide a specific example of what you think should happen, and actual code that isn't working, it would be easier to see what you're getting at.
How are you obtaining your collection of model instances, and what version of Rails are you using?
Are you saying that you have absolutely no access to either controllers or models themselves?
giving you the best answer depends on knowing those things.

N+1 problem in mongoid

I'm using Mongoid to work with MongoDB in Rails.
What I'm looking for is something like active record include. Currently I failed to find such method in mongoid orm.
Anybody know how to solve this problem in mongoid or perhaps in mongomapper, which is known as another good alternative.
Now that some time has passed, Mongoid has indeed added support for this. See the "Eager Loading" section here:
http://docs.mongodb.org/ecosystem/tutorial/ruby-mongoid-tutorial/#eager-loading
Band.includes(:albums).each do |band|
p band.albums.first.name # Does not hit the database again.
end
I'd like to point out:
Rails' :include does not do a join
SQL and Mongo both need eager loading.
The N+1 problem happens in this type of scenario (query generated inside of loop):
.
<% #posts.each do |post| %>
<% post.comments.each do |comment| %>
<%= comment.title %>
<% end %>
<% end %>
Looks like the link that #amrnt posted was merged into Mongoid.
Update: it's been two years since I posted this answer and things have changed. See tybro0103's answer for details.
Old Answer
Based on the documentation of both drivers, neither of them supports what you're looking for. Probably because it wouldn't solve anything.
The :include functionality of ActiveRecord solves the N+1 problem for SQL databases. By telling ActiveRecord which related tables to include, it can build a single SQL query, by using JOIN statements. This will result in a single database call, regardless of the amount of tables you want to query.
MongoDB only allows you to query a single collection at a time. It doesn't support anything like a JOIN. So even if you could tell Mongoid which other collections it has to include, it would still have to perform a separate query for each additional collection.
Although the other answers are correct, in current versions of Mongoid the includes method is the best way to achieve the desired results. In previous versions where includes was not available I have found a way to get rid of the n+1 issue and thought it was worth mentioning.
In my case it was an n+2 issue.
class Judge
include Mongoid::Document
belongs_to :user
belongs_to :photo
def as_json(options={})
{
id: _id,
photo: photo,
user: user
}
end
end
class User
include Mongoid::Document
has_one :judge
end
class Photo
include Mongoid::Document
has_one :judge
end
controller action:
def index
#judges = Judge.where(:user_id.exists => true)
respond_with #judges
end
This as_json response results in an n+2 query issue from the Judge record. in my case giving the dev server a response time of:
Completed 200 OK in 816ms (Views: 785.2ms)
The key to solving this issue is to load the Users and the Photos in a single query instead of 1 by 1 per Judge.
You can do this utilizing Mongoids IdentityMap Mongoid 2 and Mongoid 3 support this feature.
First turn on the identity map in the mongoid.yml configuration file:
development:
host: localhost
database: awesome_app
identity_map_enabled: true
Now change the controller action to manually load the users and photos. Note: The Mongoid::Relation record will lazily evaluate the query so you must call to_a to actually query the records and have them stored in the IdentityMap.
def index
#judges ||= Awards::Api::Judge.where(:user_id.exists => true)
#users = User.where(:_id.in => #judges.map(&:user_id)).to_a
#photos = Awards::Api::Judges::Photo.where(:_id.in => #judges.map(&:photo_id)).to_a
respond_with #judges
end
This results in only 3 queries total. 1 for the Judges, 1 for the Users and 1 for the Photos.
Completed 200 OK in 559ms (Views: 87.7ms)
How does this work? What's an IdentityMap?
An IdentityMap helps to keep track of what objects or records have already been loaded. So if you fetch the first User record the IdentityMap will store it. Then if you attempt to fetch the same User again Mongoid queries the IdentityMap for the User before it queries the Database again. This will save 1 query on the database.
So by loading all of the Users and Photos we know we are going to want for the Judges json in manual queries we pre-load the data into the IdentityMap all at once. Then when the Judge requires it's User and Photo it checks the IdentityMap and does not need to query the database.
ActiveRecord :include typically doesn't do a full join to populate Ruby objects. It does two calls. First to get the parent object (say a Post) then a second call to pull the related objects (comments that belong to the Post).
Mongoid works essentially the same way for referenced associations.
def Post
references_many :comments
end
def Comment
referenced_in :post
end
In the controller you get the post:
#post = Post.find(params[:id])
In your view you iterate over the comments:
<%- #post.comments.each do |comment| -%>
VIEW CODE
<%- end -%>
Mongoid will find the post in the collection. When you hit the comments iterator it does a single query to get the comments. Mongoid wraps the query in a cursor so it is a true iterator and doesn't overload the memory.
Mongoid lazy loads all queries to allow this behavior by default. The :include tag is unnecessary.
This could help https://github.com/flyerhzm/mongoid-eager-loading
You need update your schema to avoid this N+1 there are no solution in MongoDB to do some jointure.
Embed the detail records/documents in the master record/document.
In my case I didn't have the whole collection but an object of it that caused n+1 (bullet says that).
So rather than writing below which causes n+1
quote.providers.officialname
I wrote
Quote.includes(:provider).find(quote._id).provider.officialname
That didn't cause a problem but left me thinking if I repeated myself or checking n+1 is unnecessary for mongoid.

Resources