I have two model with has_many belongs to relation.
Scheme has_many navs
I need to fetch all the Schemes with only last nav value. I have 10 Schemes and each scheme has around 100k navs but I need only last record which is the current value.
With eager loading will load all the navs
Scheme.all.includes(:navs)
How can I apply condition to to get only last row of nav for each schemes while eager loading.
UPDATE with Log
If I run
Scheme.includes(:current_nav).limit(3)
these are the queries executed by AR
SELECT `schemes`.* FROM `schemes` LIMIT 3
SELECT `navs`.* FROM `navs` WHERE `navs`.`schemeCode` IN ('D04', 'D01', 'D30') ORDER BY id DESC
How the second query works, it will take all the navs whose schemeCode falls under list and order those by id DESC , but how it will be associated with particular scheme exactly.
How about creating an another association like this:
class Scheme < ActiveRecord::Base
has_one :current_nav, -> { order('id DESC').limit(1) }, class_name: 'Nav'
end
Now you can:
Schema.includes(:current_nav).all
or:
Schema.includes(:current_nav).last(10)
will eager load only last nav of the queried schemes.
Explanation: includes is one of the methods for retrieving objects from database in ActiveRecord. From the doc itself:
Active Record lets you specify in advance all the associations that
are going to be loaded. This is possible by specifying the includes
method of the Model.find call. With includes, Active Record ensures
that all of the specified associations are loaded using the minimum
possible number of queries.
And, since we have the association setup with current_nav, all we had to do is to use it with includes to eager load the data. Please read ActiveRecord querying doc for more information.
Related
For this setup with default (unspecified) search_data:
class Item < ActiveRecord::Base
searchkick
has_many :quantities, dependent: :destroy
scope :search_import, -> { includes(:quantities) }
end
When importing all database records by running Item.reindex, each "batch" eager loads the quantities for all of the item ids as expected.
However, if I want to specify the index model document differently than the default attributes using the search_data method including the associated model data with something like this:
class Item < ActiveRecord::Base
searchkick
has_many :quantities, dependent: :destroy
def search_data
{
part_number: standard_part_number,
category_id: category_id.to_i,
content_source_name: content_source_name,
standard_price: standard_price.to_i,
locations: quantities.map {|q| [q.location_code,
quantity: q.value.to_i,
bins: q.location_bins]
}.to_h
}
end
scope :search_import, -> { includes(:quantities) }
end
where I am operating on the quantities using map to define a locations attribute, returning to import using Item.reindex I see that it not only eager loads all of the associated quantities each batch, but it also then loads all quantities per item with a hit to the database again for each item record it is indexing.
Am I missing something here to get Searchkick to eager load the quantities models and be able to customize the search data record using that already loaded associated model without it doing another pull from the database again per item?
Update
I've determined there is something interfering with the way the ActiveRecord normally responds to the method name used for the association with the eager loaded models in our app and may not be exclusively related to us using Searchkick, although it did reveal the issue. I am not sure what it is that is interfering at this time but it has something to do with the target method on the association. Perhaps a gem that is installed is causing this problem. I did manage to find a way to work around this (for now) using item.association(:quantities).target as a replacement for item.quantities and now when running the reindex it makes use of the already eager loaded quantities and doesn't hit the db again for each item
I see that it not only eager loads all of the associated quantities each batch
This is expected behaviour (and very likely performant), since each batch you will get different quantities to load, since they are connecting to different items, so you don't need to keep all quantities in memory.
each batch, but it also then loads all quantities per item with a hit to the database again for each item record it is indexing.
This is actually unexpected, but my guess here is, that in one of the methods in Quantity (#location_code #value or #location_bins) or even one of the methods on Item that you call (#standard_part_number, #category_id, #content_source_name, #standard_price), there is some implementation, that requires reloading of records.
Without knowing the code of that methods it is purely speculative, but the presented part of the code looks fine.
So I've read a lot about the rails includes method but I'm still a bit confused about what's the best situation to use it.
I have a situation where I have a user record and then this user is related to multiple models like client, player, game, team_player, team, server and server_center.
I need to display specific attributes from the related models in a view. I only need around 1-2 attributes from a specific model and I don't use the others.
I already added delegates for example to get the server.name from player I can use server_name but in this situation do I include all of the tables from which I need the attributes or is there something else I do because I only need a couple of attributes from the model.
My query is as follows at the moment:
#user_profile = User
.includes({:client => [:player, :team_player => [:team]]},
:game,
{:server_center => :server})
.where(game_id: #master.admin.games)
Includes ensures that all of the specified associations are loaded using the minimum possible number of queries.
Let say we have 2 models named User and Profile :
class User < ActiveRecord::Base
has_one :profile
end
class Profile < ActiveRecord::Base
belongs_to :user
end
If we are iterating through each of the users and display the name of each user were name field resides in Profile model which has a association with User model, we would normally have to retrieve the name with a separate database query each time. However, when using the includes method, it has already eagerly loaded the associated person table, so this block only required a single query.
without includes:
users = User.all
users.each do |user|
puts user.profile.name # need extra database query for each time we call name
end
with includes
# 1st query to get all users 2nd to get all profiles and loads to the memory
users = User.includes(:profile).all
users.each do |user|
puts user.profile.name # no extra query needed instead it loads from memory.
end
Eager Loading is used to prevent N+1 query problems. basically it does left outer join and this plays an important role in speeding up request response or optimizing the queries. eg: if we are having huge amount users and if we want to iterate through those users and their corresponding profile. no of time which we will be hitting database will be equals to number of users. but if we are using includes it will keep all profile into memory later when we iterate through the users it will fetch from this memory instead of querying.
Eager loading may not always be the best the cure for our N+1 queries for eg: if you are dealing with some complex queries preferably looks for some caching solutions like Russian Doll caching etc.. still both method has his own pros & cons end of the day it's up to you to determine the best approach.
one useful gem which helps to detect N+1 query is bullet
Some of my classes :
class User
embeds_many :notifications
field :first_name
field :last_name
def name{ "#{first_name} #{last_name}" }
class Notification
embedded_in :user
belongs_to :sender, class_name: "User", inverse_of: nil
Now in my views, I implemented a small mailbox system for notifications. However, it's currently hitting N+1 times the database :
<% current_user.notifications.sort{...}.each do |notif|%>
...
<%= notif.sender.name if notif.sender %>
The problem here is the notif.sender.name which causes N hits on the database. Can I somehow preload/eager load this ? Something like current_user.notifications.includes(:sender) (but which would work :D)
I currently only need the sender name.
I think you're half out of luck here. Mongoid has an error message like:
Eager loading in Mongoid only supports providing arguments to M.includes that are the names of relations on the M model, and only supports one level of eager loading. (ie, eager loading associations not on the M but one step away via another relation is not allowed).
Note the last parenthesized sentence in particular:
eager loading associations not on the M but one step away via another relation is not allowed
Embedding is a relation but you want to apply includes to the embedded relation and that's one step too far for Mongoid.
The fine manual does say that:
This will work for embedded relations that reference another collection via belongs_to as well.
but that means that you'd call includes on the embedded relation rather than what the models are embedded in. In your case, that means that you could eager load the senders for each set of embedded Notifications:
current_user.notifications.includes(:sender).sort { ... }
That still leaves you with the N+1 problem that eager loading is supposed to get around but your N will be smaller.
If that's still too heavy then you could denormalize the name into each embedded document (i.e. copy it rather than referencing it through the sender). Of course, you'd need to maintain the copies if people were allowed to change their names.
It's not perfect, but this article presents a possible solution.
You can load all the senders and use set_relation to avoid them to be loaded every time.
def notifications_with_senders
sender_ids = notifications.map(:sender_id)
senders = User.in(id: sender_ids).index_by(&:id)
notifications.each do |notification|
notification.set_relation(:sender, senders[notification.sender_id])
end
end
Would be great to have that as a Relation method (like includes of Rails Active Record)
I have a query like this,
company.users.select("users.id, users.state").includes(:organization)
here I'm eager loading the association organization. I was expecting the attributes id and user_id to be fetched in the objects, but then I get all fields fetched.
Is this the way, rails behaves when we eager load or am I missing something here ?
In your case you will get all company users not organizations.
Eager loading means pre-loading the database rows. It will not fetch only attributes. It loads all rows associated.
For example:
comments = Comment.all(:select => "users.name,comment_text", :include => :user)
Here, it will not just load names from user table. It will get all users rows from the database. So you don't have to fire extra queries. And one more thing is when you use include select clause is ignored when you have attributes of included tables. For more info go through ryan bates rialscast on joins vs include : http://railscasts.com/episodes/181-include-vs-joins
I use mongodb and mongoid gem and I'd like to get some advice.
I have an app where User has many Markets and Market has many Products.
I need to search for the products, say in a specific price range, in all (or any) the markets which belong to the user.
Which relation fits better for this, embedded or referenced?
I currently use referenced and it looks like so
class User
has_many :markets
end
class Market
belongs_to :user
has_many :products
end
class Product
belongs_to :calendar
belongs_to :user
end
And for search, I use this query
Product.where(user_id: current_user.id).
in(market_id: marked_ids).
where(:price.gte => price)
I'm curious, since mongdb is a document oriented database, would I benefit in a performance or design, if I used embedded documents in this situation?
In your case I would advice to use referenced data. Because I suppose that you need to manipulate each of those collections on it's own (you need to be able to edit/delete/update "products" by _id, and do some other complicated queries, which is much easier and effective when you have separate collection).
At the same time I would store some full embedded data in Users collection, just for speed-up display to visitor's browser. Let's say you have a user's page where you want to show user's profile and top-5 markets and top-20 products. You can embed those newest top-5 and top-20 to User's document and update those embedded objects when there are new markets/products. In this case - when you show user's page you need to make just 1 query to MongoDB. So this works as cache. If visitor needs to view more products, he goes to the next page "Products" and query separate "Products" collection in MongoDB.
Use embedded documents if you only need to access the item through the parent class. If you need to query it directly or from multiple objects, use a reference.