Rails Eager Load and Limit - ruby-on-rails

I think I need something akin to a rails eager loaded query with a limit on it but I am having trouble finding a solution for that.
For the sake of simplicity, let us say that there will never be more than 30 Persons in the system (so Person.all is a small dataset) but each person will have upwards of 2000 comments (so Person.include(:comments) would be a large data set).
Parent association
class Person < ActiveRecord::Base
has_many :comments
end
Child association
class Comment < ActiveRecord::Base
belongs_to :person
end
I need to query for a list of Persons and include their comments, but I only need 5 of them.
I would like to do something like this:
Limited parent association
class Person < ActiveRecord::Base
has_many :comments
has_many :sample_of_comments, \
:class_name => 'Comment', :limit => 5
end
Controller
class PersonController < ApplicationController
def index
#persons = Person.include(:sample_of_comments)
end
end
Unfortunately, this article states: "If you eager load an association with a specified :limit option, it will be ignored, returning all the associated objects"
Is there any good way around this? Or am I doomed to chose between eager loading 1000s of unneeded ActiveRecord objects and an N+1 query? Also note that this is a simplified example. In the real world, I will have other associations with Person, in the same index action with the same issue as comments. (photos, articles, etc).

Regardless of what "that article" said, the issue is in SQL you can't narrow down the second sql query (of eager loading) the way you want in this scenario, purely by using a standard LIMIT
You can, however, add a new column and perform a WHERE clause instead
Change your second association to Person has_many :sample_of_comments, conditions: { is_sample: true }
Add a is_sample column to comments table
Add a Comment#before_create hook that assigns is_sample = person.sample_of_comments.count < 5

Related

Rails: how to eager load limited records of ordered association without N+1 queries

I know there're many questions about some of these topics, but I didn't find one covering all aspects.
Consider User, Activity and Like models. When I query an activity I would like to eager load the first Like for each activity in the collection without making N+1 queries and without loading more than necessary records. My code looks something like this:
class User < ActiveRecord::Base
has_many :likes, as: :liker
end
class Activity < ActiveRecord::Base
has_many :likes
has_one :first_like, -> { order(created_at: :asc) }, class_name: "Like"
end
class Like < ActiveRecord::Base
belongs_to :activity
belongs_to :liker, polymorphic: true
end
I made a comprehensive gist to test different loading strategies and methods: https://gist.github.com/thisismydesign/b08ba0ee3c1862ef87effe0e25386267
Strategies: N+1 queries, left outer join, single extra query
Methods: eager_load, includes, includes & references, includes & preload (these will result in either left outer join or single extra query)
Here're the problems I discovered:
Left outer join doesn't respect order(created_at: :asc) in the association scope nor default_scope { order(created_at: :asc) } (see: rails issue). It does respect explicit ordering i.e. .order("likes.created_at asc").
Left outer join should nevertheless be avoided because it "could result in many rows that contain redundant data and it performs poorly at scale" (see: apidoc rubydoc, rails api). This is a real issue with lots of data even with indexed searches on both sides.
Single extra query will create a query without limit, potentially fetching huge amounts of data (see: rails issue)
Adding an explicit limit(1) to the association in hope of constraining the single extra query will break things
The preferred method would be a single extra query that only queries the required records. All in all, I couldn't find a native solution with Rails. Is there any?
In my question, I'm looking for a native way using Rails. However, here's a solution using SQL and virtual attributes:
class Activity < ApplicationRecord
has_one :first_like, class_name: "Like", primary_key: :first_like_id, foreign_key: :id
scope :with_first_like, lambda {
select(
"activities.*,
(
SELECT like_id as first_like_id
FROM likes
WHERE activity_id = activities.id
ORDER BY created_at ASC
LIMIT 1
)"
)
}
end
Activity.with_first_like.includes(:first_like)

Find records which assoicated records do not belong to certain record

In my system I have a following structure:
class Worker
has_many :worker_memberships
end
class WorkerMembership
belongs_to :worker
belongs_to :event
end
class Event
has_many :worker_memberships
end
Imagine I have a certain #event. How can I find all workers that have NO worker_memberships belonging to this #event?
This is pretty much synthesis of both other answers.
First: stick to has_many through as #TheChamp suggests. You're probably using it already, just forgot to write it, otherwise it just wouldn't work. Well, you've been warned.
I generally do my best to avoid raw SQL in my queries whatsoever. The hint about select I provided above produces a working solution, but does some unneessary stuff, such as join when there's no practical need for it. So, let's avoid poking an association. Not this time.
Here comes the reason why I prefer has_many through to has_and_belongs_to_many in many-to-many associations: we can query the join model itself without raw SQL:
WorkerMembership.select(:worker_id).where(event: #event)
It's not the result yet, but it gets us the list of worker_ids we don't want. Then we just wrap this query into a "give me all but these guys":
Worker.where.not(id: <...> )
So the final query is:
Worker.where.not(id: WorkerMembership.select(:worker_id).where(event: #event) )
And it outputs a single query (on #event with id equal to 1):
SELECT `workers`.* FROM `workers` WHERE (`workers`.`id` NOT IN (SELECT `worker_memberships`.`worker_id` FROM `worker_memberships` WHERE `worker_memberships`.`event_id` = 1))
I also give credit to #apneadiving for his solution and a hint about mysql2's explain. SQLite's explain is horrible! My solution, if I read the explain's result correctly, is as performant as #apneadiving's.
#TheChamp also provided performance costs for all answers' queries. Check out the comments for a comparison.
Since you want to set up a many to many relationship between Worker and Event, I'd suggest you use the through association.
Your resulting models would be.
class Worker
has_many :worker_memberships
has_many :events, :through => :worker_memberships
end
class WorkerMembership
belongs_to :worker
belongs_to :event
end
class Event
has_many :worker_memberships
has_many :workers, :through => :worker_memberships
end
Now you can just call #event.workers to get all the workers associated to the event.
To find all workers that don't belong to the #event you can use:
# get all the id's of workers associated to the event
#worker_ids = #event.workers.select(:id)
# get all workers except the ones belonging to the event
Worker.where.not(:id => #worker_ids)
The one-liner
Worker.where.not(:id => #event.workers.select(:id))
Try this:
Worker.where(WorkerMembership.where("workers.id = worker_memberships.worker_id").where("worker_memberships.event_i = ?", #event.id).exists.not)
Or shorter and reusable:
class WorkerMembership
belongs_to :worker
belongs_to :event
scope :event, ->(event){ where(event_id: event.id) }
end
Worker.where(WorkerMembership.where("workers.id = worker_memberships.worker_id").event(#event.id).exists.not)
(I assumed table and column names from conventions)

Rails - Eager Load Association on an Association

EDIT - Using 'includes' generates a SQL 'IN' clause. When using Oracle this has a 1,000 item limit. It will not work for my company. Are there any other solutions out there?
Is it possible to eager load an association on an association?
For example, let's say I have an Academy class, and an academy has many students. Each student belongs_to student_level
class Academy < ActiveRecord::Base
has_many :students
end
class Student < ActiveRecord::Base
belongs_to :academy
belongs_to :student_level
end
class StudentLevel < ActiveRecord::Base
has_many :students
end
Is it possible to tailor the association in Academy so that when I load the students, I ALWAYS load the student_level with the student?
In other words, I would like the following section of code to produce one or two queries total, not one query for every student:
#academy.students.each do |student|
puts "#{student.name} - #{student.student_level.level_name}"
end
I know I can do this if I change students from an association to a method, but I don't want to do that as I won't be able to reference students as an association in my other queries. I also know that I can do this in SQL in the following manner, but I want to know if there's a way to do this without finder_sql on my association, because now I need to update my finder_sql anytime my default scope changes, and this won't preload the association:
SELECT students.*, student_levels.* FROM students
LEFT JOIN student_levels ON students.student_level_id = student_levels.id
WHERE students.academy_id = ACADEMY_ID_HERE
Have you tried using includes to eager load the data?
class Academy < ActiveRecord::Base
has_many :students
# you can probably come up with better method name
def students_with_levels
# not sure if includes works off associations, see alternate below if it does not
self.students.includes(:student_level)
end
def alternate
Student.where("academy_id = ?", self.id).includes(:student_level)
end
end
see also: http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations
should result in 3 queries
the initial find on Academy
the query for a collection of Student objects
the query for all of those Students StudentLevel objects
Additions:
# the same can be done outside of methods
#academy.students.includes(:student_level).each do |student|
puts "#{student.name} - #{student.student_level.level_name}"
end
Student.where("academy_id = ?", #academy.id).includes(:student_level).each do |student|
puts "#{student.name} - #{student.student_level.level_name}"
end
ActiveRelation queries are also chainable
#academy.students_with_levels.where("name ILIKE ?", "Smi%").each do # ...
Sort of related a nice article on encapsulation of ActiveRecord queries (methods) - http://ablogaboutcode.com/2012/03/22/respect-the-active-record/

How do I optimize a polymorphic news feed in rails?

Here is my model:
class User < ActiveRecord::Base
has_many :activities
has_many :requests
class Activity < ActiveRecord::Base
belongs_to :user
belongs_to :object, :polymorphic => true
I want to get all the users activities and display them
Activity.where(:user_id => current_user.id).include(:object)
the problem is that I can't eager load the object model because it's polymorphic
How do I overcome this problem?
Eager loading is supported with polymorphic associations. You will need to do something along the following lines:
Activity.find(:all, :include => :objectable, :conditions => {:user_id => current_user.id})
Although you need to make sure that you have defined the polymorphic relationship correctly on the associated models.
For further help refer to:
http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html#label-Eager+loading+of+associations
The polymorphic part is at the end of "Eager loading of Associations" section.
As #Wahaj says, eager loading only works with :includes and not :join.
Here's the explanation from the docs:
Address.find(:all, :include => :addressable)
This will execute one query to load the addresses and load the addressables with one query per addressable type. For example if all the addressables are either of class Person or Company then a total of 3 queries will be executed. The list of addressable types to load is determined on the back of the addresses loaded. This is not supported if Active Record has to fallback to the previous implementation of eager loading and will raise ActiveRecord::EagerLoadPolymorphicError. The reason is that the parent model’s type is a column value so its corresponding table name cannot be put in the FROM/JOIN clauses of that query.
I think this is what you're after:
current_user.activities.includes(:object)
As the docs say, there will be an extra query for each association. I'm not sure, but you may need to define an association the other direction for rails to know which AR models to search, eg:
class Post < ActiveRecord::Base
has_many :activities, :as => :object
end
If you're still getting an error, you might be on an earlier rails version which hadn't yet implemented this.
Its not possible to have eager loading to the Polymorphic relationship ... but u can do it for one polymorphic type like if u r having two polymorphic_type then filter the records on that type and then make eager loading it will work then .... not the perfect eager loading but partial eager loading

Rails app using STI -- easiest way to pull these records?

I'm learning my way around Rails and am working on a sample app to keep track of beer recipes.
I have a model called Recipe which holds the recipe name and efficiency.
I have a model called Ingredient which is using STI - this is subclassed into Malt, Hop, and Yeast.
Finally, to link the recipes and ingredients, I am using a join table called rec_items which holds the recipe_id, ingredient_id, and info particular to that recipe/ingredient combo, such as amount and boil time.
Everything seems to be working well - I can find all my malts by using Malt.all, and all ingredients by using Ingredient.all. I can find a recipe's ingredients using #recipe.ingredients, etc...
However, I'm working on my recipe's show view now, and am confused as to the best way to accomplish the below:
I want to display the recipe name and associated info, and then list the ingredients, but separated by ingredient type. So, if I have a Black IPA # 85% efficiency and it has 5 malts and 3 hops varieties, the output would be similar to:
BLACK IPA (85%)
Ingredient List
MALTS:
malt 1
malt 2
...
HOPS:
hop 1
...
Now, I can pull #recipe.rec_items and iterate through them, testing each rec_item.ingredient for type == "Malt", then do the same for the hops, but that doesn't seem very Rails-y nor efficient. So what is the best way to do this? I can use #recipe.ingredients.all to pull all the ingredients, but can't use #recipe.malts.all or #recipe.hops.all to pull just those types.
Is there a different syntax I should be using? Should I using #recipe.ingredient.find_by_type("Malt")? Doing this in the controller and passing the collection to the view, or doing it right in the view? Do I need to specify the has_many relationship in my Hop and Malt models as well?
I can get it working the way I want using conditional statements or find_by_type, but my emphasis is on doing this "the Rails way" with as little DB overhead as possible.
Thanks for the help!
Current bare-bones code:
Recipe.rb
class Recipe < ActiveRecord::Base
has_many :rec_items
has_many :ingredients, :through => :rec_items
end
Ingredient.rb
class Ingredient < ActiveRecord::Base
has_many :rec_items
has_many :recipes, :through => :rec_items
end
Malt.rb
class Malt < Ingredient
end
Hop.rb
class Hop < Ingredient
end
RecItem.rb
class RecItem < ActiveRecord::Base
belongs_to :recipe
belongs_to :ingredient
end
recipes_controller.rb
class RecipesController < ApplicationController
def show
#recipe = Recipe.find(params[:id])
end
def index
#recipes = Recipe.all
end
end
Updated to add
I'm now unable to access the join table attributes, so I posted a new question:
Rails - using group_by and has_many :through and trying to access join table attributes
If anyone can help with that, I'd appreciate it!!
It's been a while since I've used STI, having been burned a time or two. So I may be skipping over some STI-fu that would make this easier. That said...
There are many ways of doing this. First, you could make a scope for each of malt, hops, and yeast.
class Ingredient < ActiveRecord::Base
has_many :rec_items
has_many :recipes, :through => :rec_items
named_scope :malt, :conditions => {:type => 'Malt'}
named_scope :hops, :conditions => {:type => 'Hops'}
...
end
This will allow you to do something line:
malts = #recipe.ingredients.malt
hops = #recipe.ingedients.hops
While this is convenient, it isn't the most efficient thing to do, database-wise. We'd have to do three queries to get all three types.
So if we're not talking a ton of ingredients per recipe, it'll probably be better to just pull in all #recipe.ingredients, then group them with something like:
ingredients = #recipe.ingredients.group_by(&:type)
This will perform one query and then group them into a hash in ruby memory. The hash will be keyed off of type and look something like:
{"Malt" => [first_malt, second_malt],
"Hops" => [first_hops],
"Yeast" => [etc]
}
You can then refer to that collection to display the items however you wish.
ingredients["Malt"].each {|malt| malt.foo }
You can use group_by here.
recipe.ingredients.group_by {|i| i.type}.each do |type, ingredients|
puts type
ingredients.each do |ingredient|
puts ingredient.inspect
end
end
The utility of STI in this instance is dubious. You might be better off with a straight-forward categorization:
class Ingredient < ActiveRecord::Base
belongs_to :ingredient_type
has_many :rec_items
has_many :recipes, :through => :rec_items
end
The IngredientType defines your various types and ends up being a numerical constant from that point forward.
When you're trying to display a list this becomes easier. I usually prefer to pull out the intermediate records directly, then join out as required:
RecItem.sort('recipe_id, ingredient_type_id').includes(:recipe, :ingredient).all
Something like that gives you the flexibility to sort and group as required. You can adjust the sort conditions to get the right ordering. This might also work with STI if you sort on the type column.

Resources