Optimize eager loading in Rails - ruby-on-rails

Rails 3.2. I have the following:
# city.rb
class City < ActiveRecord::Base
has_many :zones, :dependent => :destroy
end
# zone.rb
class Zone < ActiveRecord::Base
belongs_to :city
has_many :zone_shops, :dependent => :destroy
has_many :shops, :through => :zone_shops
end
# zone_shop.rb
class ZoneShop < ActiveRecord::Base
belongs_to :zone
belongs_to :shop
end
# shop.rb
class Shop < ActiveRecord::Base
end
# cities_controller.rb
def show
#trip = City.find(params[:id], :include => [:user, :zones => [:shops]])
#zones = #trip.zones.order("position")
# List out all shops for the trip
shops_list_array = []
#zones.each do |zone, i|
zone.shops.each do |shop|
shops_list_array << shop.name
end
end
#shops_list = shops_list_array.join(', ')
end
# development.log
City Load (0.3ms) SELECT `cities`.* FROM `cities` WHERE `cities`.`id` = 1 LIMIT 1
Zone Load (0.3ms) SELECT `zones`.* FROM `zones` WHERE `zones`.`trip_id` IN (1) ORDER BY position asc
ZoneShop Load (0.3ms) SELECT `zone_shops`.* FROM `zone_shops` WHERE `zone_shops`.`zone_id` IN (26, 23, 22) ORDER BY position asc
Shop Load (0.5ms) SELECT `shops`.* FROM `shops` WHERE `shops`.`id` IN (8, 7, 1, 9)
Zone Load (0.5ms) SELECT `zones`.* FROM `zones` WHERE `zones`.`trip_id` = 1 ORDER BY position asc, position
Shop Load (0.5ms) SELECT `shops`.* FROM `shops` INNER JOIN `zone_shops` ON `shops`.`id` = `zone_shops`.`spot_id` WHERE `zone_shops`.`zone_id` = 26
Shop Load (0.6ms) SELECT `shops`.* FROM `shops` INNER JOIN `zone_shops` ON `shops`.`id` = `zone_shops`.`spot_id` WHERE `zone_shops`.`zone_id` = 23
Shop Load (0.4ms) SELECT `shops`.* FROM `shops` INNER JOIN `zone_shops` ON `shops`.`id` = `zone_shops`.`spot_id` WHERE `zone_shops`.`zone_id` = 22
Notice in my log, the last 3 lines with shops with ID 26, 23, 22 are redundant. How should I rewrite my cities_controller.rb to reduce the query to the system?
Many thanks.

#zones = #trip.zones.includes(:shops).order("position")
This eager-loads the shops association and should elimitate the n+1 query problem caused by zone.shops.each
For more information, have a look at the Ruby on Rails Guide section 12 on Eager Loading associations, which was also linked by #Benjamin M

I'd suggest that this
zone.shops.each do |shop|
shops_list_array << shop.name
end
produces the 3 last lines of your log. This means: You currently have only one zone inside your database. If you put more zones in there, you will get a lot more Zone Load entries in log.
The problem obviously is Rails' each method, which triggers the lazy loading:
#zones.each do |zone, i|
...
The solution depends on your needs, but I'd suggest that you read everything about Rails' eager loading feature. (There's exactly your problem: The each thing). Look it up here: http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations
It's pretty easy, short and straightforward :)

Related

how to return values in json that contain models to be associated using JBuilder

I would like to return values in json that contain models to be associated using JBuilder.
But I don’t know how to do it. And I encounter the error that “undefined method xx”
Here is my setting of rails.
Model
app/models/item.rb
belongs_to :user
app/models/user.rb
include UserImageUploader[:image]
has_many :item
vim app/uploaders/user_image_uploader.rb
# MiniMagick
require 'image_processing/mini_magick'
class UserImageUploader < Shrine
include ImageProcessing::MiniMagick
# The determine_mime_type plugin allows you to determine and store the actual MIME type of the file analyzed from file content.
plugin :determine_mime_type
plugin :store_dimensions
plugin :pretty_location
plugin :processing
plugin :recache
#The versions plugin enables your uploader to deal with versions,
#by allowing you to return a Hash of files when processing.
plugin :versions
process(:store) do |io, context|
original = io.download
thumbnail = ImageProcessing::MiniMagick
.source(original)
.resize_to_limit!(600, nil)
original.close!
{ original: io, thumbnail: thumbnail }
end
#plugin :versions
#plugin :delete_promoted
#plugin :delete_raw
end
items_controller.rb
#items = Item.includes(:user).page(params[:page] ||= 1).per(8).order('created_at DESC')
render 'index', formats: 'json', handlers: 'jbuilder'
Item/index.json.jbuilder
json.array! #items do |t|
json.id t.id //get the value normally
json.created_at t.created_at //get the value normally
Json.user_id t.user.id //undefined method `id’
json.user_original_img t.user.image_url(:original) //undefined method `image_url'
end
As above, I could not get the value of the model being associated.
By the way, I could check the value correctly with rails console.
Bundle exec rails c
Item.first.user.image_url(:original)
Item Load (1.5ms) SELECT `items`.* FROM `items` ORDER BY `items`.`id` ASC LIMIT 1
User Load (0.7ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1 LIMIT 1
=> "https://xx.s3.ap-northeast-1.amazonaws.com/store/user/1/image/original-xx”
Item.first.user.id
(19.0ms) SET NAMES utf8mb4, ##SESSION.sql_mode = CONCAT(CONCAT(##sql_mode, ',STRICT_ALL_TABLES'), ',NO_AUTO_VALUE_ON_ZERO'), ##SESSION.sql_auto_is_null = 0, ##SESSION.wait_timeout = 2147483
Item Load (0.9ms) SELECT `items`.* FROM `items` ORDER BY `items`.`id` ASC LIMIT 1
User Load (0.8ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1 LIMIT 1
=> 1
Let me know what points I am wrong with.
Thank you for reading my question.
It seems that some items in #items list, doesn't have associated user, or their user_id field is nil. Then item.user would be nil. When you do nil.image_url you get NoMethodError: undefined method 'image_url' for nil:NilClass .
You could add a foreign key constraint between Item and User in your migration, to avoid problems like this:
add_foreign_key :items, :users
NOTE:
Adding the foreign key would still allow empty values. You'd also have to add the following in your migration file to avoid empty values in user_id column:
change_column_null :items, :user_id, false
Thanks to #3limin4t0r for pointing this out.
app/models/user.rb
include UserImageUploader[:image]
has_many :item
should be has_many :items, not terribly confident on this, but this may be the reason you're finding blank columns in your db. A has_many, belongs_to relationship should default to required.

Multiple price groups per user per product: how to minimize amount of queries?

I have the following structures in my Rails (4.2) e-commerce application:
class Product < ActiveRecord::Base
has_many :prices
def best_price(price_groups)
prices.where(price_group_id: price_groups).minimum(:value)
end
def default_price
prices.where(price_group_id: 1).first.value
end
end
class Price < ActiveRecord::Base
belongs_to :price_group
belongs_to :product
end
class PriceGroup < ActiveRecord::Base
has_and_belongs_to_many :users
has_many :prices
end
class User < ActiveRecord::Base
has_and_belongs_to_many :price_groups
end
Many users can be members of many price groups with many product price rows in each of them.
There is a default group with default prices for each product plus there can be many optional price groups with special (reduced) product prices for some users.
The problem is - this way I'm getting two queries per each product listed in my index page:
SELECT MIN("prices"."value") FROM "prices" WHERE "prices"."product_id" = $1 AND "prices"."price_group_id" IN (1, 2) [["product_id", 3]]
and
SELECT "prices".* FROM "prices" WHERE "prices"."product_id" = $1 AND "prices"."price_group_id" = $2 ORDER BY "prices"."id" ASC LIMIT 1 [["product_id", 3], ["price_group_id", 1]]
So, here is my question:
Is there some (easy) way to load everything at once? Like getting list of product objects with default and minimum price fields.
I understand how it can be done in a single SQL query, but I can't think of anything more natural, rails-activerecord-way.
**** Upd:
I ended up with
# I'm using Kaminari pagination
#products = Product.page(params[:page]).per(6)
product_ids = #products.map(&:id)
#best_prices=Price.where(product_id: product_ids, price_group_id: #user_price_groups).group(:product_id).minimum(:value)
#default_prices=Price.where(product_id: product_ids, price_group_id: 1).group(:product_id).minimum(:value)
These group queries produce hashes like { product_id => value }, ... so all I need is just using #best_prices[product.id] and so on in my views.
Thanks to everyone!
I agree with #Frederick Cheung, and I don't like a custom sql query, and I am not familiar with arel, so my solution will be:
products = Product.limit(10)
product_ids = products.map(&:id)
products_hash = products.reduce({}) do |hash, product|
hash[price.id] = {product: product} }
hash
end
best_prices = Price.select("MIN(value) AS min_price, product_id")
.where(product_id: product_ids, price_group_id: price_groups)
.group("product_id")
.reduce({}) do |hash, price|
hash[price.product_id] = { best_price: price.min_price }
hash
end
default_prices = Price.where(price_group_id: 1, product_id: product_ids)
.reduce({}) do |hash, price|
hash[price.product_id] = {default_price: price.value }
hash
end
# A hash like {
# 1: {product: <Product ...>, best_price: 12, default_price: 11},
# 2: {product: <Product ...>, best_price: 12, default_price: 11},
# 3: {product: <Product ...>, best_price: 12, default_price: 11}
# }
result = products_hash.deep_merge(best_prices).deep_merge(default_prices)
Still, three quests will be needed, not only one, but this solved the N+1 problem.
Just add a default scope and includes prices and your problem will be solved
class Product < ActiveRecord::Base
default_scope { includes(:prices) }
has_many :prices
def best_price(price_groups)
prices.where(price_group_id: price_groups).minimum(:value)
end
def default_price
prices.where(price_group_id: 1).first.value
end
end

Why doesn't this model use a new starting point for a time based find on every request?

course.rb
has_many :current_users, :through => :user_statuses, :source => :user, :conditions => ['user_statuses.updated_at > ?', 1.hour.ago]
console
Loading development environment (Rails 3.2.2)
>> course = Course.find(1)
Course Load (0.3ms) SELECT `courses`.* FROM `courses` WHERE `courses`.`id` = 1 LIMIT 1
=> #<Course id: 1, title: "Course 1", created_at: "2012-04-17 19:17:15", updated_at: "2012-04-17 19:17:15">
>> Time.now
=> 2012-04-23 08:29:45 -0400
>> course.current_users.count
(0.4ms) SELECT COUNT(*) FROM `users` INNER JOIN `user_statuses` ON `users`.`id` = `user_statuses`.`user_id` WHERE `user_statuses`.`user_id` = 1 AND (user_statuses.updated_at > '2012-04-23 12:28:40')
=> 0
>> Time.now
=> 2012-04-23 08:30:07 -0400
>> course.current_users.count
(0.4ms) SELECT COUNT(*) FROM `users` INNER JOIN `user_statuses` ON `users`.`id` = `user_statuses`.`user_id` WHERE `user_statuses`.`user_id` = 1 AND (user_statuses.updated_at > '2012-04-23 12:28:40')
=> 0
>>
Notice when checking the 1.hour.ago condition it uses the same time as a starting point despite the 30 second difference between the times when I made the request. Exiting console and restarting it clears it out, but it happens again with a new time. This behavior exists in testing and a browser as well. How do I get a model to use a time based condition for a has_many :through find?
I believe you want to use a dynamic condition on your models relation.
Have a look at this SO question
Basically when your model loads, 1.hour.ago is evaluated only once. If I understand your question, you want it to be evaluated on each request.
Something like this (rails 3.1+) :
:conditions => lambda { |course| "user_statuses.updated_at > '#{1.hour.ago}'" }
Putting the query in the model at all didn't work, either in a has_many :through setup or in a method. So I ended up removing the association and putting the query in the controller. This allows the current time to be calculated when the request is made.
model:
has_many :user_statuses
controller:
#course = Course.find(params[:id])
#current_users = #course.user_statuses.where('updated_at > ?', 1.hour.ago)

Rails Arel selecting distinct columns

I've hit a slight block with the new scope methods (Arel 0.4.0, Rails 3.0.0.rc)
Basically I have:
A topics model, which has_many :comments, and a comments model (with a topic_id column) which belongs_to :topics.
I'm trying to fetch a collection of "Hot Topics", i.e. the topics that were most recently commented on. Current code is as follows:
# models/comment.rb
scope :recent, order("comments.created_at DESC")
# models/topic.rb
scope :hot, joins(:comments) & Comment.recent & limit(5)
If I execute Topic.hot.to_sql, the following query is fired:
SELECT "topics".* FROM "topics" INNER JOIN "comments"
ON "comments"."topic_id" = "topics"."id"
ORDER BY comments.created_at DESC LIMIT 5
This works fine, but it potentially returns duplicate topics - If topic #3 was recently commented on several times, it would be returned several times.
My question
How would I go about returning a distinct set of topics, bearing in mind that I still need to access the comments.created_at field, to display how long ago the last post was? I would imagine something along the lines of distinct or group_by, but I'm not too sure how best to go about it.
Any advice / suggestions are much appreciated - I've added a 100 rep bounty in hopes of coming to an elegant solution soon.
Solution 1
This doesn't use Arel, but Rails 2.x syntax:
Topic.all(:select => "topics.*, C.id AS last_comment_id,
C.created_at AS last_comment_at",
:joins => "JOINS (
SELECT DISTINCT A.id, A.topic_id, B.created_at
FROM messages A,
(
SELECT topic_id, max(created_at) AS created_at
FROM comments
GROUP BY topic_id
ORDER BY created_at
LIMIT 5
) B
WHERE A.user_id = B.user_id AND
A.created_at = B.created_at
) AS C ON topics.id = C.topic_id
"
).each do |topic|
p "topic id: #{topic.id}"
p "last comment id: #{topic.last_comment_id}"
p "last comment at: #{topic.last_comment_at}"
end
Make sure you index the created_at and topic_id column in the comments table.
Solution 2
Add a last_comment_id column in your Topic model. Update the last_comment_id after creating a comment. This approach is much faster than using complex SQL to determine the last comment.
E.g:
class Topic < ActiveRecord::Base
has_many :comments
belongs_to :last_comment, :class_name => "Comment"
scope :hot, joins(:last_comment).order("comments.created_at DESC").limit(5)
end
class Comment
belongs_to :topic
after_create :update_topic
def update_topic
topic.last_comment = self
topic.save
# OR better still
# topic.update_attribute(:last_comment_id, id)
end
end
This is much efficient than running a complex SQL query to determine the hot topics.
This is not that elegant in most SQL implementations. One way is to first get the list of the five most recent comments grouped by topic_id. Then get the comments.created_at by sub selecting with the IN clause.
I'm very new to Arel but something like this could work
recent_unique_comments = Comment.group(c[:topic_id]) \
.order('comments.created_at DESC') \
.limit(5) \
.project(comments[:topic_id]
recent_topics = Topic.where(t[:topic_id].in(recent_unique_comments))
# Another experiment (there has to be another way...)
recent_comments = Comment.join(Topic) \
.on(Comment[:topic_id].eq(Topic[:topic_id])) \
.where(t[:topic_id].in(recent_unique_comments)) \
.order('comments.topic_id, comments.created_at DESC') \
.group_by(&:topic_id).to_a.map{|hsh| hsh[1][0]}
In order to accomplish this you need to have a scope with a GROUP BY to get the latest comment for each topic. You can then order this scope by created_at to get the most recent commented on topics.
The following works for me using sqlite
class Comment < ActiveRecord::Base
belongs_to :topic
scope :recent, order("comments.created_at DESC")
scope :latest_by_topic, group("comments.topic_id").order("comments.created_at DESC")
end
class Topic < ActiveRecord::Base
has_many :comments
scope :hot, joins(:comments) & Comment.latest_by_topic & limit(5)
end
I used the following seeds.rb to generate the test data
(1..10).each do |t|
topic = Topic.new
(1..10).each do |c|
topic.comments.build(:subject => "Comment #{c} for topic #{t}")
end
topic.save
end
And the following are the test results
ruby-1.9.2-p0 > Topic.hot.map(&:id)
=> [10, 9, 8, 7, 6]
ruby-1.9.2-p0 > Topic.first.comments.create(:subject => 'Topic 1 - New comment')
=> #<Comment id: 101, subject: "Topic 1 - New comment", topic_id: 1, content: nil, created_at: "2010-08-26 10:53:34", updated_at: "2010-08-26 10:53:34">
ruby-1.9.2-p0 > Topic.hot.map(&:id)
=> [1, 10, 9, 8, 7]
ruby-1.9.2-p0 >
The SQL generated for sqlite(reformatted) is extremely simple and I hope Arel would render different SQL for other engines as this would certainly fail in many DB engines as the columns within Topic are not in the "Group by list". If this did present a problem then you could probably overcome it by limiting the selected columns to just comments.topic_id
puts Topic.hot.to_sql
SELECT "topics".*
FROM "topics"
INNER JOIN "comments" ON "comments"."topic_id" = "topics"."id"
GROUP BY comments.topic_id
ORDER BY comments.created_at DESC LIMIT 5
Since the question was about Arel, I thought I'd add this in, since Rails 3.2.1 adds uniq to the QueryMethods:
If you add .uniq to the Arel it adds DISTINCT to the select statement.
e.g. Topic.hot.uniq
Also works in scope:
e.g. scope :hot, joins(:comments).order("comments.created_at DESC").limit(5).uniq
So I would assume that
scope :hot, joins(:comments) & Comment.recent & limit(5) & uniq
should also probably work.
See http://apidock.com/rails/ActiveRecord/QueryMethods/uniq

Using :counter_cache and :touch in the same association

I have a Comment model that belongs_to a Message. In comments.rb I have the following:
class Comment < ActiveRecord::Base
belongs_to :message, :counter_cache => true, :touch => true
end
I've done this because updating the counter_cache doesn't update the updated_at time of the Message, and I'd like it to for the cache_key.
However, when I looked in my log I noticed that this causes two separate SQL updates
Message Load (4.3ms) SELECT * FROM `messages` WHERE (`messages`.`id` = 552)
Message Update (2.2ms) UPDATE `messages` SET `comments_count` = COALESCE(`comments_count`, 0) + 1 WHERE (`id` = 552)
Message Update (2.4ms) UPDATE `messages` SET `updated_at` = '2009-08-12 18:03:55', `delta` = 1 WHERE `id` = 552
Is there any way this can be done with only one SQL call?
Edit I also noticed that it does a SELECT of the Message beforehand. Is that also necessary?
It probably does two queries because it's not been optimised yet.
Why not branch and create a patch :D

Resources