Rails includes vs all? - ruby-on-rails

I am going back to refresh my Rails knowledge by watching some tutorials, and I came across where the tutorial rails app uses includes() on index.
def index
#books = Book.all
end
vs
def index
#books = Book.includes(:author, :genre)
end
As a side note, book belongs_to author and genre. Author has_many books and genre also has_many books.
When all is used, it looks like this when I refresh page:
Rendering books/index.html.erb within layouts/application
Book Load (1.4ms) SELECT "books".* FROM "books"
Author Load (0.3ms) SELECT "authors".* FROM "authors" WHERE "authors"."id" = $1 LIMIT $2 [["id", 2], ["LIMIT", 1]]
Genre Load (0.3ms) SELECT "genres".* FROM "genres" WHERE "genres"."id" = $1 LIMIT $2 [["id", 2], ["LIMIT", 1]]
Author Load (0.4ms) SELECT "authors".* FROM "authors" WHERE "authors"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
Genre Load (0.3ms) SELECT "genres".* FROM "genres" WHERE "genres"."id" = $1 LIMIT $2 [["id", 3], ["LIMIT", 1]]
CACHE (0.0ms) SELECT "authors".* FROM "authors" WHERE "authors"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
CACHE (0.0ms) SELECT "genres".* FROM "genres" WHERE "genres"."id" = $1 LIMIT $2 [["id", 3], ["LIMIT", 1]]
CACHE (0.0ms) SELECT "authors".* FROM "authors" WHERE "authors"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
CACHE (0.0ms) SELECT "genres".* FROM "genres" WHERE "genres"."id" = $1 LIMIT $2 [["id", 2], ["LIMIT", 1]]
When includes is used, when I reload the page it shows:
Rendering books/index.html.erb within layouts/application
Book Load (0.4ms) SELECT "books".* FROM "books"
Author Load (0.5ms) SELECT "authors".* FROM "authors" WHERE "authors"."id" IN (2, 1)
Genre Load (0.4ms) SELECT "genres".* FROM "genres" WHERE "genres"."id" IN (2, 3)
I think this makes includes far more efficient than all because it hits the entire model database.
My question is, why do people still use all? Why not completely eradicate all and use includes from now on? Is there any situation where I would prefer to use all and not use includes? I am using Rails 5.0.1.

Let me talk a little bit about includes.
Suppose you need to get the user name of first five post. You quickly write the query below and go enjoy your weekend.
posts = Post.limit(5)
posts.each do |post|
puts post.user.name
end
Good. But let's look at the queries
Post Load (0.5ms) SELECT `posts`.* FROM `posts` LIMIT 5
User Load (0.3ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1 LIMIT 1
User Load (0.3ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1 LIMIT 1
User Load (0.3ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 2 LIMIT 1
User Load (0.3ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 2 LIMIT 1
User Load (0.3ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1 LIMIT 1
1 query to fetch all posts and 1 query to fetch users for each post results in a total of 6 queries. Check out the solution below which does the same thing, just in 2 queries:
posts = Post.includes(:user).limit(5)
posts.each do |post|
puts post.user.name
end
#####
Post Load (0.3ms) SELECT `posts`.* FROM `posts` LIMIT 5
User Load (0.3ms) SELECT `users`.* FROM `users` WHERE `users`.`id` IN (1, 2)
There’s one little difference. Add includes(:posts) to your query, and problem solved. Quick, nice, and easy.
But don’t just add includes in your query without understanding it properly. Using includes with joins might result in cross-joins depending on the situation, and you don’t need that in most cases.
If you want to add conditions to your included models you’ll have to explicitly reference them. For example:
User.includes(:posts).where('posts.name = ?', 'example')
Will throw an error, but this will work:
User.includes(:posts).where('posts.name = ?', 'example').references(:posts)
Note that includes works with association names while references needs the actual table name.

Related

will_paginate deletes posts in a group when posts are paginated

I'm using will_paginate to paginate posts within a group.
A group might have many posts and I don't want to show all posts at once.
For some strange reason I noticed that a lot of posts are deleted from the database when I do #group.posts = paginated_posts as displayed in * groups_controller*.
I'm not doing save on #group, why are the posts deleted?
Tests first
69 it "Pagination – get one group and its posts" do
70 10.times { Fabricate(:post, group: #group, user: #user) }
71 puts "POST COUNT BEFORE #{#group.posts.count}"
72 # byebug
73 get group_path(#group, posts_per_page: 3)
74 puts "POST COUNT AFTER #{#group.posts.count}"
75 expect(response).to have_http_status(200)
76 expect(#group).to be_present
77 end
Output from the test
Groups
GET /group/:group_id?posts_per_page=10&posts_page=2
POST COUNT BEFORE 12
POST COUNT AFTER 3
groups_controller.rb
9 def show
10 paginated_posts = #group.posts.paginate(
11 page: params[:posts_page],
12 per_page: params[:posts_per_page] || 100000,
13 )
14 #group.posts = paginated_posts
15 render json: #group
16 end
The log
(0.3ms) RELEASE SAVEPOINT active_record_1
Started GET "/groups/33?posts_per_page=3" for 127.0.0.1 at 2018-09-18 12:52:13 +0000
Processing by GroupsController#show as HTML
Parameters: {"posts_per_page"=>"3", "id"=>"33"}
User Load (2.1ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", "78913bd2-4c72-4c73-ba75-421e154c830b"], ["LIMIT", 1]]
Group Load (1.4ms) SELECT "groups".* FROM "groups" WHERE "groups"."id" = $1 LIMIT $2 [["id", 33], ["LIMIT", 1]]
Post Load (2.2ms) SELECT "posts".* FROM "posts" WHERE "posts"."group_id" = $1 LIMIT $2 OFFSET $3 [["group_id", 33], ["LIMIT", 3], ["OFFSET", 0]]
Post Load (0.3ms) SELECT "posts".* FROM "posts" WHERE "posts"."group_id" = $1 [["group_id", 33]]
(0.4ms) SAVEPOINT active_record_1
Comment Load (0.5ms) SELECT "comments".* FROM "comments" WHERE "comments"."post_id" = $1 [["post_id", 156]]
PostImage Load (0.3ms) SELECT "post_images".* FROM "post_images" WHERE "post_images"."post_id" = $1 [["post_id", 156]]
Post Destroy (0.3ms) DELETE FROM "posts" WHERE "posts"."id" = $1 [["id", 156]]
Comment Load (0.3ms) SELECT "comments".* FROM "comments" WHERE "comments"."post_id" = $1 [["post_id", 157]]
PostImage Load (0.3ms) SELECT "post_images".* FROM "post_images" WHERE "post_images"."post_id" = $1 [["post_id", 157]]
Post Destroy (0.3ms) DELETE FROM "posts" WHERE "posts"."id" = $1 [["id", 157]]
Comment Load (0.3ms) SELECT "comments".* FROM "comments" WHERE "comments"."post_id" = $1 [["post_id", 158]]
#group.posts = ... doesn't work the way you think it does. If you have a one-to-many relationship and you assign an array of associated records to the existing group object, the records set and remove the group.id foreign key immediately. No save is required.
So when you do
#group.posts = paginated_posts
The paginated posts are assigned to the association and all other records in the association are removed from the association. Instantly. They're not necessarily deleted, but they no longer belong to #post
You may want to change your #to_json ... perhaps add
attr_accessor :page_of_posts
And then in your controller
#group.page_of_posts = paginated_posts
And ensure page_of_posts is what's rendered in the json. Others may suggest a better way.

Rails - Increasing performance of repeated if statement

I'm using the public_activity gem and in the output, I'm checking if the trackable owner is the same as the current user:
= a.owner == current_user ? 'You' : a.owner.name
did this activity
I get a bunch of cache calls in the log:
User Load (1.8ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
Rendered public_activity/post/_create.html.haml (1.4ms)
Rendered public_activity/_snippet.html.haml (11.4ms)
CACHE (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
Rendered public_activity/post/_create.html.haml (13.9ms)
Rendered public_activity/_snippet.html.haml (18.9ms)
CACHE (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
Rendered public_activity/comment/_comment.html.haml (0.9ms)
Rendered public_activity/_snippet.html.haml (12.1ms)
CACHE (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
Rendered public_activity/comment/_comment.html.haml (2.7ms)
Rendered public_activity/_snippet.html.haml (56.3ms)
CACHE (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
Rendered public_activity/comment/_comment.html.haml (0.6ms)
Rendered public_activity/_snippet.html.haml (4.5ms)
CACHE (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
Rendered public_activity/content/_comment.html.haml (2.1ms)
Rendered public_activity/_snippet.html.haml (9.5ms)
Is there any way to eager load the conditional?
#jverban is correct that you can compare the record IDs to avoid needless record loading. To answer your question about eager loading though, yes you can eager load using the includes method in the ActiveRecord query chain. For example:
Activity.includes(:owner).latest
That will tell Rails you intend to reference the owner relation and so they should be loaded as well.
I highly recommend adding the bullet gem to your project (only in development and test environments) to detect N+1 queries and warn you when you've got an N+1 query situation like this happening.
You shouldn't need to load the user record, just compare id attributes
= a.owner_id == current_user.id ? 'You' : a.owner.name
The cache calls will likely still happen if multiple activity owners are not the current user (to get the owner name).

Active Model Serializer caching of calculated field

I'm trying to to cache a collection of items, this is my AMS:
class ApplicationCategorySerializer < ActiveModel::Serializer
cache({})
cache key: 'application_category', expires_in: 3.days
attributes :id
attributes :name
attributes :active
attributes :group
attributes :num_of_protocols
belongs_to :group do |serializer|
serializer.attributes[:group][:name]
end
def num_of_protocols
ApplicationCategory.find(object.id).protocol_categories.count
end
end
My problem is that without the :num_of_protocols part it takes 10 times faster to generate the response,
when I look at the log I can see:
Started GET "/application_categories.json" for 127.0.0.1 at 2016-08-04 19:12:27 +0300
Processing by ApplicationCategoriesController#index as JSON
ApplicationCategory Load (3.0ms) SELECT "application_categories".* FROM "application_categories"
[active_model_serializers] Group Load (1.0ms) SELECT "groups".* FROM "groups" WHERE "groups"."id" = ? LIMIT 1 [["id", 2066]]
[active_model_serializers] ApplicationCategory Load (0.0ms) SELECT "application_categories".* FROM "application_categories" WHERE "application_categories"."id" = ? LIMIT 1 [["id", 1]]
[active_model_serializers] (0.0ms) SELECT COUNT(*) FROM "protocol_categories" INNER JOIN "application_categories_protocol_categories" ON "protocol_categories"."id" = "application_categories_protocol_categories"."protocol_category_id" WHERE "application_categories_protocol_categories"."application_category_id" = ? [["application_category_id", 1]]
[active_model_serializers] CACHE (0.0ms) SELECT "groups".* FROM "groups" WHERE "groups"."id" = ? LIMIT 1 [["id", 2066]]
[active_model_serializers] ApplicationCategory Load (0.0ms) SELECT "application_categories".* FROM "application_categories" WHERE "application_categories"."id" = ? LIMIT 1 [["id", 2]]
[active_model_serializers] (0.0ms) SELECT COUNT(*) FROM "protocol_categories" INNER JOIN "application_categories_protocol_categories" ON "protocol_categories"."id" = "application_categories_protocol_categories"."protocol_category_id" WHERE "application_categories_protocol_categories"."application_category_id" = ? [["application_category_id", 2]]
[active_model_serializers] CACHE (0.0ms) SELECT "groups".* FROM "groups" WHERE "groups"."id" = ? LIMIT 1 [["id", 2066]]
...................... [many many more lines like those...]
[active_model_serializers] CACHE (0.0ms) SELECT "application_categories".* FROM "application_categories" WHERE "application_categories"."id" = ? LIMIT 1 [["id", 608]]
[active_model_serializers] CACHE (0.0ms) SELECT COUNT(*) FROM "protocol_categories" INNER JOIN "application_categories_protocol_categories" ON "protocol_categories"."id" = "application_categories_protocol_categories"."protocol_category_id" WHERE "application_categories_protocol_categories"."application_category_id" = ? [["application_category_id", 608]]
[active_model_serializers] CACHE (0.0ms) SELECT "application_categories".* FROM "application_categories" WHERE "application_categories"."id" = ? LIMIT 1 [["id", 609]]
[active_model_serializers] CACHE (0.0ms) SELECT COUNT(*) FROM "protocol_categories" INNER JOIN "application_categories_protocol_categories" ON "protocol_categories"."id" = "application_categories_protocol_categories"."protocol_category_id" WHERE "application_categories_protocol_categories"."application_category_id" = ? [["application_category_id", 609]]
[active_model_serializers] CACHE (0.0ms) SELECT "application_categories".* FROM "application_categories" WHERE "application_categories"."id" = ? LIMIT 1 [["id", 610]]
[active_model_serializers] CACHE (0.0ms) SELECT COUNT(*) FROM "protocol_categories" INNER JOIN "application_categories_protocol_categories" ON "protocol_categories"."id" = "application_categories_protocol_categories"."protocol_category_id" WHERE "application_categories_protocol_categories"."application_category_id" = ? [["application_category_id", 610]]
[active_model_serializers] Rendered ActiveModel::Serializer::CollectionSerializer with ActiveModelSerializers::Adapter::Attributes (1521.15ms)
how can I get better results with caching? - when I comment out the cache lines I get aprox. same times as with them, when I remove the :num_of_protocols I get same time with & without cache.
what am I configuring wrong?
Can you do this?
def num_of_protocols
object.protocol_categories.count
end
And, it's been a while for serializers for me, but I thought you were able to do this...
def num_of_protocols
protocol_categories.count
end
And this is better ruby (in the case protocal_categories is already loaded as an array)
def num_of_protocols
protocol_categories.size
end
And, if all that works or not, make sure you include protocol_categories and groups when you instantiate #application_category before you render to json. You're doing too many queries.

n + 1 issue when ordering by child's attribute

The following query suffers from the 'n+1' problem of loading each order for each record:
Job.joins('LEFT JOIN orders ON orders.job_id = jobs.id').order("orders.featured")
Same for this:
Job.includes(:order).order("orders.featured")
Removing the .order(...) part removes the n + 1 issue, but then it's not ordered. Any ideas how to fix this? Do I need to create a column in the parent for the 'featured' attribute?
Output:
Order Load (0.4ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 26]]
Rendered jobs/_job.html.erb (1.9ms)
Order Load (0.3ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 3]]
Rendered jobs/_job.html.erb (2.0ms)
Order Load (0.3ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 52]]
Rendered jobs/_job.html.erb (1.7ms)
Order Load (0.3ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 13]]
Rendered jobs/_job.html.erb (1.9ms)
Order Load (0.3ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 34]]
Rendered jobs/_job.html.erb (1.9ms)
Order Load (0.4ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 64]]
Rendered jobs/_job.html.erb (2.8ms)
Order Load (0.4ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 94]]
Rendered jobs/_job.html.erb (3.2ms)
Order Load (0.4ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 60]]
Rendered jobs/_job.html.erb (3.1ms)
Order Load (0.4ms) SELECT "orders".* FROM "orders" WHERE "orders"."job_id" = $1 LIMIT 1 [["job_id", 29]]
Try using preload for the associated data:
Job.joins('LEFT JOIN orders ON orders.job_id = jobs.id')
.order("orders.featured")
.preload(:orders)
Job.includes(:order).order("orders.featured")
Includes doesn't join the tables together until you call it in the view. It will actually do 2 queries. If you want the 2 tables to be joined to do the order, you need to use eager_load:
Job.eager_load(:order).order("orders.featured")
http://blog.bigbinary.com/2013/07/01/preload-vs-eager-load-vs-joins-vs-includes.html

Rails 4 - Why does using includes() not make a join between Pins and Replies?

I'm a Rails and Ruby newcomer. I want to optimise SQL queries where I can and I was reading about using includes() to make Rails aware, that I want to eager load and join two tables.
In my show action on the pin controller:
def show
#pin = Pin.includes(:replies, :user).where(id: params[:id]).first
end
If I check the log on the queries, I see the following:
Started GET "/pin/1703704382" for 127.0.0.1 at 2014-06-12 15:30:18 +0100
Processing by PinsController#show as HTML
Parameters: {"id"=>"1703704382"}
Pin Load (0.2ms) SELECT `pins`.* FROM `pins` WHERE `pins`.`id` = 145 ORDER BY `pins`.`id` ASC LIMIT 1
Reply Load (0.1ms) SELECT `replies`.* FROM `replies` WHERE `replies`.`pin_id` IN (145)
User Load (0.1ms) SELECT `users`.* FROM `users` WHERE `users`.`id` IN (22)
User Load (0.1ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 22 LIMIT 1
Profile Load (0.1ms) SELECT `profiles`.* FROM `profiles` WHERE `profiles`.`user_id` = 22 ORDER BY `profiles`.`id` ASC LIMIT 1
CACHE (0.0ms) SELECT `profiles`.* FROM `profiles` WHERE `profiles`.`user_id` = 22 ORDER BY `profiles`.`id` ASC LIMIT 1 [["user_id", 22]]
Type Load (0.1ms) SELECT `types`.* FROM `types` WHERE `types`.`id` = 1 ORDER BY `types`.`id` ASC LIMIT 1
Skill Load (0.1ms) SELECT `skills`.* FROM `skills` WHERE `skills`.`id` = 3 ORDER BY `skills`.`id` ASC LIMIT 1
Instrument Load (0.2ms) SELECT `instruments`.* FROM `instruments` WHERE `instruments`.`id` = 6 ORDER BY `instruments`.`id` ASC LIMIT 1
Genre Load (0.1ms) SELECT `genres`.* FROM `genres` INNER JOIN `genre_pins` ON `genres`.`id` = `genre_pins`.`genre_id` WHERE `genre_pins`.`pin_id` = 145
Bookmark Load (0.1ms) SELECT `bookmarks`.* FROM `bookmarks` WHERE `bookmarks`.`pin_id` = 145
User Load (0.1ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 22 ORDER BY `users`.`id` ASC LIMIT 1
(0.2ms) SELECT COUNT(*) FROM `bookmarks` WHERE `bookmarks`.`pin_id` = 145
(0.1ms) SELECT COUNT(*) FROM `replies` WHERE `replies`.`pin_id` = 145
Rendered partials/_pin.html.erb (14.3ms)
Pin Load (0.1ms) SELECT `pins`.* FROM `pins` WHERE `pins`.`id` = 145 ORDER BY `pins`.`id` ASC LIMIT 1
CACHE (0.0ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 22 ORDER BY `users`.`id` ASC LIMIT 1 [["id", 22]]
CACHE (0.0ms) SELECT `profiles`.* FROM `profiles` WHERE `profiles`.`user_id` = 22 ORDER BY `profiles`.`id` ASC LIMIT 1 [["user_id", 22]]
Rendered replies/_reply.html.erb (4.0ms)
Rendered replies/_form.html.erb (1.5ms)
Rendered pins/show.html.erb within layouts/application (21.7ms)
Rendered partials/_meta.html.erb (0.1ms)
Rendered partials/_top.html.erb (1.3ms)
Rendered partials/_tags.html.erb (0.1ms)
Rendered partials/_search.html.erb (1.0ms)
Completed 200 OK in 33ms (Views: 27.2ms | ActiveRecord: 2.0ms | Solr: 0.0ms)
It seems like it is running separate queries to get pins, replies and the user. How can I join these into one query? Surely this could be better optimised.
Thanks for your advice and patience!
It seems like it is running separate queries to get pins, replies and the user. How can I join these into one query? Surely this could be better optimised.
This is not automatically true. Hydration (the fact of creating nested records, etc, since you don't get in a hierarchical structure from your DB) can be very costly with many relations. It's actually often times better for your performance to use only a very simple query to fetch every record of one table.
If you still want to join (with shallow relations, it's probably better), you can use .joins

Resources