Effectively query the number of friends? - ruby-on-rails

I'm current first getting all friends of a certain user and then take the size of the array as the number of friends. My concern is, since every time I'm retrieving all friends information from the database, it is potentially (or obviously) inefficient. So I'm wondering if there's a way that could query the number of friends of a certain user effectively without getting any other information?

If your relations are declared correctly, you should be able to do user.friends.count in order to generate a DB-level count.
See here in my console the SQL queries generated (Drug has_many :details and DrugDetail belongs_to :drug):
irb(main):003:0> Drug.first.details
Drug Load (0.7ms) SELECT "drugs".* FROM "drugs" LIMIT 1
DrugDetail Load (1.9ms) SELECT "drug_details".* FROM "drug_details" WHERE "drug_details"."drug_id" = 1771
=> []
irb(main):004:0> Drug.first.details.count
Drug Load (0.7ms) SELECT "drugs".* FROM "drugs" LIMIT 1
(0.6ms) SELECT COUNT(*) FROM "drug_details" WHERE "drug_details"."drug_id" = 1771
=> 0
irb(main):006:0> Drug.first.details.to_a.size
Drug Load (2.1ms) SELECT "drugs".* FROM "drugs" LIMIT 1
DrugDetail Load (0.5ms) SELECT "drug_details".* FROM "drug_details" WHERE "drug_details"."drug_id" = 1771
=> 0
In your case, if you have your relations like this:
User has_many :friends
Friend belongs_to :user
Then this should be executed at the DB-level and be faster than your first piece of code:
User.first.friends.count

Related

Rails: Why are has_many relations not saved in the variable/attribute like belongs_to?

So let's say I have the following models in Rails.
class Post < ApplicationRecord
belongs_to :user
end
class User < ApplicationRecord
has_many :posts
end
When I put the Post instance in a variable and call user on it, the following sql query runs once, after that the result it is saved/cached.
post.user
User Load (0.9ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 1 LIMIT 1
#<User id: 1 ...>
post.user
#<User id: 1 ...>
etc.
However, when I go the other way around with user.posts, it always runs the query.
user.posts
Post Load (1.0ms) SELECT `posts`.* FROM `posts` WHERE `posts`.`user_id` = 1 LIMIT 11
user.posts
Post Load (1.0ms) SELECT `posts`.* FROM `posts` WHERE `posts`.`user_id` = 1 LIMIT 11
etc.
Unless I convert it to an array, in which case it does get saved.
user.posts.to_a
Post Load (1.0ms) SELECT `posts`.* FROM `posts` WHERE `posts`.`user_id` = 1 LIMIT 11
user.posts
# No query here.
But user.posts still produces a ActiveRecord::Associations::CollectionProxy object on which I can call other activerecord methods. So no real downside here.
Why does this work like this? It seems like an unnecessary impact on sql optimization. The obvious reason I can think of is that user.posts updates correctly when a Post is created after user is set. But in reality that doesn't happen so much imo, since variables are mostly set and reset in consecutively ran Controller actions. Now I know there is a caching system in place that shows a CACHE Post sql in the server logs, but I can't really rely on that when I'm working with activerecord objects between models or with more complex queries.
Am I missing some best practice here? Or an obvious setting that fixes exactly this?
You're examining this in irb and that's why you're seeing the query always running.
In an actual block of code in a controller or other class if you were to write
#users_posts = user.posts
The query is NOT executed... not until you iterate through the collection, or request #count
irb, to be helpful, always runs queries immediately

Avoiding N+1 in model instance methods by using sort_by instead of order

My app is a CRM for teachers where a Teacher belongs_to an Account that has_many Students who HABTM PhoneNumbers through CallablePhoneNumbers (since, IRL siblings can share one phone number).
(Aside: As a possible complicating factor, PhoneNumbers is Polymorphic. Both Teachers and Students are "Callable"...)
My Issue: I'm trying to avoid N+1 in a students_list view. When viewing a list of 900 students and some metadata, the database hits are pretty terrifying.
app/models/student.rb
class Student < ActiveRecord::Base
...
has_many :phone_numbers, through: :callable_phone_numbers, as: :callable_phone_numbers
...
def last_messaged_at
self.phone_numbers.order(:last_received_message_at).last.try(:last_received_message_at)
# :last_received_message_at is a simple DateTime in the database
end
...
end
When I'm showing a list of students I want to show the last_messaged_at method as a status alongside the student, and I'm attempting to avoid N+1 via .includes()
app/controllers/dashes_controller.rb
class DashesController < ApplicationController
before_action :logged_in_teacher
def show
#teacher = Teacher.includes(account: [{students: [:phone_numbers, :grade_level, :student_groups]}, :grade_levels]).includes(:student_groups).find(#current_teacher.id)
end
end
Yes, there are a lot of other associations in there. I'm focusing this question exclusively on PhoneNumbers, though feedback about my use of .includes() would not be unwelcome, since it does look convoluted.
In the console, I can go...
pry(main)> t = Teacher.includes(account: [{students: [:phone_numbers, :grade_level, :student_groups]}, :grade_levels]).includes(:student_groups).find(3)
Teacher Load (2.3ms) SELECT "teachers".* FROM "teachers" WHERE "teachers"."id" = ? LIMIT 1 [["id", 3]]
Account Load (0.4ms) SELECT "accounts".* FROM "accounts" WHERE "accounts"."id" IN (3)
Student Load (8.2ms) SELECT "students".* FROM "students" WHERE "students"."account_id" IN (3)
CallablePhoneNumber Load (7.3ms) ... ETC
pry(main)> t.account.students.first.phone_numbers
=> [#<PhoneNumber:0x007fddcc59ac98
id: 15,
number: ... ETC
...to get phone_numbers without an additional PhoneNumber Load. However, when I...
pry(main)> t.account.students.first.last_messaged_at
PhoneNumber Load (0.4ms) SELECT "phone_numbers".* FROM "phone_numbers" INNER JOIN "callable_phone_numbers" ON "phone_numbers"."id" = "callable_phone_numbers"."phone_number_id" WHERE "callable_phone_numbers"."callable_id" = ? AND "callable_phone_numbers"."callable_type" = ? ORDER BY "phone_numbers"."last_received_message_at" DESC LIMIT 1 [["callable_id", 3], ["callable_type", "Student"]]
=> Thu, 06 Aug 2015 18:01:12 UTC +00:00
I'm unexpectedly forced to ping the database again, when I would've thought those PhoneNumbers were already in memory.
I felt like an instance method was most appropriate for this, but maybe it should be a helper that I pass the Collection of Phone Numbers to? Even if that's the case, it's still unclear to me why the instance method can't "see" the loaded PhoneNumbers.
Please try sort_by if you have already eager loaded the associations.
self.phone_numbers.sort_by { |pn| pn.last_received_message_at || Time.now - 20.year }.last.try(:last_received_message_at)

Includes still result in second database query when using relation with limited columns

I'm trying to use includes on a query to limit the number of subsequent database calls that fire when rendering but I also want the include calls to select a subset of columns from the related tables. Specifically, I want to get a set of posts, their comments, and just the name of the user who wrote each comment.
So I added
belongs_to :user
belongs_to :user_for_display, :select => "users.id, user.name", :class_name => "User", :foreign_key => "user_id"
to my comments model.
From the console, when I do
p = Post.where(:id => 1).includes(comments: [:user_for_display])
I see that the correct queries fire:
SELECT posts.* FROM posts WHERE posts.id = 1
SELECT comments.* FROM comments comments.attachable_type = "Post" AND comments.attachable_id IN (1)
SELECT users.id, users.name FROM users WHERE users.id IN (1,2,3)
but calling
p.first.comments.first.user.name
still results in a full user load database call:
User Load (0.5ms) SELECT `users`.* FROM `users` WHERE `users`.`id` = 11805 LIMIT 1
=> "John"
Referencing just p.first.comments does not fire a second comments query. And if I include the full :user relation instead of :user_for_display, the call to get the user name doesn't fire a second users query (but i'd prefer not to be loading the full user record).
Is there anyway to use SELECT to limit fields in an includes?
You need to query with user_for_display instead of user.
p.first.comments.first.user_for_display.name

ActiveRecord :includes - how to use map with loaded associations?

I have a small rails app, and I'm trying to get some order statistics.
So I have an Admin model, and an Order model, with one-to-many association.
class Admin < ActiveRecord::Base
attr_accessible :name
has_many :orders
class Order < ActiveRecord::Base
attr_accessible :operation
belongs_to :admin
And I'm trying to get specifical orders using this query:
admins = Admin.where(...).includes(:orders).where('orders.operation = ?', 'new gifts!')
That works just as expected. But when I try to make json using map like that
admins.map {|a| [a.name, a.orders.pluck(:operation)]}
Rails loads orders again using new query, ignoring already loaded objects.
(5.6ms) SELECT "orders"."operation" FROM "orders" WHERE "orders"."admin_id" = 26
(6.8ms) SELECT "orders"."operation" FROM "orders" WHERE "orders"."admin_id" = 24
(2.9ms) SELECT "orders"."operation" FROM "orders" WHERE "orders"."admin_id" = 30
(3.3ms) SELECT "orders"."operation" FROM "orders" WHERE "orders"."admin_id" = 29
(4.8ms) SELECT "orders"."operation" FROM "orders" WHERE "orders"."admin_id" = 27
(3.3ms) SELECT "orders"."operation" FROM "orders" WHERE "orders"."admin_id" = 28
(5.1ms) SELECT "orders"."operation" FROM "orders" WHERE "orders"."admin_id" = 25
When I try to use
loop instead of map, it works as it should:
admins.each do |a|
p a.orders.pluck(:operation)
end
this code doesn't load all orders, and prints only those loaded in the first query.
Is it possible to get the same result using map? What are the drawbacks of using loop instead of map?
pluck should always make a new query to database. Not sure why you think it does not happen in an each loop. Maybe you did not see the log because it is in between your prints?
There are 2 possibilities how to avoid additional queries.
Since orders are already loaded because you include them, you can do admins.map {|a| [a.name, a.orders.collect(&:operation)]}
Using joins (see #tihom's comment).
Edit: I just tested the each/ map behavior and it reloads every time as expected.

active record relations – who needs it?

Well, I`m confused about rails queries. For example:
Affiche belongs_to :place
Place has_many :affiches
We can do this now:
#affiches = Affiche.all( :joins => :place )
or
#affiches = Affiche.all( :include => :place )
and we will get a lot of extra SELECTs, if there are many affiches:
Place Load (0.2ms) SELECT "places".* FROM "places" WHERE "places"."id" = 3 LIMIT 1
Place Load (0.3ms) SELECT "places".* FROM "places" WHERE "places"."id" = 3 LIMIT 1
Place Load (0.8ms) SELECT "places".* FROM "places" WHERE "places"."id" = 444 LIMIT 1
Place Load (1.0ms) SELECT "places".* FROM "places" WHERE "places"."id" = 222 LIMIT 1
...and so on...
And (sic!) with :joins used every SELECT is doubled!
Technically we cloud just write like this:
#affiches = Affiche.all( )
and the result is totally the same! (Because we have relations declared). The wayout of keeping all data in one query is removing the relations and writing a big string with "LEFT OUTER JOIN", but still there is a problem of grouping data in multy-dimentional array and a problem of similar column names, such as id.
What is done wrong? Or what am I doing wrong?
UPDATE:
Well, i have that string Place Load (2.5ms) SELECT "places".* FROM "places" WHERE ("places"."id" IN (3,444,222,57,663,32,154,20)) and a list of selects one by one id. Strange, but I get these separate selects when I`m doing this in each scope:
<%= link_to a.place.name, **a.place**( :id => a.place.friendly_id ) %>
the marked a.place is the spot, that produces these extra queries.
UPDATE 2:
And let me do some math. In console we have:
Affiche Load (1.8ms) SELECT affiches.*, places.name FROM "affiches" LEFT OUTER JOIN "places" ON "places"."id" = "affiches"."place_id" ORDER BY affiches.event_date DESC
<VS>
Affiche Load (1.2ms) SELECT "affiches".* FROM "affiches"
Place Load (2.9ms) SELECT "places".* FROM "places" WHERE ("places"."id" IN (3,444,222,57,663,32,154,20))
Comes out: 1.8ms versus 4.1ms, pretty much, confusing...
Something is really strange here because :include option is intended to gather place_id attribute from every affiche and then fetch all places at once using select query like this:
select * from places where id in (3, 444, 222)
You can check that in rails console. Just start it and run that snippet:
ActiveRecord::Base.logger = Logger.new STDOUT
Affiche.all :include => :place
You might be incidentally fetching affiches without actually including places somewhere in your code and than calling place for every affiche making rails to perform separate query for every one of them.

Resources