Rails scope with polymorphic model - ruby-on-rails

I am getting some strange results from a very basic scope on a model with a polymorphic relationship. Here's the brief summary and detail of the relationships.
models/financials.rb
class Financial < ActiveRecord::Base
belongs_to :financiable, :polymorphic => true
#ltm is a boolean field in the model
scope :ltm, -> { where(ltm: true).last }
And then there's a basic Firm model that has many financials
models/firm.rb
class Firm < ActiveRecord::Base
has_many :financials, :as => :financiable, dependent: :destroy
So I get a bizarre result when a firm has no ltm financials (i.e. no financials with ltm: true). But when I call firm.financials.ltm I get an activerecord relation of financials that belong to the firm but do NOT have ltm: true. However, when I just do firm.financials.where(ltm: true).last I get nil
Summary of results for when there are no ltm financials for the firm:
firm.financials.ltm #AR relation of financials that belong to the firm but are not ltm
firm.financials.where(ltm: true).last #nil
And what makes it even stranger is that when a firm does have ltm financials, the scope works as expected.
Has anyone ever had this problem before or have any ideas? I mean the easy answer is to not use the scope but I wanted to understand what could be causing this.
---UPDATES BASED ON COMMENTS---
Thank you guys for putting a lot of thought into this.
D-side You were correct. The code was firm.financials.ltm and not firm.financials.ltm.last. That was a typo when I typed up the question. I updated the above to reflect and also below are the SQL queries.
Jiří Pospíšil - Great advice. I will update in my app but leave the same here so as not to create confusion.
Chumakoff. I force ltm to false if the user doesn't enter true with a before_save call so I don't think this is it but thanks for the thought.
So these are from the scenario where firm doesn't have any financials with ltm = true. As you can see, the scope request is making a second query to the database for all financials belonging to firm. Why is it doing that when it can't find it in the initial query?
firm.financials.ltm
Financial Load (4.6ms) SELECT "financials".* FROM "financials"
WHERE "financials"."financiable_id" = $1 AND
"financials"."financiable_type" = $2 AND "financials"."ltm" = 't'
ORDER BY "financials"."id" DESC LIMIT 1 [["financiable_id", 11],
["financiable_type", "Firm"]]
Financial Load (1.2ms) SELECT "financials".* FROM "financials" WHERE
"financials"."financiable_id"= $1 AND "financials"."financiable_type"
= $2 [["financiable_id", 11], ["financiable_type", "Firm"]]
firm.financials.where(ltm: true).last
Financial Load (16.8ms) SELECT "financials".* FROM "financials"
WHERE "financials"."financiable_id" = $1 AND
"financials"."financiable_type" = $2 AND "financials"."ltm" = 't'
ORDER BY "financials"."id" DESC LIMIT 1 [["financiable_id", 11],
["financiable_type", "Firm"]]

Your scope is not working correctly, .last in your scope get one results instead of an ActiveRecord relation, remove .last and the scope will work fine in getting all Financials with ltm=true

Related

Rails 5 select from two different tables and get one result

I have 3 models, Shop, Client, Product.
A shop has many clients, and a shop has many products.
Then I have 2 extra models, one is ShopClient, that groups the shop_id and client_id. The second is ShopProduct, that groups the shop_id and product_id.
Now I have a controller that receives two params, the client_id and product_id. So I want to select all the shops (in one instance variable #shops) filtered by client_id and product_id without shop repetition. How can I do this??
I hope I was clear, thanks.
ps: I'm using Postgresql as database.
Below query will work for you.
class Shop
has_many :shop_clients
has_many :clients, through: :shop_clients
has_many :shop_products
has_many :products, through: :shop_products
end
class Client
end
class Product
end
class ShopClient
belongs_to :shop
belongs_to :client
end
class ShopProduct
belongs_to :shop
belongs_to :product
end
#shops = Shop.joins(:clients).where(clients: {id: params[:client_id]}).merge(Shop.joins(:products).where(products: {id: params[:product_id]}))
Just to riff on the answer provided by Prince Bansal. How about creating some class methods for those joins? Something like:
class Shop
has_many :shop_clients
has_many :clients, through: :shop_clients
has_many :shop_products
has_many :products, through: :shop_products
class << self
def with_clients(clients)
joins(:clients).where(clients: {id: clients})
end
def with_products(products)
joins(:products).where(products: {id: products})
end
end
end
Then you could do something like:
#shops = Shop.with_clients(params[:client_id]).with_products(params[:product_id])
By the way, I'm sure someone is going to say you should make those class methods into scopes. And you certainly can do that. I did it as class methods because that's what the Guide recommends:
Using a class method is the preferred way to accept arguments for scopes.
But, I realize some people strongly prefer the aesthetics of using scopes instead. So, whichever pleases you most.
I feel like the best way to solve this issue is to use sub-queries. I'll first collect all valid shop ids from ShopClient, followed by all valid shop ids from ShopProduct. Than feed them into the where query on Shop. This will result in one SQL query.
shop_client_ids = ShopClient.where(client_id: params[:client_id]).select(:shop_id)
shop_product_ids = ShopProduct.where(product_id: params[:product_id]).select(:shop_id)
#shops = Shop.where(id: shop_client_ids).where(id: shop_product_ids)
#=> #<ActiveRecord::Relation [#<Shop id: 1, created_at: "2018-02-14 20:22:18", updated_at: "2018-02-14 20:22:18">]>
The above query results in the SQL query below. I didn't specify a limit, but this might be added by the fact that my dummy project uses SQLite.
SELECT "shops".*
FROM "shops"
WHERE
"shops"."id" IN (
SELECT "shop_clients"."shop_id"
FROM "shop_clients"
WHERE "shop_clients"."client_id" = ?) AND
"shops"."id" IN (
SELECT "shop_products"."shop_id"
FROM "shop_products"
WHERE "shop_products"."product_id" = ?)
LIMIT ?
[["client_id", 1], ["product_id", 1], ["LIMIT", 11]]
Combining the two sub-queries in one where doesn't result in a correct response:
#shops = Shop.where(id: [shop_client_ids, shop_product_ids])
#=> #<ActiveRecord::Relation []>
Produces the query:
SELECT "shops".* FROM "shops" WHERE "shops"."id" IN (NULL, NULL) LIMIT ? [["LIMIT", 11]]
note
Keep in mind that when you run the statements one by one in the console this will normally result in 3 queries. This is due to the fact that the return value uses the #inspect method to let you see the result. This method is overridden by Rails to execute the query and display the result.
You can simulate the behavior of the normal application by suffixing the statements with ;nil. This makes sure nil is returned and the #inspect method is not called on the where chain, thus not executing the query and keeping the chain in memory.
edit
If you want to clean up the controller you might want to move these sub-queries into model methods (inspired by jvillians answer).
class Shop
# ...
def self.with_clients(*client_ids)
client_ids.flatten! # allows passing of multiple arguments or an array of arguments
where(id: ShopClient.where(client_id: client_ids).select(:shop_id))
end
# ...
end
Rails sub-query vs join
The advantage of a sub-query over a join is that using joins might end up returning the same record multiple times if you query on a attribute that is not unique. For example, say a product has an attribute product_type that is either 'physical' or 'digital'. If you want to select all shops selling a digital product you must not forget to call distinct on the chain when you're using a join, otherwise the same shop may return multiple times.
However if you'll have to query on multiple attributes in product, and you'll use multiple helpers in the model (where each helper joins(:products)). Multiple sub-queries are likely slower. (Assuming you set has_many :products, through: :shop_products.) Since Rails reduces all joins to the same association to a single one. Example: Shop.joins(:products).joins(:products) (from multiple class methods) will still end up joining the products table a single time, whereas sub-queries will not be reduced.
Below sql query possibly gonna work for you.
--
-- assuming
-- tables: shops, products, clients, shop_products, shop_clients
--
SELECT DISTINCT * FROM shops
JOIN shop_products
ON shop_products.shop_id = shops.id
JOIN shop_clients
ON shop_clients.shop_id = shops.id
WHERE shop_clients.client_id = ? AND shop_products.product_id = ?
If you'll face difficulties while creating an adequate AR expression for this sql query, let me know.
Btw, here is a mock

Avoiding N+1 in model instance methods by using sort_by instead of order

My app is a CRM for teachers where a Teacher belongs_to an Account that has_many Students who HABTM PhoneNumbers through CallablePhoneNumbers (since, IRL siblings can share one phone number).
(Aside: As a possible complicating factor, PhoneNumbers is Polymorphic. Both Teachers and Students are "Callable"...)
My Issue: I'm trying to avoid N+1 in a students_list view. When viewing a list of 900 students and some metadata, the database hits are pretty terrifying.
app/models/student.rb
class Student < ActiveRecord::Base
...
has_many :phone_numbers, through: :callable_phone_numbers, as: :callable_phone_numbers
...
def last_messaged_at
self.phone_numbers.order(:last_received_message_at).last.try(:last_received_message_at)
# :last_received_message_at is a simple DateTime in the database
end
...
end
When I'm showing a list of students I want to show the last_messaged_at method as a status alongside the student, and I'm attempting to avoid N+1 via .includes()
app/controllers/dashes_controller.rb
class DashesController < ApplicationController
before_action :logged_in_teacher
def show
#teacher = Teacher.includes(account: [{students: [:phone_numbers, :grade_level, :student_groups]}, :grade_levels]).includes(:student_groups).find(#current_teacher.id)
end
end
Yes, there are a lot of other associations in there. I'm focusing this question exclusively on PhoneNumbers, though feedback about my use of .includes() would not be unwelcome, since it does look convoluted.
In the console, I can go...
pry(main)> t = Teacher.includes(account: [{students: [:phone_numbers, :grade_level, :student_groups]}, :grade_levels]).includes(:student_groups).find(3)
Teacher Load (2.3ms) SELECT "teachers".* FROM "teachers" WHERE "teachers"."id" = ? LIMIT 1 [["id", 3]]
Account Load (0.4ms) SELECT "accounts".* FROM "accounts" WHERE "accounts"."id" IN (3)
Student Load (8.2ms) SELECT "students".* FROM "students" WHERE "students"."account_id" IN (3)
CallablePhoneNumber Load (7.3ms) ... ETC
pry(main)> t.account.students.first.phone_numbers
=> [#<PhoneNumber:0x007fddcc59ac98
id: 15,
number: ... ETC
...to get phone_numbers without an additional PhoneNumber Load. However, when I...
pry(main)> t.account.students.first.last_messaged_at
PhoneNumber Load (0.4ms) SELECT "phone_numbers".* FROM "phone_numbers" INNER JOIN "callable_phone_numbers" ON "phone_numbers"."id" = "callable_phone_numbers"."phone_number_id" WHERE "callable_phone_numbers"."callable_id" = ? AND "callable_phone_numbers"."callable_type" = ? ORDER BY "phone_numbers"."last_received_message_at" DESC LIMIT 1 [["callable_id", 3], ["callable_type", "Student"]]
=> Thu, 06 Aug 2015 18:01:12 UTC +00:00
I'm unexpectedly forced to ping the database again, when I would've thought those PhoneNumbers were already in memory.
I felt like an instance method was most appropriate for this, but maybe it should be a helper that I pass the Collection of Phone Numbers to? Even if that's the case, it's still unclear to me why the instance method can't "see" the loaded PhoneNumbers.
Please try sort_by if you have already eager loaded the associations.
self.phone_numbers.sort_by { |pn| pn.last_received_message_at || Time.now - 20.year }.last.try(:last_received_message_at)

How to order records by their latest child records attribute

I'm having troubles to order my records by their has_one association. I'm quite sure the solution is obvious, but I just can't get it.
class Migration
has_many :checks
has_one :latest_origin_check, -> { where(origin: true).order(at: :desc) }, class_name: 'Check'
end
class Check
belongs_to :migration
end
If I order by checks.status I always get different check ids. Shouldn't they be the same but with different order?
Or is the -> { } way to get the has_one association the problem?
Migration.all.includes(:latest_origin_check).order("checks.status DESC").each do |m| puts m.latest_origin_check.id end
So in one sentence: How do I order records through a custom has_one association?
I'm using Ruby 2.0.0, Rails 4.2 and PostgreSQL.
Update:
I wasn't specific enough. I've got two has_one relations on the checks relation.
Also very Important. One Migration has a way to big number of checks to include all the checks at once. So Migration.first.includes(:checks) would be very slow. We are talking about serveral thousand and I only need the latest.
class Migration
has_many :checks
has_one :latest_origin_check, -> { where(origin: true).order(at: :desc) }, class_name: 'Check'
has_one :latest_target_check, -> { where(origin: false).order(at: :desc) }, class_name: 'Check'
end
class Check
belongs_to :migration
end
Now if I get the latest_origin_check, I get the correct Record. The query is the following.
pry(main)> Migration.last.latest_origin_check
Migration Load (1.1ms) SELECT "migrations".* FROM "migrations" ORDER BY "migrations"."id" DESC LIMIT 1
Check Load (0.9ms) SELECT "checks".* FROM "checks" WHERE "checks"."migration_id" = $1 AND "checks"."origin" = 't' ORDER BY "checks"."at" DESC LIMIT 1 [["migration_id", 59]]
How do I get the latest check of each migration and then sort the migrations by a attribute of the latest check?
I'm using ransack. Ransack seems to get it right when I order the records by "checks.at"
SELECT "migrations".* FROM "migrations" LEFT OUTER JOIN "checks" ON "checks"."migration_id" = "migrations"."id" AND "checks"."origin" = 't' WHERE (beginning between '2015-02-22 23:00:00.000000' and '2015-02-23 22:59:59.000000' or ending between '2015-02-22 23:00:00.000000' and '2015-02-23 22:59:59.000000') ORDER BY "checks"."at" ASC
But the same query returns wrong results when I order by status
SELECT "migrations".* FROM "migrations" LEFT OUTER JOIN "checks" ON "checks"."migration_id" = "migrations"."id" AND "checks"."origin" = 't' WHERE (beginning between '2015-02-22 23:00:00.000000' and '2015-02-23 22:59:59.000000' or ending between '2015-02-22 23:00:00.000000' and '2015-02-23 22:59:59.000000') ORDER BY "checks"."status" ASC
Check.status is a boolean, check.at is a DateTime. A colleague suggested that the boolean is the problem. Do I need to convert the booleans to an integer to make them sortable? How do I do that only for the :latest_origin_check? Something like that?
.order("(case when \"checks\".\"status\" then 2 when \"checks\".\"status\" is null then 0 else 1 end) DESC")
You already have a has_many relationship with Check on Migration. I think you are looking for a scope instead:
scope :latest_origin_check, -> { includes(:checks).where(origin:true).order("checks.status DESC").limit(1)}
Drop the has_one :latest_origin_check line on Migration.
Migration.latest_origin_check
I think the line about should return your desired result set.

eager loading the first record of an association

In a very simple forum made from Rails app, I get 30 topics from the database in the index action like this
def index
#topics = Topic.all.page(params[:page]).per_page(30)
end
However, when I list them in the views/topics/index.html.erb, I also want to have access to the first post in each topic to display in a tooltip, so that when users scroll over, they can read the first post without having to click on the link. Therefore, in the link to each post in the index, I add the following to a data attribute
topic.posts.first.body
each of the links looks like this
<%= link_to simple_format(topic.name), posts_path(
:topic_id => topic), :data => { :toggle => 'tooltip', :placement => 'top', :'original-title' => "#{ topic.posts.first.body }"}, :class => 'tool' %>
While this works fine, I'm worried that it's an n+1 query, namely that if there's 30 topics, it's doing this 30 times
User Load (0.8ms) SELECT "users".* FROM "users" WHERE "users"."id" = 1 ORDER BY "users"."id" ASC LIMIT 1
Post Load (0.4ms) SELECT "posts".* FROM "posts" WHERE "posts"."topic_id" = $1 ORDER BY "posts"."id" ASC LIMIT 1 [["topic_id", 7]]
I've noticed that Rails does automatic caching on some of these, but I think there might be a way to write the index action differently to avoid some of this n+1 problem but I can figure out how. I found out that I can
include(:posts)
to eager load the posts, like this
#topics = Topic.all.page(params[:page]).per_page(30).includes(:posts)
However, if I know that I only want the first post for each topic, is there a way to specify that? if a topic had 30 posts, I don't want to eager load all of them.
I tried to do
.includes(:posts).first
but it broke the code
This appears to work for me, so give this a shot and see if it works for you:
Topic.includes(:posts).where("posts.id = (select id from posts where posts.topic_id = topics.id limit 1)").references(:posts)
This will create a dependent subquery in which the posts topic_id in the subquery is matched up with the topics id in the parent query. With the limit 1 clause in the subquery, the result is that each Topic row will contain only 1 matching Post row, eager loaded thanks to the includes(:post).
Note that when passing an SQL string to .where, that references an eager loaded relation, the references method should be appended to inform ActiveRecord that we're referencing an association, so that it knows to perform appropriate joins in the subsequent query. Apparently it technically works without that method, but you get a deprecation warning, so you might as well throw it in lest you encounter problems in future Rails updates.
To my knowledge you can't. Custom association is often used to allow conditions on includes except limit.
If you eager load an association with a specified :limit option, it will be ignored, returning all the associated objects. http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html
class Picture < ActiveRecord::Base
has_many :most_recent_comments, -> { order('id DESC').limit(10) },
class_name: 'Comment'
end
Picture.includes(:most_recent_comments).first.most_recent_comments
# => returns all associated comments.
There're a few issues when trying to solve this "natively" via Rails which are detailed in this question.
We solved it with an SQL scope, for your case something like:
class Topic < ApplicationRecord
has_one :first_post, class_name: "Post", primary_key: :first_post_id, foreign_key: :id
scope :with_first_post, lambda {
select(
"topics.*,
(
SELECT id as first_post_id
FROM posts
WHERE topic_id = topics.id
ORDER BY id asc
LIMIT 1
)"
)
}
end
Topic.with_first_post.includes(:first_post)

Specifying conditions on eager loaded associations returns ActiveRecord::RecordNotFound

The problem is that when a Restaurant does not have any MenuItems that match the condition, ActiveRecord says it can't find the Restaurant. Here's the relevant code:
class Restaurant < ActiveRecord::Base
has_many :menu_items, dependent: :destroy
has_many :meals, through: :menu_items
def self.with_meals_of_the_week
includes({menu_items: :meal}).where(:'menu_items.date' => Time.now.beginning_of_week..Time.now.end_of_week)
end
end
And the sql code generated:
Restaurant Load (0.0ms)←[0m ←[1mSELECT DISTINCT "restaurants".id FROM "restaurants"
LEFT OUTER JOIN "menu_items" ON "menu_items"."restaurant_id" = "restaurants"."id"
LEFT OUTER JOIN "meals" ON "meals"."id" = "menu_items"."meal_id" WHERE
"restaurants"."id" = ? AND ("menu_items"."date" BETWEEN '2012-10-14 23:00:00.000000'
AND '2012-10-21 22:59:59.999999') LIMIT 1←[0m [["id", "1"]]
However, according to this part of the Rails Guides, this shouldn't be happening:
Post.includes(:comments).where("comments.visible", true)
If, in the case of this includes query, there were no comments for any posts, all the posts would still be loaded.
The SQL generated is a correct translation of your query. But look at it,
just at the SQL level (i shortened it a bit):
SELECT *
FROM
"restaurants"
LEFT OUTER JOIN
"menu_items" ON "menu_items"."restaurant_id" = "restaurants"."id"
LEFT OUTER JOIN
"meals" ON "meals"."id" = "menu_items"."meal_id"
WHERE
"restaurants"."id" = ?
AND
("menu_items"."date" BETWEEN '2012-10-14' AND '2012-10-21')
the left outer joins do the work you expect them to do: restaurants
are combined with menu_items and meals; if there is no menu_item to
go with a restaurant, the restaurant is still kept in the result, with
all the missing pieces (menu_items.id, menu_items.date, ...) filled in with NULL
now look aht the second part of the where: the BETWEEN operator demands,
that menu_items.date is not null! and this
is where you filter out all the restaurants without meals.
so we need to change the query in a way that makes having null-dates ok.
going back to ruby, you can write:
def self.with_meals_of_the_week
includes({menu_items: :meal})
.where('menu_items.date is NULL or menu_items.date between ? and ?',
Time.now.beginning_of_week,
Time.now.end_of_week
)
end
The resulting SQL is now
.... WHERE (menu_items.date is NULL or menu_items.date between '2012-10-21' and '2012-10-28')
and the restaurants without meals stay in.
As it is said in Rails Guide, all Posts in your query will be returned only if you will not use "where" clause with "includes", cause using "where" clause generates OUTER JOIN request to DB with WHERE by right outer table so DB will return nothing.
Such implementation is very helpful when you need some objects (all, or some of them - using where by base model) and if there are related models just get all of them, but if not - ok just get list of base models.
On other hand if you trying to use conditions on including tables then in most cases you want to select objects only with this conditions it means you want to select Restaurants only which has meals_items.
So in your case, if you still want to use only 2 queries (and not N+1) I would probably do something like this:
class Restaurant < ActiveRecord::Base
has_many :menu_items, dependent: :destroy
has_many :meals, through: :menu_items
cattr_accessor :meals_of_the_week
def self.with_meals_of_the_week
restaurants = Restaurant.all
meals_of_the_week = {}
MenuItems.includes(:meal).where(date: Time.now.beginning_of_week..Time.now.end_of_week, restaurant_id => restaurants).each do |menu_item|
meals_of_the_week[menu_item.restaurant_id] = menu_item
end
restaurants.each { |r| r.meals_of_the_week = meals_of_the_week[r.id] }
restaurants
end
end
Update: Rails 4 will raise Deprecation warning when you simply try to do conditions on models
Sorry for possible typo.
I think there is some misunderstanding of this
If there was no where condition, this would generate the normal set of two queries.
If, in the case of this includes query, there were no comments for any
posts, all the posts would still be loaded. By using joins (an INNER
JOIN), the join conditions must match, otherwise no records will be
returned.
[from guides]
I think this statements doesn't refer to the example Post.includes(:comments).where("comments.visible", true)
but refer to one without where statement Post.includes(:comments)
So all work right! This is the way LEFT OUTER JOIN work.
So... you wrote: "If, in the case of this includes query, there were no comments for any posts, all the posts would still be loaded." Ok! But this is true ONLY when there is NO where clause! You missed the context of the phrase.

Resources