Add computable column to multi-table select clause with eager_load in Ruby on Rails Activerecord - ruby-on-rails

I have a query with a lot of joins and I'm eager_loading some of associations at the time. And I need to compute some value as attribute of one of models.
So, I'm trying this code:
ServiceObject
.joins([{service_days: :ou}, :address])
.eager_load(:address, :service_days)
.where(ous: {id: OU.where(sector_code: 5)})
.select('SDO_CONTAINS(ous.service_area_shape, SDO_GEOMETRY(2001, 8307, sdo_point_type(addresses.lat, addresses.lng, NULL), NULL, NULL) ) AS in_zone')
Where SQL function call in select operates data from associated addresses and ous tables.
I'm getting next SQL (so my in_zone column getting calculated and returned as first column before other columns for all eager_loaded models):
SELECT SDO_CONTAINS(ous.service_area_shape, SDO_GEOMETRY(2001, 8307, sdo_point_type(addresses.lat, addresses.lng, NULL), NULL, NULL) ) AS in_zone, "SERVICE_OBJECTS"."ID" AS t0_r0, "SERVICE_OBJECTS"."TYPE" AS t0_r1, <omitted for brevity> AS t2_r36 FROM "SERVICE_OBJECTS" INNER JOIN "SERVICE_DAYS" ON "SERVICE_DAYS"."SERVICE_OBJECT_ID" = "SERVICE_OBJECTS"."ID" INNER JOIN "OUS" ON "OUS"."ID" = "SERVICE_DAYS"."OU_ID" INNER JOIN "ADDRESSES" ON "ADDRESSES"."ID" = "SERVICE_OBJECTS"."ADDRESS_ID" WHERE "OUS"."ID" IN (SELECT "OUS"."ID" FROM "OUS" WHERE "OUS"."SECTOR_CODE" = :a1) [["sector_code", "5"]]
But it seems like that in_zone isn't accessible from either model used in query.
I need to have calculated in_zone as attribute of ServiceObject model object, how I can accomplish that?
Ruby on Rails 4.2.6, Ruby 2.3.0, oracle_enhanced adapter 1.6.7, Oracle 12.1

I have successfully replicated your issue and it turns out that this is a known issue in Rails. The problem is that when using eager_load, Rails maps the columns of all eager-loaded tables into table and column aliases in the form of t0_r0, t0_r1, etc... (you can see these in the SQL that you pasted in the question). And while doing that, it simply ignores the custom columns in the select, probably because it cannot determine which eager-loaded table it should attribute the custom column to. It is sad that this issue is open for more than 2 years now...
Nevertheless I think I found a workaround. It seems that if you don't eager load the tables but manually join them (with joins), you can as well include them (with includes) and the custom columns will be returned as there will be no column aliasing taking place. The point is that you must not use associations in the joins clauses but you have to specify the joins yourself. Also note that you must specify all columns from the main table in the select manually too (see the service_objects.* in the select).
Try the following approach:
ServiceObject
.joins('INNER JOIN "SERVICE_DAYS" ON "SERVICE_DAYS"."SERVICE_OBJECT_ID" = "SERVICE_OBJECTS"."ID"')
.joins('INNER JOIN "OUS" ON "OUS"."ID" = "SERVICE_DAYS"."OU_ID"')
.joins('INNER JOIN "ADDRESSES" ON "ADDRESSES"."ID" = "SERVICE_OBJECTS"."ADDRESS_ID"')
.includes(:service_days, :address)
.where(ous: {id: OU.where(sector_code: 5)})
.select('service_objects.*, SDO_CONTAINS(ous.service_area_shape, SDO_GEOMETRY(2001, 8307, sdo_point_type(addresses.lat, addresses.lng, NULL), NULL, NULL) ) AS in_zone')
The computation in the select should still work as the related tables are joined together but there should be no column aliasing present.
Of course this approach means that you'll get three queries instead of just one but unless you return a huge amount of records, the following two queries run by the includes clause should be very fast as they simply load the relevant records using foreign keys.

That monkey patch helped #Envek:
module ActiveRecord
Base.send :attr_accessor, :_row_
module Associations
class JoinDependency
JoinBase && class JoinPart
def instantiate_with_row(row, *args)
instantiate_without_row(row, *args).tap { |i| i._row_ = row }
end; alias_method_chain :instantiate, :row
end
end
end
end
then it is possible to do:
ServiceObject
.joins([{service_days: :ou}, :address])
.eager_load(:address, :service_days)
.where(ous: {id: OU.where(sector_code: 5)})
.select('SDO_CONTAINS(ous.service_area_shape, SDO_GEOMETRY(2001, 8307, sdo_point_type(addresses.lat, addresses.lng, NULL), NULL, NULL) ) AS in_zone')
.first
._row_['in_zone']

Related

Why does database functions break rails query plan on includes?

A plain call works as intended. The resulting SQL uses LEFT OUTER JOINs to link tables as desired.
> Subscription.includes(plan: { student: :person }).order('persons.name')
=> #<ActiveRecord::Relation ... >
If a function is inserted upon order clause, seems that rails goes off-track in its query plan as the resulting SQL does not do the tables linkage and, therefore, issues the error:
> Subscription.includes(plan: { student: :person }).order('unaccent(persons.name)')
=> ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR: missing FROM-clause entry for table "persons")
LINE 1: ...subscriptions".* FROM "subscriptions" ORDER BY unaccent(persons.na...
^
: SELECT "subscriptions".* FROM "subscriptions" ORDER BY unaccent(persons.name) LIMIT $1
The same does not apply to joins that executes the command BUT using INNER JOINs as the table linkage (not exactly the intended relationship)
> Subscription.joins(plan: { student: :person }).order('unaccent(persons.name)')
=> #<ActiveRecord::Relation ... > # GOOD
As a newbie here, what am I missing?
(Re-written from my comment above)
You need to add .references(:persons) to the query.
Rails tries to be "lazy" and avoid performing unnecessary JOINs when using includes. The usage of this unaccent SQL function is throwing off the ActiveRecord query planner - so you need to be more explicit, thus forcing rails to perform the JOIN.
See the documentation on "conditions":
If you want to add conditions to your included models you’ll have to
explicitly reference them. For example:
User.includes(:posts).where('posts.name = ?', 'example')
Will throw an error, but this will work:
User.includes(:posts).where('posts.name = ?', 'example').references(:posts)
Note that includes works with
association names while references needs the actual table name

Query on ruby on Rails

How do you query on Ruby on Rails or translate this query on Ruby on Rails?
SELECT
orders.item_total,
orders.total,
payments.created_at,
payments.updated_at
FROM
public.payments,
public.orders,
public.line_items,
public.variants
WHERE
payments.order_id = orders.id AND
orders.id = line_items.order_id AND
This is working on Postgres but I'm new to RoR and it's giving me difficulty on querying this sample.
So far this is what I have.
Order.joins(:payments,:line_items,:variants).where(payments:{order_id: [Order.ids]}, orders:{id:LineItem.orders_id}).distinct.pluck(:email, :id, "payments.created_at", "payments.updated_at")
I have a lot of reference before asking a question here are the links.
How to combine two conditions in a where clause?
Rails PG::UndefinedTable: ERROR: missing FROM-clause entry for table
Rails ActiveRecord: Pluck from multiple tables with same column name
ActiveRecord find and only return selected columns
https://guides.rubyonrails.org/v5.2/active_record_querying.html
from all that link I produced this code that works for testing.
Spree::Order.joins(:payments,:line_items,:variants).where(id: [Spree::Payment.ids]).distinct.pluck(:email, :id)
but when I try to have multiple queries and pluck a specific column name from a different table it gives me an error.
Update
So I'm using Ransack to query I produced this code.
#search = Spree::Order.ransack(
orders_gt: params[:q][:created_at_gt],
orders_lt: params[:q][:created_at_lt],
payments_order_id_in: [Spree::Order.ids],
payments_state_eq: 'completed',
orders_id_in: [Spree::LineItem.all.pluck(:order_id)],
variants_id_in: [Spree::LineItem.ids]
)
#payment_report = #search.result
.includes(:payments, :line_items, :variants)
.joins(:line_items, :payments, :variants).select('payments.response_code, orders.number, payments.number')
I don't have error when I remove the select part and I need to get that specific column. Is there a way?
You just have to make a join between the tables and then select the columns you want
Spree::Order.joins(:payments, :line_items).pluck("spree_orders.total, spree_orders.item_total, spree_payments.created_at, spree_payments.updated_at")
or
Spree::Order.joins(:payments, :line_items).select("spree_orders.total, spree_orders.item_total, spree_payments.created_at, spree_payments.updated_at")
That is equivalent to this query
SELECT spree_orders.total,
spree_orders.item_total,
spree_payments.created_at,
spree_payments.updated_at
FROM "spree_orders"
LEFT OUTER JOIN "spree_payments" ON "spree_payments"."order_id" = "spree_orders"."id"
LEFT OUTER JOIN "spree_line_items" ON "spree_line_items"."order_id" = "spree_orders"."id"
You can use select_all method.This method will return an instance of ActiveRecord::Result class and calling to_hash on this object would return you an array of hashes where each hash indicates a record.
Order.connection.select_all("SELECT
orders.item_total,
orders.total,
payments.created_at,
payments.updated_at
FROM
public.payments,
public.orders,
public.line_items,
public.variants
WHERE
payments.order_id = orders.id AND
orders.id = line_items.order_id").to_hash

How to get a most recent value group by year by using SQL

I have a Company model that has_many Statement.
class Company < ActiveRecord::Base
has_many :statements
end
I want to get statements that have most latest date field grouped by fiscal_year_end field.
I implemented the function like this:
c = Company.first
c.statements.to_a.group_by{|s| s.fiscal_year_end }.map{|k,v| v.max_by(&:date) }
It works ok, but if possible I want to use ActiveRecord query(SQL), so that I don't need to load unnecessary instance to memory.
How can I write it by using SQL?
select t.username, t.date, t.value
from MyTable t
inner join (
select username, max(date) as MaxDate
from MyTable
group by username
) tm on t.username = tm.username and t.date = tm.MaxDate
For these kinds of things, I find it helpful to get the raw SQL working first, and then translate it into ActiveRecord afterwards. It sounds like a textbook case of GROUP BY:
SELECT fiscal_year_end, MAX(date) AS max_date
FROM statements
WHERE company_id = 1
GROUP BY fiscal_year_end
Now you can express that in ActiveRecord like so:
c = Company.first
c.statements.
group(:fiscal_year_end).
order(nil). # might not be necessary, depending on your association and Rails version
select("fiscal_year_end, MAX(date) AS max_date")
The reason for order(nil) is to prevent ActiveRecord from adding ORDER BY id to the query. Rails 4+ does this automatically. Since you aren't grouping by id, it will cause the error you're seeing. You could also order(:fiscal_year_end) if that is what you want.
That will give you a bunch of Statement objects. They will be read-only, and every attribute will be nil except for fiscal_year_end and the magically-present new field max_date. These instances don't represent specific statements, but statement "groups" from your query. So you can do something like this:
- #statements_by_fiscal_year_end.each do |s|
%tr
%td= s.fiscal_year_end
%td= s.max_date
Note there is no n+1 query problem here, because you fetched everything you need in one query.
If you decide that you need more than just the max date, e.g. you want the whole statement with the latest date, then you should look at your options for the greatest n per group problem. For raw SQL I like LATERAL JOIN, but the easiest approach to use with ActiveRecord is DISTINCT ON.
Oh one more tip: For debugging weird errors, I find it helpful to confirm what SQL ActiveRecord is trying to use. You can use to_sql to get that:
c = Company.first
puts c.statements.
group(:fiscal_year_end).
select("fiscal_year_end, MAX(date) AS max_date").
to_sql
In that example, I'm leaving off order(nil) so you can see that ActiveRecord is adding an ORDER BY clause you don't want.
for example you want to get all statements by start of the months you should use this
#companey = Company.first
#statements = #companey.statements.find(:all, :order => 'due_at, id', :limit => 50)
then group them as you want
#monthly_statements = #statements.group_by { |statement| t.due_at.beginning_of_month }
Building upon Bharat's answer you can do this type of query in Rails using find_by_sql in this way:
Statement.find_by_sql ["Select t.* from statements t INNER JOIN (
SELECT fiscal_year_end, max(date) as MaxDate GROUP BY fiscal_year_end
) tm on t.fiscal_year_end = tm.fiscal_year_end AND
t.created_at = tm.MaxDate WHERE t.company_id = ?", company.id]
Note the last where part to make sure the statements belong to a specific company instance, and that this is called from the class. I haven't tested this with the array form, but I believe you can turn this into a scope and use it like this:
# In Statement model
scope :latest_from_fiscal_year, lambda |enterprise_id| {
find_by_sql[..., enterprise_id] # Query above
}
# Wherever you need these statements for a particular company
company = Company.find(params[:id])
latest_statements = Statement.latest_from_fiscal_year(company.id)
Note that if you somehow need all the latest statements for all companies then this most likely leave you with a N+1 queries problem. But that is a beast for another day.
Note: If anyone else has a way to have this query work on the association without using the last where part (company.statements.latest_from_year and such) let me know and I'll edit this, in my case in rails 3 it just pulled em from the whole table without filtering.

Rails 4: Joins in ActiveRecord relation lambdas not include when doing a join

I'm creating a Revision system for a project where a base table contains the current revision for a given id, and a revision table contains the data tagged with a given revision, eg:
foos
- id
- revision
foo_revisions
- foo_id
- revision
{data}
For relations between these I have used the lamda syntax to specify conditions on the relation like this:
class Article
belongs_to :product, ->{ joins(:base).where("products.revision = product_revisions.revision") }, :class_name=> "Product::Revision", :primary_key => :product_id
Where article is not revisioned, but product is (Product::Revision is the model that contains the actual data, and is a ActiveRecord::Base mapping to product_revisions, while Product maps to products).
The :base relation is from Product::Revision to Product
This works fine for the normal things like
a = Article.find(..)
a.product
which products the sql (a.product only)
SELECT `product_revisions`.* FROM `product_revisions`
INNER JOIN `products` ON `products`.`id` = `product_revisions`.`product_id`
WHERE `product_revisions`.`product_id` = 406
AND (products.revision = product_revisions.revision) ORDER BY `product_revisions`.`id` ASC LIMIT 1
But when I do Article.joins(:product) it fails, since it doesn't join in the products table:
SELECT `articles`.* FROM `articles` INNER JOIN `product_revisions`
ON `product_revisions`.`product_id` = `articles`.`product_id`
AND (products.revision = product_revisions.revision)
with the error:
Mysql2::Error: Unknown column 'products.revision' in 'on clause'
To me it seems like ActiveRecord simply ignores the joins in the lamba when it does the joins query, which seems stupid. Is this a bug, or is there a better/correct way to do this?
I've encountered a similar problem. Any joins specified in a lambda for a has_many are silently ignored.
I found this in the Rails issues that solves the problem for me:
https://github.com/rails/rails/pull/11518
The author mentions the problem occurring when there is an order clause but I think this muddies the water - it makes no difference whether there is an order clause or not.
I cannot say whether this is a bug or intended behaviour but I suspect the former.

How do I get Rails ActiveRecord to generate optimized SQL?

Let's say that I have 4 models which are related in the following ways:
Schedule has foreign key to Project
Schedule has foreign key to User
Project has foreign key to Client
In my Schedule#index view I want the most optimized SQL so that I can display links to the Schedule's associated Project, Client, and User. So, I should not pull all of the columns for the Project, Client, and User; only their IDs and Name.
If I were to manually write the SQL it might look like this:
select
s.id,
s.schedule_name,
s.schedule_type,
s.project_id,
p.name project_name,
p.client_id client_id,
c.name client_name,
s.user_id,
u.login user_login,
s.created_at,
s.updated_at,
s.data_count
from
Users u inner join
Clients c inner join
Schedules s inner join
Projects p
on p.id = s.project_id
on c.id = p.client_id
on u.id = s.user_id
order by
s.created_at desc
My question is: What would the ActiveRecord code look like to get Rails 3 to generate that SQL? For example, somthing like:
#schedules = Schedule. # ?
I already have the associations setup in the models (i.e. has_many / belongs_to).
I think this will build (or at least help) you get what you're looking for:
Schedule.select("schedules.id, schedules.schedule_name, projects.name as project_name").joins(:user, :project=>:client).order("schedules.created_at DESC")
should yield:
SELECT schedules.id, schedules.schedule_name, projects.name as project_name FROM `schedules` INNER JOIN `users` ON `users`.`id` = `schedules`.`user_id` INNER JOIN `projects` ON `projects`.`id` = `schedules`.`project_id` INNER JOIN `clients` ON `clients`.`id` = `projects`.`client_id`
The main problem I see in your approach is that you're looking for schedule objects but basing your initial "FROM" clause on "User" and your associations given are also on Schedule, so I built this solution based on the plain assumption that you want schedules!
I also didn't include all of your selects to save some typing, but you get the idea. You will simply have to add each one qualified with its full table name.

Resources