Why does database functions break rails query plan on includes? - ruby-on-rails

A plain call works as intended. The resulting SQL uses LEFT OUTER JOINs to link tables as desired.
> Subscription.includes(plan: { student: :person }).order('persons.name')
=> #<ActiveRecord::Relation ... >
If a function is inserted upon order clause, seems that rails goes off-track in its query plan as the resulting SQL does not do the tables linkage and, therefore, issues the error:
> Subscription.includes(plan: { student: :person }).order('unaccent(persons.name)')
=> ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR: missing FROM-clause entry for table "persons")
LINE 1: ...subscriptions".* FROM "subscriptions" ORDER BY unaccent(persons.na...
^
: SELECT "subscriptions".* FROM "subscriptions" ORDER BY unaccent(persons.name) LIMIT $1
The same does not apply to joins that executes the command BUT using INNER JOINs as the table linkage (not exactly the intended relationship)
> Subscription.joins(plan: { student: :person }).order('unaccent(persons.name)')
=> #<ActiveRecord::Relation ... > # GOOD
As a newbie here, what am I missing?

(Re-written from my comment above)
You need to add .references(:persons) to the query.
Rails tries to be "lazy" and avoid performing unnecessary JOINs when using includes. The usage of this unaccent SQL function is throwing off the ActiveRecord query planner - so you need to be more explicit, thus forcing rails to perform the JOIN.
See the documentation on "conditions":
If you want to add conditions to your included models you’ll have to
explicitly reference them. For example:
User.includes(:posts).where('posts.name = ?', 'example')
Will throw an error, but this will work:
User.includes(:posts).where('posts.name = ?', 'example').references(:posts)
Note that includes works with
association names while references needs the actual table name

Related

Check if ActiveRecord::Relation alread includes JOIN

I'm inside method that adds filter (user.type) to my query/relation.
Sometimes if grouping by the user (which needs INNER join to users table in another module) is selected before filtering I receive an error:
PostgreSQL: PG::DuplicateAlias: ERROR: table name "users" specified more than once
Before error happen JOIN is already in query -
$ pry> relation.to_sql
SELECT \"posts\".* FROM \"posts\"
INNER JOIN users ON users.id = posts.user_id
WHERE \"posts\".\"created_at\" BETWEEN '2019-05-01 00:00:00'
AND '2020-05-01 23:59:59' AND \"users\".\"type\" = 'Guest'"
I wanna fix it, by checking if the table is already joined inside my ActiveRecord::Relation object. I added:
def join_users
return relation if /JOIN users/.match? relation.to_sql
relation.joins('LEFT JOIN users ON users.id = posts.user_id')
end
This solution works, but I wonder - is there any better way to check if JOIN is inside relation?
Perhaps you can use joins_values, which isn't documented, but is an ActiveRecord_Relation public method that returns an array containing the name of the table the current query (object) is constructed with:
Post.joins(:user).joins_values # [:user]
Post.all.joins_values # []
if simple join
Post.joins(:user)
you can find via joins_values
so it will look like Post.joins(:user).joins_values # [:user]
if post has left joins
Post.left_joins(:user)
you can find via left_outer_joins_values
So in this case if you write Post.joins(:user).joins_values # []
so you can fix it by writing Post.joins(:user).left_outer_joins_values # [:user]

StatementInvalid Rails Query

I've got the following query that works:
jobs = current_location.jobs.includes(:customer).all.where(complete: complete)
However, when I add a where clause to query the first name of the customer table, I get an error.
jobs = current_location.jobs.includes(:customer).all.where(complete: complete).where("customers.fist_name = ?", "Bob")
Here is the error:
PG::UndefinedTable: ERROR: missing FROM-clause entry for table "customers"
LINE 1: ...bs"."complete" = $2 AND "jobs"."status" = $3 AND (customers....
^
: SELECT "jobs".* FROM "jobs" INNER JOIN "jobs_users" ON "jobs"."id" = "jobs_users"."job_id" WHERE "jobs_users"."user_id" = $1 AND "jobs"."complete" = $2 AND "jobs"."status" = $3 AND (customers.last_name = 'Bob') ORDER BY "jobs"."start" DESC LIMIT $4 OFFSET $5
The current_location method:
def current_location
return current_user.locations.find_by(id: cookies[:current_location])
end
Location Model
has_many :jobs
has_and_belongs_to_many :customers
Job Model
belongs_to :location
belongs_to :customer
Customer Model
has_many :jobs
has_and_belongs_to_many :locations
How can I fix this issue?
includes will only join the table if you set a reference to the association.
When using includes you ensure a reference to the association in 2 fashions:
You can use the references method this will join the table whether or not there are any query conditions (If you MUST use raw SQL as shown in your question then this is the method you would need to use) e.g.
current_location.jobs
.includes(:customer)
.references(:customer)
Or you can use the hash finder version of where: (Please note that when using an associative reference in the where clause you must reference the table name, in this case customers and not the association name customer)
current_location.jobs
.includes(:customer)
.where(customers: {first_name: "Bob" })
Both of these will eager load the customer for the jobs referenced.
The first option (references) will OUTER JOIN the customers table so that all the jobs are loaded even if they have no customers as long as no query conditions reference the customers table.
The second option (using where) will OUTER JOIN the customers table but given the query parameter against the customers table it will act very much like an INNER JOIN.
If you only need to search the jobs based on customer information then joins is a better choice as this will create an INNER JOIN with the customers table but will not try to load any of the customer data in the query e.g.
current_location.jobs.joins(:customer).where(customers: {first_name: "Bob" })
joins will always include the associated table regardless of a reference in the query.
Sidenote: the all in both your queries is completely unnecessary
includes(:customer) does not necessarily join the customers table into the SQL query. You need to use joins(:customer) to force Rails to join the customers table into the SQL query and make it available to query conditions.
jobs = current_location.jobs
.joins(:customer)
.includes(:customer)
.where(complete: complete)
.where(customers: { first_name: 'Bob' })

Using joins to query by attribute on associated recored

I currently have this horribly written query:
membership_ids = User.where(skip_membership_renewal: true).includes(:memberships).map(&:membership_ids).flatten
Memberships.where(id: membership_ids)
I have been trying to use joins so that I can just make one query.
Membership.includes(:user).where("user.skip_membership_renewal", true)
However, this doesn't work since I keep getting the error: ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR.
My relationship is:
User has_many :memberships
Membership belongs_to :user
What am I doing incorrectly?
You just have a pluralization error. In Rails, you define models as singular (User) and the database table is pluralized (users).
Membership.includes(:user).where("users.skip_membership_renewal" => true)
That said, you don't need to resort to using SQL literals for such a simple case. There are a bunch of other ways of assembling this query, like the scope option David Aldridge suggested, or either of these:
non_renewing_users = User.where(skip_membership_renewal: true)
Membership.joins(:user).merge(non_renewing_users)
Membership.where(user: non_renewing_users)
What's more is that these both only execute a single SQL query for most adapters because they use subqueries:
SELECT "memberships".*
FROM "memberships"
WHERE "memberships"."user_id" IN (
SELECT "users"."id" FROM "users"
WHERE "users"."skip_membership_renewal" = true
)
You can probably aim to use:
Membership.where(:user => User.skip_membership_renewal)
Add a scope onto User ...
def self.skip_membership_renewal
where(skip_membership_renewal: true)
end
You should find that it runs as a single query.

Add computable column to multi-table select clause with eager_load in Ruby on Rails Activerecord

I have a query with a lot of joins and I'm eager_loading some of associations at the time. And I need to compute some value as attribute of one of models.
So, I'm trying this code:
ServiceObject
.joins([{service_days: :ou}, :address])
.eager_load(:address, :service_days)
.where(ous: {id: OU.where(sector_code: 5)})
.select('SDO_CONTAINS(ous.service_area_shape, SDO_GEOMETRY(2001, 8307, sdo_point_type(addresses.lat, addresses.lng, NULL), NULL, NULL) ) AS in_zone')
Where SQL function call in select operates data from associated addresses and ous tables.
I'm getting next SQL (so my in_zone column getting calculated and returned as first column before other columns for all eager_loaded models):
SELECT SDO_CONTAINS(ous.service_area_shape, SDO_GEOMETRY(2001, 8307, sdo_point_type(addresses.lat, addresses.lng, NULL), NULL, NULL) ) AS in_zone, "SERVICE_OBJECTS"."ID" AS t0_r0, "SERVICE_OBJECTS"."TYPE" AS t0_r1, <omitted for brevity> AS t2_r36 FROM "SERVICE_OBJECTS" INNER JOIN "SERVICE_DAYS" ON "SERVICE_DAYS"."SERVICE_OBJECT_ID" = "SERVICE_OBJECTS"."ID" INNER JOIN "OUS" ON "OUS"."ID" = "SERVICE_DAYS"."OU_ID" INNER JOIN "ADDRESSES" ON "ADDRESSES"."ID" = "SERVICE_OBJECTS"."ADDRESS_ID" WHERE "OUS"."ID" IN (SELECT "OUS"."ID" FROM "OUS" WHERE "OUS"."SECTOR_CODE" = :a1) [["sector_code", "5"]]
But it seems like that in_zone isn't accessible from either model used in query.
I need to have calculated in_zone as attribute of ServiceObject model object, how I can accomplish that?
Ruby on Rails 4.2.6, Ruby 2.3.0, oracle_enhanced adapter 1.6.7, Oracle 12.1
I have successfully replicated your issue and it turns out that this is a known issue in Rails. The problem is that when using eager_load, Rails maps the columns of all eager-loaded tables into table and column aliases in the form of t0_r0, t0_r1, etc... (you can see these in the SQL that you pasted in the question). And while doing that, it simply ignores the custom columns in the select, probably because it cannot determine which eager-loaded table it should attribute the custom column to. It is sad that this issue is open for more than 2 years now...
Nevertheless I think I found a workaround. It seems that if you don't eager load the tables but manually join them (with joins), you can as well include them (with includes) and the custom columns will be returned as there will be no column aliasing taking place. The point is that you must not use associations in the joins clauses but you have to specify the joins yourself. Also note that you must specify all columns from the main table in the select manually too (see the service_objects.* in the select).
Try the following approach:
ServiceObject
.joins('INNER JOIN "SERVICE_DAYS" ON "SERVICE_DAYS"."SERVICE_OBJECT_ID" = "SERVICE_OBJECTS"."ID"')
.joins('INNER JOIN "OUS" ON "OUS"."ID" = "SERVICE_DAYS"."OU_ID"')
.joins('INNER JOIN "ADDRESSES" ON "ADDRESSES"."ID" = "SERVICE_OBJECTS"."ADDRESS_ID"')
.includes(:service_days, :address)
.where(ous: {id: OU.where(sector_code: 5)})
.select('service_objects.*, SDO_CONTAINS(ous.service_area_shape, SDO_GEOMETRY(2001, 8307, sdo_point_type(addresses.lat, addresses.lng, NULL), NULL, NULL) ) AS in_zone')
The computation in the select should still work as the related tables are joined together but there should be no column aliasing present.
Of course this approach means that you'll get three queries instead of just one but unless you return a huge amount of records, the following two queries run by the includes clause should be very fast as they simply load the relevant records using foreign keys.
That monkey patch helped #Envek:
module ActiveRecord
Base.send :attr_accessor, :_row_
module Associations
class JoinDependency
JoinBase && class JoinPart
def instantiate_with_row(row, *args)
instantiate_without_row(row, *args).tap { |i| i._row_ = row }
end; alias_method_chain :instantiate, :row
end
end
end
end
then it is possible to do:
ServiceObject
.joins([{service_days: :ou}, :address])
.eager_load(:address, :service_days)
.where(ous: {id: OU.where(sector_code: 5)})
.select('SDO_CONTAINS(ous.service_area_shape, SDO_GEOMETRY(2001, 8307, sdo_point_type(addresses.lat, addresses.lng, NULL), NULL, NULL) ) AS in_zone')
.first
._row_['in_zone']

Rails 4: Joins in ActiveRecord relation lambdas not include when doing a join

I'm creating a Revision system for a project where a base table contains the current revision for a given id, and a revision table contains the data tagged with a given revision, eg:
foos
- id
- revision
foo_revisions
- foo_id
- revision
{data}
For relations between these I have used the lamda syntax to specify conditions on the relation like this:
class Article
belongs_to :product, ->{ joins(:base).where("products.revision = product_revisions.revision") }, :class_name=> "Product::Revision", :primary_key => :product_id
Where article is not revisioned, but product is (Product::Revision is the model that contains the actual data, and is a ActiveRecord::Base mapping to product_revisions, while Product maps to products).
The :base relation is from Product::Revision to Product
This works fine for the normal things like
a = Article.find(..)
a.product
which products the sql (a.product only)
SELECT `product_revisions`.* FROM `product_revisions`
INNER JOIN `products` ON `products`.`id` = `product_revisions`.`product_id`
WHERE `product_revisions`.`product_id` = 406
AND (products.revision = product_revisions.revision) ORDER BY `product_revisions`.`id` ASC LIMIT 1
But when I do Article.joins(:product) it fails, since it doesn't join in the products table:
SELECT `articles`.* FROM `articles` INNER JOIN `product_revisions`
ON `product_revisions`.`product_id` = `articles`.`product_id`
AND (products.revision = product_revisions.revision)
with the error:
Mysql2::Error: Unknown column 'products.revision' in 'on clause'
To me it seems like ActiveRecord simply ignores the joins in the lamba when it does the joins query, which seems stupid. Is this a bug, or is there a better/correct way to do this?
I've encountered a similar problem. Any joins specified in a lambda for a has_many are silently ignored.
I found this in the Rails issues that solves the problem for me:
https://github.com/rails/rails/pull/11518
The author mentions the problem occurring when there is an order clause but I think this muddies the water - it makes no difference whether there is an order clause or not.
I cannot say whether this is a bug or intended behaviour but I suspect the former.

Resources