Order and limit clauses unexpectedly passed down to scope - ruby-on-rails

(the queries here have no sensible semantic but I chose them for the sake of simplicity)
Project.limit(10).where(id: Project.select(:id))
generates as expected the following SQL query:
SELECT
"projects".*
FROM
"projects"
WHERE
"projects"."id" IN (
SELECT
"projects"."id"
FROM
"projects"
) LIMIT 10
But if I defined in my Project class the method
def self.my_filter
where(id: Project.select(:id))
end
Then
Project.limit(10).my_filter
generates the following query
SELECT
"projects".*
FROM
"projects"
WHERE
"projects"."id" IN (
SELECT
"projects"."id"
FROM
"projects" LIMIT 10
) LIMIT 10
See how the LIMIT 10 has now been also applied to the subquery.
Same issue when using a .order clause.
It happens with Rails 4.2.2 and Rails 3.2.20. It happens when the subquery is done on the Project table, it does happens if the subquery is done on another table.
Is there something I'm doing wrong here or do you think it is a Rails bug?
A workaround is to build my_filter by explicitly adding limit(nil).reorder(nil) to it but it is hackish.
EDIT: another workaround is to append the limit clause after the my_filter scope: Project.my_filter.limit(10).

This is actually a feature. Class methods work similar to scopes in ActiveRecord models.
And if you want to remove the already added scopes, you can either use unscoped, either call the method on a class directly, not on a scope:
def self.my_filter
unscoped.where(id: Project.select(:id))
end
# or
Project.my_filter

Your class method is applied in a way you may not be expecting:
Project.limit(10) # => a relation, not the Project class
.my_filter # => calling a class method on a relation
# Does, the following, suddenly:
# scoping { Project.my_filter }
# It's a relation's wrapper
From: .../ruby-2.0.0-p598/gems/activerecord-4.1.6/lib/active_record/relation.rb # line 281:
Owner: ActiveRecord::Relation
Visibility: public
Signature: scoping()
Scope all queries to the current scope.
Comment.where(post_id: 1).scoping do
Comment.first
end
# => SELECT "comments".* FROM "comments"
# WHERE "comments"."post_id" = 1 ORDER BY "comments"."id" ASC LIMIT 1
Please check unscoped if you want to remove all previous scopes (including
the default_scope) during the execution of a block.
Inside that scoping block, your class will include all the scoping rules of a relation it was built from into all queries, as scoping will enforce context. This is done so class methods can be properly chained, while still retaining the correct self. Of course, when you try using a class method inside the class method, stuff blows up.
In your first, "expected outcome" example, where is "natively" defined on relations, so no scope enforcement takes place: it's just not necessary.
Yeah, documentation hints that you can use unscoped in your nested query, like so:
def my_filter
where(id: Project.unscoped.select(:id))
end
...since that's where you need the "bare basis". Or, as you've already found out, you can just place limit at the end:
Project.my_filter.limit(10)
...here, at the time my_filter gets to execute, scoping will do effectively nothing: there will be no context built up to this point.

Related

Why does database functions break rails query plan on includes?

A plain call works as intended. The resulting SQL uses LEFT OUTER JOINs to link tables as desired.
> Subscription.includes(plan: { student: :person }).order('persons.name')
=> #<ActiveRecord::Relation ... >
If a function is inserted upon order clause, seems that rails goes off-track in its query plan as the resulting SQL does not do the tables linkage and, therefore, issues the error:
> Subscription.includes(plan: { student: :person }).order('unaccent(persons.name)')
=> ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR: missing FROM-clause entry for table "persons")
LINE 1: ...subscriptions".* FROM "subscriptions" ORDER BY unaccent(persons.na...
^
: SELECT "subscriptions".* FROM "subscriptions" ORDER BY unaccent(persons.name) LIMIT $1
The same does not apply to joins that executes the command BUT using INNER JOINs as the table linkage (not exactly the intended relationship)
> Subscription.joins(plan: { student: :person }).order('unaccent(persons.name)')
=> #<ActiveRecord::Relation ... > # GOOD
As a newbie here, what am I missing?
(Re-written from my comment above)
You need to add .references(:persons) to the query.
Rails tries to be "lazy" and avoid performing unnecessary JOINs when using includes. The usage of this unaccent SQL function is throwing off the ActiveRecord query planner - so you need to be more explicit, thus forcing rails to perform the JOIN.
See the documentation on "conditions":
If you want to add conditions to your included models you’ll have to
explicitly reference them. For example:
User.includes(:posts).where('posts.name = ?', 'example')
Will throw an error, but this will work:
User.includes(:posts).where('posts.name = ?', 'example').references(:posts)
Note that includes works with
association names while references needs the actual table name

Differences between `any?` and `exists?` in Ruby on Rails?

In Ruby on Rails, there appear to be two methods to check whether a collection has any elements in it.
Namely, they are ActiveRecord::FinderMethods’ exists? and ActiveRecord::Relation’s any?. Running these in a generic query (Foo.first.bars.exists? and Foo.first.bars.any?) generated equivalent SQL. Is there any reason to use one over the other?
#any and #exists? are very different beasts but query similarly.
Mainly, #any? accepts a block — and with this block it retrieves the records in the relation, calls #to_a, calls the block, and then hits it with Enumerable#any?. Without a block, it's the equivalent to !empty? and counts the records of the relation.
#exists? always queries the database and never relies on preloaded records, and sets a LIMIT of 1. It's much more performant vs #any?. #exists? also accepts an options param as conditions to apply as you can see in the docs.
The use of ActiveRecord#any? is reduced over ActiveRecord#exists?. With any? you can check, in the case of passing a block, if certain elements in that array matches the criteria. Similar to the Enumerable#any? but don't confuse them.
The ActiveRecord#any? implements the Enumerable#any? inside the logic of its definition, by converting the Relation accessed to an array in case a block has been passed to it and yields and access the block parameters to implement in a "hand-made" way a "Ruby" any? method.
The handy else added is intended to return the negation of empty? applied to the Relation. That's why you can check in both ways if a model has or no records in it, like:
User.count # 0
User.any? # false
# SELECT 1 AS one FROM "users" LIMIT ? [["LIMIT", 1]]
User.exists? # false
# SELECT 1 AS one FROM "users" LIMIT ? [["LIMIT", 1]]
You could also check in the "any?" way, if some record attribute has a specific value:
Foo.any? { |foo| foo.title == 'foo' } # SELECT "posts".* FROM "posts"
Or to save "efficiency" by using exists? and improve your query and lines of code:
Foo.exists?(title: 'foo') # SELECT 1 AS one FROM "posts" WHERE "posts"."title" = ? LIMIT ? [["title", "foo"], ["LIMIT", 1]]
ActiveRecord#exists? offers many implementations and is intended to work in a SQL level, rather than any?, that anyways will convert the Relation what you're working with in an array if you don't pass a block.
The answers here are all based on very outdated versions. This commit from 2016 / ActiveRecord 5.1 changes empty?, which is called by any? when no block is passed, to call exists? when not preloaded. So in vaguely-modern Rails, the only difference when no block is passed is a few extra method calls and negations, and ignoring preloaded results.
I ran into a practical issue: exists? forces a DB query while any? doesn't.
user = User.new
user.skills = [Skill.new]
user.skills.any?
# => true
user.skills.exists?
# => false
Consider having factories and a before_create hook:
class User < ActiveRecord::Base
has_many :skills
before_create :ensure_skills
def ensure_skills
# Don't want users without skills
errors.add(:skills, :invalid) if !skills.exists?
end
end
FactoryBot.define do
factory :user do
skills { [association(:skill)] }
end
end
create(:user) will fail, because at the time of before_create skills are not yet persisted. Using .any? will solve this.

ActiveRecord pluck to SQL

I know these two statements perform the same SQL:
Using select
User.select(:email)
# SELECT `users`.`email` FROM `users`
And using pluck
User.all.pluck(:email)
# SELECT `users`.`email` FROM `users`
Now I need to get the SQL statement derived from each method. Given that the select method returns an ActiveRecord::Relation, I can call the to_sql method. However, I cannot figure out how to get the SQL statement derived from a pluck operation on an ActiveRecord::Relation object, given that the result is an array.
Please, take into account that this is a simplification of the problem. The number of attributes plucked can be arbitrarily high.
Any help would be appreciated.
You cannot chain to_sql with pluck as it doesn't return ActiveRecord::relation. If you try to do, it throws an exception like so
NoMethodError: undefined method `to_sql' for [[""]]:Array
I cannot figure out how to get the SQL statement derived from a pluck
operation on an ActiveRecord::Relation object, given that the result
is an array.
Well, as #cschroed pointed out in the comments, they both(select and pluck) perform same SQL queries. The only difference is that pluck return an array instead of ActiveRecord::Relation. It doesn't matter how many attributes you are trying to pluck, the SQL statement will be same as select
Example:
User.select(:first_name,:email)
#=> SELECT "users"."first_name", "users"."email" FROM "users"
Same for pluck too
User.all.pluck(:first_name,:email)
#=> SELECT "users"."first_name", "users"."email" FROM "users"
So, you just need to take the SQL statement returned by the select and believe that it is the same for the pluck. That's it!
You could monkey-patch the ActiveRecord::LogSubscriber class and provide a singleton that would register any active record queries, even the ones that doesn't return ActiveRecord::Relation objects:
class QueriesRegister
include Singleton
def queries
#queries ||= []
end
def flush
#queries = []
end
end
module ActiveRecord
class LogSubscriber < ActiveSupport::LogSubscriber
def sql(event)
QueriesRegister.instance.queries << event.payload[:sql]
"#{event.payload[:name]} (#{event.duration}) #{event.payload[:sql]}"
end
end
end
Run you query:
User.all.pluck(:email)
Then, to retrieve the queries:
QueriesRegister.instance.queries

Using joins to query by attribute on associated recored

I currently have this horribly written query:
membership_ids = User.where(skip_membership_renewal: true).includes(:memberships).map(&:membership_ids).flatten
Memberships.where(id: membership_ids)
I have been trying to use joins so that I can just make one query.
Membership.includes(:user).where("user.skip_membership_renewal", true)
However, this doesn't work since I keep getting the error: ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR.
My relationship is:
User has_many :memberships
Membership belongs_to :user
What am I doing incorrectly?
You just have a pluralization error. In Rails, you define models as singular (User) and the database table is pluralized (users).
Membership.includes(:user).where("users.skip_membership_renewal" => true)
That said, you don't need to resort to using SQL literals for such a simple case. There are a bunch of other ways of assembling this query, like the scope option David Aldridge suggested, or either of these:
non_renewing_users = User.where(skip_membership_renewal: true)
Membership.joins(:user).merge(non_renewing_users)
Membership.where(user: non_renewing_users)
What's more is that these both only execute a single SQL query for most adapters because they use subqueries:
SELECT "memberships".*
FROM "memberships"
WHERE "memberships"."user_id" IN (
SELECT "users"."id" FROM "users"
WHERE "users"."skip_membership_renewal" = true
)
You can probably aim to use:
Membership.where(:user => User.skip_membership_renewal)
Add a scope onto User ...
def self.skip_membership_renewal
where(skip_membership_renewal: true)
end
You should find that it runs as a single query.

Order by in Rails adding it's own order

In the Rails tutorial, it says that I can simply use .order("something") and it'd work. However, when I write Course.order("name DESC") I get the query:
SELECT "courses".* FROM "courses" ORDER BY name ASC, name DESC
When I really want (notice that it's just ordered by name DESC):
SELECT "courses".* FROM "courses" ORDER BY name DESC
How could I force it through?
if you have a default order befined by a default_scope, you can override by using reorder
Order.reorder('name DESC')
UPDATE: Using unscoped will also work but be wary that this totally removes all scopes defined on the query. For example, the following will all return the same sql
Order.where('id IS NOT NULL').unscoped.order('name DESC')
Order.unscoped.order('name DESC')
Order.scope1.scope2.unscoped.order('name DESC')
current_user.orders.unscoped.order('name DESC')
It was because I was using default_scope in the model that caused it. Running this avoids the scoping:
Course.unscoped.order("name DESC")
Edit: for future reference this is code smell and default_scope should be used carefully because often developers will forget (months after writing code) that default_scope is set and bite you back.

Resources