ActiveRecord select with OR and exclusive limit - ruby-on-rails

I have the need to query the database and retrieve the last 10 objects that are either active or declined. We use the following:
User.where(status: [:active, :declined]).limit(10)
Now we need to get the last 10 of each status (total of 20 users)
I've tried the following:
User.where(status: :active).limit(10).or(User.where(status: : declined).limit(10))
# SELECT "users".* FROM "users" WHERE ("users"."status" = $1 OR "users"."status" = $2) LIMIT $3
This does the same as the previous query and returns only 10 users, of mixed statuses.
How can I get the last 10 active users and the last 10 declined users with a single query?

I'm not sure that SQL allows doing what you want. First thing I would try would be to use a subquery, something like this:
class User < ApplicationRecord
scope :active, -> { where status: :active }
scope :declined, -> { where status: :declined }
scope :last_active_or_declined, -> {
where(id: active.limit(10).pluck(:id))
.or(where(id: declined.limit(10).pluck(:id))
}
end
Then somewhere else you could just do
User.last_active_or_declined()
What this does is to perform 2 different subqueries asking separately for each of the group of users and then getting the ones in the propper group ids. I would say you could even forget about the pluck(:id) parts since ActiveRecord is smart enough to add the proper select clause to your SQL, but I'm not 100% sure and I don't have any Rails project at hand where I can try this.

limit is not a permitted value for #or relationship. If you check the Rails code, the Error raised come from here:
def or!(other) # :nodoc:
incompatible_values = structurally_incompatible_values_for_or(other)
unless incompatible_values.empty?
raise ArgumentError, "Relation passed to #or must be structurally compatible. Incompatible values: #{incompatible_values}"
end
# more code
end
You can check which methods are restricted further down in the code here:
STRUCTURAL_OR_METHODS = Relation::VALUE_METHODS - [:extending, :where, :having, :unscope, :references]
def structurally_incompatible_values_for_or(other)
STRUCTURAL_OR_METHODS.reject do |method|
get_value(method) == other.get_value(method)
end
end
You can see in the Relation class here that limit is restricted:
SINGLE_VALUE_METHODS = [:limit, :offset, :lock, :readonly, :reordering,
:reverse_order, :distinct, :create_with, :skip_query_cache,
:skip_preloading]
So you will have to resort to raw SQL I'm afraid

I don't think you can do it with a single query, but you can do it with two queries, get the record ids, and then build a query using those record ids.
It's not ideal but as you're just plucking ids the impact isn't too bad.
user_ids = User.where(status: :active).limit(10).pluck(:id) + User.where(status: :declined).limit(10).pluck(id)
users = User.where(id: user_ids)

I think you can use UNION. Install active_record_union and replace or with union:
User.where(status: :active).limit(10).union(User.where(status: :declined).limit(10))

Related

Differences between `any?` and `exists?` in Ruby on Rails?

In Ruby on Rails, there appear to be two methods to check whether a collection has any elements in it.
Namely, they are ActiveRecord::FinderMethods’ exists? and ActiveRecord::Relation’s any?. Running these in a generic query (Foo.first.bars.exists? and Foo.first.bars.any?) generated equivalent SQL. Is there any reason to use one over the other?
#any and #exists? are very different beasts but query similarly.
Mainly, #any? accepts a block — and with this block it retrieves the records in the relation, calls #to_a, calls the block, and then hits it with Enumerable#any?. Without a block, it's the equivalent to !empty? and counts the records of the relation.
#exists? always queries the database and never relies on preloaded records, and sets a LIMIT of 1. It's much more performant vs #any?. #exists? also accepts an options param as conditions to apply as you can see in the docs.
The use of ActiveRecord#any? is reduced over ActiveRecord#exists?. With any? you can check, in the case of passing a block, if certain elements in that array matches the criteria. Similar to the Enumerable#any? but don't confuse them.
The ActiveRecord#any? implements the Enumerable#any? inside the logic of its definition, by converting the Relation accessed to an array in case a block has been passed to it and yields and access the block parameters to implement in a "hand-made" way a "Ruby" any? method.
The handy else added is intended to return the negation of empty? applied to the Relation. That's why you can check in both ways if a model has or no records in it, like:
User.count # 0
User.any? # false
# SELECT 1 AS one FROM "users" LIMIT ? [["LIMIT", 1]]
User.exists? # false
# SELECT 1 AS one FROM "users" LIMIT ? [["LIMIT", 1]]
You could also check in the "any?" way, if some record attribute has a specific value:
Foo.any? { |foo| foo.title == 'foo' } # SELECT "posts".* FROM "posts"
Or to save "efficiency" by using exists? and improve your query and lines of code:
Foo.exists?(title: 'foo') # SELECT 1 AS one FROM "posts" WHERE "posts"."title" = ? LIMIT ? [["title", "foo"], ["LIMIT", 1]]
ActiveRecord#exists? offers many implementations and is intended to work in a SQL level, rather than any?, that anyways will convert the Relation what you're working with in an array if you don't pass a block.
The answers here are all based on very outdated versions. This commit from 2016 / ActiveRecord 5.1 changes empty?, which is called by any? when no block is passed, to call exists? when not preloaded. So in vaguely-modern Rails, the only difference when no block is passed is a few extra method calls and negations, and ignoring preloaded results.
I ran into a practical issue: exists? forces a DB query while any? doesn't.
user = User.new
user.skills = [Skill.new]
user.skills.any?
# => true
user.skills.exists?
# => false
Consider having factories and a before_create hook:
class User < ActiveRecord::Base
has_many :skills
before_create :ensure_skills
def ensure_skills
# Don't want users without skills
errors.add(:skills, :invalid) if !skills.exists?
end
end
FactoryBot.define do
factory :user do
skills { [association(:skill)] }
end
end
create(:user) will fail, because at the time of before_create skills are not yet persisted. Using .any? will solve this.

Rails - Active Record: Find all records which have a count on has_many association with certain attributes

A user has many identities.
class User < ActiveRecord::Base
has_many :identities
end
class Identity < ActiveRecord::Base
belongs_to :user
end
An identity has an a confirmed:boolean column. I'd like to query all users that have an only ONE identity. This identity must also be confirmed false.
I've tried this
User.joins(:identities).group("users.id").having( 'count(user_id) = 1').where(identities: { confirmed: false })
But this returns users with one identity confirmed:false but they could also have additional identities if they are confirmed true. I only want users with only one identity confirmed:false and no additional identities that are have confirmed attribute as true.
I've also tried this but obviously it's slow and I'm looking for the right SQL to just do this in one query.
def self.new_users
users = User.joins(:identities).where(identities: { confirmed: false })
users.select { |user| user.identities.count == 1 }
end
Apologies upfront if this was already answered but I could not find a similar post.
One solution is to use rails nested queries
User.joins(:identities).where(id: Identity.select(:user_id).unconfirmed).group("users.id").having( 'count(user_id) = 1')
And here's the SQL generated by the query
SELECT "users".* FROM "users"
INNER JOIN "identities" ON "identities"."user_id" = "users"."id"
WHERE "users"."id" IN (SELECT "identities"."user_id" FROM "identities" WHERE "identities"."confirmed" = 'f')
GROUP BY users.id HAVING count(user_id) = 1
I still don't think this is the most efficient way. While I'm able to generate only one SQL query (meaning only one network call to the db), I'm still have to do two scans: one scan on the USERS table and one scan on the IDENTITIES table. This can be optimized by indexing the identities.confirmed column but this still doesn't solve the two full scans problem.
For those who understand the query plan here it is:
QUERY PLAN
-------------------------------------------------------------------------------------------
HashAggregate (cost=32.96..33.09 rows=10 width=3149)
Filter: (count(identities.user_id) = 1)
-> Hash Semi Join (cost=21.59..32.91 rows=10 width=3149)
Hash Cond: (identities.user_id = identities_1.user_id)
-> Hash Join (cost=10.45..21.61 rows=20 width=3149)
Hash Cond: (identities.user_id = users.id)
-> Seq Scan on identities (cost=0.00..10.70 rows=70 width=4)
-> Hash (cost=10.20..10.20 rows=20 width=3145)
-> Seq Scan on users (cost=0.00..10.20 rows=20 width=3145)
-> Hash (cost=10.70..10.70 rows=35 width=4)
-> Seq Scan on identities identities_1 (cost=0.00..10.70 rows=35 width=4)
Filter: (NOT confirmed)
(12 rows)
def self.new_users
joins(:identities).group("identities.user_id").having("count(identities.user_id) = 1").where(identities: {confirmed: false}).uniq
end
I think group_concat may be the answer here, if you have the function in your DBMS. (if not there may be an equivalent). This will collect all the values for the field from the group into a comma-separated string. We want ones where this string is equal to "false": ie, there's just one, and it's false (which i think is your requirement, it's a little unclear). . I think this should work if we let Rails handle the translation of false into however the DB stores it.
User.joins(:identities).group("identities.user_id").having("group_concat(identities.confirmed) = ?", false)
EDIT - if your database stores false as 0 then the above will generate sql like having group_concat(identities.confirmed) = 0. Because the result of the group_concat is a string, then it may (in some DBMS's) do a string-to-integer cast on the results before comparing it to 0, which will return lots of false positives if all the other strings cast to 0. In that case you can try this:
User.joins(:identities).group("identities.user_id").having("group_concat(identities.confirmed) = '?'", false)
(note quotes around ?)
EDIT2 - postgres version.
I've not tried this but it looks like recent versions of postgres have a function array_agg() which does the same as mysql's group_concat(). Because postgres stores true/false as 't'/'f' we shouldn't need to wrap the ? in quotes. Try this:
User.joins(:identities).group("identities.user_id").having("array_agg(identities.confirmed) = ?", false)

Use Ruby's select method on a Rails relation and update it

I have an ActiveRecord relation of a user's previous "votes"...
#previous_votes = current_user.votes
I need to filter these down to votes only on the current "challenge", so Ruby's select method seemed like the best way to do that...
#previous_votes = current_user.votes.select { |v| v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id }
But I also need to update the attributes of these records, and the select method turns my relation into an array which can't be updated or saved!
#previous_votes.update_all :ignore => false
# ...
# undefined method `update_all' for #<Array:0x007fed7949a0c0>
How can I filter down my relation like the select method is doing, but not lose the ability to update/save it the items with ActiveRecord?
Poking around the Google it seems like named_scope's appear in all the answers for similar questions, but I can't figure out it they can specifically accomplish what I'm after.
The problem is that select is not an SQL method. It fetches all records and filters them on the Ruby side. Here is a simplified example:
votes = Vote.scoped
votes.select{ |v| v.active? }
# SQL: select * from votes
# Ruby: all.select{ |v| v.active? }
Since update_all is an SQL method you can't use it on a Ruby array. You can stick to performing all operations in Ruby or move some (all) of them into SQL.
votes = Vote.scoped
votes.select{ |v| v.active? }
# N SQL operations (N - number of votes)
votes.each{ |vote| vote.update_attribute :ignore, false }
# or in 1 SQL operation
Vote.where(id: votes.map(&:id)).update_all(ignore: false)
If you don't actually use fetched votes it would be faster to perform the whole select & update on SQL side:
Vote.where(active: true).update_all(ignore: false)
While the previous examples work fine with your select, this one requires you to rewrite it in terms of SQL. If you have set up all relationships in Rails models you can do it roughly like this:
entry = Entry.find(params[:entry_id])
current_user.votes.joins(:challenges).merge(entry.challenge.votes)
# requires following associations:
# Challenge.has_many :votes
# User.has_many :votes
# Vote.has_many :challenges
And Rails will construct the appropriate SQL for you. But you can always fall back to writing the SQL by hand if something doesn't work.
Use collection_select instead of select. collection_select is specifically built on top of select to return ActiveRecord objects and not an array of strings like you get with select.
#previous_votes = current_user.votes.collection_select { |v| v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id }
This should return #previous_votes as an array of objects
EDIT: Updating this post with another suggested way to return those AR objects in an array
#previous_votes = current_user.votes.collect {|v| records.detect { v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id}}
A nice approach this is to use scopes. In your case, you can set this up the scope as follows:
class Vote < ActiveRecord::Base
scope :for_challenge, lambda do |challenge_id|
joins(:entry).where("entry.challenge_id = ?", challenge_id)
end
end
Then your code for getting current votes will look like:
challenge_id = Entry.find(params[:entry_id]).challenge_id
#previous_votes = current_user.votes.for_challenge(challenge_id)
I believe you can do something like:
#entry = Entry.find(params[:entry_id])
#previous_votes = Vote.joins(:entry).where(entries: { id: #entry.id, challenge_id: #entry.challenge_id })

Rails, how to sanitize SQL in find_by_sql

Is there a way to sanitize sql in rails method find_by_sql?
I've tried this solution:
Ruby on Rails: How to sanitize a string for SQL when not using find?
But it fails at
Model.execute_sql("Update users set active = 0 where id = 2")
It throws an error, but sql code is executed and the user with ID 2 now has a disabled account.
Simple find_by_sql also does not work:
Model.find_by_sql("UPDATE user set active = 0 where id = 1")
# => code executed, user with id 1 have now ban
Edit:
Well my client requested to make that function (select by sql) in admin panel to make some complex query(joins, special conditions etc). So I really want to find_by_sql that.
Second Edit:
I want to achieve that 'evil' SQL code won't be executed.
In admin panel you can type query -> Update users set admin = true where id = 232 and I want to block any UPDATE / DROP / ALTER SQL command.
Just want to know, that here you can ONLY execute SELECT.
After some attempts I conclude sanitize_sql_array unfortunatelly don't do that.
Is there a way to do that in Rails??
Sorry for the confusion..
Try this:
connect = ActiveRecord::Base.connection();
connect.execute(ActiveRecord::Base.send(:sanitize_sql_array, "your string"))
You can save it in variable and use for your purposes.
I made a little snippet for this that you can put in initializers.
class ActiveRecord::Base
def self.escape_sql(array)
self.send(:sanitize_sql_array, array)
end
end
Right now you can escape your query with this:
query = User.escape_sql(["Update users set active = ? where id = ?", true, params[:id]])
And you can call the query any way you like:
users = User.find_by_sql(query)
Slightly more general-purpose:
class ActiveRecord::Base
def self.escape_sql(clause, *rest)
self.send(:sanitize_sql_array, rest.empty? ? clause : ([clause] + rest))
end
end
This one lets you call it just like you'd type in a where clause, without extra brackets, and using either array-style ? or hash-style interpolations.
User.find_by_sql(["SELECT * FROM users WHERE (name = ?)", params])
Source: http://blog.endpoint.com/2012/10/dont-sleep-on-rails-3-sql-injection.html
Though this example is for INSERT query, one can use similar approach for UPDATE queries. Raw SQL bulk insert:
users_places = []
users_values = []
timestamp = Time.now.strftime('%Y-%m-%d %H:%M:%S')
params[:users].each do |user|
users_places << "(?,?,?,?)" # Append to array
users_values << user[:name] << user[:punch_line] << timestamp << timestamp
end
bulk_insert_users_sql_arr = ["INSERT INTO users (name, punch_line, created_at, updated_at) VALUES #{users_places.join(", ")}"] + users_values
begin
sql = ActiveRecord::Base.send(:sanitize_sql_array, bulk_insert_users_sql_arr)
ActiveRecord::Base.connection.execute(sql)
rescue
"something went wrong with the bulk insert sql query"
end
Here is the reference to sanitize_sql_array method in ActiveRecord::Base, it generates the proper query string by escaping the single quotes in the strings. For example the punch_line "Don't let them get you down" will become "Don\'t let them get you down".
I prefer to do it with key parameters. In your case it may looks like this:
Model.find_by_sql(["UPDATE user set active = :active where id = :id", active: 0, id: 1])
Pay attention, that you pass ONLY ONE parameter to :find_by_sql method - its an array, which contains two elements: string query and hash with params (since its our favourite Ruby, you can omit the curly brackets).

How do I combine results from two queries on the same model?

I need to return exactly ten records for use in a view. I have a highly restrictive query I'd like to use, but I want a less restrictive query in place to fill in the results in case the first query doesn't yield ten results.
Just playing around for a few minutes, and this is what I came up with, but it doesn't work. I think it doesn't work because merge is meant for combining queries on different models, but I could be wrong.
class Article < ActiveRecord::Base
...
def self.listed_articles
Article.published.order('created_at DESC').limit(25).where('listed = ?', true)
end
def self.rescue_articles
Article.published.order('created_at DESC').where('listed != ?', true).limit(10)
end
def self.current
Article.rescue_articles.merge(Article.listed_articles).limit(10)
end
...
end
Looking in console, this forces the restrictions in listed_articles on the query in rescue_articles, showing something like:
Article Load (0.2ms) SELECT `articles`.* FROM `articles` WHERE (published = 1) AND (listed = 1) AND (listed != 1) ORDER BY created_at DESC LIMIT 4
Article Load (0.2ms) SELECT `articles`.* FROM `articles` WHERE (published = 1) AND (listed = 1) AND (listed != 1) ORDER BY created_at DESC LIMIT 6 OFFSET 4
I'm sure there's some ridiculously easy method I'm missing in the documentation, but I haven't found it yet.
EDIT:
What I want to do is return all the articles where listed is true out of the twenty-five most recent articles. If that doesn't get me ten articles, I'd like to add enough articles from the most recent articles where listed is not true to get my full ten articles.
EDIT #2:
In other words, the merge method seems to string the queries together to make one long query instead of merging the results. I need the top ten results of the two queries (prioritizing listed articles), not one long query.
with your initial code:
You can join two arrays using + then get first 10 results:
def self.current
(Article.listed_articles + Article.rescue_articles)[0..9]
end
I suppose a really dirty way of doing it would be:
def self.current
oldest_accepted = Article.published.order('created_at DESC').limit(25).last
Artcile.published.where(['created_at > ?', oldest_accepted.created_at]).order('listed DESC').limit(10)
end
If you want an ActiveRecord::Relation object instead of an Array, you can use:
ActiveRecordUnion gem.
Install gem: gem install active_record_union and use:
def self.current
Article.rescue_articles.union(Article.listed_articles).limit(10)
end
UnionScope module.
Create module UnionScope (lib/active_record/union_scope.rb).
module ActiveRecord::UnionScope
def self.included(base)
base.send :extend, ClassMethods
end
module ClassMethods
def union_scope(*scopes)
id_column = "#{table_name}.id"
if (sub_query = scopes.reject { |sc| sc.count == 0 }.map { |s| "(#{s.select(id_column).to_sql})" }.join(" UNION ")).present?
where "#{id_column} IN (#{sub_query})"
else
none
end
end
end
end
Then call it in your Article model.
class Article < ActiveRecord::Base
include ActiveRecord::UnionScope
...
def self.current
union_scope(Article.rescue_articles, Article.listed_articles).limit(10)
end
...
end
All you need to do is sum the queries:
result1 = Model.where(condition)
result2 = Model.where(another_condition)
# your final result
result = result1 + result2
I think you can do all of this in one query:
Article.published.order('listed ASC, created_at DESC').limit(10)
I may have the sort order wrong on the listed column, but in essence this should work. You'll get any listed items first, sorted by created_at DESC, then non-listed items.

Resources