I have an edge case where I want to use .first only after my SQL query has been executed.
My case is the next one:
User.select("sum((type = 'foo')::int) as foo_count",
"sum((type = 'bar')::int) as bar_count")
.first
.yield_self { |r| r.bar_count / r.foo_count.to_f }
However, this would throw an SQL error saying that I should include my user_id in the GROUP BY clause. I've already found a hacky solution using to_a, but I really wonder if there is a proper way to force execution before my call to .first.
The error is because first uses an order by statement to order by id.
"Find the first record (or first N records if a parameter is supplied). If no order is defined it will order by primary key."
Instead try take
"Gives a record (or N records if a parameter is supplied) without any implied order. The order will depend on the database implementation. If an order is supplied it will be respected."
So
User.select("sum((type = 'foo')::int) as foo_count",
"sum((type = 'bar')::int) as bar_count")
.take
.yield_self { |r| r.bar_count / r.foo_count.to_f }
should work appropriately however as stated the order is indeterminate.
You may want to use pluck which retrieves only the data instead of select which just alters which fields get loaded into models:
User.pluck(
"sum((type = 'foo')::int) as foo_count",
"sum((type = 'bar')::int) as bar_count"
).map do |foo_count, bar_count|
bar_count / foo_count.to_f
end
You can probably do the division in the query as well if necessary.
I was wondering if there is a way to do the following in a single query?
1) non_populated_models = PropertyPerson.select("property_people.id, count('items.recipient_person_id')").joins(:items).group('items.recipient_person_id, property_people.id')
2) populated_models = PropertyPerson.where(id: [non_populated_models])
Currently, the first group by query only returns the id, and count in the ProperyPerson object. Let's say there were 15 fields in the model and I didn't want to explicitly write them all out. Is there a way I can do this operation in a single query?
The join will work to limit the query to property_people with item and you you will get the extra column as an attr_reader.
people = PropertyPerson.select("property_people.*,
count('items.recipient_person_id') as items_count")
.joins(:items)
.group("property_people.id")
people.first.item_count
I am hardly trying to find one comparison of result.blank? and result[0] so finally today when i was checking one query with these two methods.
Here the code, result variable is #categories, which is an ActiveRecord result
This blank check calling one extra db call like SELECT COUNT(*) AS count_all
if #categories.blank?
end
But here that extra query is not showing there.
if #categories[0]
end
Is there any logic behind that? I couldn't find that
It is important to note that assigning a ActiveRecord query to a variable does not return the result of the query. Something like this:
#categories = Category.where(public: true)
Does not return an array with all categories that are public. Instead it returns an Relation which defines an query. The query to the database is execute once you call a method in the relation that needs to return the actual record, for example each, load, count.
That said: When you call blank? on a relation Rails needs to know it the relation will not return an empty array. Therefore Rails executes an query like:
SELECT COUNT(*) FROM categories WHERE public = 1
Because that queries is much faster that fetching all records when the only thing you need to know if there are any matching records.
Whereas #categories[0] works differently. Here it need to load all records to have an array holding all macthing categories and than return the first record in that array.
At this point both version ran only on query to the database. But I guess your next step would be to iterate over the records if there were any. If you used the first version (blank?) then the objects were not loaded, they were only counted. Therefore Rails would need to query for the actual records, what would result in a second query. The second exmaple ([0]) has the records already loaded, therefore not seconds query in needed.
I already have a working solution, but I would really like to know why this doesn't work:
ratings = Model.select(:rating).uniq
ratings.each { |r| puts r.rating }
It selects, but don't print unique values, it prints all values, including the duplicates. And it's in the documentation: http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Model.select(:rating)
The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
If you're going to use Model.select, then you might as well just use DISTINCT, as it will return only the unique values. This is better because it means it returns less rows and should be slightly faster than returning a number of rows and then telling Rails to pick the unique values.
Model.select('DISTINCT rating')
Of course, this is provided your database understands the DISTINCT keyword, and most should.
This works too.
Model.pluck("DISTINCT rating")
If you want to also select extra fields:
Model.select('DISTINCT ON (models.ratings) models.ratings, models.id').map { |m| [m.id, m.ratings] }
Model.uniq.pluck(:rating)
# SELECT DISTINCT "models"."rating" FROM "models"
This has the advantages of not using sql strings and not instantiating models
Model.select(:rating).uniq
This code works as 'DISTINCT' (not as Array#uniq) since rails 3.2
Model.select(:rating).distinct
Another way to collect uniq columns with sql:
Model.group(:rating).pluck(:rating)
If I am going right to way then :
Current query
Model.select(:rating)
is returning array of object and you have written query
Model.select(:rating).uniq
uniq is applied on array of object and each object have unique id. uniq is performing its job correctly because each object in array is uniq.
There are many way to select distinct rating :
Model.select('distinct rating').map(&:rating)
or
Model.select('distinct rating').collect(&:rating)
or
Model.select(:rating).map(&:rating).uniq
or
Model.select(:name).collect(&:rating).uniq
One more thing, first and second query : find distinct data by SQL query.
These queries will considered "london" and "london " same means it will neglect to space, that's why it will select 'london' one time in your query result.
Third and forth query:
find data by SQL query and for distinct data applied ruby uniq mehtod.
these queries will considered "london" and "london " different, that's why it will select 'london' and 'london ' both in your query result.
please prefer to attached image for more understanding and have a look on "Toured / Awaiting RFP".
If anyone is looking for the same with Mongoid, that is
Model.distinct(:rating)
Some answers don't take into account the OP wants a array of values
Other answers don't work well if your Model has thousands of records
That said, I think a good answer is:
Model.uniq.select(:ratings).map(&:ratings)
=> "SELECT DISTINCT ratings FROM `models` "
Because, first you generate a array of Model (with diminished size because of the select), then you extract the only attribute those selected models have (ratings)
You can use the following Gem: active_record_distinct_on
Model.distinct_on(:rating)
Yields the following query:
SELECT DISTINCT ON ( "models"."rating" ) "models".* FROM "models"
In my scenario, I wanted a list of distinct names after ordering them by their creation date, applying offset and limit. Basically a combination of ORDER BY, DISTINCT ON
All you need to do is put DISTINCT ON inside the pluck method, like follow
Model.order("name, created_at DESC").offset(0).limit(10).pluck("DISTINCT ON (name) name")
This would return back an array of distinct names.
Model.pluck("DISTINCT column_name")
I have a query like this:
locations = Location.order('id ASC').limit(10)
which returns an array of 500 or so records - all the records in the table - i.e. the limit clause is being ignored.
Yet if I put a .all on the end:
locations = Location.order('id ASC').limit(10).all
it works and returns 10 records.
This code is being run in a rake task and I am using PostgreSQL if that makes any difference.
Why is it doing that? Surely the .all should not be required. What am I missing?
I think the behaviour depends on how you are handling the locations variable after setting it. This is because Location.order('id ASC').limit(10) isn't querying records but is returning an object of type ActiveRecord::Relation. The query will only occur once you call all, first, each, map, etc. on that object.
In my testing,
Location.order('id ASC').limit(10).map { |l| l.id }
returns an array of 10 ids as you would expect. But
Location.order('id ASC').limit(10).count
returns the total number of locations in the database, because it executes the SQL
SELECT COUNT(*) FROM "locations" LIMIT 10
which returns the full count of location rows (the limit is on the number of rows returned, not the count itself).
So if you are treating the result of Location.order('id ASC').limit(10) as an array by iterating through it, you should get the same result as if you had added all. If you are calling count, you will not. Kind of unfortunate, as I think ideally they should behave the same and you shouldn't have to know that you are dealing with an ActiveRecord::Relation instead of an array.
Ok here is my explanation
First of all if you do Location.order('id ASC').limit(10).class you'll see ActiveRecord::Relation next on the site with rails API ActiveRecord::Relation doesn't have a method all however it includes ActiveRecord::FinderMethods and if you look there you'll find next
# File activerecord/lib/active_record/relation/finder_methods.rb, line 142
def all(*args)
args.any? ? apply_finder_options(args.first).to_a : to_a
end
so it calls to_a method
As was mentioned in the railscasts this method is defined as
def to_a
...
#records = eager_loading? ? find_with_associations : #klass.find_by_sql(arel.to_sql)
...
#records
end
so it does SQL query on a third line with #records = eager_loading? ? find_with_associations : #klass.find_by_sql(arel.to_sql)