Ruby-on-Rails: Selecting distinct values from the model - ruby-on-rails

The docs:
http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Clearly state that:
query = Client.select(:name).distinct
# => Returns unique names
However, when I try that in my controller, I get the following error:
undefined method `distinct' for #<ActiveRecord::Relation:0xb2f6f2cc>
To be clear, I want the distinct names, like ['George', 'Brandon'], not the clients actual records. Is there something that I am missing?

The .distinct option was added for rails 4 which is what the latest guides refer to.
Rails 2
If you are still on rails 2 you will need to use:
Client.select('distinct(name)')
Rails 3
If you are on Rails 3 you will need to use:
Client.select(:name).uniq
If you look at the equivalent section of the rails 3 guide you can see the difference between the two versions.

There are some approaches:
Rails way:
Model.select(:name).distinct
Semi-rails way
Model.select("DISTINCT ON(models.name) models.*")
The second allows you to select the first record uniqued by name, but in the whole matter, not only names.

If you do not want ActiveRecord::Relations returned, just an array of the names as strings, then use:
Client.distinct.pluck(:name)
To get an ordered result set:
Client.order(:name).distinct.pluck(:name)

This will work for Rails 2 (pretty old rails I know!), 3 and 4.
Client.select('distinct(name)')
This will actually use the SQL select distinct statement
SELECT distinct name FROM clients

Related

ActiveRecord uniq with multiple columns

I'm trying query for records from the User model which are unique(based on both name and age), i.e removing any duplicates but keeping the first record of it.
This here works but how I can make it work with two columns name and age.
User.all.uniq(&:name)
Something along the lines of this
User.all.uniq(&:name, &:age)
You can do: User.select(:name, :age).distinct
Note: In Rails 5+ .uniq is deprecated and recommended to use .distinct instead. https://edgeguides.rubyonrails.org/5_0_release_notes.html#active-record-deprecations
I believe you can do
User.select('DISTINCT ON (name,age)')

ActiveRecord group with alias

First question ever on here, and pretty new to coding full apps/Rails.
I was creating a method to get the counts for titles by author, and noticed that if the author is cased differently, it would count as different authors. I wanted to place some sort of validation/check to disregard the casing and count it together. I don't care about the casing of the book titles in this particular case.
So I have table like this:
Author Book Title Year Condition
William Shakespeare Hamlet 1599 Poor
Stephen King The Shining 1977 New
Edgar Allen Poe The Raven 1845 Good
JK Rowling Harry Potter and the Sorcerer's Stone 2001 New
edgar allen poe The Tell-Tale Heart 1843 Good
JK Rowling Fantastic Beasts and Where to Find Them 2001 New
I want to output this:
Author Count
William Shakespeare 1
Stephen King 1
Edgar Allen Poe 2
JK Rowling 2
My method was originally something like this:
def self.book_counts
distinct_counts = []
Book.group(:author).count.each do |count|
distinct_counts << count
end
distinct_counts
end
To ignore casing, I referenced this page and came up with these, which didn't end up working out, unfortunately:
1) With this one I get "undefined method lower":
Book.group(lower('author')).count.each do |count|
distinct_counts << count
2) This runs, but with the select method in general, I get a bunch of ActiveRecord results/Record id: nil. I am using Rails 6 and it additionally notes "DEPRECATION WARNING: Dangerous query method (method whose arguments are used as raw SQL) called with non-attribute argument(s) ... Non-attribute arguments will be disallowed in Rails 6.1. This method should not be called with user-provided values, such as request parameters or model attributes. Known-safe values can be passed by wrapping them in Arel.sql(). (called from irb_binding at (irb):579)":
Book.select("lower(author) as dc_auth, count(*) as book_count").group("dc_auth").order("book_count desc")
3) I even tried to test a different, simplified function to see if it'd work, but I got "ActiveRecord::StatementInvalid (PG::GroupingError: ERROR: column "books.author" must appear in the GROUP BY clause or be used in an aggregate function)":
Book.pluck('lower(author) as dc_auth, count(*) as book_count')
4) I've tried various other ways, with additional different errors, e.g. "undefined local variable or method 'dc_auth'", "undefined method 'group' did you mean group_by?", and "wrong number of arguments (given 1, expected 0)" (with group_by), etc.
This query works exactly how I want it to in postgresql. The syntax actually populates in the terminal when I run #2, but as mentioned, unfortunately due to ActiveRecord doesn't output properly in Rails.
SELECT lower(author) as dc_auth, count(*) as book_count FROM books GROUP BY dc_auth;
Is there even a way to run what I want through Rails??
Maybe you can try
Book.group("LOWER(author)").count
You can execute your query using ActiveRecord. And I will suggest to go with SQL block
book_count_query = <<-SQL
SELECT lower(author) as dc_auth, count(*) as book_count
FROM books
GROUP BY dc_auth;
SQL
1- result = ActiveRecord::Base.connection.execute(book_count_query)
or
2- result = ActiveRecord::Base.connection.exec_query(book_count_query)
What difference between line 1 and line 2?
exec_query it returns an ActiveRecords::Result object which has handy methods like .columns and .rows to access headers and values.
The array of hashes from .execute can be troublesome to deal with and gave me redundant results when I ran with a SUM GROUP BY clause.
If you need read more about this topic
example of exec_query in api.rubyonrails
active_record_querying in Rails Documentation
This Resource have example for query and output .
Why you store authors in the same table with books. The better solution is to add a separate table for authors and add a foreign key to author_id to books table. With counter_cache you can easily count the number of books for each author.
Here is a guide with books and authors examples https://guides.rubyonrails.org/association_basics.html

Grouping by into a list with activerecord in rails

I need to achieve something exactly similar to How to get list of values in GROUP_BY clause? but I need to use active record query interface in rails 4.2.1.
I have only gotten so far.
Roles.where(id: 2)
.select("user_roles.id, user_roles.role, GROUP_CONCAT(DISTINCT roles.group_id SEPARATOR ',') ")
.group(:role)
But this just returns an ActiveRecord::Relationobject with a single entry that has id and role.
How do I achieve that same with active record without having to pull in all the relationships and manually building such an object?
Roles.where(id: 2) already returns the single record. You might instead start with users and join roles table doing something like this.
User.
joins(user_roles: :roles).
where('roles.id = 2').
select("user_roles.role, GROUP_CONCAT(DISTINCT roles.group_id SEPARATOR ',') ").
group(:role)
Or, if you have the model for user_roles, start with it since you nevertheless do not query anything from users.

Combining distinct with another condition

I'm migrating a Rails 3.2 app to Rails 5.1 (not before time) and I've hit a problem with a where query.
The code that works on Rails 3.2 looks like this,
sales = SalesActivity.select('DISTINCT batch_id').where('salesperson_id = ?', sales_id)
sales.find_each(batch_size: 2000) do |batchToProcess|
.....
When I run this code under Rails 5.1, it appears to cause the following error when it attempts the for_each,
ArgumentError (Primary key not included in the custom select clause):
I want to end up with an array(?) of unique batch_ids for the given salesperson_id that I can then traverse, as was working with Rails 3.2.
For reasons I don't understand, it looks like I might need to include the whole record to traverse through (my thinking being that I need to include the Primary key)?
I'm trying to rephrase the 'where', and have tried the following,
sales = SalesActivity.where(salesperson_id: sales_id).select(:batch_id).distinct
However, the combined ActiveRecordQuery applies the DISTINCT to both the salesperson_id AND the batch_id - that's #FAIL1
Also, because I'm still using a select (to let distinct know which column I want to be 'distinct') it also still only selects the batch_id column of course, which I am trying to avoid - that's #FAIL2
How can I efficiently pull all unique batch_id records for a given salesperson_id, so I can then for_each them?
Thanks!
How about:
SalesActivity.where(salesperson_id: sales_id).pluck('DISTINCT batch_id')
May need to change up the ordering of where and pluck, but pluck should return an array of the batch_ids

Rails: select unique values from a column

I already have a working solution, but I would really like to know why this doesn't work:
ratings = Model.select(:rating).uniq
ratings.each { |r| puts r.rating }
It selects, but don't print unique values, it prints all values, including the duplicates. And it's in the documentation: http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Model.select(:rating)
The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
If you're going to use Model.select, then you might as well just use DISTINCT, as it will return only the unique values. This is better because it means it returns less rows and should be slightly faster than returning a number of rows and then telling Rails to pick the unique values.
Model.select('DISTINCT rating')
Of course, this is provided your database understands the DISTINCT keyword, and most should.
This works too.
Model.pluck("DISTINCT rating")
If you want to also select extra fields:
Model.select('DISTINCT ON (models.ratings) models.ratings, models.id').map { |m| [m.id, m.ratings] }
Model.uniq.pluck(:rating)
# SELECT DISTINCT "models"."rating" FROM "models"
This has the advantages of not using sql strings and not instantiating models
Model.select(:rating).uniq
This code works as 'DISTINCT' (not as Array#uniq) since rails 3.2
Model.select(:rating).distinct
Another way to collect uniq columns with sql:
Model.group(:rating).pluck(:rating)
If I am going right to way then :
Current query
Model.select(:rating)
is returning array of object and you have written query
Model.select(:rating).uniq
uniq is applied on array of object and each object have unique id. uniq is performing its job correctly because each object in array is uniq.
There are many way to select distinct rating :
Model.select('distinct rating').map(&:rating)
or
Model.select('distinct rating').collect(&:rating)
or
Model.select(:rating).map(&:rating).uniq
or
Model.select(:name).collect(&:rating).uniq
One more thing, first and second query : find distinct data by SQL query.
These queries will considered "london" and "london " same means it will neglect to space, that's why it will select 'london' one time in your query result.
Third and forth query:
find data by SQL query and for distinct data applied ruby uniq mehtod.
these queries will considered "london" and "london " different, that's why it will select 'london' and 'london ' both in your query result.
please prefer to attached image for more understanding and have a look on "Toured / Awaiting RFP".
If anyone is looking for the same with Mongoid, that is
Model.distinct(:rating)
Some answers don't take into account the OP wants a array of values
Other answers don't work well if your Model has thousands of records
That said, I think a good answer is:
Model.uniq.select(:ratings).map(&:ratings)
=> "SELECT DISTINCT ratings FROM `models` "
Because, first you generate a array of Model (with diminished size because of the select), then you extract the only attribute those selected models have (ratings)
You can use the following Gem: active_record_distinct_on
Model.distinct_on(:rating)
Yields the following query:
SELECT DISTINCT ON ( "models"."rating" ) "models".* FROM "models"
In my scenario, I wanted a list of distinct names after ordering them by their creation date, applying offset and limit. Basically a combination of ORDER BY, DISTINCT ON
All you need to do is put DISTINCT ON inside the pluck method, like follow
Model.order("name, created_at DESC").offset(0).limit(10).pluck("DISTINCT ON (name) name")
This would return back an array of distinct names.
Model.pluck("DISTINCT column_name")

Resources