First question ever on here, and pretty new to coding full apps/Rails.
I was creating a method to get the counts for titles by author, and noticed that if the author is cased differently, it would count as different authors. I wanted to place some sort of validation/check to disregard the casing and count it together. I don't care about the casing of the book titles in this particular case.
So I have table like this:
Author Book Title Year Condition
William Shakespeare Hamlet 1599 Poor
Stephen King The Shining 1977 New
Edgar Allen Poe The Raven 1845 Good
JK Rowling Harry Potter and the Sorcerer's Stone 2001 New
edgar allen poe The Tell-Tale Heart 1843 Good
JK Rowling Fantastic Beasts and Where to Find Them 2001 New
I want to output this:
Author Count
William Shakespeare 1
Stephen King 1
Edgar Allen Poe 2
JK Rowling 2
My method was originally something like this:
def self.book_counts
distinct_counts = []
Book.group(:author).count.each do |count|
distinct_counts << count
end
distinct_counts
end
To ignore casing, I referenced this page and came up with these, which didn't end up working out, unfortunately:
1) With this one I get "undefined method lower":
Book.group(lower('author')).count.each do |count|
distinct_counts << count
2) This runs, but with the select method in general, I get a bunch of ActiveRecord results/Record id: nil. I am using Rails 6 and it additionally notes "DEPRECATION WARNING: Dangerous query method (method whose arguments are used as raw SQL) called with non-attribute argument(s) ... Non-attribute arguments will be disallowed in Rails 6.1. This method should not be called with user-provided values, such as request parameters or model attributes. Known-safe values can be passed by wrapping them in Arel.sql(). (called from irb_binding at (irb):579)":
Book.select("lower(author) as dc_auth, count(*) as book_count").group("dc_auth").order("book_count desc")
3) I even tried to test a different, simplified function to see if it'd work, but I got "ActiveRecord::StatementInvalid (PG::GroupingError: ERROR: column "books.author" must appear in the GROUP BY clause or be used in an aggregate function)":
Book.pluck('lower(author) as dc_auth, count(*) as book_count')
4) I've tried various other ways, with additional different errors, e.g. "undefined local variable or method 'dc_auth'", "undefined method 'group' did you mean group_by?", and "wrong number of arguments (given 1, expected 0)" (with group_by), etc.
This query works exactly how I want it to in postgresql. The syntax actually populates in the terminal when I run #2, but as mentioned, unfortunately due to ActiveRecord doesn't output properly in Rails.
SELECT lower(author) as dc_auth, count(*) as book_count FROM books GROUP BY dc_auth;
Is there even a way to run what I want through Rails??
Maybe you can try
Book.group("LOWER(author)").count
You can execute your query using ActiveRecord. And I will suggest to go with SQL block
book_count_query = <<-SQL
SELECT lower(author) as dc_auth, count(*) as book_count
FROM books
GROUP BY dc_auth;
SQL
1- result = ActiveRecord::Base.connection.execute(book_count_query)
or
2- result = ActiveRecord::Base.connection.exec_query(book_count_query)
What difference between line 1 and line 2?
exec_query it returns an ActiveRecords::Result object which has handy methods like .columns and .rows to access headers and values.
The array of hashes from .execute can be troublesome to deal with and gave me redundant results when I ran with a SUM GROUP BY clause.
If you need read more about this topic
example of exec_query in api.rubyonrails
active_record_querying in Rails Documentation
This Resource have example for query and output .
Why you store authors in the same table with books. The better solution is to add a separate table for authors and add a foreign key to author_id to books table. With counter_cache you can easily count the number of books for each author.
Here is a guide with books and authors examples https://guides.rubyonrails.org/association_basics.html
Related
Say I've got an object Assignment that has_many :exercises
What is the best way to scope assignments based on ONLY the most recent exercise for each object? Or, how can I select only the Exercise that is the most recent for each assignment?
Something like this is what I'm imagining for the latter:
Exercise.distinct(:assignment_id)
But that doesn't work. (Returns all Exercises) I've also tried
Exercise.group(:assignment_id)
But that complains that the exercises.id column must also appear in the group clause. However, adding that groups by both assignment_id and id (obviously), which, again, returns all Exercises.
My other option, which I'd also like to know how to do (Never seems to work out as well as I want) is to be able to filter assignments based on the most recent exercise.
Assignment.where("most_recent_exercise.created_at <= ?", DateTime.current.beginning_of_day)
Any help is greatly appreciated!
As an example, given the below objects (Assume created in the order the ids are given)
Assignment 1
Exercise 1
Exercise 2
Exercise 3
Assignment 2
Exercise 4
Assignment 3
Exercise 5
Exercise 6
I want to be able to call
Exercise.most_recent and get exercises 3, 4, 6 since they are the most recently created Exercise for their respective Assignments.
how can I select only the Exercise that is the most recent for each assignment?
#assignment.exercises.order(:created_at).last
This is not pretty, but may work:
Exercise.where(
id: Exercise.all.order(:assignment_id, :created_at).
map{|e| e.attributes.with_indifferent_access}.
group_by{|e| e[:assignment_id]}.
map{|k,v| v.first}.
map{|e| e[:id]}
)
I have a model Company that have columns pbr, market_cap and category.
To get averages of pbr grouped by category, I can use group method.
Company.group(:category).average(:pbr)
But there is no method for weighted average.
To get weighted averages I need to run this SQL code.
select case when sum(market_cap) = 0 then 0 else sum(pbr * market_cap) / sum(market_cap) end as weighted_average_pbr, category AS category FROM "companies" GROUP BY "companies"."category";
In psql this query works fine. But I don't know how to use from Rails.
sql = %q(select case when sum(market_cap) = 0 then 0 else sum(pbr * market_cap) / sum(market_cap) end as weighted_average_pbr, category AS category FROM "companies" GROUP BY "companies"."category";)
ActiveRecord::Base.connection.select_all(sql)
returns a error:
output error: #<NoMethodError: undefined method `keys' for #<Array:0x007ff441efa618>>
It would be best if I can extend Rails method so that I can use
Company.group(:category).weighted_average(:pbr)
But I heard that extending rails query is a bit tweaky, now I just want to know how to run the result of sql from Rails.
Does anyone knows how to do it?
Version
rails: 4.2.1
What version of Rails are you using? I don't get that error with Rails 4.2. In Rails 3.2 select_all used to return an Array, and in 4.2 it returns an ActiveRecord::Result. But in either case, it is correct that there is no keys method. Instead you need to call keys on each element of the Array or Result. It sounds like the problem isn't from running the query, but from what you're doing afterward.
In any case, to get the more fluent approach you've described, you could do this:
class Company
scope :weighted_average, lambda{|col|
select("companies.category").
select(<<-EOQ)
(CASE WHEN SUM(market_cap) = 0 THEN 0
ELSE SUM(#{col} * market_cap) / SUM(market_cap)
END) AS weighted_average_#{col}
EOQ
}
This will let you say Company.group(:category).weighted_average(:pbr), and you will get a collection of Company instances. Each one will have an extra weighted_average_pbr attribute, so you can do this:
Company.group(:category).weighted_average(:pbr).each do |c|
puts c.weighted_average_pbr
end
These instances will not have their normal attributes, but they will have category. That is because they do not represent individual Companies, but groups of companies with the same category. If you want to group by something else, you could parameterize the lambda to take the grouping column. In that case you might as well move the group call into the lambda too.
Now be warned that the parameter to weighted_average goes straight into your SQL query without escaping, since it is a column name. So make sure you don't pass user input to that method, or you'll have a SQL injection vulnerability. In fact I would probably put a guard inside the lambda, something like raise "NOPE" unless col =~ %r{\A[a-zA-Z0-9_]+\Z}.
The more general lesson is that you can use select to include extra SQL expressions, and have Rails magically treat those as attributes on the instances returned from the query.
Also note that unlike with select_all where you get a bunch of hashes, with this approach you get a bunch of Company instances. So again there is no keys method! :-)
Rails: 4.1.2
Database: PostgreSQL
For one of my queries, I am using methods from both the textacular gem and Active Record. How can I chain some of the following queries with an "OR" instead of an "AND":
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
I want to chain the last two scopes (fuzzy_search and the where after it) together with an "OR" instead of an "AND." So I want to retrieve all People who are approved AND (whose first name is similar to "Test" OR whose last name contains "Test"). I've been struggling with this for quite a while, so any help would be greatly appreciated!
I digged into fuzzy_search and saw that it will be translated to something like:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rankxxx"
FROM "people"
WHERE (("people"."first_name" % 'abc'))
ORDER BY "rankxxx" DESC
That says if you don't care about preserving order, it will just filter the result by WHERE (("people"."first_name" % 'abc'))
Knowing that and now you can simply write the query with similar functionality:
People.where(status: status_approved)
.where('(first_name % :key) OR (last_name LIKE :key)', key: 'Test')
In case you want order, please specify what would you like the order will be after joining 2 conditions.
After a few days, I came up with the solution! Here's what I did:
This is the query I wanted to chain together with an OR:
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
As Hoang Phan suggested, when you look in the console, this produces the following SQL:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rank69146689305952314"
FROM "people"
WHERE "people"."status" = 1 AND (("people"."first_name" % 'Test')) AND (last_name LIKE 'Test') ORDER BY "rank69146689305952314" DESC
I then dug into the textacular gem and found out how the rank is generated. I found it in the textacular.rb file and then crafted the SQL query using it. I also replaced the "AND" that connected the last two conditions with an "OR":
# Generate a random number for the ordering
rank = rand(100000000000000000).to_s
# Create the SQL query
sql_query = "SELECT people.*, COALESCE(similarity(people.first_name, :query), 0)" +
" AS rank#{rank} FROM people" +
" WHERE (people.status = :status AND" +
" ((people.first_name % :query) OR (last_name LIKE :query_like)))" +
" ORDER BY rank#{rank} DESC"
I took out all of quotation marks in the SQL query when referring to tables and fields because it was giving me error messages when I kept them there and even if I used single quotes.
Then, I used the find_by_sql method to retrieve the People object IDs in an array. The symbols (:status, :query, :query_like) are used to protect against SQL injections, so I set their values accordingly:
# Retrieve all the IDs of People who are approved and whose first name and last name match the search query.
# The IDs are sorted in order of most relevant to the search query.
people_ids = People.find_by_sql([sql_query, query: "Test", query_like: "%Test%", status: 1]).map(&:id)
I get the IDs and not the People objects in an array because find_by_sql returns an Array object and not a CollectionProxy object, as would normally be returned, so I cannot use ActiveRecord query methods such as where on this array. Using the IDs, we can execute another query to get a CollectionProxy object. However, there's one problem: If we were to simply run People.where(id: people_ids), the order of the IDs would not be preserved, so all the relevance ranking we did was for nothing.
Fortunately, there's a nice gem called order_as_specified that will allow us to retrieve all People objects in the specific order of the IDs. Although the gem would work, I didn't use it and instead wrote a short line of code to craft conditions that would preserve the order.
order_by = people_ids.map { |id| "people.id='#{id}' DESC" }.join(", ")
If our people_ids array is [1, 12, 3], it would create the following ORDER statement:
"people.id='1' DESC, people.id='12' DESC, people.id='3' DESC"
I learned from this comment that writing an ORDER statement in this way would preserve the order.
Now, all that's left is to retrieve the People objects from ActiveRecord, making sure to specify the order.
people = People.where(id: people_ids).order(order_by)
And that did it! I didn't worry about removing any duplicate IDs because ActiveRecord does that automatically when you run the where command.
I understand that this code is not very portable and would require some changes if any of the people table's columns are modified, but it works perfectly and seems to execute only one query according to the console.
I have a model with following columns
Charges Model
Date
fee
discount
Data
1/1/15, 1, 1
1/1/15, 2, 1
2/2/15, 3, 3
I have a few named scopes like this_year
I want to do something like Charges.this_year.summed_up
How do I make a named scope for this.
The returned response then should be:
1/1/15, 3, 2
2/2/15, 3, 3
Assuming you have a model with a date field(eg. published_at) and 2 integer fields(eg. fee, discount). You can use "group" method to run GROUP BY on published_at. Then just use sum method if you want only sum of one fields. If you want more than one field, you have to run a select with SQL SUMs inside, to get multiple column sums. Here is an example.
Charge..group(published_at)
.select("published_at, SUM(fee) AS sum_fee, SUM(discount) AS sum_discount")
.order("published_at")
Note: Summarized fields won't show up in rails console return value prompt. But they are there for you to use.
Depending upon what end result you want, you may want to look at .group(:attribute) rather than .group_by:
Charge.group(:date).each do |charge|
charge.where('date = ?', charge.date).sum(:fee)
charge.where('date = ?', charge.date).sum(:discount)
end
I found this approach easier, especially if setting multiple conditions on the data you want to extract from the table.
In any case, I had an accounting model that presented this kind of issue where I needed credit and debit plus type of payment info on a single table and spent a fruitful few hours learning all about group_by before realizing that .group() offered a simple solution.
I already have a working solution, but I would really like to know why this doesn't work:
ratings = Model.select(:rating).uniq
ratings.each { |r| puts r.rating }
It selects, but don't print unique values, it prints all values, including the duplicates. And it's in the documentation: http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Model.select(:rating)
The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
If you're going to use Model.select, then you might as well just use DISTINCT, as it will return only the unique values. This is better because it means it returns less rows and should be slightly faster than returning a number of rows and then telling Rails to pick the unique values.
Model.select('DISTINCT rating')
Of course, this is provided your database understands the DISTINCT keyword, and most should.
This works too.
Model.pluck("DISTINCT rating")
If you want to also select extra fields:
Model.select('DISTINCT ON (models.ratings) models.ratings, models.id').map { |m| [m.id, m.ratings] }
Model.uniq.pluck(:rating)
# SELECT DISTINCT "models"."rating" FROM "models"
This has the advantages of not using sql strings and not instantiating models
Model.select(:rating).uniq
This code works as 'DISTINCT' (not as Array#uniq) since rails 3.2
Model.select(:rating).distinct
Another way to collect uniq columns with sql:
Model.group(:rating).pluck(:rating)
If I am going right to way then :
Current query
Model.select(:rating)
is returning array of object and you have written query
Model.select(:rating).uniq
uniq is applied on array of object and each object have unique id. uniq is performing its job correctly because each object in array is uniq.
There are many way to select distinct rating :
Model.select('distinct rating').map(&:rating)
or
Model.select('distinct rating').collect(&:rating)
or
Model.select(:rating).map(&:rating).uniq
or
Model.select(:name).collect(&:rating).uniq
One more thing, first and second query : find distinct data by SQL query.
These queries will considered "london" and "london " same means it will neglect to space, that's why it will select 'london' one time in your query result.
Third and forth query:
find data by SQL query and for distinct data applied ruby uniq mehtod.
these queries will considered "london" and "london " different, that's why it will select 'london' and 'london ' both in your query result.
please prefer to attached image for more understanding and have a look on "Toured / Awaiting RFP".
If anyone is looking for the same with Mongoid, that is
Model.distinct(:rating)
Some answers don't take into account the OP wants a array of values
Other answers don't work well if your Model has thousands of records
That said, I think a good answer is:
Model.uniq.select(:ratings).map(&:ratings)
=> "SELECT DISTINCT ratings FROM `models` "
Because, first you generate a array of Model (with diminished size because of the select), then you extract the only attribute those selected models have (ratings)
You can use the following Gem: active_record_distinct_on
Model.distinct_on(:rating)
Yields the following query:
SELECT DISTINCT ON ( "models"."rating" ) "models".* FROM "models"
In my scenario, I wanted a list of distinct names after ordering them by their creation date, applying offset and limit. Basically a combination of ORDER BY, DISTINCT ON
All you need to do is put DISTINCT ON inside the pluck method, like follow
Model.order("name, created_at DESC").offset(0).limit(10).pluck("DISTINCT ON (name) name")
This would return back an array of distinct names.
Model.pluck("DISTINCT column_name")