Rails GROUP BY query using Active Record - ruby-on-rails

I want to write this query using classic Active Record, instead of raw sql.
SELECT
user_id,
term,
SUM(views) AS Views,
SUM(clicks) AS Clicks
FROM reports
GROUP BY user_id, term
Tried this, but doesn't work.
Report.group([:user_id, :term]).sum([:views, :clicks])
I know it's possible to use .group_by{} but it's not very efficient because it's aggregated by Ruby and not performed the query.
DB: Postgresql

I think you can use this
#reports = Report.select("reports.user_id, reports.term,
sum(reports.views) as total_views
sum(reports.clicks) as total_clicks").
group("reports.user_id, reports.term")
please note all for total (as sum result) although they not show up, but they do exist if you call the name
#report.first # total_views and total_clicks not show up in rails console but
#report.first.total_views # you will see total calculation

Related

Grouping by into a list with activerecord in rails

I need to achieve something exactly similar to How to get list of values in GROUP_BY clause? but I need to use active record query interface in rails 4.2.1.
I have only gotten so far.
Roles.where(id: 2)
.select("user_roles.id, user_roles.role, GROUP_CONCAT(DISTINCT roles.group_id SEPARATOR ',') ")
.group(:role)
But this just returns an ActiveRecord::Relationobject with a single entry that has id and role.
How do I achieve that same with active record without having to pull in all the relationships and manually building such an object?
Roles.where(id: 2) already returns the single record. You might instead start with users and join roles table doing something like this.
User.
joins(user_roles: :roles).
where('roles.id = 2').
select("user_roles.role, GROUP_CONCAT(DISTINCT roles.group_id SEPARATOR ',') ").
group(:role)
Or, if you have the model for user_roles, start with it since you nevertheless do not query anything from users.

Combining distinct with another condition

I'm migrating a Rails 3.2 app to Rails 5.1 (not before time) and I've hit a problem with a where query.
The code that works on Rails 3.2 looks like this,
sales = SalesActivity.select('DISTINCT batch_id').where('salesperson_id = ?', sales_id)
sales.find_each(batch_size: 2000) do |batchToProcess|
.....
When I run this code under Rails 5.1, it appears to cause the following error when it attempts the for_each,
ArgumentError (Primary key not included in the custom select clause):
I want to end up with an array(?) of unique batch_ids for the given salesperson_id that I can then traverse, as was working with Rails 3.2.
For reasons I don't understand, it looks like I might need to include the whole record to traverse through (my thinking being that I need to include the Primary key)?
I'm trying to rephrase the 'where', and have tried the following,
sales = SalesActivity.where(salesperson_id: sales_id).select(:batch_id).distinct
However, the combined ActiveRecordQuery applies the DISTINCT to both the salesperson_id AND the batch_id - that's #FAIL1
Also, because I'm still using a select (to let distinct know which column I want to be 'distinct') it also still only selects the batch_id column of course, which I am trying to avoid - that's #FAIL2
How can I efficiently pull all unique batch_id records for a given salesperson_id, so I can then for_each them?
Thanks!
How about:
SalesActivity.where(salesperson_id: sales_id).pluck('DISTINCT batch_id')
May need to change up the ordering of where and pluck, but pluck should return an array of the batch_ids

How to get weighted average grouped by a column

I have a model Company that have columns pbr, market_cap and category.
To get averages of pbr grouped by category, I can use group method.
Company.group(:category).average(:pbr)
But there is no method for weighted average.
To get weighted averages I need to run this SQL code.
select case when sum(market_cap) = 0 then 0 else sum(pbr * market_cap) / sum(market_cap) end as weighted_average_pbr, category AS category FROM "companies" GROUP BY "companies"."category";
In psql this query works fine. But I don't know how to use from Rails.
sql = %q(select case when sum(market_cap) = 0 then 0 else sum(pbr * market_cap) / sum(market_cap) end as weighted_average_pbr, category AS category FROM "companies" GROUP BY "companies"."category";)
ActiveRecord::Base.connection.select_all(sql)
returns a error:
output error: #<NoMethodError: undefined method `keys' for #<Array:0x007ff441efa618>>
It would be best if I can extend Rails method so that I can use
Company.group(:category).weighted_average(:pbr)
But I heard that extending rails query is a bit tweaky, now I just want to know how to run the result of sql from Rails.
Does anyone knows how to do it?
Version
rails: 4.2.1
What version of Rails are you using? I don't get that error with Rails 4.2. In Rails 3.2 select_all used to return an Array, and in 4.2 it returns an ActiveRecord::Result. But in either case, it is correct that there is no keys method. Instead you need to call keys on each element of the Array or Result. It sounds like the problem isn't from running the query, but from what you're doing afterward.
In any case, to get the more fluent approach you've described, you could do this:
class Company
scope :weighted_average, lambda{|col|
select("companies.category").
select(<<-EOQ)
(CASE WHEN SUM(market_cap) = 0 THEN 0
ELSE SUM(#{col} * market_cap) / SUM(market_cap)
END) AS weighted_average_#{col}
EOQ
}
This will let you say Company.group(:category).weighted_average(:pbr), and you will get a collection of Company instances. Each one will have an extra weighted_average_pbr attribute, so you can do this:
Company.group(:category).weighted_average(:pbr).each do |c|
puts c.weighted_average_pbr
end
These instances will not have their normal attributes, but they will have category. That is because they do not represent individual Companies, but groups of companies with the same category. If you want to group by something else, you could parameterize the lambda to take the grouping column. In that case you might as well move the group call into the lambda too.
Now be warned that the parameter to weighted_average goes straight into your SQL query without escaping, since it is a column name. So make sure you don't pass user input to that method, or you'll have a SQL injection vulnerability. In fact I would probably put a guard inside the lambda, something like raise "NOPE" unless col =~ %r{\A[a-zA-Z0-9_]+\Z}.
The more general lesson is that you can use select to include extra SQL expressions, and have Rails magically treat those as attributes on the instances returned from the query.
Also note that unlike with select_all where you get a bunch of hashes, with this approach you get a bunch of Company instances. So again there is no keys method! :-)

Active Record - Chain Queries with OR

Rails: 4.1.2
Database: PostgreSQL
For one of my queries, I am using methods from both the textacular gem and Active Record. How can I chain some of the following queries with an "OR" instead of an "AND":
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
I want to chain the last two scopes (fuzzy_search and the where after it) together with an "OR" instead of an "AND." So I want to retrieve all People who are approved AND (whose first name is similar to "Test" OR whose last name contains "Test"). I've been struggling with this for quite a while, so any help would be greatly appreciated!
I digged into fuzzy_search and saw that it will be translated to something like:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rankxxx"
FROM "people"
WHERE (("people"."first_name" % 'abc'))
ORDER BY "rankxxx" DESC
That says if you don't care about preserving order, it will just filter the result by WHERE (("people"."first_name" % 'abc'))
Knowing that and now you can simply write the query with similar functionality:
People.where(status: status_approved)
.where('(first_name % :key) OR (last_name LIKE :key)', key: 'Test')
In case you want order, please specify what would you like the order will be after joining 2 conditions.
After a few days, I came up with the solution! Here's what I did:
This is the query I wanted to chain together with an OR:
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
As Hoang Phan suggested, when you look in the console, this produces the following SQL:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rank69146689305952314"
FROM "people"
WHERE "people"."status" = 1 AND (("people"."first_name" % 'Test')) AND (last_name LIKE 'Test') ORDER BY "rank69146689305952314" DESC
I then dug into the textacular gem and found out how the rank is generated. I found it in the textacular.rb file and then crafted the SQL query using it. I also replaced the "AND" that connected the last two conditions with an "OR":
# Generate a random number for the ordering
rank = rand(100000000000000000).to_s
# Create the SQL query
sql_query = "SELECT people.*, COALESCE(similarity(people.first_name, :query), 0)" +
" AS rank#{rank} FROM people" +
" WHERE (people.status = :status AND" +
" ((people.first_name % :query) OR (last_name LIKE :query_like)))" +
" ORDER BY rank#{rank} DESC"
I took out all of quotation marks in the SQL query when referring to tables and fields because it was giving me error messages when I kept them there and even if I used single quotes.
Then, I used the find_by_sql method to retrieve the People object IDs in an array. The symbols (:status, :query, :query_like) are used to protect against SQL injections, so I set their values accordingly:
# Retrieve all the IDs of People who are approved and whose first name and last name match the search query.
# The IDs are sorted in order of most relevant to the search query.
people_ids = People.find_by_sql([sql_query, query: "Test", query_like: "%Test%", status: 1]).map(&:id)
I get the IDs and not the People objects in an array because find_by_sql returns an Array object and not a CollectionProxy object, as would normally be returned, so I cannot use ActiveRecord query methods such as where on this array. Using the IDs, we can execute another query to get a CollectionProxy object. However, there's one problem: If we were to simply run People.where(id: people_ids), the order of the IDs would not be preserved, so all the relevance ranking we did was for nothing.
Fortunately, there's a nice gem called order_as_specified that will allow us to retrieve all People objects in the specific order of the IDs. Although the gem would work, I didn't use it and instead wrote a short line of code to craft conditions that would preserve the order.
order_by = people_ids.map { |id| "people.id='#{id}' DESC" }.join(", ")
If our people_ids array is [1, 12, 3], it would create the following ORDER statement:
"people.id='1' DESC, people.id='12' DESC, people.id='3' DESC"
I learned from this comment that writing an ORDER statement in this way would preserve the order.
Now, all that's left is to retrieve the People objects from ActiveRecord, making sure to specify the order.
people = People.where(id: people_ids).order(order_by)
And that did it! I didn't worry about removing any duplicate IDs because ActiveRecord does that automatically when you run the where command.
I understand that this code is not very portable and would require some changes if any of the people table's columns are modified, but it works perfectly and seems to execute only one query according to the console.

How do i use .sort() to create a relation?

I am using the kaminari gem for pagination. I have a resources controller which paginates perfectly (due to the simple nature of the ordering). That can be seen here:
#resources = Resource.order("created_at desc").page(params[:page]).per(25)
That just sorts them by latest first. when i do .class it appears thats an activerecord::relation
On my tags though, I want to sort them by a relationship (the number of resources assigned to that tag)
#tags = Tag.all.sort{|a, b| b.number_of_resources <=> a.number_of_resources}.page(params[:page]).per(50)
It gives me the error however undefined methodpage' for #`
Tag.all returns an Array, hence your #page call failing, as it expects an ARel relation.
If #number_of_resources maps to a DB column, then all you need to do is:
Tag.order('number_of_resources').page(params[:page]).per(50)
If it's not, you either need to add it to the Tag database table, or just do your sort/paginate in Ruby rather than using kaminari. This will be feasible if the number of tags is under ~1000 or so.
If you do add the info to the db, check out this post: Counter Cache for a column with conditions?
you should do something like: 1) joins the two tables, 2) group rows by tag, 3) count how many rows belongs to each group, 4) order using that new column with the count
you should make a good sql statement and then you can call pagination

Resources