only show highest value user entry [duplicate] - ruby-on-rails

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 7 years ago.
I am creating a contest where user can submit multiple entries. Only the entry with the highest tonnage will be shown. In the index view all the entries has to be sorted descending based on tonnage value.
My submissions controller shows following:
#submissions = #contest.submissions.maximum(:tonnage, group: User)
The problem here is that I do not get an array back with all the submission values. I need something I can iterate through.
e.g. a list which only contains one submissions from a user which is the submission with the highest tonnage value.
When I just group I get following error:
GroupingError: ERROR: column "submissions.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "submissions".* FROM "submissions" WHERE "submission...
UPDATE:
I found an sql query who does approximately what I want.
select *
from submissions a
inner join
( select user_id, max(tonnage) as max_tonnage
from submissions
group by user_id) b
on
a.user_id = b.user_id and
a.tonnage = b.max_tonnage
How can I fix this in activerecord?
Comment info:

Simpler with DISTINCT ON:
SELECT DISTINCT ON (user_id) *
FROM submissions
ORDER BY user_id, tonnage DESC NULLS LAST;
NULLS LAST is only relevant if tonnage can be NULL:
Detailed explanation:
Select first row in each GROUP BY group?
Syntax in ActiveRecord:
Submission.select("DISTINCT ON (user_id) *").order("user_id, tonnage DESC NULLS LAST")
More in the Ruby documentation or this related answer:
Get a list of first record for each group
Possible performance optimization:
Optimize GROUP BY query to retrieve latest record per user
Sort result rows
Per request in comment.
SELECT * FROM (
SELECT DISTINCT ON (user_id) *
FROM submissions
ORDER BY user_id, tonnage DESC NULLS LAST
) sub
ORDER BY tonnage DESC NULLS LAST, user_id; -- 2nd item to break ties;
Alternatively use row_number() in a subquery:
SELECT * FROM (
SELECT *
, row_number() OVER (PARTITION BY user_id ORDER BY tonnage DESC NULLS LAST) AS rn
FROM submissions
) sub
WHERE rn = 1
ORDER BY tonnage DESC NULLS LAST, user_id;
Or the query you have, plus ORDER BY.

Related

How to get top 5 item per category_id in rails

I am trying to get the top 5 records from each category in the items table. the items table has a category_id attribute which I want to use to select the top 5 items per category depending on the created_at attribute. so something like this pseudocode :
select top 5 from the items table
in each group
group by category_id
top value depending on create_at column
articles I have looked at:
https://stackoverflow.com/questions/32868779/get-top-n-items-per-group-in-ruby-on-rails
https://stackoverflow.com/questions/32868779/get-top-n-items-per-group-in-ruby-on-rails
I want to include the associated :tags and :item_variants table so using the raw SQL with find_by_sql is not an option or ActiveRecord::Base.connection.execute(sql)
what I have tried:
using raw SQL to get the records like this works. the problem is it doesn't allow includes to get the tags and item_variants associated table :
sql = "SELECT items.* rn
FROM
( SELECT items.*,
ROW_NUMBER() OVER (PARTITION BY name
ORDER BY name DESC
)
AS rn
FROM items
) items
WHERE rn <= 4"
#items = Item.includes([:images_attached, :blob, :item_variants, :tags]).find_by_sql(sql)
this doesn't include the tables as mentioned and results in N+1 queries.
how can solve this guys ? thank you in advance.

Re-write a query to avoid PG::GroupingError: ERROR: in the GROUP BY clause or be used in an aggregate function

I tried many alternatives before posting this question.
I have a query on a table A with columns: id, num, user_id.
id is PK, user_id can be duplicate.
I need to have all the rows such that only unique user_id has chosen to have highest num value. For this, I came up with aSQL below, which will work in Oracle database. I am on ruby on rails platform with Postgres Database.
select stats.* from stats as A
where A.num > (
select B.num
from stats as B
where A.user_id == B.user_id
group by B.user_id
having B.num> min(B.num) )
I tried writing this query via active record method but still ran into
PG::GroupingError: ERROR: column "b.num" must appear in the GROUP BY
clause or be used in an aggregate function
Stat.where("stats.num > ( select B.nums from stats as B where stats.user_id = B.user_id group by B.user_id having B.num < max(B.num) )")
Can someone tell me alternative way of writing this query
The SELECT clause of your subquery in Rails doesn't match that of your example. Note that since you're performing an aggregate function min(B.num) in your HAVING clause, you'll have to also include it in your SELECT clause:
Stat.where("stats.num > ( select B.num from stats as B where stats.user_id = B.user_id group by B.user_id having B.num < max(B.num) )")
You may also need a condition to handle the case where select B.num from stats as B where stats.user_id = B.user_id group by B.user_id having B.num < max(B.num) returns more than one row.

Sequel -- How To Construct This Query?

I have a users table, which has a one-to-many relationship with a user_purchases table via the foreign key user_id. That is, each user can make many purchases (or may have none, in which case he will have no entries in the user_purchases table).
user_purchases has only one other field that is of interest here, which is purchase_date.
I am trying to write a Sequel ORM statement that will return a dataset with the following columns:
user_id
date of the users SECOND purchase, if it exists
So users who have not made at least 2 purchases will not appear in this dataset. What is the best way to write this Sequel statement?
Please note I am looking for a dataset with ALL users returned who have >= 2 purchases
Thanks!
EDIT FOR CLARITY
Here is a similar statement I wrote to get users and their first purchase date (as opposed to 2nd purchase date, which I am asking for help with in the current post):
DB[:users].join(:user_purchases, :user_id => :id)
.select{[:user_id, min(:purchase_date)]}
.group(:user_id)
You don't seem to be worried about the dates, just the counts so
DB[:user_purchases].group_and_count(:user_id).having(:count > 1).all
will return a list of user_ids and counts where the count (of purchases) is >= 2. Something like
[{:count=>2, :user_id=>1}, {:count=>7, :user_id=>2}, {:count=>2, :user_id=>3}, ...]
If you want to get the users with that, the easiest way with Sequel is probably to extract just the list of user_ids and feed that back into another query:
DB[:users].where(:id => DB[:user_purchases].group_and_count(:user_id).
having(:count > 1).all.map{|row| row[:user_id]}).all
Edit:
I felt like there should be a more succinct way and then I saw this answer (from Sequel author Jeremy Evans) to another question using select_group and select_more : https://stackoverflow.com/a/10886982/131226
This should do it without the subselect:
DB[:users].
left_join(:user_purchases, :user_id=>:id).
select_group(:id).
select_more{count(:purchase_date).as(:purchase_count)}.
having(:purchase_count > 1)
It generates this SQL
SELECT `id`, count(`purchase_date`) AS 'purchase_count'
FROM `users` LEFT JOIN `user_purchases`
ON (`user_purchases`.`user_id` = `users`.`id`)
GROUP BY `id` HAVING (`purchase_count` > 1)"
Generally, this could be the SQL query that you need:
SELECT u.id, up1.purchase_date FROM users u
LEFT JOIN user_purchases up1 ON u.id = up1.user_id
LEFT JOIN user_purchases up2 ON u.id = up2.user_id AND up2.purchase_date < up1.purchase_date
GROUP BY u.id, up1.purchase_date
HAVING COUNT(up2.purchase_date) = 1;
Try converting that to sequel, if you don't get any better answers.
The date of the user's second purchase would be the second row retrieved if you do an order_by(:purchase_date) as part of your query.
To access that, do a limit(2) to constrain the query to two results then take the [-1] (or last) one. So, if you're not using models and are working with datasets only, and know the user_id you're interested in, your (untested) query would be:
DB[:user_purchases].where(:user_id => user_id).order_by(:user_purchases__purchase_date).limit(2)[-1]
Here's some output from Sequel's console:
DB[:user_purchases].where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT * FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"
Add the appropriate select clause:
.select(:user_id, :purchase_date)
and you should be done:
DB[:user_purchases].select(:user_id, :purchase_date).where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT user_id, purchase_date FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"

"Order by" result of "group by" count?

This query
Message.where("message_type = ?", "incoming").group("sender_number").count
will return me an hash.
OrderedHash {"1234"=>21, "2345"=>11, "3456"=>63, "4568"=>100}
Now I want to order by count of each group. How can I do that within the query.
The easiest way to do this is to just add an order clause to the original query. If you give the count method a specific field, it will generate an output column with the name count_{column}, which can be used in the sql generated by adding an order call:
Message.where('message_type = ?','incoming')
.group('sender_number')
.order('count_id asc').count('id')
When I tried this, rails gave me this error
SQLite3::SQLException: no such column: count_id: SELECT COUNT(*) AS count_all, state AS state FROM "ideas" GROUP BY state ORDER BY count_id desc LIMIT 3
Notice that it says SELECT ... AS count_all
So I updated the query from #Simon's answer to look like this and it works for me
.order('count_all desc')

Rails 3.1 with PostgreSQL: GROUP BY must be used in an aggregate function

I am trying to load the latest 10 Arts grouped by the user_id and ordered by created_at. This works fine with SqlLite and MySQL, but gives an error on my new PostgreSQL database.
Art.all(:order => "created_at desc", :limit => 10, :group => "user_id")
ActiveRecord error:
Art Load (18.4ms) SELECT "arts".* FROM "arts" GROUP BY user_id ORDER BY created_at desc LIMIT 10
ActiveRecord::StatementInvalid: PGError: ERROR: column "arts.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "arts".* FROM "arts" GROUP BY user_id ORDER BY crea...
Any ideas?
The sql generated by the expression is not a valid query, you are grouping by user_id and selecting lot of other fields based on that but not telling the DB how it should aggregate the other fileds. For example, if your data looks like this:
a | b
---|---
1 | 1
1 | 2
2 | 3
Now when you ask db to group by a and also return b, it doesn't know how to aggregate values 1,2. You need to tell if it needs to select min, max, average, sum or something else. Just as I was writing the answer there have been two answers which might explain all this better.
In your use case though, I think you don't want a group by on db level. As there are only 10 arts, you can group them in your application. Don't use this method with thousands of arts though:
arts = Art.all(:order => "created_at desc", :limit => 10)
grouped_arts = arts.group_by {|art| art.user_id}
# now you have a hash with following structure in grouped_arts
# {
# user_id1 => [art1, art4],
# user_id2 => [art3],
# user_id3 => [art5],
# ....
# }
EDIT: Select latest_arts, but only one art per user
Just to give you the idea of sql(have not tested it as I don't have RDBMS installed on my system)
SELECT arts.* FROM arts
WHERE (arts.user_id, arts.created_at) IN
(SELECT user_id, MAX(created_at) FROM arts
GROUP BY user_id
ORDER BY MAX(created_at) DESC
LIMIT 10)
ORDER BY created_at DESC
LIMIT 10
This solution is based on the practical assumption, that no two arts for same user can have same highest created_at, but it may well be wrong if you are importing or programitically creating bulk of arts. If assumption doesn't hold true, the sql might get more contrieved.
EDIT: Attempt to change the query to Arel:
Art.where("(arts.user_id, arts.created_at) IN
(SELECT user_id, MAX(created_at) FROM arts
GROUP BY user_id
ORDER BY MAX(created_at) DESC
LIMIT 10)").
order("created_at DESC").
page(params[:page]).
per(params[:per])
You need to select the specific columns you need
Art.select(:user_id).group(:user_id).limit(10)
It will raise error when you try to select title in the query, for example
Art.select(:user_id, :title).group(:user_id).limit(10)
column "arts.title" must appear in the GROUP BY clause or be used in an aggregate function
That is because when you try to group by user_id, the query has no idea how to handle the title in the group, because the group contains several titles.
so the exception already mention you need to appear in group by
Art.select(:user_id, :title).group(:user_id, :title).limit(10)
or be used in an aggregate function
Art.select("user_id, array_agg(title) as titles").group(:user_id).limit(10)
Take a look at this post SQLite to Postgres (Heroku) GROUP BY
PostGres is actually following the SQL standard here whilst sqlite and mysql break from the standard.
Have at look at this question - Converting MySQL select to PostgreSQL. Postgres won't allow a column to be listed in the select statement that isn't in the group by clause.

Resources