This query
Message.where("message_type = ?", "incoming").group("sender_number").count
will return me an hash.
OrderedHash {"1234"=>21, "2345"=>11, "3456"=>63, "4568"=>100}
Now I want to order by count of each group. How can I do that within the query.
The easiest way to do this is to just add an order clause to the original query. If you give the count method a specific field, it will generate an output column with the name count_{column}, which can be used in the sql generated by adding an order call:
Message.where('message_type = ?','incoming')
.group('sender_number')
.order('count_id asc').count('id')
When I tried this, rails gave me this error
SQLite3::SQLException: no such column: count_id: SELECT COUNT(*) AS count_all, state AS state FROM "ideas" GROUP BY state ORDER BY count_id desc LIMIT 3
Notice that it says SELECT ... AS count_all
So I updated the query from #Simon's answer to look like this and it works for me
.order('count_all desc')
Related
I have a model with the fields price, min_price,max_price, discount,in my product table. if I want to execute ascending descending orders, how that will get executed when we apply for an order on multiple fields. for example like below.
#products = Product.order("price asc").order("min_price desc").order("max_price asc").order("updated_at asc") (Query might be wrong but for reference im adding)
will it order as per the order sequence ?
If you append .to_sql to that, it will show the generated SQL so you can investigate yourself.
I tried a similar query:
Book.select(:id).order("id asc").order("pub_date desc").to_sql
=> "SELECT \"books\".\"id\" FROM \"books\" ORDER BY id asc, pub_date desc"
You might instead:
Book.select(:id).order(id: :asc, pub_date: :desc).to_sql
=> "SELECT \"books\".\"id\" FROM \"books\" ORDER BY \"books\".\"id\" ASC, \"books\".\"pub_date\" DESC"
... which you see adds the table name in, so is more reliable when if you are accessing multiple tables
I am trying to write a function that groups by some columns in a very large table (millions of rows). Is there any way to get find_each to work with this, or is it impossible given that I do not want to order by the id column?
The SQL of my query is:
SELECT derivable_type, derivable_id FROM "mytable" GROUP BY derivable_type, derivable_id ORDER BY "mytable"."id" ASC;
The rails find_each automatically adds the ORDER BY clause using a reorder statement. I have tried changing the SQL to:
SELECT MAX(id) AS "mytable"."id", derivable_type, derivable_id FROM "mytable" GROUP BY derivable_type, derivable_id ORDER BY "mytable"."id" ASC;
but that doesn't work either. Any ideas other than writing my own find_each function or overriding the private batch_order function in batches.rb?
There are at least two approaches to solve this problem:
I. Use subquery:
# query the table and select id, derivable_type and derivable_id
my_table_ids = MyTable
.group("derivable_type, derivable_id")
.select("MAX(id) AS my_table_id, derivable_type, derivable_id")
# use subquery to allow rails to use ORDER BY in find_each
MyTable
.where(id: my_table_ids.select('my_table_id'))
.find_each { |row| do_something(row) }
II. Write custom find_each function
rows = MyTable
.group("derivable_type, derivable_id")
.select("derivable_type, derivable_id")
find_each_grouped(rows, ['derivable_type', 'derivable_id']) do |row|
do_something(row)
end
def find_each_grouped(rows, columns, &block)
offset = 0
batch_size = 1_000
loop do
batch = rows
.order(columns)
.offset(offset)
.limit(limit)
batch.each(&block)
break if batch.size < limit
offset += limit
end
end
I'm not sure I'm 100% clear on what you're trying to do, but your query looks the same as doing an aggregate distinct()
SELECT derivable_type, derivable_id FROM "mytable" GROUP BY derivable_type, derivable_id ORDER BY "mytable"."id" ASC;
---- vv
SELECT DISTINCT(derivable_type, derivable_id) FROM "mytable" ORDER BY "mytable"."id" ASC;
You should be able to use Active Record to accomplish this, combined with find_each (if Mytable is your model):
Mytable.all.group(:derivable_type, :derivable_id).distinct.find_each
# gives => #<Enumerator: #<ActiveRecord::Relation [...]>:find_each({:start=>nil, :finish=>nil, :batch_size=>1000, :error_on_ignore=>nil})>
This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 7 years ago.
I am creating a contest where user can submit multiple entries. Only the entry with the highest tonnage will be shown. In the index view all the entries has to be sorted descending based on tonnage value.
My submissions controller shows following:
#submissions = #contest.submissions.maximum(:tonnage, group: User)
The problem here is that I do not get an array back with all the submission values. I need something I can iterate through.
e.g. a list which only contains one submissions from a user which is the submission with the highest tonnage value.
When I just group I get following error:
GroupingError: ERROR: column "submissions.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "submissions".* FROM "submissions" WHERE "submission...
UPDATE:
I found an sql query who does approximately what I want.
select *
from submissions a
inner join
( select user_id, max(tonnage) as max_tonnage
from submissions
group by user_id) b
on
a.user_id = b.user_id and
a.tonnage = b.max_tonnage
How can I fix this in activerecord?
Comment info:
Simpler with DISTINCT ON:
SELECT DISTINCT ON (user_id) *
FROM submissions
ORDER BY user_id, tonnage DESC NULLS LAST;
NULLS LAST is only relevant if tonnage can be NULL:
Detailed explanation:
Select first row in each GROUP BY group?
Syntax in ActiveRecord:
Submission.select("DISTINCT ON (user_id) *").order("user_id, tonnage DESC NULLS LAST")
More in the Ruby documentation or this related answer:
Get a list of first record for each group
Possible performance optimization:
Optimize GROUP BY query to retrieve latest record per user
Sort result rows
Per request in comment.
SELECT * FROM (
SELECT DISTINCT ON (user_id) *
FROM submissions
ORDER BY user_id, tonnage DESC NULLS LAST
) sub
ORDER BY tonnage DESC NULLS LAST, user_id; -- 2nd item to break ties;
Alternatively use row_number() in a subquery:
SELECT * FROM (
SELECT *
, row_number() OVER (PARTITION BY user_id ORDER BY tonnage DESC NULLS LAST) AS rn
FROM submissions
) sub
WHERE rn = 1
ORDER BY tonnage DESC NULLS LAST, user_id;
Or the query you have, plus ORDER BY.
Rails version 4.1.6, Postgres version not important.
I use a custom sorting, where strings come before integers and then integers get sorted as numbers:
sample sorting:
A0101
BD330
BE124
1
2
3
10
Since there is no direct way to achieve this with the query interface, I've found this postgres specific syntax which, in general, works fine:
default_scope {
order("substring(entries.code, '[^0-9_].*$') ASC").
order("(substring(entries.code, '^[0-9]+'))::int ASC")
}
For example, to get the first record:
2.0.0p247 :001 > Entry.first
Entry Load (3.6ms) SELECT "entries".* FROM "entries" ORDER BY substring(entries.code, '[^0-9_].*$') ASC, (substring(entries.code, '^[0-9]+'))::int ASC LIMIT 1
=> #<Entry id: ...............>
However, when I want to do a reverse search, I get some DESC words raining all over the query string... This is quite annoying since I haven't found a way yet to dispose off them:
2.0.0p247 :002 > Entry.last
Entry Load (0.8ms) SELECT "entries".* FROM "entries" ORDER BY substring(entries.code DESC, '[^0-9_].*$') DESC, (substring(entries.code DESC, '^[0-9]+'))::int DESC LIMIT 1
PG::Error: ERROR: syntax error at or near "DESC"
LINE 1: ... FROM "entries" ORDER BY substring(entries.code DESC, '[^0...
^
: SELECT "entries".* FROM "entries" ORDER BY substring(entries.code DESC, '[^0-9_].*$') DESC, (substring(entries.code DESC, '^[0-9]+'))::int DESC LIMIT 1
ActiveRecord::StatementInvalid: PG::Error: ERROR: syntax error at or near "DESC"
LINE 1: ... FROM "entries" ORDER BY substring(entries.code DESC, '[^0...
To be more specific, which I believe is not necessary, I would like to get rid of those DESC within the substring() methods...
EDIT:
I see in definition of reverse_sql_order, that the string is split at the commas , and ASC or DESC is applied there...
Using extensive database-oriented functions in a Rails project is never a good idea. Those kind of composite statements can drive you insanely crazy.
order("substring(entries.code, '[^0-9_].*$') ASC").
order("(substring(entries.code, '^[0-9]+'))::int ASC")
IMHO, the simplest and more effective solution is an helper column. Define, for instance, a table column called weight with type integer.
Define a model callback that, every time you save an object, stores in the column 0 if the value of the sorting field is a string, the digit if the value is a number. Here's your sort index.
Run the sort queries against that weight column. You can even index the attribute, and your queries will be much cleaner and faster. You will also be able to sort by DESC or ASC with no complexity at all.
I am trying to load the latest 10 Arts grouped by the user_id and ordered by created_at. This works fine with SqlLite and MySQL, but gives an error on my new PostgreSQL database.
Art.all(:order => "created_at desc", :limit => 10, :group => "user_id")
ActiveRecord error:
Art Load (18.4ms) SELECT "arts".* FROM "arts" GROUP BY user_id ORDER BY created_at desc LIMIT 10
ActiveRecord::StatementInvalid: PGError: ERROR: column "arts.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "arts".* FROM "arts" GROUP BY user_id ORDER BY crea...
Any ideas?
The sql generated by the expression is not a valid query, you are grouping by user_id and selecting lot of other fields based on that but not telling the DB how it should aggregate the other fileds. For example, if your data looks like this:
a | b
---|---
1 | 1
1 | 2
2 | 3
Now when you ask db to group by a and also return b, it doesn't know how to aggregate values 1,2. You need to tell if it needs to select min, max, average, sum or something else. Just as I was writing the answer there have been two answers which might explain all this better.
In your use case though, I think you don't want a group by on db level. As there are only 10 arts, you can group them in your application. Don't use this method with thousands of arts though:
arts = Art.all(:order => "created_at desc", :limit => 10)
grouped_arts = arts.group_by {|art| art.user_id}
# now you have a hash with following structure in grouped_arts
# {
# user_id1 => [art1, art4],
# user_id2 => [art3],
# user_id3 => [art5],
# ....
# }
EDIT: Select latest_arts, but only one art per user
Just to give you the idea of sql(have not tested it as I don't have RDBMS installed on my system)
SELECT arts.* FROM arts
WHERE (arts.user_id, arts.created_at) IN
(SELECT user_id, MAX(created_at) FROM arts
GROUP BY user_id
ORDER BY MAX(created_at) DESC
LIMIT 10)
ORDER BY created_at DESC
LIMIT 10
This solution is based on the practical assumption, that no two arts for same user can have same highest created_at, but it may well be wrong if you are importing or programitically creating bulk of arts. If assumption doesn't hold true, the sql might get more contrieved.
EDIT: Attempt to change the query to Arel:
Art.where("(arts.user_id, arts.created_at) IN
(SELECT user_id, MAX(created_at) FROM arts
GROUP BY user_id
ORDER BY MAX(created_at) DESC
LIMIT 10)").
order("created_at DESC").
page(params[:page]).
per(params[:per])
You need to select the specific columns you need
Art.select(:user_id).group(:user_id).limit(10)
It will raise error when you try to select title in the query, for example
Art.select(:user_id, :title).group(:user_id).limit(10)
column "arts.title" must appear in the GROUP BY clause or be used in an aggregate function
That is because when you try to group by user_id, the query has no idea how to handle the title in the group, because the group contains several titles.
so the exception already mention you need to appear in group by
Art.select(:user_id, :title).group(:user_id, :title).limit(10)
or be used in an aggregate function
Art.select("user_id, array_agg(title) as titles").group(:user_id).limit(10)
Take a look at this post SQLite to Postgres (Heroku) GROUP BY
PostGres is actually following the SQL standard here whilst sqlite and mysql break from the standard.
Have at look at this question - Converting MySQL select to PostgreSQL. Postgres won't allow a column to be listed in the select statement that isn't in the group by clause.