Solving a PG::GroupingError: ERROR - ruby-on-rails

The following code gets all the residences which have all the amenities which are listed in id_list. It works with out a problem with SQLite but raises an error with PostgreSQL:
id_list = [48, 49]
Residence.joins(:listed_amenities).
where(listed_amenities: {amenity_id: id_list}).
references(:listed_amenities).
group(:residence_id).
having("count(*) = ?", id_list.size)
The error on the PostgreSQL version:
What do I have to change to make it work with PostgreSQL?

A few things:
references should only be used with includes; it tells ActiveRecord to perform a join, so it's redundant when using an explicit joins.
You need to fully qualify the argument to group, i.e. group('residences.id').
For example,
id_list = [48, 49]
Residence.joins(:listed_amenities).
where(listed_amenities: { amenity_id: id_list }).
group('residences.id').
having('COUNT(*) = ?", id_list.size)

The query the Ruby (?) code is expanded to is selecting all fields from the residences table:
SELECT "residences".*
FROM "residences"
INNER JOIN "listed_amenities"
ON "listed_amentities"."residence_id" = "residences"."id"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1;
From the Postgres manual, When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column.
You'll need to either group by all fields that aggregate functions aren't applied to, or do this differently. From the query, it looks like you only need to scan the amentities table to get the residence ID you're looking for:
SELECT "residence_id"
FROM "listed_amenities"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1
And then fetch your residence data with that ID. Or, in one query:
SELECT "residences".*
FROM "residences"
WHERE "id" IN (SELECT "residence_id"
FROM "listed_amenities"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1
);

Related

Find records with ID in array of IDS and keep the order of records matching that of IDs [duplicate]

I have a simple SQL query in PostgreSQL 8.3 that grabs a bunch of comments. I provide a sorted list of values to the IN construct in the WHERE clause:
SELECT * FROM comments WHERE (comments.id IN (1,3,2,4));
This returns comments in an arbitrary order which in my happens to be ids like 1,2,3,4.
I want the resulting rows sorted like the list in the IN construct: (1,3,2,4).
How to achieve that?
You can do it quite easily with (introduced in PostgreSQL 8.2) VALUES (), ().
Syntax will be like this:
select c.*
from comments c
join (
values
(1,1),
(3,2),
(2,3),
(4,4)
) as x (id, ordering) on c.id = x.id
order by x.ordering
In Postgres 9.4 or later, this is simplest and fastest:
SELECT c.*
FROM comments c
JOIN unnest('{1,3,2,4}'::int[]) WITH ORDINALITY t(id, ord) USING (id)
ORDER BY t.ord;
WITH ORDINALITY was introduced with in Postgres 9.4.
No need for a subquery, we can use the set-returning function like a table directly. (A.k.a. "table-function".)
A string literal to hand in the array instead of an ARRAY constructor may be easier to implement with some clients.
For convenience (optionally), copy the column name we are joining to ("id" in the example), so we can join with a short USING clause to only get a single instance of the join column in the result.
Works with any input type. If your key column is of type text, provide something like '{foo,bar,baz}'::text[].
Detailed explanation:
PostgreSQL unnest() with element number
Just because it is so difficult to find and it has to be spread: in mySQL this can be done much simpler, but I don't know if it works in other SQL.
SELECT * FROM `comments`
WHERE `comments`.`id` IN ('12','5','3','17')
ORDER BY FIELD(`comments`.`id`,'12','5','3','17')
With Postgres 9.4 this can be done a bit shorter:
select c.*
from comments c
join (
select *
from unnest(array[43,47,42]) with ordinality
) as x (id, ordering) on c.id = x.id
order by x.ordering;
Or a bit more compact without a derived table:
select c.*
from comments c
join unnest(array[43,47,42]) with ordinality as x (id, ordering)
on c.id = x.id
order by x.ordering
Removing the need to manually assign/maintain a position to each value.
With Postgres 9.6 this can be done using array_position():
with x (id_list) as (
values (array[42,48,43])
)
select c.*
from comments c, x
where id = any (x.id_list)
order by array_position(x.id_list, c.id);
The CTE is used so that the list of values only needs to be specified once. If that is not important this can also be written as:
select c.*
from comments c
where id in (42,48,43)
order by array_position(array[42,48,43], c.id);
I think this way is better :
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY id=1 DESC, id=3 DESC, id=2 DESC, id=4 DESC
Another way to do it in Postgres would be to use the idx function.
SELECT *
FROM comments
ORDER BY idx(array[1,3,2,4], comments.id)
Don't forget to create the idx function first, as described here: http://wiki.postgresql.org/wiki/Array_Index
In Postgresql:
select *
from comments
where id in (1,3,2,4)
order by position(id::text in '1,3,2,4')
On researching this some more I found this solution:
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY CASE "comments"."id"
WHEN 1 THEN 1
WHEN 3 THEN 2
WHEN 2 THEN 3
WHEN 4 THEN 4
END
However this seems rather verbose and might have performance issues with large datasets.
Can anyone comment on these issues?
To do this, I think you should probably have an additional "ORDER" table which defines the mapping of IDs to order (effectively doing what your response to your own question said), which you can then use as an additional column on your select which you can then sort on.
In that way, you explicitly describe the ordering you desire in the database, where it should be.
sans SEQUENCE, works only on 8.4:
select * from comments c
join
(
select id, row_number() over() as id_sorter
from (select unnest(ARRAY[1,3,2,4]) as id) as y
) x on x.id = c.id
order by x.id_sorter
SELECT * FROM "comments" JOIN (
SELECT 1 as "id",1 as "order" UNION ALL
SELECT 3,2 UNION ALL SELECT 2,3 UNION ALL SELECT 4,4
) j ON "comments"."id" = j."id" ORDER BY j.ORDER
or if you prefer evil over good:
SELECT * FROM "comments" WHERE ("comments"."id" IN (1,3,2,4))
ORDER BY POSITION(','+"comments"."id"+',' IN ',1,3,2,4,')
And here's another solution that works and uses a constant table (http://www.postgresql.org/docs/8.3/interactive/sql-values.html):
SELECT * FROM comments AS c,
(VALUES (1,1),(3,2),(2,3),(4,4) ) AS t (ord_id,ord)
WHERE (c.id IN (1,3,2,4)) AND (c.id = t.ord_id)
ORDER BY ord
But again I'm not sure that this is performant.
I've got a bunch of answers now. Can I get some voting and comments so I know which is the winner!
Thanks All :-)
create sequence serial start 1;
select * from comments c
join (select unnest(ARRAY[1,3,2,4]) as id, nextval('serial') as id_sorter) x
on x.id = c.id
order by x.id_sorter;
drop sequence serial;
[EDIT]
unnest is not yet built-in in 8.3, but you can create one yourself(the beauty of any*):
create function unnest(anyarray) returns setof anyelement
language sql as
$$
select $1[i] from generate_series(array_lower($1,1),array_upper($1,1)) i;
$$;
that function can work in any type:
select unnest(array['John','Paul','George','Ringo']) as beatle
select unnest(array[1,3,2,4]) as id
Slight improvement over the version that uses a sequence I think:
CREATE OR REPLACE FUNCTION in_sort(anyarray, out id anyelement, out ordinal int)
LANGUAGE SQL AS
$$
SELECT $1[i], i FROM generate_series(array_lower($1,1),array_upper($1,1)) i;
$$;
SELECT
*
FROM
comments c
INNER JOIN (SELECT * FROM in_sort(ARRAY[1,3,2,4])) AS in_sort
USING (id)
ORDER BY in_sort.ordinal;
select * from comments where comments.id in
(select unnest(ids) from bbs where id=19795)
order by array_position((select ids from bbs where id=19795),comments.id)
here, [bbs] is the main table that has a field called ids,
and, ids is the array that store the comments.id .
passed in postgresql 9.6
Lets get a visual impression about what was already said. For example you have a table with some tasks:
SELECT a.id,a.status,a.description FROM minicloud_tasks as a ORDER BY random();
id | status | description
----+------------+------------------
4 | processing | work on postgres
6 | deleted | need some rest
3 | pending | garden party
5 | completed | work on html
And you want to order the list of tasks by its status.
The status is a list of string values:
(processing, pending, completed, deleted)
The trick is to give each status value an interger and order the list numerical:
SELECT a.id,a.status,a.description FROM minicloud_tasks AS a
JOIN (
VALUES ('processing', 1), ('pending', 2), ('completed', 3), ('deleted', 4)
) AS b (status, id) ON (a.status = b.status)
ORDER BY b.id ASC;
Which leads to:
id | status | description
----+------------+------------------
4 | processing | work on postgres
3 | pending | garden party
5 | completed | work on html
6 | deleted | need some rest
Credit #user80168
I agree with all other posters that say "don't do that" or "SQL isn't good at that". If you want to sort by some facet of comments then add another integer column to one of your tables to hold your sort criteria and sort by that value. eg "ORDER BY comments.sort DESC " If you want to sort these in a different order every time then... SQL won't be for you in this case.

Getting Conditional Count in Join with Laravel Query Builder

I am trying to achieve the following with Laravel Query builder.
I have a table called deals . Below is the basic schema
id
deal_id
merchant_id
status
deal_text
timestamps
I also have another table called merchants whose schema is
id
merchant_id
merchant_name
about
timestamps
Currently I am getting deals using the following query
$deals = DB::table('deals')
-> join ('merchants', 'deals.merchant_id', '=', 'merchants.merchant_id')
-> where ('merchant_url_text', $merchant_url_text)
-> get();
Since only 1 merchant is associated with a deal, I am getting deals and related merchant info with the query.
Now I have a 3rd table called tbl_deal_votes. Its schema looks like
id
deal_id
vote (1 if voted up, 0 if voted down)
timestamps
What I want to do is join this 3rd table (on deal_id) to my existing query and be able to also get the upvotes and down votes each deal has received.
To do this in a single query you'll probably need to use SQL subqueries, which doesn't seem to have good fluent query support in Laravel 4/5. Since you're not using Eloquent objects, the raw SQL is probably easiest to read. (Note the below example ignores your deals.deal_id and merchants.merchant_id columns, which can likely be dropped. Instead it just uses your deals.id and merchants.id fields by convention.)
$deals = DB::select(
DB::raw('
SELECT
deals.id AS deal_id,
deals.status,
deals.deal_text,
merchants.id AS merchant_id,
merchants.merchant_name,
merchants.about,
COALESCE(tbl_upvotes.upvotes_count, 0) AS upvotes_count,
COALESCE(tbl_downvotes.downvotes_count, 0) AS downvotes_count
FROM
deals
JOIN merchants ON (merchants.id = deals.merchant_id)
LEFT JOIN (
SELECT deal_id, count(*) AS upvotes_count
FROM tbl_deal_votes
WHERE vote = 1 && deal_id
GROUP BY deal_id
) tbl_upvotes ON (tbl_upvotes.deal_id = deals.id)
LEFT JOIN (
SELECT deal_id, count(*) AS downvotes_count
FROM tbl_deal_votes
WHERE vote = 0
GROUP BY deal_id
) tbl_downvotes ON (tbl_downvotes.deal_id = deals.id)
')
);
If you'd prefer to use fluent, this should work:
$upvotes_subquery = '
SELECT deal_id, count(*) AS upvotes_count
FROM tbl_deal_votes
WHERE vote = 1
GROUP BY deal_id';
$downvotes_subquery = '
SELECT deal_id, count(*) AS downvotes_count
FROM tbl_deal_votes
WHERE vote = 0
GROUP BY deal_id';
$deals = DB::table('deals')
->select([
DB::raw('deals.id AS deal_id'),
'deals.status',
'deals.deal_text',
DB::raw('merchants.id AS merchant_id'),
'merchants.merchant_name',
'merchants.about',
DB::raw('COALESCE(tbl_upvotes.upvotes_count, 0) AS upvotes_count'),
DB::raw('COALESCE(tbl_downvotes.downvotes_count, 0) AS downvotes_count')
])
->join('merchants', 'merchants.id', '=', 'deals.merchant_id')
->leftJoin(DB::raw('(' . $upvotes_subquery . ') tbl_upvotes'), function($join) {
$join->on('tbl_upvotes.deal_id', '=', 'deals.id');
})
->leftJoin(DB::raw('(' . $downvotes_subquery . ') tbl_downvotes'), function($join) {
$join->on('tbl_downvotes.deal_id', '=', 'deals.id');
})
->get();
A few notes about the fluent query:
Used the DB::raw() method to rename a few selected columns.
Otherwise, there would have been a conflict between deals.id
and merchants.id in the results.
Used COALESCE to default null votes to 0.
Split the subqueries into separate PHP strings to improve readability.
Used left joins for the subqueries so deals with no upvotes/downvotes still show up.

row_number() with an unspecified window `row_number() OVER ()`

I'm building a paginated scoreboard using postgres 9.1.
There are several criteria that users can sort the scoreboard by, and they can sort by ascending or descending. There is a feature to let users find "their row" across the multiple pages in the scoreboard, and it must reflect the users selected sorting criteria.
I am using postgres's row_number function to find their offset into the result set to return the page where the user can find their row.
Everything I'm reading about row_number seems to imply that bad things happen to people who don't specify an ordering within the row_number window. E.g. row_number() OVER (ORDER BY score_1) is OK, row_number() OVER () is bad.
My case is different from the examples I've read about in that I am explicitly ordering my query, I realize the DB engine may not return the results in any particular order if I don't.
But I'd like to just specify ordering at the level of the entire query and get the row_number of the results, without having to duplicate my ordering specification with the row_number's window.
So this is what I'd like to do, and it "seems to work".
SELECT
id,
row_number() OVER () AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY (score_1 ASC | score_1 DESC | score_2 ASC | score_2 DESC | score_3 ASC | score_3 DESC)
Where player_position reflects the players rank in whatever criteria I'm ordering by.
But the documentation I've read tells me I should do it like this:
SELECT
id,
row_number() OVER (ORDER BY score_1 ASC) AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY score_1 ASC
or
SELECT
id,
row_number() OVER (ORDER BY score_2 DESC) AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY score_2 DESC
The real reason that I'd like to avoid redundantly specifying the ordering for the row_number window is to keep my query amenable with the ActiveRecord ORM. I want to have my base scoreboard query, and chain on the ordering.
e.g. Ultimately, I want to be able to do this:
Players.scoreboard.order('score_1 ASC')
Players.scoreboard.order('score_2 DESC')
etc...
Is it possible?
Try moving your main query into a subquery with an ORDER BY and apply the ROW_NUMBER() to the outermost query.
SELECT y.*,
ROW_NUMBER() OVER () as player_position
FROM
(SELECT
id,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY <whatever>) as y

PGError: ERROR: aggregates not allowed in WHERE clause on a AR query of an object and its has_many objects

Running the following query on a has_many association. Recommendations has_many Approvals.
I am running, rails 3 and PostgreSQL:
Recommendation.joins(:approvals).where('approvals.count = ?
AND recommendations.user_id = ?', 1, current_user.id)
This is returning the following error: https://gist.github.com/1541569
The error message tells you:
aggregates not allowed in WHERE clause
count() is an aggregate function. Use the HAVING clause for that.
The query could look like this:
SELECT r.*
FROM recommendations r
JOIN approvals a ON a.recommendation_id = r.id
WHERE r.user_id = $current_user_id
GROUP BY r.id
HAVING count(a.recommendation_id) = 1
With PostgreSQL 9.1 or later it is enough to GROUP BY the primary key of a table (presuming recommendations.id is the PK). In Postgres versions before 9.1 you had to include all columns of the SELECT list that are not aggregated in the GROUP BY list. With recommendations.* in the SELECT list, that would be every single column of the table.
I quote the release notes of PostgreSQL 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Simpler with a sub-select
Either way, this is simpler and faster, doing the same:
SELECT *
FROM recommendations r
WHERE user_id = $current_user_id
AND (SELECT count(*)
FROM approvals
WHERE recommendation_id = r.id) = 1;
Avoid multiplying rows with a JOIN a priori, then you don't have to aggregate them back.
Looks like you have a column named count and PostgreSQL is interpreting that column name as the count aggregate function. Your SQL ends up like this:
SELECT "recommendations".*
FROM "recommendations"
INNER JOIN "approvals" ON "approvals"."recommendation_id" = "recommendations"."id"
WHERE (approvals.count = 1 AND recommendations.user_id = 1)
The error message specifically points at the approvals.count:
LINE 1: ...ecommendation_id" = "recommendations"."id" WHERE (approvals....
^
I can't reproduce that error in my PostgreSQL (9.0) but maybe you're using a different version. Try double quoting that column name in your where:
Recommendation.joins(:approvals).where('approvals."count" = ? AND recommendations.user_id = ?', 1, current_user.id)
If that sorts things out then I'd recommend renaming your approvals.count column to something else so that you don't have to worry about it anymore.

Postgres Rank As Column

I have the following query:
SELECT name, rank() OVER (PARTITION BY user_id ORDER BY love_count DESC) AS position FROM items
And I'd now like to do a where clause on the rank() function:
SELECT name, rank() OVER (PARTITION BY user_id ORDER BY love_count DESC) AS position FROM items WHERE position = 1
That is, I want to query the most loved item for each user. However, this results in:
PGError: ERROR: column "position" does not exist
Also, I'm using Rails AREL to do this and would like to enable chaining. This is the Ruby code that creates the query:
Item.select("name, rank() OVER (PARTITION BY user_id ORDER BY love_count DESC) AS position").where("position = 1")
Any ideas?
You need to "wrap" it into a derived table:
SELECT *
FROM (
SELECT name,
rank() OVER (PARTITION BY user_id ORDER BY love_count DESC) AS position
FROM items
) t
WHERE position = 1
My first thought was, "Use a common table expression", like this untested one.
WITH badly_named_cte AS (
SELECT name,
rank() OVER (PARTITION BY user_id
ORDER BY love_count DESC) AS position
FROM items
)
SELECT * FROM badly_named_cte WHERE position = 1;
The problem you're seeing has to do with the logical order of evaluation required by SQL standards. SQL has to act as if column aliases (arguments to the AS operator) don't exist until after the WHERE clause is evaluated.

Resources