Postgres Rank As Column - ruby-on-rails

I have the following query:
SELECT name, rank() OVER (PARTITION BY user_id ORDER BY love_count DESC) AS position FROM items
And I'd now like to do a where clause on the rank() function:
SELECT name, rank() OVER (PARTITION BY user_id ORDER BY love_count DESC) AS position FROM items WHERE position = 1
That is, I want to query the most loved item for each user. However, this results in:
PGError: ERROR: column "position" does not exist
Also, I'm using Rails AREL to do this and would like to enable chaining. This is the Ruby code that creates the query:
Item.select("name, rank() OVER (PARTITION BY user_id ORDER BY love_count DESC) AS position").where("position = 1")
Any ideas?

You need to "wrap" it into a derived table:
SELECT *
FROM (
SELECT name,
rank() OVER (PARTITION BY user_id ORDER BY love_count DESC) AS position
FROM items
) t
WHERE position = 1

My first thought was, "Use a common table expression", like this untested one.
WITH badly_named_cte AS (
SELECT name,
rank() OVER (PARTITION BY user_id
ORDER BY love_count DESC) AS position
FROM items
)
SELECT * FROM badly_named_cte WHERE position = 1;
The problem you're seeing has to do with the logical order of evaluation required by SQL standards. SQL has to act as if column aliases (arguments to the AS operator) don't exist until after the WHERE clause is evaluated.

Related

Converting a SQL statement to rails command

I have a situation where I need to fetch only few records from a particular active record query response.
#annotations = Annotations.select('*', ROW_NUMBER() OVER (PARTITION BY column_a) ORDER BY column_b)
Above is the query for which the #annotations is the Active Record Response on which I would like to apply the below logic. Is there a better way to write the below logic in rails way?
with some_table as
(
select *, row_number() over (partition by column_a order by column_b) rn
from the_table
)
select * from some_table
where (column_a = 'ABC' and rn <= 10) or (column_b <> 'AAA')
ActiveRecord does not provide CTEs in its high level API; however with a little Arel we can make this a sub query in the FROM clause
annotations_table = Annotation.arel_table
sub_query = annotations_table.project(
Arel.star,
Arel.sql('row_number() over (partition by column_a order by column_b)').as(:rn)
)
query = Arel::Nodes::As.new(sub_query, Arel.sql(annotations_table.name))
Annotation.from(query).where(
annotations_table[:column_a].eq('ABC').and(
annotations_table[:rn].lt(10)
).or(annotations_table[:column_b].not_eq('AAA'))
)
The result will be a collection of Annotation objects using your CTE and the filters you described.
SQL:
select annotations.*
from (
select *, row_number() over (partition by column_a order by column_b) AS rn
from annotations
) AS annotations
where (annotations.column_a = 'ABC' and annotations.rn <= 10) or (annotations.column_b <> 'AAA')
Notes:
With a little extra work we could make this a CTE but it does not seem needed in this case
We could use a bunch of hacks to transition this row_number() over (partition by column_a order by column_b) to Arel as well but it did not seem pertinent to the question.

Snowflake: Joining a Table with Effective Dates and older records are showing NULL

Summary:
In Snowflake I have a table which records the maximum number of an item which changes every so often. I want to be able to join the max number of the item for that date (effective_date). This is the most basic "example" as in my table has items "expire" when they are removed.
CREATE OR REPLACE TABLE ITEM
(
Item VARCHAR(10),
Quantity Number(5,0),
EFFECTIVE_DATE DATE
)
;
CREATE OR REPLACE TABLE REPORT
(
INVOICE_DATE DATE,
ITEM VARCHAR(10)
)
;
INSERT INTO REPORT
VALUES
('2021-02-01', '100'),
('2021-09-10', '100')
;
INSERT INTO ITEM
VALUES
('100', '10', '2021-01-01'),
('101', '15', '2021-01-01'),
('100', '5', '2021-09-01')
;
SELECT * FROM REPORT t1
LEFT JOIN
(
SELECT * FROM ITEM
QUALIFY ROW_NUMBER() OVER (PARTITION BY ITEM ORDER BY EFFECTIVE_DATE desc) = 1
) t2 on t1.ITEM = t2.ITEM AND t1.INVOICE_DATE <= t2.EFFECTIVE_DATE
;
Returns
INVOICE_DATE,ITEM,ITEM,QUANTITY,EFFECTIVE_DATE
2021-02-01,100,100,5,2021-09-01
2021-09-10,100,NULL,NULL,NULL
How do I fix this so I no longer get NULL entries on my join.
Thank you for reading this!
I am hoping to get a result like this
INVOICE_DATE,ITEM,ITEM,QUANTITY,EFFECTIVE_DATE
2021-02-01,100,100,10,2021-01-01
2021-09-10,100,100,5,2021-09-01
The issue is with your data and your expectations. Your query is this:
SELECT * FROM REPORT t1
LEFT JOIN
(
SELECT * FROM ITEM
QUALIFY ROW_NUMBER() OVER (PARTITION BY ITEM ORDER BY EFFECTIVE_DATE desc) = 1
) t2 on t1.ITEM = t2.ITEM AND t1.INVOICE_DATE <= t2.EFFECTIVE_DATE
;
which requires that the INVOICE_DATE be less than or equal to the EFFECTIVE DATE of the ITEM. This isn't the case, though. 2021-09-10 is greater than 2021-09-01 so you don't get a join hit, which is why you get NULLs. It's also why your other record is returning the wrong information from your expectations.

Can anyone help me in this dbms query?

The Question-
Write a query to display the name(s) of the students who have secured the maximum marks in each subject, ordered by subject name in ascending order. If there are multiple toppers, display their names in alphabetical order.
Display it as subject_name and student_name.
O/P: first column - subject_name
second column - student_name
My answer-
select su.subject_name,st.student_name from subject su,student st,mark m
where m.student_id = st.student_id and m.subject_id = su.subject_id
and m.value = (select max(value) from mark group by subject_id);
Error-
and m.value = (select max(value) from mark group by subject_id)
*
ERROR at line 3:
ORA-01427: single-row subquery returns more than one row
What i know is I will have to make another subquery something like
and m.value =(select..... (select max(value) from mark group by subject_id));
But I am not getting it.
Nevermind got the answer, we will have to use IN operator and club value, subject_id as they are a set.
SELECT su.subject_name, st.student_name
FROM student s
JOIN mark m ON st.student_id = m.student_id
JOIN subject su ON m.subject_id = su.subject_id
WHERE (m.value, su.subject_id) IN (
SELECT max(value), subject_id FROM mark GROUP BY subject_id
)
ORDER BY su.subject_name, st.student_name;

Solving a PG::GroupingError: ERROR

The following code gets all the residences which have all the amenities which are listed in id_list. It works with out a problem with SQLite but raises an error with PostgreSQL:
id_list = [48, 49]
Residence.joins(:listed_amenities).
where(listed_amenities: {amenity_id: id_list}).
references(:listed_amenities).
group(:residence_id).
having("count(*) = ?", id_list.size)
The error on the PostgreSQL version:
What do I have to change to make it work with PostgreSQL?
A few things:
references should only be used with includes; it tells ActiveRecord to perform a join, so it's redundant when using an explicit joins.
You need to fully qualify the argument to group, i.e. group('residences.id').
For example,
id_list = [48, 49]
Residence.joins(:listed_amenities).
where(listed_amenities: { amenity_id: id_list }).
group('residences.id').
having('COUNT(*) = ?", id_list.size)
The query the Ruby (?) code is expanded to is selecting all fields from the residences table:
SELECT "residences".*
FROM "residences"
INNER JOIN "listed_amenities"
ON "listed_amentities"."residence_id" = "residences"."id"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1;
From the Postgres manual, When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column.
You'll need to either group by all fields that aggregate functions aren't applied to, or do this differently. From the query, it looks like you only need to scan the amentities table to get the residence ID you're looking for:
SELECT "residence_id"
FROM "listed_amenities"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1
And then fetch your residence data with that ID. Or, in one query:
SELECT "residences".*
FROM "residences"
WHERE "id" IN (SELECT "residence_id"
FROM "listed_amenities"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1
);

row_number() with an unspecified window `row_number() OVER ()`

I'm building a paginated scoreboard using postgres 9.1.
There are several criteria that users can sort the scoreboard by, and they can sort by ascending or descending. There is a feature to let users find "their row" across the multiple pages in the scoreboard, and it must reflect the users selected sorting criteria.
I am using postgres's row_number function to find their offset into the result set to return the page where the user can find their row.
Everything I'm reading about row_number seems to imply that bad things happen to people who don't specify an ordering within the row_number window. E.g. row_number() OVER (ORDER BY score_1) is OK, row_number() OVER () is bad.
My case is different from the examples I've read about in that I am explicitly ordering my query, I realize the DB engine may not return the results in any particular order if I don't.
But I'd like to just specify ordering at the level of the entire query and get the row_number of the results, without having to duplicate my ordering specification with the row_number's window.
So this is what I'd like to do, and it "seems to work".
SELECT
id,
row_number() OVER () AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY (score_1 ASC | score_1 DESC | score_2 ASC | score_2 DESC | score_3 ASC | score_3 DESC)
Where player_position reflects the players rank in whatever criteria I'm ordering by.
But the documentation I've read tells me I should do it like this:
SELECT
id,
row_number() OVER (ORDER BY score_1 ASC) AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY score_1 ASC
or
SELECT
id,
row_number() OVER (ORDER BY score_2 DESC) AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY score_2 DESC
The real reason that I'd like to avoid redundantly specifying the ordering for the row_number window is to keep my query amenable with the ActiveRecord ORM. I want to have my base scoreboard query, and chain on the ordering.
e.g. Ultimately, I want to be able to do this:
Players.scoreboard.order('score_1 ASC')
Players.scoreboard.order('score_2 DESC')
etc...
Is it possible?
Try moving your main query into a subquery with an ORDER BY and apply the ROW_NUMBER() to the outermost query.
SELECT y.*,
ROW_NUMBER() OVER () as player_position
FROM
(SELECT
id,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY <whatever>) as y

Resources