Rails Active Record Count Across Calculated Field - ruby-on-rails

Using the AREL / Rails calculations I'm trying to execute the following:
SELECT to_char(timestamp, 'YYYY-MM-DD') AS segment, COUNT(*) AS counter
FROM pages
GROUP BY segment
ORDER BY segment
I can run something like:
Page.order(FIELD).count(group: FIELD)
{ a: 1, b: 4, c: 1 }
However, I can't get this working across calculated fields. Any thoughts?

Came up with this:
Page.count(:all, from: "(SELECT to_char(#{SEGMENT}, 'YYYY-MM-DD') AS segment FROM pages) AS pages", group: "segment", order: "segment")
> SELECT COUNT(*) AS count_all, segment AS segment FROM (SELECT to_char(created_at, 'YYYY-MM-DD') AS segment FROM pages) AS pages GROUP BY segment ORDER BY segment

Related

rewrite sql statement with max and groupby in ruby

I have this my sql view:
SELECT
`reports`.`date` AS `date`,
`reports`.`book_title` AS `book_title`,
max(
`reports`.`royalty_type`
) AS `royalty_type`,
max(
`reports`.`avg_list_price`
) AS `avg_list_price`
FROM
`reports`
GROUP BY
`reports`.`date`,
`reports`.`book_title`,
`reports`.`marketplace`
As far as I understand it groups results by date, then, by book_title and then by market place and then it selects max royalty_type and avg_list_price within this small subgroups
How do I rewrite this in rails activerecord?
I don't know how to select max within this small groups in activerecord.
Try this one
Report.group(:date, :book_title, :marketplace).select('date, book_title, MAX(royalty_type) AS royalty_type, MAX(avg_list_price) AS avg_list_price')

Better method to find the second largest element in ruby on rails active record query

I am using this query to find the 2nd largest element. I am making query on value column.
Booking.where("value < ?", Booking.maximum(:value)).last
Is there any better query than this? Or any alternative to this.
PS - value is not unique. There could be two elements with same value
This should work.
Booking.select("DISTINCT value").order('value DESC').offset(1).limit(1)
Which will generate this query :
SELECT DISTINCT value FROM "bookings" ORDER BY value DESC LIMIT 1 OFFSET 1
You can use offset and last:
Booking.order(:value).offset(1).last
Which will produce following SQL statement:
SELECT `bookings`.* FROM `bookings`
ORDER BY `bookings`.`value` DESC
LIMIT 1 OFFSET 1

Is the way to make this union of postgres queries more efficient is to use a materialized view?

So right now, I have the following postgres query in a Rails 5.0 application - The first query basically sums viewership and groups by domestic and international stations (radio_category) as well as FM and AM (radio_type). The second query totals viewership across all domestic and international stations and groups by FM/AM.
To make it more efficient, is it better to try and write a put a raw select statement to pull only the numbers that will eventually need to be summed in a materialized view, and then write a SUM()/GROUP_BY statement to pull from the view?
Or is there some clever use of SUM() I can do that only lets me select * the raw numbers once?
Let's say I have at least 1 million rows of data.
SELECT numbers.snapshot_id,
count(*) AS radio_count,
sum(numbers.view_count) AS view_count,
radios.category AS radio_category,
radios.type AS radio_type,
CASE
WHEN radios.type = 'AM' THEN 0
WHEN radios.type = 'FM' THEN 1
END as radio_enum_type
FROM (numbers
JOIN radios ON ((radios.id = numbers.radio_id)))
GROUP BY numbers.snapshot_id, radios.category, radios.type
UNION
SELECT numbers.snapshot_id,
count(*) AS radio_count,
sum(numbers.view_count) AS view_count,
3 AS radio_category,
radios.type AS radio_type,
CASE
WHEN radios.type = 'AM' THEN 0
WHEN radios.type = 'FM' THEN 1
END as radio_enum_type
FROM (numbers
JOIN radios ON ((radios.id = numbers.radio_id)))
GROUP BY numbers.snapshot_id, 3::integer, radios.type
You can't add a row without UNION. So not sure if this is better but you could precalculate the aggregation and then make the UNION from it. However, maybe your query gets optimized by Postgres and might be the same...
WITH aggregated_numbers AS (
SELECT numbers.snapshot_id,
count(*) AS radio_count,
sum(numbers.view_count) AS view_count,
radios.category AS radio_category,
radios.type AS radio_type,
CASE
WHEN radios.type = 'AM' THEN 0
WHEN radios.type = 'FM' THEN 1
END as radio_enum_type
FROM (numbers
JOIN radios ON ((radios.id = numbers.radio_id)))
GROUP BY numbers.snapshot_id, radios.category, radios.type)
SELECT * FROM aggregated_numbers
UNION
SELECT
snapshot_id,
sum(radio_count) as radio_count,
view_count,
3 as radio_category,
radio_type,
radio_enum_type
FROM aggregated_numbers

Solving a PG::GroupingError: ERROR

The following code gets all the residences which have all the amenities which are listed in id_list. It works with out a problem with SQLite but raises an error with PostgreSQL:
id_list = [48, 49]
Residence.joins(:listed_amenities).
where(listed_amenities: {amenity_id: id_list}).
references(:listed_amenities).
group(:residence_id).
having("count(*) = ?", id_list.size)
The error on the PostgreSQL version:
What do I have to change to make it work with PostgreSQL?
A few things:
references should only be used with includes; it tells ActiveRecord to perform a join, so it's redundant when using an explicit joins.
You need to fully qualify the argument to group, i.e. group('residences.id').
For example,
id_list = [48, 49]
Residence.joins(:listed_amenities).
where(listed_amenities: { amenity_id: id_list }).
group('residences.id').
having('COUNT(*) = ?", id_list.size)
The query the Ruby (?) code is expanded to is selecting all fields from the residences table:
SELECT "residences".*
FROM "residences"
INNER JOIN "listed_amenities"
ON "listed_amentities"."residence_id" = "residences"."id"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1;
From the Postgres manual, When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column.
You'll need to either group by all fields that aggregate functions aren't applied to, or do this differently. From the query, it looks like you only need to scan the amentities table to get the residence ID you're looking for:
SELECT "residence_id"
FROM "listed_amenities"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1
And then fetch your residence data with that ID. Or, in one query:
SELECT "residences".*
FROM "residences"
WHERE "id" IN (SELECT "residence_id"
FROM "listed_amenities"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1
);

Sqlite where clause with multiple values in select statement

I am trying a sqlite select query statement as below:
SELECT IndicatorText
FROM Table
where IndicatorID in('13','25','64','52','13','25','328')
AND RubricID in('1','1','1','1','1','1','6')
This gives an output but the duplicate values are not displayed. I want to display all the values of IndicatorText even though it is duplicate.
Please help me with this query.
The two IN conditions are evaluated individually.
To check both values at once, you could concatenate them so that you have a single string to compare:
SELECT IndicatorText
FROM MyTable
WHERE IndicatorID || ',' || RubricID IN (
'13,1', '25,1', '64,1', '52,1', '13,1', '25,1', '328,6')
However, doing this operation on the column values prevents the query optimizer from using indexes, so this query will be slow if the table is big.
To allow optimizations, create a temporary table with the desired values, and join that with the original table:
SELECT IndicatorText
FROM MyTable
NATURAL JOIN (SELECT 13 AS IndicatorID, 1 AS RubricID UNION ALL
SELECT 25, 1 UNION ALL
SELECT 64, 1 UNION ALL
SELECT 52, 1 UNION ALL
SELECT 13, 1 UNION ALL
SELECT 25, 1 UNION ALL
SELECT 328, 6)

Resources