ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions - ruby-on-rails

My requirement is to get distinct records and in order
User.joins('INNER JOIN report_posts ON posts.id = report_posts.post_id').select('DISTINCT ON (report_posts.post_id) posts.id as report_posts.id as reported_id, report_posts.reported_at').order('report_posts.reported_at desc')
I know this is not possible in postgresql, I already read this Postgresql DISTINCT ON with different ORDER BY
I want its solution that how can I do do this, its alternate way?

You need to include the DISTINCT column in your order:
.order('report_posts.post_id, report_posts.reported_at desc')

Related

Rails join and filter by primary model ids

Why are these two queries producing different result?
FbGroup.where('fb_groups.id IN (?)', ids).length
> 300
FbGroup.joins(:fb_posts).where('fb_groups.id IN (?)', ids).length
> 500
My desired result is to join fb_posts but before I join I'd like to filter fb_groups by specific ids. How can I do that?
Why are these two queries producing different result?
The join might result in 1:N (one to many) records. Eg. if one FbGroup has two FbPost, you will get two records when you join.
My desired result is to join fb_posts but before I join I'd like to filter fb_groups by specific ids. How can I do that?
Your second query already does this. When you filter with the join or join and then filter in an outer query, you would still get the same results.
My desired result is to join fb_posts but before I join I'd like to filter fb_groups by specific ids. How can I do that?
I used fb_groups with id 1, 2 and 3 just for example
FbGroup.where(id: [1, 2, 3]).joins(:fb_posts).count
I think it's duplicating records in joins. You should use group_by with joins to eliminate it. Try,
FbGroup.joins(:fb_posts).where('fb_groups.id IN (?)', ids).group('fb_groups.id').length
Please correct if I am wrong.

Properly format an ActiveRecord query with a subquery in Postgres

I have a working SQL query for Postgres v10.
SELECT *
FROM
(
SELECT DISTINCT ON (title) products.title, products.*
FROM "products"
) subquery
WHERE subquery.active = TRUE AND subquery.product_type_id = 1
ORDER BY created_at DESC
With the goal of the query to do a distinct based on the title column, then filter and order them. (I used the subquery in the first place, as it seemed there was no way to combine DISTINCT ON with ORDER BY without a subquery.
I am trying to express said query in ActiveRecord.
I have been doing
Product.select("*")
.from(Product.select("DISTINCT ON (product.title) product.title, meals.*"))
.where("subquery.active IS true")
.where("subquery.meal_type_id = ?", 1)
.order("created_at DESC")
and, that works! But, it's fairly messy with the string where clauses in there. Is there a better way to express this query with ActiveRecord/Arel, or am I just running into the limits of what ActiveRecord can express?
I think the resulting ActiveRecord call can be improved.
But I would start improving with original SQL query first.
Subquery
SELECT DISTINCT ON (title) products.title, products.* FROM products
(I think that instead of meals there should be products?) has duplicate products.title, which is not necessary there. Worse, it misses ORDER BY clause. As PostgreSQL documentation says:
Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first
I would rewrite sub-query as:
SELECT DISTINCT ON (title) * FROM products ORDER BY title ASC
which gives us a call:
Product.select('DISTINCT ON (title) *').order(title: :asc)
In main query where calls use Rails-generated alias for the subquery. I would not rely on Rails internal convention on aliasing subqueries, as it may change anytime. If you do not take this into account you could merge these conditions in one where call with hash-style argument syntax.
The final result:
Product.select('*')
.from(Product.select('DISTINCT ON (title) *').order(title: :asc))
.where(subquery: { active: true, meal_type_id: 1 })
.order('created_at DESC')

Arel and CTE to Wrap Query

I have to update a search builder which builds a relation with CTE. This is necessary because a complex relation (which includes DISTINCT, JOINs etc) is first built and then it's results have to be ordered – all in one query.
Here's a simplified look at things:
rel = User.select('DISTINCT ON (users.id) users.*').where(<lotsastuff>)
rel.to_sql
# SELECT DISTINCT ON (users.id) users.*
# FROM "users"
# WHERE <lotsastuff>
rel2 = User.from_cte('cte_table', rel).order(:created_at)
rel2.to_sql
# WITH "cte_table" AS (
# SELECT DISTINCT ON (users.id) users.*
# FROM "users"
# WHERE <lotsastuff>
# ) SELECT "cte_table".* FROM "cte_table"
# ORDER BY "cte_table"."created_at" ASC
The beauty of it is that rel2 responds as expected e.g. to count.
The from_cte method is provided by the "posgres_ext" gem which appears to have been abandoned. I'm therefore looking for another way to build the relation rel2 from rel.
The Arel docs mention a case which doesn't seem to help here.
Any hints on how to get there? Thanks a bunch!
PS: I know how to do this with to queries by selecting all user IDs in the first, then build a query with IN over the IDs and order there. However, I'm curious whether this is possible with one query (with or without CTE) as well.
Since your CTE is non-recursive, you can rewrite it as a subquery in the FROM clause. The only change is that Postgres's planner will optimize it as part of the main query instead of separately (because a CTE is an optimization fence). In ActiveRecord this works for me (tested on 5.1.4):
2.4.1 :001 > rel = User.select("DISTINCT ON (users.id) users.*").where("1=1")
2.4.1 :002 > puts User.from(rel, 'users').order(:created_at).to_sql
SELECT "users".* FROM (SELECT DISTINCT ON (users.id) users.* FROM "users" WHERE (1=1)) users ORDER BY "users"."created_at" ASC
I don't see any way to squeeze a CTE into ActiveRecord without extending it though, like what postgres_ext does. Sorry!
From what you've mentioned, I did not understand why do you need to use CTE instead of just a nested query.
rel = User.select('DISTINCT ON (users.id) users.*').where(<lotsastuff>).arel
inner_query = Arel::Table.new(:inner_query)
composed_cte = Arel::Nodes::As.new(inner_query, rel)
select_manager = Arel::SelectManager.new(composed_cte)
rel2 = select_manager.project('COUNT(*)')
rel2.to_sql
rel3 = select_manager.order('created_at ASC')
rel3.to_sql
you can then execute that sql

ActiveRecord subquery in select clause

So I'm getting a bunch of Volunteers records, with some filtering and sorting, which is fine. But I'd like to also get a count of the number of Children each volunteer is helping (using volunteer_id on children table), as a sub-query in the select clause to avoid having to perform a separate query for each record. As a bonus it would be good to be able to sort by this count too!
I'd like to end up with a generated query like this and be able to access the 'kids' column:
SELECT id, name, (SELECT COUNT(*) FROM children WHERE volunteer_id = volunteers.id) AS kids FROM volunteers
Is there any way of doing this with Arel? I've had a bit of a scout around and haven't found anything yet.
Alternatively, is it possible to join to the children table and get: count(children.id) ?
Thanks for any help :)
The proper way of doing this with SQL is with a GROUP BY clause:
SELECT v.id, v.name, COUNT(*) AS kids
FROM volunteers v
LEFT OUTER JOIN children c ON v.id = c.volunteer_id
GROUP BY v.id, v.name
There is a method .group() in AR for using GROUP BY queries.

PGError: ERROR: aggregates not allowed in WHERE clause on a AR query of an object and its has_many objects

Running the following query on a has_many association. Recommendations has_many Approvals.
I am running, rails 3 and PostgreSQL:
Recommendation.joins(:approvals).where('approvals.count = ?
AND recommendations.user_id = ?', 1, current_user.id)
This is returning the following error: https://gist.github.com/1541569
The error message tells you:
aggregates not allowed in WHERE clause
count() is an aggregate function. Use the HAVING clause for that.
The query could look like this:
SELECT r.*
FROM recommendations r
JOIN approvals a ON a.recommendation_id = r.id
WHERE r.user_id = $current_user_id
GROUP BY r.id
HAVING count(a.recommendation_id) = 1
With PostgreSQL 9.1 or later it is enough to GROUP BY the primary key of a table (presuming recommendations.id is the PK). In Postgres versions before 9.1 you had to include all columns of the SELECT list that are not aggregated in the GROUP BY list. With recommendations.* in the SELECT list, that would be every single column of the table.
I quote the release notes of PostgreSQL 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Simpler with a sub-select
Either way, this is simpler and faster, doing the same:
SELECT *
FROM recommendations r
WHERE user_id = $current_user_id
AND (SELECT count(*)
FROM approvals
WHERE recommendation_id = r.id) = 1;
Avoid multiplying rows with a JOIN a priori, then you don't have to aggregate them back.
Looks like you have a column named count and PostgreSQL is interpreting that column name as the count aggregate function. Your SQL ends up like this:
SELECT "recommendations".*
FROM "recommendations"
INNER JOIN "approvals" ON "approvals"."recommendation_id" = "recommendations"."id"
WHERE (approvals.count = 1 AND recommendations.user_id = 1)
The error message specifically points at the approvals.count:
LINE 1: ...ecommendation_id" = "recommendations"."id" WHERE (approvals....
^
I can't reproduce that error in my PostgreSQL (9.0) but maybe you're using a different version. Try double quoting that column name in your where:
Recommendation.joins(:approvals).where('approvals."count" = ? AND recommendations.user_id = ?', 1, current_user.id)
If that sorts things out then I'd recommend renaming your approvals.count column to something else so that you don't have to worry about it anymore.

Resources