How do I translate my SQLite3 query to postgreSQL? - ruby-on-rails

I am trying to order all the recipes in my database by the number of likes they have received. Likes are polymorphic and belong to :likeable while a recipe has many likes.
My query works for SQLite3, but when I upload to Heroku using PostgreSQL it seems to break things.
function is as follows:
Recipe.select('*').joins(:likes).group('recipes.id').order('COUNT(likes.likeable_id)')
And the error that Heroku gives me when I try to run the website:
ActionView::Template::Error (PG::GroupingError: ERROR: column "likes.id" must appear in the GROUP BY clause or be used in an aggregate function
Everything compiles, but the homepage uses that scope function so I get a server error right away.

You need to explicitly select recipies.*:
Recipe.select(
Recipe.arel_table[:*],
Likes.arel_table[:*].count.as('likes_count')
)
.joins(:likes)
.group(:id)
.order(:likes_count)
Selecting the count is really optional - you can skip .select entirely and just fetch the aggregate in the order clause:
Recipe.joins(:likes)
.group(:id)
.order(Likes.arel_table[:*].count)

You cannot select * from grouping by.
for most SQL-dabases (Postgres, newer Mysql, ...) you can only use SELET columns in a GROUP BY:
columns you've grouped by, and that are transient by the grouped column (e.g. grouping recipes.id can also select recipes.title)
And aggregated columns (count, sum, max)
Try:
Recipe.select('recipies.*').joins(:likes).group(:id).order('COUNT(likes.likeable_id)')

Related

RoR PostgresQL - Get latest, distinct values from database

I am trying to query my PostgreSQL database to get the latest (by created_at) and distinct (by user_id) Activity objects, where each user has multiple activities in the database. The activity object is structured as such:
Activity(id, user_id, created_at, ...)
I first tried to get the below query to work:
Activity.order('created_at DESC').select('DISTINCT ON (activities.user_id) activities.*')
however, kept getting the below error:
ActiveRecord::StatementInvalid: PG::InvalidColumnReference: ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
According to this post: PG::Error: SELECT DISTINCT, ORDER BY expressions must appear in select list, it looks like The ORDER BY clause can only be applied after the DISTINCT has been applied. This does not help me, as I want to get the distinct activities by user_id, but also want the activities to be the most recently created activities. Thus, I need the activities to be sorted before getting the distinct activities.
I have come up with a solution that works, but first grouping the activities by user id, and then ordering the activities within the groups by created_at. However, this takes two queries to do.
I was wondering if what I want is possible in just one query?
This should work, try the following
Solution 1
Activity.select('DISTINCT ON (activities.user_id) activities.*').order('created_at DESC')
Solution 2
If not work Solution 1 then this is helpful if you create a scope for this
activity model
scope :latest, -> {
select("distinct on(user_id) activities.user_id,
activities.*").
order("user_id, created_at desc")
}
Now you can call this anywhere like below
Activity.latest
Hope it helps

ActiveRecord Error with PostgreSQL but not SQLite - GroupingError - column must appear in the GROUP BY clause or be used in an aggregate function

This ActiveRecord query works in SQLite:
SlotReq.group(:team_id)
In PostgreSQL, the same query gives this error "GroupingError - column slot_reqs.id must appear in the GROUP BY clause or be used in an aggregate function"
Based on the answer to this question I changed my query to:
SlotReq.select("slot_reqs.team_id").group("slot_reqs.team_id")
and it works as expected.
I would like to know if I'm doing it right and why does this work?
Yes, you are doing it right, although you could also use:
SlotReq.select(:team_id).group(:team_id)
What happens is that PG (among other DB's) needs that all column names in SELECT must be either aggregated or used in GROUP BY clause; this is because, if any unagreggated column is omitted, it could lead to indeterminate behavior (i.e. What value should be used in that column?)
So, by specifying in select just the column you want to group, you don't omit any column; on the other hand, using group withoutselect, is equivalent to doing SELECT * FROM table GROUP BY column, which brings all columns but only one being specified on GROUP BY clause.

Group by Error: PG::GroupingError: ERROR: column must appear in the GROUP BY clause or be used in an aggregate function [duplicate]

I am getting this error in the pg production mode, but its working fine in sqlite3 development mode.
ActiveRecord::StatementInvalid in ManagementController#index
PG::Error: ERROR: column "estates.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "estates".* FROM "estates" WHERE "estates"."Mgmt" = ...
^
: SELECT "estates".* FROM "estates" WHERE "estates"."Mgmt" = 'Mazzey' GROUP BY user_id
#myestate = Estate.where(:Mgmt => current_user.Company).group(:user_id).all
If user_id is the PRIMARY KEY then you need to upgrade PostgreSQL; newer versions will correctly handle grouping by the primary key.
If user_id is neither unique nor the primary key for the 'estates' relation in question, then this query doesn't make much sense, since PostgreSQL has no way to know which value to return for each column of estates where multiple rows share the same user_id. You must use an aggregate function that expresses what you want, like min, max, avg, string_agg, array_agg, etc or add the column(s) of interest to the GROUP BY.
Alternately you can rephrase the query to use DISTINCT ON and an ORDER BY if you really do want to pick a somewhat arbitrary row, though I really doubt it's possible to express that via ActiveRecord.
Some databases - including SQLite and MySQL - will just pick an arbitrary row. This is considered incorrect and unsafe by the PostgreSQL team, so PostgreSQL follows the SQL standard and considers such queries to be errors.
If you have:
col1 col2
fred 42
bob 9
fred 44
fred 99
and you do:
SELECT col1, col2 FROM mytable GROUP BY col1;
then it's obvious that you should get the row:
bob 9
but what about the result for fred? There is no single correct answer to pick, so the database will refuse to execute such unsafe queries. If you wanted the greatest col2 for any col1 you'd use the max aggregate:
SELECT col1, max(col2) AS max_col2 FROM mytable GROUP BY col1;
I recently moved from MySQL to PostgreSQL and encountered the same issue. Just for reference, the best approach I've found is to use DISTINCT ON as suggested in this SO answer:
Elegant PostgreSQL Group by for Ruby on Rails / ActiveRecord
This will let you get one record for each unique value in your chosen column that matches the other query conditions:
MyModel.where(:some_col => value).select("DISTINCT ON (unique_col) *")
I prefer DISTINCT ON because I can still get all the other column values in the row. DISTINCT alone will only return the value of that specific column.
After often receiving the error myself I realised that Rails (I am using rails 4) automatically adds an 'order by id' at the end of your grouping query. This often results in the error above. So make sure you append your own .order(:group_by_column) at the end of your Rails query. Hence you will have something like this:
#problems = Problem.select('problems.username, sum(problems.weight) as weight_sum').group('problems.username').order('problems.username')
#myestate1 = Estate.where(:Mgmt => current_user.Company)
#myestate = #myestate1.select("DISTINCT(user_id)")
this is what I did.

Rails Postgres Error GROUP BY clause or be used in an aggregate function

In SQLite (development) I don't have any errors, but in production with Postgres I get the following error. I don't really understand the error.
PG::Error: ERROR: column "commits.updated_at" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...mmits"."user_id" = 1 GROUP BY mission_id ORDER BY updated_at...
^
: SELECT COUNT(*) AS count_all, mission_id AS mission_id FROM "commits" WHERE "commits"."user_id" = 1 GROUP BY mission_id ORDER BY updated_at DESC
My controller method:
def show
#user = User.find(params[:id])
#commits = #user.commits.order("updated_at DESC").page(params[:page]).per(25)
#missions_commits = #commits.group("mission_id").count.length
end
UPDATE:
So i digged further into this PostgreSQL specific annoyance and I am surprised that this exception is not mentioned in the Ruby on Rails Guide.
I am using psql (PostgreSQL) 9.1.11
So from what I understand, I need to specify which column that should be used whenever you use the GROUP_BY clause. I thought using SELECT would help, which can be annoying if you need to SELECT a lot of columns.
Interesting discussion here
Anyways, when I look at the error, everytime the cursor is pointed to updated_at. In the SQL query, rails will always ORDER BY updated_at. So I have tried this horrible query:
#commits.group("mission_id, date(updated_at)")
.select("date(updated_at), count(mission_id)")
.having("count(mission_id) > 0")
.order("count(mission_id)").length
which gives me the following SQL
SELECT date(updated_at), count(mission_id)
FROM "commits"
WHERE "commits"."user_id" = 1
GROUP BY mission_id, date(updated_at)
HAVING count(mission_id) > 0
ORDER BY updated_at DESC, count(mission_id)
LIMIT 25 OFFSET 0
the error is the same.
Note that no matter what it will ORDER BY updated_at, even if I wanted to order by something else.
Also I don't want to group the records by updated_at just by mission_id.
This PostgreSQL error is just misleading and has little explanation to solving it. I have tried many formulas from the stackoverflow sidebar, nothing works and always the same error.
UPDATE 2:
So I got it to work, but it needs to group the updated_at because of the automatic ORDER BY updated_at. How do I count only by mission_id?
#missions_commits = #commits.group("mission_id, updated_at").count("mission_id").size
I guest you want to show general number of distinct Missions related with Commits, anyway it won't be number on page.
Try this:
#commits = #user.commits.order("updated_at DESC").page(params[:page]).per(25)
#missions_commits = #user.commits.distinct.count(:mission_id)
However if you want to get the number of distinct Missions on page I suppose it should be:
#missions_commits = #commits.collect(&:mission_id).uniq.count
Update
In Rails 3, distinct did not exist, but pure SQL counting should be used this way:
#missions_commits = #user.commits.count(:mission_id, distinct: true)
See the docs for PostgreSQL GROUP BY here:
http://www.postgresql.org/docs/9.3/interactive/sql-select.html#SQL-GROUPBY
Basically, unlike Sqlite (and MySQL) postgres requires that any columns selected or ordered on must appear in an aggregate function or the group by clause.
If you think it through, you'll see that this actually makes sense. Sqlite/MySQL cheat under the hood and silently drop those fields (not sure that's technically what happens).
Or thinking about it another way if you are grouping by a field, what's the point of ordering it? How would that even make sense unless you also had an aggregate function on the ordered field?

.group not returning all columns

I have a .group query that is not returning all the columns in the select and I was wondering if someone could validate my syntax.
Here is a query with a .group and the result from my console;
Expense.select('account_number, SUM(credit_amount)').group(:account_number).first
Expense Load (548.8ms) EXEC sp_executesql N'SELECT TOP (1) account_number, SUM(credit_amount) FROM [expenses] GROUP BY account_number'
(36.9ms) SELECT table_name FROM information_schema.views
Even though I select two columns, I'm only getting the first one to return. I'm wondering if I may be dealing with an db adapter problem.
Try giving your sum an alias:
expense = Expense.select('account_number, SUM(credit_amount) AS credit_amount').group(:account_number).first
puts expense.credit_amount
ActiveRecord doesn't create a default alias for aggregation operations such as SUM, COUNT etc... you have to do it explicitly to be able to access the results, as shown above.
The SUM(credit_amount) column from the SQL has no alias and will not have a column name by default. If you change it to have an alias SUM(credit_amount) As 'A' for example and select the alias name, it should pick it up.

Resources