Count, empty? fails for ActiveRecord with outer joins

Count, empty? fails for ActiveRecord with outer joins - ruby-on-rails

I have two models, Monkey and Session, where Monkey has_many Session. I have a scope for Monkey:
scope :with_session_counts, -> {
joins("LEFT OUTER JOIN `sessions` ON `sessions`.`monkey_id` = `monkeys`.`id`")
.group(:id)
.select("`monkeys`.*, COUNT(DISTINCT `sessions`.`id`) as session_count")
}
in order to grab the number of associated Sessions (even when 0).
Querying #monkeys = Monkey.with_session_counts works as expected. However, when I test in my view:
<% unless #monkeys.empty?%>
I get this error:
Mysql2::Error: Column 'id' in field list is ambiguous:
SELECT COUNT(*) AS count_all, id AS id FROM `monkeys`
LEFT OUTER JOIN `sessions` ON `sessions`.`monkey_id` = `monkeys`.`id`
GROUP BY `monkeys`.`id`
How would I convince Rails to prefix id with the table name in presence of the JOIN?
Or is there a better alternative for the OUTER JOIN?
This applies equally to calling #monkeys.count(:all). I'm using RoR 4.2.1.
Update:
I have a partial fix for my issue (specify group("monkeys.id") explicitly) I wonder whether this is a bug in the code that generates the SELECT clause for count(:all). Note that in both cases (group("monkeys.id") and group(:id)) the GROUP BY part is generated correctly (i.e. with monkeys.id), but in the latter case the SELECT only contains id AS id. The reason I say 'partial' is because it works in that it does not break a call to empty?, but a call to count(:all) returns a Hash {monkey_id => number_of_sessions} instead of the number of records.
Update 2:
I guess my real question is: How can I get the number of associated sessions for each monkey, so that for all intents and purposes I can work with the query result as with Monkey.all? I know about counter cache but would prefer not to use it.

I believe it is not a bug. Like you added on your update, you have to specify the table that the id column belongs to. In this case group('monkeys.id') would do it.
How would the code responsible for generating the statement know the table to use? Without the count worked fine because it adds points.* to the projection and that is the one used by group by. However, if you actually wanted to group by Sessions id, you would have to specify it anyway.

Related

Ruby - ActiveRecord - Select one record per 'group' based on a specific column value

I have this table:
User
Name
Role
Mason
Engineer
Jackson
Engineer
Mason
Supervisor
Jackson
Supervisor
Graham
Engineer
Graham
Engineer
There can be exact duplicates (same Name/Role combination). Ignore comments about primary key.
I am writing a query that will give the distinct values from 'Name' column, with the corresponding 'Role'. To select the corresponding 'Role', if there is a 'Supervisor' role for a name, that record is returned. Otherwise, a record with the 'Engineer' role should be returned if it exists.
For the above table, the expected result is:
Name
Role
Mason
Supervisor
Jackson
Supervisor
Graham
Engineer
I tried ordering 'Role' in descending order, so that I can group by Name,Role and pick the first item - it will be a 'Supervisor' role if present, else 'Engineer' role - which matches my expecation.
I also tried doing User.select('DISTINCT ON (name) \*).order(Role: :desc) - I am not seeing this clause in the SQL query that gets executed.
Also, I tried another approach to get all valid Name, Role combinations and then process it offline iterating the result set and using if-else to decide which row to display.
However, I am interested in anything that is efficient and does not over do this handling.
I am new to Ruby and therefore reaching out.

If I wanted to do this in pure SQL, I would have to use GROUP BY.
SELECT Name, MAX(Role) FROM User GROUP BY Name
So one method would be to execute this SQL statement against the base connection.
ActiveRecord::Base.connection.execute("SELECT Name, MAX(Role) FROM User GROUP BY Name")
That would provide exactly the data you need, though it wouldn't be returned as ActiveRecord models. If you need those models then I would use find_by_sql and do an inner join to provide the records.
User.find_by_sql("SELECT User.* FROM User INNER JOIN (SELECT Name AS n, MAX(Role) AS r FROM User GROUP BY Name) U2 WHERE Name = U2.n AND Role = U2.r")
Unfortunately that would provide both records for Graham.

Rails: Omit order by clause when using ActiveRecord .first query?

I'm having a problem with a .first query in Rails 4 ActiveRecord. New behavior in Rails 4 is to add an order by the id field so that all db systems will output the same order.
So this...
Foo.where(bar: baz).first
Will give the query...
select foos.* from foos order by foos.id asc limit 1
The problem I am having is my select contains two sum fields. With the order by id thrown in the query automatically, I'm getting an error that the id field must appear in the group by clause. The error is right, no need for the id field if I want the output to be the sum of these two fields.
Here is an example that is not working...
baz = Foo.find(77).fooviews.select("sum(number_of_foos) as total_number_of_foos, sum(number_of_bars) as total_number_of_bars").reorder('').first
Here is the error...
ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: column "foos.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...bars FROM "fooviews" ORDER BY "...
Since the select is an aggregate expression, there is no need for the order by id, but AR is throwing it in automatically.
I found that I can add a reorder('') on to the end before the .first and that removes the order by id, but is that the right way to fix this?
Thank you
[UPDATE] What I neglected to mention is that I'm converting a large Rails 3 project to Rails 4. So the output from the Rails 3 is an AR object. If possible, the I would like the solution to keep in that format so that there is less code to change in the conversion.

You will want to use take:
The take method retrieves a record without any implicit ordering.
For example:
baz = Foo.find(77).fooviews.select("sum(number_of_foos) as total_number_of_foos, sum(number_of_bars) as total_number_of_bars").take
The commit message here indicates that this was a replacement for the old first behavior.

Ambiguous reference on column when grouping by association

I'm grouping a list of Bug reports on a known collection of users that are related to the report (that is, the user that is responsible for the report and the user that is currently assigned to it).
The Model Bug (AR, Rails 4.2.x) thus has, among others, two associations assigned_to and responsible, which are resolved to the foreign keys assigned_to_id, responsible_id.
Bugs can also be related to a project, which may also have a responsible user set, thus they also possess a responsible_id foreign key.
As we're grouping on both attributes from the report itself and the associated project, we want to include the associated project in the returned query.
I can then get a hash count of <User> => count through the following statement, grouping on the association name of the bug report:
Bug.group(:assigned_to)
.includes(:project)
.references(:projects)
.count
which correctly produces the desired result: A collection of Users (assignees) and the Bugs they are being assigned to.
For responsibles, the same query:
Bug.group(:responsible)
.includes(:project)
.references(:projects)
.count
yields an error, since the attribute responsible_id is both contained in the query by bugs and the associated projects.
SELECT COUNT(DISTINCT "bugs"."id") AS count_id,
responsible_id AS responsible_id
FROM "bugs"
LEFT OUTER JOIN "projects" ON "projects"."id" = "bugs"."project_id"
GROUP BY "bugs"."responsible_id"
If I instead group on the explicit attribute itself using Bugs.group('bugs.responsible_id'), I get a valid response, however in the form of responsible_id => count.
SELECT COUNT(DISTINCT "bugs"."id") AS count_id,
bugs.responsible_id AS bugs_responsible_id
FROM "bugs"
LEFT OUTER JOIN "projects" ON "projects"."id" = "bugs"."project_id"
WHERE <condition>
GROUP BY bugs.responsible_id
Is there a way to force using the association, but namespace the query as in the second query?
Of course I could process the result and expand it to the responsible users, however since the grouping is part of a larger querying functionality, I only get to manipulate the grouping identifier without extensive changes to the query builder.

I don't think there is a fix for this now (in rails 4.2.4). This will however become easy in rails 5.
If you absolutely must solve the problem now, you could patch ActiveRecord::Calculations#execute_grouped_calculation with the fix available in rails 5 for your app. Simply add an initializer at config/initializers e.g. active_record_calculations_patch.rb with the following (abbreviated) content. You can copy the original code from your rails version and then add the fix:
module ActiveRecord
module Calculations
def execute_grouped_calculation(operation, column_name, distinct)
...
else
group_fields = group_attrs
end
# LINE OF CODE COPIED OVER FROM THE FIX
group_fields = arel_columns(group_fields)
# END OF COPIED OVER CODE
group_aliases = group_fields.map { |field|
column_alias_for(field)
...
end
end
end

Rails 3 Comparing foreign key to list of ids using activerecord

I have a relationship between two models, Registers and Competitions. I have a very complicated dynamic query that is being built and if the conditions are right I need to limit Registration records to only those where it's Competition parent meets a certain criteria. In order to do this without select from the Competition table I was thinking of something along the lines of...
Register.where("competition_id in ?", Competition.where("...").collect {|i| i.id})
Which produces this SQL:
SELECT "registers".* FROM "registers" WHERE (competition_id in 1,2,3,4...)
I don't think PostgreSQL liked the fact that the in parameters aren't surrounded by parenthesis. How can I compare the Register foreign key to a list of competition ids?

you can make it a bit shorter and skip the collect (this worked for me in 3.2.3).
Register.where(competition_id: Competition.where("..."))
this will result in the following sql:
SELECT "registers".* FROM "registers" WHERE "registers"."competition_id" IN (SELECT "competitions"."id" FROM "competitions" WHERE "...")

Try this instead:
competitions = Competition.where("...").collect {|i| i.id}
Register.where(:competition_id => competitions)

rails select and include

Can anyone explain this?
Project.includes([:user, :company])
This executes 3 queries, one to fetch projects, one to fetch users for those projects and one to fetch companies.
Project.select("name").includes([:user, :company])
This executes 3 queries, and completely ignores the select bit.
Project.select("user.name").includes([:user, :company])
This executes 1 query with proper left joins. And still completely ignores the select.
It would seem to me that rails ignores select with includes. Ok fine, but why when I put a related model in select does it switch from issuing 3 queries to issuing 1 query?
Note that the 1 query is what I want, I just can't imagine this is the right way to get it nor why it works, but I'm not sure how else to get the results in one query (.joins seems to only use INNER JOIN which I do not in fact want, and when I manually specifcy the join conditions to .joins the search gem we're using freaks out as it tries to re-add joins with the same name).

I had the same problem with select and includes.
For eager loading of associated models I used native Rails scope 'preload' http://apidock.com/rails/ActiveRecord/QueryMethods/preload
It provides eager load without skipping of 'select' at scopes chain.
I found it here https://github.com/rails/rails/pull/2303#issuecomment-3889821
Hope this tip will be helpful for someone as it was helpful for me.

Allright so here's what I came up with...
.joins("LEFT JOIN companies companies2 ON companies2.id = projects.company_id LEFT JOIN project_types project_types2 ON project_types2.id = projects.project_type_id LEFT JOIN users users2 ON users2.id = projects.user_id") \
.select("six, fields, I, want")
Works, pain in the butt but it gets me just the data I need in one query. The only lousy part is I have to give everything a model2 alias since we're using meta_search, which seems to not be able to figure out that a table is already joined when you specify your own join conditions.

Rails has always ignored the select argument(s) when using include or includes. If you want to use your select argument then use joins instead.
You might be having a problem with the query gem you're talking about but you can also include sql fragments using the joins method.
Project.select("name").joins(['some sql fragement for users', 'left join companies c on c.id = projects.company_id'])
I don't know your schema so i'd have to guess at the exact relationships but this should get you started.

I might be totally missing something here but select and include are not a part of ActiveRecord. The usual way to do what you're trying to do is like this:
Project.find(:all, :select => "users.name", :include => [:user, :company], :joins => "LEFT JOIN users on projects.user_id = users.id")
Take a look at the api documentation for more examples. Occasionally I've had to go manual and use find_by_sql:
Project.find_by_sql("select users.name from projects left join users on projects.user_id = users.id")
Hopefully this will point you in the right direction.

I wanted that functionality myself,so please use it.
Include this method in your class
#ACCEPTS args in string format "ASSOCIATION_NAME:COLUMN_NAME-COLUMN_NAME"
def self.includes_with_select(*m)
association_arr = []
m.each do |part|
parts = part.split(':')
association = parts[0].to_sym
select_columns = parts[1].split('-')
association_macro = (self.reflect_on_association(association).macro)
association_arr << association.to_sym
class_name = self.reflect_on_association(association).class_name
self.send(association_macro, association, -> {select *select_columns}, class_name: "#{class_name.to_sym}")
end
self.includes(*association_arr)
end
And you will be able to call like: Contract.includes_with_select('user:id-name-status', 'confirmation:confirmed-id'), and it will select those specified columns.

The preload solution doesn't seem to do the same JOINs as eager_load and includes, so to get the best of all worlds I also wrote my own, and released it as a part of a data-related gem I maintain, The Brick.
By overriding ActiveRecord::Associations::JoinDependency.apply_column_aliases() like this then when you add a .select(...) then it can act as a filter to choose which column aliases get built out.
With gem 'brick' loaded, in order to enable this selective behaviour, add the special column name :_brick_eager_load as the first entry in your .select(...), which turns on the filtering of columns while the aliases are being built out. Here's an example:
Employee.includes(orders: :order_details)
.references(orders: :order_details)
.select(:_brick_eager_load,
'employees.first_name', 'orders.order_date', 'order_details.product_id')
Because foreign keys are essential to have everything be properly associated, they are automatically added, so you do not need to include them in your select list.
Hope it can save you both query time and some RAM!

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart