I'm trying to find the best way to count the number of Users who have one (or many) instances of a has_many relation.
For example, User has_many :bank_accounts and :credit_accounts (and a few other relations). I want to find the number of unique Users who have at least one bank_account and at least one credit_account, and ideally implement this inside of a scope so I can run where queries on it.
At the moment I'm implementing it (poorly) using the following code:
(BankAccount.select(:user_id).uniq + CreditAccount.select(:user_id) + ...).uniq.count
I've played around a lot with some joins, however I'm not getting any results. For example, I've toyed around a lot with different forms of User.joins(:bank_accounts, :credit_accounts).uniq('users.id').count however I don't appear to be getting any results.
Any help would be greatly appreciated, thanks!
If you are fine with using normal sql. You can use the below query
select distinct(user_id) from
(select user_id from bank_accounts union select user_id from credit_accounts) a;
I am not sure if a rails way exists for this.
In this case all we need is an INNER JOIN of users with credit_accounts and bank_accounts.
User.joins(:credit_accounts, :bank_accounts).uniq.count
The above query works for me. The sql generated by this query is below
"SELECT DISTINCT COUNT(DISTINCT `users`.`id`) FROM `users`.* FROM `users` INNER JOIN `credit_accounts` ON `credit_accounts`.`user_id` = `users`.`id` INNER JOIN `bank_accounts` ON `bank_accounts`.`user_id` = `users`.`id`"
Related
There are three models that matter here: Objective, Student, and Seminar. All are associated with has_and_belongs_to_many.
There is an ObjectiveStudent join model that includes columns "ready" and "points_all_time". There is an ObjectiveSeminar join model that includes column "priority".
I need to collect all of the objectives that are associated with a given student and also with a given seminar.
They need to also be marked with a "priority" above zero in the seminar. So I think I need this line:
obj_sems = ObjectiveSeminar.where(:seminar => given_seminar).where("priority > ?", 0)
Finally, they need to also be objectives where the student is ready, but has not scored above 7. So I think I need this line:
obj_studs = ObjectiveStudent.where(:user => given_student, :ready => true).where("points_all_time <= ?", 7)
Is there a way to gather all the objectives whose join table records appear in both of the above queries? Note that neither of the lists return objectives; they return objective_seminars, and objective_students, respectively. My end goal is to collect the objectives that meet all of the above criteria.
Or am I approaching this all wrong?
Bonus question: I would also love to sort the objectives by their priority in the given seminar. But I'm afraid that would add too much to the database load. What are your thoughts on this?
Thank you in advance for any insight.
In order to get Objectives you'll need to start your query from that.
In order to query with an AND condition the associated tables, you'll need inner joins with these tables.
Finally you'll need a distinct operator to only fetch each objective once.
The extended version of what (I think) you need is:
Objective.joins(objective_seminars: :seminar, objective_student: :student).
where(seminars: seminar_search_params, strudents: student_search_params).
where('objective_seminars.priority > 0').
where('objective_students.ready = 1 AND points_all_time <= 7').
order('objective_seminars.priority ASC').
distinct
Now for the database load it all depends on your indexes and the size of your tables.
The above query will translate to the following SQL (or something similar).
SELECT DISTINCT objectives.* FROM objectives
INNER JOIN objective_students ON objective_students.objective_id = objectives.id
INNER JOIN students ON students.id = objective_students.student_id
INNER JOIN objective_seminars ON objective_seminars.objective_id = objectives.id
INNER JOIN seminars ON seminars.id = objective_seminars.seminar_id
WHERE seminars_query AND
students_query AND
objective_seminars.priority > 0 AND
objective_students.ready = 1 AND points_all_time <= 7 AND
objective_seminars.priority ASC
So you'll need to add or extend your indexes so that all 5 tables queries can have an index helping out. The actual index implementation is up to you and depends on your application's specific (read - write load, tables size, cardinality etc)
I want to expand this question.
order by foreign key in activerecord
I'm trying to order a set of records based on a value in a really large table.
When I use join, it brings all the "other" records data into the objects.. As join should..
#table users 30+ columns
#table bids 5 columns
record = Bid.find(:all,:joins=>:users, :order=>'users.ranking DESC' ).first
Now record holds 35 fields..
Is there a way to do this without the join?
Here's my thinking..
With the join I get this query
SELECT * FROM "bids"
left join users on runner_id = users.id
ORDER BY ranking LIMIT 1
Now I can add a select to the code so I don't get the full user table, but putting a select in a scope is dangerous IMHO.
When I write sql by hand.
SELECT * FROM bids
order by (select users.ranking from users where users.id = runner_id) DESC
limit 1
I believe this is a faster query, based on the "explain" it seems simpler.
More important than speed though is that the second method doesn't have the 30 extra fields.
If I build in a custom select inside the scope, it could explode other searches on the object if they too have custom selects (there can be only one)
What you would like to achieve in active record writing is something along
SELECT b.* from bids b inner join users u on u.id=b.user_id order by u.ranking desc
In active record i would write such as:
Bids.joins("inner join users u on bids.user_id=u.id").order("u.ranking desc")
I think it's the only to make a join without fetching all attributes from the user models.
So, I'm doing something like:
user.students.includes(:exams).ungraded.paginate(:page =>
params[:page]).order("exams.created_at desc")
However, this causes a subtle problem. Somewhere in the guts of active record, the limit makes it do a distinct on the student id's, like this:
SELECT DISTINCT "students".id, exams.id AS alias_0 FROM "students"
LEFT OUTER JOIN "exams" ON "exams"."student_id" = "students"."id"
WHERE "students"."ready_for_grading" = 't' ORDER BY exams.id LIMIT 10 OFFSET 0;
However, this may cause results like:
id | alias_0
----+---------
42 | 256
42 | 257
42 | 260
See the problem? Eventually the limit kicks in and we don't fetch as many student id's as we were supposed to because we've "used them up" by selecting both the student id's and the exam ids, even though we really only want the exam id's for ordering.
This is Rails 3.2.1, and PostgreSQL 9.1.
Edit
I think what is happening is that paginate is using the query to get a list of students, which it then feeds to a second query, but because of the left outer join, we're not getting distinct results for the students, so it 'underfills' the 10 slots we have and generally confuses things. I think this is a bug somewhere, but I'm not sure who to pin it on.
Ok, I finally figured it out:
user.students.joins("left outer join exams on exams.student_id = students.id").ungraded.paginate(:page =>
params[:page]).order("exams.created_at desc")
Seems to work. I'm not sure why this works out better than using 'includes', but it does.
So I'm getting a bunch of Volunteers records, with some filtering and sorting, which is fine. But I'd like to also get a count of the number of Children each volunteer is helping (using volunteer_id on children table), as a sub-query in the select clause to avoid having to perform a separate query for each record. As a bonus it would be good to be able to sort by this count too!
I'd like to end up with a generated query like this and be able to access the 'kids' column:
SELECT id, name, (SELECT COUNT(*) FROM children WHERE volunteer_id = volunteers.id) AS kids FROM volunteers
Is there any way of doing this with Arel? I've had a bit of a scout around and haven't found anything yet.
Alternatively, is it possible to join to the children table and get: count(children.id) ?
Thanks for any help :)
The proper way of doing this with SQL is with a GROUP BY clause:
SELECT v.id, v.name, COUNT(*) AS kids
FROM volunteers v
LEFT OUTER JOIN children c ON v.id = c.volunteer_id
GROUP BY v.id, v.name
There is a method .group() in AR for using GROUP BY queries.
Can anyone explain this?
Project.includes([:user, :company])
This executes 3 queries, one to fetch projects, one to fetch users for those projects and one to fetch companies.
Project.select("name").includes([:user, :company])
This executes 3 queries, and completely ignores the select bit.
Project.select("user.name").includes([:user, :company])
This executes 1 query with proper left joins. And still completely ignores the select.
It would seem to me that rails ignores select with includes. Ok fine, but why when I put a related model in select does it switch from issuing 3 queries to issuing 1 query?
Note that the 1 query is what I want, I just can't imagine this is the right way to get it nor why it works, but I'm not sure how else to get the results in one query (.joins seems to only use INNER JOIN which I do not in fact want, and when I manually specifcy the join conditions to .joins the search gem we're using freaks out as it tries to re-add joins with the same name).
I had the same problem with select and includes.
For eager loading of associated models I used native Rails scope 'preload' http://apidock.com/rails/ActiveRecord/QueryMethods/preload
It provides eager load without skipping of 'select' at scopes chain.
I found it here https://github.com/rails/rails/pull/2303#issuecomment-3889821
Hope this tip will be helpful for someone as it was helpful for me.
Allright so here's what I came up with...
.joins("LEFT JOIN companies companies2 ON companies2.id = projects.company_id LEFT JOIN project_types project_types2 ON project_types2.id = projects.project_type_id LEFT JOIN users users2 ON users2.id = projects.user_id") \
.select("six, fields, I, want")
Works, pain in the butt but it gets me just the data I need in one query. The only lousy part is I have to give everything a model2 alias since we're using meta_search, which seems to not be able to figure out that a table is already joined when you specify your own join conditions.
Rails has always ignored the select argument(s) when using include or includes. If you want to use your select argument then use joins instead.
You might be having a problem with the query gem you're talking about but you can also include sql fragments using the joins method.
Project.select("name").joins(['some sql fragement for users', 'left join companies c on c.id = projects.company_id'])
I don't know your schema so i'd have to guess at the exact relationships but this should get you started.
I might be totally missing something here but select and include are not a part of ActiveRecord. The usual way to do what you're trying to do is like this:
Project.find(:all, :select => "users.name", :include => [:user, :company], :joins => "LEFT JOIN users on projects.user_id = users.id")
Take a look at the api documentation for more examples. Occasionally I've had to go manual and use find_by_sql:
Project.find_by_sql("select users.name from projects left join users on projects.user_id = users.id")
Hopefully this will point you in the right direction.
I wanted that functionality myself,so please use it.
Include this method in your class
#ACCEPTS args in string format "ASSOCIATION_NAME:COLUMN_NAME-COLUMN_NAME"
def self.includes_with_select(*m)
association_arr = []
m.each do |part|
parts = part.split(':')
association = parts[0].to_sym
select_columns = parts[1].split('-')
association_macro = (self.reflect_on_association(association).macro)
association_arr << association.to_sym
class_name = self.reflect_on_association(association).class_name
self.send(association_macro, association, -> {select *select_columns}, class_name: "#{class_name.to_sym}")
end
self.includes(*association_arr)
end
And you will be able to call like: Contract.includes_with_select('user:id-name-status', 'confirmation:confirmed-id'), and it will select those specified columns.
The preload solution doesn't seem to do the same JOINs as eager_load and includes, so to get the best of all worlds I also wrote my own, and released it as a part of a data-related gem I maintain, The Brick.
By overriding ActiveRecord::Associations::JoinDependency.apply_column_aliases() like this then when you add a .select(...) then it can act as a filter to choose which column aliases get built out.
With gem 'brick' loaded, in order to enable this selective behaviour, add the special column name :_brick_eager_load as the first entry in your .select(...), which turns on the filtering of columns while the aliases are being built out. Here's an example:
Employee.includes(orders: :order_details)
.references(orders: :order_details)
.select(:_brick_eager_load,
'employees.first_name', 'orders.order_date', 'order_details.product_id')
Because foreign keys are essential to have everything be properly associated, they are automatically added, so you do not need to include them in your select list.
Hope it can save you both query time and some RAM!