I need to retrieve a list of collections ie #medications, #treatments, #therapies etc. with a count of each collections related records.
This works but creates the initial query and then a new query for each related record count. Is there a way I can minimize the number of queries?
#medications = Medication.includes(:records).select(:id, :name).where(office_id: current_user.selected_office)
#medications.each do |medication|
medication.record_count = medication.records.count
end
if #medications query has 10 results I have total of 11 queries. I need 10 collections with related record counts so I would end up with 110 queries per request.
All models have same attributes of name, office_id , etc
I am wondering how I can restructure database to better use or restructure query. Incidentally I am using Postgres db v 9.6
What about with joins and count?:
Medication.left_outer_joins(:records)
.select('medications.name, medications.id, COUNT(records.id) AS records_count')
.group(:id)
.where(office_id: current_user.selected_office)
If you're planning to count on each record per medication, then you can use joins, and count passing the records and assigning an alias.
As each model has the name and id columns, you need to be more precise on defining them, and group in this case (correct me if I'm wrong), is mandatory, otherwise you'd get a PG::GroupingError.
Related
Assuming this simplified schema:
users has_many discount_codes
discount_codes has_many orders
I want to grab all users, and if they happen to have any orders, only include the orders that were created between two dates. But if they don't have orders, or have orders only outside of those two dates, still return the users and do not exclude any users ever.
What I'm doing now:
users = User.all.includes(discount_codes: :orders)
users = users.where("orders.created_at BETWEEN ? AND ?", date1, date2).
or(users.where(orders: { id: nil })
I believe my OR clause allows me to retain users who do not have any orders whatsoever, but what happens is if I have a user who only has orders outside of date1 and date2, then my query will exclude that user.
For what it's worth, I want to use this orders where clause here specifically so I can avoid n + 1 issues later in determining orders per user.
Thanks in advance!
It doesn't make sense to try and control the orders that are loaded as part of the where clause for users. If you were to control that it'd have to be part of the includes (which I think means it'd have to be a part of the association).
Although technically it can combine them into a single query in some cases, activerecord is going to do this as two queries.
The first query will be executed when you go to iterate over the users and will use that where clause to limit the users found.
It will then run a second query behind the scenes based on that includes statement. This will simply be a query to get all orders which are associated with the users that were found by the previous query. As such the only way to control the orders that are found through the user's where clause is to omit users from the result set.
If I were you I would create an instance method in User model for what you are looking for but instead of using where use a select block:
def orders_in_timespan(start, end)
orders.select{ |o| o.between?(start, end) }
end
Because of the way ActiveRecord will cache the found orders from the includes against the instance then if you start off with an includes in your users query then I believe this will not result in n queries.
Something like:
render json: User.includes(:orders), methods: :orders_in_timespan
Of course, the easiest way to confirm the number of queries is to look at the logs. I believe this approach should have two queries regardless of the number of users being rendered (as likely does your code in the question).
Also, I'm not sure how familiar you are with sql but you can call .to_sql on the end of things such as your users variable in order to see the sql that would be generated which might help shed some light on the discrepancies between what you're getting and what you're looking for.
Option 1: Write a custom query in SQL (ugly).
Option 2: Create 2 separate queries like below...
#users = User.limit(10)
#orders = Order.joins(:discount_code)
.where(created_at: [10.days.ago..1.day.ago], discount_codes: {user_id: users.select(:id)})
.group_by{|order| order.discount_code.user_id}
Now you can use it like this ...
#users.each do |user|
orders = #orders[user.id]
puts user.name
puts user.id
puts orders.count
end
I hope this will solve your problem.
You need to use joins instead of includes. Rails joins use inner joins and will reject all the records which don't have associations.
User.joins(discount_codes: :orders).where(orders: {created_at: [10.days.ago..1.day.ago]}).distinct
This will give you all distinct users who placed orders in a given period of time.
user = User.joins(:discount_codes).joins(:orders).where("orders.created_at BETWEEN ? AND ?", date1, date2) +
User.left_joins(:discount_codes).left_joins(:orders).group("users.id").having("count(orders.id) = 0")
How can I speed up the following query? I'm look to find record with 6 or less unique values of fb_id. The select doesn't seem to be adding much in terms of time but instead it's the group and count. Is there an alternate way to query? I added an index on fb_id and it only sped up the query by 50%
FbGroupApplication.group(:fb_id).where.not(
fb_id: _get_exclude_fb_group_ids
).group(
"count_fb_id desc"
).count(
"fb_id"
).select{|k, v| v <= 6 }
The query is looking for FbGroupApplications that have 6 or less applications to the same fb_id
Passing a block to the select method made Rails trigger the SQL, convert the found rows into ActiveRecord::Base's ruby object (record), and then perform a select on the array based of the block you gave. This whole process is costly (ruby is not good at this).
You can "delegate" the responsibility of comparing the count vs 6 to the database with a having clause:
FbGroupApplication
.group(:fb_id)
.where.not(fb_id: _get_exclude_fb_group_ids)
.having('count(fb_id) <= 6')
I need to find 10 records first and then apply ordering on that.
Model.all.limit(10).order('cast(code as integer)')
result of above code is - first it applies order on model and then limit query. So, I get same codes in my listing for given model. But I want to apply limit query first and then order fetched result.
When you call .all on model, it executes the query on DB and returns all records, to apply limit you have to write it before .all - Model.limit(10).all, but after that you can't use SQL function to operate data. So to get first 10 records and apply order to it, try this:
records = Model.limit(10).all.sort_by{|o| o.code.to_i}
or
records = Model.first(10).sort_by{|o| o.code.to_i}
Try this:
Model.limit(10).sort{|p, q| q.cost <=> p.cost}
All you need to do is to remove the .all method:
Model.limit(10).order('cast(code as integer)')
If you want to get 10 random records and then sort them without getting all records from the database then you could use
Model.limit(10).order("RANDOM()")
This uses the PostgreSQL rand() function to randomise the records.
Now you have an array of 10 random records and you can use .sort_by as described by BitOfUniverse.
I have a habtm relationship between my Product and Category model.
I'm trying to write a query that searches for products with minimum of 2 categories.
I got it working with the following code:
p = Product.joins(:categories).group("product_id").having("count(product_id) > 1")
p.length # 178
When iterating on it though, for each time I call product.categories, it will do a new call to the database - not good. I want to prevent these calls and have the same result. Doing more research I've seen that I could include (includes) my categories table and it would load all the table in memory so it's not necessary to call the database again when iterating. So I got it working with the following code:
p2 = Product.includes(:categories).joins(:categories).group("product_id").having("count(product_id) > 1")
p2.length # 178 - I compared and the objects are the same as last query
Here come's what I am confused about:
p.first.eql? p2.first # true
p.first.categories.eql? p2.first.categories # false
p.first.categories.length # 2
p2.first.categories.length # 1
Why with the includes query I get the right objects but I don't get the categories relationship right?
It has something to do with the group method. Your p2 only contains the first category for each product.
You could break this up into two queries:
product_ids = Product.joins(:categories).group("product_id").having("count(product_id) > 1").pluck(:product_id)
result = Product.includes(:categories).find(product_ids)
Yeah, you hit the database twice, but at least you don't go to the database when you're iterating.
You must know that includes doesn't play well with joins (joins will just suppress the former).
Also When you include an association ActiveRecord figures out if it'll use eager_load (with a left join) or preload (with a separate query). Includes is just a wrapper for one of those 2.
The thing is preload plays well with joins ! So you can do this :
products = Product.preload(:categories). # this will trigger a separate query
joins(:categories). # this will build the relevant query
group("products.id").
having("count(product_id) > 1").
select("products.*")
Note that this will also hit the database twice, but you will not have any O(n) query.
I'm trying to count users who have dogs.
User.joins(:pets).where("pets.type = ?", :dog).count
This returns the count of the users + their dogs combined, instead i just want the count of actual users.
What am i doing wrong?
Update
I've also tried to just fetch the users using the above query and it returns an array of the same users repeated multiple times depending on how many dogs they have.
Try this:
User.joins(:pets).where("pets.type = ?", :dog).count(distinct: true)
See api doc.
The joins would return all the duplicate rows for which the user has multiple associations...
For example if a user, U1 has one dog and user, U2 has two dogs then total three rows would be returned by the joins....so instead of using joins, try to use the includes option...
refer to this Railscast, http://railscasts.com/episodes/181-include-vs-joins for more...