In Rails 4, I have a model Thread which has_many Emails. Each Email has a field named internal_date. I want to return a collection of threads, ordered in a way where the thread with the latest email.internal_date comes first (very similar to how Gmail would sort its inbox).
This is the current line in my controller (not ordering them so far):
#threads = selected_threads.joins(:tags).filter(params_filters).includes(:emails, [some other stuff]).distinct.all.paginate(page: params[:page], :per_page => 10)
I'm doing the joins because of the filtering; and using includes to speed things up.
Ideally I would add a scope order_by_latest_email to my Thread model, without killing the loading time with too many DB queries. Any tips?
Thanks!
I think this is only possible with a really ugly query like so:
Thread.joins(:emails)
.select('threads.*, emails.internal_date')
.joins('LEFT OUTER JOIN emails em ON (emails.internal_date < em.internal_date and emails.thread_id = em.thread_id)')
.where('em.id IS NULL').order('emails.internal_date DESC')
# additional filters here
You can see details in a blog post here, but this is a semi common sql problem known as the greatest n per group.
You need to find the latest email internal date in the group of emails connected to a thread. So what you do is:
compare all the emails with each other
continue until you find one where no other email (where em represents that other email) has a later internal_date (that's what the em.id IS NULL is doing)
order by that email's internal_date.
Related
Assuming this simplified schema:
users has_many discount_codes
discount_codes has_many orders
I want to grab all users, and if they happen to have any orders, only include the orders that were created between two dates. But if they don't have orders, or have orders only outside of those two dates, still return the users and do not exclude any users ever.
What I'm doing now:
users = User.all.includes(discount_codes: :orders)
users = users.where("orders.created_at BETWEEN ? AND ?", date1, date2).
or(users.where(orders: { id: nil })
I believe my OR clause allows me to retain users who do not have any orders whatsoever, but what happens is if I have a user who only has orders outside of date1 and date2, then my query will exclude that user.
For what it's worth, I want to use this orders where clause here specifically so I can avoid n + 1 issues later in determining orders per user.
Thanks in advance!
It doesn't make sense to try and control the orders that are loaded as part of the where clause for users. If you were to control that it'd have to be part of the includes (which I think means it'd have to be a part of the association).
Although technically it can combine them into a single query in some cases, activerecord is going to do this as two queries.
The first query will be executed when you go to iterate over the users and will use that where clause to limit the users found.
It will then run a second query behind the scenes based on that includes statement. This will simply be a query to get all orders which are associated with the users that were found by the previous query. As such the only way to control the orders that are found through the user's where clause is to omit users from the result set.
If I were you I would create an instance method in User model for what you are looking for but instead of using where use a select block:
def orders_in_timespan(start, end)
orders.select{ |o| o.between?(start, end) }
end
Because of the way ActiveRecord will cache the found orders from the includes against the instance then if you start off with an includes in your users query then I believe this will not result in n queries.
Something like:
render json: User.includes(:orders), methods: :orders_in_timespan
Of course, the easiest way to confirm the number of queries is to look at the logs. I believe this approach should have two queries regardless of the number of users being rendered (as likely does your code in the question).
Also, I'm not sure how familiar you are with sql but you can call .to_sql on the end of things such as your users variable in order to see the sql that would be generated which might help shed some light on the discrepancies between what you're getting and what you're looking for.
Option 1: Write a custom query in SQL (ugly).
Option 2: Create 2 separate queries like below...
#users = User.limit(10)
#orders = Order.joins(:discount_code)
.where(created_at: [10.days.ago..1.day.ago], discount_codes: {user_id: users.select(:id)})
.group_by{|order| order.discount_code.user_id}
Now you can use it like this ...
#users.each do |user|
orders = #orders[user.id]
puts user.name
puts user.id
puts orders.count
end
I hope this will solve your problem.
You need to use joins instead of includes. Rails joins use inner joins and will reject all the records which don't have associations.
User.joins(discount_codes: :orders).where(orders: {created_at: [10.days.ago..1.day.ago]}).distinct
This will give you all distinct users who placed orders in a given period of time.
user = User.joins(:discount_codes).joins(:orders).where("orders.created_at BETWEEN ? AND ?", date1, date2) +
User.left_joins(:discount_codes).left_joins(:orders).group("users.id").having("count(orders.id) = 0")
I have users, problems, and attempts which is a join table between users and problems. I'm looking to show an index of all the problems along with the current user's most recent attempt for each, if they have one.
I've tried four things to get a left join with conditions and none of them have worked.
The naive approach is something like...
#problems = Problem.enabled
#problems.each do { |prob|
prob.last_attempt = prob.attempts
.where(user_id: current_user.id)
.last
end
This gets all the problems and the attempts I want but is N+1 queries. So...
#problems = Problem.enabled
.includes(:attempts)
This does the left join (or the equivalent two queries) getting all the problems but also all the attempts, not just those for the current user. So...
#problems = Problem.enabled
.includes(:attempts)
.where(attempts: {user_id: current_user.id})
This gets only those problems that the current user has already attempted.
So...
//problem.rb
has_many :user_attempts,
-> (user) { where(user_id: user.id) },
class_name: 'Attempt'
//problem_controller.index
#problems = Problem.enabled
.includes(:user_attempts, current_user)
And this gives an error message from rails saying joins with instance
arguments are not supported.
So I'm stuck. What is the best way to do this? Is Arel the right tool? Can I skip active record and just get back a JSON blob? Am I just being dumb?
This question is quite similar to this one but I'd need a argument to the joined scope which isn't supported. And I'm hoping rails added something in last couple years.
Thanks so much for your help.
The way I solved this was to use raw sql. It's ugly and a security risk but I didn't find better.
results = Problem.connection.exec_query(%(
SELECT *
FROM problems
LEFT JOIN (
SELECT *
//etc.
)
))
And then manipulating the results array in memory.
I'm trying to sort users based on their most recent response to a certain question in a survey using Rails 5, PostgeSQL 9.4.5
So far I've got:
User.includes(responses: [answer: :question]).where(questions: {id: X}).order(...)
Not sure what to put in the order. The responses all have numerical 'scores' representing which answer it is. I'm imagining something at the end like:
.order("answers.score ASC")
But I'm struggling to get the two to attach. I only want to sort the Users by their most recent answer to that specific question. (They can take the survey multiple times)
I'm assuming I need to set a string function in some SELECT, but I'm struggling to wrap my head around it.
Any help is appreciated!
You can print the actual SQL of the rails query like this:
User.includes(responses: [answer: :question]).where(questions: {id: X}).to_sql
Then you can order by the right table.field (find the table name in the SQL returned by to_sql) and the field in db/schema.rb
It should be a created_at. User...order('responses.created_at DESC')
UPDATE
But this will sort all responses and not users by their last response on question, as you've commented below.
In this case you have to:
group the users by their responses
calculate the last response(MAX(user_responses.created_at)) for each user
sort the users by last response
Something like this:
User
.includes(responses: [answer: :question])
.where(questions: {id: X})
.group('users.id')
.order('MAX(responses.created_at) DESC')
I have User model that is related to a Friend model (has_many / belongs_to)
After joining, I would like to be able to check if a certain friend object exists in the friends that were joined to users:
users = User.joins(:friends).where("some condition") # subset of total friends
fs = Friend.all
fs.each do |f|
if users.friends.includes?(f) # match!
...
else # no match
...
end
The code as-is does not work and I am having difficulties getting this functionality in code.
Try something like this:
users.friends.where(id: u.id).exists?
That should generate a query like so:
SELECT 1 AS one FROM `users` WHERE `users`.`friend_id` = 42 AND `users`.`id` = 1 LIMIT 1
You'll either get back the number 1 (considered "truthy"), or nil (considered "falsey").
Side note: Unless you need to use your u variable later, you can probably get away with simply placing some_id directly in the where clause, and not do the second User lookup.
Edit
Just noticed a problem in your loop that might be what is causing your original problem. When you loaded up the list of users, unless you have some limit clause or invoked .first, you'll get back an array of users. So I'm guessing your application is crapping out on this line:
users.friends.includes?(f)
Because .friends is a method of a User object, not of an array.
So you'll have to do a nested loop instead like so:
fs.each do |f|
users.each do |u|
u.friends.includes?(f)
end
end
Note that this method might be very slow, depending on the number of friends and users. It is a very inefficient algorithm, which is why I'm trying to understand your situation better in the comments, because I'm certain there's a more efficient way to accomplish your task.
I'm building a report in a Ruby on Rails application and I'm struggling to understand how to use a subquery.
Each 'Survey' has_many 'SurveyResponses' and it is simple enough to retrieve these however I need to group them according to one of the fields, 'jobcode', as I only want to report the information relating to a single jobcode in one line in the report.
However I also need to know the constituent data that makes up the totals for that jobcode. The reason for this is that I need to calculate data such as medians and standard deviations and so need to know the values that make the total.
My thinking is that I retrieve the distinct jobcodes that were reported on for the survey and then as I loop through these I retrieve the individual responses for each jobcode.
Is this the correct way to do this or should I follow a different method?
You could use a named scope to simplify getting the groups of responses:
named_scope :job_group, lambda{|job_code| {:conditions => ["job_code = ?", job_code]}}
Put that in your response model, aand use it like this:
job.responses.job_group('some job code')
and you'll get an array of responses. If you're looking to get the mean of the values of one of the attributes on the responses, you can use map:
r = job.responses.job_group('some job code')
r.map(&:total)
=> [1, 5, 3, 8]
Alternatively, you might find it quicker to write custom SQL in order to get the mean / average / sum of groups of attributes. Going through rails for this sort of work may cause significant lag.
ActiveRecord::Base.connection.execute("Custom SQL here")
You can also use Model.find_by_sql()
For example:
class User < Activerecord::Base
# Your usual AR model
end
...
def index
#users = User.find_by_sql "select * from users"
# etc
end