There is an association query I seem to be unable to do without triggering a N+1 query.
Suppose I host Parties. I have many Friends, and each time a friend comes to a party, they create a Presence.
And so:
Presence.belongs_to :party
Presence.belongs_to :friend
Friend.has_many :presences
Party.has_many :presences
So far so good.
I want to obtain a list of every one of my Friends, knowing whether or not they are present at this Party, without triggering a N+1 query.
My dataset would look like this:
friends: [
{name: "Dave Vlopment", presences: [{created_at: "8pm", party_id: 2012}]},
{name: "Brett E. Hardproblem", presences: [nil]},
{name: "Ann Plosswan-Quarry", presences: [{created_at: "10pm", party_id: 2012}]},
...
]
and so on.
I have a lot of friends and do a lot of parties, of course. (This is of course a fictional example.)
I would do:
Friend.all.includes(:presence).map{ |them| them.parties }
# But then, `them.parties` is not filtered to tonight's party.
Friend.all.includes(:presence).map{ |them| them.parties.where(party_id: pid) }
# And there I have an N+1.
I could always filter at the Ruby layer:
Friend.all.includes(:presence).map{ |them| them.parties.select{ |it| it.party_id = party.id } }
But this works pretty badly with as_json(includes: {}) and so on. I'm discovering this is very error-prone as I'll be making calculations on the results.
And I make a lot of parties, you know? (still fictional)
If I where on the first query, I lose the left join:
Friend.all.includes(:presence).where(party: party)
I have no idea that tonight, Brett and a bunch of friends, who are always there, are absent. (this one is not guaranteed to be a fictional experience)
I will only see friends who are present.
And if I go through party, well of course I will not see who is absent either.
Now I know there are ways I can do this in SQL, and other ways we can wrangle around some Ruby to pull it together.
However, I'm looking for a "first-class" way to do this in Activerecord, without getting N+1s.
Is there a way to do this using only the Activerecord tools? I haven't found anything yet.
I'm not sure whether this meets your expectation about "first-class" way or not.
But you can use this approach to avoids N+1
# fetch all friends
friends = Friend.all
# fetch all presences. grouped by friend_id
grouped_presences = Presence.all.group_by(&:friend_id)
# arrange data
data = []
friends.each do |friend|
json = friend.as_json
json["presences"] = grouped_presences[friend.id].as_json
data << json
end
puts data
It only executes 2 queries
SELECT `friends`.* FROM `friends`
SELECT `presences`.* FROM `presences`
Related
Assuming this simplified schema:
users has_many discount_codes
discount_codes has_many orders
I want to grab all users, and if they happen to have any orders, only include the orders that were created between two dates. But if they don't have orders, or have orders only outside of those two dates, still return the users and do not exclude any users ever.
What I'm doing now:
users = User.all.includes(discount_codes: :orders)
users = users.where("orders.created_at BETWEEN ? AND ?", date1, date2).
or(users.where(orders: { id: nil })
I believe my OR clause allows me to retain users who do not have any orders whatsoever, but what happens is if I have a user who only has orders outside of date1 and date2, then my query will exclude that user.
For what it's worth, I want to use this orders where clause here specifically so I can avoid n + 1 issues later in determining orders per user.
Thanks in advance!
It doesn't make sense to try and control the orders that are loaded as part of the where clause for users. If you were to control that it'd have to be part of the includes (which I think means it'd have to be a part of the association).
Although technically it can combine them into a single query in some cases, activerecord is going to do this as two queries.
The first query will be executed when you go to iterate over the users and will use that where clause to limit the users found.
It will then run a second query behind the scenes based on that includes statement. This will simply be a query to get all orders which are associated with the users that were found by the previous query. As such the only way to control the orders that are found through the user's where clause is to omit users from the result set.
If I were you I would create an instance method in User model for what you are looking for but instead of using where use a select block:
def orders_in_timespan(start, end)
orders.select{ |o| o.between?(start, end) }
end
Because of the way ActiveRecord will cache the found orders from the includes against the instance then if you start off with an includes in your users query then I believe this will not result in n queries.
Something like:
render json: User.includes(:orders), methods: :orders_in_timespan
Of course, the easiest way to confirm the number of queries is to look at the logs. I believe this approach should have two queries regardless of the number of users being rendered (as likely does your code in the question).
Also, I'm not sure how familiar you are with sql but you can call .to_sql on the end of things such as your users variable in order to see the sql that would be generated which might help shed some light on the discrepancies between what you're getting and what you're looking for.
Option 1: Write a custom query in SQL (ugly).
Option 2: Create 2 separate queries like below...
#users = User.limit(10)
#orders = Order.joins(:discount_code)
.where(created_at: [10.days.ago..1.day.ago], discount_codes: {user_id: users.select(:id)})
.group_by{|order| order.discount_code.user_id}
Now you can use it like this ...
#users.each do |user|
orders = #orders[user.id]
puts user.name
puts user.id
puts orders.count
end
I hope this will solve your problem.
You need to use joins instead of includes. Rails joins use inner joins and will reject all the records which don't have associations.
User.joins(discount_codes: :orders).where(orders: {created_at: [10.days.ago..1.day.ago]}).distinct
This will give you all distinct users who placed orders in a given period of time.
user = User.joins(:discount_codes).joins(:orders).where("orders.created_at BETWEEN ? AND ?", date1, date2) +
User.left_joins(:discount_codes).left_joins(:orders).group("users.id").having("count(orders.id) = 0")
I have users, problems, and attempts which is a join table between users and problems. I'm looking to show an index of all the problems along with the current user's most recent attempt for each, if they have one.
I've tried four things to get a left join with conditions and none of them have worked.
The naive approach is something like...
#problems = Problem.enabled
#problems.each do { |prob|
prob.last_attempt = prob.attempts
.where(user_id: current_user.id)
.last
end
This gets all the problems and the attempts I want but is N+1 queries. So...
#problems = Problem.enabled
.includes(:attempts)
This does the left join (or the equivalent two queries) getting all the problems but also all the attempts, not just those for the current user. So...
#problems = Problem.enabled
.includes(:attempts)
.where(attempts: {user_id: current_user.id})
This gets only those problems that the current user has already attempted.
So...
//problem.rb
has_many :user_attempts,
-> (user) { where(user_id: user.id) },
class_name: 'Attempt'
//problem_controller.index
#problems = Problem.enabled
.includes(:user_attempts, current_user)
And this gives an error message from rails saying joins with instance
arguments are not supported.
So I'm stuck. What is the best way to do this? Is Arel the right tool? Can I skip active record and just get back a JSON blob? Am I just being dumb?
This question is quite similar to this one but I'd need a argument to the joined scope which isn't supported. And I'm hoping rails added something in last couple years.
Thanks so much for your help.
The way I solved this was to use raw sql. It's ugly and a security risk but I didn't find better.
results = Problem.connection.exec_query(%(
SELECT *
FROM problems
LEFT JOIN (
SELECT *
//etc.
)
))
And then manipulating the results array in memory.
I have a query that looks something like this:
MATCH(u:USER) where u.id in {a_list}
MATCH(e:WHALE) # this is a singleton
CREATE (e)-[h:HARPOON]->(u)
SET h.a = 1, h.b = 2, h.created_at = {created_at}
So u can be multiple users. e is a singleton. Basically we're going to relate the whale to every user.
My problem is that it works fine... if I remove created_at from the query. If I leave it in, not all users are related to the whale. In fact, if I simply rename the parameter name from created_at to xcreated_at it works fine.
Is there something special about created_at?
created_at isn't special, as far as I know. It might depend on your driver, though. In the ruby neo4j gem, for instance, created_at is special, but not for any raw Cypher queries that you run.
Additionally, are you removing the parameter both from the query and from your parameter hash/map? That might cause some weirdness.
Lastly, this was probably dropped because you were making an example, but just created_at = {created_at} won't do anything. You need to specify the object which the property is being set on. I assume it's the relationship in this case so you'd want: h.created_at = {created_at}
I am developing a rails 4 app using ActiveRecord models for my db tables.
The main issue is that my model is quite complicated, and I would like to display a lot of information when I do an index of the main object.
For example, let's assume I have the following tables and columns:
Person: name(string)
Address: address(string), person_id(int)
EmailAddress: email(string), person_id(int)
Email: spam (boolean), email_address_id(int)
and the relations:
person has_many: :email_addresses
person has_one: :address
email_address has_many: :emails
Now I would like to display the following information
person.name
person.address.name
person.email_addresses.count
person.email_addresses.map do |email_address|
email_address.email.where(spam: false).count
end
The main issue is that I have a big amount of records, and I don't want to instantiate all of them (I have some memory issues because of that). Therefore, I was wondering how to do this kind of thing directly to get either an array of hashes or of arrays.
I managed to get the beginning using pluck:
Person.joins(:address).pluck('persons.name, addresses.address')
The problem begins with the count part.
Has someone already encountered such a situation? And is there a way to do this without writing the complete SQL query?
You can't use pluck for complex queries, but you can always use select to fetch the columns you want. First you join all the tables you need. Note I joined emails table twice, the second one with the spam: false condition. Then you define your columns, directly from the table or COUNT'ed, in your select statement:
persons = Person.joins(:address, email_addresses: :emails).
joins('INNER JOIN emails not_spammy_email_addresses ON emails.email_address_id = email_addresses.id AND emails.spam = 0').
select('persons.name, addresses.address AS address_address,
COUNT(email_addresses.id) AS email_addresses_count,
COUNT(not_spammy_email_addresses.id) AS not_spammy_email_addresses_count')
And then call your result's columns like this:
person = persons.first
person.name
person.address_address # note I'm not using *address* to prevent conflict with the model Adress
person.email_addresses_count
person.not_spammy_email_addresses_count
I believe this is as far as you can get with active_record and a single query, but I'd love to see other approaches. For instance, if you use Arel this query would feel less SQLish.
Say I'm modeling Students, Lessons, and Teachers. Given a single student enrolled in many lessons, how would I find all of their teachers of classes that are level 102? For that matter, how would I find all of their lessons' teachers? Right now, I have this:
s = Mongoid::Student.find_by(name: 'Billy')
l = s.lessons.where(level: 102)
t = l.map { |lesson| lesson.teachers }.flatten
Is there a way to do turn the second two lines into one query?
Each collection requires at least one query, there's no way to access more than one collection in one query (i.e. no JOINs) so that's the best you can do. However, this:
t = l.map { |lesson| lesson.teachers }.flatten
is doing l.length queries to get the teachers per lesson. You can clean that up by collecting all the teacher IDs from the lessons:
teacher_ids = l.map(&:teacher_ids).flatten.uniq # Or use `inject` or ...
and then grab the teachers based on those IDs:
t = Teacher.find(teacher_ids)
# or
t = Teacher.where(:id.in => teacher_ids).to_a
If all those queries don't work for you then you're stuck with denormalizing something so that you have everything you need embedded in a single collection; this would of course mean that you'd have to maintain the copies as things change and periodically sanity check the copies for consistency problems.