I have a habtm relationship between my Product and Category model.
I'm trying to write a query that searches for products with minimum of 2 categories.
I got it working with the following code:
p = Product.joins(:categories).group("product_id").having("count(product_id) > 1")
p.length # 178
When iterating on it though, for each time I call product.categories, it will do a new call to the database - not good. I want to prevent these calls and have the same result. Doing more research I've seen that I could include (includes) my categories table and it would load all the table in memory so it's not necessary to call the database again when iterating. So I got it working with the following code:
p2 = Product.includes(:categories).joins(:categories).group("product_id").having("count(product_id) > 1")
p2.length # 178 - I compared and the objects are the same as last query
Here come's what I am confused about:
p.first.eql? p2.first # true
p.first.categories.eql? p2.first.categories # false
p.first.categories.length # 2
p2.first.categories.length # 1
Why with the includes query I get the right objects but I don't get the categories relationship right?
It has something to do with the group method. Your p2 only contains the first category for each product.
You could break this up into two queries:
product_ids = Product.joins(:categories).group("product_id").having("count(product_id) > 1").pluck(:product_id)
result = Product.includes(:categories).find(product_ids)
Yeah, you hit the database twice, but at least you don't go to the database when you're iterating.
You must know that includes doesn't play well with joins (joins will just suppress the former).
Also When you include an association ActiveRecord figures out if it'll use eager_load (with a left join) or preload (with a separate query). Includes is just a wrapper for one of those 2.
The thing is preload plays well with joins ! So you can do this :
products = Product.preload(:categories). # this will trigger a separate query
joins(:categories). # this will build the relevant query
group("products.id").
having("count(product_id) > 1").
select("products.*")
Note that this will also hit the database twice, but you will not have any O(n) query.
Related
Assuming this simplified schema:
users has_many discount_codes
discount_codes has_many orders
I want to grab all users, and if they happen to have any orders, only include the orders that were created between two dates. But if they don't have orders, or have orders only outside of those two dates, still return the users and do not exclude any users ever.
What I'm doing now:
users = User.all.includes(discount_codes: :orders)
users = users.where("orders.created_at BETWEEN ? AND ?", date1, date2).
or(users.where(orders: { id: nil })
I believe my OR clause allows me to retain users who do not have any orders whatsoever, but what happens is if I have a user who only has orders outside of date1 and date2, then my query will exclude that user.
For what it's worth, I want to use this orders where clause here specifically so I can avoid n + 1 issues later in determining orders per user.
Thanks in advance!
It doesn't make sense to try and control the orders that are loaded as part of the where clause for users. If you were to control that it'd have to be part of the includes (which I think means it'd have to be a part of the association).
Although technically it can combine them into a single query in some cases, activerecord is going to do this as two queries.
The first query will be executed when you go to iterate over the users and will use that where clause to limit the users found.
It will then run a second query behind the scenes based on that includes statement. This will simply be a query to get all orders which are associated with the users that were found by the previous query. As such the only way to control the orders that are found through the user's where clause is to omit users from the result set.
If I were you I would create an instance method in User model for what you are looking for but instead of using where use a select block:
def orders_in_timespan(start, end)
orders.select{ |o| o.between?(start, end) }
end
Because of the way ActiveRecord will cache the found orders from the includes against the instance then if you start off with an includes in your users query then I believe this will not result in n queries.
Something like:
render json: User.includes(:orders), methods: :orders_in_timespan
Of course, the easiest way to confirm the number of queries is to look at the logs. I believe this approach should have two queries regardless of the number of users being rendered (as likely does your code in the question).
Also, I'm not sure how familiar you are with sql but you can call .to_sql on the end of things such as your users variable in order to see the sql that would be generated which might help shed some light on the discrepancies between what you're getting and what you're looking for.
Option 1: Write a custom query in SQL (ugly).
Option 2: Create 2 separate queries like below...
#users = User.limit(10)
#orders = Order.joins(:discount_code)
.where(created_at: [10.days.ago..1.day.ago], discount_codes: {user_id: users.select(:id)})
.group_by{|order| order.discount_code.user_id}
Now you can use it like this ...
#users.each do |user|
orders = #orders[user.id]
puts user.name
puts user.id
puts orders.count
end
I hope this will solve your problem.
You need to use joins instead of includes. Rails joins use inner joins and will reject all the records which don't have associations.
User.joins(discount_codes: :orders).where(orders: {created_at: [10.days.ago..1.day.ago]}).distinct
This will give you all distinct users who placed orders in a given period of time.
user = User.joins(:discount_codes).joins(:orders).where("orders.created_at BETWEEN ? AND ?", date1, date2) +
User.left_joins(:discount_codes).left_joins(:orders).group("users.id").having("count(orders.id) = 0")
I'm using Rails 5. I have the following model ...
class Order < ApplicationRecord
...
has_many :line_items, :dependent => :destroy
The LineItem model has an attribute, "discount_applied." I would like to return all orders where there are zero instances of a line item having the "discount_applied" field being not nil. How do I write such a finder method?
First of all, this really depends on whether or not you want to use a pure Arel approach or if using SQL is fine. The former is IMO only advisable if you intend to build a library but unnecessary if you're building an app where, in reality, it's highly unlikely that you're changing your DBMS along the way (and if you do, changing a handful of manual queries will probably be the least of your troubles).
Assuming using SQL is fine, the simplest solution that should work across pretty much all databases is this:
Order.where("(SELECT COUNT(*) FROM line_items WHERE line_items.order_id = orders.id AND line_items.discount_applied IS NULL) = 0")
This should also work pretty much everywhere (and has a bit more Arel and less manual SQL):
Order.left_joins(:line_items).where(line_items: { discount_applied: nil }).group("orders.id").having("COUNT(line_items.id) = 0")
Depending on your specific DBMS (more specifically: its respective query optimizer), one or the other might be more performant.
Hope that helps.
Not efficient but I thought it may solve your problem:
orders = Order.includes(:line_items).select do |order|
order.line_items.all? { |line_item| line_item.discount_applied.nil? }
end
Update:
Instead of finding orders which all it's line items have no discount, we can exclude all the orders which have line items with a discount applied from the output result. This can be done with subquery inside where clause:
# Find all ids of orders which have line items with a discount applied:
excluded_ids = LineItem.select(:order_id)
.where.not(discount_applied: nil)
.distinct.map(&:order_id)
# exclude those ids from all orders:
Order.where.not(id: excluded_ids)
You can combine them in a single finder method:
Order.where.not(id: LineItem
.select(:order_id)
.where.not(discount_applied: nil))
Hope this helps
A possible code
Order.includes(:line_items).where.not(line_items: {discount_applied: nil})
I advice to get familiar with AR documentation for Query Methods.
Update
This seems to be more interested than I initially though. And more complicated, so I will not be able to give you a working code. But I would look into a solution using LineItem.group(order_id).having(discount_applied: nil), which should give you a collection of line_items and then use it as sub-query to find related orders.
If you want all the records where discount_applied is nil then:
Order.includes(:line_items).where.not(line_items: {discount_applied: nil})
(use includes to avoid n+1 problem)
or
Order.joins(:line_items).where.not(line_items: {discount_applied: nil})
Here is the solution to your problem
order_ids = Order.joins(:line_items).where.not(line_items: {discount_applied: nil}).pluck(:id)
orders = Order.where.not(id: order_ids)
First query will return ids of Orders with at least one line_item having discount_applied. The second query will return all orders where there are zero instances of a line_item having the discount_applied.
I would use the NOT EXISTS feature from SQL, which is at least available in both MySQL and PostgreSQL
it should look like this
class Order
has_many :line_items
scope :without_discounts, -> {
where("NOT EXISTS (?)", line_items.where("discount_applied is not null")
}
end
If I understood correctly, you want to get all orders for which none line item (if any) has a discount applied.
One way to get those orders using ActiveRecord would be the following:
Order.distinct.left_outer_joins(:line_items).where(line_items: { discount_applied: nil })
Here's a brief explanation of how that works:
The solution uses left_outer_joins, assuming you won't be accessing the line items for each order. You can also use left_joins, which is an alias.
If you need to instantiate the line items for each Order instance, add .eager_load(:line_items) to the chain which will prevent doing an additional query for every order (N+1), i.e., doing order.line_items.each in a view.
Using distinct is essential to make sure that orders are only included once in the result.
Update
My previous solution was only checking that discount_applied IS NULL for at least one line item, not all of them. The following query should return the orders you need.
Order.left_joins(:line_items).group(:id).having("COUNT(line_items.discount_applied) = ?", 0)
This is what's going on:
The solution still needs to use a left outer join (orders LEFT OUTER JOIN line_items) so that orders without any associated items are included.
Groups the line items to get a single Order object regardless of how many items it has (GROUP BY recipes.id).
It counts the number of line items that were given a discount for each order, only selecting the ones whose items have zero discounts applied (HAVING (COUNT(line_items.discount_applied) = 0)).
I hope that helps.
You cannot do this efficiently with a classic rails left_joins, but sql left join was build to handle thoses cases
Order.joins("LEFT JOIN line_items AS li ON li.order_id = orders.id
AND li.discount_applied IS NOT NULL")
.where("li.id IS NULL")
A simple inner join will return all orders, joined with all line_items,
but if there are no line_items for this order, the order is ignored (like a false where)
With left join, if no line_items was found, sql will joins it to an empty entry in order to keep it
So we left joined the line_items we don't want, and find all orders joined with an empty line_items
And avoid all code with where(id: pluck(:id)) or having("COUNT(*) = 0"), on day this will kill your database
I have a model Company that have columns pbr, market_cap and category.
To get averages of pbr grouped by category, I can use group method.
Company.group(:category).average(:pbr)
But there is no method for weighted average.
To get weighted averages I need to run this SQL code.
select case when sum(market_cap) = 0 then 0 else sum(pbr * market_cap) / sum(market_cap) end as weighted_average_pbr, category AS category FROM "companies" GROUP BY "companies"."category";
In psql this query works fine. But I don't know how to use from Rails.
sql = %q(select case when sum(market_cap) = 0 then 0 else sum(pbr * market_cap) / sum(market_cap) end as weighted_average_pbr, category AS category FROM "companies" GROUP BY "companies"."category";)
ActiveRecord::Base.connection.select_all(sql)
returns a error:
output error: #<NoMethodError: undefined method `keys' for #<Array:0x007ff441efa618>>
It would be best if I can extend Rails method so that I can use
Company.group(:category).weighted_average(:pbr)
But I heard that extending rails query is a bit tweaky, now I just want to know how to run the result of sql from Rails.
Does anyone knows how to do it?
Version
rails: 4.2.1
What version of Rails are you using? I don't get that error with Rails 4.2. In Rails 3.2 select_all used to return an Array, and in 4.2 it returns an ActiveRecord::Result. But in either case, it is correct that there is no keys method. Instead you need to call keys on each element of the Array or Result. It sounds like the problem isn't from running the query, but from what you're doing afterward.
In any case, to get the more fluent approach you've described, you could do this:
class Company
scope :weighted_average, lambda{|col|
select("companies.category").
select(<<-EOQ)
(CASE WHEN SUM(market_cap) = 0 THEN 0
ELSE SUM(#{col} * market_cap) / SUM(market_cap)
END) AS weighted_average_#{col}
EOQ
}
This will let you say Company.group(:category).weighted_average(:pbr), and you will get a collection of Company instances. Each one will have an extra weighted_average_pbr attribute, so you can do this:
Company.group(:category).weighted_average(:pbr).each do |c|
puts c.weighted_average_pbr
end
These instances will not have their normal attributes, but they will have category. That is because they do not represent individual Companies, but groups of companies with the same category. If you want to group by something else, you could parameterize the lambda to take the grouping column. In that case you might as well move the group call into the lambda too.
Now be warned that the parameter to weighted_average goes straight into your SQL query without escaping, since it is a column name. So make sure you don't pass user input to that method, or you'll have a SQL injection vulnerability. In fact I would probably put a guard inside the lambda, something like raise "NOPE" unless col =~ %r{\A[a-zA-Z0-9_]+\Z}.
The more general lesson is that you can use select to include extra SQL expressions, and have Rails magically treat those as attributes on the instances returned from the query.
Also note that unlike with select_all where you get a bunch of hashes, with this approach you get a bunch of Company instances. So again there is no keys method! :-)
Let's say I have a User table and a Messages table, they have a has_many belongs_to relationship. I want to find the id: for users who's names are "Bob", then pull the message history for one of the id's.
x = User.where(name: "Bob")
Does that create a hash in variable x, with all the results of users whose names were Bob? The result in the console certainly looks like a hash when I run x. To includes the messages tied to all the Bobs, I think I do:
x = User.where(name: "Bob").includes(:messages)
Now that I have x...how do I find the id's of the people whose names are Bob? I don't want to query the db again, I'd like to do it all via the variable, is that possible?
I then want to get the first message of the first id (the first Bob) in my table. Can that be done via the variable, or do I have to go back to the DB once I have the first id?
Thanks for all the help guys and gals!
Most ActiveRecord queries return a Relation.
You can call x = x.to_a to make rails perform the actual query(there will be 2 SQL queries - one for users and one for messages) and then traverse the resulting array.
This will do it. As referenced in the rails guides. http://guides.rubyonrails.org/active_record_querying.html section 13.2
x = Message.includes(:users).where(users: { name: "Bob"})
and then to get the first message just tack on .first at the end of the query.
x = Message.includes(:users).where(users: { name: "Bob"}).first
You need to query from Message, not User. Joins (inner join) and includes (left outer join) can be used for eager loading, like in your question, or to do query across multiple tables.
Message.joins(:user).where('user.name = "bob"')
i am trying to query my postgres db from rails with the following query
def is_manager(team)
User.where("manager <> 0 AND team_id == :team_id", {:team_id => team.id})
end
this basically is checking that the manager is flagged and the that team.id is the current id passed into the function.
i have the following code in my view
%td= is_manager(team)
error or what we are getting return is
#<ActiveRecord::Relation:0xa3ae51c>
any help on where i have gone wrong would be great
Queries to ActiveRecord always return ActiveRecord::Relations. Doing so essentially allows the lazy loading of queries. To understand why this is cool, consider this:
User.where(manager: 0).where(team_id: team_id).first
In this case, we get all users who aren't managers, and then we get all the non-manager users who are on team with id team_id, and then we select the first one. Executing this code will give you a query like:
SELECT * FROM users WHERE manager = 0 AND team_id = X LIMIT 1
As you can see, even though there were multiple queries made in our code, ActiveRecord was able to squish all of that down into one query. This is done through the Relation. As soon as we need to actual object (i.e. when we call first), then ActiveRecord will go to the DB to get the records. This prevents unnecessary queries. ActiveRecord is able to do this because they return Relations, instead of the queried objects. The best way to think of the Relation class is that it is an instance of ActiveRecord with all the methods of an array. You can call queries on a relation, but you can also iterate over it.
Sorry if that isn't clear.
Oh, and to solve your problem. %td = is_manager(team).to_a This will convert the Relation object into an array of Users.
Just retrieve first record with .first, this might help.
User.where("manager <> 0 AND team_id == :team_id", {:team_id => team.id}).first