I have a model A as:
class A < ActiveRecord::Base
belongs_to :bp, class_name: "B", foreign_key: :bp_id
belongs_to :cp, class_name: "B", foreign_key: :cp_id
end
I have a query where I am trying to get the size of all the associations to model B from model A.
I made it work by getting the count of all associations through bp and then cp and adding both up.
total_count = bp_size + cp_size
Is it possible to get the sum of all associations to B model from A in a single query?
My exact queries are as follows:
bp_size = E.where(f_id: g.joins(:f)).joins(bp: :s).size
cp_size = E.where(f_id: g.joins(:f)).joins(cp: :s).size
The above query goes through multiple levels of associations but then I join it through bp and cp to get the appropriate size. I want to avoid running that query twice
I have no single idea about AR so-called “helpers”, but this is perfectly doable:
query = <<-SQL
SELECT
(SELECT COUNT(*) FROM a LEFT JOIN b ON (a.bp_id = b.id)) +
(SELECT COUNT(*) FROM a LEFT JOIN b ON (a.cp_id = b.id))
SQL
A.connection.execute(query).to_a.first.first
Dynamic solution
def self.associations_counts_hash(model, association_types)
associations = model.reflections.collect do |a, b|
b.plural_name if association_types.include?(b.macro)
end.compact
query = model.all.includes(associations)
associations.map do |association|
[
association,
query.map { |u| u.send(association).size }.sum
]
end.to_h
end
associations_counts_hash(User, [:belongs_to])
Couldn't work out a good solution to avoid a query for each association though.
Related
I have 2 models with a has_many association:
class Log < ActiveRecord::Base
has_many :log_details
end
and
class LogDetail < ActiveRecord::Base
belongs_to :log
end
The Log table has an action_type string column. The LogDetail table has 2 columns: key and value, both string, and a reference back to the Log table with a log_id.
I want to write 3 scopes on the Log model to query for some details joining the Log table with LogModel twice. Here's a sample:
has_many :payment_gateways, -> {where(key: 'payment_gateway')}, class_name: 'LogDetail', foreign_key: :log_id
has_many :coupon_codes, -> {where(key: 'coupon_code')}, class_name: 'LogDetail', foreign_key: :log_id
scope :initiate_payment, -> {where(action_type: 'INITIATE PAYMENT')}
scope :payment_gateway, -> (pg) {joins(:payment_gateways).where(log_details: {value: pg}) unless pg.blank?}
scope :coupon_code, -> (cc) {joins(:coupon_codes).where(log_details: {value: cc}) unless cc.blank?}
Using the above scopes, if I try to query for
Log.initiate_payment.payment_gateway('sample_pg').coupon_code('sample_cc')
I get the SQL query:
SELECT
`logs`.*
FROM
`logs`
INNER JOIN
`log_details` ON `log_details`.`log_id` = `logs`.`id`
AND `log_details`.`key` = 'payment_gateway'
INNER JOIN
`log_details` `coupon_codes_logs` ON `coupon_codes_logs`.`log_id` = `logs`.`id`
AND `coupon_codes_logs`.`key` = 'coupon_code'
WHERE
`logs`.`action_type` = 'INITIATE PAYMENT'
AND `log_details`.`value` = 'sample_pg'
AND `log_details`.`value` = 'sample_cc'
instead of: (notice the difference in the last AND condition)
SELECT
`logs`.*
FROM
`logs`
INNER JOIN
`log_details` ON `log_details`.`log_id` = `logs`.`id`
AND `log_details`.`key` = 'payment_gateway'
INNER JOIN
`log_details` `coupon_codes_logs` ON `coupon_codes_logs`.`log_id` = `logs`.`id`
AND `coupon_codes_logs`.`key` = 'coupon_code'
WHERE
`logs`.`action_type` = 'INITIATE PAYMENT'
AND `log_details`.`value` = 'sample_pg'
AND `coupon_codes_logs`.`value` = 'sample_cc'
The first query, because it doesn't resolve the join table references properly, gives me zero results.
How can I modify my scopes/models in such a way to generate the correct query? I think I need a reference to the join table alias inside the scope's where clause, but I'm not sure how to get that reference.
Sadly, ActiveRecord has no built-in way to specify the alias used when joining an association. Using merge to try merging the two scopes also fails as the condition is overridden.
3 solutions:
Use Arel to alias a joins, but that's a bit hard to read, and you still need to repeat the association definition for payment_gateways and coupon_codes
Join directly in SQL:
scope :payment_gateway, -> (pg) { joins(<<-SQL
INNER JOIN log_details payment_gateways
ON payment_gateways.log_id = logs.id
AND payment_gateways.key = 'payment_gateway'
AND payment_gateways.value = #{connection.quote(pg)}
SQL
) if pg.present? }
But you need to add manually the conditions already defined in the associations
Finally, my favorite, a solution that sticks to ActiveRecord:
scope :payment_gateway, -> (pg) do
where(id: unscoped.joins(:payment_gateways).where(log_details: {value: pg})) if pg.present?
end
scope :coupon_code, -> (cc) do
where(id: unscoped.joins(:coupon_codes).where(log_details: {value: cc})) if cc.present?
end
Gotcha #1: if you use Rails < 5.2, you might need to use class methods instead of scopes.
Gotcha #2: Solution #3 might be less performant than #2, make sure to EXPLAIN ANALYZE to see the difference.
I have the following associations:
class Captain
has_many :boats
end
class Boat
belongs_to :captain
has_many :classifications
end
class Classification
has_many :boats
end
I want to find out which captains have boats that have classifications with :name attributes of "catamaran."
This has been my best guess so far:
Captain.includes(:boats, :classifications).where(:boats => {:classifications => {:name => "catamaran"}})
Try this
Captain.joins(boats: :classifications).where(classifications: { name: "catamaran" })
This query results in following SQL query
SELECT * FROM `captains`
INNER JOIN `boats` ON `boats`.`captain_id` = `captains`.`id`
INNER JOIN `join_table` ON `join_table`.`boat_id` = `boat`.`id`
INNER JOIN `classifications` ON `join_table`.`classification_id` = `classifications`.id
#Sujan Adiga has right!
If you use the include method, active record generate 2 separates sql query. The first for your main Model, and the second for your inclued model. But you don't have access on the included model in your first query.
When you use the joins method, active record generate sql query with joins statement. So you can use the joined model in your clause where.
I'm trying to figure out how I can replicate the following SQL query using AR given the model definitions below. The cast is necessary to perform the average. The result set should group foo by bar (which comes from the polymorphic association). Any help is appreciated.
SQL:
SELECT AVG(CAST(r.foo AS decimal)) "Average", s.bar
FROM rotation r INNER JOIN cogs c ON r.cog_id = c.id
INNER JOIN sprockets s ON s.id = c.crankable_id
INNER JOIN machinists m ON r.machinist_id = m.id
WHERE c.crankable_type = 'Sprocket' AND
r.machine_id = 123 AND
m.shop_id = 1
GROUP BY s.bar
ActiveRecord Models:
class Rotation < ActiveRecord::Base
belongs_to :cog
belongs_to :machinist
belongs_to :machine
end
class Cog < ActiveRecord::Base
belongs_to :crankable, :polymorphic => true
has_many :rotation
end
class Sprocket < ActiveRecord::Base
has_many :cogs, :as => :crankable
end
class Machinist < ActiveRecord::Base
belongs_to :shop
end
UPDATE
I've figured out a way to make it work, but it feels like cheating. Is there are a better way than this?
Sprocket.joins('INNER JOIN cogs c ON c.crankable_id = sprockets.id',
'INNER JOIN rotations r ON r.cog_id = c.id',
'INNER JOIN machinists m ON r.machinist_id = m.id')
.select('sprockets.bar', 'r.foo')
.where(:r => {:machine_id => 123}, :m => {:shop_id => 1})
.group('sprockets.bar')
.average('CAST(r.foo AS decimal)')
SOLUTION
Albin's answer didn't work as-is, but did lead me to a working solution. First, I had a typo in Cog and had to change the relation from:
has_many :rotation
to the plural form:
has_many :rotations
With that in place, I am able to use the following query
Sprocket.joins(cogs: {rotations: :machinist})
.where({ machinists: { shop_id: 1 }, rotations: { machine_id: 123}})
.group(:bar)
.average('CAST(rotations.foo AS decimal)')
The only real difference is that I had to separate the where clause since a machine does not belong to a machinist. Thanks Albin!
I think this code is a little simpler and taking more help from AR
Sprocket
.joins(cogs: {rotations: :machinist})
.where({ machinists: { machine_id: 123, shop_id: 1 } } )
.group(:bar)
.average('CAST(rotations.foo AS decimal)')
The select clause was unnecessary, you don't have to select values since you only need them internally in the query, AR helps you decide what you need afterwards.
I tested this out using a similar structure in one of my own projects but it is not the exact same models so there might be a typo or something in there if it does not run straight up. I ran:
Activity
.joins(locations: {participants: :stuff})
.where({ stuffs: { my_field: 1 } })
.group(:title)
.average('CAST(participants.date_of_birth as decimal)')
producing this query
SELECT AVG(CAST(participants.date_of_birth as decimal)) AS average_cast_participants_date_of_birth_as_decimal, title AS title
FROM `activities`
INNER JOIN `locations` ON `locations`.`activity_id` = `activities`.`id`
INNER JOIN `participants` ON `participants`.`location_id` = `locations`.`id`
INNER JOIN `stuffs` ON `stuffs`.`id` = `participants`.`stuff_id`
WHERE `stuffs`.`my_field` = 1
GROUP BY title
which AR makes in to a hash looking like this:
{"dummy title"=>#<BigDecimal:7fe9fe44d3c0,'0.19652273E4',18(18)>, "stats test"=>nil}
After googling, browsing SO and reading, there doesn't seem to be a Rails-style way to efficiently get only those Parent objects which have at least one Child object (through a has_many :children relation). In plain SQL:
SELECT *
FROM parents
WHERE EXISTS (
SELECT 1
FROM children
WHERE parent_id = parents.id)
The closest I've come is
Parent.all.reject { |parent| parent.children.empty? }
(based on another answer), but it's really inefficient because it runs a separate query for each Parent.
Parent.joins(:children).uniq.all
As of Rails 5.1, uniq is deprecated and distinct should be used instead.
Parent.joins(:children).distinct
This is a follow-up on Chris Bailey's answer. .all is removed as well from the original answer as it doesn't add anything.
The accepted answer (Parent.joins(:children).uniq) generates SQL using DISTINCT but it can be slow query. For better performance, you should write SQL using EXISTS:
Parent.where<<-SQL
EXISTS (SELECT * FROM children c WHERE c.parent_id = parents.id)
SQL
EXISTS is much faster than DISTINCT. For example, here is a post model which has comments and likes:
class Post < ApplicationRecord
has_many :comments
has_many :likes
end
class Comment < ApplicationRecord
belongs_to :post
end
class Like < ApplicationRecord
belongs_to :post
end
In database there are 100 posts and each post has 50 comments and 50 likes. Only one post has no comments and likes:
# Create posts with comments and likes
100.times do |i|
post = Post.create!(title: "Post #{i}")
50.times do |j|
post.comments.create!(content: "Comment #{j} for #{post.title}")
post.likes.create!(user_name: "User #{j} for #{post.title}")
end
end
# Create a post without comment and like
Post.create!(title: 'Hidden post')
If you want to get posts which have at least one comment and like, you might write like this:
# NOTE: uniq method will be removed in Rails 5.1
Post.joins(:comments, :likes).distinct
The query above generates SQL like this:
SELECT DISTINCT "posts".*
FROM "posts"
INNER JOIN "comments" ON "comments"."post_id" = "posts"."id"
INNER JOIN "likes" ON "likes"."post_id" = "posts"."id"
But this SQL generates 250000 rows(100 posts * 50 comments * 50 likes) and then filters out duplicated rows, so it could be slow.
In this case you should write like this:
Post.where <<-SQL
EXISTS (SELECT * FROM comments c WHERE c.post_id = posts.id)
AND
EXISTS (SELECT * FROM likes l WHERE l.post_id = posts.id)
SQL
This query generates SQL like this:
SELECT "posts".*
FROM "posts"
WHERE (
EXISTS (SELECT * FROM comments c WHERE c.post_id = posts.id)
AND
EXISTS (SELECT * FROM likes l WHERE l.post_id = posts.id)
)
This query does not generate useless duplicated rows, so it could be faster.
Here is benchmark:
user system total real
Uniq: 0.010000 0.000000 0.010000 ( 0.074396)
Exists: 0.000000 0.000000 0.000000 ( 0.003711)
It shows EXISTS is 20.047661 times faster than DISTINCT.
I pushed the sample application in GitHub, so you can confirm the difference by yourself:
https://github.com/JunichiIto/exists-query-sandbox
I have just modified this solution for your need.
Parent.joins("left join childrens on childrends.parent_id = parents.id").where("childrents.parent_id is not null")
You just want an inner join with a distinct qualifier
SELECT DISTINCT(*)
FROM parents
JOIN children
ON children.parent_id = parents.id
This can be done in standard active record as
Parent.joins(:children).uniq
However if you want the more complex result of find all parents with no children
you need an outer join
Parent.joins("LEFT OUTER JOIN children on children.parent_id = parent.id").
where(:children => { :id => nil })
which is a solution which sux for many reasons. I recommend Ernie Millers squeel library which will allow you to do
Parent.joins{children.outer}.where{children.id == nil}
try including the children with #includes()
Parent.includes(:children).all.reject { |parent| parent.children.empty? }
This will make 2 queries:
SELECT * FROM parents;
SELECT * FROM children WHERE parent_id IN (5, 6, 8, ...);
[UPDATE]
The above solution is usefull when you need to have the Child objects loaded.
But children.empty? can also use a counter cache1,2 to determine the amount of children.
For this to work you need to add a new column to the parents table:
# a new migration
def up
change_table :parents do |t|
t.integer :children_count, :default => 0
end
Parent.reset_column_information
Parent.all.each do |p|
Parent.update_counters p.id, :children_count => p.children.length
end
end
def down
change_table :parents do |t|
t.remove :children_count
end
end
Now change your Child model:
class Child
belongs_to :parent, :counter_cache => true
end
At this point you can use size and empty? without touching the children table:
Parent.all.reject { |parent| parent.children.empty? }
Note that length doesn't use the counter cache whereas size and empty? do.
here's the current query:
#feed = RatedActivity.find_by_sql(["(select *, null as queue_id, 3 as model_table_type from rated_activities where user_id in (?)) " +
"UNION (select *, null as queue_id, null as rating, 2 as model_table_type from watched_activities where user_id in (?)) " +
"UNION (select *, null as rating, 1 as model_table_type from queued_activities where user_id in (?)) " +"ORDER BY activity_datetime DESC limit 100", friend_ids, friend_ids, friend_ids])
Now, this is a bit of kludge, since there are actually models set up for:
class RatedActivity < ActiveRecord::Base
belongs_to :user
belongs_to :media
end
class QueuedActivity < ActiveRecord::Base
belongs_to :user
belongs_to :media
end
class WatchedActivity < ActiveRecord::Base
belongs_to :user
belongs_to :media
end
would love to know how to use activerecord in rails 3.0 to achieve basically the same thing as is done with the crazy union i have there.
It sounds like you should consolidate these three separate models into a single model. Statuses such as "watched", "queued", or "rated" are then all implicit based on attributes of that model.
class Activity < ActiveRecord::Base
belongs_to :user
belongs_to :media
scope :for_users, lambda { |u|
where("user_id IN (?)", u)
}
scope :rated, where("rating IS NOT NULL")
scope :queued, where("queue_id IS NOT NULL")
scope :watched, where("watched IS NOT NULL")
end
Then, you can call Activity.for_users(friend_ids) to get all three groups as you are trying to accomplish above... or you can call Activity.for_users(friend_ids).rated (or queued or watched) to get just one group. This way, all of your Activity logic is consolidated in one place. Your queries become simpler (and more efficient) and you don't have to maintain three different models.
I think that your current solution is OK in case of legacy DB. As native query it is also most efficient as your DBMS does all hard work (union, sort, limit).
If you really want to get rid of SQL UNION without changing schema then you can move union to Ruby array sum - but this may be slower.
result = RatedActivity.
select("*, null as queue_id, 3 as model_table_type").
where(:user_id=>friend_ids).
limit(100).all +
QueuedActivity...
Finally you need to sort and limit that product with
result.sort(&:activity_datetime)[0..99]
This is just proof of concept, as you see it is inefficient is some points (3 queries, sorting in Ruby, limit). I would stay with find_by_sql.