Multiple joins to the same model using multiple belongs_to associations - ruby-on-rails

I have a class Agreement that has two belongs_to associations to a Member class - referenced as primary and secondary:
class Agreement < ActiveRecord::Base
belongs_to :primary, class_name: 'Member'
belongs_to :secondary, class_name: 'Member'
...
def self.by_member(member_attribute_hash)
# returns any agreements that has a primary OR secondary member that matches any of the values
# in the member_attribute_hash
...
end
end
The Member class has no knowledge of the association with the Agreement class - it does not need to:
class Member < ActiveRecord::Base
# contains surname, given_names, member_number
...
def self.by_any(member_attribute_hash)
# returns any member where the member matches on surname OR given_names OR member_number
...
end
end
What I would like to do is search for all agreements where the primary or secondary member matches a set of criteria.
From previous work (see question #14139609), I've sorted out how to build the conditional where clause for Member.by_any().
Trying to reuse that method while search for Agreements led me to try this:
class Agreement < ActiveRecord::Base
...
def self.by_member(member_attribute_hash)
Agreement.joins{primary.outer}.merge(Member.by_any(member_attribute_hash)).joins{secondary.outer}.merge(Member.by_any(member_attribute_hash))
end
end
On running this in the console, with a member_attribute_hash = {surname: 'Freud'}, the generated SQL fails to honour the alias generated for the second join to member:
SELECT "agreements".*
FROM "agreements"
LEFT OUTER JOIN "members"
ON "members"."id" = "agreements"."primary_id"
LEFT OUTER JOIN "members" "secondarys_agreements"
ON "secondarys_agreements"."id" = "agreements"."secondary_id"
WHERE "members"."surname" ILIKE 'Freud%'
AND "members"."surname" ILIKE 'Freud%'
Notice the duplicate conditions in the WHERE clause. This will return Agreements where the primary Member has a surname like 'Freud', but ignores the secondary Member condition because the alias is not flowing through the merge.
Any ideas?

After struggling to understand this, I ended up replacing the Member.by_any scope with a Squeel sifter:
class Member < ActiveRecord::Base
# contains surname, given_names, member_number
...
def self.by_any(member_attribute_hash)
# returns any member where the member matches on surname OR given_names OR member_number
squeel do
[
(surname =~ "#{member[:surname]}%" if member[:surname].present?),
(given_names =~ "#{member[:given_names]}%" if member[:given_names].present?),
(member_number == member[:member_number] if member[:member_number].present?)
].compact.reduce(:|)
# compact to remove the nils, reduce to combine the cases with |
end
end
end
The only difference (code-wise), bewteen the sifter and the scope is the replacement of the where in the scope with squeel in the sifter.
So, instead of using a merge to access the Member.by_any scope from the Agreement model, I was now able to reference the Member :by_any sifter from the Agreement model. It looked like:
class Agreement < ActiveRecord::Base
...
def self.by_member(member_attribute_hash)
Agreement.joins{primary.outer}.where{primary.sift :by_any, member_attribute_hash}.joins{secondary.outer}.where{secondary.sift :by_any, member_attribute_hash}
end
end
This fixed the aliasing issue - begin celebrating!:
SELECT "agreements".*
FROM "agreements"
LEFT OUTER JOIN "members"
ON "members"."id" = "agreements"."primary_id"
LEFT OUTER JOIN "members" "secondarys_agreements"
ON "secondarys_agreements"."id" = "agreements"."secondary_id"
WHERE "members"."surname" ILIKE 'Freud%'
AND "secondarys_agreements"."surname" ILIKE 'Freud%'
but I still wasn't getting the results I expected - celebration put on hold. What was wrong? The AND in the where clause was wrong. After a bit more digging (and a night away from the computer), a refreshed mind decided to try this:
class Agreement < ActiveRecord::Base
...
def self.by_member(member_attribute_hash)
Agreement.joins{primary.outer}.joins{secondary.outer}.where{(primary.sift :by_any, member_attribute_hash) | (secondary.sift :by_any, member_attribute_hash)}
end
end
Producing this:
SELECT "agreements".*
FROM "agreements"
LEFT OUTER JOIN "members"
ON "members"."id" = "agreements"."primary_id"
LEFT OUTER JOIN "members" "secondarys_agreements"
ON "secondarys_agreements"."id" = "agreements"."secondary_id"
WHERE ((("members"."surname" ILIKE 'Freud%')
OR ("secondarys_agreements"."surname" ILIKE 'Freud%')))
Ah sweet... restart celebration... now I get Agreements that have a primary or secondary Member that matches according to the rules defined in a single sifter.
And for all the work he's done on Squeel, a big shout-out goes to Ernie Miller.

Related

Avoiding N+1 queries in a Rails multi-table query

This is the query I've got at present:
SELECT t1.discipline_id AS discipline1,
t2.discipline_id AS discipline2,
COUNT(DISTINCT t1.product_id) as product_count
FROM (SELECT "product_disciplines".* FROM "product_disciplines") t1
INNER JOIN (SELECT "product_disciplines".* FROM "product_disciplines") t2
ON t1.product_id = t2.product_id
WHERE (t1.discipline_id < t2.discipline_id)
GROUP BY t1.discipline_id, t2.discipline_id
ORDER BY "product_count" DESC
Basically, I've got a list of Products and Disciplines, and each Product may be associated with one or more Disciplines. This query lets me figure out, for each possible (distinct) pair of disciplines, how many products are associated with them. I'll use this as input to a dependency wheel in Highcharts.
The problem arises when I involve Active Model Serializers. This is my controller:
class StatsController < ApplicationController
before_action :get_relationships, only: [:relationships]
def relationships
x = #relationships
.select('t1.discipline_id AS discipline1, t2.discipline_id AS discipline2, COUNT(DISTINCT t1.product_id) as product_count')
.order(product_count: :DESC)
.group('t1.discipline_id, t2.discipline_id')
render json: x, each_serializer: RelationshipSerializer
end
private
def get_relationships
query = ProductDiscipline.all
#relationships = ProductDiscipline
.from(query, :t1)
.joins("INNER JOIN (#{query.to_sql}) t2 on t1.product_id = t2.product_id")
.where('t1.discipline_id < t2.discipline_id')
end
end
each_serializer points to this class:
class RelationshipSerializer < ApplicationSerializer
has_many :disciplines do
Discipline.where(id: [object.discipline1, object.discipline2])
end
attribute :product_count
end
When I query the database, there are ~1300 possible pairs, which translates my single query in ~1300 Discipline lookups.
Is there a way to avoid the N+1 queries problem with this structure?
I ended up splitting this in two separate API queries. RelationshipSerializer saves just the discipline IDs,
class RelationshipSerializer < ApplicationSerializer
# has_many :disciplines do
# # Discipline.where(id: [object.discipline1, object.discipline2])
# [object.discipline1, object.discipline2].to_json
# end
attributes :discipline1, :discipline2
attribute :product_count
end
Since in my app I already need the list of available disciplines, I chose to correlate them client-side.

How to order cumulated payments in ActiveRecord?

In my Rails app I have the following models:
class Person < ApplicationRecord
has_many :payments
end
class Payment < ApplicationRecord
belongs_to :person
end
How can I get the payments for each person and order them by sum?
This is my controller:
class SalesController < ApplicationController
def index
#people = current_account.people.includes(:payments).where(:payments => { :date => #range }).order("payments.amount DESC")
end
end
It gives me the correct numbers but the order is wrong. I want it to start with the person having the highest sum of payments within a range.
This is the current Payments table:
How can this be done?
This should work for you:
payments = Payment.arel_table
sum_payments = Arel::Table.new('sum_payments')
payments_total = payments.join(
payments.project(
payments[:person_id],
payments[:amount].sum.as('total')
)
.where(payments[:date].between(#range))
.group( payments[:person_id])
.as('sum_payments'))
.on(sum_payments[:person_id].eq(Person.arel_table[:id]))
This will create broken SQL (selects nothing from payments which is syntactically incorrect and joins to people which does not even exist in this query) but we really only need the join e.g.
payments_total.join_sources.first.to_sql
#=> INNER JOIN (SELECT payments.person_id,
# SUM(payments.amount) AS total
# FROM payments
# WHERE
# payments.date BETWEEN ... AND ...
# GROUP BY payments.person_id) sum_payments
# ON sum_payments.id = people.id
So knowing this we can pass the join_sources to ActiveRecord::QueryMethods#joins and let rails and arel handle the rest like so
current_account
.people
.includes(:payments)
.joins(payments_total.join_sources)
.where(:payments => { :date => #range })
.order("sum_payments.total DESC")
Which should result in SQL akin to
SELECT
-- ...
FROM
people
INNER JOIN payments ON payments.person_id = people.id
INNER JOIN ( SELECT payments.person_id,
SUM(payments.amount) as total
FROM payments
WHERE
payments.date BETWEEN -- ... AND ...
GROUP BY payments.person_id) sum_payments ON
sum_payments.person_id = people.id
WHERE
payments.date BETWEEN -- ... AND ..
ORDER BY
sum_payments.total DESC
This will show all the people having made payments in a given date range (along with those payments) sorted by the sum of those payments in descending order.
This is untested as I did not bother to set up a whole rails application but it should be functional.

Get dynamic table alias in Rails scope

I have two scopes that are shared by the majority of my models. They have raw SQL that directly refers to the model's table name, and that doesn't play nicely with Arel:
class ApplicationRecord < ActiveRecord::Base
valid = lambda do |positive = true|
if %w[validForBegin validForEnd].all? { |c| base_class.column_names.include?(c) }
condition = "NOW() BETWEEN #{base_class.table_name}.validForBegin AND #{base_class.table_name}.validForEnd"
condition = "!(#{condition})" unless positive
where(condition)
end
end
scope :valid, valid
scope :invalid, -> { valid(false) }
end
# Sample usage
class Party < ApplicationRecord
has_one :name,
-> { valid },
class_name: 'PartyName',
foreign_key: :partyId,
has_many :expired_names,
-> { invalid },
class_name: 'PartyName',
foreign_key: :partyId,
end
Since my scope refers directly to the model's table_name, I can't join on both associations at once:
Party.joins(:name, :expired_names).first
# Produces this sequel statement
SELECT `party`.*
FROM `party`
INNER JOIN `party_name` ON `party_name`.`partyId` = `party`.`id`
AND (NOW() BETWEEN party_name.validForBegin AND party_name.validForEnd)
INNER JOIN `party_name` `expired_names_party` ON `expired_names_party`.`partyId` = `party`.`id`
AND (!(NOW() BETWEEN party_name.validForBegin AND party_name.validForEnd))
ORDER BY `party`.`id` ASC LIMIT 1
Note that both 'AND' conditions on the joins are referring to the table party_name. The second one should instead be referring to expired_names_party, the dynamically generated table alias. For more complicated Rails queries where Arel assigns an alias to EVERY table, both joins will fail.
Is it possible for my scope to use the alias assigned to it by Arel at execution time?
I created this repo to help test your situation:
https://github.com/wmavis/so_rails_arel
I believe the issue is that you are trying to use the same class for both relationships. By default, rails wants to use the name of the class for the table associated with that class. Therefore, it is using the table name party_name for both queries.
To get around this issue, I created an ExpiredName class that inherits from PartyName but tells rails to use the expired_names table:
https://github.com/wmavis/so_rails_arel/blob/master/app/models/expired_name.rb
class ExpiredName < PartyName
self.table_name = 'expired_names'
end
This seems to fix the issue in my code:
Party.joins(:name, :expired_names).to_sql
=> "SELECT \"parties\".* FROM \"parties\"
INNER JOIN \"party_names\"
ON \"party_names\".\"party_id\" = \"parties\".\"id\"
INNER JOIN \"expired_names\"
ON \"expired_names\".\"party_id\" = \"parties\".\"id\""
Let me know if it doesn't work for you and I'll try to help.

How to safely create an ActiveRecord class method that unambiguously specifies a column?

I'm trying to create a utility method on ApplicationRecord, like this:
def self.created_since(time)
where('created_at > ?', time)
end
But when this is composed into a chain of methods involving other tables, the column reference becomes ambiguous and I get an error:
Foo.includes(:bars).created_since(3.days.ago)
--> SELECT "foos".* FROM "foos" INNER JOIN "bars" ON "bars"."foo_id" = "foos"."id" WHERE (created_at > '2018-01-31 22:47:30.758235')
I would like to fix this problem. What is the best way to do so?
One way:
def self.created_since(time)
where("#{table_name}.created_at > ?", time)
end
But this is interpolating directly into a SQL string, which opens me up to an injection attack in the unlikely scenario that the attacker controls table_name.
(e.g. I'm IKEA and the Table < ApplicationRecord model defines table_name by concatenating some user-supplied data)
Another attempt:
def self.created_since(time)
where("?.created_at > ?", table_name, time)
end
does not work, as we get quotes around the table name
--> SELECT "foos".* FROM "foos" INNER JOIN "bars" ON "bars"."foo_id" = "foos"."id" WHERE ('users'.created_at > '2018-01-31 22:47:30.758235')
and that's a syntax error.
You could set a global scope and Table name can be passed as argument to
scope.
# app/models/concerns/created_since.rb
module CreatedSince
extend ActiveSupport::Concern
included do
scope :created_since, -> (time, table = self.table_name){ where("#{table}.created_at >= ?", time) }
end
end
# app/models/application_record.rb
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
include CreatedSince
end
# Usage
Foo.includes(:bars).created_since(3.days.ago)
Hope this helps!

how to modify complex find_by_sql query w/ union into rails 3

here's the current query:
#feed = RatedActivity.find_by_sql(["(select *, null as queue_id, 3 as model_table_type from rated_activities where user_id in (?)) " +
"UNION (select *, null as queue_id, null as rating, 2 as model_table_type from watched_activities where user_id in (?)) " +
"UNION (select *, null as rating, 1 as model_table_type from queued_activities where user_id in (?)) " +"ORDER BY activity_datetime DESC limit 100", friend_ids, friend_ids, friend_ids])
Now, this is a bit of kludge, since there are actually models set up for:
class RatedActivity < ActiveRecord::Base
belongs_to :user
belongs_to :media
end
class QueuedActivity < ActiveRecord::Base
belongs_to :user
belongs_to :media
end
class WatchedActivity < ActiveRecord::Base
belongs_to :user
belongs_to :media
end
would love to know how to use activerecord in rails 3.0 to achieve basically the same thing as is done with the crazy union i have there.
It sounds like you should consolidate these three separate models into a single model. Statuses such as "watched", "queued", or "rated" are then all implicit based on attributes of that model.
class Activity < ActiveRecord::Base
belongs_to :user
belongs_to :media
scope :for_users, lambda { |u|
where("user_id IN (?)", u)
}
scope :rated, where("rating IS NOT NULL")
scope :queued, where("queue_id IS NOT NULL")
scope :watched, where("watched IS NOT NULL")
end
Then, you can call Activity.for_users(friend_ids) to get all three groups as you are trying to accomplish above... or you can call Activity.for_users(friend_ids).rated (or queued or watched) to get just one group. This way, all of your Activity logic is consolidated in one place. Your queries become simpler (and more efficient) and you don't have to maintain three different models.
I think that your current solution is OK in case of legacy DB. As native query it is also most efficient as your DBMS does all hard work (union, sort, limit).
If you really want to get rid of SQL UNION without changing schema then you can move union to Ruby array sum - but this may be slower.
result = RatedActivity.
select("*, null as queue_id, 3 as model_table_type").
where(:user_id=>friend_ids).
limit(100).all +
QueuedActivity...
Finally you need to sort and limit that product with
result.sort(&:activity_datetime)[0..99]
This is just proof of concept, as you see it is inefficient is some points (3 queries, sorting in Ruby, limit). I would stay with find_by_sql.

Resources