Rails 4 ActiveRecord group_by and nested loops - ruby-on-rails

The project I am working on has the following models:
Button: has many extra_prices (one for each Currency)
ExtraPrice: belongs to a product (in that case a Button) and has as an attribute a currency_id
Currency: there is a reference to a currency_id on an ExtraPrice as I mentioned above. FYI there are 4 currencies so far in the app.
Some of the buttons don't have an extra_price set in one of the currencies which causes an error in another part of the app.
I am trying to write a rake task that would:
- check all buttons missing an extra_price for one a more currencies
- find out which currency is missing
- create the extra price
So far I toyed with a few options but I am stuck (I am pretty junior as a dev, and especially on the back-end side/DB query).
I was thinking something like:
Button.transaction do
currencies = Currency.all.pluck :id
buttons_no_extra_price = Button.select { |button| button.extra_prices.length <
currencies.length }
end
and then I'm stuck :)
I would like to do something like
buttons_no_extra_price.group_by(|button| button.extra_prices.currency_ids)
(wrong formatting of course since extra_prices is an array and currency_id is an attribute on each extra_price)
but instead of grouping them by currency_id, I would like to group them by the missing currency_id or ids, maybe using the currencies variable above.
missing_prices = {currency1: [button1, button2], currency2: [button192, button208], currency3: [button392, button220]...}
This way I could loop through every Currency and create an extra_price on each button object of the nested array like:
missing_prices.each |currency, array_of_buttons| do
array_of_buttons.each do |button|
ExtraPrice.create!(currency: currency, product: button)
end
end
I am also thinking that from a performance standpoint it needs to be optimized so maybe work more with includes, joins, etc. but it's a bit above my current abilities to be totally honest.
So any help would be appreciated :)
Thanks!

So I think I follow your question, and if I am this should do the trick. Let me know if you have any questions. Note that there is probably a more performant way to do this, but given that it's a rake task performance won't need to be fully optimized unless you are dealing with millions of records.
all_currency_ids = Currency.all.pluck(:id)
Button.eager_load(:extra_price).group('buttons.id').having('count(extra_prices.id) < ?', all_currency_ids.count).each do |button|
missing_currency_ids = all_currency_ids - button.extra_prices.pluck(:currency_id)
missing_currency_ids.each do |missing_currency_id|
ExtraPrice.create!(currency: Currency.find_by(id: missing_currency_id), product: button)
end
end
Button.eager_load(:extra_price).group('buttons.id').having('count(extra_prices.id) < ?', all_currency_ids.count) is what gets you the buttons with missing extra prices. This hinges on the fact that each button has an extra price per currency, so I hope I interpreted that correctly.

(Note: I'm not 100% familiar with Rails 4, only 5 and 6. The underlying SQL principles remain the same regardless and are adaptable.)
You can do this in a single query if your relationships are set up correctly.
class Button < ApplicationRecord
has_many :extra_prices
has_many :currencies, through: :extra_prices
end
class ExtraPrice < ApplicationRecord
belongs_to :button
belongs_to :currency
end
class Currency < ApplicationRecord
has_many :extra_prices
end
I've added a relationship between Currency and ExtraPrice. This allows us to set up a relationship between Button and Currency using has_many :currencies, through: :extra_prices. Then we can get a Button's Currencies with button.currencies.
Now we can do a left join between Button and ExtraPrice and Currency. A left join, as opposed to the normal inner join, will pick up Buttons that have no ExtraPrices nor Currencies. We can't use Rails's left_joins, it will not do the right thing.
buttons_missing_currencies = Button
.includes(:currencies)
.joins("left join extra_prices ep on ep.button_id = buttons.id")
.joins("left join currencies c on c.id = ep.currency_id")
.having("count(c.id) < ?", Currency.count)
.group("buttons.id")
That will give you all the Buttons which lack a Currency in a single, efficient query. includes(:currencies) means each Button's Currencies will already be loaded avoiding making N+1 queries.
Now we can look through each button, discover which currencies are missing, and fill them in.
all_currencies = Currency.all
buttons_missing_currencies.each do |button|
missing_currencies = all_currencies - button.currencies
missing_currencies.each do |missing_currency|
button.extra_prices.create!(currency: missing_currency)
end
end

Related

Problems with displaying the correct number of items in an has_and_belongs_to_many association

I have a model UseCases (about 6.000 rows) and EducationalObjectives (about 4.000 rows) associated with has_and_belongs_to_many(EducationalObjectivesUseCases with about 8.000 rows). Some of the EducationalObjectives belong to subjectA (about 4.500 rows in EducationalObjectivesUseCases) and some to subjectB (about 3.500 rows in EducationalObjectivesUseCases).
Now I want to display a list of all UseCases which are tied to the EducationalObjectives of the subjectA which should be about 3.500 rows but I get about 4.500 rows (you've guessed it: the number of associations within EducationalObjectivesUseCases) since duplicate entries (UseCases with many EducationalObjectives on subjectA) are displayed the number of times of entries.
My thinking was that I only can tell through the HABTM association that I need the UseCases for subjectA but don't know how the avoid duplicate entries.
class UseCase < ApplicationRecord
has_and_belongs_to_many :educational_objectives
end
class EducationalObjective < ApplicationRecord
has_and_belongs_to_many :use_cases
end
class EducationalObjectivesUseCase < ApplicationRecord
belongs_to :educational_objective
belongs_to :use_case
end
class UseCasesController < ApplicationController
def index
#use_cases = UseCase.all.
order(:use_case).
joins(:educational_objectives).
where('educational_objectives.subject_id = ?',2)
end
end
How do I get Rails to display only the used UseCases for subjectA once (only 3.500 rows)? Where is my mistake?
Thanks in advance!
The quickest way to solve this is to call #distinct on the where-chain. Since the select is automatically set to use_cases.* this will work and filter out duplicated records.
def index
#use_cases = UseCase.joins(:educational_objectives)
.where(educational_objectives: {subject_id: 2})
.order(:use_case)
.distinct
end
Alternatively this can be solved using a sub-query.
def index
educational_objectives = EducationalObjective.where(subject_id: 2)
use_case_ids = EducationalObjectivesUseCase
.where(educational_objective_id: educational_objectives)
.select(:use_case_id)
#use_cases = UseCase.where(id: use_case_ids).order(:use_case)
end
edit
The sub-query code will execute 1 SQL query, just like the code for the distinct version. When executed on the console suffix each statement with ;nil to prevent execution by the #inspect method (used to show you the result). If you don't do this the console will try to show the result and trigger the query before we are ready executing it. It will still work, but it looks like it are multiple queries.

ActiveRecord sort model on attribute of last has_many relation

I've been digging around for this for awhile... I can't find a graceful solution. I have loans and loans has_many :decisions. decisions has an attribute that I care about, called risk_rating.
I'd like to sort loans based on the most recent decision (based on created_at, per usual), but by the risk_rating.
Loan.includes(:decisions).references(:decisions).order('decisions.risk_rating DESC') doesn't work...
I want loans... sorted by their most recent decision's risk_rating. This seems like it should be easier than it is.
I'm currently doing this outside of the database like this, but it's chewing up time and memory:
Loan.all.sort do |x,y|
x.decisions.last.try(:risk_rating).to_f <=> y.decisions.last.try(:risk_rating).to_f
end
I'd like to show the performance I'm getting with the proposed answer, along with an inaccuracy...
Benchmark.bm do |x|
x.report{ Loan.joins('LEFT JOIN decisions ON decisions.loan_id = loans.id').group('loans.id').order('MAX(decisions.risk_rating) DESC').limit(10).map{|l| l.decisions.last.try(:risk_rating)} }
end
user system total real
0.020000 0.000000 0.020000 ( 20.573096)
=> [0.936775, 0.934465, 0.932088, 0.922352, 0.921882, 0.794724, 0.919432, 0.918385, 0.916952, 0.914938]
The order isn't right. That 0.794724 is out of place.
To that extent... I'm only seeing one attribute in the proposed answer. I don't see the connection =/
Alright, it looks like I'm working late tonight because I couldn't help but jump in:
class Loan < ApplicationRecord
has_many :decisions
has_one :latest_decision, -> { merge(Decision.latest) }, class_name: 'Decision'
end
class Decision < ApplicationRecord
belongs_to :loan
def latest
t1 = arel_table
t2 = arel_table.alias('t2')
# Self join based on `loan_id` prefer latest `created_at`
join_on = t1[:loan_id].eq(t2[:loan_id]).and(
t1[:created_at].lt(t2[:created_at]))
where(t2[:loan_id].eq(nil)).joins(
t1.create_join(t2, t1.create_on(join_condition), Arel::Nodes::OuterJoin)
)
end
end
Loan.includes(:latest_decision)
This doesn't sort, just provides the latest decision for each loan. Throwing an order that references access_codes messes things up because of the table aliasing. I don't have the time to work that kink out now, but I bet you can figure it out if you check out some of the great resources on Arel and how to use it with ActiveRecord. I really enjoy this one.
At first let's write sql-query which will select necessary data. SO contains a question which may helps here: Select most recent row with GROUP BY in MySQL. My best version:
SELECT loans.*
FROM loans
LEFT JOIN (
SELECT loan_id, MAX(id) as id
FROM decisions
GROUP BY loan_id) d ON d.loan_id = loans.id
LEFT JOIN decisions ON decisions.id = d.id
ORDER BY decisions.risk_rating DESC
This code suppose MAX(id) gives id of the recent row in group.
You may do the same query by this Rails code:
sub_query =
Decision.select('loan_id, MAX(id) as id').
group(:loan_id).to_sql
Loan.
joins("LEFT JOIN (#{sub_query}) d ON d.loan_id = loans.id").
joins("LEFT JOIN decisions ON decisions.id = d.id").
order("decisions.risk_rating DESC")
Unfortunately, I don't have MySQL at hand and I can't try this code. Hope it will work.

Disable Rails STI for certain ActiveRecord queries only

I can disable STI for a complete model but I'd like to just disable it when doing certain queries. Things were working swell in Rails 3.2, but I've upgraded to Rails 4.1.13 and a new thing is happening that's breaking the query.
I have a base class called Person with a fairly complex scope.
class Person < ActiveRecord::Base
include ActiveRecord::Sanitization
has_many :approvals, :dependent => :destroy
has_many :years, through: :approvals
scope :for_current_or_last_year, lambda {
joins(:approvals).merge(Approval.for_current_or_last_year) }
scope :for_current_or_last_year_latest_only, ->{
approvals_sql = for_current_or_last_year.select('person_id AS ap_person_id, MAX(year_id) AS max_year, year_id AS ap_year_id, status_name AS ap_status_name, active AS ap_active, approved AS ap_approved').group(:id).to_sql
approvals_sql = select("*").from("(#{approvals_sql}) AS ap, approvals AS ap2").where('ap2.person_id = ap.ap_person_id AND ap2.year_id = ap.max_year').to_sql
select("people.id, ap_approved, ap_year_id, ap_status_name, ap_active").
joins("JOIN (#{approvals_sql}) filtered_people ON people.id =
filtered_people.person_id").uniq
}
end
And inheriting classes called Member and Staff. The only thing related to this I had to comment out to get my tests to pass with Rails 4. It may be the problem, but uncommenting it hasn't helped in this case.
class Member < Person
#has_many :approvals, :foreign_key => 'person_id', :dependent => :destroy, :class_name => "MemberApproval"
end
The problem happens when I do the query Member.for_current_or_last_year_latest_only
I get the error unknown column 'people.type'
When I look at the SQL, I can see the problem line but I don't know how to remove it or make it work.
Member.for_current_or_last_year_latest_only.to_sql results in.
SELECT DISTINCT people.id, ap_approved, ap_year_id, ap_status_name, ap_active
FROM `people`
JOIN (SELECT * FROM (SELECT person_id AS ap_person_id, MAX(year_id) AS max_year, year_id AS ap_year_id, status_name AS ap_status_name, active AS ap_active, approved AS ap_approved
FROM `people` INNER JOIN `approvals` ON `approvals`.`person_id` = `people`.`id`
WHERE `people`.`type` IN ('Member') AND ((`approvals`.`year_id` = 9 OR `approvals`.`year_id` = 8))
GROUP BY `people`.`id`) AS ap, approvals AS ap2
WHERE `people`.`type` IN ('Member') AND (ap2.person_id = ap.ap_person_id AND ap2.year_id = ap.max_year)) filtered_people ON people.id = filtered_people.person_id
WHERE `people`.`type` IN ('Member')
If I remove people.type IN ('Member') AND from the beginning of the second to last WHERE clause the query runs successfully. And btw, that part isn't in the query generated from the old Rails 3.2 code, neither is the one above it (only the last one matches the Rails 3.2 query). The problem is, that part is being generated from rails Single Table Inheritance I assume, so I can't just delete it from my query. It's not the only place that is getting added into the original query, but that's the only one that is causing it to break.
Does anybody have any idea how I can either disable STI for only certain queries or add something to my query that will make it work? I've tried putting people.type in every one of the SELECT queries to try and make it available but to no avail.
Thanks for taking the time to look at this.
I was apparently making this harder than it really was...I just needed to add unscoped to the front of the two approval_sql sub-queries. Thanks for helping my brain change gears.

ActiveRecord query array intersection?

I'm trying to figure out the count of certain types of articles. I have a very inefficient query:
Article.where(status: 'Finished').select{|x| x.tags & Article::EXPERT_TAGS}.size
In my quest to be a better programmer, I'm wondering how to make this a faster query. tags is an array of strings in Article, and Article::EXPERT_TAGS is another array of strings. I want to find the intersection of the arrays, and get the resulting record count.
EDIT: Article::EXPERT_TAGS and article.tags are defined as Mongo arrays. These arrays hold strings, and I believe they are serialized strings. For example: Article.first.tags = ["Guest Writer", "News Article", "Press Release"]. Unfortunately this is not set up properly as a separate table of Tags.
2nd EDIT: I'm using MongoDB, so actually it is using a MongoWrapper like MongoMapper or mongoid, not ActiveRecord. This is an error on my part, sorry! Because of this error, it screws up the analysis of this question. Thanks PinnyM for pointing out the error!
Since you are using MongoDB, you could also consider a MongoDB-specific solution (aggregation framework) for the array intersection, so that you could get the database to do all the work before fetching the final result.
See this SO thread How to check if an array field is a part of another array in MongoDB?
Assuming that the entire tags list is stored in a single database field and that you want to keep it that way, I don't see much scope of improvement, since you need to get all the data into Ruby for processing.
However, there is one problem with your database query
Article.where(status: 'Finished')
# This translates into the following query
SELECT * FROM articles WHERE status = 'Finished'
Essentially, you are fetching all the columns whereas you only need the tags column for your process. So, you can use pluck like this:
Article.where(status: 'Finished').pluck(:tags)
# This translates into the following query
SELECT tags FROM articles WHERE status = 'Finished'
I answered a question regarding general intersection like queries in ActiveRecord here.
Extracted below:
The following is a general approach I use for constructing intersection like queries in ActiveRecord:
class Service < ActiveRecord::Base
belongs_to :person
def self.with_types(*types)
where(service_type: types)
end
end
class City < ActiveRecord::Base
has_and_belongs_to_many :services
has_many :people, inverse_of: :city
end
class Person < ActiveRecord::Base
belongs_to :city, inverse_of: :people
def self.with_cities(cities)
where(city_id: cities)
end
# intersection like query
def self.with_all_service_types(*types)
types.map { |t|
joins(:services).merge(Service.with_types t).select(:id)
}.reduce(scoped) { |scope, subquery|
scope.where(id: subquery)
}
end
end
Person.with_all_service_types(1, 2)
Person.with_all_service_types(1, 2).with_cities(City.where(name: 'Gold Coast'))
It will generate SQL of the form:
SELECT "people".*
FROM "people"
WHERE "people"."id" in (SELECT "people"."id" FROM ...)
AND "people"."id" in (SELECT ...)
AND ...
You can create as many subqueries as required with the above approach based on any conditions/joins etc so long as each subquery returns the id of a matching person in its result set.
Each subquery result set will be AND'ed together thus restricting the matching set to the intersection of all of the subqueries.

How to write complex query in Ruby

Need advice, how to write complex query in Ruby.
Query in PHP project:
$get_trustee = db_query("SELECT t.trustee_name,t.secret_key,t.trustee_status,t.created,t.user_id,ui.image from trustees t
left join users u on u.id = t.trustees_id
left join user_info ui on ui.user_id = t.trustees_id
WHERE t.user_id='$user_id' AND trustee_status ='pending'
group by secret_key
ORDER BY t.created DESC")
My guess in Ruby:
get_trustee = Trustee.find_by_sql('SELECT t.trustee_name, t.secret_key, t.trustee_status, t.created, t.user_id, ui.image FROM trustees t
LEFT JOIN users u ON u.id = t.trustees_id
LEFT JOIN user_info ui ON ui.user_id = t.trustees_id
WHERE t.user_id = ? AND
t.trustee_status = ?
GROUP BY secret_key
ORDER BY t.created DESC',
[user_id, 'pending'])
Option 1 (Okay)
Do you mean Ruby with ActiveRecord? Are you using ActiveRecord and/or Rails? #find_by_sql is a method that exists within ActiveRecord. Also it seems like the user table isn't really needed in this query, but maybe you left something out? Either way, I'll included it in my examples. This query would work if you haven't set up your relationships right:
users_trustees = Trustee.
select('trustees.*, ui.image').
joins('LEFT OUTER JOIN users u ON u.id = trustees.trustees_id').
joins('LEFT OUTER JOIN user_info ui ON ui.user_id = t.trustees_id').
where(user_id: user_id, trustee_status: 'pending').
order('t.created DESC')
Also, be aware of a few things with this solution:
I have not found a super elegant way to get the columns from the join tables out of the ActiveRecord objects that get returned. You can access them by users_trustees.each { |u| u['image'] }
This query isn't really THAT complex and ActiveRecord relationships make it much easier to understand and maintain.
I'm assuming you're using a legacy database and that's why your columns are named this way. If I'm wrong and you created these tables for this app, then your life would be much easier (and conventional) with your primary keys being called id and your timestamps being called created_at and updated_at.
Option 2 (Better)
If you set up your ActiveRecord relationships and classes properly, then this query is much easier:
class Trustee < ActiveRecord::Base
self.primary_key = 'trustees_id' # wouldn't be needed if the column was id
has_one :user
has_one :user_info
end
class User < ActiveRecord::Base
belongs_to :trustee, foreign_key: 'trustees_id' # relationship can also go the other way
end
class UserInfo < ActiveRecord::Base
self.table_name = 'user_info'
belongs_to :trustee
end
Your "query" can now be ActiveRecord goodness if performance isn't paramount. The Ruby convention is readability first, reorganizing code later if stuff starts to scale.
Let's say you want to get a trustee's image:
trustee = Trustee.where(trustees_id: 5).first
if trustee
image = trustee.user_info.image
..
end
Or if you want to get all trustee's images:
Trustee.all.collect { |t| t.user_info.try(:image) } # using a #try in case user_info is nil
Option 3 (Best)
It seems like trustee is just a special-case user of some sort. You can use STI if you don't mind restructuring you tables to simplify even further.
This is probably outside of the scope of this question so I'll just link you to the docs on this: http://api.rubyonrails.org/classes/ActiveRecord/Base.html see "Single Table Inheritance". Also see the article that they link to from Martin Fowler (http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html)
Resources
http://guides.rubyonrails.org/association_basics.html
http://guides.rubyonrails.org/active_record_querying.html
Yes, find_by_sql will work, you can try this also:
Trustee.connection.execute('...')
or for generic queries:
ActiveRecord::Base.connection.execute('...')

Resources