ActiveRecord sort model on attribute of last has_many relation - ruby-on-rails

I've been digging around for this for awhile... I can't find a graceful solution. I have loans and loans has_many :decisions. decisions has an attribute that I care about, called risk_rating.
I'd like to sort loans based on the most recent decision (based on created_at, per usual), but by the risk_rating.
Loan.includes(:decisions).references(:decisions).order('decisions.risk_rating DESC') doesn't work...
I want loans... sorted by their most recent decision's risk_rating. This seems like it should be easier than it is.
I'm currently doing this outside of the database like this, but it's chewing up time and memory:
Loan.all.sort do |x,y|
x.decisions.last.try(:risk_rating).to_f <=> y.decisions.last.try(:risk_rating).to_f
end
I'd like to show the performance I'm getting with the proposed answer, along with an inaccuracy...
Benchmark.bm do |x|
x.report{ Loan.joins('LEFT JOIN decisions ON decisions.loan_id = loans.id').group('loans.id').order('MAX(decisions.risk_rating) DESC').limit(10).map{|l| l.decisions.last.try(:risk_rating)} }
end
user system total real
0.020000 0.000000 0.020000 ( 20.573096)
=> [0.936775, 0.934465, 0.932088, 0.922352, 0.921882, 0.794724, 0.919432, 0.918385, 0.916952, 0.914938]
The order isn't right. That 0.794724 is out of place.
To that extent... I'm only seeing one attribute in the proposed answer. I don't see the connection =/

Alright, it looks like I'm working late tonight because I couldn't help but jump in:
class Loan < ApplicationRecord
has_many :decisions
has_one :latest_decision, -> { merge(Decision.latest) }, class_name: 'Decision'
end
class Decision < ApplicationRecord
belongs_to :loan
def latest
t1 = arel_table
t2 = arel_table.alias('t2')
# Self join based on `loan_id` prefer latest `created_at`
join_on = t1[:loan_id].eq(t2[:loan_id]).and(
t1[:created_at].lt(t2[:created_at]))
where(t2[:loan_id].eq(nil)).joins(
t1.create_join(t2, t1.create_on(join_condition), Arel::Nodes::OuterJoin)
)
end
end
Loan.includes(:latest_decision)
This doesn't sort, just provides the latest decision for each loan. Throwing an order that references access_codes messes things up because of the table aliasing. I don't have the time to work that kink out now, but I bet you can figure it out if you check out some of the great resources on Arel and how to use it with ActiveRecord. I really enjoy this one.

At first let's write sql-query which will select necessary data. SO contains a question which may helps here: Select most recent row with GROUP BY in MySQL. My best version:
SELECT loans.*
FROM loans
LEFT JOIN (
SELECT loan_id, MAX(id) as id
FROM decisions
GROUP BY loan_id) d ON d.loan_id = loans.id
LEFT JOIN decisions ON decisions.id = d.id
ORDER BY decisions.risk_rating DESC
This code suppose MAX(id) gives id of the recent row in group.
You may do the same query by this Rails code:
sub_query =
Decision.select('loan_id, MAX(id) as id').
group(:loan_id).to_sql
Loan.
joins("LEFT JOIN (#{sub_query}) d ON d.loan_id = loans.id").
joins("LEFT JOIN decisions ON decisions.id = d.id").
order("decisions.risk_rating DESC")
Unfortunately, I don't have MySQL at hand and I can't try this code. Hope it will work.

Related

Increase performance: avoid looking for the right element in a collection

I have this situation.
activity.rb
belongs_to :user
belongs_to :cause
belongs_to :sub_cause
belongs_to :client
def amount
duration / 60.0 * user.hourly_cost_by_year(date.year).amount rescue 0
end
user.rb
has_many :hourly_costs # one hourly_cost for year
has_many :activities
def hourly_cost_by_year(year = Date.today.year)
hourly_costs.find { |hc| hc.year == year }
end
hourly_cost.rb
belongs_to :user
I have a big report where I achieved good performance (the number of SQL queries is fixed) but I think I could do better. The query I use is
activities = Activity.includes(:client, :cause, :sub_cause, user: :hourly_costs)
And this is ok, it's fast, but I think is improvable because hourly_cost_by_year method. I mean, activity has a date and I can use that date to know which of those hourly costs I should use. Something like this in activity
def self.user_with_single_hourly_cost
joins('LEFT JOIN users u ON u.id = activities.user_id').
joins('LEFT JOIN hourly_costs hc ON hc.user_id = u.id AND hc.year = EXTRACT(year from activities.date)')
end
But I don't how integrate this in my query. Whatever I tried did not work. I could use raw SQL but I'm trying to use ActiveRecord. I even thought to use redis to cache every hourly cost by user and year, could work, but I think this query, with the extract part, should do the best job because I'd have a flat table.
Update: I try to clarify. Whatever query I use in my action at some point I have to do
activities.sum(&:amount)
and that method, you know, is
def amount
duration / 60.0 * user.hourly_cost_by_year(date.year).amount rescue 0
end
And I don't know how to pick directly the hourly_cost I want without search between hourly_costs. Is this possible?
You may consider using Arel for this. Arel is the underlying query assembler for rails/activerecord (so no new dependencies) and can be very useful when building complex queries because it offers far more depth than the high level ActiveRecord::QueryMethods.
Obviously with a broader API comes more verbosity (which actually adds quite a bit to the readability) and less syntactical sugar which takes some getting used to but has proven indispensable for me on multiple occasions.
While I did not take the time to recreate your data structure something like this may work for you
activities = Activity.arel_table
users = User.arel_table
hourly_costs = HourlyCost.arel_table
activity_users_hourly_cost = activities
.join(users,Arel::Nodes::OuterJoin)
.on(activities[:user_id].eq(users[:id]))
.join(hourly_costs,Arel::Nodes::OuterJoin)
.on(hourly_costs[:user_id].eq(users[:id])
.and(hourly_costs[:year].eq(Arel::Nodes::Extract.new(activities[:date],'year'))
)
)
Activity.includes(:client, :cause, :sub_cause).joins(activity_users_hourly_cost.join_sources)
This will add the requested join e.g.
activity_users_hourly_cost.to_sql
#=> SELECT
FROM [activities]
LEFT OUTER JOIN [users] ON [activities].[user_id] = [users].[id]
LEFT OUTER JOIN [hourly_costs] ON [hourly_costs].[user_id] = [users].[id]
AND [hourly_costs].[year] = EXTRACT(YEAR FROM [activities].[date])
Update
If you just want to add the "hourly_cost" this should work for you
Activity.includes(:client, :cause, :sub_cause)
.joins(activity_users_hourly_cost.join_sources)
.select("activities.*, activities.duration / 60.0 * ISNULL([hourly_costs].[amount],0) as hourly_cost_by_year")
Please note that this will only return Activity objects but they will now have a method called hourly_cost_by_year which will return the result of that calculation. Full SQL will look like
SELECT
[activities].*,
activities.duration / 60.0 * ISNULL([hourly_costs].[amount],0) as hourly_cost_by_year
FROM [activities]
-- Dependant upon WHERE Clause
LEFT OUTER JOIN causes ON [activities].[cause_id] = [causes].[id]
LEFT OUTER JOIN sub_causes ON [activities].[subcause_id] = [subcauses].[id]
LEFT OUTER JOIN clients [activities].[client_id] = [clients].[id]
--
LEFT OUTER JOIN [users] ON [activities].[user_id] = [users].[id]
LEFT OUTER JOIN [hourly_costs] ON [hourly_costs].[user_id] = [users].[id]
AND [hourly_costs].[year] = EXTRACT(YEAR FROM [activities].[date])
You could build the select portion in Arel too if you like but seems overkill for such a simple statement.

activerecord exists subquery

I have the following two models:
class Client < ActiveRecord::Base
has_many :orders
end
class Order < ActiveRecord::Base
belongs_to :client
end
I want to query clients with orders specified in a list (order_1, order_2), at the same time, the client's all orders are needed. I can do this by SQL below:
SELECT *
FROM CLIENTS C
JOIN ORDERS O
ON C.ID = O.CLIENT_ID
WHERE EXISTS
(SELECT *
FROM CLIENTS C1
JOIN ORDERS O1
ON C1.ID = O1.CLIENT_ID
WHERE O1.ID IN ('order_1', 'order_2')
AND C1.ID = C.ID
);
Is there any way to do this in rails way? The following code would give the satisfied clients, but client.orders returns only the specified orders.
clients.includes(:orders).where(orders: { id: ['order_1', 'order_2'] })
I don't know how to get all information in one query.
I know this is years later, but here's how to do it (based on this blog post).
Order.where(client_id:
Client.joins(:orders)
.where(orders: {id: ['order_1', 'order_2']})
.select(:id)
)
A gem that exists to do that: activerecord_where_assoc (I'm the author)
With it, you can do what you want this way:
clients.includes(:orders).where_assoc_exists(:orders, id: ['order_1', 'order_2'])
Doing it without a gem makes it easy to do mistakes or have annoying side effects, such as the one you mentionned of not having every records. Here is a whole document about the problems.
Read more in the documentation. Here is an introduction and examples.

Exclude object if one of the has_many related entities has the attribute with value x

I came across about the problem excluding data, if the attribute x of one of the associated data has the value 'a'.
Example:
class Order < ActiveRecord::Base
has_many :items
end
class Item < ActiveRecord::Base
belongs_to :order
validate_presence_of :status
end
The query should return all Orders that don't have an Item with status = 'paid' (status != 'paid').
Because of the 1:n association an Order can have many Items. And one of the Itmes can have the status = 'paid'. These Orders must be excluded from the result of my query even if the order has other items with status different from 'paid'.
How would I solve this problem:
paid_items = Items.where(status: 'paid').pluck(:order_id)
orders_wo_paid = Order.where('id NOT IN (?)', paid_items)
Is there an ActiveRecord solution, that solves this problem in one query.
Or are there other ways to solve this question?
I 'm not looking for ruby solution such as:
Order.select do |order|
!order.items.pluck(:status).include?('paid')
end
thx for ideas and inspirations.
You can do:
Order.where('orders.id NOT IN (?)', Item.where(status: 'paid').select(:order_id))
If you're using Rails 4.x then:
Order.where.not(id: Item.where(status: 'paid').select(:order_id))
The query you are interested in is the following, but creating with activerecord will be hard/no very readable:
SELECT
orders.*
FROM
orders
LEFT JOIN
order_items ON orders.id = order_items.order_id
GROUP BY
order_items.order_id
HAVING
COUNT(DISTINCT order_items.id) = COUNT(DISTINCT order_items.status <> 'paid')
Sorry for the sql indentation, I have no idea which are the conventions for it.
A way (not the best one at all) to it with rails (unfortunately writing sql for the most important parts) would be the following:
Order.group(:order_id).joins("LEFT JOIN order_items ON orders.id = order_items.order_id")
.having("COUNT(DISTINCT order_items.id) = COUNT(DISTINCT order_items.status <> 'paid')")
Of course you can play with AREL to get rid of the hard coded sql, but in my opinion it will not be easier to read.
You can have an example of creating lefts joins in this gist: https://gist.github.com/mildmojo/3724189

Rails - can't access custom join data

I've got a really complicated query (which finds bus connections between two towns) and I haven't got any idea how to access data from joins (I'd like to know at which stop does the connection start and at which does it end). Is it possible to access this data using ActiveRecord?
Course.joins("INNER JOIN stop_times as start_stop ON start_stop.course_id=courses.id")
.joins("INNER JOIN stop_times as end_stop ON end_stop.course_id = courses.id")
.joins('INNER JOIN stops as start_stopi ON start_stop.stop_id = start_stopi.id')
.joins('INNER JOIN stops as end_stopi ON end_stop.stop_id = end_stopi.id')
.where('start_stop.hour>= ? OR (start_stop.hour>= ? AND start_stop.minute>= ?)',hour,(hour+1)%24,minute)
.where('start_stopi.town_id = ? and end_stopi.town_id = ?',start_town,end_town)
.where('start_stop."order"<end_stop."order"').order('start_stop.minute ASC').order('start_stop.hour ASC')
EDIT:
I've managed to rewrite it to use active record joins, although it broken my names, it works.
Course.joins(end_stop_times: :stop).joins(start_stop_times: :stop)
.where('start_stop_times_courses.hour>= ? OR (start_stop_times_courses.hour>= ? AND start_stop_times_courses.minute>= ?)',hour,(hour+1)%24,minute)
.where('stops_stop_times.town_id = ? and stops.town_id = ?',start_town,end_town)
.where('start_stop_times_courses."order"<stop_times."order"')
.order('start_stop_times_courses.minute ASC').order('start_stop_times_courses.hour ASC')
Using this new query models are:
class Course < ActiveRecord::Base
belongs_to :carrier
has_many :end_stop_times, class_name: 'StopTime'
has_many :start_stop_times, class_name: 'StopTime'
class Stop < ActiveRecord::Base
belongs_to :town
class StopTime < ActiveRecord::Base
belongs_to :stop
belongs_to :course
You need to add sth like:
your_query.select('courses.*, start_stopi.id as start_stop_id, end_stopi.id as end_stop_id)
and then you can access it by calling start_stop_id and end_stop_id on course object.
However you should probably use association for this kind of operations. Could you show us you models?
Check your log for the output of this query, you should find that it starts with select courses.* - therefore it will not bring through data from the included tables.
You can add some select other_table.some_column statements to your query, but this isn't the rails way.
I would suggest you separate your scope into the relevant models - put scopes in the stop_times model (and others) so that you can call the scopes on the object you actually want to get data from.
When you're constructing custom SQL of that complexity I think you've taken the Rails-way of doing things too far. You're using practically no activerecord association information to construct it, and you've built a programming construct that is horribly ugly and difficult to read.
I'd advise that you rewrite it as well formatted SQL
results = ActiveRecord::Base.connection.execute(
"select c.*
from courses c
join stop_times ss on ss.course_id = c.id
join stop_times es on es.course_id = c.id
... etc ...
where (start_stop.hour >= #{ActiveRecord::Base.sanitize(hour)} or
... etc ...")
Now it could be that you can improve your models and associations to the point where this level of complexity is not required (eg. the associations between courses, stop_times (start) and stop_times (end) could probably be encapsulated in activerecord pretty well, but at the moment you seem to be falling between the pure SQL and the pure activerecord approaches in a very uncomfortable way.

How to write complex query in Ruby

Need advice, how to write complex query in Ruby.
Query in PHP project:
$get_trustee = db_query("SELECT t.trustee_name,t.secret_key,t.trustee_status,t.created,t.user_id,ui.image from trustees t
left join users u on u.id = t.trustees_id
left join user_info ui on ui.user_id = t.trustees_id
WHERE t.user_id='$user_id' AND trustee_status ='pending'
group by secret_key
ORDER BY t.created DESC")
My guess in Ruby:
get_trustee = Trustee.find_by_sql('SELECT t.trustee_name, t.secret_key, t.trustee_status, t.created, t.user_id, ui.image FROM trustees t
LEFT JOIN users u ON u.id = t.trustees_id
LEFT JOIN user_info ui ON ui.user_id = t.trustees_id
WHERE t.user_id = ? AND
t.trustee_status = ?
GROUP BY secret_key
ORDER BY t.created DESC',
[user_id, 'pending'])
Option 1 (Okay)
Do you mean Ruby with ActiveRecord? Are you using ActiveRecord and/or Rails? #find_by_sql is a method that exists within ActiveRecord. Also it seems like the user table isn't really needed in this query, but maybe you left something out? Either way, I'll included it in my examples. This query would work if you haven't set up your relationships right:
users_trustees = Trustee.
select('trustees.*, ui.image').
joins('LEFT OUTER JOIN users u ON u.id = trustees.trustees_id').
joins('LEFT OUTER JOIN user_info ui ON ui.user_id = t.trustees_id').
where(user_id: user_id, trustee_status: 'pending').
order('t.created DESC')
Also, be aware of a few things with this solution:
I have not found a super elegant way to get the columns from the join tables out of the ActiveRecord objects that get returned. You can access them by users_trustees.each { |u| u['image'] }
This query isn't really THAT complex and ActiveRecord relationships make it much easier to understand and maintain.
I'm assuming you're using a legacy database and that's why your columns are named this way. If I'm wrong and you created these tables for this app, then your life would be much easier (and conventional) with your primary keys being called id and your timestamps being called created_at and updated_at.
Option 2 (Better)
If you set up your ActiveRecord relationships and classes properly, then this query is much easier:
class Trustee < ActiveRecord::Base
self.primary_key = 'trustees_id' # wouldn't be needed if the column was id
has_one :user
has_one :user_info
end
class User < ActiveRecord::Base
belongs_to :trustee, foreign_key: 'trustees_id' # relationship can also go the other way
end
class UserInfo < ActiveRecord::Base
self.table_name = 'user_info'
belongs_to :trustee
end
Your "query" can now be ActiveRecord goodness if performance isn't paramount. The Ruby convention is readability first, reorganizing code later if stuff starts to scale.
Let's say you want to get a trustee's image:
trustee = Trustee.where(trustees_id: 5).first
if trustee
image = trustee.user_info.image
..
end
Or if you want to get all trustee's images:
Trustee.all.collect { |t| t.user_info.try(:image) } # using a #try in case user_info is nil
Option 3 (Best)
It seems like trustee is just a special-case user of some sort. You can use STI if you don't mind restructuring you tables to simplify even further.
This is probably outside of the scope of this question so I'll just link you to the docs on this: http://api.rubyonrails.org/classes/ActiveRecord/Base.html see "Single Table Inheritance". Also see the article that they link to from Martin Fowler (http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html)
Resources
http://guides.rubyonrails.org/association_basics.html
http://guides.rubyonrails.org/active_record_querying.html
Yes, find_by_sql will work, you can try this also:
Trustee.connection.execute('...')
or for generic queries:
ActiveRecord::Base.connection.execute('...')

Resources