Rails - How to use custom attribute in where? - ruby-on-rails

I have Order model in which I have datetime column start and int columns arriving_dur, drop_off_dur, etc.. which are durations in seconds from start
Then in my model I have
class Order < ApplicationRecord
def finish_time
self.start + self.arriving_duration + self.drop_off_duration
end
# other def something_time ... end
end
I want to be able to do this:
Order.where(finish_time: Time.now..(Time.now+2.hours) )
But of course I can't, because there's no such column finish_time. How can I achieve such result?
I've read 4 possible solutions on SA:
eager load all orders and select it with filter - that would not work well if there were more orders
have parametrized scope for each time I need but that means soo much code duplication
have sql function for each time and bind it to model with select() - it's just pain
somehow use http://api.rubyonrails.org/classes/ActiveRecord/Attributes/ClassMethods.html#method-i-attribute ? But I have no idea how to use it for my case or whether it even solves the problem I have.
Do you have any idea or some 'best practice' how to solve this?
Thanks!

You have different options to implement this behaviour.
Add an additional finish_time column and update it whenever you update/create your time values. This could be done in rails (with either before_validation or after_save callbacks) or as psql triggers.
class Order < ApplicationRecord
before_validation :update_finish_time
private
def update_finish_time
self.finish_time = start_time + arriving_duration.seconds + drop_off_duration.seconds
end
end
This is especially useful when you need finish_time in many places throughout your app. It has the downside that you need to manage that column with extra code and it stores data you actually already have. The upside is that you can easily create an index on that column should you ever have many orders and need to search on it.
An option could be to implement the finish-time update as a postgresql trigger instead of in rails. This has the benefit of being independent from your rails application (e.g. when other sources/scripts access your db too) but has the downside of splitting your business logic into many places (ruby code, postgres code).
Your second option is adding a virtual column just for your query.
def orders_within_the_next_2_hours
finishing_orders = Order.select("*, (start_time + (arriving_duration + drop_off_duration) * interval '1 second') AS finish_time")
Order.from("(#{finishing_orders.to_sql}) AS orders").where(finish_time: Time.now..(Time.now+2.hours) )
end
The code above creates the SQL query for finishing_order which is the order table with the additional finish_time column. In the second line we use that finishing_orders SQL as the FROM clause ("cleverly" aliased to orders so rails is happy). This way we can query finish_time as if it was a normal column.
The SQL is written for relatively old postgresql versions (I guess it works for 9.3+). If you use make_interval instead of multiplying with interval '1 second' the SQL might be a little more readable (but needs newer postgresql version, 9.4+ I think).

Related

Extract records which satisfy a model function in Rails

I have following method in a model named CashTransaction.
def is_refundable?
self.amount > self.total_refunded_amount
end
def total_refunded_amount
self.refunds.sum(:amount)
end
Now I need to extract all the records which satisfy the above function i.e records which return true.
I got that working by using following statement:
CashTransaction.all.map { |x| x if x.is_refundable? }
But the result is an Array. I am looking for ActiveRecord_Relation object as I need to perform join on the result.
I feel I am missing something here as it doesn't look that difficult. Anyways, it got me stuck. Constructive suggestions would be great.
Note: Just amount is a CashTransaction column.
EDIT
Following SQL does the job. If I can change that to ORM, it will still do the job.
SELECT `cash_transactions`.* FROM `cash_transactions` INNER JOIN `refunds` ON `refunds`.`cash_transaction_id` = `cash_transactions`.`id` WHERE (cash_transactions.amount > (SELECT SUM(`amount`) FROM `refunds` WHERE refunds.cash_transaction_id = cash_transactions.id GROUP BY `cash_transaction_id`));
Sharing Progress
I managed to get it work by following ORM:
CashTransaction
.joins(:refunds)
.group('cash_transactions.id')
.having('cash_transactions.amount > sum(refunds.amount)')
But what I was actually looking was something like:
CashTransaction.joins(:refunds).where(is_refundable? : true)
where is_refundable? being a model function. Initially I thought setting is_refundable? as attr_accesor would work. But I was wrong.
Just a thought, can the problem be fixed in an elegant way using Arel.
There are two options.
1) Finish, what you have started (which is extremely inefficient when it comes to bigger amount of data, since it all is taken into the memory before processing):
CashTransaction.all.map(&:is_refundable?) # is the same to what you've written, but shorter.
SO get the ids:
ids = CashTransaction.all.map(&:is_refundable?).map(&:id)
ANd now, to get ActiveRecord Relation:
CashTransaction.where(id: ids) # will return a relation
2) Move the calculation to SQL:
CashTransaction.where('amount > total_refunded_amount')
Second option is in every possible way faster and efficient.
When you deal with database, try to process it on the database level, with smallest Ruby involvement possible.
EDIT
According to edited question here is how you would achieve the desired result:
CashTransaction.joins(:refunds).where('amount > SUM(refunds.amount)')
EDIT #2
As to your updates in question - I don't really understand, why you have latched onto is_refundable? as an instance method, which could be used in query, which is basically not possible in AR, but..
My suggestion is to create a scope is_refundable:
scope :is_refundable, -> { CashTransaction
.joins(:refunds)
.group('cash_transactions.id')
.having('cash_transactions.amount > sum(refunds.amount)')
}
Now it is available in as short notation as
CashTransaction.is_refundable
which is shorter and more clear than aimed
CashTransaction.where('is_refundable = ?', true)
You can do it this way:
cash_transactions = CashTransaction.all.map { |x| x if x.is_refundable? } # Array
CashTransaction.where(id: cash_transactions.map(&:id)) # ActiveRecord_Relation
But, this is an in-efficient way of doing it as the other answerers also mentioned.
You can do it using SQL if amount and total_refunded_amount are the columns of the cash_transactions table in the database which will be much more efficient and performant:
CashTransaction.where('amount > total_refunded_amount')
But, if amount or total_refunded_amount are not the actual columns in the database, then you can't do it this way. Then, I guess you have do it the other way which is in-efficient than using raw SQL.
I think you should pre-compute is_refundable result (in a new column) when a CashTransaction and his refunds (supposed has_many ?) are updated by using callbacks :
class CashTransaction
before_save :update_is_refundable
def update_is_refundable
is_refundable = amount > total_refunded_amount
end
def total_refunded_amount
self.refunds.sum(:amount)
end
end
class Refund
belongs_to :cash_transaction
after_save :update_cash_transaction_is_refundable
def update_cash_transaction_is_refundable
cash_transaction.update_is_refundable
cash_transaction.save!
end
end
Note : The above code must certainly be optimized to prevent some queries
They you can query is_refundable column :
CashTransaction.where(is_refundable: true)
I think it's not bad to do this on two queries instead of a join table, something like this
def refundable
where('amount < ?', total_refunded_amount)
end
This will do a single sum query then use the sum in the second query, when the tables grow larger you might find that this is faster than doing a join in the database.

update_all with a method

Lets say I have a model:
class Result < ActiveRecord::Base
attr_accessible :x, :y, :sum
end
Instead of doing
Result.all.find_each do |s|
s.sum = compute_sum(s.x, s.y)
s.save
end
assuming compute_sum is a available method and does some computation that cannot be translated into SQL.
def compute_sum(x,y)
sum_table[x][y]
end
Is there a way to use update_all, probably something like:
Result.all.update_all(sum: compute_sum(:x, :y))
I have more than 80,000 records to update. Each record in find_each creates its own BEGIN and COMMIT queries, and each record is updated individually.
Or is there any other faster way to do this?
If the compute_sum function can't be translated into sql, then you cannot do update_all on all records at once. You will need to iterate over the individual instances. However, you could speed it up if there are a lot of repeated sets of values in the columns, by only doing the calculation once per set of inputs, and then doing one mass-update per calculation. eg
Result.all.group_by{|result| [result.x, result.y]}.each do |inputs, results|
sum = compute_sum(*inputs)
Result.update_all('sum = #{sum}', "id in (#{results.map(&:id).join(',')})")
end
You can replace result.x, result.y with the actual inputs to the compute_sum function.
EDIT - forgot to put the square brackets around result.x, result.y in the group_by block.
update_all makes an sql query, so any processing you do on the values needs to be in sql. So, you'll need to find the sql function, in whichever DBMS you're using, to add two numbers together. In Postgres, for example, i believe you would do
Sum.update_all(sum: "x + y")
which will generate this sql:
update sums set sum = x + y;
which will calculate the x + y value for each row, and set the sum field to the result.
EDIT - for MariaDB. I've never used this, but a quick google suggests that the sql would be
update sums set sum = sum(x + y);
Try this first, in your sql console, for a single record. If it works, then you can do
Sum.update_all(sum: "sum(x + y)")
in Rails.
EDIT2: there's a lot of things called sum here which is making the example quite confusing. Here's a more generic example.
set col_c to the result of adding col_a and col_b together, in class Foo:
Foo.update_all(col_c: "sum(col_a + col_b)")
I just noticed that i'd copied the (incorrect) Sum.all.update_all from your question. It should just be Sum.update_all - i've updated my answer.
I'm completely beginner, just wondering Why not add a self block like below, without adding separate column in db, you still can access Sum.sum from outside.
def self.sum
x+y
end

How to update thousands of records

I have to update an age column based on the value in a date of birth column. There are thousands of records to update.
How do I do this using rails?
Is this the right way to do it?
User.update_all(:age => some_method);
def some_method
age = Date.today.year - dob.year
end
Yes, update_all is the right method but no, you can't do it like this. Your some_method will only get called once to set up a database call (I assume you're persisting to a database). You'll then get an error because dob won't be recognised in the scope of the User class.
You'll need to translate your date logic to SQL functions.
Something like (for mysql):
User.update_all("age = year(now()) -
year(dob) -
(DATE_FORMAT(now(), '%m%d') < DATE_FORMAT(dob, '%m%d'))")
(NB. the date_format stuff is so that you get the right age for people who's birthdays are later in the year than the current date - see this question for more details)
The other option is to use one of the batches functionality in rails.
User.where(some_condition).find_in_batches do |group_of_users|
# some logic
# e.g. group_of_users.update_all(:age => some_logic)
end
This would lock your db for less time. Note that you should pretty much always update with a condition in mind. I can't think of many cases you would want to update an entire table every time something happens.
There are a few options checkout the rails docs or the api.
your query is right.
There are many way to update record in a batch/lot.
But, I think that your query is best. Because it is rails query that will support every condition for all database.
for updating more than one attributes
Model.update_all(:column1 => value1, :column2 => value2, ........)
or
you can use :
Model.update_all("column1 = value1, column2 = value2, ........")

Dealing with column conversions in Ruby ActiveRecord

Dealing with a legacy database, I've come across a column in a SQL Server database where the date is stored as a decimal. E.g. 2011-04-23 is stored as 20110423.0.
Is there a general ActiveRecord mechanism for dealing with "weird" column storage conventions? Enum-like columns where they're actually stored as integers is another case that might also make use of the same mechanism, but I can't quite find what I'm looking for.
It seems like serialization gets me partly there:
class Thing < ActiveRecord::Base
class DecimalDate
def load(date)
if date.is_a? Numeric
y,m,d = /^(\d\d\d\d)(\d\d)(\d\d)/.match(date.to_s)[1..3].map(&:to_i)
Date.civil(y,m,d)
end
end
def dump(date)
date ? date.strftime('%Y%m%d').to_i : 0
end
end
serialize :weird_date, DecimalDate
end
rails c
> Thing.first.weird_date
=> Sun, 02 Jan 2011
But, the illusion is thin. The column doesn't "know" that it's a date stored as a decimal. E.g. comparisons fail:
rails c
> Thing.where('weird_date > ?', 1.week.ago)
...
ActiveRecord::StatementInvalid: ... Error converting data type varchar to numeric.:
If your are forced to deal with legacy data I see two possibilities to manage it.
1 From your database
You can "convert" your data by making a view of your table which convert your (date) fields on the fly. Then you make a trigger (before insert/update) on this view which convert your data back to your old format. Finally you tell ActiveRecord to use your view instead of your table.
2 From your application (Rails)
Find a way to tell ActiveRecord to do the same job. Have you already tried to manage it with AR callbacks with after_initialize and before_save? More informations here

Subquery in Rails report generation

I'm building a report in a Ruby on Rails application and I'm struggling to understand how to use a subquery.
Each 'Survey' has_many 'SurveyResponses' and it is simple enough to retrieve these however I need to group them according to one of the fields, 'jobcode', as I only want to report the information relating to a single jobcode in one line in the report.
However I also need to know the constituent data that makes up the totals for that jobcode. The reason for this is that I need to calculate data such as medians and standard deviations and so need to know the values that make the total.
My thinking is that I retrieve the distinct jobcodes that were reported on for the survey and then as I loop through these I retrieve the individual responses for each jobcode.
Is this the correct way to do this or should I follow a different method?
You could use a named scope to simplify getting the groups of responses:
named_scope :job_group, lambda{|job_code| {:conditions => ["job_code = ?", job_code]}}
Put that in your response model, aand use it like this:
job.responses.job_group('some job code')
and you'll get an array of responses. If you're looking to get the mean of the values of one of the attributes on the responses, you can use map:
r = job.responses.job_group('some job code')
r.map(&:total)
=> [1, 5, 3, 8]
Alternatively, you might find it quicker to write custom SQL in order to get the mean / average / sum of groups of attributes. Going through rails for this sort of work may cause significant lag.
ActiveRecord::Base.connection.execute("Custom SQL here")
You can also use Model.find_by_sql()
For example:
class User < Activerecord::Base
# Your usual AR model
end
...
def index
#users = User.find_by_sql "select * from users"
# etc
end

Resources