Ways to simplify and optimize my code? - ruby-on-rails

I've got some code which i would like to optimize.
First, not bad at all, but maybe it can be a bit shorter or faster, mainly the update_result method:
class Round < ActiveRecord::Base
belongs_to :match
has_and_belongs_to_many :banned_champions, :class_name => "Champion", :join_table => "banned_champions_rounds"
belongs_to :clan_blue, :class_name => "Clan", :foreign_key => "clan_blue_id"
belongs_to :clan_purple, :class_name => "Clan", :foreign_key => "clan_purple_id"
belongs_to :winner, :class_name => "Clan", :foreign_key => "winner_id"
after_save {self.update_result}
def update_result
match = self.match
if match.rounds.count > 0
clan1 = match.rounds.first.clan_blue
clan2 = match.rounds.first.clan_purple
results = {clan1=>0, clan2=>0}
for round in match.rounds
round.winner == clan1 ? results[clan1] += 1 : results[clan2] += 1
end
if results[clan1] > results[clan2] then
match.winner = clan1; match.looser = clan2
match.draw_1 = nil; match.draw_2 = nil
elsif results[clan1] < results[clan2] then
match.winner = clan2; match.looser = clan1
match.draw_1 = nil; match.draw_2 = nil
else
match.draw_1 = clan1; match.draw_2 = clan2
match.winner = nil; match.looser = nil
end
match.save
end
end
end
And second, totally bad and slow in seeds.rb:
require 'faker'
champions = [{:name=>"Akali"},
{:name=>"Alistar"},
{:name=>"Amumu"},
{:name=>"Anivia"},
{:name=>"Annie"},
{:name=>"Galio"},
{:name=>"Tryndamere"},
{:name=>"Twisted Fate"},
{:name=>"Twitch"},
{:name=>"Udyr"},
{:name=>"Urgot"},
{:name=>"Veigar"}
]
Champion.create(champions)
10.times do |n|
name = Faker::Company.name
clan = Clan.create(:name=>name)
6.times do |n|
name = Faker::Internet.user_name
clan.players.create(:name=>name)
end
end
for clan in Clan.all do
2.times do
match = Match.create()
c = [clan,Clan.first(:offset => rand(Clan.count))]
3.times do
round = match.rounds.create
round.clan_blue = c[0]
round.clan_purple = c[1]
round.winner = c[0]
round.save!
end
for item in c
for p in item.players.limit(5)
rand_champion = Champion.first(:offset => rand(Champion.count))
match.participations.create!(:player => p, :champion => rand_champion)
end
end
match.save!
end
2.times do
match = Match.create()
c = [clan,Clan.first(:offset => rand(Clan.count))]
3.times do
round = match.rounds.create
round.clan_blue = c[0]
round.clan_purple = c[1]
round.winner = c[1]
round.save!
end
for item in c
for p in item.players.limit(5)
rand_champion = Champion.first(:offset => rand(Champion.count))
match.participations.create!(:player => p, :champion => rand_champion)
end
end
match.save!
end
2.times do
match = Match.create()
c = [clan,Clan.first(:offset => rand(Clan.count))]
2.times do |n|
round = match.rounds.create
round.clan_blue = c[0]
round.clan_purple = c[1]
round.winner = c[n]
round.save!
end
for item in c
for p in item.players.limit(5)
rand_champion = Champion.first(:offset => rand(Champion.count))
match.participations.create!(:player => p, :champion => rand_champion)
end
end
match.save!
end
end
Any chances to optimize them?

Don't underestimate the value of whitespace in cleaning up code readability!
class Round < ActiveRecord::Base
belongs_to :match
belongs_to :clan_blue, :class_name => "Clan", :foreign_key => "clan_blue_id"
belongs_to :clan_purple, :class_name => "Clan", :foreign_key => "clan_purple_id"
belongs_to :winner, :class_name => "Clan", :foreign_key => "winner_id"
has_and_belongs_to_many :banned_champions, :class_name => "Champion", :join_table => "banned_champions_rounds"
after_save { match.update_result }
end
class Match < ActiveRecord::Base
def update_result
return unless rounds.count > 0
clan1, clan2 = rounds.first.clan_blue, rounds.first.clan_purple
clan1_wins = rounds.inject(0) {|total, round| total += round.winner == clan1 ? 1 : 0 }
clan2_wins = rounds.length - clan1_wins
self.winner = self.loser = self.draw_1 = self.draw_2 = nil
if clan1_wins == clan2_wins
self.draw1, self.draw2 = clan1, clan2
else
self.winner = clan1_wins > clan2_wins ? clan1 : clan2
self.loser = clan1_wins < clan2_wins ? clan1 : clan2
end
save
end
end
For your seeds, I'd replace your fixtures with a factory pattern, if it's for tests. If you're going to stick with what you have there, though, wrap the whole block in a transaction and it should become orders of magnitude faster.

Well, on your first example, it appears that you are forcing Match behavior into your Round class, which is not consistent with abstract OOP. Your update_result method actually belongs in your Match class. Once you do that, I think the code will clean itself up a bit.
On your second example, it's hard to see what you are trying to do, but it's not surprising that it's so slow. Every single create and save generates a separate database call. At first glance your code generates over a hundred separate database saves. Do you really need all those records? Can you combine some of the saves?
Beyond that, you can cut your database calls in half by using build instead of create, like this:
round = match.rounds.build
round.clan_blue = c[0]
round.clan_purple = c[1]
round.winner = c[0]
round.save!
If you want to save some lines of code, you could replace the above with this syntax:
match.rounds.create(:clan_blue_id => c[0].id, :clan_purple_id => c[1].id, :winner_id => c[0].id)

In your seeds file:
c = [clan,Clan.first(:offset => rand(Clan.count))]
This works, but it looks like you're picking a random number in Ruby. From what I understand, if you can do something in SQL instead of Ruby, it's generally faster. Try this:
c = [clan,Clan.find(:all, :limit => 1, :order => 'random()')
You won't get too many gains since it's only run twice per clan (so 20x total), but there are similar lines like these two
# (runs 60x total)
rand_champion = Champion.first(:offset => rand(Champion.count))
# (runs up to 200x, I think)
c = [clan,Clan.first(:offset => rand(Clan.count))]
In general, you can almost always find something more to optimize in your program. So your time is most efficiently used by starting with the areas that are repeated the most--the most deeply nested loops. I'll leave optimizing the above 2 lines (and any others that may be similar) to you as an exercise. If you're having trouble, just let me know in a comment.
Also, I'm sure you'll get a lot of good suggestions in many of the responses, so I highly highly highly recommend setting up a benchmarker so you can measure the differences. Be sure run it several times for each version you test, so you can get a good average (programs running in the background could potentially throw off your results).
As far as simplicity, I think readability is pretty important. It won't make your code run any faster, but it can make your debugging faster (and your time is important!). The few things that were giving me trouble were nondescript variables like c and p. I do this too sometimes, but when you have several of these variables in the same scope, I very quickly reach a point where I think "what was that variable for again?". Something like temp_clan instead of c goes a long way.
For readability, I also prefer .each instead of for. That's entirely a personal preference, though.
btw I love League of Legends :)
Edit: (comments won't let me indent code) Upon taking a second look, I realized that this snippet can be optimized further:
for p in item.players.limit(5)
rand_champion = Champion.first(:offset => rand(Champion.count))
match.participations.create!(:player => p, :champion => rand_champion)
end
change Champion.first(:offset => rand(Champion.count))
rand_champs = Champion.find(:all, :limit => 5, :order => 'random()')
for p ...
i = 0
match.participations.create!(:player => p, :champion => rand_champs(i))
i++
end
This will reduce 5 SQL queries into 1. Since it's called 60x, this will reduce your SQL queries from 60 to 12. As an extra plus, you won't get repeated champions on the same team, (or I guess that could be a downside if that was your intention)

Related

Rails: Faster way to perform updates on many records

In our Rails 3.2.13 app (Ruby 2.0.0 + Postgres on Heroku), we are often retreiving a large amount of Order data from an API, and then we need to update or create each order in our database, as well as the associations. A single order creates/updates itself plus approx. 10-15 associcated objects, and we are importing up to 500 orders at a time.
The below code works, but the problem is it's not at all efficient in terms of speed. Creating/updating 500 records takes approx. 1 minute and generates 6500+ db queries!
def add_details(shop, shopify_orders)
shopify_orders.each do |shopify_order|
order = Order.where(:order_id => shopify_order.id.to_s, :shop_id => shop.id).first_or_create
order.update_details(order,shopify_order,shop) #This calls update_attributes for the Order
ShippingLine.add_details(order, shopify_order.shipping_lines)
LineItem.add_details(order, shopify_order.line_items)
Taxline.add_details(order, shopify_order.tax_lines)
Fulfillment.add_details(order, shopify_order.fulfillments)
Note.add_details(order, shopify_order.note_attributes)
Discount.add_details(order, shopify_order.discount_codes)
billing_address = shopify_order.billing_address rescue nil
if !billing_address.blank?
BillingAddress.add_details(order, billing_address)
end
shipping_address = shopify_order.shipping_address rescue nil
if !shipping_address.blank?
ShippingAddress.add_details(order, shipping_address)
end
payment_details = shopify_order.payment_details rescue nil
if !payment_details.blank?
PaymentDetail.add_details(order, payment_details)
end
end
end
def update_details(order,shopify_order,shop)
order.update_attributes(
:order_name => shopify_order.name,
:order_created_at => shopify_order.created_at,
:order_updated_at => shopify_order.updated_at,
:status => Order.get_status(shopify_order),
:payment_status => shopify_order.financial_status,
:fulfillment_status => Order.get_fulfillment_status(shopify_order),
:payment_method => shopify_order.processing_method,
:gateway => shopify_order.gateway,
:currency => shopify_order.currency,
:subtotal_price => shopify_order.subtotal_price,
:subtotal_tax => shopify_order.total_tax,
:total_discounts => shopify_order.total_discounts,
:total_line_items_price => shopify_order.total_line_items_price,
:total_price => shopify_order.total_price,
:total_tax => shopify_order.total_tax,
:total_weight => shopify_order.total_weight,
:taxes_included => shopify_order.taxes_included,
:shop_id => shop.id,
:email => shopify_order.email,
:order_note => shopify_order.note
)
end
So as you can see, we are looping through each order, finding out if it exists or not (then either loading the existing Order or creating the new Order), and then calling update_attributes to pass in the details for the Order. After that we create or update each of the associations. Each associated model looks very similar to this:
class << self
def add_details(order, tax_lines)
tax_lines.each do |shopify_tax_line|
taxline = Taxline.find_or_create_by_order_id(:order_id => order.id)
taxline.update_details(shopify_tax_line)
end
end
end
def update_details(tax_line)
self.update_attributes(:price => tax_line.price, :rate => tax_line.rate, :title => tax_line.title)
end
I've looked into the activerecord-import gem but unfortunately it seems to be more geared towards creation of records in bulk and not update as we also require.
What is the best way that this can be improved for performance?
Many many thanks in advance.
UPDATE:
I came up with this slight improvement, which essentialy removes the call to update the newly created Orders (one query less per order).
def add_details(shop, shopify_orders)
shopify_orders.each do |shopify_order|
values = {:order_id => shopify_order.id.to_s, :shop_id => shop.id,
:order_name => shopify_order.name,
:order_created_at => shopify_order.created_at,
:order_updated_at => shopify_order.updated_at,
:status => Order.get_status(shopify_order),
:payment_status => shopify_order.financial_status,
:fulfillment_status => Order.get_fulfillment_status(shopify_order),
:payment_method => shopify_order.processing_method,
:gateway => shopify_order.gateway,
:currency => shopify_order.currency,
:subtotal_price => shopify_order.subtotal_price,
:subtotal_tax => shopify_order.total_tax,
:total_discounts => shopify_order.total_discounts,
:total_line_items_price => shopify_order.total_line_items_price,
:total_price => shopify_order.total_price,
:total_tax => shopify_order.total_tax,
:total_weight => shopify_order.total_weight,
:taxes_included => shopify_order.taxes_included,
:email => shopify_order.email,
:order_note => shopify_order.note}
get_order = Order.where(:order_id => shopify_order.id.to_s, :shop_id => shop.id)
if get_order.blank?
order = Order.create(values)
else
order = get_order.first
order.update_attributes(values)
end
ShippingLine.add_details(order, shopify_order.shipping_lines)
LineItem.add_details(order, shopify_order.line_items)
Taxline.add_details(order, shopify_order.tax_lines)
Fulfillment.add_details(order, shopify_order.fulfillments)
Note.add_details(order, shopify_order.note_attributes)
Discount.add_details(order, shopify_order.discount_codes)
billing_address = shopify_order.billing_address rescue nil
if !billing_address.blank?
BillingAddress.add_details(order, billing_address)
end
shipping_address = shopify_order.shipping_address rescue nil
if !shipping_address.blank?
ShippingAddress.add_details(order, shipping_address)
end
payment_details = shopify_order.payment_details rescue nil
if !payment_details.blank?
PaymentDetail.add_details(order, payment_details)
end
end
end
and for the associated objects:
class << self
def add_details(order, tax_lines)
tax_lines.each do |shopify_tax_line|
values = {:order_id => order.id,
:price => tax_line.price,
:rate => tax_line.rate,
:title => tax_line.title}
get_taxline = Taxline.where(:order_id => order.id)
if get_taxline.blank?
taxline = Taxline.create(values)
else
taxline = get_taxline.first
taxline.update_attributes(values)
end
end
end
end
Any better suggestions?
Try wrapping your entire code into a single database transaction. Since you're on Heroku it'll be a Postgres bottom-end. With that many update statements, you can probably benefit greatly by transacting them all at once, so your code executes quicker and basically just leaves a "queue" of 6500 statements to run on Postgres side as the server is able to dequeue them. Depending on the bottom end, you might have to transact into smaller chunks - but even transacting 100 at a time (and then close and re-open the transaction) would greatly improve throughput into Pg.
http://api.rubyonrails.org/classes/ActiveRecord/Transactions/ClassMethods.html
http://www.postgresql.org/docs/9.2/static/sql-set-transaction.html
So before line 2 you'd add something like:
def add_details(shop, shopify_orders)
Order.transaction do
shopify_orders.each do |shopify_order|
And then at the very end of your method add another end:
if !payment_details.blank?
PaymentDetail.add_details(order, payment_details)
end
end //shopify_orders.each..
end //Order.transaction..
end //method
You can monkey-patch ActiveRecord like this:
class ActiveRecord::Base
#http://stackoverflow.com/questions/15317837/bulk-insert-records-into-active-record-table?lq=1
#https://gist.github.com/jackrg/76ade1724bd816292e4e
# "UPDATE THIS SET <list_of_column_assignments> FROM <table_name> THIS JOIN (VALUES (<csv1>, <csv2>,...) VALS ( <column_names> ) ON <list_of_primary_keys_comparison>"
def self.bulk_update(record_list)
pk = self.primary_key
raise "primary_key not found" unless pk.present?
raise "record_list not an Array of Hashes" unless record_list.is_a?(Array) && record_list.all? {|rec| rec.is_a? Hash }
return nil if record_list.empty?
result = nil
#test if every hash has primary keys, so we can JOIN
record_list.each { |r| raise "Primary Keys '#{self.primary_key.to_s}' not found on record: #{r}" unless hasAllPKs?(r) }
#list of primary keys comparison
pk_comparison_array = []
if (pk).is_a?(Array)
pk.each {|thiskey| pk_comparison_array << "THIS.#{thiskey} = VALS.#{thiskey}" }
else
pk_comparison_array << "THIS.#{pk} = VALS.#{pk}"
end
pk_comparison = pk_comparison_array.join(' AND ')
#SQL
(1..record_list.count).step(1000).each do |start|
key_list, value_list = convert_record_list(record_list[start-1..start+999])
#csv values
csv_vals = value_list.map {|v| "(#{v.join(", ")})" }.join(", ")
#column names
column_names = key_list.join(", ")
#list of columns assignments
columns_assign_array = []
key_list.each {|col|
unless inPK?(col)
columns_assign_array << "THIS.#{col} = VALS.#{col}"
end }
columns_assign = columns_assign_array.join(', ')
sql = "UPDATE THIS SET #{columns_assign} FROM #{self.table_name} THIS JOIN ( VALUES #{csv_vals} ) VALS ( #{column_names} ) ON ( #{pk_comparison} )"
result = self.connection.execute(sql)
return result if result<0
end
return result
end
def self.inPK?(str)
pk = self.primary_key
test = str.to_s
if pk.is_a?(Array)
(pk.include?(test))
else
(pk==test)
end
end
#test if given hash has primary keys included as hash keys and those keys are not empty
def self.hasAllPKs?(hash)
h = hash.stringify_keys
pk = self.primary_key
if pk.is_a?(Array)
(pk.all? {|k| h.key?(k) and h[k].present? })
else
h.key?(pk) and h[pk].present?
end
end
def self.convert_record_list(record_list)
# Build the list of keys
key_list = record_list.map(&:keys).flatten.map(&:to_s).uniq.sort
value_list = record_list.map do |rec|
list = []
key_list.each {|key| list << ActiveRecord::Base.connection.quote(rec[key] || rec[key.to_sym]) }
list
end
# If table has standard timestamps and they're not in the record list then add them to the record list
time = ActiveRecord::Base.connection.quote(Time.now)
for field_name in %w(created_at updated_at)
if self.column_names.include?(field_name) && !(key_list.include?(field_name))
key_list << field_name
value_list.each {|rec| rec << time }
end
end
return [key_list, value_list]
end
end
Then, you can generate a array of hashes containing your models attributes (including theirs primary keys) and do something like:
ActiveRecord::Base.transaction do
Model.bulk_update [ {attr1: val1, attr2: val2,...}, {attr1: val1, attr2: val2,...}, ... ]
end
It will be a single SQL command without Rails callbacks and validations.
For PostgreSQL, there are several issues that the above approach does not address:
You must specify an actual table, not just an alias, in the update target table.
You cannot repeat the target table in the FROM phrase. Since you are joining the target table to a VALUES table (hence there is only one table in the FROM phrase, you won't be able to use JOIN, you must instead use "WHERE ".
You don't get the same "free" casts in a VALUES table that you do in a simple "UPDATE" command, so you must cast date/timestamp values as such (#val_cast does this).
class ActiveRecord::Base
def self.update!(record_list)
raise ArgumentError "record_list not an Array of Hashes" unless record_list.is_a?(Array) && record_list.all? {|rec| rec.is_a? Hash }
return record_list if record_list.empty?
(1..record_list.count).step(1000).each do |start|
field_list, value_list = convert_record_list(record_list[start-1..start+999])
key_field = self.primary_key
non_key_fields = field_list - [%Q["#{self.primary_key}"], %Q["created_at"]]
columns_assign = non_key_fields.map {|field| "#{field} = #{val_cast(field)}"}.join(",")
value_table = value_list.map {|row| "(#{row.join(", ")})" }.join(", ")
sql = "UPDATE #{table_name} AS this SET #{columns_assign} FROM (VALUES #{value_table}) vals (#{field_list.join(", ")}) WHERE this.#{key_field} = vals.#{key_field}"
self.connection.update_sql(sql)
end
return record_list
end
def self.val_cast(field)
field = field.gsub('"', '')
if (column = columns.find{|c| c.name == field }).sql_type =~ /time|date/
"cast (vals.#{field} as #{column.sql_type})"
else
"vals.#{field}"
end
end
def self.convert_record_list(record_list)
# Build the list of fields
field_list = record_list.map(&:keys).flatten.map(&:to_s).uniq.sort
value_list = record_list.map do |rec|
list = []
field_list.each {|field| list << ActiveRecord::Base.connection.quote(rec[field] || rec[field.to_sym]) }
list
end
# If table has standard timestamps and they're not in the record list then add them to the record list
time = ActiveRecord::Base.connection.quote(Time.now)
for field_name in %w(created_at updated_at)
if self.column_names.include?(field_name) && !(field_list.include?(field_name))
field_list << field_name
value_list.each {|rec| rec << time }
end
end
field_list.map! {|field| %Q["#{field}"] }
return [field_list, value_list]
end
end

How can I refactor this Rails controller?

I have the following in my controller:
#custom_exercises = #user.exercises.all
#all_exercises = Exercise.not_the_placeholder_exercise.public.order("name").all
if #user.trainers.present?
trainer_exercises = []
#user.trainers.each do |trainer|
trainer_exercises << trainer.exercises.all
end
#my_trainer_custom_exercises = trainer_exercises
end
#exercises = #custom_exercises + #all_exercises
if #my_trainer_custom_exercises.present?
#exercises << #my_trainer_custom_exercises
#exercises.flatten!
end
This feels really messy. How could I refactor this?
First step: set up an AR relationship between users and exercises, probably along the lines of:
class User < ActiveRecord::Base
has_many :trainer_exercises,
:through => :trainers,
:foreign_key => :client_id,
:source => :exercises
end
Second step: move #all_exercises to a class method in Exercise.
class Exercise < ActiveRecord::Base
def self.all_exercises
not_the_placeholder_exercise.public.order("name").all
end
end
This way, the whole controller gets a whole lot simpler:
#custom_exercises = #user.exercises.all
#trainer_exercises = #user.trainer_exercises.all
#exercises = Exercise.all_exercises + #custom_exercises + #trainer_exercises
From a purely less lines of code perspective, you could start with this ( more or less / not tested but should work:
if #user.trainers.present?
#my_trainer_custom_exercises = #user.trainers.each.inject([]){ |trainer, trainer_exercises|
trainer_exercises << trainer.exercises.all
}
end

N+1 while enumerating self-referencing records

I'm doing a pretty basic thing - displaying a tree of categories in topological order and ActiveRecord issues extra query for enumerating each category's children.
class Category < ActiveRecord::Base
attr_accessible :name, :parent_id
belongs_to :parent, :class_name => 'Category'
has_many :children, :class_name => 'Category', :foreign_key => 'parent_id'
def self.in_order
all = Category.includes(:parent, :children).all # Three queries as it should be
root = all.find{|c| c.parent_id == nil}
queue = [root]
result = []
while queue.any?
current = queue.shift
result << current
current.children.each do |child| # SELECT * FROM categories WHERE parent_id = ?
queue << child
end
end
result
end
end
UPD. As far as I understand what's going here is that when a category is referred as a children of some category it's not the same object as the one in the initial list and so it hasn't it's children loaded. Is there a way to implement desired behavior without resorting to creating extra adjacency list?
UPD2: Here's the manual adjacency list solution. It uses only one query but I'd really like to use something more idiomatic
def self.in_order_manual
cache = {}
adj = {}
root = nil
all.each do |c|
cache[c.id] = c
if c.parent_id != nil
(adj[c.parent_id] ||= []) << c.id
else
root = c.id
end
end
queue = [root]
result = []
while queue.any?
current = queue.shift
result << current
(adj[current] || []).each{|child| queue << child}
end
result.map{|id| cache[id]}
end

Mass Inserting Data into Nested Models

I've been following this fantastic tutorial about mass inserting data. All is well, I've got my transaction times down from about 30 seconds to less than 1 :)
I just don't know how to populate the fields in a child model:
has_many :check, :dependent => :destroy
accepts_nested_attributes_for :check, :reject_if => lambda { |a| a[:value].blank? }, :allow_destroy => true
Previously, I've used this:
...
User.create!(:username => username, :check_attributes => [ {:attribute_name => "User-Password", :value => password, :op => ":="}])
...
Since moving to a different method, I've now got this in my user model:
def self.activerecord_extensions_mass_insert(validate = true)
columns = [:username]
values = []
10000.times do
username = ""
5.times { username << (i = Kernel.rand(62); i += ((i < 10) ? 48 : ((i < 36) ? 55 : 61 ))).chr }
values.push [username]
end
User.import columns, values, {:validate => validate}
end
I've tried using this and a few other variations without success...
columns = [:username, :check_attributes => [ :attribute_name, :value, :op]]
Any suggestions?
import won't accept nesting so I think you are out of luck for that method.
Why not have two strings, user_sql and check_sql and then as you loop through manually build each sql statement for an extended insert. then you can run run two queries instead of a bunch.
on the plus side, you will shave off some more time from the indexing penalty of multiple queries.
edited to add code:
you build two sql statements at once so you can have them releate to each other via user_id, then at the end you can execute. If this stuff might ever make it to production, wrap the sql calls in a transaction so you don't have any accidents.
user_sql = "INSERT into users (id,username,password) VALUES\n"
check_sql = "INSERT into check_attributes (user_id,foo,bar) VALUES\n"
random_data_pool = [('0'..'9'),('A'..'Z'),('a'..'z')].collect(&:to_a).flatten
max_loop = 10
1.upto(max_loop) do |i|
seperator = (i == max_loop) ? ';' : ",\n"
username = (1..5).map{ random_data_pool[Kernel.rand(random_data_pool.size)] }.join
password = (1..5).map{ random_data_pool[Kernel.rand(random_data_pool.size)] }.join
user_sql += "(#{i},'#{username}','#{password}')" + seperator
check_sql +="(#{i},true,'potato')" + seperator
end
ActiveRecord::Base.connection.execute(user_sql)
ActiveRecord::Base.connection.execute(check_sql)

How to properly handle changed attributes in a Rails before_save hook?

I have a model that looks like this:
class StopWord < ActiveRecord::Base
UPDATE_KEYWORDS_BATCH_SIZE = 1000
before_save :update_keywords
def update_keywords
offset = 0
max_id = ((max_kw = Keyword.first(:order => 'id DESC')) and max_kw.id) || 0
while offset <= max_id
begin
conditions = ['id >= ? AND id < ? AND language = ? AND keyword RLIKE ?',
offset, offset + UPDATE_KEYWORDS_BATCH_SIZE, language]
# Clear keywords that matched the old stop word
if #changed_attributes and (old_stop_word = #changed_attributes['stop_word']) and not #new_record
Keyword.update_all 'stopword = 0', conditions + [old_stop_word]
end
Keyword.update_all 'stopword = 1', conditions + [stop_word]
rescue Exception => e
logger.error "Skipping batch of #{UPDATE_KEYWORDS_BATCH_SIZE} keywords at offset #{offset}"
logger.error "#{e.message}: #{e.backtrace.join "\n "}"
ensure
offset += UPDATE_KEYWORDS_BATCH_SIZE
end
end
end
end
This works just fine, as the unit tests show:
class KeywordStopWordTest < ActiveSupport::TestCase
def test_stop_word_applied_on_create
kw = Factory.create :keyword, :keyword => 'foo bar baz', :language => 'en'
assert !kw.stopword, 'keyword is not a stop word by default'
sw = Factory.create :stop_word, :stop_word => kw.keyword.split(' ')[1], :language => kw.language
kw.reload
assert kw.stopword, 'keyword is a stop word'
end
def test_stop_word_applied_on_save
kw = Factory.create :keyword, :keyword => 'foo bar baz', :language => 'en', :stopword => true
sw = Factory.create :keyword_stop_word, :stop_word => kw.keyword.split(' ')[1], :language => kw.language
sw.stop_word = 'blah'
sw.save
kw.reload
assert !kw.stopword, 'keyword is not a stop word'
end
end
But mucking with the #changed_attributes instance variable just feels wrong. Is there a standard Rails-y way to get the old value of an attribute that is being modified on a save?
Update: Thanks to Douglas F Shearer and Simone Carletti (who apparently prefers Murphy's to Guinness), I have a cleaner solution:
def update_keywords
offset = 0
max_id = ((max_kw = Keyword.first(:order => 'id DESC')) and max_kw.id) || 0
while offset <= max_id
begin
conditions = ['id >= ? AND id < ? AND language = ? AND keyword RLIKE ?',
offset, offset + UPDATE_KEYWORDS_BATCH_SIZE, language]
# Clear keywords that matched the old stop word
if stop_word_changed? and not #new_record
Keyword.update_all 'stopword = 0', conditions + [stop_word_was]
end
Keyword.update_all 'stopword = 1', conditions + [stop_word]
rescue StandardError => e
logger.error "Skipping batch of #{UPDATE_KEYWORDS_BATCH_SIZE} keywords at offset #{offset}"
logger.error "#{e.message}: #{e.backtrace.join "\n "}"
ensure
offset += UPDATE_KEYWORDS_BATCH_SIZE
end
end
end
Thanks, guys!
You want ActiveModel::Dirty.
Examples:
person = Person.find_by_name('Uncle Bob')
person.changed? # => false
person.name = 'Bob'
person.changed? # => true
person.name_changed? # => true
person.name_was # => 'Uncle Bob'
person.name_change # => ['Uncle Bob', 'Bob']
Full documentation: http://api.rubyonrails.org/classes/ActiveModel/Dirty.html
You're using the right feature but the wrong API.
You should #changes and #changed?.
See this article and the official API.
Two additional notes about your code:
Never rescue Exception directly when you actually want to rescue execution errors. This is Java-style. You should rescue StandardError instead because lower errors are normally compilation error or system error.
You don't need the begin block in this case.
def update_keywords
...
rescue => e
...
ensure
...
end

Resources