Ruby on Rails best way to update 100k records - ruby-on-rails

I am in a situation where I have to update more than 100k records in the database with best efficient way Please see below my code:
namespace :order do
desc "update confirmed at field for Payments::Order"
task set_confirmed_at: :environment do
puts "==> Updating confirmed_at for orders starts ...".blue
Payments::Order.find_each(batch_size: 10000) do |orders|
order_action = orders.actions.where("sender LIKE ?", "%ConfirmJob%").first if orders.actions
if !order_action.blank?
orders.update_attribute(:confirmed_at, order_action.created_at)
puts "order id = #{orders.id} has been updated.".green
end
end
puts "== completed ==".blue
end
end
Here I am breaking records into 10000 of each batch size and then try to update the record on the basis of some conditions so could anyone suggest me a more efficient way to do the same task.
Thank you in advance!

You can try update_all:
Payments::Order.joins(:actions).where(Payment::OrderAction.arel_table[:sender].matches("%ConfirmJob%")).update_all("confirmed_at = actions.created_at")
So your code will look like this:
namespace :order do
desc "update confirmed at field for Payments::Order"
task set_confirmed_at: :environment do
puts "==> Updating confirmed_at for orders starts ...".blue
Payments::Order.joins(:actions).where(Payments::OrderAction.arel_table[:sender].matches("%ConfirmJob%")).update_all("confirmed_at = actions.created_at")
puts "== completed ==".blue
end
end
Update:
I've investigated an issue and found out that bulk update with joined table is a long term issue in rails
As set part uses string parameter as it is I suggest to add from clause there.
namespace :order do
desc "update confirmed at field for Payments::Order"
task set_confirmed_at: :environment do
puts "==> Updating confirmed_at for orders starts ...".blue
Payments::Order.joins(:actions).
where(Order::Action.arel_table[:sender].matches("%ConfirmJob%")).
update_all("confirmed_at = actions.created_at FROM actions")
puts "== completed ==".blue
end
end

You are doing Payments::Order.find_each so your solution will loop for each Payment::Order when you only want to loop for the ones having actions.server like '%ConfirmJob%', so I will go with this solution:
Payments::Order
.includes(:actions)
.joins(:actions)
.where("actions.server like '%?%'", "ConfirmJob")
.find_each do |order|
order_action = order.actions.first
order.update!(confirmed_at: order_action.created_at)
end

Related

Automatically Generating Daily Posts For A Blog With Ruby On Rails

Currently I have a rake task which I will run daily with the Heroku Scheduler.
It currently will generate a new post for the user every day when the rake task is executed as long as today's date is after the "start date" of the users account.
This is the code for the rake task:
namespace :abc do
desc "Used to generate a new daily log"
task :create_post => :environment do
User.find_each do |currentUser|
starting_date = currentUser.start_date
Post.create!(content: "RAKED", user: currentUser, status: "new") if Date.today >= starting_date && Date.today.on_weekday?
end
puts "It worked yo"
end
end
My problem is if someone makes an account then sets their start date sometime in the past (so they can fill in old posts) my current rake task will not generate the backdated daily posts. Does anyone have any ideas about how to resolve this so that the rake task still performs its current job but also deals with this case?
namespace :abc do
desc "Used to generate a new daily log"
task :create_post => :environment do
User.find_each do |currentUser
starting_date = currentUser.start_date
if Date.today >= starting_date && Date.today.on_weekday?
if currentUser.posts.count.zero?
starting_date.upto(Date.today) { |date| currentUser.generate_post if date.on_weekday? }
else
currentUser.generate_post
end
end
end
puts "It actually worked yo!"
end
end
In User model,
def generate_post
posts.create!(content: "RAKED", status: "new")
end
Your logic remains the same, I just loopes over the starting date to the current date to create backdated posts. Checking post count to zero will ensure that the condition is true only for the new user/user whose posts are not created earlier.
Hope it helps..

Logging raw SQL errors in Rake Tasks

I'm using raw sql bulk updates (for performance reasons) in the context of a rake task. Something like the following:
update_sql = Book.connection.execute("UPDATE books AS b SET
stock = vs.stock,
promotion = vs.promotion,
sales = vs.sales
FROM (values #{values_string}) AS vs
(stock, promotion, sales) WHERE b.id = vs.id;")
While everything is "transparent" in local development, if this SQL fails in production during the execution of the rails task (for example because the promotion column is nil and the statement becomes invalid), no error is logged.
I can manually log this with catching the exception, like below, however some option that would allow for automatic logging would be better.
begin
...
rescue ActiveRecord::StatementInvalid => e
Rails.logger.fatal "Books update: ActiveRecord::StatementInvalid: "+ e.to_s
end
You can make your own custom class in your model folder:
app/models/custom_sql_logger.rb :
class CustomSqlLogger
def self.debug(msg=nil)
#custom_log ||= Logger.new("#{Rails.root}/log/custom_sql.log")
#custom_log.debug(msg) unless msg.nil?
end
end
Then go to the rake task where you would like to debug updated fields for example lib/task/calculate_avarages.rake and call your custom debugger:
CustomSqlLogger.debug "The field was successfully updated into DB"
Example from my project:
require 'rake'
task :calculate_averages => :environment do
products = Product.all
products.each do |product|
puts "Calculating average rating for #{product.name}..."
product.update_attribute(:average_rating, product.reviews.average("rating"))
CustomSqlLogger.debug "#{product.name} was susscefully updated into DB"
end
end
Custom debugger will create the new file custom_sql.log into log folder: log/custom_sql.log and saved all information there. Beware of a log file size after a while.

rails cron job overlap

I am using heroku scheduler with every 10 minutes run a task. The task is doing sth like below. I am thinking about how to prevent the next job overlap with the current one. Are there anythings can prevent cron job overlap problem?
task force_close: :environment do
#get all unvoted wine_question
questions = Question.where(closed: false)
puts "Total #{questions} of wine_question will be closed"
finish_count = 0
questions.each do |question|
begin
question.force_close!
finish_count += 1
rescue StandardError => bang
puts "question #{self.id} error when running #{bang}"
end
end
puts "Total #{finish_count} of question was closed"
end
There is 2 ways.
First,
Create checked field on Question. like force_closed:boolean
and touch force_closed field when call force_close! method
so you can find questions by where(closed: false, force_closed: false)
Second,
Create batch history table. columns should are task_name and run_at
and save runnging time info in batch history table,
# last line of task
BatchHistory.create(task_name: 'force_close', run_at: Time.now)

Destroying a Rails 3 object in rake?

I'm stuck on a simple issue here. I'm building an application that manages a database of coupons, each of which has an expiration date. I'm trying to build a rake task that will delete the expired coupons. The relevant code from the rakefile looks like this:
desc "Deletes expired offers from the database."
task :purge_expired => :environment do
today = Date.today.to_s
Offer.where('expires_on < ?', today).destroy
end
That however fails with the following error message:
rake aborted!
wrong number of arguments (0 for 1)
I'm just not sure why. What arguments would be needed?
As an experiment, I found that this worked fine:
desc "Deletes expired offers from the database."
task :purge_expired => :environment do
today = Date.today.to_s
puts Offer.where('expires_on < ?', today).count
end
That returned the right number of records, so I assume I'm successfully gathering up the right objects.
FWIW, I tried this too, and had no luck:
desc "Deletes expired offers from the database."
task :purge_expired => :environment do
today = Date.today.to_s
#offers = Offer.where('expires_on < ?', today)
#offers.destroy
end
So I'm kind of out of ideas. What am I doing wrong here?
Thanks so much for your help. I'm pretty sure I wouldn't have a job if it weren't for Stack Overflow!
You're close. You just need to use the #destroy_all method instead of #destroy. The latter requires an id argument.
today = Date.today.to_s
Offer.where('expires_on < ?', today).destroy_all
First off, to help debug things from rake, invoke it with the --trace option. Your issue here isn't rake specific though.
The Offer.where('expires_on < ?', today) is going to return a collection, and not a single instance of Offer and there isn't a destroy method available for the collection.
You can iterate over each expired offer and call destroy. Something like this:
#offers = Offer.where('expires_on < ?', today)
#offers.each { |offer| offer.destroy }

Rails - Help with rake task

I have a rake task I need to run in order to sanitize (remove forward slashes) some data in the database. Here's the task:
namespace :db do
desc "Remove slashes from old-style URLs"
task :substitute_slashes => :environment do
puts "Starting"
contents = Content.all
contents.each do |c|
if c.permalink != nil
c.permalink.gsub!("/","")
c.save!
end
end
puts "Finished"
end
end
Which allows me to run rake db:substitute_slashes --trace
If I do puts c.permalink after the gsub! I can see it's setting the attribute properly. However the save! doesn't seem to be working because the data is not changed. Can someone spot what the issue may be?
Another thing, I have paperclip installed and this task is triggering [paperclip] Saving attachments. which I would rather avoid.
try this:
namespace :db do
desc "Remove slashes from old-style URLs"
task :substitute_slashes => :environment do
puts "Starting"
contents = Content.all
contents.each do |c|
unless c.permalink.nil?
c.permalink = c.permalink.gsub(/\//,'')
c.save!
end
end
puts "Finished"
end
end
1.) Change != nil to unless record.item.nil? (I don't know if it makes a different, but I've never used != nil. You may want to use .blank? also judging by your code)
2.) Your gsub was malformed. The pattern must be between two / (/ stuff /). The \ is necessary because you need to escape the /.
3.) Bang (!) updates the object in place. I think your biggest issue may be that you are overusing !.
4.) You're also making this very inefficient... You are looking at every record and updating every record. Rails isn't always the best option. Learn SQL and do this in one line:
"UPDATE contents SET permalink = replace(permalink, '/', '');"
If you MUST use Rails:
ActiveRecord::Base.connection.execute "UPDATE contents SET permalink = replace(permalink, '/', '');"
Wow! One query. Amazing! :)
The next thing I would try would be
c.permalink = c.permalink.gsub("/","")
As for saving without callbacks, this stackoverflow page has some suggestions.

Resources