I have added the following code to my initializers:
require 'delayed/worker'
Delayed::Worker.logger = ActiveSupport::BufferedLogger.new("log/#{Rails.env}_delayed_jobs.log", Rails.logger.level)
module Delayed
class Worker
def say_with_flushing(text, level = Logger::INFO)
if logger
say_without_flushing(text, level)
logger.flush
end
end
alias_method_chain :say, :flushing
end
end
But it only seems to output stuff like:
2011-09-06T14:02:04-0700: [Worker(delayed_job host:Caribbean.local pid:80322)] 1 jobs processed at 16.5391 j/s, 0 failed ...
2011-09-06T14:03:19-0700: [Worker(delayed_job host:Caribbean.local pid:80322)] MailTest completed after 0.2072
...
Is there a way to get it to actually output what it's doing? Ideally what I am trying to get is the actual outgoing email content that would normally be seen in development mode, but have it go into the delayed_job.log (regardless of rails environment).
So without having mailing go through delayed_job, in production mode the output in the log is something like:
[Mail sent to foo#foo.com]
But I want to get the actual contents as they'd be displayed if I were in development mode-- (the entire email body, subject, etc) but again in the delayed_job log because delayed_job is handling mailing emails.
Related
I have an issue with importing a lot of records from a user provided excel file into a database. The logic for this is working fine, and I’m using ActiveRecord-import to cut down on the number of database calls. However, when a file is too large, the processing can take too long and Heroku will return a timeout. Solution: Resque and moving the processing to a background job.
So far, so good. I’ve needed to add CarrierWave to upload the files to S3 because I can’t just hold the file in memory for the background job. The upload portion is also working fine, I created a model for them and am passing the IDs through to the queued job to retrieve the file later as I understand I can’t pass a whole ActiveRecord object through to the job.
I’ve installed Resque and Redis locally, and everything seems to be setup correctly in that regard. I can see the jobs I’m creating being queued and then run without failing. The job seems to run fine, but no records are added to the database. If I run the code from my job line by line in the console, the records are added to the database as I would expect. But when the queued jobs I’m creating run, nothing happens.
I can’t quite work out where the problem might be.
Here’s my upload controller’s create action:
def create
#upload = Upload.new(upload_params)
if #upload.save
Resque.enqueue(ExcelImportJob, #upload.id)
flash[:info] = 'File uploaded.
Data will be processed and added to the database.'
redirect_to root_path
else
flash[:warning] = 'Upload failed. Please try again.'
render :new
end
end
This is a simplified version of the job with fewer sheet columns for clarity:
class ExcelImportJob < ApplicationJob
#queue = :default
def perform(upload_id)
file = Upload.find(upload_id).file.file.file
data = parse_excel(file)
if header_matches? data
# Create a database entry for each row, ignoring the first header row
# using activerecord-import
sales = []
data.drop(1).each_with_index do |row, index|
sales << Sale.new(row)
if index % 2500 == 0
Sale.import sales
sales = []
end
end
Sale.import sales
end
def parse_excel(upload)
# Open the uploaded excel document
doc = Creek::Book.new upload
# Map rows to the hash keys from the database
doc.sheets.first.rows.map do |row|
{ date: row.values[0],
title: row.values[1],
author: row.values[2],
isbn: row.values[3],
release_date: row.values[5],
units_sold: row.values[6],
units_refunded: row.values[7],
net_units_sold: row.values[8],
payment_amount: row.values[9],
payment_amount_currency: row.values[10] }
end
end
# Returns true if header matches the expected format
def header_matches?(data)
data.first == {:date => 'Date',
:title => 'Title',
:author => 'Author',
:isbn => 'ISBN',
:release_date => 'Release Date',
:units_sold => 'Units Sold',
:units_refunded => 'Units Refunded',
:net_units_sold => 'Net Units Sold',
:payment_amount => 'Payment Amount',
:payment_amount_currency => 'Payment Amount Currency'}
end
end
end
I can probably have some improved logic anyway as right now I’m holding the whole file in memory, but that isn’t the issue I’m having – even with a small file that has only 500 or so rows, the job doesn’t add anything to the database.
Like I said my code worked fine when I wasn’t using a background job, and still works if I run it in the console. But for some reason the job is doing nothing.
This is my first time using Resque so I don’t know if I’m missing something obvious? I did create a worker and as I said it does seem to run the job. Here’s the output from Resque’s verbose formatter:
*** resque-1.27.4: Waiting for default
*** Checking default
*** Found job on default
*** resque-1.27.4: Processing default since 1508342426 [ExcelImportJob]
*** got: (Job{default} | ExcelImportJob | [15])
*** Running before_fork hooks with [(Job{default} | ExcelImportJob | [15])]
*** resque-1.27.4: Forked 63706 at 1508342426
*** Running after_fork hooks with [(Job{default} | ExcelImportJob | [15])]
*** done: (Job{default} | ExcelImportJob | [15])
In the Resque dashboard the jobs aren’t logged as failed. They get executed and I can see an increment in the ‘processed’ jobs on the stats page. But as I say the DB remains untouched. What’s going on? How can I debug the job more clearly? Is there a way to get into it with Pry?
It looks like my problem was with Resque.enqueue(ExcelImportJob, #upload.id).
I changed my code to ExcelImportJob.perform_later(#upload.id) and now my code actually runs!
I also added a resque.rake task to lib/tasks as described here: http://bica.co/2015/01/20/active-job-resque/.
That link also notes how to use rails runner to call the job without running the full Rails server and triggering the job, which is useful for debugging.
Strangely, I didn't quite manage to get the job to print anything to STDOUT as suggested by #hoffm but at least it led me down a good avenue of inquiry.
I still don't fully understand the difference between why calling Resqueue.enqueue still added my jobs to the queue and indeed seemed to run them, but the code wasn't executed, so if someone has a better grasp and an explanation, that would be much appreciated.
TL;DR: calling perform_later rather than Resque.enqueue fixed the problem but I don't know why.
I found these logs here that I'm trying to duplicate in my code, specifically the start time and done times:
PID-- ThreadID----- LLvl YourKlass- JobID--------
TID-oveahmcxw INFO: HardWorker JID-oveaivtrg start
TID-oveajt7ro INFO: HardWorker JID-oveaish94 start
TID-oveahmcxw INFO: HardWorker JID-oveaivtrg done: 10.003 sec
TID-oveajt7ro INFO: HardWorker JID-oveaish94 done: 10.002 sec
Does anyone know how to access these values? I know there is a jid that gets popluated (which I'm using), but I can't for the life of me find any documentation on where start and done come from.
You can add logging by using a middleware:
class JobMiddleware
def call(worker, msg, queue)
Rails.logger.info "Job #{worker.jid} stated at #{Time.now}"
yield
Rails.logger.info "Job #{worker.jid} ended at #{Time.now}"
end
end
Add this to config/initializers/sidekiq.rb and restart your sidekiq workers.
Click here for more info about sidekiq middlewares
I use a Sidekiq queue to process communications with an unreliable, 3rd party API. Since this API is often down for a couple minutes at a time and then back up again, Sidekiq has been handy. When a connection issue happens, an error is raised and Sidekiq throws the job back in the queue to be retried again later, after some time has passed.
I use NewRelic to not only help debug crashes, but also for monitoring. My problem is that this current methodology above creates errors in NewRelic. If the 3rd party API is down for more than a couple of minutes, the error count accumulates enough to cause notifications to send out through NewRelic.
What I'd like to do is only raise an error from my worker when a certain number of retries have occurred for a job. I'm using sidekiq_retries_exhausted to do this. My problem is that I'm not quite sure how to put jobs back in the queue after they have an error without raising an error.
Does Sidekiq provide any facilities to return a job to a queue, increment the number of retries for the job, and have it sit there until it's due to run again, as if an exception was raised in the worker class?
You raise a specific error and tell the error service to ignore errors of that type. For NewRelic:
https://docs.newrelic.com/docs/agents/ruby-agent/installation-configuration/ruby-agent-configuration#error_collector.ignore_errors
Here is what I did to keep intentional retry errors out of AirBrake:
class TaskWorker
include Sidekiq::Worker
class RetryNotAnError < RuntimeError
end
def perform task_id
task = Task.find(task_id)
task.do_cool_stuff
if task.finished?
#log.debug "Task #{task_id} was successful."
return false
else
#log.debug "Task #{task_id} will try again later."
raise RetryNotAnError, task_id
end
end
end
Tell Airbrake to ignore it:
Airbrake.configure do |config|
config.ignore << 'RetryNotAnError'
end
It's good to make your exception name OBVIOUSLY not an error (e.g. RetryLaterNotAnError), as it will still show up in logs and such, and you don't want to freak people out when they see a bunch of them.
ps. That said, I would really like to see Sidekiq to provide an explicit, errorless retry mechanism.
If using Sidekiq Enterprise, one other option might be to utilize the optional set of additional error types that will then get treated as Sidekiq::Limiter::OverLimit violations.
For my purposes, I've used a new error class and then added it to the list in the config. Here are the notes from the sidekiq-ent code (not in the public sidekiq repo) on how to modify your config file:
# An optional set of additional error types which would be
# treated as a rate limit violation, so the job would automatically
# be rescheduled as with Sidekiq::Limiter::OverLimit.
#
# Sidekiq::Limiter.errors << MyApp::TooMuch
# Sidekiq::Limiter.errors = [Foo::Error, MyApp::Limited]
Inside the specific job you can specify the max_retries, or it will default to 20:
sidekiq_options max_limiter_retries: 10
Inside the job, I'll rescue the "expected" intermittent error that I'd rather not ignore completely and then raise the error I've added to the list, something like this:
rescue RestClient::RequestTimeout => e
raise SidekiqSoftRetry.new(e.inspect)
end
Here's what that looks like in my initialization file-- and Mike Perham was kind enough to respond with the option to update the global retry limit.
class SidekiqSoftRetry < RuntimeError
end
Sidekiq::Limiter::DEFAULT_OPTIONS[:reschedule] = 10
Sidekiq::Limiter.configure do |config|
config.errors.concat(
[
SidekiqSoftRetry,
]
)
end
I couldn't find anywhere a solution on how to make queue_classic write logs to a file. Scrolls, which Queue_Classic uses for logging, doesn't seem to have any example either.
Could someone provide a working example?
The logging within the method called by QC will be the source for the logging. For example, in rails. Any call to Rails.logger will go to the log file appropriate to your RAILS_ENV. The log data coming from scrolls goes to stdout, so you could pipe STDOUT from your queues to a log file when you start them.
You could control your queues with god.rb giving a god.rb configuration instance similar to this (I've left your configuration for number of queues, directories, etc up to you):
number_queues.times do |queue_num|
God.watch do |w|
w.name = "QC-#{queue_num}"
w.group = "QC"
w.interval = 5.minutes
w.start = "bundle exec rake queue:work" # This is your rake task to start QC listening
w.gid = 'nginx'
w.uid = 'nginx'
w.dir = rails_root
w.keepalive
w.env = {"RAILS_ENV" => rails_env}
w.log = "#{log_dir}/qc.stdout.log" # Or.... "#{log_dir}//qc-#{queue_num}.stdout.log"
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
end
end
FWIW, I find the STDOUT log data to be less than helpful and may end up just sending it to the bitbucket.
If the log data is useless, we should consider removing it.
DEBUG is very nice when I'm running it from a commandline. For times when I am trying to figure out what I've done to bork everything up (usually bundling issues, path issues, etc). Or times when I am going to demo what is going on in a queue.
For me, INFO logging consists of the standard lib=queue_classic level=info action=insert_job elapsed=16 message and any STDOUT/STDERR from forked executions or from PostgreSQL. I've not extended any of the logging classes since scrolls goes to STDOUT and my tasks are in an environment that provides logging.
Sure, it could be removed. I think that REALLY depends on the environment and what the queue is doing. If I were to be doing something that did not have Rails.logger then I would use the QC.log and scrolls more effectively and instrument my tasks in that manner.
As I play with it, I may keep my configuration as-is just because of the output coming from methods/applications called by the tasks themselves. I might decide to override QC.log code to add date/time. I am still working to determine what fits my needs.
Sorry, my last line was really focused on the environment example I had given.
original here: enter link description here
Situation:
In a typical cluster setup, I have a 5 instances of mongrel running behind Apache 2.
In one of my initializer files, I schedule a cron task using Rufus::Scheduler which basically sends out a couple of emails.
Problem:
The task runs 5 times, once for each mongrel instance and each recipient ends up getting 5 mails (despite the fact I store logs of each sent mail and check the log before sending). Is it possible that since all 5 instances run the task at exact same time, they end up reading the email logs before they are written?
I am looking for a solution that will make the tasks run only once. I also have a Starling daemon up and running which can be utilized.
The rooster rails plugin specifically addresses your issue. It uses rufus-scheduler and ensures the environment is loaded only once.
The way I am doing it right now:
Try to open a file in exclusive locked mode
When lock is acquired, check for messages in Starling
If message exists, other process has already scheduled the job
Set the message again to the queue and exit.
If message is not found, schedule the job, set the message and exit
Here is the code that does it:
starling = MemCache.new("#{Settings[:starling][:host]}:#{Settings[:starling][:port]}")
mutex_filename = "#{RAILS_ROOT}/config/file.lock"
scheduler = Rufus::Scheduler.start_new
# The filelock method, taken from Ruby Cookbook
# This will ensure unblocking of the files
def flock(file, mode)
success = file.flock(mode)
if success
begin
yield file
ensure
file.flock(File::LOCK_UN)
end
end
return success
end
# open_lock method, taken from Ruby Cookbook
# This will create and hold the locks
def open_lock(filename, openmode = "r", lockmode = nil)
if openmode == 'r' || openmode == 'rb'
lockmode ||= File::LOCK_SH
else
lockmode ||= File::LOCK_EX
end
value = nil
# Kernerl's open method, gives IO Object, in our case, a file
open(filename, openmode) do |f|
flock(f, lockmode) do
begin
value = yield f
ensure
f.flock(File::LOCK_UN) # Comment this line out on Windows.
end
end
return value
end
end
# The actual scheduler
open_lock(mutex_filename, 'r+') do |f|
puts f.read
digest_schedule_message = starling.get("digest_scheduler")
if digest_schedule_message
puts "Found digest message in Starling. Releasing lock. '#{Time.now}'"
puts "Message: #{digest_schedule_message.inspect}"
# Read the message and set it back, so that other processes can read it too
starling.set "digest_scheduler", digest_schedule_message
else
# Schedule job
puts "Scheduling digest emails now. '#{Time.now}'"
scheduler.cron("0 9 * * *") do
puts "Begin sending digests..."
WeeklyDigest.new.send_digest!
puts "Done sending digests."
end
# Add message in queue
puts "Done Scheduling. Sending the message to Starling. '#{Time.now}'"
starling.set "digest_scheduler", :date => Date.today
end
end
# Sleep will ensure all instances have gone thorugh their wait-acquire lock-schedule(or not) cycle
# This will ensure that on next reboot, starling won't have any stale messages
puts "Waiting to clear digest messages from Starling."
sleep(20)
puts "All digest messages cleared, proceeding with boot."
starling.get("digest_scheduler")
Why dont you use mod_passenger (phusion)? I moved from mongrel to phusion and it worked perfect (with a timeamount of < 5 minutes)!