Rails 5 + ActiveJob + Sidekiq: Stop and log error after 10 retries - ruby-on-rails

Trying to program a job that after 10 retries (from all exception types) will report a failure and die. Can't get it to work. Tried this answer and this one too. Neither worked.
The best solution would be to access retry_count from within the perform method.

I think what you're asking for is the sidekiq_retries_exhausted hook. It will be called once your retries are up and job will move to dead queue. Just set retries to 10 and implement that hook.
config.death_handlers might also be interesting.
See docs here: https://github.com/mperham/sidekiq/wiki/Error-Handling#configuration

Related

Using Ruby Mongo with ActiveJob

I am using Ruby 2.7 with Mongo 2.17 client. Currently using Sidekiq with ActiveJob to perform millions of Jobs executions to do single transactions to AWS DocumentDB. While reading the Mongo client documentation I see that they claim that is a bad idea to instantiate a Client per request, rather than just having 1 and reusing it.
Currently the job that runs millions of times does instantiate a client and closes it at the end, the job has many threads executing per Sidekiq process, currently running multiple Sidekiq processes:
jobs/my_job.rb
def perform(document)
client = Mongo::Client.new(DOCUMENTDB_HOST, DOCUMENTDB_OPTIONS)
client.insert_one(document)
client.close
end
From the documentation it states:
The default configuration for a Mongo::Client works for most applications:
client = Mongo::Client.new(["localhost:27017"])
Create this client once for each process, and reuse it for all operations. It is a common mistake to create a new client for each request, which is very inefficient and not what the client was designed for.
To support extremely high numbers of concurrent MongoDB operations within one process, increase max_pool_size:
client = Mongo::Client.new(["localhost:27017"], max_pool_size: 200)
Any number of threads are allowed to wait for connections to become available, and they can wait the default (1 second) or the wait_queue_timeout setting:
client = Mongo::Client.new(["localhost:27017"], wait_queue_timeout: 0.5)
When #close is called on a client by any thread, all connections are closed:
client.close
Note that when creating a client using the block syntax described above, the client is automatically closed after the block finishes executing.
My question would be if this statement also applies for isolated Sidekiq jobs execution, and if so, how could i recycle Mongo Client connection object along a Sidekiq process? I could think of having a global ##client in the Sidekiq initializer:
config/initializers/sidekiq.rb:
##client = Mongo::Client.new(DOCUMENTDB_HOST, DOCUMENTDB_OPTIONS)
and then:
jobs/my_job.rb:
def perform(document)
##client[:my_collection].insert_one(document)
end
Note:
No significant errors are raised, the whole system just get frozen and I get the following exception thrown randomly after the system has several minutes running correctly:
OpenSSL::SSL::SSLError: SSL_connect SYSCALL returned=5 errno=0 state=SSLv3/TLS write client hello (for 10.0.0.123:27017
UPDATE:
I tried 'reusing' the connection client by creating a global variable with the connection object in an initializer:
config/initializers/mongodb_client.rb
$mongo_client = Mongo::Client.new(DOCUMENTDB_HOST, DOCUMENTDB_OPTIONS)
and then using it inside my ActiveJob class. So far it seems to work good but I am unaware of side effects; actually I did start many Sidekiq processes and I am closely watching at the logs looking for exceptions thrown, so far all good.
jobs/my_job.rb
def perfom(document)
$mongo_client[:activity_log].insert_one(log_document)
end
It looks like the MongoClient is threadsafe, just set that :max_pool_size to your Sidekiq concurrency so each job thread can concurrently use the client.

Rails Initializer: An infinite loop in a separate thread to update records in the background

I want to run an infinite loop on a separate thread that starts as soon as the app initializes (in an initializer). Here's what it might look like:
# in config/initializers/item_loop.rb
Thread.new
loop do
Item.find_each do |item|
# Get price from third-party api and update record.
item.update_price!
# Need to wait a little between requests to avoid getting throttled.
sleep 5
end
end
end
I tend to accomplish this by running batch updates in recurring background jobs. But this doesn't make sense since I don't really need parallelization, downtime, or queueing, I just want to update one item at a time in a single thread, forever.
Yet there are multiple things that concern me:
Leaked Connections: Should I open up a new connection_pool for the thread? Should I use a gem like safely to avoid crashing the thread?
Thread Safety: Should I be worried about race conditions? Should I make use of Mutex and synchronize? Does using ActiveRecord::Base.transaction impact thread safety?
Deadlock: Should I use Rails.application.executor.wrap?
Concurrent Ruby/Sleep Intervals: Should I use TimerTask from concurrent-ruby gem instead of sleep or something other than Thread.new?
Information on any of these subjects is appreciated.
Usually to perform a job in a background process(non web-server process) a background workers manager is used. Rails has a specific interface for that manager called ActiveJob There are few implementation of a background workers manager - Sidekiq, DelayedJob, Resque, etc. Sidekiq is preferred. Returning back to actual problem - you may create a schedule to run UpdatePriceJob every interval using gem sidekiq-scheduler Another nice extension for throttling Sidekiq workers is sidekiq-throttler
Some code snippets:
# app/workers/update_price_worker.rb
# Actual Worker class
class UpdatePriceWorker
include Sidekiq::Worker
sidekiq_options throttle: { threshold: 720, period: 1.hour }
def perform(item_id)
Item.find(item_id).update_price!
end
end
# app/workers/update_price_master_worker.rb
# Master worker that loops over items
class UpdatePriceMasterWorker
include Sidekiq::Worker
def perform
Item.find_each { |item| UpdatePriceWorker.perform_async item.id }
end
end
# config/sidekiq.yml
:schedule:
update_price:
cron: '0 */4 * * *' # Runs once per 4 hours - depends on how many Items are there
class: UpdatePriceMasterWorker
Idea of this setup - we run MasterWorker every 4 hours(this depends on how much time it takes to update all items). Master worker creates jobs to update price of an every particular item. UpdatePriceWorker is throttled to max 720 RPH.
I use rails runner x (god gem or k8s) in our similar case.
Rails runner runs in another process so that we do not have to worry about connection-leak and thread-safety.
God-gem or k8s supports concurrency and monitoring the job failure. Running 1 process with some specific sleep-time would promise third-party API throttles (running N process with N API-key could support speed up).
I think deadlock would happen in any concurrency situation.
I do not think this loop + sleep approach is a design flaw, because:
cron always starts based on schedule so that long running jobs could run simultaneously. We need to add a logic to avoid job overlapping. Rather, just loop + sleep keeps maximum throughput without any job overlap.
ActiveJob is good for one-shot long-running task, but it does not fit for daemon.

Sidekiq - Only handle error after x retries?

I'm using sidekiq to process thousands of jobs per hour - all of which ping an external API (Google). One out of X thousand requests will return an unexpected (or empty) result. As far as I can tell, this is unavoidable when dealing with an external API.
Currently, when I encounter such response, I raise an Exception so that the retry logic will automatically take care of it on the next try. Something is only really wrong with the same job fails over and over many times. Exceptions are handled by Airbrake.
However my airbrake gets clogged up with these mini-outages that aren't really 'issues'. I'd like Airbrake to only be notified of these issues if the same job has failed X times already.
Is it possible to either
disable the automated airbrake integration so that I can use the sidekiq_retries_exhausted to report the error manually via Airbrake.notify
Rescue the error somehow so it doesn't notify Airbrake but keep retrying it?
Do this in a different way that I'm not thinking of?
Here's my code outline
class GoogleApiWorker
include Sidekiq::Worker
sidekiq_options queue: :critical, backtrace: 5
def perform
# Do stuff interacting with the google API
rescue Exception => e
if is_a_mini_google_outage? e
# How do i make it so this harmless error DOES NOT get reported to Airbrake but still gets retried?
raise e
end
end
def is_a_mini_google_outage? e
# check to see if this is a harmless outage
end
end
As far as I know Sidekiq has a class for retries and jobs, you can get your current job through arguments (comparing - cannot he effective) or jid (in this case you'd need to record the jid somewhere), check the number of retries and then notify or not Airbrake.
https://github.com/mperham/sidekiq/wiki/API
https://github.com/mperham/sidekiq/blob/master/lib/sidekiq/api.rb
(I just don't give more info because I'm not able to)
if you look for Sidekiq solution https://blog.eq8.eu/til/retry-active-job-sidekiq-when-exception.html
if you are more interested in configuring Airbrake so you don't get these errors untill certain retry check Airbrake::Sidekiq::RetryableJobsFilter
https://github.com/airbrake/airbrake#airbrakesidekiqretryablejobsfilter

How set timeout for jobs in sidekiq

I encountered an issue with sidekiq: I want to set timeout for jobs, meaning when a job has process time greater than timeout then that job will stop.
I have searched how to set global timeout config in file sidekiq.yml. But I want to set separate timeout for difference separate jobs meaning one of classes to define worker will have particular timeout config.
Can you help me. Thanks so much.
There's no approved way to do this. You cannot stop a thread safely while it is executing. You need to change your job to check periodically if it should stop.
You can set network timeouts on any 3rd party calls you are making so that they time out.
You can wrap your job code inside a timeout block like the below:
Timeout::timeout(2.hours) do
***. do possibly long-running task *****
end
The job will fail automatically after 2 hours.
This is the same method as yassen suggested, but more concrete.
class MyCustomWorker
include Sidekiq::Worker
def perform
begin
Timeout::timeout(30.minutes) do # set timeout to 30 minutes
perform_job()
end
rescue Timeout::Error
Rails.logger.error "timeout reached for worker"
end
end
def perform_job
# worker logic here
end
end

create recurring activejob fails

I'm trying to create an ActiveJob in rails 4.2 that runs at a regular rate. The job is being called the first time, but it does not start again. My code is throwing the exception below after trying to call perform_later.
log output
[ActiveJob] Enqueued ProcessInboxJob (Job ID: 76a63689-e330-47a1-af92-8e4838b508ae) to Inline(default)
[ActiveJob] [ProcessInboxJob] [76a63689-e330-47a1-af92-8e4838b508ae] Performing ProcessInboxJob from Inline(default)
ProcessInboxJob running...
[ActiveJob] [ProcessInboxJob] [76a63689-e330-47a1-af92-8e4838b508ae] [AWS S3 200 0.358441 0 retries] list_objects(:bucket_name=>"...",:max_keys=>1000)
[ActiveJob] [ProcessInboxJob] [76a63689-e330-47a1-af92-8e4838b508ae] Enqueued ProcessInboxJob (Job ID: dfd3dd7a-06ab-4dba-9bbf-ce1ad606f7e5) to Inline(default) with arguments: {:wait=>30 seconds}
[ActiveJob] [ProcessInboxJob] [76a63689-e330-47a1-af92-8e4838b508ae] Performed ProcessInboxJob from Inline(default) in 599.72ms
Exiting
/Users/antarrbyrd/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/activejob-4.2.0/lib/active_job/arguments.rb:60:in `serialize_argument': Unsupported argument type: ActiveSupport::Duration (ActiveJob::SerializationError)
process_inbox_job.rb
class ProcessInboxJob < ActiveJob::Base
queue_as :default
#FREQUENCY = 3.minutes
def perform()
# do some work
end
# reschedule job
after_perform do |job|
self.class.perform_later(wait: 30.seconds)
end
end
The syntax is self.class.set(wait: 30.seconds).perform_later. But that's not a reliable way of doing it as if an exception occurs the chain breaks. Also you must have the initial job scheduled.
If you use resque you can use https://rubygems.org/gems/activejob-scheduler
As #bcd said, you have to use self.class.set(wait: 30.seconds).perform_later with a queue adapter that support queuing, i.e. not the default (inline) adapter.
I post to give a different point of view on the question of rescheduling, which may help future readers.
after_perform will not be called if an exception is raised but that does not make it a bad place to reschedule a job. If you have an exception in a job, better rescue it (with the class method rescue_from) and send yourself a notification if your backend doesn't do it already.
You can then try to fix the problem (either in the data or in your code) and retry (if you can) or enqueue a similar job again.
For the scheduling part, activejob-scheduler is great and does not work only for resque, but has some down sides.
It uses rufus-scheduler, which performs in-memory delay, so whenever your server restarts you'll lose all scheduling information, which may really be a problem for some tasks (I schedule tasks 1 month in the future and update my app every week, which mean a restart each time).
You also lose all advantages of using an actual queuing backend, such as beanstalk with backburner.
ActiveJob-scheduler also claims to perform job at the exact right time, which is false. The ActiveJob adapter runs at the specified time, but depending on your setup it may take a few time before the job is actually performed, e.g. when you run your jobs on another server.
Lastly, for the initial scheduling you can include a code that checks if the job exists at the worker start, and schedule it if needed.
To sum up,
Yeah ActiveJob-Scheduler is great, but you'll lose some of ActiveJob features and it does not do everything.
depending on which queuing system you are using you can try https://github.com/codez/delayed_cron_job or https://github.com/ondrejbartas/sidekiq-cron. With DJ cron you can use UI like rails_admin to actually edit the cron regex. Sidekiq-cron gives you Sinatra web UI where you can manually kick off a job or pause it.

Resources