I am using Rufus Scheduler to trigger the background jobs that need to run every 1 hour.
scheduler = Rufus::Scheduler.singleton
scheduler.every '1h' do
JobName.perform_now
end
I have my Infra setup in AWS and for production, I have 2 Instances running the APP inside an ECS.
What happens is that the Scheduler schedules the jobs twice.
Instance A schedules the job at 00:00:05:01 and Instance B schedules at 00:00:05:05
The jobs are not failing. I am using ActiveJob. I was looking into other solutions like delayed Job but that has the same problem when there are multiple instances.
Can you guys provide an alternate approach to fix this issue? Or a workaround for the same?
https://github.com/jmettraux/rufus-scheduler#lockfile--mylockfiletxt
"This is useful in environments where the Ruby process holding the scheduler gets started multiple times."
Try this:
scheduler = Rufus::Scheduler.singleton(:lockfile => ".rufus-scheduler.lock")
scheduler.every '1h' do
JobName.perform_now
end
You need a distributed lock since ECS instances don't share files, the most common ones being Zookeeper, Consul and Redis.
Below an example with Zookeeper, from the docs:
class ZookeptScheduler < Rufus::Scheduler
def initialize(zookeeper, opts={})
#zk = zookeeper
super(opts)
end
def lock
#zk_locker = #zk.exclusive_locker('scheduler')
#zk_locker.lock # returns true if the lock was acquired, false else
end
def unlock
#zk_locker.unlock
end
def confirm_lock
return false if down?
#zk_locker.assert!
rescue ZK::Exceptions::LockAssertionFailedError => e
# we've lost the lock, shutdown (and return false to at least prevent
# this job from triggering
shutdown
false
end
end
You could maybe use EFS to share a lockfile, but this is not the correct way.
Start only the scheduler on instance A.
Related
I have a newsletter that I send out to my customers (~10k emails) every morning and sometimes happens that this Sidekiq job is taking some much CPU/memory performance that the website (Rails app) is not running and facing blackouts.
When I look at the Sidekiq dashboard, I see there is some problem (probably invalid email address and Sidekiq repeatedly trying to send it again?) with the newsletter and it's stuck.
How do I prevent this behavior and preclude repeating the Sidekiq task (which I believe that's the problem of the breakout)?
Here's my code:
rake task:
namespace :mailer do desc "Carrier blast - morning"
task :newsletter_morning => [:environment] do
NewslettertJob.perform_later
end
end
job definition:
class NewslettertJob < ApplicationJob
def perform
...
NewsletterMailer.morning_blast(data).deliver_now
end
end
and NewsletterMailer:
class NewsletterMailer < ApplicationMailer
def morning_blast(data)
...
customers.each do |customer|
yield customer, nil; next if customer.email.blank?
begin
Retryable.retryable( tries: 1, sleep: 30, on: [Net::OpenTimeout, Net::SMTPAuthenticationError, Net::SMTPServerBusy]) do
send_email(customer.email).deliver
end
send_email(customer.email).deliver
rescue Net::SMTPSyntaxError => e
error_msg = "Newsletter sending failed on #{Time.now} with: #{e.message}. e.inspect: #{e.inspect}"
logger.warn error_msg
yield customer, nil
next
end
end
end
end
What I want to achieve is that the newsletter will be sent out every morning and if Rails/Sidekiq faces a problem, it will simply shut itself down, so the newsletter will not affect the "life" on the main website (its server).
Thank you in advance for every advice. I am being stuck on this issue for a while now.
If your machine only has one core, Sidekiq and puma will fight for CPU. Lower Sidekiq's concurrency so it uses less CPU, or get a machine with multiple cores, or move Sidekiq to a different machine.
If a Sidekiq process is using 100% of a core, lower the concurrency setting. The default in Sidekiq 6.0 is 10, which is a good default but if you are just delivering emails you could probably bump that to 20. You can run multiple Sidekiq processes if you wish to utilize multiple cores to process jobs faster.
I think ideally, you should separate your background task servers from your web servers, that way background process won't impact on the performance of the web server. I work for a very high traffic/ high-load company, and we have an architecture of sorts in here.
There are explanations on how to stop retries in this answer: Disable automatic retry with ActiveJob, used with Sidekiq
Another thing, your e-mail sending is done synchronously (.deliver). This implicates on your task being a huge monolitical process with many customers, with huge impact on memory. Instead, you could use a deliver_later, so each customer get's it's own little worker. This will also help aliviate CPU and Memory usage. You could even create a worker for sending e-mails per customer, and use your monolitical Job to merely dispatch those.
class NewslettertJob < ApplicationJob
def perform
...
customers.each |customer| do
NewsletterMailer.morning_blast(customer, data).deliver_later if customer.email.present?
end
end
end
However, I think the silver bullet is separating your sidekiq server from your web server - having one server dedicated to background tasks. On your web server, you don't even start the sidekiq instances.
I have a Rails app that uses Rufus Scheduler combined with Delayed jobs to execute background jobs. There are another jobs, but the one I'm having trouble with is scheduled in a controller using this code:
def create
#harvest_plan = HarvestPlan.new(resource_params)
#harvest_plan.start_date = Time.parse(resource_params[:start_date])
if #harvest_plan.save
ApplicationController.new.insert_in_messages_list(session, :success, 'Harvest plan created')
schedule_harvest
redirect_to farms_path
end
end
private
def schedule_harvest
Rufus::Scheduler.singleton.every "#{#harvest_plan.hours_between}h",
:times => #harvest_plan.repetitions, :first_at => #harvest_plan.start_date do
CreateHarvestFromPlanJob.perform_later
end
end
The job is supposed to be scheduled according to the harvest plan model, which indicates how many hours must past between jobs, when is the first one supposed to be scheduled and how many repetitions must occur. Everything works perfect except for the first job, which does happen at the time specified with first_at but it is scheduled twice for some reason, delayed jobs then executes the job twice. I tried using the mutex, blocking and overlap options, but it did nothing different. After the first job (scheduled twice) everything works fine. The next jobs are scheduled on time and just once. I have just one delayed jobs worker
Why is this happening?
I am running Rails 4.2.4, Ruby 2.2.2 and Rufus 3.3.2. Since the error happens both with passenger and webrick I assume this has nothing to do with the problem.
Why is Rufus scheduling the first job twice?
because of a bug you found: https://github.com/jmettraux/rufus-scheduler/issues/231
Thanks a lot!
We have an issue similar to this Sidekiq schedule same worker to queue when done
The crux of the issue is that if more then one argument is passed to the perform_in it does not schedule them for later, but processes them as usual.
This works as expected, but is not useful because there are no arguments
being passed into the new job
def perform(id, mode)
begin
Some::Process.remediate(App.find(id), mode)
rescue CustomErrors::MyError => e
# This will schedule a job, but without arguments :(
self.class.perform_in(2.hours)
end
end
This would be useful, but does not work as expected
The job completes and nothing is rescheduled
def perform(id, mode)
begin
Some::Process.remediate(App.find(id), mode)
rescue CustomErrors::MyError => e
# This does not schedule a job
self.class.perform_in(2.hours, id, mode)
end
end
Used:
Sidekiq 3.5.1
Rails 4.2.4
ruby 2.3.1p112
dev environment
Any help is highly appreciated
UPDATE
When the code is ran with a binding
binding.pry
self.class.perform_in(2.hours, 2, 12345)
It does schedule a job, but only on the first time. When it comes to it the second time around the job is being run for some reason.
I have a sidekiq worker that shouldn't take more than 30 seconds, but after a few days I'll find that the entire worker queue stops executing because all of the workers are locked up.
Here is my worker:
class MyWorker
include Sidekiq::Worker
include Sidekiq::Status::Worker
sidekiq_options queue: :my_queue, retry: 5, timeout: 4.minutes
sidekiq_retry_in do |count|
5
end
sidekiq_retries_exhausted do |msg|
store({message: "Gave up."})
end
def perform(id)
begin
Timeout::timeout(3.minutes) do
got_lock = with_semaphore("lock_#{id}") do
# DO WORK
end
end
rescue ActiveRecord::RecordNotFound => e
# Handle
rescue Timeout::Error => e
# Handle
raise e
end
end
def with_semaphore(name, &block)
Semaphore.get(name, {stale_client_timeout: 1.minute}).lock(1, &block)
end
end
And the semaphore class we use. (redis-semaphore gem)
class Semaphore
def self.get(name, options = {})
Redis::Semaphore.new(name.to_sym,
:redis => Application.redis,
stale_client_timeout: options[:stale_client_timeout] || 1.hour,
)
end
end
Basically I'll stop the worker and it will state done: 10000 seconds, which the worker should NEVER be running for.
Anyone have any ideas on how to fix this or what is causing it? The workers are running on EngineYard.
Edit: One additional comment. The # DO WORK has a chance to fire off a PostgresSQL function. I have noticed in logs some mention of PG::TRDeadlockDetected: ERROR: deadlock detected. Would this cause the worker to never complete even with a timeout set?
Given you want to ensure unique job execution, i would attempt removing all locks and delegate job uniqueness control to a plugin like Sidekiq Unique Jobs
In this case, even if sidetiq enqueue the same job id twice, this plugin ensures it will be enqueued/processed a single time.
You might also try the ActiveRecord with_lock mechanism: http://api.rubyonrails.org/classes/ActiveRecord/Locking/Pessimistic.html
I have had a similar problem before. To solve this problem, you should stop using Timeout.
As explained in this article, you should never use Timeout in a Sidekiq job. If you use Timeout, Sidekiq processes and threads can easily break.
Not only Ruby, but also Java has a similar problem. Stopping a thread from the outside is inherently dangerous, regardless of the language.
If you continue to have the same problem after deleting Timeout, check that if you are using threads carelessly in your code.
As Sidekiq's architecture is so sophisticated, in almost all cases, the source of the bug is outside of Sidekiq.
I need to add a job to the Sidekiq queue when my Rails app starts, to update some data, but I don't know where is the best place to do it.
Right now, I've wrote this on my application.rb:
class Application < Rails::Application
config.after_initialize do
MyWorker.perform_async
end
end
But the problem is that when I run the sidekiq command it will also load the Rails stack, so I'll end up with 2 jobs on the queue.
Is there any other way of doing that? This is my first big Rails app and my first time with Sidekiq, so I don't know if I'm not understanding things correctly. That might not be the right way of doing that.
Thanks!
A better solution would be to create an initializer config/initializers/sidekiq.rb
Sidekiq.configure_client do |config|
Rails.application.config.after_initialize do
# You code goes here
end
end
We were having issues w/ Redis connections and multiple jobs being launched.
I ended up using this and it seems to be working well:
if defined?(Sidekiq)
Sidekiq.configure_server do |config|
config.on(:startup) do
already_scheduled = Sidekiq::ScheduledSet.new.any? {|job| job.klass == "MyWorker" }
MyWorker.perform_async unless already_scheduled
end
end
end
Probably foreman suits for your purposes.
I know this is old, but none of this worked for me - would still start the job several times. I came up with the following solution:
I have a Class to do the actual Job:
class InitScheduling
include Sidekiq::Worker
def perform
# your code here
end
end
And I have an inizializer, which would normally start 3 Times, every time, something loads the Rails environment. So I use the Job as a state variable that this job is already scheduled:
# code in inizilizer/your_inizilizer.rb
Rails.application.config.after_initialize do
all_jobs = Sidekiq::ScheduledSet.new
# If this is True InitScheduling is already scheduled - don't start it again
unless all_jobs.map(&:klass).include?("InitScheduling")
puts "########### InitScheduling ##############"
# give rails time to build before running this job & keeps this initialization from re-running
InitScheduling.perform_in(5.minutes)
# your code here
end
end