We have an issue similar to this Sidekiq schedule same worker to queue when done
The crux of the issue is that if more then one argument is passed to the perform_in it does not schedule them for later, but processes them as usual.
This works as expected, but is not useful because there are no arguments
being passed into the new job
def perform(id, mode)
begin
Some::Process.remediate(App.find(id), mode)
rescue CustomErrors::MyError => e
# This will schedule a job, but without arguments :(
self.class.perform_in(2.hours)
end
end
This would be useful, but does not work as expected
The job completes and nothing is rescheduled
def perform(id, mode)
begin
Some::Process.remediate(App.find(id), mode)
rescue CustomErrors::MyError => e
# This does not schedule a job
self.class.perform_in(2.hours, id, mode)
end
end
Used:
Sidekiq 3.5.1
Rails 4.2.4
ruby 2.3.1p112
dev environment
Any help is highly appreciated
UPDATE
When the code is ran with a binding
binding.pry
self.class.perform_in(2.hours, 2, 12345)
It does schedule a job, but only on the first time. When it comes to it the second time around the job is being run for some reason.
Related
I have an API which uses a Service, in which I have used Ruby thread to reduce the response time of the API. I have tried to share the context using the following example. It was working fine with Rails 4, ruby 2.2.1
Now, we have upgraded rails to 5.2.3 and ruby 2.6.5. After which service has stopped working. I can call the service from Console, it works fine. But with API call, service becomes unresponsive once it reaches CurrencyConverter.new. Any Idea what can be the issue?
class ParallelTest
def initialize
puts "Initialized"
end
def perform
# Our sample set of currencies
currencies = ['ARS','AUD','CAD','CNY','DEM','EUR','GBP','HKD','ILS','INR','USD','XAG','XAU']
# Create an array to keep track of threads
threads = []
currencies.each do |currency|
# Keep track of the child processes as you spawn them
threads << Thread.new do
puts currency
CurrencyConverter.new(currency).print
end
end
# Join on the child processes to allow them to finish
threads.each do |thread|
thread.join
end
{ success: true }
end
end
class CurrencyConverter
def initialize(params)
#curr = params
end
def print
puts #curr
end
end
If I remove the CurrencyConverter.new(currency), then everything works fine. CurrencyConverter is a service object that I have.
Found the Issue
Thanks to #anothermh for this link
https://guides.rubyonrails.org/threading_and_code_execution.html#wrapping-application-code
https://guides.rubyonrails.org/threading_and_code_execution.html#load-interlock
As per the blog, When one thread is performing an autoload by evaluating the class definition from the appropriate file, it is important no other thread encounters a reference to the partially-defined constant.
Only one thread may load or unload at a time, and to do either, it must wait until no other threads are running application code. If a thread is waiting to perform a load, it doesn't prevent other threads from loading (in fact, they'll cooperate, and each perform their queued load in turn, before all resuming running together).
This can be resolved by permitting concurrent loads.
https://guides.rubyonrails.org/threading_and_code_execution.html#permit-concurrent-loads
Rails.application.executor.wrap do
urls.each do |currency|
threads << Thread.new do
CurrencyConverter.new(currency)
puts currency
end
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
threads.map(&:join)
end
end
end
Thank you everybody for your time, I appreciate.
Don't re-invent the wheel and use Sidekiq instead. 😉
From the project's page:
Simple, efficient background processing for Ruby.
Sidekiq uses threads to handle many jobs at the same time in the same process. It does not require Rails but will integrate tightly with Rails to make background processing dead simple.
With 400+ contributors, and 10k+ starts on Github, they have build a solid parallel job execution process that is production ready, and easy to setup.
Have a look at their Getting Started to see it by yourself.
I am using Rufus Scheduler to trigger the background jobs that need to run every 1 hour.
scheduler = Rufus::Scheduler.singleton
scheduler.every '1h' do
JobName.perform_now
end
I have my Infra setup in AWS and for production, I have 2 Instances running the APP inside an ECS.
What happens is that the Scheduler schedules the jobs twice.
Instance A schedules the job at 00:00:05:01 and Instance B schedules at 00:00:05:05
The jobs are not failing. I am using ActiveJob. I was looking into other solutions like delayed Job but that has the same problem when there are multiple instances.
Can you guys provide an alternate approach to fix this issue? Or a workaround for the same?
https://github.com/jmettraux/rufus-scheduler#lockfile--mylockfiletxt
"This is useful in environments where the Ruby process holding the scheduler gets started multiple times."
Try this:
scheduler = Rufus::Scheduler.singleton(:lockfile => ".rufus-scheduler.lock")
scheduler.every '1h' do
JobName.perform_now
end
You need a distributed lock since ECS instances don't share files, the most common ones being Zookeeper, Consul and Redis.
Below an example with Zookeeper, from the docs:
class ZookeptScheduler < Rufus::Scheduler
def initialize(zookeeper, opts={})
#zk = zookeeper
super(opts)
end
def lock
#zk_locker = #zk.exclusive_locker('scheduler')
#zk_locker.lock # returns true if the lock was acquired, false else
end
def unlock
#zk_locker.unlock
end
def confirm_lock
return false if down?
#zk_locker.assert!
rescue ZK::Exceptions::LockAssertionFailedError => e
# we've lost the lock, shutdown (and return false to at least prevent
# this job from triggering
shutdown
false
end
end
You could maybe use EFS to share a lockfile, but this is not the correct way.
Start only the scheduler on instance A.
I have a Rails app that uses Rufus Scheduler combined with Delayed jobs to execute background jobs. There are another jobs, but the one I'm having trouble with is scheduled in a controller using this code:
def create
#harvest_plan = HarvestPlan.new(resource_params)
#harvest_plan.start_date = Time.parse(resource_params[:start_date])
if #harvest_plan.save
ApplicationController.new.insert_in_messages_list(session, :success, 'Harvest plan created')
schedule_harvest
redirect_to farms_path
end
end
private
def schedule_harvest
Rufus::Scheduler.singleton.every "#{#harvest_plan.hours_between}h",
:times => #harvest_plan.repetitions, :first_at => #harvest_plan.start_date do
CreateHarvestFromPlanJob.perform_later
end
end
The job is supposed to be scheduled according to the harvest plan model, which indicates how many hours must past between jobs, when is the first one supposed to be scheduled and how many repetitions must occur. Everything works perfect except for the first job, which does happen at the time specified with first_at but it is scheduled twice for some reason, delayed jobs then executes the job twice. I tried using the mutex, blocking and overlap options, but it did nothing different. After the first job (scheduled twice) everything works fine. The next jobs are scheduled on time and just once. I have just one delayed jobs worker
Why is this happening?
I am running Rails 4.2.4, Ruby 2.2.2 and Rufus 3.3.2. Since the error happens both with passenger and webrick I assume this has nothing to do with the problem.
Why is Rufus scheduling the first job twice?
because of a bug you found: https://github.com/jmettraux/rufus-scheduler/issues/231
Thanks a lot!
I have a sidekiq worker that shouldn't take more than 30 seconds, but after a few days I'll find that the entire worker queue stops executing because all of the workers are locked up.
Here is my worker:
class MyWorker
include Sidekiq::Worker
include Sidekiq::Status::Worker
sidekiq_options queue: :my_queue, retry: 5, timeout: 4.minutes
sidekiq_retry_in do |count|
5
end
sidekiq_retries_exhausted do |msg|
store({message: "Gave up."})
end
def perform(id)
begin
Timeout::timeout(3.minutes) do
got_lock = with_semaphore("lock_#{id}") do
# DO WORK
end
end
rescue ActiveRecord::RecordNotFound => e
# Handle
rescue Timeout::Error => e
# Handle
raise e
end
end
def with_semaphore(name, &block)
Semaphore.get(name, {stale_client_timeout: 1.minute}).lock(1, &block)
end
end
And the semaphore class we use. (redis-semaphore gem)
class Semaphore
def self.get(name, options = {})
Redis::Semaphore.new(name.to_sym,
:redis => Application.redis,
stale_client_timeout: options[:stale_client_timeout] || 1.hour,
)
end
end
Basically I'll stop the worker and it will state done: 10000 seconds, which the worker should NEVER be running for.
Anyone have any ideas on how to fix this or what is causing it? The workers are running on EngineYard.
Edit: One additional comment. The # DO WORK has a chance to fire off a PostgresSQL function. I have noticed in logs some mention of PG::TRDeadlockDetected: ERROR: deadlock detected. Would this cause the worker to never complete even with a timeout set?
Given you want to ensure unique job execution, i would attempt removing all locks and delegate job uniqueness control to a plugin like Sidekiq Unique Jobs
In this case, even if sidetiq enqueue the same job id twice, this plugin ensures it will be enqueued/processed a single time.
You might also try the ActiveRecord with_lock mechanism: http://api.rubyonrails.org/classes/ActiveRecord/Locking/Pessimistic.html
I have had a similar problem before. To solve this problem, you should stop using Timeout.
As explained in this article, you should never use Timeout in a Sidekiq job. If you use Timeout, Sidekiq processes and threads can easily break.
Not only Ruby, but also Java has a similar problem. Stopping a thread from the outside is inherently dangerous, regardless of the language.
If you continue to have the same problem after deleting Timeout, check that if you are using threads carelessly in your code.
As Sidekiq's architecture is so sophisticated, in almost all cases, the source of the bug is outside of Sidekiq.
I am facing a very interesting problem. I have tested the Delay job gem 4 times. I doubt it is the design problem of the gem or a bug. I use command rake jobs:work to create worker to do delayed job.
Once I create a LongTask record, i also make a delayed job which will change the attributeminutes_delayed to 2.
The gem works perfectly if I don't update the attributes. But once I edited the description, the gem will not work properly, which means it will not execute the delayed job, but the related delayed job record will be removed in the database.
Interesting final result:
It Seems to reference a object with attribute that is exactly the same, this picture was captured before the running time have gone over.
This one was captured after all tests have been gone though. You can see the delayed job record for test4 have been removed even this delayed job did't have any effect.
terminal results (only 2 jobs are executed)
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] Starting job worker
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] LongTask#set_delay_time_without_delay completed after 0.0343
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] 1 jobs processed at 16.6270 j/s, 0 failed ...
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] LongTask#set_delay_time_without_delay completed after 0.0105
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] 1 jobs processed at 51.4774 j/s, 0 failed ...
Code in model:
def set_delay_time(time)
self.minutes_delayed = time
# very important for this, otherwise cannot write the change into the database
self.save
end
handle_asynchronously :set_delay_time, :run_at => Proc.new { 2.minutes.from_now }
Code in controller:
def create
#long_task = LongTask.new(params[:long_task])
respond_to do |format|
if #long_task.save
#long_task.set_delay_time(2)
Without seeing your code, it's impossible to tell for sure, but it's likely that both of your delayed jobs are working on serialized copies of your object, rather than reloading them from the database.