Moving a Resque job between queues - ruby-on-rails

Is there anyway to move a resque job between two different queues?
We sometimes get in the situation that we have a big queue and a job that is near the end we find a need to "bump up its priority." We thought it might be an easy way to simply move it to another queue that had a worker waiting for any high priority jobs.
This happens rarely and is usually a case where we get a special call from a customer, so scaling, re-engineering don't seem totally necessary.

There is nothing built-in in Resque. You can use rpoplpush like:
module Resque
def self.move_queue(source, destination)
r = Resque.redis
r.llen("queue:#{source}").times do
r.rpoplpush("queue:#{source}", "queue:#{destination}")
end
end
end
https://gist.github.com/rafaelbandeira3/7088498

If it's a rare occurrence you're probably better off just manually pushing a new job into a shorter queue. You'll want to make sure that your system has a way to identify that the job has already run and to bail out so that when the job in the long queue is finally reached it is not processed again (if double processing is a problem for you).

Related

Idempotent Design with Sidekiq Ruby on Rails Background Job

Sidekiq recommends that all jobs be idempotent (able to run multiple times without being an issue) as it cannot guarantee a job will only be run one time.
I am having trouble understanding the best way to achieve that in certain cases. For example, say you have the following table:
User
id
email
balance
The background job that is run simply adds some amount to their balance
def perform(user_id, balance_adjustment)
user = User.find(user_id)
user.balance += balance_adjustment
user.save
end
If this job is run more than once their balance will be incorrect. What is best practice for something like this?
If I think about it a potential solution I can come up with is to create a record before scheduling the job that is something like
PendingBalanceAdjustment
user_id
balance_adjustment
When the job runs it will need to acquire a lock for this user so that there's no chance of a race condition between two workers and then will need to both update the balance and delete the record from pending balance adjustment before releasing the lock.
The job then looks something like this?
def perform(user_id, balance_adjustment_id)
user = User.find(user_id)
pba = PendingBalanceAdjustment.where(:balance_adjustment_id => balance_adjustment_id).take
if pba.present?
$redis.lock("#{user_id}/balance_adjustment") do
user.balance += pba.balance_adjustment
user.save
pba.delete
end
end
end
This seems to solve both
a) Race condition between two workers taking the job at the same time (though you'd think Sidekiq could guarantee this already?)
b) A job being run multiple times after running successfully
Is this pattern a good solution?
You're on the right track; you want to use a database transaction, not a redis lock.
I think you're on the right track too but you're solution might be overkill since I don't have full knowledge of your application.
BUT, a simpler solution would simply be to have a flag on you User model like balance_updated:datetime. So, you could check that before updating.
As Mike mentions using a Transaction block should ensure it's thread safe.
In any case, to answer your question more generally... having an updated_ column is usually good enough to start with, and then if it gets complicated you can move this stuff to another model.

Sidekiq - how to execute the job immediately (+ does make sense to use a queue in this case)?

I have a task that I need to generate immediately after the request is created and get it done ASAP.
So for this purpose, I have created a /config/sidekiq.yml file where I defined this:
---
:queues:
- default
- [critical, 10]
And for the respective worker, I set this:
class GeneratePDFWorker
include Sidekiq::Worker
sidekiq_options queue: 'critical', retry: false
def perform(order_id)
...
Then, when I call this worker:
GeneratePDFWorker.perform_async(#order.id)
So I am testing this. But - I found this post, where is said that if I want to execute the tasks immediately, I should call:
GeneratePDFWorker.new.perform(#order.id)
So my question is - should I use the combination of a (critical) queue + the new (GeneratePDFWorker.new.perform) method? Does it make sense?
Also, how can I verify that the tasks is execute as critical?
Thank you
So my question is - should I use the combination of a (critical) queue + the new (GeneratePDFWorker.new.perform) method? Does it make sense?
Using GeneratePDFWorker.new.perform will run the code right there and then, like normal, inline code (in a blocking manner, not async). You can't define a queue, because it's not being queued.
As Walking Wiki mentioned, GeneratePDFWorker.new.perform(#order.id) will call the worker synchronously. So if you did this from a controller action, the request would block until the perform method completed.
I think your approach of using priority queues for critical tasks with Sidekiq is the way to go. As long as you have enough Sidekiq workers, and your queue isn't backlogged, the task should run almost immediately so the benefit of running your worker in-process is pretty much nil. So I'd say yes, it does make sense to queue in this case.
Also, you're probably aware of this, but sidekiq has a great monitoring UI: https://github.com/mperham/sidekiq/wiki/Monitoring. This should should make it easy to get reliable, detailed metrics on the performance of your workers.
should I use the combination of a (critical) queue?
Me:
Yes you can use critical queue if you feel so. A queue with a weight of 2 will be checked twice as often as a queue with a weight of 1.
Tips:
Keep the number of queues fewer as possible. Sidekiq is not designed to handler tremendous number of queues.
Also keep weights as simple as possible. If you want queues always processed in a specific order, just declare them in order without weights.
the new (GeneratePDFWorker.new.perform) method?
Me: No, using sidekiq in the same thread asynchronously is bad in the first place. This will hamper your application's performance as your application-server will be busy for longer. This will be very expensive for you. Then what will be the point of using sidekiq?

How to run the job synchronously with sidekiq

Currently I am working with queue job on the ruby on rail with the Sidekiq. I have 2 jobs that are depend to each other and I want 1st job to finish first before starting the 2nd job, so is there any way to make it with Sidekiq.
Yes, you can use the YourSidekiqJob.new.perform(parameters_to_the_job) pattern. This will run your jobs in order, synchronously.
However, there are 2 things to consider here:
What happens if the first job fails?
How long does the each job run?
For #2, the pattern blocks execution for the length of time each job takes to run. If the jobs are extremely short in runtime, why use the jobs in the first place? If they're long, are you expecting the user to wait until they're done?
Alternatively, you can schedule the running of the second job as the last line in the body of the first one. You still need to account for the failure mode of job #1 or #2. Also, you need to consider that the job won't necessarily run when it's scheduled to run, due to the state of the queue at schedule time. How does this affect your business logic?
Hope this helps
--edit according to last comment
class SecondJob < SidekiqJob
def perform(params)
data = SomeData.find
return unless data.ready?
# do whatever you need to do with the ready data
end
end

Ruby (rails) non-blocking recursive algorithm?

I've written the following pseudo-ruby to illustrate what I'm trying to do. I've got some computers, and I want to see if anything's connected to them. If nothing is connected to them, try again for another two attempts, and if that's the still case, shut it down.
This is for a big deployment so this recursive timer could be running for hundreds of nodes. I just want to check, is this approach sound? Will it generate tonnes of threads and eat up lots of RAM while blocking the worker processes? (I expect it will be running as a delayed_job)
check_status(0)
def check_status(i)
if instance.connected.true? then return
if instance.connected.false? and i < 3
wait.5.minutes
instance.check_status(i+1)
else
instance.shutdown
return
end
end
There is not going to be a large problem when the maximum recursion depth here is 3. It should be fine. Recursing a method does not create threads, but each call does store more information about the call stack, and eventually the resources used for that storage could run out. Not after 3 calls though, that is quite safe.
However, there is no need for recursion to solve your problem. The following loop should do just as well:
def check_status
return if instance.connected.true?
2.times do
wait.5.minutes
return if instance.connected.true?
end
instance.shutdown
end
You got answers from other users already. However, since you are waiting 5 minutes at least two times, you might consider using another language or change the design.
Ruby (MRI) has a global interpreter lock, which restricts parallel execution of Ruby code. MRI is not parallel. You risk to be inefficient with this.
Consider using threads (a reasonable number of thread pools might make sense), probably fed by a queue with tasks
Make sure you don't wait 5 minutes. Instead put them to sleep for that time. This way other threads can execute, while some are sleeping/waiting
You could also consider using jRuby, since jRuby has true parallelism (MRI is restricted by the GIL, thus it is not truly parallel)
Consider using another programming language that might be more performant
If it's running via delayed_job why not use the gem's functionality to implement what you want? I, for one, would go for something like the following. No need to sleep the delayed jobs or anything.
class CheckStatusJob
def before(job)
#job = job
end
def perform
if instance.connected.true? then return
if instance.connected.false? and #job.attempts < 3
raise 'The job failed!'
else
instance.shutdown
end
end
def max_attempts
3
end
def reschedule_at(current_time, attempts)
current_time + 5.minutes
end
end

How to avoid meeting Heroku's API rate limit with delayed job and workless

My Survey model has about 2500 instances and I need to apply the set_state method to each instance twice. I need to apply it the second time only after every instance has had the method applied to it once. (The state of an instance can depend on the state of other instances.)
I'm using delayed_job to create delayed jobs and workless to automatically scale up/down my worker dynos as required.
The set_state method typically takes about a second to execute. So I've run the following at the heroku console:
2.times do
Survey.all.each do |survey|
survey.delay.set_state
sleep(4)
end
end
Shouldn't be any issues with overloading the API, right?
And yet I'm still seeing the following in my logs for each delayed job:
Heroku::API::Errors::ErrorWithResponse: Expected(200) <=> Actual(429 Unknown)
I'm not seeing any infinite loops -- it just returns this message as soon as I create the delayed job.
How can I avoid blowing Heroku's API rate limits?
Reviewing workless, it looks like it incurs an API call per delayed job to check the worker count and potentially a second API call to scale up/down. So if you are running 5000 (2500x2) jobs within a short period, you'll end up with 5000+ API calls. Which would be well in excess of the 1200/requests per hour limit. I've commented over there to hopefully help toward reducing the overall API usage (https://github.com/lostboy/workless/issues/33#issuecomment-20982433), but I think we can offer a more specific solution for you.
In the mean time, especially if your workload is pretty predictable (like this). I'd recommend skipping workless and doing that portion yourself. ie it sounds like you already know WHEN the scaling would need to happen (scale up right before the loop above, scale down right after). If that is the case you could do something like this to emulate the behavior in workless:
require 'heroku-api'
heroku = Heroku::API.new(:api_key => ENV['HEROKU_API_KEY'])
client.post_ps_scale(ENV['APP_NAME'], 'worker', Survey.count)
2.times do
Survey.all.each do |survey|
survey.delay.set_state
sleep(4)
end
end
min_workers = ENV['WORKLESS_MIN_WORKERS'].present? ? ENV['WORKLESS_MIN_WORKERS'].to_i : 0
client.post_ps_scale(ENV['APP_NAME'], 'worker', min_workers)
Note that you'll need to remove workless from these jobs also. I didn't see a particular way to do this JUST for certain jobs though, so you might want to ask on that project if you need that. Also, if this needs to be 2 pass (the first time through needs to finish before the second), the 4 second sleep may in some cases be insufficient but that is a different can of worms.
I hope that helps narrow in on what you needed, but I'm certainly happy to discuss further and/or elaborate on the above as needed. Thanks!

Resources