Limit the number of background workers per user in Rails - ruby-on-rails

I have a Rails application where each user has a specific number of background workers.
Since users pay more to increase the number of workers available, I want to be able to add these workers dynamically.
I would like to use ActiveJob in combination with Sidekiq and I thought about the following solution:
when the user registers, I create a new queue in sidekiq with the id of the user.
I add a number of workers, dedicated to that specific queue, depending on how much the user is paying.
I have problems in implementing this solution with Sidekiq and I could not find documentation on how to add queues and workers dynamically.

If I were to do this, here's what I'd try first:
Wrap all limited jobs in a counter.
on start/dequeue job checks if this user has capacity to run it.
If yes, job runs. If not, it reschedules itself.
Something along these lines:
class MyWorker
def perform(user_id, *args)
user = User.find(user_id)
unless user.has_available_workers
# re-enqueue with the same args. Possibly, with a delay.
return
end
user.checkout_worker
# do work
ensure
user.release_worker
end
end

Related

ruby on rails background application to run jobs automaically at a time dynamically defined by users?

I have a use case where user schedules a 'command' from the web interface. The user also specifies the date and time the command needs to be triggred.
This is sequence of steps:
1.User schedules a command 'Restart Device' at May 31, 3pm.
2.This is saved in a database table called Command.
3.Now there needs to be a background job that needs to be triggered at this specified time to do something (make an api call, send email etc.)
4.Once job is executed, It is removed or marked done, until a new command is issued.
There could be multpile users concurrently performing the above sequence of steps.
Is delayed_job a good choice for above? I couldnt find an example as how to implement above using delayed job.
EDIT: the reason I was looking at delayed_job is because eventually I would need to leverage existing relational database
I would advise to use Sidekiq. With it you can use scheduled jobs to tell sidekiq when to perform the jobs.
Example :
MyWorker.perform_at(3.hours.from_now, 'mike', 1)
EDIT : worker example
#app/workers/restart_device_worker.rb
class RestartDeviceWorker
include Sidekiq::Worker
def perform(params)
# Do the job
# ...
# update in DB
end
end
see doc: https://blog.codeship.com/how-to-use-rails-active-job/
https://guides.rubyonrails.org/active_job_basics.html
If you are using Rails 5 then you have best option of ActiveJob(inbuilt feature)
Use ActiveJob
"Active Job – Make work happen later. Active Job is a framework for declaring jobs and making them run on a variety of queuing backends. These jobs can be everything from regularly scheduled clean-ups, to billing charges, to mailings. Anything that can be chopped up into small units of work and run in parallel, really."
Active Job has built-in adapters for multiple queuing backends (Sidekiq, Resque, Delayed Job and others). You just need to tell them.
Scenario: I want to delete my story after 24 hours(1 day). Then we do create a job named "StoriesCleanupJob". Call this job at the time of the creation of story like below
StoriesCleanupJob.set(wait: 1.day).perform_later(story)
It will call the Job after 1 day.
class StoriesCleanupJob < ApplicationJob
queue_as :default
def perform(story)
if story.destroy
#put your own conditions like update the status and all, whatever you want to perform.
end
end
end

Idempotent Design with Sidekiq Ruby on Rails Background Job

Sidekiq recommends that all jobs be idempotent (able to run multiple times without being an issue) as it cannot guarantee a job will only be run one time.
I am having trouble understanding the best way to achieve that in certain cases. For example, say you have the following table:
User
id
email
balance
The background job that is run simply adds some amount to their balance
def perform(user_id, balance_adjustment)
user = User.find(user_id)
user.balance += balance_adjustment
user.save
end
If this job is run more than once their balance will be incorrect. What is best practice for something like this?
If I think about it a potential solution I can come up with is to create a record before scheduling the job that is something like
PendingBalanceAdjustment
user_id
balance_adjustment
When the job runs it will need to acquire a lock for this user so that there's no chance of a race condition between two workers and then will need to both update the balance and delete the record from pending balance adjustment before releasing the lock.
The job then looks something like this?
def perform(user_id, balance_adjustment_id)
user = User.find(user_id)
pba = PendingBalanceAdjustment.where(:balance_adjustment_id => balance_adjustment_id).take
if pba.present?
$redis.lock("#{user_id}/balance_adjustment") do
user.balance += pba.balance_adjustment
user.save
pba.delete
end
end
end
This seems to solve both
a) Race condition between two workers taking the job at the same time (though you'd think Sidekiq could guarantee this already?)
b) A job being run multiple times after running successfully
Is this pattern a good solution?
You're on the right track; you want to use a database transaction, not a redis lock.
I think you're on the right track too but you're solution might be overkill since I don't have full knowledge of your application.
BUT, a simpler solution would simply be to have a flag on you User model like balance_updated:datetime. So, you could check that before updating.
As Mike mentions using a Transaction block should ensure it's thread safe.
In any case, to answer your question more generally... having an updated_ column is usually good enough to start with, and then if it gets complicated you can move this stuff to another model.

What Rails gem should I use to make recurring jobs with Resque?

I have a "Play" button in my app that checks a stock value from an API and creates a Position object that holds that value. This action uses Resque to make a background job using Resque and Redis in the following way:
Controller - stock_controller.rb:
def start_tracking
#stock = Stock.find(params[:id])
Resque.enqueue(StockChecker, #stock.id)
redirect_to :back
end
Worker:
class StockChecker
#queue = :stock_checker_queue
def self.perform(stock_id)
stock = Stock.find_by(id: stock_id)
stock.start_tracking_position
end
end
Model - stock.rb:
def start_tracking_position
// A Position instance that holds the stock value is created
end
I now want this to happen every 15 minutes for every Stock object. I looked at the scheduling section in the Ruby Toolbox website and have a hard time deciding what fits to my needs and how to start implementing it.
My concern is that my app will create tons of Position objects so I need something that is simple, uses Resque and can withstand this type of object creating without overloading the app.
What gem should I use and what is the simplest way to make my Resque Job happen every 15 minutes when the start_tracking action happens on a Stock object?
I've found resque scheduler to be useful: https://github.com/resque/resque-scheduler.
Configure up the schedule.yml for 15 mins
The biggest issue I found was ensuring it's running post releases etc. In the end I set up God to shutdown and restart
In terms of load. Not sure I follow, the schedulers will trigger events but the load is determined by the number of workers you have and how you decide to implement the creation. You can set the priority of the queues, and workers for the queue, but I guess if you don't process them in a timely way you get a backlog, is that acceptable. ? Normally you would run them of a separate server, this minimising impact to front end

is there a way to run a job at a set time later, without cron, say a scheduled queue?

I have a rails application where I want to run a job in the background, but I need to run the job 2 hours from the original event.
The use case might be something like this:
User posts a product listing.
Background job is queued to syndicate listing to 3rd party api's, but even after original request, the response could take a while and the 3rd party's solution is to poll them every 2 hours to see if we can get a success acknowledgement.
So is there a way to queue a job, so that a worker daemon knows to ignore it or only listen to it at the scheduled time?
I don't want to use cron because it will load up a whole application stack and may be executed twice on overlapping long running jobs.
Can a priority queue be used for this? What solutions are there to implement this?
try delayed job - https://github.com/collectiveidea/delayed_job
something along these lines?
class ProductCheckSyndicateResponseJob < Struct.new(:product_id)
def perform
product = Product.find(product_id)
if product.still_needs_syndicate_response
# do it ...
# still no response, check again in two hours
Delayed::Job.enqueue(ProductCheckSyndicateResponseJob.new(product.id), :run_at => 2.hours.from_now)
else
# nothing to do ...
end
end
end
initialize job first time in controller or maybe before_create callback on model?
Delayed::Job.enqueue(ProductCheckSyndicateResponseJob.new(#product.id), :run_at => 2.hours.from_now)
Use the Rufus Scheduler gem. It runs as a background thread, so you don't have to load the entire application stack again. Add it to your Gemfile, and then your code is as simple as:
# in an initializer,
SCHEDULER = Rufus::Scheduler.start_new
# then wherever you want in your Rails app,
SCHEDULER.in('2h') do
# whatever code you want to run in 2 hours
end
The github page has tons of more examples.

execute only one of many duplicate jobs with sidekiq?

I have a background job that does a map/reduce job on MongoDB. When the user sends in more data to the document, it kicks of the background job that runs on the document. If the user sends in multiple requests, it will kick off multiple background jobs for the same document, but only one really needs to run. Is there a way I can prevent multiple duplicate instances? I was thinking of creating a queue for each document and making sure it is empty before I submit a new job. Or perhaps I can set a job id somehow that is the same as my document id, and check that none exists before submitting it?
Also, I just found a sidekiq-unique-jobs gem. But the documentation is non-existent. Does this do what I want?
My initial suggestion would be a mutex for this specific job. But as there's a chance that you may have multiple application servers working the sidekiq jobs, I would suggest something at the redis level.
For instance, use redis-semaphore within your sidekiq worker definition. An untested example:
def perform
s = Redis::Semaphore.new(:map_reduce_semaphore, connection: "localhost")
# verify that this sidekiq worker is the first to reach this semaphore.
unless s.locked?
# auto-unlocks in 90 seconds. set to what is reasonable for your worker.
s.lock(90)
your_map_reduce()
s.unlock
end
end
def your_map_reduce
# ...
end
https://github.com/krasnoukhov/sidekiq-middleware
UniqueJobs
Provides uniqueness for jobs.
Usage
Example worker:
class UniqueWorker
include Sidekiq::Worker
sidekiq_options({
# Should be set to true (enables uniqueness for async jobs)
# or :all (enables uniqueness for both async and scheduled jobs)
unique: :all,
# Unique expiration (optional, default is 30 minutes)
# For scheduled jobs calculates automatically based on schedule time and expiration period
expiration: 24 * 60 * 60
})
def perform
# Your code goes here
end
end
There also is https://github.com/mhenrixon/sidekiq-unique-jobs (SidekiqUniqueJobs).
You can do this, assuming you have all the jobs are getting added to Enqueued bucket.
class SidekiqUniqChecker
def self.perform_unique_async(action, model_name, id)
key = "#{action}:#{model_name}:#{id}"
queue = Sidekiq::Queue.new('elasticsearch')
queue.each { |q| return if q.args.join(':') == key }
Indexer.perform_async(action, model_name, id)
end
end
The above code is just a sample, but you may tweak it to your needs.
Source

Resources