I want to do this:
def perform
# do some stuff
Run 10 other workers in parallel
# do some more stuff when all 10 are finished
end
How would I do this in Sidekiq? I thought I may add those ten to a custom queue, and then check to see when that queue is empty.
Probably this page: https://github.com/mperham/sidekiq/wiki/Related-Projects#execution-ordering could help.
Also sounds like typical Sidekiq Pro use case.
Related
I want to run an infinite loop on a separate thread that starts as soon as the app initializes (in an initializer). Here's what it might look like:
# in config/initializers/item_loop.rb
Thread.new
loop do
Item.find_each do |item|
# Get price from third-party api and update record.
item.update_price!
# Need to wait a little between requests to avoid getting throttled.
sleep 5
end
end
end
I tend to accomplish this by running batch updates in recurring background jobs. But this doesn't make sense since I don't really need parallelization, downtime, or queueing, I just want to update one item at a time in a single thread, forever.
Yet there are multiple things that concern me:
Leaked Connections: Should I open up a new connection_pool for the thread? Should I use a gem like safely to avoid crashing the thread?
Thread Safety: Should I be worried about race conditions? Should I make use of Mutex and synchronize? Does using ActiveRecord::Base.transaction impact thread safety?
Deadlock: Should I use Rails.application.executor.wrap?
Concurrent Ruby/Sleep Intervals: Should I use TimerTask from concurrent-ruby gem instead of sleep or something other than Thread.new?
Information on any of these subjects is appreciated.
Usually to perform a job in a background process(non web-server process) a background workers manager is used. Rails has a specific interface for that manager called ActiveJob There are few implementation of a background workers manager - Sidekiq, DelayedJob, Resque, etc. Sidekiq is preferred. Returning back to actual problem - you may create a schedule to run UpdatePriceJob every interval using gem sidekiq-scheduler Another nice extension for throttling Sidekiq workers is sidekiq-throttler
Some code snippets:
# app/workers/update_price_worker.rb
# Actual Worker class
class UpdatePriceWorker
include Sidekiq::Worker
sidekiq_options throttle: { threshold: 720, period: 1.hour }
def perform(item_id)
Item.find(item_id).update_price!
end
end
# app/workers/update_price_master_worker.rb
# Master worker that loops over items
class UpdatePriceMasterWorker
include Sidekiq::Worker
def perform
Item.find_each { |item| UpdatePriceWorker.perform_async item.id }
end
end
# config/sidekiq.yml
:schedule:
update_price:
cron: '0 */4 * * *' # Runs once per 4 hours - depends on how many Items are there
class: UpdatePriceMasterWorker
Idea of this setup - we run MasterWorker every 4 hours(this depends on how much time it takes to update all items). Master worker creates jobs to update price of an every particular item. UpdatePriceWorker is throttled to max 720 RPH.
I use rails runner x (god gem or k8s) in our similar case.
Rails runner runs in another process so that we do not have to worry about connection-leak and thread-safety.
God-gem or k8s supports concurrency and monitoring the job failure. Running 1 process with some specific sleep-time would promise third-party API throttles (running N process with N API-key could support speed up).
I think deadlock would happen in any concurrency situation.
I do not think this loop + sleep approach is a design flaw, because:
cron always starts based on schedule so that long running jobs could run simultaneously. We need to add a logic to avoid job overlapping. Rather, just loop + sleep keeps maximum throughput without any job overlap.
ActiveJob is good for one-shot long-running task, but it does not fit for daemon.
I have a delayed job queue which contains particularly slow running tasks, which I want to be crunched by its own set of dedicated workers, so there is less risk it'll bottleneck the rest of the worker pipeline.
RAILS_ENV=production script/delayed_job --queue=super_slow_stuff start
However I then also want a general worker pool for all other queues, hopefully without having to specify them seperately (as their names etc are often changed/added too). Something akin to:
RAILS_ENV=production script/delayed_job --except-queue=super_slow_stuff start
I could use the wildcard * charecter but I imagine this would cause the second worker to pickup the super slow jobs too?
Any suggestions on this?
you can define a global constant for your app with all queues.
QUEUES={
mailers: 'mailers',
etc..
}
then use this constant in yours delay method calls
object.delay(queue: QUEUES[:mailers]).do_something
and try to build delayed_job_args dinamically
system("RAILS_ENV=production script/delayed_job --pool=super_slow_stuff --pool:#{(QUEUES.values-[super_slow_stuff]).join(',')}:number_of_workers start")
Unfortunately this functionality not realized in delayed jobs.
See:
https://github.com/collectiveidea/delayed_job/pull/466
https://github.com/collectiveidea/delayed_job/pull/901
You may fork delayed jobs repository and apply the simple patches from https://github.com/collectiveidea/delayed_job/pull/466.
Then use your GitHub repo, but please vote into https://github.com/collectiveidea/delayed_job/pull/466 to make it merged finally into upstream.
Update:
I wrote option to exclude queues for myself. It is in (exclude_queues) branch: https://github.com/one-more-alex/delayed_job/tree/exclude_queues
https://github.com/one-more-alex/delayed_job_active_record/tree/exclude_queues
Options description included in Readme.md
Parts about exclusion.
# Option --exclude-specified-queues will do inverse of queues processing by skipping onces from --queue, --queues.
# If both --pool=* --exclude-specified-queues given, no exclusions will by applied on "*".
If EXCLUDE_SPECIFIED_QUEUES set to YES, then queues defined by QUEUE, QUEUES will be skipped instead. See opton --exclude-specified-queues description for specal case of queue "*"
If answer strictly on question, the calling of general worker will be like:
RAILS_ENV=production script/delayed_job --queue=super_slow_stuff --exclude-specified-queues start
Warning
Please not, that am not going to support DelayedJobs and code placed "as is" in hope it will be useful.
Corresponding pull request was made by me https://github.com/collectiveidea/delayed_job/pull/1019
Also for Active Record backend: https://github.com/collectiveidea/delayed_job_active_record/pull/151
Only ActiveRecord backend supported.
I have some methods that works with API of third party app. To do it on button click is no problem, but it should be permanent process.
How to run them background? And how to pause the cycle for make some other works with same API and resume the cycle after the job is done.
Now I read about ActiveJob, but its has time dependences only...
UPDATE
I've tried to make it with whenever and sidekiq, task runs, but it do nothing. Where to look for logs I can't understand.
**schedule.rb**
every 1.minute do
runner "UpdateWorker.perform_async"
end
**update_worker.rb**
class UpdateWorker
include Sidekiq::Worker
include CommonMods
def perform
logger.info "Things are happening."
logger.debug "Here's some info: #{hash.inspect}"
myMethod
end
def myMethod
....
....
....
end
end
It's not exactly what I need, but better then nothing. Can somebody explain me with examples?
UPDATE 2 After manipulating with code it's absolutely necessary to restart sidekiq . With this problem is solved, but I'm not sure that this is the best way.
You can define a job which enqueues itself:
class MyJob < ActiveJob::Base
def perform(*args)
# Do something unless some flag is raised
ensure
self.class.set(wait: 1.hour).perform_later(*args)
end
end
There are several libraries to schedule jobs on a regular basis. For example you could use to sidekiq-cron to run a job every minute.
If you want to pause it for some time, you could set a flag somewhere (Redis/database/file) and skip execution as long it is detected.
On a somewhat related note: don't use sidetiq. It was really great but it's not maintained anymore and has incompatibilities to current Sidekiq versions.
Just enqueue next execution in ensure section after job completes after checking some flag that indicates that it should.
Also i recommend adding some delay there so that you don't end up with dead loop on some error inside job
I dont know ActiveJobs, but I can recommend the whenever gem to create cron (periodic background) jobs. Basically you end up writing a rake tasks. Like this:
desc 'send digest email'
task send_digest_email: :environment do
# ... set options if any
UserMailer.digest_email_update(options).deliver!
end
I never added a rake task to itself but for repeated processing you could do somehow like this (from answers to this specific question)
Rake::Task["send_digest_email"].execute
I'm using Sidetiq and Sidekiq together to recurring jobs :
include Sidekiq::Worker
include Sidetiq::Schedulable
recurrence { secondly(3) }
def perform(id,last_occurrence)
# magic happens
end
However, now I want to stop the entire enqueuing process. I want to remove all the process from Sidetiq. How can I do?
Kind of late on this it looks like, but here we go anywho.
You can delete all scheduled sidetiq process like this:
Sidetiq::scheduled.each { |occurrence| occurrence.delete }
As far as preventing sidetiq from queuing additional jobs, i'm not sure how that works or how to dynamically stop it.
I have a rails 3 application and looked around in the internet for daemons but didnt found the right for me..
I want a daemon which fetches data permanently (exchange courses) from a web resource and saves it to the database..
like:
while true
Model.update_attribte(:course, http::get.new("asdasd").response)
end
I've only seen cron like jobs, but they only run after a specific time... I want it permanently, depending on how long it takes to end the query...
Do you understand what i mean?
The gem light-daemon I wrote should work very well in your case.
http://rubygems.org/gems/light-daemon
You can write your code in a class which has a perform method, use a queue system like this and at application startup enqueue the job with Resque.enqueue(Updater).
Obviously the job won't end until the application is stopped, personally I don't like that, but if this is the requirement.
For this reason if you need to execute other tasks you should configure more than one worker process and optionally more than one queue.
If you can edit your requirements and find a trigger for the update mechanism the same approach still works, you only have to remove the while true loop
Sample class needed:
Class Updater
#queue = :endless_queue
def self.perform
while true
Model.update_attribute(:course, http::get.new("asdasd").response)
end
end
end
Finaly i found a cool solution for my problem:
I use the god gem -> http://god.rubyforge.org/
with a bash script (link) for starting / stopping a simple rake task (with an infinite loop in it).
Now it works fine and i have even some monitoring with god running that ensures that the rake task runs ok.