I wish I could decide on which queue to go to work.
This is because if the job is scheduled by the server (cronjob) it must be run on a slow queue, if instead it is run by the user it will go on a fast queue.
How can I run this in Resque?
Controller
MyJob.perform_later(id, :fast)
Rake task
MyJob.perform_later(id, :slow)
Job
class MyJob < ApplicationJob
queue_as :default #<-- This has to be dynamic
def perform(item_id, queue_name)
....
end
I see you are using ActiveJob, you can set the queue by using set method:
Controller
MyJob.set(queue: :fast).perform_later(id)
Rake task
MyJob.set(queue: :slow).perform_later(id)
set method allows you to set more thing than just queue, you can also set eg priority or when should job be performed. See the documentation https://api.rubyonrails.org/v5.2.3/classes/ActiveJob/Core/ClassMethods.html#method-i-set
Note: I presume, you already have slow and fast Resque queues in place and running and only want to use them
Related
I am trying to implement an API endpoint that would queue a request and return immediately.
I am using the gem https://rubygems.org/gems/activejob/versions/5.2.0 (I am on an old version for historical reasons).
I have defined a job that looks something like:
class Service::ExportBooks::Job < ActiveJob::Base
def perform
## ... Do the job
rescue StandardError
binding.pry
raise
end
end
In the controller, I am calling:
Service::ExportBooks::Job.perform_later
The job gets called synchronously and the controller gets even any errors raised by the job.
I've also tried other options such as:
job = Service::ExportBooks::Job.new
job.enqueue(wait: 5.seconds)
but it does the same, the job is not enqueued, is immediately executed.
UPDATE:
It looks like the method Resque.inline? returns true and so the execution is inline and not async. How can I make sure that it's async? I tried to set Resque.inline = false manually and the job was queued but it wasn't executed...
I have started a worker using the command:
QUEUE=* PIDFILE=./tmp/resque.pid bundle exec rake environment resque:work
Two things to do here.
Make sure Resque.inline = false.
Start up the resque workers in another process. See here.
This will get the job enqueued and run on the worker process.
I have a SchedulerJob:
class SchedulerJob < ActiveJob::Base
queue_as :scheduler
def perform
logger.debug("Start")
DelayedJob.set(wait: 10.seconds).perform_later
logger.debug("After Job 1")
DelayedJob.set(wait: 20.seconds).perform_later
logger.debug("After Job 2")
DelayedJob.set(wait: 30.seconds).perform_later
logger.debug("End")
end
end
and a DelayedJob:
class DelayedJob < ActiveJob::Base
queue_as :delayed_jobs
def perform
puts "I'm done"
end
end
If I call SchedulerJob.new.perform the job runs in just a few milliseconds. If I call SchedulerJob.perform_later to run the job in Sidekiq it takes about 90 seconds to finish, and by looking at the logs I can tell that each of those .perform_later calls takes about 30 seconds each.
Why would this happen?
By definition,
perform_later will be performed as soon as queuing system is free
perform will be performed regardless of your queue status
My guess is that when you call perform_later, there are some jobs running in the queue.
The problem was the initializer. This was the solution:
My guess is you’ve destroyed all concurrency by hacking in one global $REDIS connection into the pool. Don’t do that. Let Sidekiq create and manage the pool of connections.
I want to run an infinite loop on a separate thread that starts as soon as the app initializes (in an initializer). Here's what it might look like:
# in config/initializers/item_loop.rb
Thread.new
loop do
Item.find_each do |item|
# Get price from third-party api and update record.
item.update_price!
# Need to wait a little between requests to avoid getting throttled.
sleep 5
end
end
end
I tend to accomplish this by running batch updates in recurring background jobs. But this doesn't make sense since I don't really need parallelization, downtime, or queueing, I just want to update one item at a time in a single thread, forever.
Yet there are multiple things that concern me:
Leaked Connections: Should I open up a new connection_pool for the thread? Should I use a gem like safely to avoid crashing the thread?
Thread Safety: Should I be worried about race conditions? Should I make use of Mutex and synchronize? Does using ActiveRecord::Base.transaction impact thread safety?
Deadlock: Should I use Rails.application.executor.wrap?
Concurrent Ruby/Sleep Intervals: Should I use TimerTask from concurrent-ruby gem instead of sleep or something other than Thread.new?
Information on any of these subjects is appreciated.
Usually to perform a job in a background process(non web-server process) a background workers manager is used. Rails has a specific interface for that manager called ActiveJob There are few implementation of a background workers manager - Sidekiq, DelayedJob, Resque, etc. Sidekiq is preferred. Returning back to actual problem - you may create a schedule to run UpdatePriceJob every interval using gem sidekiq-scheduler Another nice extension for throttling Sidekiq workers is sidekiq-throttler
Some code snippets:
# app/workers/update_price_worker.rb
# Actual Worker class
class UpdatePriceWorker
include Sidekiq::Worker
sidekiq_options throttle: { threshold: 720, period: 1.hour }
def perform(item_id)
Item.find(item_id).update_price!
end
end
# app/workers/update_price_master_worker.rb
# Master worker that loops over items
class UpdatePriceMasterWorker
include Sidekiq::Worker
def perform
Item.find_each { |item| UpdatePriceWorker.perform_async item.id }
end
end
# config/sidekiq.yml
:schedule:
update_price:
cron: '0 */4 * * *' # Runs once per 4 hours - depends on how many Items are there
class: UpdatePriceMasterWorker
Idea of this setup - we run MasterWorker every 4 hours(this depends on how much time it takes to update all items). Master worker creates jobs to update price of an every particular item. UpdatePriceWorker is throttled to max 720 RPH.
I use rails runner x (god gem or k8s) in our similar case.
Rails runner runs in another process so that we do not have to worry about connection-leak and thread-safety.
God-gem or k8s supports concurrency and monitoring the job failure. Running 1 process with some specific sleep-time would promise third-party API throttles (running N process with N API-key could support speed up).
I think deadlock would happen in any concurrency situation.
I do not think this loop + sleep approach is a design flaw, because:
cron always starts based on schedule so that long running jobs could run simultaneously. We need to add a logic to avoid job overlapping. Rather, just loop + sleep keeps maximum throughput without any job overlap.
ActiveJob is good for one-shot long-running task, but it does not fit for daemon.
I'm wondering what is most efficient to send a bunch of emails.
should I put the loop on the rake tasks with delayed job only doing the sending?
task :publish => :environment do
# insert loop here do
# insert delayed job here
end
end
should I put the loop inside the delayed job?
task :publish => :environment do
# insert delayed job here
end
# and on the job:
def perform
# insert loop here
end
It depends on how many background workers you have. If you have more than one worker, then the first option (creating each job separately, with the loop inside the rake task) is far better, as it allows those tasks to be run in parallel.
It also makes it easier to write your worker method, as you don't need to worry about rerunning the worker over the entire list if it happens to fall over or be terminated. (although it's still good practice to ensure that your workers are idempotent where practical!)
I have some methods that works with API of third party app. To do it on button click is no problem, but it should be permanent process.
How to run them background? And how to pause the cycle for make some other works with same API and resume the cycle after the job is done.
Now I read about ActiveJob, but its has time dependences only...
UPDATE
I've tried to make it with whenever and sidekiq, task runs, but it do nothing. Where to look for logs I can't understand.
**schedule.rb**
every 1.minute do
runner "UpdateWorker.perform_async"
end
**update_worker.rb**
class UpdateWorker
include Sidekiq::Worker
include CommonMods
def perform
logger.info "Things are happening."
logger.debug "Here's some info: #{hash.inspect}"
myMethod
end
def myMethod
....
....
....
end
end
It's not exactly what I need, but better then nothing. Can somebody explain me with examples?
UPDATE 2 After manipulating with code it's absolutely necessary to restart sidekiq . With this problem is solved, but I'm not sure that this is the best way.
You can define a job which enqueues itself:
class MyJob < ActiveJob::Base
def perform(*args)
# Do something unless some flag is raised
ensure
self.class.set(wait: 1.hour).perform_later(*args)
end
end
There are several libraries to schedule jobs on a regular basis. For example you could use to sidekiq-cron to run a job every minute.
If you want to pause it for some time, you could set a flag somewhere (Redis/database/file) and skip execution as long it is detected.
On a somewhat related note: don't use sidetiq. It was really great but it's not maintained anymore and has incompatibilities to current Sidekiq versions.
Just enqueue next execution in ensure section after job completes after checking some flag that indicates that it should.
Also i recommend adding some delay there so that you don't end up with dead loop on some error inside job
I dont know ActiveJobs, but I can recommend the whenever gem to create cron (periodic background) jobs. Basically you end up writing a rake tasks. Like this:
desc 'send digest email'
task send_digest_email: :environment do
# ... set options if any
UserMailer.digest_email_update(options).deliver!
end
I never added a rake task to itself but for repeated processing you could do somehow like this (from answers to this specific question)
Rake::Task["send_digest_email"].execute