create recurring activejob fails - ruby-on-rails

I'm trying to create an ActiveJob in rails 4.2 that runs at a regular rate. The job is being called the first time, but it does not start again. My code is throwing the exception below after trying to call perform_later.
log output
[ActiveJob] Enqueued ProcessInboxJob (Job ID: 76a63689-e330-47a1-af92-8e4838b508ae) to Inline(default)
[ActiveJob] [ProcessInboxJob] [76a63689-e330-47a1-af92-8e4838b508ae] Performing ProcessInboxJob from Inline(default)
ProcessInboxJob running...
[ActiveJob] [ProcessInboxJob] [76a63689-e330-47a1-af92-8e4838b508ae] [AWS S3 200 0.358441 0 retries] list_objects(:bucket_name=>"...",:max_keys=>1000)
[ActiveJob] [ProcessInboxJob] [76a63689-e330-47a1-af92-8e4838b508ae] Enqueued ProcessInboxJob (Job ID: dfd3dd7a-06ab-4dba-9bbf-ce1ad606f7e5) to Inline(default) with arguments: {:wait=>30 seconds}
[ActiveJob] [ProcessInboxJob] [76a63689-e330-47a1-af92-8e4838b508ae] Performed ProcessInboxJob from Inline(default) in 599.72ms
Exiting
/Users/antarrbyrd/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/activejob-4.2.0/lib/active_job/arguments.rb:60:in `serialize_argument': Unsupported argument type: ActiveSupport::Duration (ActiveJob::SerializationError)
process_inbox_job.rb
class ProcessInboxJob < ActiveJob::Base
queue_as :default
#FREQUENCY = 3.minutes
def perform()
# do some work
end
# reschedule job
after_perform do |job|
self.class.perform_later(wait: 30.seconds)
end
end

The syntax is self.class.set(wait: 30.seconds).perform_later. But that's not a reliable way of doing it as if an exception occurs the chain breaks. Also you must have the initial job scheduled.
If you use resque you can use https://rubygems.org/gems/activejob-scheduler

As #bcd said, you have to use self.class.set(wait: 30.seconds).perform_later with a queue adapter that support queuing, i.e. not the default (inline) adapter.
I post to give a different point of view on the question of rescheduling, which may help future readers.
after_perform will not be called if an exception is raised but that does not make it a bad place to reschedule a job. If you have an exception in a job, better rescue it (with the class method rescue_from) and send yourself a notification if your backend doesn't do it already.
You can then try to fix the problem (either in the data or in your code) and retry (if you can) or enqueue a similar job again.
For the scheduling part, activejob-scheduler is great and does not work only for resque, but has some down sides.
It uses rufus-scheduler, which performs in-memory delay, so whenever your server restarts you'll lose all scheduling information, which may really be a problem for some tasks (I schedule tasks 1 month in the future and update my app every week, which mean a restart each time).
You also lose all advantages of using an actual queuing backend, such as beanstalk with backburner.
ActiveJob-scheduler also claims to perform job at the exact right time, which is false. The ActiveJob adapter runs at the specified time, but depending on your setup it may take a few time before the job is actually performed, e.g. when you run your jobs on another server.
Lastly, for the initial scheduling you can include a code that checks if the job exists at the worker start, and schedule it if needed.
To sum up,
Yeah ActiveJob-Scheduler is great, but you'll lose some of ActiveJob features and it does not do everything.

depending on which queuing system you are using you can try https://github.com/codez/delayed_cron_job or https://github.com/ondrejbartas/sidekiq-cron. With DJ cron you can use UI like rails_admin to actually edit the cron regex. Sidekiq-cron gives you Sinatra web UI where you can manually kick off a job or pause it.

Related

Rails Initializer: An infinite loop in a separate thread to update records in the background

I want to run an infinite loop on a separate thread that starts as soon as the app initializes (in an initializer). Here's what it might look like:
# in config/initializers/item_loop.rb
Thread.new
loop do
Item.find_each do |item|
# Get price from third-party api and update record.
item.update_price!
# Need to wait a little between requests to avoid getting throttled.
sleep 5
end
end
end
I tend to accomplish this by running batch updates in recurring background jobs. But this doesn't make sense since I don't really need parallelization, downtime, or queueing, I just want to update one item at a time in a single thread, forever.
Yet there are multiple things that concern me:
Leaked Connections: Should I open up a new connection_pool for the thread? Should I use a gem like safely to avoid crashing the thread?
Thread Safety: Should I be worried about race conditions? Should I make use of Mutex and synchronize? Does using ActiveRecord::Base.transaction impact thread safety?
Deadlock: Should I use Rails.application.executor.wrap?
Concurrent Ruby/Sleep Intervals: Should I use TimerTask from concurrent-ruby gem instead of sleep or something other than Thread.new?
Information on any of these subjects is appreciated.
Usually to perform a job in a background process(non web-server process) a background workers manager is used. Rails has a specific interface for that manager called ActiveJob There are few implementation of a background workers manager - Sidekiq, DelayedJob, Resque, etc. Sidekiq is preferred. Returning back to actual problem - you may create a schedule to run UpdatePriceJob every interval using gem sidekiq-scheduler Another nice extension for throttling Sidekiq workers is sidekiq-throttler
Some code snippets:
# app/workers/update_price_worker.rb
# Actual Worker class
class UpdatePriceWorker
include Sidekiq::Worker
sidekiq_options throttle: { threshold: 720, period: 1.hour }
def perform(item_id)
Item.find(item_id).update_price!
end
end
# app/workers/update_price_master_worker.rb
# Master worker that loops over items
class UpdatePriceMasterWorker
include Sidekiq::Worker
def perform
Item.find_each { |item| UpdatePriceWorker.perform_async item.id }
end
end
# config/sidekiq.yml
:schedule:
update_price:
cron: '0 */4 * * *' # Runs once per 4 hours - depends on how many Items are there
class: UpdatePriceMasterWorker
Idea of this setup - we run MasterWorker every 4 hours(this depends on how much time it takes to update all items). Master worker creates jobs to update price of an every particular item. UpdatePriceWorker is throttled to max 720 RPH.
I use rails runner x (god gem or k8s) in our similar case.
Rails runner runs in another process so that we do not have to worry about connection-leak and thread-safety.
God-gem or k8s supports concurrency and monitoring the job failure. Running 1 process with some specific sleep-time would promise third-party API throttles (running N process with N API-key could support speed up).
I think deadlock would happen in any concurrency situation.
I do not think this loop + sleep approach is a design flaw, because:
cron always starts based on schedule so that long running jobs could run simultaneously. We need to add a logic to avoid job overlapping. Rather, just loop + sleep keeps maximum throughput without any job overlap.
ActiveJob is good for one-shot long-running task, but it does not fit for daemon.

Why is Rufus scheduling the first job twice?

I have a Rails app that uses Rufus Scheduler combined with Delayed jobs to execute background jobs. There are another jobs, but the one I'm having trouble with is scheduled in a controller using this code:
def create
#harvest_plan = HarvestPlan.new(resource_params)
#harvest_plan.start_date = Time.parse(resource_params[:start_date])
if #harvest_plan.save
ApplicationController.new.insert_in_messages_list(session, :success, 'Harvest plan created')
schedule_harvest
redirect_to farms_path
end
end
private
def schedule_harvest
Rufus::Scheduler.singleton.every "#{#harvest_plan.hours_between}h",
:times => #harvest_plan.repetitions, :first_at => #harvest_plan.start_date do
CreateHarvestFromPlanJob.perform_later
end
end
The job is supposed to be scheduled according to the harvest plan model, which indicates how many hours must past between jobs, when is the first one supposed to be scheduled and how many repetitions must occur. Everything works perfect except for the first job, which does happen at the time specified with first_at but it is scheduled twice for some reason, delayed jobs then executes the job twice. I tried using the mutex, blocking and overlap options, but it did nothing different. After the first job (scheduled twice) everything works fine. The next jobs are scheduled on time and just once. I have just one delayed jobs worker
Why is this happening?
I am running Rails 4.2.4, Ruby 2.2.2 and Rufus 3.3.2. Since the error happens both with passenger and webrick I assume this has nothing to do with the problem.
Why is Rufus scheduling the first job twice?
because of a bug you found: https://github.com/jmettraux/rufus-scheduler/issues/231
Thanks a lot!

DelayedJob sometimes cannot load job class with a namespace

Sometimes I've got errors in delayed_job worker
NameError: uninitialized constant Notifiers::MessageNotifierJob
full backtrace https://gist.github.com/olegantonyan/eeca9d612f9a10864efe
Notifiers::MessageNotifierJob is defined in app/jobs/notifiers/message_notifier_job.rb
By sometimes I mean that this job may fail -> retry -> succeed. Same thing with another jobs which has a namespace. Jobs without namespace work just fine.
I tried to add app/jobs/ to autoload paths explicitly without any luck
config.autoload_paths += Dir[ Rails.root.join('app', 'jobs', '**/') ]
The job itself looks like this
module Notifiers
class MessageNotifierJob < BaseNotifierJob
def perform(from, to, text)
# some code to send slack notification
end
end
end
Solved. Delayed job or autoloader are not to blame.
A week before adding these new jobs (like Notifiers::MessageNotifierJob) I've increased number of delayed job workers (using capistrano3-delayed-job gem) from 1 to 4. But, capistrano3-delayed-job haven't killed old delayed job process, and only started new 4. So I ended up with 1 old job without any knowledge about my new job classes. Whenever this old process picked the job it failed. Then one of the new processes picked this job and succeeded.

Resque job not actually backgrounding

It is instead taking up my processor, and then effectually timing out.
I have in my controller :
after_save :handle_file
def handle_test
Resque.enqueue UnpackFileOnS3, parent.id
end
It hits this mark, and then the entire app waits for it to set up and upload the files as prescribed inside my Job. Then it predictably times out because it takes awhile to upload it.
This occurs in my console as well.. If I run :
Resque.enqueue UnpackFileOnS3, 4
Then instead of enqueue'ing it, it locks up my console as it tries to run the entire file. I think that normally, console would just enqueue it to a worker and redis..
Why isn't this actually happening in the background? As I assume if that were the case, the timeouts would not occur.
My guess is that you are running resque in an inline mode. In this mode queing is disabled. Check your configs for this kind of code:
Resque.inline = ENV['RAILS_ENV'] == "cucumber"
#or whatever, important part is the inline option

Delay job gem will not works in case the referred object changes its attribute

I am facing a very interesting problem. I have tested the Delay job gem 4 times. I doubt it is the design problem of the gem or a bug. I use command rake jobs:work to create worker to do delayed job.
Once I create a LongTask record, i also make a delayed job which will change the attributeminutes_delayed to 2.
The gem works perfectly if I don't update the attributes. But once I edited the description, the gem will not work properly, which means it will not execute the delayed job, but the related delayed job record will be removed in the database.
Interesting final result:
It Seems to reference a object with attribute that is exactly the same, this picture was captured before the running time have gone over.
This one was captured after all tests have been gone though. You can see the delayed job record for test4 have been removed even this delayed job did't have any effect.
terminal results (only 2 jobs are executed)
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] Starting job worker
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] LongTask#set_delay_time_without_delay completed after 0.0343
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] 1 jobs processed at 16.6270 j/s, 0 failed ...
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] LongTask#set_delay_time_without_delay completed after 0.0105
[Worker(host:Jasonteki-MacBook-Air.local pid:1726)] 1 jobs processed at 51.4774 j/s, 0 failed ...
Code in model:
def set_delay_time(time)
self.minutes_delayed = time
# very important for this, otherwise cannot write the change into the database
self.save
end
handle_asynchronously :set_delay_time, :run_at => Proc.new { 2.minutes.from_now }
Code in controller:
def create
#long_task = LongTask.new(params[:long_task])
respond_to do |format|
if #long_task.save
#long_task.set_delay_time(2)
Without seeing your code, it's impossible to tell for sure, but it's likely that both of your delayed jobs are working on serialized copies of your object, rather than reloading them from the database.

Resources