Rails activejob with Sidekiq is spawning multiple jobs on perform_later - ruby-on-rails

I have an fairly simple Rails 5 app for monitoring sites speeds. It sends jobs off to an external page speed testing service and periodically checks to see if the jobs are complete, and if so it calls and stores the data about the page.
Each project can have many pages, and each page can have many jobs, which is where the response from the testing service is stored.
There is an activejob, with a sidekiq backend, that is supposed to run every minute, check for pages that are due to run, and if any are found launch a job to enqueue it. It also checks if there are any enqueued jobs, and if they are found, it spools up a job to check the status and save data if it's complete.
def perform()
#Todo:- Add some way of setting the status into the view and check for active here - no way to stop jobs at the mo!
pagestorun = Page.where("runtime < ?", DateTime.now)
pagestorun.each do |page|
job = Job.new(:status => "new")
page.jobs << job
updatedatetime(page)
end
if pagestorun.count != 0
#Run the queuejob task to pick out all unqueued jobs and send them to webpagespeedtest
QueuejobsJob.perform_later
end
#Check any jobs in the queue, then schedule the next task
GetrunningtasksJob.perform_later
FindjobstorunJob.set(wait: 1.minute).perform_later()
end
This seems to work as expected for a while, but after 5 minutes or so two jobs seem to end up spawning at the same time. Eventually each of those spawn more of their own, and after a few days I end up with tens of thousands trying to run per hour. There's no errors or failing jobs as best I can tell, I can't find any reason why it'd be happening. Any help would be mundo appreciated :)

There's a chance the jobs are being retried due failures which, when overlapping with the regular 60 seconds schedule, could cause the double scheduling you're experiencing. For more info, see Error Handling in the Sidekiq wiki.
BTW I'm not entirely sure an active job it's the best way to run periodical tasks(unless your using sidekiq's enterprise periodical-jobs). Instead, I'd use a cron job running a rake task every 60 seconds, the rake
task would schedule the jobs to check specific pages
QueuejobsJob.perform_later page.id

Related

Long running schedule job

I new to ROR. Wanted to ask something for confirmation. If I run long schedule job. Will it block others schedule job? I have others job running every 5 minutes, Plan to write something that easily run more than 3 hours. Will it block the 5 minutes job?
The whenever gem is basically only a way to configure and handle Cron jobs.
That said: At the given time Cron will just start and run a configured job. Cron will not block other jobs nor it cares if a job fails or if another job is still running.
Limiting factor might be:
Memory/CPU consumption: Each job consumes memory/CPU. If there are too many jobs running at the same time your server might run out of memory or might have a high load. But this doesn't really block other jobs it just slows down the whole server.
Database locks: If your jobs perform tasks that lock database tables other queries might be blocked and need to wait. But this is not Cron specific, this depends on what your code actually does.

Old Sidekiq jobs retry forever. No visibility on Sidekiq UI

I have made a decent amount of changes to the process of one of my jobs over the last year. Things like triggering it from after_commit instead of after_create on the respective Model as well as cleaning up the logic and covering corner cases
I see my old jobs from months ago retrying over and over again in my Papertrail logs on my Heroku Ruby on Rails app. The new ones are fine and I believe my changes have fixed any issues. The problem is how do I stop all those old jobs and why do I not see them on my Sidekiq UI? The Sidekiq UI just shows a number of completed jobs, but 0 failed, dead, busy, or enqueued. It says 0 yet I see the logs churning away.
I log the job ID's but have seen that you cannot kill a specific job. I have restarted my server multiple times with no luck. Every day they try again.
I should note that all recent jobs are fine. Anything within the last month or so do not repeat. Out of the 5000 objects that had an after_create triggered job, only 1-60 are retrying. The others passed and are fine
If you know the jid's you can do this from a rails console:
queue = Sidekiq::Queue.new("my_queue")
queue.each do |job|
job.delete if job.jid == 'abcdef1234567890'
end
If it's in the retryset you can do:
query = Sidekiq::RetrySet.new
query.each do |job|
job.delete if job.jid == 'abcdef1234567890'
end
If you can't delete because the jobs are inflight, stop your worker processes (ie shut 'em down) for a few minutes and then run the above.

How do I handle long running jobs on Heroku?

I want to use Heroku but the fact they restart dynos every 24 hours at random times is making things a bit difficult.
I have a series of jobs dealing with payment processing that are very important, and I want them backed by the database so they're 100% reliable. For this reason, I chose DJ which is slow.
Because I chose DJ, it means that I also can't just push 5,000,000 events to the database at once (1 per each email send).
Because of THAT, I have longer running jobs (send 200,000 text messages over a few hours).
With these longer running jobs, it's more challenging to get them working if they're cut off right in the middle.
It appears heroku sends SIGTERM and then expects the process to shut down within 30 seconds. This is not going to happen for my longer jobs.
Now I'm not sure how to handle them... the only way I can think is to update the database immediately after sending texts for instance (for example, a sms_sent_at column), but that just means I'm destroying database performance instead of sending a single update query for every batch.
This would be a lot better if I could schedule restarts, at least then I could do it at night when I'm 99% likely not going to be running any jobs that don't take longer than 30 seconds to shut down.
Or.. another way, can I 'listen' for SIGTERM within a long running DJ and at least abort the loop early so it can resume later?
Manual restarts will reset the 24 hr clock - heroku ps:restart at your preferred time ought to give you the control you are looking for.
More info can be found here: Dynos and the Dyno Manager
Here's the proper answer, you listen for SIGTERM (I'm using DJ here) and then gracefully rescue. It's important that the jobs are idempotent.
Long running delayed_job jobs stay locked after a restart on Heroku
class WithdrawPaymentsJob
def perform
begin
term_now = false
old_term_handler = trap('TERM') { term_now = true; old_term_handler.call }
loop do
puts 'doing long running job'
sleep 1
if term_now
raise 'Gracefully terminating job early...'
end
end
ensure
trap('TERM', old_term_handler)
end
end
end
Here's how you solve it with Que:
if Que.worker_count.zero?
raise 'Gracefully terminating job early...'
end

How can I start recurring background jobs when a user visits a web page?

I am working on creating a Rails web application with background workers for performing some of the tasks in the background on a set interval. I am using Resque with Redis for queuing the background jobs, and using Resque-scheduler to run it on a set interval for ex., every 30 seconds or so.
The background job needs to be enqueued only when a user visits a particular page, and it should run on a schedule until the user moves away from that page. Basically, I would like to set the schedule dynamically during runtime. My application is deployed in Cloud, and the main Rails app and the background workers run as a separate processes communicating through Redis. How can I set the schedule dynamically, and from where?
resque_schedule.yml
do_my_job:
every: 1m
class: BackgroundJob
description: Runs the perform method in MyJob
queue: backgroundq
inside the controller
def index
Resque.enqueue(BackgroundJob)
end
background_job.rb
class BackgroundJob
#queue = :backgroundq
#job_helper = BackgroundHelper.new
def self.perform
#job_helper.get_job_data
end
end
Resque-scheduler can do what you're describing using dynamic schedules. Every time a user visits your page, you create a schedule with Resque.set_schedule that runs every 30 seconds on that user. When the user leaves, you abort with Resque.remove_schedule. I can imagine a lot of ways the user might leave the page without you noticing (worst case, what if their computer loses power?), so I'd be worried about the schedules being left around. I'm also not sure how happy resque-scheduler remains as you add more and more schedules, so you might run into trouble with lots of users.
You could minimize the number of schedules by having one job that runs every 30 seconds over every user you currently need running, like:
class BackgroundJob
def self.perform
User.active.each { ... }
end
end
If you do something like set User#last_seen_at when the user visits your page and have User.active load everyone seen in the past 10 minutes, then you'll be running every 30 seconds on all your active users, and there's no chance of a user's jobs lasting well after they leave, because it times out in 10 minutes. However, if you have more users than you can do work for in 30 seconds, the scheduled job won't be finished before its next copy starts up, and there will be trouble. You might be able to get around that by having the top-level job enqueue for each user a secondary job that actually does the work. Then as long as you can do that in 30 seconds and have enough resque workers for your job volume, things should be fine.
A third way to approach it is to have the user's job enqueue another copy of itself, like so:
class BackgroundJob
def self.perform(user_id)
do_work(user_id)
Resque.enqueue_in(30.seconds, BackgroundJob, user_id)
end
end
You enqueue the first job when the user visits your page, and then it keeps itself going. This way is nice because every user's job run independently, and you don't need either a separate schedule for each or a top-level schedule managing everything. You'll still need a way to stop the jobs running once the user is gone, perhaps with a timeout or TTL counter.
Finally, I'd question whether Resque is the right tool for your task. If you want something happening every 30 seconds as long as a user is on a page, presumably you want the user seeing it happen, so you should consider implementing it in the browser. For example, you could set up a 30 second JavaScript interval to hit an API endpoint that does some work and returns the value. This avoids any need for Resque at all, and will automatically stop when the user navigates away, even if they lose power and your code doesn't get a chance to clean up.

Is it possible to run a delayed job immediately but still asynchronously?

We are using DelayedJob to run tasks in the background because they could take a while, and also because if an error is thrown, we still want the web request to succeed.
The issue is, sometimes the job could be really big (changing hundreds or thousands of database rows) and sometimes it could be really small (like 5 db rows). In the case of the small ones, we'd still like to have it run as a delayed job so that the error handling can work the same way, but we'd love to not have to wait roughly 5 seconds for DJ to pick up the job.
Is there a way to queue the job so it runs in the background, but then immediately run it so we don't have to wait for the worker to execute 5 seconds later?
Edit: Yes, this is Ruby on Rails :-)
Delayed Job polls the database for new dj records at a set interval. You can reconfigure this interval in an initializer:
# config/delayed_job.rb
Delayed::Worker.sleep_delay = 2 # or 1 if you're feeling wild.
This will affect DJ globally.
How about
SomeJob.set(
wait: 0,
queue: "queue_name",
).perform_later

Resources