How to run jobs longer than 24 hours on Heroku? - ruby-on-rails

How can we run background jobs that take longer than 24 hours on Heroku? Since every dyno is killed once a day it seems impossible, even if I have a dedicated worker dyno to handle it. Is writing my job in a way that it'll continue from where it stopped when was killed the only way?
Thanks,
Michal

Ideally you want to break that long job up into a number of small jobs that can be handled independently. What exactly are you doing that takes more than 24 hours?

Related

How can I speed up my Rails DelayedJobs time to start?

I am using Rails / Delayed Jobs, with Rails 4.1 on Heroku. I've noticed that my jobs take anywhere from 1 second to 10 seconds to actually start. Once they start, they run pretty fast.
How can I speed that up?
I am calling them with my_thing.delay.run!
I do have some other ongoing jobs, but there's not that many of them, so it doesn't seem like that would be the cause. It just seems like a lagginess in how often it checks to run jobs.
I think you want to configure Delayed::Worker.sleep_delay which is hinted at in the delayed job README. If delayed job can't find a job then it sleeps for this many seconds before looking again. The default sleep is for 5 seconds.
So, you might set the following in config/initializers/delayed_job.rb to only sleep for 2 seconds between queries for pending jobs.
Delayed::Worker.sleep_delay = 2
Obviously the trade-off is the more frequent polling for jobs when nothing is going on.
Also, if you're not wedded to delayed job then you might find resque or particularly sidekiq would probably process your jobs quicker than delayed job.

Should I use Delayed Job for long running background tasks?

I have an application where I want to automatically deactivate a user 72 hours after they have been activated. I have set this up with Delayed Job, but am now wondering if that is the best option.
My question is, if I set a task for 72 hours in the future, will a worker be active for that entire 72 hours? (I'm concerned about this as Heroku charges by the hour)
I'm open to suggestion here as far as better ways of doing this goes. One idea I had was to set this up using an exp_date column and check against that at sign in there by eliminating the need for DJ completely.
My question is, if I set a task for 72 hours in the future, will a worker be active for that entire 72 hours? (I'm concerned about this as Heroku charges by the hour)
Yes, it will be up all time. Delayed job continuously pings the database to see if there any job in its queue.
And, regarding the best option i think i rather put one column knows as valid_upto and put the date till will be active. I only signins (or whatever) to only those user which has created_at dates less then or equal to valid_upto date. And, periodically may be once in month i will run one cron job to remove invalid users.
And, like #leesungchul suggested, you can use that, that looks cool.
You can use the workless gem which is an addon for delayed jobs so you don't leave your worker running constantly on heroku.
https://github.com/lostboy/workless

Long-running Sidekiq jobs keep dying

I'm using the sidekiq gem to process background jobs in Rails. For some reason, the job just hang after a while -- the process either becomes unresponsive, showing up on top but not much else, or mysteriously vanishes, without errors (nothing is reported to airbrake.io).
Has anyone had experience with this?
Use the TTIN signal to get a backtrace of all threads in the process so you can figure out where the workers are stuck.
https://github.com/mperham/sidekiq/wiki/Signals
I've experienced this, and haven't found a solution/root cause.
I couldn't resolve this cleanly, but came up with a hack.
I configured God to monitor my Sidekiq processes, and to restart them if a file changed.
I then setup a Cron Job that ran every 5 minutes that checked all the current Sidekiq workers for a queue. If a certain % of the workers had a start time of <= 5 minutes in the past, it meant those workers hung for some reason. If that happened, I touched a file, which made God restart Sidekiq. For me, 5 minutes was ideal but it depends on how long your jobs typically run.
This is the only way I could resolve hanging Sidekiq jobs without manually checking on them every hour, and restarting it myself.

Email notification when 'updated_at' become 2 hours before current time

I'd like to make an email notification if SomeModel has not been updated for 2 hours.
What is the best way to implement it?
After a model has been saved, queue up a background job to run 2 hours from that time to send the email. When a new job is enqueued, remove any still-unrun jobs that are still on the queue.
resque-scheduler providers a pretty simple way of doing this, assuming you have redis up and running.
Personally I find the solution that #x1a4 proposes to be somewhat overkill. Given the relatively large window of 2 hours, I would just run a job periodically (say, once every 10-15 minutes), then search all Models for updated_at <= 2.hours.ago and send out the emails.
As for scheduling that job to run every 15 minutes, there are several options. You may use resque-scheduler, if you are using Resque. You may also use the standard system cron, but will incur some fairly substantial overhead starting Rails each time the job runs. I also have written a distributed scheduler gem (i.e. cron that can run on multiple machines, but act like it's only running on one), which uses Redis under the hood.

How to prevent backgroundrb from starting multiple copies of the same task?

Say, I have a worker that's set up to run every 15 minutes using the cron scheduling feature of backgroundrb. Then, say, if a single instance of the worker takes longer than 15 minutes to run, I don't want a second worker to be started in paraller by backgroundrb. How do I achieve that?
Okay, I guess I'll answer this one myself. The trick is to not specify reload_on_schedule true in your worker.

Resources