If I use Heroku scheduler, do I still need delayed job? - ruby-on-rails

I'm a little confused about this. I have a couple of tasks that I would like to run asynchronously, for example my inventory sync integration. For this I have implemented delayed job, but I realize that I need to run rake jobs:work on Heroku for this. I can use the Heroku scheduler to run this rake task every 10 minutes. My question is; if I create rake tasks to run i.e. my inventory sync method, do I still need delayed job? My understanding is that heroku scheduler kicks off 'one off dynos'.
Instead of using delayed job, could I not just kick off the sync method directly since a separate dyno is used anyway? What is the added value of delayed job here?

Heroku's Scheduler replaces what cron would handle on a typical server. Delayed Job or Sidekiq are for processing jobs asynchronously from your app, not a timed schedule.
The reason you use a worker & run these jobs on the back-end is so that your server can return a response as soon as is possible rather than making the user wait for some potentially unnecessarily long running process to finish (lots of queries, outbound e-mail, external API requests, etc.).
Ex, scheduler can run analytics or updates from a script every hour or day, but delayed job can not.

Related

Long running schedule job

I new to ROR. Wanted to ask something for confirmation. If I run long schedule job. Will it block others schedule job? I have others job running every 5 minutes, Plan to write something that easily run more than 3 hours. Will it block the 5 minutes job?
The whenever gem is basically only a way to configure and handle Cron jobs.
That said: At the given time Cron will just start and run a configured job. Cron will not block other jobs nor it cares if a job fails or if another job is still running.
Limiting factor might be:
Memory/CPU consumption: Each job consumes memory/CPU. If there are too many jobs running at the same time your server might run out of memory or might have a high load. But this doesn't really block other jobs it just slows down the whole server.
Database locks: If your jobs perform tasks that lock database tables other queries might be blocked and need to wait. But this is not Cron specific, this depends on what your code actually does.

how to continuously deploy with long running jobs

We currently use delayed_job and rails to manage some long running jobs in our system. Some of these jobs take potentially hours to run, but we also like to deploy rather frequently, often many times a day. The problem with this setup is that we have to restart delayed_job during deployment to pick up code changes, so that any new jobs are processed with the latest code.
The solution we've arrived at is that for any job that needs to run for more than some small amount of time, we fork the delayed job so that it returns immediately, and the forked process handles the work. This way a deploy can restart all the delayed job processes, while the long-running 'job' keeps going until it's finished as an orphaned process.
We've looked at sidekiq, but it looks like we'd have the same issue there when trying to deploy new code.
Has anyone developed a solution they would recommend for dealing with long-running background processes that span multiple deployments?

How can I configure Delayed jobs to not wait for a task before starting the others?

I am using Delayed jobs for my Ruby app hosted in Heroku to perform a very long task that can take up to 5 minutes.
I've noticed that, in development mode at least, when this task is running the ones that come afterwards are not started until that one finishes. I would like other tasks to be able to start running without having to wait for the other to finish (to have at least 3 concurrent tasks, for example).
I don't wish to increase the number of workers in Heroku ($$$).
I noticed the 'pool' param in delayed jobs but I don't fully understand if this is what I need or how to use it.
https://github.com/collectiveidea/delayed_job/blob/master/README.md
I achieved it using threads in the task code, but maybe this is not the best way to do it.
If you could tell me exactly how I could achieve concurrency in delayed jobs I would really appreciate it.
A DJ worker only runs a single job at a time. If you want concurrent processing of your background jobs, you'll need multiple background workers.
You are way better off implementing sidekiq.

What is the best way to run a long task on Heroku with Ruby On Rails?

I am looking for the best way to run a very long task in Heroku.
I use Ruby On Rails for my web application and I have a very long task that I want to run it every week on Sunday during the night. It takes around 15~20 minutes. I already have Rufus-Scheduler, but I am not sure it is the most effective solution.
I also find something about Backgrounding Tasks in Heroku with Delayed Job. But is it the best way to handle it ?
Thanks.
This is what I use for a job that I run every night: https://devcenter.heroku.com/articles/scheduler
It works really well if your job is configured as a rake task. The guide at the link shows you how to configure everything and even addresses long-running jobs.
Heroku does not recommend to run long-running jobs with Heroku Scheduler.
Heroku says,
Scheduled jobs are meant to execute short running tasks or enqueue longer running tasks into a background job queue. Anything that takes longer than a couple of minutes to complete should use a worker dyno to run.
So, in my opinion, the best approach would be to use Heroku Scheduler to run rake task for every kind of job (short or long) but if a task takes longer than a couple of minutes then I would simply create a Background Job within that rake task. That way the scheduler will never run longer than a couple of minutes.

Should the resque-scheduler queue be expected to handle /lots/ of delayed jobs?

I am currently using resque and resque-scheduler in an application that will have to handle a lot of recurring jobs - "do this every hour", "do this every day" etc. At the moment, I simply queue up the next run of the job in the job itself, the HourlyJob queue has a .enqueue_at(1.hour.from_now, HourlyJob) etc.
Should I be doing this? It "feels" like I should have a static recurring job using resque-schedulers cron-type functionality that then schedules up say the next 5 minutes worth of delayed jobs... but all I am really doing is moving the work from the (probably fast, redis based) resque-scheduler to my (probably less well implemented, mysql based) code, surely?
Is there anything wrong with how I'm doing it now?
I'd personally use the cron style provided by resque-scheduler, your use case is exactly what it was built for:
Your more directly indicate these are recurring jobs.
Everything is located in the same YAML file rather then multiple job classes/modules.
By queuing the next run of the job inside the actual job:
You run the risk of the next run going missing when your worker/job/server fails.
Your needlessly using more memory in Redis, the scheduler process will not add the jobs to Redis until there ready to be run.
Hops this helps.

Resources