Long running schedule job - ruby-on-rails

I new to ROR. Wanted to ask something for confirmation. If I run long schedule job. Will it block others schedule job? I have others job running every 5 minutes, Plan to write something that easily run more than 3 hours. Will it block the 5 minutes job?

The whenever gem is basically only a way to configure and handle Cron jobs.
That said: At the given time Cron will just start and run a configured job. Cron will not block other jobs nor it cares if a job fails or if another job is still running.
Limiting factor might be:
Memory/CPU consumption: Each job consumes memory/CPU. If there are too many jobs running at the same time your server might run out of memory or might have a high load. But this doesn't really block other jobs it just slows down the whole server.
Database locks: If your jobs perform tasks that lock database tables other queries might be blocked and need to wait. But this is not Cron specific, this depends on what your code actually does.

Related

how to continuously deploy with long running jobs

We currently use delayed_job and rails to manage some long running jobs in our system. Some of these jobs take potentially hours to run, but we also like to deploy rather frequently, often many times a day. The problem with this setup is that we have to restart delayed_job during deployment to pick up code changes, so that any new jobs are processed with the latest code.
The solution we've arrived at is that for any job that needs to run for more than some small amount of time, we fork the delayed job so that it returns immediately, and the forked process handles the work. This way a deploy can restart all the delayed job processes, while the long-running 'job' keeps going until it's finished as an orphaned process.
We've looked at sidekiq, but it looks like we'd have the same issue there when trying to deploy new code.
Has anyone developed a solution they would recommend for dealing with long-running background processes that span multiple deployments?

How can I configure Delayed jobs to not wait for a task before starting the others?

I am using Delayed jobs for my Ruby app hosted in Heroku to perform a very long task that can take up to 5 minutes.
I've noticed that, in development mode at least, when this task is running the ones that come afterwards are not started until that one finishes. I would like other tasks to be able to start running without having to wait for the other to finish (to have at least 3 concurrent tasks, for example).
I don't wish to increase the number of workers in Heroku ($$$).
I noticed the 'pool' param in delayed jobs but I don't fully understand if this is what I need or how to use it.
https://github.com/collectiveidea/delayed_job/blob/master/README.md
I achieved it using threads in the task code, but maybe this is not the best way to do it.
If you could tell me exactly how I could achieve concurrency in delayed jobs I would really appreciate it.
A DJ worker only runs a single job at a time. If you want concurrent processing of your background jobs, you'll need multiple background workers.
You are way better off implementing sidekiq.

If I use Heroku scheduler, do I still need delayed job?

I'm a little confused about this. I have a couple of tasks that I would like to run asynchronously, for example my inventory sync integration. For this I have implemented delayed job, but I realize that I need to run rake jobs:work on Heroku for this. I can use the Heroku scheduler to run this rake task every 10 minutes. My question is; if I create rake tasks to run i.e. my inventory sync method, do I still need delayed job? My understanding is that heroku scheduler kicks off 'one off dynos'.
Instead of using delayed job, could I not just kick off the sync method directly since a separate dyno is used anyway? What is the added value of delayed job here?
Heroku's Scheduler replaces what cron would handle on a typical server. Delayed Job or Sidekiq are for processing jobs asynchronously from your app, not a timed schedule.
The reason you use a worker & run these jobs on the back-end is so that your server can return a response as soon as is possible rather than making the user wait for some potentially unnecessarily long running process to finish (lots of queries, outbound e-mail, external API requests, etc.).
Ex, scheduler can run analytics or updates from a script every hour or day, but delayed job can not.

How can I speed up my Rails DelayedJobs time to start?

I am using Rails / Delayed Jobs, with Rails 4.1 on Heroku. I've noticed that my jobs take anywhere from 1 second to 10 seconds to actually start. Once they start, they run pretty fast.
How can I speed that up?
I am calling them with my_thing.delay.run!
I do have some other ongoing jobs, but there's not that many of them, so it doesn't seem like that would be the cause. It just seems like a lagginess in how often it checks to run jobs.
I think you want to configure Delayed::Worker.sleep_delay which is hinted at in the delayed job README. If delayed job can't find a job then it sleeps for this many seconds before looking again. The default sleep is for 5 seconds.
So, you might set the following in config/initializers/delayed_job.rb to only sleep for 2 seconds between queries for pending jobs.
Delayed::Worker.sleep_delay = 2
Obviously the trade-off is the more frequent polling for jobs when nothing is going on.
Also, if you're not wedded to delayed job then you might find resque or particularly sidekiq would probably process your jobs quicker than delayed job.

Should the resque-scheduler queue be expected to handle /lots/ of delayed jobs?

I am currently using resque and resque-scheduler in an application that will have to handle a lot of recurring jobs - "do this every hour", "do this every day" etc. At the moment, I simply queue up the next run of the job in the job itself, the HourlyJob queue has a .enqueue_at(1.hour.from_now, HourlyJob) etc.
Should I be doing this? It "feels" like I should have a static recurring job using resque-schedulers cron-type functionality that then schedules up say the next 5 minutes worth of delayed jobs... but all I am really doing is moving the work from the (probably fast, redis based) resque-scheduler to my (probably less well implemented, mysql based) code, surely?
Is there anything wrong with how I'm doing it now?
I'd personally use the cron style provided by resque-scheduler, your use case is exactly what it was built for:
Your more directly indicate these are recurring jobs.
Everything is located in the same YAML file rather then multiple job classes/modules.
By queuing the next run of the job inside the actual job:
You run the risk of the next run going missing when your worker/job/server fails.
Your needlessly using more memory in Redis, the scheduler process will not add the jobs to Redis until there ready to be run.
Hops this helps.

Resources