I am looking for a way to be able run Active Job serially. Ideally, a long running Job 1 is scheduled to run at a certain time. A similarly running Job 2 is slated to run only after Job 1 completes. Job 3 then waits for Job 2 to run to completion before it starts and so on.
I have to admit that I am rather new to background jobs in Rails but I am already using Active Job with Sidekiq as the job runner for simple fire-and-forget tasks.
I like Active Job because it provides a simple enough interface to dive almost immediately into background jobs processing. I can use Sidekiq without having to define workers, for example.
For reference, I have achieved something similar but it was on .NET using the excellent Hangfire library which has continuations where you pass the ID of a parent job ensuring that the job will run only after the parent job has successfully completed.
It would be nice to have something as clean and simple as that using Sidekiq and Active Job but really any alternative ways to achieve the same thing are welcome. It doesn't have to be Sidekiq and Active Job.
The most straightforward way to to this is to call a third job from within a second job, and the second one from within a first job
Related
Till today I used Quartz.net for a single job or different jobs that worked independently.. now I need to run 2 different jobs based on some data I've. I've created 3 jobs for now and called them one after the other (am doing ugly things I won't write here to run them one after the other...)
I was wondering if I use a queue, shared between the jobs, (or even 3 background worker that works on different queue) can I have this working with Quartz.net?
I mean I have the main job run, the other works on their on queue, but the main job is not scheduled until the worker is finished?
I've just put on the main job the [DisallowConcurrentExecution] attribute
Thanks
I am using Delayed jobs for my Ruby app hosted in Heroku to perform a very long task that can take up to 5 minutes.
I've noticed that, in development mode at least, when this task is running the ones that come afterwards are not started until that one finishes. I would like other tasks to be able to start running without having to wait for the other to finish (to have at least 3 concurrent tasks, for example).
I don't wish to increase the number of workers in Heroku ($$$).
I noticed the 'pool' param in delayed jobs but I don't fully understand if this is what I need or how to use it.
https://github.com/collectiveidea/delayed_job/blob/master/README.md
I achieved it using threads in the task code, but maybe this is not the best way to do it.
If you could tell me exactly how I could achieve concurrency in delayed jobs I would really appreciate it.
A DJ worker only runs a single job at a time. If you want concurrent processing of your background jobs, you'll need multiple background workers.
You are way better off implementing sidekiq.
I have two jobs, consider them to be the super simple jobs that just print a line and have no triggers or timeouts defines. They work fine when I call them from a controller class through: <name of my class>Job.triggerNow()
What I want is to trigger one job and, as it as it finishes, trigger a consequent different job.
I have tried using the quartzScheduler, but I can't seem to get a JobDetail from my job classes, so I'm not sure what is the correct way for doing this. I also want to pass some results from the first job onto the second one.
I know I can trigger the second job as the last line on my first job's execute method, but this is not desirable since its technically not part of the first job and couples things more than I would like.
Any help will be greatly appreciated. thanks
What it sounds like you are after is an asynchronous "pipeline" of work where there are different workers that are all in a line and pass data to be worked on from one to the next. This sort of architecture is amazingly flexible and applies to a large number of very common applications
The best way that I have found to get such an architecture in place with Grails is to use a message queue, like RabbitMQ for example, with a series of queues (one for each step in the pipeline), and then have the controller(s) put messages into the first step of the pipeline.
Then, you have a worker (just a service within the Grails app if you use the excellent RabbitMQ Grails plugin) listen to the queue that holds jobs for them to work on. As work comes into the queue, the worker will pop the job off, processes it, and then put a message into the queue of the next step in the pipeline.
I've found this to be the best way to architect just about any asynchronous pipeline, since it allows you to scale each piece separately as needed and doesn't have too much overhead. There are also ways to decouple the jobs from having to know about the next step in the pipeline, but I've found that in most cases this isn't really needed and just adds useless complexity.
Quartz is great for jobs that need to happen on a schedule, but a pipeline is much better at processing things as it comes in in a scaleable way
Please have a look #
JobListener
You can utilize
public void jobWasExecuted(JobExecutionContext context,
JobExecutionException jobException);
I built something similar to this in my web application using queue messaging technique with Redis. I simply define the dependency structure for all the jobs, and have a master job with the only purpose is to monitor/update the status of other jobs and trigger dependent jobs if needed.
Each job will have to report its status running/finish/cancel using the Redis queue. Master job pop each queue message and process it properly.
I am currently using resque and resque-scheduler in an application that will have to handle a lot of recurring jobs - "do this every hour", "do this every day" etc. At the moment, I simply queue up the next run of the job in the job itself, the HourlyJob queue has a .enqueue_at(1.hour.from_now, HourlyJob) etc.
Should I be doing this? It "feels" like I should have a static recurring job using resque-schedulers cron-type functionality that then schedules up say the next 5 minutes worth of delayed jobs... but all I am really doing is moving the work from the (probably fast, redis based) resque-scheduler to my (probably less well implemented, mysql based) code, surely?
Is there anything wrong with how I'm doing it now?
I'd personally use the cron style provided by resque-scheduler, your use case is exactly what it was built for:
Your more directly indicate these are recurring jobs.
Everything is located in the same YAML file rather then multiple job classes/modules.
By queuing the next run of the job inside the actual job:
You run the risk of the next run going missing when your worker/job/server fails.
Your needlessly using more memory in Redis, the scheduler process will not add the jobs to Redis until there ready to be run.
Hops this helps.
I'm very happy with By so far, only I have this one issue:
When one process takes 1 or 2 hours to complete, all other jobs in the queue seem to wait for that one job to finish. Worse still is when uploading to a server which time's out regularly.
My question: is Bj running jobs in parallel or one after another?
Thank you,
Damir
BackgroundJob will only allow one worker to run per webserver instance. This is by design to keep things simple. Here is a quote from Bj's README:
If one ignores platform specific details the design of Bj is quite simple: the
main Rails application submits jobs to table, stored in the database. The act
of submitting triggers exactly one of two things to occur:
1) a new long running background runner to be started
2) an existing background runner to be signaled
The background runner refuses to run two copies of itself for a given
hostname/rails_env combination. For example you may only have one background
runner processing jobs on localhost in development mode.
The background runner, under normal circumstances, is managed by Bj itself -
you need do nothing to start, monitor, or stop it - it just works. However,
some people will prefer manage their own background process, see 'External
Runner' section below for more on this.
The runner simply processes each job in a highest priority oldest-in fashion,
capturing stdout, stderr, exit_status, etc. and storing the information back
into the database while logging it's actions. When there are no jobs to run
the runner goes to sleep for 42 seconds; however this sleep is interuptable,
such as when the runner is signaled that a new job has been submitted so,
under normal circumstances there will be zero lag between job submission and
job running for an empty queue.
You can learn more on the github page: Here