My environment: CentOS, Rails 4, Ruby 2.
I'm running in Delayed Job infinite loop that getting in real time information from some another site.
The task is to run simultaniously many processes to get information from different sites. For each new process I add it to DJ queue by running search_engine.delay.track!.
So when I run one worker it successfully takes first job from queue and when job complete takes next. When I run more than one worker by ./bin/delayed_job -n5 start every DJ worker takes first job in queue and starts to track! it. But I want them to take only jobs, that was not been taked by other workers.
It was because of Delayed::Worker.max_run_time = 0. The DJ thought, that work already done and runs it again.
Related
I new to ROR. Wanted to ask something for confirmation. If I run long schedule job. Will it block others schedule job? I have others job running every 5 minutes, Plan to write something that easily run more than 3 hours. Will it block the 5 minutes job?
The whenever gem is basically only a way to configure and handle Cron jobs.
That said: At the given time Cron will just start and run a configured job. Cron will not block other jobs nor it cares if a job fails or if another job is still running.
Limiting factor might be:
Memory/CPU consumption: Each job consumes memory/CPU. If there are too many jobs running at the same time your server might run out of memory or might have a high load. But this doesn't really block other jobs it just slows down the whole server.
Database locks: If your jobs perform tasks that lock database tables other queries might be blocked and need to wait. But this is not Cron specific, this depends on what your code actually does.
We currently use delayed_job and rails to manage some long running jobs in our system. Some of these jobs take potentially hours to run, but we also like to deploy rather frequently, often many times a day. The problem with this setup is that we have to restart delayed_job during deployment to pick up code changes, so that any new jobs are processed with the latest code.
The solution we've arrived at is that for any job that needs to run for more than some small amount of time, we fork the delayed job so that it returns immediately, and the forked process handles the work. This way a deploy can restart all the delayed job processes, while the long-running 'job' keeps going until it's finished as an orphaned process.
We've looked at sidekiq, but it looks like we'd have the same issue there when trying to deploy new code.
Has anyone developed a solution they would recommend for dealing with long-running background processes that span multiple deployments?
I am using Delayed jobs for my Ruby app hosted in Heroku to perform a very long task that can take up to 5 minutes.
I've noticed that, in development mode at least, when this task is running the ones that come afterwards are not started until that one finishes. I would like other tasks to be able to start running without having to wait for the other to finish (to have at least 3 concurrent tasks, for example).
I don't wish to increase the number of workers in Heroku ($$$).
I noticed the 'pool' param in delayed jobs but I don't fully understand if this is what I need or how to use it.
https://github.com/collectiveidea/delayed_job/blob/master/README.md
I achieved it using threads in the task code, but maybe this is not the best way to do it.
If you could tell me exactly how I could achieve concurrency in delayed jobs I would really appreciate it.
A DJ worker only runs a single job at a time. If you want concurrent processing of your background jobs, you'll need multiple background workers.
You are way better off implementing sidekiq.
How can I manage to execute job after the first job that has executed is done in sidekiq. For example:
I triggered the first job for this morning
GoodWorker.perform_async(params) #=> JID-eetc
while it is still in progress I've executed again a job in the same worker dynamically
GoodWorker.perform_ascyn(params) #=> JID-eetc2
and etc.
What's going on now is Sidekiq processing the jobs all of the time,
is there a way performing the job one at a time?
Short answer: no.
Long answer: You can use a mutex to guarantee that only one instance of a worker is executing at a time. If you're running on a cluster, you'll need to use Redis or some other medium to maintain the mutex. Otherwise, you might try putting these jobs in their own queue, and firing up a separate instance of Sidekiq that only monitors that queue, with a concurrency of one.
Can you not setup Sidekiq to only have one thread? Then only one job will be executed at a time.
I'm very happy with By so far, only I have this one issue:
When one process takes 1 or 2 hours to complete, all other jobs in the queue seem to wait for that one job to finish. Worse still is when uploading to a server which time's out regularly.
My question: is Bj running jobs in parallel or one after another?
Thank you,
Damir
BackgroundJob will only allow one worker to run per webserver instance. This is by design to keep things simple. Here is a quote from Bj's README:
If one ignores platform specific details the design of Bj is quite simple: the
main Rails application submits jobs to table, stored in the database. The act
of submitting triggers exactly one of two things to occur:
1) a new long running background runner to be started
2) an existing background runner to be signaled
The background runner refuses to run two copies of itself for a given
hostname/rails_env combination. For example you may only have one background
runner processing jobs on localhost in development mode.
The background runner, under normal circumstances, is managed by Bj itself -
you need do nothing to start, monitor, or stop it - it just works. However,
some people will prefer manage their own background process, see 'External
Runner' section below for more on this.
The runner simply processes each job in a highest priority oldest-in fashion,
capturing stdout, stderr, exit_status, etc. and storing the information back
into the database while logging it's actions. When there are no jobs to run
the runner goes to sleep for 42 seconds; however this sleep is interuptable,
such as when the runner is signaled that a new job has been submitted so,
under normal circumstances there will be zero lag between job submission and
job running for an empty queue.
You can learn more on the github page: Here