concurrency in delayed_jobs - ruby-on-rails

I have ROR application and 1 delay_job process ran using rake job:work.
ROR application add Job in multiple queue.
Lets say we have queue 1 and queue 2.
My Question is task in queue 1 and task in queue 2 will be executed concurrently?
Currently in my application after running rake job:work process only 1 thread is spawn which executes queue1 task and then queue2 task.
If i have to execute in parallel, i have to run two rake task of job:work.
Is it correct behavior or it can be run concurrently in 1 rake task of job:work.
And what is worker in Delay Job. Is delay Job interchangeably used with worker
Thanks
Priyanka

No, one worker cannot run two jobs concurrently, you need more than one process running for that.
In the example you are describing, you are starting a worker that is running in the foreground (rake job:work), but what you could do instead, is to start them as background workers, by running bin/delayed_job instead (script/delayed_job for earlier versions). That command has multiple options that you can use to specify how you want delayed_job to function.
One of the options is -n or --number_of_workers=workers. That means you can start two workers by running the following command:
bundle exec bin/delayed_job --number_of_workers=2 start
It is also possible to dedicate certain workers to only run jobs from a specific queue, or only high priority jobs.

Related

Is there a way to tell a sleeping delayed job worker to process the queue from Rails code?

I'm using delayed_job library as adapter for Active Jobs in Rails:
config.active_job.queue_adapter = :delayed_job
My delayed_job worker is configured to sleep for 60 seconds before checking the queue for new jobs:
Delayed::Worker.sleep_delay = 60
In my Rails code I add a new job to the queue like this:
MyJob.perform_later(data)
However, the job will not be picked up by the delayed_job worker immediately, even if there are no jobs in the queue, because of the sleep_delay. Is there a way to tell the delayed_job worker to wake up and start processing the job queue if it's sleeping?
There is a MyJob.perform_now method, but it blocks the thread, so it's not what I want because I want to execute a job asynchronously.
Looking at the delayed_job code and it appears that there's no way to directly control or communicate with the workers after they are daemonized.
I think the best you can do would be to start a separate worker with a small sleep_delay that only reads a specific queue, then use that queue for these jobs. A separate command is necessary because you can't start a worker pool where the workers have different sleep delays:
Start your main worker: bin/delayed_job start
Start your fast worker: bin/delayed_job start --sleep-delay=5 --queue=fast --identifier=999 (the identifier is necessary to differentiate the workers daemons)
Update your job to use that queue:
class MyJob < ApplicationJob
queue_as :fast
def perform...
end
Notes:
When you want to stop the workers you'll also have to do it separately: bin/delayed_job stop and bin/delayed_job stop --identifier=999.
This introduces some potential parallelism and extra load on the server when both workers are working at the same time.
The main worker will process jobs from the fast queue too, it's just a matter of which worker grabs the job first. If you don't want that you need to setup the main worker to only read from the other queue(s), by default that's only 'default', so: bin/delayed_job start --queue=default.

Rails delayed job and docker: adding more workers

I run my rails app using a Docker. Delayed jobs are processed by a single worker that runs in a separate container called worker and inside it worker runs with a command bundle exec rake jobs:work.
I have several types of jobs that I would like to move to a separate queue and create a separate worker for that. Or at least have two workers for process tasks.
I tried to run my worker container with env QUEUE=default_queue bundle exec rake job:work && env QUEUE=another_queue bundle exec rake job:work but that does not make any sense. It does not fails, is starts but jobs aren't processed.
Is there any way to have separate workers in one container? And is it correct? Or should I create separate container for each worker I would ever want to make?
Thanx in advance!
Running the command command1 && command2 results in command2 being executed only when command1 completes. rake jobs:work never terminates, even when it has finished executing all the jobs in the queue, so the second command will never execute.
A single "&" is probably what you're looking for: command1 & command2.
This will run the commands independently in their own processes.
You should use the delayed_job script on production, and it's a good idea to put workers of different queues into different containers in case one of the queues contains jobs that use up a lot of resources.
This will start a delayed job worker for the default_queue:
bundle exec script/delayed_job --queue=default_queue start
For Rails 4, it is: bundle exec bin/delayed_job --queue=default_queue start
Check out this answer on the topic: https://stackoverflow.com/a/6814591/6006050
You can also start multiple workers in separate processes using the -n option. This will start 3 workers in separate processes, all picking jobs from the default_queue:
bundle exec script/delayed_job --queue=default_queue -n 3 start
Differences between rake jobs:work and the delayed_job script:
It appears that the only difference is that rake jobs:work starts processing jobs in the foreground, while the delayed_job script creates a daemon which processes jobs in the background. You can use whichever is more suited to your use case.
Check this github issue: https://github.com/collectiveidea/delayed_job/issues/659
Actually i just came across this problem with scaling delayed_jobs on docker
see this gist for a script that starts delayed jobs with arbitrary arguments and listens to SIGTERM and executes a smooth shutdown of the started jobs on container shutdown. This way you can execute as many processes and queues as you want.
https://gist.github.com/jklimke/3fea1e5e7dd7cd8003de7500508364df
#!/bin/bash
# Variable DELAYED_JOB_ARGS contains the arguments for delayed jobs for, e.g. defining queues and worker pools.
# function that is called when the docker container should stop. It stops the delayed job processes
_term() {
echo "Caught SIGTERM signal! Stopping delayed jobs !"
# unbind traps
trap - SIGTERM
trap - TERM
trap - SIGINT
trap - INT
# end delayed jobs
bundle exec "./bin/delayed_job ${DELAYED_JOB_ARGS} stop"
exit
}
# register handler for selected signals
trap _term SIGTERM
trap _term TERM
trap _term INT
trap _term SIGINT
echo "Starting delayed jobs ... with ARGs \"${DELAYED_JOB_ARGS}\""
# restart delayed jobs on script execution
bundle exec "./bin/delayed_job ${DELAYED_JOB_ARGS} restart"
echo "Finished starting delayed jobs... Waiting for SIGTERM / CTRL C"
# sleep forever until exit
while true; do sleep 86400; done

Ruby on Rails batch processing

I am working on a Rails app that runs regularly scheduled sidekiq jobs, and I understand how queues and background jobs works. I'm working with a 3rd party that requests that I batch jobs to them so that each worker handles one job at a time with 50 workers running in parallel.
I've been researching this for hours, but I'm unclear on how to do this and how to tell if it's actually working. Currently, my procfile looks like this:
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
worker: bundle exec sidekiq -C ./config/sidekiq.yml
Is it as simple as increasing the concurrency from the rake task to -c 50 in the worker line? Or do I need to use ConnectionPool inside the worker class? The Rails docs say that using find_each is "useful if you want multiple workers dealing with the same processing queue." If I run find_each inside the rake task and call the worker once for each item, will it run the jobs in parallel? I read one article that says that concurrency and parallelism are often confused, so I am, in turn, a little confused about which direction to take.

delayed_job rake task parameters and concurrency

The documentation states that a delayed job worker can be invoked using a rake task like so: rake jobs:work, or QUEUE=queue1 rake jobs:work if you want it to work on a specific queue.
I have a couple of questions about this way to invoke jobs:
Is there a way to pass other parameters like sleep-delay or read-ahead (like you would do if you start the worker using the script: delayed_job start --sleep-delay 30 --read-ahead 500 --queue=queue1)?
Is there any gain in processing speed if you launch 2 workers on the same queue using the rake task?
In answer to 1. - yes you can set sleep delay and read ahead from the command line. You do it via environment variables:
QUEUE=queue1 SLEEP_DELAY=1 rake jobs:work
for example. See this commit.
rake jobs:work is just a means to an end to put up another worker, for development purposes or to work off a big queue (you have rake jobs:workoff for this though) so all benefits and disclaimers of multiple workers apply,
two jobs process in parallel so if you've got the cpu power your queue will be worked quicker
I don't know about the question #1 though, it's possible rake jobs wasn't intended to be used outside of development

How do I assign one worker to a queu?

In Resque, if I have the following:
bundle exec rake resque:work QUEUE=critical,high,normal,low,aggregate
How do I indicate that I only want one, and ONLY ONE worker, for a specific queue (i.e. the aggregate queue)?
I dont think that is possible
Reason if you look the current code over here
Resque poll all the queue
value = #redis.blpop(*(queue_names + [1])) until value
where queue_names is your critical,high,normal,low,aggregate
so the point over here Irrespective you if you use single resque work or mutilple workers
using resque:workers each of the resque work poll all the available queue and once the
message is consumed from any on the queue the work start acting upon it
If you want to assign only one work for the above queue it would be better running two rake
task like this
bundle exec rake resque:worker COUNT=4 QUEUE=critical,high,normal,low
bundle exec rake resque:work QUEUE=aggregate
This way you can assign a single resque worker for the aggregate queue
Hope this help

Resources