I'm trying delayed_job now, and have some questions.
From the http://github.com/collectiveidea/delayed_job page, I can see some information:
Workers can be running on any
computer, as long as they have access
to the database and their clock is in
sync. Keep in mind that each worker
will check the database at least every
5 seconds.
When I invoke rake jobs:work once, it will create ONE worker, right?
When a worker checks the database, it will read ALL new and failed tasks EACH TIME, and run them?
it says a worker will check the database every 5 seconds, can I make it 2 seconds?
When I create a worker(rake jobs:work), there are already 10 tasks in the database, and each will take 3s. How many processes will DelayedJob create? And how many seconds need in total?
yes
yes
Delayed::Worker.sleep_delay = 2
1 worker will work on each task in turn, passing or failing it before going onto the next. 30 seconds total + however long 9 sleep delays are for the total time (45 sec. by default). I'm not sure how to answer your question on processes. 1 worker is created, which is a process. Zero or more other processes may be created, depending on what the job to run is.
Related
I want to use Heroku but the fact they restart dynos every 24 hours at random times is making things a bit difficult.
I have a series of jobs dealing with payment processing that are very important, and I want them backed by the database so they're 100% reliable. For this reason, I chose DJ which is slow.
Because I chose DJ, it means that I also can't just push 5,000,000 events to the database at once (1 per each email send).
Because of THAT, I have longer running jobs (send 200,000 text messages over a few hours).
With these longer running jobs, it's more challenging to get them working if they're cut off right in the middle.
It appears heroku sends SIGTERM and then expects the process to shut down within 30 seconds. This is not going to happen for my longer jobs.
Now I'm not sure how to handle them... the only way I can think is to update the database immediately after sending texts for instance (for example, a sms_sent_at column), but that just means I'm destroying database performance instead of sending a single update query for every batch.
This would be a lot better if I could schedule restarts, at least then I could do it at night when I'm 99% likely not going to be running any jobs that don't take longer than 30 seconds to shut down.
Or.. another way, can I 'listen' for SIGTERM within a long running DJ and at least abort the loop early so it can resume later?
Manual restarts will reset the 24 hr clock - heroku ps:restart at your preferred time ought to give you the control you are looking for.
More info can be found here: Dynos and the Dyno Manager
Here's the proper answer, you listen for SIGTERM (I'm using DJ here) and then gracefully rescue. It's important that the jobs are idempotent.
Long running delayed_job jobs stay locked after a restart on Heroku
class WithdrawPaymentsJob
def perform
begin
term_now = false
old_term_handler = trap('TERM') { term_now = true; old_term_handler.call }
loop do
puts 'doing long running job'
sleep 1
if term_now
raise 'Gracefully terminating job early...'
end
end
ensure
trap('TERM', old_term_handler)
end
end
end
Here's how you solve it with Que:
if Que.worker_count.zero?
raise 'Gracefully terminating job early...'
end
I'm making an application that needs to run a job at extremely precise intervals of time (say 30 seconds, maximum acceptable delay is +-1 second).
I'm currently doing so using an external Go application that polls an API endpoint built within my application.
Is there a way that I could run the task on a worker machine (eg a Heroku dyno) with delays less than one second?
I've investigated Sidekiq and delayed_job, but both have significant lag and therefore are unsuitable for my application.
Schedule the job for 60 seconds prior to when you need it run. Pass in the exact time you need the job executed, as a parameter. Then, run sleep until Time.now == exact_time_down_to_the_second?
We are using DelayedJob to run tasks in the background because they could take a while, and also because if an error is thrown, we still want the web request to succeed.
The issue is, sometimes the job could be really big (changing hundreds or thousands of database rows) and sometimes it could be really small (like 5 db rows). In the case of the small ones, we'd still like to have it run as a delayed job so that the error handling can work the same way, but we'd love to not have to wait roughly 5 seconds for DJ to pick up the job.
Is there a way to queue the job so it runs in the background, but then immediately run it so we don't have to wait for the worker to execute 5 seconds later?
Edit: Yes, this is Ruby on Rails :-)
Delayed Job polls the database for new dj records at a set interval. You can reconfigure this interval in an initializer:
# config/delayed_job.rb
Delayed::Worker.sleep_delay = 2 # or 1 if you're feeling wild.
This will affect DJ globally.
How about
SomeJob.set(
wait: 0,
queue: "queue_name",
).perform_later
If you have a recurring task that runs once per day, you use a Scheduled Task.
If you have a recurring task that runs every 10 seconds, you use a Service.
At what point do you switch between the two? Is there official guidance on this somewhere?
i`m not sure the interval is the main issue here.
here are a few thing to consider:
how much state this task needs in memory - do you load stuff from a file of DB ?
does the system that needs this task to run, have a need to communicate with the task
other that when its running ?
do you need more control over the process lifecycle when the task is up?
you can see where i`m going with this , that a service is a resident entity, and a sched task isn't.
i think it depends on the point if your programm is made for only one task or for more. if it's just doin' one "stupid" thing (like running a stored procedure in a database every 20 seconds) i would concidering a sheduled task, but if it does more than that and maybe got some dependencies (maybe what time it is running or some file-operations) I would concider a service.
I would also concider a service if the intervals when the operation is made are different. Let's say your programm runs a single stored procedure in a database and depending on the fact that it made "real" changes to the db. If it did something the next run is in 5 seconds and if not the next run is in 20 seconds. That's one of the perfect examples for a service.
I've got an app that could benefit from delayed_job and some background processing. The thing is, I don't really need/want delayed_job workers running all the time.
The app runs in a shared hosting environment and in multiple locations (for different users). Plus, the app doesn't get a large amount of usage.
Is there a way to start and stop processing jobs (either with the script or rake task) from my app only after certain actions/events?
You could call out to system:
system "cd #{Rails.root} && rake delayed_job:start RAILS_ENV=production"
You could just change delayed_job to check less often too. Instead of the 5 second default, set it to 15 minutes or something.
Yes, you can, but I'm not sure what the benefit will be. You say you don't want workers running all the time - what are your concerns? Memory usage? Database connections?
To keep the impact of delayed_job low on your system, I'd run only one worker, and configure it to sleep most of the time.
Delayed::Worker::sleep_delay = 60 * 5 # in your initializer.rb
A single worker will only wake up and check the db for new jobs every 5 minutes. Running this way keeps you from 'customizing' too much.
But if you really want to start a Delayed::Worker programatically, look in that class for work_off, and implement your own script/run_jobs_and_exit script. It should probably look much like script/delayed_job does - 3 lines.
I found this because I was looking for a way to run some background jobs without spending all the money to run them all the time when they weren't needed. Someone made a hack using google app engine to run the background jobs:
http://viatropos.com/blog/how-to-run-background-jobs-on-heroku-for-free/
It's a little outdated though. There is an interesting comment in the thread:
"When I need to send an e-mail, copy a file, etc I basically add it to the queue. At the end of every request it checks if there is anything in the queue. If so then it uses the Heroku API to set the worker to 1. At the end of a worker getting a task done it checks to see if there is anything left in the queue. If not then it sets the workers back to 0. The end result is the background worker will just work for a few seconds here and there. I can do all the background processing that I need and the bill at the end of the month rarely ever reaches 1 hour total worth of work. Even if it does no problem, I'll pay $0.05 for background processing. :)"
If you go to stop a worker, you are given the PID. You can simply kill -9 PID if all else fails.