I've got Resque workers that typically shouldn't take longer than about 1-5 minutes to run, but frequently those workers will get "stuck" and go idle, clogging up workers and doing nothing.
So I'd like to regularly check for workers that have been running longer than X time and purge them. But I need to do this automatically, so I don't have to personally go in and manually clear them (Resque.workers.each {|w| w.unregister_worker}) every few hours.
This needs to work on Heroku.
Put this into a rake task:
allocated_time = 60 * 60 # 1 hour
Resque::WorkerRegistry.working.each do |worker|
if (worker.started <=> Time.now - allocated_time) < 1
worker.unregister
end
end
Use heroku scheduler, you can set it to minimum of 10 minutes if that suites.
For Resque v1,
# lib/tasks/clear_stale_workers.rake
namespace :clear do
desc 'Clearing stuck workers ...'
task :stale_workers => :environment do
Resque.workers.each do |w|
w.unregister_worker unless w.started > 1.hour.ago
end
end
end
From the command line, rake clear:stale_workers
On Heroku, set the set the scheduler to run this Rake task.
This worked for me to remove the specific workers running stale jobs. You could add it to a rake task.
Resque::Worker.working.each{|w| w.done_working }
Related
I use a gem called whenever to manage my cron jobs.
In cronfile, I have every 1 minute cron job which call a task XXXX. My config/schedule.rb is like this:
every '* * * * *' do
rake "XXXXXXXX"
end
This cron job is working fine with make slight delay. Task XXXX starts to run its first line a few seconds after process is created. Since this task finishes in less than 1 minute, I should never have multiple processes at the same time.
However, the server is heavily loaded, this delay will become a few minutes.
This leads that many undone processes remain in my process list beacause cron job creates a process every minute.
This will cause the server to become heavier, if worst comes to worst, the server is completely dead.
why does it happen? How can I prevent cronjob from to delay calling a task?
you can add a dependent task to check whether the server is running or not, fail fast (a.k.a. fail early), for example i want to verify that my rails server is already started on port 3000 before call the rake :test
# Rakefile
task :check_localhost do
pid = system("lsof -i tcp:3000 -t")
fail unless pid # or you can use `abort('message')`
end
task :test => :check_localhost do
puts "****** THIS IS TEST ******"
end
I have 5 Resque workers setup like this:
QUEUE=* rake environment resque:work
QUEUE=* rake environment resque:work
QUEUE=* rake environment resque:work
when I run a heavy job like this:
100.times do
Resque.enqueue(DoTheJob)
end
First worker gets about 80 of the job and other workers share the rest...
In my case, I may have 40 concurrent and really heavy jobs -video transcoding-. They will be triggered consecutively and I want jobs to be seperated equally or at least fairly to my existing workers (they can be up to 30).
Is there an option or something like that?
How can I achieve this?
Thank you
Well, when I try the code above with a simplest Resque worker class structure like this:
class MyWorker
#queue=:test
def self.perform(data)
puts "Testing...."
end
end
100.times do
Resque.enqueue(MyWorker)
end
All the jobs are queued to the same worker.
BUT.
When I put a few seconds sleep to the worker, I see that jobs are distrubeted pretty fairly.
class MyWorker #queue=:test
def self.perform(data)
puts "Testing...."
sleep 3
end
end
I don't know what is Resque's algorythm for this purpose but looks like there's no problem with it.
I created an app that uses the whenever gem. The gem creates cron jobs. I got it working locally but can't seem to get it working on heroku cedar. What's the command to do this?
running:
heroku run whenever --update-crontab job1
doesn't work
Short answer: use the scheduler add-on: http://addons.heroku.com/scheduler
Long answer: When you do heroku run, we
spin up a dyno
put your code on it
execute your command, wait for it to finish
throw the dyno away
Any changes you made to crontab would be immediately thrown away. Everything is ephemeral, you cannot edit files on heroku, just push new code.
You need to add Heroku Scheduler addon.
You can add it directly from your dashboard or using following commands:
install the add-on:
heroku addons:create scheduler:standard
Create a rake task in lib/tasks
# lib/tasks/scheduler.rake
task :send_reminders => :environment do
User.send_reminders
end
Schedule job
Visit Heroku Dashboard
Open your app
Select Scheduler from add-ons list
Click Add Job, enter a task and select frequency.
e.g. Add rake send_reminders, select "Daily" and "00:00" to send reminders every day at midnight.
The other answers specify you should use the Heroku Scheduler add-on, and it is able to run a background tasks indeed, but it doesn't support the flexibility of cron.
There's another add-on, called Cron To Go, that is able to run your jobs on one-off dynos with cron's flexibility. You can also specify a timezone for your job and get notifications (email or webhook) when job fail, succeed or start.
(Full disclosure - I work for the company that created and operates Cron To Go)
If you want to:
Use Heroku Scheduler
Run tasks every minute (not 10 min)
Don't care about dyno hours
This was my solution hack to run jobs every minute - assuming the task completes in under 60 seconds.
task start_my_service: :environment do
1.upto(9) do |iteration|
start_time = DateTime.now
Services::MyService.call
end_time = DateTime.now
wait_time = 60 - ((end_time - start_time) * 24 * 60 * 60).to_i
sleep wait_time if wait_time > 0
end
end
Heroku doesn't support cron jobs. And there are two drawbacks to the Heroku Scheduler :
you cannot choose an arbitrary interval or time at which to run jobs (it's either every 10 mins, 1 hour or daily).
your jobs are not defined in code, hence not in your versioning system and not easy to keep track of or modify.
Heroku does provide an alternative : custom clock processes. But the clock process requires its own dyno, and "Since dynos are restarted at least once a day some logic will need to exist on startup of the clock process to ensure that a job interval wasn’t skipped during the dyno restart".
Simple scheduler is a gem made specifically made for scheduling on Heroku, but seems a bit hackish.
I ended up using sidekiq-cron. Only drawback : if sidekiq is down right when a job is scheduled to run, the job won't run.
How do I create a delayed job for a rake task that should run every 15 minutes?
You can give it a try: https://github.com/defunkt/resque
I am using Resque + Redis with Heroku. Delayed job is also very much supported on their cloud service.
In lib/tasks/cron.rb
desc "This task is called by the Heroku cron add-on"
task :cron => :environment do
def resubmit_pending_jobs
Resque.enqueue(SomeJob, job.id)
end
end
One way I can think of is by using the cron addon offered by Heroku which does it every hour (not 15 mins). Perhaps the above code block can assist you in finding a similar implementation for Delayed Job.
In the case you are interested in getting Resque setup with RedisToGo and Heroku, please consult this guide.
Hope that helps!
Take a look at SimpleWorker. It's a cloud-based background processing / worker queue for Ruby apps. It's an add-on for Heroku.
You create worker classes in your code and the queue up jobs to run right away or run later -- one time or on a recurring schedule.
worker = SomeWorker.new
# Set attributes for worker to use here
worker.schedule(:start_at => 1.minute, :run_every => 900)
Is there a way to limit the number of instances of a rake task?
I have a rake task for reading emails that runs every 5 mins as a cron job.
Sometimes the rake tasks takes more than 5 mins to complete and another
rake task is launched before it finishes.
There are hacky workarounds to check ps -Af inside the rake file but I
am looking for cleaner way to limit launching multiple instances of the
rake tasks similar to how the daemon gem does.
Checking emails is just an example, I have several such rake tasks that involve
polling multiple servers.
You could also just use a PidFile.
First, install the 'pidfile' gem. Then make your task like this:
task :my_task => :environment do |task|
PidFile.new(:piddir => Rails.root.join('tmp', 'pids'), :pidfile => task.name)
# do some stuff
end
Still can't find a super elegant way, so I resorted to saving a unique file for
each rake task.
This is how the rake task looks now -
run_unique_rake(__FILE__) do
puts "\n is running\n"
sleep(40)
end
here is run_unique_rake
def self.run_unique_rake(file)
path = RAILS_ROOT + "/" + CONFIG['rake_log'] + "/" + File.basename(file)
unless File.exists?(path)
`touch #{path}`
yield if block_given?
`rm #{path}`
end
end
Still hoping for an elegant way within rake to limit to a single instance.