Running delayed jobs on Heroku for free - ruby-on-rails

Is it possible to run delayed jobs on Heroku for free?
I'm trying to use delayed_job_active_record on Heroku. However, it requires a worker dyno and it would cost money if I turned this dyno on for full time.
I thought using Unicorn and making its workers run delayed jobs instead of the Heroku worker, would cost nothing, while successfully running all the jobs. However, Unicorn workers do not seem to start "working" automatically.
I have the following in my Procfile.
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
worker: bundle exec rake jobs:work
and the following in my unicorn.rb
worker_processes 3
timeout 30
preload_app true
before_fork do |server, worker|
# Replace with MongoDB or whatever
if defined?(ActiveRecord::Base)
ActiveRecord::Base.connection.disconnect!
Rails.logger.info('Disconnected from ActiveRecord')
end
# If you are using Redis but not Resque, change this
if defined?(Resque)
Resque.redis.quit
Rails.logger.info('Disconnected from Redis')
end
sleep 1
end
after_fork do |server, worker|
# Replace with MongoDB or whatever
if defined?(ActiveRecord::Base)
ActiveRecord::Base.establish_connection
Rails.logger.info('Connected to ActiveRecord')
end
# If you are using Redis but not Resque, change this
if defined?(Resque)
Resque.redis = ENV['REDIS_URI']
Rails.logger.info('Connected to Redis')
end
end
Delayed jobs only seem to work when I scale the Heroku worker from 0 to 1.
Again, is it not possible to use Unicorn workers instead of Heroku worker to do the delayed jobs?
Do I have to use a gem like workless to run delayed jobs on Heroku for free? (reference)

Splitting the process like that can incur problems - your best bet is it not try and get it 'free' but use something like http://hirefireapp.com/ which will start up a worker when there are jobs to perform reducing the cost significantly rather than running a worker 24x7.
Also note, Heroku will only ever autostart a 'web' process for you, starting other named processes is a manual task.

You can use Heroku Scheduler to run the jobs using the command
rake jobs:workoff
This way the jobs can run in your web dyno. According to Delayed_Job docs, this command will run all available jobs and exit.
You can configure the scheduler to run this command every 10 minutes for example, and it doesn't have sensible effect on the app's performance when no jobs are queued. Another good idea is to schedule it to run daily at a time with lower access rates.

Ideally there is no straight way to get this free, but you would find lots of workaround one can make to enjoy free background jobs. One of which is http://nofail.de/2011/07/heroku-cedar-background-jobs-for-free/
Also if you plan to use resque which is an excellent choice for background jobs you would need redis which comes free with nano version => https://addons.heroku.com/redistogo. https://devcenter.heroku.com/articles/queuing-ruby-resque
Simple solution is to buy a one dyno for the worker, whereas your web dyno would be free.
Let me if you need more help.
Thanks

Consider using the Workless gem: https://github.com/lostboy/workless

If you only have one web worker, Heroku will sleep it if it's inactive for an hour.
Also, Heroku will reboot all dynos at least once a day.
This makes it hard to do a within-ruby scheduler. It has to at least use persistent storage (e.g. database).

Related

How to make sure resque background jobs are always up?

I use ActiveJob with a resque back-end and use capistrano-resque to (re)start my work processes on deploy.
What I have been strugling with is making sure those processes are always up. Can and could such a process crash? Should I put safeguards in making sure that my background jobs always get picked up by a worker?
I have searched far and wide but have not found any standard solution to this.
I am using god with resque. Here's an example script for it.
Capistrano
desc "Restart resque workers"
task :restart_workers, roles: :resque do
run "sudo god restart resque-production"
end
after 'deploy:restart', 'deploy:restart_workers'
where resque-production is the w.name from the script example.

Two instances running on Heroku but only 1 worker. Why?

I have been troubleshooting a memory usage of about 550 Mb on a Heroku Rails-app running on Unicorn which is causing some 2k ms response times.
I looked at my New Relic-graphs and realized I am running two instances but I only have 1 worker and I am only running 1 dyno (Hobby). I don't understand why there are two instances! It seems like I am accidentally using a "ghost" instance. This only happens when I use Unicorn, not on Puma.
Edit: I added a worker to see what happened if I ran 2 workers. This caused 3 instances to be running, according to New Relic, so it does not duplicate it just add one ghost instance.
Once every 10 minutes I run a short scheduled task, which can be seen in the graphs. ENV["WEB_CONCURRENCY"] is not set, by the way.
# Unicorn.rb:
worker_processes Integer(ENV["WEB_CONCURRENCY"] || 1)
timeout 15
preload_app true
before_fork do |server, worker|
Signal.trap 'TERM' do
puts 'Unicorn master intercepting TERM and sending myself QUIT instead'
Process.kill 'QUIT', Process.pid
end
defined?(ActiveRecord::Base) and
ActiveRecord::Base.connection.disconnect!
end
after_fork do |server, worker|
Signal.trap 'TERM' do
puts 'Unicorn worker intercepting TERM and doing nothing. Wait for master to send QUIT'
end
defined?(ActiveRecord::Base) and
ActiveRecord::Base.establish_connection
end
# Proc
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
Running heroku ps gives:
heroku ps
=== web (Hobby): bundle exec unicorn -p $PORT -c ./config/unicorn.rb (1)
web.1: up 2018/01/12 11:34:08 +0100 (~ 5h ago)
Is this behavior to be expected or am I doing something terribly wrong here? What could cause the second instance to run? Is it possible to accidentally start two versions on the app on boot?
I removed the scheduler, some gems and the dalli cache and this removed the extra/ghost instance. Then I put them back one after another but it stayed at one instance. I.e. exactly the same setup that before had two instances, now were down to 1 (which makes the most sense).
The memory consumtion remains the same so I will mark this down as a New Relic bug. Unfortunately.

Error R12 (Exit timeout) using Heroku's recommended Unicorn config

My Unicorn config (copied from Heroku's docs):
# config/unicorn.rb
worker_processes Integer(ENV["WEB_CONCURRENCY"] || 3)
timeout 30
preload_app true
before_fork do |server, worker|
Signal.trap 'TERM' do
puts 'Unicorn master intercepting TERM and sending myself QUIT instead'
Process.kill 'QUIT', Process.pid
end
defined?(ActiveRecord::Base) and
ActiveRecord::Base.connection.disconnect!
end
after_fork do |server, worker|
Signal.trap 'TERM' do
puts 'Unicorn worker intercepting TERM and doing nothing. Wait for master to send QUIT'
end
defined?(ActiveRecord::Base) and
ActiveRecord::Base.establish_connection
end
But every time a dyno is restarted, we get this:
heroku web.5 - - Error R12 (Exit timeout) -> At least one process failed to exit within 10 seconds of SIGTERM
Ruby 2.0, Rails 3.2, Unicorn 4.6.3
We've had issues like this with Unicorn for some time . . . we also get seemingly random timeout errors, even though we never see much load and have 4 dynos with 4 workers each (we never have any request queuing). We have had 0 luck getting rid of these errors, even with help from Heroku. I get the feeling even they aren't 100% confident in the optimal settings for Unicorn on Heroku.
We just recently switched to Puma and so far so good, much better performance and no weird timeouts yet. One of the other reasons we switched to Puma is that I suspect some of our random timeouts come from "slow clients" . . . Unicorn isn't designed to handle slow clients.
I will let you know if we see continued success with Puma, but so far so good. The switch is pretty painless, assuming your app is thread-safe.
Here are the puma settings we are using. We are using "Clustered Mode".
procfile:
web: bundle exec puma -p $PORT -C ./config/puma.rb
puma.rb:
environment ENV['RACK_ENV']
threads Integer(ENV["PUMA_THREADS"] || 5),Integer(ENV["PUMA_THREADS"] || 5)
workers Integer(ENV["WEB_CONCURRENCY"] || 4)
preload_app!
on_worker_boot do
ActiveSupport.on_load(:active_record) do
ActiveRecord::Base.establish_connection
end
end
We currently have WEB_CONCURRENCY set to 4 and PUMA_THREADS set to 5.
We aren't using an initializer for DB_POOL, just using the default DB_POOL setting of 5 (hence the 5 threads).
The only reason we are using WEB_CONCURRENCY as our environment variable name is so that log2viz reports the correct number of workers. Would rather call it PUMA_WORKERS but whatever, not a huge deal.
Hope this helps . . . again, will let you know if we see any issues with Puma.
I hate to add another answer, especially one this simple, but ultimately what fixed this problem for us was removing the 'rack-timeout' gem. I realize this is probably not best practice but I'm curious if there is some conflict between rack-timeout and Unicorn and/or Puma (which is odd because Heroku recommends rack-timeout for use with Unicorn).
Anyway Puma is working great for us but we did still see some random inexplicable timeouts even after the Puma upgrade . . . but removing rack-timeout got rid of the issue completely. Obviously we still get timeouts but only for code we haven't optimized or if we are getting heavy usage (basically when you would expect to see timeouts). Thus I would blame this issue on rack-timeout and not on Unicorn . . . thus contradicting my previous answer :)
Hope this helps. If anyone else wants to poke holes in my theory, feel free!

Rails, Heroku, Unicorn & Resque - how to choose the amount of web workers / resque workers?

I've just switched to using Unicorn on Heroku. I'm also going to switch to resque from delayed_job and use the setup described at http://bugsplat.info/2011-11-27-concurrency-on-heroku-cedar.html
What I don't understand from this is how config/unicorn.rb:
worker_processes 3
timeout 30
#resque_pid = nil
before_fork do |server, worker|
#resque_pid ||= spawn("bundle exec rake " + \
"resque:work QUEUES=scrape,geocode,distance,mailer")
end
translates into:
"This will actually result in six processes in each web dyno: 1 unicorn master, 3 unicorn web workers, 1 resque worker, 1 resque child worker when it actually is processing a job"
How many workers will actually process background jobs? 1 or 2?
Lets say I wanted to increase the number of resque workers - what would I change?
I think if you run that block, you have your unicorn master already running, plus 3 web workers that you specify at the top of the file, and then the block below launches one Resque worker if it's not already started.
I'm guessing that Resque launches a child worker by itself when it actually performs work.
It would appear that if you wanted another Resque worker, you could just do
worker_processes 3
timeout 30
#resque_pid = nil
#resque_pid2 = nil
before_fork do |server, worker|
#resque_pid ||= spawn("bundle exec rake " + \
"resque:work QUEUES=scrape,geocode,distance,mailer")
#resque_pid2 ||= spawn("bundle exec rake " + \
"resque:work QUEUES=scrape,geocode,distance,mailer")
end
In my experience with Resque, it's as simple as launching another process as specified above. The only uncertainty I have is with Heroku and how it chooses to deal with giving you more workers.

Keeping a rake job running

I'm using delayed_job to run jobs, with new jobs being added every minute by a cronjob.
Currently I have an issue where the rake jobs:work task, currently started with 'nohup rake jobs:work &' manually, is randomly exiting.
While God seems to be a solution to some people, the extra memory overhead is rather annoying and I'd prefer a simpler solution that can be restarted by the deployment script (Capistrano).
Is there some bash/Ruby magic to make this happen, or am I destined to run a monitoring service on my server with some horrid hacks to allow the unprivelaged account the site deploys to the ability to restart it?
For me the daemons gem was unreliable with delayed_job. Could be a poorly written script (was using the one on collectiveidea's delayed_job github page), and not daemons fault, I'm not really sure. But for whatever reason, it would restart inconsistently on deployments.
I read somewhere this was due to it not waiting for the process to actually exit, so the pid files would get overwritten or something. But I didn't really bother to investigate. I switched to the daemons-spawn gem using these instructions and it seems to be much more reliable now.
The delayed_job docs suggest that you use a monitoring service to manage the rake worker job(s). I use runit--works well.
(You can install it in the mode where it does not replace init.)
Added:
Re: restart by Capistrano: yes, runit enables that. Just do a
sudo sv kill delayed_job
in your Capistrano recipe to kill the delayed_job worker. Runit will then restart it with your newly deployed code base.
I have implemented small rake task that restarts the jobs task over and over again:
desc "Start a delayed_job worker in a endless loop to prevent exits."
task :jobs => :environment do
while true
begin
Delayed::Worker.new(:min_priority => ENV['MIN_PRIORITY'],
:max_priority => ENV['MAX_PRIORITY'],
:quiet => false).start
rescue Exception => e
puts "Exception occured (#{e})"
end
puts "Task jobs:work exited, clearing queue and restarting"
sleep 1
Delayed::Job.delete_all
end
end
Apparently it did not work. So I ended with this simple solution:
for (( ;; )); do rake jobs:work --trace; done
get rid of delayed job and use either whenever or resque

Resources