How to monitor background jobs in production, queue_classic - ruby-on-rails

I am using queue_classic for background jobs,
I need to monitor background jobs in production ie start, stop etc.
I found the similar question but it didn't help me
Also I found the god code:
but how would I stop, restart workers?
number_queues.times do |queue_num|
God.watch do |w|
w.name = "QC-#{queue_num}"
w.group = "QC"
w.interval = 5.minutes
w.start = "bundle exec rake queue:work" # This is your rake task to start QC listening
w.gid = 'nginx'
w.uid = 'nginx'
w.dir = rails_root
w.keepalive
w.env = {"RAILS_ENV" => rails_env}
w.log = "#{log_dir}/qc.stdout.log" # Or.... "#{log_dir}//qc-#{queue_num}.stdout.log"
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
end
end
UPDATE
This code seems doesn't work
namespace :queue_classic do
desc "Start QC worker"
task :start, roles: :web do
run "cd #{release_path} && RAILS_ENV=production bundle exec rake qc:work"
end
after "deploy:restart", "queue_classic:restart"
end

As said in the documentation you can restart your worker by issuing
god restart QC-<worker_number>
where QC-<worker_number> is the name you assign to your worker

Depending on what kind of monitoring you need, you might also look at Toro, which provides a great deal of monitoring, both in a web interface and through the fact that jobs store a great deal of data and can easily be queried using ActiveRecord queries. Toro also supports middleware, which may be useful for your needs.

Related

Scheduler works once

The project uses the task scheduler - gem 'clockwork'. Capistrano executes the hook:
after :'deploy:finished', :'clockwork:restart'
The scheduler is triggered once (after this hook), runs all rake tasks, then tasks are not started. No matter how much I put an interval, in a day or 5 minutes, the task does not start anymore. Gem 'daemons' is installed. I will be glad to any help!
UPDATE
require 'clockwork'
require_relative './boot'
require_relative './environment'
module Clockwork
handler do |job|
puts "Running job: #{job}"
end
every(1.minute, 'job:some_task') do
rake_task('job:some_task')
end
def rake_task(task_name)
AppName::Application.load_tasks
Rake::Task[task_name].invoke
end
configure do |config|
config[:sleep_timeout] = 3600 # 1 hour
config[:logger] = Logger.new("#{Rails.root}/log/clockwork.log")
config[:tz] = 'UTC'
config[:max_threads] = 15
config[:thread] = true
end
end
My guess is you're not running clockwork as a daemon, hence why it runs only once. Have a look at this gist
desc "Start clockwork"
task :start, :roles => clockwork_roles, :on_no_matching_servers => :continue do
run "daemon --inherit --name=clockwork --env='#{rails_env}' --output=#{log_file} --pidfile=#{pid_file} -D #{current_path} -- bundle exec clockwork config/clockwork.rb"
end
You could always SSH into your deployment and either check the list of PID or check in your rails application the temporal file that stores clockwork's PID:
.../tmp/pids/clockwork.pid
Alternatively check clockwork's logs:
.../log/clockwork.log

Correct way to monitor ruby background processes (like Resque) in production

I've been struggling with this for a while, What is the correct approach to start background processes like resque and resque scheduler? Is using God an overkill?
Currently I'm trying to get God to work, but I'm not sure if the *.god conf files should be in the app's directory or somewhere else.
This is what I use:
config
|- app.god
|- God
|- resque.god
|- resque_scheduler.god
# config/god/resque.god
rails_env = ENV['RAILS_ENV'] || raise(ArgumentError, "RAILS_ENV not defined")
rails_root = ENV['RAILS_ROOT'] || File.expand_path(File.join(File.dirname(__FILE__), '..', '..'))
num_workers = rails_env == 'production' ? 5 : 2
num_workers.times do |num|
God.watch do |w|
w.dir = "#{rails_root}"
w.name = "resque-#{num}"
w.group = 'resque'
w.interval = 30.seconds
w.env = {"QUEUE"=>"*", "RAILS_ENV"=>rails_env, "BUNDLE_GEMFILE"=>"#{rails_root}/Gemfile"}
w.start = "/usr/bin/rake -f #{rails_root}/Rakefile environment resque:work"
w.log = "#{rails_root}/log/resque-scheduler.log"
... start/stop methods ...
end
end
There's a root user and a MyApp user. MyApp user has an app located in: /home/myapp/apps/myapp_production/current
The God capistrano recipe I use is:
# config/deploy.rb
after "deploy:restart", "god:restart"
namespace :god do
def try_killing_resque_workers
run "pkill -3 -f resque"
rescue
nil
end
desc "Restart God gracefully"
task "restart", :roles => :app do
god_config_path = File.join(release_path, 'config', 'app.god')
begin
# Throws an exception if god is not running.
run "cd #{release_path}; bundle exec god status && RAILS_ENV=#{rails_env} RAILS_ROOT=#{release_path} bundle exec god load #{god_config_path} && bundle exec god start resque"
# Kill resque processes and have god restart them with the newly loaded config.
try_killing_resque_workers
rescue => ex
# god is dead, workers should be as well, but who knows.
try_killing_resque_workers
# Start god.
run "cd #{release_path}; RAILS_ENV=#{rails_env} bundle exec god -c #{god_config_path}"
end
end
end
When I deploy, I get "The server is not available (or you do not have permissions to access it)"
What's weird is when I even login as a root and run god status it returns nothing but if i run god --version it returns the version.
Anyone knows why?
Did you setup an init.d script ?
Do you have god running ?
/etc/init.d/god status
You can try to launch it by hand
/usr/bin/god -c /etc/god/conf.god -D
and check the logs

Need access to current hostname in Capistrano configuration variable for delayed_job named queue

I'm using named queues within delayed_job to keep tasks isolated by server:
subdomain = Socket.gethostname.split('.')[0]
MyModel.delay(:queue => (subdomain + "_queue")).get_some_records
When I start delayed_job on each server, I need to set the --queue flag. One can pass arguments to the delayed_job CL with set :delayed_job_args. AFAIK, Capistrano allows for the use of the $CAPISTRANO:HOST$ in run commands, but this doesn't help me with set.
As a workaround, I have overridden the delayed_jobs task like this:
desc "Start the delayed_job process"
task :start, :roles => lambda { roles } do
run "cd #{current_path};#{rails_env} script/delayed_job start --queue=$CAPISTRANO:HOST$_queue #{args}"
end
Is there any way to do this as intended, using set? I would like to be able to do something like this:
set :delayed_job_args, "--queue=#{ hostname }_queue"
Update
I discovered another kludgy (and not so DRY) way to do this, but still would like to do it with set if possible:
desc "Start the delayed_job process"
task :start, :roles => lambda { roles } do
parallel do |session|
session.when "server.host =~ /server1/", "cd #{current_path};#{rails_env} script/delayed_job start --queue=server1_queue #{args}"
session.when "server.host =~ /server2/", "cd #{current_path};#{rails_env} script/delayed_job start --queue=server2_queue #{args}"
session.else "cd #{current_path};#{rails_env} script/delayed_job restart #{args}"
end
end
Rails 3.2.8, delayed_job 3.0.3, capistrano 2.13.4.
How about … ?
set :delayed_job_args, "--queue=#{ $CAPISTRANO:HOST$ }_queue"
I have not personally tested this in Cap, but it seems logical.

God resque start gives "The server is not available"

I'm having trouble figuring out how to get God to restart resque.
I've got a Rails 3.2.2 stack on a Ubuntu 10.04.3 LTS Linode slice. Its running system Ruby 1.9.3-p194 (no RVM).
There's a God init.d service at /etc/init.d/god-service that contains:
CONF_DIR=/etc/god
GOD_BIN=/var/www/myapp.com/shared/bundle/ruby/1.9.1/bin/god
RUBY_BIN=/usr/local/bin/ruby
RETVAL=0
# Go no further if config directory is missing.
[ -d "$CONF_DIR" ] || exit 0
case "$1" in
start)
# Create pid directory
$RUBY_BIN $GOD_BIN -c $CONF_DIR/master.conf
RETVAL=$?
;;
stop)
$RUBY_BIN $GOD_BIN terminate
RETVAL=$?
;;
restart)
$RUBY_BIN $GOD_BIN terminate
$RUBY_BIN $GOD_BIN -c $CONF_DIR/master.conf
RETVAL=$?
;;
status)
$RUBY_BIN $GOD_BIN status
RETVAL=$?
;;
*)
echo "Usage: god {start|stop|restart|status}"
exit 1
;;
esac
exit $RETVAL
master.conf in the above contains:
load "/var/www/myapp.com/current/config/resque.god"
resque.god in the above contains:
APP_ROOT = "/var/www/myapp.com/current"
God.log_file = "/var/www/myapp.com/shared/log/god.log"
God.watch do |w|
w.name = 'resque'
w.interval = 30.seconds
w.dir = File.expand_path(File.join(File.dirname(__FILE__),'..'))
w.start = "RAILS_ENV=production bundle exec rake resque:work QUEUE=*"
w.uid = "deploy"
w.gid = "deploy"
w.start_grace = 10.seconds
w.log = File.expand_path(File.join(File.dirname(__FILE__), '..','log','resque-worker.log'))
# restart if memory gets too high
w.transition(:up, :restart) do |on|
on.condition(:memory_usage) do |c|
c.above = 200.megabytes
c.times = 2
end
end
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
on.condition(:process_running) do |c|
c.running = true
c.interval = 5.seconds
end
# failsafe
on.condition(:tries) do |c|
c.times = 5
c.transition = :start
c.interval = 5.seconds
end
end
# start if process is not running
w.transition(:up, :start) do |on|
on.condition(:process_running) do |c|
c.running = false
end
end
end
In deploy.rb I have a reload task:
task :reload_god_config do
run "god stop resque"
run "god load #{File.join(deploy_to, 'current', 'config', 'resque.god')}"
run "god start resque"
end
The problem is whether I deploy, or run god (stop|start|restart|status) resque manually, I get the error message:
The server is not available (or you do not have permissions to access it)
I tried installing god to system gems and pointing to it in god-service:
GOD_BIN=/usr/local/bin/god
but god start rescue gives the same error.
However, I can start the service by doing:
sudo /etc/init.d/god-service start
So its probably a permissions issue, I think, probably related to the fact that the init.d service is owned by root and god is run from the bundle by the deploy user.
What's the best way around this issue?
You're running the god service as a different user (there's a good chance root).
Also check out: God not running: The server is not available (or you do not have permissions to access it)
First you check whether you have installed god in your machine using "god --version" command. If its available just try to run some god script with -D option. Example "god -c sample.god -D" it will give you some type of error messages in the standard output in your console where is the exact issue. I was also getting same error when i try to run the commmand without "-D" option. Then when i tried with "-D" mode it was told some folder write permission issue i could able to find it and fix it.
Ok this is an issue with your config files, check them all of them + the includes somewhere it fails throwing you this error. I checked mine and fixes some errors afterwards it worked perfectly!

Run resque in background

I have a working rails app with a resque queue system which works very well. However, I lack a good way of actually demonizing the resque workers.
I can start them just fine by going rake resque:work QUEUE="*" but I guess it's not the point that you should have your workers running in the foreground. For some reason nobody seems to adress this issue. On the official resque github page the claim you can do something like this:
PIDFILE=./resque.pid BACKGROUND=yes QUEUE="*" rake resque:work
well - it doesn't fork into the background here at least.
A +1 for resque-pool - it really rocks. We use it in combination with God to make sure that it is always available.
# Resque
God.watch do |w|
w.dir = RAILS_ROOT
w.name = "resque-pool"
w.interval = 30.seconds
w.start = "cd #{RAILS_ROOT} && sudo -u www-data sh -c 'umask 002 && resque-pool -d -E #{RAILS_ENV}'"
w.start_grace = 20.seconds
w.pid_file = "#{RAILS_ROOT}/tmp/pids/resque-pool.pid"
w.behavior(:clean_pid_file)
# restart if memory gets too high
#w.transition(:up, :restart) do |on|
# on.condition(:memory_usage) do |c|
# c.above = 350.megabytes
# c.times = 2
# end
#end
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
on.condition(:process_running) do |c|
c.running = true
c.interval = 5.seconds
end
# failsafe
on.condition(:tries) do |c|
c.times = 5
c.transition = :start
c.interval = 5.seconds
end
end
# start if process is not running
w.transition(:up, :start) do |on|
on.condition(:process_running) do |c|
c.running = false
end
end
end
This then gives you a really elegant way to reload code in your workers without interrupting jobs - simply kill -2 your resque-pool(s) when you deploy. Idle workers will die immediately, busy workers will die when they finish their current jobs, and God will restart resque-pool with workers using your new code.
These are our Resque tasks for Capistrano:
namespace :resque do
desc "Starts resque-pool daemon."
task :start, :roles => :app, :only => { :jobs => true } do
run "cd #{current_path};resque_pool -d -e #{rails_env} start"
end
desc "Sends INT to resque-pool daemon to close master, letting workers finish their jobs."
task :stop, :roles => :app, :only => { :jobs => true } do
pid = "#{current_path}/tmp/pids/resque-pool.pid"
sudo "kill -2 `cat #{pid}`"
end
desc "Restart resque workers - actually uses resque.stop and lets God restart in due course."
task :restart, :roles => :app, :only => { :jobs => true } do
stop # let God restart.
end
desc "List all resque processes."
task :ps, :roles => :app, :only => { :jobs => true } do
run 'ps -ef f | grep -E "[r]esque-(pool|[0-9])"'
end
desc "List all resque pool processes."
task :psm, :roles => :app, :only => { :jobs => true } do
run 'ps -ef f | grep -E "[r]esque-pool"'
end
end
You might need to reconnect any DB connections when resque-pool forks workers - check the docs.
I had the same problem and the following works for me.
PIDFILE=./resque.pid BACKGROUND=yes QUEUE="*" rake resque:work >> worker1.log &
You can also redirect STDERR to the same log file.
To demonize a process you can use nohup:
nohup cmd &
On resque's github there is a config for monit, that shows how to use nohup, it looks something like this:
nohup bundle exec rake resque:work QUEUE=queue_name PIDFILE=tmp/pids/resque_worker_QUEUE.pid & >> log/resque_worker_QUEUE.log 2>&1
Another option you should look into is using the resque pool gem to manage your workers.
You can run resque pool in background by using this command:
resque-pool --daemon --environment production
The BACKGROUND environment variable was added to Resque 1.20; make sure you're not using 1.19 or lower.
One good way is to use God to manage it. It launches a daemonized version of Resque and monitor it. Actually, you can choose between using Resque as a daemon and letting God daemonize Resque. I choose option 2.
A resque.god file example :
rails_env = ENV['RAILS_ENV'] || "production"
rails_root = ENV['RAILS_ROOT'] || "/path/to/my/app/current"
num_workers = rails_env == 'production' ? 5 : 2
num_workers.times do |num|
God.watch do |w|
w.dir = "#{rails_root}"
w.name = "resque-#{num}"
w.group = 'resque'
w.interval = 30.seconds
w.env = {"QUEUE"=>"critical,mailer,high,low", "RAILS_ENV"=>rails_env}
w.start = "bundle exec rake -f #{rails_root}/Rakefile resque:work"
w.stop_signal = 'QUIT'
w.stop_timeout = 20.seconds
w.uid = 'myappuser'
w.gid = 'myappuser'
w.behavior(:clean_pid_file)
# restart if memory gets too high
w.transition(:up, :restart) do |on|
on.condition(:memory_usage) do |c|
c.above = 350.megabytes
c.times = 2
c.notify = {:contacts => ['maxime'], :priority => 9, :category => 'myapp'}
end
end
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
on.condition(:process_running) do |c|
c.running = true
c.interval = 5.seconds
end
# failsafe
on.condition(:tries) do |c|
c.times = 5
c.transition = :start
c.interval = 5.seconds
end
end
# start if process is not running
w.transition(:up, :start) do |on|
on.condition(:process_running) do |c|
c.running = false
c.notify = {:contacts => ['maxime'], :priority => 1, :category => 'myapp'}
end
end
end
end
I also faced this issue, I start worker in cap task, but I got issue
BACKGROUND causes worker always in starting mode.
nohup process is killed right after finish, we must wait a couple seconds. But unable to append more command after '&'
At last, I must create a shell, let it sleep 5s after nohup... call.
My code
desc 'Start resque'
task :start, :roles => :app do
run("cd #{current_path} ; echo \"nohup bundle exec rake resque:work QUEUE=* RAILS_ENV=#{rails_env} PIDFILE=tmp/pids/resque_worker_1.pid &\nnohup bundle exec rake resque:work QUEUE=* RAILS_ENV=#{rails_env} PIDFILE=tmp/pids/resque_worker_2.pid &\nsleep 5s\" > startworker.sh ")
run("cd #{current_path} ; chmod +x startworker.sh")
run("cd #{current_path} ; ./startworker.sh")
run("cd #{current_path} ; rm startworker.sh")
end
I know this is a situation solution. but it works well in my project
You can manage your workers with this script. Commands available:
rake resque:start_workers
rake resque:stop_workers
rake resque:restart_workers
There is also included resque-scheduler. Comment this lines to disable it:
pid = spawn(env_vars, 'bundle exec rake resque:scheduler', ops_s)
Process.detach(pid)

Resources