Unicorn USR2 Restart Hanging Problems - ruby-on-rails
I’m having this peculiar problem when attempting to restart Unicorn using a USR2 signal. On a clean reboot of the VPS, I have no problems with sending a USR2 signal to Unicorn and having it gracefully restart. However, after an hour or so if I try to do it again, I will be left with an old master hanging around preventing the new master from starting. I am then forced to kill the old master so the new master can start. If I reboot the VPS, it fixes it but then after an hour so the problem starts again. I'm on Rails 4, Ruby 2.0.0.
unicorn.log
I, [2014-01-07T15:37:37.118523 #19797] INFO -- : executing ["/srv/rails/current/bin/unicorn", "-c", "/srv/rails/current/config/unicorn.rb", {12=>#<Kgio::UNIXServer:fd 12>}] (in /srv/rails/releases/20140107091945)
I, [2014-01-07T15:37:37.118983 #19797] INFO -- : forked child re-executing...
I, [2014-01-07T15:37:38.998632 #19797] INFO -- : inherited addr=/srv/rails/shared/sockets/unicorn.sock fd=12
I, [2014-01-07T15:37:38.999038 #19797] INFO -- : Refreshing Gem list
I, [2014-01-07T15:37:41.927794 #19967] INFO -- : Refreshing Gem list
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid=': Already running on PID:19967 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:151:in `start'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
from /srv/rails/current/bin/unicorn:16:in `load'
from /srv/rails/current/bin/unicorn:16:in `<main>'
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid=': Already running on PID:21250 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:151:in `start'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
from /srv/rails/current/bin/unicorn:16:in `load'
from /srv/rails/current/bin/unicorn:16:in `<main>'
E, [2014-01-07T15:40:46.720131 #20878] ERROR -- : reaped #<Process::Status: pid 21075 exit 1> exec()-ed
E, [2014-01-07T15:40:46.720870 #20878] ERROR -- : master loop error: Already running on PID:21250 (or pid=/srv/rails/shared/pids/unicorn.pid is stale) (ArgumentError)
E, [2014-01-07T15:40:46.723525 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:219:in `pid='
E, [2014-01-07T15:40:46.723671 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:413:in `reap_all_workers'
E, [2014-01-07T15:40:46.723747 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:292:in `join'
E, [2014-01-07T15:40:46.723815 #20878] ERROR -- : /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
E, [2014-01-07T15:40:46.723880 #20878] ERROR -- : /srv/rails/current/bin/unicorn:16:in `load'
E, [2014-01-07T15:40:46.723930 #20878] ERROR -- : /srv/rails/current/bin/unicorn:16:in `<main>'
E, [2014-01-07T15:41:13.704700 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:13.704901 #21250] ERROR -- : retrying in 0.5 seconds (4 tries left)
E, [2014-01-07T15:41:14.205452 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:14.205597 #21250] ERROR -- : retrying in 0.5 seconds (3 tries left)
78.40.124.16, 173.245.49.122 - - [07/Jan/2014 15:41:14] "GET / HTTP/1.0" 200 28697 0.8345
E, [2014-01-07T15:41:14.706179 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:14.706335 #21250] ERROR -- : retrying in 0.5 seconds (2 tries left)
E, [2014-01-07T15:41:15.206834 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:15.206987 #21250] ERROR -- : retrying in 0.5 seconds (1 tries left)
E, [2014-01-07T15:41:15.707431 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
E, [2014-01-07T15:41:15.707563 #21250] ERROR -- : retrying in 0.5 seconds (0 tries left)
78.40.124.16, 149.154.158.74 - - [07/Jan/2014 15:41:15] "GET / HTTP/1.0" 200 32866 0.4528
E, [2014-01-07T15:41:16.208055 #21250] ERROR -- : adding listener failed addr=/srv/rails/shared/sockets/unicorn.sock (in use)
/srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `initialize': Address already in use - "/srv/rails/shared/sockets/unicorn.sock" (Errno::EADDRINUSE)
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `new'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/socket_helper.rb:158:in `bind_listen'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:255:in `listen'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `block in bind_new_listeners!'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `each'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:801:in `bind_new_listeners!'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:146:in `start'
from /srv/rails/shared/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>'
from /srv/rails/current/bin/unicorn:16:in `load'
from /srv/rails/current/bin/unicorn:16:in `<main>'
unicorn.rb
deploy_path = "/srv/rails"
RAILS_ENV = ENV['RAILS_ENV'] || "production"
working_directory "#{deploy_path}/current"
pid "#{deploy_path}/shared/pids/unicorn.pid"
stderr_path "#{deploy_path}/shared/log/unicorn.log"
# Listen on a UNIX data socket
listen "#{deploy_path}/shared/sockets/unicorn.sock"
worker_processes 4
# Preload application before forking worker processes
preload_app true
# Restart any workers that haven't responded in 30 seconds
timeout 30
before_fork do |server, worker|
##
# When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
# immediately start loading up a new version of itself (loaded with a new
# version of our app). When this new Unicorn is completely loaded
# it will begin spawning workers. The first worker spawned will check to
# see if an .oldbin pidfile exists. If so, this means we've just booted up
# a new Unicorn and need to tell the old one that it can now die. To do so
# we send it a QUIT.
#
# Using this method we get 0 downtime deploys.
old_pid = "#{server.config[:pid]}.oldbin"
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH => e
log = File.open(Rails.root.join('log/unicorn.log'), "a")
log.puts "Error encountered when killing process:\n"
log.puts "#{e.message}"
log.close
end
end
# the following is recomended for Rails + "preload_app true"
# as there's no need for the master process to hold a connection
if defined?(ActiveRecord::Base)
ActiveRecord::Base.connection.disconnect!
end
end
after_fork do |server, worker|
##
# Unicorn master loads the app then forks off workers - because of the way
# Unix forking works, we need to make sure we aren't using any of the parent's
# sockets, e.g. db connection
ActiveRecord::Base.establish_connection
# Redis and Memcached would go here but their connections are established
# on demand, so the master never opens a socket
##
# Unicorn master is started as root, which is fine, but let's
# drop the workers to deployer
begin
uid, gid = Process.euid, Process.egid
user, group = 'deployer', 'deployer'
target_uid = Etc.getpwnam(user).uid
target_gid = Etc.getgrnam(group).gid
worker.tmp.chown(target_uid, target_gid)
if uid != target_uid || gid != target_gid
Process.initgroups(user, target_gid)
Process::GID.change_privilege(target_gid)
Process::UID.change_privilege(target_uid)
end
rescue => e
if RAILS_ENV == 'development'
STDERR.puts "couldn't change user, oh well"
else
raise e
end
end
end
deploy.rb
require 'bundler/capistrano' # runs a bundle install --deployment
# https://github.com/sstephenson/rbenv/issues/101
set :keep_releases, 10
set :shared_children, shared_children + %w(public/images public/uploads)
# Multistage extension
set :stages, ["production", "staging"]
set :default_stage, "staging"
require 'capistrano/ext/multistage'
require 'underglow/capistrano'
# Whenever crontab updates
set :whenever_environment, defer { stage }
set :whenever_command, "bin/whenever"
require 'whenever/capistrano'
set :application, "rails"
set :user, "deployer"
default_run_options[:pty] = true
default_run_options[:shell] = '/bin/zsh'
set :use_sudo, false
# repository
set :repository, "XXXXXXXXXXXXXXXXX"
set :branch, fetch(:branch, "master") # can specify a branch from `cap -S branch="<branch_name>"`
set :scm, :git
set :scm_verbose, true
set :ssh_options, forward_agent: true
set :deploy_to, "/srv/rails"
set :deploy_via, :remote_cache
# We're using a rbenv user install, setup the PATH we need to access the rbenv shims
set :default_environment, {
'PATH' => "$HOME/.rbenv/shims:$HOME/.rbenv/bin:$PATH"
}
Has anyone seen this?
You should check the unicorn stdout/stderr logs for more evidence about why the old unicorn may be hanging or the new one failing to kill it off properly.
One gotcha is that if the older capistrano release directory has been removed during deployment of the new release, you may have bundler errors during the hot-swap handoff. Folks advise adding the following to bind to the permanent path to the Gemfile vs. the release-specific path:
before_exec do |server|
ENV['BUNDLE_GEMFILE'] = "#{deploy_path}/current/Gemfile"
end
If that is the problem you're having, you should be seeing bundler errors or a failure in the unicorn logs.
This may not help you, but here is what I did to "fix" the problem.
I started getting this problem from the release of Unicorn 4.7.0. In 4.7.0, the behavior of how pid files were written changed, and broke my restart script. The old pre-4.7.0 behavior was: move pid file to oldpid, write new pid, start up workers, shut down master. The last step was in my unicorn.rb file of course. The new behavior was to remove the old pid quickly, and only write the new one after some heavy lifting occurred. This broke my script as it could not trust that things restarted properly. This caused my sh script to attempt to restart it, leading to confusion with the now-freshly started unicorn process and the eh-script started "full start" one. Both lost in various ways, so both exited, leaving a old master still serving requests.
I also had a defect in my unicorn.rb file which did not properly set up bundler, as someone already mentioned.
Upgrading to Unicorn 4.8.1, released recently, fixed this problem as pid files are written as they were in pre-4.7.0 days.
Related
Discourse server fails to start with errors related to redis
Rails server fails to start in Discourse project either in development or production. Below are the logs when trying to start the server in dev mode. The application was installed and has been working, It's deployed on AWS in production mode and restarting the unicorn loads the application for some time and again the url stops responding with error messages. Development logs from $rails s rb t#ip-XXX-XX-XX-XX-app:/var/www/discourse# vi config/environments/development.r root#ip-172-31-25-46-app:/var/www/discourse# rails s => Booting Puma => Rails 5.1.4 application starting in production => Run `rails server -h` for more startup options Exiting bundler: failed to load command: script/rails (script/rails) Redis::CommandError: ERR Error running script (call to f_b06356ba4628144e123b652c99605b873107c9be): #user_script:14: #user_script: 14: -MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error. /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis/client.rb:121:in `call' /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2399:in `block in _eval' /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:58:in `block in synchronize' /usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize' /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:58:in `synchronize' /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2398:in `_eval' /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2450:in `evalsha' /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus/backends/redis.rb:380:in `cached_eval' /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus/backends/redis.rb:140:in `publish' /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus.rb:248:in `publish' /var/www/discourse/lib/distributed_cache.rb:72:in `publish' **Production logs ** /var/www/discourse/lib/demon/base.rb:109:in `ensure_running' /var/www/discourse/lib/demon/base.rb:34:in `block in ensure_running' /var/www/discourse/lib/demon/base.rb:33:in `each' /var/www/discourse/lib/demon/base.rb:33:in `ensure_running' config/unicorn.conf.rb:145:in `master_sleep' /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:284:in `join' /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/bin/unicorn:126:in `<top (required)>' /var/www/discourse/vendor/bundle/ruby/2.3.0/bin/unicorn:23:in `load' /var/www/discourse/vendor/bundle/ruby/2.3.0/bin/unicorn:23:in `<main>' E, [2018-01-04T08:43:37.949928 #60] ERROR -- : reaped #<Process::Status: pid 5870 exit 1> worker=unknown Detected dead worker 5870, restarting... Loading Sidekiq in process id 5883 Failed to report error: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error. 4 Redis::CommandError (MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.) /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:121:in `call' web-exception Redis logs 47:M 17 Jan 09:38:01.070 # Can't save in background: fork: Cannot allocate memory 47:M 17 Jan 09:38:07.087 * 10000 changes in 60 seconds. Saving...
The issue has been fixed, Edited this file /etc/sysctl.conf and added the line at last vm.overcommit_memory=1 After this restarted sysctl. $ sudo sysctl -p /etc/sysctl.conf Redis doesn't need the amount of memory which OS thinks, the status 1 means always overcommit, never check. More details can be found from Redis Docs.
websocket-rails / puma: "async response must have empty headers and body"
I'm using websocket_rails to provide an API for a JS client. Locally it works great, but the exact same setup in production will (seemingly randomly) decide to stop working. My production.log yields RuntimeError (eventmachine not initialized: evma_install_oneshot_timer) At first I thought this was the root issue, but my Puma error log yields this when restart the server and try again: RuntimeError: async response must have empty headers and body I added some logging in the puma gem, and indeed, it's receiving rails session headers when doing GET /websocket Sometimes there is no issue at all, and everything works fine for a few days, and then, not. And no matter what I do it just refuses to work again. Thanks in advance. I've wasted days on this problem! Puma config: # Change to match your CPU core count workers 1 # Min and Max threads per worker threads 1, 6 app_dir = File.expand_path("../..", __FILE__) shared_dir = "#{app_dir}/shared" # Default to production rails_env = ENV['RAILS_ENV'] || "production" environment rails_env # Set up socket location bind "unix://#{shared_dir}/sockets/puma.sock" # Logging stdout_redirect "#{shared_dir}/log/puma.stdout.log", "#{shared_dir}/log/puma.stderr.log", true # Set master PID and state locations pidfile "#{shared_dir}/pids/puma.pid" state_path "#{shared_dir}/pids/puma.state" activate_control_app on_worker_boot do require "active_record" ActiveRecord::Base.connection.disconnect! rescue ActiveRecord::ConnectionNotEstablished ActiveRecord::Base.establish_connection(YAML.load_file("#{app_dir}/config/database.yml")[rails_env]) end
Unicorn workers timeout after "zero downtime" deploy with capistrano
I'm running a Rails 3.2.21 app and deploy to a Ubuntu 12.04.5 box using capistrano (nginx and unicorn). I have my app set for a zero-downtime deploy (at least I thought), with my config files looking more or less like these. Here's the problem: When the deploy is nearly done and it restarts unicorn, when I watch my unicorn.log I see it fire up the new workers, reap the old ones... but then my app just hangs for 2-3 minutes. Any request to the app at this point hits the timeout window (which I set to 40 seconds) and returns my app's 500 error page. Here is the first part of the output from unicorn.log as unicorn is restarting (I have 5 unicorn workers): I, [2015-04-21T23:06:57.022492 #14347] INFO -- : master process ready I, [2015-04-21T23:06:57.844273 #15378] INFO -- : worker=0 ready I, [2015-04-21T23:06:57.944080 #15381] INFO -- : worker=1 ready I, [2015-04-21T23:06:58.089655 #15390] INFO -- : worker=2 ready I, [2015-04-21T23:06:58.230554 #14541] INFO -- : reaped #<Process::Status: pid 15551 exit 0> worker=4 I, [2015-04-21T23:06:58.231455 #14541] INFO -- : reaped #<Process::Status: pid 3644 exit 0> worker=0 I, [2015-04-21T23:06:58.249110 #15393] INFO -- : worker=3 ready I, [2015-04-21T23:06:58.650007 #15396] INFO -- : worker=4 ready I, [2015-04-21T23:07:01.246981 #14541] INFO -- : reaped #<Process::Status: pid 32645 exit 0> worker=1 I, [2015-04-21T23:07:01.561786 #14541] INFO -- : reaped #<Process::Status: pid 15534 exit 0> worker=2 I, [2015-04-21T23:07:06.657913 #14541] INFO -- : reaped #<Process::Status: pid 16821 exit 0> worker=3 I, [2015-04-21T23:07:06.658325 #14541] INFO -- : master complete Afterwards, as the app hangs for those 2-3 minutes, here is what's happening: E, [2015-04-21T23:07:38.069635 #14347] ERROR -- : worker=0 PID:15378 timeout (41s > 40s), killing E, [2015-04-21T23:07:38.243005 #14347] ERROR -- : reaped #<Process::Status: pid 15378 SIGKILL (signal 9)> worker=0 E, [2015-04-21T23:07:39.647717 #14347] ERROR -- : worker=3 PID:15393 timeout (41s > 40s), killing E, [2015-04-21T23:07:39.890543 #14347] ERROR -- : reaped #<Process::Status: pid 15393 SIGKILL (signal 9)> worker=3 I, [2015-04-21T23:07:40.727755 #16002] INFO -- : worker=0 ready I, [2015-04-21T23:07:43.212395 #16022] INFO -- : worker=3 ready E, [2015-04-21T23:08:24.511967 #14347] ERROR -- : worker=3 PID:16022 timeout (41s > 40s), killing E, [2015-04-21T23:08:24.718512 #14347] ERROR -- : reaped #<Process::Status: pid 16022 SIGKILL (signal 9)> worker=3 I, [2015-04-21T23:08:28.010429 #16234] INFO -- : worker=3 ready Eventually, after 2 or 3 minutes, the app starts being responsive again, but everything is more sluggish. You can see this very clearly in New Relic (the horizontal line marks the deploy, and the light blue area indicates Ruby): I have an identical staging server, and I cannot replicate the issue in staging... granted, staging is under no load (I'm the only person trying to make page requests). Here is my config/unicorn.rb file: root = "/home/deployer/apps/myawesomeapp/current" working_directory root pid "#{root}/tmp/pids/unicorn.pid" stderr_path "#{root}/log/unicorn.log" stdout_path "#{root}/log/unicorn.log" shared_path = "/home/deployer/apps/myawesomeapp/shared" listen "/tmp/unicorn.myawesomeapp.sock" worker_processes 5 timeout 40 preload_app true before_exec do |server| ENV['BUNDLE_GEMFILE'] = "#{root}/Gemfile" end before_fork do |server, worker| if defined?(ActiveRecord::Base) ActiveRecord::Base.connection.disconnect! end old_pid = "#{root}/tmp/pids/unicorn.pid.oldbin" if File.exists?(old_pid) && server.pid != old_pid begin Process.kill("QUIT", File.read(old_pid).to_i) rescue Errno::ENOENT, Errno::ESRCH end end end after_fork do |server, worker| if defined?(ActiveRecord::Base) ActiveRecord::Base.establish_connection end end And just to paint a complete picture, in my capistrano deploy.rb, the unicorn restart task looks like this: namespace :deploy do task :restart, roles: :app, except: { no_release: true } do run "kill -s USR2 `cat #{release_path}/tmp/pids/unicorn.pid`" end end Any ideas why the unicorn workers timeout right after the deploy? I thought the point of a zero-downtime was to keep the old ones around until the new ones are spun up and ready to serve? Thanks! UPDATE I did another deploy, and this time kept an eye on production.log to see what was going on there. The only suspicious thing was the following lines, which were mixed in with normal requests: Dalli/SASL authenticating as 7510de Dalli/SASL: 7510de Dalli/SASL authenticating as 7510de Dalli/SASL: 7510de Dalli/SASL authenticating as 7510de Dalli/SASL: 7510de UPDATE #2 As suggested by some of the answers below, I changed the before_fork block to add sig = (worker.nr + 1) >= server.worker_processes ? :QUIT : :TTOU so the workers would be incrementally killed off. Same result, terribly slow deploy, with the same spike I illustrated in the graph above. Just for context, out of my 5 worker processes, the first 4 sent a TTOU signal, and the 5th sent QUIT. Still, does not seem to have made a difference.
I came across a similar problem recently while trying to set up Rails/Nginx/Unicorn on Digital Ocean. I was able to get zero-downtime deploys to work after tweaking a few things. Here are a few things to try: Reduce the number of worker process. Increase the memory of your server. I was getting timeouts on the 512MB RAM droplet. Seemed to fix the issue when I increased it to 1GB. Use the "capistrano3-unicorn" gem. If preload_app true, use restart (USR2). If false, use reload (HUP). Ensure "tmp/pids" is in the set as a linked_dirs in deploy.rb. Use px aux | grep unicorn to make sure the old processes are being removed. Use kill [pid] to manually stop any unicorn processes still running. Here's my unicorn config for reference: working_directory '/var/www/yourapp/current' pid '/var/www/yourapp/current/tmp/pids/unicorn.pid' stderr_path '/var/www/yourapp/log/unicorn.log' stdout_path '/var/www/yourapp/log/unicorn.log' listen '/tmp/unicorn.yourapp.sock' worker_processes 2 timeout 30 preload_app true before_fork do |server, worker| old_pid = "/var/www/yourapp/current/tmp/pids/unicorn.pid.oldbin" if old_pid != server.pid begin sig = (worker.nr + 1) >= server.worker_processes ? :QUIT : :TTOU Process.kill(sig, File.read(old_pid).to_i) rescue Errno::ENOENT, Errno::ESRCH end end end deploy.rb lock '3.4.0' set :application, 'yourapp' set :repo_url, 'git#bitbucket.org:username/yourapp.git' set :deploy_to, '/var/www/yourapp' set :linked_files, fetch(:linked_files, []).push('config/database.yml', 'config/secrets.yml', 'config/application.yml') set :linked_dirs, fetch(:linked_dirs, []).push('log', 'tmp/pids', 'tmp/cache', 'tmp/sockets', 'vendor/bundle', 'public/system') set :format, :pretty set :log_level, :info set :rbenv_ruby, '2.1.3' namespace :deploy do after :restart, :clear_cache do on roles(:web), in: :groups, limit: 3, wait: 10 do end end end after 'deploy:publishing', 'deploy:restart' namespace :deploy do task :restart do #invoke 'unicorn:reload' invoke 'unicorn:restart' end end
Are you vendoring unicorn and having cap run a bundle install on deploy? If so this could be an executable issue. When you do a Capistrano deploy, cap creates a new release directory for your revision and moves the current symlink to point to the new release. If you haven't told the running unicorn to gracefully update the path to its executable, it should work if you add this line: Unicorn::HttpServer::START_CTX[0] = ::File.join(ENV['GEM_HOME'].gsub(/releases\/[^\/]+/, "current"),'bin','unicorn') You can find some more information here. I think the before_fork block you have looks good, but I would add the sig = (worker.nr + 1) >= server.worker_processes ? :QUIT : :TTOU line from #travisluong's answer as well; that will incrementally kill off the workers as the new ones spawn. I would not remove preload_app true, incidentally, as it greatly improves worker spawn time.
`build': /home/releases/#{release_number}/Gemfile not found (Bundler::GemfileNotFound)
INFO -- : executing ["/home/shared/bundle/ruby/2.1.0/bin/unicorn", "-c", "/home/current/config/unicorn/production.rb", "-E", "deployment", "-D", {12=>#<Kgio::UNIXServer:fd 12>}] (in /home/releases/20140714144301) DEBUG[b59ba27c] I, [2014-07-14T14:43:33.495683 #27897] INFO -- : forked child re-executing... DEBUG[b59ba27c] /home/ DEBUG[b59ba27c] lic/.rvm/gems/ruby-2.1.2#global/gems/bundler-1.6.3/lib/bundler/definition.rb:23:in `build': /home/releases/20140710032913/Gemfile not found (Bundler::GemfileNotFound) I am pretty sure the problem stems from the fact that with zero-downtime deployment, unicorn is caching this release number: 20140710032913. I have capistrano set to keep 5 releases. After 5 deploys, the old release number is rolled off. How can I force unicorn to use the current Gemfile? Possible duplicate: Unicorn restart issue with capistrano
In my #{Rails.root}/config/unicorn/#{environment}.rb file I needed: before_exec do |_server| ENV['BUNDLE_GEMFILE'] = "#{working_directory}/Gemfile" end
Unicorn restarting on its own - No memory - gets killed
I am running two Rails apps on DigitalOcean with 512MB RAM and with 4 nginx processes. The rails apps use Unicorn. One has 2 workers and the other uses 1. My problem is with the second app that has 1 Unicorn worker (same problem was there when there were 2 workers as well). What happens is, suddenly my app throws a 500 error. When I SSH into the server I would find that the app's unicorn process is not running! When I start unicorn again everything would be fine. This is my log file. As you can see, the worker gets reaped and then it is not able to fork it and the reason given is No Memory. , [2014-01-24T04:12:28.080716 #8820] INFO -- : master process ready I, [2014-01-24T04:12:28.110834 #8824] INFO -- : worker=0 ready E, [2014-01-24T06:45:08.423082 #8820] ERROR -- : reaped #<Process::Status: pid 8824 SIGKILL (signal 9)> worker=0 E, [2014-01-24T06:45:08.438352 #8820] ERROR -- : Cannot allocate memory - fork(2) (Errno::ENOMEM) /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:523:in `fork' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:523:in `spawn_missing_workers' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:538:in `maintain_worker_count' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:303:in `join' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/bin/unicorn:23:in `load' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/bin/unicorn:23:in `<main>' I, [2014-01-24T08:43:53.693228 #26868] INFO -- : Refreshing Gem list I, [2014-01-24T08:43:56.283950 #26868] INFO -- : unlinking existing socket=/tmp/unicorn.hmd.sock I, [2014-01-24T08:43:56.284840 #26868] INFO -- : listening on addr=/tmp/unicorn.hmd.sock fd=11 I, [2014-01-24T08:43:56.320075 #26868] INFO -- : master process ready I, [2014-01-24T08:43:56.348648 #26872] INFO -- : worker=0 ready E, [2014-01-24T09:10:07.251846 #26868] ERROR -- : reaped #<Process::Status: pid 26872 SIGKILL (signal 9)> worker=0 I, [2014-01-24T09:10:07.300339 #27743] INFO -- : worker=0 ready I, [2014-01-24T09:18:09.992675 #28039] INFO -- : executing ["/home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/bin/unicorn", "-D", "-c", "/home/vocp/projects/hmd/config/unicorn.rb", "-E", "production", {11=>#<Kgio::UNIXServer:/tmp/unicorn.hmd.sock>}] (in /home/vocp/projects/hmd) I, [2014-01-24T09:18:10.426852 #28039] INFO -- : inherited addr=/tmp/unicorn.hmd.sock fd=11 I, [2014-01-24T09:18:10.427090 #28039] INFO -- : Refreshing Gem list E, [2014-01-24T09:18:13.456986 #28039] ERROR -- : Cannot allocate memory - fork(2) (Errno::ENOMEM) /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:523:in `fork' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:523:in `spawn_missing_workers' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/lib/unicorn/http_server.rb:153:in `start' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/gems/unicorn-4.7.0/bin/unicorn:126:in `<top (required)>' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/bin/unicorn:23:in `load' /home/vocp/projects/hmd/vendor/bundle/ruby/2.0.0/bin/unicorn:23:in `<main>' E, [2014-01-24T09:18:13.464982 #26868] ERROR -- : reaped #<Process::Status: pid 28039 exit 1> exec()-ed This is my unicorn.rb root = "/home/vocp/projects/hmd" working_directory root pid "#{root}/tmp/pids/unicorn.pid" stderr_path "#{root}/log/unicorn.log" stdout_path "#{root}/log/unicorn.log" listen "/tmp/unicorn.hmd.sock" worker_processes 1 timeout 30 preload_app true # Force the bundler gemfile environment variable to # reference the capistrano "current" symlink before_exec do |_| ENV["BUNDLE_GEMFILE"] = File.join(root, 'Gemfile') end before_fork do |server, worker| defined?(ActiveRecord::Base) && ActiveRecord::Base.connection.disconnect! old_pid = Rails.root + '/tmp/pids/unicorn.pid.oldbin' if File.exists?(old_pid) && server.pid != old_pid begin Process.kill("QUIT", File.read(old_pid).to_i) rescue Errno::ENOENT, Errno::ESRCH puts "Old master alerady dead" end end end after_fork do |server, worker| defined?(ActiveRecord::Base) && ActiveRecord::Base.establish_connection child_pid = server.config[:pid].sub('.pid', ".#{worker.nr}.pid") system("echo #{Process.pid} > #{child_pid}") end I donot have monit or god or any monitoring tools. I find it very odd because generally the used server memory would be 380/490. And nobody uses these two apps apart from me! They are in development. Have I wrongly configured anything? Why is this happening? please help. Should I configure god to restart unicorn when it crashes?
For Unicorn memory usage the only way is up, unfortunately. Unicorn will allocate more memory if your rails app needs it. But it does not release it even if it doesn't need it anymore. For example if you load a lot of records for an index page at once, unicorn will increase memory usage. Now this is exacerbated by the fact that 512MB are not a huge amount of memory for 2 rails apps with 3 workers. Furthermore there are memory leaks that increase memory usage too. See this article https://www.digitalocean.com/community/articles/how-to-optimize-unicorn-workers-in-a-ruby-on-rails-app At the end of the article they refer to the unicorn-worker-killer gem in order to restart the unicorn workers based on either max connections or max memory which looks pretty straightforward. Personally I have used the bluepill gem to monitor individual unicorn processes and restart them if needed. In your case I would monitor all unicorn processes and restart them if they reach a certain memory size.
Check First the memory by using command "df -H" on your server. If the memory is ok than reboot your system with "sudo su reboot" and it will work fine..