Cannot restart unicorn - ruby-on-rails

I have a unicorn + nginx setup and suddenly when I run cap unicorn:upgrade (which sends a USR2 to the master process) it doesn't prefix the .pid file and it doesn't fork a new master process at all. When I open the log file I can see the line
reaped #<Process::Status: pid 32448 exit 10> exec()-ed
can anyone suggest something to do in order to see what's wrong?
Thanks

Does your unicorn config have preload_app(true) ? You may need to send a QUIT signal if it does.

Related

Unicorn workers dying for no reason

All the unicorn workers are dying silently, no indication as to why, and I can't find any evidence of an external process killing them. I'm new to diagnosing this kind of stuff, and after several hours of research, experimenting, and trying to figure this out, I'm at a dead end.
Background info- it's a Rails 4.1 app, Ruby 2.0, running nginx and unicorn on a Ubuntu 14.04 server.
unicorn.rb
working_directory "/home/deployer/apps/ourapp/current"
pid "/home/deployer/apps/ourapp/current/tmp/pids/unicorn.pid"
stderr_path "/home/deployer/apps/ourapp/current/log/unicorn.log"
stdout_path "/home/deployer/apps/ourapp/current/log/unicorn.log"
listen "/tmp/unicorn.ourapp.sock"
worker_processes 2
timeout 30
excerpt from unicorn.log (last lines before it dies and after restart)
I, [2016-08-28T19:54:01.685757 #19559] INFO -- : worker=1 ready
I, [2016-08-28T19:54:01.817464 #19556] INFO -- : worker=0 ready
I, [2016-08-29T09:19:14.818267 #30343] INFO -- : unlinking existing socket=/tmp/unicorn.ourapp.sock
I, [2016-08-29T09:19:14.818639 #30343] INFO -- : listening on addr=/tmp/unicorn.ourapp.sock fd=10
I, [2016-08-29T09:19:14.818807 #30343] INFO -- : worker=0 spawning...
I, [2016-08-29T09:19:14.824358 #30343] INFO -- : worker=1 spawning...
Some pertinent info:
After a period of time ranging from about 8 - 20 hours, unicorn dies.
There's no error recorded in the unicorn log.
I searched all of /var/log for evidence of processes that were killed, and can only find one unrelated process that was killed a few days ago.
New Relic shows flat memory usage before the last random shutdown, with ruby using around 400mb. It's currently at 480mb with no problems, so I don't think it's hitting memory constraints.
Same with CPU usage...ruby was hovering around 0.1% before it died.
The last couple of times it died were in the middle of the night. The only requests coming in were from New Relic and Linode Longview monitoring.
Our production.log shows a last request before dying as a ping from New Relic. It Completed 200 OK in 264ms so it doesn't seem to be a request timing out.
It's happening in staging as well, and the log level is set to debug, and there are no additional clues in the staging logs.
Questions:
What could be killing the Unicorn workers that's not the out-of-memory manager or a shut down signal?
Could it be the OOM or a shut down signal, and it's being recorded in some place that I'm not looking, or just not being recorded for some reason?
Is there a way to capture what's happing to Unicorn in more detail?
I have no idea where to go from here, so any suggestions would be much appreciated.
UPDATE
As suggested, I used strace to find out that unicorn was being killed by an old crontab (I know I should have checked there earlier) added by the previous developers that was intended to restart the server every night. The stop command worked, but the start command was failing.
I still don't know why I wasn't able to find anything in my log searches, but after attaching strace to the main unicorn process (using something like strace -o /tmp/strace.out -s 2000 -fp <unicorn_process_id>), the strace log ended with a clear +++ killed by SIGKILL +++. I searched the logs again, and that led me to the crontab.
The underlying cause is probably pretty specific to my situation, but I'm really glad I know about strace now.

how to stop rails server (redmine) ?

i've installed and running redmine in my domain. something went wrong and i cant access redmine admin panel. i tried to reset password and also did some google and changed password in database. but no luck still cant login. then i removed all app files but its still running same as before..
this was the code i used to run redmine server
bundle exec ruby script/rails server webrick -e -s production
now i'm trying to stop or restart but nothing is working.
is there any way to stop the server.
Thanks in advance.
Kill the process
kill -INT $(cat tmp/pids/server.pid)
Its cleaner to write a rake task to do that:
task :stopserver do
pid_file = 'tmp/pids/server.pid'
if File.file?(pid_file)
print "Shutting down WEBrick\n"
pid = File.read(pid_file).to_i
Process.kill "INT", pid
end
File.file?(pid_file) && File.delete(pid_file)
end
Delete the file tmp/pids/server.pid) and then restart the server.
Usually you'll hit Ctrl-C to stop webrick when it's started without -d option. The Ctrl-C makes INT signal, so youcould try with kill -INT <pid> to stop webrick started with -d option.
If it doesn't stop you can try with kill -9 <pid> sending a KILL signal, that's not a proper clean shutdown but seems the only way to stop it. It's not a 'best practice' but it's the only method i've ever found.
$ killall -9 ruby
this command wil kill all the running instance of ruby on your system and you can restart ur server again

Unicorn not reloading with USR2

I'm trying to reload unicorn with a USR2 signal, but I get the following error on the logs:
E, [2012-04-13T21:27:00.801192 #24474] ERROR -- : old PID:23820 running with existing pid=/home/user/app.git/tmp/unicorn.pid.oldbin, refusing rexec
I've search the internets but don't have a clue. It seems that unicorn is trying to write to the pid file? I'm issuing a kill -s USR2 PID
Thanks
I ran into this today. I'm assuming you have previously sent USR2 to unicorn, and this is now the second time you're trying to do so.
Per the unicorn documentation on signals and USR2: "A separate QUIT should be sent to the original process once the child is verified to be up and running."
In this particular case, you'd pass the old PID to kill
kill -s QUIT 23820
Or, you can take advantage of the fact this old PID is stored a known file (referenced in your error message) alongside the "current" PID, and execute:
kill -s QUIT `cat /home/user/app.git/tmp/unicorn.pid.oldbin`

Restarting Unicorn with USR2 doesn't seem to reload production.rb settings

I'm running unicorn and am trying to get zero downtime restarts working.
So far it is all awesome sauce, the master process forks and starts 4 new workers, then kills the old one, everyone is happy.
Our scripts send the following command to restart unicorn:
kill -s USR2 `cat /www/app/shared/pids/unicorn.pid`
On the surface everything looks great, but it turns out unicorn isn't reloading production.rb. (Each time we deploy we change the config.action_controller.asset_host value to a new CDN container endpoint with our pre-compiled assets in it).
After restarting unicorn in this way the asset host is still pointing to the old release. Doing a real restart (ie: stop the master process, then start unicorn again from scratch) picks up the new config changes.
preload_app is set to true in our unicorn configuration file.
Any thoughts?
My guess is that your unicorns are being restarted in the old production directory rather than the new production directory -- in other words, if your working directory in unicorn.rb is <capistrano_directory>/current, you need to make sure the symlink happens before you attempt to restart the unicorns.
This would explain why stopping and starting them manually works: you're doing that post-deploy, presumably, which causes them to start in the correct directory.
When in your deploy process are you restarting the unicorns? You should make sure the USR2 signal is being sent after the new release directory is symlinked as current.
If this doesn't help, please gist your unicorn.rb and deploy.rb; it'll make it a lot easier to debug this problem.
Keep in mind that:
your working directory in unicorn.rb should be :
/your/cap/directory/current
NOT be:
File.expand_path("../..", FILE)
Because the unicorn and linux soft link forking error: soft link can not work well.
for example:
cd current #current is a soft link to another directory
... ...
when we get our working directory, we got the absolute path not the path in "current"

Unicorn completely ignores USR2 signal

I'm experiencing a rather strange problem with unicorn on my production server.
Although the config file states preload_app true, sending USR2 to the master process does not generate any response, and it seems like unicorn is ignoring the signal altogether.
On another server sending USR2 changes the master process to and (old) state and starts a new master process successfully.
The problematic server is using RVM & bundler, so I'm assuming it's somehow related (the other one is vanilla ruby).
Sending signals other than USR2 (QUIT, HUP) works just fine.
Is there a way to trace what's going on behind the scenes here? Unicorn's log file is completely empty.
I suspect your issue might be that your Gemfile has changed, but you haven't started your unicorn in a way that allows USR2 to use the new Gemfile. It's therefore crashing when you try to restart the app.
Check your /log/unicorn.log for details of what might be failing.
If you're using Capistrano, specify the BUNDLE_GEMFILE as the symlink, e.g.:
run "cd #{current_path} && BUNDLE_GEMFILE=#{current_path}/Gemfile bundle exec unicorn -c #{config_path} -E #{unicorn_env} -D"
Here's a PR that demostrates this.
I experienced a similar problem, but my logs clearly identified the issue: sending USR2 would initially work on deployments, but as deployments got cleaned up, the release that the Unicorn master was initially started on would get deleted, so attempts at sending a USR2 signal would appear to do nothing / fail, with the error log stating:
forked child re-executing... 53
/var/www/application/releases/153565b36021c0b8c9cbab1cc373a9c5199073db/vendor/bundle/ruby/1.9.1/gems/unicorn-4.3.1/lib/unicorn/http_server.rb:439:in
`exec': No such file or directory -
/var/www/application/releases/153565b36021c0b8c9cbab1cc373a9c5199073db/vendor/bundle/ruby/1.9.1/bin/unicorn
(Errno::ENOENT)
The Unicorn documents mention this potential problem at http://unicorn.bogomips.org/Sandbox.html: "cleaning up old revisions will cause revision-specific installations of unicorn to go missing and upgrades to fail", which in my case meant USR2 appeared to 'do nothing'.
I'm using Chef's application recipe to deploy applications, which creates a symlinked vendor_bundle directory that is shared across deployments, but calling bundle exec unicorn still resulted in the original Unicorn master holding a path reference that included a specific release directory.
To fix it I had to call bundle exec /var/www/application/shared/vendor_bundle/ruby/1.9.1/bin/unicorn to ensure the Unicorn master had a path to a binary that would be valid from one deployment to the next. Once that was done I could deploy to me heart's content, and kill -USR2 PID would work as advertised.
The Unicorn docs mention you can manually change the binary path reference by setting the following in the Unicorn config file and sending HUP to reload Unicorn before sending a USR2 to fork a new master: Unicorn::HttpServer::START_CTX[0] = "/some/path/to/bin/unicorn"
Perhaps this is useful to some people in similar situations, but I didn't implement this as it appears specifying an absolute path to the shared unicorn binary was enough.
I've encountered a similar problem on my VDS. Strace'ing revealed the cause:
write(2, "E, [2011-07-23T04:40:27.240227 #19450] ERROR -- : Cannot allocate memory - fork(2) (Errno::ENOMEM) <...>
Try increasing the memory size, XEN memory on demand limits (they were too hard in my case), or maybe turn on overcommit, through the latter may have some serious unwanted side effects, so do it carefully.

Resources