I'm having a hard time writing an upstart configuration file to start (and keep alive) the unicorn web server on a Ubuntu box.
How should I set the respawn and expect parameters? With respawn enabled the process is continously restarted (I see on top its PID changing continuously and the old one becoming zombie). If I remove the directive then the process isn't restarted when it dies.
According to the upstart documentation the expect parameter might be crucial: what's the unicorn fork-wise behaviour? Any clue?
Related
I'm using an EC2 instance for hosting a rails application. I'm deploying with capistrano and I had already included sidekiq and it's working fine. However, sometimes on deploy, and sometimes sporadically, sidekiq stops running and I don't notice until some tasks that use sidekiq doesn't run.
I could do something on deploy to check that, but if it stops to work eventually after deploy, that would still be a problem.
I would like to know what is the best way, in that scenario, to check periodically if sidekiq is running, and if not to, run it.
I thought of doing a bash script for that, but apparently, when I run sidekiq from command line, it creates another process with a different pid of the one launched by sidekiq... so I think it could get messy.
Any help is appreciated. Thanks!
Learn and use systemd to manage the service.
https://github.com/mperham/sidekiq/wiki/Deployment#running-your-own-process
A couple of days ago I noticed a strange thing - from time to time server stops processing request for some time. At the top output it looks like this:
ten Unicorn workers process requests;
then, for some reason, they stop doing anything. I mean, all ten workers have 'sleeping' status;
for a ten-fifteen seconds they sleep;
and then suddenly all then workers at the same time start processing requests (lots of them were queued for 10s);
I have the following setup:
nginx, unicorn 4.6.2, postgres, redis for sessions and cache, MRI ruby 2.0.0p353.
My first thought was to blame redid (because if redis doesn't give sessions, all process will wait for it), but it seems it is not the case, because while unicorn workers freeze, redis serving other processes that do background jobs.
I don't understand what is the reason of this strange behaviour.
If someone have some thoughts on the matter I would gladly check it. If you need additional information - just tell me what to do, and I'll try to provide it.
UPDATE:
Unicorn config
strace on unicorn worker
strace on unicorn master
strace on nginx
It turned out (with the help of strace on worker processes) workers were trying to write logs on the disk. Disk was heavy loaded and processes were blocked.
I'm using delayed_job to run jobs, with new jobs being added every minute by a cronjob.
Currently I have an issue where the rake jobs:work task, currently started with 'nohup rake jobs:work &' manually, is randomly exiting.
While God seems to be a solution to some people, the extra memory overhead is rather annoying and I'd prefer a simpler solution that can be restarted by the deployment script (Capistrano).
Is there some bash/Ruby magic to make this happen, or am I destined to run a monitoring service on my server with some horrid hacks to allow the unprivelaged account the site deploys to the ability to restart it?
I'd suggest you to use foreman. It allows you to start any number of jobs in development by using foreman run, and then export your configuration (number of processes per type, limits etc) as upstart scripts, to make them available to Ubuntu's upstart (why invoking God when the operating system already has this for free??).
The configuration file, Procfile, is also exactly the same file Heroku uses for process configuration, so with just one file you get three process management systems covered.
Situation: I am using Rails + Unicorn, deploying with Capistrano. Sometimes Rails app fails to start in production mode (though it is not the real production, but a staging env). This usually happens due to errors in deploy scripts or configuration (thus usually not detectable by tests). When this happens, unicorn master process kills the worker that failed and spawns a new one, which also fails and so on and so forth. During all that time unicorn consumes lots of CPU and pollutes logs with the same message.
Manual way (not good): Go to your home page to see if it works. Look at the htop. Tail the logs. Kill unicorn manually. Cons: easy to forget. Logs are polluted, CPU is loaded while you are reacting.
Another solution: Use unicorn's preload_app true. This will cause master process to fail fast. Cons: higher memory consumption in happy scenario.
Best practice: - ???
Is there any way to cleverly detect that unicorn master uselessly tries to spawn failing children and stop it?
You have something like "unicorn start" in your Capistrano script right? Make your Capistrano script ping Unicorn right after invoking that command. If Unicorn does not return an expected response within a timeout, then you know that something went wrong, and you can choose to rollback the deploy or performing some other action.
As for how to ping Unicorn, that depends. If you have Unicorn listening on a TCP socket then you can use curl. If you have Unicorn listening on a Unix domain socket then you have to write a little script that connects to it, like this:
require 'socket'
sock = UNIXSocket.new('/path-to-unicorn.sock')
sock.write("HEAD / HTTP/1.0\r\n")
sock.write("Host: www.foo.com\r\n")
sock.write("Connection: close\r\n")
sock.write("\r\n")
if sock.read !~ /something/
exit 1
end
But it sounds like Phusion Passenger Enterprise solves your problem beautifully. It has this feature called "deployment error resistance". When you deploy a new version and Phusion Passenger detects that it cannot spawn any processes for your new codebase, it will stop trying to spawn your new version and keep the processes for the old versions around indefinitely, until you manually give the signal that it's okay to spawn processes for the new version. In the mean time it will log all errors into the log file so that you can analyze the problem.
I would suggest brushing off your bash skills. The functionality you need is already in Unicorn as it leverages the Unix-y master/worker process.
You need a init.d script. Or at the very least godrb or monit. I recommend the init.d script route AND monitoring. Its more complex, but it can more easily be leveraged by your monitoring software and also gives you an automatic start on reboot.
The gist of it is:
Send the USR2 signal to the unicorn master process, this will fork the master process.
Then send the WINCH to the old master process that gets created, this will kill each worker.
Then you can send the old master process the QUIT signal.
Unicorn Signals
This will spin up a new master process running the new code and label the old one as (old). If it fails the old one should be returned to its prior state and you shouldn't suffer an outage, just a restart error. This is the beauty of unicorn. You can almost get instantaneous deploys of your code.
I'm using a lot of hedge words because I did this work on my apps over a year ago so there are a lot of cobwebs upstairs. Hope this helps!
This is by no mean a correct script. Its a good starting point though ... feel free to update the gist if you can improve upon it! :-)
Example Unicorn Control Script
Suppose I make a little change to my rails app such as changing the html layout. How would I do a rolling restart with Unicorn? Effectively one would like to bring up unicorn processes(or workers instead?) for the newest version of the rails app and then switch traffic from the old unicorn processes/workers to the new ones atomically. From Google searches I couldn't quite get a concrete definitive explanation of how to do this and all the gotchas surrounding it.
There are multiple methods, but one of them is as follows:
Send SIGUSR2 to the master process. Unicorn start a new master with worker processes, that live in parallel with your old master and old worker processes.
Wait until the new master and worker processes have started.
Kill the old master.
Source: http://unicorn.bogomips.org/SIGNALS.html
This is not very memory friendly though. You temporarily need twice the memory usage.
Phusion Passenger Enterprise supports rolling restarts (along with other cool features) but it restarts processes one-by-one and so does not need as much memory. It is possible to script one-by-one rolling restarts in Unicorn using the TTIN and TTOUT signals but Phusion Passenger does everything automatically for you without scripting.