Is it possible to disable retry mechanism for uWSGI after harakiri? - uwsgi

I am running a webpy web server with uWSGI and there are some requests that take too long. After setting up harakiri mode I noticed that after a request is killed it is retried one more time.
Is it possible to disable this behavior?
The running command is:
/usr/local/bin/uwsgi --http-socket={socket} --chdir={dir} --master --module=start --max-requests=1500 --harakiri=20 --carbon-max-retry=0 --rawrouter-max-retries=0 --sslrouter-max-retries=0 --processes=1 --enable-threads --ignore-sigpipe --die-on-term --worker-reload-mercy=5 --pidfile=/tmp/uwsgi.pid
Thank you!

uWSGI IRC user damjan helped me debugging this. There is no problem with uWSGI, the retires are generated by the browser. If you try with curl requests are killed without problems.
If you have nginx on top of uWSGI you might also want to take a look at: https://news.ycombinator.com/item?id=11217477

Related

Preventing uwsgi_response_write_body_do() TIMEOUT

We use uwsgi with the python3 plugin, under nginx, to serve potentially hundreds of megabytes of data per query. Sometimes when nginx is queried from client a slow network connection, a uwsgi worker dies with "uwsgi_response_write_body_do() TIMEOUT !!!".
I understand the uwsgi python plugin reads from the iterator our app returns as fast as it can, trying to send the data over the uwsgi protocol unix socket to nginx. The HTTPS/TCP connection to the client from nginx will get backed up from a slow network connection and nginx will pause reading from its uwsgi socket. uwsgi will then fail some writes towards nginx, log that message and die.
Normally we run nginx with uwsgi buffering disabled. I tried enabling buffering, but it doesn't help as the amount of data it might need to buffer is 100s of MBs.
Our data is not simply read out of a file, so we can't use file offload.
Is there a way to configure uwsgi to pause reading from the our python iterator if that unix socket backs up?
The existing question here "uwsgi_response_write_body_do() TIMEOUT - But uwsgi_read_timeout not helping" doesn't help, as we have buffering off.
To answer my own question, adding socket-timeout = 60 is helping for all but the slowest client connection speeds.
That's sufficient so this question can be closed.

Sidekiq can't connect to database?

I have "mariadb" set to 127.0.0.1 in my /etc/hosts file and sidekiq occasionally throws errors such as:
Mysql2::Error::ConnectionError: Unknown MySQL server host 'mariadb' (16)
The VM is not under significant load or anything like that.
Later edit: seems other gems have trouble resolving hosts too:
WARN -- : Unable to record event with remote Sentry server (Errno::EBUSY - Failed to open TCP connection to XXXX.ingest.sentry.io:443 (Device or resource busy - getaddrinfo)):
Anyone have any idea why that may happen?
I've figured this out a couple weeks ago but wanted to be sure before posting an answer.
I still can't figure out the mechanic of this issue but it was caused by fail2ban.
I had it running in a container polling the httpd logs and blocking the tremendous amount of bots scraping my sites.
Also I increased the max file handlers and inotify handlers.
fs.file-max = 131070
fs.inotify.max_user_watches = 65536
As soon as I got rid of fail2ban and increased the inotify handlers the errors disappeared.
Obviously fail2ban gets on the "do not touch" list because of this, and we've rolled out a 404/403/500 handler on application layer that pushes unknown IPs to Cloudflare.
Although this is probably an edge case I'm leaving this here in hope it helps someone at some point.

Err max clients reached Redis/Sidekiq/Rails

I have been stuck on this issue for the past 3 days and unsure where to look now.
I have a simple Sidekiq implementation into my rails app.
I am working on: Rails 4.2.0, Sidekiq 4.1.2, Redis 3.0.6
The production app is running live with heroku, and I have 1 worker dyno and 1 web dyno.
The issue is this, and I am unsure on how to approach it or what I did to make it do this.
When I run the redis-cli on heroku I can see the clients that I have running. At most I have 2 or 3 clients running at any given time. I can easily kill the clients with
CLIENT KILL TYPE normal
So that's all fine and dandy. The part when things get a little tricky is when I fire up my server locally, and I am working in development. All of a sudden my redic-cli shows that I have 19 clients running. This will result in me logging
Err max clients reached
My assumption is that somehow locally I am directing sidekiq to work off the redis production url. I have to admit what I know about Redis and Sidekiq is limited, but I do have a basic understanding of how it should be working.
Any help or guidance would be appreciated.
Try using sidekiq -c 3 to limit your concurrency.
This ended up being a configuration error. Just in case anyone stumbles upon this question hopefully this will help them not overlook something like I did.
This issue was happening only when I was firing up my local server, so I knew it had something to do with me locally. I noticed that on my production redis:cli I was seeing clients that had my local IP in the ADDR column.
This led me to believe that my local machine was pushing clients to my production Redis server. Looking at my logs when I fired up my Procfile I saw the Redis url there so that only confirmed it.
Finally after searching through my code, I discovered that I had actually added the url into my .env, so when I fired up my server it was using that production Redis url. So I changed it to the appropriate IP address for local development on my .env file redis://127.0.0.1:6379 and everything is now working as normal.

How can I automatically restart my Heroku app when there's a server error?

I have a Rails 4.2 app running on Heroku. Occasionally there is an issue that causes most incoming requests to get a server error. For example, there could be a memory leak or a max database connection issue. How can I setup a script or service to automatically restart the server when it detects errors?
I think this service could ping the app every few minutes and if it detects an error, it should confirm there's really a problem and then run heroku restart. How could this be set up?
After Googling this topic, I came across Neptune.io, which seems to provide a useful service for this task.

Nginx + unicorn (rails) often gives "Connection refused" in nginx error log

At work we're running some high traffic sites in rails. We often get a problem with the following being spammed in the nginx error log:
2011/05/24 11:20:08 [error] 90248#0: *468577825 connect() to unix:/app_path/production/shared/system/unicorn.sock failed (61: Connection refused) while connecting to upstream
Our setup is nginx on the frontend server (load balancing), and unicorn on our 4 app servers. Each unicorn is running with 8 workers. The setup is very similar to the one GitHub uses.
Most of our content is cached, and when the request hits nginx it looks for the page in memcached and serves that it if can find it - otherwise the request goes to rails.
I can solve the above issue - SOMETIMES - by doing a pkill of the unicorn processes on the servers followed by a:
cap production unicorn:check (removing all the pid's)
cap production unicorn:start
Do you guys have any clue to how I can debug this issue? We don't have any significantly high load on our database server when these problems occurs..
Something killed your unicorn process on one of the servers, or it timed out. Or you have an old app server in your upstream app_server { } block that is no longer valid. Nginx will retry it from time to time. The default is to re-try another upstream if it gets a connection error, so hopefully your clients didn't notice anything.
I don't think this is a nginx issue for me, restarting nginx didn't help. It seems to be gunicorn...A quick and dirty way to avoid this is to recycle the gunicorn instances when the system is not being used, say 1AM for example if that is an acceptable maintenance window. I run gunicorn as a service that will come back up if killed so a pkill script takes care of the recycle/respawn:
start on runlevel [2345]
stop on runlevel [06]
respawn
respawn limit 10 5
exec /var/web/proj/server.sh
I am starting to wonder if this is at all related to memory allocation. I have MongoDB running on the same system and it reserves all the memory for itself but it is supposed to yield if other applications require more memory.
Other things worth a try is getting rid of eventlet or other dependent modules when running gunicorn. uWSGI can also be used as an alternative to gunicorn.

Resources