Nginx Reverse Proxy to Docker 502 Bad Gateway - docker

Spent all week on this one and tried every related stackoverflow post. Thanks for being here.
I have an Ubuntu VM running nginx with reverse proxies pointing to various docker daemons concurrently running on different ports. All my static sites work flawlessly. However, I have one container running an expressjs app.
I get responses after restarting the server for about an hour. Then I get 502 Bad Gateway. A refresh brings the site back up for approx 5 seconds until it permanently goes down. This is reproducible.
The docker container has express listening on 0.0.0.0:8090 inside the container
The container is running
02e1917991e6 docker/express-site "docker-entrypoint.s…" About an hour ago Up About an hour 127.0.0.1:8090->8090/tcp express-site
The 8090 port is EXPOSEd in the Dockerfile.
I tried other ports.
When down, I can curl the site from within the container when inspecting.
When down, curling the site from within the VM yields
curl: (52) Empty reply from server
Memory and CPU usage within the container and within the VM barely reach 5%.
Site usually has SSL but tried http as well.
Tried various nginx proxy settings (see config below)
Using out-of-the box nginx.conf
Considering that it might be related to a timeout or docker network settings...
My site-available config file looks like:
server {
server_name example.com www.example.com;
location / {
proxy_pass http://127.0.0.1:8090;
#proxy_set_header Host $host;
#proxy_buffering off;
#proxy_buffer_size 16k;
#proxy_busy_buffers_size 24k;
#proxy_buffers 64 4k;
}
listen 80;
listen [::]:80;
#listen 443 ssl; # managed by Certbot
#ssl_certificate /etc/letsencrypt/live/www.example.com/fullchain.pem; # managed by Certbot
#ssl_certificate_key /etc/letsencrypt/live/www.example.com/privkey.pem; # managed by Certbot
#include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
#ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}
Nginx Error Log shows:
2021/01/02 23:50:00 [error] 13901#13901: *46 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: ***.**.**.***, server: example.com, request: "GET /favicon.ico HTTP/1.1", upstream: "http://127.0.0.1:8090/favicon.ico", host: "www.example.com", referrer: "http://www.example.com"
Anyone else have ideas?

Didn't get much feedback, but I did more research and the issue is now stable so I wanted to post my findings.
I have isolated the issue with the docker container. Nginx works fine with the same app running on the VM directly.
I updated my docker container image from node:12-alpine to node:14-alpine. The site has been up for 42 hours without issue.
If it randomly fails again, then it's probably due to load.
I hope this solves someone's issue.
Update 2021-10-24
The same issue started and I've narrowed it down to the port and/or docker on my version of Ubuntu. May I recommend...
changing the port
rebooting your PC
installing the latest OS and docker updates

Related

Nginx + Docker - "no live upstreams while connecting to upstream" with upstream but works fine with proxy_pass

So I've been facing a weird problem, and I'm not sure where the fault is. I'm running a container using docker-compose, and the following nginx configuration works great:
server {
location / {
proxy_pass http://container_name1:1337;
}
}
Where container_name was the name of the service I gave in docker-compose.yml file. It resolves to the IP perfectly and it works. However, the moment I change the above file to this:
upstream backend {
least_conn;
server container_name1:1337;
server container_name2:1337;
}
server {
location / {
proxy_pass http://backend;
}
}
It stops working completely and in error logs I get the following:
2020/03/17 13:16:03 [error] 8#8: *11 no live upstreams while connecting to upstream, client: xxxxxx, server: codedamn.com, request: "GET /HTTP/1.1", upstream: "http://backend/", host: "xxxxx"
Why is that? Is nginx not able to resolve DNS when inside upstream blocks? Could anyone help with this problem?
NOTE: This happens only on production (Ubuntu 16.04), on local (macOS Catalina), the same configuration works fine. I'm totally confused after discovering this.
Update 1: The following works:
upstream backend {
least_conn;
server container_name1:1337;
}
But not with more than one server. Why?!
Alright. Figured it out. This is because docker-compose creates contianers randomly and nginx quickly marks the containers as down (I was deploying this on production when there was some traffic). The app containers weren't ready, but nginx was, so it marked them down and stopped forwarding any traffic.
For now, instead of syncing up docker-compose container creation order (which was a bit hacky, as I discovered), I disabled the failed attempts of nginx to automatically mark service as down by writing:
server app_inst1:1337 max_fails=0;
which lets nginx still forward the traffic to a particular service (and my docker is configured to restart the container in case it crashes), which is fine.

nginx responds to HTTPS but not HTTP

I am using the dockerized Nextcloud as shown here: https://github.com/nextcloud/docker/tree/master/.examples/docker-compose/with-nginx-proxy-self-signed-ssl/mariadb/fpm
I set this up with port 80 mapped to 12345 and port 443 mapped to 12346. When I go to https://mycloud.example.com:12346, I get the self-signed certificate prompt, but otherwise everything is fine and I see the NextCloud web UI. But when I go to http://mycloud.example.com:12345 nginx (the proxy container) gives error "503 Service Temporarily Unavailable". The error also shows up in proxy's logs.
How can I diagnose the issue? Why is HTTPS working but not HTTP?
Can you provide your docker command starting nextcloud, or docker-compose file ?
Diagnosis is as usual with docker stuff : get the id for the currently running container
docker ps
Then check the logs
docker logs [id or name of your container]
docker-compose logs [name of your service]
Connect in the container
docker exec -ti [id or name of your container] [bash or ash if alpine based container]
There read the nginx conf files involved. In your case I'ld check the redirection being made from http to https, most likely it's something like below with no specific port specified for https, hence port 443, hence not working
server {
listen 80;
server_name my.domain.com;
return 301 https://$server_name$request_uri; <======== no port = 443
}
server {
listen 443 ssl;
server_name my.domain.com;
# add Strict-Transport-Security to prevent man in the middle attacks
add_header Strict-Transport-Security "max-age=31536000" always;
[....]
}

nginx reverse proxy in docker

I'm having a trivial problem with nginx. For a starter, I'm just running nginx and portainer as containers. Portainer is running on port 9000 and the containers are on the same docker network so it's not a visibilty issue. Nginx exposes port 80 and works fine. So does portainer when accessing 9000 directly. I'm mapping the nginx volumes /etc/nginx/nginx.conf:ro and /usr/share/nginx/html:ro locally and they react to changes so I should be hooked up correctly. In my mapped nginx.conf (http section) I have
server {
location /portainer {
proxy_pass http://portainer:9000;
}
}
where portainer is named, well, portainer. I've also tried with an upstream-directive+server but that didn't work either.
When accessing localhost/portainer logs nginx shows
2018/04/30 09:21:32 [error] 7#7: *1 open() "/usr/share/nginx/html/portainer" failed (2: No such file or directory), client: 172.18.0.1, server: localhost, request: "GET /portainer HTTP/1.1", host: "localhost"
which would indicate that the location directive is not even hit(?). I've tried / in various places but to no avail. I'm guessing it's something trivial I'm missing.
Thanks in advance,
Nik
I had to add a trailing slash to both lines:
server {
location /portainer/ {
proxy_pass http://portainer:9000/;
}
}
Try this instead:
location ~* ^/portainer/(.*)$ {
proxy_pass http://portainer:9000/$1$is_args$args;
}
Ref: http://nginx.org/en/docs/http/ngx_http_core_module.html

NGINX reverse proxy to docker applications

I am currently learning to set up nginx but I am already having an issue. There are gitlab and nextcloud running on my vps and both are accessible with the right port. Therefore I created a nginx config with a simple proxy_pass command but I always reveice 502 Bad Gateway.
Nextcloud, Gitlab and NGINX are docker container and NGINX has port 80 opened. The remaining two containers are having port 3000 and 3100 opened.
/etc/nginx/conf.d/gitlab.domain.com.conf
upstream gitlab {
server x.x.x.x:3000;
}
server {
listen 80;
server_name gitlab.domain.com;
location / {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_set_header X-NginX-Proxy true;
proxy_pass http://gitlab/;
}
}
/var/logs/error.log
2018/04/12 08:10:41 [error] 7#7: *1 connect() failed (113: Host is unreachable) while connecting to upstream, client: xx.201.226.19, server: gitlab.domain.com, request: "GET / HTTP/1.1", upstream: "http://xxx.249.7.15:3000/", host: "gitlab.domain.com"
2018/04/12 08:10:42 [error] 7#7: *1 connect() failed (113: Host is unreachable) while connecting to upstream, client: xx.201.226.19, server: gitlab.domain.com, request: "GET /favicon.ico HTTP/1.1", upstream: "http://xxx.249.7.15:3000/favicon.ico", host: "gitlab.domain.com", referrer: "http://gitlab.domain.com/
What is wrong with my configuration?
I think you could get away with a config way simpler than that.
Maybe something like this:
http {
...
server {
listen 80;
charset utf-8;
...
location / {
proxy_pass http://gitlab:3000;
}
}
}
I assume you are using docker's internal DNS for accessing the containers for example gitlab points to the gitlab containers internal IP. If that is the case then you can open up a container and try ping the gitlab container from the other container.
For example you can ping the gitlab container from the nginx container like this:
$ docker ps (use this to get the container id)
Now do:
$ docker exec -it <container_id_for_nginx_container> bash
# apt-get update -y
# apt-get install iputils-ping -y
# ping -c 2 gitlab
If you can't ping it then it means the containers have trouble communicating with each other. Are you using docker-compose? If you are then I would suggest look at the "links" keyword which is used to link containers that should be able to communicate with each other. So for example you would probably link the gitlab container to postgresql.
Let me know if this helps.
Another option that uses the advantage that your Docker containers are just processes in an isolated own control group is to bind each process (container) to a port on the host network (instead of an isolated network group). This bypasses Docker routing, so beware of the caveat that ports may not overlap on the host machine (no different than any normal process sharing the same host network.
You mentioned running Nginx and Nextcloud (I assume you are using the nextcloud fpm image because of FastCGI support). In this case, I had to do the following on my Arch Linux machine:
/usr/share/webapps/nextcloud is bounded (bind mounted) to the container at /var/www/html.
The UID of both host and container process must be the same (in my case, user host http and container www-data are UID=33)
The 443 server block in nginx.conf must set root to the host's nextcloud path, root /usr/share/webapps/nextcloud;.
The FastCGI script path for each server block that calls php-fpm over FastCGI must be adjusted to refer to the Docker container's Nextcloud base path, fastcgi_param SCRIPT_FILENAME /var/www/html$fastcgi_script_name;. In other words, you cannot use $document_root as you normally would, because this points to the host's nextcloud root path.
Optional: Adjust paths to database and Redis in the config.php file to not use localhost, rather the hostname of the host machine. localhost seems to reference the container's host despite having been bound to the host machine's main network.

Re-resolve of backend in nginx SNI docker swarm

I am using nginx to do TCP forwarding based on hostname as discussed here: Nginx TCP forwarding based on hostname
When the upstream containers are taken down for a short period of time (5 or so mins), and then brought back up, nginx doesn't seem to re-resolve them (continue to get 111: connection refused error).
I've attempted to put a resolver in the server block of the nginx config:
server {
listen 443;
resolver x.x.x.x valid=30s
proxy_pass $name;
ssl_preread on;
}
I still get the same behaviour with this in place.
Like BMitch says, you can scale to 0 to ensure DNS remains available to Nginx.
But really, if you're using nginx in Swarm, I recommend using a Swarm-aware proxy solution that dynamically updates nginx/haproxy config's based on Services that have the proper labels. In those cases, when the service was removed, the config would also be removed from the proxy. One's I've used include:
Traefik
Docker Flow Proxy

Resources