This is the first time such a thing happens to me. I'm really scared.
I've been coding and testing a Django webapp on my laptop. The app is running on Docker, with docker-compose. Both the host and guest are Ubuntu 18.04. It consists of 3 images: Django+Gunicorn, Nginx and Postgres.
Nothing really fancy and it worked perfectly, until 5 minutes ago.
When I tried to refresh the page (accessible via 127.0.0.1) on Chrome Incognito, it got stuck on loading. Same thing with curl. At the time, I was logged into the Django container (to activate collectstatic whenever I needed it) and it was still running as usual.
I thought something was stuck somewhere so I tried to see if there's anything listening to the 80 port. Nothing really special:
tcp6 0 0 :::80 :::* LISTEN 10815/docker-proxy
So, wanting to get back to coding as fast as possible, I tried to (sudo) down then kill the containers, to no avail:
ERROR: for xxxxxxxx_nginx_1 Cannot kill container: e94e64a75b1726ccd27623024a4223ffb3d77c6578b4d69f6240bea51e8e641b: Cannot kill container e94e64a75b1726ccd27623024a4223ffb3d77c6578b4d69f6240bea51e8e641b: unknown error after kill: docker-runc did not terminate sucessfully: container_linux.go:393: signaling init process caused "permission denied"
: unknown
No problem, I thought, and I just stoped the docker service:
sudo systemctl stop docker
I refreshed the 127.0.0.1 page expecting to see a This site can’t be reached page ... only to see the webapp loading!
I tried to see what container are running to stop them, but docker ps returned this:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Which confirms the Docker service was down. systemctl status confirmed just that. I also checked if the serverside code was running. It is. I also tried to change some frontend code, and it loading the new version.
Can someone tell me what's going on, and how to stop this 'zombie' app from running?
Thanks!
EDIT
I just had the idea to run ps aux | grep docker and here's what I found:
root 1661 0.5 0.9 670260 74136 ? Ssl 17:47 1:15 dockerd -G docker --exec-root=/var/snap/docker/384/run/docker --data-root=/var/snap/docker/common/var-lib-docker --pidfile=/var/snap/docker/384/run/docker.pid --config-file=/var/snap/docker/384/config/daemon.json --debug
root 2148 0.3 0.4 756640 34944 ? Ssl 17:47 0:47 docker-containerd --config /var/snap/docker/384/run/docker/containerd/containerd.toml
root 4105 0.0 0.0 7508 4112 ? Sl 17:48 0:01 docker-containerd-shim -namespace moby -workdir /var/snap/docker/common/var-lib-docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/7709ab085e470228c120eff4c9b36590348dac483a40d9b107cfb8d62146e060 -address /var/snap/docker/384/run/docker/containerd/docker-containerd.sock -containerd-binary /snap/docker/384/bin/docker-containerd -runtime-root /var/snap/docker/384/run/docker/runtime-runc -debug
root 10618 0.0 0.0 7508 4464 ? Sl 17:57 0:01 docker-containerd-shim -namespace moby -workdir /var/snap/docker/common/var-lib-docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/3a689a845ef012584e46d631c053ca0a00dbe34bb430f5e52a4de879c7efe966 -address /var/snap/docker/384/run/docker/containerd/docker-containerd.sock -containerd-binary /snap/docker/384/bin/docker-containerd -runtime-root /var/snap/docker/384/run/docker/runtime-runc -debug
root 10815 0.0 0.0 425952 2956 ? Sl 17:58 0:07 /snap/docker/384/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 80 -container-ip 172.20.0.4 -container-port 80
root 10822 0.0 0.0 9172 5032 ? Sl 17:58 0:01 docker-containerd-shim -namespace moby -workdir /var/snap/docker/common/var-lib-docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/e94e64a75b1726ccd27623024a4223ffb3d77c6578b4d69f6240bea51e8e641b -address /var/snap/docker/384/run/docker/containerd/docker-containerd.sock -containerd-binary /snap/docker/384/bin/docker-containerd -runtime-root /var/snap/docker/384/run/docker/runtime-runc -debug
ahmed 26359 0.0 0.0 21536 1048 pts/5 S+ 21:52 0:00 grep --color=auto docker
EDIT 2
After manually killing some of the processes above, the situation is back to normal. But still, I'd love to get an explanation if there's one.
Related
[root#k8s001 ~]# docker exec -it f72edf025141 /bin/bash
root#b33f3b7c705d:/var/lib/ghost# ps aux`enter code here`
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1012 4 ? Ss 02:45 0:00 /pause
root 8 0.0 0.0 10648 3400 ? Ss 02:57 0:00 nginx: master process nginx -g daemon off;
101 37 0.0 0.0 11088 1964 ? S 02:57 0:00 nginx: worker process
node 38 0.9 0.0 2006968 116572 ? Ssl 02:58 0:06 node current/index.js
root 108 0.0 0.0 3960 2076 pts/0 Ss 03:09 0:00 /bin/bash
root 439 0.0 0.0 7628 1400 pts/0 R+ 03:10 0:00 ps aux
The display come from internet, it says pause container is the parent process of other containers in the pod, if you attach pod or other containers, do ps aux, you would see that.
Is it correct, I do in my k8s,different, PID 1 is not /pause.
...Is it correct, I do in my k8s,different, PID 1 is not /pause.
This has changed, pause no longer hold PID 1 despite being the first container created by the container runtime to setup the pod (eg. cgroups, namespace etc). Pause is isolated (hidden) from the rest of the containers in the pod regardless of your ENTRYPOINT/CMD. See here for more background information.
By default, Docker will run your entrypoint (or the command, if there is no entrypoint) as PID 1. However, that is not necessarily always the case, since, depending on how you start the container, Docker (or your orchestrator) can also run its custom init process as PID 1:
$ docker run -d --init --name test alpine sleep infinity
849efe38ecec439550738e981065ec4aff55ef5607f03b9fed975e2d3146b9b0
$ with-docker docker exec -ti test ps
PID USER TIME COMMAND
1 root 0:00 /sbin/docker-init -- sleep infinity
7 root 0:00 sleep infinity
8 root 0:00 ps
For more information on why you would want your entrypoint not to be PID 1, you can check this explanation from a tini developer:
Now, unlike other processes, PID 1 has a unique responsibility, which is to reap zombie processes.
Zombie processes are processes that:
Have exited.
Were not waited on by their parent process (wait is the syscall parent processes use to retrieve the exit code of their children).
Have lost their parent (i.e. their parent exited as well), which means they'll never be waited on by their parent.
After starting a SCADA LTS Docker container as suggested on https://github.com/SCADA-LTS/Scada-LTS with the following command:
docker run -it -e DOCKER_HOST_IP=docker-machine ip-p 81:8080 scadalts/scadalts /root/start.sh
...The container works well for some time and then suddenly a "HTTP Status 404" error is shown, like the following:
http://[IP]/ScadaBR/
HTTP Status 404 - /ScadaBR/
type Status report
message /ScadaBR/
description The requested resource is not available.
Apache Tomcat/7.0.85
Where [IP] is the default Docker IP address and port, most of the times is localhost:81.
Any idea how to solve it?
Thank you in advance!
TL;DR
After some time running the MySQLservice dies. Is necessary to restart it manually with this:
docker exec scada service mysql restart
docker exec scada killall tail
DETAILED REPORT
When the error is shown, you can check if all the services are running on the container (in this case named 'scada'):
>docker exec scada ps -A
PID TTY TIME CMD
1 ? 00:00:00 start.sh
790 ? 01:00:22 java
791 ? 00:01:27 tail
858 ? 00:00:00 ps
As can be seen, no MySQL service is running. This explains why Tomcat is running but SCADA-LTS don't.
You can restart MySQL service inside the container with:
docker exec scada service mysql restart
After that SCADA-LTS is still down and you have to restart tomcat which can be done in this way:
docker exec scada killall tail
After a minute or less, all the services are running:
>docker exec scada ps -A
PID TTY TIME CMD
1 ? 00:00:00 start.sh
43 ? 00:00:00 mysqld_safe
398 ? 00:00:00 mysqld
481 ? 00:00:31 java
482 ? 00:00:00 sleep
618 ? 00:00:00 ps
Now SCADA-LTS is running!
Is it possible to reload haproxy while the backend server ip changed? If, how?
It is essential for docker stack. On every deploy, new containers with different ip will replace the old containers.
In our implementation, services return 503 occasionally as the old haproxy process is not terminated and still accepting request, while the backend server is already gone. httplog show that some requests forward a backend which is gone.
# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 893 0.0 0.0 0 0 ? Zs 19:39 0:01 [haproxy] <defunct>
root 898 0.3 0.0 49416 9640 ? Ss 19:49 0:13 /usr/local/sbin/haproxy -D -f /app/haproxy.cfg -p /var/run/haproxy.pid
root 915 0.2 0.0 0 0 ? Zs 19:49 0:12 [haproxy] <defunct>
root 920 0.2 0.0 49308 10196 ? Ss 20:57 0:01 /usr/local/sbin/haproxy -D -f /app/haproxy.cfg -p /var/run/haproxy.pid
root 937 0.0 0.0 0 0 ? Zs 20:57 0:00 [haproxy] <defunct>
root 942 0.3 0.0 49296 9880 ? Ss 20:58 0:01 /usr/local/sbin/haproxy -D -f /app/haproxy.cfg -p /var/run/haproxy.pid
root 959 0.2 0.0 49296 9852 ? Ss 20:58 0:01 /usr/local/sbin/haproxy -D -f /app/haproxy.cfg -p /var/run/haproxy.pid
[Edit]
I am using docker swarm mode. I did try with publish service's port to the host; however, the performance of the swarm’s internal load balancer is bad, and I try to avoid.
While it should be possible to change the HAProxy configuration to point to a different backend server, it seems like it would be easier to bind the Docker containers' ports to predictable ports on the Docker host, so the HAProxy config does not need to change.
For example:
docker run -d -p 127.0.0.1:80:9999 hello_world
And your HAProxy config could look like
backend something
# Assuming the Docker host's IP address is 192.0.2.123
server some-server 192.0.2.123:9999
My version of OS Ubuntu 16.04.
I want to stop docker, so I run in the terminal:
sudo systemctl stop docker
But this commands doesn't help me:
gridsim1103 ~: ps ax | grep docker
11347 ? Sl 0:00 containerd-shim 487e3784f983274131d37bde1641db657e76e41bdd056f43ef4ad5adc1bfc518 /var/run/docker/libcontainerd/487e3784f983274131d37bde1641db657e76e41bdd056f43ef4ad5adc1bfc518 runc
14299 pts/2 S+ 0:00 grep --color=auto docker
29914 ? S 0:00 sudo dockerd -H gridsim1103:2376
29915 ? Sl 4:45 dockerd -H gridsim1103:2376
29922 ? Ssl 0:24 containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime runc
30107 ? Sl 1:01 /usr/bin/docker-proxy -proto tcp -host-ip 188.184.80.77 -host-port 8500 -container-ip 192.17.0.2 -container-port 8500
30139 ? Sl 0:00 /usr/bin/docker-proxy -proto tcp -host-ip 188.184.80.77 -host-port 8400 -container-ip 192.17.0.2 -container-port 8400
Version of docker server is:
Server:
Version: 1.12.1
API version: 1.24 (minimum version )
Go version: go1.6.2
Git commit: 23cf638
Built: Tue, 27 Sep 2016 12:25:38 +1300
OS/Arch: linux/amd64
Experimental: false
I also unsuccessfully tried:
sudo service docker stop
The output of ps aux looks like you did not start docker through systemd/systemctl.
It looks like you started it with:
sudo dockerd -H gridsim1103:2376
When you try to stop it with systemctl, nothing should happen as the resulting dockerd process is not controlled by systemd. So the behavior you see is expected.
The correct way to start docker is to use systemd/systemctl:
systemctl enable docker
systemctl start docker
After this, docker should start on system start.
EDIT: As you already have the docker process running, simply kill it by pressing CTRL+C on the terminal you started it. Or send a kill signal to the process.
First I stop the docker by the following command:
sudo systemctl stop docker
Then I get the message :Warning: Stopping docker.service, but it can still be activated by: docker.socket.
So, I stop the socket as well :
sudo systemctl stop docker.socket
Note: you can start and stop only the docker.socket when it triggers by it.
In my case, it was neither systemd nor a cron job, but it was snap.
So I had to run:
sudo snap stop docker
sudo snap remove docker
... and the last command actually never ended, I don't know why: this snap thing is really a pain. So I also ran:
sudo apt purge snap
:-)
if you have no systemctl and started the docker daemon by:
sudo service docker start
you can stop it by:
sudo service docker stop
stop docker:
docker stop docker_id
ex:
docker stop 1fec077018w4
remove docker:
docker rm docker_id
ex:
docker rm 1fec077018w4
do not to stop:
docker-compose kill -s SIGINT
docker restart:
docker-compose restart
I have docker container based on ubuntu 12.04 and wish start it on scaleway This instantApp run on ubuntu 15.04 with systemd. For my container I need upstart. I turn on upstart by this recommendation:
Install the upstart-sysv package, which will remove ubuntu-standard and systemd-sysv (but should not remove anything else -- if it does, yell!), and run sudo update-initramfs -u. After that, grub's "Advanced options" menu will have a corresponding "Ubuntu, with Linux ... (systemd)" entry where you can do an one-time boot with systemd.
Now my server running with upstart:
# ps aux|grep upstart
root 1447 0.0 0.0 2632 1744 ? S 13:44 0:00 upstart-udev-bridge --daemon
root 1598 0.0 0.0 2044 176 ? S 13:44 0:00 upstart-file-bridge --daemon
root 2571 0.0 0.0 2032 1128 ? S 13:44 0:00 upstart-socket-bridge --daemon
root 32408 0.0 0.0 3156 1472 pts/4 S+ 14:27 0:00 grep --color=auto upstart
but docker not running:
# service docker status
* Docker is managed via upstart, try using service docker status
# service docker start
* Docker is managed via upstart, try using service docker start
How I can start docker as daemon?
See answer for this Ask Ubuntu question - it's a workaround to get things running again until the Kernel bug is address: https://askubuntu.com/questions/683462/docker-is-managed-via-upstart-try-using-service-docker