How to turn a docker container to a zombie - docker

A few years ago. When I just started playing docker. I remember there are some blog posts mentioned if you don't handle your pid(1) process well. You will create a zombie docker container. At that time. I chose just follow the suggestion start using a init tool called dumb-init. And I never really see a zombie container be created.
But I am still curious why it's a problem. If I remember correctly, docker stop xxx by default will send SIGTERM to the container pid(1) process. And if the process can not gracefully stop within 10s (default). Docker will force kill it by sending SIGKILL to pid(1) process. And I also know that pid(1) process is special in Linux system. It can ignore SIGKILL signal (link). But I think even if the process's PID in docker container is 1. It just because it's using namespaces to scope its processes. In the host machine, you should see the process is another PID. Which can be killed by the kernel.
So my questions are:
Why can't docker engine just kill the container in the host kernel level? So no matter what. The user can ensure the container be killed properly.
How can I create a zombie process in docker container? (If someone can share a Gist will be great!)

Not zombie containers, but zombie processes. Write this zombie.py:
#!/usr/bin/env python3
import subprocess
import time
p = subprocess.Popen(['/bin/sleep', '1'])
time.sleep(2)
subprocess.run(['/bin/ps', '-ewl'])
Write this Dockerfile:
FROM python:3
COPY zombie.py /
CMD ["/zombie.py"]
Build and run it:
chmod +x zombie.py
docker build -t zombie .
docker run --rm zombie
What happens here is the /bin/sleep command runs to execution. The parent process needs to use the wait call to clean up after it, but it doesn't so when it runs ps, you'll see a "Z" zombie process.
But wait, there's more! Say your process does carefully clean up after itself. In this specific example, subprocess.run() includes the required wait call, for instance, and you might change the Popen call to run. If that subprocess launches another subprocess, and it exits (or crashes) without waiting for it, the init process with pid 1 becomes the new parent process of the zombie. (It's worked this way for 40 years.) In a Docker container, though, the main container process runs with pid 1, and if it's not expecting "extra" child processes, you could wind up with stale zombie processes for the life of the container.
This leads to the occasional suggestion that a Docker container should always run some sort of "real" init process, maybe something as minimal as tini, so that something picks up after zombie processes and your actual container job doesn't need to worry about it.

Related

Is there a way to hibernate a docker container

I like to use Jupyter Notebook. If I run it in a VM in virtualbox, I can save the state of the VM, and then pick up right where I left off the next day. Can I do something similar if I were to run it in a docker container? i.e. dump the "state" of the container to disk, then crank it back up and reload the "state"?
It looks like docker checkpoint may be the thing I'm attempting to accomplish here. There's not much in the docs that describes it as such. In fact, the docs for docker checkpoint say "Manage checkpoints" which is massively unhelpful.
UPDATE: This IS, in fact, what docker checkpoint is supposed to accomplish. When I checkpoint my jupyter notebook container, it saves it, I can start it back up with docker start --checkpoint [my_checkpoint] jupyter_notebook, and it shows the things I had running as being in a Running state. However, attempts to then use the Running notebooks fail. I'm not sure if this is a CRIU issue or a Jupyter issue, but I'll bring it up in the appropriate git issue tracker.
Anyhoo docker checkpoint is the thing that is supposed to provide VM-save-state/hibernate style functionality.
The closest approach I can see is docker pause <container-id>
https://docs.docker.com/engine/reference/commandline/pause/
The docker pause command suspends all processes in the specified containers. On Linux, this uses the cgroups freezer. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended. With the cgroups freezer the process is unaware, and unable to capture, that it is being suspended, and subsequently resumed.
Take into account as an important difference against VirtualBox hibernation, that there is no disk persistence of the memory state of the containerized process.
If you just stop the container, it hibernates:
docker stop myjupyter
(hours pass)
docker start myjupyter
docker attach myjupyter
I do this all the time, especially with docker containers which have web browers in them.

Wondering about differences between Docker vs Supervisor

They seem to accomplish the same thing of managing processes. What's the difference between Docker and Supervisor?
You can use supervisor in a docker container actually: when you can to make sure that exiting your container will kill all your processes.
A Container isolate one main process: as long as that process runs, the container runs.
But if your container needs to run several processes, you need a supervisor to manage the propagation of signals, especially the one indicating a process needs to be terminated.
See more at "Use of Supervisor in docker" to avoid the PID 1 zombie reaping problem. (zombie processes are processes which are never stopped, and remains "zombie", without any parent process)
Since Docker 1.12 (Q3 2016), you don't need supervisor anymore if you have multiple processes:
docker run --init
See PR 26061

Is it possible to run a command when stopping a Docker container?

I'm using docker-compose to organize containers for a JS application.
The source container sports command: npm start, pretty standard, to spin up the live application. However, that timeouts when I ask it to stop.
I was wondering if it's possible to have docker-compose stop to run a command inside the container - that could properly terminate the application.
docker-compose stop just sends a SIGTERM to your container and if it does not stop after 10secs (configurable) SIGKILL follows. So if you want to customize this behavior, you should handle signal inside your entrypoint (if you have one).

Docker Process Management

I have a deployed application running inside a Docker container, which is, in effect, an websocket client that runs forever. Every deploy I'm rebuilding the container and starting it with docker run using the command set in the Dockerfile.
Now, I've noticed a few times that the process occasionally dies without restarting. When running docker ps, I can see that the container is up, and has been up for 2 weeks, however the process running inside of it has died without the host being any the wiser
Do I need to go so far as to have a process manager inside of the docker container to manage the containerized process?
EDIT:
Dockerfile: https://github.com/DVG/catpen-edi/blob/master/Dockerfile
We've developed a process-manager tailor-made for Docker containers and have been using it with quite a bit of success to solve exactly the problem you describe. The best starting point is to take a look at chaperone-docker on github. The readme on the first page contains a quick link to a minimal base image as well as a fully configured LAMP stack so you can try it out and see what a fully-configured image would look like. It's open-source and fully documented.
This is a very interesting problem here related to PID1 and the fact that docker replaces PID1 with the command specified in CMD or ENTRYPOINT. What's happening is that the child process isn't automagically adopted by anything if the parent dies and it becomes an orphan (since there is no PID1 in the sense of a traditional init system like you're used to). Here is some excellent reading to give you a few ideas. You may get some mileage out of their baseimage-docker image which comes with their simplified init system ("my_app"), which will solve some of this problem for you. However, I would strongly caution you against automatically adopting the Phusion mindset for all of your containers, as there exists some ideological friction in that space. I can't recall any discussion on Docker's Github about a potential minimal init system to solve this problem, but I can't imagine it will be a problem forever. Good luck!
If you have two ruby processes it sounds like the child hasn't exited, the application has just stopped working. It's likely the EventMachine reactor is sitting in the background.
Does the EDI app really need to spawn the additional Ruby process? This only adds another layer between Docker and your app. Run the server directly with CMD [ "ruby", "boot.rb" ]. If you find the problem still occurs with a single process then you will need to find what is causing your app to hang.
When a process is running as PID 1 is docker it will need handle the SIGINT and SIGTERM signals too.
# Trap ^C
Signal.trap("INT") {
shut_down
exit
}
# Trap `Kill `
Signal.trap("TERM") {
shut_down
exit
}
Docker also has restart policies for when the container does actually die.
docker run --restart=always
no
Do not automatically restart the container when it exits. This is
the default.
on-failure[:max-retries]
Restart only if the container
exits with a non-zero exit status. Optionally, limit the number of
restart retries the Docker daemon attempts.
always
Always restart the
container regardless of the exit status. When you specify always, the
Docker daemon will try to restart the container indefinitely. The
container will also always start on daemon startup, regardless of the
current state of the container.
unless-stopped
Always restart the
container regardless of the exit status, but do not start it on daemon
startup if the container has been put to a stopped state before.

How can I know why a Docker container stopped?

I have a Docker container that contains a JVM process. When the process ends the container completes and stops.
While thankfully rare, my JVM can go legs-up suddenly with a hard-failure, e.g. OutOfMemoryError. When this happens my container just stops, like a normal JVM exit.
I can have distributed logging, etc., for normal JVM logging, but in this hard-fail case I want to know the JVM's dying words, which are typically uttered on stderr.
Is there a way to know why my container stopped, look around in logs, stderr, or something along these lines?
You can run the docker logs [ContainerName] or docker logs [ContainerID] command even on stopped containers. You can see them with docker ps -a.
Source: the comment of Usman Ismail above: this is just a way to convert his comment into a proper answer.

Resources