pause vs stop in docker - docker

I am trying to understand what is the difference between the commands docker stop ContainerID and docker pause ContainerID. According to this page both of them are used to pause an existing Docker container.

The docker pause command suspends all processes in the specified containers. On Linux, this uses the cgroups freezer. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended
https://docs.docker.com/engine/reference/commandline/pause/
The docker stop command. The main process inside the container will receive SIGTERM, and after a grace period, SIGKILL.
https://docs.docker.com/engine/reference/commandline/stop/#options
SIGTERM is the termination signal. The default behavior is to terminate the process, but it also can be caught or ignored. The intention is to kill the process, gracefully or not, but to first allow it a chance to cleanup.
SIGKILL is the kill signal. The only behavior is to kill the process, immediately. As the process cannot catch the signal, it cannot cleanup, and thus this is a signal of last resort.
SIGSTOP is the pause signal. The only behavior is to pause the process; the signal cannot be caught or ignored. The shell uses pausing (and its counterpart, resuming via SIGCONT) to implement job control.

And addition to the answers added earlier
running docker events after docker stop shows events
kill (signal 15): where signal 15 = SIGTERM
die
stop
running docker events after docker pause shows only one event
pause
Also docker pause would still keep memory portion while the container is paused. This memory is used when the container is resumed. docker stop releases the memory used after the container is stopped.
This table has even more details.

What is the difference between docker stop and docker pause?
docker stop: Send SIGTERM(termination signal), and if needed SIGKILL(kill signal)
docker pause: Send SIGSTOP(pause signal)
SIGTERM: The default behavior is to terminate the process, but it also can be caught or ignored. The intention is to kill the process,
gracefully or not, but first give it a chance to clean up.
SIGKILL: The only behavior is to kill the process, immediately. As the process cannot catch the signal, it cannot clean up; thus, this is
a signal of last resort.
SIGSTOP: The only behavior is to pause the process; the signal cannot be caught or ignored. The shell uses pausing (and its
counterpart, resuming via SIGCONT) to implement job control
When to use docker stop and docker pause?
docker stop: When you wish to clear up memory or delete all of the processes' -cached- data. Simply put, you no longer care about the processes in the container and are comfortable with killing them.
docker pause: When you only want to suspend the processes in the container; you do not want them to lose data or state.
Example:
Consider a container with a counter. Assume the counter has reached 3000. Running docker stop will cause the counter to lose its value and you will be unable to retrieve it. Using docker pause, on the other hand, will maintain the counter state and value.
Hope it's clear now!

docker pause pauses (i.e., sends SIGSTOP) pauses (read: suspends) all the processes in a container[s].
docker stop stops (i.e., sends SIGTERM, and if needed SIGKILL) to the conrainer[s]'s main process.

When the running container is issued with the docker pause command, the SIGSTOP signal is passed which allows the processes inside the container (basically the container itself) to be in a paused state.
So when the docker unpause is issued, SIGCONT signal is passed to the container processes to restore the container proceses.
When the docker stop command is issued to the running container, the SIGTERM signal is passed to the container processes to stop and stops the container.
Hence when the docker pause is issued to a container, and the docker service is restarted, the cgroups allocated to it is released. (as the SIGTERM is passed to all the container processes)
So after the restart, the unpause would not be helpful as the containers are stopped.

Related

How to turn a docker container to a zombie

A few years ago. When I just started playing docker. I remember there are some blog posts mentioned if you don't handle your pid(1) process well. You will create a zombie docker container. At that time. I chose just follow the suggestion start using a init tool called dumb-init. And I never really see a zombie container be created.
But I am still curious why it's a problem. If I remember correctly, docker stop xxx by default will send SIGTERM to the container pid(1) process. And if the process can not gracefully stop within 10s (default). Docker will force kill it by sending SIGKILL to pid(1) process. And I also know that pid(1) process is special in Linux system. It can ignore SIGKILL signal (link). But I think even if the process's PID in docker container is 1. It just because it's using namespaces to scope its processes. In the host machine, you should see the process is another PID. Which can be killed by the kernel.
So my questions are:
Why can't docker engine just kill the container in the host kernel level? So no matter what. The user can ensure the container be killed properly.
How can I create a zombie process in docker container? (If someone can share a Gist will be great!)
Not zombie containers, but zombie processes. Write this zombie.py:
#!/usr/bin/env python3
import subprocess
import time
p = subprocess.Popen(['/bin/sleep', '1'])
time.sleep(2)
subprocess.run(['/bin/ps', '-ewl'])
Write this Dockerfile:
FROM python:3
COPY zombie.py /
CMD ["/zombie.py"]
Build and run it:
chmod +x zombie.py
docker build -t zombie .
docker run --rm zombie
What happens here is the /bin/sleep command runs to execution. The parent process needs to use the wait call to clean up after it, but it doesn't so when it runs ps, you'll see a "Z" zombie process.
But wait, there's more! Say your process does carefully clean up after itself. In this specific example, subprocess.run() includes the required wait call, for instance, and you might change the Popen call to run. If that subprocess launches another subprocess, and it exits (or crashes) without waiting for it, the init process with pid 1 becomes the new parent process of the zombie. (It's worked this way for 40 years.) In a Docker container, though, the main container process runs with pid 1, and if it's not expecting "extra" child processes, you could wind up with stale zombie processes for the life of the container.
This leads to the occasional suggestion that a Docker container should always run some sort of "real" init process, maybe something as minimal as tini, so that something picks up after zombie processes and your actual container job doesn't need to worry about it.

Wondering about differences between Docker vs Supervisor

They seem to accomplish the same thing of managing processes. What's the difference between Docker and Supervisor?
You can use supervisor in a docker container actually: when you can to make sure that exiting your container will kill all your processes.
A Container isolate one main process: as long as that process runs, the container runs.
But if your container needs to run several processes, you need a supervisor to manage the propagation of signals, especially the one indicating a process needs to be terminated.
See more at "Use of Supervisor in docker" to avoid the PID 1 zombie reaping problem. (zombie processes are processes which are never stopped, and remains "zombie", without any parent process)
Since Docker 1.12 (Q3 2016), you don't need supervisor anymore if you have multiple processes:
docker run --init
See PR 26061

Stopping paused Docker containers

In my application consisting of several containers, I pause containers which are currently not needed. When they are needed again, I unpause them. This works fine.
However, if something goes wrong in one of the running containers(container exits with exit code != 0), docker-compose(which I am also using) tries to stop all the other containers. If a container is paused, it cannot be stopped or killed.
A small example to illustrate what happens. (all of these commands are automated in my case)
docker start cd1d8ad01f56
docker pause cd1d8ad01f56
docker stop cd1d8ad01f56
Error response from daemon: Cannot stop container cd1d8ad01f56:
Container cd1d8ad01f56c695a598e168e2eacdcd20a5231b9240029db1579bc0f1dcb903
is paused. Unpause the container before stopping
Error: failed to stop containers: [cd1d8ad01f56]
I want the containers to be stopped, even if they are paused.
Solutions I thought of:
First unpause every sleeped container, then stop or kill it. This is an unsuitable solution that requires manual work. But it works...
I could write a script that looks for paused containers and then unpauses and kills them. But I want for compose to just kill all the other stuff and be done with it. I do not want to have to issue another command to execute my script.
Is there a way to specify the code of a container that exits(i.e. tell it to unpause other containers)? So that the containers are not sleeped when trying to stop them.
First unpausing every sleeped container, to then stop or kill it is tedious work, which I would like to automate. I am working in a test environment and do not care how the containers shutdown. I just want them to end together with the failed container(s).

Can I clone a paused Docker container?

I can suspend the processes running inside a container with the PAUSE command. Is it possible to clone the Docker container whilst paused, so that it can be resumed (i.e. via the UNPAUSE command) several times in parallel?
The use case for this is a process which takes long time to start (i.e. ~20 seconds). Given that I want to have a pool of short-living Docker containers running that process in parallel, I would reduce start-up time for each container a lot if this was somehow possible.
No, you can only clone the container's disk image, not any running processes.
Yes, you can, using docker checkpoint (criu). This does not have anything to do with pause though, it is a seperate docker command.
Also see here.

Resource Usage by stopped Docker containers

Docker makes it easy to stop & restart containers. It also has the ability to pause and then unpause containers. The Docker docs state
When the container is exited, the state of the file system and its exit value is preserved. You can start, stop, and restart a container. The processes restart from scratch (their memory state is not preserved in a container), but the file system is just as it was when the container was stopped.
I tested this out by settting up a container with memcached running, wrote a value to memcache and then
Stopped & then restarted the container - the memcached value was gone
Paused & then unpaused the container - the memcached value was still intact
Somewhere in the docs - I can no longer find the precise document - I read that stopped containers do not consume CPU or memory. However:
I suppose the fact that the file system state is preserved means that the container still does consume some space on the host's file system?
Is there a performance hit (other than host disk space consumption) associated with having 10s, or even 100s, of stopped containers in the system? For instance, does it make it any harder for Docker to startup and manage new containers?
And finally, if Paused containers retain their memory state when Unpaused - as demonstrated by their ability to remember memcached keys - do they have a different impact on CPU and memory?
I'd be most obliged to anyone who might be able to clarify these issues.
I am not an expert about docker core but I will try to answer some of these questions.
I suppose the fact that the file system state is preserved means that the container still does consume some space on the host's file
system?
Yes. Docker save all the container and image data in /var/lib/docker. The default way to save the container and image data is using aufs. The data of each layer is saved under /var/lib/docker/aufs/diff. When a new container is created, a new layer is also created with is folder, and there the changes from the layers of the source image are stored.
Is there a performance hit (other than host disk space consumption) associated with having 10s, or even 100s, of stopped
containers in the system? For instance, does it make it any harder for
Docker to startup and manage new containers?
As far as I know, it should not be any performace hit. When you stop a container, docker daemon sends SIGTERM and SIGKILL to all the process of that container, as described in docker CLI documentation:
Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...]
Stop a running container by sending SIGTERM and then SIGKILL after a
grace period
-t, --time=10 Number of seconds to wait for the container to
stop before killing it. Default is 10 seconds.
3.And finally, if Paused containers retain their memory state when
Unpaused - as demonstrated by their ability to remember memcached
keys - do they have a different impact on CPU and memory?
As #Usman said, docker implements pause/unpause using the cgroup freezer. If I'm not wrong, when you put a process in the freezer (or its cgroup), you block the execution of new task of that process from the kernel task scheduler (i.e.: it stops the process), but you don't kill them and they keep consuming the memory they are using (although the Kernel may move that memory to swap or to solid disk). And the CPU resources used by a paused container I would consider insignificant. For more information about this I would check the pull request of this feature, Docker issue #5948

Resources