Docker makes it easy to stop & restart containers. It also has the ability to pause and then unpause containers. The Docker docs state
When the container is exited, the state of the file system and its exit value is preserved. You can start, stop, and restart a container. The processes restart from scratch (their memory state is not preserved in a container), but the file system is just as it was when the container was stopped.
I tested this out by settting up a container with memcached running, wrote a value to memcache and then
Stopped & then restarted the container - the memcached value was gone
Paused & then unpaused the container - the memcached value was still intact
Somewhere in the docs - I can no longer find the precise document - I read that stopped containers do not consume CPU or memory. However:
I suppose the fact that the file system state is preserved means that the container still does consume some space on the host's file system?
Is there a performance hit (other than host disk space consumption) associated with having 10s, or even 100s, of stopped containers in the system? For instance, does it make it any harder for Docker to startup and manage new containers?
And finally, if Paused containers retain their memory state when Unpaused - as demonstrated by their ability to remember memcached keys - do they have a different impact on CPU and memory?
I'd be most obliged to anyone who might be able to clarify these issues.
I am not an expert about docker core but I will try to answer some of these questions.
I suppose the fact that the file system state is preserved means that the container still does consume some space on the host's file
system?
Yes. Docker save all the container and image data in /var/lib/docker. The default way to save the container and image data is using aufs. The data of each layer is saved under /var/lib/docker/aufs/diff. When a new container is created, a new layer is also created with is folder, and there the changes from the layers of the source image are stored.
Is there a performance hit (other than host disk space consumption) associated with having 10s, or even 100s, of stopped
containers in the system? For instance, does it make it any harder for
Docker to startup and manage new containers?
As far as I know, it should not be any performace hit. When you stop a container, docker daemon sends SIGTERM and SIGKILL to all the process of that container, as described in docker CLI documentation:
Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...]
Stop a running container by sending SIGTERM and then SIGKILL after a
grace period
-t, --time=10 Number of seconds to wait for the container to
stop before killing it. Default is 10 seconds.
3.And finally, if Paused containers retain their memory state when
Unpaused - as demonstrated by their ability to remember memcached
keys - do they have a different impact on CPU and memory?
As #Usman said, docker implements pause/unpause using the cgroup freezer. If I'm not wrong, when you put a process in the freezer (or its cgroup), you block the execution of new task of that process from the kernel task scheduler (i.e.: it stops the process), but you don't kill them and they keep consuming the memory they are using (although the Kernel may move that memory to swap or to solid disk). And the CPU resources used by a paused container I would consider insignificant. For more information about this I would check the pull request of this feature, Docker issue #5948
Related
My server is at 0% free space. I just deleted 100gb of data in one of my docker volumes in an actively running container.
How do I free up the space and release it to the host system so that I am not at 0%. Do I need to stop the docker container to release it?
Thanks!
If the container is still running, and deleting files doesn't show any drop in disk usage, odds are good that the process inside your container has those file handles open. The OS won't release the underlying file and reclaim the space until all processes with open file handles close those file handles, or the process exits. In short, you likely need to restart the container.
what you need to do is clean your docker system by using:
docker system prune : https://docs.docker.com/config/pruning/
This will remove unused containers, clean images and more.
I like to use Jupyter Notebook. If I run it in a VM in virtualbox, I can save the state of the VM, and then pick up right where I left off the next day. Can I do something similar if I were to run it in a docker container? i.e. dump the "state" of the container to disk, then crank it back up and reload the "state"?
It looks like docker checkpoint may be the thing I'm attempting to accomplish here. There's not much in the docs that describes it as such. In fact, the docs for docker checkpoint say "Manage checkpoints" which is massively unhelpful.
UPDATE: This IS, in fact, what docker checkpoint is supposed to accomplish. When I checkpoint my jupyter notebook container, it saves it, I can start it back up with docker start --checkpoint [my_checkpoint] jupyter_notebook, and it shows the things I had running as being in a Running state. However, attempts to then use the Running notebooks fail. I'm not sure if this is a CRIU issue or a Jupyter issue, but I'll bring it up in the appropriate git issue tracker.
Anyhoo docker checkpoint is the thing that is supposed to provide VM-save-state/hibernate style functionality.
The closest approach I can see is docker pause <container-id>
https://docs.docker.com/engine/reference/commandline/pause/
The docker pause command suspends all processes in the specified containers. On Linux, this uses the cgroups freezer. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended. With the cgroups freezer the process is unaware, and unable to capture, that it is being suspended, and subsequently resumed.
Take into account as an important difference against VirtualBox hibernation, that there is no disk persistence of the memory state of the containerized process.
If you just stop the container, it hibernates:
docker stop myjupyter
(hours pass)
docker start myjupyter
docker attach myjupyter
I do this all the time, especially with docker containers which have web browers in them.
I can suspend the processes running inside a container with the PAUSE command. Is it possible to clone the Docker container whilst paused, so that it can be resumed (i.e. via the UNPAUSE command) several times in parallel?
The use case for this is a process which takes long time to start (i.e. ~20 seconds). Given that I want to have a pool of short-living Docker containers running that process in parallel, I would reduce start-up time for each container a lot if this was somehow possible.
No, you can only clone the container's disk image, not any running processes.
Yes, you can, using docker checkpoint (criu). This does not have anything to do with pause though, it is a seperate docker command.
Also see here.
I have a running docker container with a base image fedora:latest.
I would like to preserve the state of my running applications, but still update a few packages which got security fixes (i.e. gnutls, openssl and friends) since I first deployed the container.
How can I do that without interrupting service or losing the current state?
So optimally I would like to get a bash/csh/dash/sh on the running container, or any fleet magic?
It's important to note that you may run into some issues with the container shutting down.
For example, imagine that you have a Dockerfile for an Apache container which runs Apache in the foreground. Imagine that you attach a shell to your container (via docker exec) and you start updating. You have to apply a fix to Apache and, in the process of updating, Apache restarts. The instant that Apache shuts down, the container will stop. You're going to lose the current state of the applications. This is going to require extremely careful planning and some luck, and some updates will probably not be possible.
The better way to do it is rebuild the image upon which the container is based with all the appropriate updates, then re-run the container. There will be a (brief) interruption in service. However, in order for you to be able to save the state of your applications, you would need to design the images in such a way that any state information that needs to be preserved is stored in a persistent manner - either in the host file system by mounting a directory or in a data container.
In short, if you're going to lose important information when your container shuts down, then your system is fragile & you're going to run into problems sooner or later. Better to redesign it so that everything that needs to be persistent is saved outside the container.
If the docker container has a running bash
docker attach <containerIdOrName>
Otherwise execute a new program in the same container (here: bash)
docker exec -it <containerIdOrName> bash
Assume I am starting a big number of docker containers which are based on the same docker image. It means that each docker container is running the same application. It could be the case that the application is big enough and requires a lot of hard drive memory.
How is docker dealing with it?
Does all docker containers sharing the static part defined in the docker image?
If not does it make sense to copy the application into some directory on the machine which is used to run docker containers and to mount this app directory for each docker container?
Docker shares resources at kernel level. This means application logic is in never replicated when it is ran. If you start notepad 1000 times it is still stored only once on your hard disk, the same counts for docker instances.
If you run 100 instances of the same docker image, all you really do is keep the state of the same piece of software in your RAM in 100 different separated timelines. The hosts processor(s) shift the in-memory state of each of these container instances against the software controlling it, so you DO consume 100 times the RAM memory required for running the application.
There is no point in physically storing the exact same byte-code for the software 100 times because this part of the application is always static and will never change. (Unless you write some crazy self-altering piece of software, or you choose to rebuild and redeploy your container's image)
This is why containers don't allow persistence out of the box, and how docker differs from regular VM's that use virtual hard disks. However, this is only true for the persistence inside the container. The files that are being changed by docker software on the hard disk are "mounted" into containers using the docker volumes and thus arent really part of the docker environments, but just mounted into them. (Read more about this at: https://docs.docker.com/userguide/dockervolumes/)
Another question that you might want to ask when you think about this, is how does docker store changes that it makes to its disk on runtime. What is really sweet to check out, is how docker actually manages to get this working. The original state of the container's hard disk is what is given to it from the image. It can NOT write to this image. Instead of writing to the image, a diff is made of what is changed in the containers internal state in comparison to what is in the docker image.
Docker uses a technology called "Union Filesystem", which creates a diff layer on top of the initial state of the docker image.
This "diff" (referenced as the writable container in the image below) is stored in memory and disappears when you delete your container. (Unless you use the command "docker commit", however: I don't recommend this. The state of your new docker image is not represented in a dockerfile and can not easily be regenerated from a rebuild)