Kubernetes/Docker uses too much disk space - docker

I have a Kubernetes-cluster with 1 master-node and 3 worker-nodes. All Nodes are running on CentOS 7 with Docker 19.06. I'm also running Longhorn for dynamic provisioning of volumes (if that's important).
My problem is that every few days one of the worker nodes grows the HDD-usage to 85% (43GB). This is not a linear increase, but happens over a few hours, sometimes quite rapidly. I can "solve" this problem for a few days by first restarting the docker service and then doing a docker system prune -a. If I don't restart the service first, the prune removes next to nothing (only a few MB).
I also tried to find out which container is taking up all that space, but docker system df says it doesn't use the space. I used df and du to crawl along the /var/lib/docker subdirectories too, and it seems none of the folders (alone or all together) takes up much space either. Continuing this all over the system, I can't find any other big directories either. There are 24GB that I just can't account for. What makes me think this is a docker problem nonetheless is that a restart and prune just solves it every time.
Googling around I found a lot of similar issues where most people just decided to increase disk space. I'm not keen on accepting this as the preferred solution, as it feels like kicking the can down the road.
Would you have any smart ideas on what to do instead of increasing disk space?

It seems like it is expected behavior, from Docker documentation you can read:
Docker takes a conservative approach to cleaning up unused objects
(often referred to as “garbage collection”), such as images,
containers, volumes, and networks: these objects are generally not
removed unless you explicitly ask Docker to do so. This can cause
Docker to use extra disk space. For each type of object, Docker
provides a prune command. In addition, you can use docker system prune to clean up multiple types of objects at once. This topic shows
how to use these prune commands.
So it seems like you have to clean it up manually using docker system/image/container prune. Other issue might be that those containers create too much logs and you might need to clean it up.

Related

How to find and get rid of docker residual data?

I am running docker on windows and even though I do docker system prune it is using more and more space somewhere on my harddisk.
Often after restarting the laptop and running prune I can get rid of some more but its less than it actually takes.
I know that docker is using these space because space on my HDD decreases when building new images and running containers but always decreases by much less space.
It's eaten over 50gb of my 256 gb SSD.
I appreciate any help in how to find and efficiently locate all files docker leaves when building and running containers.
I tried many lines from here and most work but I always fail to reclaim all space and given that I have a very small SSD I really need all the space I can get back.
Many thanks in advance!
I suggest you to add to the command docker system prune the `--all' because of:
-a, --all : Remove all unused images not just dangling ones
I use this to free up all my no more needed disk space.

var/lib/docker/containers/* eats my hard disk space

My raspberrypi suddenly had no more free space.
By looking at the folder sizes with the following command:
sudo du -h --max-depth=3
I noticed that a docker folder eats an incredible amount of hard disk space. It's the folder
var/lib/docker/containers/*
The folder seems to contain some data for the current running docker containers. The first letters of the filename correspond to the docker container-ID. One folder seems to grow dramatically fast. After stopping the affected container and removed him, the related folder disappeared. So the folder seems to have belonged to it.
Problem solved.
I wonder now what the reason could be that this folder size increases so much. Further, I wonder what is the best way to not run into the same problem again later.
I could write a bash script which removes the related container at boot and run it again. Better ideas are very welcome.
The container ids are directories, so you can look inside to see what is using space in there. The two main reasons are:
Logs from stdout/stdere. These can be limited with added options. You can view these with docker logs.
Filesystem changes. The underlying image filesystem is not changed, so any writes trigger a copy-on-write to a directory within each container id. You can view these with docker diff.

Docker taking up a lot of disk space

I am using Docker Desktop for Windows on Windows 10.
I was experiencing issues with system SSD always being full and moved 'docker-desktop-data' distro (which is used to store docker images and other stuff) out of the system drive to drive D: which is HDD using this guide.
Finally, I was happy to have a lot of space on my SSD... but docker containers started to work slower. I guess this happens due to HDD write/read operations being slower than on SSD.
Is there a better way to solve the problem of the continuously growing size of Docker distro's without impacting how fast containers actually work and images are built?
Actually only be design. As you know, a docker container is layered. So it might be feasible to check if it is possible to create something like a "base-container" from which your actual image in derived.
Also it might be sensible to check if your base distro is small enough. I often have seen containers created from full blown Debian or Ubuntu distros. Thats not the best idea. Try to derive from an alpine version or check for even smaller approaches.

GlusterFS storage for docker images

I have a docker swarm with 10 docker worker nodes and i'm experiencing issues with docker images storage (in thin pool). It keeps getting full as i got rather small disks (30GB-60GB).
The error:
Thin Pool has 7330 free data blocks which is less than minimum required 8455 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior
Because of that, cleaning strategy has to be aggressive, meaning deleting all images three times a day. This aggressive cleaning strategy results in broken pulls ( when cleaning happens at the same time when someone is pulling an image) and that developers cannot use cached images - instead they need to download the images that just got deleted by cleaning mechanism.
However, there is an option to use GlusterFs storage and i want to mount glusterFS volumes to each docker node and use them to create thin pool for docker images and /var/lib/docker.
I'm looking for guide how to do that exactly. Have anyone tried that?
P.S. I made my research about shared storage for docker images between multiple docker nodes and it seems its not possible, as stated here. However mounting separate volumes to each docker node should be possible.

Docker container memory usage

I have a Lamp Docker Image.
I want to start 500 containers of this image, how many RAM i need?
I have tracked memory usage of each new container and it nearly the same as any other container of its image.
So,if single container is using 200 MB, I can start 5 containers on Linux machine with 1 GB RAM.
My question is:
Is docker container using same memory as, for example, same Virtual Machine Image?
May be I am doing something wrong in docker configuration or docker files?
docker stats might give you the feedback you need. https://docs.docker.com/engine/reference/commandline/stats/
I don't know the exact details of the docker internals, but the general idea is that Docker tries to reuse as much as it can. So if you start five identical containers, it should run much faster than a virtual machine, because docker should only have one instance of the base image and file system which all containers refer to.
Any changes to the file system of one container will be added as a layer on top, only marking the change. The underlying image will not be changed, so the five containers can still refer to the same single base image.
The virtual machine however (i believe) will have a complete copy of the file system for each of the five instances, because it doesn't use a layered file system.
So I'm not sure how you can determine exactly how much memory you need, but this should make the concept clearer to you. You could start one container to see the 'base memory' that will be needed for one and then each new container should only add a smaller constant amount of memory and that should give you a broad idea how much you need.

Resources