How to avoid Docker data clutter during rapid development cycles - docker

I am developing a dockerized application. From time to time I rebuild and run the image/container on my machine to do some testing. To do so I naively executed the following:
docker image build -t myapp .
docker container run -p 8020:5000 -v $(pwd)/data:/data myapp
This seems to work quite well. But today I realized that the docker folder on my machine has grown quite big in the last three weeks (> 200 GB).
So it seems that all the supposedly temporary containers, images, volumes etc. have piled up on my disk.
What would be the right way to solve this? I only need one version of my image, so it would be great everything was simply overwritten every time I start a new test cycle.
Alternatively I could execute a docker system prune, but that seems overkill.

Majority of space is occupied by images and volumes so pruning only them would be better option then docker system prune.
docker image prune
docker volume prune
image prune will clean all dangling images that were untagged from myapp due to a new image built and tag to myapp.
Also as the containers are started for testing/dev, they can be started with --rm for cleaning when stopped.

Related

Reduce the disk space Docker uses [duplicate]

(Post created on Oct 05 '16)
I noticed that every time I run an image and delete it, my system doesn't return to the original amount of available space.
The lifecycle I'm applying to my containers is:
> docker build ...
> docker run CONTAINER_TAG
> docker stop CONTAINER_TAG
> rm docker CONTAINER_ID
> rmi docker image_id
[ running on a default mac terminal ]
The containers in fact were created from custom images, running from node and a standard redis. My OS is OSX 10.11.6.
At the end of the day I see I keep losing Mbs. How can I face this problem?
EDITED POST
2020 and the problem persists, leaving this update for the community:
Today running:
macOS 10.13.6
Docker Engine 18.9.2
Docker Desktop Cli 2.0.0.3
The easiest way to workaround the problem is to prune the system with the Docker utilties.
docker system prune -a --volumes
WARNING:
By default, volumes are not removed to prevent important data from being deleted if there is currently no container using the volume. Use the --volumes flag when running the command to prune volumes as well:
Docker now has a single command to do that:
docker system prune -a --volumes
See the Docker system prune docs
There are three areas of Docker storage that can mount up, because Docker is cautious - it doesn't automatically remove any of them: exited containers, unused container volumes, unused image layers. In a dev environment with lots of building and running, that can be a lot of disk space.
These three commands clear down anything not being used:
docker rm $(docker ps -f status=exited -aq) - remove stopped containers
docker rmi $(docker images -f "dangling=true" -q) - remove image layers that are not used in any images
docker volume rm $(docker volume ls -qf dangling=true) - remove volumes that are not used by any containers.
These are safe to run, they won't delete image layers that are referenced by images, or data volumes that are used by containers. You can alias them, and/or put them in a CRON job to regularly clean up the local disk.
It is also worth mentioning that file size of docker.qcow2 (or Docker.raw on High Sierra with Apple Filesystem) can seem very large (~64GiB), larger than it actually is, when using the following command:
ls -klsh Docker.raw
This can be somehow misleading because it will output the logical size of the file rather than its physical size.
To see the physical size of the file you can use this command:
du -h Docker.raw
Source: https://docs.docker.com/docker-for-mac/faqs/#disk-usage
Why does the file keep growing?
If Docker is used regularly, the size of the Docker.raw (or Docker.qcow2) can keep growing, even when files are deleted.
To demonstrate the effect, first check the current size of the file on the host:
$ cd ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/
$ ls -s Docker.raw
9964528 Docker.raw
Note the use of -s which displays the number of filesystem blocks actually used by the file. The number of blocks used is not necessarily the same as the file “size”, as the file can be sparse.
Next start a container in a separate terminal and create a 1GiB file in it:
$ docker run -it alpine sh
# and then inside the container:
/ # dd if=/dev/zero of=1GiB bs=1048576 count=1024
1024+0 records in
1024+0 records out
/ # sync
Back on the host check the file size again:
$ ls -s Docker.raw
12061704 Docker.raw
Note the increase in size from 9964528 to 12061704, where the increase of 2097176 512-byte sectors is approximately 1GiB, as expected. If you switch back to the alpine container terminal and delete the file:
/ # rm -f 1GiB
/ # sync
then check the file on the host:
$ ls -s Docker.raw
12059672 Docker.raw
The file has not got any smaller! Whatever has happened to the file inside the VM, the host doesn’t seem to know about it.
Next if you re-create the “same” 1GiB file in the container again and then check the size again you will see:
$ ls -s Docker.raw
14109456 Docker.raw
It’s got even bigger! It seems that if you create and destroy files in a loop, the size of the Docker.raw (or Docker.qcow2) will increase up to the upper limit (currently set to 64 GiB), even if the filesystem inside the VM is relatively empty.
The explanation for this odd behaviour lies with how filesystems typically manage blocks. When a file is to be created or extended, the filesystem will find a free block and add it to the file. When a file is removed, the blocks become “free” from the filesystem’s point of view, but no-one tells the disk device. Making matters worse, the newly-freed blocks might not be re-used straight away – it’s completely up to the filesystem’s block allocation algorithm. For example, the algorithm might be designed to favour allocating blocks contiguously for a file: recently-freed blocks are unlikely to be in the ideal place for the file being extended.
Since the block allocator in practice tends to favour unused blocks, the result is that the Docker.raw (or Docker.qcow2) will constantly accumulate new blocks, many of which contain stale data. The file on the host gets larger and larger, even though the filesystem inside the VM still reports plenty of free space.
TRIM
A TRIM command (or a DISCARD or UNMAP) allows a filesystem to signal to a disk that a range of sectors contain stale data and they can be forgotten. This allows:
an SSD drive to erase and reuse the space, rather than spend time shuffling it around; and
Docker for Mac to deallocate the blocks in the host filesystem, shrinking the file.
So how do we make this work?
Automatic TRIM in Docker for Mac
In Docker for Mac 17.11 there is a containerd “task” called trim-after-delete listening for Docker image deletion events. It can be seen via the ctr command:
$ docker run --rm -it --privileged --pid=host walkerlee/nsenter -t 1 -m -u -i -n ctr t ls
TASK PID STATUS
vsudd 1741 RUNNING
acpid 871 RUNNING
diagnose 913 RUNNING
docker-ce 958 RUNNING
host-timesync-daemon 1046 RUNNING
ntpd 1109 RUNNING
trim-after-delete 1339 RUNNING
vpnkit-forwarder 1550 RUNNING
When an image deletion event is received, the process waits for a few seconds (in case other images are being deleted, for example as part of a docker system prune ) and then runs fstrim on the filesystem.
Returning to the example in the previous section, if you delete the 1 GiB file inside the alpine container
/ # rm -f 1GiB
then run fstrim manually from a terminal in the host:
$ docker run --rm -it --privileged --pid=host walkerlee/nsenter -t 1 -m -u -i -n fstrim /var/lib/docker
then check the file size:
$ ls -s Docker.raw
9965016 Docker.raw
The file is back to (approximately) it’s original size – the space has finally been freed!
Hopefully this blog will be helpful, also checkout the following macos docker utility scripts for this problem:
https://github.com/wanliqun/macos_docker_toolkit
Docker on Mac has an additional problem that is hurting a lot of people: the docker.qcow2 file can grow out of proportions (up to 64gb) and won't ever shrink back down on its own.
https://github.com/docker/for-mac/issues/371
As stated in one of the replies by djs55 this is in the planning to be fixed, but its not a quick fix. Quote:
The .qcow2 is exposed to the VM as a block device with a maximum size
of 64GiB. As new files are created in the filesystem by containers,
new sectors are written to the block device. These new sectors are
appended to the .qcow2 file causing it to grow in size, until it
eventually becomes fully allocated. It stops growing when it hits this
maximum size.
...
We're hoping to fix this in several stages: (note this is still at the
planning / design stage, but I hope it gives you an idea)
1) we'll switch to a connection protocol which supports TRIM, and
implement free-block tracking in a metadata file next to the qcow2.
We'll create a compaction tool which can be run offline to shrink the
disk (a bit like the qemu-img convert but without the dd if=/dev/zero
and it should be fast because it will already know where the empty
space is)
2) we'll automate running of the compaction tool over VM reboots,
assuming it's quick enough
3) we'll switch to an online compactor (which is a bit like a GC in a
programming language)
We're also looking at making the maximum size of the .qcow2
configurable. Perhaps 64GiB is too large for some environments and a
smaller cap would help?
Update 2019: many updates have been done to Docker for Mac since this answer was posted to help mitigate problems (notably: supporting a different filesystem).
Cleanup is still not fully automatic though, you may need to prune from time to time. For a single command that can help to cleanup disk space, see zhongjiajie's answer.
docker container prune
docker system prune
docker image prune
docker volume prune
Since nothing here was working for me, here's what I did. Check file size:
ls -lhks ~/Library/Containers/com.docker.docker//Data/vms/0/data/Docker.raw
Then in the docker desktop simply reduce the disk image size (I was using raw format). It will say it will delete everything, but by the time you are reading this post, you probably already have. So that creates a fresh new empty file.
i'm not sure if it is related to the current topic , but this been a solution for me personally
open docker settings -> resources -> disk image size - 16gb
There are several options on how to limit docker diskspace, I'd start by limiting/rotating the logs: Docker container logs taking all my disk space
E.g. if you have a recent docker version, you can start it with an --log-opt max-size=50m option per container.
Also - if you've got old, unused containers, you can consider having a look at the docker logs which are located at /var/lib/docker/containers/*/*-json.log
$ sudo docker system prune
WARNING! This will remove:
all stopped containers
all networks not used by at least one container
all dangling images
all dangling build cache

How do I clean docker?

An error occurred because there is not enough disk space
I decided to check how much is free and came across this miracle
Cleaned up via docker system prune -a and
docker container prune -f
docker image prune -f
docker system prune -f
But only 9GB was cleared
Prune removes containers/images that have not been used for a while/stopped. I would suggest you do a docker ps -a and then remove/stop all the containers that you don't want with docker stop <container-id>, and then move on to remove docker images by docker images ps and then remove them docker rmi <image-name>
Once you have stooped/removed all the unwanted containers run docker system prune --volumes to remove all the volumes/cache and dangling images.
Don't forget to prune the volumes! Those almost always take up way more space than images and containers. docker volume prune. Be careful if you have anything of value stored in them though.
It could be your logging of the running containers. I've seen Docker logging writing disks full with logging. By default Docker containers can write unlimited logs.
I always add logging configuration to my docker-compose restrict total size. Docker Logging Configuration.
From the screenshot I think there's some confusion on what's taking up how much space. Overlay filesystems are assembled from directories on the parent filesystem. In this case, that parent filesystem is within /var/lib/docker which is part of / in your example. So the df output for each overlay filesystem is showing how much disk space is used/available within /dev/vda2. Each container isn't using 84G, that's just how much is used in your entire / filesystem.
To see how much space is being used by docker containers, images, volumes, etc, you can use docker system df. If you have running containers (likely from the above screenshot), docker will not clean those up, you need to stop them before they are eligible for pruning. Once containers have been deleted, you can then prune images and volumes. Note that deleting volumes deletes any data you were storing in that volume, so be sure it's not data you wanted to save.
It's not uncommon for docker to use a lot of disk space from downloading lots of images (docker makes it easy to try new things) and those are the easiest to prune when the containers are stopped. However what's harder to see are the logs that containers are outputting which will slowly fill the drive within the containers directory. For more details on how to clean the logs automatically, see this answer.
If you want, you could dig deep at granular level and pinpoint the file(s) which are the cause of this much disk usage.
du -sh /var/lib/docker/overley2/<container hash>/merged/* | sort -h
this would help you coming to a conclusion much easily.

What have to be done to deliver on Docker and avoid to accumulate images?

I use Docker to execute a website I make.
When a release have to be delivered, I have to build a new Docker image and start a new Container from it.
The problem is that images et containers are accumulating and taking huge space.
Besides the delivery, I need to stop the running container and delete it and the source image too.
I don't need Docker command lines but a checklist or a process to not forget anything.
For instance:
-Stop running container
-Delete stopped container
-Delete old image
-Build new image
-Start new container
Am I missing something?
I'm not used to Docker, maybe there are best practices to this pretty classical use case?
The local workflow that works for me is:
Do core development locally, without Docker. Things like interactive debuggers and live reloading work just fine in a non-Docker environment without weird hacks or root access, and installing the tools I need usually involves a single brew or apt-get step. Make all of my pytest/junit/rspec/jest/... tests pass.
docker build a new image.
docker stop && docker rm the old container.
docker run a new container.
When the number of old images starts to bother me, docker system prune.
If you're using Docker Compose, you might be able to replace the middle set of steps with docker-compose up --build.
In a production environment, the sequence is slightly different:
When your CI system sees a new commit, after running the repository's local tests, it docker build && docker push a new image. The image has a unique tag, which could be a timestamp or source control commit ID or version tag.
Your deployment system (could be the CI system or a separate CD system) tells whatever cluster manager you're using (Kubernetes, a Compose file with Docker Swarm, Nomad, an Ansible playbook, ...) about the new version tag. The deployment system takes care of stopping, starting, and removing containers.
If your cluster manager doesn't handle this already, run a cron job to docker system prune.
You should use:
docker system df
to investigate the space used by docker.
After that you can use
docker system prune -a --volumes
to remove unused components. Containers you should stop them yourself before doing this, but this way you are sure to cover everything.

Docker remove container with changes

How can I remove a container without removing all the changes that have been made in this container?
I already used:
Stop all running containers: docker stop $(docker ps -a -q)
Delete all stopped containers: docker rm $(docker ps -a -q)
You can create a docker image from your container with docker commit.
This image can be used to start up new containers or can be pushed to a docker repository for later use.
https://docs.docker.com/engine/reference/commandline/commit/
You don’t. Anything you change in a Docker container will be lost as soon as the container is deleted. Also, you need to routinely delete containers to change options like the base image (if there’s some security patch you need), networking options, or environment variables.
The standard pattern for this involves two things:
Don’t install software or make other configuration changes inside a running container. Instead, write a Dockerfile that does that work for you, check it into source control, and use docker build to build an image from it. (If you get something wrong, you can easily fix it and re-run it; if you need to update something in six months you have a written record of what you did.)
If your container does need some persistent state, start it with docker run -v or a similar option to mount a host directory or named volume into the container. Data stored there will outlive the container.
In theory docker commit can turn a container into an image, but using it really isn’t a best practice. Images you build this way can’t be recreated or updated (again, imagine a critical security fix in the underlying base image). In some cases a committed container won’t even have the data you want to save (for example you can’t usefully commit a MySQL or PostgreSQL container).

Quick docker container refresh workflow

This is probably a duplicate, but all answers that I saw didn't work for me.
I'm using docker (17.06.2-ce), docker-compose (1.16.1).
I have an image of solr which I use for development and testing purposes (and on CI too).
When making changes to the image I need to rebuild the image and recreate containers, so that the containers use the latest possible image, which, in turn, takes the latest possible code from the local repo.
I've created my own image which is based on official solr-docker image. The repo is a folder with additional steps that I'm applying to the image, such as copying files and making changes to existing configs using sed.
I'm working in the repo and have the containers running in the background.
When I need to refresh the containers, I usually do these commands
sudo docker-compose stop
sudo docker rm $(sudo docker ps -a -q)
sudo docker rmi $(sudo docker images -q)
sudo docker-compose up
The above 4 commands is the only way it works for me. All other approaches that I've tried din't rebuild the images and didn't create the containers based off the new, rebuilt images. In other words, the code in the image would be stale.
Questions:
Is it possible to refresh the image + rebuild the container using fewer commands?
Every time I'm running above 4 commands, docker would download ~500MB of dependencies. Is it possible to not to download them and just rebuild the image using updated local code and existing cached dependencies?
I usually do docker-compose rm && docker-compose build && docker-compose up for recreating docker containers: it won't download 500mb.
You can use docker-compose down which does the following:
down Stop and remove containers, networks, images, and volumes
Therefore the command to use will be: docker-compose down --rmi local && docker-compose up
The --rmi local option will remove the built image, and thus forcing a rebuild on up

Resources