Is it safe to clean docker/overlay2/ - docker

I got some docker containers running on AWS EC2, the /var/lib/docker/overlay2 folder grows very fast in disk size.
I'm wondering if it is safe to delete its content?
or if docker has some kind of command to free up some disk usage.
UPDATE:
I actually tried docker system prune -a already, which reclaimed 0Kb.
Also my /docker/overlay2 disk size is much larger than the output from docker system df
After reading docker documentation and BMitch's answer, I believe it is a stupid idea to touch this folder and I will try other ways to reclaim my disk space.

Docker uses /var/lib/docker to store your images, containers, and local named volumes. Deleting this can result in data loss and possibly stop the engine from running. The overlay2 subdirectory specifically contains the various filesystem layers for images and containers.
To cleanup unused containers and images, see docker system prune. There are also options to remove volumes and even tagged images, but they aren't enabled by default due to the possibility of data loss:
$ docker system prune --help
Usage: docker system prune [OPTIONS]
Remove unused data
Options:
-a, --all Remove all unused images not just dangling ones
--filter filter Provide filter values (e.g. 'label=<key>=<value>')
-f, --force Do not prompt for confirmation
--volumes Prune volumes
What a prune will never delete includes:
running containers (list them with docker ps)
logs on those containers (see this post for details on limiting the size of logs)
filesystem changes made by those containers (visible with docker diff)
Additionally, anything created outside of the normal docker folders may not be seen by docker during this garbage collection. This could be from some other app writing to this directory, or a previous configuration of the docker engine (e.g. switching from AUFS to overlay2, or possibly after enabling user namespaces).
What would happen if this advice is ignored and you deleted a single folder like overlay2 out from this filesystem? The container filesystems are assembled from a collection of filesystem layers, and the overlay2 folder is where docker is performing some of these mounts (you'll see them in the output of mount when a container is running). Deleting some of these when they are in use would delete chunks of the filesystem out from a running container, and likely break the ability to start a new container from an impacted image. See this question for one of many possible results.
To completely refresh docker to a clean state, you can delete the entire directory, not just sub-directories like overlay2:
# danger, read the entire text around this code before running
# you will lose data
sudo -s
systemctl stop docker
rm -rf /var/lib/docker
systemctl start docker
exit
The engine will restart in a completely empty state, which means you will lose all:
images
containers
named volumes
user created networks
swarm state

I found this worked best for me:
docker image prune --all
By default Docker will not remove named images, even if they are unused. This command will remove unused images.
Note each layer in an image is a folder inside the /usr/lib/docker/overlay2/ folder.

I had this issue... It was the log that was huge. Logs are here :
/var/lib/docker/containers/<container id>/<container id>-json.log
You can manage this in the run command line or in the compose file. See there : Configure logging drivers
I personally added these 3 lines to my docker-compose.yml file :
my_container:
logging:
options:
max-size: 10m

also had problems with rapidly growing overlay2
/var/lib/docker/overlay2 - is a folder where docker store writable layers for your container.
docker system prune -a - may work only if container is stopped and removed.
in my i was able to figure out what consumes space by going into overlay2 and investigating.
that folder contains other hash named folders. each of those has several folders including diff folder.
diff folder - contains actual difference written by a container with exact folder structure as your container (at least it was in my case - ubuntu 18...)
so i've used du -hsc /var/lib/docker/overlay2/LONGHASHHHHHHH/diff/tmp to figure out that /tmp inside of my container is the folder which gets polluted.
so as a workaround i've used -v /tmp/container-data/tmp:/tmp parameter for docker run command to map inner /tmp folder to host and setup a cron on host to cleanup that folder.
cron task was simple:
sudo nano /etc/crontab
*/30 * * * * root rm -rf /tmp/container-data/tmp/*
save and exit
NOTE: overlay2 is system docker folder, and they may change it structure anytime. Everything above is based on what i saw in there. Had to go in docker folder structure only because system was completely out of space and even wouldn't allow me to ssh into docker container.

Backgroud
The blame for the issue can be split between our misconfiguration of container volumes, and a problem with docker leaking (failing to release) temporary data written to these volumes. We should be mapping (either to host folders or other persistent storage claims) all of out container's temporary / logs / scratch folders where our apps write frequently and/or heavily. Docker does not take responsibility for the cleanup of all automatically created so-called EmptyDirs located by default in /var/lib/docker/overlay2/*/diff/*. Contents of these "non-persistent" folders should be purged automatically by docker after container is stopped, but apparently are not (they may be even impossible to purge from the host side if the container is still running - and it can be running for months at a time).
Workaround
A workaround requires careful manual cleanup, and while already described elsewhere, you still may find some hints from my case study, which I tried to make as instructive and generalizable as possible.
So what happened is the culprit app (in my case clair-scanner) managed to write over a few months hundreds of gigs of data to the /diff/tmp subfolder of docker's overlay2
du -sch /var/lib/docker/overlay2/<long random folder name seen as bloated in df -haT>/diff/tmp
271G total
So as all those subfolders in /diff/tmp were pretty self-explanatory (all were of the form clair-scanner-* and had obsolete creation dates), I stopped the associated container (docker stop clair) and carefully removed these obsolete subfolders from diff/tmp, starting prudently with a single (oldest) one, and testing the impact on docker engine (which did require restart [systemctl restart docker] to reclaim disk space):
rm -rf $(ls -at /var/lib/docker/overlay2/<long random folder name seen as bloated in df -haT>/diff/tmp | grep clair-scanner | tail -1)
I reclaimed hundreds of gigs of disk space without the need to re-install docker or purge its entire folders. All running containers did have to be stopped at one point, because docker daemon restart was required to reclaim disk space, so make sure first your failover containers are running correctly on an/other node/s). I wish though that the docker prune command could cover the obsolete /diff/tmp (or even /diff/*) data as well (via yet another switch).
It's a 3-year-old issue now, you can read its rich and colorful history on Docker forums, where a variant aimed at application logs of the above solution was proposed in 2019 and seems to have worked in several setups: https://forums.docker.com/t/some-way-to-clean-up-identify-contents-of-var-lib-docker-overlay/30604

Friends, to keep everything clean you can use de commands:
docker system prune -a && docker volume prune

WARNING: DO NOT USE IN A PRODUCTION SYSTEM
/# df
...
/dev/xvda1 51467016 39384516 9886300 80% /
...
Ok, let's first try system prune
#/ docker system prune --volumes
...
/# df
...
/dev/xvda1 51467016 38613596 10657220 79% /
...
Not so great, seems like it cleaned up a few megabytes. Let's go crazy now:
/# sudo su
/# service docker stop
/# cd /var/lib/docker
/var/lib/docker# rm -rf *
/# service docker start
/var/lib/docker# df
...
/dev/xvda1 51467016 8086924 41183892 17% /
...
Nice!
Just remember that this is NOT recommended in anything but a throw-away server. At this point Docker's internal database won't be able to find any of these overlays and it may cause unintended consequences.

adding to above comment, in which people are suggesting to prune system like clear dangling volumes, images, exit containers etc., Sometime your app become culprit, it generated too much logs in a small time and if you using an empty directory volume (local volumes) this fill the /var partitions. In that case I found below command very interesting to figure out, what is consuming space on my /var partition disk.
du -ahx /var/lib | sort -rh | head -n 30
This command will list top 30, which is consuming most space on a single disk. Means if you are using external storage with your containers, it consumes a lot of time to run du command. This command will not count mount volumes. And is much faster. You will get the exact directories/files which are consuming space. Then you can go to those directories and check which files are useful or not. if these files are required then you can move them to some persistent storage by making change in app to use persistent storage for that location or change location of that files. And for rest you can clear them.

If your system is also used for building images you might have a look at cleaning up garbage created by the builders using:
docker buildx prune --all
and
docker builder prune --all

DON'T DO THIS IN PRODUCTION
The answer given by #ravi-luthra technically works but it has some issues!
In my case, I was just trying to recover disk space. The lib/docker/overlay folder was taking 30GB of space and I only run a few containers regularly. Looks like docker has some issue with data leakage and some of the temporary data are not cleared when the container stops.
So I went ahead and deleted all the contents of lib/docker/overlay folder. After that, My docker instance became un-useable. When I tried to run or build any container, It gave me this error:
failed to create rwlayer: symlink ../04578d9f8e428b693174c6eb9a80111c907724cc22129761ce14a4c8cb4f1d7c/diff /var/lib/docker/overlay2/l/C3F33OLORAASNIYB3ZDATH2HJ7: no such file or directory
Then with some trial and error, I solved this issue by running
(WARNING: This will delete all your data inside docker volumes)
docker system prune --volumes -a
So It is not recommended to do such dirty clean ups unless you completely understand how the system works.

"Official" answer, cleaning with "prune" commands, does not clean actually garbage in overlay2 folder.
So, to answer the original question, what can be done is:
Disclaimer: Be careful when applying this. This may result broking your Docker object!
List folder names (hashes) in overlay2
Inspect your Docker objects (images, containers, ...) that you need (A stopped container or an image currently not inside any container do not mean that you do not need them).
When you inspect, you will see that it gives you the hashes that are related with your object, including overlay2's folders.
Do grep against overlay2's folders
Note all folders that are found with grep
Now you can delete folders of overlay2 that are not referred by any Docker object that you need.
Example:
Let say there are these folders inside your overlay2 directory,
a1b28095041cc0a5ded909a20fed6dbfbcc08e1968fa265bc6f3abcc835378b5
021500fad32558a613122070616963c6644c6a57b2e1ed61cb6c32787a86f048
And what you only have is one image with ID c777cf06a6e3.
Then, do this:
docker inspect c777cf06a6e3 | grep a1b2809
docker inspect c777cf06a6e3 | grep 021500
Imagine that first command found something whereas the second nothing.
Then, you can delete 0215... folder of overlay2:
rm -r 021500fad32558a613122070616963c6644c6a57b2e1ed61cb6c32787a86f048
To answer the title of question:
Yes, it is safe deleting dxirectly overlay2 folder if you find out that it is not in use.
No, it is not safe deleting it directly if you find out that it is in use or you are not sure.

In my case, systemctl stop docker then systemctl start docker somehow automatically free space /var/lib/docker/*

I had the same problem, in my instance it was because ´var/lib/docker´ directory was mounted to a running container (in my case google/cadvisor) therefore it blocked docker prune from cleaning the folder. Stopping the container, running docker prune -and then rerunning the container solved the problem.

Based on Mert Mertce's answer I wrote the following script complete with spinners and progress bars.
Since writing the script, however, I noticed the extra directories on our build servers to be transient - that is Docker appears to be cleaning up, albeit slowly. I don't know if Docker will get upset if there is contention for removing directories. Our current solution is to use docuum with a lot of extra overhead (150+GB).
#!/bin/bash
[[ $(id -u) -eq 0 ]] || exec sudo /bin/bash -c "$(printf '%q ' "$BASH_SOURCE" "$#")"
progname=$(basename $0)
quiet=false
no_dry_run=false
while getopts ":qn" opt
do
case "$opt" in
q)
quiet=true
;;
n)
no_dry_run=true
;;
?)
echo "unexpected option ${opt}"
echo "usage: ${progname} [-q|--quiet]"
echo " -q: no output"
echo " -n: no dry run (will remove unused directories)"
exit 1
;;
esac
done
shift "$(($OPTIND -1))"
[[ ${quiet} = false ]] || exec /bin/bash -c "$(printf '%q ' "$BASH_SOURCE" "$#")" > /dev/null
echo "Running as: $(id -un)"
progress_bar() {
local w=80 p=$1; shift
# create a string of spaces, then change them to dots
printf -v dots "%*s" "$(( $p*$w/100 ))" ""; dots=${dots// /.};
# print those dots on a fixed-width space plus the percentage etc.
printf "\r\e[K|%-*s| %3d %% %s" "$w" "$dots" "$p" "$*";
}
cd /var/lib/docker/overlay2
echo cleaning in ${PWD}
i=1
spi=1
sp="/-\|"
directories=( $(find . -mindepth 1 -maxdepth 1 -type d | cut -d/ -f2) )
images=( $(docker image ls --all --format "{{.ID}}") )
total=$((${#directories[#]} * ${#images[#]}))
used=()
for d in "${directories[#]}"
do
for id in ${images[#]}
do
((++i))
progress_bar "$(( ${i} * 100 / ${total}))" "scanning for used directories ${sp:spi++%${#sp}:1} "
docker inspect $id | grep -q $d
if [ $? ]
then
used+=("$d")
i=$(( $i + $(( ${#images[#]} - $(( $i % ${#images[#]} )) )) ))
break
fi
done
done
echo -e "\b\b " # get rid of spinner
i=1
used=($(printf '%s\n' "${used[#]}" | sort -u))
unused=( $(find . -mindepth 1 -maxdepth 1 -type d | cut -d/ -f2) )
for d in "${used[#]}"
do
((++i))
progress_bar "$(( ${i} * 100 / ${#used[#]}))" "scanning for unused directories ${sp:spi++%${#sp}:1} "
for uni in "${!unused[#]}"
do
if [[ ${unused[uni]} = $d ]]
then
unset 'unused[uni]'
break;
fi
done
done
echo -e "\b\b " # get rid of spinner
if [ ${#unused[#]} -gt 0 ]
then
[[ ${no_dry_run} = true ]] || echo "Could remove: (to automatically remove, use the -n, "'"'"no-dry-run"'"'" flag)"
for d in "${unused[#]}"
do
if [[ ${no_dry_run} = true ]]
then
echo "Removing $(realpath ${d})"
rm -rf ${d}
else
echo " $(realpath ${d})"
fi
done
echo Done
else
echo "All directories are used, nothing to clean up."
fi

I navigated to the folder containing overlay2. Using du -shc overlay2/*, I found that there was 25G of junk in overlay2. Running docker system prune -af said Total Reclaimed Space: 1.687MB, so I thought it had failed to clean it up. However, I then ran du -shc overlay2/* again only to see that overlay2 had only 80K in it, so it did work.
Be careful, docker lies :).

Everything in /var/lib/docker are filesystems of containers. If you stop all your containers and prune them, you should end up with the folder being empty. You probably don't really want that, so don't go randomly deleting stuff in there. Do not delete things in /var/lib/docker directly. You may get away with it sometimes, but it's inadvisable for so many reasons.
Do this instead:
sudo bash
cd /var/lib/docker
find . -type f | xargs du -b | sort -n
What you will see is the largest files shown at the bottom. If you want, figure out what containers those files are in, enter those containers with docker exec -ti containername -- /bin/sh and delete some files.
You can also put docker system prune -a -f on a daily/weekly cron job as long as you aren't leaving stopped containers and volumes around that you care about. It's better to figure out the reasons why it's growing, and correct them at the container level.

Docker apparently keeps image layers of old versions of an image for running containers. It may happen if you update your running container's image (same tag) without stopping it, for example:
docker-compose pull
docker-compose up -d
Running docker-compose down before updating solved it, the downtime is not an issue in my case.

I recently had a similar issue, overlay2 grew bigger and bigger, But I couldn’t figure out what consumed the bulk of the space.
df showed me that overlay2 was about 24GB in size.
With du I tried to figure out what occupied the space… and failed.
The difference came from the fact that deleted files (mostly log files in my case) where still being used by a process (Docker). Thus the file doesn’t show up with du but the space it occupies will show with df.
A reboot of the host machine helped. Restarting the docker container would probably have helped already…
This article on linuxquestions.org helped me to figure that out.

Maybe this folder is not your problem, don't use the result of df -h with docker.
Use the command below to see the size of each of your folders:
echo; pwd; echo; ls -AlhF; echo; du -h --max-depth=1; echo; du-sh

docker system prune -af && docker image prune -af

I used "docker system prune -a" it cleaned all files under volumes and overlay2
[root#jasontest volumes]# docker system prune -a
WARNING! This will remove:
- all stopped containers
- all networks not used by at least one container
- all images without at least one container associated to them
- all build cache
Are you sure you want to continue? [y/N] y
Deleted Images:
untagged: ubuntu:12.04
untagged: ubuntu#sha256:18305429afa14ea462f810146ba44d4363ae76e4c8dfc38288cf73aa07485005
deleted: sha256:5b117edd0b767986092e9f721ba2364951b0a271f53f1f41aff9dd1861c2d4fe
deleted: sha256:8c7f3d7534c80107e3a4155989c3be30b431624c61973d142822b12b0001ece8
deleted: sha256:969d5a4e73ab4e4b89222136eeef2b09e711653b38266ef99d4e7a1f6ea984f4
deleted: sha256:871522beabc173098da87018264cf3e63481628c5080bd728b90f268793d9840
deleted: sha256:f13e8e542cae571644e2f4af25668fadfe094c0854176a725ebf4fdec7dae981
deleted: sha256:58bcc73dcf4050a4955916a0dcb7e5f9c331bf547d31e22052f1b5fa16cf63f8
untagged: osixia/openldap:1.2.1
untagged: osixia/openldap#sha256:6ceb347feb37d421fcabd80f73e3dc6578022d59220cab717172ea69c38582ec
deleted: sha256:a562f6fd60c7ef2adbea30d6271af8058c859804b2f36c270055344739c06d64
deleted: sha256:90efa8a88d923fb1723bea8f1082d4741b588f7fbcf3359f38e8583efa53827d
deleted: sha256:8d77930b93c88d2cdfdab0880f3f0b6b8be191c23b04c61fa1a6960cbeef3fe6
deleted: sha256:dd9f76264bf3efd36f11c6231a0e1801c80d6b4ca698cd6fa2ff66dbd44c3683
deleted: sha256:00efc4fb5e8a8e3ce0cb0047e4c697646c88b68388221a6bd7aa697529267554
deleted: sha256:e64e6259fd63679a3b9ac25728f250c3afe49dbe457a1a80550b7f1ccf68458a
deleted: sha256:da7d34d626d2758a01afe816a9434e85dffbafbd96eb04b62ec69029dae9665d
deleted: sha256:b132dace06fa7e22346de5ca1ae0c2bf9acfb49fe9dbec4290a127b80380fe5a
deleted: sha256:d626a8ad97a1f9c1f2c4db3814751ada64f60aed927764a3f994fcd88363b659
untagged: centos:centos7
untagged: centos#sha256:2671f7a3eea36ce43609e9fe7435ade83094291055f1c96d9d1d1d7c0b986a5d
deleted: sha256:ff426288ea903fcf8d91aca97460c613348f7a27195606b45f19ae91776ca23d
deleted: sha256:e15afa4858b655f8a5da4c4a41e05b908229f6fab8543434db79207478511ff7
Total reclaimed space: 533.3MB
[root#jasontest volumes]# ls -alth
total 32K
-rw------- 1 root root 32K May 23 21:14 metadata.db
drwx------ 2 root root 4.0K May 23 21:14 .
drwx--x--x 14 root root 4.0K May 21 20:26 ..

Related

Reduce the disk space Docker uses [duplicate]

(Post created on Oct 05 '16)
I noticed that every time I run an image and delete it, my system doesn't return to the original amount of available space.
The lifecycle I'm applying to my containers is:
> docker build ...
> docker run CONTAINER_TAG
> docker stop CONTAINER_TAG
> rm docker CONTAINER_ID
> rmi docker image_id
[ running on a default mac terminal ]
The containers in fact were created from custom images, running from node and a standard redis. My OS is OSX 10.11.6.
At the end of the day I see I keep losing Mbs. How can I face this problem?
EDITED POST
2020 and the problem persists, leaving this update for the community:
Today running:
macOS 10.13.6
Docker Engine 18.9.2
Docker Desktop Cli 2.0.0.3
The easiest way to workaround the problem is to prune the system with the Docker utilties.
docker system prune -a --volumes
WARNING:
By default, volumes are not removed to prevent important data from being deleted if there is currently no container using the volume. Use the --volumes flag when running the command to prune volumes as well:
Docker now has a single command to do that:
docker system prune -a --volumes
See the Docker system prune docs
There are three areas of Docker storage that can mount up, because Docker is cautious - it doesn't automatically remove any of them: exited containers, unused container volumes, unused image layers. In a dev environment with lots of building and running, that can be a lot of disk space.
These three commands clear down anything not being used:
docker rm $(docker ps -f status=exited -aq) - remove stopped containers
docker rmi $(docker images -f "dangling=true" -q) - remove image layers that are not used in any images
docker volume rm $(docker volume ls -qf dangling=true) - remove volumes that are not used by any containers.
These are safe to run, they won't delete image layers that are referenced by images, or data volumes that are used by containers. You can alias them, and/or put them in a CRON job to regularly clean up the local disk.
It is also worth mentioning that file size of docker.qcow2 (or Docker.raw on High Sierra with Apple Filesystem) can seem very large (~64GiB), larger than it actually is, when using the following command:
ls -klsh Docker.raw
This can be somehow misleading because it will output the logical size of the file rather than its physical size.
To see the physical size of the file you can use this command:
du -h Docker.raw
Source: https://docs.docker.com/docker-for-mac/faqs/#disk-usage
Why does the file keep growing?
If Docker is used regularly, the size of the Docker.raw (or Docker.qcow2) can keep growing, even when files are deleted.
To demonstrate the effect, first check the current size of the file on the host:
$ cd ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/
$ ls -s Docker.raw
9964528 Docker.raw
Note the use of -s which displays the number of filesystem blocks actually used by the file. The number of blocks used is not necessarily the same as the file “size”, as the file can be sparse.
Next start a container in a separate terminal and create a 1GiB file in it:
$ docker run -it alpine sh
# and then inside the container:
/ # dd if=/dev/zero of=1GiB bs=1048576 count=1024
1024+0 records in
1024+0 records out
/ # sync
Back on the host check the file size again:
$ ls -s Docker.raw
12061704 Docker.raw
Note the increase in size from 9964528 to 12061704, where the increase of 2097176 512-byte sectors is approximately 1GiB, as expected. If you switch back to the alpine container terminal and delete the file:
/ # rm -f 1GiB
/ # sync
then check the file on the host:
$ ls -s Docker.raw
12059672 Docker.raw
The file has not got any smaller! Whatever has happened to the file inside the VM, the host doesn’t seem to know about it.
Next if you re-create the “same” 1GiB file in the container again and then check the size again you will see:
$ ls -s Docker.raw
14109456 Docker.raw
It’s got even bigger! It seems that if you create and destroy files in a loop, the size of the Docker.raw (or Docker.qcow2) will increase up to the upper limit (currently set to 64 GiB), even if the filesystem inside the VM is relatively empty.
The explanation for this odd behaviour lies with how filesystems typically manage blocks. When a file is to be created or extended, the filesystem will find a free block and add it to the file. When a file is removed, the blocks become “free” from the filesystem’s point of view, but no-one tells the disk device. Making matters worse, the newly-freed blocks might not be re-used straight away – it’s completely up to the filesystem’s block allocation algorithm. For example, the algorithm might be designed to favour allocating blocks contiguously for a file: recently-freed blocks are unlikely to be in the ideal place for the file being extended.
Since the block allocator in practice tends to favour unused blocks, the result is that the Docker.raw (or Docker.qcow2) will constantly accumulate new blocks, many of which contain stale data. The file on the host gets larger and larger, even though the filesystem inside the VM still reports plenty of free space.
TRIM
A TRIM command (or a DISCARD or UNMAP) allows a filesystem to signal to a disk that a range of sectors contain stale data and they can be forgotten. This allows:
an SSD drive to erase and reuse the space, rather than spend time shuffling it around; and
Docker for Mac to deallocate the blocks in the host filesystem, shrinking the file.
So how do we make this work?
Automatic TRIM in Docker for Mac
In Docker for Mac 17.11 there is a containerd “task” called trim-after-delete listening for Docker image deletion events. It can be seen via the ctr command:
$ docker run --rm -it --privileged --pid=host walkerlee/nsenter -t 1 -m -u -i -n ctr t ls
TASK PID STATUS
vsudd 1741 RUNNING
acpid 871 RUNNING
diagnose 913 RUNNING
docker-ce 958 RUNNING
host-timesync-daemon 1046 RUNNING
ntpd 1109 RUNNING
trim-after-delete 1339 RUNNING
vpnkit-forwarder 1550 RUNNING
When an image deletion event is received, the process waits for a few seconds (in case other images are being deleted, for example as part of a docker system prune ) and then runs fstrim on the filesystem.
Returning to the example in the previous section, if you delete the 1 GiB file inside the alpine container
/ # rm -f 1GiB
then run fstrim manually from a terminal in the host:
$ docker run --rm -it --privileged --pid=host walkerlee/nsenter -t 1 -m -u -i -n fstrim /var/lib/docker
then check the file size:
$ ls -s Docker.raw
9965016 Docker.raw
The file is back to (approximately) it’s original size – the space has finally been freed!
Hopefully this blog will be helpful, also checkout the following macos docker utility scripts for this problem:
https://github.com/wanliqun/macos_docker_toolkit
Docker on Mac has an additional problem that is hurting a lot of people: the docker.qcow2 file can grow out of proportions (up to 64gb) and won't ever shrink back down on its own.
https://github.com/docker/for-mac/issues/371
As stated in one of the replies by djs55 this is in the planning to be fixed, but its not a quick fix. Quote:
The .qcow2 is exposed to the VM as a block device with a maximum size
of 64GiB. As new files are created in the filesystem by containers,
new sectors are written to the block device. These new sectors are
appended to the .qcow2 file causing it to grow in size, until it
eventually becomes fully allocated. It stops growing when it hits this
maximum size.
...
We're hoping to fix this in several stages: (note this is still at the
planning / design stage, but I hope it gives you an idea)
1) we'll switch to a connection protocol which supports TRIM, and
implement free-block tracking in a metadata file next to the qcow2.
We'll create a compaction tool which can be run offline to shrink the
disk (a bit like the qemu-img convert but without the dd if=/dev/zero
and it should be fast because it will already know where the empty
space is)
2) we'll automate running of the compaction tool over VM reboots,
assuming it's quick enough
3) we'll switch to an online compactor (which is a bit like a GC in a
programming language)
We're also looking at making the maximum size of the .qcow2
configurable. Perhaps 64GiB is too large for some environments and a
smaller cap would help?
Update 2019: many updates have been done to Docker for Mac since this answer was posted to help mitigate problems (notably: supporting a different filesystem).
Cleanup is still not fully automatic though, you may need to prune from time to time. For a single command that can help to cleanup disk space, see zhongjiajie's answer.
docker container prune
docker system prune
docker image prune
docker volume prune
Since nothing here was working for me, here's what I did. Check file size:
ls -lhks ~/Library/Containers/com.docker.docker//Data/vms/0/data/Docker.raw
Then in the docker desktop simply reduce the disk image size (I was using raw format). It will say it will delete everything, but by the time you are reading this post, you probably already have. So that creates a fresh new empty file.
i'm not sure if it is related to the current topic , but this been a solution for me personally
open docker settings -> resources -> disk image size - 16gb
There are several options on how to limit docker diskspace, I'd start by limiting/rotating the logs: Docker container logs taking all my disk space
E.g. if you have a recent docker version, you can start it with an --log-opt max-size=50m option per container.
Also - if you've got old, unused containers, you can consider having a look at the docker logs which are located at /var/lib/docker/containers/*/*-json.log
$ sudo docker system prune
WARNING! This will remove:
all stopped containers
all networks not used by at least one container
all dangling images
all dangling build cache

Docker Cleanup - Can I delete old directories under /var/lib/docker/containers

I have 16 docker containers running in my system which stores a lot of data under/var/lib/docker/overlay2/<id>(around 30 GB per-directory.. Total 30*16 GB). Primarily space is consumed by diff and merged directories inside it.
Every time I do a docker-compose down followed by docker-compose up, it creates another set of 16 directories and starts storing data under that. But it does not clean up old overlay2 directories. This leads to a space crunch
Please let me know if I can do rm -rf /var/lib/docker/overlay2/
of old directories or how I can free up space.
Do I need to wait for a couple of hours after docker-compose down to
reclaim space?
Note: I did a docker system prune -a also.
1)Please let me know if I can do rm -rf /var/lib/docker/overlay2/ of
old directories or how I can free up space.
Only you can decide that.
When I suspect unused data for a /var/lib/docker/overlay2/foo123 folder, I first inspect the content of that folder. It contains generally image content. With modification date of files and content of files , I have a high probability to determinate if the folder is useless.
When I am sure that it is useless, I delete it with rm -rf .... Note that the delete may fail because file mounting, in that case I identify them and unmount them first.
If I am not sure that it is useless, I perform a backup before such as cp -a /var/lib/docker/overlay2/foo123 /var/lib/docker/overlay2/backup-foo123 before deleting
2)Do I need to wait for a couple of hours after docker-compose down to
reclaim space?
With just Docker, no at all. Data have to be explicitly removed (such as : docker container rm FOO, docker system prune, docker system prune -a, docker image rm FOO and so for).
But Docker is not perfect. So sometimes you may have stale data.
Generally from time to time, I inspect docker big folders with a du -sh */ | sort -h.

How can I reinit docker layers?

Using docker under Kubuntu 18 I got out of free space on the device.
I run commands to clear space:
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
docker-compose down --remove-orphans
docker system prune --force --volumes
As I was still of free space opened /var/lib/docker/overlay2/ directory and
deleted a lot of subdirectories ubder it.
After that I got error :
$ docker-compose up -d --build
Creating network "master_default" with the default driver
ERROR: stat /var/lib/docker/overlay2/36af81b800ebb595a24b6c724318c1126932d2bfae61e2c98bfc65a203b2b928: no such file or directory
Looks like that is not a good way to free space. Which way isd good in my case?
If there is a way to reinit my docker apps?
Thanks!
As I was still of free space opened /var/lib/docker/overlay2/ directory and deleted a lot of subdirectories ubder it.
At this point, the docker filesystem has been corrupted. To repair, the best you can do is backup anything you want to save, particularly any volumes, stop the docker engine (systemctl stop docker), delete the entire docker filesystem (rm -rf /var/lib/docker), and restart docker (systemctl start docker).
At that point the engine will be completely empty without any images, containers, etc. You'll need to pull/rebuild your images and recreate the containers you were running. Hopefully that's as easy as a docker-compose up.
/var/lib/docker/overlay2/ is where docker stores the image layers.
Now, docker system prune -a removes all the unused images, stopped containers and the build cache if I'm not wrong.
One advice, since you are building images, check docker buildkit. I know docker-compose added support for it, but I don't know if your version supports that. To make it short, building your images will be way faster.

How to clean up Docker ZFS legacy shares

Summary
Given that:
The storage driver docker users is ZFS;
Only docker creates legacy datasets;
Bash:
$ docker ps -a | wc -l
16
$ docker volume ls | wc -l
12
$ zfs list | grep legacy | wc -l
157
16 containers (both running and stopped). 12 volumes. 157 datasets. This seems like an awful lot of legacy datasets. I'm wondering if a lot of them are so orphaned that not even docker knows about them anymore, so they don't get cleaned up.
Rationale
There is a huge list of legacy volumes in my Debian zfs pool. They started appearing when I started using Docker on this machine:
$ sudo zfs list | grep legacy | wc -l
486
They are all in the form of:
pool/var/<64-char-hash> 202K 6,18T 818M legacy
This location is used solely by docker.
$ docker info | grep -e Storage -e Dataset
Storage Driver: zfs
Parent Dataset: pool/var
I started cleaning up.
$ docker system prune -a
(...)
$ sudo zfs list | grep legacy | wc -l
154
That's better. However, I'm only running about 15 containers, and after running docker system prune -a, the history or every container shows that only the last image layer is still available. The rest are <missing> (because they are cleaned up).
$ docker images | wc -l
15
If all containers use only the last image layer after pruning the rest, shouldn't docker only use 15 image layers and 15 running containers, totalling 30 volumes?
$ sudo zfs list | grep legacy | wc -l
154
Can I find out if they are in use by a container/image? Is there a command that traverses all pool/var/<hash> datasets in ZFS and figures out to what docker container/image they belong? Either a lot of them can be removed, or I don't understand how to figure out (beyond just trusting docker system prune) they cannot.
The excessive use of zfs volumes by docker messes up my zfs list command, both visually and performance-wise. Listing zfs volumes now takes ~10 seconds in stead of <1.
Proof that docker sees no more dangling counts
$ docker ps -qa --no-trunc --filter "status=exited"
(no output)
$ docker images --filter "dangling=true" -q --no-trunc
(no output)
$ docker volume ls -qf dangling=true
(no output)
zfs list example:
NAME USED AVAIL REFER MOUNTPOINT
pool 11,8T 5,81T 128K /pool
pool/var 154G 5,81T 147G /mnt/var
pool/var/0028ab70abecb2e052d1b7ffc4fdccb74546350d33857894e22dcde2ed592c1c 1,43M 5,81T 1,42M legacy
pool/var/0028ab70abecb2e052d1b7ffc4fdccb74546350d33857894e22dcde2ed592c1c#211422332 10,7K - 1,42M -
# and 150 more of the last two with different hashes
I had the same question but couldn't find a satisfactory answer. Adding what I eventually found, since this question is one of the top search results.
Background
The ZFS storage driver for Docker stores each layer of each image as a separate legacy dataset.
Even just a handful of images can result in a huge number of layers, each layer corresponding to a legacy ZFS dataset.
Quote from the Docker ZFS driver docs:
The base layer of an image is a ZFS filesystem. Each child layer is a ZFS clone based on a ZFS snapshot of the layer below it. A container is a ZFS clone based on a ZFS Snapshot of the top layer of the image it’s created from.
Investigate
You can check the datasets used by one image by running:
$ docker image inspect [IMAGE_NAME]
Example output:
...
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:f2cb0ecef392f2a630fa1205b874ab2e2aedf96de04d0b8838e4e728e28142da",
...
...
...
"sha256:2e8cc9f5313f9555a4decca744655ed461e21fbe48a0f078ed5f7c4e5292ad2e",
]
},
...
This explains why you can see 150+ datasets created when only running a dozen containers.
Solution
Prune and delete unused images.
$ docker image prune -a
To avoid a slow zfs list, specify the dataset of interest.
Suppose you store docker in tank/docker and other files in tank/data. List only the data datasets by the recursive option:
# recursively list tank/data/*
$ zfs list tank/data -r
I use docker-in-docker containers that also generates a lot of unused snapshot.
based on #Redsandro comment I've used the following command
sudo zfs list -t snapshot -r pool1| wc -l
sudo zpool list
(sudo zfs get mounted |grep "mounted no" | awk '/docker\// { print $1 }' | xargs -l sudo zfs destroy -R ) 2> /dev/null
as just delete all snapshot ruined the consistency of docker.
But as docker mounts all images that uses under /var/lib/docker/zfs/graph
(same for the docker-in-docker images) so ignoring those that mounted only should delete dangling images/volumes/containers that was not properly freed up. You need to run this till the number of snapshot decreasing.
Prune introductions on docker.com.
I assume your docker version is lower than V17.06. Since you’ve executed docker system prune -a, the old layers’ building information and volumes are missing. And -a/--all flag means all images without at least one container would be deleted. Without -a/--all flag, just dangling images would be deleted.
In addition, I think you have misunderstanding about <missing> mark and dangling images. <missing> doesn't mean that the layers marked as missing are really missing. It just means that these layers may be built on other machines. Dangling images are non-referenced images. Even the name and tag are marked <none>, the image still could be referenced by other images, which could check with docker history image_id.
In your case, these layers are marked as missing, since you have deleted the old versions of images which include building information. You said above--only latest version images are available--thus, only the latest layer are not marked missing.
Note this: docker system prune is a lazy way to manage all objects(image/container/volume/network/cache) of Docker.

How to clean up Docker

I've just noticed that I ran out of disk space on my laptop. Quite a lot is used by Docker as found by mate-disk-usage-analyzer:
The docker/aufs/diff folder contains 152 folders ending in -removing.
I already ran the following commands to clean up
Kill all running containers:
# docker kill $(docker ps -q)
Delete all stopped containers
# docker rm $(docker ps -a -q)
Delete all images
# docker rmi $(docker images -q)
Remove unused data
# docker system prune
And some more
# docker system prune -af
But the screenshot was taken after I executed those commands.
What is docker/aufs/diff, why does it consume that much space and how do I clean it up?
I have Docker version 17.06.1-ce, build 874a737. It happened after a cleanup, so this is definitely still a problem.
The following is a radical solution. IT DELETES ALL YOUR DOCKER STUFF. INCLUDING VOLUMES.
$ sudo su
# service docker stop
# cd /var/lib/docker
# rm -rf *
# service docker start
See https://github.com/moby/moby/issues/22207#issuecomment-295754078 for details
It might not be /var/lib/docker
The docker location might be different in your case. You can use a disk usage analyzer (such as mate-disk-usage-analyzer) to find the folders which need most space.
See Where are Docker images stored on the host machine?
This dir is where container rootfs layers are stored when using the AUFS storage driver (default if the AUFS kernel modules are loaded).
If you have a bunch of *-removing dirs, this is caused by a failed removal attempt. This can happen for various reasons, the most common is that an unmount failed due to device or resource busy.
Before Docker 17.06, if you used docker rm -f to remove a container all container metadata would be removed even if there was some error somewhere in the cleanup of the container (e.g., failing to remove the rootfs layer).
In 17.06 it will no longer remove the container metadata and instead flag the container with a Dead status so you can attempt to remove it again.
You can safely remove these directories, but I would stop docker first, then remove, then start docker back up.
docker takes lot of gig into three main areas :
Check for downloaded and compiled images.
clean unused and dead images by running below command
docker image prune -a
Docker creates lot of volume, some of the volumes are from dead container that are no more used
clean the volume and reclaim the space using
docker system prune -af && \
docker image prune -af && \
docker system prune -af --volumes && \
docker system df
Docker container logs are also very notorious in generating GBs of log
overlay2 storage for layers of container is also another source of GBs eaten up .
One better way is to calculate the size of docker image and then restrict the docker container with below instructions for storage and logs upper cap.
For these feature use docker V19 and above.
docker run -it --storage-opt size=2G --log-opt mode=non-blocking --log-opt max-buffer-size=4m fedora /bin/bash
Note that this is actually a know, yet still pending, issue: https://github.com/moby/moby/issues/37724
If you have the same issue, I recommend to "Thumbs Up" the issue on GitHub so that it gets addressed soon.
I had same issue.
In my case solution was:
view all images:
docker images
remove old unused images:
docker rmi IMAGE_ID
possibly you will need to prune stopped containers:
docker container prune
p.s. docker --help is good solution :)

Resources