How to 'squash' base image layers into a single layer? - docker

I am trying to upgrade the base image version in one of my images, but due to a layer limit of 40, my builds are failing.
I noticed that in the older image, docker history gave this output:
IMAGE CREATED CREATED BY SIZE
092c6c14cf83 8 days ago /bin/sh -c #(nop) ENV LANG 0B
376101232840 8 days ago /bin/sh -c /tmp/tmp.sh 65.4MB
6c58dda60477 8 days ago /bin/sh -c #(nop) COPY file:a 1.27kB
1264065f6ae8 7 months ago 4.72kB
<missing> 7 months ago 207MB
But after updating the image version, this is the output:
IMAGE CREATED CREATED BY SIZE
3233036cf707 41 hours ago /bin/sh -c #(nop) 0B
1e72b109fe29 41 hours ago /bin/sh -c /tmp/tmp.sh 65.5MB
e4bb0f8240aa 41 hours ago /bin/sh -c #(nop) COPY file:a 1.27kB
dea12a7906f5 12 days ago /bin/sh -c rm -f /tmp/tls-ca- 207MB
<missing> 12 days ago /bin/sh -c rm -f '/etc/yum.re' 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL "dist 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:7 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:ad 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL relea 0B
<missing> 12 days ago /bin/sh -c mkdir -p /var/log 0B
<missing> 12 days ago /bin/sh -c rm -rf /var/log/ 0B
<missing> 12 days ago /bin/sh -c #(nop) CMD ["/bin 0B
<missing> 12 days ago /bin/sh -c #(nop) ENV PATH / 0B
<missing> 12 days ago /bin/sh -c #(nop) ENV contain 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL io. 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL io. 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL io. 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL des 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL sum 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL com 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL com 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD multi:3 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:21 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:09 0B
Any clue why this may be happening? These extra layers that are being added seem to come from the base image. Any clue how to make them appear as a single layer like shown in the older build?
Thanks in advance!

So I realized a couple of things regarding this question at a later point of time.
First off, docker history does not give you the true layer count*. To find the true layer count, I used docker inspect --format='{{join .RootFS.Layers "\n"}} $INSTANCE_ID | wc -l to get the number of true layers.
Secondly, if you still need to squash layers, explore using the --squash directive in docker build (experimental feature), or use docker-squash.
* - The output of docker history shows the list of hashes mapped to the diff of layers. Since these hashes are randomly generated, the mapping is handled by the docker engine. When an image is pulled, this mapping is lost and you get the <missing> value in the IMAGE field. Couldn't have figured this out without this amazing blog post

Related

Missing <missing> Image Layer ID for subsequent child layers - DOCKER_BUILDKIT=1 docker build and docker history <image id>

Docker version 19.03.12, build 48a66213fe
OS: Red Hat Enterprise Linux Server release 7.9 (Maipo)
Please See: Dockerfile have minimal steps and it builds successfully.
To create docker image, I ran (this command will capture stdout/stderr in a .log file):
DOCKER_BUILDKIT=1 docker build \
--network=host \
-t project-opensuse-docker-image:15.2 . \
|& tee -a /tmp/project-opensuse-docker-image.log
When I use DOCKER_BUILDKIT=1 during docker build ....... command, it makes Image ID of the subsequent image layers listed as <missing>; If I re-run docker build again as soon as the first run completes successfully, it is NOT able to use any layer cache concept to go/build fast even though there's no change done to Dockerfile or any folder/file on the file system (i.e if used during COPY / ADD etc steps).
It builds everytime from scratch and I think, the reason is because there's NO valid IMAGE ID for those subsequent layers under the final Image ID: a07eebd0a9ba.
See Image ID in docker history <image> output showing for subsequent layer's IDs.
WHY am I getting under Image ID for these layers?
​
[gigauser#rh79maipo_machine opensuse-x]$ sudo docker history a07eebd0a9ba
IMAGE CREATED CREATED BY SIZE COMMENT
a07eebd0a9ba 3 hours ago CMD ["ls -l"] 0B buildkit.dockerfile.v0
<missing> 3 hours ago WORKDIR /home/nonroot_user 0B buildkit.dockerfile.v0
<missing> 3 hours ago VOLUME [/home/nonroot_user/git] 0B buildkit.dockerfile.v0
<missing> 3 hours ago RUN /bin/sh -c ls -l /home/nonroot_user /home/rayd… 174MB buildkit.dockerfile.v0
<missing> 3 hours ago COPY ./boost_1_68_0.tar.gz /home/nonroot_user/tool… 109MB buildkit.dockerfile.v0
<missing> 3 hours ago RUN /bin/sh -c ls -l /home/nonroot_user /home/rayd… 253MB buildkit.dockerfile.v0
<missing> 3 hours ago COPY ./wxWidgets-3.1.3.tar.bz2 /home/nonroot_user/… 21.3MB buildkit.dockerfile.v0
<missing> 3 hours ago RUN /bin/sh -c echo -e "\n-- Installing Zypper… 1.71GB buildkit.dockerfile.v0
<missing> 26 hours ago RUN /bin/sh -c useradd -r -m -l … 953kB buildkit.dockerfile.v0
<missing> 26 hours ago COPY ./CompanyCertBundle/PEM/*.cer /usr/sha… 34.7kB buildkit.dockerfile.v0
<missing> 26 hours ago ENV http_proxy=http://company.proxy.com:80/ … 0B buildkit.dockerfile.v0
<missing> 26 hours ago LABEL Project=aPROJECT IM CentOS_Version=CentOS… 0B buildkit.dockerfile.v0
<missing> 4 weeks ago KIWI 9.23.20 109MB
[gigauser#rh79maipo_machine opensuse-x]$

What can be the cause of different permissions inside a fresh Docker container?

I have the situation that a fresh Docker container leads to different results when executed on different machines.
Specifically the file system permissions are different:
docker run --rm my-private-image /bin/sh -c "ls -l /"
...
drwxr-xr-t 2 root root 4096 Dec 18 23:33 tmp
vs.
drwxrwxrwt 2 root root 4096 May 29 2020 tmp
The problem exists only recently, when I moved the /var/lib/docker directory to another partition. So most likely I screwed it up myself.
I already deleted the image in question to force Docker to fetch it freshly to correct the mistake, but no luck here (it corrected different owners/groups, but not the permissions).
The image in question (I called it my-private-image) is based upon trafex/alpine-nginx-php7, which is responsible for /tmp.
My question now is:
How/Why does Docker keep the different layers even after I deleted the image?
And what can I do the rectify the situation?
(I could of course just delete the whole /var/lib/docker and reinstall Docker to solve that, but I want to understand Docker's internals better)
Doing a quick check of the image history, nothing jumps out as modifying /tmp, so I'm pretty sure it will be the base layer:
$ docker history trafex/alpine-nginx-php7
IMAGE CREATED CREATED BY SIZE COMMENT
d03c5e607375 5 months ago /bin/sh -c #(nop) HEALTHCHECK &{["CMD-SHELL… 0B
<missing> 5 months ago /bin/sh -c #(nop) CMD ["/usr/bin/supervisor… 0B
<missing> 5 months ago /bin/sh -c #(nop) EXPOSE 8080 0B
<missing> 5 months ago /bin/sh -c #(nop) COPY --chown=nobodydir:922… 58B
<missing> 5 months ago /bin/sh -c #(nop) WORKDIR /var/www/html 0B
<missing> 5 months ago /bin/sh -c #(nop) USER nobody 0B
<missing> 5 months ago /bin/sh -c chown -R nobody.nobody /var/www/h… 1.15kB
<missing> 5 months ago /bin/sh -c mkdir -p /var/www/html 0B
<missing> 5 months ago /bin/sh -c #(nop) COPY file:12908bc96c18db8f… 459B
<missing> 5 months ago /bin/sh -c #(nop) COPY file:ba2b24ac43720041… 27B
<missing> 5 months ago /bin/sh -c #(nop) COPY file:791d5f77ccca3899… 2.16kB
<missing> 5 months ago /bin/sh -c #(nop) COPY file:e7e19bb0340c77dd… 2.91kB
<missing> 5 months ago /bin/sh -c ln -s /usr/bin/php8 /usr/bin/php 13B
<missing> 5 months ago /bin/sh -c apk --no-cache add curl nginx… 121MB
<missing> 5 months ago /bin/sh -c #(nop) LABEL Description=Lightwe… 0B
<missing> 5 months ago /bin/sh -c #(nop) LABEL Maintainer=Tim de P… 0B
<missing> 6 months ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 6 months ago /bin/sh -c #(nop) ADD file:f278386b0cef68136… 5.6MB
Inspecting the image, that sha looks like the 72e8... layer:
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf",
"sha256:8e1de91f5d76729f122777c082e5a0a04bf157438b40223add6e5b0d74974b4a",
"sha256:02bef6095ce13955a5feda9394ceb41e3ff06b1547c179550495f0ea51fa7d81",
"sha256:3dc8fd1f904d7ed46a9ef40a6083f7da2dd76c5d33acbfe5b49a57054a7e691a",
"sha256:5e1da4b6ee09de869cc02ebd7b0ef093da7b1b3cf5e1a5258398960679252d1d",
"sha256:3c93b50985ab6d6ff15f51828432c04d0163232e46d510131ce6fafb44832da9",
"sha256:4da6c12a08b014c74acc2348d4e5fec101a10f27caf2ac2cd80596c1775cd6ad",
"sha256:f44b520b457aa8b531c86e63f2c9c391d73f5eca898e698713c679b684637890",
"sha256:a7a1e7d3913b5bd1b0d13653840349ce407c81beec236055edab400c5a1a0096",
"sha256:e68f2db5551b5b4c15ad89dae5ee54ea4a3b864c71475146cedfda1ee6d75ec7"
]
},
In docker, those are currently stored within /var/lib/docker/image/overlay2/layerdb and that has a pointer to the overlay2 folder:
# cat image/overlay2/layerdb/sha256/72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf/cache-id
a2cd81767ad8c4e1fc556585df7f9904089e4d3884304f1c3a343c234b9a8f08
# ls -al overlay2/a2cd81767ad8c4e1fc556585df7f9904089e4d3884304f1c3a343c234b9a8f08/diff/tmp
total 8
drwxrwxrwt 2 root root 4096 Jun 15 2021 .
drwxr-xr-x 19 root root 4096 Jul 7 15:51 ..
If your filesystem has been corrupted at this level in the move, I'd personally delete the copy (the entire /var/lib/docker directory, not just pieces from it) and start over. Docker will not pull layers that it already has locally, and you are probably seeing just the first issue with the /tmp folder permissions.

Retrieve file that was deleted during the image build

I have a docker image.
When I use the docker history command on the image, I can see
85d9bf810d44 9 days ago /bin/sh -c apk add vim 26.9MB
<missing> 9 days ago /bin/sh -c apk update 1.78MB
<missing> 9 days ago /bin/sh -c rm -f file.txt 0B
<missing> 9 days ago /bin/sh -c a=$(base64 -d < file.txt) && echo $a … 49B
<missing> 9 days ago /bin/sh -c #(nop) COPY file:98f5646751cb4985… 68B
<missing> 6 weeks ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 6 weeks ago /bin/sh -c #(nop) ADD file:f17f65714f703db90… 5.57MB
So there was a file.txt at some point in the image, but it was later removed. I would like to know if there is a way to retrieve the content of that file from the image layers.
I have looked into Dive and all sorts of stuff. Also navigating through Docker's overlay files (as indicated here) seemed promising, but I am using macOS and I couldn't find the corresponding directories...
docker image save will export a tarball that contains a tarball per layer.
https://docs.docker.com/engine/reference/commandline/image_save/

Can I show data in specific layer from an docker image? And how?

Each docker image consists of a series of layers.
Ex: custom-elasticsearch:lastest
$: docker history custom-elasticsearch
IMAGE CREATED CREATED BY SIZE COMMENT
5f14f49e0f6b 8 days ago /bin/sh -c #(nop) EXPOSE 9091/tcp 9200/tcp 9 0 B
c1b5b6bdc8d8 8 days ago /bin/sh -c /usr/share/elasticsearch/bin/plugi 3 MB
a406ab7ba4ed 8 days ago /bin/sh -c #(nop) COPY file:cf296a4961a04abc0 489 B
6b0d046baaa8 8 days ago /bin/sh -c #(nop) COPY file:81c04951307f0688f 83 B
6f609da577b7 20 months ago /bin/sh -c #(nop) CMD ["elasticsearch"] 0 B
<missing> 20 months ago /bin/sh -c #(nop) EXPOSE 9200/tcp 9300/tcp 0 B
<missing> 20 months ago /bin/sh -c #(nop) ENTRYPOINT &{["/docker-entr 0 B
<missing> 20 months ago /bin/sh -c #(nop) COPY file:d25889029dd34582c 672 B
//...
Can I show, copy file in image at fourth layer with id (6b0d046baaa8)?
Thanks
Update:
There is a very useful tool called dive that allows you to navigate through the Docker layers and view the filesystem.
dlayer does this quite nicely:
For example, to see what is in each layer of docker.io/moby/buildkit:v0.10.3:
docker image pull docker.io/moby/buildkit:v0.10.3
docker save docker.io/moby/buildkit:v0.10.3 | dlayer
example output
$ docker save docker.io/moby/buildkit:v0.10.3 \
| dlayer
(elided content...)
====================================================================================================
1.6 kB $ COPY examples/buildctl-daemonless/buildctl-daemonless.sh /usr/bin/ # buildkit
====================================================================================================
1.6 kB usr/bin/buildctl-daemonless.sh
====================================================================================================
116 MB $ COPY / /usr/bin/ # buildkit
====================================================================================================
40 MB usr/bin/buildkitd
25 MB usr/bin/buildctl
21 MB usr/bin/buildkit-runc
5.5 MB usr/bin/buildkit-qemu-aarch64
3.9 MB usr/bin/buildkit-qemu-arm
3.9 MB usr/bin/buildkit-qemu-ppc64le
3.5 MB usr/bin/buildkit-qemu-riscv64
3.4 MB usr/bin/buildkit-qemu-mips64
3.3 MB usr/bin/buildkit-qemu-mips64el
3.0 MB usr/bin/buildkit-qemu-s390x
3.0 MB usr/bin/buildkit-qemu-i386

Docker /vfs folder size

I have a problem with my docker.
I've downloaded an image, and docker shows its size about 600Mb.
But on the disk, in the /usr/lib/docker/ its using like almost 6Gb.
Here is my folder before image pull:
/..
27.9 MiB [##########] /tmp
236.0 KiB [ ] /image
60.0 KiB [ ] /network
8.0 KiB [ ] /vfs
e 4.0 KiB [ ] /volumes
e 4.0 KiB [ ] /trust
e 4.0 KiB [ ] /containers
And here is after image pull:
/..
5.8 GiB [##########] /vfs
27.9 MiB [ ] /tmp
2.2 MiB [ ] /image
60.0 KiB [ ] /network
e 4.0 KiB [ ] /volumes
e 4.0 KiB [ ] /trust
e 4.0 KiB [ ] /containers
Image itself:
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/bitnami/mariadb latest f5dbed792113 8 days ago 598.1 MB
And its history:
IMAGE CREATED CREATED BY SIZE COMMENT
f5dbed792113 8 days ago /bin/sh -c #(nop) CMD ["/run.sh"] 0 B
<missing> 8 days ago /bin/sh -c #(nop) ENTRYPOINT &{["/app-entrypo 0 B
<missing> 8 days ago /bin/sh -c #(nop) EXPOSE 3306/tcp 0 B
<missing> 8 days ago /bin/sh -c #(nop) VOLUME [/bitnami/mariadb] 0 B
<missing> 8 days ago /bin/sh -c #(nop) ENV ALLOW_EMPTY_PASSWORD=no 0 B
<missing> 8 days ago /bin/sh -c #(nop) COPY dir:c5bea93fb9ce36dc47 3.758 kB
<missing> 8 days ago /bin/sh -c bitnami-pkg unpack mariadb-10.1.23 482.1 MB
<missing> 8 days ago /bin/sh -c install_packages libaio1 libc6 lib 12.29 MB
<missing> 8 days ago /bin/sh -c #(nop) LABEL maintainer=Bitnami <c 0 B
<missing> 3 weeks ago /bin/sh -c #(nop) ENTRYPOINT &{["/entrypoint. 0 B
<missing> 3 weeks ago /bin/sh -c #(nop) COPY dir:21a422cab8e9367936 10.17 kB
<missing> 3 weeks ago /bin/sh -c #(nop) ENV BITNAMI_IMAGE_VERSION=j 0 B
<missing> 3 weeks ago /bin/sh -c #(nop) ENV PATH=/opt/bitnami/nami/ 0 B
<missing> 3 weeks ago /bin/sh -c cd /tmp && gpg --keyserver hkp:/ 1.423 MB
<missing> 3 weeks ago /bin/sh -c #(nop) ENV GOSU_VERSION=1.10 GOSU_ 0 B
<missing> 3 weeks ago /bin/sh -c cd /tmp && gpg --keyserver hkp:/ 40.76 kB
<missing> 3 weeks ago /bin/sh -c #(nop) ENV TINI_VERSION=v0.13.2 0 B
<missing> 3 weeks ago /bin/sh -c cd /tmp && curl -sSLO https://na 16.77 MB
<missing> 3 weeks ago /bin/sh -c #(nop) ENV NAMI_VERSION=0.0.6-0 0 B
<missing> 3 weeks ago /bin/sh -c install_packages curl ca-certifica 34.3 MB
<missing> 3 weeks ago /bin/sh -c #(nop) LABEL maintainer=Bitnami <c 0 B
<missing> 3 weeks ago 51.14 MB from Bitnami with love
I'm new to the Docker, so is this normal?
I'm find that quite insane. Whole linux VM might be smaller, than this mariadb image.
How i can resolve this?
My docker info:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 1
Server Version: 1.10.3
Storage Driver: vfs
Execution Driver: native-0.2
Logging Driver: journald
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 2.6.32-042stab120.20
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 1
Total Memory: 512 MiB
WARNING: No oom kill disable support
WARNING: bridge-nf-call-ip6tables is disabled
Registries: docker.io (secure)
The problem is the vfs storage driver.
Quoting Storage Drivers in Docker: A Deep Dive
First, let’s get the one special graphdriver out of the way–vfs is the
“naive” implementation of the interface that does not use a union
filesystem or CoW techniques at all, but rather copies all the layers
in order into a static subdirectory and mounts the end result as the
container root filesystem. It is not meant for real (production) use
but is very valuable for simple validation and testing of other parts
of the Docker engine.
My advise is to upgrade to the latest CentOS 7.2 to obtain the latest kernel version they support and use overlay2:
https://docs.docker.com/engine/userguide/storagedriver/overlayfs-driver/
With VFS storage driver each RUN/COPY/ADD instruction in Dockerfile creates a full copy of image's filesystem during image build. So, if your Dockerfile file contains ten RUNs, it will create ten layers/copies.
The solution to this is to reduce the number of RUN/COPY/ADD instructions or to use other drivers (e.g. Overlay2) that copy only changed files..
Also the final number of layers can be reduced by using instructions below at the end of Dockerfile:
FROM scratch
COPY --from=0 / /
It causes that all layers created before the FROM scratch instruction can be forgotten and freed (with docker image prune). What COPY does, refer here, it basically copy all files from one stage to new one.
You can check this Dockerfile for usage example.

Resources