Retrieve file that was deleted during the image build - docker

I have a docker image.
When I use the docker history command on the image, I can see
85d9bf810d44 9 days ago /bin/sh -c apk add vim 26.9MB
<missing> 9 days ago /bin/sh -c apk update 1.78MB
<missing> 9 days ago /bin/sh -c rm -f file.txt 0B
<missing> 9 days ago /bin/sh -c a=$(base64 -d < file.txt) && echo $a … 49B
<missing> 9 days ago /bin/sh -c #(nop) COPY file:98f5646751cb4985… 68B
<missing> 6 weeks ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 6 weeks ago /bin/sh -c #(nop) ADD file:f17f65714f703db90… 5.57MB
So there was a file.txt at some point in the image, but it was later removed. I would like to know if there is a way to retrieve the content of that file from the image layers.
I have looked into Dive and all sorts of stuff. Also navigating through Docker's overlay files (as indicated here) seemed promising, but I am using macOS and I couldn't find the corresponding directories...

docker image save will export a tarball that contains a tarball per layer.
https://docs.docker.com/engine/reference/commandline/image_save/

Related

How to 'squash' base image layers into a single layer?

I am trying to upgrade the base image version in one of my images, but due to a layer limit of 40, my builds are failing.
I noticed that in the older image, docker history gave this output:
IMAGE CREATED CREATED BY SIZE
092c6c14cf83 8 days ago /bin/sh -c #(nop) ENV LANG 0B
376101232840 8 days ago /bin/sh -c /tmp/tmp.sh 65.4MB
6c58dda60477 8 days ago /bin/sh -c #(nop) COPY file:a 1.27kB
1264065f6ae8 7 months ago 4.72kB
<missing> 7 months ago 207MB
But after updating the image version, this is the output:
IMAGE CREATED CREATED BY SIZE
3233036cf707 41 hours ago /bin/sh -c #(nop) 0B
1e72b109fe29 41 hours ago /bin/sh -c /tmp/tmp.sh 65.5MB
e4bb0f8240aa 41 hours ago /bin/sh -c #(nop) COPY file:a 1.27kB
dea12a7906f5 12 days ago /bin/sh -c rm -f /tmp/tls-ca- 207MB
<missing> 12 days ago /bin/sh -c rm -f '/etc/yum.re' 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL "dist 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:7 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:ad 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL relea 0B
<missing> 12 days ago /bin/sh -c mkdir -p /var/log 0B
<missing> 12 days ago /bin/sh -c rm -rf /var/log/ 0B
<missing> 12 days ago /bin/sh -c #(nop) CMD ["/bin 0B
<missing> 12 days ago /bin/sh -c #(nop) ENV PATH / 0B
<missing> 12 days ago /bin/sh -c #(nop) ENV contain 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL io. 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL io. 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL io. 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL des 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL sum 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL com 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL com 0B
<missing> 12 days ago /bin/sh -c #(nop) LABEL 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD multi:3 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:21 0B
<missing> 12 days ago /bin/sh -c #(nop) ADD file:09 0B
Any clue why this may be happening? These extra layers that are being added seem to come from the base image. Any clue how to make them appear as a single layer like shown in the older build?
Thanks in advance!
So I realized a couple of things regarding this question at a later point of time.
First off, docker history does not give you the true layer count*. To find the true layer count, I used docker inspect --format='{{join .RootFS.Layers "\n"}} $INSTANCE_ID | wc -l to get the number of true layers.
Secondly, if you still need to squash layers, explore using the --squash directive in docker build (experimental feature), or use docker-squash.
* - The output of docker history shows the list of hashes mapped to the diff of layers. Since these hashes are randomly generated, the mapping is handled by the docker engine. When an image is pulled, this mapping is lost and you get the <missing> value in the IMAGE field. Couldn't have figured this out without this amazing blog post

Missing <missing> Image Layer ID for subsequent child layers - DOCKER_BUILDKIT=1 docker build and docker history <image id>

Docker version 19.03.12, build 48a66213fe
OS: Red Hat Enterprise Linux Server release 7.9 (Maipo)
Please See: Dockerfile have minimal steps and it builds successfully.
To create docker image, I ran (this command will capture stdout/stderr in a .log file):
DOCKER_BUILDKIT=1 docker build \
--network=host \
-t project-opensuse-docker-image:15.2 . \
|& tee -a /tmp/project-opensuse-docker-image.log
When I use DOCKER_BUILDKIT=1 during docker build ....... command, it makes Image ID of the subsequent image layers listed as <missing>; If I re-run docker build again as soon as the first run completes successfully, it is NOT able to use any layer cache concept to go/build fast even though there's no change done to Dockerfile or any folder/file on the file system (i.e if used during COPY / ADD etc steps).
It builds everytime from scratch and I think, the reason is because there's NO valid IMAGE ID for those subsequent layers under the final Image ID: a07eebd0a9ba.
See Image ID in docker history <image> output showing for subsequent layer's IDs.
WHY am I getting under Image ID for these layers?
​
[gigauser#rh79maipo_machine opensuse-x]$ sudo docker history a07eebd0a9ba
IMAGE CREATED CREATED BY SIZE COMMENT
a07eebd0a9ba 3 hours ago CMD ["ls -l"] 0B buildkit.dockerfile.v0
<missing> 3 hours ago WORKDIR /home/nonroot_user 0B buildkit.dockerfile.v0
<missing> 3 hours ago VOLUME [/home/nonroot_user/git] 0B buildkit.dockerfile.v0
<missing> 3 hours ago RUN /bin/sh -c ls -l /home/nonroot_user /home/rayd… 174MB buildkit.dockerfile.v0
<missing> 3 hours ago COPY ./boost_1_68_0.tar.gz /home/nonroot_user/tool… 109MB buildkit.dockerfile.v0
<missing> 3 hours ago RUN /bin/sh -c ls -l /home/nonroot_user /home/rayd… 253MB buildkit.dockerfile.v0
<missing> 3 hours ago COPY ./wxWidgets-3.1.3.tar.bz2 /home/nonroot_user/… 21.3MB buildkit.dockerfile.v0
<missing> 3 hours ago RUN /bin/sh -c echo -e "\n-- Installing Zypper… 1.71GB buildkit.dockerfile.v0
<missing> 26 hours ago RUN /bin/sh -c useradd -r -m -l … 953kB buildkit.dockerfile.v0
<missing> 26 hours ago COPY ./CompanyCertBundle/PEM/*.cer /usr/sha… 34.7kB buildkit.dockerfile.v0
<missing> 26 hours ago ENV http_proxy=http://company.proxy.com:80/ … 0B buildkit.dockerfile.v0
<missing> 26 hours ago LABEL Project=aPROJECT IM CentOS_Version=CentOS… 0B buildkit.dockerfile.v0
<missing> 4 weeks ago KIWI 9.23.20 109MB
[gigauser#rh79maipo_machine opensuse-x]$

What can be the cause of different permissions inside a fresh Docker container?

I have the situation that a fresh Docker container leads to different results when executed on different machines.
Specifically the file system permissions are different:
docker run --rm my-private-image /bin/sh -c "ls -l /"
...
drwxr-xr-t 2 root root 4096 Dec 18 23:33 tmp
vs.
drwxrwxrwt 2 root root 4096 May 29 2020 tmp
The problem exists only recently, when I moved the /var/lib/docker directory to another partition. So most likely I screwed it up myself.
I already deleted the image in question to force Docker to fetch it freshly to correct the mistake, but no luck here (it corrected different owners/groups, but not the permissions).
The image in question (I called it my-private-image) is based upon trafex/alpine-nginx-php7, which is responsible for /tmp.
My question now is:
How/Why does Docker keep the different layers even after I deleted the image?
And what can I do the rectify the situation?
(I could of course just delete the whole /var/lib/docker and reinstall Docker to solve that, but I want to understand Docker's internals better)
Doing a quick check of the image history, nothing jumps out as modifying /tmp, so I'm pretty sure it will be the base layer:
$ docker history trafex/alpine-nginx-php7
IMAGE CREATED CREATED BY SIZE COMMENT
d03c5e607375 5 months ago /bin/sh -c #(nop) HEALTHCHECK &{["CMD-SHELL… 0B
<missing> 5 months ago /bin/sh -c #(nop) CMD ["/usr/bin/supervisor… 0B
<missing> 5 months ago /bin/sh -c #(nop) EXPOSE 8080 0B
<missing> 5 months ago /bin/sh -c #(nop) COPY --chown=nobodydir:922… 58B
<missing> 5 months ago /bin/sh -c #(nop) WORKDIR /var/www/html 0B
<missing> 5 months ago /bin/sh -c #(nop) USER nobody 0B
<missing> 5 months ago /bin/sh -c chown -R nobody.nobody /var/www/h… 1.15kB
<missing> 5 months ago /bin/sh -c mkdir -p /var/www/html 0B
<missing> 5 months ago /bin/sh -c #(nop) COPY file:12908bc96c18db8f… 459B
<missing> 5 months ago /bin/sh -c #(nop) COPY file:ba2b24ac43720041… 27B
<missing> 5 months ago /bin/sh -c #(nop) COPY file:791d5f77ccca3899… 2.16kB
<missing> 5 months ago /bin/sh -c #(nop) COPY file:e7e19bb0340c77dd… 2.91kB
<missing> 5 months ago /bin/sh -c ln -s /usr/bin/php8 /usr/bin/php 13B
<missing> 5 months ago /bin/sh -c apk --no-cache add curl nginx… 121MB
<missing> 5 months ago /bin/sh -c #(nop) LABEL Description=Lightwe… 0B
<missing> 5 months ago /bin/sh -c #(nop) LABEL Maintainer=Tim de P… 0B
<missing> 6 months ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 6 months ago /bin/sh -c #(nop) ADD file:f278386b0cef68136… 5.6MB
Inspecting the image, that sha looks like the 72e8... layer:
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf",
"sha256:8e1de91f5d76729f122777c082e5a0a04bf157438b40223add6e5b0d74974b4a",
"sha256:02bef6095ce13955a5feda9394ceb41e3ff06b1547c179550495f0ea51fa7d81",
"sha256:3dc8fd1f904d7ed46a9ef40a6083f7da2dd76c5d33acbfe5b49a57054a7e691a",
"sha256:5e1da4b6ee09de869cc02ebd7b0ef093da7b1b3cf5e1a5258398960679252d1d",
"sha256:3c93b50985ab6d6ff15f51828432c04d0163232e46d510131ce6fafb44832da9",
"sha256:4da6c12a08b014c74acc2348d4e5fec101a10f27caf2ac2cd80596c1775cd6ad",
"sha256:f44b520b457aa8b531c86e63f2c9c391d73f5eca898e698713c679b684637890",
"sha256:a7a1e7d3913b5bd1b0d13653840349ce407c81beec236055edab400c5a1a0096",
"sha256:e68f2db5551b5b4c15ad89dae5ee54ea4a3b864c71475146cedfda1ee6d75ec7"
]
},
In docker, those are currently stored within /var/lib/docker/image/overlay2/layerdb and that has a pointer to the overlay2 folder:
# cat image/overlay2/layerdb/sha256/72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf/cache-id
a2cd81767ad8c4e1fc556585df7f9904089e4d3884304f1c3a343c234b9a8f08
# ls -al overlay2/a2cd81767ad8c4e1fc556585df7f9904089e4d3884304f1c3a343c234b9a8f08/diff/tmp
total 8
drwxrwxrwt 2 root root 4096 Jun 15 2021 .
drwxr-xr-x 19 root root 4096 Jul 7 15:51 ..
If your filesystem has been corrupted at this level in the move, I'd personally delete the copy (the entire /var/lib/docker directory, not just pieces from it) and start over. Docker will not pull layers that it already has locally, and you are probably seeing just the first issue with the /tmp folder permissions.

how to get size of docker image layers

I have pulled a couple of images from my private repo. am able to see the size of the layers using docker history <image-id> i don't see actual layer sha256 id for the layers instead it shows missing. So am not sure how I can get the size of each layer.
Actually, I want the size of each layer in the image.
am able to get layers details from docker-inspect command docker inspect <image-id> | jq .[].RootFS.Layers
docker history f183414e30ab
IMAGE CREATED CREATED BY SIZE COMMENT
f183414e30ab 16 months ago /bin/sh -c apt-get update && apt-get install… 317MB
<missing> 16 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 16 months ago /bin/sh -c mkdir -p /run/systemd && echo 'do… 7B
<missing> 16 months ago /bin/sh -c set -xe && echo '#!/bin/sh' > /… 745B
<missing> 16 months ago /bin/sh -c [ -z "$(apt-get indextargets)" ] 987kB
<missing> 16 months ago /bin/sh -c #(nop) ADD file:3ddd02d976792b6c6… 63.2MB
Problem trying to solve: am trying to get the size of docker image layers which also been shared with other images.
You are looking for command docker system df

How do you know what you're getting when you pull an existing Docker image?

When you create your own docker image, you usually start a Docker file with FROM, and base your image off something that already exists on docker hub. How can I learn more about what is actually in the image I am referencing?
For example, I'm interested in starting with this image:
https://hub.docker.com/_/swift/
Besides what's listed in the description fields on that webpage, how can I verify what is actually getting installed? Is there a way to view a Dockerfile for an existing image on docker hub?
Thanks
Dockerfile links
Many images, especially "official" images, will contain Dockerfile links. You'll find them in the description on Docker Hub. For instance, right now at the link you posted in your question, you'll find a few image tags and a couple of links to Dockerfile.
3.1.0, 3.1, 3, latest (Dockerfile)
Simply click on "Dockerfile" and it will take you to the Dockerfile that was used to build that version of the image.
It should be noted that this is metadata associated with the Docker Hub account. You can't completely trust that it is correct, because it's just a link. (To GitHub, in this case, but it can be anywhere.)
Since you can't completely trust that, you may want to look also at...
Docker history
If you docker pull swift to fetch the image, you can then use the docker history command to take a closer look at it. Currently, that looks like this:
IMAGE CREATED CREATED BY SIZE COMMENT
d505ae70cb39 2 weeks ago /bin/sh -c swift --version 0B
<missing> 2 weeks ago /bin/sh -c SWIFT_URL=https://swift.org/bui... 403MB
<missing> 2 weeks ago /bin/sh -c #(nop) ENV SWIFT_PLATFORM=ubun... 0B
<missing> 2 weeks ago /bin/sh -c #(nop) ARG SWIFT_VERSION=swift... 0B
<missing> 2 weeks ago /bin/sh -c #(nop) ARG SWIFT_BRANCH=swift-... 0B
<missing> 2 weeks ago /bin/sh -c #(nop) ARG SWIFT_PLATFORM=ubun... 0B
<missing> 2 weeks ago /bin/sh -c apt-get -q update && apt-ge... 626MB
<missing> 2 weeks ago /bin/sh -c #(nop) MAINTAINER Haris Amin <... 0B
<missing> 2 weeks ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 2 weeks ago /bin/sh -c mkdir -p /run/systemd && echo '... 7B
<missing> 2 weeks ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\... 2.76kB
<missing> 2 weeks ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0B
<missing> 2 weeks ago /bin/sh -c set -xe && echo '#!/bin/sh' >... 745B
<missing> 2 weeks ago /bin/sh -c #(nop) ADD file:5aff8c59a707833... 118MB
You'll notice that the commands used to build each layer of the image are truncated for display. That makes this display not especially useful, but you can use the --no-trunc flag to get a much more verbose output.
docker history --no-trunc swift:latest
Then you will get a lot of output (more than I will paste here), but here is a one-entry sample:
<missing> 2 weeks ago /bin/sh -c SWIFT_URL=https://swift.org/builds/$SWIFT_BRANCH/$(echo "$SWIFT_PLATFORM" | tr -d .)/$SWIFT_VERSION/$SWIFT_VERSION-$SWIFT_PLATFORM.tar.gz && curl -fSsL $SWIFT_URL -o swift.tar.gz && curl -fSsL $SWIFT_URL.sig -o swift.tar.gz.sig && export GNUPGHOME="$(mktemp -d)" && set -e; for key in 7463A81A4B2EEA1B551FFBCFD441C977412B37AD 1BE1E29A084CB305F397D62A9F597F4D21A56D5F A3BAFD3556A59079C06894BD63BC1CFE91D306C6 ; do gpg --quiet --keyserver ha.pool.sks-keyservers.net --recv-keys "$key"; done && gpg --batch --verify --quiet swift.tar.gz.sig swift.tar.gz && tar -xzf swift.tar.gz --directory / --strip-components=1 && rm -r "$GNUPGHOME" swift.tar.gz.sig swift.tar.gz 403MB
Most of the text is simply the commands executed by Dockerfile RUN statements. You will also see the other Dockerfile commands like ARG, CMD, ADD, COPY, etc.
Since this is encoded into the image layers, it is probably more reliable (if less readable) than the Dockerfile links found in the Docker Hub readme file.

Resources