How to find out the base image for a docker image - docker

I have a docker image and I would like to find out from which image it has been created. Of course there are multiple layers, but I'd like to find out the last image (the FROM statement in the dockerfile for this image)?
I try to use docker image history and docker image inspect but I can't find this information in there.
I tried to use the following command but it gives me a error message
alias dfimage="sudo docker run -v /var/run/docker.sock:/var/run/docker.sock --rm xyz/mm:9e945ff"
dfimage febae8978318
This is the error message I'm getting
container_linux.go:235: starting container process caused "exec: \"febae8978318\": executable file not found in $PATH"
/usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "exec: \"febae8978318\": executable file not found in $PATH".

Easy way is to use
docker image history deno
This above command will give you output like this
Then just look at the IMAGE column and take that image ID which a24bb4013296 which is just above the first <missing>
Then just do the
For Linux
docker image ls | grep a24bb4013296
For Windows
docker image ls | findstr a24bb4013296
This will give you the base image name

The information doesn't really exist, exactly. An image will contain the layers of its parent(s) but there's no easy way to reverse layer digests back to a FROM statement, unless you happen to have (or are able to figure out) the image that contains those layers.
If you have the parent image(s) on-hand (or can find them), you can infer which image(s) your image used for its FROM statement (or ancestry) by cross-referencing the layers.
Theoretical example
Suppose your image, FOO, contains the layers 1 2 3 4 5 6. If you have another image, BAR on your system containing layers 1 2 3, you could infer that image BAR is an ancestor of image FOO -- I.E. that FROM BAR would have been used at some point in its hierarchy.
Suppose further that you have another image, BAZ which contains the layers 1 2 3 4 5. You could infer that image BAZ has image BAR in its ancestry and that image FOO inherits from image BAZ (and therefore indirectly from BAR).
From this, information you could infer the dockerfiles for these images might have looked something like this:
# Dockerfile of image BAR
FROM scratch
# layers 1 2 and 3
COPY ./one /
COPY ./two /
COPY ./three /
# Dockerfile of Image BAZ
FROM BAR
RUN echo "this makes layer 4" > /four
RUN echo "this makes layer 5" > /five
# Dockerfile of image FOO
FROM BAZ
RUN echo "this makes layer 6" > /six
You could get the exact commands by looking at docker image history for each image.
One important thing to keep in mind here, however, is that docker tags are mutable; maintainers make new images and move the tags to those images. So if you built an image with FROM python:3.8.1 today, it won't contain the same layers as if you had built an image with that same FROM line a few weeks ago. You'll need the SHA256 digest to be sure you're using the exact same image.
Practical Example, local images
Now that we understand the theory behind identifying images and their bases, let's put it to practice with a real-world example.
Note: because the tags I use will change over time (see above RE: tag mutability), I'll be using the SHA256 digest to pull the images in this example so it can be reproduced by viewers of this answer.
Let's say we have a particular image and we want to find its base(s). We'll use the official maven image here.
First, we'll take a look at its layers.
# maven:3.6-jdk-11-slim at time of writing, on my platform
IMAGE="docker.io/maven#sha256:55f1c145a04e01706233d68fe0b6b20bf76f765ab32f3fe6e29c8ef933917af6"
docker pull $IMAGE
docker image inspect $IMAGE | jq -r '.[].RootFS.Layers[]'
This will output the layers:
sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d
sha256:eda2f4da9b1e70500ac340d40ee039ef3877e8be13b9a24cd345406bf6693412
sha256:6bdb7b3c3e226bdfaa911ba72a95fca13c3979cd150061d570cf569e93037ce6
sha256:ce217e530345060ca0973807a3288560e1e15cf1a4eeec44d6aa594a926c92dc
sha256:f256c980a7d17a00f57fd42a19f6323fcc2341fa46eba128def04824cafa5afa
sha256:446b1af848de2dcb92bbd229ca6ecaabf2f48dab323c19f90d02622e09a8fa67
sha256:10652cf89eaeb5b5d8e0875a6b1867b5cf92c509a9555d3f57d87fab605115a3
sha256:d9a4cf86bf01eb170242ca3b0ce456159fd3fddc9c4d4256208a9d19bae096ca
Now, from here, we can try to find other images that have a (strict) subset of these layers. Assuming you have the images on-hand, you can find them by cross-referencing the layers of images you have on disk, for example, using docker image inspect.
In this case, I just happen to know what these images are and have them on-hand (I'll discuss later what you might do if you don't have the images on-hand) so we'll go ahead and pull those images and take a look at the layers.
If you want to follow along:
# openjdk:11.0.10-jdk-slim at time of writing, on my platform
OPENJDK='docker.io/openjdk#sha256:fe6a46a26ff7d6c31b258e07b3d53f0c42fe68f55f646cc39d60d0b17cbc827b'
# debian:buster-20210329-slim at time of writing on my platform
DEBIAN='docker.io/debian#sha256:088be7d6017ad3ae98325f47707112e1f61687c371be1865e55d5e5531ca97fd'
docker pull $OPENJDK
docker pull $DEBIAN
If we inspect these images and compare them against the layers we saw in the output of docker image inspect for the maven image, we can confirm that the layers from openjdk and debian are present in our original maven image.
$ docker image inspect $DEBIAN | jq -r '.[].RootFS.Layers[]'
sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d
$ docker image inspect $OPENJDK | jq -r '.[].RootFS.Layers[]'
sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d
sha256:eda2f4da9b1e70500ac340d40ee039ef3877e8be13b9a24cd345406bf6693412
sha256:6bdb7b3c3e226bdfaa911ba72a95fca13c3979cd150061d570cf569e93037ce6
sha256:ce217e530345060ca0973807a3288560e1e15cf1a4eeec44d6aa594a926c92dc
As stated, because these 5 layers are a strict subset of the 8 layers from the maven image, we can conclude the openjdk and debian images are, at least, both in the ancestry path of the maven image.
We can further infer that the last 3 layers most likely come from the maven image itself (or, potentially, some unknown image).
Caveats, when you don't have images locally
Now, of course the above only works because I happen to have all the images on-hand. So, you'd either need to have the images or be able to locate them by the layer digests.
You might still be able to figure this out using information that may be available from registries like Docker Hub or your own private repositories.
For official images, the docker-library/repo-info contains historical information about the official images, including the layer digests for the various tags cataloged over the last several years. You could use this, for example, as a source of layer information.
If you can imagine this like a database of layer digests, you could infer ancestry of at least these official images.
"Distribution" (remote) digests vs "Content" (local) digests
An important caveat to note is that, when you inspect an image for its layer digests locally, you are getting the content digest of the layers. If you are looking at layer digests in a registry manifest (like what appears in the docker-library/repo-info project) you get the compressed distribution digest and won't be able to compare the layer digests with content.
So you can compare digests local <--> local OR remote <--> remote only.
Example, using remote images
Suppose I want to do this same thing, but I want to associate images in a remote repository and find its base(s). We can do the same thing by looking at the layers in the remote manifest.
You can find references how to do this for your particular registry, as described in this answer for dockerhub.
Using the same images from the example above, we would find that the distribution layer digests also match in the same way.
$ get-remote-layers $IMAGE
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3
sha256:96fde6667c188c81fcddee021ccbb3e054ebe83350fd4609e17a3d37f0ec7f9d
sha256:74d17759dd2a1b51afc740fadd96f655260689a2087308e40d1865a0098c5fae
sha256:bbe8ebb5d0a64d265558901c7c6c66e1d09f664da57cdb2e5f69ba52a7109d31
sha256:b2edaadd7dd62cfe7f551b902244ee67b84bc5c0b6538b9480ac9ca97a0a4986
sha256:0fca65d33e353bdfdd5edd8d4c8ab5efde52c078bd25e2dcf454f995e5420725
sha256:d6d771d0512387eee1e419a965b929a9a3b0365cf1935b3719d60bf9feffcf63
sha256:dee8cd26669373102db07820072127c46bbfdad340a586ee9dfe60ae933eac2b
$ get-remote-layers $DEBIAN
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3
$ get-remote-layers $OPENJDK
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3
sha256:96fde6667c188c81fcddee021ccbb3e054ebe83350fd4609e17a3d37f0ec7f9d
sha256:74d17759dd2a1b51afc740fadd96f655260689a2087308e40d1865a0098c5fae
sha256:bbe8ebb5d0a64d265558901c7c6c66e1d09f664da57cdb2e5f69ba52a7109d31
One other caveat with distribution digests in repositories is that you can only compare digests of the same manifest schema version. So, if an image was pushed with manifest v1 it won't have the same digest pushed again with manifest v2.
TL;DR
Images contain the layers of their ancestor image(s). Therefore, if an image A contains a strict subset of image B layers, you know that image B is a descendent of image A.
You can use this property of Docker images to determine the base images from which your images were derived.

You can use method suggested in this answer:
https://stackoverflow.com/a/53841690/3691891
First, pull chenzj/dfimage:
docker pull chenzj/dfimage
Get ID of your image:
docker images | grep <IMAGE_NAME> | awk '{print $3}'
Replace <IMAGE_NAME> with the name of your image. Use this ID as
the parameter to chenzj/dfimage:
docker run -v /var/run/docker.sock:/var/run/docker.sock --rm chenzj/dfimage <IMAGE_ID>
If you find this too hard just pull the chenzj/dfimage image and then
use the following docker-get-dockerfile.sh script:
#!/usr/bin/env sh
if [ "$#" -lt 1 ]
then
printf "Image name needed\n" >&2
exit 1
fi
image_id="$(docker images | grep "^$1 " | awk '{print $3}')"
if [ -z "$image_id" ]
then
printf "Image not found\n" >&2
exit 2
fi
docker run -v /var/run/docker.sock:/var/run/docker.sock --rm chenzj/dfimage "$image_id"
You need to pass image name as the parameter. Example usage:
$ ./docker-get-dockerfile.sh alpine
FROM alpine:latest
ADD file:fe64057fbb83dccb960efabbf1cd8777920ef279a7fa8dbca0a8801c651bdf7c in /
CMD ["/bin/sh"]

docker run image:tag cat /etc/*release*
Run a docker container from that image with the command above(change "image:tag" with your image name and tag). your container will print details you need to answer your question.

Related

How do I check a registry image creation date without pulling it for a CI CD incremental build

I want to cancel a CI CD build if none of the files changed that would go into a docker image.
For that I want to get the creation timestamp of a docker image without pulling it first.
docker pull groovy:latest
docker image inspect groovy:latest | jq -r '.[].Created'
This would give me the creation timestamp
docker image remove groovy:latest || true
docker image inspect groovy:latest | jq -r '.[].Created'
If the image is not locally available the result would be 'Error: No such image: groovy:latest'
I can get a manifest list like so
docker image remove groovy:latest
docker manifest inspect groovy:latest
docker images | grep groovy
This does not give me a creation timestamp nor should it as documented in the api.
I am unable to get a timestamp without pulling the image.
Is there a way to achieve what I am looking for?
I need the creation timestamp of the image in the remote registry
So that I can cancel a build if no files are newer than the image.
Some more example code - this is not a minimal example and it takes a while to execute. I recommend testing this on an image with only a few layers. The approach has flaws:
It does not produce timestamps since the data is not in the manifests
In the current form the execution takes longer than pulling the image
Due to the long execution time I am not entirely sure the behavior is as intended.
Playing with it kills your rate limit
I removed the example since it does not help and playing with it kills your docker rate limit.

Docker registry space if pushing two images from same docker file

What happens on docker registry server space side when an image is created from same docker file. So, for example in case below, if I push an image with tag 1.0 and then create another image with same docker file and push that with tag 1.1. Is it going to take any additional space on docker registry?
docker build . -t myRegistry.com/myImage:1.0
docker push myRegistry.com/myImage:1.0
docker build . -t myRegistry.com/myImage:1.1
docker push myRegistry.com/myImage:1.1
docker build . -t myRegistry.com/myImage:1.2
docker push myRegistry.com/myImage:1.2
docker build . -t myRegistry.com/myImage:1.3
docker push myRegistry.com/myImage:1.3
In your sample case, the container registry will use the same image, which is calculated by the image's sha256 value (also known as the IMAGE ID) -- the tag is simply alias to that unique image.
It's a one-to-many relationship, i.e., you can have many tags point to the same image. You can use docker images --no-trunc to see the full value of the IMAGE ID. (Note this is useful if you have consistency issues using common tags like "latest" or "develop" since you can't be sure which image it actually is unless you use the sha256 value.)
For builds on different machines/environments, using the same Dockerfile with the same files may result in the same hash, but it depends on many variables like how dynamic your dependencies are, if timestamps have changed, etc.
As #Henry mentioned, this further applies (largely behind the scenes) to individual layers of an image:
Docker images have intermediate layers that increase reusability,
decrease disk usage, and speed up docker build by allowing each step
to be cached. These intermediate layers are not shown by default.
see docs
Btw, to see a container's sha256 value to see which image it came from, you can inspect it, e.g., docker inspect --format='{{index .RepoDigests 0}}' mongo:3.4-jessie

How make docker layer to single layer

Docker images create with multiple layers, I want to convert this to single layer is there any docker build command to achive this ? i googled for but cant find anything
No command to achieve that, and a single layer image is against docker's design concept. This Understand images, containers, and storage drivers doc described why docker image has multiple layers. In short, image layers are one of the reasons Docker is so lightweight. When you change a Docker image, such as when you update an application to a new version, a new layer is built and replaces only the layer it updates. Besides, even your image has only one layer, when you create a container with that image, docker still will add a thin Read/Writable container layer on the top of your image layer.
If you just want to move your image around and think one single layer could make it easier, you probably should try to use docker save command to create a tar file of it.
Or you have more complicated requirements, you may need to use VM image rather than docker image.
I have just workaround by using multistage build (the last build will be just a COPY from the previous build)
FROM alpine as build1
RUN echo "The 1st Build"
FROM scratch
COPY --from=build1 / /
First option:
# docker image build .
# docker run <your-image>
# docker container export <container-id created from previous command> -o myimage.tar.gz
# docker image import myimage.tar.gz
The imported image will be a single layer file system image.
Second option: (not a complete solution) - use multi stage builds to reduce number of image layers.
During build we can also pass --squash option to make it a single layer image.
Experimental (daemon)API 1.25+
Squash newly built layers into a single new layer
https://docs.docker.com/engine/reference/commandline/image_build/
Flattening a Docker Image to a Single Layer:
docker run -d --name flat_container nginx
docker export flat_container > flat.tar
cat flat.tar | docker import - flat:latest
docker image history flat

Docker save only non public layers

I can export images with
docker save -o <save image to path> <image name>
but this will pack all layers, and the file is big
is there a possibility to pack only layers which are not public available, so only the difference to the last public layer is exported?
You can try undocker. The tool can extract all or part of the layers of a Docker image onto the local filesystem. You can extract one or more specific layers:
$ docker save busybox |
undocker -vi -o busybox -l ea13149945cb6b1e746bf28032f02e9b5a793523481a0a18645fc77ad53c4ea2
INFO:undocker:extracting image busybox (4986bf8c15363d1c5d15512d5266f8777bfba4974ac56e3270e7760f6f0a8125)
INFO:undocker:extracting layer ea13149945cb6b1e746bf28032f02e9b5a793523481a0a18645fc77ad53c4ea2
Of course, it doesn't automatically sort out publicly available layers, but this is something you can start with, here is the tool intro article by original author.
The docker-save-last-layer command line utility combined with docker build --squash is made to accomplish exactly this.
It exports only the last layer of the specified docker image.
It works by using a patched version of the docker daemon inside a docker image that can access the images on your host machine. So it doesn't require doing a full docker save before using it like the undocker answer. This makes it much more performant for large base images.
Typical usage is simple and looks like:
pip install d-save-last
docker build --t myimage --squash .
d-save-last myimage -o ./myimage.tar

Docker build command with --tag unable to tag images

I have trying to build a Docker image using a Dockerfile available locally.
docker build -t newimage .
I have used this command multiple times earlier too, but somehow its not working currently and i am stuck finding the reason for it.
I will be really helpful if someone can help me with a possible solution or a possible area to look for an issue.
i already had a look over other posts that could have been related for example :
Docker build tag repository name
Okay! I found out the reason for issue.
DOCKER BUILD PROCESS
When we build a docker image, while creating a image several other intermediate images are generated in the process. We never see them in docker images because with the generation of next intermediate image the earlier image is removed.And in the end we have only one which is the final image.
The tag we provide using -t or --tag is for the final build, and obviously no intermediate container is tagged with the same.
ISSUE EXPLANATION
When we try to build a docker image with Dockerfile sometimes the process is not successfully completed with a similar message like Successfully built image with IMAGEID
So it is so obvious that the build which has failed will not be listed in docker images
Now, the image with tag <none> is some other image (intermediate). This creates a confusion that the image exists but without a tag, but the image is actually not what the final build should be, hence not tagged.
If your Dockerfile's last line is RUN then it may hang on that during build.
I changed RUN npm start to CMD ["npm", "start"] and it's tagging now.
There's nothing wrong with Docker.
An image can have multiple tags:
alpine 3.4 4e38e38c8ce0 6 weeks ago 4.799 MB
alpine latest 4e38e38c8ce0 6 weeks ago 4.799 MB
In this example the image with id 4e38e38c8ce0 is tagged alpine:latest and alpine:3.4. If you were to execute docker build -t alpine . the latest tag would be removed from the image 4e38e38c8ce0 and assigned to the newly built image (which has a different id).
If you remove the last tag from an image, the image isn't deleted automatically. It shows up as <none>.
Docker also uses a cache. So if you build an image with a Dockerfile, change that file, built the image again and than undo the change and build again you will have two images - the image you built in the first and last step are the same. The second image will be "tagged" <none>.
If you want to keep multiple version of an image use docker build -ttag:versionimage:tag . where versiontag is changed every time you make some changes.
Edit:
What I called tag is actually the image name and what I called version is called the tag: https://docs.docker.com/engine/reference/commandline/tag/
One of walkaround solution for this case, run below command immediately after build command, if all other images has already been taged.
docker tag $(docker image ls | grep "<none>" | awk '{print $3}') newimage:latest
As I know this will work only in linux os.

Resources