Docker registry space if pushing two images from same docker file - docker

What happens on docker registry server space side when an image is created from same docker file. So, for example in case below, if I push an image with tag 1.0 and then create another image with same docker file and push that with tag 1.1. Is it going to take any additional space on docker registry?
docker build . -t myRegistry.com/myImage:1.0
docker push myRegistry.com/myImage:1.0
docker build . -t myRegistry.com/myImage:1.1
docker push myRegistry.com/myImage:1.1
docker build . -t myRegistry.com/myImage:1.2
docker push myRegistry.com/myImage:1.2
docker build . -t myRegistry.com/myImage:1.3
docker push myRegistry.com/myImage:1.3

In your sample case, the container registry will use the same image, which is calculated by the image's sha256 value (also known as the IMAGE ID) -- the tag is simply alias to that unique image.
It's a one-to-many relationship, i.e., you can have many tags point to the same image. You can use docker images --no-trunc to see the full value of the IMAGE ID. (Note this is useful if you have consistency issues using common tags like "latest" or "develop" since you can't be sure which image it actually is unless you use the sha256 value.)
For builds on different machines/environments, using the same Dockerfile with the same files may result in the same hash, but it depends on many variables like how dynamic your dependencies are, if timestamps have changed, etc.
As #Henry mentioned, this further applies (largely behind the scenes) to individual layers of an image:
Docker images have intermediate layers that increase reusability,
decrease disk usage, and speed up docker build by allowing each step
to be cached. These intermediate layers are not shown by default.
see docs
Btw, to see a container's sha256 value to see which image it came from, you can inspect it, e.g., docker inspect --format='{{index .RepoDigests 0}}' mongo:3.4-jessie

Related

Dockerfile FROM command - Does it always download from Docker Hub?

I just started working with docker this week and came across a 'dockerfile'. I was reading up on what this file does, and the official documentation basically mentions that the FROM keyword is needed to build a "base image". These base images are pulled from Docker hub, or downloaded from there.
Silly question - Are base images always pulled from docker hub?
If so and if I understand correctly I am assuming that running the dockerfile to create an image is not done very often (only when needing to create an image) and once the image is created then the image is whats run all the time?
So the dockerfile then can be migrated to which ever enviroment and things can be set up all over again quickly?
Pardon the silly question I am just trying to understand the over all flow and how dockerfile fits into things.
If the local (on your host) Docker daemon (already) has a copy of the container image (i.e. it's been docker pull'd) specified by FROM in a Dockerfile then it's cached and won't be repulled.
Container images include a tag (be wary of ever using latest) and the image name e.g. foo combined with the tag (which defaults to latest if not specified) is the full name of the image that's checked i.e. if you have foo:v0.0.1 locally and FROM:v0.0.1 then the local copy is used but FROM foo:v0.0.2 will pull foo:v0.0.2.
There's an implicit docker.io prefix i.e. docker.io/foo:v0.0.1 that references the Docker registry that's being used.
You could repeatedly docker build container images on the machines where the container is run but this is inefficient and the more common mechanism is that, once a container image is built, it is pushed to a registry (e.g. DockerHub) and then pulled from there by whatever machines need it.
There are many container registries: DockerHub, Google Artifact Registry, Quay etc.
There are tools other than docker that can be used to interact with containers e.g. (Red Hat's) Podman.

Check if local docker image latest

In my use case I always fetch the image tagged with "latest" tag. This "latest" tag gets updated regularly. So even if the latest tag image is updated on registry, the "docker run" command does not update it on local host. This is expected behavior as the "latest" image exists on local host.
But I want to make sure that if the "latest" image on local host and registry are different then it should pull the latest image from registry.
Is there any way to achieve this?
You can manually docker pull the image before you run it. This is fairly inexpensive, especially if the image hasn't changed. You can do it while the old container is still running to minimize downtime.
docker pull the-image
docker stop the-container
docker rm the-container
docker run -d ... --name the-container the-image
In an automated environment you might consider avoiding the latest tag and other similar fixed strings due to exactly this ambiguity. In Kubernetes, for example, the default behavior is to reuse a local image that has some name, which can result in different nodes running different latest images. If you label your images with a date stamp or source-control ID or something else such that every image has a unique tag, you can just use that tag.
Finding the tag value can be problematic outside the context of a continuous-deployment system; Docker doesn't have any built-in way to find the most recent tag for an image.
# docker pull the-image:20220704 # optional
docker stop the-container
docker rm the-container
docker run -d ... --name the-container the-image:20220704
docker rmi the-image:20220630
One notable advantage of this last approach is that it's very easy to go back to an earlier build if today's build happens to be broken; just switch the image tag back a build or two.

How to find out the base image for a docker image

I have a docker image and I would like to find out from which image it has been created. Of course there are multiple layers, but I'd like to find out the last image (the FROM statement in the dockerfile for this image)?
I try to use docker image history and docker image inspect but I can't find this information in there.
I tried to use the following command but it gives me a error message
alias dfimage="sudo docker run -v /var/run/docker.sock:/var/run/docker.sock --rm xyz/mm:9e945ff"
dfimage febae8978318
This is the error message I'm getting
container_linux.go:235: starting container process caused "exec: \"febae8978318\": executable file not found in $PATH"
/usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "exec: \"febae8978318\": executable file not found in $PATH".
Easy way is to use
docker image history deno
This above command will give you output like this
Then just look at the IMAGE column and take that image ID which a24bb4013296 which is just above the first <missing>
Then just do the
For Linux
docker image ls | grep a24bb4013296
For Windows
docker image ls | findstr a24bb4013296
This will give you the base image name
The information doesn't really exist, exactly. An image will contain the layers of its parent(s) but there's no easy way to reverse layer digests back to a FROM statement, unless you happen to have (or are able to figure out) the image that contains those layers.
If you have the parent image(s) on-hand (or can find them), you can infer which image(s) your image used for its FROM statement (or ancestry) by cross-referencing the layers.
Theoretical example
Suppose your image, FOO, contains the layers 1 2 3 4 5 6. If you have another image, BAR on your system containing layers 1 2 3, you could infer that image BAR is an ancestor of image FOO -- I.E. that FROM BAR would have been used at some point in its hierarchy.
Suppose further that you have another image, BAZ which contains the layers 1 2 3 4 5. You could infer that image BAZ has image BAR in its ancestry and that image FOO inherits from image BAZ (and therefore indirectly from BAR).
From this, information you could infer the dockerfiles for these images might have looked something like this:
# Dockerfile of image BAR
FROM scratch
# layers 1 2 and 3
COPY ./one /
COPY ./two /
COPY ./three /
# Dockerfile of Image BAZ
FROM BAR
RUN echo "this makes layer 4" > /four
RUN echo "this makes layer 5" > /five
# Dockerfile of image FOO
FROM BAZ
RUN echo "this makes layer 6" > /six
You could get the exact commands by looking at docker image history for each image.
One important thing to keep in mind here, however, is that docker tags are mutable; maintainers make new images and move the tags to those images. So if you built an image with FROM python:3.8.1 today, it won't contain the same layers as if you had built an image with that same FROM line a few weeks ago. You'll need the SHA256 digest to be sure you're using the exact same image.
Practical Example, local images
Now that we understand the theory behind identifying images and their bases, let's put it to practice with a real-world example.
Note: because the tags I use will change over time (see above RE: tag mutability), I'll be using the SHA256 digest to pull the images in this example so it can be reproduced by viewers of this answer.
Let's say we have a particular image and we want to find its base(s). We'll use the official maven image here.
First, we'll take a look at its layers.
# maven:3.6-jdk-11-slim at time of writing, on my platform
IMAGE="docker.io/maven#sha256:55f1c145a04e01706233d68fe0b6b20bf76f765ab32f3fe6e29c8ef933917af6"
docker pull $IMAGE
docker image inspect $IMAGE | jq -r '.[].RootFS.Layers[]'
This will output the layers:
sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d
sha256:eda2f4da9b1e70500ac340d40ee039ef3877e8be13b9a24cd345406bf6693412
sha256:6bdb7b3c3e226bdfaa911ba72a95fca13c3979cd150061d570cf569e93037ce6
sha256:ce217e530345060ca0973807a3288560e1e15cf1a4eeec44d6aa594a926c92dc
sha256:f256c980a7d17a00f57fd42a19f6323fcc2341fa46eba128def04824cafa5afa
sha256:446b1af848de2dcb92bbd229ca6ecaabf2f48dab323c19f90d02622e09a8fa67
sha256:10652cf89eaeb5b5d8e0875a6b1867b5cf92c509a9555d3f57d87fab605115a3
sha256:d9a4cf86bf01eb170242ca3b0ce456159fd3fddc9c4d4256208a9d19bae096ca
Now, from here, we can try to find other images that have a (strict) subset of these layers. Assuming you have the images on-hand, you can find them by cross-referencing the layers of images you have on disk, for example, using docker image inspect.
In this case, I just happen to know what these images are and have them on-hand (I'll discuss later what you might do if you don't have the images on-hand) so we'll go ahead and pull those images and take a look at the layers.
If you want to follow along:
# openjdk:11.0.10-jdk-slim at time of writing, on my platform
OPENJDK='docker.io/openjdk#sha256:fe6a46a26ff7d6c31b258e07b3d53f0c42fe68f55f646cc39d60d0b17cbc827b'
# debian:buster-20210329-slim at time of writing on my platform
DEBIAN='docker.io/debian#sha256:088be7d6017ad3ae98325f47707112e1f61687c371be1865e55d5e5531ca97fd'
docker pull $OPENJDK
docker pull $DEBIAN
If we inspect these images and compare them against the layers we saw in the output of docker image inspect for the maven image, we can confirm that the layers from openjdk and debian are present in our original maven image.
$ docker image inspect $DEBIAN | jq -r '.[].RootFS.Layers[]'
sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d
$ docker image inspect $OPENJDK | jq -r '.[].RootFS.Layers[]'
sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d
sha256:eda2f4da9b1e70500ac340d40ee039ef3877e8be13b9a24cd345406bf6693412
sha256:6bdb7b3c3e226bdfaa911ba72a95fca13c3979cd150061d570cf569e93037ce6
sha256:ce217e530345060ca0973807a3288560e1e15cf1a4eeec44d6aa594a926c92dc
As stated, because these 5 layers are a strict subset of the 8 layers from the maven image, we can conclude the openjdk and debian images are, at least, both in the ancestry path of the maven image.
We can further infer that the last 3 layers most likely come from the maven image itself (or, potentially, some unknown image).
Caveats, when you don't have images locally
Now, of course the above only works because I happen to have all the images on-hand. So, you'd either need to have the images or be able to locate them by the layer digests.
You might still be able to figure this out using information that may be available from registries like Docker Hub or your own private repositories.
For official images, the docker-library/repo-info contains historical information about the official images, including the layer digests for the various tags cataloged over the last several years. You could use this, for example, as a source of layer information.
If you can imagine this like a database of layer digests, you could infer ancestry of at least these official images.
"Distribution" (remote) digests vs "Content" (local) digests
An important caveat to note is that, when you inspect an image for its layer digests locally, you are getting the content digest of the layers. If you are looking at layer digests in a registry manifest (like what appears in the docker-library/repo-info project) you get the compressed distribution digest and won't be able to compare the layer digests with content.
So you can compare digests local <--> local OR remote <--> remote only.
Example, using remote images
Suppose I want to do this same thing, but I want to associate images in a remote repository and find its base(s). We can do the same thing by looking at the layers in the remote manifest.
You can find references how to do this for your particular registry, as described in this answer for dockerhub.
Using the same images from the example above, we would find that the distribution layer digests also match in the same way.
$ get-remote-layers $IMAGE
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3
sha256:96fde6667c188c81fcddee021ccbb3e054ebe83350fd4609e17a3d37f0ec7f9d
sha256:74d17759dd2a1b51afc740fadd96f655260689a2087308e40d1865a0098c5fae
sha256:bbe8ebb5d0a64d265558901c7c6c66e1d09f664da57cdb2e5f69ba52a7109d31
sha256:b2edaadd7dd62cfe7f551b902244ee67b84bc5c0b6538b9480ac9ca97a0a4986
sha256:0fca65d33e353bdfdd5edd8d4c8ab5efde52c078bd25e2dcf454f995e5420725
sha256:d6d771d0512387eee1e419a965b929a9a3b0365cf1935b3719d60bf9feffcf63
sha256:dee8cd26669373102db07820072127c46bbfdad340a586ee9dfe60ae933eac2b
$ get-remote-layers $DEBIAN
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3
$ get-remote-layers $OPENJDK
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3
sha256:96fde6667c188c81fcddee021ccbb3e054ebe83350fd4609e17a3d37f0ec7f9d
sha256:74d17759dd2a1b51afc740fadd96f655260689a2087308e40d1865a0098c5fae
sha256:bbe8ebb5d0a64d265558901c7c6c66e1d09f664da57cdb2e5f69ba52a7109d31
One other caveat with distribution digests in repositories is that you can only compare digests of the same manifest schema version. So, if an image was pushed with manifest v1 it won't have the same digest pushed again with manifest v2.
TL;DR
Images contain the layers of their ancestor image(s). Therefore, if an image A contains a strict subset of image B layers, you know that image B is a descendent of image A.
You can use this property of Docker images to determine the base images from which your images were derived.
You can use method suggested in this answer:
https://stackoverflow.com/a/53841690/3691891
First, pull chenzj/dfimage:
docker pull chenzj/dfimage
Get ID of your image:
docker images | grep <IMAGE_NAME> | awk '{print $3}'
Replace <IMAGE_NAME> with the name of your image. Use this ID as
the parameter to chenzj/dfimage:
docker run -v /var/run/docker.sock:/var/run/docker.sock --rm chenzj/dfimage <IMAGE_ID>
If you find this too hard just pull the chenzj/dfimage image and then
use the following docker-get-dockerfile.sh script:
#!/usr/bin/env sh
if [ "$#" -lt 1 ]
then
printf "Image name needed\n" >&2
exit 1
fi
image_id="$(docker images | grep "^$1 " | awk '{print $3}')"
if [ -z "$image_id" ]
then
printf "Image not found\n" >&2
exit 2
fi
docker run -v /var/run/docker.sock:/var/run/docker.sock --rm chenzj/dfimage "$image_id"
You need to pass image name as the parameter. Example usage:
$ ./docker-get-dockerfile.sh alpine
FROM alpine:latest
ADD file:fe64057fbb83dccb960efabbf1cd8777920ef279a7fa8dbca0a8801c651bdf7c in /
CMD ["/bin/sh"]
docker run image:tag cat /etc/*release*
Run a docker container from that image with the command above(change "image:tag" with your image name and tag). your container will print details you need to answer your question.

Docker create independent image

Hy i build a little image for docker on top of the debian:jessie image form the Docer Hub.
First i got debian:jessie from Docker Hub:
docker pull debian:jessie
Then I startet this image with a bash:
docker run -it debian:jessie
Then I installed my stuff e.g. ssh server and configured it.
Next from a second shell, i commitet the changes:
docker commit <running container id> debian-sshd
Now i have two images:
debian:jessie and debian-sshd
If i now want to delete debian:jessie, docker tells me i can't delete this because it has child-images(debian-sshd)
Is There a way I can make debian-sshd an independent image?
Most Dockerfiles start from a parent image. If you need to completely
control the contents of your image, you might need to create a base
image instead. Here’s the difference:
A parent image is the image that your image is based on. It refers to the contents of the FROM directive in the Dockerfile. Each
subsequent declaration in the Dockerfile modifies this parent image.
Most Dockerfiles start from a parent image, rather than a base image.
However, the terms are sometimes used interchangeably.
A base image either has no FROM line in its Dockerfile, or has FROM scratch.
Having quoted from docs, I would say that images are made up of layers, and since you have based your image on debian:jessie, one of the layers of debian-sshd is the debian:jessie image. If you want your independent image, build from scratch.
Other then that, all docker images are open source, so you can browse the dockerfile and modify it to suit your needs. Also, you could build from scratch if you want your own base image.

Is there any way to pull an image from private registry and cut URL?

I have some private Docker registry: http://some-registry-somewhere.com:5000.
When I need to run my compose configuration, I need to pull a target image.
$ docker pull some-registry-somewhere.com:5000/target/image:tag1
In docker-compose.yml file, I have to set the same full URL-path because there is pulled image some-registry-somewhere.com:5000/target/image:tag1.
To have an image with image name only we may tag it:
$ docker tag some-registry-somewhere.com:5000/target/image:tag1 target/image:tag1
But is there any way to automatically cut Docker registry URL through Docker?
There is no such way, because of API specification. The image name is not just the tag, it also identifies for docker engine, which registry should be used for pushes and pulls of this image.
While the first some-registry-somewhere.com:5000/target/image:tag1 is image target/image:tag1 which is located in some-registry-somewhere.com:5000.
The second one target/image:tag1 is, in other words, image, docker.io/target/image:tag1, which is located in official repository.
In fact, they can be different in most of the cases.
The one way, which is not good, actually, because can be confusing (see again about repositories), is to use &&:
docker pull some-registry-somewhere.com:5000/target/image:tag1 && docker tag some-registry-somewhere.com:5000/target/image:tag1 target/image:tag1

Resources