Is the sha256 hash of a docker image different between repositories? - docker

I have just copied a docker image from one repository to another, by pulling an explicit sha256 hash tag from our OpenShift 3.11 external repository, retagging it to our Harbor 1.9.2 repository and pushing that tag.
In that process the sha256 key for the new image was shown, and it was different to the sha256 key I started out with. This was unexpected as I did not change anything with the image, except assigning it another tag, so the bytes should be the same giving the same hash.
Does this mean that the algorithms for some reason are different? That the repository name is included in the hash key calculation? Or something else?

You are confusing the image id digest with the layer digests. If you docker inspect those images you will notice that the underlying layer digests will match exactly.
Each image in the registry gets an image id. Run docker images --digests --no-trunc and notice that you will see a digest column and an image id column and they are not the same. The digest column is the digest of the manifest and is shown as RepoDigests in docker inspect output. If the manifest contains the name and the tag, then the digest will be different as well.
Also try diff <(docker inspect image_id_1) <(docker inspect image_id_2) to see what is going on.
See this answer and this article for additional details.

Related

2 Docker images, different TAG have the same IMAGE ID

Why do 2 docker images with a different TAG have the same IMAGE ID?
My idea of IMAGE ID was that this is unique for the image/tag.
$ docker images | grep -e "airflow" -e "IMAGE"
REPOSITORY TAG IMAGE ID CREATED SIZE
bitnami/airflow 2 28660a21473c 2 weeks ago 2.12GB
bitnami/airflow 2.2.5 28660a21473c 2 weeks ago 2.12GB
The image id is based on a hash of the image config. That config uniquely identifies an image since it not only contains all settings of the image, but also hashes of the filesystem layers that make up the image.
Tags, both on docker and in registries (though they reference a slightly different thing) are pointers to that unique digest. Multiple pointers can be made to the same digest, and one pointer can be modified to point to a new digest.

Searching docker hub registry images/layers by their SHA digest

If you ever attentively browse for docker images on https://hub.docker.com you may have once dissect all the commands composing an image of interest within a certain tag.
Great, but then you may have seen this kind of "translated" command when you click on a specific line of a command:
I may be wrong here because I'm not a Docker expert, but this seems to be an SHA-256 digest which refers to... something else inside the Hub.
My question is; how to find what exactly does it refer to, knowing the SHA value (3a7bff4e139bcacc5831fd70a035c130a91b5da001dd91c08b2acd635c7064e8)?
The SHA value you see is the digest of the file that was added.
For example, suppose you have the following dockerfile:
FROM scratch
ADD foo.txt /foo.txt
If you were to push that to dockerhub, you would see something like:
ADD file:b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c in /
where b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c is the digest of foo.txt.
There's no definitive way to reverse this information to obtain the file, considering ADD can do things like unpack tar archives.
With modern versions of Docker/buildkit, you might see the filename instead.
This is the same thing you see when using docker image history.

Using container image tags together with digests

Problem
It is well-known that digests are immutable while tags are mutable. And referring to digests is preferable in a number of cases where you need to guarantee that the same exact image should be used during the lifetime of the service. It is considered the best practice for production environments (see this and this).
The problem with the digests that while being immutable and useful they are not human-readable. And all the examples use either digest or tags. But practice shows that you can use both. It is just that tag is omitted.
For instance:
docker run --rm k8s.gcr.io/pause-amd64:3.1#sha256:59eec8837a4d942cc19a52b8c09ea75121acc38114a2c68b98983ce9356b8610
kubectl create deployment pause --image k8s.gcr.io/pause-amd64:3.1#sha256:59eec8837a4d942cc19a52b8c09ea75121acc38114a2c68b98983ce9356b8610
Question
is it safe to use the <image_path>:<tag>#sha256:<digest> pattern?
Edit
Seems like it's absolutely legit.
Yes, the tag is ignored when the digest is present.
You can even make up a tag and it will still pull the image by digest:
docker image pull k8s.gcr.io/pause-amd64:some-made-up-tag#sha256:59eec8837a4d942cc19a52b8c09ea75121acc38114a2c68b98983ce9356b8610
sha256:59eec8837a4d942cc19a52b8c09ea75121acc38114a2c68b98983ce9356b8610: Pulling from pause-amd64
Digest: sha256:59eec8837a4d942cc19a52b8c09ea75121acc38114a2c68b98983ce9356b8610
Status: Image is up to date for k8s.gcr.io/pause-amd64#sha256:59eec8837a4d942cc19a52b8c09ea75121acc38114a2c68b98983ce9356b8610

How can I check when was a docker image pulled?

I want to know when I pulled a certain image, when you run docker images The Created field appear but the date that the image was pulled don't.
If you installed docker-engine from official repositories on your linux, it should be installed in /var/lib/docker, for your own configuration, find the respective path.
There is /var/lib/docker/image/aufs/repositories.json file where docker stores images with their sha256 values.
cat /var/lib/docker/image/aufs/repositories.json
Find the image you are looking after and copy it's sha256 hash somewhere.
Then:
ls /var/lib/docker/image/aufs/imagedb/content/sha256 -lash
Find the sha265 value you found in repositories.json then look at the date.

Where can I find the sha256 code of a docker image?

I'd like to pull the images of CentOS, Tomcat, ... using their sha256 code, like in
docker pull myimage#sha256:0ecb2ad60
But I can't find the sha256-code to use anywhere.
I checked the DockerHub repository for any hint of the sha256-code, but couldn't find any. I downloaded the images by their tag
docker pull tomcat:7-jre8
and checked the image with docker inspect to see if there's a sha256 code in the metadata, but there is none (adding the sha256 code of the image would probably change the sha256 code).
Do I have to compute the sha256 code of an image myself and use that?
Latest answer
Edit suggested by OhJeez in the comments.
docker inspect --format='{{index .RepoDigests 0}}' $IMAGE
Original answer
I believe you can also get this using
docker inspect --format='{{.RepoDigests}}' $IMAGE
Works only in Docker 1.9 and if the image was originally pulled by the digest. Details are on the docker issue tracker.
You can get it by docker images --digests
REPOSITORY TAG DIGEST IMAGE ID CREATED SIZE
docker/ucp-agent 2.1.0 sha256:a428de44a9059f31a59237a5881c2d2cffa93757d99026156e4ea544577ab7f3 583407a61900 3 weeks ago 22.3 MB
Simplest and most concise way is:
docker images --no-trunc --quiet $IMAGE
This returns only the sha256:... string and nothing else.
e.g.:
$ docker images --no-trunc --quiet debian:stretch-slim
sha256:220611111e8c9bbe242e9dc1367c0fa89eef83f26203ee3f7c3764046e02b248
Edit:
NOTE: this only works for images that are local. You can docker pull $IMAGE first, if required.
Just saw it:
When I pull an image, the sha256 code is diplayed at the bottom of the output (Digest: sha....):
docker pull tomcat:7-jre8
7-jre8: Pulling from library/tomcat
902b87aaaec9: Already exists
9a61b6b1315e: Already exists
...
4dcef5c50d60: Already exists
Digest: sha256:c34ce3c1fcc0c7431e1392cc3abd0dfe2192ffea1898d5250f199d3ac8d8720f
Status: Image is up to date for tomcat:7-jre8
This sha code
sha256:c34ce3c1fcc0c7431e1392cc3abd0dfe2192ffea1898d5250f199d3ac8d8720f
can be used to pull the image afterwards with
docker pull tomcat#sha256:c34ce3c1fcc0c7431e1392cc3abd0dfe2192ffea1898d5250f199d3ac8d8720f
This way you can be sure that the image is not changed and can be safely used for production.
I found the above methods to not work in some cases. They either:
don't deal well with multiple images with the same hash (in the case of .RepoDigests suggestion - when you want to use a specific registry path)
don't work well when pushing the image to registries
(in the case of .Id where it's a local hash, not the hash in the
registry).
The below method is delicate, but works for extracting the specific full 'name' and hash for a specific pushed container.
Here's the scenario - An image is uploaded separately to 2 different projects in the same repo, so querying RepoDigests returns 2 results.
$ docker inspect --format='{{.RepoDigests}}' gcr.io/alpha/homeapp:latest
[gcr.io/alpha/homeapp#sha256:ce7395d681afeb6afd68e73a8044e4a965ede52cd0799de7f97198cca6ece7ed gcr.io/beta/homeapp#sha256:ce7395d681afeb6afd68e73a8044e4a965ede52cd0799de7f97198cca6ece7ed]
I want to use the alpha result, but I can't predict which index it will be. So I need to manipulate the text output to remove the brackets and get each entry on a separate line. From there I can easily grep the result.
$ docker inspect --format='{{.RepoDigests}}' gcr.io/alpha/homeapp:latest | sed 's:^.\(.*\).$:\1:' | tr " " "\n" | grep alpha
gcr.io/alpha/homeapp#sha256:ce7395d681afeb6afd68e73a8044e4a965ede52cd0799de7f97198cca6ece7ed
In addition to the existing answers, you can use the --digests option while doing docker images to get a list of digests for all the images you have.
docker images --digests
You can add a grep to drill down further
docker images --digests | grep tomcat
You can find it at the time of pulling the image from the respective repository. Below command mentions Digest: sha256 at the time of pulling the docker image.
09:33 AM##~::>docker --version
Docker version 19.03.4, build 9013bf5
Digest: sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d
09:28 AM##~::>docker pull ubuntu
Using default tag: latest
latest: Pulling from library/ubuntu
7ddbc47eeb70: Pull complete
c1bbdc448b72: Pull complete
8c3b70e39044: Pull complete
45d437916d57: Pull complete
**Digest: sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d**
Status: Downloaded newer image for ubuntu:latest
docker.io/library/ubuntu:latest
Once, the image is downloaded, we can do the following
"ubuntu#sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d"
09:36 AM##~::>docker inspect ubuntu | grep -i sha256
"Id": "sha256:775349758637aff77bf85e2ff0597e86e3e859183ef0baba8b3e8fc8d3cba51c",
**"ubuntu#sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d"**
"Image": "sha256:f0caea6f785de71fe8c8b1b276a7094151df6058aa3f22d2902fe6b51f1a7a8f",
"Image": "sha256:f0caea6f785de71fe8c8b1b276a7094151df6058aa3f22d2902fe6b51f1a7a8f",
"sha256:cc967c529ced563b7746b663d98248bc571afdb3c012019d7f54d6c092793b8b",
"sha256:2c6ac8e5063e35e91ab79dfb7330c6154b82f3a7e4724fb1b4475c0a95dfdd33",
"sha256:6c01b5a53aac53c66f02ea711295c7586061cbe083b110d54dafbeb6cf7636bf",
"sha256:e0b3afb09dc386786d49d6443bdfb20bc74d77dcf68e152db7e5bb36b1cca638"
This should have been the Id field, that you could see in the old deprecated Docker Hub API
GET /v1/repositories/foo/bar/images HTTP/1.1
Host: index.docker.io
Accept: application/json
Parameters:
namespace – the namespace for the repo
repo_name – the name for the repo
Example Response:
HTTP/1.1 200
Vary: Accept
Content-Type: application/json
[{"id": "9e89cc6f0bc3c38722009fe6857087b486531f9a779a0c17e3ed29dae8f12c4f",
"checksum": "b486531f9a779a0c17e3ed29dae8f12c4f9e89cc6f0bc3c38722009fe6857087"},
{"id": "ertwetewtwe38722009fe6857087b486531f9a779a0c1dfddgfgsdgdsgds",
"checksum": "34t23f23fc17e3ed29dae8f12c4f9e89cc6f0bsdfgfsdgdsgdsgerwgew"}]
BUT: this is not how it is working now with the new docker distribution.
See issue 628: "Get image ID with tag name"
The /v1/ registry response /repositories/<repo>/tags used to list the image ID along with the tag handle.
/v2/ only seems to give the handle.
It would be useful to get the ID to compare to the ID found locally. The only place I can find the ID is in the v1Compat section of the manifest (which is overkill for the info I want)
The current (mid 2015) answer is:
This property of the V1 API was very computationally expensive for the way images are stored on the backend. Only the tag names are enumerated to avoid a secondary lookup.
In addition, the V2 API does not deal in Image IDs. Rather, it uses digests to identify layers, which can be calculated as property of the layer and are independently verifiable.
As mentioned by #zelphir, using digests is not a good way since it doesn't exist for a local-only image. I assume the image ID sha is the most accurate and consistent across tags/pull/push etc.
docker inspect --format='{{index .Id}}' $IMAGE
Does the trick.
Just issue docker pull tomcat:7-jre8 again and you will get what you want.

Resources