Why digests are different depend on registry?

Why digests are different depend on registry? - docker

AFAIK, image digest is a hash of image's manifest body.
When I pull busybox image from docker hub, and push it to my private registry, the digests get different.
$ docker pull busybox
...
Digest: sha256:2605a2c4875ce5eb27a9f7403263190cd1af31e48a2044d400320548356251c4
Status: Downloaded newer image for busybox:latest
$ docker tag busybox myregistry/busybox
$ docker push myregistry/busybox
...
08c2295a7fa5: Pushed
latest: digest: sha256:8573b4a813d7b90ef3876c6bec33db1272c02f0f90c406b25a5f9729169548ac size: 527
$ docker images --digests
myregistry/busybox latest sha256:8573b4a813d7b90ef3876c6bec33db1272c02f0f90c406b25a5f9729169548ac efe10ee6727f 2 weeks ago 1.13MB
busybox latest sha256:2605a2c4875ce5eb27a9f7403263190cd1af31e48a2044d400320548356251c4 efe10ee6727f 2 weeks ago 1.13MB
The images are not changed at all, and the image ids are same as each other.
But why image digests get different?
Updated:
Interestingly, the digest from another private registry is exactly same with the digest by my private registry.
$ docker image inspect efe10ee6727f
...
"RepoDigests": [
"myregistry/busybox#sha256:8573b4a813d7b90ef3876c6bec33db1272c02f0f90c406b25a5f9729169548ac",
"busybox#sha256:2605a2c4875ce5eb27a9f7403263190cd1af31e48a2044d400320548356251c4",
"anotherregistry/busybox#sha256:8573b4a813d7b90ef3876c6bec33db1272c02f0f90c406b25a5f9729169548ac"
],

The digests you are looking at are registry digests, which are different from the image id digest. You can have an image id that has different registry references (and possibly digests) for all the places it has been pushed. You can see the two id's in the inspect output:
$ docker inspect busybox --format 'Id: {{.Id}}
Repo Digest: {{index .RepoDigests 0}}'
Id: sha256:efe10ee6727fe52d2db2eb5045518fe98d8e31fdad1cbdd5e1f737018c349ebb
Repo Digest: busybox#sha256:2605a2c4875ce5eb27a9f7403263190cd1af31e48a2044d400320548356251c4
If the registry is using an old v1 manifest, the repository name and tag are part of that manifest, which means it will change as it's moved between registries:
{
"name": <name>,
"tag": <tag>,
"fsLayers": [
{
"blobSum": "<digest>"
},
...
]
],
"history": <v1 images>,
"signature": <JWS>
}
However for OCI manifests and Docker's v2 manifests, this is no longer the case and you should see the same registry digest for the same image:
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 7023,
"digest": "sha256:b5b2b2c507a0944348e0303114d8d93aaaa081732b86451d9bce1f432a537bc7"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 32654,
"digest": "sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 16724,
"digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 73109,
"digest": "sha256:ec4b8955958665577945c89419d1af06b5f7636b4ac3da7f12184802ad867736"
}
]
}
Digests themselves are a sha256 digest of the content, which you can also find in OCI's implementation. When you pull an image locally, some things change, including the layers being decompressed, and multi-platform images are dereferenced to your local platform. Because of those changes, the digest on the content will change and the image ID will not match the registry digest.
Therefore, to compare images between registries, make sure you specify you want a v2 schema with the accept header, otherwise the registry will convert the result back to a v1 schema. In curl, passing those headers looks like:
curl \
-H "Accept: application/vnd.docker.distribution.manifest.v2+json" \
-H "Accept: application/vnd.docker.distribution.manifest.list.v2+json" \
http://$registry/v2/$repo/manifests/$tag

Related

Creating multiplatform Docker manifest results in all images being linux/amd64

I have a script which runs on my Mac and caches Docker images I use in my build process in a local registry, to avoid issues with Docker Hub's rate limits.
This was fine when I was running an Intel Mac, but now I am using Apple Silicon (but the build server is still on Intel) I am having problems caching both. If I run this script:
docker pull --platform linux/amd64 redis:6-alpine
docker tag redis:6-alpine registry.example.com/redis:6-alpine-amd64
docker push registry.example.com/redis:6-alpine-amd64
docker pull --platform linux/arm64/v8 redis:6-alpine
docker tag redis:6-alpine registry.example.com/redis:6-alpine-arm64v8
docker push registry.example.com/redis:6-alpine-arm64v8
docker manifest create registry.example.com/redis:6-alpine --amend registry.example.com/redis:6-alpine-amd64 --amend registry.example.com/redis:6-alpine-arm64v8
docker manifest push registry.example.com/redis:6-alpine
and then run docker manifest inspect -v registry.example.com/redis:6-alpine I get a manifest which has both tagged versions, but they are both for the "amd64" architecture.
[
{
"Ref": "registry.example.com/redis:6-alpine-amd64",
"Descriptor": {
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:55f1ee5c3b34c41e3d0b9e7bf3baedada5792d14069ab4b2921b22b43c06646b",
"size": 1571,
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
"SchemaV2Manifest": {
…
}
},
{
"Ref": "registry.example.com/redis:6-alpine-arm64v8",
"Descriptor": {
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:55f1ee5c3b34c41e3d0b9e7bf3baedada5792d14069ab4b2921b22b43c06646b",
"size": 1571,
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
"SchemaV2Manifest": {
…
}
}
]
If I reverse the order of the first two sections and pull/re-push the arm64 platform first, it makes no difference: both still have the "amd64" architecture, so it doesn't seem to be the second pull failing to overwrite the first.
DOCKER_DEFAULT_PLATFORM is not set. Activity Monitor shows all Docker processes are running native, rather than under Rosetta. I'm running Docker Desktop 4.6.1, which has engine version 20.10.13, and have experimental features enabled.
If anything, given it's running on Apple Silicon, I would have expected both to be "arm64" rather than "amd64"!
Could somebody point me in the right direction to successfully build a manifest containing both architectures? I don't really want to go through the hassle of setting up a container registry proxy when I already have a registry set up through a local GitLab instance…

Why is a parent container image SHA not listed in the layers of the child?

Consider a container image, lets call it BaseContainerImage. I build this container image based off a container image on docker hub (the .Net Core 3.1 runtime if it matters). By "based off" I mean that the FROM references that docker hub image.
When I build it, it gets a SHA of: sha256:9ec7b7481feee3eb141f7321be1df353b1ab8b6bdf0d871717b6f7e90e6ed0f6. (Found by checking config.digest of the container image's manifest.)
Then I go make a new container image, lets call it ApplicationContainerImage. It is based off the BaseContainerImage, using a tag that refers to the above SHA. After I build it, I look at the container image's manifest.
I expected the layers section to contain the SHA of the "parent" container image. But it does not.
When I compare the layers of both, all the layers of the BaseContainerImage are in the ApplicationContainerImage. So I know that the FROM working. But I just don't understand why the SHA of the BaseContainerImage is left out of the layers of the ApplicationContainerImage.
Why is the SHA of the BaseContainerImage not listed in layers of the ApplicationContainerImage?
Later Notes:
When I went and downloaded the BaseContainerImage from a remote repository, it tells me (as part of the PULL command ouput that the Digest is Digest: sha256:a1dd2dfdfc51e7abba1d2db319ba457e7b72f7258f5cefca0ee6ec6845f564b6 Which clearly does not match the above digest. But when I run docker manifest inspect the the exact same image, the config.digest is sha256:9ec7b7481feee3eb141f7321be1df353b1ab8b6bdf0d871717b6f7e90e6ed0f6, matching what I got earlier.
Why are there two different SHA values? Is one just for the pull action somehow?

You're mixing up digests for different objects. The image in a registry consist of:
Manifest: this is the top level, has it's own digest, but is commonly referred to by tag
Config: the manifest points to this, and it includes the default settings you see when you inspect a docker image
Layers: each layer has it's own digest, these are typically tar+gzip on the registry, and tar (uncompressed) when pulled locally
The manifest digest is the most commonly used digest, it's used to pin an image for pulling. Note that you can have a manifest list that points to multiple platform specific manifests, and each of those have their own digest.
The config digest shouldn't be compared to anything locally, it's needed to pull the config blob from the registry, but it isn't directly associated with layer digests and isn't the manifest digest.
The layer digests are sometimes confused because they change when you go from compressed on the registry to uncompressed locally.
What is a digest? It's just the sha256sum on the content. That file is pushed to the registry as a blob or manifest. Because of how the manifest includes digests of the other files, you end up with a directed acyclic graph (DAG).
To see the layer reuse, look at the actual layers within the image manifest. Or you can look at the layers section of the config blob (these digests will be different because the layer digests in the config are on the uncompressed layer).
Here's an example of layer reuse looking at two images on docker hub:
$ regctl image manifest alpine:3.13
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 1472,
"digest": "sha256:6dbb9cc54074106d46d4ccb330f2a40a682d49dda5f4844962b7dce9fe44aaec"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2811969,
"digest": "sha256:540db60ca9383eac9e418f78490994d0af424aab7bf6d0e47ac8ed4e2e9bcbba"
}
]
}
$ regctl image manifest redis:alpine
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 6390,
"digest": "sha256:1690b63e207f6651429bebd716ace700be29d0110a0cfefff5038bb2a7fb6fc7"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2811969,
"digest": "sha256:540db60ca9383eac9e418f78490994d0af424aab7bf6d0e47ac8ed4e2e9bcbba"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1258,
"digest": "sha256:29712d301e8c43bcd4a36da8a8297d5ff7f68c3d4c3f7113244ff03675fa5e9c"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 384200,
"digest": "sha256:8173c12df40f1578a7b2dfbbc0034a4fbc8ec7c870fd32b9236c2e5e1936616a"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 7692532,
"digest": "sha256:8cc52074f78e0a2fd174bdd470029cf287b7366bf1b8d3c1f92e2aa8789b92ae"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 135,
"digest": "sha256:aa7854465cce07929842cb49fc92f659de8a559cf521fc7ea8e1b781606b85cd"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 412,
"digest": "sha256:6ab1d05b49730290d3c287ccd34640610423d198e84552a4c2a4e98a46680cfd"
}
]
}
From that you can see the config blobs are completely different (as expected, these aren't the same image), but the layer from the alpine image is the same as the first layer of the redis:alpine image.
The regctl tool shown here is available from github. Disclaimer, I'm the author.

How can I be sure that I am pulling a trusted image from docker?

We are using notary service along with third party provider aujas for signing the docker images.
I have a build machine from where we run the scripts to sign the images. So far so good.
When my customer pulls the image that we have signed, how can he be sure that the image is a signed image and can be trusted ?
I tested from a different machine (other than the build machine) and I am able to pull the image successfully when I unset DOCKER_CONTENT_TRUST. The moment I enable DOCKER_CONTENT_TRUST I get an error that
Error: remote trust data does not exist for docker.io/xxx/xxxx:
notary.docker.io does not have trust data for docker.io/xxxx/xxxx
How do my customer trust that image he is pulling is signed ?
Thanks,
Madhav

Not all images have been signed by their maintainers. The error your are getting is because this particular image was not signed by the maintainer of that image.
An image that is signed for example is the nginx image.
$ docker trust inspect nginx:alpine
[
{
"Name": "nginx:alpine",
"SignedTags": [
{
"SignedTag": "alpine",
"Digest": "1e9c503db9913a59156f78c6420f6e2f01c8a3b71ceeeddcd7f604c4db0f045e",
"Signers": [
"Repo Admin"
]
}
],
"Signers": [],
"AdministrativeKeys": [
{
"Name": "Root",
"Keys": [
{
"ID": "d2f02ea35ebffce87d31673efbff44c199b1af0be042989d4655a176e8aad40d"
}
]
},
{
"Name": "Repository",
"Keys": [
{
"ID": "ec92eb8e988506253f8590cb924b6becdbb0520f2fb430257d8879e2d3bed2cc"
}
]
}
]
}
]
Therefore you can pull this image with content trust enabled.
$ DOCKER_CONTENT_TRUST=true docker pull nginx:alpine
Pull (1 of 1): nginx:alpine#sha256:1e9c503db9913a59156f78c6420f6e2f01c8a3b71ceeeddcd7f604c4db0f045e
docker.io/library/nginx#sha256:1e9c503db9913a59156f78c6420f6e2f01c8a3b71ceeeddcd7f604c4db0f045e: Pulling from library/nginx
Digest: sha256:1e9c503db9913a59156f78c6420f6e2f01c8a3b71ceeeddcd7f604c4db0f045e
Status: Image is up to date for nginx#sha256:1e9c503db9913a59156f78c6420f6e2f01c8a3b71ceeeddcd7f604c4db0f045e
Tagging nginx#sha256:1e9c503db9913a59156f78c6420f6e2f01c8a3b71ceeeddcd7f604c4db0f045e as nginx:alpine
docker.io/library/nginx:alpine
However an image that is not signed can not be pulled with content trust enabled.
$ docker trust inspect docker/whalesay
[]
No signatures or cannot access docker/whalesay
As you can see then I will get the same error as you got.
$ DOCKER_CONTENT_TRUST=true docker pull docker/whalesay
Using default tag: latest
Error: remote trust data does not exist for docker.io/docker/whalesay: notary.docker.io does not have trust data for docker.io/docker/whalesay
If you would like to work with signed images a way to achieve that is by signing them yourself and push yo your own repository.
export DOCKER_CONTENT_TRUST=true # enable content trust globally
DOCKER_CONTENT_TRUST=false docker pull docker/whalesay # download unsiged image by disabling content trust
docker tag docker/whalesay marcofranssen/whalesay
docker push marcofranssen/whalesay # pushes and signs the image in my own repository with my keys
docker trust inspect marcofranssen/whalesay
[
{
"Name": "marcofranssen/whalesay",
"SignedTags": [
{
"SignedTag": "latest",
"Digest": "4a79736c5f63638261bc21228b48e9991340ca6d977b73de3598be20606e5d87",
"Signers": [
"marcofranssen"
]
}
],
"Signers": [
{
"Name": "marcofranssen",
"Keys": [
{
"ID": "eb9dd99255f91efeba139941fbfdb629f11c2353704de07a2ad653d22311c88b"
}
]
}
],
"AdministrativeKeys": [
{
"Name": "Root",
"Keys": [
{
"ID": "0428c356406a6ea3543012855c117d13d784774e49aa6db461cfbad5726d187b"
}
]
},
{
"Name": "Repository",
"Keys": [
{
"ID": "b635efeddff59751e8b6b59abb45383555103d702e7d3f46fbaaa9a8ac144ab8"
}
]
}
]
}
]
Now you can use your own repository with signed versions of the image. It goes without saying that you should only sign images after verifying it's contents.

Images that are pulled through Docker Content Trust can be trusted as their cryptographic signatures are automatically verified. From the Docker documentation:
Image consumers can enable DCT to ensure that images they use were signed. If a consumer enables DCT, they can only pull, run, or build with trusted images. Enabling DCT is a bit like applying a “filter” to your registry. Consumers “see” only signed image tags and the less desirable, unsigned image tags are “invisible” to them.

Difference between OCI image manifest and Docker V2.2 image manifest

I have a requirement of converting an OCI image manifest to Docker v2.2 image format and vice versa. But I am not able to find any difference between the two , is there any actual difference or they are same ?

Docker Image Manifest V 2, Schema 2
Registry image manifests define the components that make up an image on a container registry (see section on container registries). The more common manifest format we’ll be working with is the Docker Image Manifest V2, Schema 2 (more simply, V2.2). There is also a V2, Schema 1 format that is commonly used but more complicated than V2.2 due to backwards-compatibility reasons against V1.
The V2.2 manifest format is a JSON blob with the following top-level fields:
schemaVersion - 2 in this case
mediaType - application/vnd.docker.distribution.manifest.v2+json
config - descriptor of container configuration blob
layers - list of descriptors of layer blobs, in the same order as the rootfs of the container configuration
Blob descriptors are JSON objects containing 3 fields:
mediaType - application/vnd.docker.container.image.v1+json for a container configuration or application/vnd.docker.image.rootfs.diff.tar.gzip for a layer
size - the size of the blob, in bytes
digest - the digest of the content
Here is an example of a V2.2 manifest format (for the Docker Hub busybox image):
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 1497,
"digest": "sha256:3a093384ac306cbac30b67f1585e12b30ab1a899374dabc3170b9bca246f1444"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 755724,
"digest": "sha256:57c14dd66db0390dbf6da578421c077f6de8e88edd0815b4caa94607ba5f4c09"
}
]
}
OCI Image Manifest
The OCI image format is essentially the same as the Docker V2.2 format, with a few differences.
mediaType - must be set to application/vnd.oci.image.manifest.v1+json
config.mediaType - must be set to application/vnd.oci.image.config.v1+json
Each object in layers must have mediaType be either application/vnd.oci.image.layer.v1.tar+gzip or application/vnd.oci.image.layer.v1.tar.
Source: https://containers.gitbook.io/build-containers-the-hard-way/#registry-format-oci-image-manifest

Pull docker image for different architecture

I have an air gapped system that I'm running some docker containers on. I'm trying to get some images on it however the system is a different architecture than what I'm running. For a few images (like GCC) I was able to just say docker pull repo/gcc and that worked fine, however for some reason when I try to do docker pull repo/python I get:
Using default tag: latest
latest: Pulling from repo/python
no matching manifest for linux/amd64 in the manifest list entries
Is there a way to specify the architecture in my pull request?

With the latest docker, you can specify platform when you're pulling
docker pull --platform linux/arm64 alpine:latest

Docker from version 20.10.0+ (released on 2020-12-08) supports explicit definition of the platform via --platform tag, e.g.:
docker pull --platform linux/arm64 repo/python
Of course, source must contain an image for the requested platform.
Answer for Docker versions before 20.10.0:
To answer question from the title: you can pull image by digest.
Example: list all supported architectures (manifest):
$ docker manifest inspect ckulka/multi-arch-example
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
"manifests": [
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 2200,
"digest": "sha256:6eaeab9bf8270ce32fc974c36a15d0bac4fb6f6cd11a0736137c4248091b3646",
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 2413,
"digest": "sha256:f02e0fd2918a894ecd49d67b802c22082dc3c6424f6566e1753a83ba833b0993",
"platform": {
"architecture": "arm",
"os": "linux",
"variant": "v5"
}
},
...
And then pull by digest, e.g. arm arch (pulled on linux machine):
$ docker pull ckulka/multi-arch-example#sha256:f02e0fd2918a894ecd49d67b802c22082dc3c6424f6566e1753a83ba833b0993
But you can't run all architectures, so it can be useless when you pull image for different architecture.

If the image is not multi-arch, you cant unless you emulate your architecture to be of the target architecture of the manifest.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart