Identify FROM line from Docker Image - docker

I have an image from a colleague and the original version of the Dockerfile was lost that was used to create it.
I used alpine/dfimage to rebuild everything in the Dockerfile except the first line (FROM).
According to Artifactory, the digest for the layer is
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2813316,
"digest": "sha256:cbdbe7a5bc2a134ca8ec91be58565ec07d037386d1f1d8385412d224deafca08"
}
and the ADD line is:
ADD
file:b91adb67b670d3a6ff9463e48b7def903ed516be66fc4282d22c53e41512be49
in /
I'm fairly confident the base image was pulled from a public repo. I've been trying to find a way of searching Docker Hub for images by digest (like I can in Artifactory), but I've yet to identify something. For example, library/apline:3.15.0 has a digest of sha256:21a3deaa0d32a8057914f36584b5288d2e5ecc984380bc0118285c70fa8c9300 (different from the repo digest of sha256:c74f1b1166784193ea6c8f9440263b9be6cae07dfe35e32a5df7a31358ac2060 as advertised on Docker Hub).
If I knew the namespace and repo, I could simply specify the digest with docker image pull server/namespace/repo#digest. Unfortunately, the only thing I have is a previous version of the Dockerfile and the namespace and repo referenced there didn't work.
FROM mcr.microsoft.com/dotnet/core/runtime:2.1.11-alpine3.9
In other words, docker image pull mcr.microsoft.com/dotnet/core/runtime#sha256:cbdbe7a5bc2a134ca8ec91be58565ec07d037386d1f1d8385412d224deafca08 did not return success. Nor did docker image pull mcr.microsoft.com/dotnet/runtime#sha256:cbdbe7a5bc2a134ca8ec91be58565ec07d037386d1f1d8385412d224deafca08.
Any help would be greatly appreciated. My only alternative at this point is to reinvent the wheel.

The digest for a layer blob and the digest for the image manifest are two different things, the manifest contains the digests for the layer blobs so that if any layer changes, the digest for the manifest also changes. That's what provides the immutability of images. Inspecting the image you found:
$ regctl manifest get localhost:5000/library/alpine:3.11.6
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 1507,
"digest": "sha256:f70734b6a266dcb5f44c383274821207885b549b75c8e119404917a61335981a"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2813316,
"digest": "sha256:cbdbe7a5bc2a134ca8ec91be58565ec07d037386d1f1d8385412d224deafca08"
}
]
}
You can see your one blob, but to pull the image, you need the manifest digest, and unfortunately there's no way to list all the manifests that point to a specific blob (at least not yet, we may get that as part of OCI reference types, but that's a ways off). However, the manifest itself has a digest that matches what you would see when inspecting the image:
$ regctl manifest get localhost:5000/library/alpine:3.11.6 --format raw-body | sha256sum
39eda93d15866957feaee28f8fc5adb545276a64147445c64992ef69804dbf01 -
$ regctl manifest get localhost:5000/library/alpine:3.11.6 --list --format raw-body | sha256sum
9a839e63dad54c3a6d1834e29692c8492d93f90c59c978c1ed79109ea4fb9a54 -
As for why there are two digests above, the alpine image is a multi-platform image, so there's a digest for the parent manifest list, and then a digest for each image manifest. Here's what the manifest list looks like:
$ regctl manifest get localhost:5000/library/alpine:3.11.6 --list --format raw-body | jq .
{
"manifests": [
{
"digest": "sha256:39eda93d15866957feaee28f8fc5adb545276a64147445c64992ef69804dbf01",
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"platform": {
"architecture": "amd64",
"os": "linux"
},
"size": 528
},
{
"digest": "sha256:0ff8a9dffabb5ed8dcba4ee898f62683305b75b4086f433ee722db99138f4f53",
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"platform": {
"architecture": "arm",
"os": "linux",
"variant": "v6"
},
"size": 528
},
{
"digest": "sha256:19c4e520fa84832d6deab48cd911067e6d8b0a9fa73fc054c7b9031f1d89e4cf",
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"platform": {
"architecture": "arm",
"os": "linux",
"variant": "v7"
},
"size": 528
},
{
"digest": "sha256:ad295e950e71627e9d0d14cdc533f4031d42edae31ab57a841c5b9588eacc280",
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"platform": {
"architecture": "arm64",
"os": "linux",
"variant": "v8"
},
"size": 528
},
{
"digest": "sha256:b28e271d721b3f6377cb5bae6cd4506d2736e77ef6f70ed9b0c4716da8bdf17c",
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"platform": {
"architecture": "386",
"os": "linux"
},
"size": 528
},
{
"digest": "sha256:e095eb9ac24e21bf2621f4d243274197ef12b91c67cde023092301b2db1e073c",
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"platform": {
"architecture": "ppc64le",
"os": "linux"
},
"size": 528
},
{
"digest": "sha256:41ba0806c6113064dd4cff12212eea3088f40ae23f182763ccc07f430b3a52f8",
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"platform": {
"architecture": "s390x",
"os": "linux"
},
"size": 528
}
],
"mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
"schemaVersion": 2
}
Once we have the manifest list, either by tag or digest, we can walk the structure to the manifests and contained blobs. But the blobs don't contain digest of the parent (adding it would change the digest of the blob, which changes the digest of the parent, and you get a circular dependency).
One thing that was recently added to the list of OCI standard annotations is org.opencontainers.image.base.digest and org.opencontainers.image.base.tag, which if implemented would allow users of an image to identify their base image by both tag and digest, which can be useful for determining when an image needs to be rebuilt (when the tag no longer refers to the same digest).

So, I was able to find this, but none of the data I have seem to correlate and I'd welcome a discussion on it if there is, in fact, any correlating data.
The image of interest is alpine:3.11.6 with a digest of sha256:9a839e63dad54c3a6d1834e29692c8492d93f90c59c978c1ed79109ea4fb9a54.

Related

Docker Image Layer Remote URL - How do you set the URL value for referencing blobs externally from the registry?

This is a common example from MCR for windows images. I'm not interested in the mediaType but I am interested in settings urls. While the endpoint is barely documented, I haven't seen any implementations of utilizing the endpoint.
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.foreign.diff.tar.gzip",
"size": 1534685324,
"digest": "sha256:65014b3c312172f10bd6701a063f9b5aaf9a916c2d2cb843d406a6f77ded3f8d",
"urls": [
"https://go.microsoft.com/fwlink/?linkid=2041275"
]
}
I've looked through and through Docker, Podman, Registry, ORAS, etc with no luck.

filter docker images with json output

Several requests.
I'd like to get json output from docker images, I can do by this way
$ docker images --format "{{json . }}" |jq .
{
"Containers": "N/A",
"CreatedAt": "2022-07-27 11:12:07 +1000 AEST",
"CreatedSince": "8 days ago",
"Digest": "<none>",
"ID": "b673851840e8",
"Repository": "python",
"SharedSize": "N/A",
"Size": "915MB",
"Tag": "3.9",
"UniqueSize": "N/A",
"VirtualSize": "914.7MB"
}
{
"Containers": "N/A",
"CreatedAt": "2022-07-19 07:00:15 +1000 AEST",
"CreatedSince": "2 weeks ago",
"Digest": "<none>",
"ID": "d7d3d98c851f",
"Repository": "alpine",
"SharedSize": "N/A",
"Size": "5.53MB",
"Tag": "latest",
"UniqueSize": "N/A",
"VirtualSize": "5.529MB"
}
then I want to get the iamge name with tag and ID
$ docker images --format "{{json . }}" |jq -r "[.Repository,.Tag,.ID]|#csv"
"python","3.9","b673851840e8"
"alpine","latest","d7d3d98c851f"
So my question is, how can I get the output as
python:3.9 b673851840e8
alpine:latest d7d3d98c851f
(optional) second request, how can I filter the output that only output the images with *python*
You don't have to use jq to filter images based out of name, use the native --format flag itself
docker images --format "{{.Repository}}:{{.Tag}} {{.ID}}"
(or) to filter names starting with python
docker images --filter=reference='python*' --format "{{.Repository}}:{{.Tag}} {{.ID}}"
Using jq, collect the required fields into an array and use any of the string concatenation operators. You could also use join("\t") in place of #tsv to retain consistent usage of join method.
jq -r '[([.Repository, .Tag] | join(":")), .ID] | #tsv'

Docker container OS and host OS on Windows incompatibility skip OS validation?

I am hitting incompatibility with Docker OS container and Host OS.
Sending build context to Docker daemon 5.724GB
Step 1/6 : FROM mcr.microsoft.com/windows/servercore:1909-amd64
1909-amd64: Pulling from windows/servercore
a Windows version 10.0.18363-based image is incompatible with a 10.0.17763 host
I have seen other SO threads where a workaround would be to update Host OS to match targetting container OS version.
Is there any way how to skip this pull validation?
We are using VMs only to build docker images and ship them, not to run containers (even though docker build creates containers along the way).
I think there is no way except to change the system
I had once encountered such issue where I was using amr64 machines for amd64 images and it didnt work
https://forums.docker.com/t/standard-init-linux-go-190-exec-user-process-caused-exec-format-error/49368/5
https://forums.docker.com/t/standard-init-linux-go-190-exec-user-process-caused-exec-format-error/49368/5
Usually compatibility info of any image you will get in this output
you can try it on your system for mcr.microsoft.com/windows/servercore:1909-amd64
docker manifest inspect ‐‐verbose rust:1.42-slim-buster
[
{
"Ref": "docker.io/library/rust:1.42-slim-buster#sha256:1bf29985958d1436197c3b507e697fbf1ae99489ea69e59972a30654cdce70cb",
"Descriptor": {
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:1bf29985958d1436197c3b507e697fbf1ae99489ea69e59972a30654cdce70cb",
"size": 742,
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
"SchemaV2Manifest": { ... }
},
{
"Ref": "docker.io/library/rust:1.42-slim-buster#sha256:116d243c6346c44f3d458e650e8cc4e0b66ae0bcd37897e77f06054a5691c570",
"Descriptor": {
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"digest": "sha256:116d243c6346c44f3d458e650e8cc4e0b66ae0bcd37897e77f06054a5691c570",
"size": 742,
"platform": {
"architecture": "arm",
"os": "linux",
"variant": "v7"
}
},
"SchemaV2Manifest": { ... }
...
]

Why is a parent container image SHA not listed in the layers of the child?

Consider a container image, lets call it BaseContainerImage. I build this container image based off a container image on docker hub (the .Net Core 3.1 runtime if it matters). By "based off" I mean that the FROM references that docker hub image.
When I build it, it gets a SHA of: sha256:9ec7b7481feee3eb141f7321be1df353b1ab8b6bdf0d871717b6f7e90e6ed0f6. (Found by checking config.digest of the container image's manifest.)
Then I go make a new container image, lets call it ApplicationContainerImage. It is based off the BaseContainerImage, using a tag that refers to the above SHA. After I build it, I look at the container image's manifest.
I expected the layers section to contain the SHA of the "parent" container image. But it does not.
When I compare the layers of both, all the layers of the BaseContainerImage are in the ApplicationContainerImage. So I know that the FROM working. But I just don't understand why the SHA of the BaseContainerImage is left out of the layers of the ApplicationContainerImage.
Why is the SHA of the BaseContainerImage not listed in layers of the ApplicationContainerImage?
Later Notes:
When I went and downloaded the BaseContainerImage from a remote repository, it tells me (as part of the PULL command ouput that the Digest is Digest: sha256:a1dd2dfdfc51e7abba1d2db319ba457e7b72f7258f5cefca0ee6ec6845f564b6 Which clearly does not match the above digest. But when I run docker manifest inspect the the exact same image, the config.digest is sha256:9ec7b7481feee3eb141f7321be1df353b1ab8b6bdf0d871717b6f7e90e6ed0f6, matching what I got earlier.
Why are there two different SHA values? Is one just for the pull action somehow?
You're mixing up digests for different objects. The image in a registry consist of:
Manifest: this is the top level, has it's own digest, but is commonly referred to by tag
Config: the manifest points to this, and it includes the default settings you see when you inspect a docker image
Layers: each layer has it's own digest, these are typically tar+gzip on the registry, and tar (uncompressed) when pulled locally
The manifest digest is the most commonly used digest, it's used to pin an image for pulling. Note that you can have a manifest list that points to multiple platform specific manifests, and each of those have their own digest.
The config digest shouldn't be compared to anything locally, it's needed to pull the config blob from the registry, but it isn't directly associated with layer digests and isn't the manifest digest.
The layer digests are sometimes confused because they change when you go from compressed on the registry to uncompressed locally.
What is a digest? It's just the sha256sum on the content. That file is pushed to the registry as a blob or manifest. Because of how the manifest includes digests of the other files, you end up with a directed acyclic graph (DAG).
To see the layer reuse, look at the actual layers within the image manifest. Or you can look at the layers section of the config blob (these digests will be different because the layer digests in the config are on the uncompressed layer).
Here's an example of layer reuse looking at two images on docker hub:
$ regctl image manifest alpine:3.13
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 1472,
"digest": "sha256:6dbb9cc54074106d46d4ccb330f2a40a682d49dda5f4844962b7dce9fe44aaec"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2811969,
"digest": "sha256:540db60ca9383eac9e418f78490994d0af424aab7bf6d0e47ac8ed4e2e9bcbba"
}
]
}
$ regctl image manifest redis:alpine
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 6390,
"digest": "sha256:1690b63e207f6651429bebd716ace700be29d0110a0cfefff5038bb2a7fb6fc7"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2811969,
"digest": "sha256:540db60ca9383eac9e418f78490994d0af424aab7bf6d0e47ac8ed4e2e9bcbba"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1258,
"digest": "sha256:29712d301e8c43bcd4a36da8a8297d5ff7f68c3d4c3f7113244ff03675fa5e9c"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 384200,
"digest": "sha256:8173c12df40f1578a7b2dfbbc0034a4fbc8ec7c870fd32b9236c2e5e1936616a"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 7692532,
"digest": "sha256:8cc52074f78e0a2fd174bdd470029cf287b7366bf1b8d3c1f92e2aa8789b92ae"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 135,
"digest": "sha256:aa7854465cce07929842cb49fc92f659de8a559cf521fc7ea8e1b781606b85cd"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 412,
"digest": "sha256:6ab1d05b49730290d3c287ccd34640610423d198e84552a4c2a4e98a46680cfd"
}
]
}
From that you can see the config blobs are completely different (as expected, these aren't the same image), but the layer from the alpine image is the same as the first layer of the redis:alpine image.
The regctl tool shown here is available from github. Disclaimer, I'm the author.

Pull docker image for different architecture

I have an air gapped system that I'm running some docker containers on. I'm trying to get some images on it however the system is a different architecture than what I'm running. For a few images (like GCC) I was able to just say docker pull repo/gcc and that worked fine, however for some reason when I try to do docker pull repo/python I get:
Using default tag: latest
latest: Pulling from repo/python
no matching manifest for linux/amd64 in the manifest list entries
Is there a way to specify the architecture in my pull request?
With the latest docker, you can specify platform when you're pulling
docker pull --platform linux/arm64 alpine:latest
Docker from version 20.10.0+ (released on 2020-12-08) supports explicit definition of the platform via --platform tag, e.g.:
docker pull --platform linux/arm64 repo/python
Of course, source must contain an image for the requested platform.
Answer for Docker versions before 20.10.0:
To answer question from the title: you can pull image by digest.
Example: list all supported architectures (manifest):
$ docker manifest inspect ckulka/multi-arch-example
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
"manifests": [
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 2200,
"digest": "sha256:6eaeab9bf8270ce32fc974c36a15d0bac4fb6f6cd11a0736137c4248091b3646",
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"size": 2413,
"digest": "sha256:f02e0fd2918a894ecd49d67b802c22082dc3c6424f6566e1753a83ba833b0993",
"platform": {
"architecture": "arm",
"os": "linux",
"variant": "v5"
}
},
...
And then pull by digest, e.g. arm arch (pulled on linux machine):
$ docker pull ckulka/multi-arch-example#sha256:f02e0fd2918a894ecd49d67b802c22082dc3c6424f6566e1753a83ba833b0993
But you can't run all architectures, so it can be useless when you pull image for different architecture.
If the image is not multi-arch, you cant unless you emulate your architecture to be of the target architecture of the manifest.

Resources