Are tags exclusive to one image in a Docker repo? - docker

I thought that tags in Docker worked like in stackoverflow where millions of questions can be tagged with the same tag. But when I tag a second image in Docker the first one loses its tag:
So are images to tags one-to-many, i.e. one image can have multiple tags in a repo, but a tag cannot be applied to 2 or more images in the same repo?

Pushing a new tag replaces the old tag, but if you know the digest, you can pull the old manifest until the registry garbage collects it.
A tag is a pointer to a manifest in the registry, and it can only point to a single manifest, similar to a symlink in Linux. This is needed since everything else in the registry is content addressable, so you need the tag to avoid needing to remember long digests.
There are a couple manifest types, an image manifest, and a manifest list. The manifest list contains references to other manifests, which is commonly used for multi-platform images. So a tag pointing to a manifest list could refer to multiple images using a manifest list. But runtimes will only pull a single image out of that list. And that list is generated by the tool pushing the image, not dynamically created by the registry by merging the previous images into a list (that would break the content addressable logic since it would change the digest).

Related

Deleting Docker Images from Artifactory

We are doing an artifactory cleanup, removing Docker images from our docker repo.
I understand from the article over https://jfrog.com/knowledge-base/how-can-i-delete-docker-images-older-than-a-certain-time-period
that if a layer is shared between two different images, and if only one is a candidate for deletion, then that layer will not be deleted from the binary storage.
Our policy is on deleting some specific tag versions (that are not used in production) and now we have some queries based on the above article
Is there a possibility that we will end up with partially deleted images (corrupted images). Say some of the layers of the image we are trying to delete is referenced by some other image, would layers be partially deleted, leaving us with a corrupted image which could be be pulled, but then result in failure??
You are correct, if the layer is referenced by another tag, then it will not be deleted.
The worst that can happen is that you will leave a tag behind that points to a non-existent image.

What is shared tags / Simple tags for docker images

What is Shared and Simple tags for Docker images?
Why is it in place ?
I could see shared/simple tags for
https://hub.docker.com/_/hello-world
A detailed explanation would help me.
From
docker library, mongo
in short, "...(images) are identified by simple tags. Simple tags are like main tags. Shared tags group images together under common name"
To my understanding, you can have multiple images having shared tag which I'd view as version (following mongo example):
3.2.21-jessie
3.2.21-windowsservercore-ltsc2016
3.2.21-windowsservercore-1709
would make it possible to pull 3.2.21 shared tag and have the correct exact image pulled form the registry based on the host running.

How Do I Know What Is In A Public Docker Image?

Is there any way to know what is in a public image other than downloading it and checking it out manually?
e.g. I can see on dockerhub various java images and a various ansible images, I would have to download quite a lot to determine which one to use and if any had both
The dockerfile lists some info but often there is inheritance and so you can't see all the info.
Is there anything that lists all the contained packages or an online service that lets you try them out without downloading the whole image?
MicroBadger lists the docker history of a Docker image and shows matching base images (with their layers as well). E.g. https://microbadger.com/images/ansible/ansible

GC collection of Docker Registry

Since v2.4.0 a garbage collector command is included within the registry binary. I read about how it works in the official documentation.
To use the garbage-collection:
bin/registry garbage-collect [--dry-run] /path/to/config.yml
I see the config in /etc/docker/registry/config.yml
When I just perform a dry-run I see a lot of blobs marked and at the end the blobs which would have been deleted without dry-run.
But I don't see how I can easily link this blobs to images?
Which images will be deleted and am I able to tell which image should be deleted or do I need to use another command and after that I have to run the gc?)
Can someone maybe provide an example in which case an image/blob will be deleted? Thanks
From your referenced documentation:
In the context of the Docker registry, garbage collection is the process of removing blobs from the filesystem which are no longer referenced by a manifest. Blobs can include both layers and manifests.
Manifests are groups of blobs (layers) used to represent an image tag. The only blobs deleted no longer reference any image. So to answer your question, if GC is working correctly, no one should be able to give an example of this deleting an image, but every useful GC should delete blobs, including your own.

What is the relationship between a docker image ID and the IDs in the manifests?

I am trying to understand the connection between the image ID as reported by docker images (or docker inspect) and the actual layers or images in the registry or manifest (using v2).
I run docker images, I get (abbreviated and changed to protect the not-so-innocent):
REPOSITORY TAG IMAGE ID
my.local.registry/some/image latest abcdefg
If I pull the manifest for the above image using the API, I get one that contains fsLayers, not one of which matches the (full) ID for the image. I get that, since the image is the sum of the layers.
However, if I pull that image elsewhere, I get the same ID. If I update the image, push and pull it, the new version has a new ID.
I thought it might be the hash of the manifest. However, (a) pulling the manifest via the API does not return the hash of the manifest in the JSON, and (b) looking in the registry directory itself, the sha256 of the given manifest in /var/lib/registry/v2/repositories/some/image/_manifests/tags/latest/current/link (or those in index/sha256/) give the correct link for the manifest that was downloaded, but does not match the image ID.
I had assumed the image ID matches the blob, but perhaps I am in error?
When we push an image into a registry,
We create a manifest that defines the image - layers inside it, and push both the manifest and layers independently.
We compress the layers and only then push them.
So in our host the hashes we have are the hashes of the content present in those layers, called the Content Hashes.
But to the registry we send the compressed layers, due to which the content changes and so the hashes change. So before those compressed layers are sent, the hashes for the compressed layers are calculated which are called the Distribution Hashes and those hashes are put in the manifest file.
Due to the difference in these Content and Distribution hashes, you see a difference in the ids.

Resources