GC collection of Docker Registry - docker

Since v2.4.0 a garbage collector command is included within the registry binary. I read about how it works in the official documentation.
To use the garbage-collection:
bin/registry garbage-collect [--dry-run] /path/to/config.yml
I see the config in /etc/docker/registry/config.yml
When I just perform a dry-run I see a lot of blobs marked and at the end the blobs which would have been deleted without dry-run.
But I don't see how I can easily link this blobs to images?
Which images will be deleted and am I able to tell which image should be deleted or do I need to use another command and after that I have to run the gc?)
Can someone maybe provide an example in which case an image/blob will be deleted? Thanks

From your referenced documentation:
In the context of the Docker registry, garbage collection is the process of removing blobs from the filesystem which are no longer referenced by a manifest. Blobs can include both layers and manifests.
Manifests are groups of blobs (layers) used to represent an image tag. The only blobs deleted no longer reference any image. So to answer your question, if GC is working correctly, no one should be able to give an example of this deleting an image, but every useful GC should delete blobs, including your own.

Related

Why is a new docker image the same size of the original one from which the commit was made?

I downloaded a Docker image and made some changes inside a container based on it. Then I commited those changes to create a new image that I would actually like to use.
docker images says that these images have about the same size. So, it seemed to me that Docker copied everything it needs to the new image.
Yet I can't remove the old image which I no longer need. It seems like I'm getting the worst of both worlds: neither is space conserved by a parenting relationship, nor can I delete the unwanted files.
What gives? Am I interpreting docker images output wrong (maybe it's not reporting the actual on-disk size)?
you may remove the first image with a force,
docker image rm -f $IMAGE_ID
As for the same size, it depends mainly on your changes, you can check if they match exactly on a byte level with:
docker image inspect IMAGE_NAME:$TAG --format='{{.Size}}'

How to add/mount large files kept in SharePoint to Docker Container through Dockerfile

I'm new to using Docker and wanted to understand how to add large folders (combined ~1GB) kept elsewhere (such as in SharePoint) to the Docker container using Dockerfile. What is the best way to add the files and can someone explain the commands to be used? For example, one method I have come across is the following:
ADD http://example.com/big.tar.xz /usr/src/things/
Does the /usr/src/things/ specify the location where I want to save the folders (not individual files) with respect to my original repository?
This answer is from: Adding large files to docker during build which covers the question at a high level. Can someone share details/commands for each step involved? One answer mentions not adding the files to the image but mounting as a volume. Is that a better option than using ADD in the Dockerfile.
Thanks!

Deleting Docker Images from Artifactory

We are doing an artifactory cleanup, removing Docker images from our docker repo.
I understand from the article over https://jfrog.com/knowledge-base/how-can-i-delete-docker-images-older-than-a-certain-time-period
that if a layer is shared between two different images, and if only one is a candidate for deletion, then that layer will not be deleted from the binary storage.
Our policy is on deleting some specific tag versions (that are not used in production) and now we have some queries based on the above article
Is there a possibility that we will end up with partially deleted images (corrupted images). Say some of the layers of the image we are trying to delete is referenced by some other image, would layers be partially deleted, leaving us with a corrupted image which could be be pulled, but then result in failure??
You are correct, if the layer is referenced by another tag, then it will not be deleted.
The worst that can happen is that you will leave a tag behind that points to a non-existent image.

Are tags exclusive to one image in a Docker repo?

I thought that tags in Docker worked like in stackoverflow where millions of questions can be tagged with the same tag. But when I tag a second image in Docker the first one loses its tag:
So are images to tags one-to-many, i.e. one image can have multiple tags in a repo, but a tag cannot be applied to 2 or more images in the same repo?
Pushing a new tag replaces the old tag, but if you know the digest, you can pull the old manifest until the registry garbage collects it.
A tag is a pointer to a manifest in the registry, and it can only point to a single manifest, similar to a symlink in Linux. This is needed since everything else in the registry is content addressable, so you need the tag to avoid needing to remember long digests.
There are a couple manifest types, an image manifest, and a manifest list. The manifest list contains references to other manifests, which is commonly used for multi-platform images. So a tag pointing to a manifest list could refer to multiple images using a manifest list. But runtimes will only pull a single image out of that list. And that list is generated by the tool pushing the image, not dynamically created by the registry by merging the previous images into a list (that would break the content addressable logic since it would change the digest).

How Do I Know What Is In A Public Docker Image?

Is there any way to know what is in a public image other than downloading it and checking it out manually?
e.g. I can see on dockerhub various java images and a various ansible images, I would have to download quite a lot to determine which one to use and if any had both
The dockerfile lists some info but often there is inheritance and so you can't see all the info.
Is there anything that lists all the contained packages or an online service that lets you try them out without downloading the whole image?
MicroBadger lists the docker history of a Docker image and shows matching base images (with their layers as well). E.g. https://microbadger.com/images/ansible/ansible

Resources