Does building docker images on different machines prevent sharing the layers? - docker

If we build an image on machine 1 and tag it as machine1:latest and push it to our docker registry then build another image from the same Dockerfile on machine2 and tag it as machine2:latest and push it to the registry will the registry use the layers of machine1:latest? Or because we built the image on a different machine the layers will be different?
In general what factors will change/affect the layer sharing in docker?

Since the registry uses the hash to detect the layers and if possible share them if we build on different machines sharing is not possible because each docker daemon generates a different hash and this will prevent sharing of layers.

Related

How to copy multi-arch docker images to a different container registry?

There's a well-known approach to have docker images copied from one container registry to another. In case the original registry is dockerhub, the typical workflow would be something like this:
docker pull <image:tag>
docker tag <image:tag> <new-reg-url/uid/image:tag>
docker push <new-reg-url/uid/image:tag>
Now, how do you the above when dealing with images with multi-architecture layers?
As per the information in this link, you can rely on buildx to construct multi-arch images, and while doing that, you can also upload those to whichever repo you wish, but how do i do this without having to first build the images?
Looks like buildx cli has unnecessarily (?) coupled the uploading process with the building one. Any suggestions?
Thanks!
While the docker pull ...; docker tag ...; docker push ... syntax is the easy way to move images between registries, it has a couple drawbacks. First, as you've seen, is that it dereferences a multi-platform image to a single platform. And the second is that it pulls all layers to the docker engine even if the remote registry already has those layers, making it a bad method for ephemeral CI workers that would always need to pull every layer.
To do this, I prefer talking directly to the registry servers rather than the docker engine itself. You don't need the functionality from the engine to run the images, all you need is the registry API. Docker has documented the original registry API and OCI recently went 1.0 on the distribution-spec which should get us some standardization.
There's a variety of tooling based on those specs, from the docker engine itself and containerd, to skopeo, google's crane, and I've also been working on regclient. Doing this with regclient's regctl command looks like:
regctl image copy <source_image:tag> <target_image:tag>
And the result is the various layers, image config, manifests, and multi-platform manifest list will be copied between registries, but only for the layers that don't already exist on the target registry.
2022 (docker builtin) solution
It's posible to perform the copy using the not-well-documented built in command docker buildx imagetools create using --tag
# i.e.
OLD_TAG=registry.example.com/namespaced/repository/example-image:old-tag
NEW_TAG=registry.example.com/namespaced/repository/example-image:new-tag
# we can
docker buildx imagetools create --tag "$NEW_TAG" "$OLD_TAG"
Reference documentation
IMPORTANT NOTE: There is no support at the moment to perform this operation against different repositories. Given tags like
OLD_TAG=registry.example.com/namespaced/repository/example-image:latest
NEW_TAG=registry.example.com/other-repository/example-image:latest
You end with an error like
error: multiple repositories currently not supported
For this situation I'm going to test the actual accepted answer
As #laconbass has written, this can be done with docker buildx imagetools create. The ability to do this over multiple repos was added in this PR
docker buildx imagetools create -t <NEW-TAG> <OLD-TAG>

Can Kubernetes ever create a Docker image?

I'm new to Kubernetes and I'm learning about it.
Are there any circumstances where Kubernetes is used to create a Docker image instead of pulling it from a repository ?
Kubernetes natively does not create images. But you can run a piece of software such kaniko in the kubernetes cluster to achieve it. Kaniko is a tool to build container images from a Dockerfile, inside a container or Kubernetes cluster.
The kaniko executor image is responsible for building an image from a Dockerfile and pushing it to a registry. Within the executor image, we extract the filesystem of the base image (the FROM image in the Dockerfile). We then execute the commands in the Dockerfile, snapshotting the filesystem in userspace after each one. After each command, we append a layer of changed files to the base image (if there are any) and update image metadata
Several options exist to create docker images inside Kubernetes.
If you are already familiar with docker and want a mature project you could use docker CE running inside Kubernetes. Check here: https://hub.docker.com/_/docker and look for the dind tag (docker-in-docker). Keep in mind there's pros and cons to this approach, so take care to understand them.
Kaniko seems to have potential but there's no version 1 release yet.
I've been using docker dind (docker-in-docker) to build docker images that run in production Kubernetes cluster with good results.

Syncing docker images

I have 2 machines(separate hosts) running docker and I am using the same image on both the machines. How do I keep both the images in sync. For eg. suppose I make changes to the image in one of the hosts and want the changes to reflect in the other host as well. I can commit the image and copy the image over to the other host. Is there any other efficient way of doing this??
Some ways I can think of:
1. with a Docker registry
the workflow here is:
HOST A: docker commit, docker push
HOST B: docker pull
2. by saving the image to a .tar file
the workflow here is:
HOST A: docker save
HOST B: docker load
3. with a Dockerfile and by building the image again
the workflow here is:
provide a Dockerfile together with your code / files required
everytime your code has changed and you want to make a release, use docker build to create a new image.
from the hosts that you want to take the update, you will have to get the updated source code (maybe by using a version control software like Git), and then docker build the image
4. CI/CD pipeline
you can see a video here: docker.com/use-cases/cicd
Keep in mind that containers are considered to be ephemeral. This means that updating an image inside another host will then require:
to stop and remove any old container (running with the outdated image)
to run a new one (with the updated image)
I quote from: Best practices for writing Dockerfiles
General guidelines and recommendations
Containers should be ephemeral
The container produced by the image your Dockerfile defines should be as ephemeral as possible. By “ephemeral,” we mean that it can be stopped and destroyed and a new one built and put in place with an absolute minimum of set-up and configuration.
You can perform docker push to upload you image to docker registry and perform a docker pull to get the latest image from another host.
For more information please look at this

Clone an image from a docker registry to another

I have a private registry with a set of images. It can be visualized as a store of applications.
My app can take these applications and run them on other machines.
To achieve this, my app first pull the image from the private registry and then copies it to a local registry for later use.
Step as are follow:
docker pull privateregistry:5000/company/app:tag
docker tag privateregistry:5000/company/app:tag localregistry:5000/company/app:tag
docker push localregistry:5000/company/app:tag
Then later on a different machine in my network:
docker pull localregistry:5000/company/app:tag
Is there a way to efficiently copy an image from a repository to another without using a docker client in between ?
you can use docker save to save the images to tar archive and then copy the tar to new host and use docker load to untar it.
read below links for more
https://docs.docker.com/engine/reference/commandline/save/
Is there a way to efficiently copy an image from a repository to another without using a docker client in between?
Yes, there's a variety of tools that implement this today. RedHat has been pushing their skopeo, Google has crane, and I've been working on my own with regclient. Each of these tools talks directly to the registry server without needing a docker engine. And at least with regclient (I haven't tested the others), these will only copy the layers that are not already in the target registry, avoiding the need to pull layers again. Additionally, you can move a multi-platform image, retaining all of the available platforms, which you would lose with a docker pull since that dereferences the image to a single platform.

Does docker reuse images when multiple containers run on the same host?

My understanding is that Docker creates an image layer at every stage of a dockerfile.
If I have X containers running on the same machine (where X >=2) and every container has a common underlying image layer (ie. debian), will docker keep only one copy of the base image on that machine, or does it have multiple copies for each container?
Is there a point this breaks down, or is it true for every layer in the dockerfile?
How does this work?
Does Kubernetes affect this in any way?
Dockers Understand images, containers, and storage drivers details most of this.
From Docker 1.10 onwards, all the layers that make up an image have an SHA256 secure content hash associated with them at build time. This hash is consistent across hosts and builds, as long as the content of the layer is the same.
If any number of images share a layer, only the 1 copy of that layer will be stored and used by all images on that instance of the Docker engine.
A tag like debian can refer to multiple SHA256 image hash's over time as new releases come out. Two images that are built with FROM debian don't necessarily share layers, only if the SHA256 hash's match.
Anything that runs the Docker Engine underneath will use this storage setup.
This sharing also works in the Docker Registry (>2.2 for the best results). If you were to push images with layers that already exist on that registry, the existing layers are skipped. Same with pulling layers to your local engine.

Resources