Azure DevOps replace variables in web.config in docker image on deployment - docker

Is there a way to do variable/file transforms in the *.config file of a docker image as it's deployed (to a Kubernetes cluster - AKS)?
This could be done by doing the replacement and creating a docker container image for each configuration but this would lead to a lot of extra container images when the only difference is the configuration file.

A Docker image is build as a readable/writeable layer on top of a bunch of read-only layers. These layers (also called intermediate images) are generated when the commands in the Dockerfile are executed during the Docker image build.
These intermediate layers are shared across the Docker image to help increase reusability, decrease disk usage, and speed up docker build by allowing each step to be cached.
So, if you create multiple images by changing the configuration setting, each image use the shared layers that are unchanged and only the changed configuration amounts to the extra size for the new images.
Alternately, you could define these configurations as ConfigMap if you use Kubernetes and mount them on the pod as volumes/environment variables.

Related

Is there any way to configure Skaffold to build images on my local Docker daemon and not on the minikube's one?

I use minikube with Docker driver on Linux. For a manual workflow I can enable registry addon in minikube, push there my images and refer to them in deployment config file simply as localhost:5000/anything. Then they are pulled to a minikube's environment by its Docker daemon and deployments successfully start in here. As a result I get all the base images saved only on my local device (as I build my images using my local Docker daemon) and minikube's environment gets cluttered only by images that are pulled by its Docker daemon.
Can I implement the same workflow when use Skaffold? By default Skaffold uses minikube's environment for both building images and running containers out of them, and also it duplicates (sometimes even triplicates) my images inside minikube (don't know why).
Skaffold builds directly to Minikube's Docker daemon as an optimization so as to avoid the additional retrieve-and-unpack required when pushing to a registry.
I believe your duplicates are like the following:
$ (eval $(minikube docker-env); docker images node-example)
REPOSITORY TAG IMAGE ID CREATED SIZE
node-example bb9830940d8803b9ad60dfe92d4abcbaf3eb8701c5672c785ee0189178d815bf bb9830940d88 3 days ago 92.9MB
node-example v1.17.1-38-g1c6517887 bb9830940d88 3 days ago 92.9MB
Although these images have different tags, those tags are just pointers to the same Image ID so there is a single image being retained.
Skaffold normally cleans up left-over images from previous runs. So you shouldn't see the minikube daemon's space continuously growing.
An aside: even if those Image IDs were different, an image is made up of multiple layers, and those layers are shared across the images. So Docker's reported image sizes may not actually match the actual disk space consumed.

Can Kubernetes ever create a Docker image?

I'm new to Kubernetes and I'm learning about it.
Are there any circumstances where Kubernetes is used to create a Docker image instead of pulling it from a repository ?
Kubernetes natively does not create images. But you can run a piece of software such kaniko in the kubernetes cluster to achieve it. Kaniko is a tool to build container images from a Dockerfile, inside a container or Kubernetes cluster.
The kaniko executor image is responsible for building an image from a Dockerfile and pushing it to a registry. Within the executor image, we extract the filesystem of the base image (the FROM image in the Dockerfile). We then execute the commands in the Dockerfile, snapshotting the filesystem in userspace after each one. After each command, we append a layer of changed files to the base image (if there are any) and update image metadata
Several options exist to create docker images inside Kubernetes.
If you are already familiar with docker and want a mature project you could use docker CE running inside Kubernetes. Check here: https://hub.docker.com/_/docker and look for the dind tag (docker-in-docker). Keep in mind there's pros and cons to this approach, so take care to understand them.
Kaniko seems to have potential but there's no version 1 release yet.
I've been using docker dind (docker-in-docker) to build docker images that run in production Kubernetes cluster with good results.

Why does docker have to create an image from a dockerfile then create a container from the image instead of creating a container from a Dockerfile?

Why does docker have to create an image from a dockerfile then create a container from the image instead of creating a container directly from a Dockerfile?
What is the purpose/benefit of creating the image first from the Dockerfile then from that create a container?
-----EDIT-----
This question What is the difference between a Docker image and a container?
Does not answer my question.
My question is: Why do we need to create a container from an image and not a dockerfile? What is the purpose/benefit of creating the image first from the Dockerfile then from that create a container?
the Dockerfile is the recipe to create an image
the image is a virtual filesystem
the container is the a running process on a host machine
You don't want every host to build its own image based on the recipe. It's easier for some hosts to just download an image and work with that.
Creating an image can be very expensive. I have complicated Dockerfiles that may take hours to build, may download 50 GB of data, yet still only create a 200 MB image that I can send to different hosts.
Spinning up a container from an existing image is very cheap.
If all you had was the Dockerfile in order to spin up image-containers, the entire workflow would become very cumbersome.
Images and Containers are two different concepts.
Basically, images are like a snapshot of a filesystem, along with some meta-data.
A container is one of several process that are actually running (and which is based on an image). As soon as the processes end, your container do not exist anymore (well, it is stopped to be exact)
You can view the image as the base that you will make your container run on.
Thus, you Dockerfile will create an image (which is static) which you can store locally or push on a repository, to be able to use it later.
The container cannot be "stored" because it is a "living" thing.
You can think of Images vs Containers similar to Classes vs Objects or the Definition vs Instance. The image contains the filesystem and default settings for creating the container. The container contains the settings for a specific instance, and when running, the namespaces and running process.
As for why you'd want to separate them, efficiency and portability. Since we have separate images, we also have inheritance, where one image extends another. The key detail of that inheritance is that filesystem layers in the image are not copied for each image. Those layers are static, and you can them by creating a new image with new layers. Using the overlay filesystem (or one of the other union filesystem drivers) we can append additional changes to that filesystem with our new image. Containers do the same when the run the image. That means you can have a 1 Gig base image, extend it with a child image with 100 Megs of changes, and run 5 containers that each write 1 Meg of files, and the overall disk space used on the docker host is only 1.105 Gigs rather than 7.6 Gigs.
The portability part comes into play when you use registries, e.g. Docker Hub. The image is the part of the container that is generic, reusable, and transferable. It's not associated with an instance on any host. So you can push and pull images, but containers are tightly bound to the host they are running on, named volumes on that host, networks defined on that host, etc.

Shared volume during build?

I have a docker-compose environment setup like so:
Oracle
Filesystem
App
...
etc...
The filesystem container downloads the latest code from our repo and exposes its volume for other containers to mount. This works great except that containers that need to use the code to do builds can't access it since the volume isn't mounted until the containers are run.
I'd like to avoid checkout/downloading the code since the codebase is over 3 gig right now... Hence trying to do something spiffier.
Is there a better way to do this?
As you mentioned, Docker volumes won't work as volumes are used when the container start.
The best solution for your situation is to use Docker multistage Builds. The idea here is to have an image which has the code base and other images can access this code directly from this image.
You basically have an image, that is responsible for pulling the code:
FROM alpine/git
RUN git clone ...
You then build this image, either separately or as the first image in a compose file.
Other images can then use this image as such:
FROM code-image as code
COPY --from=code /git/<code-repository> /code
This will make the code available to all the images, and it will only be pulled once from the remote repo.

Does docker reuse images when multiple containers run on the same host?

My understanding is that Docker creates an image layer at every stage of a dockerfile.
If I have X containers running on the same machine (where X >=2) and every container has a common underlying image layer (ie. debian), will docker keep only one copy of the base image on that machine, or does it have multiple copies for each container?
Is there a point this breaks down, or is it true for every layer in the dockerfile?
How does this work?
Does Kubernetes affect this in any way?
Dockers Understand images, containers, and storage drivers details most of this.
From Docker 1.10 onwards, all the layers that make up an image have an SHA256 secure content hash associated with them at build time. This hash is consistent across hosts and builds, as long as the content of the layer is the same.
If any number of images share a layer, only the 1 copy of that layer will be stored and used by all images on that instance of the Docker engine.
A tag like debian can refer to multiple SHA256 image hash's over time as new releases come out. Two images that are built with FROM debian don't necessarily share layers, only if the SHA256 hash's match.
Anything that runs the Docker Engine underneath will use this storage setup.
This sharing also works in the Docker Registry (>2.2 for the best results). If you were to push images with layers that already exist on that registry, the existing layers are skipped. Same with pulling layers to your local engine.

Resources