does docker build --no-cache builds different layers? - docker

I few months ago I decided to setup the CI of my project building docker images with the no-cache flag: I thought it would be best not to take the risk of letting docker use an old cache layer.
I realized only now that the sha of the layers of my image are always different (even if the newly built image should generate a layer identical to the previous built) and whenever I pull the newly built image all layers are always downloaded from zero.
I'm thinking now that the issue is the --no-cache flag, I know it sounds obvious, but honestly I thought that the --no-cache was only slower to execute, but also thought that it was implemented in a functional way (same command + same content = same layer).
Can someone confirm that the --no-cache flag is the problem?

The thing with containers is that practically speaking, you will never build the same layer with the same sha, ever. You can only have the same sha if you use the same layer you previously built.
Think about it this way: every time you build a layer, there will be at least a log file, a timestamp, something that is different - and then we have not yet mentioned external dependencies pulled in.
The --no-cache flag will simply stop the Docker engine from using the cached layers and it will download & build everything again. So that flag is indeed the (indirect) reason why your hashes are different, but that is the intended behavior. Building from the cache means that your builds will be faster, but reuse previously built layers, hence having the same sha (this may result in reusing previous stale results and whatnot, that is why we have the flag).
Have a look at this article for further info:
https://thenewstack.io/understanding-the-docker-cache-for-faster-builds/
If you wish to guarantee some layers have the same sha but still not want to use the cache, you may want to look into multiphase Docker builds:
https://docs.docker.com/develop/develop-images/multistage-build/
This way you can have a base image which is fixed and build everything else on top of that.

Related

Why do docker containers rely on uploading (large) images rather than building from the spec files?

Having needed several times in the last few days to upload a 1Gb image after some micro change, I can't help but wonder why there isnt a deploy path built into docker and related tech (e.g. k8s) to push just the application files (Dockerfile, docker-compose.yml and app related code) and have it build out the infrastructure from within the (live) docker host?
In other words, why do I have to upload an entire linux machine whenever I change my app code?
Isn't the whole point of Docker that the configs describe a purely deterministic infrastructure output? I can't even see why one would need to upload the whole container image unless they make changes to it manually, outside of Dockerfile, and then wish to upload that modified image. But that seems like bad practice at the very least...
Am I missing something or this just a peculiarity of the system?
Good question.
Short answer:
Because storage is cheaper than processing power, building images "Live" might be complex, time-consuming and it might be unpredictable.
On your Kubernetes cluster, for example, you just want to pull "cached" layers of your image that you know that it works, and you just run it... In seconds instead of compiling binaries and downloading things (as you would specify in your Dockerfile).
About building images:
You don't have to build these images locally, you can use your CI/CD runners and run the docker build and docker push from the pipelines that run when you push your code to a git repository.
And also, if the image is too big you should look into ways of reducing its size by using multi-stage building, using lighter/minimal base images, using few layers (for example multiple RUN apt install can be grouped to one apt install command listing multiple packages), and also by using .dockerignore to not ship unnecessary files to your image. And last read more about caching in docker builds as it may reduce the size of the layers you might be pushing when making changes.
Long answer:
Think of the Dockerfile as the source code, and the Image as the final binary. I know it's a classic example.
But just consider how long it would take to build/compile the binary every time you want to use it (either by running it, or importing it as a library in a different piece of software). Then consider how indeterministic it would download the dependencies of that software, or compile them on different machines every time you run them.
You can take for example Node.js's Dockerfile:
https://github.com/nodejs/docker-node/blob/main/16/alpine3.16/Dockerfile
Which is based on Alpine: https://github.com/alpinelinux/docker-alpine
You don't want your application to perform all operations specified in these files (and their scripts) on runtime before actually starting your applications as it might be unpredictable, time-consuming, and more complex than it should be (for example you'd require firewall exceptions for an Egress traffic to the internet from the cluster to download some dependencies which you don't know if they would be available).
You would instead just ship an image based on the base image you tested and built your code to run on. That image would be built and sent to the registry then k8s will run it as a black box, which might be predictable and deterministic.
Then about your point of how annoying it is to push huge docker images every time:
You might cut that size down by following some best practices and well designing your Dockerfile, for example:
Reduce your layers, for example, pass multiple arguments whenever it's possible to commands, instead of re-running them multiple times.
Use multi-stage building, so you will only push the final image, not the stages you needed to build to compile and configure your application.
Avoid injecting data into your images, you can pass it later on-runtime to the containers.
Order your layers, so you would not have to re-build untouched layers when making changes.
Don't include unnecessary files, and use .dockerignore.
And last but not least:
You don't have to push images from your machine, you can do it with CI/CD runners (for example build-push Github action), or you can use your cloud provider's "Cloud Build" products (like Cloud Build for GCP and AWS CodeBuild)

How to improve automation of running container's base image updates?

I want all running containers on my server to always use the latest version of an official base image e.g. node:16.3 in order to get security updates. To achieve that I have implemented an image update mechanism for all container images in my registry using a CI workflow which has some limitations described below.
I have read the answers to this question but they either involve building or inspecting images on the target server which I would like to avoid.
I am wondering whether there might be an easier way to achieve the container image updates or to alleviate some of the caveats I have encountered.
Current Image Update Mechanism
I build my container images using the FROM directive with the minor version I want to use:
FROM node:16.13
COPY . .
This image is pushed to a registry as my-app:1.0.
To check for changes in the node:16.3 image compared to when I built the my-app:1.0 image I periodically compare the SHA256 digests of the layers of the node:16.3 with those of the first n=(number of layers of node:16.3) layers of my-app:1.0 as suggested in this answer. I retrieve the SHA256 digests with docker manifest inpect <image>:<tag> -v.
If they differ I rebuild my-app:1.0 and push it to my registry thus ensuring that my-app:1.0 always uses the latest node:16.3 base image.
I keep the running containers on my server up to date by periodically running docker pull my-app:1.0 on the server using a cron job.
Limitations
When I check for updates I need to download the manifests for all my container images and their base images. For images hosted on Docker Hub this unfortunately counts against the download rate limit.
Since I always update the same image my-app:1.0 it is hard to track which version is currently running on the server. This information is especially important when the update process breaks a service. I keep track of the updates by logging the output of the docker pull command from the cron job.
To be able to revert the container image on the server I have to keep previous versions of the my-app:1.0 images as well. I do that by pushing incremental patch version tags along with the my-app:1.0 tag to my registry e.g. my-app:1.0.1, my-app:1.0.2, ...
Because of the way the layers of the base image and the app image are compared it is not possible to detect a change in the base image where only the uppermost layers have been removed. However I do not expect this to happen very frequently.
Thank you for your help!
There are a couple of things I'd do to simplify this.
docker pull already does essentially the sequence you describe, of downloading the image's manifest and then downloading layers you don't already have. If you docker build a new image with an identical base image, an identical Dockerfile, and identical COPY source files, then it won't actually produce a new image, just put a new name on the existing image ID. So it's possible to unconditionally docker build --pull images on a schedule, and it won't really use additional space. (It could cause more redeploys if neither the base image nor the application changes.)
[...] this unfortunately counts against the download rate limit.
There's not a lot you can do about that beyond running your own mirror of Docker Hub or ensuring your CI system has a Docker Hub login.
Since I always update the same image my-app:1.0 it is hard to track which version is currently running on the server. [...] To be able to revert the container image on the server [...]
I'd recommend always using a unique image tag per build. A sequential build ID as you have now works, date stamps or source-control commit IDs are usually easy to come up with as well. When you go to deploy, always use the full image tag, not the abbreviated one.
docker pull registry.example.com/my-app:1.0.5
docker stop my-app
docker rm my-app
docker run -d ... registry.example.com/my-app:1.0.5
docker rmi registry.example.com/my-app:1.0.4
Now you're absolutely sure which build your server is running, and it's easy to revert should you need to.
(If you're using Kubernetes as your deployment environment, this is especially important. Changing the text value of a Deployment object's image: field triggers Kubernetes's rolling-update mechanism. That approach is much easier than trying to ensure that every node has the same version of a shared tag.)

Should I actively be deleting old docker images to clear disk space?

I use docker to build a web app (a Rails app specifically).
Each build is tagged with the git SHA value and the :latest tag points to the latest SHA value (e.g. 4bfcf8d) in this case.
# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
feeder_web 4bfcf8d c2f766746901 About a minute ago 1.61GB
feeder_web latest c2f766746901 About a minute ago 1.61GB
feeder_web c14c3e6 4cb983fbf407 13 minutes ago 1.61GB
feeder_web cc1ecd9 3923b2c0c77f 18 minutes ago 1.61GB
Each version only differs by some minor copy in the app's frontend, but other than that they are largely the same.
Each one is listed at 1.61GB. Does it really require an additional 1.61GB for each build if I just change a few lines in the web app? And if so, should I actively be clearing old builds?
Each version only differs by some minor copy in the app's frontend, but other than that they are largely the same. Each one is listed at 1.61GB. Does it really require an additional 1.61GB for each build if I just change a few lines in the web app?
Whether or not you can benefit from layer caching depends largely on how you write your Dockerfile.
For example if you write
FROM debian
COPY ./code /code
RUN apt-get update && all that jazz ... ...
...
and you change one iota of that ./code, the whole layer is tossed and every layer after it. Docker has to rerun (and re-store) your layer, creating another few hundred meg layers every time you build. But if you run
FROM debian
RUN apt-get song and dance my-system deps && clean up after myself
MKDIR /code
COPY ./code/requirements /code/requirements
RUN pip or gem thingy
COPY /code/
...
Now you don't have to install requirements every time. So the bulk of your environment (system and language libraries) doesn't need to change. You should only need the space for whatever you copy in ./code and thereafter in this case - still usually in the 0.1 gig or so magnitude.
The community tends to tout minimization of layers in an image and as far as steps with the same lifespan and dependencies (apt-get install / cleanup ) it makes sense. But this is actually contrary to efficiency if you can make good use of caching. For example, if you need to change gem file, probably don't need to change all of system libraries, so no need to rebuild that image unless you want to update the lower layers. This also drastically increases build time if you don't have to install libffi-dev or whatever every time.
Likely the biggest thing you can do to keep final image sizes down is use multi stage builds. Python and Ruby containers often end up pulling in complex build time dependencies that are then kept in the final image. Removing them from the end image is also a potential security bonus, at least security overhead in terms of CVE exposure. So look up multi stage builds if you haven't yet and spend an hour seeing if it's fairly easy to get some of the build time dependencies out of your final image. Full disclosure: I cannot be sure at the moment whether these build stages are automatically cleaned up.
And if so, should I actively be clearing old builds?
Since disk space is a fundamentally limited resource, the only question is how actively, and to what extent do you want to mitigate by increasing hard disk space.
And don't forget to clean up old containers ,too. I try to make docker run --rm a habit whenever possible, but still find myself pruning them after they inevitably build up.
If one build requires 1.6G then just changing a few lines is not going to change the size.
If you are not planning to use them anymore, I would suggest clearing out the old builds. Every so often I do a docker system prune -a which removes unused data (images, containers, volumes, etc.).
Each build is a full container, not just a layer on top of an older container. In other words, if you update your image 4 times and rebuild, you have 4 full copies of you app running. Each one is going to take up approximately the same amount of space because they are very similar to one another other than the incremental changes you’ve made between each build.
In this context, I don’t think there’s any reason not to clear out the old containers. They are clearly consuming a lot of space. Presumably, you’ve updated them for a reason, so unless you are continuing to use/test the old version of your app and need to keep it for that reason, it probably makes sense to at least periodically clear them out. This is assuming you’re doing all this in a development environment (ie, your personal/work machine) and having old versions lying around dormant doesn’t pose a security risk because they aren’t actively connecting to external services. If these are running on a live server somewhere, dump the old ones as soon as they are no longer being used so that you aren’t unnecessarily exposing any extra attack surface on your production servers.

How to do deterministic builds of Docker images?

I'm trying to build Docker images and I would like my Docker images to be deterministic. Much to my surprise I found that even a trivial Dockerfile such as
FROM scratch
ENV a b
Produces different IDs when built repeatedly using docker build --no-cache .
How could I make my builds deterministic and whats causing the changes in image IDs? When caching is enabled the same ID is produced.
The reason I'm trying to get this reproducibility is to enable producing the same layers in a distributed build environment. I can not control where a build is run therefore I can not know what is in the cache.
Also the Docker build downloads files using wget from an ftp which may or may not have changed, currently I can not easily tell Docker from within a Dockerfile if the results of a RUN should invalidate the cache. Therefore if I could just produce the same ID for identical layers (when no cache is used) these layers would not have to be "push"ed and "pull"ed again.
Also all the reasons listed here: https://reproducible-builds.org/
AFAIK, currently docker images do not hash to byte-exact hashes, since the metadata currently contains stateful information such as created date. You can check out the design doc from 1.10. Unfortunately, it looks like the history metadata is an important part of image validity and identification.
Don't get me wrong, I'm all about reproducible builds. However I don't believe hash-exactness is the best criteria for measuring reproducibility of a docker image. A docker image isn't a compiled binary. There is no way to guarantee the results of a stage will ever be able to be reproduced, so even if the datetime metadata was absent, it would not guarantee reproducible builds. Take this pathological example:
RUN curl "https://www.random.org/strings/?num=1&len=20&digits=on&unique=on&format=plain&rnd=new" -o nonce.txt
The image ID is a SHA256 of the image's configuration object (what you get when you do a docker image inspect). Run this with the images you are creating and you will see differences between them.

Why doesn't Docker Hub cache Automated Build Repositories as the images are being built?

Note: It appears the premise of my question is no longer valid since the new Docker Hub appears to support caching. I haven't personally tested this. See the new answer below.
Docker Hub's Automated Build Repositories don't seem to cache images. As it is building, it removes all intermediate containers. Is this the way it was intended to work or am I doing something wrong? It would be really nice to not have to rebuild everything for every small change. I thought that was supposed to be one of the best advantages of docker and it seems weird that their builder doesn't use it. So why doesn't it cache images?
UPDATE:
I've started using Codeship to build my app and then run remote commands on my DigitalOcean server to copy the built files and run the docker build command. I'm still not sure why Docker Hub doesn't cache.
Disclaimer: I am a lead software engineer at Quay.io, a private Docker container registry, so this is an educated guess based on the same problem we faced in our own build system implementation.
Given my experience with Dockerfile build systems, I would suspect that the Docker Hub does not support caching because of the way caching is implemented in the Docker Engine. Caching for Docker builds operates by comparing the commands to be run against the existing layers found in memory.
For example, if the Dockerfile has the form:
FROM somebaseimage
RUN somecommand
ADD somefile somefile
Then the Docker build code will:
Check to see if an image matching somebaseimage exists
Check if there is a local image with the command RUN somecommand whose parent is the previous image
Check if there is a local image with the command ADD somefile somefile + a hashing of the contents of somefile (to make sure it is invalidated when somefile changes), whose parent is the previous image
If any of the above steps match, then that command will be skipped in the Dockerfile build process, with the cached image itself being used instead. However, the one key issue with this process is that it requires the cached images to be present on the build machine, in order to find and verify the matches. Having all of everyone's images on build nodes would be highly inefficient, making this a harder problem to solve.
At Quay.io, we solved the caching problem by creating a variation of the Docker caching code that could precompute these commands/hashes and then ask our registry for the cached layers, downloading them to the machine only after we had found the most efficient caching set. This required significant data model changes in our registry code.
If you'd like more information, we gave a technical overview into how we do so in this talk: https://youtu.be/anfmeB_JzB0?list=PLlh6TqkU8kg8Ld0Zu1aRWATiqBkxseZ9g
The new Docker Hub came out with a new Automated Build system that supports Build Caching.
https://blog.docker.com/2018/12/the-new-docker-hub/

Resources