Since BuildKit has been set as the default backend of docker, can we just build cache based on command docker build ... without the help of docker buildx build ... --to-cache?
I checked options list of docker build but I can not find the --to-cache option.
Related
The Docker technology stack has evolved multiple different tools for building container images, like docker build and docker-compose build and docker buildx build and docker buildx bake and buildctl.
How do these relate to each other? And what is recommended for a new Docker-based project?
Here is my incomplete understanding.
At first Docker Engine provided an API endpoint for requesting a build of a container image from a Dockerfile. The Docker CLI build command, docker build, invoked this API according to its command line options.
Then docker-compose (Docker Compose V1) came around and provided a way to describe the build options for a container image in YAML, invoking the same Docker Engine API.
In 2017, Docker started the Moby Project with the intention of creating open source backend components for working with containers.
One these components is BuildKit, which is like a compiler for container images. It builds images from a low-level build definition format called LLB. Frontend tools can translate inputs including, but not limited to, Dockerfiles into LLB. (buildctl is just a tool for interacting with BuildKit from the command line.)
An important thing to note is that BuildKit is independent from the Docker Engine. As BuildKit matured, existing tools aimed to switch from using the Docker Engine API to using BuildKit.
The Docker CLI buildx plugin provides one command line interface for invoking BuildKit. docker buildx build builds a single container image from command line options, and docker buildx bake can read those options from a docker-compose.yml file.
Eventually, the normal docker build command learned to be able to use BuildKit. This mode has to be enabled by setting the DOCKER_BUILDKIT environment variable, or a daemon.json configuration option. This is the default on Docker Desktop.
Until now, docker-compose used the Docker Engine API directly. A new COMPOSE_DOCKER_CLI_BUILD environment variable was added to instead invoke docker build. Most recently, the tool was rewritten from scratch, and Docker Compose V2 (now docker compose) can invoke BuildKit itself. The COMPOSE_DOCKER_CLI_BUILD option is not supported anymore.
The docker buildx build command accepts an option --cache-to type=registry,ref=my/repo:buildcache that pushes the build cache to the registry (some docs).
How can this cache be inspected?
docker buildx imagetools inspect my/repo:buildcache crashes from a null pointer dereference.
docker manifest inspect my/repo:buildcache triggers a 500 Internal Server Error.
The Docker Hub website provides no information about a tag for a pushed build cache, except that it exists.
If I run a build with the symmetric --cache-from type=registry,ref=my/repo:buildcache option, it does print importing cache manifest from my/repo:buildcache but I don't know what it has imported.
When I want to create multiarch builds with docker, I use the command:
docker buildx build --push --platform <list of archs> -t <tag1> -t <tag2> .
This works perfectly fine, but seems to be building the images concurrently. In most situations, it may be acceptable, but in some other cases, this seems to be too heavy, requires too much RAM and causes network failures (too many parallel connections).
Is there any way to build the images sequentially?
The only solution I found is to build for each arch separately and then use "docker manifest create" to assemble:
docker buildx build --push --platform <arch1> -t <tag-arch1> .
docker buildx build --push --platform <arch2> -t <tag-arch2> .
[...]
docker manifest create ... --amend ...
docker manifest push
This may be fine, but it seems that each image is supposed to be pushed to the registry for "docker manifest create" to be able to assemble. This is not very good, as it pollutes the list of images of tags I don't want.
Would it be possible to use "docker manifest create" on local images, without the need to upload each of them separately with a tag to the registry? Is there any better way to simply build the images sequentially?
Thanks!
At present, there's no way to limit the concurrency of a single build. This is currently an open issue with buildkit. There's also a similar issue to allow setting cgroup limits that would allow you to run the builds concurrently, but with less maximum CPU time, which would have a similar effect for some use cases.
Suppose I am building an image using Docker Buildkit. My image is from a multistage Dockerfile, like so:
FROM node:12 AS some-expensive-base-image
...
FROM some-expensive-base-image AS my-app
...
I am now trying to build both images. Suppose that I push these to Docker Hub. If I were to use Docker Buildkit's external caching feature, then I would want to try to save build time on my CI pipeline by pulling in the remote some-expensive-base-image:latest image as the cache when building the some-expensive-base-image target. And, I would want to pull in both the just-built some-expensive-base-image image and the remote my-app:latest image as the caches for the latter image. I believe that I need both in order to prevent requiring the steps of some-expensive-base-image from needing to be rebuilt, since...well...they are expensive.
This is what my build script looks like:
export DOCKER_BUILDKIT=1
docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from some-expensive-base-image:latest --target some-expensive-base-image -t some-expensive-base-image:edge .
docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from some-expensive-base-image:edge --cache-from my-app:latest --target my-app -t my-app:edge .
My question: Does the order of the --cache-from arguments matter for the second docker build?
I have been getting inconsistent results on my CI pipeline for this build. There are cache misses that are happening when building that latter image, even though there hasn't been any code changes that would have caused cache busting. The Cache Minefest can be pulled without issue. There are times when the cache image is pulled, but other times when all steps of that latter target need to be rerun. I don't know why.
By chance, should I instead try to docker pull both images before running the docker build commands in my script?
Also, I know that I referred to Docker Hub in my example, but in real life, my application uses AWS ECR for its remote Docker repository. Would that matter for proper Buildkit functionality?
Yes, the order of --cache-from matters!
See the explanation on Github from the person who implemented the feature, quoting here:
When using multiple --cache-from they are checked for a cache hit in the order that user specified. If one of the images produces a cache hit for a command only that image is used for the rest of the build.
I've had similar problems in the past, you might find useful to check ths answer, where I've shared about using Docker cache in the CI.
I recently switched to multi-stage docker builds, and it doesn't appear that there's any caching on intermediate builds. I'm not sure if this is a docker limitation, something which just isn't available or whether I'm doing something wrong.
I am pulling down the final build and doing a --cache-from at the start of the new build, but it always runs the full build.
This appears to be a limitation of docker itself and is described under this issue - https://github.com/moby/moby/issues/34715
The workaround is to:
Build the intermediate stages with a --target
Push the intermediate images to the registry
Build the final image with a --target and use multiple --cache-from paths, listing all the intermediate images and the final image
Push the final image to the registry
For subsequent builds, pull the intermediate + final images down from the registry first
Since the previous answer was posted, there is now a solution using the BuildKit backend: https://docs.docker.com/engine/reference/commandline/build/#specifying-external-cache-sources
This involves passing the argument --build-arg BUILDKIT_INLINE_CACHE=1 to your docker build command. You will also need to ensure BuildKit is being used by setting the environment variable DOCKER_BUILDKIT=1 (on Linux; I think BuildKit might be the default backend on Windows when using recent versions of Docker Desktop). A complete command line solution for CI might look something like:
export DOCKER_BUILDKIT=1
# Use cache from remote repository, tag as latest, keep cache metadata
docker build -t yourname/yourapp:latest \
--cache-from yourname/yourapp:latest \
--build-arg BUILDKIT_INLINE_CACHE=1 .
# Push new build up to remote repository replacing latest
docker push yourname/yourapp:latest
Some of the other commenters are asking about docker-compose. It works for this too, although you need to additionally specify the environment variable COMPOSE_DOCKER_CLI_BUILD=1 to ensure docker-compose uses the docker CLI (with BuildKit thanks to DOCKER_BUILDKIT=1) and then you can set BUILDKIT_INLINE_CACHE: 1 in the args: section of the build: section of your YAML file to ensure the required --build-arg is set.
I'd like to add another important point to the answer
--build-arg BUILDKIT_INLINE_CACHE=1 caches only the last layer, and works in case nothing changed.
So, to enable the caching of layers for the whole build, this argument should be replaced by --cache-to type=inline,mode=max. See the documentation