I'm doing cross-platform testing (tooling, not kernel), so I have a custom image (used for ephemeral Jenkins slaves) for each OS, based on standard base images: centos6, centos7, ubuntu14, sles11, sles12, etc.
Aside for the base being different, my images have a lot in common with each other (all of them get a copy of pre-built and frequently changing maven/gradle/npm repositories for speed).
Here is a simplified example of the way the images are created (the tarball is the same across images):
# Dockerfile one
FROM centos:centos6
ADD some-files.tar.gz
# Dockerfile two
FROM ubuntu:14.04
ADD some-files.tar.gz
This results in large images (multi-GB) that have to be rebuilt regularly. Some layer reuse occurs between rebuilds thanks to the docker build cache, but if I can stop having to rebuild images altogether it would be better.
How can I reliably share the common contents among my images?
The images don't change much outside of these directories. This cannot be a simple mounted volume because in use the directories in this layer are modified, so it cannot be read-only and the source must not be changed (so what I'm looking for is closer to a COW but applied to a specific subset of the image)
Problem with --cache-from:
The suggestion to use --cache-from will not work:
$ cat df.cache-from
FROM busybox
ARG UNIQUE_ARG=world
RUN echo Hello ${UNIQUE_ARG}
COPY . /files
$ docker build -t test-from-cache:1 -f df.cache-from --build-arg UNIQUE_ARG=docker .
Sending build context to Docker daemon 26.1MB
Step 1/4 : FROM busybox
---> 54511612f1c4
Step 2/4 : ARG UNIQUE_ARG=world
---> Running in f38f6e76bbca
Removing intermediate container f38f6e76bbca
---> fada1443b67b
Step 3/4 : RUN echo Hello ${UNIQUE_ARG}
---> Running in ee960473d88c
Hello docker
Removing intermediate container ee960473d88c
---> c29d98e09dd8
Step 4/4 : COPY . /files
---> edfa35e97e86
Successfully built edfa35e97e86
Successfully tagged test-from-cache:1
$ docker build -t test-from-cache:2 -f df.cache-from --build-arg UNIQUE_ARG=world --cache-from test-from-cache:1 .
Sending build context to Docker daemon 26.1MB
Step 1/4 : FROM busybox
---> 54511612f1c4
Step 2/4 : ARG UNIQUE_ARG=world
---> Using cache
---> fada1443b67b
Step 3/4 : RUN echo Hello ${UNIQUE_ARG}
---> Running in 22698cd872d3
Hello world
Removing intermediate container 22698cd872d3
---> dc5f801fc272
Step 4/4 : COPY . /files
---> addabd73e43e
Successfully built addabd73e43e
Successfully tagged test-from-cache:2
$ docker inspect test-from-cache:1 -f '{{json .RootFS.Layers}}' | jq .
[
"sha256:6a749002dd6a65988a6696ca4d0c4cbe87145df74e3bf6feae4025ab28f420f2",
"sha256:01bf0fcfc3f73c8a3cfbe9b7efd6c2bf8c6d21b6115d4a71344fa497c3808978"
]
$ docker inspect test-from-cache:2 -f '{
{json .RootFS.Layers}}' | jq .
[
"sha256:6a749002dd6a65988a6696ca4d0c4cbe87145df74e3bf6feae4025ab28f420f2",
"sha256:c70c7fd4529ed9ee1b4a691897c2a2ae34b192963072d3f403ba632c33cba702"
]
The build shows exactly where it stops using the cache, when the command changes. And the inspect shows the change of the second layer id even though the same COPY command was run in each. And anytime the preceding layer differs, the cache cannot be used from the other image build.
The --cache-from option is there to allow you to trust the build steps from an image pulled from a registry. By default, docker only trusts layers that were locally built. But the same rules apply even when you provide this option.
Option 1:
If you want to reuse the build cache, you must have the preceding layers identical in both images. You could try using a multi-stage build if the base image for each is small enough. However, doing this would lose all of the settings outside of the filesystem (environment variables, entrypoint specification, etc), so you'd need to recreate that as well:
ARG base_image
FROM ${base_image} as base
# the above from line makes the base image available for later copying
FROM scratch
COPY large-content /content
COPY --from=base / /
# recreate any environment variables, labels, entrypoint, cmd, or other settings here
And then build that with:
docker build --build-arg base_image=base1 -t image1 .
docker build --build-arg base_image=base2 -t image2 .
docker build --build-arg base_image=base3 -t image3 .
This could also be multiple Dockerfiles if you need to change other settings. This will result in the entire contents of each base image being copied, so make sure your base image is significantly smaller to make this worth the effort.
Option 2:
Reorder your build to keep common components at the top. I understand this won't work for you, but it may help others coming across this question later. It's the preferred and simplest solution that most people use.
Option 3:
Remove the large content from your image and add it to your containers externally as a volume. You lose the immutability + copy-on-write features of layers of the docker filesystem. And you'll manually need to ship the volume content to each of your docker hosts (or use a network shared filesystem). I've seen solutions where a "sync container" is run on each of the docker hosts which performs a git pull or rsync or any other equivalent command to keep the volume updated. If you can, consider mounting the volume with :ro at the end to make it read only inside the container where you use it to give you immutability.
Given it sounds like the content of this additional 4GB of data is unrelated to the underlying container image, is there any way to mount that data outside of the container build/creation process? I know this creates an additional management step (getting the data everywhere you want the image), but assuming it can be a read-only shared mount (and then untarred by the image main process into the container filesystem as needed), this might be an easier way than building it into every image.
Turns out that as of Docker 1.13, you can use the --cache-from OTHER_IMAGE flag. (Docs)
In this situation, the solution would look like this:
docker build -t image1
docker build -t image2 --cache-from image1
docker build -t image3 --cache-from image1 --cache-from image2
... and so on
This will ensure that any layer these images have in common is reused.
UPDATE: as mentioned in other answers, this doesn't do what I expected. I admit I still don't understand what this does since it definitely changes the push behavior but the layers are not ultimately reused.
The most reliable and docker way to share the common contents between different docker images, is to refactor the commonalities between the images, into base images that the other images extend.
Example, if all the images build on top of a base image and install in it packages x, y, and z. The you refactor the installion of packages x, y and z with the base image to a newer base image, that the downstream images build on top.
Related
I've searched site:stackoverflow.com dockerfile: ENV, RUN - layers or images and have read Does Docker EXPOSE make a new layer? and What are Docker image "layers"?.
While reading docs Best practices for writing Dockerfiles and trying to understand this part:
Each ENV line creates a new intermediate layer, just like RUN
commands. This means that even if you unset the environment variable
in a future layer, it still persists in this layer and its value can
be dumped.
I recalled that part above:
In older versions of Docker, it was important that you minimized the
number of layers in your images to ensure they were performant. The
following features were added to reduce this limitation:
Only the instructions RUN, COPY, ADD create layers. Other instructions
create temporary intermediate images, and do not increase the size of
the build.
I've read How to unset "ENV" in dockerfile?. and redid the example given on doc page, it indeed proves ENV is not unset:
$ docker build -t alpine:envtest -<<HITHERE
> FROM alpine
> ENV ADMIN_USER="mark"
> RUN unset ADMIN_USER
> HITHERE
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM alpine
latest: Pulling from library/alpine
89d9c30c1d48: Already exists
Digest: sha256:c19173c5ada610a5989151111163d28a67368362762534d8a8121ce95cf2bd5a
Status: Downloaded newer image for alpine:latest
---> 965ea09ff2eb
Step 2/3 : ENV ADMIN_USER="mark"
---> Running in 5d34f829a387
Removing intermediate container 5d34f829a387
---> e9c50b16c0e1
Step 3/3 : RUN unset ADMIN_USER
---> Running in dbcf57ca390d
Removing intermediate container dbcf57ca390d
---> 2cb4de2e0257
Successfully built 2cb4de2e0257
Successfully tagged alpine:envtest
$ docker run --rm alpine:envtest sh -c 'echo $ADMIN_USER'
mark
And the output says same "Removing intermediate container" for both ENV and RUN.
I've recently downloaded docker, don't think it is that old:
$ docker --version
Docker version 19.03.5, build 633a0ea
Maybe RUN instruction and RUN command is a different thing?
ENV, RUN - do they create layers, images or containers?
The Docker as containerization system based on two main concepts of image and container. The major difference between them is the top writable layer. When you generate a new container the new writable layer will be put above the last image layer. This layer is often called the container layer.
All the underlying image content remains unchanged and each change in the running container that creating new files, modifying existing files and so on will be copied in this thin writable layer.
In this case, Docker only stores the actual containers data and one image instance, which decreases storage usage and simplifies the underlying workflow. I would compare it with static and dynamic linking in C language, so Docker uses dynamic linking.
The image is a combination of layers. Each layer is only a set of differences from the layer before it.
The documentation says:
Only the instructions RUN, COPY, ADD create layers. Other instructions
create temporary intermediate images, and do not increase the size of
the build.
The description here is neither really clear nor accurate, and generally speaking these aren't the only instructions that create layers in the latest versions of Docker, as the documentation outlines.
For example, by default WORKDIR creates a given path if it does not exist and change directory to it. If the new path was created WORKDIR will generate a new layer.
By the way, ENV doesn't lead to layer creation. The data will be stored permanently in image and container config and there is no easy way to get rid of it. Basically, there are two options, how to organize workflow:
Temporal environment variables, they will be available until the end of the current RUN directive:
RUN export NAME='megatron' && echo $NAME # 'megatron'
RUN echo $NAME # blank
Clean environment variable, if there is no difference for you between the absence of env or blank content of it, then you could do:
ENV NAME='megatron'
# some instructions
ENV NAME=''
RUN echo $NAME
In the context of Docker, there is no distinction between commands and instructions. For RUN any commands that don't change filesystem content won't trigger permanent layers creation. Consider the following Dockerfile:
FROM alpine:latest
RUN echo "Hello World" # no layer
RUN touch file.txt # new layer
WORKDIR /no/existing/path # new layer
In the end, the output would be:
Step 1/4 : FROM alpine:latest
---> 965ea09ff2eb
Step 2/4 : RUN echo "Hello World"
---> Running in 451adb70f017
Hello World
Removing intermediate container 451adb70f017
---> 816ccbd1e8aa
Step 3/4 : RUN touch file.txt
---> Running in 9edc6afdd1e5
Removing intermediate container 9edc6afdd1e5
---> ea0040ec0312
Step 4/4 : WORKDIR /no/existing/path
---> Running in ec0feaf6710d
Removing intermediate container ec0feaf6710d
---> f2fe46478f7c
Successfully built f2fe46478f7c
Successfully tagged envtest:lastest
There is inspect command for inspecting Docker objects:
docker inspect --format='{{json .RootFS.Layers}}' <image_id>
Which shows us the list of SHA of three layers getting FROM, second RUN and WORKDIR directives, I would recommend using dive for exploring each layer in a docker image.
So why does it say removing intermediate container and not removing intermediate layer? Actually to execute RUN commands Docker needs to instantiate a container with the intermediate image up to that line of the Dockerfile and run the actual command. It will then "commit" the state of the container as a new intermediate image and continue the building process.
I have already created an image locally and it contains two layers
$ docker images inspect existingimagename
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:e21695bdc8e8432b1a119d610d5f8497e2509a7d040ad778b684bbccd067099f",
"sha256:3ff73e68714cf1e9ba79b30389f4085b6e31b7a497f986c7d758be51595364de"
]
},
Now i am building another image and want to save space. The first layer of the previous image is the main file system. So i decided to use it
FROM sha256:e21695bdc8e8432b1a119d610d5f8497e2509a7d040ad778b684bbccd067099f
ENV LANG=en_US.UTF-8
CMD ["/usr/bin/bash"]
Then i try to build the new image
$ docker build -t newimage -f Dockerfile .
Sending build context to Docker daemon 443.5MB
Step 1/3 : FROM sha256:e21695bdc8e8432b1a119d610d5f8497e2509a7d040ad778b684bbccd067099f
pull access denied for sha256, repository does not exist or may require 'docker login'
it gives error.
So how to deal with this.
An easy way to profit from image layer cache is to create a base image with just the first layer.
Then use FROM <base image> in your other Dockerfiles.
This way, disk space will be spared as multiple images will share the same layer and also builds will be faster.
Dockerfile-base:
FROM scratch
ADD ./system.tar.gz /
docker build -f Dockerfile-base -t base .
Dockerfile-1:
FROM base
COPY ./somefiles /
docker build -f Dockerfile-1 -t image1 .
Dockerfile-2:
FROM base
COPY ./otherfiles /
docker build -f Dockerfile-2 -t image2 .
Recommended reads
Best practices for writing Dockerfiles ยง Leverage build cache
Problem: I can't reproduce docker layers using exactly same content (on one machine or in CI cluster where something is built from git repo)
Consider this simple example
$ echo "test file" > test.txt
$ cat > Dockerfile <<EOF
FROM alpine:3.8
COPY test.txt /test.txt
EOF
If I build image on one machine with caching enabled, then last layer with copied file would be shared across images
$ docker build -t test:1 .
Sending build context to Docker daemon 3.072kB
Step 1/2 : FROM alpine:3.8
3.8: Pulling from library/alpine
cd784148e348: Already exists
Digest: sha256:46e71df1e5191ab8b8034c5189e325258ec44ea739bba1e5645cff83c9048ff1
Status: Downloaded newer image for alpine:3.8
---> 3f53bb00af94
Step 2/2 : COPY test.txt /test.txt
---> decab6a3fbe3
Successfully built decab6a3fbe3
Successfully tagged test:1
$ docker build -t test:2 .
Sending build context to Docker daemon 3.072kB
Step 1/2 : FROM alpine:3.8
---> 3f53bb00af94
Step 2/2 : COPY test.txt /test.txt
---> Using cache
---> decab6a3fbe3
Successfully built decab6a3fbe3
Successfully tagged test:2
But with cache disabled (or simply using another machine) I got different hash values.
$ docker build -t test:3 --no-cache .
Sending build context to Docker daemon 3.072kB
Step 1/2 : FROM alpine:3.8
---> 3f53bb00af94
Step 2/2 : COPY test.txt /test.txt
---> ced4dff22d62
Successfully built ced4dff22d62
Successfully tagged test:3
At the same time history command shows that file content was same
$ docker history test:1
IMAGE CREATED CREATED BY SIZE COMMENT
decab6a3fbe3 6 minutes ago /bin/sh -c #(nop) COPY file:d9210c40895e
$ docker history test:3
IMAGE CREATED CREATED BY SIZE COMMENT
ced4dff22d62 27 seconds ago /bin/sh -c #(nop) COPY file:d9210c40895e
Am I missing something or this behavior is by design?
Are there any technics to get reproducible/reusable layers that does not force me to do one of the following
Share docker cache across machines
Do a pull of "previous" image before building next
Ultimately this problem prevents me from getting thin layers with constantly changing app code while keeping layers of my dependencies in separate and infrequently changed layer.
After some extra googling, I found a great post describing solution to this problem.
Starting from 1.13, docker has --cache-from option that can be used to tell docker to look at another images for layers. Important thing - image should be explicitly pulled for it to work + you still need point what image to take. It could be latest or any other "rolling" image you have.
Given that, unfortunately there is no way to produce same layer in "isolation", but cache-from solves root problem - how to eventually reuse some layers during ci build.
I have a docker image with the following dockerfile code:
FROM scratch
RUN echo "Hello World - Dockerfile"
And I build my image in a powershell prompt like this:
docker build -t imagename .
Here is what I do when I build my image :
Sending build context to Docker daemon 194.5MB
Step 1/2 : FROM scratch
--->
Step 2/2 : RUN echo "Hello World - Dockerfile"
---> Running in 42d5e5add10e
invalid reference format
I want to run my image with a windows container.
What is missing to make it work?
Thanks
Your image doesn't have a command called echo.
A FROM scratch image contains absolutely nothing at all. No shells, no libraries, no system programs, nothing. The two most common uses for it are to build a base image from a tar file or to build an extremely minimal image from a statically-linked binary; both are somewhat advanced uses.
Usually you'll want to start from an image that contains a more typical set of operating system tools. On a Linux base (where I'm more familiar) ubuntu and debian are common, alpine as well (though it has some occasionally compatibility issues). #gp. suggests FROM microsoft/windowsservercore in a comment and that's probably a good place to start for a Windows container.
With this new version of Docker, Multi-Stage build gets introduced, at least I'd never heard of it before. And the question I have now is, should I use it like a standard Compose file?
I used docker-compose.yaml to start containers where many images where involved, one for the web server and one for the data base. With this new multi-stage build, can I use one single Dockerfile with two FROM commands and that's it?
Will this Multi-stage build eventually kill Compose (since images are smaller)?
Multi-stage builds don't impact the use of docker-compose (though you may want to look into using docker stack deploy with swarm mode to use that compose on a swarm). Compose is still needed to connect multiple microservices together, e.g. running a proxy, a few applications, and an in memory cache. Compose also simplifies passing all the configuration options to a complex docker image, attaching networks and volumes, configuring restart policies, swarm constraints, etc. All of these could be done with lots of scripting, but are made easier by a simple yaml definition.
What multi-stage builds do replace is a multiple step build where you may have a build environment that should be different than a runtime environment. This is all prior to the docker-compose configuration of running your containers.
The popular example is a go binary. That binary is statically compiled so it doesn't really need anything else to run. But the build environment for it is much larger as it pulls in the compiler and various libraries. Here's an example hello.go:
package main
import "fmt"
func main() {
fmt.Printf("Hello, world.\n")
}
And the corresponding Dockerfile:
ARG GOLANG_VER=1.8
FROM golang:${GOLANG_VER} as builder
WORKDIR /go/src/app
COPY . .
RUN go-wrapper download
RUN go-wrapper install
FROM scratch
COPY --from=builder /go/bin/app /app
CMD ["/app"]
The two FROM lines that that Dockerfile are what make it a multi-stage build. The first FROM line creates the first stage with the go compiler. The second FROM line is also the last which makes it the default image to tag when you build. In this case, that stage is the runtime of a single binary. Other stages are all cached on the build server but don't get copied with the final image. You can target the build to different stages if you need to build a single piece with the docker build --target=builder . command.
This becomes important when you look at the result of the build:
$ docker build -t test-mult-stage .
Sending build context to Docker daemon 4.096kB
Step 1/9 : ARG GOLANG_VER=1.8
--->
Step 2/9 : FROM golang:${GOLANG_VER} as builder
---> a0c61f0b0796
Step 3/9 : WORKDIR /go/src/app
---> Using cache
---> af5177aae437
Step 4/9 : COPY . .
---> Using cache
---> 976490d44468
Step 5/9 : RUN go-wrapper download
---> Using cache
---> e31ac3ce83c3
Step 6/9 : RUN go-wrapper install
---> Using cache
---> 2630f482fe78
Step 7/9 : FROM scratch
--->
Step 8/9 : COPY --from=builder /go/bin/app /app
---> 96b9364cdcdc
Removing intermediate container ed558a4da820
Step 9/9 : CMD /app
---> Running in 55db8ed593ac
---> 5fd74a4d4235
Removing intermediate container 55db8ed593ac
Successfully built 5fd74a4d4235
Successfully tagged test-mult-stage:latest
$ docker images | grep 2630
<none> <none> 2630f482fe78 5 weeks ago 700MB
$ docker images | grep test-mult-stage
test-mult-stage latest 5fd74a4d4235 33 seconds ago 1.56MB
Note the runtime image is only 1.5 MB, while the untaged builder image with the compiler is 700MB. Previously to get the same space savings you would need to compile your application outside of docker and deal with all the dependency issues that docker would normally solve for you. Or you could do the build in one container, copy the result out of that container, and use that copied file as the input to another build. The multi-stage build turns this second option into a single reproducible and portable command.
Multi-stage feature allows you to create temporary builds and extract their files to be used in your final build. For e.g. you need gcc to build your libraries but you don't need gcc in production container. Though, you could do multiple builds using few lines of bash scripting, multi-stage feature allows you to do it using a single Dockerfile. Compose only use your final image(s) regardless of how you've built it, so they are unrelated.