I am setting an automatic build from which I would like to produce 2 images.
The use-case is in building and distributing a library:
- one image with the dependencies which will be reused for building and testing on Travis
- one image to provide the built software libs
Basically, I need to be able to push an image of the container at a certain point (before building) and one later (after building and installing).
Is this possible? I did not find anything relevant in Dockerfile docs.
You can do that using Docker Multi Stage builds. Have two Docker files
Dockerfile
FROM alpine
RUN apk update && apk add gcc
RUN echo "This is a test" > /tmp/builtfile
Dockerfile-prod
FROM myapp:testing as source
FROM alpine
COPY --from=source /tmp/builtfile /tmp/builtfile
RUN cat /tmp/builtfile
build.sh
docker build -t myapp:testing .
docker build -t myapp:production -f Dockerfile-prod .
So to explain, what we do is build the image with dependencies first. Then in our second Dockerfile-prod, we just include a FROM of our previously build image. And copy the built file to the production image.
Truncated output from my build
vagrant#vagrant:~/so$ ./build.sh
Step 1/3 : FROM alpine
Step 2/3 : RUN apk update && apk add gcc
Step 3/3 : RUN echo "This is a test" > /tmp/builtfile
Successfully tagged myapp:testing
Step 1/4 : FROM myapp:testing as source
Step 2/4 : FROM alpine
Step 3/4 : COPY --from=source /tmp/builtfile /tmp/builtfile
Step 4/4 : RUN cat /tmp/builtfile
This is a test
Successfully tagged myapp:production
For more information refer to https://docs.docker.com/engine/userguide/eng-image/multistage-build/#name-your-build-stages
Related
I'm trying to use the custom build output within docker to export some files in to the build environment for artifacts but even though the files seem to be outputted there's nothing I can find in the build agent.
For simplicity here's a cutdown version of a dockerfile that uses a scratch image as the final image with a single message file.
FROM debian as build
RUN echo "Hello World" > message
FROM scratch as final
COPY --from=build message .
And here's an example of the scripts run for out ci pipeline within the bitbucket-pipelines.yml
image: atlassian/default-image:2
pipelines:
default:
- step:
name: Build
script:
- docker build -o build-output .
- ls
- find / -name "build-output" -type d
services:
- docker
Here's the log output from Bitbucket pipelines.
Images used:
build : docker.io/atlassian/default-image#sha256:d8ae266b47fce4078de5d193032220e9e1fb88106d89505a120dfe41cb592a7b
+ docker build -o build-output .
Sending build context to Docker daemon 65.02kB
Step 1/4 : FROM debian as build
latest: Pulling from library/debian
627b765e08d1: Pulling fs layer
627b765e08d1: Verifying Checksum
627b765e08d1: Download complete
627b765e08d1: Pull complete
Digest: sha256:cc58a29c333ee594f7624d968123429b26916face46169304f07580644dde6b2
Status: Downloaded newer image for debian:latest
---> 0980b84bde89
Step 2/4 : RUN echo "Hello World" > message
---> Running in dfcb5471840d
Removing intermediate container dfcb5471840d
---> 245eb3333d6c
Step 3/4 : FROM scratch as final
--->
Step 4/4 : COPY --from=build message .
---> a8805c31d962
Successfully built a8805c31d962
+ ls
Dockerfile
bitbucket-pipelines.yml
+ find / -name "build-output" -type d
As you can see the build-output directory is nowhere to be seen! 👻
Here's a public build with the whole example - https://bitbucket.org/kev_bite/dockerbuildoutput/addon/pipelines/home#!/results/2
I've been reading docs Best practices for writing Dockerfiles. I encountered small incorrectness (IMHO) for which meaning was clear after reading further:
Using apt-get update alone in a RUN statement causes caching issues
and subsequent apt-get install instructions fail.
Why fail I wondered. Later came explanation of what they meant by "fail":
Because the apt-get update is not run, your build can potentially get
an outdated version of the curl and nginx packages.
However, for the following I still cannot understand what they mean by "If not, the cache is invalidated.":
Starting with a parent image that is already in the cache, the next
instruction is compared against all child images derived from that
base image to see if one of them was built using the exact same
instruction. If not, the cache is invalidated.
That part is mentioned in some answers on SO e.g. How does Docker know when to use the cache during a build and when not? and as a whole the concept of cache invalidation is clear to me, I've read below:
When does Docker image cache invalidation occur?
Which algorithm Docker uses for invalidate cache?
But what is meaning of "if not"? At first I was sure the phrase meant if no such image is found. That would be overkill - to invalidate cache which maybe useful later for other builds. And indeed it is not invalidated if no image is found when I've tried below:
$ docker build -t alpine:test1 - <<HITTT
> FROM apline
> RUN echo "test1"
> RUN echo "test1-2"
> HITTT
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM apline
pull access denied for apline, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
(base) nb0408:docker a.martianov$ docker build -t alpine:test1 - <<HITTT
> FROM alpine
> RUN echo "test1"
> RUN echo "test1-2"
> HITTT
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM alpine
---> 965ea09ff2eb
Step 2/3 : RUN echo "test1"
---> Running in 928453d33c7c
test1
Removing intermediate container 928453d33c7c
---> 0e93df31058d
Step 3/3 : RUN echo "test1-2"
---> Running in b068bbaf8a75
test1-2
Removing intermediate container b068bbaf8a75
---> daeaef910f21
Successfully built daeaef910f21
Successfully tagged alpine:test1
$ docker build -t alpine:test1-1 - <<HITTT
> FROM alpine
> RUN echo "test1"
> RUN echo "test1-3"
> HITTT
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM alpine
---> 965ea09ff2eb
Step 2/3 : RUN echo "test1"
---> Using cache
---> 0e93df31058d
Step 3/3 : RUN echo "test1-3"
---> Running in 74aa60a78ae1
test1-3
Removing intermediate container 74aa60a78ae1
---> 266bcc6933a8
Successfully built 266bcc6933a8
Successfully tagged alpine:test1-1
$ docker build -t alpine:test1-2 - <<HITTT
> FROM alpine
> RUN "test2"
> RUN
(base) nb0408:docker a.martianov$ docker build -t alpine:test2 - <<HITTT
> FROM alpine
> RUN echo "test2"
> RUN echo "test1-3"
> HITTT
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM alpine
---> 965ea09ff2eb
Step 2/3 : RUN echo "test2"
---> Running in 1a058ddf901c
test2
Removing intermediate container 1a058ddf901c
---> cdc31ac27a45
Step 3/3 : RUN echo "test1-3"
---> Running in 96ddd5b0f3bf
test1-3
Removing intermediate container 96ddd5b0f3bf
---> 7d8b901f3939
Successfully built 7d8b901f3939
Successfully tagged alpine:test2
$ docker build -t alpine:test1-3 - <<HITTT
> FROM alpine
> RUN echo "test1"
> RUN echo "test1-3"
> HITTT
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM alpine
---> 965ea09ff2eb
Step 2/3 : RUN echo "test1"
---> Using cache
---> 0e93df31058d
Step 3/3 : RUN echo "test1-3"
---> Using cache
---> 266bcc6933a8
Successfully built 266bcc6933a8
Successfully tagged alpine:test1-3
Cache was again used for last build. What does docs mean by "if not"?
Let's focus on your original problem (regarding apt-get update) to make things easier. The following example is not based on any best practices. It just illustrates the point you are trying to understand.
Suppose you have the following Dockerfile:
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y nginx
You build a first image using docker build -t myimage:latest .
What happens is:
The ubuntu image is pulled if it does not exist
A layer is created and cached to run apt-get update
A layer is created an cached to run apt install -y nginx
Now suppose you modify your Docker file to be
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y nginx openssl
and you run a build again with the same command as before. What happens is:
There is already an ubuntu image locally so it will not be pulled (unless your force with --pull)
A layer was already created with command apt-get update against the existing local image so it uses the cached one
The next command has changed so a new layer is created to install nginx and openssl. Since apt database was created in the preceding layer and taken from cache, if a new nginx and/or openssl version was released since then, you will not see them and you will install the outdated ones.
Does this help you to grasp the concept of cached layers ?
In this particular example, the best handling is to do everything in a single layer making sure you cleanup after yourself:
FROM ubuntu:18.04
RUN apt-get update \
&& apt-get install -y nginx openssl \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
The phrasing of the line would be better said:
If not, there is a cache miss and the cache is not used for this build step and any following build step of this stage of the Dockerfile.
That gets a bit verbose because a multi-stage Dockerfile can fail to find a cache match in one stage and then find a match in another stage. Different builds can all use the cache. The cache is "invalidated" for a specific build process, the cache itself is not removed from the docker host and it continues to be available for future builds.
The Dockerfile contains several steps to build and create a docker image.
E.g. RUN dotnet restore, RUN dotnet build and RUN dotnet publish.
Is it possible to export the results/logging of each step to a separate file which can be displayed/ formatted in several Jenkins stages?
You can export Jenkins build to log file as well using this plugin https://github.com/cboylan/jenkins-log-console-log.
But if you want to view each step log in your Jenkins console log try this way.
Create a job and build your docker image from the bash script and run that script from Jenkins.
docker build --compress --no-cache --build-arg DOCKER_ENV=staging --build-arg DOCKER_REPO="${docker_name}" -t "${docker_name}:${docker_tag}" .
If you run this command from Jenkins or create bash file you will see each step logs as mentioned below. If you looking for something more let me know.
Building in workspace /var/lib/jenkins/workspace/testlog
[testlog] $ /bin/sh -xe /tmp/jenkins8370164159405243093.sh
+ cd /opt/containers/
+ ./scripts/abode_docker.sh build alpine base
verifying docker name: alpine
Docker name verified
verify_config retval= 0
comparing props
LIST: alpine:3.7
Now each step will be displayed under
http://jenkins.domain.com/job/testlog/1/console
Step 1/5 : FROM alpine:3.7
Step 2/5 : COPY id_rsa /root/.ssh/id_rsa
---> 6645bd2838c9
Step 3/5 : COPY supervisord.conf /etc/supervisord.conf
---> 635e37d9503e
.....
Step 5/5 : ONBUILD RUN ls /usr/share/zoneinfo && cp /usr/share/zoneinfo/Europe/Brussels /etc/localtime && echo "US/Eastern" > /etc/timezone && date
---> Running in 7b8517d90264
Removing intermediate container 7b8517d90264
---> 3ead0f40b7b4
Successfully built 3ead0f40b7b4
Successfully tagged alpine:3.7
I have a Dockerfile like this:
FROM python:2.7
RUN echo "Hello World"
When I build this the first time with docker build -f Dockerfile -t test ., or build it with the --no-cache option, I get this output:
Sending build context to Docker daemon 40.66MB
Step 1/2 : FROM python:2.7
---> 6c76e39e7cfe
Step 2/2 : RUN echo "Hello World"
---> Running in 5b5b88e5ebce
Hello World
Removing intermediate container 5b5b88e5ebce
---> a23687d914c2
Successfully built a23687d914c2
My echo command executes.
If I run it again without busting the cache, I get this:
Sending build context to Docker daemon 40.66MB
Step 1/2 : FROM python:2.7
---> 6c76e39e7cfe
Step 2/2 : RUN echo "Hello World"
---> Using cache
---> a23687d914c2
Successfully built a23687d914c2
Successfully tagged test-requirements:latest
Cache is used for Step 2/2, and Hello World is not executed. I could get it to execute again by using --no-cache. However, each time, even when I am using --no-cache it uses a cached python:2.7 base image (although, unlike when the echo command is cached, it does not say ---> Using cache).
How do I bust the cache for the FROM python:2.7 line? I know I can do FROM python:latest, but that also seems to just cache whatever the latest version is the first time you build the Dockerfile.
If I understood the context correctly, you can use --pull while using docker build to get the latest base image -
$ docker build -f Dockerfile.test -t test . --pull
So using both --no-cache & --pull will create an absolute fresh image using Dockerfile -
$ docker build -f Dockerfile.test -t test . --pull --no-cache
Issue - https://github.com/moby/moby/issues/4238
FROM pulls an image from the registry (DockerHub in this case).
After the image is pulled to produce your build, you will see it if you run docker images.
You may remove it by running docker rmi python:2.7.
I made a Docker container which is fairly large. When I commit the container to create an image, the image is about 7.8 GB big. But when I export the container (not save the image!) to a tarball and re-import it, the image is only 3 GB big. Of course the history is lost, but this OK for me, since the image is "done" in my opinion and ready for deployment.
How can I flatten an image/container without exporting it to the disk and importing it again? And: Is it a wise idea to do that or am I missing some important point?
Now that Docker has released the multi-stage builds in 17.05, you can reformat your build to look like this:
FROM buildimage as build
# your existing build steps here
FROM scratch
COPY --from=build / /
CMD ["/your/start/script"]
The result will be your build environment layers are cached on the build server, but only a flattened copy will exist in the resulting image that you tag and push.
Note, you would typically reformulate this to have a complex build environment and only copy over a few directories. Here's an example with Go to make a single binary image from source code and a single build command without installing Go on the host and compiling outside of docker:
$ cat Dockerfile
ARG GOLANG_VER=1.8
FROM golang:${GOLANG_VER} as builder
WORKDIR /go/src/app
COPY . .
RUN go-wrapper download
RUN go-wrapper install
FROM scratch
COPY --from=builder /go/bin/app /app
CMD ["/app"]
The go file is a simple hello world:
$ cat hello.go
package main
import "fmt"
func main() {
fmt.Printf("Hello, world.\n")
}
The build creates both environments, the build environment and the scratch one, and then tags the scratch one:
$ docker build -t test-multi-hello .
Sending build context to Docker daemon 4.096kB
Step 1/9 : ARG GOLANG_VER=1.8
--->
Step 2/9 : FROM golang:${GOLANG_VER} as builder
---> a0c61f0b0796
Step 3/9 : WORKDIR /go/src/app
---> Using cache
---> af5177aae437
Step 4/9 : COPY . .
---> Using cache
---> 976490d44468
Step 5/9 : RUN go-wrapper download
---> Using cache
---> e31ac3ce83c3
Step 6/9 : RUN go-wrapper install
---> Using cache
---> 2630f482fe78
Step 7/9 : FROM scratch
--->
Step 8/9 : COPY --from=builder /go/bin/app /app
---> Using cache
---> 5645db256412
Step 9/9 : CMD /app
---> Using cache
---> 8d428d6f7113
Successfully built 8d428d6f7113
Successfully tagged test-multi-hello:latest
Looking at the images, only the single binary is in the image being shipped, while the build environment is over 700MB:
$ docker images | grep 2630f482fe78
<none> <none> 2630f482fe78 6 days ago 700MB
$ docker images | grep 8d428d6f7113
test-multi-hello latest 8d428d6f7113 6 days ago 1.56MB
And yes, it runs:
$ docker run --rm test-multi-hello
Hello, world.
Up from Docker 1.13, you can use the --squash flag.
Before version 1.13:
To my knowledge, you cannot using the Docker api. docker export and docker import are designed for this scenario, as you yourself already mention.
If you don't want to save to disk, you could probably pipe the outputstream of export into the input stream of import. I have not tested this, but try
docker export red_panda | docker import - exampleimagelocal:new
Take a look at docker-squash
Install with:
pip install docker-squash
Then, if you have a image, you can squash all layers into 1 with
docker-squash -f <nr_layers_to_squash> -t new_image:tag existing_image:tag
A quick 1-liner that is useful for me to squash all layers:
docker-squash -f $(($(docker history $IMAGE_NAME | wc -l | xargs)-1)) -t ${IMAGE_NAME}:squashed $IMAGE_NAME
Build the image with the --squash flag:
https://docs.docker.com/engine/reference/commandline/build/#squash-an-images-layers---squash-experimental
Also consider mopping up unneeded files, such as the apt cache:
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*