How to flatten a Docker image?

How to flatten a Docker image? - docker

I made a Docker container which is fairly large. When I commit the container to create an image, the image is about 7.8 GB big. But when I export the container (not save the image!) to a tarball and re-import it, the image is only 3 GB big. Of course the history is lost, but this OK for me, since the image is "done" in my opinion and ready for deployment.
How can I flatten an image/container without exporting it to the disk and importing it again? And: Is it a wise idea to do that or am I missing some important point?

Now that Docker has released the multi-stage builds in 17.05, you can reformat your build to look like this:
FROM buildimage as build
# your existing build steps here
FROM scratch
COPY --from=build / /
CMD ["/your/start/script"]
The result will be your build environment layers are cached on the build server, but only a flattened copy will exist in the resulting image that you tag and push.
Note, you would typically reformulate this to have a complex build environment and only copy over a few directories. Here's an example with Go to make a single binary image from source code and a single build command without installing Go on the host and compiling outside of docker:
$ cat Dockerfile
ARG GOLANG_VER=1.8
FROM golang:${GOLANG_VER} as builder
WORKDIR /go/src/app
COPY . .
RUN go-wrapper download
RUN go-wrapper install
FROM scratch
COPY --from=builder /go/bin/app /app
CMD ["/app"]
The go file is a simple hello world:
$ cat hello.go
package main
import "fmt"
func main() {
fmt.Printf("Hello, world.\n")
}
The build creates both environments, the build environment and the scratch one, and then tags the scratch one:
$ docker build -t test-multi-hello .
Sending build context to Docker daemon 4.096kB
Step 1/9 : ARG GOLANG_VER=1.8
--->
Step 2/9 : FROM golang:${GOLANG_VER} as builder
---> a0c61f0b0796
Step 3/9 : WORKDIR /go/src/app
---> Using cache
---> af5177aae437
Step 4/9 : COPY . .
---> Using cache
---> 976490d44468
Step 5/9 : RUN go-wrapper download
---> Using cache
---> e31ac3ce83c3
Step 6/9 : RUN go-wrapper install
---> Using cache
---> 2630f482fe78
Step 7/9 : FROM scratch
--->
Step 8/9 : COPY --from=builder /go/bin/app /app
---> Using cache
---> 5645db256412
Step 9/9 : CMD /app
---> Using cache
---> 8d428d6f7113
Successfully built 8d428d6f7113
Successfully tagged test-multi-hello:latest
Looking at the images, only the single binary is in the image being shipped, while the build environment is over 700MB:
$ docker images | grep 2630f482fe78
<none> <none> 2630f482fe78 6 days ago 700MB
$ docker images | grep 8d428d6f7113
test-multi-hello latest 8d428d6f7113 6 days ago 1.56MB
And yes, it runs:
$ docker run --rm test-multi-hello
Hello, world.

Up from Docker 1.13, you can use the --squash flag.
Before version 1.13:
To my knowledge, you cannot using the Docker api. docker export and docker import are designed for this scenario, as you yourself already mention.
If you don't want to save to disk, you could probably pipe the outputstream of export into the input stream of import. I have not tested this, but try
docker export red_panda | docker import - exampleimagelocal:new

Take a look at docker-squash
Install with:
pip install docker-squash
Then, if you have a image, you can squash all layers into 1 with
docker-squash -f <nr_layers_to_squash> -t new_image:tag existing_image:tag
A quick 1-liner that is useful for me to squash all layers:
docker-squash -f $(($(docker history $IMAGE_NAME | wc -l | xargs)-1)) -t ${IMAGE_NAME}:squashed $IMAGE_NAME

Build the image with the --squash flag:
https://docs.docker.com/engine/reference/commandline/build/#squash-an-images-layers---squash-experimental
Also consider mopping up unneeded files, such as the apt cache:
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Related

Docker not able to find/run binary

I have what I believe is a pretty simple setup.
I build a binary file outside of docker and then try to add it using this Dockerfile
FROM alpine
COPY apps/dist/apps /bin/
RUN chmod +x /bin/apps
RUN ls -al /bin | grep apps
CMD /bin/apps
And I think this should work.
The binary on its own seems to work on my host machine and I don't understand why it wouldn't on the docker image.
Anyways, the output I get is this:
docker build -t apps -f app.Dockerfile . && docker run apps
Sending build context to Docker daemon 287.5MB
Step 1/5 : alpine
---> d05cf6536f67
Step 2/5 : COPY apps/dist/apps /bin/
---> Using cache
---> c54d6d57154e
Step 3/5 : RUN chmod +x /bin/apps
---> Using cache
---> aa7e6adb0981
Step 4/5 : RUN ls -al /bin | grep apps
---> Running in 868c5e235d68
-rwxr-xr-x 1 root root 68395166 Dec 20 13:35 apps
Removing intermediate container 868c5e235d68
---> f052c06269b0
Step 5/5 : CMD /bin/apps
---> Running in 056fd02733e1
Removing intermediate container 056fd02733e1
---> 331600154cbe
Successfully built 331600154cbe
Successfully tagged apps:latest
/bin/sh: /bin/apps: not found
does this make sense, and am I just missing something obvious?

Your binary likely has dynamic links to libraries that don't exist inside the image filesystem. You can check those dynamic links with the ldd apps/dist/apps command.

Docker build not using cache

When I run
docker build -f docker/webpack.docker services/webpack --build-arg env=production
twice in a row, Docker builds my image each time, starting from the first RUN (the COPY uses the cache).
FROM node:lts
ARG env=production
ENV NODE_ENV=$env
WORKDIR /app
COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile --production=false --non-interactive
COPY . .
RUN node --max-old-space-size=20000 node_modules/.bin/svg2fonts icons -o assets/markons -b mrkn -f markons -n Markons
RUN node --max-old-space-size=20000 node_modules/.bin/webpack --progress
How can I get it to cache those RUNs?
Output looks like:
Sending build context to Docker daemon 3.37MB
Step 1/9 : FROM node:lts
---> 0c601cba9f11
Step 2/9 : ARG env=production
---> Using cache
---> dd38b2167c75
Step 3/9 : ENV NODE_ENV=$env
---> Using cache
---> 800f5afd416c
Step 4/9 : WORKDIR /app
---> Using cache
---> d15b93dce11d
Step 5/9 : COPY package.json yarn.lock ./
---> Using cache
---> a049dd1609a8
Step 6/9 : RUN yarn install --frozen-lockfile --production=false --non-interactive
---> Using cache
---> d5e51b0d556c
Step 7/9 : COPY . .
---> 92990e326d4b
Step 8/9 : RUN node --max-old-space-size=20000 node_modules/.bin/svg2fonts icons -o assets/markons -b mrkn -f markons -n Markons
---> Running in a23878db7b0e
Wrote assets/markons/markons.css
Wrote assets/markons/markons.js
Wrote assets/markons/markons.html
Wrote assets/markons/markons-chars.json
Wrote assets/markons/markons.svg
Wrote assets/markons/markons.ttf
Wrote assets/markons/markons.woff
Wrote assets/markons/markons.woff2
Wrote assets/markons/markons.eot
Removing intermediate container a23878db7b0e
---> 3bce79d0ecf0
Step 9/9 : RUN node --max-old-space-size=20000 node_modules/.bin/webpack --progress
---> Running in b6d460488950
<s> [webpack.Progress] 0% compiling
...

See the description:
If the contents of all external files on the first COPY command are
the same, the layer cache will be used and all subsequent commands
until the next ADD or COPY command will use the layer cache.
However, if the contents of one or more external files are different,
then all subsequent commands will be executed without using the layer
cache.
So every time the content is changed two last RUN will be executed with no cache. There is no way to control caching yet. Maybe it's a better option to specify volumes?

docker /bin/sh: 1: .: Can't open /path

I have an sbt project projectA under home/demo/projectA my Dockerfile resides in /home/demo/ for some reason i don't want it to be inside projectA
so hierarchy looks like this
home/demo
Dockerfile
projectA
here i am trying to run sbt command in the image build process here is the contents of my Dockerfile
FROM hseeberger/scala-sbt:11.0.2_2.12.8_1.2.8 as stripecommon
MAINTAINER sara <sarawaheed3191#gmail.com>
WORKDIR /aa
RUN \
. /home/demo/projectA sbt
I am getting this error when building the image
:~/home/demo$ docker build -t testapp .
Sending build context to Docker daemon 1.297GB
Step 1/4 : FROM hseeberger/scala-sbt:11.0.2_2.12.8_1.2.8 as stripecommon
---> 349a7e4f4029
Step 2/4 : MAINTAINER sara <sarawaheed3191#gmail.com>
---> Using cache
---> 8603662d3730
Step 3/4 : WORKDIR /aa
---> Using cache
---> f07ec5bb4d34
Step 4/4 : RUN . /home/demo/projectA sbt
---> Running in 7509ee45f622
/bin/sh: 1: .: Can't open /home/demo/projectA
The command '/bin/sh -c . /home/demo/projectA sbt' returned a non-zero code: 2
what is the right way to do this also i am a beginner in docker help will be appreciated

You need to make sure that projectA exists inside the container. so for this either you pick code from github or copy it using COPY or ADD command. After that you can build it using sbt.

Go 1.11 unknown import path for own package in Docker build

I am migrating some code to work with Go 1.11 modules, and I am able to build it from the shell but not in Docker.
Relevant Dockerfile sections:
WORKDIR /goscout
COPY ["go.mod", "go.sum", "./"]
RUN GO111MODULE=on go get -u=patch
COPY *.go ./
RUN GO111MODULE=on go build -v -ldflags "-linkmode external -extldflags -static" -o GoScout -a .
When Docker is running the last command in the above excerpt, I get this error:
can't load package: package github.com/triplestrange/StrangeScout/goscout: unknown import path "github.com/triplestrange/StrangeScout/goscout": ambiguous import: found github.com/triplestrange/StrangeScout/goscout in multiple modules:
github.com/triplestrange/StrangeScout/goscout (/goscout)
github.com/triplestrange/StrangeScout v0.3.0 (/go/pkg/mod/github.com/triplestrange/!strange!scout#v0.3.0/goscout)
I don't get this in the shell, so I'm guessing I am not copying some files correctly. But before this command runs I have copied go.mod, go.sum, and *.go, so I don't know what could be missing.

Make sure that you initialized modules properly for your project
go mod init github.com/triplestrange/StrangeScout/goscout
so that the content of your go.mod is
module github.com/triplestrange/StrangeScout/goscout
And then you can use your current Dockerfile without any changes.
There is no need to set GO111MODULE=on since you're running go commands outside of the $GOPATH
➜ docker build -t goscout .
Sending build context to Docker daemon 47.1kB
Step 1/11 : FROM golang:latest AS builder
---> fb7a47d8605b
Step 2/11 : WORKDIR /goscout
---> Running in e9786fe5ab53
Removing intermediate container e9786fe5ab53
---> 6d101e346175
Step 3/11 : COPY ./ ./
---> 7081c0b47dc9
Step 4/11 : RUN go get -d -v ./...
---> Running in 3ce69359ae88
go: finding github.com/go-sql-driver/mysql v1.4.0
go: finding github.com/gorilla/mux v1.6.2
go: downloading github.com/gorilla/mux v1.6.2
go: downloading github.com/go-sql-driver/mysql v1.4.0
Removing intermediate container 3ce69359ae88
...
---> 3df0dbca80e5
Successfully built 3df0dbca80e5
Successfully tagged goscout:latest

Dockerfile not using cache in RUN composer install command

I thought i understand Docker already, but today i found some problem about utilizing docker cache.
Here is my dockerfile
FROM quay.io/my_company/phpjenkins
WORKDIR /usr/src/my_project
ADD composer.json composer.json
ADD composer.lock composer.lock
RUN composer install -o
ADD . .
RUN mkdir -p temp/unittest/cache log
RUN cp app/config/config.unittest.template.neon app/config/config.unittest.neon
CMD ["tail", "-f", "/dev/null"]
I expect docker to use the cache until ADD . .
However, every build, look like docker try to do composer install every time.
Here is some output
+ docker-compose -f docker-compose.yml run app vendor/bin/phpunit -d memory_limit=2048M
Creating network "xxx_default" with the default driver
Creating xxx_rabbitmq_1
Creating xxx_mysql_1
Building app
Step 1/9 : FROM quay.io/my_company/phpjenkins
---> f10ea65fb7df
Step 2/9 : WORKDIR /usr/src/my_project
---> Using cache
---> 07ad76770cd2
Step 3/9 : ADD composer.json composer.json
---> Using cache
---> 0d22314b81af
Step 4/9 : ADD composer.lock composer.lock
---> Using cache
---> 3d41825efcb3
Step 5/9 : RUN composer install -o
---> Running in 38de5f08eb46
Warning: This development build of composer is over 60 days old. It is recommended to update it by running "/usr/local/bin/composer self-update" to get the latest version.
Do not run Composer as root/super user! See https://getcomposer.org/root for details ....
...
---> aa05dc9ddc5f
Removing intermediate container 581aa7e4b00f
Step 6/9 : ADD . .
---> 8796a9235b9a
Removing intermediate container b7354231fbd7
I run out of lead, what could be possible thing that dockerfile didn't use cache for RUN composer install command
I'm using Docker version 17.05.0-ce, build 89658be on Debian, if this help for investigation.
Please advise.

As a work-around you could create two Dockerfiles. One that creates an image at the point where you would like to cache. The second Dockerfile can then use the first image as its base and make modifications as required.
FROM quay.io/my_company/phpjenkins
WORKDIR /usr/src/my_project
ADD composer.json composer.json
ADD composer.lock composer.lock
RUN composer install -o
CMD ["tail", "-f", "/dev/null"]
Build this file to mycomposerimage using
docker build -t mycomposerimage .
Then second dockerfile picks up from there
FROM mycomposerimage
WORKDIR /usr/src/my_project
ADD . .
RUN mkdir -p temp/unittest/cache log
RUN cp app/config/config.unittest.template.neon app/config/config.unittest.neon
CMD ["tail", "-f", "/dev/null"]

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart