How to correctly install programs in docker? - docker

I know that every line of RUN ... will add a layer to the docker image and that it is recommended to make RUN commands connected with &&. But my question is:
Is better this:
RUN apk update && apk upgrade \
&& apk add openssh \
&& apk add --update nodejs nodejs-npm \
&& npm install -g #angular/cli \
&& apk add openjdk11 \
&& apk add maven \
&& apk add git
Or this:
RUN apk update && apk upgrade
RUN apk add openssh
RUN apk add --update nodejs nodejs-npm
RUN npm install -g #angular/cli
RUN apk add openjdk11
RUN apk add maven
RUN apk add git
The first one creates just one layer but when a version of anything changes the image would have to start from the beginning, not from cash. The second approach will create more layers but when just the version of git changes only the git layer needs to be build again and all previous layers can be used from cash.

I'd recommend:
Install all the OS packages in a single apk invocation: there is some overhead in starting the package manager (more noticeable with dpkg/apt) and it is faster if you start it once and install several packages
If you need to run an update command, always run it in the same RUN command as your other package-manager steps. This avoids some trouble with Docker layer caching (again, very noticeable with apt) where docker build doesn't re-run update, but then it does try to run a changed install step; when it tries to install a package using yesterday's package index, the upload of that package that happened today deleted yesterday's file and the download will fail.
Don't npm install single packages. That means your package.json file is incomplete. Add it there.
I've seen recommendations both ways as to whether or not to run a full upgrade. Keeping up-to-date on security fixes is important; the underlying base images on Docker Hub also update pretty regularly. So if your image is FROM alpine:latest, doing a docker build --pull will get you much of the effect of an explicit apk upgrade.
Stylistically, if I need any substantial number of packages, I find the list a little more maintainable if I sort it alphabetically and put one package on a line, but this is purely personal preference.
Putting this all together would transform your example into:
RUN apk update \
&& apk upgrade \
&& apk add \
git \
maven \
nodejs \
nodejs-npm \
openjdk11 \
openssh
COPY package.json package-lock.json . # includes #angular/cli
RUN npm ci
Don't be afraid to use multiple containers, if that makes sense. (What's your application that uses both Java and Node together; can it be split into two single-language parts?) Don't install unnecessary developer-oriented tools in your image. (Does your application invoke git while it's running; do you install a dependency directly from GitHub; or can you remove git?) Don't try to run an ssh daemon in your container. (It breaks the "one process per container" rule which instantly makes things harder to manage; 90% of the SO examples have a hard-coded user password plus sudo rights, which is not really a security best practice; managing the credentials is essentially impossible.)

Both approaches are on the extreem, you need to try to minimize the layers for "reusability" at the same time to optimize for lower number of layers.
Based on your example, the build can be organized as follows:
RUN apk update && apk upgrade \
&& apk add openssh \
&& apk add --update nodejs nodejs-npm \
&& apk add openjdk11 \
&& apk add maven \
&& apk add git
RUN npm install -g #angular/cli \
Now I have only 2 layers, first one is bringing the OS packages and the second one dealing with the node.js packages. Now this can better be reused in other builds.
Once you have done this modification, you can move to multistage build where you will be able to better control and reuse the intermediate containers like in this example

Related

Cannot explain why Alpine apk upgrade command does not update ncurses package although a newer version exists

I have a Dockerfile as:
FROM nginx:1.21.3-alpine
RUN apk update && apk add bash \
&& apk upgrade
I can see that package ncurses is installed and the version is 6.2_p20210612-r0.
Now, There is a newer package available in the main repository edge branch with version 6.2_p20211002-r0 here.
As far as I understand after building an image from the above mentioned Dockerfile the version of ncurses should be updated to 6.2_p20211002-r0 BUT instead it stays as 6.2_p20210612-r0. I cannot understand why?
I confirmed this by running a container after build and running:
apk info -a ncurses
The output was:
ncurses-6.2_p20210612-r0 installed size:
284 KiB
The nginx:1.21.3-alpine image is based on Alpine 3.14 (see cat /etc/os-release), and therefore ncurses is updated with the version of the Alpine 3.14 repository, which is currently 6.2_p20210612-r0.
For installing ncurses from edge (currently version 6.2_p20211002-r0), you could specify the edge repository explicitly in the apk command:
apk add ncurses --repository=http://dl-cdn.alpinelinux.org/alpine/edge/main
Mixing and matching packages from different repositories this way might be OK in some cases, but has to be tested carefully. For ncurses, some functionality might be broken, since the matching ncurses-libs package should be installed as well, but some of the package images depend on ncurses-libs, so re-installing it triggers update of these packages. Moreover, the nginx-module-njs dependent package must be removed. If this is acceptable, you could modify the Dockerfile as follows:
FROM nginx:1.21.3-alpine
RUN apk update && \
apk del ncurses ncurses-libs nginx-module-njs && \
apk add ncurses ncurses-libs --repository=http://dl-cdn.alpinelinux.org/alpine/edge/main && \
apk add bash && \
apk upgrade

Docker multistage build vs. keeping artifacts in git

My target container is a build environment container, so my team would build an app in a uniform environment.
This app doesn't necessarily run as a container - it runs on physical machine. The container is solely for building.
The app depends on third parties.
Some I can apt-get install with Dockerfile RUN command.
And some I must build myself because they require special building.
I was wondering which way is better.
Using multistage build seems cool; Dockerfile for example:
From ubuntu:18.04 as third_party
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
...
ADD http://.../boost.tar.gz /
RUN tar boost.tar.gz && \
... && \
make --prefix /boost_out ...
From ubuntu:18.04 as final
COPY --from=third_party /boost_out/ /usr/
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
...
CMD ["bash"]
...
Pros:
Automatically built when I build my final container
Easy to change third party version (boost in this example)
Cons
ADD command downloads ~100MB file each time, makes image build process slower
I want to use --cache-from so I would be able to cache third_party and build from different docker host machine. Meaning I need to store ~1.6GB image in a docker registry. That's pretty heavy to pull/push.
On the other hand
I could just build boost (with this third_party image) and store its artifacts on some storage, git for example. It would take ~200MB which is better than storing 1.6GB image.
Pros:
Smaller disc space
Cons:
Cumbersome build
Manually build and push artifacts to git when changing boost version.
Somehow link Docker build and git to pull newest artifacts and COPY to the final image.
In both ways I need a third_party image that uniformly and automatically builds third parties. In 1. the image bigger than 2. that will contain just build tools, and not build artifacts.
Is this the trade-off?
1. is more automatic but consumes more disk space and push/pull time,
2. is cumbersome but consumes less disk space and push/pull time?
Are there any other virtues for any of these ways?
I'd like to propose changing your first attempt to something like this:
FROM ubuntu:18.04 as third_party
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
...
RUN wget http://.../boost.tar.gz -O /boost.tar.gz && \
tar xvf boost.tar.gz && \
... && \
make --prefix /boost_out ... && \
find -name \*.o -delete && \
rm /boost.tar.gz # this is important!
From ubuntu:18.04 as final
COPY --from=third_party /boost_out/ /usr/
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
...
CMD ["bash"]
This way, you are paying for the download of boost only once (when building the image without a cache), and you do not pay for the storage/pull-time of the original tar-ed sources. Additionally, you should remove unneeded target files (.o?) from the build in the same step in which they are generated. Otherwise, they are stored and pulled as well.
If you are at liberty posting the whole Dockerfile, I'll gladly take a deeper look at it and give you some hints.

Code changes breaks apk add cache

Is it possible to split the apk add and go build commands so that a code change doesn't re-install the apk dependencies?
FROM golang:1.8-alpine AS go-build-env
RUN apk update && apk upgrade && apk add --no-cache bash git
RUN go build /bin/webui main.go
EDIT: updated
FROM golang:1.8-alpine AS go-build-env
RUN apk update && apk upgrade && apk add --no-cache bash git openssh curl g++ \
make perl; go-wrapper download
RUN mkdir -p /go/src/github.com/markwallsgrove/saml_federation_proxy \
/go/src/github.com/markwallsgrove/saml_federation_proxy/models \
/go/src/github.com/markwallsgrove/saml_federation_proxy/webui
COPY webui/main.go /go/src/github.com/markwallsgrove/saml_federation_proxy/webui
COPY models /go/src/github.com/markwallsgrove/saml_federation_proxy/models
WORKDIR /go/src/github.com/markwallsgrove/saml_federation_proxy/webui
The dockerfile as written does not contain any ADD instructions so main.go isn't present.
You're also not dealing with an "apt-get" cache as you're using alpine and apk, but looking beyond those errors...
In order to keep docker layers cached ignoring code changes, keep them above any ADD / COPY instructions -- these invalidate all layers below them.
In your example dockerfile it would look something like this:
FROM golang:1.8-alpine AS go-build-env
RUN apk update && apk upgrade && apk add --no-cache bash git
ADD main.go .
RUN go build /bin/webui main.go

Docker + older version of Elixir/Phoenix

I have been requested to move an Elixir/Phoenix app to Docker, with which I have no prior experience. The app uses non-latest versions of Elixir and Phoenix so I have had to diverge from the code online which generally focuses on latest versions. That led me to write this Dockerfile
# FROM bitwalker/alpine-elixir:latest
FROM bitwalker/alpine-elixir:1.3.4
MAINTAINER Paul Schoenfelder <paulschoenfelder#gmail.com>
# Important! Update this no-op ENV variable when this Dockerfile
# is updated with the current date. It will force refresh of all
# of the base images and things like `apt-get update` won't be using
# old cached versions when the Dockerfile is built.
ENV REFRESHED_AT=2017-07-26 \
# Set this so that CTRL+G works properly
TERM=xterm
# Install NPM
RUN \
mkdir -p /opt/app && \
chmod -R 777 /opt/app && \
apk update && \
apk --no-cache --update add \
git make g++ wget curl inotify-tools \
nodejs nodejs-current-npm && \
npm install npm -g --no-progress && \
update-ca-certificates --fresh && \
rm -rf /var/cache/apk/*
# Add local node module binaries to PATH
ENV PATH=./node_modules/.bin:$PATH \
HOME=/opt/app
# Install Hex+Rebar
RUN mix local.hex --force && \
mix local.rebar --force
WORKDIR /opt/app
CMD ["/bin/sh"]
<then it goes on to add some elixir depedencies>
On running
sudo docker build -t phoenix .
I'm ending up with this error and wondering how to get around it? Noting 'current' in the title I'm wondering whether using an older version of nodejs, and if so, how to do that? Beyond that I am open to any and all suggestions
ERROR: unsatisfiable constraints:
nodejs-current-npm (missing):
required by: world[nodejs-current-npm]
musl-1.1.14-r14:
breaks: musl-dev-1.1.14-r15[musl=1.1.14-r15]
That looks like bitwalker/alpine-elixir issue 5:
when using tagged images, you may sometimes need to explicitly upgrade packages, as the installed packages are at the versions found when building the image.
Generally it's as simple as adding apk --update upgrade before any commands which install packages.
Indeed, when you compare the old elixir 1.4.4-based Dockerfile, and the latest one, you will see an upgrade first in the latter:
apk --no-cache --update upgrade && \
apk add --no-cache --update --virtual .elixir-build \
...
Try and add that to your Dockerfile.

Does alpine `apk` have an ubuntu `apt` `--no-install-recommends` equivalent

I'm trying to make the absolute smallest Docker image I can get away with, so I have switched from ubuntu as my base to alpine.
For apt, I used to use --no-install-recommends to minimize "dependencies" installed with my desired packages. Is there an equivalent flag I need to pass along with apk or is this the default behavior for this slimmed down OS?
No it doesn't have the same flag I think because it does not even do the same behaviour of downloading recommended packages.
However there is another flag --virtual which helps keep your images smaller:
apk add --virtual somename package1 package2
and then
apk del somename
This is useful for stuff needed for only build but not for execution later.
Note you must execute it in one RUN command, else it cannot be deleted from the previous Docker image layer.
e.g. if pything1 needs package1 and package2 to run, but only needs package3 and package4 during the install build, this would be optimal:
RUN apk add --no-cache package1 package2
RUN apk add --no-cache --virtual builddeps package3 package4 && \
pip install pything1 && \
apk del builddeps
package 3 and 4 are not added the "world" packages but are removed before the layer is written.
This question asks the question other way round: What is .build-deps for apk add --virtual command?

Resources