Does alpine `apk` have an ubuntu `apt` `--no-install-recommends` equivalent - docker

I'm trying to make the absolute smallest Docker image I can get away with, so I have switched from ubuntu as my base to alpine.
For apt, I used to use --no-install-recommends to minimize "dependencies" installed with my desired packages. Is there an equivalent flag I need to pass along with apk or is this the default behavior for this slimmed down OS?

No it doesn't have the same flag I think because it does not even do the same behaviour of downloading recommended packages.
However there is another flag --virtual which helps keep your images smaller:
apk add --virtual somename package1 package2
and then
apk del somename
This is useful for stuff needed for only build but not for execution later.
Note you must execute it in one RUN command, else it cannot be deleted from the previous Docker image layer.
e.g. if pything1 needs package1 and package2 to run, but only needs package3 and package4 during the install build, this would be optimal:
RUN apk add --no-cache package1 package2
RUN apk add --no-cache --virtual builddeps package3 package4 && \
pip install pything1 && \
apk del builddeps
package 3 and 4 are not added the "world" packages but are removed before the layer is written.
This question asks the question other way round: What is .build-deps for apk add --virtual command?

Related

Cannot explain why Alpine apk upgrade command does not update ncurses package although a newer version exists

I have a Dockerfile as:
FROM nginx:1.21.3-alpine
RUN apk update && apk add bash \
&& apk upgrade
I can see that package ncurses is installed and the version is 6.2_p20210612-r0.
Now, There is a newer package available in the main repository edge branch with version 6.2_p20211002-r0 here.
As far as I understand after building an image from the above mentioned Dockerfile the version of ncurses should be updated to 6.2_p20211002-r0 BUT instead it stays as 6.2_p20210612-r0. I cannot understand why?
I confirmed this by running a container after build and running:
apk info -a ncurses
The output was:
ncurses-6.2_p20210612-r0 installed size:
284 KiB
The nginx:1.21.3-alpine image is based on Alpine 3.14 (see cat /etc/os-release), and therefore ncurses is updated with the version of the Alpine 3.14 repository, which is currently 6.2_p20210612-r0.
For installing ncurses from edge (currently version 6.2_p20211002-r0), you could specify the edge repository explicitly in the apk command:
apk add ncurses --repository=http://dl-cdn.alpinelinux.org/alpine/edge/main
Mixing and matching packages from different repositories this way might be OK in some cases, but has to be tested carefully. For ncurses, some functionality might be broken, since the matching ncurses-libs package should be installed as well, but some of the package images depend on ncurses-libs, so re-installing it triggers update of these packages. Moreover, the nginx-module-njs dependent package must be removed. If this is acceptable, you could modify the Dockerfile as follows:
FROM nginx:1.21.3-alpine
RUN apk update && \
apk del ncurses ncurses-libs nginx-module-njs && \
apk add ncurses ncurses-libs --repository=http://dl-cdn.alpinelinux.org/alpine/edge/main && \
apk add bash && \
apk upgrade

How to correctly install programs in docker?

I know that every line of RUN ... will add a layer to the docker image and that it is recommended to make RUN commands connected with &&. But my question is:
Is better this:
RUN apk update && apk upgrade \
&& apk add openssh \
&& apk add --update nodejs nodejs-npm \
&& npm install -g #angular/cli \
&& apk add openjdk11 \
&& apk add maven \
&& apk add git
Or this:
RUN apk update && apk upgrade
RUN apk add openssh
RUN apk add --update nodejs nodejs-npm
RUN npm install -g #angular/cli
RUN apk add openjdk11
RUN apk add maven
RUN apk add git
The first one creates just one layer but when a version of anything changes the image would have to start from the beginning, not from cash. The second approach will create more layers but when just the version of git changes only the git layer needs to be build again and all previous layers can be used from cash.
I'd recommend:
Install all the OS packages in a single apk invocation: there is some overhead in starting the package manager (more noticeable with dpkg/apt) and it is faster if you start it once and install several packages
If you need to run an update command, always run it in the same RUN command as your other package-manager steps. This avoids some trouble with Docker layer caching (again, very noticeable with apt) where docker build doesn't re-run update, but then it does try to run a changed install step; when it tries to install a package using yesterday's package index, the upload of that package that happened today deleted yesterday's file and the download will fail.
Don't npm install single packages. That means your package.json file is incomplete. Add it there.
I've seen recommendations both ways as to whether or not to run a full upgrade. Keeping up-to-date on security fixes is important; the underlying base images on Docker Hub also update pretty regularly. So if your image is FROM alpine:latest, doing a docker build --pull will get you much of the effect of an explicit apk upgrade.
Stylistically, if I need any substantial number of packages, I find the list a little more maintainable if I sort it alphabetically and put one package on a line, but this is purely personal preference.
Putting this all together would transform your example into:
RUN apk update \
&& apk upgrade \
&& apk add \
git \
maven \
nodejs \
nodejs-npm \
openjdk11 \
openssh
COPY package.json package-lock.json . # includes #angular/cli
RUN npm ci
Don't be afraid to use multiple containers, if that makes sense. (What's your application that uses both Java and Node together; can it be split into two single-language parts?) Don't install unnecessary developer-oriented tools in your image. (Does your application invoke git while it's running; do you install a dependency directly from GitHub; or can you remove git?) Don't try to run an ssh daemon in your container. (It breaks the "one process per container" rule which instantly makes things harder to manage; 90% of the SO examples have a hard-coded user password plus sudo rights, which is not really a security best practice; managing the credentials is essentially impossible.)
Both approaches are on the extreem, you need to try to minimize the layers for "reusability" at the same time to optimize for lower number of layers.
Based on your example, the build can be organized as follows:
RUN apk update && apk upgrade \
&& apk add openssh \
&& apk add --update nodejs nodejs-npm \
&& apk add openjdk11 \
&& apk add maven \
&& apk add git
RUN npm install -g #angular/cli \
Now I have only 2 layers, first one is bringing the OS packages and the second one dealing with the node.js packages. Now this can better be reused in other builds.
Once you have done this modification, you can move to multistage build where you will be able to better control and reuse the intermediate containers like in this example

Alpine package py3-scipy is missing

I'm in an alpine linux docker container, and I'm trying to install py3-scipy. Here is info on that package: https://pkgs.alpinelinux.org/package/edge/community/x86/py3-scipy. I want to do this because pip install scipy takes way too long.
Here is what I get:
/ # apk add py3-scipy
ERROR: unsatisfiable constraints:
py3-scipy (missing):
required by: world[py3-scipy]
My Dockerfile:
FROM alpine:3.9
RUN apk add --update python3-dev g++ gcc libxslt-dev cython lapack-dev gfortran build-base py3-scipy
What is causing this error?
I think this issue sometimes happen when alpine moves a package from edge/testing to edge/community so old versions of alpine will keep referring to the old url. So you need to try to use the latest alpine version alpine:latest instead of a specific version.
I was missing the echo step, with that I was able to install it:
echo "#testing http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories
apk add --update --no-cache py3-scipy

What is .build-deps for apk add --virtual command?

What is .build-deps in the following command? I can't find an explanation in the Alpine docs. Is this a file that is predefined? Is see this referenced in many Dockerfiles.
RUN apk add --no-cache --virtual .build-deps \
gcc \
freetype-dev \
musl-dev
RUN pip install --no-cache-dir <packages_that_require_gcc...> \
RUN apk del .build-deps
If you see the documentation
-t, --virtual NAME Instead of adding all the packages to 'world', create a new
virtual package with the listed dependencies and add that
to 'world'; the actions of the command are easily reverted
by deleting the virtual package
What that means is when you install packages, those packages are not added to global packages. And this change can be easily reverted. So if I need gcc to compile a program, but once the program is compiled I no more need gcc.
I can install gcc, and other required packages in a virtual package and all of its dependencies and everything can be removed this virtual package name. Below is an example usage
RUN apk add --virtual mypacks gcc vim \
&& apk del mypacks
The next command will delete all 18 packages installed with the first command.
In docker these must be executed as a single RUN command (as shown above), otherwise it will not reduce the image size.
.build-deps is an arbitrary name to call a "virtual package" in Alpine, where you will add packages.
It creates an extra 'world' of packages, that you will need for a limited period of time (e.g. compilers for building other things).
Its main purpose is to keep your image as lean and light as possible, because you can easily get rid of it once those packages were used.
Please remember that it should be included in the same RUN if you want to achieve the main purpose of lightweight.

Code changes breaks apk add cache

Is it possible to split the apk add and go build commands so that a code change doesn't re-install the apk dependencies?
FROM golang:1.8-alpine AS go-build-env
RUN apk update && apk upgrade && apk add --no-cache bash git
RUN go build /bin/webui main.go
EDIT: updated
FROM golang:1.8-alpine AS go-build-env
RUN apk update && apk upgrade && apk add --no-cache bash git openssh curl g++ \
make perl; go-wrapper download
RUN mkdir -p /go/src/github.com/markwallsgrove/saml_federation_proxy \
/go/src/github.com/markwallsgrove/saml_federation_proxy/models \
/go/src/github.com/markwallsgrove/saml_federation_proxy/webui
COPY webui/main.go /go/src/github.com/markwallsgrove/saml_federation_proxy/webui
COPY models /go/src/github.com/markwallsgrove/saml_federation_proxy/models
WORKDIR /go/src/github.com/markwallsgrove/saml_federation_proxy/webui
The dockerfile as written does not contain any ADD instructions so main.go isn't present.
You're also not dealing with an "apt-get" cache as you're using alpine and apk, but looking beyond those errors...
In order to keep docker layers cached ignoring code changes, keep them above any ADD / COPY instructions -- these invalidate all layers below them.
In your example dockerfile it would look something like this:
FROM golang:1.8-alpine AS go-build-env
RUN apk update && apk upgrade && apk add --no-cache bash git
ADD main.go .
RUN go build /bin/webui main.go

Resources