Minimize size of docker image - docker

I am building a docker image for running yarn jobs.
In order to install yarn, I need curl to fetch the package repository. After installing yarn, I am not really interested in curl anymore so I purge it again.
But this has no effect on the resulting docker image size since the layer with curl installed is still and underlying image layer (as far as I understand docker images).
I am less interested in this specific case (curl and yarn) but in general how to minimize my docker image in such a scenario. How can I "purge" a no longer needed underlying layer in my docker image?
Example Dockerfile for reference:
FROM ubuntu:focal
# Updating and installing curl (not required in final image)
RUN apt update && apt install -y curl
# Using curl to install yarn
RUN curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - &&\
echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list && \
apt update && apt install -y yarn
# Doing cleanup (no positive effect on image size)
RUN apt purge -y curl && rm -rf /var/lib/apt/lists/* && apt autoremove -y && apt clean -y
EDIT:
Just for clarification:
ubuntu/focal on it's own is just 74 MB in image size.
After running apt update it's at 95 MB
After apt installing curl wget git it's at 198 MB
Even purging all these installations doesn't bring me back to the 74 MB
multi-stage builds are a nice concept which I will look into.
This question although is about wheather or not it is possible to reduce a single image size again.

You can build your image using a multi stage Dockerfile.
For example:
FROM ubuntu:focal AS building_stage
RUN apt update && apt install -y curl
RUN curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - &&\
echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list && \
apt update && apt install -y yarn
RUN yarn install # or whatever you want to do with yarn
FROM ubuntu:focal AS running_stage
COPY --from=building_stage /root/node_modules .
After building this Dockerfile, the final image doesn't contain either yarn and curl, but it has necessary files for your final image to run. I didn't know what you wanted to do with yarn, so I couldn't show a pure example from your sample, but multi stage builds are probably the thing you want to use.

Related

Create a complete Application Server Container using Multi Stage Build Docker

Problem: To install multiple images to create a application server, but using multi-stage build and size small and keeping only one final image
Question1: Is that possible?
Question2: Does a container always expected to have a base OS, only 1 application dependency and then running compiled code based on 1 application dependency?
I have not found example online where we can create a complete application server (Web Server+Database Server+Application Server) using Docker multistage build.
I would like to have the following in an application server:
Install Alpine
Install nodejs
Install nginx
Install jenkins
Install mongodb
Install Python
This will be the base Image that can be replicated.
Don't want to use : Run apk or Run apt get to install the applications as the image size grows big
Want to use Multi-stage build to have final one image and small size of the image.
However, i want to keep the image size small using MultiStage Build.
FROM alpine:3.17.2 as base1
FROM python:alpine3.17 as base2
FROM nodejs:lts-alpine3.17 as base3
FROM nginx:stable-alpine as base4
FROM jenkins:2.375.3-lts-alpine as base5
FROM mongo:jammy as base6
COPY --from=base1 / /
COPY --from=base2 / /
COPY --from=base3 / /
COPY --from=base4 / /
COPY --from=base5 / /
COPY --from=base6 / /
[this will overwrite some directories]
Expectation
When i run the final image "base7", i can run any nodejs, mongo, python commands. Ingest data file into mongodb, then python analysis and using nodejs to display it.
Previously working without multi-stage build (Issue is size is big and many image layers created)
FROM ubuntu
RUN apt update
RUN apt upgrade
RUN apt-get -y install git
RUN apt-get -y install nodejs
RUN apt-get -y install wget
RUN apt-get -y install gnupg
RUN wget -q -O - https://pkg.jenkins.io/debian-stable/jenkins.io.key | apt-key add -
RUN sh -c 'echo deb http://pkg.jenkins.io/debian-stable binary/ > /etc/apt/sources.list.d/jenkins.list'
RUN apt update
RUN apt-get -y install jenkins
RUN apt-get -y install gnupg
RUN wget -qO - https://www.mongodb.org/static/pgp/server-6.0.asc | apt-key add -
RUN echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/6.0 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-6.0.list
RUN wget http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2.16_amd64.deb
RUN apt update
RUN dpkg -i libssl1.1_1.1.1f-1ubuntu2.16_amd64.deb
RUN apt-get -y install libssl-dev
RUN apt-get -y install libssl1.1
RUN apt-get install -y mongodb-org
COPY . /learn
WORKDIR /learn

How to add/install cypress in dockerimage

How to add/install cypress in my docker base image? This is my baseimage dockerfile file where I am installing common dependencies.
How can I install cypress. I don't want to install it via package.json. I want it to be pre-installed.
FROM node:lts-stretch-slim
RUN apt-get update && apt-get install -y curl wget gnupg
RUN apt-get install python3-dev -y
RUN curl -O https://bootstrap.pypa.io/get-pip.py
RUN python3 get-pip.py
RUN pip3 install awscli --upgrade
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list'
RUN apt-key update && apt-get update && apt-get install -y google-chrome-stable
There are docker images available with cypress already in them.
CircleCI have one for their CI testing.
For convenience, CircleCI maintains several Docker images. These
images are typically extensions of official Docker images and include
tools especially useful for CI/CD. All of these pre-built images are
available in the CircleCI org on Docker Hub. Visit the circleci-images
GitHub repo for the source code for the CircleCI Docker images. Visit
the circleci-dockerfiles GitHub repo for the Dockerfiles for the
CircleCI Docker images
https://circleci.com/docs/2.0/circleci-images/?gclid=Cj0KCQiApaXxBRDNARIsAGFdaB9QO4ZaUXxHzyuRWVc19uzIN0Baz5qd5npQb6rHL3wbup6pFLwKb-4aArzOEALw_wcB

Debian with nginx docker image won't update

I'm bulding nginx in a Debian-based docker image. Every time I run it, it shows me the current nginx version nginx/1.10.3. I need it to download the latest stable nginx.
This is my Dockerfile:
FROM debian:latest
RUN apt-get -y update
RUN apt-get install -yq gnupg2
RUN apt-get install -yq software-properties-common
RUN apt-get install -yq lsb-release
RUN apt-get install -yq curl
RUN add-apt-repository "deb http://archive.canonical.com/ $(lsb_release -sc) partner"
RUN add-apt-repository "deb http://nginx.org/packages/debian `lsb_release -cs` nginx"
RUN apt-get install -y nginx
RUN rm -rf /var/lib/apt/lists/
RUN echo "\ndaemon off;" >> /etc/nginx/nginx.conf
EXPOSE 80
CMD ["/usr/sbin/nginx"]
Docker image layers serve as a cache for subsequent builds. Without some sort of change in the Dockerfile, you're likely getting nginx 1.10.3 because it was cached from a previous build.
Instead of building your own nginx image, you should use the official nginx image, and choose the tag (e.g., 1.15.9) for the version you want.
First off, trivially, you need to apt-get update to fetch the index files from the repos you added before apt will find any packages there.
RUN add-apt-repository blah blah
RUN apt-get update -y # Add this
RUN apt-get install -y whatever
But also, you have invalid repos in the add-apt-repository section. The output of lsb_release -sc is a Debian code name like stretch which of course the Canonical partner repo doesn't have a section for; and the NGninx repo only supports Debian squeeze (though I would expect the packages to also work on newer versions of Debian).
Finally, you need to manage the keys of these repos, or otherwise mark them as safe. As a small bonus, I tried to condense your apt-get downloads slightly. Try this Dockerfile:
FROM debian:latest
RUN apt-get -y update
RUN apt-get install -yq gnupg2 \
software-properties-common curl # lsb-release
# XXX FIXME: the use of [trusted=yes] is really quick and dirty
RUN add-apt-repository "deb [trusted=yes] http://archive.canonical.com/ bionic partner"
RUN add-apt-repository "deb [trusted=yes] http://nginx.org/packages/debian squeeze nginx"
RUN apt-get update -y
RUN apt-get install -y nginx
RUN rm -rf /var/lib/apt/lists/
RUN echo "\ndaemon off;" >> /etc/nginx/nginx.conf
EXPOSE 80
CMD ["/usr/sbin/nginx"]

How to create the smallest possible Docker image after installing apt dependencies

I've created a Docker image using debian as the parent image. In my Dockerfile I've installed some dependencies using apt and pip.
Now, I want to get rid off everything that is not completely necessary to run my app, which of course, needs the dependencies installed.
For now I have the following lines in my Dockerfile after installing the dependencies.
RUN rm -rf /var/lib/apt/lists/* \
&& rm -Rf /usr/share/doc && rm -Rf /usr/share/man \
&& apt-get clean
I've also installed the dependencies using the --no-install-recommends option.
Anything else I can do to reduce the footprint of my Docker image?
PS: just in case, this is how I installed the dependencies:
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
sudo systemd \
build-essential libffi-dev libssl-dev \
python-pip python-dev python-setuptools python-wheel
To reduce the size of the image, you need to combine your RUN commands into one. When you create files in one layer and delete them in another, the files still exist on the drive and are shipped over the network. Their existence is just hidden when the layers of the filesystem are assembled for your container.
The Dockerfile best practices explain this in more detail: https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#run
I'd also recommend building with docker build --rm=false --no-cache . (temporarily) and then reviewing the output of docker diff on each of the created images to see what files are created in each layer.

Docker commands require keyboard interaction

I'm trying to create a Docker image for ripping CDs (using abcde).
Here's the relevant portion of the Dockerfile:
FROM ubuntu:17.10
MAINTAINER Graham Nicholls <graham#rockcons.co.uk>
RUN apt update && apt -y install eject vim ruby abcde
...
Unfortunately, the package "abcde" pulls in a mail client (not sure which), and apt tries to configure that by asking what type of mail connection to configure (smarthost/relay etc).
When docker runs, it's not appearing to read from stdin, so I can't redirect into the docker process.
I've tried using --nodeps with apt (and replacing apt with apt-get); unfortunately --nodeps seems no-longer to be a supported option and returns:
E: Command line option --nodeps is not understood in combination with the other options
Someone has suggested using expect in response to a similar question, which I'd rather avoid. This seems to be a "difficult to google" problem - I can't find anything.
So, is there a way of passing in the answer to the config in apt, or of preventing apt from pulling in a mail client, which would be better - I'm not planning in sending updates to cddb.
The typical template to install apt packages in a docker container looks like:
RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
eject \
vim \
ruby \
abcde \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
Running it with the "noninteractive" value removes any prompts. You don't want to set that as an ENV since that would also impact any interactive commands you run inside the container.
You also want to cleanup the package database when finished to reduce the layer size and avoid reusing a stale cached package database in a later step.
The no-install-recommends option will reduce the number of packages installed by only installing the required dependencies, not the additional recommended packages. This cuts the size of the root filesystem down by half for me.
If you need to pass a non-default configuration to a package, then use debconf. First run you install somewhere interactively and enter the options you want to save. Install debconf-utils. Then run:
debconf-get-selections | grep "${package_name}"
to view all the options you configured for that package. You can then pipe these options to debconf-set-selections in your container before running your install, e.g.:
RUN echo "postfix postfix/main_mailer_type select No configuration" \
| debconf-set-selections \
&& apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
....
or save your selections to a file that you copy in:
COPY debconf-selections /
RUN debconf-set-selections </debconf-selections \
&& apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
....

Resources