Docker best practice: use OS or application as base image? - docker

I would like to build a docker image which contains Apache and Python3. What is the suggested base image to use in this case?
There is a offical apache:2.4.43-alpine image I can use as the base image, or I can install apache on top of a alpine base image.
What would be the best approach in this case?
Option1:
FROM apache:2.4.43-alpine
<Install python3>
Option2:
FROM alpine:3.9.6
<Install Apache>
<Install Python3>

Here are my rules.
rule 1: If the images are official images (such as node, python, apache, etc), it is fine to use them as your application's base image directly, more than you build your own.
rule 2:, if the images are built by the owner, such as hashicorp/terraform, hashicorp is the owner of terraform, then it is better to use it, more than build your own.
rule 3: If you want to save time only, choice the most downloaded images with similar applications installed as base image
Make sure you can view its Dockerfile. Otherwise, don't use it at all, whatever how many download counted.
rule 4: never pull images from public registry servers, if your company has security compliance concern, build your own.
Another reason to build your own is, the exist image are not built on the operation system you prefer. Such as some images proved by aws, they are built with amazon linux 2, in most case, I will rebuild with my own.
rule 5: When build your own, never mind from which base image, no need reinvent the wheel and use exis image's Dockerfile from github.com if you can.

Avoid Alpine, it will often make Python library installs muuuuch slower (https://pythonspeed.com/articles/alpine-docker-python/)
In general, Python version is more important than Apache version. Latest Apache from stable Linux distro is fine even if not latest version, but latest Python might be annoyingly too old. Like, when 3.9 comes out, do you want to be on 3.7?
As such, I would recommend python:3.8-slim-buster (or whatever Python version you want), and install Apache with apt-get.

Related

How to compose it?

Target: build opencv docker
Dockerfile creation:
From Ubuntu14.04
or
From Python3.7
Which to choose and why?
I was trying to write dockerfile from scratch without copy paste from others dockerfile.
I would usually pick the highest-level Docker Hub library image that matches what I need. It's also worth searching the https://hub.docker.com/ search box which will often find relevant things, though of rather varied ownership and maintenance levels.
The official Docker Hub images tend to have thought through a lot of issues around persistence and configuration and first-time setup. Compare "I'll just apt-get install mysql-server" with all of the parts that go into the official mysql image; just importing that real-world experience and reusing it can save you some trouble.
I'd consider building my own from an OS base like ubuntu:16.04 if:
There is a requirement that Docker images must be built from some specific distribution base ("my job requires everything to be built off of CentOS so I need a CentOS-based MySQL image")
I need a combination of software versions or patches that the Docker Hub image no longer supports (jruby:9.1.16.0 is no longer being built, so if I need OS updates, I need to build my own base image)
I need an especially exotic set of build options for whatever reason ("I have a C extension that only works if the interpreter is specifically built with UTF-16 Unicode support")
I need or want very detailed control over what version(s) of software are embedded; for example if it's something Java-based where there's a JVM version and a runtime version and an application version that all could matter
In my opinion you should choose From Python3.7.
Since you are writing a dockerfile for opencv which is an open source computer vision and machine learning software library so you may require python also in your container.
Now if you use From Ubuntu14.04 you may need to add python also in the dockerfile whereas with From Python3.7 that will become redundant and will also make the dockerfile a bit shorter.

Which docker base image to use in the Dockerfile?

I'm having a web application, which consists of two projects:
using VueJS as a front-end part;
using ExpressJS as a back-end part;
I now need to docker-size my application using a docker, but I'm not sure about the very first line in my docker files (which is referring to the used environment I guess, source).
What I will need to do now is separate docker images for both projects, but since I'm very new to this, I can't figure out what should be the very first lines for both of the Dockerfiles (in both of the projects).
I was developing the project in Windows 10 OS, where I'm having node version v8.11.1 and expressjs version 4.16.3.
I tried with some of the versions which I found (as node:8.11.1-alpine) but what I got a warning: `
SECURITY WARNING: You are building a Docker image from Windows against
a non-Windows Docker host.
Which made me to think that I should not only care about node versions, instead to care about OS as well. So not sure which base images to use now.
node:8.11.1-alpine is a perfectly correct tag for a Node image. This particular one is based on Alpine Linux - a lightweight Linux distro, which is often used when building Docker images because of it's small footprint.
If you are not sure about which base image you should choose, just read the documentation at DockerHub. It lists all currently supported tags and describes different flavours of the Node image ('Image Variants' section).
Quote:
Image Variants
The node images come in many flavors, each designed for a specific use case.
node:<version>
This is the defacto image. If you are unsure about what your needs are, you probably want to use this one. It is designed to be used both as a throw away container (mount your source code and start the container to start your app), as well as the base to build other images off of. This tag is based off of buildpack-deps. buildpack-deps is designed for the average user of docker who has many images on their system. It, by design, has a large number of extremely common Debian packages. This reduces the number of packages that images that derive from it need to install, thus reducing the overall size of all images on your system.
node:<version>-alpine
This image is based on the popular Alpine Linux project, available in the alpine official image. Alpine Linux is much smaller than most distribution base images (~5MB), and thus leads to much slimmer images in general.
This variant is highly recommended when final image size being as small as possible is desired. The main caveat to note is that it does use musl libc instead of glibc and friends, so certain software might run into issues depending on the depth of their libc requirements. However, most software doesn't have an issue with this, so this variant is usually a very safe choice. See this Hacker News comment thread for more discussion of the issues that might arise and some pro/con comparisons of using Alpine-based images.
To minimize image size, it's uncommon for additional related tools (such as git or bash) to be included in Alpine-based images. Using this image as a base, add the things you need in your own Dockerfile (see the alpine image description for examples of how to install packages if you are unfamiliar).
node:<version>-onbuild
The ONBUILD image variants are deprecated, and their usage is discouraged. For more details, see docker-library/official-images#2076.
While the onbuild variant is really useful for "getting off the ground running" (zero to Dockerized in a short period of time), it's not recommended for long-term usage within a project due to the lack of control over when the ONBUILD triggers fire (see also docker/docker#5714, docker/docker#8240, docker/docker#11917).
Once you've got a handle on how your project functions within Docker, you'll probably want to adjust your Dockerfile to inherit from a non-onbuild variant and copy the commands from the onbuild variant Dockerfile (moving the ONBUILD lines to the end and removing the ONBUILD keywords) into your own file so that you have tighter control over them and more transparency for yourself and others looking at your Dockerfile as to what it does. This also makes it easier to add additional requirements as time goes on (such as installing more packages before performing the previously-ONBUILD steps).
node:<version>-slim
This image does not contain the common packages contained in the default tag and only contains the minimal packages needed to run node. Unless you are working in an environment where only the node image will be deployed and you have space constraints, we highly recommend using the default image of this repository.

What Docker scratch contains by default?

There is an option to use FROM scratch for me it looks like a really attractive way of building my Go containers.
My question is what does it still have natively to run binaries do I need to add anything in order to reliably run Go binaries? Compiled Go binary seems to run it at least on my laptop.
My goal is to keep image size to a minimum both for security and infra management reasons. In an optimal situation, my container would not be able to execute binaries or shell commands outside of build phase.
The scratch image contains nothing. No files. But actually, that can work to your advantage. It turns out, Go binaries built with CGO_ENABLED=0 require absolutely nothing, other than what they use. There are a couple things to keep in mind:
With CGO_ENABLED=0, you can't use any C code. Actually not too hard.
With CGO_ENABLED=0, your app will not use the system DNS resolver. I don't think it does by default anyways because it's blocking and Go's native DNS resolver is non-blocking.
Your app may depend on some things that are not present:
Apps that make HTTPS calls (as in, to other services, i.e. Amazon S3, or the Stripe API) will need ca-certs in order to confirm HTTPS certificate authenticity. This also has to be updated over time. This is not needed for serving HTTPS content.
Apps that need timezone awareness will need the timezone info files.
A nice alternative to FROM scratch is FROM alpine, which will include a base Alpine image - which is very tiny (5 MiB I believe) and includes musl libc, which is compatible with Go and will allow you to link to C libraries as well as compile without setting CGO_ENABLED=0. You can also leverage the fact that alpine is regularly updated, using its tzinfo and ca-certs.
(It's worth noting that the overhead of Docker layers is amortized a bit because of Docker's deduplication, though of course that is negated by how often your base image is updated. Still, it helps sell the idea of using the quite small Alpine image.)
You may not need tzinfo or ca-certs now, but it's better to be safe than sorry; you can accidentally add a dependency without realizing it breaks your build. So I recommend using alpine as your base. alpine:latest should be fine.
Bonus: If you want the advantages of reproducible builds inside Docker, but with small image sizes, you can use the new Docker multi-stage builds available in Docker 17.06+.
It works a bit like this:
FROM golang:alpine
ADD . /go/src/github.com/some/gorepo # may need some go getting if you don't vendor
RUN go build -o /app github.com/some/gorepo
FROM scratch # or alpine
COPY --from=0 /app /app
ENTRYPOINT ["/app"]
(I apologize if I've made any mistakes, I'm typing that from memory.)
Note that when using FROM scratch you must use the exec form of ENTRYPOINT, because the shell form won't work (it depends on the Docker image having /bin/sh, which it won't.) This will work fine in Alpine.

Docker image just contain software without OS

I have a PROD environment running on RHEL 7 server. I want to use docker for deployment. I want to package all the software and apps in a Docker image, without a base OS. Because I don't want to add an additional layer on top of RHEL. Also, I could not find an official base image for RHEL. Is that possible?
I see some old posts mentioned about "FROM scratch" but looks it does not work in the latest version of Docker -- 1.12.5.
If this is impossible, any suggestions for this?
Docker is designed to also abstract the OS dependencies - that is what it has been build for. Beside it also encapsulates the runtime, memory and things, it specifically is used as a extreme-better variant of chroot ( lets say chroot on ultra-steroids ).
It seems like you neither want the runtime seperation nor the OS layer seperation ( dependencies ) - thus docker makes absolutely no sense for you.
Deploying with docker the is not "simple" or simpler as using other tools. You can use capistrano or, probably something like https://www.habitat.sh/ which actually does not require a software to be bundled in docker containers to be "deployable", it also works on barebones and uses its own packaging format. Thus you have a state-of-the-art deployment solution, and with habitat, you can later even upgrade using docker-containers.

Docker Image Requirements for a GitLab-CI Runner

I have GitLab, GitLab-CI and gitlab-ci-multi-runner running on different machines. I've successfully added a runner using docker and the ruby:2.1 image as detailed on https://gitlab.com/gitlab-org/gitlab-ci-multi-runner/blob/master/docs/install/linux-repository.md
What I'd like to do next is have runners for a minimal Ubuntu 12.04, 14.04 configured. Not knowing docker, I thought I'd try to use the ubuntu:14.04 and ubuntu:12.04 images. However when the start up and try to clone my project repo, they complain about git not being found. I then assumed that with these images, git wasn't installed. So my questions are:
What tools need to be available in a docker image to be used "out-of-the-box" by the gitlab-ci-multi-runner
Are there a set of images already available for various OS's with these already included
Should I really be looking to create my own docker images for this purpose?
Popular base images that contain commonly used software for building software are (based on) buildpack-deps https://hub.docker.com/r/library/buildpack-deps/ (e. g. openjdk is based on that)
In your case you could specify FROM buildpack-deps:stretch-scm which is based on Debian Stretch and includes source code management (SCM) tools like git.
I suppose you could create a Gitlab Runner that uses these images as default. But I think you should always specify the needed image in your .gitlab-ci.yml-file.
I would always hesitate in creating my own Docker Images, because there are a lot available. Still many specific ones are maintained (or better not maintained) by one person for a specific use case that often does not fit your application. The best choice would be to pick a base-image and only add minor RUN commands for the customization.

Resources