Docker - only image with operating system? - docker

So far I have only seen images with some operating system as a base layer. Is this necessary ? Is it possible to run some container without an operating system ?

An operating system consist of the kernel and userland utilities. So, there are no Docker images with "operating systems" as base layer, but a lot of images named after operating system distributions with their particular userland utilities.
You can create a Docker image from a tarball with anything you like. But it wouldn't be useful if it lacks /bin/sh and you want to include a RUN in the Dockerfile.

There is FROM scratch which you'll find if you follow how the images on docker hub have been built (you may have to follow various FROM x lines to different images until you reach scratch). Scratch is empty, completely empty, no libraries, no shells, no files at all. It's designed as a starting point that you can copy a bunch of files into.
You can also use scratch if you have a statically linked binary and don't want anything else inside the container. However note that debugging the container becomes more difficult since you can't launch a shell inside the container, but that also means attackers can't do that either.

Related

Windows Containers Fundamental Question - Base Image Always Required?

I am a completely newbie when it comes to containers.
I am particularly interested into Windows Containers running in Process Isolation (not Hyper-V Isolation)
I have been doing a lot of reading and watching of videos but there is one fundamental question which has not be explained to me in the reading I have done so far.
Is it mandatory for Every Windows Container/Image to include a base image/layer of either nanoserver or servercore?
What confuses me are comments such as those made at 5m35sec in the following video;
Windows Container 101 Video on Channel9
He makes a statement (and I'm paraphrasing)
"that the only thing necessary to build a docker image is a statically
linked binary."
That to me implies that if my HOST operating system which is running the containers has all the dependencies necessary then it is possible to virtualise the kernel from the base operating system negating the requirement for a base operating system image/layer in the docker image.
What am I missing? Why do i need the nanoserver or servercore base image layer?
If my Host operating system is v1903 and the docker image requires a kernel of v1903 why can't it virtualise the kernel from the HOST operating system?
Thanks in Advance!
The basic thought of docker is to reuse the kernel of host system, see this for windows container:
Windows Server containers provide application isolation through process and namespace isolation technology, which is why these containers are also referred to as process-isolated containers. A Windows Server container shares a kernel with the container host and all containers running on the host. These process-isolated containers don't provide a hostile security boundary and shouldn't be used to isolate untrusted code. Because of the shared kernel space, these containers require the same kernel version and configuration.
But as you know, to make an os run, just kernel is not enough, you need the file system.
So, this is the root of base image comes, see this.
A file system is built up from a series of layers, this make you have possibility to separate some layers to one image, while separate other layers to another image. With base image, that is nanoserver or servercore here, different apps could reuse the same base image, and put just app binary to built upon the base images.
Just as next diagram shows: different container with its own binary could share the base images (ubuntu15.04 here for example), and every container's image plus the shared common image could be a complete file system to make container run.

Purpose of FROM command - Docker file

Main purpose of Docker container is to avoid carrying guest OS in every container, as shown below.
As mentioned here, The FROM instruction initializes a new build stage and sets the Base Image for subsequent instructions. As such, a valid Dockerfile must start with a FROM instruction.
My understanding is, FROM <image> allow a container to run on its own OS.
Why a valid Docker file must have FROM instruction?
Containers don't run a full OS, they share the kernel of the host OS (typically, the Linux kernel). That's the "Host Operating System" box in your right image.
They do provide what's called "user space isolation" though - roughly speaking, this means that every container manages its own copy of the part of the OS which runs in user mode- typically, that's a Linux distribution such as Ubuntu. In your right image, that would be contained in the "Bins/Libs" box.
You can leave out the FROM line in your Dockerfile, or use FROM scratch, to create a blank base image, then add all the user mode pieces on top of a blank kernel yourself.
Another common use of FROM is to chain builds together to form a multi-stage build of smaller images.
This would be useful for instance to limit redundant rebuilding during failed auto-builds.
FROM instruction specifies the underlying OS architecture that you are gonna use to build the image. You have to use some form of base image for you to get started with building an image. It can be ubuntu, centos or any minimal linux image like ALPINE which is only 5MB!. The idea is to install only the packages you need rather than having everything bundled and packaged as a distribution. This makes the size of the docker images very small as compared to the full blown OS distribution. I hope this answers your question. Let me know if you have any questions.

Why provide a Linux distro as a Dockerfile base when my host has all the software I need installed?

I want to start writing a Docker image. I have a .net Core 2.0 Web Api service that I have deployed to an Amazon Linux machine. It runs fine, but I would like to automate the build and deployment process a bit.
As far as I am concerned, there is no need for a Parent image for the image I need to build. I might grab some files from a location, run some dotnet CLI commands, and run the service using Apache as a reverse proxy. I dont really see the need for a parent image in any of that.
I am asking this question because most of the examples I have seen include a base image. Most of the time its something very generic, like "From Ubuntu". I have read that most images will include a parent image. According to Docker's documentation:
A parent image is the image that your image is based on. It refers to the contents of the FROM directive in the Dockerfile. Each subsequent declaration in the Dockerfile modifies this parent image. Most Dockerfiles start from a parent image, rather than a base image. However, the terms are sometimes used interchangeably.
What exactly is the point of inheriting from Ubuntu? Even the Docker docs suggest using Debian "since it’s very tightly controlled and kept minimal". Does that just ensure that your Linux machine has an Ubuntu distribution? Does it even matter if I am using Amazon Linux but use the Debian image as my base?
A Docker image runs in a set of filesystem namespaces which are unconnected from the host's except where you've chosen to bind-mount a volume. This means that tools installed on the host are unavailable to the container: Just because the host runs Amazon Linux doesn't mean that the userspace commands Amazon Linux provides (and the libraries those commands use to run) are available to the guests.
Without a Linux distro available inside the container, you wouldn't have a package management tool (yum, apt-get, etc) with which to install the tools you need to download a file, run software (that presumably needs to be linked to a libc, a copy of OpenSSL, or other shared components). There are also runtime parts of a working Linux system such as the resolver that are provided in userland by your distro and not shared from the host in a Docker install.
Using a base image ensures that you have tools available inside your container -- and it ensures that that container will work consistently on any Linux system with a compatible kernel and hardware architecture.
It's possible in theory to bind-mount many of the tools from the host (as by exposing all of /usr as a volume), but doing so would defeat many of the advantages Docker offers in portability.

Docker base images, what do they compose of?

I am trying to wrap my head around the Docker architecture, in particular figuring out what exactly a base image consists of, and in doing I have been exploring some of the images found on the docker hub. Specifically when looking at the following repo it references the centos-7.2.1511-docker.tar.xz file.
I've downloaded and examined the contents of the tar and it has your typical Linux filesystem.
As I understand it, this is not a complete Linux OS and is just a replica of a linux filesystem with all the non essentials removed? Where all other requirements are drawn from the Host OS when a container is run(?)
My question essentially boils down to how one would go about creating that tar file? What exactly do you need. My intention is not to create one but rather understand what portion of files/data/dependencies come from a target OS to create an image and what gets used on the Host OS
A Docker container is a set of processes, running a sandbox enabled by Linux namespaces, on top of the host kernel.
A Docker image is a set of layers, which are often simply tarballs, of files that are unpacked, and made to look as if they are the root of the filesystem when used to start a container.
A Docker image could be just a single statically-linked executable! You can create your own Docker image from scratch by simply creating a tarball of a single executable, and giving it to docker load which wI'll store it as the appropriate internal format and register it as an image.
As you can see then, a Docker image need not be much. It certainly doesn't need a kernel, or any of the components normally used for configuring the system, networking daemons, or even things like cron. Those are all left to the host.
Things that are usually available in an image are a dynamic library runtime, and files like /etc/hosts, /etc/resolv.conf, and other files which are referenced directly by libc. This allows you to add typical dynamically-linked executables which interact with the system as if they're running on a traditionalal OS.
I have successfully "Dockerized" a legacy CentOS 6-based VM by uninstalling as many packages as possible, then tar-ing up the filesystem (excluding directories like /proc, /sys, /dev, etc.) and loading this via docker load. Afterwards, I started a container and (sometimes forcefully) removed additional "system" packages that serve no purpose in a Docker image, like kernel, udev, etc.
This blog post goes into some of the specifics of docker load:
http://tuhrig.de/difference-between-save-and-export-in-docker/

docker can I mount os from another container

I have been setting up a runtime with several images. I have been keeping them lean with one process and minimal os, based on debian (because I'm used to that).
However, I wonder why I need all these copies of the OS? Could I build one image with OS (to separate from host os) and then have other images mount relevant parts (read-only or copy where necessary -- /etc/ ...)?
I tried googling for this pattern but didn't find it. Are there any pitfalls? Does docker need "something" present to be able to boot an image even before mounting?
As long as you're using FROM debian as the base of each of your images, you only have one copy of debian. That's the beauty of using a copy-on-write filesystem like AUFS or btrfs.

Resources