Docker: base image

Docker: base image - docker

I am trying to understand Docker concepts but one thing I can not catch:
As I understand image (consequently - a container) can be instantiated from different linux distributives, such as Ubuntu, CentOS and others.
Let's say on host machine I run standard Ubuntu 14.04,
What happens if I use container that is not instantiated from same distributive?
Not 14.04?
Not Ubuntu (or any other Debian-based)?
What disadvantages of using different base-images of images you use? (Let's say I use Image A that uses Ubuntu as a base image, Image B that used Debian as base image and Image C that uses CentOS as base image)?
Bonus question: How can I tell what base image used for an image if developer didn't specified it in a Docker hub description?
Thank you in advance!

Docker does not use LXC (not since Docker 0.9) but libcontainer (now runc), a built-in execution driver which manipulates namespaces, control groups, capabilities, apparmor profiles, network interfaces and firewalling rules – all in a consistent and predictable way, and without depending on LXC or any other userland package.
A docker image represents a set of files winch will run as a container in their own memory and disk and user space, while accessing the host kernel.
This differs from a VM, which does not access the host kernel but includes its own hardware/software stack through its hypervisor.
A container has just to set limits (disk, memory, cpu) in the host. An actual VM has to build an entire new host.
That docker image (group of files) can be anything, as long as:
it does not depends on host libraries (since it is isolated in its own disk space, it does not have access to hosts files, unless volumes are mounted)
it does only system calls: see "What is meant by shared kernel in Docker?"
That means an image can be anything: another linux distro, or even a single executable file. Any executable compile in go (https://golang.org/) for instance, could be packaged in its own docker image without any linux distro:
FROM scratch
COPY my_go_exe /
ENTRYPOINT /my_go_exe
scratch is the "empty" image, and a go executable is statically linked, so it is self-contained and only depends on system calls to the kernel.

The main thing shared between the host OS and docker container is the kernel. The main risk of running docker containers from different distributions/versions is that they may depend on kernel functionality not present on the host system, for example if the container expects a newer kernel than the host has installed.
In theory, the Linux kernel is backwards compatible. As long as the host kernel is newer than the container kernel it should work.
From an operational perspective, each time you start depending on a different base image that is another dependency that you need to monitor for updates and security issues. Standardizing on one distribution reduces the workload for your ops team when the next big vulnerability is discovered.

Docker uses LXC, which is an operating-system-level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel.
You can compare this to a VM on your machine, where you start another Linux distro, which does not have to be the same as your host OS. So it does not matter, if your host os is the same as the base image of your container.
In Docker, the container is built from layers. Each step (command) in your Dockerfile represent one layer, which are applied one after the other. The first step ist to apply the base OS layer, which is indicated by FROM.
So to answer your bonus question, you can have a look inside the Dockerfile of the container you're using (it's the third tab on DockerHub) and see in the first statement, which is the base image (os).

Related

What are the problems with running Docker containers where the Linux kernel version of the base image differs from the OS kernel version? [duplicate]

How can docker run on a Debian host maybe an OpenSUSE in a container? It uses different kernel, with separated modules. Also older Debian versions have used older kernels, so how can run it on a kernel version 3.10+ ? Older kernels have only older built in functions, how can an old distro manage new features?
What is "the trick" in it?

Docker never uses a different kernel: the kernel is always your host kernel.
If your host kernel is "compatible enough" with the software in the container you want to run it will work; otherwise, it won't.
"Containers" Are Just Process Configuration
The key thing to understand is that a Docker container is not a virtual machine: it doesn't create a new virtual computer on which to run the software. Instead, Docker starts processes in your existing OS, just like you start new processes from the command line.
The difference between a "containerized" process and an ordinary process is the restrictions put on the containerized process and the changes to how it sees the environment around it. (These are passed on to any child processes started by the containerized process.) Typical restrictions and changes include:
Instead of using the host's root filesystem, mount a different filesystem on / (usually one supplied with the container's image). Parts of the host filesystem may be mounted underneath the new process' root filesystem, e.g. by using docker run -v /u/myprogram-data:/var/data/myprogram so that when the containerized process reads or writes /var/data/myprogram/file this reads/writes /u/myprogram-data/file in the host filesystem.
Create a separate process space for the containerized process so that it can see only itself and its children (with ps or similar commands), but cannot see other processes running on the host.
Create a separate user namespace so that the users in the container are different from those in the host: e.g., UID 1234 in the containerized process will not be the same as UID 1234 for non-containerized
Create a separate set of network interfaces with their own IP addresses, often using a "virtual router" and address translation between those and the host network interfaces. (E.g., the host, when it receives a packet on port 8080, forwards it to port 80 on the container processes' virtual network interface.)
All of this is done by facilities built into the kernel; you can do any of it yourself without Docker if you write a program to do the appropriate setup and set the appropriate parameters when it starts a new process.
Compatibility
So what does "compatible enough" mean? It depends on what requests the program makes of the kernel (system calls) and what features it expects the kernel to support. Some programs make requests that will break things; others don't. For example, on an Ubuntu 18.04 (kernel 4.19) or similar host:
docker run centos:7 bash works fine.
docker run centos:6 bash fails with exit code 139, meaning it terminated with a segmentation violation signal; this is because the 4.19 kernel doesn't support something that that build of bash tried to do.
docker run centos:6 ls works fine because it's not making a request the kernel can't handle, as bash was.
If you try docker run centos:6 bash on an older kernel, say 4.9 or earlier, you'll find it will work fine. (At least as far as I tested it.)

How can docker run on a Debian host maybe an OpenSUSE in a container
Because the kernel is the same and will support the Docker engine to run all those container images: the host kernel should be 3.10 or more, but its list of system calls is fairly stable.
See "Architecting Containers: Why Understanding User Space vs. Kernel Space Matters":
Applications contain business logic, but rely on system calls.
Once an application is compiled, the set of system calls that an application uses (i.e. relies upon) is embedded in the binary (in higher level languages, this is the interpreter or JVM).
Containers don’t abstract the need for the user space and kernel space to share a common set of system calls.
In a containerized world, this user space is bundled up and shipped around to different hosts, ranging from laptops to production servers.
Over the coming years, this will create challenges.
From time to time new system calls are added, and old system calls are deprecated; this should be considered when thinking about the lifecycle of your container infrastructure and the applications that will run within it.
See also "Why kernel version doesn't match Ubuntu version in a Docker container?":
There's no kernel inside a container. Even if you install a kernel, it won't be loaded when the container starts. The very purpose of a container is to isolate processes without the need to run a new kernel.

How docker container use host OS?

In every docker tutorial, one of the main advantages of the docker is that docker container use host OS. But if that is true, I don't understand why I need to include OS in the image. For example here is image of centOS. I understand that if I want to run centOS in container I must pull this image but where then host OS come? It would be best if someone can point me to some link to read about that because I cannot find appropriate one.

What Docker uses of the host is actually only the OS's kernel.
What you include in the Docker container is not the actual OS (i.e., the kernel), but rather all the files that make up a specific distribution, such as Ubuntu or Fedora, or whatever…
This is also the reason why you can't run Linux containers on Windows and vice-versa (without a VM), because Linux software of course doesn't work with the Windows kernel, and Windows software doesn't work with the Linux kernel.
So, all Docker containers running on a given host share the host OS's kernel.

It actually shares the kernel & required libraries to boot the image from host OS. That's why those images are really small & not like traditional ISO files. It primarily utilizes union file system, cgroups and namespaces to manage the images and containers.
You can give a quick read to below -
https://kjanshair.github.io/2017/07/04/Docker-Containers-vs-System-Processes/
How is Docker different from a normal virtual machine?

Run Different Linux OS in Docker Container?

Have been trying to learn Docker and one thing that puzzles me is how a different flavour of Linux (to the host OS) actually runs in the Docker container.
If we assume my Docker host is running RedHat and I start a container from an Ubuntu image then are the following true?:
logically speaking, if the Ubuntu image footprint is around 550MB then will the Docker Daemon actually download (from an image registry) 550MB worth of Ubuntu image in order to create the Container?
is the instance of Ubuntu running in the container essentially no different than if I had downloaded and installed the same version manually?
I'm aware that the Docker container shares the same kernel used by the host OS and that one of the fundamental points of Docker was its efficiency gains of the container using the underlying OS. So Im a bit confused about what actually happens when you start a Container created from a different Linux version than the host.

I think this previous post may help you understand it a little more - Docker container isolation, does it care about underlying Linux OS?.
The crux of the matter is that if the Host OS is RedHat then it is the RedHat kernel which will be used by whatever build of Linux you run in your Docker container ie. Ubuntu in your example.
This comes down to understanding what the difference is between a Linux OS and a Linux Image. You will not be running a full Ubuntu OS inside the Docker Container but an image of Ubuntu.
For the purpose of your question think:-
OS = kernel + filesystem/libraries
Image = filesystem/libraries
The Ubuntu image running inside your Docker container is just the Ubuntu filesystem/libraries - it will not contain the Ubuntu kernel. This partly explains the efficiencies you get from a Docker container which is leveraging the Kernel (among other things) of the underlying Host.

The Ubuntu image running inside the Docker container runs in what is called the user space for that container. This image can make kernel system calls to the RedHat host OS kernel (as part of transferring control from user space to kernel space for some user operations). Since the core kernel is common technology, the system calls are expected to be compatible even when the call is made from an Ubuntu user space code to a Redhat kernel code. This compatibility make it possible to share the kernel across containers which may all have different base OS images.

Which Docker base image should be used to install Apps in a container without any additional OS?

I am running a Docker daemon on my GUEST OS which is CentOS. I want to install software services on top of that in an isolated manner and I do not need another OS image inside my Docker container.
I want to have a Docker container with just the additional binaries and libraries for the software application I am going to install.
Is there a "whiteglove/blank" base image in Docker I can use ? I want a very lean container that uses as a starting point what my GUEST OS has to offer. Is that possible ?

What you're asking for isn't possible out-of-the-box with Docker. Each Docker image has its own root filesystem, which needs to have some sort of OS installed.
Your options are:
Use a minimal base image, such as the BusyBox image. This will give you the absolute minimum you need to get a container running.
Use the CentOS base image, in which case your container will be running the same or very similar OS.
The reason Docker images are like this is because they're meant to be portable. Any Docker image is meant to run anywhere Docker is running, regardless of the operating system. This means that the Docker image must contain an entire root filesystem and OS installation.
What you can do if you need stuff from the host OS is share a directory using Docker volumes. However, this is generally meant to be used for mounting data directories, and it still necessitates the Docker image having an OS.
That said, if you have a statically-linked binary that has absolutely no dependencies, it becomes easy to create a very minimal image. This is called a "microcontainer", and Go in particular is well-suited to producing these. Here is some further reading on microcontainers and how to produce them.
One other option you could look into if all you want is the resource management part of containers is using lxc-execute, as described in this answer. But you lose out on all the other nice Docker features as well. Unfortunately, what you're trying to do is just not what Docker is built for.

As I understood docker, when you use a base image, you really do not install an additional OS.
Its just a directory structure sort of thing with preinstalled programs or we can say a file system of an actual base image OS.
In most cases [click this link for the exception], docker itself [the docker engine] runs on a linux VM when used on mac and windows.
If you are confused with virtualization, there is no virtualization inside Docker Container. Containers run in user space on top of the host operating system's kernel. So, the containers and the host OS would share the same kernel.
So, to sumarize:
Consider the host OS to be windows or mac.
Docker when installed, is inside a linux VM running on these host OS.[use this resource for more info]
The base linux images inside the docker container then use this linux VM machine as host OS and not the native windows or mac.
On linux, The base linux images inside the docker container direclty uses the host OS which is linux itself without any virtualization.
The base image inside Docker Container is just the snapshot of that linux distributions programs and tool.
The base image make use of the host kernel (which in all three cases, is linux).
Hence, there is no virtualisation inside a container but docker can use a single parent linux virtual machine to run itself [the docker engine] inside it.
Conclusion:
When you install a base image inside docker, there is no additional OS that is installed inside the container but just the copy of filesystem with minimal programs and tools is created.

From Docker's best practices:
Whenever possible, use current Official Repositories as the basis for your image. We recommend the Debian image since it’s very tightly controlled and kept extremely minimal (currently under 100 mb), while still being a full distribution.
What you're asking for is completely against the idea of using Docker Containers. You don't want to have any dependability on your GUEST OS. If you do your Docker wont be portable.
When you create a container, you want it to run on any machine that runs Docker. Be it CentoOS, Ubuntu, Mac, or Microsoft Azure :)
Ideally there are no advantages of your base container OS having to do anything with your Host OS.

For any container, you need to have at least a root file system. That is why you need to use a base image that have the root file system. Your idea is not completely against the container paradigm of usage; as opposed to VMs, we want container to be minimal without much of repetitive elements that it can leverage from the underlayer OS.

Following the links of Rohan Singh, I found some related info, that doesn't generally contradict, but relates to the core ide of the question:
The base image for all Docker images is the scratch image. It has essentially nothing in it. This may sound useless, but you can actually use it to create the smallest possible image for your application, if you can compile your application to a static binary with zero dependencies like you can with Go or C.

What is the relationship between the docker host OS and the container base image OS?

I'm not certain that I'm asking the right question... but while I have been reading everything docker that I can get my hands on I see that I can install Docker on Ubuntu 12.04 (for example) and then I can install a Fedora container or a different version of ubuntu? (there is an example where the user installed busybox in the container.)
And of course I could be completely wrong.
But it would be my expectation that there was a ephemeral connection between the base system and the container.
restated: what is the relationship between the host OS and the container base image's OS?

As mentioned by BraveNewCurrency, the only relationship between the host OS and the container is the Kernel.
It is one of the main difference between docker and 'regular' virtual machines, there is no overhead, everything takes place directly within the host's kernel.
This is why you can run only Linux based distribution/binaries within the container. If you want to run something else, it is not impossible, but you would need some kind of virtualization within the container (qemu, kvm, etc.)
Docker manage images that are the file system representation. You can install any linux distribution or simply put binaries.
Indeed, for the convenience of the example, we often rely on the base images, but you could also create your image without any of the distribution libraries/binaries. That way you would have a really tiny yet functional container.
One more point regarding the distributions: as the kernel is still the kernel of the host, you will not have any specific kernel module/patches provided by the distribution.

Literally, the only thing they have in common is the kernel. Their whole world (file system) is in the docker container.

There is another consideration - even if the both kernels are the same, there is a problem if the host OS does not support Docker, like RHEL 6: https://access.redhat.com/solutions/1378023
So you won't be able to spin up a container on RHEL 6, even if the image is a Linux one.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart