How does Docker handle different kernel versions? - docker

Let's say that I make an image for an OS that uses a kernel of version 10. What behavior does Docker exhibit if I run a container for that image on a host OS running a kernel of version 9? What about version 11?
Does the backward compatibility of the versions matter? I'm asking out of curiosity because the documentation only talks about "minimum Linux kernel version", etc. This sounds like it doesn't matter what kernel version the host is running beyond that minimum. Is this true? Are there caveats?

Let's say that I make an image for an OS that uses a kernel of version 10.
I think this is a bit of a misconception, unless you are talking about specific software that relies on newer kernel features inside your Docker image, which should be pretty rare. Generally speaking a Docker image is just a custom file/directory structure, assembled in layers via FROM and RUN instructions in one or more Dockerfiles, with a bit of meta data like what ports to open or which file to execute on container start. That's really all there is to it. The basic principle of Docker is very much like a classic chroot jail, only a bit more modern and with some candy on top.
What behavior does Docker exhibit if I run a container for that image on a host OS running a kernel of version 9? What about version 11?
If the kernel can run the Docker daemon it should be able to run any image.
Are there caveats?
As noted above, Docker images that include software which relies on bleeding edge kernel features will not work on kernels that do not have those features, which should be no surprise. Docker will not stop you from running such an image on an older kernel as it simply does not care whats inside an image, nor does it know what kernel was used to create the image.
The only other thing I can think of is compiling software manually with aggressive optimizations for a specific cpu like Intel or Amd. Such images will fail on hosts with a different cpu.

Docker's behaviour is no different: it doesn't concern itself (directly) with the behaviour of the containerized process. What Docker does do is set up various parameters (root filesystem, other mounts, network interfaces and configuration, separate namespaces or restrictions on what PIDs can be seen, etc.) for the process that let you consider it a "container," and then it just runs the initial process in that environment.
The specific software inside the container may or may not work with your host operating system's kernel. Using a kernel older than the software was built for is not infrequently problematic; more often it's safe to run older software on a newer kernel.
More often, but not always. On a host with kernel 4.19 (e.g. Ubuntu 18.04) try docker run centos:6 bash. You'll find it segfaults (exit code 139) because that old build of bash does something that greatly displeases the newer kernel. (On a 4.9 or lower kernel, docker run centos:6 bash will work fine.) However, docker run centos:6 ls will not die in the same way because that program is not dependent on particular kernel facilities that have changed (at least, not when run with no arguments).

This sounds like it doesn't matter what kernel version the host is running beyond that minimum. Is this true?
As long as your kernel meets Docker's minimum requirements (which mostly involve having the necessary APIs to support the isolated execution environment that Docker sets up for each container), Docker doesn't really care what kernel you're running.
In many way, this isn't entirely a Docker question: for the most part, user-space tools aren't tied particularly tightly to specific kernel versions. This isn't unilaterally true; there are some tools that by design interact with a very specific kernel version, or that can take advantage of APIs in recent kernel versions for improved performance, but for the most part your web server or database just doesn't care.
Are there caveats?
The kernel version you're running may dictate things like which storage drivers are available to Docker, but this doesn't really have any impact on your containers.
Older kernel versions may have security vulnerabilities that are fixed in more recent versions, and newer versions may have fixes that offer improved performance.

Related

Does Docker image include OS?

I have below Dockerfile"
FROM openjdk:12.0.2
EXPOSE 8080
ADD ./build/libs/*.jar app.jar
ENTRYPOINT ["java","-jar","/app.jar"]
The resulting Docker image encapsulates Java program. When I deploy this Docker image to Windows Server or Linux, does the image always include OS like Linux which runs on top of host OS (Windows Server or Linux) ?
I am asking this question in the sense of Docker image being physical box which contains other boxes (one being openjdk), does this box also contain Linux OS box that I can pull out of it ( assuming if this was possible) and install it as Linux OS on empty machine?
That depends on what you call the "OS". It will always contain stuff from the distribution image, it is built on.
For example, a debian based images will include apt and other debian-specific tools. But most of the stuff you need on a "complete" machine (as in non-container), will have been removed to keep the image as small as possible.
It will not contain the kernel, as it is running on the host machine and is controlled by the host's kernel.
The "official" OpenJDK images from the Docker Hub are available in variants based on a number of different Linux distributions. There is a cut-down Debian, an Alpine, and others. There are advantages and disadvantages to each.
The image will need to contain enough operating system dependencies to allow the JVM to run. It may also include basic diagnostic and management tools -- enough to carry out rudimentary troubleshooting in the container, anyway. You can expect all the images to contain at least basic console shell tools like "cp" and "cat", although they differ in implementation. For example, the Alpine variant gets these utilities from BusyBox, not from a conventional GNU/Linux installation.
It's possible to create a Docker image that contains no platform dependencies at all, but there's little incentive to be that minimal -- you'd just have to build more stuff into the application program itself.
It doesn't include the entire operating system, but the image will be dependent on either linux or windows, you can't build an image that runs on both in one Dockerfile.
The reason for the dependency is that a docker container shares resources with it's host machine in a carefully fenced off way, this mechanism is different on windows and linux (though to you, as a docker user, the difference is invisible).

docker container does not need an OS, but each container has one. Why?

"docker" is a buzz word these days and I'm trying to figure out, what it is and how does it work. And more specifically, how is it different from the normal VM (e.g. VirtualBox, HyperV or WMWare solutions).
The introduction section of the documentation (https://docs.docker.com/get-started/#a-brief-explanation-of-containers) reads:
Containers run apps natively on the host machine’s kernel. They have better performance characteristics than virtual machines that only get virtual access to host resources through a hypervisor. Containers can get native access, each one running in a discrete process, taking no more memory than any other executable.
Bingo! Here is the difference. Containers run directly on the kernel of hosting OS, this is why they are so lightweight and fast (plus they provide isolation of processes and nice distribution mechanism in the shape of docker hub, which plays well with the ability to connect containers with each other).
But wait a second. I can run Linux applications on windows using docker - how can it be? Sure, there is some VM. Otherwise we would just not get job done...
OK, but how does it look like, when we work on Linux host??? And here comes real confusion... there one still defines OS as a base image for every image we want to create. Even if we say "FROM scratch" - scratch is still some minimalistic kernel... So here comes
QUESTION 1: If I run e.g. CentOS host, can I create the container, which would directly use kernel of this host operating system (and not VM, which includes its own OS)? If yes, how can I do it? If no, why the documentaion of docker lies to us (as then docker images always run within some VM and it is not too much different from other VMs, or ist it?)?
After some thinking about it and looking around I was wondering, if some optimization is done for running the images. Here comes
QUESTION 2: If I run two containers, images of both of which are based on the same parent image, will this parent image be loaded into memory only once? Will there be one VM for each container or just one, which runs both containers? And what if we use different OSs?
The third question is quite beaten:
QUESTION 3: Are there somewhere some resources, which describe this kind of things... because most of the articles, which discuss docker just tell "it is so cool, you must definitely use ut. Just run one command and be happy"... which does not explain too much.
Thanks.
Docker "containers" are not virtual machines; they are just regular processes running on the host system (and thus always on the host's Linux kernel) with some special configuration to partition them off from the rest of the system.
You can see this for yourself by starting a process in a container and doing a ps outside the container; you'll see that process in the host's list of all processes. Running ps in the containerized process, however, will show only processes in that container; limiting the view of processes on the system is one of the facilities that containerization provides.
The container is also usually given a limited or separate view of many other system resources, such as files, network interfaces and users. In particular, containerized processes are often given a completely different root filesystem and set of users, making it look almost as if it's running on a separate machine. (But it's not; it still shares the host's CPU, memory, I/O bandwidth and, most importantly, Linux kernel of the host.)
To answer your specific questions:
On CentOS (or any other system), all containers you create are using the host's kernel. There is no way to create a container that uses a different kernel; you need to start a virtual machine for that.
The image is just files on disk; these files are "loaded into memory" in the same way any files are. So no, for any particular disk block of a file in a shared parent image there will never be more than one copy of that disk block in memory at once. However, each container has its own private "transparent" filesystem layer above the base image layer that is used to handle writes, so if you change a file the changed blocks will be stored there, and will now be separate from the underlying image that that other processes (who have not changed any blocks in that file) see.
In Linux you can try man cgroups and man cgroup_namespaces to get some fairly technical details about the cgroup mechanism, which is what Docker (and any other containerization scheme on Linux) uses to limit and change what a containerized process sees. I don't have any other particular suggestions on readings directly related to this, but I think it might help to learn the technical details of how processes and various other systems work on Unix and POSIX systems in general, because understanding that gives you the background to understand what kinds of things containerization does. Perhaps start with learning about the chroot(2) system call and programming with it a bit (or even playing around with the chroot(8) program); that would give you a practical hands-on example of how one particular area of containerization.
Follow-up questions:
There is no kernel version matching; only the one host kernel is ever used. If the program in the container doesn't work on that version of that kernel, you're simply out of luck. For example, try runing the Docker official centos:6 or centos:5 container on a Linux system with a 4.19 or later kernel, and you'll see that /bin/bash segfaults when you try to start it. The kernel and userland program are not compatible. If the program tries to use newer facilities that are not in the kernel, it will similarly fail. This is no different from running the same binaries (program and shared libraries!) outside of a container.
Windows and Macintosh systems can't run Linux containers directly, since they're not Linux kernels with the appropriate facilities to run even Linux programs, much less supporting the same extra cgroup facilities. So when you install Docker on these, generally it installs a Linux VM on which to run the containers. Almost invariably it will install only a single VM and run all containers in that one VM; to do otherwise would be a waste of resources for no benefit. (Actually, there could be benefit in being able to have several different kernel versions, as mentioned above.)
Docker does not has an OS in its containers. In simple terms, a docker container image just has a kind of filesystem snapshot of the linux-image the container image is dependent on.
The container-image includes some basic programs like bash-shell, vim-editor etc to facilitate developer to work easily with the docker image. Also, docker images can include pre-installed dependencies like nodeJS, redis-server etc as we can find on docker hub.
Docker behind the scene uses the host OS which is linux itself to run its containers. The programs included in linux-like filesystem snapshot that we see in form of docker containers actually runs on the host OS in isolation.
The container-images may sound like different linux distros but they are the filesystem snapshot of those distros. All Linux distributions are based on the same kernel. They differ in the programs, tools and dependencies that they ships with.
Also take note of this comment [click]. It is very much relevant to this question.
Hope this helps.
It's now long time since I posted this question, but it seems, like it still get hits... So I decided to answer it - in fact mainly the question, which is in the title (the questions in the text are carefully answered by Curt J. Sampson).
So, the discussion of the "main" question: if containers are not VMs, then why do we need VMs for them?
As you may guess, I am working on windows (on Linux this question would not emerge, because on Linux one does not need VMs for docker).
The reason, why we need a VM for containers in Winodows is pretty obvious (probably this is the reason, why nobody mentions it explicitly). As was already mentioned here and it many other FAQs, containers reuse kernel and some other resources of the hosting OS. Taking into account, that most of the containers available out there are based on Linux, one may conclude, that those containers need host OS to provide Linux kernel for them to run. Which is not natively easy on Windows (I am not sure, may be it is now possible with Linux subsystem). This is why on Windows we need one VM, which runs Linux and docker service inside this VM. And then, when we start the containers, they are also started inside this VM (and reuse the resources of its Linux OS). All the containers run inside the same VM. Getting a bit more technical: by default docker uses Hyper-V to run this linux VM, but one can also use Docker-Toolbox, which uses Oracle VirtualBox. By the way, VM can be freely seen in the Virtual Box interface. Nice part is that Docker (or Docker toolbox) takes care about managing this VM and we don't need to care about it.
Now some bonus question, which that time confused me even more. One may think: "Ok, it is clear now. If we run Linux container on Winodws OS, then we need Linux kernel and thus need VM with Linux. But if we run Windows container on Windows (by the way, it exists), then VM should not be needed, right?..." Answer: "wrong" (or almost wrong). :) The problem is, that the Windows based containers (at least those, which I saw) use windows server kernel, which is not available e.g. in Windows 10. Thus one still need VM with special version of Windows Server running on it. In fact MS even created special version of Windows Server, which can be run on VM for development purposes free of charge specifically to enable development of Windows-Server based containers. If my understanding is correct, those containers should be possible to run without VM on Windows Server. I should admit, that I never checked it though.
I hope, that this messy explanation may help someone to better understand the topic.
We need a VM to run a docker on the host machine ( this is achieved through the docker toolbox) if it is windows, on Linux we don't even need this. Once we have a docker toolbox container in itself doesn't need a VM, each container has a baseline image which is very minimal and reuses a lot of stuff with the host kernel hence making it lightweight compared to VM. You can run many such container using single host kernel.

What happens if docker container requires kernel features not provided by host?

Can issues result if a docker image requires a kernel feature not provided by host OS kernel (e.g an image which requires a very specific kernel version)? Is this issue guaranteed to be prevented in some way?
Can issues result if a docker image requires a kernel feature not provided by host OS kernel
Yes, but note that the docker installation page recommend a minimun kernel level for docker itself to run.
For instance on RedHat "your kernel must be 3.10 at minimum".
If the image you run requires more recent kernel features, it won't work even though docker itself will.
Is this issue guaranteed to be prevented in some way?
Not really, as illustrated in "Docker - the pain of finding the right distribution+kernel+hardware combination "
As noted in "Can a docker image based on Ubuntu run in Redhat?"
Most Linux kernel variants are sufficiently similar that applications will not notice. However if the code relies on something specific in the kernel that is not there, Docker can't help you.
Plus:
system architecture is a limitation.
x86_64 images won't run on ARM for exemple. I.E. you won't run the official ubuntu image on a Raspberry PI.

"Dockerized" apps frequently are built on top of OS containers. Why doesn't this defeat the purpose?

A question came up as I was giving a presentation on Docker to my team that I didn't know how to answer.
Many of the prebuilt containers on Docker Hub, for just one example the jboss/wildfly container, are built on top of containers for a specific OS (Ubuntu, CentOS, etc.). A few of these containers ARE in fact nothing but containers for these OSes.
Yet Docker's main raison d'etre, it's prime claim to fame, the basis of its claim that it is better than Virtual Machine technologies, is that it is lighter weight because it doesn't need to be built on top of an OS. But if this is so and most containers include an OS does this not defeat the purpose and invalidate the claim?
So what IS in these OS Docker images, and how is the claim of lighter weight still able to be made? Is it some stripped down version of an OS?
Can one make a Docker image that is not built on top of an OS?
What determines when an application gets OS services from the OS embedded in the container, as opposed to getting OS services from the host?
A Docker image (which will most likely contain the base system from a Linux distribution), is read only and is augmented with several layers that are enabled as you write to a location. So you can share the base image and have "add-ons" if you will. This is called a union file system. The docker documentation provides more information here. This kind of sharing makes Docker consume less resources (fs space in this case) compared to VMs, where you'd have to install a new distribution on each.
Note that you don't have to have a full Ubuntu installation (the kernel is shared with the host system, anyway), it is just that most of it is usually required by the applications you want to run in your container. You can easily find images that are stripped down, omitting files not needed to run most applications while still being viable for many targets (so you can still share the base image, see above).

Can a docker image based on Ubuntu run in Redhat?

Read some PPTs, it seems that one container can run on different linux vendors. Is is true?
Yes. That's the main idea of docker.
It creates a "static container" in a chrooted env that is able to run on any linux because all the needed user-land dependencies are included in the image.
Since linux (the kernel) maintains a backward compatibility on system calls and their call-schemes, the idea can work across versions and even different distributions of Linux.
Of course, the binary architecture (say amd64) needs to be the same on the source and target system.
Yes, for most applications this works. The kernel is whatever you are really running on (RedHat in your example) while the userspace is supplied by the container (Ubuntu).
Most Linux kernel variants are sufficiently similar that applications will not notice. However if the code relies on something specific in the kernel that is not there, Docker can't help you.
Docker itself relies on certain minimum kernel features, version 3.8 at the time of writing. https://docs.docker.com/engine/installation/binaries/

Resources