Disable Kernel Spectre/Meltdown Mitigations - google-container-optimized-os

Is it possible to disable spectre/meltdown mitigations on container optimized OS instances? Because the root filesystem is readonly it doesn't seem to be possible to modify boot flags.
These are instances that run in isolated environments without internet access doing high performance processing.

Related

Setting CPU and Memory limits globally for all docker containers

There are many examples which talks about setting memory, cpu etc using docker run. Is it possible to set it globally so every container created with those values?
There should be other ways like using AppArmor maybe, I'll check, but the first thing that comes to my mind is this project from a friend
Docker enforcer
https://github.com/piontec/docker-enforcer
This project is a docker plugin that will kill containers if they don't meet certain pre-defined policies such as having strict memory and cpu limits

Each docker container runs a separate JVM at run-time?

We are running hundreds of Mule Java 8 apps in a machine currently. A single JVM appears to be running that is sharing hundreds of megs of Jars with the apps at run-time. If we were to run each app in a docker container would each container run a separate JVM at run-time? If so this would incur a major RAM penalty!
Yes, each container would run its own java process and thus its own JVM.
You may consider partitioning your apps though so that, rather than going from many apps on one server|VM to many containers each running one app on one server, that you go to some number of containers each running several apps on one server.
Yes, you'd have to duplicate shared jars for each container. Yes, you'd increase CPU, RAM and network traffic.
But, you'd gain more flexibility in duplicating app servers for scale-out and moving them onto different machines to better reflect their CPU, memory, bandwidth needs.

Is it possible to use a kernel module built from within Docker?

I have a custom kernel module I need to build for a specific piece of hardware. I want to automate setting up my system so I have been containerizing several applications. One of the things I need is this kernel module. Assuming the kernel headers et al in the Docker container and the kernel on the host are for the exact same version, is it possible to have my whole build process containerized and allow the host to use that module?
Many tasks that involve controlling the host system are best run directly on the host, and I would avoid Docker here.
At a insmod(8) level, Docker containers generally run with a restricted set of permissions and can’t make extremely invasive changes like this over the host. There’s probably a docker run --cap-add option that would theoretically make it possible, but a significant design statement of Docker is that container processes aren’t supposed to be able to impact other containers or the host like this.
At an even broader Linux level, the build version of custom kernel modules has to match the host’s kernel exactly. This means, if you update the host kernel (for a routine security update for example) you have to also rebuild and reinstall any custom modules. Mainstream Linux distributions have support for this, but if you’ve boxed away management of this into a container, you have to remember how to rebuild the container with the newer kernel headers and make sure it doesn’t get restarted until you reboot the host. That can be tricky.
At a Docker level, you’re in effect building an image that can only be used on one very specific system. Usually the concept is to build an image that can be reused in multiple contexts; you want to be able to push the image to a registry and run it on another system with minimal configuration. It’s hard to do this if an image is tied to an extremely specific kernel version or other host-level dependency.

How do Docker and other container services differ from KVMs?

Looking at this question and its answers, it's clear that a few points make container services fairly different from traditional VMs:
They can save on performance and space by sharing the host's operating system
They further save space using the AuFS filesystem, which allows them to share the hard drive with the host
All of this allows them to boot in a fraction of the time it takes for a full VM.
I may have some misconceptions about how KVMs work and about the hypervisor model, but aren't containers much like KVMs? In what do they differ, and what are the performance gains/losses for either of them?
I may have some misconceptions about how KVMs work and about the
hypervisor model, but aren't containers much like KVMs? In what do
they differ, and what are the performance gains/losses for either of
them?
A virtual machine is just that -- "virtual" hardware that can boot pretty much any compatible operating system. For example, you can run Windows in a VM on your Linux host. A VM provides a variety of emulated hardware, including the CPU, network cards, storage interfaces, and so forth.
In contrast, a container is nothing more than a collection of processes on your host. Processes running inside the container are no different from processes running outside the container -- from the host you can see them with ps, manage them using tools like kill, etc. Because of this, processes running in containers are using your host kernel -- you can't, say, run a Windows binary inside a container on your Linux host.
Because they're not performing any sort of hardware virtualization, containers are substantially lighter weight than virtual machines. As long as you are able to work with their limitations (ie., the fact that they are limited to the host operating system kernel), they will yield better utilization of hardware than running the same services inside a virtual machine.

How is Docker different from a virtual machine?

I keep rereading the Docker documentation to try to understand the difference between Docker and a full VM. How does it manage to provide a full filesystem, isolated networking environment, etc. without being as heavy?
Why is deploying software to a Docker image (if that's the right term) easier than simply deploying to a consistent production environment?
Docker originally used LinuX Containers (LXC), but later switched to runC (formerly known as libcontainer), which runs in the same operating system as its host. This allows it to share a lot of the host operating system resources. Also, it uses a layered filesystem (AuFS) and manages networking.
AuFS is a layered file system, so you can have a read only part and a write part which are merged together. One could have the common parts of the operating system as read only (and shared amongst all of your containers) and then give each container its own mount for writing.
So, let's say you have a 1 GB container image; if you wanted to use a full VM, you would need to have 1 GB x number of VMs you want. With Docker and AuFS you can share the bulk of the 1 GB between all the containers and if you have 1000 containers you still might only have a little over 1 GB of space for the containers OS (assuming they are all running the same OS image).
A full virtualized system gets its own set of resources allocated to it, and does minimal sharing. You get more isolation, but it is much heavier (requires more resources). With Docker you get less isolation, but the containers are lightweight (require fewer resources). So you could easily run thousands of containers on a host, and it won't even blink. Try doing that with Xen, and unless you have a really big host, I don't think it is possible.
A full virtualized system usually takes minutes to start, whereas Docker/LXC/runC containers take seconds, and often even less than a second.
There are pros and cons for each type of virtualized system. If you want full isolation with guaranteed resources, a full VM is the way to go. If you just want to isolate processes from each other and want to run a ton of them on a reasonably sized host, then Docker/LXC/runC seems to be the way to go.
For more information, check out this set of blog posts which do a good job of explaining how LXC works.
Why is deploying software to a docker image (if that's the right term) easier than simply deploying to a consistent production environment?
Deploying a consistent production environment is easier said than done. Even if you use tools like Chef and Puppet, there are always OS updates and other things that change between hosts and environments.
Docker gives you the ability to snapshot the OS into a shared image, and makes it easy to deploy on other Docker hosts. Locally, dev, qa, prod, etc.: all the same image. Sure you can do this with other tools, but not nearly as easily or fast.
This is great for testing; let's say you have thousands of tests that need to connect to a database, and each test needs a pristine copy of the database and will make changes to the data. The classic approach to this is to reset the database after every test either with custom code or with tools like Flyway - this can be very time-consuming and means that tests must be run serially. However, with Docker you could create an image of your database and run up one instance per test, and then run all the tests in parallel since you know they will all be running against the same snapshot of the database. Since the tests are running in parallel and in Docker containers they could run all on the same box at the same time and should finish much faster. Try doing that with a full VM.
From comments...
Interesting! I suppose I'm still confused by the notion of "snapshot[ting] the OS". How does one do that without, well, making an image of the OS?
Well, let's see if I can explain. You start with a base image, and then make your changes, and commit those changes using docker, and it creates an image. This image contains only the differences from the base. When you want to run your image, you also need the base, and it layers your image on top of the base using a layered file system: as mentioned above, Docker uses AuFS. AuFS merges the different layers together and you get what you want; you just need to run it. You can keep adding more and more images (layers) and it will continue to only save the diffs. Since Docker typically builds on top of ready-made images from a registry, you rarely have to "snapshot" the whole OS yourself.
It might be helpful to understand how virtualization and containers work at a low level. That will clear up lot of things.
Note: I'm simplifying a bit in the description below. See references for more information.
How does virtualization work at a low level?
In this case the VM manager takes over the CPU ring 0 (or the "root mode" in newer CPUs) and intercepts all privileged calls made by the guest OS to create the illusion that the guest OS has its own hardware. Fun fact: Before 1998 it was thought to be impossible to achieve this on the x86 architecture because there was no way to do this kind of interception. The folks at VMware were the first who had an idea to rewrite the executable bytes in memory for privileged calls of the guest OS to achieve this.
The net effect is that virtualization allows you to run two completely different OSes on the same hardware. Each guest OS goes through all the processes of bootstrapping, loading kernel, etc. You can have very tight security. For example, a guest OS can't get full access to the host OS or other guests and mess things up.
How do containers work at a low level?
Around 2006, people including some of the employees at Google implemented a new kernel level feature called namespaces (however the idea long before existed in FreeBSD). One function of the OS is to allow sharing of global resources like network and disks among processes. What if these global resources were wrapped in namespaces so that they are visible only to those processes that run in the same namespace? Say, you can get a chunk of disk and put that in namespace X and then processes running in namespace Y can't see or access it. Similarly, processes in namespace X can't access anything in memory that is allocated to namespace Y. Of course, processes in X can't see or talk to processes in namespace Y. This provides a kind of virtualization and isolation for global resources. This is how Docker works: Each container runs in its own namespace but uses exactly the same kernel as all other containers. The isolation happens because the kernel knows the namespace that was assigned to the process and during API calls it makes sure that the process can only access resources in its own namespace.
The limitations of containers vs VMs should be obvious now: You can't run completely different OSes in containers like in VMs. However you can run different distros of Linux because they do share the same kernel. The isolation level is not as strong as in a VM. In fact, there was a way for a "guest" container to take over the host in early implementations. Also you can see that when you load a new container, an entire new copy of the OS doesn't start like it does in a VM. All containers share the same kernel. This is why containers are light weight. Also unlike a VM, you don't have to pre-allocate a significant chunk of memory to containers because we are not running a new copy of the OS. This enables running thousands of containers on one OS while sandboxing them, which might not be possible if we were running separate copies of the OS in their own VMs.
Good answers. Just to get an image representation of container vs VM, have a look at the one below.
Source
I like Ken Cochrane's answer.
But I want to add additional point of view, not covered in detail here. In my opinion Docker differs also in whole process. In contrast to VMs, Docker is not (only) about optimal resource sharing of hardware, moreover it provides a "system" for packaging application (preferable, but not a must, as a set of microservices).
To me it fits in the gap between developer-oriented tools like rpm, Debian packages, Maven, npm + Git on one side and ops tools like Puppet, VMware, Xen, you name it...
Why is deploying software to a docker image (if that's the right term) easier than simply deploying to a consistent production environment?
Your question assumes some consistent production environment. But how to keep it consistent?
Consider some amount (>10) of servers and applications, stages in the pipeline.
To keep this in sync you'll start to use something like Puppet, Chef or your own provisioning scripts, unpublished rules and/or lot of documentation... In theory servers can run indefinitely, and be kept completely consistent and up to date. Practice fails to manage a server's configuration completely, so there is considerable scope for configuration drift, and unexpected changes to running servers.
So there is a known pattern to avoid this, the so called immutable server. But the immutable server pattern was not loved. Mostly because of the limitations of VMs that were used before Docker. Dealing with several gigabytes big images, moving those big images around, just to change some fields in the application, was very very laborious. Understandable...
With a Docker ecosystem, you will never need to move around gigabytes on "small changes" (thanks aufs and Registry) and you don't need to worry about losing performance by packaging applications into a Docker container at runtime. You don't need to worry about versions of that image.
And finally you will even often be able to reproduce complex production environments even on your Linux laptop (don't call me if doesn't work in your case ;))
And of course you can start Docker containers in VMs (it's a good idea). Reduce your server provisioning on the VM level. All the above could be managed by Docker.
P.S. Meanwhile Docker uses its own implementation "libcontainer" instead of LXC. But LXC is still usable.
Docker isn't a virtualization methodology. It relies on other tools that actually implement container-based virtualization or operating system level virtualization. For that, Docker was initially using LXC driver, then moved to libcontainer which is now renamed as runc. Docker primarily focuses on automating the deployment of applications inside application containers. Application containers are designed to package and run a single service, whereas system containers are designed to run multiple processes, like virtual machines. So, Docker is considered as a container management or application deployment tool on containerized systems.
In order to know how it is different from other virtualizations, let's go through virtualization and its types. Then, it would be easier to understand what's the difference there.
Virtualization
In its conceived form, it was considered a method of logically dividing mainframes to allow multiple applications to run simultaneously. However, the scenario drastically changed when companies and open source communities were able to provide a method of handling the privileged instructions in one way or another and allow for multiple operating systems to be run simultaneously on a single x86 based system.
Hypervisor
The hypervisor handles creating the virtual environment on which the guest virtual machines operate. It supervises the guest systems and makes sure that resources are allocated to the guests as necessary. The hypervisor sits in between the physical machine and virtual machines and provides virtualization services to the virtual machines. To realize it, it intercepts the guest operating system operations on the virtual machines and emulates the operation on the host machine's operating system.
The rapid development of virtualization technologies, primarily in cloud, has driven the use of virtualization further by allowing multiple virtual servers to be created on a single physical server with the help of hypervisors, such as Xen, VMware Player, KVM, etc., and incorporation of hardware support in commodity processors, such as Intel VT and AMD-V.
Types of Virtualization
The virtualization method can be categorized based on how it mimics hardware to a guest operating system and emulates a guest operating environment. Primarily, there are three types of virtualization:
Emulation
Paravirtualization
Container-based virtualization
Emulation
Emulation, also known as full virtualization runs the virtual machine OS kernel entirely in software. The hypervisor used in this type is known as Type 2 hypervisor. It is installed on the top of the host operating system which is responsible for translating guest OS kernel code to software instructions. The translation is done entirely in software and requires no hardware involvement. Emulation makes it possible to run any non-modified operating system that supports the environment being emulated. The downside of this type of virtualization is an additional system resource overhead that leads to a decrease in performance compared to other types of virtualizations.
Examples in this category include VMware Player, VirtualBox, QEMU, Bochs, Parallels, etc.
Paravirtualization
Paravirtualization, also known as Type 1 hypervisor, runs directly on the hardware, or “bare-metal”, and provides virtualization services directly to the virtual machines running on it. It helps the operating system, the virtualized hardware, and the real hardware to collaborate to achieve optimal performance. These hypervisors typically have a rather small footprint and do not, themselves, require extensive resources.
Examples in this category include Xen, KVM, etc.
Container-based Virtualization
Container-based virtualization, also known as operating system-level virtualization, enables multiple isolated executions within a single operating system kernel. It has the best possible performance and density and features dynamic resource management. The isolated virtual execution environment provided by this type of virtualization is called a container and can be viewed as a traced group of processes.
The concept of a container is made possible by the namespaces feature added to Linux kernel version 2.6.24. The container adds its ID to every process and adding new access control checks to every system call. It is accessed by the clone() system call that allows creating separate instances of previously-global namespaces.
Namespaces can be used in many different ways, but the most common approach is to create an isolated container that has no visibility or access to objects outside the container. Processes running inside the container appear to be running on a normal Linux system although they are sharing the underlying kernel with processes located in other namespaces, same for other kinds of objects. For instance, when using namespaces, the root user inside the container is not treated as root outside the container, adding additional security.
The Linux Control Groups (cgroups) subsystem, the next major component to enable container-based virtualization, is used to group processes and manage their aggregate resource consumption. It is commonly used to limit the memory and CPU consumption of containers. Since a containerized Linux system has only one kernel and the kernel has full visibility into the containers, there is only one level of resource allocation and scheduling.
Several management tools are available for Linux containers, including LXC, LXD, systemd-nspawn, lmctfy, Warden, Linux-VServer, OpenVZ, Docker, etc.
Containers vs Virtual Machines
Unlike a virtual machine, a container does not need to boot the operating system kernel, so containers can be created in less than a second. This feature makes container-based virtualization unique and desirable than other virtualization approaches.
Since container-based virtualization adds little or no overhead to the host machine, container-based virtualization has near-native performance
For container-based virtualization, no additional software is required, unlike other virtualizations.
All containers on a host machine share the scheduler of the host machine saving need of extra resources.
Container states (Docker or LXC images) are small in size compared to virtual machine images, so container images are easy to distribute.
Resource management in containers is achieved through cgroups. Cgroups does not allow containers to consume more resources than allocated to them. However, as of now, all resources of host machine are visible in virtual machines, but can't be used. This can be realized by running top or htop on containers and host machine at the same time. The output across all environments will look similar.
Update:
How does Docker run containers in non-Linux systems?
If containers are possible because of the features available in the Linux kernel, then the obvious question is how do non-Linux systems run containers. Both Docker for Mac and Windows use Linux VMs to run the containers. Docker Toolbox used to run containers in Virtual Box VMs. But, the latest Docker uses Hyper-V in Windows and Hypervisor.framework in Mac.
Now, let me describe how Docker for Mac runs containers in detail.
Docker for Mac uses https://github.com/moby/hyperkit to emulate the hypervisor capabilities and Hyperkit uses hypervisor.framework in its core. Hypervisor.framework is Mac's native hypervisor solution. Hyperkit also uses VPNKit and DataKit to namespace network and filesystem respectively.
The Linux VM that Docker runs in Mac is read-only. However, you can bash into it by running:
screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty.
Now, we can even check the Kernel version of this VM:
# uname -a
Linux linuxkit-025000000001 4.9.93-linuxkit-aufs #1 SMP Wed Jun 6 16:86_64 Linux.
All containers run inside this VM.
There are some limitations to hypervisor.framework. Because of that Docker doesn't expose docker0 network interface in Mac. So, you can't access containers from the host. As of now, docker0 is only available inside the VM.
Hyper-v is the native hypervisor in Windows. They are also trying to leverage Windows 10's capabilities to run Linux systems natively.
Most of the answers here talk about virtual machines. I'm going to give you a one-liner response to this question that has helped me the most over the last couple years of using Docker. It's this:
Docker is just a fancy way to run a process, not a virtual machine.
Now, let me explain a bit more about what that means. Virtual machines are their own beast. I feel like explaining what Docker is will help you understand this more than explaining what a virtual machine is. Especially because there are many fine answers here telling you exactly what someone means when they say "virtual machine". So...
A Docker container is just a process (and its children) that is compartmentalized using cgroups inside the host system's kernel from the rest of the processes. You can actually see your Docker container processes by running ps aux on the host. For example, starting apache2 "in a container" is just starting apache2 as a special process on the host. It's just been compartmentalized from other processes on the machine. It is important to note that your containers do not exist outside of your containerized process' lifetime. When your process dies, your container dies. That's because Docker replaces pid 1 inside your container with your application (pid 1 is normally the init system). This last point about pid 1 is very important.
As far as the filesystem used by each of those container processes, Docker uses UnionFS-backed images, which is what you're downloading when you do a docker pull ubuntu. Each "image" is just a series of layers and related metadata. The concept of layering is very important here. Each layer is just a change from the layer underneath it. For example, when you delete a file in your Dockerfile while building a Docker container, you're actually just creating a layer on top of the last layer which says "this file has been deleted". Incidentally, this is why you can delete a big file from your filesystem, but the image still takes up the same amount of disk space. The file is still there, in the layers underneath the current one. Layers themselves are just tarballs of files. You can test this out with docker save --output /tmp/ubuntu.tar ubuntu and then cd /tmp && tar xvf ubuntu.tar. Then you can take a look around. All those directories that look like long hashes are actually the individual layers. Each one contains files (layer.tar) and metadata (json) with information about that particular layer. Those layers just describe changes to the filesystem which are saved as a layer "on top of" its original state. When reading the "current" data, the filesystem reads data as though it were looking only at the top-most layers of changes. That's why the file appears to be deleted, even though it still exists in "previous" layers, because the filesystem is only looking at the top-most layers. This allows completely different containers to share their filesystem layers, even though some significant changes may have happened to the filesystem on the top-most layers in each container. This can save you a ton of disk space, when your containers share their base image layers. However, when you mount directories and files from the host system into your container by way of volumes, those volumes "bypass" the UnionFS, so changes are not stored in layers.
Networking in Docker is achieved by using an ethernet bridge (called docker0 on the host), and virtual interfaces for every container on the host. It creates a virtual subnet in docker0 for your containers to communicate "between" one another. There are many options for networking here, including creating custom subnets for your containers, and the ability to "share" your host's networking stack for your container to access directly.
Docker is moving very fast. Its documentation is some of the best documentation I've ever seen. It is generally well-written, concise, and accurate. I recommend you check the documentation available for more information, and trust the documentation over anything else you read online, including Stack Overflow. If you have specific questions, I highly recommend joining #docker on Freenode IRC and asking there (you can even use Freenode's webchat for that!).
Through this post we are going to draw some lines of differences between VMs and LXCs. Let's first define them.
VM:
A virtual machine emulates a physical computing environment, but requests for CPU, memory, hard disk, network and other hardware resources are managed by a virtualization layer which translates these requests to the underlying physical hardware.
In this context the VM is called as the Guest while the environment it runs on is called the host.
LXCs:
Linux Containers (LXC) are operating system-level capabilities that make it possible to run multiple isolated Linux containers, on one control host (the LXC host). Linux Containers serve as a lightweight alternative to VMs as they don’t require the hypervisors viz. Virtualbox, KVM, Xen, etc.
Now unless you were drugged by Alan (Zach Galifianakis- from the Hangover series) and have been in Vegas for the last year, you will be pretty aware about the tremendous spurt of interest for Linux containers technology, and if I will be specific one container project which has created a buzz around the world in last few months is – Docker leading to some echoing opinions that cloud computing environments should abandon virtual machines (VMs) and replace them with containers due to their lower overhead and potentially better performance.
But the big question is, is it feasible?, will it be sensible?
a. LXCs are scoped to an instance of Linux. It might be different flavors of Linux (e.g. a Ubuntu container on a CentOS host but it’s still Linux.) Similarly, Windows-based containers are scoped to an instance of Windows now if we look at VMs they have a pretty broader scope and using the hypervisors you are not limited to operating systems Linux or Windows.
b. LXCs have low overheads and have better performance as compared to VMs. Tools viz. Docker which are built on the shoulders of LXC technology have provided developers with a platform to run their applications and at the same time have empowered operations people with a tool that will allow them to deploy the same container on production servers or data centers. It tries to make the experience between a developer running an application, booting and testing an application and an operations person deploying that application seamless, because this is where all the friction lies in and purpose of DevOps is to break down those silos.
So the best approach is the cloud infrastructure providers should advocate an appropriate use of the VMs and LXC, as they are each suited to handle specific workloads and scenarios.
Abandoning VMs is not practical as of now. So both VMs and LXCs have their own individual existence and importance.
Docker encapsulates an application with all its dependencies.
A virtualizer encapsulates an OS that can run any applications it can normally run on a bare metal machine.
They both are very different. Docker is lightweight and uses LXC/libcontainer (which relies on kernel namespacing and cgroups) and does not have machine/hardware emulation such as hypervisor, KVM. Xen which are heavy.
Docker and LXC is meant more for sandboxing, containerization, and resource isolation. It uses the host OS's (currently only Linux kernel) clone API which provides namespacing for IPC, NS (mount), network, PID, UTS, etc.
What about memory, I/O, CPU, etc.? That is controlled using cgroups where you can create groups with certain resource (CPU, memory, etc.) specification/restriction and put your processes in there. On top of LXC, Docker provides a storage backend (http://www.projectatomic.io/docs/filesystems/) e.g., union mount filesystem where you can add layers and share layers between different mount namespaces.
This is a powerful feature where the base images are typically readonly and only when the container modifies something in the layer will it write something to read-write partition (a.k.a. copy on write). It also provides many other wrappers such as registry and versioning of images.
With normal LXC you need to come with some rootfs or share the rootfs and when shared, and the changes are reflected on other containers. Due to lot of these added features, Docker is more popular than LXC. LXC is popular in embedded environments for implementing security around processes exposed to external entities such as network and UI. Docker is popular in cloud multi-tenancy environment where consistent production environment is expected.
A normal VM (for example, VirtualBox and VMware) uses a hypervisor, and related technologies either have dedicated firmware that becomes the first layer for the first OS (host OS, or guest OS 0) or a software that runs on the host OS to provide hardware emulation such as CPU, USB/accessories, memory, network, etc., to the guest OSes. VMs are still (as of 2015) popular in high security multi-tenant environment.
Docker/LXC can almost be run on any cheap hardware (less than 1 GB of memory is also OK as long as you have newer kernel) vs. normal VMs need at least 2 GB of memory, etc., to do anything meaningful with it. But Docker support on the host OS is not available in OS such as Windows (as of Nov 2014) where as may types of VMs can be run on windows, Linux, and Macs.
Here is a pic from docker/rightscale :
1. Lightweight
This is probably the first impression for many docker learners.
First, docker images are usually smaller than VM images, makes it easy to build, copy, share.
Second, Docker containers can start in several milliseconds, while VM starts in seconds.
2. Layered File System
This is another key feature of Docker. Images have layers, and different images can share layers, make it even more space-saving and faster to build.
If all containers use Ubuntu as their base images, not every image has its own file system, but share the same underline ubuntu files, and only differs in their own application data.
3. Shared OS Kernel
Think of containers as processes!
All containers running on a host is indeed a bunch of processes with different file systems. They share the same OS kernel, only encapsulates system library and dependencies.
This is good for most cases(no extra OS kernel maintains) but can be a problem if strict isolations are necessary between containers.
Why it matters?
All these seem like improvements, not revolution. Well, quantitative accumulation leads to qualitative transformation.
Think about application deployment. If we want to deploy a new software(service) or upgrade one, it is better to change the config files and processes instead of creating a new VM. Because Creating a VM with updated service, testing it(share between Dev & QA), deploying to production takes hours, even days. If anything goes wrong, you got to start again, wasting even more time. So, use configuration management tool(puppet, saltstack, chef etc.) to install new software, download new files is preferred.
When it comes to docker, it's impossible to use a newly created docker container to replace the old one. Maintainance is much easier!Building a new image, share it with QA, testing it, deploying it only takes minutes(if everything is automated), hours in the worst case. This is called immutable infrastructure: do not maintain(upgrade) software, create a new one instead.
It transforms how services are delivered. We want applications, but have to maintain VMs(which is a pain and has little to do with our applications). Docker makes you focus on applications and smooths everything.
Docker, basically containers, supports OS virtualization i.e. your application feels that it has a complete instance of an OS whereas VM supports hardware virtualization. You feel like it is a physical machine in which you can boot any OS.
In Docker, the containers running share the host OS kernel, whereas in VMs they have their own OS files. The environment (the OS) in which you develop an application would be same when you deploy it to various serving environments, such as "testing" or "production".
For example, if you develop a web server that runs on port 4000, when you deploy it to your "testing" environment, that port is already used by some other program, so it stops working. In containers there are layers; all the changes you have made to the OS would be saved in one or more layers and those layers would be part of image, so wherever the image goes the dependencies would be present as well.
In the example shown below, the host machine has three VMs. In order to provide the applications in the VMs complete isolation, they each have their own copies of OS files, libraries and application code, along with a full in-memory instance of an OS.
Whereas the figure below shows the same scenario with containers. Here, containers simply share the host operating system, including the kernel and libraries, so they don’t need to boot an OS, load libraries or pay a private memory cost for those files. The only incremental space they take is any memory and disk space necessary for the application to run in the container. While the application’s environment feels like a dedicated OS, the application deploys just like it would onto a dedicated host. The containerized application starts in seconds and many more instances of the application can fit onto the machine than in the VM case.
Source: https://azure.microsoft.com/en-us/blog/containers-docker-windows-and-trends/
There are three different setups that providing a stack to run an application on (This will help us to recognize what a container is and what makes it so much powerful than other solutions):
1) Traditional Servers(bare metal)
2) Virtual machines (VMs)
3) Containers
1) Traditional server stack consist of a physical server that runs an operating system and your application.
Advantages:
Utilization of raw resources
Isolation
Disadvantages:
Very slow deployment time
Expensive
Wasted resources
Difficult to scale
Difficult to migrate
Complex configuration
2) The VM stack consist of a physical server which runs an operating system and a hypervisor that manages your virtual machine, shared resources, and networking interface. Each Vm runs a Guest Operating System, an application or set of applications.
Advantages:
Good use of resources
Easy to scale
Easy to backup and migrate
Cost efficiency
Flexibility
Disadvantages:
Resource allocation is problematic
Vendor lockin
Complex configuration
3) The Container Setup, the key difference with other stack is container-based virtualization uses the kernel of the host OS to rum multiple isolated guest instances. These guest instances are called as containers. The host can be either a physical server or VM.
Advantages:
Isolation
Lightweight
Resource effective
Easy to migrate
Security
Low overhead
Mirror production and development environment
Disadvantages:
Same Architecture
Resource heavy apps
Networking and security issues.
By comparing the container setup with its predecessors, we can conclude that containerization is the fastest, most resource effective, and most secure setup we know to date. Containers are isolated instances that run your application. Docker spin up the container in a way, layers get run time memory with default storage drivers(Overlay drivers) those run within seconds and copy-on-write layer created on top of it once we commit into the container, that powers the execution of containers. In case of VM's that will take around a minute to load everything into the virtualize environment. These lightweight instances can be replaced, rebuild, and moved around easily. This allows us to mirror the production and development environment and is tremendous help in CI/CD processes. The advantages containers can provide are so compelling that they're definitely here to stay.
In relation to:-
"Why is deploying software to a docker image easier than simply
deploying to a consistent production environment ?"
Most software is deployed to many environments, typically a minimum of three of the following:
Individual developer PC(s)
Shared developer environment
Individual tester PC(s)
Shared test environment
QA environment
UAT environment
Load / performance testing
Live staging
Production
Archive
There are also the following factors to consider:
Developers, and indeed testers, will all have either subtlely or vastly different PC configurations, by the very nature of the job
Developers can often develop on PCs beyond the control of corporate or business standardisation rules (e.g. freelancers who develop on their own machines (often remotely) or contributors to open source projects who are not 'employed' or 'contracted' to configure their PCs a certain way)
Some environments will consist of a fixed number of multiple machines in a load balanced configuration
Many production environments will have cloud-based servers dynamically (or 'elastically') created and destroyed depending on traffic levels
As you can see the extrapolated total number of servers for an organisation is rarely in single figures, is very often in triple figures and can easily be significantly higher still.
This all means that creating consistent environments in the first place is hard enough just because of sheer volume (even in a green field scenario), but keeping them consistent is all but impossible given the high number of servers, addition of new servers (dynamically or manually), automatic updates from o/s vendors, anti-virus vendors, browser vendors and the like, manual software installs or configuration changes performed by developers or server technicians, etc. Let me repeat that - it's virtually (no pun intended) impossible to keep environments consistent (okay, for the purist, it can be done, but it involves a huge amount of time, effort and discipline, which is precisely why VMs and containers (e.g. Docker) were devised in the first place).
So think of your question more like this "Given the extreme difficulty of keeping all environments consistent, is it easier to deploying software to a docker image, even when taking the learning curve into account ?". I think you'll find the answer will invariably be "yes" - but there's only one way to find out, post this new question on Stack Overflow.
There are many answers which explain more detailed on the differences, but here is my very brief explanation.
One important difference is that VMs use a separate kernel to run the OS. That's the reason it is heavy and takes time to boot, consuming more system resources.
In Docker, the containers share the kernel with the host; hence it is lightweight and can start and stop quickly.
In Virtualization, the resources are allocated in the beginning of set up and hence the resources are not fully utilized when the virtual machine is idle during many of the times.
In Docker, the containers are not allocated with fixed amount of hardware resources and is free to use the resources depending on the requirements and hence it is highly scalable.
Docker uses UNION File system .. Docker uses a copy-on-write technology to reduce the memory space consumed by containers. Read more here
With a virtual machine, we have a server, we have a host operating system on that server, and then we have a hypervisor. And then running on top of that hypervisor, we have any number of guest operating systems with an application and its dependent binaries, and libraries on that server. It brings a whole guest operating system with it. It's quite heavyweight. Also there's a limit to how much you can actually put on each physical machine.
Docker containers on the other hand, are slightly different. We have the server. We have the host operating system. But instead a hypervisor, we have the Docker engine, in this case. In this case, we're not bringing a whole guest operating system with us. We're bringing a very thin layer of the operating system, and the container can talk down into the host OS in order to get to the kernel functionality there. And that allows us to have a very lightweight container.
All it has in there is the application code and any binaries and libraries that it requires. And those binaries and libraries can actually be shared across different containers if you want them to be as well. And what this enables us to do, is a number of things. They have much faster startup time. You can't stand up a single VM in a few seconds like that. And equally, taking them down as quickly.. so we can scale up and down very quickly and we'll look at that later on.
Every container thinks that it’s running on its own copy of the operating system. It’s got its own file system, own registry, etc. which is a kind of a lie. It’s actually being virtualized.
Source: Kubernetes in Action.
I have used Docker in production environments and staging very much. When you get used to it you will find it very powerful for building a multi container and isolated environments.
Docker has been developed based on LXC (Linux Container) and works perfectly in many Linux distributions, especially Ubuntu.
Docker containers are isolated environments. You can see it when you issue the top command in a Docker container that has been created from a Docker image.
Besides that, they are very light-weight and flexible thanks to the dockerFile configuration.
For example, you can create a Docker image and configure a DockerFile and tell that for example when it is running then wget 'this', apt-get 'that', run 'some shell script', setting environment variables and so on.
In micro-services projects and architecture Docker is a very viable asset. You can achieve scalability, resiliency and elasticity with Docker, Docker swarm, Kubernetes and Docker Compose.
Another important issue regarding Docker is Docker Hub and its community.
For example, I implemented an ecosystem for monitoring kafka using Prometheus, Grafana, Prometheus-JMX-Exporter, and Docker.
For doing that, I downloaded configured Docker containers for zookeeper, kafka, Prometheus, Grafana and jmx-collector then mounted my own configuration for some of them using YAML files, or for others, I changed some files and configuration in the Docker container and I build a whole system for monitoring kafka using multi-container Dockers on a single machine with isolation and scalability and resiliency that this architecture can be easily moved into multiple servers.
Besides the Docker Hub site there is another site called quay.io that you can use to have your own Docker images dashboard there and pull/push to/from it. You can even import Docker images from Docker Hub to quay then running them from quay on your own machine.
Note: Learning Docker in the first place seems complex and hard, but when you get used to it then you can not work without it.
I remember the first days of working with Docker when I issued the wrong commands or removing my containers and all of data and configurations mistakenly.
This is how Docker introduces itself:
Docker is the company driving the container movement and the only
container platform provider to address every application across the
hybrid cloud. Today’s businesses are under pressure to digitally
transform but are constrained by existing applications and
infrastructure while rationalizing an increasingly diverse portfolio
of clouds, datacenters and application architectures. Docker enables
true independence between applications and infrastructure and
developers and IT ops to unlock their potential and creates a model
for better collaboration and innovation.
So Docker is container based, meaning you have images and containers which can be run on your current machine. It's not including the operating system like VMs, but like a pack of different working packs like Java, Tomcat, etc.
If you understand containers, you get what Docker is and how it's different from VMs...
So, what's a container?
A container image is a lightweight, stand-alone, executable package of
a piece of software that includes everything needed to run it: code,
runtime, system tools, system libraries, settings. Available for both
Linux and Windows based apps, containerized software will always run
the same, regardless of the environment. Containers isolate software
from its surroundings, for example differences between development and
staging environments and help reduce conflicts between teams running
different software on the same infrastructure.
So as you see in the image below, each container has a separate pack and running on a single machine share that machine's operating system... They are secure and easy to ship...
There are a lot of nice technical answers here that clearly discuss the differences between VMs and containers as well as the origins of Docker.
For me the fundamental difference between VMs and Docker is how you manage the promotion of your application.
With VMs you promote your application and its dependencies from one VM to the next DEV to UAT to PRD.
Often these VM's will have different patches and libraries.
It is not uncommon for multiple applications to share a VM. This requires managing configuration and dependencies for all the applications.
Backout requires undoing changes in the VM. Or restoring it if possible.
With Docker the idea is that you bundle up your application inside its own container along with the libraries it needs and then promote the whole container as a single unit.
Except for the kernel the patches and libraries are identical.
As a general rule there is only one application per container which simplifies configuration.
Backout consists of stopping and deleting the container.
So at the most fundamental level with VMs you promote the application and its dependencies as discrete components whereas with Docker you promote everything in one hit.
And yes there are issues with containers including managing them although tools like Kubernetes or Docker Swarm greatly simplify the task.
Feature
Virtual Machine
(Docker) Containers
OS
Each VM Does contains an Operating System
Each Docker Container Does Not contains an Operating System
H/W
Each VM contain a virtual copy of the hardware that OS requires to run.
There is No virtualization of H/W with containers
Weight
VM's are heavy -- reason sited above--
containers are lightweight and, thus, fast
Required S/W
Virtuliazation achieve using software called a hypervisor
Containerzation achieve using software called a Docker
Core
Virtual machines provide virtual hardware (or hardware on which an operating system and other programs can be installed)
Docker containers don’t use any hardware virtualization. **It helps to use container
Abstraction
Virtual machines provide hardware abstractions so you can run multiple operating systems.
Containers provide OS abstractions so you can run multiple containers.
Boot-Time
It takes a long time (often minutes) to create and require significant resource overhead because they run a whole operating system in addition to the software you want to use.
It takes less time because Programs running inside Docker containers interface directly with the host’s Linux kernel.
Containers isolates libraries and software packages from the system so that you can install different versions of same software and libraries without conflict. It uses minimal storage and ram, almost no overhead using same base os kernel and available libraries with a small delta difference if possible. You can expose your hardware directly or indirectly to containers so that you can use acceleration such as gpu for computations.
In practice you use docker for pre-made containers. You install them and run them in one line. Installing tensorflow-gpu is as easy as docker run -it tensorflow-gpu. Although I could not stumble upon many premade containers of lxd (lxc containers),I find them easier to customize and more stable and performant.
Both containers and VMs can be used to distribute the load. But since containers has almost no overhead, container management software are focused on creating container clusters so that you distribute them, thus the load, to metal machines easily.
Real Life example:
Suppose that you need more than 50 types of computation environment and 50 types of services such as mysql, webhosting and cloud based services (like jenkins and object storage) and you have more than 50 different bare metal servers. Typically its an academic environment with many faculties. And you need to use resources efficiently and you need high availability. When one server goes down users should not experience any problem.
To solve this, what you do is basically installing all types of containers on all servers. And distribute the load to all metal machines. As one type of container is needed more it is possible to automatically spawn more of them on one or more bare metal machines. So that many different users can use different services and environments continuously and flexibly.
In that setup suppose there are 100 students using the system at the same time. 95 of them are using servers for rudimentary services such as checking GPA's, curriculum, library database etc. But 5 of them are performing 5 different types of engineering simulations. You will see that 49 bare metal servers are fully dedicated to engineering simulation each having 5 different types of computation containers tying to race each but balanced as %20 hw resource use. When you add 2500 more students for rudimentary tasks, that either will use %5 of all bare metal machines. Rest will be used for computations.
Thus the most important distinguishing features of container providing such flexibility benefits are:
ready to deploy premade containers, almost no overhead, fast spawnability
with live-adjustable quotas
using .cpu_allowencess , .ram_allowances or directly cgroup.
Kubernetes does all of this for you. After fiddling with docker and lxd you may want to check it out.
In my opinion it depends, it can be seen from the needs of your application, why decide to deploy to Docker because Docker breaks the application into small parts according to its function, this becomes effective because when one application / function is an error it has no effect on other applications , in contrast to using full vm, it will be slower and more complex in configuration, but in some ways safer than docker
The docker documentation (and self-explanation) makes a distinction between "virtual machines" vs. "containers". They have the tendency to interpret and use things in a little bit uncommon ways. They can do that because it is up to them, what do they write in their documentation, and because the terminology for virtualization is not yet really exact.
Fact is what the Docker documentation understands on "containers", is paravirtualization (sometimes "OS-Level virtualization") in the reality, contrarily the hardware virtualization, which is docker not.
Docker is a low quality paravirtualisation solution. The container vs. VM distinction is invented by the docker development, to explain the serious disadvantages of their product.
The reason, why it became so popular, is that they "gave the fire to the ordinary people", i.e. it made possible the simple usage of typically server ( = Linux) environments / software products on Win10 workstations. This is also a reason for us to tolerate their little "nuance". But it does not mean that we should also believe it.
The situation is made yet more cloudy by the fact that docker on Windows hosts used an embedded Linux in HyperV, and its containers have run in that. Thus, docker on Windows uses a combined hardware and paravirtualization solution.
In short, Docker containers are low-quality (para)virtual machines with a huge advantage and a lot of disadvantages.

Resources