I have a unique Docker issue. I am developing an application which needs to connect to multiple Docker containers. The gist is, that this application will use the Docker SDK to spin up containers and connect to them as needed.
However, due to the nature of the application, we should assume that each one of these containers is compromised and unsafe. Therefore, I need to separate them from the host network (so they cannot access my devices and the WAN). I still have the constraint of needing to connect to them from my application.
It is a well-known problem that the macOS networking stack doesn't support connecting to a docker network. Normally, I'd get around this by exposing a port I need. However, this is not possible with my application, as I am using internal networks with Docker.
I'd like to accomplish something like the following. Imagine Container 2 and Container 3 are on their own private internal network. The host (which isn't a container) is controlling the Docker SDK and can query their internal IPs. Thus, it can easily connect to these machines without this network being exposed to the network of the host. Fortunately, this sort of setup works on Linux. However, I'd like to come up with a cross platform solution that works on macOS.
I had a similar situation. What I ended up doing was:
The app manages a dynamic container-to-port mapping (just a hash table).
When my app (on the host) wants to launch a container, it finds an unused port in a pre-defined range (e.g. 28000-29000).
Once it has a port, it maps the container's port to some port in a pre-determined range (e.g. -p 28003:80).
When my app needs to refer to a container, it uses localhost:<port> (e.g. localhost:28001).
It turns out to not be a lot of code, but if you go that route, make sure you encapsulate the way you refer to containers (i.e. don't hard-code the hostname and port, use a class that generates the string).
All that said, you should really do some testing with a VM deployment option before you rule it out as too slow.
I have a custom kernel module I need to build for a specific piece of hardware. I want to automate setting up my system so I have been containerizing several applications. One of the things I need is this kernel module. Assuming the kernel headers et al in the Docker container and the kernel on the host are for the exact same version, is it possible to have my whole build process containerized and allow the host to use that module?
Many tasks that involve controlling the host system are best run directly on the host, and I would avoid Docker here.
At a insmod(8) level, Docker containers generally run with a restricted set of permissions and can’t make extremely invasive changes like this over the host. There’s probably a docker run --cap-add option that would theoretically make it possible, but a significant design statement of Docker is that container processes aren’t supposed to be able to impact other containers or the host like this.
At an even broader Linux level, the build version of custom kernel modules has to match the host’s kernel exactly. This means, if you update the host kernel (for a routine security update for example) you have to also rebuild and reinstall any custom modules. Mainstream Linux distributions have support for this, but if you’ve boxed away management of this into a container, you have to remember how to rebuild the container with the newer kernel headers and make sure it doesn’t get restarted until you reboot the host. That can be tricky.
At a Docker level, you’re in effect building an image that can only be used on one very specific system. Usually the concept is to build an image that can be reused in multiple contexts; you want to be able to push the image to a registry and run it on another system with minimal configuration. It’s hard to do this if an image is tied to an extremely specific kernel version or other host-level dependency.
I am migrating legacy application deployed on two physical servers[web-app(node1) and DB(node2)].
Though following blog fullfilled my requirement. but still some questions
https://codeblog.dotsandbrackets.com/multi-host-docker-network-without-swarm/#comment-2833
1- For mentioned scenario web-app(node1) and DB(node2), we can use expose port options and webapp will use that port, why to create overlay network?
2- By using swarm-mode with replica=1 we can achieve same, so what advantage we will get by using creating overlay network without swarm mode?
3- if node on which consul is installed, it goes down our whole application is no more working.(correct if understanding is wrong)
4- In swarm-mode if manager node goes down(which also have webapp) my understanding is swarm will launch both containers on available host? please correct me if my understanding is not correct?
That article is describing an outdated mode of operation for 'Swarm'. What's described is 'Classic Swarm' that needed an external kv store (like consul) but now Docker primarily uses 'Swarm mode' (which is an orchestration capability built in to the engine itself). To answer what I think your questions are:
I think you're asking, if we can expose a port for a service on a host, why do we need an overlay network? If so, what happens if the host goes down and the container gets re-scheduled to another node? The overlay network takes care of that by keeping track of where containers are and routing traffic appropriately.
Not sure what you mean by this.
If consul was a key piece of discovery infra then yes, it would be a single point of failure so you'd want to run it HA. This is one of the reasons that the dependency on an external kv was removed with 'Swarm Mode'.
Not sure what you mean by this, but maybe about rebalancing? If so then yes, if a host (with containers) goes down, those containers will be re-scheduled on another node.
"docker" is a buzz word these days and I'm trying to figure out, what it is and how does it work. And more specifically, how is it different from the normal VM (e.g. VirtualBox, HyperV or WMWare solutions).
The introduction section of the documentation (https://docs.docker.com/get-started/#a-brief-explanation-of-containers) reads:
Containers run apps natively on the host machine’s kernel. They have better performance characteristics than virtual machines that only get virtual access to host resources through a hypervisor. Containers can get native access, each one running in a discrete process, taking no more memory than any other executable.
Bingo! Here is the difference. Containers run directly on the kernel of hosting OS, this is why they are so lightweight and fast (plus they provide isolation of processes and nice distribution mechanism in the shape of docker hub, which plays well with the ability to connect containers with each other).
But wait a second. I can run Linux applications on windows using docker - how can it be? Sure, there is some VM. Otherwise we would just not get job done...
OK, but how does it look like, when we work on Linux host??? And here comes real confusion... there one still defines OS as a base image for every image we want to create. Even if we say "FROM scratch" - scratch is still some minimalistic kernel... So here comes
QUESTION 1: If I run e.g. CentOS host, can I create the container, which would directly use kernel of this host operating system (and not VM, which includes its own OS)? If yes, how can I do it? If no, why the documentaion of docker lies to us (as then docker images always run within some VM and it is not too much different from other VMs, or ist it?)?
After some thinking about it and looking around I was wondering, if some optimization is done for running the images. Here comes
QUESTION 2: If I run two containers, images of both of which are based on the same parent image, will this parent image be loaded into memory only once? Will there be one VM for each container or just one, which runs both containers? And what if we use different OSs?
The third question is quite beaten:
QUESTION 3: Are there somewhere some resources, which describe this kind of things... because most of the articles, which discuss docker just tell "it is so cool, you must definitely use ut. Just run one command and be happy"... which does not explain too much.
Thanks.
Docker "containers" are not virtual machines; they are just regular processes running on the host system (and thus always on the host's Linux kernel) with some special configuration to partition them off from the rest of the system.
You can see this for yourself by starting a process in a container and doing a ps outside the container; you'll see that process in the host's list of all processes. Running ps in the containerized process, however, will show only processes in that container; limiting the view of processes on the system is one of the facilities that containerization provides.
The container is also usually given a limited or separate view of many other system resources, such as files, network interfaces and users. In particular, containerized processes are often given a completely different root filesystem and set of users, making it look almost as if it's running on a separate machine. (But it's not; it still shares the host's CPU, memory, I/O bandwidth and, most importantly, Linux kernel of the host.)
To answer your specific questions:
On CentOS (or any other system), all containers you create are using the host's kernel. There is no way to create a container that uses a different kernel; you need to start a virtual machine for that.
The image is just files on disk; these files are "loaded into memory" in the same way any files are. So no, for any particular disk block of a file in a shared parent image there will never be more than one copy of that disk block in memory at once. However, each container has its own private "transparent" filesystem layer above the base image layer that is used to handle writes, so if you change a file the changed blocks will be stored there, and will now be separate from the underlying image that that other processes (who have not changed any blocks in that file) see.
In Linux you can try man cgroups and man cgroup_namespaces to get some fairly technical details about the cgroup mechanism, which is what Docker (and any other containerization scheme on Linux) uses to limit and change what a containerized process sees. I don't have any other particular suggestions on readings directly related to this, but I think it might help to learn the technical details of how processes and various other systems work on Unix and POSIX systems in general, because understanding that gives you the background to understand what kinds of things containerization does. Perhaps start with learning about the chroot(2) system call and programming with it a bit (or even playing around with the chroot(8) program); that would give you a practical hands-on example of how one particular area of containerization.
Follow-up questions:
There is no kernel version matching; only the one host kernel is ever used. If the program in the container doesn't work on that version of that kernel, you're simply out of luck. For example, try runing the Docker official centos:6 or centos:5 container on a Linux system with a 4.19 or later kernel, and you'll see that /bin/bash segfaults when you try to start it. The kernel and userland program are not compatible. If the program tries to use newer facilities that are not in the kernel, it will similarly fail. This is no different from running the same binaries (program and shared libraries!) outside of a container.
Windows and Macintosh systems can't run Linux containers directly, since they're not Linux kernels with the appropriate facilities to run even Linux programs, much less supporting the same extra cgroup facilities. So when you install Docker on these, generally it installs a Linux VM on which to run the containers. Almost invariably it will install only a single VM and run all containers in that one VM; to do otherwise would be a waste of resources for no benefit. (Actually, there could be benefit in being able to have several different kernel versions, as mentioned above.)
Docker does not has an OS in its containers. In simple terms, a docker container image just has a kind of filesystem snapshot of the linux-image the container image is dependent on.
The container-image includes some basic programs like bash-shell, vim-editor etc to facilitate developer to work easily with the docker image. Also, docker images can include pre-installed dependencies like nodeJS, redis-server etc as we can find on docker hub.
Docker behind the scene uses the host OS which is linux itself to run its containers. The programs included in linux-like filesystem snapshot that we see in form of docker containers actually runs on the host OS in isolation.
The container-images may sound like different linux distros but they are the filesystem snapshot of those distros. All Linux distributions are based on the same kernel. They differ in the programs, tools and dependencies that they ships with.
Also take note of this comment [click]. It is very much relevant to this question.
Hope this helps.
It's now long time since I posted this question, but it seems, like it still get hits... So I decided to answer it - in fact mainly the question, which is in the title (the questions in the text are carefully answered by Curt J. Sampson).
So, the discussion of the "main" question: if containers are not VMs, then why do we need VMs for them?
As you may guess, I am working on windows (on Linux this question would not emerge, because on Linux one does not need VMs for docker).
The reason, why we need a VM for containers in Winodows is pretty obvious (probably this is the reason, why nobody mentions it explicitly). As was already mentioned here and it many other FAQs, containers reuse kernel and some other resources of the hosting OS. Taking into account, that most of the containers available out there are based on Linux, one may conclude, that those containers need host OS to provide Linux kernel for them to run. Which is not natively easy on Windows (I am not sure, may be it is now possible with Linux subsystem). This is why on Windows we need one VM, which runs Linux and docker service inside this VM. And then, when we start the containers, they are also started inside this VM (and reuse the resources of its Linux OS). All the containers run inside the same VM. Getting a bit more technical: by default docker uses Hyper-V to run this linux VM, but one can also use Docker-Toolbox, which uses Oracle VirtualBox. By the way, VM can be freely seen in the Virtual Box interface. Nice part is that Docker (or Docker toolbox) takes care about managing this VM and we don't need to care about it.
Now some bonus question, which that time confused me even more. One may think: "Ok, it is clear now. If we run Linux container on Winodws OS, then we need Linux kernel and thus need VM with Linux. But if we run Windows container on Windows (by the way, it exists), then VM should not be needed, right?..." Answer: "wrong" (or almost wrong). :) The problem is, that the Windows based containers (at least those, which I saw) use windows server kernel, which is not available e.g. in Windows 10. Thus one still need VM with special version of Windows Server running on it. In fact MS even created special version of Windows Server, which can be run on VM for development purposes free of charge specifically to enable development of Windows-Server based containers. If my understanding is correct, those containers should be possible to run without VM on Windows Server. I should admit, that I never checked it though.
I hope, that this messy explanation may help someone to better understand the topic.
We need a VM to run a docker on the host machine ( this is achieved through the docker toolbox) if it is windows, on Linux we don't even need this. Once we have a docker toolbox container in itself doesn't need a VM, each container has a baseline image which is very minimal and reuses a lot of stuff with the host kernel hence making it lightweight compared to VM. You can run many such container using single host kernel.
I have experimented with packaging my site-deployment script in a Docker container. The idea is that my services will all be inside containers and then using the special management container to manage the other containers.
The idea is that my host machine should be as dumb as absolutely possible (currently I use CoreOS with the only state being a systemd config starting my management container).
The management container be used as a push target for creating new containers based on the source code I send to it (using SSH, I think, at least that is what I use now). The script also manages persistent data (database files, logs and so on) in a separate container and manages back-ups for it, so that I can tear down and rebuild everything without ever touching any data. To accomplish this I forward the Docker Unix socket using the -v option when starting the management container.
Is this a good or a bad idea? Can I run into problems by doing this? I did not read anywhere that it is discouraged, but I also did not find a lot of examples of others doing this.
This is totally OK, and you're not the only one to do it :-)
Another example of use is to use the management container to hande authentication for the Docker REST API. It would accept connections on an EXPOSE'd TCP port, itself published with -p, and proxy requests to the UNIX socket.
As this question is still of relevance today, I want to answer with a bit more detail:
It is possible to work with this setup, where you pass the docker socket into a running container. This is done by many solutions and works well. BUT you have to think about the problems, that come with this:
If you want to use the socket, you have to be root inside the container. This allows the execution of any command inside the container. So for example if an intruder controlls this container, he controls all other docker containers.
If you expose the socket with a TCP Port as sugested by jpetzzo, you will have the same problem even worse, because now you won't even have to compromise the container but just the network. If you filter the connections (like sugested in his comment) the first problem stays.
TLDR;
You could do this and it will work, but then you have to think about security for a bit.