I’m trying to understand the swarm and volumes in scale. I followed the https://docs.docker.com/get-started until the end and everything was clear and working as expected. Now I want to do the same thing (have two nodes, 1 manager, 1 worker) but on my physical devices. Let’s say I have two computers on the same network or two AWS instances.
I tried to set it up between my two computers on the local network, but it seems that the manager’s IP is responding to the manager nodes and worker’s IP to worker nodes. Also, the Redis is not working on the worker’s IP.
I think it should be possible, but I can’t find an article where they are doing a similar thing.
I mean there should be a way, I don’t think that companies are running all their Docker containers on the same server (AWS instances)
P.S.: I'm using all the same files and setups provided in the link above.
I'm using Docker I have implemented a system to deploy environments (on a single server) based on Git branches using Traefik (*.dev.domain.com) and Docker Compose templates.
I like Kubernetes and I've never switched to it since I'm limited to one single server for my infrastructure. I've only used it using local installations (Docker for Windows).
So, my question is: does it make sense to run a Kubernetes "cluster" (master and nodes) on a single server to orchestrate and route containers (in place of Traefik/Rancher/Docker Compose)?
This use is for development and staging only for the moment, so high availability is not a prerequisite.
Thanks.
If it is not a production environment, it doesn't matter how many nodes you are using. So yes, it should be just fine in this case. But make sure all the k8s features you will need in production are available in test/dev, to keep things similar and portable.
AFAIU,
I do not see a requirement for kubernetes unless we are doing below at least for single host using native docker run or docker-compose or docker engine swarm mode -
Make sure there are enough(>=2) replicas of your app in a single server and you are balancing the load across those apps docker containers.
If you want to go bit advanced, we should be able to scale up & down dynamically (docker swarm mode supports this out of the box else use jwilder nginx proxy).
Your deployment should not cause a downtime. Make sure a single container is always healthy at any instant of time while deploying.
Container should auto heal(restart automatically) in case your HTTP or TCP health check fails.
Doing all of the above will certainly put you in a better place but single host is still a single source of failure which you got to deal with at regular intervals.
Preferred : if possible try to start with docker engine swarm mode or kubernetes single master or minikube. This will automatically take care of all the above scenarios out of the box and will also allow you to further scale up anytime by adding more nodes without changing much in your YML files for docker swarm or kubernetes.
Ref -
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
https://docs.docker.com/engine/swarm/
I would use single host k8s only if I managed clusters with the same project that I would like to deploy to the said host. This enables you to reuse manifests and all the automation you've created for your clusters.
Have I had single host environments only, I would probably stick to docker-compose.
If you're looking to try it out your easiest options are probably minikube (easy to run single-node cluster locally but without some features) or using one of the free trial accounts for a managed Kubernetes service from one of the big cloud providers (fully-featured and multi-node but limited use before you have to pay).
I'm using docker on a bare metal server. I'm pretty happy with docker-compose to configure and setup applications.
Still some features are missing, like configuration management and monitoring maybe there are other solutions to solve this issues but I'm a bit overwhelmed by the feature set of Kubernetes and can't judge if it would help me here.
I'm also open for recommendations to solve the requirements separately:
Configuration / Secret management
Monitoring of my docker hostes applications (e.g. having some kind of dashboard)
Remot container control (SSH is okay with only one Server)
Being ready to scale my environment (based on multiple different Dockerized applications) to more than one server in future - already thinking about networking/service discovery issues with a pure docker-compose setup
I'm sure Kubernetes covers some of these features, but I have the feeling that it's too much focused on Cloud platforms where Machines are created on the fly (since I only have at most few bare metal Servers)
I hope the questions scope is not too broad, else please use the comment section and help me to narrow down the question.
Thanks.
I think the Kubernetes is absolutely much your requests and it is what you need.
Let's start one by one.
I have the feeling that it's too much focused on Cloud platforms where Machines are created on the fly (since I only have at most few bare metal Servers)
No, it is not focused on Clouds. Kubernates can be installed almost on any bare-metal platform (include ARM) and have many tools and instructions which can help you to do it. Also, it is easy to deploy it on your local PC using Minikube, which will prepare local cluster for you within VMs or right in your OS (only for Linux).
Configuration / Secret management
Kubernates has a powerful configuration and management based on special objects which can be attached to your containers. You can read more about configuration management in that article.
Moreover, some tools like Helm can provide you more automation and range of preconfigured applications, which you can install using a single command. And you can prepare your own charts for it.
Monitoring of my docker hostes applications (e.g. having some kind of dashboard)
Kubernetes has its own dashboard where you can get many kinds of information: current applications status, configuration, statistics and many more. Also, Kubernetes has great integration with Heapster which can be used with Grafana for powerful visualization of almost anything.
Remot container control (SSH is okay with only one Server)
Kubernetes controlling tool kubectl can get logs and connect to containers in the cluster without any problems. As an example, to connect a container "myapp" you just need to call kubectl exec -it myapp sh, and you will get sh session in the container. Also, you can connect to any application inside your cluster using kubectl proxy command, which will forward a port you need to your PC.
Being ready to scale my environment (based on multiple different Dockerized applications) to more than one server in future - already thinking about networking/service discovery issues with a pure docker-compose setup
Kubernetes can be scaled up to thousands of nodes. Or can have only one. It is your choice. Independent of a cluster size, you will get production-grade networking, service discovery and load balancing.
So, do not afraid, just try to use it locally with Minikube. It will make many of operation tasks more simple, not more complex.
My goal is to simulate a cluster environment where I can test my applications and tools.
I need to have minimum of 3 Docker nodes (not container) running, and have access to them over ssh.
I have tried the following:
1 - Installing multiple VMs Machines from Ubuntu MinimalCD
Result: ended up with huge files to maintain, repeating the process is really harmful, and unpleasant.
2- Downloading Vagrant box that has docker inside (there are some here).
Result: I can't access them over ssh, and can't really fire up more than one box (Ok, I can but it is still not optimal).
3- Tried to run "Kitematic" multiple times, but had no success with it.
What do you do to test your clustering tools for docker?
My only "easy" solution is to run multiple instances from some provider and pay per hour usage, but that is really not that easy when I am offline, and when I just don't want to pay.
I don't need to run multiple "containers", but multiple "hosts", which I can then join together into a single Cluster to simulate distributed Data Center.
You could use docker-machine to create a few VMs locally. You can connect to all of them by changing the environment variables.
You might also be interested in something like https://github.com/dnephin/compose-swarm-sandbox/. It creates multiple docker hosts inside containers using https://github.com/dnephin/docker-swarm-slave.
If you are using something other than swarm, you would just remove that service from /srv/.
I would recommend you to use docker-machine for this purpose as they are very light weight and very easy to install, run and manage.
Try creating 3-4 docker machine, pull the swarm image on them and make a cluster and use docker compose to manage the cluster in one go.
Option 2 should be a valid option but what you looked at was to use a VM box using the docker provisionner. I would recommend looking at vagrant docker provider you do not need a vagrant box in this scenario but docker images. The Vagrant file though is still there and you can easily setup your multiple machines from the single Vagrant file
here is a nice blog but I am sure there are plenty of other good articles that explains in detail
I recommend Running CoreOS on Vagrant, it has been designed for your request with cluster enable and 3 instances will be started by default.
With etcd and fleetd, you should be fine to get cluster work properly.
I am trying to wrap my head around CoreOS and I perused their official docs, some random articles, and even watched this excellent presentation by their CTO.
My understanding of CoreOS is that its a stripped down, bare bones Linux distribution that requires anything running on it to be an OCF-compliant container, not just a Docker container.
My understanding of fleet is that its systemd at the cluster level
My understanding of flannel is that its a network layer that is used by both etcd and fleet to route network requests to containers living in the cluster
So first off, if my above assertions are incorrect or misled in any way, please begin by correcting me! Assuming that I'm more or less on track, I have a few concerns here:
What concrete benefit(s) does CoreOS offer Docker-contained apps that is not present with other Linux distros, such as Ubuntu or Debian? In other words, what objective benefits do I gain by going Docker/CoreOS vs. Docker/Ubuntu?
Fleet just seems like a scheduling engine, like Mesos or Kubernetes. Is it a direct competitor to these projects, or do they handle scheduling at different "layers" (different types of responsibilities)? If so, what are these distinctions?
There are a lot of moving parts here. The answer already posted is very good. I think there are going to be opinions in any answer you get. I thought I'd go through your punch list in my attempt at 100 bounty points :-)
I've been using CoreOS/Flannel/Kubernetes/Fleet everyday now for about 6 months. When you posted the url to the introduction I decided to watch it. Wow, great presentation. I think Brandon Philips is a very good teacher. I like the way he built upon each technology as he introduced it. I would recommend that tutorial to anyone.
CoreOS is a linux based operating system. It is very stripped down, nothing extra running. For me, it does these things:
Auto updates. Does this well. Dual partitions, updates non-active, swaps active, falls back (I think, I have never experienced a fallback). The have tackled the 'how to update your operating system after you deploy' issue and made it relatively painless.
systemd init system. This one took me a bit longer to like (being a /etc/init.d guy) but, after a while it grows on you. There is a pretty steep learning curve. Once you get what is going on you will like how systemd keeps the machine running specific things, dependencies, restarts (if you want), listening on sockets (like super initd) and spawning processes, d-bus (although I don't know much about this part yet). systemd lets you specify 'units' and units can have dependencies, pre and post processes, etc.
basic services. I've copied the brief description line from each of the services that are running on my CoreOS system.
systemd - It provides a system and service manager that runs as PID 1 and starts the rest of the system
docker - Docker is an open source project to pack, ship and run any application as a lightweight container
etcd - etcd is a distributed, consistent key-value store for shared configuration and service discovery
sshd - sshd (OpenSSH Daemon) is the daemon program for ssh(1). Together these programs replace rlogin and rsh, and provide secure encrypted communications between two untrusted hosts over an insecure network.
locksmithd - locksmith is a reboot manager for the CoreOS update engine which uses etcd to ensure that only a subset of a cluster of machines are rebooting at any given time. locksmithd runs as a daemon on CoreOS machines and is responsible for controlling the reboot behaviour after updates.
journald - systemd-journald is a system service that collects and stores logging data.
timesyncd - systemd-timesyncd is a system service that may be used to synchronize the local system clock with a remote Network Time Protocol server
update_engine
udevd - systemd-udevd listens to kernel uevents. For every event, systemd-udevd executes matching instructions specified in udev rules. See udev(7).
logind - systemd-logind is a system service that manages user logins.
resolved - systemd-resolved is a system service that manages network name resolution. It implements a caching DNS stub resolver and an LLMNR resolver and responder.
hostnamed - This is a tiny daemon that can be used to control the host name and related machine meta data from user programs.
networkd - systemd-networkd is a system service that manages networks. It detects and configures network devices as they appear, as well as creating virtual network devices.
CoreOS doesn't necessarily require that everything that you want to run must be a container. It will run anything that a unix box will run. yum and apt-get are conspicuously missing, but wget is included. So, you can 'install' programs, libraries, even apt-get via wget and be on your way to polluting the CoreOS base. That wouldn't be good, though. You really do want to keep it pristine. To that end, they include a 'toolbox' which lets you run a container like sandbox to do your work that goes away when you log out of it.
My favorite part of CoreOS is the cloud-config. On first boot you can provide user_data called a cloud-config. It is a yaml file which tells the base CoreOS what to do when it boots the first time. This is where you install things like fleet, flannel, kubernetes, etc. It is a real easy way to get a repeatable install of a combination of your choosing on a VM. In a typical cloud-config I will write configuration files, copy files from other machines to install on the new machine, and create unit files that control the other processes we want CoreOS' systemd to manage (like flannel, fleet, etc). And it is completely repeatable.
Here is another interesting thing about CoreOS. You can modify the dependency and configuration of existing units. For example, CoreOS starts docker. But, I want to modify the startup sequence of docker, so I can add a drop-in configuration that augments the existing system docker configuration. I use this to drop-in the dependency for flannel before docker starts, so I can configure docker to use a flannel provided network. This isn't necessarily CoreOS, but, it does make it all fit together.
I think you can use cloud-config with Ubuntu as well as CoreOS, and you can do the same things. So, I think the benefit you get from CoreOS over Ubuntu would be that you get a new release often, the operating system is auto-updated, and you don't have anything 'extra' running (it's lean, and a reduced attack vector is fallout). CoreOS is tuned for docker (it is already running) and ubuntu doesn't have it already running. Although, you can create a cloud-config file that will make ubuntu run docker... In summary, I think you have CoreOS understood.
Another thing that you can get with CoreOS is support, directly from the company, either paid or unpaid. I have had many questions answered by the people at CoreOS via this forum and CoreOS Dev/CoreOS User Google groups.
Your fleet description is also pretty good. Fleet manages a cluster. A cluster is one or more CoreOS machines. So, if you are going to use fleet you must use CoreOS, I guess this would be another of those benefits of CoreOS over Ubuntu.
Much like how a Unit File for systemd controls running a process on a host, a Unit File for fleetd controls running a process on a cluster. There is a bit of syntactic sugar, but a Unit file for fleet is about the same as a unit file for systemd. They fit very well together. Fleet's unit files are saved in etcd's database, so once ingested the unit is persistent, even if the machine(s) that host the unit service go down, the unit description exists in etcd.
Fleet also has commands for listing my machines in my cluster, listing a unit file, showing the units that are running, etc. Basically you can submit units to run on the cluster (or all machines, or on a specific kind of machine (like with ssd drives), or on the same machine as something else is running (affinity), etc, etc).
Fleet keeps it running. If the machine goes away its units are going to be run on some other machine in the cluster.
In the tutorial you reference Brandon uses Fleet to launch Kubernetes. It is very simple to do. By making the Fleet unit files place Kubernetes on all machines in the fleet cluster, as machines are added and subtracted from the fleet cluster Kubernetes automatically uses that machine and schedules the Kubernetes to run on them. I have run my Kubernetes cluster like this as well. However, I don't do that much anymore. I am sure there is a benefit that I don't see, but, I feel like it is not necessary in my environment. Since I already boot my machines with a cloud-config file, it is trivial to put the Kubernetes node services directly in there. In fact, with cloud-config, if I wanted to use Fleet to boot the Kubernetes stuff, I would have to write the Fleet unit files, start Fleet, submit the unit files I wrote to Fleet, when I could just write a unit file to start the Kubernetes node. But I digress...
Fleet is a scheduling mechanism, just like Kubernetes. However, Fleet can start any executable just like systemd via a unit file, where Kubernetes is geared towards containers. Kubernetes allows definition of:
replication controllers
services
pods
containers
(other stuff as well).
So, the assertion that Fleet is just a different 'layer' of scheduling is a good one. You might add that Fleet schedules different things. In my work I don't use the Fleet layer, I just jump directly to the Kubernetes because I am working only with containers.
Finally, the assertion about flannel is incorrect. Flannel uses etcd for its database. Flannel creates a private network for each host that it routes between them. The flannel network is handed to docker, and docker is told to use that network to assign ip addresses from. So, docker processes that use flannel can communicate with each other over ip. All of the port mapping stuff can be skipped since each container gets its own ip address. These docker processes can communicate infra and intra machine on the flannel network. I could be wrong, but I don't think there is any connection between Fleet and flannel. Also, I don't think etcd or Fleet use flannel to route their data. Etcd and Fleet route whether or not flannel is being used. Docker containers route their traffic over flannel.
-g
Yes, your understanding is pretty much correct.
Coreos is designed as a more secure operating system that autoupdates itself by default and runs the bare minimum in services to lessen any attack vector. http://www.activestate.com/blog/2013/08/alex-polvi-explains-coreos
Everything needs to either run in a container, be statically compiled (golang binaries as an example), or be a shell script. There is no python or ruby installed.
Containers/systemd units started by Fleet can be rescheduled on another node should its server fail (assuming you container is ephemeral) and should keep the requested number of instances running over the cluster, whilst obeying deployment constraints. https://coreos.com/using-coreos/clustering/
Mesos is more of a framework for a scheduler, you still need something else (chronos/marathon) to provide jobs to execute, but it is very flexible in that regard and handles utilising server resources better.
I don't have much experience with Flannel, but the new networking plugins coming in a future version of Docker may give you more options for container networking. http://blog.docker.com/2015/06/networking-receives-an-upgrade/
what objective benefits do I gain by going Docker/CoreOS vs.
Docker/Ubuntu?
Technical benefits
The features that attract me to CoreOS are:
It's a cluster, not a single-machine, OS
It's built with failure in mind
It's self-updating
CoreOS is a cluster, while Ubuntu is a single machine. With CoreOS when the machine a container is on disappears, the cluster starts the container somewhere else. When that Ubuntu server fails, its containers go down with it. CoreOS allows the machine to be disposable, which is a good thing.
With that said, keep in mind CoreOS does not handle data persistence; data stored in a container does not exist! ;) In my case I dynamically attach EBS volumes where needed.
Design benefits
To me more importantly, the technical benefits above bring along design benefits. Going into a system designing knowing, "this process will randomly disappear," is great for building resiliency. From the beginning, services are stateless and because you literally have no idea what system a dependent service is on, they must also be discoverable. CoreOS's etcd, a distributed configuration store, can be used to discover where a service is located. Finally, because processes may not be on the same machine, network-accessible services -- a must for horizontally-scalable systems -- are the only way to go.
All in all, I find CoreOS a great for building Twelve-Factor Apps and you get Chaos Monkey for free.
Fleet just seems like a scheduling engine, like Mesos or Kubernetes.
Is it a direct competitor to these projects, or do they handle
scheduling at different "layers" (different types of
responsibilities)? If so, what are these distinctions?
Yes, Fleet schedules a container and determines where in a cluster it runs. If that machine disappears, Fleet also takes responsibility for re-launching it on a working machine.
I haven't taken a deep dive into Kubernetes, but there does appear to be overlap. The way I understand it thus far is that Fleet handles running a single container (a "unit"), while Kubernetes is complementary and orchestrates multiple units comprising a system. For example, Fleet ensures Postgres stays running; Kubernetes ensures your application, e.g. comprised of Postgres, Redis, and Django, are all humming away.