Docker Swarm scheduling strategies

Docker Swarm scheduling strategies - docker

How can I make Docker Swarm distribute containers based on CPU load and not equally across all nodes (right now it seems to be doing the latter)?

Related

What is a cluster and a node oriented to containers?

Sorry for this question, but I just started with Docker and Docker Compose and I really didn't need any of this until I read that I need to use Docker Swarn or Kuebernetes to have more stability in production. I started reading about Docker Swarn and they mentioned nodes and clusters.
I was really happy not knowing about this as I understood docker-compose:
Is that I could manage my services/containers from a single file
and only have to run several commands to launch, build, delete, etc.
all my services based on the docker-compose configuration.
But now the nodes and cluster have come out and I've really gone a bit crazy, and that's why if you can help me understand this next step in the life of containers. I've been googling and it's not very clear to me.
I hope you can help me and explain it to me in a way that I can understand.
Thank you!

A node is just a physical or virtual machine.
In Kubernetes/Docker Swarm context each node must have the relevant binaries installed (Docker Engine, kubelet etc..)
A cluster is a grouping of one or more nodes.
If you have just been testing on your local machine you have a single node.
If you were to add a second machine and link both machines together using docker swarm/kubernetes then you would have created a 2 node cluster
You can then use docker swarm/kubernetes to run your services/containers on any or all nodes in your cluster. This allows your services to be more resilient and fault tolerant.

By default Docker Compose runs a set of containers on a single system. If you need to run more containers than fit on one system, or you're just afraid of that system crashing, you need more than one system to do it. The cluster is the group of all of the systems (physical computers, virtual machines, cloud instances) that are working together to run the containers. Each of those individual systems is a node.
The other important part of the cluster container setups is that you can generally run multiple replicas of a give container, and you don't care where in the cluster they run. Say you have five nodes, and a Web server container, and you'd like to run three copies of it for redundancy. Instead of having to pick a node, ssh to it, and manually docker run there, you just tell the cluster manager "run me three of these", and it chooses a node and launches the container for you. You can also scale the containers up and down at runtime, or potentially set the cluster to do the scaling on its own based on load.
If your workload is okay running a single copy of containers on a single server, you don't need a cluster setup. (You might have some downtime during updates or if the single server dies.) Swarm has the advantages of being bundled with Docker and being able to use Docker-native tools (docker-compose can deploy to a Swarm cluster). Kubernetes is much more complex, but at this point most public cloud providers will sell you a preconfigured Kubernetes cluster, and it has better stories around security, storage management, and autoscaling. There are also a couple other less-prominent alternatives like Nomad and Mesos out there.

Difference between Minikube, Kubernetes, Docker Compose, Docker Swarm, etc

I am new to cluster container management, and this question is the basis for all the freshers over here.
I read some documentation, but still, my understanding is not too clear, so any leads.. helping to understand?
Somewhere it is mentioned, Minikube is used to run Kubernetes locally. So if we want to maintain cluster management in my four-node Raspberry Pi, then Minikube is not the option?
Does Minikube support only a one-node system?
Docker Compose is set of instructions and a YAML file to configure and start multiple Docker containers. Can we use this to start containers of the different hosts? Then for simple orchestration where I need to call container of the second host, I don't need any cluster management, right?
What is the link between Docker Swarm and Kubernetes? Both are independent cluster management. Is it efficient to use Kubernetes on Raspberry Pi? Any issue, because I was told that Kubernetes in single node takes the complete memory and CPU usage? Is it true?
Is there other cluster management for Raspberry Pi?
I think this 4-5 set will help me better.

Presuming that your goal here is to run a set of containers over a number of different Raspberry Pi based nodes:
Minikube isn't really appropriate. This starts a single virtual machine on a Windows, MacOS or Linux and installs a Kubernetes cluster into it. It's generally used by developers to quickly start-up a cluster on their laptops or desktops for development and testing purposes.
Docker Compose is a system for managing sets of related containers. So for example if you had a web server and database that you wanted to manage together you could put them in a single Docker Compose file.
Docker Swarm is a system for managing sets of containers across multiple hosts. It's essentially an alternative to Kubernetes. It has fewer features than Kubernetes, but it is much simpler to set up.
If you want a really simple multi-node Container cluster, I'd say that Docker swarm is a reasonable choice. If you explicitly want to experiment with Kubernetes, I'd say that kubeadm is a good option here. Kubernetes in general has higher resource requirements than Docker Swarm, so it could be somewhat less suited to it, although I know people have successfully run Kubernetes clusters on Raspberry Pis.

Docker Compose
A utility to to start multiple docker containers on a single host using a single docker-compose up. This makes it easier to start multiple containers at once, rather than having do mutliple docker run commands.
Docker swarm
A native container orchestrator for Docker. Docker swarm allows you to create a cluster of docker containers running on multiple machines. It provides features such as replication, scaling, self-healing i.e. starting a new container when one dies ...
Kubernetes
Also a container orchestrator. Kubernetes and Docker swarm can be considered as alternatives to one another. They both try to handle managing containers starting in a cluster
Minikube
Creating a real kubernetes cluster requires having multiple machines either on premise or on a cloud platform. This is not always convenient if someone is just new to Kubernetes and trying to learn by playing around with Kubernetes. To solve that minikube allows you to start a very basic Kubernetes cluster that consists of a single VM on you machine, which you can use to play around with Kubernetes.
Minikube is not for a production or multi-node cluster. There are many tools that can be used to create a multi-node Kubernetes cluster such as kubeadm

Containers are the future of application deployment. Containers are smallest unit of deployment in docker. There are three components in docker as docker engine to run a single container, docker-compose to run a multi-container application on a single host and docker-swarm to run multi-container application across hosts which also an orchestration tool.
In kubernetes, the smallest unit of deployment is Pod(which is composed of multiple container). Minikube is a single node cluster where you can install it locally and try, test and feel the kubernetes features locally. But, you can't scale this to more than a single machine. Kubernetes is an orchestration tool like Docker Swarm but more prominent than Docker Swarm with respect to features, scaling, resiliency, and security.
You can do the analysis and think about which tool will be fit for your requirements. Each one having their own pros or cons like docker swarm is good and easy to manage small clusters whereas kubernetes is much better for larger once. There is another orchestration tool Mesos which is also popular and used in largest size clusters.
Check this out, Choose your own Adventure but, it's just a general analogy and only to understand because all the three technologies are evolving rapidly.

I get the impression you're mostly looking for confirmation and am happy to help with that if I can.
Yes, minikube is local-only
Yes, minikube is intended to be single-node
Docker-compose isn't really an orchestration system like swarm and Kubernetes are. It helps with running related containers on a single host, but it is not used for multi-host.
Kubernetes and Docker Swarm are both container orchestration systems. These systems are good at managing scaling up, but they have an overhead associated with them so they're better suited to multi-node.
I don't know the range of orchestration options for Raspberry Pi, but there are Kubernetes examples out there such as Build Your Own Cloud with Kubernetes and Some Raspberry Pi.

For Pi, you can use Docker Swarm Mode on one or more Pi's. You can even run ARM emulation for testing on Docker for Windows/Mac before trying to get it all working directly on a Pi. Same goes for Kubernetes, as it's built-in to Docker for Windows/Mac now (no minikube needed).
Alex Ellis has a good blog on Pi and Docker and this post may help too.

I've been playing around with orchestrating Docker containers on a subnet of Raspberry Pis (3Bs).
I found Docker-swarm easiest to set up and work with, and adequate for my purposes. Guide: https://docs.docker.com/engine/swarm/swarm-tutorial/
For Kubernetes there are two main options; k3s and microk8s. Some guides:
k3s
https://bryanbende.com/development/2021/05/07/k3s-raspberry-pi-initial-setup
microk8s
https://ubuntu.com/tutorials/how-to-kubernetes-cluster-on-raspberry-pi#1-overview

Can Docker-Swarm run in fail-over-mode only?

I am facing a situation, where I need to run about 100 different applications in Docker containers. It is not reasonable to run all 100 containers on one hardware, so I need to spread the applications over several machines.
As far as I understood, docker-swarm is for scaling only, which means when I run my containers in a swarm, than all 100 containers will automatically be deployed and started on every node of my docker-swarm. But this is not what I am searching for. I want to split the applications and for example run 50 on node1 and 50 on node2.
Question 1:
Is there a way to configure docker-swarm in a way, where my applications will be automatically dispatched on the node which has the most idling resources?
Question 2:
Is there a kind of fail-over-mode in docker swarm which can stop a container on one node and try to start it on on another in case something goes wrong?

all 100 containers will automatically be deployed and started on every node of my docker-swarm
This is not true. When you deploy 100 containers in a swarm, the containers will be distributed on the available nodes in the swarm. You will mostly get an even distribution of containers on all nodes.
Question 1: Is there a way to configure docker-swarm in a way, where my applications will be automatically dispatched on the node which has the most idling resources?
Docker swarm does not check the available resources (memory, cpu ...) available on the nodes before deploying a container on it. The distribution of containers is balanced per nodes, without taking into account the availability of resources on each node.
You can however build a strategy of distributing container on the nodes. You can use placement constraints were you can restrict where a container can be deployed. You can label nodes having a lot of resources and restrict some heavy containers to only run on these nodes.
Question 2: Is there a kind of fail-over-mode in docker swarm which can stop a container on one node and try to start it on on another in case something goes wrong?
If a container crashes, docker swarm will ensure that a new container is started. Again, the decision of what node to deploy the new container on cannot be predetermined.

Difference between Docker container and service

I'm wondering whether there are any differences between the following docker setups.
Administrating two separate docker engines via the remote api.
Administrating two docker swarm nodes via one single docker engine.
I'm wondering if you can administrate a swarm with the ability run a container on a specific node are there any use cases to have separate docker engines?

The difference between the two is swarm mode. When a docker engine is running services in swarm mode you get:
Orchestration from the manager to continuously try to correct any differences between the current state and the target state. This can also include HA using the quorum model (as long as a majority of the managers are reachable to make decisions).
Overlay networking which allows containers on different hosts to talk to each other on their own container network. That can also involve IPSEC for security.
Mesh networking for published ports and a VIP for the service that doesn't change like container IP's do. The latter prevents problems from DNS caching. And the former has all nodes in the swarm publish the port and routes traffic to a container providing this service.
Rolling upgrades to avoid any downtime with replicated services.
Load balancing across multiple nodes when scaling up a service.
More details on swarm mode are available from docker's documentation.
The downside of swarm mode is that you are one layer removed from the containers when they run on a remote node. You can't run an exec command on a task to investigate a container, you need to do that on a container and be on the node it's currently using. Docker also removed some options from services like --volumes-from which don't apply when containers may be running on different machines.
If you think you may grow beyond running containers on a single node, need to communicate between the containers on different nodes, or simply want the orchestration features like rolling upgrades, then I would recommend swarm mode. I'd only manage containers directly on the hosts if you have a specific requirement that prevents swarm mode from being an option. And you can always do both, manage some containers directly and others as a service or stack inside of swarm, on the same nodes.

How does Docker Swarm load balance?

I have a cluster of 10 Swarm nodes started via docker swarm join command
If i want to scale a docker instance to 15 via
docker service create --replicas 15
how does docker swarm know where to start the container?
is it round-robin or does it take into consideration of compute resource (how much cpu/mem being used)?

When you create a service or scale it in the Swarm mode, scheduler on Elected Leader (one of the managers) will choose a node to run the service on. There are 3 strategies currently available to the leader.
spread
binpack
random
The spread and binpack strategies compute rank according to a node’s available CPU, its RAM, and the number of containers it has. The random strategy uses no computation. It selects a node at random and is primarily intended for debugging.
Under the spread strategy, Swarm optimizes for the node with the least number of containers. The binpack strategy causes Swarm to optimize for the node which is most packed.
Swarm uses spread by default.
Keep in mind you can Constraint Containers on specific nodes too.
It's not possible to set strategies in docker version 1.12.1 (Latest release to posting date)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart