How services are distributed in docker swarm - docker

Can I somehow configure how master node distributes services in docker swarm? I thought, that it should see free resources of worker nodes and distribute it to "freest" node.
Currently I have problem, that service is distributed into one node, which is full (90% RAM) and it starts be laggy, but at the same time second node has few services and it can handle another one.
docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
wdkklpy6065zxckxyuj000ei4 * docker-master Ready Drain Leader 18.09.6
sk45rol2whdr5eh2jqozy0035 docker-node01 Ready Active Reachable 18.09.6
o4zwwbwwcrbwo4tsd00pxkfuc docker-node02 Ready Active 18.09.6
Now I have 36 (very similar) services, 28 run on docker-node01, 8 on docker-node02. I thought, that ideal state is 16 services on both nodes.
Both docker-nodes are same.
How docker swarm knows where to run service? What algorithm does it use?
It is possible to change/update algorithm for selecting node?

According to the swarmkit project README the only available strategy is spread so it schedule tasks on the least loaded modes.
Note that the swarm won't move nodes around to maintain this strategy so if you added the node02 after the node01 was full then the node02 will remain mostly empty. You could drain both nodes then activate them to see if it distributes better the load.
You can find a more detailed description on the schedules algorithm on the project documentation: scheduling-algorithm
For the older swarm manager this attribute was configurable:
https://docs.docker.com/swarm/reference/manage/#--strategy--scheduler-placement-strategy

Also I found https://docs.docker.com/swarm/scheduler/strategy/, it explains a lot about Docker swarm strategies.

Related

How to handle when leader node goes down in docker swarm

I have two docker nodes running in swarm like below. The second node i promoted to work as manager.
imb9cmobjk0fp7s6h5zoivfmo * Node1 Ready Active Leader 19.03.11-ol
a9gsb12wqw436zujakdpbqu5p Node2 Ready Active Reachable 19.03.11-ol
This works fine when leader node goes to drain/pause. But as part of my test i have stopped the Node1 instance then i got below error when try to see what are the nodes(docker node ls) in the second node and when tried to list the services running(docker service ls).
Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online
Also no docker process coming up in node 2 which were running in node 1 before stopping the instance. Only the existing process are running. My expectation is after stopping the node1 instance, the procees were running in node 1 has to move to node2. This works fine when a node goes to drain status
The raft consensus algoritm fails when it cant' find a clear majority.
This means, never run with 2 manager nodes as one node going down leaves the other with 50% - which is not a majority and quorum cannot be reached.
Generally in fact, avoid even numbers, especially when splitting managers between availability zones, as a zone split can leave you with a 50/50 partition - again no majority and no Quorum and a dead swarm.
So, valid numbers of swarm managers to try are generally: 1,3,5,7. Going higher than 7 generally reduces performance and doesn't help availability.
1 should only be used if you are using a 1 or 2 node swarm, and in these cases, loss of the manager node equates to loss of the swarm anyway.
3 managers is really the minimum you should aim for. If you only have 3 nodes, then prefer to use the managers as workers than run 1 manager and 2 workers.

Kubernetes - is it ok to have 1 Control Plane and 1 worker Node for dev/test purposes?

we need to initiate Kubernetes cluster and start our development.
Is it OK to have 1 master Control Plane node and 1 worker node with our containers to start the development?
We can afford for services to be unavailable in case of upgrades, scaling and so on, I've just worried if I am lacking some more important info.
I was planning to have 8CPUs and 64 GB since that are the similar resources which we have on one of our VMs without containers with the same apps.
We will deploy cluster with Azure Kubernetes Service.
Thank you
Sure, you can also have single node clusters. Just as you said, that means if one node goes down, the cluster is unavailable.

What happens if master node dies in kubernetes? How to resolve the issue?

I've started learning kubernetes with docker and I've been thinking, what happens if master node dies/fails. I've already read the answers here. But it doesn't answer the remedy for it.
Who is responsible to bring it back? And how to bring it back? Can there be a backup master node to avoid this? If yes, how?
Basically, I'm asking a recommended way to handle master failure in kubernetes setup.
You should have multiple VMs serving as master node to avoid single point of failure.An odd number of 3 or 5 master nodes are recommended for quorum. Have a load balancer in-front of all the VMs serving as master node which can do load balancing and in case one master node dies loadbalancer should remove the VMs IP and make it as unhealthy and not send traffic to it.
Also ETCD cluster is the brain of a kubernetes cluster. So you should have multiple VMs serving as ETCD nodes. Those VMs can be same VMs as of master node or for reduced blast radius you can have separate VMs for ETCD. Again the odd number of VMs should should be 3 or 5. Make sure to take periodic backup of ETCD nodes data so that you can restore the cluster state to pervious state in case of a disaster.
Check the official doc on how to install a HA kubernetes cluster using Kubeadm.
In short, for Kubernetes you should keep master nodes to function properly all the time. There are different methods to make copies of master node, so it is available on failure. As example check this - https://kubernetes.io/docs/tasks/administer-cluster/highly-available-master/
Abhishek, you can run master node in high availability, you should set up the control plane aka master node behind Load balancer as first step. If you have plans to upgrade a single control-plane kubeadm cluster to high availability you should specify the --control-plane-endpoint to set the shared endpoint for all control-plane nodes. Such an endpoint can be either a DNS name or an IP address of a load-balancer.
By default because of security reasons the master node does not host PODS and if you want to enable hosting PODS on master node you can run the following command to do so.
kubectl taint nodes --all node-role.kubernetes.io/master
If you want to manually restore the master make sure you back up the etcd directory /var/lib/etcd. You can restore this on the new master and it should work. Read about high availability kubernetes over here.

Backup Docker Swarm - How many Manager Nodes Required

From the official docker doc, there is a statement (as below) looks confusing to me. From my understanding, don't we only need to pick anyone of healthy manager nodes to backup for future restoration purpose?
"You must perform a manual backup on each manager node, because logs contain node IP address information and are not transferable to other nodes. If you do not backup the raft logs, you cannot verify workloads or Swarm resource provisioning after restoring the cluster."
Link: https://docs.docker.com/ee/admin/backup/back-up-swarm/
It depends on how you want to recover. If you want to restore a specific node, you need a backup from that node.
If you are rebuilding your swarm cluster from an old backup, then you only need one healthy node's backup. See the following guide for performing a backup and restore:
https://docs.docker.com/engine/swarm/admin_guide/#back-up-the-swarm
If you restore the cluster from a single node, you will need to reset and join the swarm again on the other managers since you are running a single node cluster. What is restored in that scenario are the services, stacks, and other definitions, but not the nodes.

Limit starting containers after complete shutdown

We have a bare metal Docker Swarm cluster, with a lot of containers.
And recently we have a full stop on the physical server.
The main problem, happened on Docker startup where all container tried to start on the same time.
I would like to know if there is a way to limit the amount of starting container?
Or if there is another way to avoid overloading the physical server.
At present, I'm not aware of an ability to limit how fast swarm mode will start containers. There is a todo entry to add an exponential backoff in the code and various open issues in swarmkit, e.g. 1201 that may eventually help with this scenario. Ideally, you would have an HA cluster with nodes spread in different AZ's, and when one node fails, the workload would migrate to another node and you do not end up with one overloaded node.
What you can use are resource constraints. You can configure each service with a minimum CPU and memory reservation. This would prevent swarm mode from scheduling more containers on a node than it could handle during a significant outage. The downside is that some services may go unscheduled during an outage and you cannot prioritize which are more important to schedule.

Resources