I am playing with multi node docker swarm in cloud. I setup 4 nodes swarm where 2 manager (1 primary and the other one is reachable manager ) and 2 worker nodes. While I am reading docs, I found out that we have to choose odd number of manager nodes like 1,3.... Not sure what is the technical restriction behind this decision.
This is related to how consensus across managers is determined when maintaining cluster consistency during an outage. See Raft consensus in swarm mode.
The algorithm used to derive consensus for a cluster of N nodes requires (N/2)+1 of them to agree. For a cluster of 2 managers you would actually be reducing reliability because if either of them goes down the other would be unable to do anything. In general, having an even number of managers provides no benefit over having one less.
Related
I am new to docker swarm, I read documentation and googled to the topic, but results was vague,
Is it possible to add worker or manager node from distinct and separate Virtual private servers?
Idea is to connect many non-related hosts into a swarm which then creates distribution over many systems and resiliency in case of any HW failures. The only thing you need to watch out for is that the internet connection between the hosts is stable and that all of the needed ports based of the official documentation are open. And you are good to go :)
Oh and between managers you want a VERY stable internet connection without any random ping spikes, or you may encounter weird behaviour (because of consensus with raft and decision making).
other than that it is good
Refer to Administer and maintain a swarm of Docker Engines
In production the best practice to maximise swarm HA is to spread your swarm managers across multiple availability zones. Availability Zones are geo-graphically co-located but distinct sites. i.e. instead of having a single London data centre, have 3 - each connected to a different internet and power utility. That way, if any single ISP or Power utility has an outage, you still have 2 data centres connected to the internet.
Swarm was designed with this kind of Highly available topology in mind and can scale to having its managers - and workers - distributed across nodes in different data centres.
However, Swarm is sensitive to latency over longer distances - so global distribution is not a good idea. In a single city, Data center to Data centre latencies will be in the low 10s of ms. Which is fine.
Connecting data centres in different cities / continents moves the latency to the low, to mid 100s of ms which does cause problems and leads to instability.
Otherwise, go ahead. Build your swarm across AZ distributed nodes.
There are a lot of articles online about running an Elasticsearch multi-node cluster using docker-compose, including the official documentation for Elasticsearch 8.0. However, I cannot find a reason why you would set up multiple nodes on the same docker host. Is this the recommended setup for a production environment? Or is it an example of theory in practice?
You shouldn't consider this a production environment. The guides are examples, often for lab environments, and testing scenarios with the application. I would not consider them production ready, and compose is often not considered a production grade tool since everything it does is to a single docker node, where in production you typically want multiple nodes spread across multiple availability zones.
Since one ES node heap memory should never get more than half the available memory (and less than ~30.5GB), one reason it makes sense to have several nodes on a given host is when you have hosts with ample memory (say 128GB+). In that case you could run 2 ES nodes (with 64GB of memory each, 30.5GB heap and the rest for Lucene) on the same host by correctly constraining each Docker container.
Note that the above is not related to Docker, you can always configure several nodes per host, whether Docker or not.
Regarding production and given the fact that 2+ nodes would run on the same host, if you lose that host, you lose two nodes, which is not good. However, depending on how many hosts you have, it might be a lesser problem, if and only if, each host is in a different availability zone and you have the appropriate cluster/shard allocation awareness settings configured, which would ensure that your data is redundantly copied in 2+ availability zones. In this case, losing a host (2 nodes) would still keep your cluster running, although in degraded mode.
It's worth noting that Elastic Cloud Enterprise (which powers Elastic Cloud) is designed to run several nodes per hosts depending on the sizing of the nodes and the available hardware. You can find more info on hardware pre-requisites as well as how medium and large scale deployments make use of one or more large 256GB hosts per availability zones.
We are currently operating a backend stack in central europe, Japan and Taiwan and are perparing our stack to transition to docker swarm.
We are working with real time data streams from sensor networks to do fast desaster warnings which means that latency is critical for some services. Therefore, we currently have brokers (rabbitmq) running on dedicated servers in each region as well as a backend instance digesting the data that is sent accross these brokers.
I'm uncertain how to best achieve a comparable topology using docker swarm. Is it possible to group nodes, let's say by nationality and then deploy a latency critical service stacks to each of these groups? Should I create separate swarms for each region (feels conceptually contradictory to docker swarm)?
The swarm managers should be in a low latency zone. Swarm workers can be anywhere. You can use a node label to indicate the location of the node, and restrict your workloads to a particular label as needed.
Latency critical considerations on the container-to-container network across large regional boundaries may be relevant depending on your required data path. If the only latency-critical data path is to the rabbitmq service that is external to the swarm, then you won't need to worry about the container-to-container latency.
It is also a valid pattern to have one swarm per region. If you need to be able to lose any region without impacting services on another region, then you'd want to split it up. If you have multiple low latency regions, then you can spread the master nodes across those.
Lets say we have a test setup of 10 nodes, 4 managers and 6 workers.
When the leader manager fails, the other 3 managers will chose another manager as leader.
When this leader as well fails, we only have 2 managers left out of 4. The other managers then say
Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
Because we have not more than half of the managers left, they will not be able to chose a new leader although 2 managers of the cluster are left.
My question is
the sense of this rule, because the cluster is without a leader and not manageable anymore as long as no additional managers are added to the cluster, although there are 2 managers available.
Why should I chose the role worker for nodes at all? What advantage are there to have nodes as workers? Managers also act as workers by default only with the disadvantage that they cannot take over when manager nodes fail.
Docker recommends to use a system with odd number of manager nodes. So your initial setup of 4 manager is as good as having 3 manager nodes. It is recommended that you start with 5 nodes, as you are loosing 2 nodes. Also, isn't there any serious issue to be addressed in the way you are using? (loosing so many nodes is not a good sign)
If the swarm loses the quorum of managers, the swarm cannot perform management tasks. If your swarm has multiple managers, always have more than two. To maintain quorum, a majority of managers must be available. An odd number of managers is recommended, because the next even number does not make the quorum easier to keep. For instance, whether you have 3 or 4 managers, you can still only lose 1 manager and maintain the quorum. If you have 5 or 6 managers, you can still only lose two.
Having a dedicated worker nodes makes sure that they won't participate in the Raft distributed state, make scheduling decisions, or serve the swarm mode HTTP API. So the complete compute power of these nodes are dedicated specifically to run the containers.
because manager nodes use the Raft consensus algorithm to replicate data in a consistent way, they are sensitive to resource starvation
The quotes are taken from the docker official documentation link
I have a 3 node docker swarm cluster. We might want to have 2 managers. I know at one time there is only one leader. Since it is a 3 node cluster, I am trying to find some literature to understand what are the pros and cons of multiple managers. I need this info since in my 3 node cluster if I have 2 masters, 1 worker, what is the downside if I simply create 3 masters in a cluster. Any thoughts would be helpful.
A Docker swarm with two managers is not recommended.
Why?
Docker swarm implements a RAFT consensus:
Raft tolerates up to (N-1)/2 failures and requires a majority or quorum of (N/2)+1 members to agree on values proposed to the cluster. This means that in a cluster of 5 Managers running Raft, if 3 nodes are unavailable, the system will not process any more requests to schedule additional tasks
So with 2 managers, if one is down, the other will not be able to schedule additional tasks (no cluster upgrades, no new services, etc...).
The docs is also clear about the number of managers you should have for high availability :
Size your deployment
To make the cluster tolerant to more failures, add additional replica nodes to your cluster.
Manager nodes Failures tolerated
1 0
3 1
5 2
7 3
So in brief, as the doc states here:
Adding more managers does NOT mean increased scalability or higher performance. In general, the opposite is true.