Can we create a node group in docker swarm? - docker

Our current docker cluster has mix size of nodes. i:e some nodes have more memory and storage than other nodes.
Is there any way I can create two separate nodes groups for low end and high end nodes so that I can provision heavy containers on high end nodes only.
I understand using constraint filter(https://docs.docker.com/swarm/scheduler/filter/) I can provision a container on a particular node by ID or name. But again I can;t scale it dynamically if that node goes down or new nodes are added to the cluster.

No, you can't subdivide nodes within a swarm, you need to do it through labels. You apply labels either on the Docker engine - for the old Docker Swarm, or on the node for the new Swarm Mode.
Adding a label would be part of your onboarding for a new node - so all nodes have the appropriate labels and the scheduler can manage your services as you want.

Related

Docker Swarm scaling

I am using Docker Swarm in AWS environment.
I also use auto-scaling.
By default, there are two instances, and if auto-scaling works and the instance grows, you want nodes to grow accordingly.
For example, it is currently set to replica=4 and has two container master and nodes each. If auto-scaling adds one instance, it also wants to have two docker instance. I want the total number of containers to be 6. We want the total number of containers to be eight as another instance increases.

docker services, choose a preferable node to run or rearrange all services if leader comes down

I have 2 swarm nodes and I whish that in case one node shut down, the other one rearrange all services to itself.
Right now I have one leader(manager) and one worker, and it works perfectly if the worker goes down, because leader rearranges all services to itself.
My problem here is when leader goes down and no one assumes services within it.
I already tried with two managers, but didn't works.
So I am thinking about to let all my services in the worker node so if leader node goes down there is no problem at all and if worker node goes down, leader node would rearrange all services to itself.
I tried with
deploy:
placement:
constraints:
- "node.role!=manager"
But it also does not works, because it will never instance this service in a manager node.
So I would like to ask if there is any way to make those two nodes to rearrange all services to itself in case other goes down?!
or
There is an way to configure a service to "preferably" be deployed in one specific node if that node is available otherwise be deployed in any other node?
The rub of it is, you need 3 nodes, all managers. It is not a good idea, even with a 2 node swarm, to make 2 nodes managers as docker swarm uses the raft protocol for manager quorum, and this protocol requires a clear majority. With two manager nodes, if either node goes down, the remaining manager node only represents 50% of the swarm managers and so will not represent the swarm until qorum is restored.
Once you have 3 nodes - all managers - the swarm will tolerate any single nodes failure and move tasks to the other two nodes.
Don't bother with 4 manager nodes - they dont provide extra protection from single node failures, and don't protect from two node failures as, again, only 2 out 4 does not represet more than 50%, to survive 2 node failures you want 5 managers.

Docker constraints until node going down

I'm having 5 Docker nodes in a cluster [swarm].
Let's say I'll constraint NGINX [it is not about nginx, it is just an example] to be deployed only on Docker node 1.
Can I create that constraint in such a way that if Docker node 1 goes down the constraint to not be available anymore?
Like, having that constraint only when the node is reacheable, when it isn't, automatically remove the constraint?
Yes, you can use the placement-ref to place a spread stratergy to your node.hostname=your.node1.hostname as document here
https://docs.docker.com/engine/reference/commandline/service_create/#specify-service-placement-preferences---placement-pref.
If the nodes in one category (for example, those
with node.labels.datacenter=south) can’t handle their fair share of
tasks due to constraints or resource limitations, the extra tasks will
be assigned to other nodes instead, if possible.
The downside is that when your node 1 is back online, the service won't be update and rebalance until the service has been updated again (manually or the service is down).
Additionally, it's not a good design if your service has to be placed on a special node but it should be designed to be able to work every where so you can balance server load accross all nodes. Regarding, NGINX, it's stateless and you can deploy it to all of your nodes and let the docker routing mess to do the load balancing. If your service is statefull, even that it's re-deploy to a second node, your data will be not available and your total service is interupted too. So my real answer is that your question is possible but not the expectation of how Docker Swarm is designed and may be not good too.
If you have any good reason to stick with your question solution. You can think about a load balancer in front of your NGINX (or others) like another NGINX or HAProxy which will allow you more control to route your requests to a master node and use secondary or more node for backup purpose only and so on. The design will be that you have a stateless Load Balancer deploy in global mode, and your core service is running behind the LB. This design will give you no downtime when your node 1 is down or service is updating or relocating.

Preventing untagged containers from running on specific nodes

I have some nodes in my swarm where I want to only allow containers with a specific tag to be scheduled. Any container without that specific tag must not be scheduled on these nodes.
OpenShift has this functionality in taints and tolerations.
A taint allows a node to refuse pod to be scheduled unless that pod has a matching toleration.
Is there something similar in Swarm that I haven't been able find, or some way to achieve the same outcome?
Swarm mode has the concept of constraints, which is the inverse of taints and tolerations. With constraints, you add labels to your swarm nodes. And you then use those labels with constraints on a service to permit or deny a workload from running on a specific node. The inverse part is each service needs to specify constraints to exclude specific nodes, the scheduling model includes all nodes by default regardless of their labels.

AWS EKS Cluster Auto scale

I have a AWS EKS cluster 1.12 version for my applications, We have deployed 6 apps in the cluster everything is working fine, while creating nodes I have added an autoscaling node group which spans across availability zones with minimum 3 and max 6 nodes, so desired 3 nodes are running fine.
I have scenario like this:
when some memory spike happens I need to get more nodes as I mentioned in auto scaling group max nodes, so at the time of cluster set up I didn't add Cluster auto scale.
Can somebody please address following doubts
As per AWS documentation cluster auto scale won't support if our node group in multiple AZs
If at all we require to create multiple node groups as per the aws doc, how to mention min max nodes, is it like for entire cluster ?
How can I achieve auto scale on memory metric since this won't come by default like cpu metric
You should create one node group for every AZ. So if your cluster size is 6 nodes then create 2 instance node groups in one AZ each. You can also spread the pods across AZ for High Availability. If you look at cluster autoscaler documentation, it recommends:
Cluster autoscaler does not support Auto Scaling Groups which span
multiple Availability Zones; instead you should use an Auto Scaling
Group for each Availability Zone and enable the
--balance-similar-node-groups feature. If you do use a single Auto Scaling Group that spans multiple Availability Zones you will find
that AWS unexpectedly terminates nodes without them being drained
because of the rebalancing feature.
I am assuming you want to scale the pods based on memory. For that you will have to use metric server or Prometheus and create a HPA which scaled based on memory. You can find a working example here.

Resources