I got three nodes in my swarm, one manager and two workers (worker1 and worker2). I have a couple of services which preferably is running on the first worker node (worker1), however when this node goes down I wish it to start running on the second worker node.
From what I've gathered I could put a constraint like this.
placement:
constraints:
- node.hostname==worker1
This however forces the service to be run on worker1 and when that node goes down the service will simply go down.
I could also do this
placement:
constraints:
- node.role==worker
This restricts the service to either one of the worker nodes but doesn't prioritize worker1 as I want to. Is there a way to prioritize the service to a specific node rather than putting a constraint?
Related
I wish to understand the difference between docker swarm node running as Leader and running as a Manager.
I also, understand that there can be several docker managers but can there be multiple docker swarm Leader nodes and the reasons for it.
Note: im aware of what a docker worker node is.
Docker swarm has following terminology.
Manager Node (Can be a leader or manager)
Worker Node
Now for simple docker swarm mode , there is a single manager and other are worker node. In This manager is a leader.
It is possible to have more then one manager node. Like 2 manager ( Mostly odd number prefer like 1,3,5). In such case to one is leader who is responsible to scheduler task on worker node. Also manager node will talk to each other to maintain the state. To make highly available environment when manager node which is a leader at this moment get down , it should not stop scheduling work. At that moment another manager will automatically promoted as a leader and take responsibility to schedule task (container) on worked node.
I have a swarm running with one manager and multiple workers.
I want a specific service to be deployed once (and only once) per node, but only on the workers.
The manager still run other services.
What I found doesn't fit my needs:
mode: global do what I want for the 'once per container' but that does not exclude the manager.
mode: replicated
replicas: 6
placement:
constraints:
- node.role == worker limit to the worker but with that solution there could be more than one replicas on a node. And --max-replicas-per-node doesn't exist yet.
docker node update --availability drain manager1 removes the manager from the workers, but that's not possible either because my manager should run other services.
You can combine your first solution with the second one. Something like this works well for me in my environment:
mode: global
placement:
constraints:
- node.role == worker
The only issue would be that you need to assign a label (worker) to every node you need to run the service on.
I have a single node Docker Swarm setup with a dozen services created by simply calling docker service create [...].
Can anyone tell me what will happen to my services if I reboot my node? WIll they automatically restart or will I have to recreate them all?
I undestand that Swarm services and docker-compose setups are different, but in the case of having to recreate the services upon reboot, is there a way to save a docker-compose.yml file for each of my services (i.e. something that parses the output of docker service inspect)? Is there a better way of "saving" my services configuration?
No need to recreate the services,it will remains same even after the node restart. I have tested the same in my swarm cluster. i have three node swarm setup (1 manager & 2 worker). completely stopped the worker nodes and services on worker node moved to the active node(manager node). I have restarted the active node(manager) and still i can see the services are up and running on the manager node.
before restart:
enter image description here
After Restart:
enter image description here
So Even if you are running one node swarm,no need to worry about the services, it will automatically recreated automatically. Attached the screen shots for your reference.
I have docker swarm cluster with 2 nodes on AWS. I stopped the both instances and initially started swarm manager and then worker. Before stopped the instances i had a service running with 4 replicas distributed among manager and worker.
When i started swarm manager node first all replica containers started on manager itself and not moving to worker at all.
Please tell me how to do load balance?
Is swarm manager not responsible to do when worker started?
Swarm currently (18.03) does not move or replace containers when new nodes are started, if services are in the default "replicated mode". This is by design. If I were to add a new node, I don't necessarily want a bunch of other containers stopped, and new ones created on my new node. Swarm only stops containers to "move" replicas when it has to (in replicated mode).
docker service update --force <servicename> will rebalance a service across all nodes that match its requirements and constraints.
Further advice: Like other container orchestrators, you need to give capacity on your nodes in order to handle the workloads of any service replicas that move during outages. You're spare capacity should match the level of redundancy you plan to support. If you want to handle capacity for 2 nodes failing at once, for instance, you'd need a minimum percentage of resources on all nodes for those workloads to shift to other nodes.
Here's a bash script I use to rebalance:
#!/usr/bin/env bash
set -e
EXCLUDE_LIST="(_db|portainer|broker|traefik|prune|logspout|NAME)"
for service in $(docker service ls | egrep -v $EXCLUDE_LIST |
awk '{print $2}'); do
docker service update --force $service
done
Swarm doesn't do auto-balancing once containers are created. You can scale up/down once all your workers are up and it will distribute containers per your config requirements/roles/etc.
see: https://github.com/moby/moby/issues/24103
There are problems with new nodes getting "mugged" as they are added.
We also avoid pre-emption of healthy tasks. Rebalancing is done over
time, rather than killing working processes. Pre-emption is being
considered for the future.
As a workaround, scaling a service up and down should rebalance the
tasks. You can also trigger a rolling update, as that will reschedule
new tasks.
In docker-compose.yml, you can define:
version: "3"
services:
app:
image: repository/user/app:latest
networks:
- net
ports:
- 80
deploy:
restart_policy:
condition: any
mode: replicated
replicas: 5
placement:
constraints: [node.role == worker]
update_config:
delay: 2s
Remark: the constraint is node.role == worker
Using the flag “ — replicas” implies we don’t care on which node they are put on, if we want one service per node we can use “ — mode=global” instead.
In Docker 1.13 and higher, you can use the --force or -f flag with the docker service update command to force the service to redistribute its tasks across the available worker nodes.
I have a 3 node swarm. Each of which has a static ip address. I have a leader node-0 on ip 192.168.2.100, a backup manager node-1 on 192.1682.101, and a worker node-2 on 192.168.2.102. node-0 is the leader that initialized the swarm, so the --advertise-addr is 192.168.2.100. I can deploy services that can land on any node, and node-0 handles the load balancing. So, if I have a database on node-2 (192.168.2.102:3306), it is still reachable from node-0 192.168.2.100:3306, even though the service is not directly running on node-0.
However, when I reboot node-0 (let's say it loses power), the next manager in line assumes leader role (node-1) - as expected.
But, now if I want to access a service, let's say an API or database from a client (a computer that's not in the swarm), I have to use 192.168.2.101:3306 as my entry point ip, because node-1 is handling load balancing. So, essentially from the outside world (other computers on the network), the ip address of swarm has changed, and this is unacceptable and impractical.
Is there a way to resolve this such that a given manager has priority over another manager? Otherwise, how is this sort of issue resolved such that the entry point ip of the swarm is not dependent on the acting leader?
Make all three of your nodes managers and use some sort of load balanced DNS to point to all three of your manager nodes. If one of the managers goes down, your DNS will route to one of the other two managers (seamlessly or slightly less seamlessly depending on how sophisticated your DNS routing/health-check/failover setup is). When you come to scale out with more nodes, nodes 4, 5, 6 etc can all be worker nodes but you will benefit from having three managers rather than one.