How to reschedule containers with swarm when the server dies for a moment - docker-swarm

I run two servers using docker-compose and swarm
When stopping the A server, the container in A server is moved to B server.
but, when starting up the A server, the container that was in the B server will not moved to A server.
I want to know how to properly arrange the location of the dead server's container when the server dies for a moment

First, for your Swarm to be able to re-create a task when a node goes down, you still need to have a majority of manager nodes still available... so if it was only a two node Swarm, this wouldn't work because you'd need three managers for one to fail and another to take leader role and re-schedule the failed replicas. (just a FYI)
I think what you're asking for is "re-balancing". When a node comes back online (or a new one is added), Swarm does nothing with services that are set to the default replicated mode. Swarm doesn't "move" containers, it destroys and re-creates containers, so it considers the service still healthy on Node B and won't move it back to Node A. It wouldn't want to disrupt your active/healthy services on Node B just because Node A came back online.
If Node B does fail, then Swarm would again re-schedule the task on the next best node.
If Node B has a lot of containers, and work is unbalanced (i.e. Node A is empty and Node B has 3 tasks running), then you can force a service update which will destroy and re-create all replicas of that service and will try to spread them out by default, which may result on one of the tasks ending up back on Node A.
docker service update --force <serivcename>

Related

How to handle when leader node goes down in docker swarm

I have two docker nodes running in swarm like below. The second node i promoted to work as manager.
imb9cmobjk0fp7s6h5zoivfmo * Node1 Ready Active Leader 19.03.11-ol
a9gsb12wqw436zujakdpbqu5p Node2 Ready Active Reachable 19.03.11-ol
This works fine when leader node goes to drain/pause. But as part of my test i have stopped the Node1 instance then i got below error when try to see what are the nodes(docker node ls) in the second node and when tried to list the services running(docker service ls).
Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online
Also no docker process coming up in node 2 which were running in node 1 before stopping the instance. Only the existing process are running. My expectation is after stopping the node1 instance, the procees were running in node 1 has to move to node2. This works fine when a node goes to drain status
The raft consensus algoritm fails when it cant' find a clear majority.
This means, never run with 2 manager nodes as one node going down leaves the other with 50% - which is not a majority and quorum cannot be reached.
Generally in fact, avoid even numbers, especially when splitting managers between availability zones, as a zone split can leave you with a 50/50 partition - again no majority and no Quorum and a dead swarm.
So, valid numbers of swarm managers to try are generally: 1,3,5,7. Going higher than 7 generally reduces performance and doesn't help availability.
1 should only be used if you are using a 1 or 2 node swarm, and in these cases, loss of the manager node equates to loss of the swarm anyway.
3 managers is really the minimum you should aim for. If you only have 3 nodes, then prefer to use the managers as workers than run 1 manager and 2 workers.

What steps does docker swarm take when doing a rolling update with start-first?

When docker swarm does a rolling update with stop-first on multiple running container instances, it takes -among others- the following steps in order for each container in a row:
Remove the container from its internal load balancer
Send a SIGTERM signal to the container.
With respect to the stop-grace-period, send a SIGKILL signal.
Start a new container
Add the new container to its internal load balancer
But which order of steps are taken when I want to do a rolling update with start-first?
Will the old- and new- container be available through the loadbalancer at the same time (until the old one has stopped and removed from the lb)?
Or will the new container first be started and not be added to the loadbalancer until the old container is stopped and removed from the loadbalancer?
The latter would nescesarry for processes that are bounded to a specific instance of a service (container).
But which order of steps are taken when I want to do a rolling update
with start-first?
It's basically the reverse. New container starts, added to LB, then the old one is removed from LB and sent shutdown signal.
Will the old- and new- container be available through the loadbalancer
at the same time (until the old one has stopped and removed from the
lb)?
Yes.
A reminder that most of this will not be seamless (or near zero downtime) unless you (at a minimum) have healthchecks enabled in the service. I talk about this a little in this YouTube video.

restore a docker swarm

Let's say we have a swarm1 (1 manager and 2 workers), I am going to back up this swarm on a daily basis, so if there is a problem some day, I could restore all the swarm to a new one (swarm2 = 1 manager and 2 workers too).
I followed what described here but it seems that while restoring, the new manager get the same token as the old manager, as a result : the 2 workers get disconnected and I end up with a new swarm2 with 1 manager and 0 worker.
Any ideas / solution?
I don't recommend restoring workers. Assuming you've only lost your single manager, just docker swarm leave on the workers, then join again. Then on the manager you can always cleanup old workers later (does not affect uptime) with docker node rm.
Note that if you loose the manager quorum, this doesn't mean the apps you're running go down, so you'll want to keep your workers up and providing your apps to your users until you fix your manager.
If your last manager fails or you lose quorum, then focus on restoring the raft DB so the swarm manager has quorum again. Then rejoin workers, or create new workers in parallel and only shutdown old workers when new ones are running your app. Here's a great talk by Laura Frank that goes into it at DockerCon.

Docker swarm load balancing - How to give common name to the service?

I read swarm routing mesh
I create a simple service which uses tomcat server and listens at 8080.
docker swarm init I created a node manager at node1.
docker swarm join /tokens I used the token provided by the manager at node 2 and node 3 to create workers.
docker node ls shows 5 instances of my service, 3 running at node 1, 1 running at node 2, another one is at node 3.
docker service create image I created the service.
docker service scale imageid=5 scaled it.
My application uses atomic number which is maintained at JVM level.
If I hit http://node1:8080/service 25 times, all requests goes to node1. How dose it balance node?
If I hit http://node2:8080/service, it goes to node 2.
Why is it not using round-robin?
Doubts:
Is anything wrong in the above steps?
Did I miss something?
I feel I am missing something. Like common service name http://domain:8080/service, then swarm will work in round robin fashion.
I would like to understand only swarm mode. I am not interested external load balancer as of now.
How do I see swarm load balance in action?
Docker does round robin load balancing per connection to the port. As long as a connection is up, it will continue to go to the same instance.
Http allows a connection to be kept alive and reused. Browsers take advantage of this behavior to speed up later requests by leaving connections open. To test the round robin load balancing, you'd need to either disable that keep alive setting or switch to a command line tool like curl or wget.

Docker swarm mode: scale down a node and remove services

I have set of tasks for given service t1, t2, ..., tk across nodes N1, N2, ...Nw.
Due to lower usage, I do not need as many tasks as k.
I need only l tasks (l < k).
In fact, I do not need w nodes so I want to start removing machines and pay less. Removing one machine at a time is fine.
Each service has its own state.
The services are started in replicated mode.
1) How can I remove a single node and force the docker swarm not to recreate the same number of tasks for the service?
Notes:
I can ensure that no work is rerouted to tasks running on a specific node, so removing the specific node is safe.
This is the easiest solution, I will end up with w - 1 nodes and l services assuming that on the removed node was served k - l services.
or
2) How can I remove specific containers (tasks) from docker swarm and keep the number of replicas of the service lower by the number of removed tasks?
Notes:
I assume that I already removed a node. The services from the node were redeployed to other nodes.
I monitor myself the containers (tasks) which serve no traffic -> no state is needed to maintain
or
3) Any other solution?
To use a concrete example let's say you have 3 nodes and 9 tasks. You now want to go to 2 nodes and 6 tasks, without any unnecessary rescheduling (e.g. 2 modes and 9 tasks, or 3 modes and 6 tasks).
To scale down a service and 'drain' a node at the same time, you can do this:
docker service update --replicas 6 --constraint-add "node.hostname != node_to_be_removed_hostname" service_name
If your existing setup is balanced, this should only cause the tasks running on the host to be removed to be killed.
After this, you can proceed to (docker node update) drain the node, remove it from the swarm, and remove the constraint that has just been added.
To answer your questions
Q1-> You can simply drain the node in the cluster to verify if the services on the services are started on other nodes. Once they do you can safely remove the node from the swarm cluster.
docker node update --availability drain <>
Q2-> You must have specified replica count while starting the services, you can simply scale it to a lower count.
docker service scale <>=<>

Resources