How to remove node from swarm? - docker

I added three nodes to a swarm cluster with static file mode. I want to remove host1 from the cluster. But I don't find a docker swarm remove command:
Usage: swarm [OPTIONS] COMMAND [arg...]
Commands:
create, c Create a cluster
list, l List nodes in a cluster
manage, m Manage a docker cluster
join, j join a docker cluster
help, h Shows a list of commands or help for one command
How can I remove the node from the swarm?

Using Docker Version: 1.12.0, docker help offers:
➜ docker help swarm
Usage: docker swarm COMMAND
Manage Docker Swarm
Options:
--help Print usage
Commands:
init Initialize a swarm
join Join a swarm as a node and/or manager
join-token Manage join tokens
update Update the swarm
leave Leave a swarm
Run 'docker swarm COMMAND --help' for more information on a command.
So, next try:
➜ docker swarm leave --help
Usage: docker swarm leave [OPTIONS]
Leave a swarm
Options:
--force Force leave ignoring warnings.
--help Print usage

Using the swarm mode introduced in the docker engine version 1.12, you can directly do docker swarm leave.

The reference to "static file mode" implies the container based standalone swarm that predated the current Swarm Mode that most know as Swarm. These are two completely different "Swarm" products from Docker and are managed with completely different methods.
The other answers here focused on Swarm Mode. With Swarm Mode docker swarm leave on the target node will cause the node to leave the swarm. And when the engine is no longer talking to the manager, docker node rm on an active manager for the specific node will cleanup any lingering references inside the cluster.
With the container based classic swarm, you would recreate the manager container with an updated static list. If you find yourself doing this a lot, the external DB for discovery would make more sense (e.g. consul, etcd, or zookeeper). Given the classic swarm is deprecated and no longer being maintained, I'd suggest using either Swarm Mode or Kubernetes for any new projects.

Try this:
docker node list # to get a list of nodes in the swarm
docker node rm <node-id>

Using the Docker CLI
I work with Docker Swarm clusters and to remove a node from the cluster there are two options.
It depends on where you want to run the command, within the node you want to remove or on a manager node other than the node to be removed.
The important thing is that the desired node must be drained before being removed to maintain cluster integrity.
First option:
So I think the best thing to do is (as steps in official document):
Go to one of the nodes with manager status using a terminal ssh;
Optionally get your cluster nodes;
Change the availability to drained of the node you want to remove;
And remove it;
# step 1
ssh user#node1cluster3
# step 2, see the nodes in your cluster like print screen below
docker node ls
# step 3, drain one of them
docker node update --availability drain node4cluster3
# step 4, remove the drained node
docker node rm node4cluster3
Second option:
The second option needs two terminal logins, one on a manager node and one on the node you want to remove.
Perform the 3 initial steps described in the first option to drain the desired node.
Afterwards, log in to the node you want to remove and run the docker swarm leave command.
# remove from swarm using leave
docker swarm leave
# OR, if the desired node is a manager, you can use force (be careful*)
docker swarm leave --force
*For information about maintaining a quorum and disaster recovery, refer to the Swarm administration guide.
My environment information
I use Ubuntu 20.04 for nodes inside VMs;
With Docker version 20.10.9;
Swarm: active;

Related

adding a manager back into swarm cluster

I have a swarm cluster with two nodes. 1 Manager and 1 worker. I am running my application on worker node and to run a test case, I force remove manager from docker swarm cluster.
My application continues to work, but I would like to know if there is any possibility to add back the force removed manager in the cluster again. (I don't remember the join-token and neither have them copied anywhere)
I understand docker advises to have odd number of manager nodes to maintain quorum, but would like to know if docker has addressed such scenarios anywhere.
docker swarm init --force-new-cluster --advertise-addr node01:2377
When you run the docker swarm init command with the --force-new-cluster flag, the Docker Engine where you run the command becomes the manager node of a single-node swarm which is capable of managing and running services. The manager has all the previous information about services and tasks, worker nodes are still part of the swarm, and services are still running. You need to add or re-add manager nodes to achieve your previous task distribution and ensure that you have enough managers to maintain high availability and prevent losing the quorum.
How about create new cluster from your previously manager machine and let the worker leave previous cluster to join the new one? That seems like the only possible solution for me, since your cluster now does not have any managers and the previous manager now not in any cluster, you can just create a new cluster and let other workers join the new one.
# In previously manager machine
$ docker swarm init --advertise-addr <manager ip address>
# *copy the generated command for worker to join new cluster after this command
# In worker machine
$ docker swarm leave
# *paste and execute the copied command here

Docker Swarm Mode - Show containers per node

I am using Docker version 17.12.1-ce.
I have set up a swarm with two nodes, and I have a stack running on the manager, while I am to instantiate new nodes on the worker (not within a service, but as stand-alone containers).
So far I have been unable to find a way to instantiate containers on the worker specifically, and/or to verify that the new container actually got deployed on the worker.
I have read the answer to this question which led me to run containers with the -e option specifying constraint:Role==worker, constraint:node==<nodeId> or constraint:<custom label>==<value>, and this github issue from 2016 showing the docker info command outputting just the information I would need (i.e. how many containers are on each node at any given time), however I am not sure if this is a feature of the stand-alone swarm, since docker info only the number of nodes, but no detailed info for each node. I have also tried with docker -D info.
Specifically, I need to:
Manually specify which node to deploy a stand-alone container to (i.e. not related to a service).
Check that a container is running on a specific swarm node, or check how many containers are running on a node.
Swarm commands will only care/show service-related containers. If you create one with docker run, then you'll need to use something like ssh node2 docker ps to see all containers on that node.
I recommend you do your best in a Swarm to have all containers as part of a service. If you need a container to run on nodeX, then you can create a service with a "node constraint" using labels and constraints. In this case you could restrict the single replica of that service to a node's hostname.
docker service create --constraint Node.Hostname==swarm2 nginx
To see all tasks on a node from any swarm manager:
docker node ps <nodename_or_id>

docker-compose swarm without docker-machine

After looking through docker official swarm explanations, github issues and stackoverflow answers im still at a loss on why i am having the problem that i have.
Issue at hand: docker-compose up starts services not in the swarm even though swarm is active and has 2 nodes.
Im using 1.12.1 docker version.
Looking at swarm tutorial i was able to start and scale my swarm using docker service create without any issues.
running docker-compose up with version 2 docker-compose.yml results in services starting outside of swarm, i can see them through docker ps but not docker service ls
I can see that docker-machine as the tool that solves this problems, but then again it needs virtual box to be installed.
so my questions would be
Can i use docker-compose with docker-swarm (NOT docker-engine) without docker-machine and without experimental build bundle functionality?
If docker service create can start a service on any nodes is it an indication that network configuration of the swarm is correct ?
What is the advantages/disadvantages of docker-machine versus experimental build functionality
1) No. Docker Compose isn't integrated with the new Swarm Mode yet. Issue 3656 in GitHub is tracking that. If you start containers on a swarm with Docker Compose at the moment, it uses docker run to start containers, which is why you see them all on one node.
2) Yes. Actually you can use docker node ls on the manager to confirm all the nodes are up and active, and docker node inspect to check a particular node, you don't need to create a service to validate the swarm.
3) Docker Machine is also behind the 1.12 release, so if you start a swarm with Docker Machine it will be the 'old' type of swarm. The old Docker Swarm product needed a whole lot of extra setup for a key-value store, TLS etc. which Swarm Mode does for free.
1) You can't start services using docker-compose on the new Docker "Swarm Mode". There's a feature to convert a docker-compose file to the new dab format which is understood by the new swarm mode but that's incomplete and experimental at this point. You basically need to use bash scripts to start services at the moment.
2) The nodes in a swarm (swarm mode) interact using their own overlay network. It's the one named ingress when you do docker network ls. You need to setup your own overlay network to run services in. eg:
docker network create -d overlay mynet
docker service create --name serv1 --network mynet nginx
3) I'm not sure what feature you mean by "experimental build'. docker-machine is just a way to create hosts (the nodes). It facilitates the setting up of the docker daemon on each host, the certificates and allows some basic maintenance (renewing the certs, stopping/starting a host if you're the one who created it). It doesn't create services, volumes, networks or manages them. That's the job of the docker api.

How to run same container on all Docker Swarm nodes

I'm just getting my feet wet with Docker Swarm because we're looking at ways to configure our compute cluster to make it more containerized.
Basically we have a small farm of 16 computers and I'd like to be able to have each node pull down the same image, run the same container, and accept jobs from an OpenMPI program running on a master node.
Nothing is really OpenMPI specific about this, just that the containers have to be able to open SSH ports and the master must be able to log into them. I've got this working with a single Docker container and it works.
Now I'm learning Docker Machine and Docker Swarm as a way to manage the 16 nodes. From what I can tell, once I set up a swarm, I can then set it as the DOCKER_HOST (or use -H) to send a "docker run", and the swarm manager will decide which node runs the requested container. I got this basically working using a simple node list instead of messing with discovery services, and so far so good.
But I actually want to run the same container on all nodes in one command. Is this possible?
Docker 1.12 introduced global services and passing --mode global to run command Docker will schedule service to all nodes.
Using Docker Swarm you can use labels and negative affinity filters to gain the same result:
openmpi:
environment:
- "affinity:container!=*openmpi*"
labels:
- "com.myself.name=openmpi"

Checking reason behind node failure

I have docker swarm setup with nodes running node-1, node-2 and node-3. Due to some reason everyday one of my node is getting failed basically they exits. I ran docker logs <container id of swarm> but logs doesn't contains any info related to node failure.
So, is there any logs file where a log related to this failure can be seen? or this is due to some less memory allocation problem?
Can any one suggest me how to dig this problem and find a proper solution. As everyday I have to start by swarm nodes.
Like most containers, Swarm containers run and exit unless you use docker run with the -d option to "daemonize" them. For example:
$ docker run -d swarm join --advertise=172.30.0.69:2375 consul://172.30.0.161:8500
On the other hand, if you used Docker Machine to create the VMs, then also use Docker Machine to create the Swarm manager and nodes. By default, Docker Machine applies TLS authentication to the Docker Engine nodes. The easiest thing to do is to also create the Swarm manager and nodes at the same time as you create the Docker Engine nodes.
For more info, check out the brand new Swarm doc.

Resources