Docker swarm cluster how to add manager nodes as a reachable - docker-swarm

I am using docker virtual box for windows 7 machine.
$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
default * virtualbox Running tcp://1.2.3.101:2376 v17.04.0-ce
manager1 - virtualbox Running tcp://1.2.3.106:2376 v17.04.0-ce
manager2 - virtualbox Running tcp://1.2.3.105:2376 v17.04.0-ce
worker1 - virtualbox Running tcp://1.2.3.102:2376 v17.04.0-ce
worker2 - virtualbox Running tcp://1.2.3.104:2376 v17.04.0-ce
worker3 - virtualbox Running tcp://1.2.3.103:2376 v17.04.0-ce
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
e8kum3w0xqd4g02cx1tfps9ni manager1 Down Active
aibbgvqtiv9bhzbs8l20lbx2m * default Ready Active Leader
sbt75u8ayvf7lqj7y3zppjwvk worker1 Ready Active
ny2j5556w4tyflf3tjfqzjrte worker2 Ready Active
veipdd0qs2gjnogftxvr1kfhq worker3 Ready Active
Now i am planing set up environment docker swarm cluster, like i have three manager node (name as default,manager1,manager2) and three workers nodes (name as worker1, worker2,worker3).
Using default manager node i init docker swarm with address
$ docker swarm init --advertise-addr 1.2.3.101:2376
output starting
Swarm initialized: current node (acbbgvqtiv6bhzbs8l20lbx1e) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-1ie1b420bhs452ubt4iy01brfc97801q0ya608spbt0fnuzkp0-1h2a86acczxe4qta164np487r 1.2.3.101:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
output ending
Using this output i easily added worker nodes. Now my question is how yo add other manager (manager1,manager2) to reachable state. Note still default node act as leader
could you please any one help on this ?
Thanks

Sorry for late answer.
On the existing manager host get manager-token:
>docker swarm join-token manager
and then on a potential manager host execute got output

Run command on manager node
docker swarm join-token manager
to get the token to add other nodes as manager, should be similar to the worker token you got above
You need to ssh to the other machine which you want to add as a manager node to the swarm.
Once done, run that command
For the manager to advertise address you can provide --advertise-addr and --listen-addr flags as well, they take host:port as param.
Hope this helps

Related

How to deploy a compose file in docker swarm which is present in Worker node

In system1(i.e Host name of Master node), the docker is started using
docker swarm init
And later the Compose file available in system1 (*.yml) are deployed using
docker stack deploy --compose-file file_1.yml system1
docker stack deploy --compose-file file_2.yml system1
docker stack deploy --compose-file file_3.yml system1
Next in system2 (i.e Host name of Worker node),
Will join the manager node (system1) using join --token command.And using below mentioned command,and later copy the output of that command and join the manager node.
docker swarm join-token worker
And once ran output of the above command in system2.Was able to successfully join the manager node.
Also cross verified by using ,
docker node ls
And I could see both manager node and worker in Ready and active state.
In my case I'm using worker node(system2) for failover .
Now that I have similar compose files (*.yml files) in system2.
How do I get that deployed in docker swarm ?
Since system2 is worker node, I cannot deploy in system2.
At first I'm not sure what do you mean by
In my case I'm using worker node(system2) for failover .
We are running Docker Swarm in production and the only way you can achieve failover with managers is to use more of them. Note because Docker Swarm uses etcd and that uses quorum, go with the rule of 1,3,5 ...
As for deployments from non-manager nodes, it is not possible to do so in Docker Swarm unless you use a management service which has a docker socket proxy and it can work with it through a service (service will be running on the manager and since it all lives inside Docker Swarm you can then invoke the calls from the worker.).
But there is no way to directly deploy or administrate the swarm from the worker node.
Some things:
First:
Docker contexts are used to communicate with a swarm manager remotely so that you do not have to be on the manager when executing docker commands.
i.e. to deploy remotely to a swarm you could create then use a context like this:
docker context create swarm1 --docker "host=ssh://user#node1"
docker --context swarm1 stack deploy --compose-file stack.yml stack1
2nd:
Once the swarm is set up, you always communicate with a manager node, and it orchestrates the deployment of the service to available worker nodes. In the case that worker nodes are added after services are deployed docker will not move tasks to the worker nodes until new deployments are performed as it prefers to not interrupt running tasks. The goal is eventual balance. If you want to force a docker to rebalance to consider the new worker node immediately, then just redeploy the stack, or
docker service update --force some-service
3rd:
To control which worker nodes services run tasks on you can use placement constraints and node labels.
docker service create --constraint node.role==worker ... would only deploy onto nodes that have the worker role (are not managers)
or
docker service update --constraint-add "node.labels.is-nvidia-enabled==1" some-service would only deploy tasks to the node where you have explicitly labeled the node with the corresponding label and value.
e.g. docker node update label-add is-nvidia-enabled=1 node1 node3

Docker swarm - Manager node cannot access the containers in worker node

In our docker swarm environment, there is 1 manager nodes and 2 worker nodes.
We also installed portainer,swarm and the portainer agent's &swarm agents on all nodes.
Yesterday, one of the virtual servers which worker node installed rebooted unexpectedly.
When we check the docker service it was stopped. restarted the docker service with using this command:
systemctl restart docker
Then all the containers seem to work fine on the worker node. But when we check the containers by the portainer which runs on a master node, the containers look stopped. Swarmpit reports that the worker's nodes active and ready.
What could be the problem?
Worker Node:
Master node - running containers
Swarmpit
We find out that the firewall caused the error.
After rebotting CentOS, the firewall is enabled automatically and it conflicted with the docker engine so we disabled the firewall with this command:
systemctl disable firewalld

Error response from daemon: attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded

I am trying to set up docker swarm with an overlay network. I have some hosts on aws while others are laptops running Ubuntu(same as on aws). Every node has a static public IP. I have created an overlay network as:
docker network create --driver=overlay --attachable test-net
I have created a swarm network on one of the aws hosts. Every other node is able to join that swarm network.
However when I run docker run -it --name alpine2 --network test-net alpine on any node not on aws, I get the error: docker: Error response from daemon: attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded.
But if I run the same on any aws host, then everything is working fine. Is there anything more I need to do in terms of networking/ports If there are some nodes on aws while others are not?
I have opened the ports required for swarm networking on all machines.
EDIT: All the nodes are marked as "active" when listing in the manager node.
UPDATE Solved this issue by opening the respective ports. It now works if all the nodes are Linux based. But when I try to make a swarm with the manager as Linux(ubuntu) os, mac os machines are not able to join the swarm.
check if the node in drain state:
docker node inspect --format {{.Spec.Availability}} node
if yes then update the state:
docker node update --availability active node
here is the explanation:
Resolution
When a node is in drain state, it is expected behavior that you should
not be able to allocate swarm mode resources such as multi-host
overlay network IP addresses to the node.However, swarm mode does not
currently provide a messaging mechanism between the swarm leader where
IP address management occurs back to the worker node that requested
the IP address. So docker run fails with context deadline exceeded.
Internal engineering issue escalation/292 has been opened to provide a
better error message in a future release of the Docker daemon.
source
Check if the below ports are opened on both machines.
TCP port 2377
TCP and UDP port 7946
UDP port 4789
You may use ufw to allow the ports:
ufw allow 2377/tcp
I had a similar issue, managed to fix it by making sure the ENGINE VERSION of the nodes were the same.
sudo docker node ls
Another common cause for this is Ubuntu server installer installing docker using snap, and that package is buggy. Uninstall with snap and install using apt. And reconsider Ubuntu. :-/

Docker Swarm: deploy containers on Worker node

I am trying to deploy the application on multiple instances.
On master node, I used these bunch of commands:
docker swarm init
docker network create --attachable --driver overlay fabric
docker stack deploy --compose-file docker-compose-org2.yaml fabric
And the service was deployed on master node and is running properly.
Now I have another compose file named: docker-compose-orderer.yaml Which I want to deploy on other AWS instance.
I used the following command on worker node:
docker swarm join --token SWMTKN-1-29jg0j594eluoy8g86dniy3opax0jphhe3a4w3hjuvglekzt1b-525ene2t4297pgpxp5h5ayf89 <IP>:2377
docker stack deploy --compose-file docker-compose-org1.yaml fabric
It command docker stack deploy --compose-file docker-compose-org1.yaml fabric says this node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again
Does anyone knows how to deploy the compose file in worker node?
Any help/suggestion would be appreciated.
Update 1:
Worker node joined swarm manager successfully.
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qz9y7p1ba3prp23xtuv3uo2dk ip-172-31-18-206 Ready Active 18.06.1-ce
no97mrg6f7eftbbeu86xg88d9 * ip-172-31-40-235 Ready Active Leader 18.06.1-ce
you must apply all docker service and docker stack commands on manager nodes. it will deploy automatically the containers on less used nodes. when you want to explicit deploy a container on a specific node, you must tag this node and work with constraints.

Host unreachable after docker swarm init

I have Windows Server 2016 Core(Hyper-V VM). Docker is installed, working and I want to create swarm.
IP config at the beginning:
1. Ethernet - 192.168.0.1
2. vEthernet (HSN Internal NIC) - 172.30.208.1
Then I run
docker swarm init --advertise-addr 192.168.0.1
Swarm is created, but I have lost my main IP address. IP config:
1. vEthernet (HNS internal NIC) - 172.30.208.1
2. vEthernet (HNS Transparent) - 169.254.225.229
Created swarm manager node is not reachable on main address 192.168.0.1. I can't connect to it and swarm workers are not able to join with this IP. Where is the problem?
A little late answering this but ... Docker is going to take over your network card when you bring up the Swarm. What I did was use two network cards: one I left alone for Docker to use and the second I used for everything else including virtual machines.
Currently, you cannot use Docker for Mac or Docker for Windows alone to test a multi-node swarm. For single node swarm cluster,
If you are using Docker for Mac or Docker for Windows to test single-node swarm, simply run docker swarm init with no arguments
However, you can use the included version of Docker Machine to create the swarm nodes (see Get started with Docker Machine and a local VM), then follow the tutorial for all multi-node features
For furthere info read this
Edit:
Also refer to this

Resources