Should Swarm Master Join As Node in a Single Node Cluster? - docker

We are building a small cluster, and a (strange) requirement is to setup everything in one machine, to which other machines can join in the future.
I set up consul with:
docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap
and the master with:
docker run -d -p 4000:4000 swarm manage -H :4000 --advertise <ip_here>:4000 consul://<ip_here>:8500
where docker is run with:
sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
and docker -H :4000 info lists the Nodes as 0 at this stage, where I cannot run any images with docker -H :4000 run <image> because No healthy node available in the cluster.
When I join the master node to the cluster with:
docker run -d swarm join --advertise=<ip_here>:2375 consul://<ip_here>:8500
Then docker -H :4000 info lists the Nodes as 1, and I can run containers.
Please note that <ip_here> refers all to the same ip of the machine.
Is this the intended behaviour? If not, what am I doing wrong?

After seeing Docker Machine's way of creating a Swarm cluster, as well as using Swarm that is integrated into Docker v1.12.0, I wanted to post an update. Swarm master does join Swarm cluster, by running two containers, an agent and a master.

As for me, I use the Swarm Master as the Consul server. This answer may help you. Then, to answer, the Swarm Master does not join in as a single node cluster.
You can't deploy Swarm on a single node. That's not its use and cannot work that way. Swarm turns a pool of Docker hosts into a single, virtual Docker host, so if the pool of Docker hosts contains zero hosts... There is no Docker Agent to host container.

Related

Is there a way to setup a test docker swarm on a single machine?

I am trying to setup a docker swarm on WSL2 for testing purposes. I want to know, if it is possible to have a swarm with multiple "dummy" nodes on a single machine.
Here are the two ways that I trid:
Run multiple WSL instances as suggested here.
PS C:\Users\jdu> wsl -l
Windows-Subsystem für Linux-Distributionen:
Ubuntu3
Ubuntu
Ubuntu2
Docker is installed and run in each WSL instance. So I manage to initialize a swarm on Ubuntu and let Ubuntu2 and Ubuntu3 to join as workers.
On Ubuntu
$ docker swarm init
Swarm initialized: current node (hude19jo7t9dqpe0akg55ipmy) is now a manager.
On Ubuntu2
$ docker swarm join --token SWMTKN-1-xxxxxxxxx-xxxxxxxxx 192.168.189.5:2377 --listen-addr 0.0.0.0:12377
This node joined a swarm as a manager.
Then if I check on Ubuntu
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
hude19jo7t9dqpe0akg55ipmy * laptop-ebc155 Ready Active Leader 20.10.21
ozeq43yukgfbltjnfya0tlx08 laptop-ebc155 Ready Active Reachable 20.10.20
Inspired by the ideas here, I have tried with docker-in-docker containers, e.g. I deploy multiple docker instances on a single WSL.
# Init Swarm master
docker swarm init
# Get join token:
SWARM_TOKEN=$(docker swarm join-token -q worker)
echo $SWARM_TOKEN
# Get Swarm master IP (Docker for Mac xhyve VM IP)
SWARM_MASTER_IP=$(docker info | grep -w 'Node Address' | awk '{print $3}')
echo $SWARM_MASTER_IP
DOCKER_VERSION=dind
# setup deploy Docker-in-Docker containers and join them to a swarm
docker run -d --privileged --name worker-1 --hostname=worker-1 -p 12377:2377 docker:${DOCKER_VERSION}
docker exec worker-1 docker swarm join --token ${SWARM_TOKEN} ${SWARM_MASTER_IP}:2377
docker run -d --privileged --name worker-2 --hostname=worker-2 -p 22377:2377 docker:${DOCKER_VERSION}
docker exec worker-2 docker swarm join --token ${SWARM_TOKEN} ${SWARM_MASTER_IP}:2377
docker run -d --privileged --name worker-3 --hostname=worker-3 -p 32377:2377 docker:${DOCKER_VERSION}
docker exec worker-3 docker swarm join --token ${SWARM_TOKEN} ${SWARM_MASTER_IP}:2377
After that
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
s371tmygu9h640xfosn6kyca4 * laptop-ebc155 Ready Active Leader 20.10.21
w1ina9ttvje4hn6r13p3gzbge worker-1 Ready Active 20.10.20
m8mqky6jchjao01nz8t5e392a worker-2 Ready Active 20.10.20
n29afhbb090tlyn9p0byga9au worker-3 Ready Active 20.10.20
To test the above two swarm setup, I use a very simple compose file as suggested by the official docs. As you can expect, these two swarm setup didn't work that well :/
If the MongoDB and MongoExpress are deployed on different nodes, both of the swarm setups show a same error MongoNetworkError: failed to connect to server [mongo:27017] on first connect. My understanding to this error is, that MongoExpress can not reach MongoDB under mongo:27017, which seems like a problem of the docker internal DNS. Can someone help me out? Or just feel free to tell me, dont try this single-multi nodes ideas anymore :D I am very appreciate to any help!
I just tried the same two exercises :)
Approach 1 - swarm nodes in WSL instances
I think it is currently impossible because of WSL2 design see https://github.com/microsoft/WSL/issues/4304. WSL2 instances are in fact sharing network setup - ip, interfaces, network namespaces, and so on. Every change made in one of them is immediately visible in all others and this conflicts with virtual interfaces and namespaces created by docker swarm nodes when they start up.
I tried configuring multiple ip addresses on eth0 interface, so that each node can have it's own (like here), and then used --advertise-addr --listen-addr options in docker swarm init and docker swarm join commands. Still I'm getting this error in dockerd logs:
moving interface ov-001000-yis5e to host ns failed, invalid argument, after config error error setting interface \"ov-001000-yis5e\" IP to 10.0.0.1/24: cannot program address 10.0.0.1/24 in sandbox interface because it conflicts with existing route {Ifindex: 4 Dst: 10.0.0.0/24 Src: 10.0.0.1 Gw: <nil> Flags: [] Table: 254}"
I believe here docker swarm hits a problem, because it already sees master's interfaces when it tries to to set up routing mesh networking for the worker. All because master and node share network config.
Approach 2 - swarm nodes as docker containers (docker-in-docker)
But I've got no 2. working with just a small change in swarm init command:
# advertise swarm on default bridge network
docker swarm init --advertise-addr 172.17.0.1
For me, the standard docker swarm init selected by default the eth0 address, which was only working for communication from dind -> wsl, but not the other way round.
Another but probably unrelated problem was that I could not access services/stacks executed this way from Windows host. This seems to be a wls bug and luckily there is a workaround.
One last hint about this mongo stack is ... patience. The stack consists of 2 services: mongo - the database and mongo-express - the client. Mongo image is a lot bigger ~600MB while mongo-express just ~135MB. The mongo-express image will be downloaded faster and it will be recreated by swarm multiple times before mongo is even started. Note also that docker images are independently downloaded for each worker in this setup, so also rebalancing may take some time.
I found these commands useful to see what is really happening:
# overview of services
docker service ls
# containers in each swarm service
docker service ps $(docker service ls --format {{.Name}})
# images in each dind worker
for i in $(seq "${NUM_WORKERS}"); do
docker exec worker-${i} docker images
done
#containers in each dind worker
for i in $(seq "${NUM_WORKERS}"); do
docker exec worker-${i} docker ps -a
done
Full listing of commands necessary to get working docker swarm using dind:
docker swarm init --advertise-addr docker0
SWARM_TOKEN=$( docker swarm join-token -q worker)
echo $SWARM_TOKEN
SWARM_MASTER_IP=$( docker info 2>&1 | grep -w 'Node Address' | awk '{print $3}')
echo $SWARM_MASTER_IP
DOCKER_VERSION=20.10.12-dind
NUM_WORKERS=3
# Run NUM_WORKERS workers with SWARM_TOKEN
for i in $(seq "${NUM_WORKERS}"); do
docker run -d --privileged --name worker-${i} --hostname=worker-${i} docker:${DOCKER_VERSION}
sleep 5
docker exec worker-${i} docker swarm join --token ${SWARM_TOKEN} ${SWARM_MASTER_IP}:2377
done
# Setup the visualizer
docker service create \
--detach=true \
--name=viz \
--publish=8000:8080/tcp \
--constraint=node.role==manager \
--mount=type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock \
dockersamples/visualizer
####### play with mongo
mkdir mongodemo && cd mongodemo
wget https://raw.githubusercontent.com/docker-library/docs/f6c9b596064e2eed9c3b6ac75bea606cb6d94099/mongo/stack.yml
docker stack deploy -c stack.yml mongo
# from windows:
# mongo will be available under <eth0>:8081
# visualizer under <eth0>:8000
ip -4 addr | grep eth0

How can I replicate containers inside docker service when I restart docker daemon?

I created docker service for the image percona XtraDB cluster with 3 replicas using the following command
docker service create \
--name mysql-galera \
--replicas 3 \
-p 3306:3306 \
--network mynet \
--env MYSQL_ROOT_PASSWORD=mypassword \
--env DISCOVERY_SERVICE=10.0.0.2:2379 \
--env XTRABACKUP_PASSWORD=mypassword \
--env CLUSTER_NAME=galera \
perconalab/percona-xtradb-cluster:5.6
I had already initialized docker swarm with three machines ( named with mach1, mach2, mach3) and all are joined as managers. And the replicas equally distributed to each of the three machines
When I tried to stop the docker daemon in mach2, docker created one more replica container in mach3. Again I restarted the docker daemon, mach3 was still running the two replicas and nothing on mach2. I manually removed the container in mach3 and mach2 was up with the 3rd replica
What should I do to automatically replicate containers on a restarted docker machine?
I dont't think docker service re balance option is available in docker swarm ..... you can re distribute container by updating service with --force option
docker service update mysql-galera --force
What you are looking for is a way to rebalance your replicaset automatically based on the availability of masters / slave nodes with capacity. However doing this would involve killing healthy containers and rescheduling them.
There is a open issue in docker swarm for this
https://github.com/moby/moby/issues/24103
Right now if you want to do this the best way seems to be to scle up / down or create a rollout which will attempt to reschedule the containers in the swarm.

Adding services in different consul clients running on same host

I've followed the section of in Testing a Consul cluster on a single host using consul. Three consul servers are successfully added and running in same host for testing purpose. Afterwards, I've also followed the tutorial and created a consul client node4 to expose ports. Is it possible to add more services and bind to one of those consul clients ?
Use the new 'swarm mode' instead of the legacy Swarm. Swarm mode doesn't require Consul. Service discovery and key/value store is now part of the docker daemon. Here's how to create a 3 nodes High Available cluster (3 masters).
Create three nodes
docker-machine create --driver vmwarefusion node01
docker-machine create --driver vmwarefusion node02
docker-machine create --driver vmwarefusion node03
Find the ip of node01
docker-machine ls
Set one as the initial swarm master
docker $(docker-machine config node01) swarm init --advertise-addr <ip-of-node01>
Retrieve the token to let other nodes join as master
docker $(docker-machine config node01) swarm join-token manager
This will print out something like
docker swarm join \
--token SWMTKN-1-0siwp7rzqeslnhuf42d16zcwodk543l99liy0wuq1mern8s8u9-8mbsrxzu9mgfw7x6ehpxh0dof \
192.168.40.144:2377
Add the other two nodes to the swarm as masters
docker $(docker-machine config node02) swarm join \
--token SWMTKN-1-0siwp7rzqeslnhuf42d16zcwodk543l99liy0wuq1mern8s8u9-8mbsrxzu9mgfw7x6ehpxh0dof \
192.168.40.144:2377
docker $(docker-machine config node03) swarm join \
--token SWMTKN-1-0siwp7rzqeslnhuf42d16zcwodk543l99liy0wuq1mern8s8u9-8mbsrxzu9mgfw7x6ehpxh0dof \
192.168.40.144:2377
Examine the swarm
docker node ls
You should now be able to shutdown the leader node and see another pick up as manager.
Best practice for Consul, is to run consul one per HOST, and when you want to talk to consul, you always talk locally. In general, everything 1 consul node knows, every other consul node also knows. So you can just talk to your localhost consul (127.0.0.1:8500) and do everything you need to do. When you add services, you add them to the local consul node that has the service's process on it. There are projects like Registrator (https://github.com/gliderlabs/registrator) That will automatically add services from running docker containers, which makes life easier.
Overall, welcome to Consul, it's great stuff!

Docker ping container on other nodes

I have 2 virtual machines (VM1 with IP 192.168.56.101 and VM2 with IP 192.16.56.102 which can ping each other) and these are the steps I'm doing:
- Create consul container on VM1 with 'docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap'
- Create swarm manager on VM1 with 'docker run -d -p 3376:3376 swarm manage -H 0.0.0.0:3376 --advertise 192.168.56.101:3376 consul://192.168.56.101:8500
- Create swarm agents on each VM with 'docker run -d swarm join --advertise <VM-IP>:2376 consul://192.168.56.101:8500
If i run docker -H 0.0.0.0:3376 info I can see both nodes connected to the swarm and they are both healthy. I can also run container and they are scheduled to the nodes. However, If I create a network and assign a few nodes to this network and then SSH into one node and try to ping every other node I can only reach the nodes which are running on the same virtual machine.
Both Virtual Machines have these DOCKER_OPTS:
DOCKER_OPTS = DOCKER_OPTS="--cluster-store=consul://192.168.56.101:8500 --cluster-advertise=<VM-IP>:0 -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock"
I don't have a direct quote, but from what I've read on Docker GitHub issue tracker, ICMP packets (ping) are never routed between containers on different nodes.
TCP connection to explicitly opened ports should work. But as of Docker 1.12.1 it is buggy.
Docker 1.12.2 has some bug fixes wrt establishing a connection to containers on other hosts. But ping is not going to work across hosts.
You can only ping containers on the same node because you attach them to a local scope network.
As suggested in the comments, if you want to ping containers across hosts (meaning from a container on VM1 to a container on VM2) using docker swarm (or docker swarm mode) without explicitly opening ports, you need to create an overlay network (or globally scoped network) and assign/start containers on that network.
To create an overlay network:
docker network create -d overlay mynet
Then start the containers using that network:
For Docker Swarm mode:
docker service create --replicas 2 --network mynet --name web nginx
For Docker Swarm (legacy):
docker run -itd --network=mynet busybox
For example, if we create two containers (on legacy Swarm):
docker run -itd --network=mynet --name=test1 busybox
docker run -itd --network=mynet --name=test2 busybox
You should be able to docker attach on test2 to ping test1 and vice-versa.
For more details you can refer to the networking documentation.
Note: If containers still can't ping each other after the creation of an overlay network and attaching containers to it, check the firewall configurations of the VMs and make sure that these ports are open:
data plane / vxlan: UDP 4789
control plane / gossip: TCP/UDP 7946

Need Explanation for the docker documentation on the swarm

I finished this documentation:
https://docs.docker.com/swarm/install-w-machine/
It works fine.
Now I tried to setup this EC2 instances by following this documentation:
https://docs.docker.com/swarm/install-manual/
I am in Step 4. Set up a discovery backend
I cannot understand the steps what I need to do further.
I created 5 nodes in EC2: manager0, manager1, consul0, node0, node1. Now I need to know how to setup service discovery with swarm.
In that document they ask us to connect manager0 and consul0 then ifconfig, then they given as etc0 instance. I don't know where this is coming from.
Ultimately I need to know where (in which node?) to run this command:
$ docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap
Any suggestion for me How to clear this step?
Consul will run on the consul0 server you created. So basically you first need to be able to run docker on worker0 and worker1 remotely, this is step 3. A better way of doing this is editing the daemon directly with the command:
sudo echo 'DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock"' > /etc/default/docker`
Then restart docker. Afterwards you will find that you can run docker remotely from master0, master1 or any other instance behind your firewall with docker commands that start with:
docker -H $WORKER0_IPADDRESS:2375
For example if your workers ip address was 1.2.3.4 this would run the docker ps command remotely:
docker -H 1.2.3.4:2375 ps
This is what swarm runs on. Then start up your consul server with the command you want to run, you got that one right and thats it you wont do anything else with the consul0 server except use its IP address when you run your swarm commands.
So if $CONSUL0 represented the IP address of your consul server this is how you would set up the rest of swarm. If you ran each of them on the local machine of each node:
On consul0:
docker run -d -p 8500:8500 --restart=unless-stopped --name=consul progrium/consul -server -bootstrap
On master0 and master1:
docker run --name=master -d -p 4000:4000 swarm manage -H :4000 --replication --advertise $(hostname -i):4000 consul://$CONSUL0:8500
On worker0 and worker1:
docker run -d --name=worker swarm join --advertise=$(hostname -i):2375 consul://$CONSUL0:8500/

Resources