As we know in docker swarm we can have more than one manger. Let's suppose that we have 2 nodes and 2 managers (so each node is both manager and worker).
Now, let client (using CLI tool) execute following two seperated scenarios:
1. docker create service some_service
2. docker update --force some_service
where client is launched on one of swarm nodes.
Where will above requests be sent? Only to leader or to each worker node? How docker deal with simultaneous requests?
I assume you're talking about the docker cli talking to the manager api.
The docker cli on a node will default to connecting to localhost. Assuming you're on a manager, you can see which node your cli is talking to with docker node ls.
The * next to a node name indicates that's the one you're talking to.
From there if that node isn't the Leader, it will relay the commands to the Leader node and wait on a response to return to your cli. This all means:
Just ensure you're running the docker cli on a manager node or your cli is configured to talk to one.
It doesn't matter which manager, as they will all relay your command to the current Leader.
I have a swarm cluster with two nodes. 1 Manager and 1 worker. I am running my application on worker node and to run a test case, I force remove manager from docker swarm cluster.
My application continues to work, but I would like to know if there is any possibility to add back the force removed manager in the cluster again. (I don't remember the join-token and neither have them copied anywhere)
I understand docker advises to have odd number of manager nodes to maintain quorum, but would like to know if docker has addressed such scenarios anywhere.
docker swarm init --force-new-cluster --advertise-addr node01:2377
When you run the docker swarm init command with the --force-new-cluster flag, the Docker Engine where you run the command becomes the manager node of a single-node swarm which is capable of managing and running services. The manager has all the previous information about services and tasks, worker nodes are still part of the swarm, and services are still running. You need to add or re-add manager nodes to achieve your previous task distribution and ensure that you have enough managers to maintain high availability and prevent losing the quorum.
How about create new cluster from your previously manager machine and let the worker leave previous cluster to join the new one? That seems like the only possible solution for me, since your cluster now does not have any managers and the previous manager now not in any cluster, you can just create a new cluster and let other workers join the new one.
# In previously manager machine
$ docker swarm init --advertise-addr <manager ip address>
# *copy the generated command for worker to join new cluster after this command
# In worker machine
$ docker swarm leave
# *paste and execute the copied command here
I am using Docker version 17.12.1-ce.
I have set up a swarm with two nodes, and I have a stack running on the manager, while I am to instantiate new nodes on the worker (not within a service, but as stand-alone containers).
So far I have been unable to find a way to instantiate containers on the worker specifically, and/or to verify that the new container actually got deployed on the worker.
I have read the answer to this question which led me to run containers with the -e option specifying constraint:Role==worker, constraint:node==<nodeId> or constraint:<custom label>==<value>, and this github issue from 2016 showing the docker info command outputting just the information I would need (i.e. how many containers are on each node at any given time), however I am not sure if this is a feature of the stand-alone swarm, since docker info only the number of nodes, but no detailed info for each node. I have also tried with docker -D info.
Specifically, I need to:
Manually specify which node to deploy a stand-alone container to (i.e. not related to a service).
Check that a container is running on a specific swarm node, or check how many containers are running on a node.
Swarm commands will only care/show service-related containers. If you create one with docker run, then you'll need to use something like ssh node2 docker ps to see all containers on that node.
I recommend you do your best in a Swarm to have all containers as part of a service. If you need a container to run on nodeX, then you can create a service with a "node constraint" using labels and constraints. In this case you could restrict the single replica of that service to a node's hostname.
docker service create --constraint Node.Hostname==swarm2 nginx
To see all tasks on a node from any swarm manager:
docker node ps <nodename_or_id>
I am having a problem trying to implement the best way to add new container to an existing cluster while all containers run in docker.
Assuming I have a docker swarm, and whenever a container stops/fails for some reason, the swarm bring up new container and expect it to add itself to the cluster.
How can I make any container be able to add itself to a cluster?
I mean, for example, if I want to create a RabbitMQ HA cluster, I need to create a master, and then create slaves, assuming every instance of RabbitMQ (master or slave) is a container, let's now assume that one of them fails, we have 2 options:
1) slave container has failed.
2) master container has failed.
Usually, a service which have the ability to run as a cluster, it also has the ability to elect a new leader to be the master, so, assuming this scenerio is working seemlesly without any intervention, how would a new container added to the swarm (using docker swarm) will be able to add itself to the cluster?
The problem here is, the new container is not created with new arguments every time, the container is always created as it was deployed first time, which means, I can't just change it's command line arguments, and this is a cloud, so I can't hard code an IP to use.
Something here is missing.
Maybe trying to declare a "Service" in the "docker Swarm" level, will acctualy let the new container the ability to add itself to the cluster without really knowing anything the other machines in the cluster...
There are quite a few options for scaling out containers with Swarm. It can range from being as simple as passing in the information via a container environment variable to something as extensive as service discovery.
Here are a few options:
Pass in IP as container environment variable. e.g. docker run -td -e HOST_IP=$(ifconfig wlan0 | awk '/t addr:/{gsub(/.*:/,"",$2);print$2}') somecontainer:latest
this would set the internal container environment variable HOST_IP to the IP of the machine it was started on.
Service Discovery. Querying a known point of entry to determine the information about any required services such as IP, Port, ect.
This is the most common type of scale-out option. You can read more about it in the official Docker docs. The high level overview is that you set up a service like Consul on the masters, which you have your services query to find the information of other relevant services. Example: Web server requires DB. DB would add itself to Consul, the web server would start up and query Consul for the databases IP and port.
Network Overlay. Creating a network in swarm for your services to communicate with each other.
$ docker network create -d overlay mynet
$ docker service create –name frontend –replicas 5 -p 80:80/tcp –network mynet mywebapp
$ docker service create –name redis –network mynet redis:latest
This allows the web app to communicate with redis by placing them on the same network.
Lastly, in your example above it would be best to deploy it as 2 separate containers which you scale individually. e.g. Deploy one MASTER and one SLAVE container. Then you would scale each dependent on the number you needed. e.g. to scale to 3 slaves you would go docker service scale <SERVICE-ID>=<NUMBER-OF-TASKS> which would start the additional slaves. In this scenario if one of the scaled slaves fails swarm would start a new one to bring the number of tasks back to 3.
Docker images have a new layer for health check.
Use a health check layer in your containers for example:
RUN ./
HEALTHCHECK exit 1 or (Any command you want to add)
HEALTHCHECK check the status code of command 0 or 1 and than result as
1. healthy
2. unhealthy
3. starting etc.
Docker swarm auto restart the unhealthy containers in swarm cluster.
In docker swarm mode I can run docker node ls to list swarm nodes but it does not work on worker nodes. I need a similar function. I know worker nodes does not have a strong consistent view of the cluster, but there should be a way to get current leader or reachable leader.
So is there a way to get current leader/manager on worker node on docker swarm mode 1.12.1?
You can get manager addresses by running docker info from a worker.
The docs and the error message from a worker node mention that you have to be on a manager node to execute swarm commands or view cluster state:
Error message from a worker node: "This node is not a swarm manager. Worker nodes can't be used to view or modify cluster state. Please run this command on a manager node or promote the current node to a manager."
After further thought:
One way you could crack this nut is to use an external key/value store like etcd or any other key/value store that Swarm supports and store the elected node there so that it can be queried by all the nodes. You can see examples of that in the Shipyard Docker management / UI project:
Another simple way would be to run a redis service on the cluster and another service to announce the elected leader. This announcement service would have a constraint to only run on the manager node(s): --constraint node.role == manager
There are four machines:
Local, where swarm create will run
My understanding is swarm-master will control agents, but what is Local used for?
It is for generating the discovery token using the Docker Swarm image.
That token is used when creating the swarm master.
This discovery service associates a token with instances of the Docker Daemon running on each node. Other discovery service backends such as etcd, consul, and zookeeper are available.
So the "local" machine is there to make sure the swarm manager discovers nodes. Its functions are:
register: register a new node
watch: callback method for the swarm manager
fetch: fetch the list of entries
See this introduction: