Cannot join Docker manager node in Windows using tokens - docker

My friend and I are trying to connect our Docker daemon using Docker Swarm. We both are using Windows OS and we are NOT on the same network. According to Docker docs each docker host must have the following ports open;
TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes
UDP port 4789 for overlay network traffic
We both have added new rules for the given ports in inbound and outbound rules in the firewall. Though we keep getting the same two errors while trying to join using token created by the manager node using docker swarm join --token command;
1. error response from daemon: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 192.168.65.3:2377: connect: connection refused"
2. Timeout error
Also, if either of us runs docker swarm init it shows 192.168.65.3 IP address that isn't part of any network we're connected to. What does it mean?
Docker overlay tutorial also states that in order to connect to the manager node, the worker node should add the IP address of the manager.
docker swarm join --token \ --advertise-addr IP-ADDRESS-OF-WORKER-1
IP-ADDRESS-OF-MANAGER:2377
Does it mean that in our case we have to use public IP address of the manager node after enabling port forwarding?

Potential network issues aside, here is your problem:
We both are using Windows OS
I have seen this issue in other threads when attempting to use Windows nodes in a multi-node swarm. Here are some important pieces of information from the Docker overlay networks documentation:
Before you can create an overlay network, you need to either initialize your Docker daemon as a swarm manager using docker swarm init or join it to an existing swarm using docker swarm join. Either of these creates the default ingress overlay network which is used by swarm services by default.
Overlay network encryption is not supported on Windows. If a Windows node attempts to connect to an encrypted overlay network, no error is detected but the node cannot communicate.
By default, Docker encrypts all swarm service management traffic. As far as I know, disabling this encryption is not possible. Do not confuse this with the --opt encrypted option, as that involves encrypting application data, not swarm management traffic.
For a single-node swarm, using Windows is just fine. For a multi-node swarm, which would be deployed using Docker stack, I highly recommend using Linux for all worker and manager nodes.
A while ago I was using Linux as a manager node and Windows as a worker node. I noticed that joining the swarm would only work if the Linux machine was the swarm manager; If the Windows machine was the manager, joining the swarm would not work. After the Windows machine joined the swarm, container-to-container communication over a user-defined overlay network would not work no matter what. Replacing the Windows machine with a Linux machine fixed all issues.

Related

Add a VM running Ubuntu as a worker node in Docker Swarm

I'm trying to create swarm consisting of 2 nodes, using docker-machine, it is easy to provision a VM and add it as a node, but I want to create a swarm using a ubuntu VM machine and Windows docker as manager without using docker-machine.
Running
docker swarm init
in Windows (Host Machine) gives me a token to add a worker. I have Ubuntu running in VirtualBox, Docker is also installed in the VM and I'm able to ssh into it and run commands but whenever I try to add this Ubuntu Machine as a worker node by using the token generated from Windows Machine, it says
Error response from daemon: Timeout was reached before node joined. The attempt to join the swarm will continue in the background. Use the "docker info" command to see the current swarm status of your node.
I think it is related to port forwarding. I'm forwarding my VM port 22 to 127.0.0.1:22 in VBox for connecting via SSH. But I tried several combinations of forwarding. Still the VM is not able to join as a node in the swarm that I created in Windows.
Any guidance will be of great value.
Check if you have connectivity from your Ubuntu to your Windows machine. First, ssh to your Ubuntu and check:
Windows is addressable, for example using ping windows-ip.
If it is not, make sure both are in the same network, for example setting a bridge network in your VM configuration.
Windows is listening in ports needed by docker swarm:
TCP port 2376 for secure Docker client communication. This port is required for Docker Machine to work. Docker Machine is used to orchestrate Docker hosts.
TCP port 2377. This port is used for communication between the nodes of a Docker Swarm or cluster. It only needs to be opened on manager nodes.
TCP and UDP port 7946 for communication among nodes (container network discovery).
UDP port 4789 for overlay network traffic (container ingress networking).
You can check this using telnet windows-ip port.
If they are not reachable, check your Windows firewall.
I hope it helps!
I tried to create a similar Swarm with a Windows manager node but never really got it to work. You can initialize a single-node Swarm from Windows with docker swarm init. However adding multiple worker nodes does not appear to be supported at the moment:
https://docs.docker.com/engine/swarm/swarm-tutorial/.
"Currently, you cannot use Docker Desktop for Mac or Docker Desktop for Windows alone to test a multi-node swarm".
The following options are possible:
Pure Linux swarm (Linux manager + Linux workers) which runs only Linux containers
Hybrid Swarm (Linux manager + Windows workers + Linux workers) which runs Windows and Linux containers
(Sometimes) Pure Windows Swarm using Win Server 2019 as the manager. The regular Windows updates have been known to break various features of Swarm. For example, https://github.com/moby/moby/issues/40998
Then everyone either tries workarounds or waits for the next Windows update to fix the problem.
Personally I've had good luck with hybrid Swarm. It works fine with simple Ubuntu manager + standard Windows 10 workers. No need for Win Server.

Docker swarm overlay Connect: no route to host

I have a swarm with 2 nodes. One is an ubuntu VM on azure and the other one is my VM on my local machine.
When the containers try to make requests to each other with I get this dial tcp 10.0.0.88:9999: connect: no route to host
I've enabled in the 2 nodes all the swarm communication ports needed: tcp 2377 udp/tcp 7946 and udp 4789.
Communication works if I run everything local.
Any ideas?
Thanks
An overlay network doesn't create connectivity between two nodes, it requires connectivity, and then uses that to connect containers running on each node. From the prerequisites, each node needs to be able to reach the overlay ports on every other node in the cluster. See the documentation for more details:
https://docs.docker.com/network/overlay/

Error response from daemon: attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded

I am trying to set up docker swarm with an overlay network. I have some hosts on aws while others are laptops running Ubuntu(same as on aws). Every node has a static public IP. I have created an overlay network as:
docker network create --driver=overlay --attachable test-net
I have created a swarm network on one of the aws hosts. Every other node is able to join that swarm network.
However when I run docker run -it --name alpine2 --network test-net alpine on any node not on aws, I get the error: docker: Error response from daemon: attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded.
But if I run the same on any aws host, then everything is working fine. Is there anything more I need to do in terms of networking/ports If there are some nodes on aws while others are not?
I have opened the ports required for swarm networking on all machines.
EDIT: All the nodes are marked as "active" when listing in the manager node.
UPDATE Solved this issue by opening the respective ports. It now works if all the nodes are Linux based. But when I try to make a swarm with the manager as Linux(ubuntu) os, mac os machines are not able to join the swarm.
check if the node in drain state:
docker node inspect --format {{.Spec.Availability}} node
if yes then update the state:
docker node update --availability active node
here is the explanation:
Resolution
When a node is in drain state, it is expected behavior that you should
not be able to allocate swarm mode resources such as multi-host
overlay network IP addresses to the node.However, swarm mode does not
currently provide a messaging mechanism between the swarm leader where
IP address management occurs back to the worker node that requested
the IP address. So docker run fails with context deadline exceeded.
Internal engineering issue escalation/292 has been opened to provide a
better error message in a future release of the Docker daemon.
source
Check if the below ports are opened on both machines.
TCP port 2377
TCP and UDP port 7946
UDP port 4789
You may use ufw to allow the ports:
ufw allow 2377/tcp
I had a similar issue, managed to fix it by making sure the ENGINE VERSION of the nodes were the same.
sudo docker node ls
Another common cause for this is Ubuntu server installer installing docker using snap, and that package is buggy. Uninstall with snap and install using apt. And reconsider Ubuntu. :-/

Host unreachable after docker swarm init

I have Windows Server 2016 Core(Hyper-V VM). Docker is installed, working and I want to create swarm.
IP config at the beginning:
1. Ethernet - 192.168.0.1
2. vEthernet (HSN Internal NIC) - 172.30.208.1
Then I run
docker swarm init --advertise-addr 192.168.0.1
Swarm is created, but I have lost my main IP address. IP config:
1. vEthernet (HNS internal NIC) - 172.30.208.1
2. vEthernet (HNS Transparent) - 169.254.225.229
Created swarm manager node is not reachable on main address 192.168.0.1. I can't connect to it and swarm workers are not able to join with this IP. Where is the problem?
A little late answering this but ... Docker is going to take over your network card when you bring up the Swarm. What I did was use two network cards: one I left alone for Docker to use and the second I used for everything else including virtual machines.
Currently, you cannot use Docker for Mac or Docker for Windows alone to test a multi-node swarm. For single node swarm cluster,
If you are using Docker for Mac or Docker for Windows to test single-node swarm, simply run docker swarm init with no arguments
However, you can use the included version of Docker Machine to create the swarm nodes (see Get started with Docker Machine and a local VM), then follow the tutorial for all multi-node features
For furthere info read this
Edit:
Also refer to this

Can Consul be run inside a Docker container using Docker for Windows?

I am trying to make Consul work inside a Docker container, but using Docker for Windows and Linux containers. I am using the official Consul Docker image. The documentation states that the container must use --net=host for Consul's consensus and gossip protocols.
The problem is, as far as I can tell, that Docker for Windows uses a Linux VM under the hood, and the "host" of the container is not the actual host machine, but that VM. I could not find a combination of -bind, -client and -advertise parameters (IP addresses), so that:
Other Consul agents on other hosts can connect to the local agent using the host machine's IP address.
Other containerized services on the same host can query the local agent's REST interface.
Whenever I pass the host machines IP address in the LAN through -advertise, I get these errors inside the container:
2018/04/03 15:15:55 [WARN] consul: error getting server health from "linuxkit-00155d02430b": rpc error getting client: failed to get conn: dial tcp
127.0.0.1:0->10.241.2.67:8300: connect: invalid argument 2018/04/03 15:15:56 [WARN] consul: error getting server health from "linuxkit-00155d02430b": context deadline exceeded
Also, other agents on other hosts cannot connect to that agent.
Using -bind on that address fails - my guess is, since the container is inside the Linux VM, the host machine's address is not the container's host's address, and therefore cannot be bound.
I have tried various combinations of -bind, -client and -advertise, using addresses like 0.0.0.0, 127.0.0.1, 10.0.75.2 (addresss on the Docker virtual switch) and the host machine's IP, but to no avail.
I am now wondering whether this is achievable at all. I have been trying this for quite some time, and I am despairing. Any advice would be appreciated!
I have tried the whole process without using --net=host, and everything works fine. I can connect agents across hosts, and I can query the local agents REST interface from other containerized applications... Is --net=host really crucial to the functioning of Consul?

Resources