Node cannot join Swarm Cluster

Node cannot join Swarm Cluster - docker

I have 3 VM's. They all have docker 1.12 and they are running on centos7.
All the ports are opened and the vm's are able to ping eachother
I started my cluster with
docker swarm init --advertise-addr 192.168.140.12
Docker info showed me:
Swarm: active
NodeID: 0drcj2nku1mv8t16fxva48edxx
Is Manager: true
ClusterID: cchn0yzospwoe1h9f55d7omxx
Managers: 1
Nodes: 1
Now I try to join nodes (other vms) to the cluster. I use the command which was recommended after starting my manager.
docker swarm join \
--token SWMTKN-1-48ythur5k6ckkz90ttlprw37p9z3ldclws51qirw5wdyfmvevr-3sb2t66b2fj6e4dhmfo1vavxx \
192.168.140.12:2377
But I got:
Error response from daemon: Timeout was reached before node was joined. Attempt to join the cluster will continue in the background. Use "docker info" command to see the current swarm status of your node.
Docker info showed me:
Swarm: pending
NodeID:
Error: rpc error: code = 1 desc = context canceled
Is Manager: false
Node Address: 192.168.140.14
On manager of cluster:
# netstat -tulpn | grep docker
tcp6 0 0 :::2377 :::* LISTEN 1602/dockerd
tcp6 0 0 :::7946 :::* LISTEN 1602/dockerd
tcp6 0 0 :::8080 :::* LISTEN 3398/docker-proxy
tcp6 0 0 :::32768 :::* LISTEN 3199/docker-proxy
tcp6 0 0 :::32769 :::* LISTEN 3219/docker-proxy
tcp6 0 0 :::32770 :::* LISTEN 3341/docker-proxy
tcp6 0 0 :::32771 :::* LISTEN 3436/docker-proxy
tcp6 0 0 :::2375 :::* LISTEN 1602/dockerd
udp6 0 0 :::7946 :::* 1602/dockerd
How can I debug this issue or did I forgot to perform some important step? Do the servers need ssh-access to each other? Thanks
logs on node:
Aug 8 09:50:24 localhost dockerd: time="2016-08-08T09:50:24.393432145-04:00" level=error msg="Handler for POST /v1.24/swarm/leave returned error: This node is not part of swarm"
Aug 8 09:51:01 localhost su: (to root) worker1 on pts/1
Aug 8 09:51:34 localhost dockerd: time="2016-08-08T09:51:34.384408514-04:00" level=error msg="Handler for POST /v1.24/swarm/join returned error: Timeout was reached before node was joined. Attempt to join the cluster will continue in the background. Use \"docker info\" command to see the current swarm status of your node."
Aug 8 09:51:40 localhost su: (to root) worker1 on pts/1
Aug 8 09:52:47 localhost dhclient[1277]: DHCPREQUEST on eno16777736 to 192.168.140.254 port 67 (xid=0x11f8fba8)
Aug 8 09:52:47 localhost dhclient[1277]: DHCPACK from 192.168.140.254 (xid=0x11f8fba8)
Aug 8 09:52:47 localhost NetworkManager[953]: <info> address 192.168.140.13
Aug 8 09:52:47 localhost NetworkManager[953]: <info> plen 24 (255.255.255.0)
Aug 8 09:52:47 localhost NetworkManager[953]: <info> gateway 192.168.140.2
Aug 8 09:52:47 localhost NetworkManager[953]: <info> server identifier 192.168.140.254
Aug 8 09:52:47 localhost NetworkManager[953]: <info> lease time 1800
Aug 8 09:52:47 localhost NetworkManager[953]: <info> nameserver '192.168.140.2'
Aug 8 09:52:47 localhost NetworkManager[953]: <info> domain name 'localdomain'
Aug 8 09:52:47 localhost NetworkManager[953]: <info> (eno16777736): DHCPv4 state changed bound -> bound
Aug 8 09:52:47 localhost dbus[878]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Aug 8 09:52:47 localhost dbus-daemon: dbus[878]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Aug 8 09:52:47 localhost systemd: Starting Network Manager Script Dispatcher Service...
Aug 8 09:52:47 localhost dhclient[1277]: bound to 192.168.140.13 -- renewal in 713 seconds.
Aug 8 09:52:47 localhost dbus[878]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Aug 8 09:52:47 localhost dbus-daemon: dbus[878]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Aug 8 09:52:47 localhost nm-dispatcher: Dispatching action 'dhcp4-change' for eno16777736
Aug 8 09:52:47 localhost systemd: Started Network Manager Script Dispatcher Service.
Sometimes warnings:
level=warning msg="failed to retrieve remote root CA certificate: rpc error: code = 1 desc = context canceled

Maybe you were using a http proxy.
You can use the following command to see what dockerd is doing.
# strace -Fp `pidof dockerd` 2>&1 |grep -v futex |grep -v epoll_wait |grep -v pselect

As explained by wenjianhn, make sure that you did not configure on your worker-node an http-proxy for Docker (as described here). Indeed, Swarm nodes communicate over HTTP (default port 2377); so if you configured an http-proxy, your worker-node will use the configured http-proxy, even if the manager-node sits in your LAN.
Also, make sure that no firewall is blocking the traffic on port 2377:
user#workernode$ telnet ip-of-manager 2377
If you can't open a telnet connection on port 2377, it means that this port is blocked by a firewall (either the worker node's firewall, either the manager's one).

The hostnames on all my vm's were: localhost.localdomain.
I changed the hostnames in /etc/hosts on each server and rebooted.
Now I'm able to create my swarm cluster and add nodes successfully.

I had same problem, and solved by sync each worker node's date same as master node date.
pi#workernode$sudo date --set="$(username#masternode date)"
after this, try update the worker node, and it should work.

If none of the below solutions have worked. Try disabling firewall on the master and see if it works.

I had a similar problem. I checked that there is no proxy configured and no firewall in the way but "docker swarm join" did not work. I noticed that there were managers listed in "docker info" that were no longer present. "docker node ls" did not show anything strange.
Eventually I solved my issue by doing "service docker restart" on the node addressed by "docker swarm join". Obviously some internal bookkeeping within the docker daemon got out of sync.

Related

Haproxy always giving 503 Service Unavailable

I've installed Haproxy 1.8 on a Kubernetes Container.
Whenever I make any request to /test, I always get 503 Service Unavailable response. I want to return the stats page when I get a request to /test
Following is my configuration file:
/etc/haproxy/haproxy.cfg:
global
daemon
maxconn 256
defaults
mode http
timeout connect 15000ms
timeout client 150000ms
timeout server 150000ms
frontend stats
bind *:8404
mode http
stats enable
stats uri /stats
stats refresh 10s
frontend http-in
bind *:8083
default_backend servers
acl ar1 path -i -m sub /test
use_backend servers if ar1
backend servers
mode http
server server1 10.1.0.46:8404/stats maxconn 32
# 10.1.0.46 is my container IP
I can access the /stats page using:
curl -ik http://10.1.0.46:8404/stats
But when I do:
curl -ik http://10.1.0.46:8083/test
I always get following response:
HTTP/1.0 503 Service Unavailable
Cache-Control: no-cache
Content-Type: text/html
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
I started haproxy using:
/etc/init.d/haproxy restart
and then subsequently restart it using:
haproxy -f haproxy.cfg -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid)
Following is the output of netstat -anlp:
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:5000 0.0.0.0:* LISTEN 54/python3.5
tcp 0 0 0.0.0.0:8083 0.0.0.0:* LISTEN 802/haproxy
tcp 0 0 0.0.0.0:8404 0.0.0.0:* LISTEN 802/haproxy
tcp 0 0 10.1.0.46:8404 10.0.15.225:20647 TIME_WAIT -
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node PID/Program name Path
Following is the output of ps -eaf:
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Jul22 ? 00:00:00 /bin/sh -c /bin/bash -x startup_script.sh
root 6 1 0 Jul22 ? 00:00:00 /bin/bash -x startup_script.sh
root 54 6 0 Jul22 ? 00:00:09 /usr/local/bin/python3.5 /usr/local/bin/gunicorn --bind 0.0.0.0:5000 runner:app?
root 57 54 0 Jul22 ? 00:02:50 /usr/local/bin/python3.5 /usr/local/bin/gunicorn --bind 0.0.0.0:5000 runner:app?
root 61 0 0 Jul22 pts/0 00:00:00 bash
root 739 0 0 07:02 pts/1 00:00:00 bash
root 802 1 0 08:09 ? 00:00:00 haproxy -f haproxy.cfg -p /var/run/haproxy.pid -sf 793
root 804 739 0 08:10 pts/1 00:00:00 ps -eaf
Why could I be getting 503 unavailable always?

Why do you use HAProxy 1.8 when a 2.2.x already exists?
You will need to adopt the path in the backend which can't be set on the server level.
backend servers
mode http
http-request set-path /stats
server server1 10.1.0.46:8404 maxconn 32
# 10.1.0.46 is my container IP

Can't connect to docker after resuming VM

For some reason whenever I suspend my VM and resume it, I can no longer connect to the docker container that is hosted within the VM. Usually, I pass -p 3000:3000 to the docker container so that I can access the rails instance within it and this works fine, but when I suspend the VM and resume it later, I can no longer connect to port 3000 even though it's listening within the docker image.
This results in me having to reboot the VM as service docker restart does not change anything.
Is there something else I should be looking at to resolve this issue? I've been suspending/resuming my VM with docker in it for quite awhile and have never run into this issue before.
EDIT
To reproduce this issue, I simply resumed my VM and tried connecting to localhost port 3000 from the VM itself (not within the docker image) and it cannot connect. However, below shows that port 3000 is listening:
[root:kali:~/app]# curl http://localhost:3000
curl: (56) Recv failure: Connection reset by peer
[root:kali:~/app]# netstat -antp | grep -i listen
tcp 0 0 127.0.0.1:43050 0.0.0.0:* LISTEN 84770/autossh
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 20478/sshd
tcp6 0 0 :::3000 :::* LISTEN 32731/docker-proxy
tcp6 0 0 :::3001 :::* LISTEN 32715/docker-proxy
tcp6 0 0 :::111 :::* LISTEN 1/systemd
tcp6 0 0 :::22 :::* LISTEN 20478/sshd
From within docker, I can see that rails is working:
[root:77f444beafff:~/app]# rails s --binding 0.0.0.0
=> Booting Puma
=> Rails 5.2.3 application starting in development
=> Run `rails server -h` for more startup options
Puma starting in single mode...
* Version 3.12.1 (ruby 2.5.1-p57), codename: Llamas in Pajamas
* Min threads: 5, max threads: 5
* Environment: development
* Listening on tcp://0.0.0.0:3000
Use Ctrl-C to stop
And here's the netstat from within docker:
[root:77f444beafff:~/app]# netstat -antp | grep -i listen
tcp 0 0 0.0.0.0:6379 0.0.0.0:* LISTEN 478/redis-server *:
tcp 0 0 0.0.0.0:3000 0.0.0.0:* LISTEN 765/puma 3.12.1 (tc
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN -
tcp6 0 0 :::6379
If I curl from within the docker image, I can see it hits the rails app just fine:
[root:77f444beafff:~/app]# curl http://localhost:3000/ -I
HTTP/1.1 200 OK
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: strict-origin-when-cross-origin
Content-Type: text/html; charset=utf-8
ETag: W/"5078d30a6c1a5f6fc5cb7f9a82cd89f5"
Cache-Control: max-age=0, private, must-revalidate
Set-Cookie: _vspm_session=Cace%2FN0zB%2F6QJOiietbuHxTHOMZUMuRmEukYqQTNaHQ91hskaN%2BPJzev0KdGUAAtYx9a35Mqdkr8eRkPdH4qOl6vOaCcPU0gy8s7IMfkb9VhRGPPbecepmI%2F9leA2dnD694P8ctXSBklOCnjhN0%3D--SglWrWvx3BFEAI3z--IkylACdXbR6eF27Hgn0Cgg%3D%3D; path=/; HttpOnly
X-Request-Id: 29aa7251-f29a-4309-adec-6af479e7bd9b
X-Runtime: 12.241723

I'm having exactly the same issue with my VMWare virtual machine (VMWare running on Windows).
The only workaround that is working for me is:
docker stop $(docker ps -aq) && sudo systemctl restart NetworkManager docker
If I had to guess I would say it may be related to some firewall rules docker setup on start, maybe when you resume the virtual machine a change in the network configuration breaks those rules.
Similar issue: https://github.com/docker/for-mac/issues/1990 (Doesn't seem specific to docker for mac).

I was able to solve this issue with the hint given by lannox in the comment. It's necessary to mark the network interfaces of the docker containers as unmanaged by NetworkManager.
To do that, create a new file /etc/NetworkManager/conf.d/10-unmanage-docker-interfaces.conf with the following content:
[keyfile]
unmanaged-devices=interface-name:docker*;interface-name:veth*;interface-name:br-*;interface-name:vmnet*;interface-name:vboxnet*
This configures NetworkManager to ignore all interfaces with names docker*,
veth*,
br-*,
vmnet*, and
vboxnet* interfaces.
Then restart NetworkManager with sudo systemctl restart NetworkManager.
Next time the host suspends and resumes, the docker containers keep their network connectivity.

Several questions here that might help you solve this :
Is your docker container still running? Run docker ps and find your container
Since the -p 3000:3000 option is set I guess the port is exposed, but you might want to check you really have run your container with this option this time
Is your app really listening? Run lsof -np | grep listen and find your app listening on port 3000
Connect to your container with docker exec -it <your_container> bash and try running lsof -np | grep listen to see if this is a docker issue or your app

It seems that when you run netstat on your VM you get the following line :
tcp6 0 0 :::3000 :::* LISTEN 32731/docker-proxy
On Docker you get :
tcp 0 0 0.0.0.0:3000 0.0.0.0:* LISTEN 765/puma 3.12.1 (tc
There are two differences here:
:::3000 vs 0 0.0.0.0:3000, the first means it is listening on IPv6 and the second on IPv4 (found the info on this question).
tcp6 vs tcp, again IPv6 vs IPv4.
According to this other question, it seems you have to run rails with -b :: option.
The -b option binds Rails to the specified IP, by default it is localhost. You can run a server as a daemon by passing a -d option.

Please do
sudo docker ps
If you do not got your container do
sudo docker ps -a
Does your container is stopped?
If its true so do
sudo docker start CONTAINER_ID

Why I have been accessing the docker daemon on another machine and it always fails

I want to access the docker daemon on another machine,but it always fails.
Both machines are virtual machines.
systemctl status docker service
[root#localhost ~]# systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2019-04-16 07:38:52 EDT; 3s ago
Docs: https://docs.docker.com
Main PID: 5191 (dockerd)
Memory: 128.5M
CGroup: /system.slice/docker.service
└─5191 /usr/bin/dockerd -H unix://var/run/docker.sock -H tcp://0.0.0.0:2375
ps -ef | grep docker
[root#localhost ~]# ps -ef | grep docker
root 5191 1 1 07:38 ? 00:00:01 /usr/bin/dockerd -H unix://var/run/docker.sock -H tcp://0.0.0.0:2375
root 5800 5161 0 07:40 pts/0 00:00:00 grep --color=auto docker
netstat -tulp
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 localhost:smtp 0.0.0.0:* LISTEN 4373/master
tcp 0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN 4134/sshd
tcp6 0 0 localhost:smtp [::]:* LISTEN 4373/master
tcp6 0 0 [::]:2375 [::]:* LISTEN 5191/dockerd
tcp6 0 0 [::]:2377 [::]:* LISTEN 5191/dockerd
tcp6 0 0 [::]:7946 [::]:* LISTEN 5191/dockerd
tcp6 0 0 [::]:ssh [::]:* LISTEN 4134/sshd
Result of access on another machine
[root#localhost dack]# docker -H tcp://192.168.233.150:2375 images
error during connect: Get http://192.168.233.150:2375/v1.39/images/json: dial tcp 192.168.233.150:2375: connect: no route to host
[root#localhost dack]# docker -H tcp://192.168.233.150:2375 info
error during connect: Get http://192.168.233.150:2375/v1.39/info: dial tcp 192.168.233.150:2375: connect: no route to host

Can you check for firewall, if 2375 port is allowed to connect from other servers.

Worker cannot connect to docker swarm manager

I have setup the docker swarm manager in one machine with IP 192.168.XXX.XXX by using this command :
docker swarm init --advertise-addr=192.168.XXX.XXX
and I got this message :
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-
0jpgak7bm7t4mzluz48gdub06f5036q8yaoo99awkjmlz48vtb-
1eutz0k1vp37ztmiuxdnglka2 192.168.XXX.XXX:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
In other machine I tried the following command :
docker swarm join --token SWMTKN-1-
0jpgak7bm7t4mzluz48gdub06f5036q8yaoo99awkjmlz48vtb-
1eutz0k1vp37ztmiuxdnglka2 192.168.XXX.XXX:2377
and the result was :
error response from daemon : rpc error : code = Unavailable desc = all
SubConns are in TransientFailure, latest connection error: connection error
: desc = "transport: Error while dialing dial tcp 192.168.XXX.XXX:2377 :
connect: connection refused
Docker version :
Client: Docker Engine - Community
Version: 18.09.0
API version: 1.39
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:47:51 2018
OS/Arch: windows/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.0
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:55:00 2018
OS/Arch: linux/amd64
Experimental: false

This is a connection refused error you are running into. Most likely, the other machine cannot connect to the manager as it is not on the same network. Solution: To fix this, the two machines have to be able to talk to each other. If the machines are in a cloud like Azure or AWS create a virtual network and add the two machines to it.
Debugging connection refused error:
To confirm the machines cannot talk to each other, try pinging the manager from the other machine. It will most likely fail in your current setup.
$ ping 192.168.XXX.XXX
If it succeeds then check if port 2377 is open and listening on the manager. Run below from the other machine:
$ nc -vz 192.168.XXX.XXX 2377
Another tool that can help in debugging is to run netstat on the manager:
$ netstat -tuplen
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN 101 51772227 -
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 0 25361290 -
tcp6 0 0 :::2377 :::* LISTEN 0 1423965 -
tcp6 0 0 :::7946 :::* LISTEN 0 1423980 -
Verify you see :::2377 and :::7946 and that their state is set to LISTEN as these ports are required by docker swarm [1]

Error response from daemon: service endpoint with name

I'm getting this strange error, when I try to run a docker with a name it gives me this error.
docker: Error response from daemon: service endpoint with name qc.T8 already exists.
However, there is no container with this name.
> docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
> sudo docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 3
Server Version: 1.12.3
Storage Driver: aufs
Root Dir: /ahdee/docker/aufs
Backing Filesystem: extfs
Dirs: 28
Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null bridge host overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-101-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 64
Total Memory: 480.3 GiB
Is there anyway I can flush this out?

Just in case someone else needs this. As #Jmons pointed out it was a weird networking issue. So I solved this by forcing a removal
docker network disconnect --force bridge qc.T8
A

TLDR: restart your docker daemon or restart your docker-machine (if you're using that e.g. on a mac).
Edit: As there are more recent posts below, they answer the question better then mine. The Network adapter is stuck on the daemon. I'm updating mine as its possibly 'on top' of the list and people might not scroll down.
Restarting your docker daemon / docker service / docker-machine is the easiest answer.
the better answer (via Shalabh Negi):
docker network inspect <network name>
docker network disconnect <network name> <container id/ container name>
This is also faster in real time if you can find the network as restarting the docker machine/demon/service in my experience is a slow thing. If you use that, please scroll down and click +1 on their answer.
So the problem is probably your network adapter (virtual, docker thing, not real): have a quick peek at this: https://github.com/moby/moby/issues/23302.
To prevent it happening again is a bit tricky. It seems there may be an issue with docker where a container exits with a bad status code (e.g. non-zero) that holds the network open. You can't then start a new container with that endpoint.

docker network inspect <network name>
docker network disconnect <network name> <container id/ container name>
You can also try doing:
docker network prune
docker volume prune
docker system prune
these commands will help clearing zombie containers, volume and network.
When no command works then do
sudo service docker restart
your problem will be solved

docker network rm <network name>
Worked for me

Restarting docker solved it for me.

I created a script a while back, I think this should help people working with swarm. Using docker-machine this can help a bit.
https://gist.github.com/lcamilo15/7aaaebe71852444ea8f1da5c4c9c84b7
declare -a NODE_NAMES=("node_01", "node_02");
declare -a CONTAINER_NAMES=("container_a", "container_b");
declare -a NETWORK_NAMES=("network_1", "network_2");
for x in "${NODE_NAMES[#]}"; do;
docker-machine env $x;
eval $(docker-machine env $x)
for CONTAINER_NAME in "${CONTAINER_NAMES[#]}"; do;
for NETWORK_NAME in "${NETWORK_NAMES[#]}"; do;
echo "Disconnecting $CONTAINER_NAME from $NETWORK_NAME"
docker network disconnect -f $NETWORK_NAME $CONTAINER_NAME;
done;
done;
done;

You could try seeing if there's any network with that container name by running:
docker network ls
If there is, copy the network id then go on to remove it by running:
docker network rm network-id

This could be because an abrupt removal of a container may leave the network open for that endpoint (container-name).
Try stopping the container first before removing it.
docker stop <container-name>. Then docker rm <container-name>.
Then docker run <same-container-name>.

i think restart docker deamon will solve the problem

Even reboot did not help in my case. It turned out that port 80 to be assigned by the nginx container automatically was in use, even after reboot. How come?
root#IONOS_2: /root/2_proxy # netstat -tlpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:873 0.0.0.0:* LISTEN 1378/rsync
tcp 0 0 0.0.0.0:5355 0.0.0.0:* LISTEN 1565/systemd-resolv
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1463/nginx: master
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1742/sshd
tcp6 0 0 :::2377 :::* LISTEN 24139/dockerd
tcp6 0 0 :::873 :::* LISTEN 1378/rsync
tcp6 0 0 :::7946 :::* LISTEN 24139/dockerd
tcp6 0 0 :::5355 :::* LISTEN 1565/systemd-resolv
tcp6 0 0 :::21 :::* LISTEN 1447/vsftpd
tcp6 0 0 :::22 :::* LISTEN 1742/sshd
tcp6 0 0 :::5000 :::* LISTEN 24139/dockerd
No idea what nginx: master means or where it came from. And indeed 1463 is the PID:
root#IONOS_2: /root/2_proxy # ps aux | grep "nginx"
root 1463 0.0 0.0 43296 908 ? Ss 00:53 0:00 nginx: master process /usr/sbin/nginx
root 1464 0.0 0.0 74280 4568 ? S 00:53 0:00 nginx: worker process
root 30422 0.0 0.0 12108 1060 pts/0 S+ 01:23 0:00 grep --color=auto nginx
So I tried this:
root#IONOS_2: /root/2_proxy # kill 1463
root#IONOS_2: /root/2_proxy # ps aux | grep "nginx"
root 30783 0.0 0.0 12108 980 pts/0 S+ 01:24 0:00 grep --color=auto nginx
And the problem was gone.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Node cannot join Swarm Cluster - docker

Maybe you were using a http proxy. You can use the following command to see what dockerd is doing. # strace -Fp `pidof dockerd` 2>&1 |grep -v futex |grep -v epoll_wait |grep -v pselect

The hostnames on all my vm's were: localhost.localdomain. I changed the hostnames in /etc/hosts on each server and rebooted. Now I'm able to create my swarm cluster and add nodes successfully.

I had same problem, and solved by sync each worker node's date same as master node date. pi#workernode$sudo date --set="$(username#masternode date)" after this, try update the worker node, and it should work.

If none of the below solutions have worked. Try disabling firewall on the master and see if it works.

Related

Haproxy always giving 503 Service Unavailable

Can't connect to docker after resuming VM

Why I have been accessing the docker daemon on another machine and it always fails

Worker cannot connect to docker swarm manager

Error response from daemon: service endpoint with name

Categories

Resources