Error trying runnning kubeadm init on Centos 7

Error trying runnning kubeadm init on Centos 7 - docker

I am new at kubeletes and I can´t run "kubeadm init" with success.
Let me show you step by step what I did:
I installed last version dockers using yum following dockers documentation
(I have configurated 'Environment="HTTP_PROXY=http://usuario:password#proxy:port/" "HTTPS_PROXY=http://usuario:password#proxy:port/"' in /etc/systemd/system/docker.service.d/http-proxy.conf).
I have disabled SELINUXTYPE, disabled Swap with the command "swapoff -a" and commented "#/dev/mapper/centos-swap swap swap defaults 0 0" in /etc/fstab.
I used "modprobe br_netfilter" and "echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables" to activate the module called "br_netfilter".
"kubernetes.repo" file to install "kubelet kubeadm kubectl" using yum:
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
Opened ports:
firewall-cmd --permanent --add-port=6443/tcp
firewall-cmd --permanent --add-port=2379-2380/tcp
firewall-cmd --permanent --add-port=10250/tcp
firewall-cmd --permanent --add-port=10251/tcp
firewall-cmd --permanent --add-port=10252/tcp
firewall-cmd --permanent --add-port=10255/tcp
firewall-cmd --reload
I created "10-kubeadm.conf" file:
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
Reload and enable services:
systemctl daemon-reload
systemctl restart docker
systemctl enable docker
systemctl restart kubelet
systemctl enable kubelet
(both services with status: active(running))
Error:
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
Thanks in advance for the help.
Best Regards.

please disable your swap
swapoff -a
vim /etc/fstab
comment the swap line
after that install this packages
yum install -y yum-utils device-mapper-persistent-data lvm2
and add repo by this
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
and you should install docker by this command
yum install -y docker-ce
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet kubeadm kubectl
then reboot
systemctl start docker && systemctl enable docker
systemctl start kubelet && systemctl enable kubelet
systemctl daemon-reload
systemctl restart kubelet
kubeadm init --apiserver-advertise-address=MASTER_IP --pod-network-cidr=10.244.0.0/16
do not change 10.244.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Next, deploy the flannel network to the kubernetes cluster using the kubectl command.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
I write the complete way to run kubernetes and I run kubernetes cluster by this command 1000 of time

Related

Problems starting docker instance

Not sure this is a docker problem specifically but this is how it goes: I tried sudo docker stop 7f8c9285465c which resulted in
Error response from daemon: cannot stop container: 7f8c9285465c: Cannot kill container...unknown error after kill: runc did not terminate sucessfully: container_linux.go:392: signaling init process caused "permission denied"
Following this stackoverflow suggestion I did sudo aa-remove-unknown. Now the docker stop succeeded but subsequent docker-compose up resulted in:
snap-confine has elevated permissions and is not confined but should be. Refusing to continue to avoid permission escalation attacks.
Next I ran the command sudo apt purge snapd snap-confine && sudo apt install -y snapd. Now running docker-compose up results in
bash: /snap/bin/docker-compose: No such file or directory.
The command sudo docker container ls results in:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
sudo service docker status returns Active: active (running).
I tried reinstalling docker. running sudo docker run hello-world retunrs the same Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (although status is active)
.

Create file /etc/systemd/network/bridge.network with contents:
[Network]
IPFoward=kernel
If no permissions to save do so in root mode (sudo su -).
Then, run:
sudo systemctl restart systemd-networkd.service # (disconnected network)
sudo apt remove docker-ce # If you hadn't done so before
sudo apt install docker-ce # Should start docker.service
sudo systemctl status docker.service # Verify docker.service is running
This information has been taken from this docker forum discussion.

I think you installed docker with snap and installation of snapd is not complete "snap-confine has elevated permissions and is not confined but should be. Refusing to continue to avoid permission escalation attacks.
" this error indicates that "apparmor" service is not enabled.
sudo systemctl enable --now apparmor
sudo systemctl status apparmor
If apparmor is not installed install it:
sudo apt-get install apparmor
this error "bash: /snap/bin/docker-compose" indicates that "/snap/bin" is not in your PATH. if you run this snap --version you might get an error saying "/snap/bin" is not in PATH.
sudo nano /etc/environment
Add "/snap/bin"
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/usr/games:/snap/bin
Now you need to restart your system. Everything should be fixed.

How to fix: yum install with error in docker container?

I just would like to try to install sshd in centos:latest image.
I try to install 'passwd', typing the command like this:
yum install passwd
But I have a error like this:
Failed to set locale, defaulting to C.UTF-8
CentOS-8 - AppStream 0.0 B/s | 0 B 00:30
Errors during downloading metadata for repository 'AppStream':
- Curl error (6): Couldn't resolve host name for http://mirrorlist.centos.org/?release=8&arch=x86_64&repo=AppStream&infra=container [Could not resolve host: mirrorlist.centos.org]
Error: Failed to download metadata for repo 'AppStream': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for http://mirrorlist.centos.org/?release=8&arch=x86_64&repo=AppStream&infra=container [Could not resolve host: mirrorlist.centos.org]
I don't know why I can't install using yum in docker container?

Masquerading allows for docker ingress and egress:
firewall-cmd --zone=public --add-masquerade --permanent
Specifically allow incoming traffic on port 80/443 :
firewall-cmd --zone=public --add-port=443/tcp
Reload firewall to apply permanent rules:
firewall-cmd --reload
Restart docker :
systemctl restart docker

Just tested this on my local machine:
docker run -it -d --name test centos:latest;
docker exec -it test /bin/bash;
In docker container:
[root#f3b8b3fe70df /]# yum update -y;
[root#f3b8b3fe70df /]# yum install passwd;

Add access to the host network using --network host
docker run --network host -it -d --name test centos:latest

No nodes available on Minikube

I am working on minikube inside a vmware (ubuntu 16.04).
Everything was fine for couple of weeks.
One day I came and noticeds that all my pods stucked on "pending".
I describe one of the pods and saw:
no nodes available to schedule pods
I uninstalled minikube:
minikube stop; minikube delete &&
docker stop $(docker ps -aq) &&
rm -rf ~/.kube ~/.minikube &&
rm -rf /usr/local/bin/localkube /usr/local/bin/minikube &&
rm -rf /etc/kubernetes/ &&
docker system prune -af --volumes
systemctl stop kubelet
systemctl disable kubelet
Installed it again:
curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube && mv minikube /usr/local/bin/
swapoff -a
minikube start --vm-driver=none
mv /root/.minikube $HOME/.minikube # this will write over any previous configuration
chown -R $USER $HOME/.minikube
chgrp -R $USER $HOME/.minikube
When I run kubectl get nodes I received: The connection to the server 192.168.21.129:8443 was refused - did you specify the right host or port?`
I ran docker ps -a:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
602e5d1a27c1 3193be46e0b3 "kube-scheduler --ad…" About a minute ago Exited (2) 22 seconds ago k8s_kube-scheduler_kube-scheduler-minikube_kube-system_9729a196c4723b60ab401eaff722982d_1
29208eeb5f46 k8s.gcr.io/pause:3.1 "/pause" 2 minutes ago Exited (0) 22 seconds ago k8s_POD_kube-scheduler-minikube_kube-system_9729a196c4723b60ab401eaff722982d_0
I have only 2 pods which are exited...
What happens here?
I uninstalled it deeply as I can (or not?).
How to troubleshoot it and fix ?
EDIT:
I uninstalled Minikube but now also removed the kubelet service like that:
I followed the instruction of removing service from here:
systemctl stop kubelet
systemctl disable kubelet
rm -rf /etc/systemd/system/kubelet.service.d
rm -rf /lib/systemd/system/kubelet.service
rm -rf /var/lib/kubelet
rm -rf /usr/libexec/kubernetes/kubelet-plugins
rm -rf /usr/bin/kubelet
systemctl daemon-reload
systemctl reset-failed
I searched to see if I still have it on my system with find / | grep kubelet and found lots of files under /sys/kernel/slab. I restarted the machine and it gone.
I installed again minikube and in the beginning I received errors about kubelet:
sudo /usr/bin/kubeadm init --config /var/lib/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests --ignore-preflight-errors=DirAvailable--data-minikube --ignore-preflight-errors=Port-10250 --ignore-preflight-errors=FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml --ignore-preflight-errors=FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml --ignore-preflight-errors=FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml --ignore-preflight-errors=FileAvailable--etc-kubernetes-manifests-etcd.yaml --ignore-preflight-errors=Swap --ignore-preflight-errors=CRI
output: [init] Using Kubernetes version: v1.13.2
[preflight] Running pre-flight checks
[WARNING FileExisting-ebtables]: ebtables not found in system path
[WARNING FileExisting-socat]: socat not found in system path
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.02.0-ce. Latest validated version: 18.06
[WARNING Hostname]: hostname "minikube" could not be reached
[WARNING Hostname]: hostname "minikube": lookup minikube on 127.0.1.1:53: server misbehaving
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/var/lib/minikube/certs/"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [minikube localhost] and IPs [192.168.21.129 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [minikube localhost] and IPs [192.168.21.129 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
: running command:
sudo /usr/bin/kubeadm init --config /var/lib/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests --ignore-preflight-errors=DirAvailable--data-minikube --ignore-preflight-errors=Port-10250 --ignore-preflight-errors=FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml --ignore-preflight-errors=FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml --ignore-preflight-errors=FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml --ignore-preflight-errors=FileAvailable--etc-kubernetes-manifests-etcd.yaml --ignore-preflight-errors=Swap --ignore-preflight-errors=CRI
I tried to start it again and then it wrote that it succeed but then again, all the containers are being deleted.
I think it related to the kubelet but I removed it and reinstalled completely.

One of the errors was:
[WARNING SystemVerification]: this Docker version is not on the list
of validated versions: 18.02.0-ce. Latest validated version: 18.06
I decided to also remove docker.
This is the full uninstall process I did:
# Remove minikube
minikube stop; minikube delete &&
docker stop $(docker ps -aq) &&
rm -rf ~/.kube ~/.minikube &&
rm -rf /usr/local/bin/localkube /usr/local/bin/minikube &&
rm -rf /etc/kubernetes/ &&
docker system prune -af --volumes
# remove kubelet
systemctl stop kubelet
systemctl disable kubelet
rm -rf /etc/systemd/system/kubelet.service.d
rm -rf /lib/systemd/system/kubelet.service
rm -rf /var/lib/kubelet
rm -rf /usr/libexec/kubernetes/kubelet-plugins
rm -rf /usr/bin/kubelet
systemctl daemon-reload
systemctl reset-failed
# uninstall docker
dpkg -l | grep -i docker
apt-get purge -y docker.io
rm -rf /var/lib/docker
apt-get autoremove -y --purge docker.io
apt-get autoclean
# remove other pieces
rm -rf /home/myuser/.minikube
rm -rf ~/.kube
rm -f /var/lib/dpkg/info/kubelet*
rm -f /var/cache/apt/archives/kubelet_1.13.2-00_amd64.deb
rm -f /var/lib/systemd/deb-systemd-helper-enabled/kubelet.service.dsh-also
rm -f /var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/kubelet.service
rm -f /etc/default/kubelet
I restart my machine and install everything:
apt-get install -y docker.io
curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube && mv minikube /usr/local/bin/
swapoff -a
minikube start --v=3 --vm-driver=none
Now everything works fine.

Unable to start Docker service with error "Failed to start docker.service: Unit not found."

I have installed Docker with yum install docker:
$ uname -a
Linux caspgval4 3.10.0-229.20.1.el7.x86_64 #1 SMP Wed Nov 4 10:08:36 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
$ docker --version
Docker version 1.12.6, build 3a094bd/1.12.6
$ docker info
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
$ sudo systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: inactive (dead)
Docs: http://docs.docker.com
I am trying to install and run Docker, but it is giving an error as below:
$ sudo service docker start
Redirecting to /bin/systemctl start docker.service
Failed to start docker.service: Unit not found.
How do I resolve this issue? I tried the following commands, but no luck:
$ sudo systemctl start docker
Failed to start docker.service: Unit not found.
Extra information:
$ journalctl -u docker
No journal files were found.
-- No entries --
$ cat /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target
Wants=docker-storage-setup.service
Requires=rhel-push-plugin.socket
Requires=docker-cleanup.timer
[Service]
Type=notify
NotifyAccess=all
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
Environment=GOTRACEBACK=crash
Environment=DOCKER_HTTP_HOST_COMPAT=1
Environment=PATH=/usr/libexec/docker:/usr/bin:/usr/sbin
ExecStart=/usr/bin/dockerd-current \
--add-runtime docker-runc=/usr/libexec/docker/docker-runc-current \
--default-runtime=docker-runc \
--authorization-plugin=rhel-push-plugin \
--exec-opt native.cgroupdriver=systemd \
--userland-proxy-path=/usr/libexec/docker/docker-proxy-current \
$OPTIONS \
$DOCKER_STORAGE_OPTIONS \
$DOCKER_NETWORK_OPTIONS \
$ADD_REGISTRY \
$BLOCK_REGISTRY \
$INSECURE_REGISTRY
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
TimeoutStartSec=0
Restart=on-abnormal
MountFlags=slave
[Install]
WantedBy=multi-user.target
I tried the following:
$ sudo systemctl daemon-reload
$ sudo systemctl start docker
Failed to start docker.service: Unit not found.
$ sudo journalctl -u docker
-- No entries --
More debug information:
$ sudo systemctl status network.target
● network.target - Network
Loaded: loaded (/usr/lib/systemd/system/network.target; static; vendor preset: disabled)
Active: active since Mon 2017-01-23 02:54:39 PST; 2 months 29 days ago
Docs: man:systemd.special(7)
http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
Jan 23 02:54:39 mymachine systemd[1]: Starting Network.
Jan 23 02:54:39 mymachine systemd[1]: Reached target Network.
$ sudo systemctl status docker-storage-setup.service
● docker-storage-setup.service - Docker Storage Setup
Loaded: loaded (/usr/lib/systemd/system/docker-storage-setup.service; disabled; vendor preset: disabled)
Active: inactive (dead)
$ sudo systemctl status rhel-push-plugin.socket
Unit rhel-push-plugin.socket could not be found.
$ sudo systemctl status docker-cleanup.timer
● docker-cleanup.timer - Run docker-cleanup every hour
Loaded: loaded (/usr/lib/systemd/system/docker-cleanup.timer; disabled; vendor preset: disabled)
Active: inactive (dead)

In case someone has installed docker using snap, they can start the service using
sudo snap status docker #check the status
sudo snap start docker # start the service

Run this command to list all the services:
sudo systemctl list-units --type=service
Look for the correct Docker service name (in my case it is snap.docker.dockerd.service) then run:
sudo systemctl restart snap.docker.dockerd.service

It looks like you're missing the rhel-push-plugin.socket unit which is presumably part of a rhel-push-plugin package. You can try fixing that install, or you can install the upstream Docker package directly from Docker with the following as root:
curl -sSL https://get.docker.com/ | sh
Or more appropriately following the CentOS install guide from Docker. (The CentOS install tends to work on even a RHEL system when you have a supported version, which until the recent 20.10 release did not include CentOS 8 or RHEL 8.)
The upstream Docker install will be a more recent version of Docker, but it will not have the various RHEL modifications like the rhel-push-plugin.

Use the following command if your OS is Ubuntu, it will install Docker successfully:
apt install docker.io

if you have installed docker with snap you can check the status of the daemon like this:
sudo snap services docker
If the current column shows inactive then it's not running. Try starting it with this:
sudo snap start docker
Check the service again, if it's still not running after that you can check the logs:
sudo snap logs docker
which might give more hints on what's stopping it from running
For me, I had an error saying that docker could not create a symlink in /etc/docker/ because another file was in the way. Emptying the directory was not enough due to a known bug in the docker snap. The work-around was to delete the directory then refresh the snap (see https://forum.snapcraft.io/t/layouts-still-brittle-when-refreshing-snaps/26252/2)
sudo rm -rf /etc/docker
sudo snap refresh

I worked around the issue by simply doing the following:
$ sudo apt-get purge containerd.io docker-ce
$ rm -rf /var/lib/containerd
[reboot]
$ sudo apt-get install containerd.io docker-ce

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-basics.html
Note
In some cases, you may need to reboot your instance to provide permissions for the ec2-user to access the Docker daemon. Try rebooting your instance if you see the following error:

Try installing Docker as root (sudo):
sudo yum install docker
See Installation on Red Hat Enterprise Linux.

Ubuntu docker swarm error "docker: Cannot connect to the Docker daemon. Is the docker daemon running on this host?."

I am trying to set up docker swarm with consul on some Ubuntu 14.04 vagrant boxes, however there is an issue with the docker daemon. I already have a progrium/consul container running and a swarm manager container running. 172.28.128.3 is the master machine running everything, 172.28.128.4 is the machine I am trying to start a docker swarm container on. Here is my command and the output:
vagrant#ubuntu-14:~$ docker -H=172.28.128.4:2375 run -d swarm join \
> --advertise=172.28.128.4:2375 \
> consul://172.28.128.3:8500/
docker: Cannot connect to the Docker daemon. Is the docker daemon running on this host?.
See 'docker run --help'.
There is no other problem with docker and attempting to start the daemon the same way I would on my macs boot2docker gives the following output:
vagrant#ubuntu-14:~$ eval "$(docker-machine env default)"
docker-machine: command not found
Update: here is the output of $sudo docker info and $docker info (they are exactly the same except for one line described below)
vagrant#ubuntu-14:~$ sudo docker info
Containers: 8
Running: 2
Paused: 0
Stopped: 6
Images: 8
Server Version: 1.11.1
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 81
Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null host bridge
Kernel Version: 3.13.0-24-generic
Operating System: Ubuntu 14.04 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 993.9 MiB
Name: ubuntu-14
ID: BBEM:JVHD:UXV7:AGQR:ITUY:3KGT:K4RS:7KSR:ESCJ:2VZQ:QTOG:J26U
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No kernel memory limit support
The only difference between the two commands is that $docker info has the following entry for Network:
Network: host bridge null
On my second machine there is no difference at all between the two command outputs.
UPDATE: after adding DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock" to the file /etc/default/docker on my worker machine and restarting the docker service on my worker server sudo docker restart swarm is working correctly.
Thank you JorelC for the solution.

You have to configure all machines that You want to use docker through tcp to run in tcp mode. In Your remote machine (172.28.128.4 in your question), edit /etc/default/docker file and add something like this in DOCKER_OPTS:
DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock"
After that, You need to restart the service:
sudo service docker restart
And You should use docker through tcp. Try from your client machine:
docker -H=172.28.128.4:2375 info
to test if it's working

There can also be issues if you are using clones of instances or instance images that have docker preinstalled on them.
To get around that use the following shell script:
#UNINSTALL
sudo apt-get purge -y docker-engine
sudo apt-get autoremove -y --purge docker-engine
#CLONES
sudo rm /etc/docker/key.json
#INSTALL
sudo apt-get install -y curl
sudo curl -sSL http://get.docker.com | sudo sh
sudo usermod -aG docker $(whoami)
sudo su root
And if you want to use the newest version of docker swarm (1.12 the one with docker swarm built in) use the following script:
# DOCKER 1.12.0
sudo apt-get update
sudo apt-get purge -y lxc-docker docker-engine
sudo apt-get autoremove -y --purge docker-engine
sudo curl -fsSL https://experimental.docker.com/ | sudo sh
sudo chmod 777 /etc/default/docker
echo 'DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock"' > /etc/default/docker
sudo chmod 755 /etc/default/docker
sudo rm /etc/docker/key.json
sudo service docker restart
sudo usermod -aG docker $(whoami)
sudo su root

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Error trying runnning kubeadm init on Centos 7 - docker

Related

Problems starting docker instance

How to fix: yum install with error in docker container?

No nodes available on Minikube

Unable to start Docker service with error "Failed to start docker.service: Unit not found."

Ubuntu docker swarm error "docker: Cannot connect to the Docker daemon. Is the docker daemon running on this host?."

Categories

Resources