Hi I have two virtual machine in a local server with ubuntu 20.04 and i want to build a small cluster for my microservices. I ran the following step to setup my cluster but I got issue with calico-nodes. They are running with 0/1/
master.domain.com
ubuntu 20.04
docker --version = Docker version 20.10.7, build f0df350
kubectl version = Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:12:00Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
worker.domain.com
ubuntu 20.04
docker --version = Docker version 20.10.2, build 20.10.2-0ubuntu1~20.04.2
kubectl version = Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:12:00Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
STEP-1
In the master.domain.com virtual machine I run the following commands
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-7f4f5bf95d-gnll8 1/1 Running 0 38s 192.168.29.195 master <none> <none>
kube-system calico-node-7zmtm 1/1 Running 0 38s 195.251.3.255 master <none> <none>
kube-system coredns-74ff55c5b-ltn9g 1/1 Running 0 3m49s 192.168.29.193 master <none> <none>
kube-system coredns-74ff55c5b-nkhzf 1/1 Running 0 3m49s 192.168.29.194 master <none> <none>
kube-system etcd-kubem 1/1 Running 0 4m6s 195.251.3.255 master <none> <none>
kube-system kube-apiserver-kubem 1/1 Running 0 4m6s 195.251.3.255 master <none> <none>
kube-system kube-controller-manager-kubem 1/1 Running 0 4m6s 195.251.3.255 master <none> <none>
kube-system kube-proxy-2cr2x 1/1 Running 0 3m49s 195.251.3.255 master <none> <none>
kube-system kube-scheduler-kubem 1/1 Running 0 4m6s 195.251.3.255 master <none> <none>
STEP-2
In the worker.domain.com virtual machine I run the following commands
sudo kubeadm join 195.251.3.255:6443 --token azuist.xxxxxxxxxxx --discovery-token-ca-cert-hash sha256:xxxxxxxxxxxxxxxxxxxxxxxxxxxx
STEP-3
In the master.domain.com virtual machine I run the following commands
kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-7f4f5bf95d-gnll8 1/1 Running 0 6m37s 192.168.29.195 master <none> <none>
kube-system calico-node-7zmtm 0/1 Running 0 6m37s 195.251.3.255 master <none> <none>
kube-system calico-node-wccnb 0/1 Running 0 2m19s 195.251.3.230 worker <none> <none>
kube-system coredns-74ff55c5b-ltn9g 1/1 Running 0 9m48s 192.168.29.193 master <none> <none>
kube-system coredns-74ff55c5b-nkhzf 1/1 Running 0 9m48s 192.168.29.194 master <none> <none>
kube-system etcd-kubem 1/1 Running 0 10m 195.251.3.255 master <none> <none>
kube-system kube-apiserver-kubem 1/1 Running 0 10m 195.251.3.255 master <none> <none>
kube-system kube-controller-manager-kubem 1/1 Running 0 10m 195.251.3.255 master <none> <none>
kube-system kube-proxy-2cr2x 1/1 Running 0 9m48s 195.251.3.255 master <none> <none>
kube-system kube-proxy-kxw4m 1/1 Running 0 2m19s 195.251.3.230 worker <none> <none>
kube-system kube-scheduler-kubem 1/1 Running 0 10m 195.251.3.255 master <none> <none>
kubectl logs -n kube-system calico-node-7zmtm
...
...
2021-06-20 17:10:25.064 [INFO][56] monitor-addresses/startup.go 774: Using autodetected IPv4 address on interface eth0: 195.251.3.255/24
2021-06-20 17:10:34.862 [INFO][53] felix/summary.go 100: Summarising 11 dataplane reconciliation loops over 1m3.5s: avg=4ms longest=13ms ()
kubectl logs -n kube-system calico-node-wccnb
...
...
2021-06-20 17:10:59.818 [INFO][55] felix/summary.go 100: Summarising 8 dataplane reconciliation loops over 1m3.6s: avg=3ms longest=13ms (resync-filter-v4,resync-nat-v4,resync-raw-v4)
2021-06-20 17:11:05.994 [INFO][51] monitor-addresses/startup.go 774: Using autodetected IPv4 address on interface br-9a88318dda68: 172.21.0.1/16
As you can see for both calico nodes I get 0/1 running, Why??
Any idea how to solve this problem?
Thank you
Got totally the same issue.
CentOS 8
kubectl kubeadm kubelet v1.22.3
docker-ce version 20.10.9
The only difference worth mention is that I have to comment line
- --port=0
in /etc/kubernetes/manifests/kube-scheduler.yaml or otherwise scheduler declared as unhealthy in
kubectl get componentstatuses
Kubernetes API is advertised on a public IP address.
Public IP address of control panel node is substituted with 42.42.42.42 in kubectl print-out;
Public IP address of worker node is substituted with 21.21.21.21
Public domain name (which is also a hostname on Control Panel node) is substituted with public-domain.work
>kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-5d995d45d6-rk9cq 1/1 Running 0 76m 192.168.231.193 public-domain.work <none> <none>
calico-node-qstxm 0/1 Running 0 76m 42.42.42.42 public-domain.work <none> <none>
calico-node-zmz5s 0/1 Running 0 75m 21.21.21.21 node1.public-domain.work <none> <none>
coredns-78fcd69978-5xsb2 1/1 Running 0 81m 192.168.231.194 public-domain.work <none> <none>
coredns-78fcd69978-q29fn 1/1 Running 0 81m 192.168.231.195 public-domain.work <none> <none>
etcd-public-domain.work 1/1 Running 3 82m 42.42.42.42 public-domain.work <none> <none>
kube-apiserver-public-domain.work 1/1 Running 3 82m 42.42.42.42 public-domain.work <none> <none>
kube-controller-manager-public-domain.work 1/1 Running 2 82m 42.42.42.42 public-domain.work <none> <none>
kube-proxy-5kkks 1/1 Running 0 81m 42.42.42.42 public-domain.work <none> <none>
kube-proxy-xsc66 1/1 Running 0 75m 21.21.21.21 node1.public-domain.work <none> <none>
kube-scheduler-public-domain.work 1/1 Running 1 (78m ago) 78m 42.42.42.42 public-domain.work <none> <none>
>kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
public-domain.work Ready control-plane,master 4h56m v1.22.3 42.42.42.42 <none> CentOS Stream 8 4.18.0-348.el8.x86_64 docker://20.10.9
node1.public-domain.work Ready <none> 4h50m v1.22.3 21.21.21.21 <none> CentOS Stream 8 4.18.0-348.el8.x86_64 docker://20.10.10
>kubectl logs -n kube-system calico-node-qstxm
2021-11-09 15:27:38.996 [INFO][86] felix/int_dataplane.go 1539: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{}}
2021-11-09 15:27:38.996 [INFO][86] felix/hostip_mgr.go 85: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{}}
2021-11-09 15:27:38.997 [INFO][86] felix/ipsets.go 130: Queueing IP set for creation family="inet" setID="this-host" setType="hash:ip"
2021-11-09 15:27:38.998 [INFO][86] felix/ipsets.go 785: Doing full IP set rewrite family="inet" numMembersInPendingReplace=7 setID="this-host"
2021-11-09 15:27:40.198 [INFO][86] felix/iface_monitor.go 201: Netlink address update. addr="here:is:some:ipv6:address:that:has:nothing:to:do:with:my:control:panel:server:public:ipv6" exists=true ifIndex=3 2021-11-09 15:27:40.198 [INFO][86] felix/int_dataplane.go 1071: Linux interface addrs changed. addrs=set.mapSet{"fe80::9132:a0df:82d8:e26c":set.empty{}} ifaceName="eth1"
2021-11-09 15:27:40.198 [INFO][86] felix/int_dataplane.go 1539: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{"here:is:some:ipv6:address:that:has:nothing:to:do:with:my:control:panel:server:public:ipv6":set.empty{}}}
2021-11-09 15:27:40.199 [INFO][86] felix/hostip_mgr.go 85: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{"here:is:some:ipv6:address:that:has:nothing:to:do:with:my:control:panel:server:public:ipv6":set.empty{}}}
2021-11-09 15:27:40.199 [INFO][86] felix/ipsets.go 130: Queueing IP set for creation family="inet" setID="this-host" setType="hash:ip"
2021-11-09 15:27:40.200 [INFO][86] felix/ipsets.go 785: Doing full IP set rewrite family="inet" numMembersInPendingReplace=7 setID="this-host"
2021-11-09 15:27:48.010 [INFO][81] monitor-addresses/startup.go 713: Using autodetected IPv4 address on interface eth0: 42.42.42.42/24
> kube-system calico-node-zmz5s
2021-11-09 15:25:56.669 [INFO][64] felix/int_dataplane.go 1071: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="eth1"
2021-11-09 15:25:56.669 [INFO][64] felix/int_dataplane.go 1539: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{}}
2021-11-09 15:25:56.669 [INFO][64] felix/hostip_mgr.go 85: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{}}
2021-11-09 15:25:56.669 [INFO][64] felix/ipsets.go 130: Queueing IP set for creation family="inet" setID="this-host" setType="hash:ip"
2021-11-09 15:25:56.670 [INFO][64] felix/ipsets.go 785: Doing full IP set rewrite family="inet" numMembersInPendingReplace=7 setID="this-host"
2021-11-09 15:25:56.769 [INFO][64] felix/iface_monitor.go 201: Netlink address update. addr="here:is:some:ipv6:address:that:has:nothing:to:do:with:my:worknode:server:public:ipv6" exists=false ifIndex=3
2021-11-09 15:26:07.050 [INFO][64] felix/summary.go 100: Summarising 14 dataplane reconciliation loops over 1m1.7s: avg=5ms longest=11ms ()
2021-11-09 15:26:33.880 [INFO][59] monitor-addresses/startup.go 713: Using autodetected IPv4 address on interface eth0: 21.21.21.21/24
Seemed that issue was in closed BGP port due to firewall.
This commands on master node solved it for me:
>firewall-cmd --add-port 179/tcp --zone=public --permanent
>firewall-cmd --reload
Related
kube-system coredns-f68dcb75-f6smn 0/1 Pending 0 34m
kube-system coredns-f68dcb75-npc48 0/1 Pending 0 34m
kube-system etcd-master 1/1 Running 0 33m
kube-system kube-apiserver-master 1/1 Running 0 34m
kube-system kube-controller-manager-master 1/1 Running 0 33m
kube-system kube-flannel-ds-amd64-lngrx 1/1 Running 1 32m
kube-system kube-flannel-ds-amd64-qz2gn 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-w5lpc 1/1 Running 0 34m
kube-system kube-proxy-9l9nv 1/1 Running 0 32m
kube-system kube-proxy-hvd5g 1/1 Running 0 32m
kube-system kube-proxy-vdgth 1/1 Running 0 34m
kube-system kube-scheduler-master 1/1 Running 0 33m
I am using the latest k8s version: 1.16.0.
kubeadm init --pod-network-cidr=10.244.0.0/16 --image-repository=<some-repo> --token=TOKEN --apiserver-advertise-address=<IP> --kubernetes-version=1.16.0
This is the command I am using to initialize the cluster
The current state of the cluster.
master NotReady master 42m v1.16.0
slave1 NotReady <none> 39m v1.16.0
slave2 NotReady <none> 39m v1.16.0
Please comment if you need any other info.
I think you need to wait for k8s v1.17.0 or update your current installaion, this issue fixed in here
orginal Issue
I got Kubernetes Cluster with 1 master and 3 workers nodes.
calico v3.7.3 kubernetes v1.16.0 installed via kubespray https://github.com/kubernetes-sigs/kubespray
Before that, I normally deployed all the pods without any problems.
I can't start a few pod (Ceph):
kubectl get all --namespace=ceph
NAME READY STATUS RESTARTS AGE
pod/ceph-cephfs-test 0/1 Pending 0 162m
pod/ceph-mds-665d849f4f-fzzwb 0/1 Pending 0 162m
pod/ceph-mon-744f6dc9d6-jtbgk 0/1 CrashLoopBackOff 24 162m
pod/ceph-mon-744f6dc9d6-mqwgb 0/1 CrashLoopBackOff 24 162m
pod/ceph-mon-744f6dc9d6-zthpv 0/1 CrashLoopBackOff 24 162m
pod/ceph-mon-check-6f474c97f-gjr9f 1/1 Running 0 162m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ceph-mon ClusterIP None <none> 6789/TCP 162m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/ceph-osd 0 0 0 0 0 node-type=storage 162m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ceph-mds 0/1 1 0 162m
deployment.apps/ceph-mon 0/3 3 0 162m
deployment.apps/ceph-mon-check 1/1 1 1 162m
NAME DESIRED CURRENT READY AGE
replicaset.apps/ceph-mds-665d849f4f 1 1 0 162m
replicaset.apps/ceph-mon-744f6dc9d6 3 3 0 162m
replicaset.apps/ceph-mon-check-6f474c97f 1 1 1 162m
But another obe is ok:
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6d57b44787-xlj89 1/1 Running 19 24d
calico-node-dwm47 1/1 Running 310 19d
calico-node-hhgzk 1/1 Running 15 24d
calico-node-tk4mp 1/1 Running 309 19d
calico-node-w7zvs 1/1 Running 312 19d
coredns-74c9d4d795-jrxjn 1/1 Running 0 2d23h
coredns-74c9d4d795-psf2v 1/1 Running 2 18d
dns-autoscaler-7d95989447-7kqsn 1/1 Running 10 24d
kube-apiserver-master 1/1 Running 4 24d
kube-controller-manager-master 1/1 Running 3 24d
kube-proxy-9bt8m 1/1 Running 2 19d
kube-proxy-cbrcl 1/1 Running 4 19d
kube-proxy-stj5g 1/1 Running 0 19d
kube-proxy-zql86 1/1 Running 0 19d
kube-scheduler-master 1/1 Running 3 24d
kubernetes-dashboard-7c547b4c64-6skc7 1/1 Running 591 24d
nginx-proxy-worker1 1/1 Running 2 19d
nginx-proxy-worker2 1/1 Running 0 19d
nginx-proxy-worker3 1/1 Running 0 19d
nodelocaldns-6t92x 1/1 Running 2 19d
nodelocaldns-kgm4t 1/1 Running 0 19d
nodelocaldns-xl8zg 1/1 Running 0 19d
nodelocaldns-xwlwk 1/1 Running 12 24d
tiller-deploy-8557598fbc-7f2w6 1/1 Running 0 131m
I use Centos 7:
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
The error log:
Get https://10.2.67.203:10250/containerLogs/ceph/ceph-mon-744f6dc9d6-mqwgb/ceph-mon?tailLines=5000×tamps=true: dial tcp 10.2.67.203:10250: connect: no route to host
Maybe someone came across this and can help me? I will provide any additional information
logs from pending pods:
Warning FailedScheduling 98s (x125 over 3h1m) default-scheduler 0/4 nodes are available: 4 node(s) didn't match node selector.
It seems that a firewall is blocking ingress traffic from port 10250 on the 10.2.67.203 node.
You can open it by running the commands below (I'm assuming firewalld is installed or you can run the commands of the equivalent firewall module):
sudo firewall-cmd --add-port=10250/tcp --permanent
sudo firewall-cmd --reload
sudo firewall-cmd --list-all # you should see that port `10250` is updated
tl;dr; It looks like your cluster itself is fairly broken and should be repaired before looking at Ceph specifically
Get https://10.2.67.203:10250/containerLogs/ceph/ceph-mon-744f6dc9d6-mqwgb/ceph-mon?tailLines=5000×tamps=true: dial tcp 10.2.67.203:10250: connect: no route to host
10250 is the port that the Kubernetes API server uses to connect to a node's Kubelet to retrieve the logs.
This error indicates that the Kubernetes API server is unable to reach the node. This has nothing to do with your containers, pods or even your CNI network. no route to host indicates that either:
The host is unavailable
A network segmentation has occurred
The Kubelet is unable to answer the API server
Before addressing issues with the Ceph pods I would investigate why the Kubelet isn't reachable from the API server.
After you have solved the underlying network connectivity issues I would address the crash-looping Calico pods (You can see the logs of the previously executed containers by running kubectl logs -n kube-system calico-node-dwm47 -p).
Once you have both the underlying network and the pod network sorted I would address the issues with the Kubernetes Dashboard crash-looping, and finally, start to investigate why you are having issues deploying Ceph.
I have kubernetes cluster running on 4 Raspberry-pi devices, out of which 1 is acting as master and other 3 are working as worker i.e w1, w2, w3. I have started a daemon set deployment, so each worker is running a pod of 2 containers.
w2 is running pod of 2 container. If I exec into any container and ping www.google.com from the container, I get the response. But if I do the same on w1 and w3 it says temporary failure in name resolution. All the pods in kube-system are running. I am using weave for networking. Below are all the pods for kube-system
NAME READY STATUS RESTARTS AGE
etcd-master-pi 1/1 Running 1 23h
kube-apiserver-master-pi 1/1 Running 1 23h
kube-controller-manager-master-pi 1/1 Running 1 23h
kube-dns-7b6ff86f69-97vtl 3/3 Running 3 23h
kube-proxy-2tmgw 1/1 Running 0 14m
kube-proxy-9xfx9 1/1 Running 2 22h
kube-proxy-nfgwg 1/1 Running 1 23h
kube-proxy-xbdxl 1/1 Running 3 23h
kube-scheduler-master-pi 1/1 Running 1 23h
weave-net-7sh5n 2/2 Running 1 14m
weave-net-c7x8p 2/2 Running 3 23h
weave-net-mz4c4 2/2 Running 6 22h
weave-net-qtgmw 2/2 Running 10 23h
If I am starting the containers using the normal docker container command but not from the kubernetes deployment then I do not see this issue. I think this is because of kube-dns. How can I debug this issue.?
You can start by checking if the dns is working
Run the nslookup on kubernetes.default from inside the pod, check if it is working.
[root#metrics-master-2 /]# nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
Check the local dns configuration inside the pods:
[root#metrics-master-2 /]# cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal
options ndots:5
At last, check the kube-dns container logs while you run the ping command, It will give you the possible reasons why the name is not resolving.
kubectl logs kube-dns-86f4d74b45-7c4ng -c kubedns -n kube-system
Hope this helps.
This might not be applicable to your scenario, but I wanted to document the solution I found. My issues ended up being related to a flannel network overlay setup on our master nodes.
# kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE
coredns-qwer 1/1 Running 0 4h54m
coredns-asdf 1/1 Running 0 4h54m
etcd-h1 1/1 Running 0 4h53m
etcd-h2 1/1 Running 0 4h48m
etcd-h3 1/1 Running 0 4h48m
kube-apiserver-h1 1/1 Running 0 4h53m
kube-apiserver-h2 1/1 Running 0 4h48m
kube-apiserver-h3 1/1 Running 0 4h48m
kube-controller-manager-h1 1/1 Running 2 4h53m
kube-controller-manager-h2 1/1 Running 0 4h48m
kube-controller-manager-h3 1/1 Running 0 4h48m
kube-flannel-ds-amd64-asdf 1/1 Running 0 4h48m
kube-flannel-ds-amd64-qwer 1/1 Running 1 4h48m
kube-flannel-ds-amd64-zxcv 1/1 Running 0 3h51m
kube-flannel-ds-amd64-wert 1/1 Running 0 4h54m
kube-flannel-ds-amd64-sdfg 1/1 Running 1 4h41m
kube-flannel-ds-amd64-xcvb 1/1 Running 1 4h42m
kube-proxy-qwer 1/1 Running 0 4h42m
kube-proxy-asdf 1/1 Running 0 4h54m
kube-proxy-zxcv 1/1 Running 0 4h48m
kube-proxy-wert 1/1 Running 0 4h41m
kube-proxy-sdfg 1/1 Running 0 4h48m
kube-proxy-xcvb 1/1 Running 0 4h42m
kube-scheduler-h1 1/1 Running 1 4h53m
kube-scheduler-h2 1/1 Running 1 4h48m
kube-scheduler-h3 1/1 Running 0 4h48m
tiller-deploy-asdf 1/1 Running 0 4h28m
If I exec'd into any container and ping'd google.com from the container, I get a bad address response.
# ping google.com
ping: bad address 'google.com'
# ip route
default via 10.168.3.1 dev eth0
10.168.3.0/24 dev eth0 scope link src 10.168.3.22
10.244.0.0/16 via 10.168.3.1 dev eth0
ip route varies from ip route run from the master node.
altering my pods deployment configuration to include the hostNetwork: true allowed me to ping outside my container.
my newly running pod ip route
# ip route
default via 172.25.10.1 dev ens192 metric 100
10.168.0.0/24 via 10.168.0.0 dev flannel.1 onlink
10.168.1.0/24 via 10.168.1.0 dev flannel.1 onlink
10.168.2.0/24 via 10.168.2.0 dev flannel.1 onlink
10.168.3.0/24 dev cni0 scope link src 10.168.3.1
10.168.4.0/24 via 10.168.4.0 dev flannel.1 onlink
10.168.5.0/24 via 10.168.5.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 scope link src 172.17.0.1
172.25.10.0/23 dev ens192 scope link src 172.25.11.35 metric 100
192.168.122.0/24 dev virbr0 scope link src 192.168.122.1
# ping google.com
PING google.com (172.217.6.110): 56 data bytes
64 bytes from 172.217.6.110: seq=0 ttl=55 time=3.488 ms
Update 1
My associate and I found a number of different websites which advise against setting hostNetwork: true. We then found this issue and are currently investigating it as a possible solution, sans hostNetwork: true.
Usually you'd do this with the '--ip-masq' flag to flannel which is 'false' by default and is defined as "setup IP masquerade rule for traffic destined outside of overlay network". Which sounds like what you want.
Update 2
It turns out that our flannel network overlay was misconfigured. We needed to ensure that our configmap for flannel had net-conf\.json.network matching our networking.podSubnet (kubeadm config view). Changing these networks to match alleviated our networking woes. We were then able to remove hostNetwork: true from our deployments.
I have a local(without cloud provider) cluster made up of 3 vm the master and the nodes, I have created a volume with a nfs to reuse it if a pod die and is reschedule on another nodes, but i think same component not work well: I use to create the cluster just this guide: kubernetes guide and I have after that create the cluster this is the actual state:
master#master-VirtualBox:~/Documents/KubeT/nfs$ sudo kubectl get pod --all-namespaces
[sudo] password for master:
NAMESPACE NAME READY STATUS RESTARTS AGE
default mysqlnfs3 1/1 Running 0 27m
kube-system etcd-master-virtualbox 1/1 Running 0 46m
kube-system kube-apiserver-master-virtualbox 1/1 Running 0 46m
kube-system kube-controller-manager-master-virtualbox 1/1 Running 0 46m
kube-system kube-dns-86f4d74b45-f6hpf 3/3 Running 0 47m
kube-system kube-flannel-ds-nffv6 1/1 Running 0 38m
kube-system kube-flannel-ds-rqw9v 1/1 Running 0 39m
kube-system kube-flannel-ds-s5wzn 1/1 Running 0 44m
kube-system kube-proxy-6j7p8 1/1 Running 0 38m
kube-system kube-proxy-7pj8d 1/1 Running 0 39m
kube-system kube-proxy-jqshs 1/1 Running 0 47m
kube-system kube-scheduler-master-virtualbox 1/1 Running 0 46m
master#master-VirtualBox:~/Documents/KubeT/nfs$ sudo kubectl get node
NAME STATUS ROLES AGE VERSION
host1-virtualbox Ready <none> 39m v1.10.2
host2-virtualbox Ready <none> 40m v1.10.2
master-virtualbox Ready master 48m v1.10.2
and this is the pod:
master#master-VirtualBox:~/Documents/KubeT/nfs$ sudo kubectl get pod
NAME READY STATUS RESTARTS AGE
mysqlnfs3 1/1 Running 0 29m
it is schedule on the host2 and if i try to go in the shell of host 2 and I do dockerexec I use the container very well, the data are store and retrieve, but when I try to use kubect exec not work:
master#master-VirtualBox:~/Documents/KubeT/nfs$ sudo kubectl exec -it -n default mysqlnfs3 -- /bin/bash
error: unable to upgrade connection: pod does not exist
Overview
kube-dns can't start (SetupNetworkError) after kubeadm init and network setup:
Error syncing pod, skipping: failed to "SetupNetwork" for
"kube-dns-654381707-w4mpg_kube-system" with SetupNetworkError:
"Failed to setup network for pod
\"kube-dns-654381707-w4mpg_kube-system(8ffe3172-a739-11e6-871f-000c2912631c)\"
using network plugins \"cni\": open /run/flannel/subnet.env:
no such file or directory; Skipping pod"
Kubernetes version
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:48:38Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:42:39Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Environment
VMWare Fusion for Mac
OS
NAME="Ubuntu"
VERSION="16.04.1 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.1 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
Kernel (e.g. uname -a)
Linux ubuntu-master 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
What is the problem
kube-system kube-dns-654381707-w4mpg 0/3 ContainerCreating 0 2m
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
3m 3m 1 {default-scheduler } Normal Scheduled Successfully assigned kube-dns-654381707-w4mpg to ubuntu-master
2m 1s 177 {kubelet ubuntu-master} Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "kube-dns-654381707-w4mpg_kube-system" with SetupNetworkError: "Failed to setup network for pod \"kube-dns-654381707-w4mpg_kube-system(8ffe3172-a739-11e6-871f-000c2912631c)\" using network plugins \"cni\": open /run/flannel/subnet.env: no such file or directory; Skipping pod"
What I expected to happen
kube-dns Running
How to reproduce it
root#ubuntu-master:~# kubeadm init
Running pre-flight checks
<master/tokens> generated token: "247a8e.b7c8c1a7685bf204"
<master/pki> generated Certificate Authority key and certificate:
Issuer: CN=kubernetes | Subject: CN=kubernetes | CA: true
Not before: 2016-11-10 11:40:21 +0000 UTC Not After: 2026-11-08 11:40:21 +0000 UTC
Public: /etc/kubernetes/pki/ca-pub.pem
Private: /etc/kubernetes/pki/ca-key.pem
Cert: /etc/kubernetes/pki/ca.pem
<master/pki> generated API Server key and certificate:
Issuer: CN=kubernetes | Subject: CN=kube-apiserver | CA: false
Not before: 2016-11-10 11:40:21 +0000 UTC Not After: 2017-11-10 11:40:21 +0000 UTC
Alternate Names: [172.20.10.4 10.96.0.1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local]
Public: /etc/kubernetes/pki/apiserver-pub.pem
Private: /etc/kubernetes/pki/apiserver-key.pem
Cert: /etc/kubernetes/pki/apiserver.pem
<master/pki> generated Service Account Signing keys:
Public: /etc/kubernetes/pki/sa-pub.pem
Private: /etc/kubernetes/pki/sa-key.pem
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready
<master/apiclient> all control plane components are healthy after 14.053453 seconds
<master/apiclient> waiting for at least one node to register and become ready
<master/apiclient> first node is ready after 0.508561 seconds
<master/apiclient> attempting a test deployment
<master/apiclient> test deployment succeeded
<master/discovery> created essential addon: kube-discovery, waiting for it to become ready
<master/discovery> kube-discovery is ready after 1.503838 seconds
<master/addons> created essential addon: kube-proxy
<master/addons> created essential addon: kube-dns
Kubernetes master initialised successfully!
You can now join any number of machines by running the following on each node:
kubeadm join --token=247a8e.b7c8c1a7685bf204 172.20.10.4
root#ubuntu-master:~#
root#ubuntu-master:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system dummy-2088944543-eo1ua 1/1 Running 0 47s
kube-system etcd-ubuntu-master 1/1 Running 3 51s
kube-system kube-apiserver-ubuntu-master 1/1 Running 0 49s
kube-system kube-controller-manager-ubuntu-master 1/1 Running 3 51s
kube-system kube-discovery-1150918428-qmu0b 1/1 Running 0 46s
kube-system kube-dns-654381707-mv47d 0/3 ContainerCreating 0 44s
kube-system kube-proxy-k0k9q 1/1 Running 0 44s
kube-system kube-scheduler-ubuntu-master 1/1 Running 3 51s
root#ubuntu-master:~#
root#ubuntu-master:~# kubectl apply -f https://git.io/weave-kube
daemonset "weave-net" created
root#ubuntu-master:~#
root#ubuntu-master:~#
root#ubuntu-master:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system dummy-2088944543-eo1ua 1/1 Running 0 47s
kube-system etcd-ubuntu-master 1/1 Running 3 51s
kube-system kube-apiserver-ubuntu-master 1/1 Running 0 49s
kube-system kube-controller-manager-ubuntu-master 1/1 Running 3 51s
kube-system kube-discovery-1150918428-qmu0b 1/1 Running 0 46s
kube-system kube-dns-654381707-mv47d 0/3 ContainerCreating 0 44s
kube-system kube-proxy-k0k9q 1/1 Running 0 44s
kube-system kube-scheduler-ubuntu-master 1/1 Running 3 51s
kube-system weave-net-ja736 2/2 Running 0 1h
It looks like you have configured flannel before running kubeadm init. You can try to fix this by removing flannel (it may be sufficient to remove config file rm -f /etc/cni/net.d/*flannel*), but it's best to start fresh.
open below file location(if exists, either create) and paste below data
vim /run/flannel/subnet.env
FLANNEL_NETWORK=10.240.0.0/16
FLANNEL_SUBNET=10.240.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true