kube-dns not working on kubernetes arm - docker

I deploy a kubernetes cluster following the guide: https://blog.hypriot.com/post/setup-kubernetes-raspberry-pi-cluster/. It basically uses hypriotOS and kubernetes from the debian repository.
After the deployment, all the pods were running and no faults were shown. However, the dns server was not working properly on the worker node.
master
$ kubectl -n kube-system get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 10.96.0.10 <none> 53/UDP,53/TCP 34m
kubernetes-dashboard 10.103.97.112 <nodes> 80:30518/TCP 31m
# I installed the dnsutils to have the dig command
$ dig #10.96.0.10 || echo "FAIL"
# shows a valid response (note that we are not resolving anything)
worker
$ dig #10.96.0.10 || echo "FAIL"
....
FAIL

It turn out that the answer was in one of the comments from , but it was not clear that this was my issue.
As the author of the comment stated is due to the iptables policies from Docker versions > 1.13.
To solve it, execute the following on both nodes:
sudo iptables -A FORWARD -i cni0 -j ACCEPT
sudo iptables -A FORWARD -o cni0 -j ACCEPT

Related

Kubernetes CoreDNS in CrashLoopBackOff

I understand that this question is asked dozen times, but nothing has helped me through internet searching.
My set up:
CentOS Linux release 7.5.1804 (Core)
Docker Version: 18.06.1-ce
Kubernetes: v1.12.3
Installed by official guide and this one:https://www.techrepublic.com/article/how-to-install-a-kubernetes-cluster-on-centos-7/
CoreDNS pods are in Error/CrashLoopBackOff state.
kube-system coredns-576cbf47c7-8phwt 0/1 CrashLoopBackOff 8 31m
kube-system coredns-576cbf47c7-rn2qc 0/1 CrashLoopBackOff 8 31m
My /etc/resolv.conf:
nameserver 8.8.8.8
Also tried with my local dns-resolver(router)
nameserver 10.10.10.1
Setup and init:
kubeadm init --apiserver-advertise-address=10.10.10.3 --pod-network-cidr=192.168.1.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
I tried to solve this with:
Editing the coredns: root#kub~]# kubectl edit cm coredns -n kube-system
and changing
proxy . /etc/resolv.conf
directly to
proxy . 10.10.10.1
or
proxy . 8.8.8.8
Also tried to:
kubectl -n kube-system get deployment coredns -o yaml | sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' | kubectl apply -f -
And still nothing helps me.
Error from the logs:
plugin/loop: Seen "HINFO IN 7847735572277573283.2952120668710018229." more than twice, loop detected
The other thread - coredns pods have CrashLoopBackOff or Error state didnt help at all, becouse i havent hit any solutions that were described there. Nothing helped.
Even I have got such error and I successfully managed to work by below steps.
However, you missed 8.8.4.4
sudo nano /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
run following commands to restart daemon and docker service
sudo systemctl daemon-reload
sudo systemctl restart docker
If you are using kubeadm make sure you delete an entire cluster from master and provision cluster again.
kubectl drain <node_name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node_name>
kubeadm reset
Once You Provision the new cluster
kubectl get pods --all-namespaces
It Should give below expected Result
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-gldlr 2/2 Running 0 24s
kube-system coredns-86c58d9df4-lpnj6 1/1 Running 0 40s
kube-system coredns-86c58d9df4-xnb5r 1/1 Running 0 40s
kube-system kube-proxy-kkb7b 1/1 Running 0 40s
kube-system kube-scheduler-osboxes 1/1 Running 0 10s
$kubectl edit cm coredns -n kube-system
delete ‘loop’ ,save and exit
restart master node. It was work for me.
I faced the the same issue in my local k8s in Docker (KIND) setup. CoreDns pod gets crashloop backoff error.
Steps followed to make the pod into running state:
As Tim Chan said in this post and by referring the github issues link, I did the following
kubectl -n kube-system edit configmaps coredns -o yaml
modify the section
forward . /etc/resolv.conf with forward . 172.16.232.1 (mycase i set 8.8.8.8 for the timebeing)
Delete one of the Coredns Pods, or can wait for sometime - the pods will be in running state.
Usually happens when coredns can't talk to the kube-apiserver:
Check that your kubernetes service is in the default namespace:
$ kubectl get svc kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 130d
Then (you might have to create a pod):
$ kubectl -n kube-system exec -it <any-pod-with-shell> sh
# ping kubernetes.default.svc.cluster.local
PING kubernetes.default.svc.cluster.local (10.96.0.1): 56 data bytes
Also, try hitting port 443 from the port:
# telnet kubernetes.default.svc.cluster.local 443 # or
# curl kubernetes.default.svc.cluster.local:443
I got the error is:
connect: no route to host","time":"2021-03-19T14:42:05Z"}
crashloopbackoff
in the log showed by kubectl -n kube-system logs coredns-d9fdb9c9f-864rz
The issue is mentioned in https://github.com/coredns/coredns/tree/master/plugin/loop#troubleshooting-loops-in-kubernetes-clusters
tldr;
Reason: /etc/resolv.conf got updated somehow. The original one is at /run/systemd/resolve/resolv.conf:
e.g:
nameserver 172.16.232.1
Quick fix, edit Corefile:
$ kubectl -n kube-system edit configmaps coredns -o yaml
to replace forward . /etc/resolv.conf with forward . 172.16.232.1
e.g:
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . 172.16.232.1 {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
metadata:
creationTimestamp: "2021-03-18T15:58:07Z"
name: coredns
namespace: kube-system
resourceVersion: "49996"
uid: 428a03ff-82d0-4812-a3fa-e913c2911ebd
Done, after that, may need to restart the docker
sudo systemctl restart docker
Update: it could be fixed by just sudo systemctl restart docker

Coredns in pending state in Kubernetes cluster

I am trying to configure a 2 node Kubernetes cluster. First I am trying to configure the master node of the cluster on a CentOS VM. I have initialized the cluster using 'kubeadm init --apiserver-advertise-address=172.16.100.6 --pod-network-cidr=10.244.0.0/16' and deployed the flannel network to the cluster. But when I do 'kubectl get nodes', I get the following output ----
[root#kubernetus ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubernetus NotReady master 57m v1.12.0
Following is the output of 'kubectl get pods --all-namespaces -o wide ' ----
[root#kubernetus ~]# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
kube-system coredns-576cbf47c7-9x59x 0/1 Pending 0 58m <none> <none> <none>
kube-system coredns-576cbf47c7-l52wc 0/1 Pending 0 58m <none> <none> <none>
kube-system etcd-kubernetus 1/1 Running 2 57m 172.16.100.6 kubernetus <none>
kube-system kube-apiserver-kubernetus 1/1 Running 2 57m 172.16.100.6 kubernetus <none>
kube-system kube-controller-manager-kubernetus 1/1 Running 1 57m 172.16.100.6 kubernetus <none>
kube-system kube-proxy-hr557 1/1 Running 1 58m 172.16.100.6 kubernetus <none>
kube-system kube-scheduler-kubernetus 1/1 Running 1 57m 172.16.100.6 kubernetus <none>
coredns is in a pending state for a very long time. I have removed docker and kubectl, kubeadm, kubelet a no of times & tried to recreate the cluster, but every time it shows the same output. Can anybody help me with this issue?
Try to install Pod network add-on (Base on this guide).
Run this line:
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
Unable to update cni config: No networks found in /etc/cni/net.d .....
Oct 02 19:21:32 kubernetus kubelet[19007]: E1002 19:21:32.886170 19007
kubelet.go:2167] Container runtime network not ready:
NetworkReady=false reason:NetworkPluginNotReady message:docker:
network plugin is not ready: cni config uninitialized
According to this error, you forgot to initialize a Kubernetes Pod network add-on. Looking at your settings, I suppose it should be Flannel.
Here is the instruction from the official Kubernetes documentation:
For flannel to work correctly, you must pass
--pod-network-cidr=10.244.0.0/16 to kubeadm init.
Set /proc/sys/net/bridge/bridge-nf-call-iptables to 1 by running
sysctl net.bridge.bridge-nf-call-iptables=1 to pass bridged IPv4
traffic to iptables’ chains. This is a requirement for some CNI
plugins to work, for more information please see here.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml
Note that flannel works on amd64, arm, arm64 and ppc64le, but until
flannel v0.11.0 is released you need to use the following manifest
that supports all the architectures:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml
For more information, you can visit this link.
For the Kubernetes cluster to be available, the cluster should have a Container Networking Interface (CNI). A pod-network is required to be configured for the dns pod to be functional.
Install any of the CNI Providers like:
- Flannel
- Calico
- Canal
- WeaveNet, etc.,
Without this, the hosted Kubernetes cluster would have the master in the NotReady State.
Check if docker and kubernetes are using the same cgroup driver.
I faced the same issue (CentOS 7, kubernetes v1.14.1), and setting same cgroup driver (systemd) fixed it.
I installed kubernetes with 1 master + 1 work-node.
After I made kubeadm init ..., I faced two issues:
On the master node, the coredns were pending.
On the work-node, kubectl command didn't work out
On the work-node, I did the following and fixed the both issues:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/kubelet.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config**
For me, I've restarted the system and re-applied calico.yaml, coredns and calico pods started creating.
take this solution at least priority and try changing instance type (preferably higher cpu core/ram)
in my case i have changed linux instance t3.micro to t2.medium and its works

Flannel fails in kubernetes cluster due to failure of subnet manager

I am running etcd, kube-apiserver, kube-scheduler, and kube-controllermanager on a master node as well as kubelet and kube-proxy on a minion node as follows (all kube binaries are from kubernetes 1.7.4):
# [master node]
./etcd
./kube-apiserver --logtostderr=true --etcd-servers=http://127.0.0.1:2379 --service-cluster-ip-range=10.10.10.0/24 --insecure-port 8080 --secure-port=0 --allow-privileged=true --insecure-bind-address 0.0.0.0
./kube-scheduler --address=0.0.0.0 --master=http://127.0.0.1:8080
./kube-controller-manager --address=0.0.0.0 --master=http://127.0.0.1:8080
# [minion node]
./kubelet --logtostderr=true --address=0.0.0.0 --api_servers=http://$MASTER_IP:8080 --allow-privileged=true
./kube-proxy --master=http://$MASTER_IP:8080
After this, if I execute kubectl get all --all-namespaces and kubectl get nodes, I get
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default svc/kubernetes 10.10.10.1 <none> 443/TCP 27m
NAME STATUS AGE VERSION
minion-1 Ready 27m v1.7.4+793658f2d7ca7
Then, I apply flannel as follows:
kubectl apply -f kube-flannel-rbac.yml -f kube-flannel.yml
Now, I see a pod is created, but with error:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-flannel-ds-p8tcb 1/2 CrashLoopBackOff 4 2m
When I check the logs inside the failed container in the minion node, I see the following error:
Failed to create SubnetManager: unable to initialize inclusterconfig: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
My question is: how to resolve this? Is this a SSL issue? What step am I missing in setting up my cluster?
Maybe it is your flannel yaml file has something wrong,
you can try this to install your flannel,
check the old ip link
ip link
if it show flannel,please delete it
ip link delete flannel.1
and install , its default pod network cdir is 10.244.0.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.0/Documentation/kube-flannel.yml
You could try to pass --etcd-prefix=/your/prefix and --etcd-endpoints=address to flanneld instead of --kube-subnet-mgr so flannel get net-conf from etcd server and not from api server.
Keep in mind that you must to push net-conf to etcd server.
UPDATE
The problem (/var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory) can appear when execute apiserver without --admission-control=...,ServiceAccount,... or if kubelet is inside a container (eg: hypercube) and this last was my case. If you want execute k8s components inside a container you need to pass 'shared' option to kubelet volume
/var/lib/kubelet/:/var/lib/kubelet:rw,shared
Furthermore enable same option to docker in docker.service
MountFlags=shared
Now the question is: is there a security hole with shared mount?

kubernetes.default: Name does not resolve

Im running OpenShift.
OpenShift Master: v3.3.1.7
Kubernetes Master: v1.3.0+52492b4
But am having problems trying to run a build in Jenkins (running in a pod). This is not a problem with the java code that I'm trying to build, but is a problem in the Kubernetes/Openshift setup.
The builds fail with:
Caused by: java.net.UnknownHostException: kubernetes.default: Name does not resolve
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at okhttp3.Dns$1.lookup(Dns.java:39)
...
Does anyone know how to fix this?
First confirm that DNS is actually working with:
› kubectl run -i -t busybox --image=busybox --restart=Never
Waiting for pod default/busybox to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # nslookup kubernetes.default
Server: 192.168.60.10
Address 1: 192.168.60.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 192.168.60.1 kubernetes.default.svc.cluster.local
If that doesn't work check if the DNS pods are running:
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
Will respond sth like:
NAME READY STATUS RESTARTS AGE
kube-dns-v14-3u5zi 3/3 Running 36 166d
Finally checking the related logs is worth a try:
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kube-dns
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c healthz
Full instructions can be found on kubernetes.io
Please check on manage jenkins -> Configure Global Security -> Agent port is 50000 and Fixed

kubectl run does not create replicacontroller

I'm newbie of the Kubernetes while I'm using Google Cloud Container. I just follow the tutorials as belows:
https://cloud.google.com/container-engine/docs/tutorials/http-balancer
http://kubernetes.io/docs/hellonode/#create-your-pod
In these tutorials, I'll get the replicacontroller after I run the "kubectl run" but there is no replicacontrollers so that I cannot run the command of "kubectl expose rc" in order to open a port.
Here is my result of the commands:
ChangMatthews-MacBook-Pro:frontend changmatthew$ kubectl run nginx --image=nginx --port=80
deployment "nginx" created
ChangMatthews-MacBook-Pro:frontend changmatthew$ kubectl expose rc nginx --target-port=80 --type=NodePort
Error from server: replicationcontrollers "nginx" not found
Here is my result when I run "kubectl get rc,svc,ingress,deployments,pods":
ChangMatthews-MacBook-Pro:frontend changmatthew$ kubectl get rc,svc,ingress,deployments,pods
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.3.240.1 <none> 443/TCP 12m
NAME RULE BACKEND ADDRESS AGE
basic-ingress - nginx:80 107.178.247.247 12m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx 1 1 1 1 11m
NAME READY STATUS RESTARTS AGE
nginx-198147104-zgo7m 1/1 Running 0 11m
One of my solution is to create yaml file which define the replicacontroller. But is there any way to create replicacontroller via kubectl run command like above tutorials?
Thanks,
Now that kubectl run creates a deployment, you specify that the type being exposed in a deployment rather than a replication controller:
kubectl expose deployment nginx --target-port=80 --type=NodePort
The team might still be updating the docs to reflect 1.2. Note the output you got:
$ kubectl run nginx --image=nginx --port=80
deployment "nginx" created
kubectl run now creates a deployemtn+replica-set.
To view these you can do kubectl get deployment, and get rs respectively.
Deployments are essentially a nicer way to perform rolling update server side, but there's a little more to it. See docs: http://kubernetes.io/docs/user-guide/deployments/
In version 1.15.0, it works as follows.
root#k8smaster ~]# kubectl run guestbook --image=coolguy/k8s_guestbook:1.0 --port=8080 --generator=run/v1
kubectl run --generator=run/v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create
instead.
***replicationcontroller/guestbook created***
In version 1.19.0:
[root#k8smaster ~]# kubectl run guestbook --image=dmsong2008/k8s_guestbook:1.0 --port=8080 --generator=run/v1
***Flag --generator has been deprecated, has no effect and will be removed in the future.***
pod/guestbook created

Resources