Kubernetes describe pod - Error from server (NotFound) - docker

I am trying to debug a pod with the status "ImagePullBackOff".
The pod is in the namespace minio-operator, but when I try to to describe the pod, it is apparently not found.
Why does that happen?
[psr-admin#zon-psr-2-u001 ~]$ kubectl get all -n minio-operator
NAME READY STATUS RESTARTS AGE
pod/minio-operator-5dd99dd858-n6fdj 0/1 ImagepullBackoff 0 7d
NAME READY. UP-TO-DATE AVAILABLE AGE
deployment.apps/minio-operator 0 1 0 7d
NAME DESIRED CURRENT READY AGE
replicaset.apps/minio-operator-5dd99dd858 1 1 0 7d
[psr-admin#zon-psr-2-u001 ~]$ kubectl describe pod minio-operator-5dd99dd858-n6fdj
Error from server (NotFound): pods "minio-operator-5dd99dd858-n6fdj" not found
Error from server (NotFound): pods "minio-operator-5dd99dd858-n6fdj" not found

You've not specified the namespace in your describe pod command.
You did kubectl get all -n minio-operator, which gets all resources in the minio-operator namespace, but your kubectl describe has no namespace, so it's looking in the default namespace for a pod that isn't there.
kubectl describe pod -n minio-operator <pod name>
Should work OK.
Most resources in kubernetes are namespaced, so will require the -n <namespace> argument unless you switch namespaces.

Related

Kubernetes CoreDNS in CrashLoopBackOff

I understand that this question is asked dozen times, but nothing has helped me through internet searching.
My set up:
CentOS Linux release 7.5.1804 (Core)
Docker Version: 18.06.1-ce
Kubernetes: v1.12.3
Installed by official guide and this one:https://www.techrepublic.com/article/how-to-install-a-kubernetes-cluster-on-centos-7/
CoreDNS pods are in Error/CrashLoopBackOff state.
kube-system coredns-576cbf47c7-8phwt 0/1 CrashLoopBackOff 8 31m
kube-system coredns-576cbf47c7-rn2qc 0/1 CrashLoopBackOff 8 31m
My /etc/resolv.conf:
nameserver 8.8.8.8
Also tried with my local dns-resolver(router)
nameserver 10.10.10.1
Setup and init:
kubeadm init --apiserver-advertise-address=10.10.10.3 --pod-network-cidr=192.168.1.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
I tried to solve this with:
Editing the coredns: root#kub~]# kubectl edit cm coredns -n kube-system
and changing
proxy . /etc/resolv.conf
directly to
proxy . 10.10.10.1
or
proxy . 8.8.8.8
Also tried to:
kubectl -n kube-system get deployment coredns -o yaml | sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' | kubectl apply -f -
And still nothing helps me.
Error from the logs:
plugin/loop: Seen "HINFO IN 7847735572277573283.2952120668710018229." more than twice, loop detected
The other thread - coredns pods have CrashLoopBackOff or Error state didnt help at all, becouse i havent hit any solutions that were described there. Nothing helped.
Even I have got such error and I successfully managed to work by below steps.
However, you missed 8.8.4.4
sudo nano /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
run following commands to restart daemon and docker service
sudo systemctl daemon-reload
sudo systemctl restart docker
If you are using kubeadm make sure you delete an entire cluster from master and provision cluster again.
kubectl drain <node_name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node_name>
kubeadm reset
Once You Provision the new cluster
kubectl get pods --all-namespaces
It Should give below expected Result
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-gldlr 2/2 Running 0 24s
kube-system coredns-86c58d9df4-lpnj6 1/1 Running 0 40s
kube-system coredns-86c58d9df4-xnb5r 1/1 Running 0 40s
kube-system kube-proxy-kkb7b 1/1 Running 0 40s
kube-system kube-scheduler-osboxes 1/1 Running 0 10s
$kubectl edit cm coredns -n kube-system
delete ‘loop’ ,save and exit
restart master node. It was work for me.
I faced the the same issue in my local k8s in Docker (KIND) setup. CoreDns pod gets crashloop backoff error.
Steps followed to make the pod into running state:
As Tim Chan said in this post and by referring the github issues link, I did the following
kubectl -n kube-system edit configmaps coredns -o yaml
modify the section
forward . /etc/resolv.conf with forward . 172.16.232.1 (mycase i set 8.8.8.8 for the timebeing)
Delete one of the Coredns Pods, or can wait for sometime - the pods will be in running state.
Usually happens when coredns can't talk to the kube-apiserver:
Check that your kubernetes service is in the default namespace:
$ kubectl get svc kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 130d
Then (you might have to create a pod):
$ kubectl -n kube-system exec -it <any-pod-with-shell> sh
# ping kubernetes.default.svc.cluster.local
PING kubernetes.default.svc.cluster.local (10.96.0.1): 56 data bytes
Also, try hitting port 443 from the port:
# telnet kubernetes.default.svc.cluster.local 443 # or
# curl kubernetes.default.svc.cluster.local:443
I got the error is:
connect: no route to host","time":"2021-03-19T14:42:05Z"}
crashloopbackoff
in the log showed by kubectl -n kube-system logs coredns-d9fdb9c9f-864rz
The issue is mentioned in https://github.com/coredns/coredns/tree/master/plugin/loop#troubleshooting-loops-in-kubernetes-clusters
tldr;
Reason: /etc/resolv.conf got updated somehow. The original one is at /run/systemd/resolve/resolv.conf:
e.g:
nameserver 172.16.232.1
Quick fix, edit Corefile:
$ kubectl -n kube-system edit configmaps coredns -o yaml
to replace forward . /etc/resolv.conf with forward . 172.16.232.1
e.g:
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . 172.16.232.1 {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
metadata:
creationTimestamp: "2021-03-18T15:58:07Z"
name: coredns
namespace: kube-system
resourceVersion: "49996"
uid: 428a03ff-82d0-4812-a3fa-e913c2911ebd
Done, after that, may need to restart the docker
sudo systemctl restart docker
Update: it could be fixed by just sudo systemctl restart docker

Failed to start container while "Setting up Jenkins on Container Engine"

I'm going through this tutorial
Setting up Jenkins on Container Engine
https://cloud.google.com/solutions/jenkins-on-container-engine-tutorial
and failing on "Creating the Jenkins deployment and services" step
I got this error at one point:
jenkins- 0/1 rpc error: code = 2 desc = failed to start container "": Error response from daemon: {"message":"linux spec user: unable to find user jenkins: no matching entries in passwd file"}
And I get these results for the following commands:
> kubectl apply -f jenkins/k8s/
deployment "jenkins" configured
service "jenkins-ui" configured
service "jenkins-discovery" configured
> get pods --namespace jenkins
NAME READY STATUS RESTARTS AGE
jenkins-<some id> 0/1 CrashLoopBackOff 5 10m
I get it that it is looking for jenkins user in the passwd file, but I'm still not sure why this error took place and what the correct way to fix it is. Any insight would be highly appreciated.
Edit: output of running "kubectl get pods --namespace jenkins"
The very first time running it:
> kubectl get pods --namespace jenkins
NAME READY STATUS RESTARTS AGE
jenkins-1937056428-fp7vr 0/1 ContainerCreating 0 16s
Second time running it:
> kubectl get pods --namespace jenkins
NAME READY STATUS RESTARTS AGE
jenkins-1937056428-fp7vr 0/1 rpc error: code = 2 desc = failed to start container "10a8ab7e3eb0ad153fd6055d86336b1cdfe9642b6993684a7e01fefbeca7a566": Error response from
daemon: {"message":"linux spec user: unable to find user jenkins: no matching entries in passwd file"} 1 39s
Third and after:
> kubectl get pods --namespace jenkins
NAME READY STATUS RESTARTS AGE
jenkins-1937056428-fp7vr 0/1 CrashLoopBackOff 270 22h
It appears that the persistent disk volume for the jenkins is not properly setup. Try running the following commands to reconfigure disk volumes and rerun jenkins pod,
kubectl delete -f jenkins/k8s/
gcloud compute disks delete jenkins-home
gcloud compute images delete jenkins-home-image
gcloud config set compute/zone us-east1-d
gcloud compute images create jenkins-home-image --source-uri https://storage.googleapis.com/solutions-public-assets/jenkins-cd/jenkins-home-v3.tar.gz
gcloud compute disks create jenkins-home --image jenkins-home-image --zone us-east1-d
kubectl apply -f jenkins/k8s/
I basically did one step wrong:
Provision a Kubernetes cluster using Container Engine.
gcloud container clusters create jenkins-cd \
--network jenkins \
--scopes "https://www.googleapis.com/auth/projecthosting,storage-rw"
Here make sure the options --network and --scopes actually get passed in. I guess I copied the command without fixing it up and the options got dropped.

kubernetes.default: Name does not resolve

Im running OpenShift.
OpenShift Master: v3.3.1.7
Kubernetes Master: v1.3.0+52492b4
But am having problems trying to run a build in Jenkins (running in a pod). This is not a problem with the java code that I'm trying to build, but is a problem in the Kubernetes/Openshift setup.
The builds fail with:
Caused by: java.net.UnknownHostException: kubernetes.default: Name does not resolve
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at okhttp3.Dns$1.lookup(Dns.java:39)
...
Does anyone know how to fix this?
First confirm that DNS is actually working with:
› kubectl run -i -t busybox --image=busybox --restart=Never
Waiting for pod default/busybox to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # nslookup kubernetes.default
Server: 192.168.60.10
Address 1: 192.168.60.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 192.168.60.1 kubernetes.default.svc.cluster.local
If that doesn't work check if the DNS pods are running:
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
Will respond sth like:
NAME READY STATUS RESTARTS AGE
kube-dns-v14-3u5zi 3/3 Running 36 166d
Finally checking the related logs is worth a try:
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kube-dns
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c healthz
Full instructions can be found on kubernetes.io
Please check on manage jenkins -> Configure Global Security -> Agent port is 50000 and Fixed

How to retry image pull in a kubernetes Pods?

I am new to kubernetes. I have an issue in the pods. When I run the command
kubectl get pods
Result:
NAME READY STATUS RESTARTS AGE
mysql-apim-db-1viwg 1/1 Running 1 20h
mysql-govdb-qioee 1/1 Running 1 20h
mysql-userdb-l8q8c 1/1 Running 0 20h
wso2am-default-813fy 0/1 ImagePullBackOff 0 20h
Due to an issue of "wso2am-default-813fy" node, I need to restart it. Any suggestion?
In case of not having the yaml file:
kubectl get pod PODNAME -n NAMESPACE -o yaml | kubectl replace --force -f -
Usually in case of "ImagePullBackOff" it's retried after few seconds/minutes. In case you want to try again manually you can delete the old pod and recreate the pod. The one line command to delete and recreate the pod would be:
kubectl replace --force -f <yml_file_describing_pod>
$ kubectl replace --force -f <resource-file>
if all goes well, you should see something like:
<resource-type> <resource-name> deleted
<resource-type> <resource-name> replaced
details of this can be found in the Kubernetes documentation, "manage-deployment" and kubectl-cheatsheet pages at the time of writing.
If the Pod is part of a Deployment or Service, deleting it will restart the Pod and, potentially, place it onto another node:
$ kubectl delete po $POD_NAME
replace it if it's an individual Pod:
$ kubectl get po -n $namespace $POD_NAME -o yaml | kubectl replace -f -
Try with deleting pod it will try to pull image again.
kubectl delete pod <pod_name> -n <namespace_name>
First try to see what's wrong with the pod:
kubectl logs -p <your_pod>
In my case it was a problem with the YAML file.
So, I needed to correct the configuration file and replace it:
kubectl replace --force -f <yml_file_describing_pod>
Most probably the issue of ImagePullBackOff is due to either the image not being present or issue with the pod YAML file.
What I will do is this
kubectl get pod -n $namespace $POD_NAME --export > pod.yaml | kubectl -f apply -
I would also see the pod.yaml to see the why the earlier pod didn't work
There is also possibility that the pull policy is not defined or kubernetes is configured to pull from the hub but fails due network issues. Try setting up a local secure registry and execute a pull . It would work.

kubectl run does not create replicacontroller

I'm newbie of the Kubernetes while I'm using Google Cloud Container. I just follow the tutorials as belows:
https://cloud.google.com/container-engine/docs/tutorials/http-balancer
http://kubernetes.io/docs/hellonode/#create-your-pod
In these tutorials, I'll get the replicacontroller after I run the "kubectl run" but there is no replicacontrollers so that I cannot run the command of "kubectl expose rc" in order to open a port.
Here is my result of the commands:
ChangMatthews-MacBook-Pro:frontend changmatthew$ kubectl run nginx --image=nginx --port=80
deployment "nginx" created
ChangMatthews-MacBook-Pro:frontend changmatthew$ kubectl expose rc nginx --target-port=80 --type=NodePort
Error from server: replicationcontrollers "nginx" not found
Here is my result when I run "kubectl get rc,svc,ingress,deployments,pods":
ChangMatthews-MacBook-Pro:frontend changmatthew$ kubectl get rc,svc,ingress,deployments,pods
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.3.240.1 <none> 443/TCP 12m
NAME RULE BACKEND ADDRESS AGE
basic-ingress - nginx:80 107.178.247.247 12m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx 1 1 1 1 11m
NAME READY STATUS RESTARTS AGE
nginx-198147104-zgo7m 1/1 Running 0 11m
One of my solution is to create yaml file which define the replicacontroller. But is there any way to create replicacontroller via kubectl run command like above tutorials?
Thanks,
Now that kubectl run creates a deployment, you specify that the type being exposed in a deployment rather than a replication controller:
kubectl expose deployment nginx --target-port=80 --type=NodePort
The team might still be updating the docs to reflect 1.2. Note the output you got:
$ kubectl run nginx --image=nginx --port=80
deployment "nginx" created
kubectl run now creates a deployemtn+replica-set.
To view these you can do kubectl get deployment, and get rs respectively.
Deployments are essentially a nicer way to perform rolling update server side, but there's a little more to it. See docs: http://kubernetes.io/docs/user-guide/deployments/
In version 1.15.0, it works as follows.
root#k8smaster ~]# kubectl run guestbook --image=coolguy/k8s_guestbook:1.0 --port=8080 --generator=run/v1
kubectl run --generator=run/v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create
instead.
***replicationcontroller/guestbook created***
In version 1.19.0:
[root#k8smaster ~]# kubectl run guestbook --image=dmsong2008/k8s_guestbook:1.0 --port=8080 --generator=run/v1
***Flag --generator has been deprecated, has no effect and will be removed in the future.***
pod/guestbook created

Resources