How To Stop a Stuck Pod in Kubernetes

How To Stop a Stuck Pod in Kubernetes - jenkins

Background
I am trying to learn to automate deployments with Jenkins on my laptop computer. I did not check the resource settings in the helm chart when I deployed Jenkins and I ended up over provisioned the memory and cpu requests.
The pod was initializing for several minutes and then eventually ended up in the status of CrashLoopBackOf.
Software and Versions
$ minikube start
😄 minikube v1.17.1 on Microsoft Windows 10 Enterprise 10.0.19042 Build 19042
...
...
🐳 Preparing Kubernetes v1.20.2 on Docker 20.10.2
...
Note that Docker was installed from Visual Studio Code with Docker Desktop and Windows 10 WSL Ubuntu 20.04 LTS enabled.
$ helm version
version.BuildInfo{Version:"v3.5.2", GitCommit:"167aac70832d3a384f65f9745335e9fb40169dc2", GitTreeState:"dirty", GoVersion:"go1.15.7"}
Installation
$ helm repo add stable https://charts.jenkins.io
$ helm repo ls
NAME URL
stable https://charts.jenkins.io
$ kubectl create namespace devops-cicd
namespace/devops-cicd created
$ helm install jenkins stable/jenkins --namespace devops-cicd
$ kubectl get svc -n devops-cicd -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
jenkins ClusterIP 10.108.169.104 <none> 8080/TCP 7m1s app.kubernetes.io/component=jenkins-controller,app.kubernetes.io/instance=jenkins
jenkins-agent ClusterIP 10.103.213.213 <none> 50000/TCP 7m app.kubernetes.io/component=jenkins-controller,app.kubernetes.io/instance=jenkins
$ kubectl get pod -n devops-cicd --output wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
jenkins-0 1/2 Running 1 8m13s 172.17.0.10 minikube <none> <none>
The pod failed eventually, ending with the status of CrashLoopBackOff
Unfortunately, I forgot to extract the logs for the pod.
In full disclosure, I got it deployed successfully by pulling the chart to my local file system and halved the size of the memory and cpu settings.
Questions
I fear that the situation of over provisioning in the Production environment one day. So how does one stop a failed pod from respawning/restarting and undo/rollback the deployment?
I tried to set Deployment replicas=0 but it had no effect. Actually, the only resources I could see were a couple of Services, the Pod itself, a PersistentVolume and some secrets.
I had to delete the namespace to remove the pod. This is not ideal. So what is the best way to tackle this situation (i.e. just deal with the problematic pod)?

Drawing on the feedback I have gathered and confirmed that the pod is scheduled by a StatefulSet. I am attempting to answer my own question with the hope that it is useful for newbies like me.
My question was how to stop a pod (from respawning).
So here I get the info on the StatefulSet:
$ kubectl get statefulsets -n devops-cicd -o wide
NAME READY AGE CONTAINERS IMAGES
jenkins 0/1 33s jenkins,config-reload jenkins/jenkins:2.303.1-jdk11,kiwigrid/k8s-sidecar:1.12.2
Then scale in:
$ kubectl scale statefulset jenkins --replicas=0 -n devops-cicd
statefulset.apps/jenkins scaled
Result:
$ kubectl get statefulsets -n devops-cicd -o wide
NAME READY AGE CONTAINERS IMAGES
jenkins 0/0 6m35s jenkins,config-reload jenkins/jenkins:2.303.1-jdk11,kiwigrid/k8s-sidecar:1.12.2

Related

How to stop & exit jenkins from Kubernetes cluster

I have installed Jenkins through Helm with below commands.
$ chart=jenkinsci/jenkins
$ helm install jenkins -n jenkins -f jenkins-values.yaml $chart
Now how do i stop & exit Jenkins completely from my kubernetes cluster?
# kubectl delete pod jenkins-0 -n jenkins
pod "jenkins-0" deleted
After delete pod command, still it creates new pod.
# kubectl get pods -n jenkins
NAME READY STATUS RESTARTS AGE
jenkins-0 0/2 Init:0/1 0 6s

Since I have Jenkins installed through helm. So I have removed the pod and complete deployment inside my namespace using # helm uninstall jenkins -n jenkins.

Why is my Kubernetes deployment registering as unavailable even though it runs in Docker?

I have a docker image I have created that works on docker like this (local docker)n...
docker run -p 4000:8080 jrg/hello-kerb
Now I am trying to run it as a Kubernetes pod. To do this I create the deployment...
kubectl create deployment hello-kerb --image=jrg/hello-kerb
Then I run kubectl get deployments but the new deployment comes as unavailable...
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
hello-kerb 1 1 1 0 17s
I was using this site as the instructions. It shows that the status should be available...
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
hello-node 1 1 1 1 1m
What am I missing? Why is the deployment unavailable?
UPDATE
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
hello-kerb-6f8f84b7d6-r7wk7 0/1 ImagePullBackOff 0 12s

If you are running a local image (from docker build) it is directly available to the docker daemon and can be executed. If you are using a remote daemon, f.e. in a kubernetes cluster, it will try to get the image from the default registry, since the image is not available locally. This is usually dockerhub. I checked https://hub.docker.com/u/jrg/ and there seems to be no repository and therefore no jrg/hello-kerb
So how can you solve this? When using minikube, you can build (and provide) the image using the docker daemon that is provided by minikube.
eval $(minikube docker-env)
docker build -t jrg/hello-kerb .
You could also provide the image at a registry that is reachable from your container runtime in the kubernetes cluster, f.e. dockerhub.

I solved this by using kubectl edit deployment hello-kerb then finding "imagePullPolicy" (:/PullPolicy). Finally I changed the value from "Always" to "Never". After saving this when I run kubectl get pod it shows...
NAME READY STATUS RESTARTS AGE
hello-kerb-6f744b6cc5-x6dw6 1/1 Running 0 6m
And I can access it.

How do I diagnose a Kubernetes cluster that never becomes ready?

I deployed an image to Kubernetes, but it never becomes ready, even after hours.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
myapp-b8dd974db-9jbsl 0/1 ImagePullBackOff 0 21m
All this happens with the Quickstart Hello app, as well as my own Docker image.
Attempts to attach fail.
$ kubectl attach -it myapp-b8dd974db-9jbs
Unable to use a TTY - container myapp did not allocate one
If you don't see a command prompt, try pressing enter.
error: unable to upgrade connection: container
myapp not found in pod myapp-b8dd974db-9jbsl_default
Attempts to access it over HTTP fail.
In Stackdriver Logging I see messages like
skipping: failed to "StartContainer" for "myapp"
with ImagePullBackOff: "Back-off pulling image
\"gcr.io/myproject/myapp-image:1.0\""
and No such image
Yet I did deploy these images and the Cloud Console shows that the pods are "green."
And kubectl seems to tell me that the cluster is OK.
$ kubectl get service myapp
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
myapp LoadBalancer 10.43.248.78 35.193.107.141 8222:31840/TCP 29m
How can I diagnose this?

You can use kubectl describe myapp-b8dd974db-9jbsl to get more information on your pod.
But from the status message 'ImagePullBackOff' it is probably trying to download the docker image and failing.
This might because of several reasons, you will obtain more information with the kubectl describe but it's probably that you don't have permissions to that docker repository or the image/image:tag does not exist.

Kubernetes Docker process inside pod

I have a Docker image with the CMD to run a Java application.
This application is being deployed to container into Kubernetes. Since, I am deploying it as a Docker image, I was expecting it as running as a Docker process. So, I just logged into the pods and was trying "docker ps".
But, I was surprised that it is running as a Java process and not as a docker process. I am able to see the process by "ps -ef"
I am confused, how does it work internally?

As others stated, Kubernetes uses docker internally to deploy the containers. To explain in detail consider the cluster which has 4 nodes, 1 master and 3 slaves.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
******.mylabserver.com Ready master 13d v1.10.5
******.mylabserver.com Ready <none> 13d v1.10.5
******.mylabserver.com Ready <none> 13d v1.10.5
******.mylabserver.com Ready <none> 13d v1.10.5
I am deploying a pod with nignx docker image.
$ cat pod-nginx.yml
apiVersion: v1
kind: Pod
metadata:
name: alpine
namespace: default
spec:
containers:
- name: alpine
image: alpine
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
You can get the status of the pod as below:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
alpine 1/1 Running 0 21s 10.244.3.4 ******.mylabserver.com
Kube-scheduler will schedule the pod on one of the available nodes.
Now the pod is deployed to a server, where you can login to that particular server and find the information that you are looking for.
root#******:/home/user# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS
PORTS NAMES
6486de4410ad alpine#sha256:e1871801d30885a610511c867de0d6baca7ed4e6a2573d506bbec7fd3b03873f "sleep 3600" 58 seconds ago Up 57 seconds
k8s_alpine_alpine_default_2e2b3016-79c8-11e8-aaab-
Run the docker exec command in that server to see the process running inside.
root#******:/home/user# docker exec -it 6486de4410ad /bin/sh
/ # ps -eaf
PID USER TIME COMMAND
1 root 0:00 sleep 3600
7 root 0:00 /bin/sh
11 root 0:00 ps -eaf
/ #
https://kubernetes.io/docs/home/- this can give you more info about pods and how deployments happen with pods/containers.
Hope this helps.

Kubernetes using the yaml file that the user provides, deploys a pod (smaller unit of Kubernetes deployment) with one or more containers in it.
You can access the containers inside the pod using the kubectl tool.
For example, in case your pod has one container you can open a shell inside it:
kubectl exec -ti <pod-name> -n <pod-namespace> bash
Through this shell, you can run ps commands and your output will be the isolated processes running inside your container.
In case you want to observe the Docker containers which Kubernetes has deployed in a node, you can connect to that node and run docker ps commands.

Kubernetes pods are running but docker ps does not give any output

I have been trying to run tomcat container on port 5000 on cluster using kubernetes. But when i am using kubectl create -f tmocat_pod.yaml , it creates pod but docker ps does not give any output. Why is it so?
Ideally, when it is running a pod, it means it is running a container inside that pod and that container is defined in yaml file.
Why is that docker ps does not show any containers running?
I am following the below URLs:
http://containertutorials.com/get_started_kubernetes/k8s_example.html
https://blog.jetstack.io/blog/k8s-getting-started-part2/
How can I get it running and see tomcat running on browser on port 5000.

The docker containers should be running on the virtual machine. Since I only installed minikube on my local machine, I confirmed the following will bring what you want:
minikub ssh
...
docker ps
Just try the kubernetes equivalent of minikube ssh.

In Kubernetes, Docker contaienrs are run on Pods, and Pods are run on Nodes, and Nodes are run on your machine (minikube/GKE)
When you run kubectl create -f tmocat_pod.yaml you basically create a pod and it runs the docker container on that pod.
The node that holds this pod, is basically a virtual instance, if you could 'SSH' into that node, docker ps would work.
What you need is:
kubectl get pods <-- It is like docker ps, it shows you all the pods (think of it as docker containers) running
kubectl get nodes <-- view the host machines for your pods.
kubectl describe pods <pod-name> <-- view system logs for your pods.
kubectl logs <pod-name> <-- Will give you logs for the specific pod.

You can connect your Terminal with the docker server what is running inside your Node/VM.
With this command in your terminal: eval $(minikube docker-env)
This only configures your current terminal window.
illustration

may be you are not using docker as container runtime.
I faced the same issue, and i forgot that i switched to gVisor with runsc as handler.
cat /etc/default/kubelet
KUBELET_EXTRA_ARGS="--container-runtime remote --container-runtime-endpoint unix:///run/containerd/containerd.sock"
If so, you need to use runsc command instead of docker.

I'm not sure where you are running the docker ps command, but if you are trying to do that from your host machine and the k8s cluster is located elsewhere, i.e. your machine is not a node in the cluster, docker ps will not return anything since the containers are not tied to your docker host.
Assuming your pod is running, kubectl get pods will display all of your running pods. To check further details, you can use kubectl describe pod <yourpodname> to check the status of each container (in great detail). To get the pod names, you should be able to use tab-complete with the kubernetes cli. Also, if your pod contains multiple containers, you will need to give the container name as well, which you can use tab-complete for after you've selected your pod.
The output will look similar to:
kubectl describe pod comparison-api-dply-reborn-6ffb88b46b-s2mtx
Name: comparison-api-dply-reborn-6ffb88b46b-s2mtx
Namespace: default
Node: aks-nodepool1-99697518-0/10.240.0.5
Start Time: Fri, 20 Apr 2018 14:08:21 -0400
Labels: app=comparison-pod-reborn
pod-template-hash=2996446026
...
Status: Running
IP: *.*.*.*
Controlled By: ReplicaSet/comparison-api-dply-reborn-6ffb88b46b
Containers:
rabbit-mq:
...
Port: 5672/TCP
State: Running
...
If your containers and pods are already running, then you shouldn't need to troubleshoot them too much. To make them accessible from the Public Internet, take a look at Services (https://kubernetes.io/docs/concepts/services-networking/service/) to make your API's IP address fixed and easily reachable.

Have you tried a "docker ps -a" to see if the container is dead? If it is there you can see its logs with "docker logs " and maybe this gives you a hint.

If your pod is running successfully and if you are looking for the container on the node where the pod is scheduled the issue could be kubernetes is using a different container runtime.
Example
root#renjith-laptop:/home/renjith/raspbery-k8s# kubectl exec -it nginx-8586cf59-h92ct bash
root#nginx-8586cf59-h92ct:/# exit
exit
root#renjith-laptop:/home/renjith/raspbery-k8s# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE
nginx-8586cf59-h92ct 1/1 Running 0 47s 10.20.0.3 renjith-laptop
root#renjith-laptop:/home/renjith/raspbery-k8s# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
root#renjith-laptop:/home/renjith/raspbery-k8s#
Here I am able exec to the pod, and I am in the same node where pod is scheduled, but docker ps doesn't show the container. In my case kubelet is using different container runtime, one of the argument to kubelet service is --container-runtime-endpoint=unix:///var/run/cri-containerd.sock

From Kubernetes documentation to get container images running on your system:
kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" |\
tr -s '[[:space:]]' '\n' |\
sort |\
uniq -c
Then you get back something like:
2 registry.k8s.io/coredns/coredns:v1.9.3
1 registry.k8s.io/etcd:3.5.4-0
1 registry.k8s.io/kube-apiserver:v1.25.1
1 registry.k8s.io/kube-controller-manager:v1.25.1
3 registry.k8s.io/kube-proxy:v1.25.1
1 registry.k8s.io/kube-scheduler:v1.25.1

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart