Kubernetes DNS Disk Utilization is High. Is there a way to tailor the logging to assist?

Kubernetes DNS Disk Utilization is High. Is there a way to tailor the logging to assist? - kube-dns

Can someone suggest what level of logging should be enabled for kube-dns and what parameters to use? My kube-dns pod is using 23GB of disk space and I fear its related to logging.
Has anyone else seen this behavior?

There are few ways to resolve your issue:
You can change verbose of logs in theconfig of your deployment.
kubectl get -o yaml --export deployments kube-dns --namespace=kube-system > file
Edit the file, change --v=2 to --v=0 (it will disable all logs in kube-dns) and deploy it.
kubectl apply -f ./file --namespace=kube-system
Then clear the logs on you pods:
kubectl get pods --namespace=kube-system
kubectl exec POD_NAME /bin/sh
You can configure logs rotating using any of the available tools, for example fluentd.

Related

How To Stop a Stuck Pod in Kubernetes

Background
I am trying to learn to automate deployments with Jenkins on my laptop computer. I did not check the resource settings in the helm chart when I deployed Jenkins and I ended up over provisioned the memory and cpu requests.
The pod was initializing for several minutes and then eventually ended up in the status of CrashLoopBackOf.
Software and Versions
$ minikube start
😄 minikube v1.17.1 on Microsoft Windows 10 Enterprise 10.0.19042 Build 19042
...
...
🐳 Preparing Kubernetes v1.20.2 on Docker 20.10.2
...
Note that Docker was installed from Visual Studio Code with Docker Desktop and Windows 10 WSL Ubuntu 20.04 LTS enabled.
$ helm version
version.BuildInfo{Version:"v3.5.2", GitCommit:"167aac70832d3a384f65f9745335e9fb40169dc2", GitTreeState:"dirty", GoVersion:"go1.15.7"}
Installation
$ helm repo add stable https://charts.jenkins.io
$ helm repo ls
NAME URL
stable https://charts.jenkins.io
$ kubectl create namespace devops-cicd
namespace/devops-cicd created
$ helm install jenkins stable/jenkins --namespace devops-cicd
$ kubectl get svc -n devops-cicd -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
jenkins ClusterIP 10.108.169.104 <none> 8080/TCP 7m1s app.kubernetes.io/component=jenkins-controller,app.kubernetes.io/instance=jenkins
jenkins-agent ClusterIP 10.103.213.213 <none> 50000/TCP 7m app.kubernetes.io/component=jenkins-controller,app.kubernetes.io/instance=jenkins
$ kubectl get pod -n devops-cicd --output wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
jenkins-0 1/2 Running 1 8m13s 172.17.0.10 minikube <none> <none>
The pod failed eventually, ending with the status of CrashLoopBackOff
Unfortunately, I forgot to extract the logs for the pod.
In full disclosure, I got it deployed successfully by pulling the chart to my local file system and halved the size of the memory and cpu settings.
Questions
I fear that the situation of over provisioning in the Production environment one day. So how does one stop a failed pod from respawning/restarting and undo/rollback the deployment?
I tried to set Deployment replicas=0 but it had no effect. Actually, the only resources I could see were a couple of Services, the Pod itself, a PersistentVolume and some secrets.
I had to delete the namespace to remove the pod. This is not ideal. So what is the best way to tackle this situation (i.e. just deal with the problematic pod)?

Drawing on the feedback I have gathered and confirmed that the pod is scheduled by a StatefulSet. I am attempting to answer my own question with the hope that it is useful for newbies like me.
My question was how to stop a pod (from respawning).
So here I get the info on the StatefulSet:
$ kubectl get statefulsets -n devops-cicd -o wide
NAME READY AGE CONTAINERS IMAGES
jenkins 0/1 33s jenkins,config-reload jenkins/jenkins:2.303.1-jdk11,kiwigrid/k8s-sidecar:1.12.2
Then scale in:
$ kubectl scale statefulset jenkins --replicas=0 -n devops-cicd
statefulset.apps/jenkins scaled
Result:
$ kubectl get statefulsets -n devops-cicd -o wide
NAME READY AGE CONTAINERS IMAGES
jenkins 0/0 6m35s jenkins,config-reload jenkins/jenkins:2.303.1-jdk11,kiwigrid/k8s-sidecar:1.12.2

how to obtain AKS logs

A few days ago I had some pods crash and in their logs I don't see anything unusual.
I was using the following command:
kubectl logs mypod -n namespace
How do I see the AKS log to see if I see a problem there?

If you're creating your pods using a kubernetes deployment, pods will restart automatically if they crash. The new pod won't have logs for the crashed pod.
To see the logs for the previously terminated pod, add the "-p" argument:
kubectl logs -n <namespace> <pod> -p

task.Container moving from docker to kubernetes

I'm trying to bring docker container to kubernetes. In docker you can ask the dns for :
host tasks.containername
The result are all internal IPs from running containers with this name.
I do the same on kubernetes, by using the headless-services. I can do
host pod-name
The result are also all internal IPs from the pods.
So far so good but there are a lot of "run.sh" scripts that uses the "task.XXX" query. Have someone a idea how to fix this without editing all run.sh scripts ?
Maybe something in the coredns, with mapping.
best and thank you

I agree with editing the scripts is the wisest solution, but here is how you can edit coredns in kubernetes.
kubectl edit configmap coredns -n kube-system
and then add the rewrite config as below.
rewrite name tasks.containername.default.svc.cluster.local containername.default.svc.cluster.local
for example
.:53 {
errors
log
health
rewrite name tasks.containername.default.svc.cluster.local containername.default.svc.cluster.local
kubernetes cluster.local 10.0.0.0/24
proxy . /etc/resolv.conf
cache 30
}
And then reload coredns as below
kubectl exec -n kube-system coredns-xxxxxxx -- kill -SIGUSR1 1

I think editing them is the wisest solution.
For sake of clarity, it's not host pod-name but host headless-service

Kubernetes :: Restart terminated pod

I'm using Kubernetes to run jobs with a RestartPolicy to Never.
Sometime, I would like to be able to debug a failed/terminated pod. In some way, I'm trying to find how restart it with a sleep XXX command to connect (exec) to the container and get the same state.
In Docker this is something doable using docker ps --all and then docker start X but I didn't find something similar with kubectl or the client-go
Thanks!

Not sure about client-go as I have no experience there. But if I understood the question correctly, you can check the reason of the failure:
kubectl get pods (if you do not see your pod here add --all-namespaces)
NAME READY STATUS RESTARTS AGE
pi-c2x4r 0/1 Completed 0 19m
pi-test-c5hln 0/1 Error 0 16m`
And then run:
kubectl describe pod pi-test-c5hln (name of your pod).
kubectl logs pi-test-c5hln
You can also find more information when you run:
kubectl describe job *job name*
You can find useful information about Jobs and how to work with them (including cleanup, termination and patterns) in here.
Not sure if it needs to be added, but terminating is ongoing process, so you can work with the pod after it goes from terminating to other status (error, completed).

how to debug container images using openshift

Let's say I have a docker image created using a Dockerfile. At the time of writing the Dockerfile I had to test it repeatedly to realize what I did wrong. To debug a docker image I can simply run a test container and look at its stdout/stderr to see what's wrong with the image.
IMAGE_NAME=authoritative-dns-bind
IMAGE_OPTIONS="
-v $(pwd)/config.yaml:/config.yaml:ro
-p 127.0.0.1:53:53
-p 127.0.0.1:53:53/udp"
docker run -t -i $IMAGE_OPTIONS $IMAGE_NAME
Learning the above was good enough to iteratively create and debug a minimal working Docker container. Now I'm looking for a way to do the same for OpenShift.
I'm pretty much aware of the fact that the container is not ready for OpenShift. My plan is to run it and watch its stdoud/stderr like I did with Docker. One of the people I asked for help came up with a command that looked like exactly what I need.
oc run -i -t --image $IMAGE_NAME --command test-pod -- bash
And the above command seemed for me for fedora:24 and fedora:latest images from the docker registry and I got a working shell. But the same wouldn't happen for my derived image with a containerized service. My explanation is that it probably does an entirely different thing and instead of starting the command interactively it starts it non-interactively and then tries to run bash inside a failed container.
So what I'm looking for is a reasonable way to debug a container image in OpenShift. I expected that I would be able to at least capture and view stdin/stdout of OpenShift containers.
Any ideas?
Update
According to the comment by Graham oc run should indeed work as docker run but it doesn't seem to be the case. With original Fedora images the bash always appears at least upon hitting enter.
# oc run -i -t --image authoritative-dns-bind --command test-auth13 -- bash
Waiting for pod myproject/test-auth13-1-lyng3 to be running, status is Pending, pod ready: false
Waiting for pod myproject/test-auth13-1-lyng3 to be running, status is Pending, pod ready: false
Waiting for pod myproject/test-auth13-1-lyng3 to be running, status is Pending, pod ready: false
...
Waiting for pod myproject/test-auth13-1-lyng3 to be running, status is Pending, pod ready: false
^C
#
I wasn't able to try out the suggested oc debug yet as it seems to require more configuration than just simple image. There's another problem with oc run as that command creates new and new containers that I don't really need. I hope there is a way to start the debug easily and get the container automatically distroyed afterwards.

There are three main commands to debug pods:
oc describe pod $pod-name -- detailed info about the pod
oc logs $pod-name -- stdout and stderr of the pod
oc exec -ti $pod-name -- bash -- get a shell in running pod
To your specific problem: oc run default pull policy is set to Always. This means that OpenShift will try to pull the image until successful and refuse to use the local one.
Once this kuberenetes patch lands in OpenShift origin, the pull policy will be easily configurable.

Please do not consider this a final answer to the question and supersede it with your own better answers...
I'm now using a pod configuration file like the following...
apiVersion: v1
kind: Pod
metadata:
name: "authoritative-dns-server" # pod name, your reference from command line
namespace: "myproject" # default namespace in `oc cluster up`
spec:
containers:
- command:
- "bash"
image: "authoritative-dns-bind" # use your image!
name: "authoritative-dns-bind-container" # required
imagePullPolicy: "Never" # important! you want openshift to use your local image
stdin: true
tty: true
restartPolicy: "Never"
Note the command is explicitly set to bash. You can then create the pod, attach to the container and run the docker command yourself.
oc create -f pod.yaml
oc attach -t -i authoritative-dns-server
/files/run-bind.py
This looks far from ideal and it doesn't really help you debug an ordinary openshift container with standard pod configuration, but at least it's possible to debug, now. Looking forward to better answers.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart