Kubernetes pod random timeout - docker

I have a Kubernetes deployment containing a very simple Spring Boot web application. I am experiencing random timeouts trying to connect to this application externally.
Some requests return instantly whereas others hang for minutes.
I am unable to see any issues in the logs.
When connecting to the pod directly, I am able to curl the application and get a response immediately so it feels more like a networking issue.
I also have other applications with the identical configuration running in the same cluster which are experiencing no problems.
I am still quite new to Kubernetes so my question would be:
Where and how should I go about diagnosing network issues?
Can provide more information if it helps.

As you have narrow down the issue to networking which means components of the cluster are healthy such as Kubelet, Kube-proxy and etc.
You can check their status by using systemctl utility. For example
systemctl status kubelet
systemctl status kube-proxy
You can get more detail by using journalctl utility. for example
journalctl -xeu kubelet
journalctl -f -u docker
Now If you want to know what's the destiny of the packets then you need to use iptables utility. It's the one who decides forwarding, routing, and verdict of the packets (incoming or outgoing packetes).
My plan of action is Do Not make any assumptions.I follow following utilities to clear the doubts.
Kubectl
Kubectl describe pod/svc podName/svcName
systemctl
journalctl
etcdctl
curl
iptables
If I still could not solve the issue it means I have made an assumption.
please let me know any other tools I would love to put it on my utility-set

Related

Kubectl commands not working after adding proxy

I have installed a Kubenetes deployment (version:1.19.14) with docker version 20.10.8 on unbuntu 18.04.
I was able to install it and was working fine.
Due to some reason internet connectivity was lost on the host and on some finding I found that proxy settings were erased.
When I added the proxy the internet connectivity started working but strangely I was not able to give kubernetes kubectl commands anymore.
While trying kubectl commands after exporting proxy, the following error pops up:
Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exeeded while awaiting headers)
We exported the proxy in the following manner:
export http_proxy=http://proxy.example.com:80
export https_proxy=$http_proxy
I searched and was suggested somewhere to make the proxy persistent through http-proxy.conf and reload the daemon:
sudo systemctl daemon-reload
sudo systemctl restart docker
Even after doing this the kubectl commands didn't work.
Please let me know how can I resolve this issue.
Kubectl is just a CLI that communicates with the api-server of the Kubernetes control plane. First of all you need to make sure that the api-server is running and healthy, and that this is not the source of your problem.
You can use the tool crictl to debug pods when Kubectl is not working. It takes directly to the underlying container runtime which would be containerd if you are using Docker.

No outbound networking on Kubernetes pods

I am running a one-node Kubernetes cluster in a VM for development and testing purposes. I used Rancher Kubernetes Engine (RKE, Kubernetes version 1.18) to deploy it and MetalLB to enable the LoadBalancer service type. Traefik is version 2.2, deployed via the official Helm chart (https://github.com/containous/traefik-helm-chart). I have a few dummy containers deployed to test the setup (https://hub.docker.com/r/errm/cheese).
I can access the Traefik dashboard just fine through the nodes IP (-> MetalLB seems to work). It registers the services and routes for the test containers. Everything is looking fine but when I try to access the test containers in my browser I get a 502 Bad Gateway error.
Some probing showed that there seems to be an issue with outbound traffic from the pods. When I SSH into the node I can reach all pods by their service or pod IP. DNS from node to pod works as well. However, if I start an interactive busybox pod I can't reach any other pod or host from there. When I wget to any other container (all in the default namespace) I only get wget: can't connect to remote host (10.42.0.7): No route to host. The same is true for servers on the internet.
I have not installed any network policies and there are none installed by default that I am aware of.
I have also gone through this: https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service
Everything in the guide is working fine, except that the pods don't seem to have any network connectivity whatsoever.
My RKE config is standard, except that I turned off the standard Nginx ingress and enabled etcd encryption-at-rest.
Any ideas?
Maybe just double check that your node's ip forwarding is turned on: sysctl net.ipv4.ip_forward
If for some reason it doesn't return:
net.ipv4.ip_forward = 1
Then you can set it with:
sudo sysctl -w net.ipv4.ip_forward=1
And to make it permanent:
edit /etc/sysctl.conf
add or uncomment net.ipv4.ip_forward = 1
and reload via sysctl -p /etc/sysctl.conf
Ok, so I was being stupid (or rather: a noob). I had an old iptables rule lying around on the host dropping all traffic on the FORWARD chain... removing that rule fixed the problem.
I feel a bit uneasy just removing that role but I have to admit that I don't fully understand the security implications of this. This might take some further research, but that's another topic. And since I'm not currently planning to run this cluster in production but rather use a hosted cluster, it's not really a problem anyways.

IBM Cloud Private node appears to be running but services are unresponsive

One my ICP nodes appears to be running, but the services on that node are unresponsive and will at times return a 504 Gateway Timeout.
When I SSH into the unresponsive node and run journalctl -u kubelet -f I am seeing error messages such as transport: dial unix /var/run/docker/containerd/docker-containerd.sock: connect: connection refused
Furthermore, when I run top I'm seeing dockerd using an usually high percentage of my CPU.
What is causing this behavior and how can I return my node to its normal working condition?
These errors might be due to a known issue with Docker where an old containerd reference is used even after the containerd daemon was restarted. This defect causes the Docker daemon to go into an internal error loop that uses a high amount of CPU resources and logs a high number of errors. For more information about this error, please see the Refresh containerd remotes on containerd restarted pull request against the Moby project.
To work around this issue, use the host operating system command to restart the docker service on the node. After some time, the services should resume.

How kubelet - docker container communication happens?

I wondered about how kubelet communicates with docker containers. Where this configuration has defined? I searched a lot but didn't find anything informative. I am using https kube API server. I am able to create pods but containers are not getting spawned ? Any one knows what may be the cause ? Thanks in advance.
Kubelet talks to the docker daemon using the docker API over the docker socket. You can override this with --docker-endpoint= argument to the kubelet.
Pods may not be being spwaned for any number of reasons. Check the logs of your scheduler, controller-manager and kubelet.

Which Kubernetes component creates a new pod?

I have a problem to understand the kubernetes workflow:
So as I understand the flow:
You have a master which contains etcd, api-server, controller manager and scheduler.
You have nodes which contain pods (wich contain containers), kubelet and a proxy.
The proxy is working as a basic proxy to make it possible for a service to communicate with other nodes.
When a pod dies, the controller manager will see this (it 'reads' the replication controller which describes how many pods there normally are).
unclear:
The controller manager will inform the API-server (I'm not right about this).
The API-server will tell the scheduler to search a new place for the pod.
After the scheduler has found a good place, the API will inform kubelet to create a new pod.
I'm not sure about the last scenario? Can you tell me the right proces is a clear way?
Which component is creating the pod and container? Is it kubelet?
So it's the kubelet that actually creates the pods and talks to the docker daemon. If you do a docker ps -a on your nodes (as in not master) in your cluster, you'll see the containers in your pod running. So the workflow is run a kubectl command, that goes to the API server, which passes it to the controller, say that command was to spawn a pod, the controller relays that to the API server which then goes to the scheduler and tells it to spawn the pod. Then the kubelet is told to spawn said pod.
I suggest reading the Borg paper that Kubernetes is based on to better understand things in further detail. http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf

Resources