How to kill a multi-container pod if one container fails? - jenkins

I'm using Jenkins Kubernetes Plugin which starts Pods in a Kubernetes Cluster which serve as Jenkins agents. The pods contain 3 containers in order to provide the slave logic, a Docker socket as well as the gcloud command line tool.
The usual workflow is that the slave does its job and notifies the master that it completed. Then the master terminates the pod. However, if the slave container crashes due to a lost network connection, the container terminates with error code 255, the other two containers keep running and so does the pod. This is a problem because the pods have large CPU requests and setup is cheap with the slave running only when they have to, but having multiple machines running for 24h or over the weekend is a noticable financial damage.
I'm aware that starting multiple containers in the same pod is not fine Kubernetes arts, however ok if I know what I'm doing and I assume I do. I'm sure it's hard to solve this differently given the way the Jenkins Kubernetes Plugin works.
Can I make the pod terminate if one container fails without it respawn? As solution with a timeout is acceptable as well, however less preferred.

Disclaimer, I have a rather limited knowledge of kubernetes, but given the question:
Maybe you can run the forth container that exposes one simple endpoint of "liveness"
It can run ps -ef or any other way to contact 3 existing containers just to make sure they're alive.
This endpoint could return "OK" only if all the containers are running, and "ERROR" if at least one of them was detected as "crushed"
Then you could setup a liveness probe of kubernetes so that it would stop the pod upon the error returned from that forth container.
Of course if this 4th process will crash by itself for any reason (well it shouldn't unless there is a bug or something) then the liveness probe won't respond and kubernetes is supposed to stop the pod anyway, which is probably what you really want to achieve.

Related

Jenkins slave pods on Kubernetes disappear when their is an influx of running pods

I have a Kubernetes cluster running Jenkins master in a single pod and each build running in a separate slave pod. When there are many builds running, there are many pods being spun up and down and often I will see an error in a job like this:
Cannot contact slave-jenkins-0g9p0: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel#197b6a38:JNLP4-connect connection from 10.10.3.90/10.10.3.90:54418": Remote call on JNLP4-connect connection from 10.10.3.90/10.10.3.90:54418 failed. The channel is closing down or has closed down
Could not connect to slave-jenkins-0g9p0 to send interrupt signal to process
The pod, for example slave-jenkins-0g9p0, just disappears. There is no trace that it existed. While watching information like kubectl describe pod slave-jenkins-0g9p0, there is no error message, it simply stops existing.
I have a feeling that because there are multiple pods spinning up and down that Kubernetes attempts to balance the load on the nodes and reschedule the pod but after killing it, it cannot spin up the pod on another node. I cannot be sure though. Maybe there is a way to tell K8s to tie a pod to a node until it exits itself? Im not really sure what/how to debug this case.
Kuberentes version: v1.16.13-eks-2ba888 on AWS EKS
Jenkins version: 2.257
Kubernetes plugin version 1.27.2
Any advise would be appreciated
Thanks
UPDATE:
I have uploaded three slave pod manifest examples here where you can see the resources allocated. The above issue occurs in each of these running pods.
The node pool is controlled by the Kubernetes autoscaler (v1.14.6) and use AWS t3a.large (2 CPU, 8GB mem) instances.
UPDATE 2:
I believe that I have found the cause of the problem. I disabled the cluster-autoscaler](https://github.com/kubernetes/autoscaler) (v1.14.6) and the problem stopped.
So what is seems is happening is that the autoscaler is removing the node that the slave pd is running on. I know that taints can be used to tell the autoscaler not to remove a node but is there a way to do this dynamically that it wont remove a node if a certain pod is running on it. Without having to develop something new.

Not able to connect to a container(Created via Rest API) in Kubernetes

I am creating a docker container ( using docker run) in a kubernetes Environment by invoking a rest API.
I have mounted the docker.sock of the host machine and i am building an image and running that image from RESTAPI..
Now i need to connect to this container from some other container which is actually started by Kubectl from deployment.yml file.
But when used kubeclt describe pod (Pod name), my container created using Rest API is not there.. So where is this container running and how can i connect to it from some other container ?
Are you running the container in the same namespace as namespace with deployment.yml? One of the option to check that would be to run -
kubectl get pods --all-namespaces
If you are not able to find the docker container there than I would suggest performing below steps -
docker ps -a {verify running docker status}
Ensuring that while mounting docker.sock there are no permission errors
If there are permission errors, escalate privileges to the appropriate level
To answer the second question, connection between two containers should be possible by referencing cluster DNS in below format -
"<servicename>.<namespacename>.svc.cluster.local"
I would also request you to detail steps, codes and errors(if there are any) for me to better answer the question.
You probably shouldn't be directly accessing the Docker API from anywhere in Kubernetes. Kubernetes will be totally unaware of anything you manually docker run (or equivalent) and as you note normal administrative calls like kubectl get pods won't see it; the CPU and memory used by the pod won't be known about by the node interface and this could cause a node to become over utilized. The Kubernetes network environment is also pretty complicated, and unless you know the details of your specific CNI provider it'll be hard to make your container accessible at all, much less from a pod running on a different node.
A process running in a pod can access the Kubernetes API directly, though. That page notes that all of the official client libraries are aware of the conventions this uses. This means that you should be able to directly create a Job that launches your target pod, and a Service that connects to it, and get the normal Kubernetes features around this. (For example, servicename.namespacename.svc.cluster.local is a valid DNS name that reaches any Pod connected to the Service.)
You should also consider whether you actually need this sort of interface. For many applications, it will work just as well to deploy some sort of message-queue system (e.g., RabbitMQ) and then launch a pool of workers that connects to it. You can control the size of the worker queue using a Deployment. This is easier to develop since it avoids a hard dependency on Kubernetes, and easier to manage since it prevents a flood of dynamic jobs from overwhelming your cluster.

How to detect exception occured in a Pod in Kubernetes?

I have a multinode kubernetes cluster. Multiple services are deployed as Pods. They communicate over each other via rabbitmq which also exists as Pod in the Cluster.
Problem Scenario:
Many time services fails to connect to required queue in the Rabbitmq. Log for the same are reported in Rabbitmq pod logs and on the services Pod as well. This occurs primarily due to connectivity issues and is inconsistent. Due to this failure functionality breaks. And also since this is NOT a crash, pod is always in running state in the kubernetes. To fix this we have to manually go and restart the pod.
I want to create a liveness probe for every pod. But how this should work to catch the exception? Since many process in a service can be trying to access the connection, any one of them can fail.
I'd suggest implementing http endpoint for liveness probe that would check statew of the connection to rabbitmq or actualy failing miserably and exiting whole process when rabbit connection does not work.
But... the best solution would be to retry the connection indefinitely when it fails so a temporary networking issue is transparently recovered from. Well written service should wait for depending services to become operational instead of cascading the failure up the stack.
Imagine you have a liveness check like you ask here on 20 services using that rabvbit or other service. That service goes down for some time, and what you end up with is cluster with 20+ services in CrashLoopBackoff state due to incremental backoff on failure. Meaning your cluster will take some time to recover when that originaly failing service is back, as well as the picture will be pretty messed up and will make it harder to understand what happened at first glance.

Is there a best practice to reboot a cluster

I followed Alex Ellis' excellent tutorial that uses kubeadm to spin-up a K8s cluster on Raspberry Pis. It's unclear to me what the best practice is when I wish to power-cycle the Pis.
I suspect sudo systemctl reboot is going to result in problems. I'd prefer not to delete and recreate the cluster each time starting with kubeadm reset.
Is there a way that I can shutdown and restart the machines without deleting the cluster?
Thanks!
This question is quite old but I imagine others may eventually stumble upon it so I thought I would provide a quick answer because there is, in fact, a best practice around this operation.
The first thing that you're going to want to ensure is that you have a highly available cluster. This consists of at least 3 masters and 3 worker nodes. Why 3? This is so that at any given time they can always form a quorum for eventual consistency.
Now that you have an HA Kubernetes cluster, you're going to have to go through every single one of your application manifests and ensure that you have specified Resource Requests and Limitations. This is so that you can ensure that a pod will never be scheduled on a pod without the required resources. Furthermore, in the event that a pod has a bug that causes it to consume a highly abnormal amount of resources, the limitation will prevent it from taking down your cluster.
Now that that is out of the way, you can begin the process of rebooting the cluster. The first thing you're going to do is reboot your masters. So run kubectl drain $MASTER against one of your (at least) three masters. The API Server will now reject any scheduling attempts and immediately start the process of evicting any scheduled pods and migrating their workloads to your other masters.
Use kubectl describe node $MASTER to monitor the node until all pods have been removed. Now you can safely connect to it and reboot it. Once it has come back up, you can now run kubectl uncordon $MASTER and the API Server will once again begin scheduling Pods to it. Once again use kubectl describe $NODE until you have confirmed that all pods are READY.
Repeat this process for all of the masters. After the masters have been rebooted, you can safely repeat this process for all three (or more) worker nodes. If you properly perform this operation you can ensure that all of your applications will maintain 100% availability provided they are using multiple pods per service and have proper Deployment Strategy configured.

Openshift PaaS/Kubernetes Docker Container Monitoring and Orchestration

Kubernetes deployment and replication controller give the ability to self heal by ensuring a minimum number of replicas is/are present.
Also the auto scaling features, allows to increase replicas given a specific cpu threshold.
Are there tools available that would provide flexibility in the auto-healing and auto-scale features?
Example :
Auto-adjust number of replicas during peak hours or days.
When the pod dies, and is due to external issues, prevent the system from re-creating container and wait for a condition to succeed, i.e. ping or telnet test.
You can block pod startup by waiting for external services in an entrypoint script or init container. That's the closest that exists today to waiting for external conditions.
There is no time based autoscaler today, although it would be possible to script it failure easily on a schedule.
In Openshift, you can easily scale your app by running this command in a cron job.
Scale command
oc scale dc app --replicas=5
And of course, scale it down changing the numer of replicas.
Autoscale
This is what Openshift for developers write about autoscaling.
OpenShift also supports automatic scaling, defining upper and lower thresholds for CPU usage by pod.
If the upper threshold is consistently exceeded by the running pods for your application, a new instance of your application will be started. When CPU usage drops back below the lower threshold, because your application is no longer working as hard, the number of instances will be scaled back again.
I think Kubernetes now released version 1.3 which allows autoscale but integrated yet in Openshift.
Health Check
What it comes to health check, Openshift has:
readiness checks Checks the status of the test you configure before the router start to send traffic to it.
liveness probe: liveness probe is run periodically once traffic has been switched to an instance of your application to ensure it is still behaving correctly. If the liveness probe fails, OpenShift will automatically shut down that instance of your application and replace it with a new one.
You can perform this kind of tests (HTTP check, Container execution check and TCP socket check)
So e this tolos I guess you can créate some readiness check and liveness check to ensure that the status of your pod is running properly, if not a new deployment will be triggered until readiness status comes to ok.

Resources