Run the kubernetes pod from the point of failure without restarting - docker

I have deployed an application in Kubernetes that prints numbers from 1-20 in Kubernetes.
While printing numbers suddenly there is an internet failure and the pod crases after printing numbers from 1-10. Now the basic pod lifecycle says that the pod will restart and numbers will print again starting from 1 but I want to print the numbers from where it failed ie 10...
So basically I am searching for a way through which I can resume the application running in pods from the point of failure without restarting again.
Is there a way to do it ?? I have read about persistent storage and volumes but they are basically used to assign volumes to pods so that they can retain data and files .....
Please help me how can I achieve this and demonstrate this in form of POC ...

can a statefulset be of use here?
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
Using StatefulSets
StatefulSets are valuable for applications that require one or more of the following.
Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, automated rolling updates.
In the above, stable is synonymous with persistence across Pod (re)scheduling. If an application doesn't require any stable identifiers or ordered deployment, deletion, or scaling, you should deploy your application using a workload object that provides a set of stateless replicas. Deployment or ReplicaSet may be better suited to your stateless needs.

Related

Kubernetes POD Failover

I am toying around with Kubernetes and have managed to deploy a statefull application (jenkins instance) to a single node.
It uses a PVC to make sure that I can persist my jenkins data (jobs, plugins etc).
Now I would like to experiment with failover.
My cluster has 2 digital ocean droplets.
Currently my jenkins pod is running on just one node.
When that goes down, Jenkins becomes unavailable.
I am now looking on how to accomplish failover in a sense that, when the jenkins pod goes down on my node, it will spin up on the other node. (so short downtime during this proces is ok).
Of course it has to use the same PVC, so that my data remains intact.
I believe, when reading, that a StatefulSet kan be used for this?
Any pointers are much appreciated!
Best regards
Digital Ocean's Kubernetes service only supports ReadWriteOnce access modes for PVCs (see here). This means the volume can only be attached to one node at a time.
I came across this blogpost which, while focused on Jenkins on Azure, has the same situation of only supporting ReadWriteOnce. The author states:
the drawback for me though lies in the fact that the access mode for Azure Disk persistent volumes is ReadWriteOnce. This means that an Azure disk can be attached to only one cluster node at a time. In the event of a node failure or update, it could take anywhere between 1-5 minutes for the Azure disk to get detached and attached to the next available node.
Note, Pod failure and node failures are different things. Since DO only supports ReadWriteOnce, there's no benefit to trying anything more sophisticated than what you have right now in terms of tolerance to node failure. Since it's ReadWriteOnce the volume will need to be unmounted from the failing node and re-mounted to the new node, and then a new Pod will get scheduled on the new node. Kubernetes will do this for you, and there's not much you can do to optimize it.
For Pod failure, you could use a Deployment since you want to read and write the same data, you don't want different PVs attached to the different replicas. There may be very limited benefit to this, you will have multiple replicas of the Pod all running on the same node, so it depends on how the Jenkins process scales and if it can support that type of scale horizontal out model while all writing to the same volume (as opposed to simply vertically scaling memory or CPU requests).
If you really want to achieve higher availability in the face of node and/or Pod failures, and the Jenkins workload you're deploying has a hard requirement on local volumes for persistent state, you will need to consider an alternative volume plugin like NFS, or moving to a different cloud provider like GKE.
Yes, you would use a Deployment or StatefulSet depending on the use case. For Jenkins, a StatefulSet would be appropriate. If the running pod becomes unavailable, the StatefulSet controller will see that and spawn a new one.
What you are describing is the default behaviour of Kubernetes for Pods that are managed by a controller, such as a Deployment.
You should deploy any application as a Deployment (or another controller) even if it consists just of a single Pod. You never really deploy Pods directly to Kubernetes. So, in this case, there's nothing special you need to do to get this behaviour.
When one of your nodes dies, the Pod dies too. This is detected by the Deployment controller, which creates a new Pod. This is in turn detected by the scheduler, which assigns the new Pod to a node. Since one of the nodes is down, it will assign the Pod to the other node that is still running. Once the Pod is assigned to this node, the kubelet of this node will run the container(s) of this Pod on this node.
Ok, let me try to anwser my own question here.
I think Amit Kumar Gupta came the closest to what I believe is going on here.
Since I am using a Deployment and my PVC in ReadWriteOnce, I am basically stuck with one pod, running jenkins, on one node.
weibelds answer made me realise that I was asking questions to about a concept that Kubernetes performs by default.
If my pod goes down (in my case i am shutting down a node on purpose by doing a hard power down to simulate a failure), the cluster (controller?) will detect this and spawn a new pod on another node.
All is fine so far, but then I noticed that my new pod as stuck in ContainerCreating state.
Running a describe on my new pod (the one in ContainerCreating state) showed this
Warning FailedAttachVolume 16m attachdetach-controller Multi-Attach error for volume "pvc-cb772fdb-492b-4ef5-a63e-4e483b8798fd" Volume is already used by pod(s) jenkins-deployment-6ddd796846-dgpnm
Warning FailedMount 70s (x7 over 14m) kubelet, cc-pool-bg6u Unable to mount volumes for pod "jenkins-deployment-6ddd796846-wjbkl_default(93747d74-b208-421c-afa4-8d467e717649)": timeout expired waiting for volumes to attach or mount for pod "default"/"jenkins-deployment-6ddd796846-wjbkl". list of unmounted volumes=[jenkins-home]. list of unattached volumes=[jenkins-home default-token-wd6p7]
Then it started to hit me, this makes sense.
It's a pitty, but it makes sense.
Since I did a hard power down on the node, the PV went down with it.
So now the controller tries to start a new pod, on a new node but it cant transfer the PV, since the one on the previous pod became unreachable.
As I read more on this, I read that DigitalOcean only supports ReadWriteOnce , which now leaves me wondering, how the hell can I achieve a simple failover for a stateful application on a Kubernetes Cluster on Digital Ocean that consists of just a couple of simple droplets?

HPA Implementation on single node kubernetes cluster

I am running Kubernetes cluster on GKE. Running the monolithic application and now migrating to microservices so both are running parallel on cluster.
A monolithic application is simple python app taking the memory of 200Mb around.
K8s cluster is simple single node cluster GKE having 15Gb memory and 4vCPU.
Now i am thinking to apply the HPA for my microservices and monolithic application.
On single node i have also installed Graylog stack which include (elasticsearch, mongoDb, Graylog pod). Sperated by namespace Devops.
In another namespace monitoring there is Grafana, Prometheus, Alert manager running.
There is also ingress controller and cert-manager running.
Now in default namespace there is another Elasticsearch for application use, Redis, Rabbitmq running. These all are single pod, Type statefulsets or deployment with volume.
Now i am thinking to apply the HPA for microservices and application.
Can someone suggest how to add node-pool on GKE and auto scale. When i added node in pool and deleted old node from GCP console whole cluster restarted and service goes down for while.
Plus i am thinking to use the affinity/anti-affinity so can someone suggest devide infrastructure and implement HPA.
From the wording in your question, I suspect that you want to move your current workloads to the new pool without disruption.
Since this action represents a voluntary disruption, you can start by defining a PodDisruptionBudget to control the number of pods that can be evicted in this voluntary disruption operation:
A PDB limits the number of pods of a replicated application that are down simultaneously from voluntary disruptions.
The settings in the PDB depend on your application and your business needs, for a reference on the values to apply, you can check this.
Following this, you can drain the nodes where your application is scheduled since it will be "protected" by the budget and, drain uses the Eviction API instead of directly deleting the pods, which should make evictions graceful.
Regarding Affinity, I'm not sure how it fits in the beforementioned goal that you're trying to achieve. However, there is an answer of this particular regard in the comments.

How to change k8s's pods limts without killing the original pod?

Requst:limits of a pod may be set to low at the beginning, to make full use of node's resource, we need to set the limits higher. However, when the resource of node is not enough, to make the node's still work well, we need to set the limits lower. It is better not to kill the pod, because it may influence the cluster.
Background:I am currently a beginner in k8s and docker, my mentor give me this requests. Can this requests fullfill normaly? Or is it better way to solve this kind of problem? Thanks for your helps!
All I tried:I am trying to do by editing the Cgroups, but I can only do this in a container, so may be container should be use in privileged mode.
I expect a resonable plan for this requests.
Thanks...
The clue is you want to change limits without killing the pod.
This is not the way Kubernetes works, as Markus W Mahlberg explained in his comment above. In Kubernetes there is no "hot plug CPU/memory" or "live migration" facilities the convenient hypervisors provide. Kubernetes treats pods as ephemeral instances and does not take care about keeping them running. Whether you need to change resource limits for the application, change the app configuration, install app updates or repair misbehaving application, the "kill-and-recreate" approach is applied to pods.
Unfortunately, the solutions suggested here will not work for you:
Increasing limits for the running container within the pod ( docker update command ) will lead to breaching the pod limits and killing the pod by Kubernetes.
Vertical Pod Autoscaler is part of Kubernetes project and relies on the "kill-and-recreate" approach as well.
If you really need to keep the containers running and managing allocated resource limits for them "on-the-fly", perhaps Kubernetes is not suitable solution in this particular case. Probably you should consider using pure Docker or a VM-based solution.
I do no think this is possible, there is an old issue tracking such thing on the kubernetes github (https://github.com/kubernetes/kubernetes/issues/9043) from 2015 and it is open.
Also, you should not rely on pod not being recreated while using kubernetes. Applications should be able to stateless to a point where if it dies in mid of a process, it could handle this failure and start it from the begin once it is started again.
I understand the idea behind trying to optimize the resource usage to it maximum but you should be also worried about a reliable process.
I think you should check out the Kubernetes' Vertical Pod Autoscaler, as it automates the resources of a pod depending on its usage. Maybe that could be an alternative: https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler
You have to find the container ID's those running inside the pods and run this below command to increase the resources.
docker update --cpu-shares NewValue -m NewValue DockerContainerID

Migrating karaf from Legacy deployment to cloud with active-standby deployment

Currently we have karaf deployment like following
Its active-standby
Standby is there with startlevel as 50 so failover would be fast as system bundles are at start status and user bundles are at install status
As soon as active goes down standby takeover.
Now we planned to migrate to kubernetes, as per current research
Kubernetes will create 1 pod for us and we declare happy state as 1 which
mean if POD goes down it will automatically launch new, but my concern is
launching of new POD will take more time as in legacy deployment, standby
was in semi active state.
How we can achieve such active-standby in kubernetes?
Answer provided by Tomislav Mikulin in comments:
That depends on if you app is stateless or not...if its stateless that
just use more then one pod (like 2 or 3) for your app, and you should
be all set...
If you don't want (or can't) have multiple pods of you app already
running, you could make two different deployments of the same app, and
just switch them through the Service object. That way you could have
an extra pod running that you can switch on at any moment.

Is there a best practice to reboot a cluster

I followed Alex Ellis' excellent tutorial that uses kubeadm to spin-up a K8s cluster on Raspberry Pis. It's unclear to me what the best practice is when I wish to power-cycle the Pis.
I suspect sudo systemctl reboot is going to result in problems. I'd prefer not to delete and recreate the cluster each time starting with kubeadm reset.
Is there a way that I can shutdown and restart the machines without deleting the cluster?
Thanks!
This question is quite old but I imagine others may eventually stumble upon it so I thought I would provide a quick answer because there is, in fact, a best practice around this operation.
The first thing that you're going to want to ensure is that you have a highly available cluster. This consists of at least 3 masters and 3 worker nodes. Why 3? This is so that at any given time they can always form a quorum for eventual consistency.
Now that you have an HA Kubernetes cluster, you're going to have to go through every single one of your application manifests and ensure that you have specified Resource Requests and Limitations. This is so that you can ensure that a pod will never be scheduled on a pod without the required resources. Furthermore, in the event that a pod has a bug that causes it to consume a highly abnormal amount of resources, the limitation will prevent it from taking down your cluster.
Now that that is out of the way, you can begin the process of rebooting the cluster. The first thing you're going to do is reboot your masters. So run kubectl drain $MASTER against one of your (at least) three masters. The API Server will now reject any scheduling attempts and immediately start the process of evicting any scheduled pods and migrating their workloads to your other masters.
Use kubectl describe node $MASTER to monitor the node until all pods have been removed. Now you can safely connect to it and reboot it. Once it has come back up, you can now run kubectl uncordon $MASTER and the API Server will once again begin scheduling Pods to it. Once again use kubectl describe $NODE until you have confirmed that all pods are READY.
Repeat this process for all of the masters. After the masters have been rebooted, you can safely repeat this process for all three (or more) worker nodes. If you properly perform this operation you can ensure that all of your applications will maintain 100% availability provided they are using multiple pods per service and have proper Deployment Strategy configured.

Resources