Autoscale volume and pods simultaneously (Kubernetes) - docker

I'm using Kubernetes deployment with persistent volume to run my application, like this example;
https://github.com/kubernetes/kubernetes/tree/master/examples/mysql-wordpress-pd
, but when I try to add more replicas or autoscale, all the new pods try to connect to the same volume.
How can I simultaneously auto create new volumes for each new pod., like statefulsets(petsets) are able to do it.

The conclusion I reached for K8S 1.6 is you can't. However, you can use NFS. If, like CrateDB, your cluster can create a folder for each node under the volume mount, then you can auto-scale. So, I auto-scale CrateDB as a Deployment using this configuration:
https://github.com/erik777/kubernetes-cratedb
which relies on an nfs-server, which I deploy as an RC with PVC/PV:
SAME_BASE/kubernetes-nfs-server
It is on my TODO list to exlpore distributed file systems such as GluterFS. For K8S Deployments, your choice of file system is your remedy.
You can also engage the scalability and storage SIGs in the K8S community to help prioritize this use-case. Adding the capability to K8S removes the requirement for a clustering solution to handle node separation in a shared volume, as well as prevent the introduction of additional points of failure between the clustered app and the PV.
GITHUB kubernetes/community
Hopefully, we can see a K8S OTB solution by 2.0.
(NOTE: Had to change 2 of the GITHUB links because I don't have "10 reputation")

Related

Docker container live migration in kubernetes

I am searching for a tutorial or a good reference to perform docker container live migration in Kubernetes between two hosts (embedded devices - arm64 architecture).
As far as I searched on the internet resources, I could not find a complete documentation about it. I am a newbe and it will be really helpful if someone could provide me any good reference materials so that I can improve myself.
Posting this as a community wiki, feel free to edit and expand.
As #David Maze said in terms of containers and pods, it's not really a live migration. Usually pods are managed by deployments which have replicasets which control pods state: they are created and in requested amount. Any changes in amount of pods (e.g. you delete it) or using image will trigger pods recreation.
This also can be used for scheduling pods on different nodes when for instance you need to perform maintenance on the node or remove/add one.
As for your question in comments, it's not necessarily the same volume as it can I suppose have a short downtime.
Sharing volumes between kubernetes clusters on premise (cloud may differ) is not a built-in feature. You may want to look at nfs server deployed in your network:
Mounting external NFS share to pods

Unsure on how to Orchestrate docker containers

Im new to docker and am wanting to accomplish something but I am unsure on how to Orchestrate my docker containers to do this.
What I want to do:
I have an API that in simple does a calculation from a requested file. It loads the file (around 80mb) from disk to memory then keep it in memory for 2 hours (caching).
Im wanting to have an architecture where for example when the container gets overwhelmed with requests a new one fires up, and when the original container frees its memory and the requests slow down then the container shuts down.
Is Memory and CPU Container Orchestration possible?
Thank You,
/Jeremy
Docker itself is not dedicated to the orchestration multiple containers. You need to use some container orchestration environment. The most popular are Kubernetes, Docker Swarm, and Apache Mesos. Or if you want to run in the Cloud, then some vendor-specific, like AWS ECS.
Here's a good list of container clustering toolkit.
In all these environments it's possible to configure what you described. If you're completely new to the topic, then I recommend installing Docker-for-Desktop which comes with built-in Kubernetes and play with that in your local.
For sure, container orchestration system is what you want to be able efficiently manage your docker containers.
You can find current complete list of solutions for production environment in this spreadsheet
Tools, like kubernetes will give you reach set of benefits eg
Provisioning and deployment of containers
Redundancy and availability of containers
Scaling up or removing containers to spread application load evenly
across host infrastructure
Allocation of resources between containers
Load balancing of service discovery between containers
Health monitoring of containers and hosts
In Kubernetes there is a Horizontal Pod Autoscaler, that
automatically scales the number of pods in a replication controller,
deployment, replica set or stateful set based on observed CPU
utilization (or, with custom metrics support, on some other
application-provided metrics). Note that Horizontal Pod Autoscaling
does not apply to objects that can’t be scaled, for example,
DaemonSets.
As for beginning I would recommend you start with minikube.
More advanced ways are setup manually cluster using kubeadm either look into the cloud providers
Please be aware that you will not have option to modify cloud based control plane. More info in my related answer

How to Convert NFS into a Storage Class in kubernetes

I work in an media organisation where we deploy all our application on monolithic VMs but now we want move to kubernetes but we have major problem we have almost 40+NFS servers from which we are consuming the data in terabytes
The major problem is how do we read all this data from containers
The solutions we tried creating a
1.Persistent Volume and Persistent Volume Claim of the NFS which according to us is not a feasible solution as the data grow we have to create a new pv and pvc and create deployment
2.Mounting volumes on Kubernetes if we do this there would be no difference between kubernetes and VMs
3.Adding docker volumes to containers we were able to add the volume but we cannot see the data in the container
How can we make the existing nfs as storage class and use it or how to mount all the 40+ NFS servers on pods
It sounds like you need some form of object storage or block storage platform to manage the disks and automatically provisions disks for you.
You could use something like Rook for deploying Ceph into your cluster.
This will enable disk management in a much more friendly way, and help to automatically provision the NFS disks into your cluster.
Take a look at this: https://docs.ceph.com/docs/mimic/radosgw/nfs/
There is also the option of creating your own implementation using CRDs to trigger PV/PVC creation on certain actions/disks being mounted in your servers.

HPA Implementation on single node kubernetes cluster

I am running Kubernetes cluster on GKE. Running the monolithic application and now migrating to microservices so both are running parallel on cluster.
A monolithic application is simple python app taking the memory of 200Mb around.
K8s cluster is simple single node cluster GKE having 15Gb memory and 4vCPU.
Now i am thinking to apply the HPA for my microservices and monolithic application.
On single node i have also installed Graylog stack which include (elasticsearch, mongoDb, Graylog pod). Sperated by namespace Devops.
In another namespace monitoring there is Grafana, Prometheus, Alert manager running.
There is also ingress controller and cert-manager running.
Now in default namespace there is another Elasticsearch for application use, Redis, Rabbitmq running. These all are single pod, Type statefulsets or deployment with volume.
Now i am thinking to apply the HPA for microservices and application.
Can someone suggest how to add node-pool on GKE and auto scale. When i added node in pool and deleted old node from GCP console whole cluster restarted and service goes down for while.
Plus i am thinking to use the affinity/anti-affinity so can someone suggest devide infrastructure and implement HPA.
From the wording in your question, I suspect that you want to move your current workloads to the new pool without disruption.
Since this action represents a voluntary disruption, you can start by defining a PodDisruptionBudget to control the number of pods that can be evicted in this voluntary disruption operation:
A PDB limits the number of pods of a replicated application that are down simultaneously from voluntary disruptions.
The settings in the PDB depend on your application and your business needs, for a reference on the values to apply, you can check this.
Following this, you can drain the nodes where your application is scheduled since it will be "protected" by the budget and, drain uses the Eviction API instead of directly deleting the pods, which should make evictions graceful.
Regarding Affinity, I'm not sure how it fits in the beforementioned goal that you're trying to achieve. However, there is an answer of this particular regard in the comments.

Kubernetes Deployments, Pod and Container concepts

I have started recently getting familiar with Kubernetes, however while I do get the concept I have some questions I am unable to answer clearly through Kubernete's Concept and Documentation, and some understandings that I'd wish to confirm.
A Deployment is a group of one or more container images (Docker ..etc) that is deployed within a Pod, and through Kubernetes Deployment Controller such deployments are monitored and created, updated, or deleted.
A Pod is a group of one or more containers, are those containers from the same Deployment, or can they be from multiple deployments?
"A pod models contains one or more application containers which are relatively tightly coupled". Is there any clear criteria on when to deploy containers within the same pod, rather than separate pods?
"Pods are the smallest deployable units of computing that can be created and managed in Kubernetes" - Pods, Kuberenets Documentation. Is that to mean that Kubernetes API is unable to monitor, and manage containers (at least directly)?
Appreciate your input.
your question is actually too broad for StackOverflow but I'll quickly answer before this one is closed.
Maybe it get's clearer when you look at the API documentation. Which you could read like this:
A Deployment describes a specification of the desired behavior for the contained objects.
This is done within the spec field which is of type DeploymentSpec.
A DeploymentSpec defines how the related Pods should look like with a templatethrough the PodTemplateSpec
The PodTemplateSpec then holds the PodSpec for all the require parameters and that defines how containers within this Pod should look like through a Container definition.
This is not a punchy oneline statement, but maybe makes it easier to see how things relate to each other.
Related to the criteria on what's a good size and what's too big for a Pod or a Container. This is very opinion loaded and the best way to figure that out is to read through the opinions on the size of Microservices.
To cover your last point - Kubernetes is able to monitor and manage containers, but the "user" is not able to schedule single containers. They have to be embedded in a Pod definion. You can of course access Container status and details per container (e.g. through kubeget logs <pod> -c <container> (details) or through the metrics API.
I hope this helps a bit and doesn't add to the confusion.
Pod is an abstraction provided by Kubernetes and it corresponds to a group of containers which share a subset of namespaces, most importantly the network namespace. For instances the applications running in these containers can interact like the way applications in the same vm would interact, except for the fact that they don't share the same filesystem hierarchy.
The workloads are run in the form of pods, but POD is a lower level abstraction. The workloads are typically scheduled in terms of Kubernetes Deployments/ Jobs / CronJobs / Daemonsets etc which in turn create the Pods.

Resources