I have added HPA support for my containers and they are scaling out and in as expected. But, I'm not sure about the internal state of the docker containers during the scaling.
Let's say I have an ongoing process going and number of replicas is 1. If the cpu usage goes above threshold, replicas scale out to 2 or 3. I understand the new replicas are ready to serve new requests, but what happens to the ongoing process ? Also, how would it be impacted in case of stateless and stateful process ?
Your ongoing process shouldn't be affected by the scaling operation, existing requests will keep being processed in the existing container, while new requests be routed to the new containers the cluster provisioned.
In the case of stateless processes the scaling shouldn't affect the response as the service doesn't hold any state in any of it's containers.
The case for stateful services is a lot more complicated and kinda goes beyond the scope of an SO question. You can examine the k8s statefulset object to see how they tackle this issue, and consider it for stateful processes running in a cluster. Check out the documentation for more info.
Related
i am running k8s cluster on GKE
it has 4 node pool with different configuration
Node pool : 1 (Single node coroned status)
Running Redis & RabbitMQ
Node pool : 2 (Single node coroned status)
Running Monitoring & Prometheus
Node pool : 3 (Big large single node)
Application pods
Node pool : 4 (Single node with auto-scaling enabled)
Application pods
currently, i am running single replicas for each service on GKE
however 3 replicas of the main service which mostly manages everything.
when scaling this main service with HPA sometime seen the issue of Node getting crashed or kubelet frequent restart PODs goes to Unkown state.
How to handle this scenario ? If the node gets crashed GKE taking time to auto repair and which cause service down time.
Question : 2
Node pool : 3 -4 running application PODs. Inside the application, there are 3-4 memory-intensive micro services i am also thinking same to use Node selector and fix it on one Node.
while only small node pool will run main service which has HPA and node auto scaling auto work for that node pool.
however i feel like it's not best way to it with Node selector.
it's always best to run more than one replicas of each service but currently, we are running single replicas only of each service so please suggest considering that part.
As Patrick W rightly suggested in his comment:
if you have a single node, you leave yourself with a single point of
failure. Also keep in mind that autoscaling takes time to kick in and
is based on resource requests. If your node suffers OOM because of
memory intensive workloads, you need to readjust your memory requests
and limits – Patrick W Oct 10 at
you may need to redesign a bit your infrastructure so you have more than a single node in every nodepool as well as readjust mamory requests and limits
You may want to take a look at the following sections in the official kubernetes docs and Google Cloud blog:
Managing Resources for Containers
Assign CPU Resources to Containers and Pods
Configure Default Memory Requests and Limits for a Namespace
Resource Quotas
Kubernetes best practices: Resource requests and limits
How to handle this scenario ? If the node gets crashed GKE taking time
to auto repair and which cause service down time.
That's why having more than just one node for a single node pool can be much better option. It greatly reduces the likelihood that you'll end up in the situation described above. GKE autorapair feature needs to take its time (usually a few minutes) and if this is your only node, you cannot do much about it and need to accept possible downtimes.
Node pool : 3 -4 running application PODs. Inside the application,
there are 3-4 memory-intensive micro services i am also thinking same
to use Node selector and fix it on one Node.
while only small node pool will run main service which has HPA and
node auto scaling auto work for that node pool.
however i feel like it's not best way to it with Node selector.
You may also take a loot at node affinity and anti-affinity as well as taints and tolerations
I can understand how it is helpful when scaling over multiple different machines.
But here we have just one single machine (or a node). However docker still supports scaling the service to run multiple tasks (each served by one container) like this:
docker service scale serviceName=num_of_replicas
Let's take an example of running a Web API. Really I don't see how scaling in this case can help. One machine hosting a web API can serve with its max power. Using multiple containers in it cannot help increase that maximum power. With the request handling pipeline of Web API, one server can handle multiple requests at the same time and independently as long as the server has enough resources (CPU, RAM). So we don't need multiple (unnecessary) tasks in this case with docker service scaling.
The only benefit I can see here is docker service scaling may provide a better isolation between tasks (containers) compared with serving all the requests by one same server (container).
Could you please let me know some other benefit of scaling docker service this way? Is there anything wrong with my assumption above?
Using multiple containers in it cannot help increase that maximum power.
That really depends on the implementation. Some non efficient implementations may use only single process/thread/cpu and scaling helps with their performance.
Another benefit: scaling on the single node will help also with high availability. There is always small nonzero chance for non recoverable error, out of memory issue, ... which may stop single container. So there will be downtime, until orchestration scheduler restarts container.
If you have a host machine with say 3 VMS (or docker containers) running a different service each, whats the point of adding a replica of one of these VMs/containers on the same host machine or when would you need to do so? If the host machine is under a lot of traffic which will lead to problems with CPU utilization and memory, how will creating even more instances help?
Docker swarm also allows users to create new instances of a running container without adding new nodes to the cluster. How can this possibly help?
When you're traffic is going to your containers you would want more instances of it. With orchestrators such as kubernetes you can spread your instances across many hosts and make them accessible with a single address.
Your assumption that replicas are supposed to be on the same host is wrong.
The very idea of replicas is supposed to provide fault-tolerance, and thus they need to be on different hosts so that if one host goes down, your service is still available on the different node. [Think node clusters]
That said, there's nobody stopping you from creating the new instances on the same node, but that makes no sense and provides no added advantage of fault-tolerance.
Coming to the part where you say, if host machine is already under stress due to loads, how will it help to spawn new instance there?
Well, it won't. That is precisely why we spawn it on a different node on the cluster. And with the same IP, Kubernetes/Docker swarm makes sure to load-balance between each of them.
Container-level scaling increases fault-tolerance.
Node-level scaling increases throughput.
You might just run all services on 1 node (e.g. 1 VM), and when it is overloaded, add another instance. This is fine for on-demand resources, such as CPU and disk IO: They are unused when idle, but each service will also add some fixed overhead, like RAM or database connections. So you will waste this when scaling up everything rather than just the containers.
By being able to scale both on the container level and node level, you can add resources to your cluster (RAM, CPU), but only allocate it to the services needing it.
So the scaling should be something like this:
Node 1:
service A
service B
service C
Node 2:
service A
service B
Node 3:
service B
Any doubles per node just helps with fault tolerance.
Our cloud application consists of 3 tightly coupled Docker containers, Nginx, Web and Mongo. Currently we run these containers on a single machine. However as our users are increasing we are looking for a solution to scale. Using Kubernetes we would form a multi container pod. If we are to replicate we need to replicate all 3 containers as a unit. Our cloud application is consumed by mobile app users. Our app can only handle approx 30000 users per Worker node and we intend to place a single pod on a single worker node. Once a mobile device is connected to worker node it must continue to only use that machine ( unique IP address )
We plan on using Kubernetes to manage the containers. Load balancing doesn't work for our use case as a mobile device needs to be tied to a single machine once assigned and each Pod works independently with its own persistent volume. However we need a way of spinning up new Pods on worker nodes if the number of users goes over 30000 and so on.
The idea is we have some sort of custom scheduler which assigns a mobile device a Worker Node ( domain/ IPaddress) depending on the number of users on that node.
Is Kubernetes a good fit for this design and how could we implement a custom pod scale algorithm.
Thanks
Piggy-Backing on the answer of Jonah Benton:
While this is technically possible - your problem is not with Kubernetes it's with your Application! Let me point you the problem:
Our cloud application consists of 3 tightly coupled Docker containers, Nginx, Web, and Mongo.
Here is your first problem: Is you can only deploy these three containers together and not independently - you cannot scale one or the other!
While MongoDB can be scaled to insane loads - if it's bundled with your web server and web application it won't be able to...
So the first step for you is to break up these three components so they can be managed independently of each other. Next:
Currently we run these containers on a single machine.
While not strictly a problem - I have serious doubt's what it would mean to scale your application and what the challenges that come with scalability!
Once a mobile device is connected to worker node it must continue to only use that machine ( unique IP address )
Now, this IS a problem. You're looking to run an application on Kubernetes but I do not think you understand the consequences of doing that: Kubernetes orchestrates your resources. This means it will move pods (by killing and recreating) between nodes (and if necessary to the same node). It does this fully autonomous (which is awesome and gives you a good night sleep) If you're relying on clients sticking to a single nodes IP, you're going to get up in the middle of the night because Kubernetes tried to correct for a node failure and moved your pod which is now gone and your users can't connect anymore. You need to leverage the load-balancing features (services) in Kubernetes. Only they are able to handle the dynamic changes that happen in Kubernetes clusters.
Using Kubernetes we would form a multi container pod.
And we have another winner - No! You're trying to treat Kubernetes as if it were your on-premise infrastructure! If you keep doing so you're going to fail and curse Kubernetes in the process!
Now that I told you some of the things you're thinking wrong - what a person would I be if I did not offer some advice on how to make this work:
In Kubernetes your three applications should not run in one pod! They should run in separate pods:
your webservers work should be done by Ingress and since you're already familiar with nginx, this is probably the ingress you are looking for!
Your web application should be a simple Deployment and be exposed to ingress through a Service
your database should be a separate deployment which you can either do manually through a statefullset or (more advanced) through an operator and also exposed to the web application trough a Service
Feel free to ask if you have any more questions!
Building a custom scheduler and running multiple schedulers at the same time is supported:
https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
That said, to the question of whether kubernetes is a good fit for this design- my answer is: not really.
K8s can be difficult to operate, with the payoff being the level of automation and resiliency that it provides out of the box for whole classes of workloads.
This workload is not one of those. In order to gain any benefit you would have to write a scheduler to handle the edge failure and error cases this application has (what happens when you lose a node for a short period of time...) in a way that makes sense for k8s. And you would have to come up to speed with normal k8s operations.
With the information provided, hard pressed to see why one would use k8s for this workload over just running docker on some VMs and scripting some of the automation.
I have been looking into the new Docker Swarm mode that will be available in Docker 1.12. In this Docker Swarm Mode Walkthrough video, they create a simple Nginx service that is composed of a single Nginx container. In the video, they have 4 nodes in the Swarm cluster. During the scaling demonstration, they increase the replication factor to 10, thus creating 10 copies of the Nginx container across all 4 machines in the cluster.
I get that the video is just a demonstration, but in the real world, what is the point of creating more replicas of a container (or service) than there are nodes in the Swarm cluster? It seems to be pointless since two containers on the same machine would be sharing that machines finite computing resources anyway. I don't get what the benefit is.
So my question is, is there any real world benefit to replicating a Docker service or container beyond the number of nodes in the Swarm cluster?
Thanks
It depends on how the application handles threading and multiple requests. A single threaded application, or job that only handles one request at a time, may use a fraction of the OS resources and benefit from running multiple instances on a single host. An application that's been tuned to process requests concurrently and which fully utilizes the OS will see no benefit and will in fact incur a penalty of taking away resources to run multiple instances of the application.
One advantage can be performing live zero-downtime software updates. See the Docker 0.12rc2 Swarm tutorial on rolling updates
You have a RabbitMQ or other Queue System with a high load on data. You can start more Containers with workers than nodes to handle the high data load on your RabbitMQ.
Hardware resource constrain is not the only thing one needs to consider when you have your services replicated.
A simple example would be if you are having a service to provide security details. The resource consumption by this service will be low (read a record from Db/Cache and send it out). However if there are 20 or 30 requests to be handled by the same service the requests will be queued up.
Yes there are better ways to implement my example but I believe is good enough to illustrate why one might replicate a service on the same host/node.