Horizontally Scaling a web based application - scalability

I have a web application which runs 3 services simultaneously: "A", "B" and "C" and they're sharing the same data from an external database.
"A" may be using more resources than "B" and "C", and might need its own dedicated machine.
I'm thinking to scale the application horizontally (by deploying new machines, instead of updating the current servers' configuration) in the following way:
measure the load on the server # "every minute" intervals
if the load is over 90% for more than 30 minutes, deploy a new machine running a new instance of service "A", "B" or "C", depending which has the highest load.
if the load of the currently monitored machine is lower than 10% and it's not the only machine running, shut down the machine.
Is there any book or recommended website on this topic as well?
How about software tools to help? (Or more specific, I might consider Amazon EC2)
Many thanks,
Vlad

Amazon's Elastic Load Balancer seems exactly what you need. It will allow you to scale your instances if they are running in EC2. It suports HTTP/HTTPS but other non-web TCP services.
RightScale would also be an option. It also supports other clouds in addition to EC2.

Related

Does it make sense to cluster NodeJs (in order to take advantage of multiple CPUs) if will be deployed with orchestration tool like Kubernetes?

Right now I am struggling with debugging of NodeJs application which is clustered and is running on Docker. Found on this link and this information in it:
Remember, Node.js is still single-threaded in most cases, so even on a
single server you’ll likely want to spin up multiple container
replicas to take advantage of multiple CPU’s
So what does it mean, clustering of NodeJs app is pointless when it is meant to be deployed on Kubernetes ?
EDIT: I should also say that, by clustering I mean forking workers with cluster.fork() and goal of the application is to build simple REST API with high load traffic.
Short answer is yes..
Containers are just mini VM's and kubernetes is the orchestration tool that manages all the running 'containers', checking for health, resource allocation, load etc.
So, if you are running your node application in a container with an orchestration tool like kubernetes, then clustering is moot as each 'container' will be using 1 CPU or partial CPU depending on how you have it configured. Multiple containers essentially just place a new VM in rotation and kubernetes will direct traffic to each.
Now, when we talk about clustering node, that really comes into play when using tools like PM2, lets say you have a beefy server with 8 CPU's, node can only use 1 per instance so tools like PM2 setup a cluster and will route traffic along each of the running instances.
One thing to keep in mind though is that your application needs to be cluster OR container ready. Meaning nothing should be stored on the ephemeral disk as with each container restart that data is lost OR in a cluster situation there is no guarantee the folders will be available to each running instance and if you cluster with multiple servers etc you are asking for trouble :D ( this is where an object store would come into play like S3)

How to use Kubernetes effectively for 2 distant nodes

I want to move all of my operations over to K8S for so long, but am still hesitant to that. This question will likely be broad, but bear with me. Let me first describe the existing system.
I hosts a lot of different websites (>30). A lot of that for my own experimentation, but some are for actual clients. I have 1 VM in New York (I'm using DigitalOcean), with multiple Docker containers, frequently managed using docker-compose. There is 1 container for every site. The request first comes in to front container running HAProxy. This strips away SSL, then forwards the request to 2 proxy container running Nginx. These 2 container then forwards the request to all the other containers for their service. All of my certificates come from LetsEncrypt, and have to be renewed every 3 months. To do so, I stop front, run certbot --apache so it binds to port 80. It gets the certificates, then I stop apache, then recreate front container.
There are several reasons to why I do it this way:
I change site configs a lot, and how all of them are wired together. So front is expected to run forever, unless I'm getting certificates, and proxys are expected to change a lot. I change the proxy image, then stops and recreates the 1st container, then stops and recreates the 2nd container, so that there will be no downtime at all.
I really don't know how to get certificates when there are multiple nodes. In fact, I'm a total noob at the whole certificate thing and LetsEncrypt is pretty much the only way I know of to do this.
I want to directly edit files on the remote server. I have a bad practice of editing production code directly, mainly because I get impatient with setting up dev, staging and production environments. It takes too much time, and the gains feels small. And for clients, they are typically small businesses, with <10 employees, and regularly, they want to have some aesthetic changes to the websites. I can have a video call with them, they tell me exactly what they want, I code that in, it gets uploaded to the server immediately, and they see changes right away. Then they can critique the design, and we can iterate back and forth. If I were to setup different environments, they can't see it right away, and there has to be this long process of committing to git, deploy to staging, then production. This takes a long time, and I don't think is justified.
I realize that my systems are not that well maintained. Images are not getting security updates, I don't know if they are still running or not unless I check for them manually, which is tedious, so I don't do them at all. Furthermore, I have an Asian background, that means I have clients from both the US and Asia, pretty much the farthest place possible from each other, which increases latency by a lot. That means client in Asia has to wait for around 1-2 second for the page to actually load, which is eternal. I have also moved to Asia in the past week, so now, accessing the New York server via ssh is incredibly slow, and my productivity just plummets. So now it might be the best time to revamp everything, and move to K8S once and forever. However, there are major problems in the planning process and currently, K8S seems to lack a lot of stuff that are just deal breakers for me. So please criticize my plans, and improve them however you see fit.
What I plan to do now is this:
There will be 2 servers, 1 at New York, 1 at Singapore. These 2 severs will have 2 different ip addresses. Those 2 will be running K8S Pods. Preferably, they should have exactly the same configs, website containers, database containers, etc. Then for each website DNS record, I will modify A and AAAA records so that they contain 2 ip addresses for the 2 servers.
My question is:
Will DNS always route to Singapore if user is in China, and always route to New York if user is in England?
How to actually get certificates for 2 nodes? My understanding is that when certbot issues a certificate, it associates the domain name with the node ip address. That means 2 nodes can't have the same certificate for the same domain name. Is this correct? If you can get certificates for 2 nodes then how to do that?
How to keep files in sync between servers? Say I edit the file tree in Singapore server, I want that file to also be modified in New York several seconds later. For databases, I can have a master database at either Singapore or New York, then have slave databases at both locations that updates whenever the master updates, and the slaves can serve as a low latency database for each server.
How to actually route requests from servers to containers inside. I initially plan to use NodePort, to direct the request to front Pods, then that can distribute requests to other Pods, but I was heartbroken when NodePort can't attach to ports below 30000. The only other option that I am aware of is to have an external load balancing service that directs traffic to the 2 servers. But that costs like $15/site/month, and because I have >30 sites, doing so will bankrupt me. I can also have 4 servers in total, 2 for the K8S cluster, and 2 serves as a load balancer that will forward to NodePort. Will this plan works? How will automatic renewing of certificates even work here?
Please note that may be my questions are the wrong questions to ask (like, may be I shouldn't use A and AAAA records for directing traffic), and there's a different way to do this entirely, so feel free to ask the right questions.
read your question hats off to write down the whole stuff but half of the stuff is useless.
Answers of your question :
Can we add the same or multiple entries in DNS? example.com with A record multiple times possible?
You might require to set up a regional K8s cluster with regional ingress support. you can use certmanager with letsencrypt which will manage your cert at LB level and terminate it at the front.
If you are looking forward to use two VMs put one LB in front of both and set SSL over there.
if you are using K8s with stateless PODs editing direct file inside container is not a option. better you manage the Github update inside and container get deployed on to both cluster at a same time for that you can setup CI/CD. You are right in case of database server setup with master slave concept you can use read replicas.
To route the traffic from server to internal application of K8s you can an internal LB or exposing services with node ports(above 30000 but change target port in SVC) and route the port if you want to redirect requests on a specific port using the target port.
still, i am not getting "I can also have 4 servers in total, 2 for the K8S cluster, and 2 serves as a load balancer that will forward to NodePort. Will this plan works? How will automatic renewing of certificates even work here?" which server will be in front and which one in the backend.
If all your services are websites (run over http) you could use k8s ingress to route traffic to pods based on Host header (domain name) and use only one LB with one IP address. The most popular ingress controller seems to be the Nginx Ingress Controller
If you don't want to use LB you can use hostPort to expose nginx ingress but as soon as you have k8s cluster with more than one node, use LB because hostPort is generally not advised to use unless you have a very good reason to do so.
Speaking of DNS, you can use sth like AWS route53 routing policies for location routing. You don't necessarily need to use AWS. I just want to show you that there are solutions to this problem, but use whatever you like.
For certificates use cetrmanager with DNS-01 challenge.
From letsencrypt docs about DNS-01 challenge:
It works well even if you have multiple web servers.
cetrmanager will also handle certificate renewal for you.
About keeping files in sync between servers; It depends on files, but for static content it might be best to use CDN that will replicate content from one source to other locations.
For simultanous deploys to 2 separate clusters you can use some CI/CD pipeline like e.g. github actions.

Kubernetes scaling pods using custom algorithm

Our cloud application consists of 3 tightly coupled Docker containers, Nginx, Web and Mongo. Currently we run these containers on a single machine. However as our users are increasing we are looking for a solution to scale. Using Kubernetes we would form a multi container pod. If we are to replicate we need to replicate all 3 containers as a unit. Our cloud application is consumed by mobile app users. Our app can only handle approx 30000 users per Worker node and we intend to place a single pod on a single worker node. Once a mobile device is connected to worker node it must continue to only use that machine ( unique IP address )
We plan on using Kubernetes to manage the containers. Load balancing doesn't work for our use case as a mobile device needs to be tied to a single machine once assigned and each Pod works independently with its own persistent volume. However we need a way of spinning up new Pods on worker nodes if the number of users goes over 30000 and so on.
The idea is we have some sort of custom scheduler which assigns a mobile device a Worker Node ( domain/ IPaddress) depending on the number of users on that node.
Is Kubernetes a good fit for this design and how could we implement a custom pod scale algorithm.
Thanks
Piggy-Backing on the answer of Jonah Benton:
While this is technically possible - your problem is not with Kubernetes it's with your Application! Let me point you the problem:
Our cloud application consists of 3 tightly coupled Docker containers, Nginx, Web, and Mongo.
Here is your first problem: Is you can only deploy these three containers together and not independently - you cannot scale one or the other!
While MongoDB can be scaled to insane loads - if it's bundled with your web server and web application it won't be able to...
So the first step for you is to break up these three components so they can be managed independently of each other. Next:
Currently we run these containers on a single machine.
While not strictly a problem - I have serious doubt's what it would mean to scale your application and what the challenges that come with scalability!
Once a mobile device is connected to worker node it must continue to only use that machine ( unique IP address )
Now, this IS a problem. You're looking to run an application on Kubernetes but I do not think you understand the consequences of doing that: Kubernetes orchestrates your resources. This means it will move pods (by killing and recreating) between nodes (and if necessary to the same node). It does this fully autonomous (which is awesome and gives you a good night sleep) If you're relying on clients sticking to a single nodes IP, you're going to get up in the middle of the night because Kubernetes tried to correct for a node failure and moved your pod which is now gone and your users can't connect anymore. You need to leverage the load-balancing features (services) in Kubernetes. Only they are able to handle the dynamic changes that happen in Kubernetes clusters.
Using Kubernetes we would form a multi container pod.
And we have another winner - No! You're trying to treat Kubernetes as if it were your on-premise infrastructure! If you keep doing so you're going to fail and curse Kubernetes in the process!
Now that I told you some of the things you're thinking wrong - what a person would I be if I did not offer some advice on how to make this work:
In Kubernetes your three applications should not run in one pod! They should run in separate pods:
your webservers work should be done by Ingress and since you're already familiar with nginx, this is probably the ingress you are looking for!
Your web application should be a simple Deployment and be exposed to ingress through a Service
your database should be a separate deployment which you can either do manually through a statefullset or (more advanced) through an operator and also exposed to the web application trough a Service
Feel free to ask if you have any more questions!
Building a custom scheduler and running multiple schedulers at the same time is supported:
https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
That said, to the question of whether kubernetes is a good fit for this design- my answer is: not really.
K8s can be difficult to operate, with the payoff being the level of automation and resiliency that it provides out of the box for whole classes of workloads.
This workload is not one of those. In order to gain any benefit you would have to write a scheduler to handle the edge failure and error cases this application has (what happens when you lose a node for a short period of time...) in a way that makes sense for k8s. And you would have to come up to speed with normal k8s operations.
With the information provided, hard pressed to see why one would use k8s for this workload over just running docker on some VMs and scripting some of the automation.

Valid CoreOS multi tenancy scenario?

I'm currently tinkering with a scenario for using CoreOS. It's probably not the 1st class use case. But I'd like to get a pointer if it's valid though. As I'm really at the beginning of getting a grip on CoreOS I hope that my "use case" is not totally off.
Imagine a multi tenant application where every tenant should get it's own runtime environment. Let's take a web app running on Node.js and PostgreSQL for data storage as given. Each tenant environment would be be running on CoreOS in their respective containers. Data persistance is left out for now. For me it's currently more about the general feasibility.
So why CoreOS?
Currently I try to stick with the idea of separated environments per tenant. To optimise the density of DB and web server instances per hardware host I thought CoreOS might be the right choice instead of "classic" virtualisation.
Another reason is that a lot of tenants might not need more than a single, smallish DB instance and a single, smallish web server. But there might be other tenants that need some constantly scaled out deployments. Others might need a temporary scale out during burst times. CoreOS sounds like a good fit here as well.
On the other side there must be a scalable messaging infrastructure (RabbitMQ) in behind that will handle a lot of messages. This infrastructure will be used by all tenants and needs to dynamically scalable at best. Probably there will be a "to be scaled" Elasticsearch infrastructure as well. Viewed through my current "CoreOS for everything goggles" this seems a good fit as well.
In case this whole scenario is generally valid, I currently cannot see how it would be possible to route the traffic for a general available web site to the different tenant containers.
Imagine the app is running at app.greatthing.tld. A user can login and should be presented the app served for it's tenant. Is this something socketplane and/or flannel are there to solve? Or how would a solution look like to get the tenant served by the right containers? I think it's kind of a general issue. But at least in the context of a CoreOS containerized environment I cannot see how to deal with this at all.
CoreOS takes care of scheduling your container in the cluster with their own tools such as fleetctl/etcd/systemd and also takes care of persistent storage when resheduled to a different container using flocker (experimental). They have their own load balancers.

openstack overkill for HA website stack?

Some background:
I'm building a pretty involved website (as far as used stack concerned). Components among some other smaller stuff include:
Elasticsearch
Redis
ZeroMQ
Couchbase
RethinkDB
traffic through Nginx -> Node
The intention is to have a high available website running but be pretty lean (and low cost) at the same time.
Current topology I'm considering:
2 webservers in active/active config with DNS-loadbalancing. (Nginx, static asset serving, etc. + loadbalancing to the second tier:
2 appservers in active/active. Most of the components like Elasticsearch can do sharding/replication themselves so this should not be as hard to set-up (fingers crossed)
session handling in replicated Redis
Naturally I want monitoring and alerting when something is wrong, and ideally the system should be able to handle failures automatically. Stuff like: promote Redis from Slave to Master, or even initialize a new ec2-instance, if I were to be on Ec2 that is.
However, I want to be free from a particular hosting provider. Which I believe (please correct if wrong) is where Openstack comes in.
Is it correct that:
- openstack allows me to control the entire lifecycle of my website-stack (covering multiple boxes / virtual machines? )
- Does it allow me to (with work on config of course) to spin-up instances, monitor, alert when something goes wrong, take appropriate actions in those scenario's, etc.?
Or is Openstack just entirely the wrong tool for the job? Anything else that would fit better as a sort of "management layer" on top of my entire website?
Thanks
OpenStack isn't VMWare ESX. It's not a very good straight up simple virtual machine hosting environment. If what you want is a way to easily manage virtual machines I might suggest Ganeti. It even has HA failover of virtual machines. In a two physical host environment, this is probably the way to go.
What OpenStack gives you that Ganeti won't is RESTful APIs. It has AWS Compatible APIs, but it has OpenStack APIs that are even better. If you want to automate elasticity or healability this is huge. Being able to link up in python using existing client APIs and just write scripts that spin up instances as needed is something joe DevOps is all about.
So I guess it comes down to what your level of commitment is and what you need. For 2 physical machines OpenStack probably isn't the best solution. But, down the line when you've got more apps and more vms than you can manage manually, openstack will be there to help you write code that makes your datacenter dance to your melodic tunes.

Resources