External load balancer should route to swarm managers, workers or both? - docker-swarm

I have an architecture of microservices running into a docker swarm stack.
My swarm stack is composed with :
3 Managers (manager only)
3 workers
I have an external load-balancer to dispatch requests on the nodes of my stack with a single public IP.
I'm wondering if my external load-balancer should route traffic only to the managers, only to the workers or to all nodes.
I didn't find any direct answers to this question in the swarm documentation but i think it is better to route traffic to wokers only in order to save resources of managers. Is it the right way to do it ?

yes. no. maybe.
If you have only one vip, you probably want it to point to the managers, because you want HA access to the managers to manage the swarm.
i.e. with my internal swarm, "swarm.example.com" is a vip that points to the managers. My CI/CD pipelines use that as their target when doing docker stack deploy operations, and this means I can perform node maintenance without breaking pipeline deployments. "*.swarm.example.com" is also, for convenience a CNAME to swarm.example.com, so all my http (and other) ingress arrives on the managers, which is where I deploy traefik (which needs access to the manager api via /var/lib/docker.sock) for ingress routing to services.
Now, a more sophisticated setup would be to use separate vip pools to manage the control plane and ingress routing, and having traefik on the manager nodes is a security concern, but that speaks to a much larger setup with greater security concerns than an on prem swarm running ci/cd for devs.

Related

do replicas decrease traffic on a single node kubernetes cluster?

I am new to the world of kubernetes. I am trying to implement kubernetes advantages on my personal project.
I have an api service in a docker container which fetches data from back end.
I plan on creating multiple replicas of this api service container on a single external port in the kubernetes cluster. Do replicas share traffic if they're on a single node ?
My end goal is to create multiple instances of this api service to make my application faster(users can access one of the multiple api services which should reduce traffic on a single instance).
Am i thinking right in terms of kubernetes functionality?
You are right, the multiple replicas of your API service will share the load. In Kubernetes, there is a concept of Services which will send traffic to the backend and in this case it is your api application running in the pods. By default, the choice of backend is random. Also, it doesn't matter whether the Pods are running on a single node or on different nodes, the traffic will be distributed randomly among all the Pods based on the labels.
This will also make your application highly available because you will use deployment for specifying the number of replicas and whenever the number of available replicas are less than the desired replicas, Kubernetes will provision new pods to meet the desired state.
If you add multiple instances / replicas of your web server it will share the load and will avoid single point of failure.
However to achieve this you will have to create and expose a Service. You will have to communicate using the Service endpoint and not using each pods IP directly.
A service exposes an endpoint. It has load balancing. It usually uses round robin to distribute load / requests to servers behind the service load balancer.
Kubernetes manages Pods. Pods are wrappers around containers. Kubernetes can schedule multiple pods on the same node(hardware) or across multiple nodes. Depends how you configure it. You can use Deployments to manage ReplicaSets which manage Pods.
Usually it is recommended to avoid managing pods directly. Pods can crash, stop abruptly. Kubectl will create a new for you automatically depending on the Replica Set config.
Using deployments you can do rolling updates also.
You can refer to Kubernetes Docs to read about this in detail.
Yes. It's called Braess's paradox.

Is it possible to run Kubernetes nodes on hosts that can be physically compromised?

Currently I am working on a project where we have a single trusted master server, and multiple untrusted (physically in an unsecured location) hosts (which are all replicas of each other in different physical locations).
We are using Ansible to automate the setup and configuration management however I am very unimpressed in how big of a gap we have in our development and testing environments, and production environment, as well as the general complexity we have in configuration of the network as well as containers themselves.
I'm curious if Kubernetes would be a good option for orchestrating this? Basically, multiple unique copies of the same pod(s) on all untrusted hosts must be kept running, and communication should be restricted between the hosts, and only allowed between specific containers in the same host and specific containers between the hosts and the main server.
There's a little bit of a lack of info here. I'm going to make the following assumptions:
K8s nodes are untrusted
K8s masters are trusted
K8s nodes cannot communicate with each other
Containers on the same host can communicate with each other
Kubernetes operates on the model that:
all containers can communicate with all other containers without NAT
all nodes can communicate with all containers (and vice-versa) without NAT
the IP that a container sees itself as is the same IP that others see it as
Bearing this in mind, you're going to have some difficulty here doing what you want.
If you can change your physical network requirements, and ensure that all nodes can communicate with each other, you might be able to use Calico's Network Policy to segregate access at the pod level, but that depends entirely on your flexibility.

Docker Swarm with external Load Balancer - how to get collective healthcheck

I'm hoping that there are some docker swarm experts out there who have configured a load balancer to front a docker swarm multi-node setup. In such a simplified architecture, if the load balancer needs to detect if a manager node is down and stop routing traffic to it, what is the "best practice" for that? Does Docker swarm provide a health endpoint (api) that can be tested for each manager node? I'm new to some of this and there doesn't seem to be a lot out there that describes what I'm looking for. Thanks in advance
There is the metrics endpoint of the engine, and then the engine api, but I don't think that's what you want by an application load balancer.
What I see most people do is put a load balancer in front of the Swarm nodes they want to handle incoming traffic for specific apps running in services, and since that LB needs to know if the containers are responding (not just the node's engine health) they should hit the apps health endpoint, and take nodes in and out of that apps LB based on the app response.
This is how AWS ELB's work out of the box, for example.
If you had a published service on port 80 in the Swarm, you would setup your ELB to point to the nodes you want to handle incoming traffic, and have them expect a healthy 200/300 return on those nodes. It'll remove nodes from the pool if they return something else or don't respond.
Then you could use a full monitoring solution that checks node health and optionally respond to issues like replacing nodes.

Configuring AWS ECS cluster to be load-balanced and auto-scaling behind a DNS

I have a recently-Dockerized web app that I'm trying to get running in AWS ECS. I'm using Route 53 for the DNS.
Although I haven't set it up yet in Route 53, my plan is to create a DNS record of api.uat.myapp.example.com, and what I want is to have that domain name backed by an load-balanced, autoscaleable cluster of my containers living in ECS.
I'm in the ECS Container Network Configuration tab:
Press the "I believe!" button for a minute and let's pretend that I've already created the api.uat.myapp.example.com domain name in Route 53. What values/configs do I need to add here so that:
When remote clients try to connect to api.uat.myapp.example.com they get routed to a load-balanced container running in my ECS cluster?; and
That load-balanced ECS cluster is auto-scaling (once I figure out where I can configure auto-scaling properties, I'm sure I can figure out how to configure them!)
Your question is extremely broad. As such, this answer is likely overly generic to be of specific use, but hopefully will point you in the right direction.
First things first
You're misunderstanding the configuration you've taken a screenshot of. This is the network configuration within your container. So, what is your container's hostname, what other containers does it link to, what DNS Server do you want it to use, or alternatively, what specific other hosts should it know about... these are all optional, but may be required for your specific setup.
This has nothing to do with how your application scales.
Load Balancing
You need to decide on what type of load balancer you should use for your application, first. You then can point Route53 to that load balancer. Behind your load balancer will be your EC2 instances. For traffic to route to your containers, you need to make sure your EC2 ports map to the listening ports of your container. ECS can help with this.
Autoscaling
This happens in multiple ways.
You can do application autoscaling within ECS by having an ECS Service spin up and tear down containers on your EC2 instances.
You can have your Autoscaling Group scale up and down your EC2 instances -- but doing so, you'll need a way to automatically add those instances to your ECS Cluster.
Multiple scaling strategies exist. You'll need to decide for yourself which one is the most appropriate for your application, based on what metrics are most important to your scaling decisions.

AutoScaling in Docker Containers

I have been looking into Docker containerization for a while now but few things are still confusing to me. I understand that all the containers are grouped into a cluster and cluster management tools like Docker Swarm, DC/OS, Kubernetes or Rancher can be used to manage docker containers. I have been testing out Container cluster management with DC/OS and Kubernetes, but still a few questions remain unanswered to me.
How does auto scaling in container level help us in production servers? How does the application serve traffic from multiple containers?
Suppose we have deployed a web application using containers and they have auto scaled. How does the traffic flow to the containers? How are the sessions managed?
What metrics are calculated for autoscaling containers?
The autoscaling in DC/OS (note: Mesosphere is the company, DC/OS the open source project) the autoscaling is described in detail in the docs. Essentially the same as with Kubernetes, you can use either low-level metrics such as CPU utilization to decide when to increase the number of instances of an app or higher-level stuff like app throughput, for example using the Microscaling approach.
Regarding your question how the routing works (how are requests forwarded to an instance, that is a single container running): you need a load balancer and again, DC/OS provides you with this out of the box. And again, the options are detailed out in the docs, essentially: HAProxy-based North-South or IPtables-based, East-West (cluster internal) load balancers.
Kubernetes has concept called service. A Kubernetes Service is an abstraction which defines a logical set of Pods and a policy by which to access them. Kubernetes uses services to serve traffic from multiple containers. You can read more about services here.
AFAIK, Sessions are managed outside kubernetes, but Client-IP based session affinity can be selected by setting service.spec.sessionAffinity to "ClientIP". You can read more about Service and session affinity here
Multiple metrics like cpu and memory can be used for autoscaling containers. There is a good blog you can read about autoscaling, when and how.

Resources