I'm hoping that there are some docker swarm experts out there who have configured a load balancer to front a docker swarm multi-node setup. In such a simplified architecture, if the load balancer needs to detect if a manager node is down and stop routing traffic to it, what is the "best practice" for that? Does Docker swarm provide a health endpoint (api) that can be tested for each manager node? I'm new to some of this and there doesn't seem to be a lot out there that describes what I'm looking for. Thanks in advance
There is the metrics endpoint of the engine, and then the engine api, but I don't think that's what you want by an application load balancer.
What I see most people do is put a load balancer in front of the Swarm nodes they want to handle incoming traffic for specific apps running in services, and since that LB needs to know if the containers are responding (not just the node's engine health) they should hit the apps health endpoint, and take nodes in and out of that apps LB based on the app response.
This is how AWS ELB's work out of the box, for example.
If you had a published service on port 80 in the Swarm, you would setup your ELB to point to the nodes you want to handle incoming traffic, and have them expect a healthy 200/300 return on those nodes. It'll remove nodes from the pool if they return something else or don't respond.
Then you could use a full monitoring solution that checks node health and optionally respond to issues like replacing nodes.
Related
I've got a database running in a private network (say IP 1.2.3.4).
In my own computer, I can do these steps in order to access the database:
Start a Docker container using something like docker run --privileged --sysctl net.ipv4.ip_forward=1 ...
Get the container IP
Add a routing rule, such as ip route add 1.2.3.4/32 via $container_ip
And then I'm able to connect to the database as usual.
I wonder if there's a way to route traffic through a specific pod in Kubernetes for certain IPs in order to achieve the same results. We use GKE, by the way, I don't know if this helps in any way.
PS: I'm aware of the sidecar pattern, but I don't think this would be ideal for our use case, as our jobs are short-lived tasks, and we are not able to run multiple "gateway" containers at the same time.
I wonder if there's a way to route traffic through a specific pod in Kubernetes for certain IPs in order to achieve the same results. We use GKE, by the way, I don't know if this helps in any way.
You can start a GKE in a fully private network like this, then you run application that needs to be fully private in this cluster. Access to this cluster is only possible when explicitly granted; just like those commands you used in your question, but of course now you will use the cloud platform (eg. service control, bastion etc etc), there is no need to "route traffic through a specific pod in Kubernetes for certain IPs". But if you have to run everything in a cluster, then likely a fully private cluster will not work for you, in this case you can use network policy to control access to your database pod.
GKE doesn't support the use case you mentionned #gabriel Milan.
What's your requirement ? Do you need to know which IP the pod will use to reach the database so you can open a firewall for it ?
Replying here as the comments have limited character count
Unfortunately GKE doesn't support that use case.
However You have couple of options:
Option#1: Create a dedicated nodepool with couple of nodes, force the pods to be scheduled on these nodes using taints and tolerations [1]. Use the IP addresses of these nodes on your firewall
Option#2: Install a Service Mesh like Istio, Use the Egress gateway[2] to route traffic toward your onPrem system and force the gateways to be deployed on a specific set of nodes so you have a know IP address. This quite complicated as a solution
[1] https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
[2] https://istio.io/latest/docs/tasks/traffic-management/egress/egress-gateway/
i would suggest using or creating the NAT gateway instead of using the container as a gateway option.
Using container or Istio is a good idea however it has its own limitations hard to implement, management and resources usage of that gateway containers.
Ultimately you want Single IP for your K8s cluster, instead request going out instead of Node's IP on which POD is scheduled.
Here terraform of GKE NAT gateway which you can use it.
https://registry.terraform.io/modules/GoogleCloudPlatform/nat-gateway/google/latest/examples/gke-nat-gateway
NAT gateway will forward all PODs traffic from a single VM and you can use that IP in the database to whitelist also.
After implementation, there will be single Egress point in your cluster.
GitHub Repo link - Click to deploy available GCP magic ;)
Not sure if this is a silly question. When the same app/service running in multiple containers, how do they report themselves to zookeeper/etcd and identify themselves? So that load balancers know the different instances and know who to talk to, where to probe and dispatch, etc..? Or the service instances would use some id from the container in their identification?
Thanks in advance
To begin with, let me explain in a few sentences how it works:
The basic building block starts with the Pod, which is just a resource that can be created and destroyed on demand. Because a Pod can be moved or rescheduled to another Node, any internal IPs that this Pod is assigned can change over time.
If we were to connect to this Pod to access our application, it would not work on the next re-deployment. To make a Pod reachable to external networks or clusters without relying on any internal IPs, we need another layer of abstraction. K8s offers that abstraction with what we call a Service Deployment.
This way, you can create a website that will be identified, for example, by a load balancer.
Services provide network connectivity to Pods that work uniformly across clusters. Service discovery is the actual process of figuring out how to connect to a service.
You can also find some information about Service in the official documentation:
An abstract way to expose an application running on a set of Pods as a network service.
With Kubernetes you don't need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.
Kubernetes supports 2 primary modes of finding a Service - environment variables and DNS. You can read more about this topic here and here.
I have a recently-Dockerized web app that I'm trying to get running in AWS ECS. I'm using Route 53 for the DNS.
Although I haven't set it up yet in Route 53, my plan is to create a DNS record of api.uat.myapp.example.com, and what I want is to have that domain name backed by an load-balanced, autoscaleable cluster of my containers living in ECS.
I'm in the ECS Container Network Configuration tab:
Press the "I believe!" button for a minute and let's pretend that I've already created the api.uat.myapp.example.com domain name in Route 53. What values/configs do I need to add here so that:
When remote clients try to connect to api.uat.myapp.example.com they get routed to a load-balanced container running in my ECS cluster?; and
That load-balanced ECS cluster is auto-scaling (once I figure out where I can configure auto-scaling properties, I'm sure I can figure out how to configure them!)
Your question is extremely broad. As such, this answer is likely overly generic to be of specific use, but hopefully will point you in the right direction.
First things first
You're misunderstanding the configuration you've taken a screenshot of. This is the network configuration within your container. So, what is your container's hostname, what other containers does it link to, what DNS Server do you want it to use, or alternatively, what specific other hosts should it know about... these are all optional, but may be required for your specific setup.
This has nothing to do with how your application scales.
Load Balancing
You need to decide on what type of load balancer you should use for your application, first. You then can point Route53 to that load balancer. Behind your load balancer will be your EC2 instances. For traffic to route to your containers, you need to make sure your EC2 ports map to the listening ports of your container. ECS can help with this.
Autoscaling
This happens in multiple ways.
You can do application autoscaling within ECS by having an ECS Service spin up and tear down containers on your EC2 instances.
You can have your Autoscaling Group scale up and down your EC2 instances -- but doing so, you'll need a way to automatically add those instances to your ECS Cluster.
Multiple scaling strategies exist. You'll need to decide for yourself which one is the most appropriate for your application, based on what metrics are most important to your scaling decisions.
I read swarm routing mesh
I create a simple service which uses tomcat server and listens at 8080.
docker swarm init I created a node manager at node1.
docker swarm join /tokens I used the token provided by the manager at node 2 and node 3 to create workers.
docker node ls shows 5 instances of my service, 3 running at node 1, 1 running at node 2, another one is at node 3.
docker service create image I created the service.
docker service scale imageid=5 scaled it.
My application uses atomic number which is maintained at JVM level.
If I hit http://node1:8080/service 25 times, all requests goes to node1. How dose it balance node?
If I hit http://node2:8080/service, it goes to node 2.
Why is it not using round-robin?
Doubts:
Is anything wrong in the above steps?
Did I miss something?
I feel I am missing something. Like common service name http://domain:8080/service, then swarm will work in round robin fashion.
I would like to understand only swarm mode. I am not interested external load balancer as of now.
How do I see swarm load balance in action?
Docker does round robin load balancing per connection to the port. As long as a connection is up, it will continue to go to the same instance.
Http allows a connection to be kept alive and reused. Browsers take advantage of this behavior to speed up later requests by leaving connections open. To test the round robin load balancing, you'd need to either disable that keep alive setting or switch to a command line tool like curl or wget.
I'm trying to understand a good way to handle Kubernetes cluster where there are several nodes and a master.
I host the cluster within the cloud of my company, plain Ubuntu boxes (so no Google Cloud or AWS).
Each pod contains the webapp (which is stateless) and I run any number of pods via replication controllers.
I see that with Services, I can declare PublicIPs however this is confusing because after adding ip addresses of
my minion nodes, each ip only exposes the pod that it runs and it doesn't do any sort of load balancing. Due to this,
if a node doesn't have any active pod running (as created pods are random allocated among nodes), it simply timeouts and I end up some IP addresses that don't response. Am I understanding this wrong?
How can I truly do a proper external load balancing for my web app? Should I do load balancing on Pod level instead of using Service?
If so, pods are considered mortal and they may dynamically die and born, how I do track of this?
The PublicIP thing is changing lately and I don't know exactly where it landed. But, services are the ip address and port that you reference in your applications. In other words, if I create a database, I create it as a pod (with or without a replication controller). I don't connect to the pod, however, from another application. I connect to a service which knows about the pod (via a label selector). This is important for a number of reasons.
If the database fails and is recreated on a different host, the application accessing it still references the (stationary) service ip address, and the kubernetes proxies take care of getting the request to the correct pod.
The service address is known by all Kubernetes nodes. Any node can proxy the request appropriately.
I think a variation of the theme applies to your problem. You might consider creating an external load balancer which forwards traffic to all of your nodes for the specific (web) service. You still need to take the node out of the balancer's targets if the node goes down, but, I think that any node will forward the traffic for any service whether or not that service is on that node.
All that said, I haven't had direct experience with external (public) ip addresses load balancing to the cluster, so there are probably better techniques. The main point I was trying to make is the node will proxy the request to the appropriate pod whether or not that node has a pod.
-g