Kubernetes: Getting the IP Addresses of Other Pods on the Network - docker

What's the best way to get the IP addresses of the other kubernetes pods on a local network?
Currently, I'm using the following command and parsing the output: kubectl describe pods.
Unfortunately, the command above often takes many seconds to complete (at least 3, and often 30+ seconds) and if a number of requests happen nearly simultaneously, I get 503 style errors. I've built a caching system around this command to cache the IP addresses on the local pod, but when a 10 or so pods wake up and need to create this cache, there is a large delay and often many errors. I feel like I'm doing something wrong. Getting the IP addresses of other pods on a network seems like it should be a straightforward process. So what's the best way to get them?
For added details, I'm using Google's kubernetes system on their container engine. Running a standard Ubuntu image.
Context: To add context, I'm trying to put together a shared memcached between the pods on the cluster. To do that, they all need to know eachother's IP address. If there's an easier way to link pods/instances for the purposes of memcached, that would also be helpful.

Have you tried
kubectl get pods -o wide
This also returns IP addresses of the pods. Since this does not return ALL information describe returns, this might be faster.

For your described use case you should be using services. A headless service would allow you to reference them with my-svc.my-namespace.svc.cluster.local. This assumes you don't need to know individual nodes, only how to reach one of them, as it will round robin between them.
If you do need to have fixed network identities in your cluster attached to the pods you can setup a StatefulSet and reference them with: app-0.my-svc.my-namespace.svc.cluster.local, app-1.my-svc.my-namespace.svc.cluster.local and so on.
You should never need to contact specific pod ip's in other ways, specially since they can be rescheduled at any time and have their IPs changed.
For your use case specifically, it might be easier to just use the memcache helm chart, which supports a cluster in a StatefulSet: https://github.com/kubernetes/charts/tree/master/stable/memcached

Related

Sending requests Google Kubernetes Engine, multiple deployments, under one external IP address

The Google Cloud Platform Kubernetes Engine based backend deployment I work on has between 4-60 nodes running at all times, spanning two different services.
I want to interface with an API that employs IP whitelisting however, which would mean that all outgoing requests would have to be funneled through one singular IP address.
How do I do this? The deployment uses an Nginx Ingress controller, which doesn't allow many options when it comes to the egress part of things.
I tried setting up a VM outside of the deployment, but still on GCP in the same region, and was unable to set up a forward proxy. At least, not one that I could connect to off my local device. Not sure if this was because of GCP's firewall or anything of that sort. This was using Squid, as well Apache, with no success in either.
I also looked at the Cloud NAT option, but it seems like I would have to recreate all the services, CI/CD pipelines, and DNS settings etc. I would ideally avoid that, as it would be a few days worth of work and would call for some downtime of the systems as well.
Ideally I would have a working forward proxy. I tried looking for Docker images that would function as one, but that does not seem to be a thing, sadly. SSHing into a VM to set up such a proxy hasn't led to success yet, either.
You have already found the solution, you have to rebuild things using either Cloud NAT or an equivalent solution made yourself. Even that is relatively recent and I've not actually tried it myself, as recently as a 6 months ago we were told this was not supported for GKE. Our solution was the proxy idea you mentioned, an HTTP proxy running outside of GKE and directing things through it at the app code level rather than infrastructure. It was not fun.

Why don't use host network in docker since docker and kubernetes network is so complex

Using docker can simplify CI/CD but also introduce the complexity, not everybody able to hold the docker network though selecting open source solutions like Flannel, Calico.
So why don't use host network in docker, or what lost if use host network in docker.
I know the port conflict is one point, any others?
There are two parts to an answer to your question:
Pods must have individual, cluster-routable, IP addresses and one should be very cautious about recycling them
You can, if you wish, not use any software defined network (SDN)
So with the first part, it is usually a huge hassle to provision a big enough CIDR to house the address range required for supporting every Pod that is running across every Namespace, and have the space be big enough to avoid recycling addresses for a very long time. Thus, having an SDN allows using "fake" addresses that one need not bother the "real" network with knowing about. No routers need to be updated, no firewalls, no DHCP, whatever.
That said, as with the second part, you don't have to use an SDN: that's exactly what the container network interface (CNI) is designed to paper over. You can use the CNI provider that makes you the happiest, including using static IP addresses or the outer network's DHCP server.
But your comment about port collisions is pretty high up the list of reasons one wouldn't just want to hostNetwork: true and be done with it; I'm actually not certain if the default kubernetes scheduler is aware of hostNetwork: true and the declared ports: on the containers: in order to avoid co-scheduling two containers that would conflict. I guess try it and see, or, better yet, don't try it -- use CNI so the next poor person who tries to interact with your cluster doesn't find a snowflake setup.

Does the nodes of a Kubernetes cluster share memory

We want to deploy an application that utilizes memory cache using docker and kubernetes with horizontal pod auto-scale, but we have no idea if the containerized application inside the pods would use the same cache since it won't be guaranteed that the pods would be in the same node when scaled by the auto-scaler.
I've tried searching for information regarding cache memory on kubernetes clusters, and all I found is a statement in a Medium article that states
the CPU and RAM resources of all nodes are effectively pooled and managed by the cluster
and a sentence in a Mirantis blog
Containers in a Pod share the same IPC namespace, which means they can also communicate with each other using standard inter-process communications such as SystemV semaphores or POSIX shared memory.
But I can't find anything regarding pods in different nodes having access to the same cache. And these are all on 3rd party sites and not in the official kubernetes site.
I'm expecting the cache to be shared between all pods in all nodes, but I just want confirmation regarding the matter.
No, separate pods do not generally share anything even if running on the same physical node. There are ways around this if you are very very careful and fancy but the idea is for pods to be independent anyway. Within a single pod it's easier, you can use normal shmem, but this is pretty rare since there isn't much reason to do that usually.

Kubernetes scaling pods using custom algorithm

Our cloud application consists of 3 tightly coupled Docker containers, Nginx, Web and Mongo. Currently we run these containers on a single machine. However as our users are increasing we are looking for a solution to scale. Using Kubernetes we would form a multi container pod. If we are to replicate we need to replicate all 3 containers as a unit. Our cloud application is consumed by mobile app users. Our app can only handle approx 30000 users per Worker node and we intend to place a single pod on a single worker node. Once a mobile device is connected to worker node it must continue to only use that machine ( unique IP address )
We plan on using Kubernetes to manage the containers. Load balancing doesn't work for our use case as a mobile device needs to be tied to a single machine once assigned and each Pod works independently with its own persistent volume. However we need a way of spinning up new Pods on worker nodes if the number of users goes over 30000 and so on.
The idea is we have some sort of custom scheduler which assigns a mobile device a Worker Node ( domain/ IPaddress) depending on the number of users on that node.
Is Kubernetes a good fit for this design and how could we implement a custom pod scale algorithm.
Thanks
Piggy-Backing on the answer of Jonah Benton:
While this is technically possible - your problem is not with Kubernetes it's with your Application! Let me point you the problem:
Our cloud application consists of 3 tightly coupled Docker containers, Nginx, Web, and Mongo.
Here is your first problem: Is you can only deploy these three containers together and not independently - you cannot scale one or the other!
While MongoDB can be scaled to insane loads - if it's bundled with your web server and web application it won't be able to...
So the first step for you is to break up these three components so they can be managed independently of each other. Next:
Currently we run these containers on a single machine.
While not strictly a problem - I have serious doubt's what it would mean to scale your application and what the challenges that come with scalability!
Once a mobile device is connected to worker node it must continue to only use that machine ( unique IP address )
Now, this IS a problem. You're looking to run an application on Kubernetes but I do not think you understand the consequences of doing that: Kubernetes orchestrates your resources. This means it will move pods (by killing and recreating) between nodes (and if necessary to the same node). It does this fully autonomous (which is awesome and gives you a good night sleep) If you're relying on clients sticking to a single nodes IP, you're going to get up in the middle of the night because Kubernetes tried to correct for a node failure and moved your pod which is now gone and your users can't connect anymore. You need to leverage the load-balancing features (services) in Kubernetes. Only they are able to handle the dynamic changes that happen in Kubernetes clusters.
Using Kubernetes we would form a multi container pod.
And we have another winner - No! You're trying to treat Kubernetes as if it were your on-premise infrastructure! If you keep doing so you're going to fail and curse Kubernetes in the process!
Now that I told you some of the things you're thinking wrong - what a person would I be if I did not offer some advice on how to make this work:
In Kubernetes your three applications should not run in one pod! They should run in separate pods:
your webservers work should be done by Ingress and since you're already familiar with nginx, this is probably the ingress you are looking for!
Your web application should be a simple Deployment and be exposed to ingress through a Service
your database should be a separate deployment which you can either do manually through a statefullset or (more advanced) through an operator and also exposed to the web application trough a Service
Feel free to ask if you have any more questions!
Building a custom scheduler and running multiple schedulers at the same time is supported:
https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
That said, to the question of whether kubernetes is a good fit for this design- my answer is: not really.
K8s can be difficult to operate, with the payoff being the level of automation and resiliency that it provides out of the box for whole classes of workloads.
This workload is not one of those. In order to gain any benefit you would have to write a scheduler to handle the edge failure and error cases this application has (what happens when you lose a node for a short period of time...) in a way that makes sense for k8s. And you would have to come up to speed with normal k8s operations.
With the information provided, hard pressed to see why one would use k8s for this workload over just running docker on some VMs and scripting some of the automation.

Kubernetes on Mesos

I Have the following setup in mind:
Kubernetes on Mesos (based on the kubernetes-mesos project) within a /16 network.
Each pod will have its own IP and I believe this will avail 64 000 pods.
The idea is to provide isolation for each app i.e. Each app gets its own mysql within the same pod - the app accesses mysql on localhost(within the pod).
If an additional service were needed, I'd use kubernetes rolling updates to add the service's container to the pod, the app will be able to access this new service on localhost as well.
Each application needs as much isolation as possible.
Are there any defects to such an implementation?
Do I have to use weave?
There's an option to specify the service-ip-range while running the kubernetes-mesos install.
One hole is how do I scale a service, is this really viable?
Is there a better way to do this? i.e. Offering isolated services
Thanks.
PS//I'm obviously a noobie at this and I'm trying to get the best possible setup running.
A common misconception is that a Pod should manage a vertical, multi-tier stack: for example a web tier + DB tier together.
It's interesting to read the Kubernetes design intent of Pods: they're for collecting 'helper' processes rather than composing a vertical stack.
To answer your questions, I'd recommend:
Define a Pod template for the web tier only. This can be scaled to any size required, using a replication controller (questions #1 and #3).
Define another Pod for MySQL.
Use the Service abstraction to locate these components.
This sort of design will work for small applications, but you're right that it'll be tough to scale up if you suddenly want two have a couple instances of a service hit the same mysql backend.
You may want to look into putting each service into a separate namespace. Then a service's DNS lookups will be scoped to its own namespace by default so that it won't find other services' resources unless it's explicitly looking for them. This would let you put mysql (and any other dependencies) in a separate pod so that the frontend could be scaled independently.

Resources