Does the nodes of a Kubernetes cluster share memory - docker

We want to deploy an application that utilizes memory cache using docker and kubernetes with horizontal pod auto-scale, but we have no idea if the containerized application inside the pods would use the same cache since it won't be guaranteed that the pods would be in the same node when scaled by the auto-scaler.
I've tried searching for information regarding cache memory on kubernetes clusters, and all I found is a statement in a Medium article that states
the CPU and RAM resources of all nodes are effectively pooled and managed by the cluster
and a sentence in a Mirantis blog
Containers in a Pod share the same IPC namespace, which means they can also communicate with each other using standard inter-process communications such as SystemV semaphores or POSIX shared memory.
But I can't find anything regarding pods in different nodes having access to the same cache. And these are all on 3rd party sites and not in the official kubernetes site.
I'm expecting the cache to be shared between all pods in all nodes, but I just want confirmation regarding the matter.

No, separate pods do not generally share anything even if running on the same physical node. There are ways around this if you are very very careful and fancy but the idea is for pods to be independent anyway. Within a single pod it's easier, you can use normal shmem, but this is pretty rare since there isn't much reason to do that usually.

Related

Kubernetes scaling pods using custom algorithm

Our cloud application consists of 3 tightly coupled Docker containers, Nginx, Web and Mongo. Currently we run these containers on a single machine. However as our users are increasing we are looking for a solution to scale. Using Kubernetes we would form a multi container pod. If we are to replicate we need to replicate all 3 containers as a unit. Our cloud application is consumed by mobile app users. Our app can only handle approx 30000 users per Worker node and we intend to place a single pod on a single worker node. Once a mobile device is connected to worker node it must continue to only use that machine ( unique IP address )
We plan on using Kubernetes to manage the containers. Load balancing doesn't work for our use case as a mobile device needs to be tied to a single machine once assigned and each Pod works independently with its own persistent volume. However we need a way of spinning up new Pods on worker nodes if the number of users goes over 30000 and so on.
The idea is we have some sort of custom scheduler which assigns a mobile device a Worker Node ( domain/ IPaddress) depending on the number of users on that node.
Is Kubernetes a good fit for this design and how could we implement a custom pod scale algorithm.
Thanks
Piggy-Backing on the answer of Jonah Benton:
While this is technically possible - your problem is not with Kubernetes it's with your Application! Let me point you the problem:
Our cloud application consists of 3 tightly coupled Docker containers, Nginx, Web, and Mongo.
Here is your first problem: Is you can only deploy these three containers together and not independently - you cannot scale one or the other!
While MongoDB can be scaled to insane loads - if it's bundled with your web server and web application it won't be able to...
So the first step for you is to break up these three components so they can be managed independently of each other. Next:
Currently we run these containers on a single machine.
While not strictly a problem - I have serious doubt's what it would mean to scale your application and what the challenges that come with scalability!
Once a mobile device is connected to worker node it must continue to only use that machine ( unique IP address )
Now, this IS a problem. You're looking to run an application on Kubernetes but I do not think you understand the consequences of doing that: Kubernetes orchestrates your resources. This means it will move pods (by killing and recreating) between nodes (and if necessary to the same node). It does this fully autonomous (which is awesome and gives you a good night sleep) If you're relying on clients sticking to a single nodes IP, you're going to get up in the middle of the night because Kubernetes tried to correct for a node failure and moved your pod which is now gone and your users can't connect anymore. You need to leverage the load-balancing features (services) in Kubernetes. Only they are able to handle the dynamic changes that happen in Kubernetes clusters.
Using Kubernetes we would form a multi container pod.
And we have another winner - No! You're trying to treat Kubernetes as if it were your on-premise infrastructure! If you keep doing so you're going to fail and curse Kubernetes in the process!
Now that I told you some of the things you're thinking wrong - what a person would I be if I did not offer some advice on how to make this work:
In Kubernetes your three applications should not run in one pod! They should run in separate pods:
your webservers work should be done by Ingress and since you're already familiar with nginx, this is probably the ingress you are looking for!
Your web application should be a simple Deployment and be exposed to ingress through a Service
your database should be a separate deployment which you can either do manually through a statefullset or (more advanced) through an operator and also exposed to the web application trough a Service
Feel free to ask if you have any more questions!
Building a custom scheduler and running multiple schedulers at the same time is supported:
https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
That said, to the question of whether kubernetes is a good fit for this design- my answer is: not really.
K8s can be difficult to operate, with the payoff being the level of automation and resiliency that it provides out of the box for whole classes of workloads.
This workload is not one of those. In order to gain any benefit you would have to write a scheduler to handle the edge failure and error cases this application has (what happens when you lose a node for a short period of time...) in a way that makes sense for k8s. And you would have to come up to speed with normal k8s operations.
With the information provided, hard pressed to see why one would use k8s for this workload over just running docker on some VMs and scripting some of the automation.

Multiple Pods for multiple clients on a single Kubernetes Instance

I'm trying to wrap my head around how/if Kubernetes manages multiple Pods in terms of a clustered client model. Based on this documentation Multi-container it sounds as though Kubernetes is only concerned with the health of a pod and the containers within it. This means that a single Kubernetes instance could manage multiple client's pods, which contain containers running that client's applications, microservices etc.
Is this correct?
Please see my diagram for a clearer idea of what I'm asking.
The diagram has the right idea, but not quite the right terminology.
The diagram would be more accurate if the "Pod" label was replaced with "Namespace", and the "Container" label was replaced with "Pod".
A single Kubernetes cluster is intended to be able to support multi-tenancy, where the workloads of individual clients can run with proper security, resource allocation, isolation, and other important tenancy management attributes.
The unit of tenancy, however, is a namespace- a logical layer of abstraction in which are deployed workloads, usually for an individual client- not a pod, and the unit of replication for workload processing is the pod (comprising one or more containers), not an individual container.

Kubernetes: Getting the IP Addresses of Other Pods on the Network

What's the best way to get the IP addresses of the other kubernetes pods on a local network?
Currently, I'm using the following command and parsing the output: kubectl describe pods.
Unfortunately, the command above often takes many seconds to complete (at least 3, and often 30+ seconds) and if a number of requests happen nearly simultaneously, I get 503 style errors. I've built a caching system around this command to cache the IP addresses on the local pod, but when a 10 or so pods wake up and need to create this cache, there is a large delay and often many errors. I feel like I'm doing something wrong. Getting the IP addresses of other pods on a network seems like it should be a straightforward process. So what's the best way to get them?
For added details, I'm using Google's kubernetes system on their container engine. Running a standard Ubuntu image.
Context: To add context, I'm trying to put together a shared memcached between the pods on the cluster. To do that, they all need to know eachother's IP address. If there's an easier way to link pods/instances for the purposes of memcached, that would also be helpful.
Have you tried
kubectl get pods -o wide
This also returns IP addresses of the pods. Since this does not return ALL information describe returns, this might be faster.
For your described use case you should be using services. A headless service would allow you to reference them with my-svc.my-namespace.svc.cluster.local. This assumes you don't need to know individual nodes, only how to reach one of them, as it will round robin between them.
If you do need to have fixed network identities in your cluster attached to the pods you can setup a StatefulSet and reference them with: app-0.my-svc.my-namespace.svc.cluster.local, app-1.my-svc.my-namespace.svc.cluster.local and so on.
You should never need to contact specific pod ip's in other ways, specially since they can be rescheduled at any time and have their IPs changed.
For your use case specifically, it might be easier to just use the memcache helm chart, which supports a cluster in a StatefulSet: https://github.com/kubernetes/charts/tree/master/stable/memcached

Kubernetes on Mesos

I Have the following setup in mind:
Kubernetes on Mesos (based on the kubernetes-mesos project) within a /16 network.
Each pod will have its own IP and I believe this will avail 64 000 pods.
The idea is to provide isolation for each app i.e. Each app gets its own mysql within the same pod - the app accesses mysql on localhost(within the pod).
If an additional service were needed, I'd use kubernetes rolling updates to add the service's container to the pod, the app will be able to access this new service on localhost as well.
Each application needs as much isolation as possible.
Are there any defects to such an implementation?
Do I have to use weave?
There's an option to specify the service-ip-range while running the kubernetes-mesos install.
One hole is how do I scale a service, is this really viable?
Is there a better way to do this? i.e. Offering isolated services
Thanks.
PS//I'm obviously a noobie at this and I'm trying to get the best possible setup running.
A common misconception is that a Pod should manage a vertical, multi-tier stack: for example a web tier + DB tier together.
It's interesting to read the Kubernetes design intent of Pods: they're for collecting 'helper' processes rather than composing a vertical stack.
To answer your questions, I'd recommend:
Define a Pod template for the web tier only. This can be scaled to any size required, using a replication controller (questions #1 and #3).
Define another Pod for MySQL.
Use the Service abstraction to locate these components.
This sort of design will work for small applications, but you're right that it'll be tough to scale up if you suddenly want two have a couple instances of a service hit the same mysql backend.
You may want to look into putting each service into a separate namespace. Then a service's DNS lookups will be scoped to its own namespace by default so that it won't find other services' resources unless it's explicitly looking for them. This would let you put mysql (and any other dependencies) in a separate pod so that the frontend could be scaled independently.

What's the difference between Kubernetes and Flynn/Deis

I have read some introduction of these projects, but still cannot get a clear idea of the difference between Kubernetes and Flynn/Deis. Can anyone help?
Kubernetes is really three things:
A way to dynamically schedule containers (actually, sets of containers called pods) to a cluster of machines.
Manage and horizontally scale a lot of those pods using labels and helpers (ReplicationController)
Communicate between sets of pods via services, expose a set of pods externally on a public IP and easily consume external services. This is necessary to deal with the horizontal scaling and the dynamic nature of how pods get placed/scheduled.
This is all very much a tool set for managing compute across a set of machines. It isn't a full application PaaS. Kubernetes doesn't have any idea what an "application" is. Generally PaaS systems provide an easy way to take code and get it deployed and managed as an application. In fact, I expect to see specialized PaaS systems built on top of Kubernetes -- that is what RedHat OpenShift is doing.
One way to think about Kubernetes is as a system for "logical" infrastructure (vs. traditional VM cloud systems which are

Resources