I have started recently getting familiar with Kubernetes, however while I do get the concept I have some questions I am unable to answer clearly through Kubernete's Concept and Documentation, and some understandings that I'd wish to confirm.
A Deployment is a group of one or more container images (Docker ..etc) that is deployed within a Pod, and through Kubernetes Deployment Controller such deployments are monitored and created, updated, or deleted.
A Pod is a group of one or more containers, are those containers from the same Deployment, or can they be from multiple deployments?
"A pod models contains one or more application containers which are relatively tightly coupled". Is there any clear criteria on when to deploy containers within the same pod, rather than separate pods?
"Pods are the smallest deployable units of computing that can be created and managed in Kubernetes" - Pods, Kuberenets Documentation. Is that to mean that Kubernetes API is unable to monitor, and manage containers (at least directly)?
Appreciate your input.
your question is actually too broad for StackOverflow but I'll quickly answer before this one is closed.
Maybe it get's clearer when you look at the API documentation. Which you could read like this:
A Deployment describes a specification of the desired behavior for the contained objects.
This is done within the spec field which is of type DeploymentSpec.
A DeploymentSpec defines how the related Pods should look like with a templatethrough the PodTemplateSpec
The PodTemplateSpec then holds the PodSpec for all the require parameters and that defines how containers within this Pod should look like through a Container definition.
This is not a punchy oneline statement, but maybe makes it easier to see how things relate to each other.
Related to the criteria on what's a good size and what's too big for a Pod or a Container. This is very opinion loaded and the best way to figure that out is to read through the opinions on the size of Microservices.
To cover your last point - Kubernetes is able to monitor and manage containers, but the "user" is not able to schedule single containers. They have to be embedded in a Pod definion. You can of course access Container status and details per container (e.g. through kubeget logs <pod> -c <container> (details) or through the metrics API.
I hope this helps a bit and doesn't add to the confusion.
Pod is an abstraction provided by Kubernetes and it corresponds to a group of containers which share a subset of namespaces, most importantly the network namespace. For instances the applications running in these containers can interact like the way applications in the same vm would interact, except for the fact that they don't share the same filesystem hierarchy.
The workloads are run in the form of pods, but POD is a lower level abstraction. The workloads are typically scheduled in terms of Kubernetes Deployments/ Jobs / CronJobs / Daemonsets etc which in turn create the Pods.
Related
I am searching for a tutorial or a good reference to perform docker container live migration in Kubernetes between two hosts (embedded devices - arm64 architecture).
As far as I searched on the internet resources, I could not find a complete documentation about it. I am a newbe and it will be really helpful if someone could provide me any good reference materials so that I can improve myself.
Posting this as a community wiki, feel free to edit and expand.
As #David Maze said in terms of containers and pods, it's not really a live migration. Usually pods are managed by deployments which have replicasets which control pods state: they are created and in requested amount. Any changes in amount of pods (e.g. you delete it) or using image will trigger pods recreation.
This also can be used for scheduling pods on different nodes when for instance you need to perform maintenance on the node or remove/add one.
As for your question in comments, it's not necessarily the same volume as it can I suppose have a short downtime.
Sharing volumes between kubernetes clusters on premise (cloud may differ) is not a built-in feature. You may want to look at nfs server deployed in your network:
Mounting external NFS share to pods
We developed an application which consist out of a few go/java services, a mongoDB and a reverse proxy which forwards REST-calls to the specific service. Each service runs in an own docker container.
The whole app is deployable with a single docker-compose file.
We successfully managed to deploy the app in a kubernetes cluster.
Now the "tricky" part: We want to deploy one isolated instance of the app for each customer. (remember one instance consists of approximately 10 containers)
In the past we reached this goal by deploying multiple instances of the docker-compose file.
What is the recommended way in Kubernetes to reach this?
Thank you very much.
Applications can be separated via simple naming and labels or namespaces. Separation could go even further into restricting the nodes an instance may run on or even running separate clusters.
Network policies can be applied on top of deployment to improve network isolation. This would be needed to emulate the docker-compose "network bridge per instance" setup.
"Isolated" can mean a lot of things though as there are various layers where the term can be applied in various ways.
Naming
Many instances of a deployment can run intermingled on a cluster as long as the naming of each kubernetes resource doesn't clash. This includes the labels applied (and sometimes annotations) that are used to select or report on apps so you can uniquely identify a customers resources.
kubectl create -f deployment-customer1.yaml
kubectl create -f deployment-customer2.yaml
This type of naming is easier to manage with a deployment mechanism like helm. Helm "charts" describe a release and are built with the base concept of a variable "release name", so yaml templates can rely on variables. The average helm release would be:
helm install -f customer1-values.yaml customer1-app me/my-app-chart
helm install -f customer2-values.yaml customer2-app me/my-app-chart
Namespaces
A namespace is a logical grouping of resources in a cluster. By itself, a namespace only provides naming isolation but a lot of other k8s resources can then depend on a namespace to apply to:
Authorization/Role based access to k8s
Pod security policy
Resource quotas
A namespace per customer/instance may be useful, for example if you had a "premium" customer that get a bigger slice of resources quotas. It may also make labelling and selecting instances easier, which Network Policy will use.
Environments can be a good fit for a namespace, so a similar deployment can go to the dev/test/prod ns. If you are giving users access to manage or query Kubernetes resources themselves, namespaces make management much easier.
Managing namespaced resources might look like:
kubectl create ns customer1
kubectl create -f deployment.yaml -n customer1
kubectl create ns customer2
kubectl create -f deployment.yaml -n customer2
Again, helm is equally applicable to namespaced deployments.
DNS is probably worth a mention too, containers will look up host names in their own namespace by default. In namespace customer1, looking up the host name service-name will resolve to service-name.customer1.svc.cluster.local
Similarly in the namespace customer2: A lookup for service-name is service-name.customer2.svc.cluster.local
Nodes
Customers could be pinned to a particular nodes (VM or physical) to provide security and/or resource isolation from other customers.
Clusters
Cluster separation can provide full security, resource and network isolation without relying on kubernetes to manage it.
Large apps can often end up using a complete cluster per "grouping". This has a huge overhead of management for each cluster but allow closer to complete independence between instances. Security can be a big driver for this, as you can provide a layer of isolation between clusters outside of the Kubernetes masters.
Network Policy
A network policy lets you restrict network access between Pods/Services via label selectors. Kubernetes will actively manage the firewall rules wherever the Pods are scheduled in the cluster. This would be required to provide similar network isolation to docker-compose creating a network per instance.
The cluster will need to use a network plugin (CNI) that supports network policies, like Calico.
In kuberenetes, you can package all your resources into a helm chart (https://helm.sh/docs/topics/charts/) so that you deploy different instances of each and can manage its lifecycle. You can also pass parameters for each of the instances if required.
Another method is by deploying your application instance using kubernetes operators (https://kubernetes.io/docs/concepts/extend-kubernetes/operator/). This also helps in managing your application components.
I have the same problem as the following:
Dual nginx in one Kubernetes pod
In my Kubernetes Deployment template, I have 2 containers that are using the same port 80.
I understand that containers within a Pod are actually under the same network namespace, which enables accessing another container in the Pod with localhost or 127.0.0.1.
It means containers can't use the same port.
It's very easy to achieve this with the help of docker run or docker-compose, by using 8001:80 for the first container and 8002:80 for the second container.
Is there any similar or better solution to do this in Kubernetes Pod ? Without separating these 2 containers into different Pods.
Basically I totally agree with #David's and #Patric's comments but I decided to add to it a few more things expanding it into an answer.
I have the same problem as the following: Dual nginx in one Kubernetes pod
And there is already a pretty good answer for that problem in a mentioned thread. From the technical point of view it provides ready solution to your particular use-case however it doesn't question the idea itself.
It's very easy to achieve this with the help of docker run or
docker-compose, by using 8001:80 for the first container and 8002:80
for the second container.
It's also very easy to achieve in Kubernetes. Simply put both containers in different Pods and you will not have to manipulate with nginx config to make it listen on a port different than 80. Note that those two docker containers that you mentioned don't share a single network namespace and that's why they can both listen on ports 80 which are mapped to different ports on host system (8001 and 8002). This is not the case with Kubernetes Pods. Read more about microservices architecture and especially how it is implemented on k8s and you'll notice that placing a few containers in a single Pod is really rare use case and definitely should not be applied in a case like yours. There should be a good reason to put 2 or more containers in a single Pod. Usually the second container has some complimentary function to the main one.
There are 3 design patterns for multi-container Pods, commonly used in Kubernetes: sidecar, ambassador and adapter. Very often all of them are simply referred to as sidecar containers.
Note that 2 or more containers coupled together in a single Pod in all above mentioned use cases have totally different function. Even if you put more than just one container in a single Pod (which is most common), in practice it is never a container of the same type (like two nginx servers listening on different ports in your case). They should be complimentary and there should be a good reason why they are put together, why they should start and shut down at the same time and share same network namespace. Sidecar container with a monitoring agent running in it has complimentary function to the main container which can be e.g. nginx webserver. You can read more about container design patterns in general in this article.
I don't have a very firm use case, because I'm still
very new to Kubernetes and the concept of a cluster.
So definitely don't go this way if you don't have particular reason for such architecture.
My initial planning of the cluster is putting all my containers of the system
into a pod. So that I can replicate this pod as many as I want.
You don't need a single Pod to replicate it. You can have in your cluster a lot of replicaSets (usually managed by Deployments), each of them taking care of running declared number of replicas of a Pod of a certain kind.
But according to all the feedback that I have now, it seems like I going
in the wrong direction.
Yes, this is definitely wrong direction, but it was actually already said. I'd like only to highlight why namely this direction is wrong. Such approach is totally against the idea of microservices architecture and this is what Kubernetes is designed for. Putting all your infrastructure in a single huge Pod and binding all your containers tightly together makes no sense. Remember that a Pod is the smallest deployable unit in Kubernetes and when one of its containers crashes, the whole Pod crashes. There is no way you can manually restart just one container in a Pod.
I'll review my structure and try with the
suggests you all provided. Thank you, everyone! =)
This is a good idea :)
I believe what you need to do is specify a different Container Port for each container in the pod. Kubernetes allows you specify the port each container exposes using this parameter in the pod definition file. You can then create services pointing to same pods but different ports.
I’m trying to figure out and learn the patterns and best practices on moving a bunch of Docker containers I have for an application into Kubernetes. Things like, pod design, services, deployments, etc. For example, I could create a Pod with the single web and application containers in them, but that’d not be a good design.
Searching for things like architecture and design with Kubernetes just seems to yield topics on the product’s architecture or how to implement a Kubernetes cluster, and not the overlay of designing the pods, services, etc.
What does the community generally refer to this application later design in the Kubernetes world, and can anyone refer me to a 101 on this topic please?
Thanks.
Kubernetes is a complex system, and learning step by step is the best way to gain expertise. What I recommend you is documentation about Kubernetes, from where you can learn about each of components.
Another good option is to review 70 best K8S tutorials, which are categorized in many ways.
Designing and running applications with scalability, portability, and robustness in mind can be challenging. Here are great resources about it:
Architecting applications for Kubernetes
Using Kubernetes in production, lessons learned
Kubernetes Design Principles from Google
Well, there's no Kubernetes approach but rather a Cloud Native one: I would suggest you Designing Distributed Systems: patterns and paradigms by Brendan Burns.
It's really good because it provides several scenarios along with pattern approached and related code.
Most of the examples are obviously based on Kubernetes but I think that the implementation is not so important, since you have to understand why and when to use an Ambassador pattern or a FaaS according to the application needs.
The answer to this can be quite complex and that's why it is important that software/platform architects understand K8s well.
Mostly you will find an answer on that which tells you "put each application component in a single pod". And basically that's correct as the main reason for K8s is high availability, fault tolerance of the infrastructure and things like this. This leads us to, if you put every single component to a single pod and make it with a replica higher than 2 its will reach a batter availability.
But you also need to know why you want to go to K8s. At the moment it is a trending topic. But if you don't want to Ops a cluster and actually don't need HA or so, why you don't run on stuff like AWS ECS, Digital Ocean droplets and co?
Best answers you will currently find are all around how to design and cut microservices as each microservice could be represented in a pod. Also, a good starting point is from RedHat Principles of container-based Application Design
or InfoQ.
Un kubernetes cluster is composed of:
A master server called control plane
Nodes: nodes which execute the applications / Containers or pods
By design, a production kubernetes cluster must have at least a master server and 2 nodes according to the kubernetes documentation.
Here is a summary of the components of a kubernetes cluster:
Master = control plane:
kube-api-server: expose the kubernetes api
etcd: key values store for the cluster
kube-scheduler: distributed the pods on the nodes
kube-controller-manager: controller of nodes, pods, cluster components.
Nodes = Servers that run applications
Kubelet: runs on each node, It makes sure that the containers are running in a pod.
kube-proxy: Allows the pods to communicate in the cluster and outside
Runtine container: allows to run the containers / pods
Complementary modules = addons
DNS: DNS server that serves DNS records for Kubernetes services.
Webui: Graphical dashboard for the cluster
Container Resource Monitoring: Records metrics on containers in a central DB, provides UI to browse them
Cluster-level Logging: Records container logs in a central log with a search / browse interface.
In Kubernetes POD is considered as a single unit of deployment which might have one or more containers, so if we scale all the containers in the POD are scaled irrespectively.
If the POD has only one container its easier to scale the particular POD, so whats purpose of packaging one or more containers inside the POD?
From the documentation:
Pods can be used to host vertically integrated application stacks (e.g. LAMP), but their primary motivation is to support co-located, co-managed helper programs
The most common example of this is sidecar containers which contain helper applications like log shipping utilities.
A deeper dive can be found here
The reason behind using pod rather than directly container is that kubernetes requires more information to orchestrate the containers like restart policy, liveness probe, readiness probe. A liveness probe defines that container inside the pods is alive or not, restart policy defines the what to do with container when it failed. A readiness probe defines that container is ready to start serving.
So, Instead of adding those properties to the existing container, kubernetes had decided to write the wrapper on containers with all the necessary additional information.
Also, Kubernetes supports the multi-container pod which is mainly requires for the sidecar containers mainly log or data collector or proxies for the main container. Another advantage of multi-container pod is they can have very tightly coupled application container together sharing the same data, same network namespace and same IPC namespace which would not be possible if they choose for directly using container without any wrapper around it.
Following is very nice article to give you brief idea:
https://www.mirantis.com/blog/multi-container-pods-and-container-communication-in-kubernetes/