I have 2 pods running in my kubernates cluster. One is simple a wordpress application and the 2nd one contains a mysql DB. Now wordpress is communicating with mysql DB.
I want to find this dependancies between pods. Is there any kubectl command or any tool like prometheus by which I can find dependancies between pods inside kubernates cluster?
No, there is no native kubernetes primitive which can define dependencies between pods. An easy thing you can do is to define labels like dependsOn and attach them to the corresponding pod.
For example, your wordpress pod can have a label which says dependsOn: mysql where mysql can either be the name or another label of your mysql pod.
But this will only help a human reader understand what this pod is dependent on. Kubernetes works on the principle of eventual consistency. Even if mysql doesn't start before wordpress, eventually they will start working together and system will become consistent. The wordpress pod will crash when it cannot find mysql and Kubernetes will keep restarting crashing pods.
If you want to define dependencies between applications on Kubernetes and require deployments to happen in a particular order etc. you can take a look at tools like Aptomi.
Related
Trying to deploy pods in my kubernetes cluster and some of the pods are giving me an error of some storage problems. Screen shot is given below:
I am sure the problem is with one of my worker node. its not a problem with pulsar i think. i'll also share the YAML file here just for a clear view of what the problem is.
Link to YAML File:https://github.com/apache/pulsar/blob/master/deployment/kubernetes/generic/k8s-1-9-and-above/zookeeper.yaml
I need help with the YAML file to tweek it arround a little, so that the pods can be created with existing requirements i have on my worker nodes. I'll be happy if you need more information.
Thanks in advance
It looks like the affinity rules are preventing the pods from starting. In production, you want to make sure the Zookeeper pods (and other pod groups like BookKeeper) don't run on the same worker node, which is why those rules are configured that way. You can increase your Kubernetes setup to 3 worker nodes, or remove the affinity rules from the various stateful sets and deployment files.
Alternatively, you can use this Helm chart (full disclosure: I am the creator) to deploy Pulsar to Kubernetes:
https://helm.kafkaesque.io
See the section "Installing Pulsar for development" for settings that will enable Pulsar to run in smaller Kubernetes setups, including disabling affinity rules.
I have the same problem as the following:
Dual nginx in one Kubernetes pod
In my Kubernetes Deployment template, I have 2 containers that are using the same port 80.
I understand that containers within a Pod are actually under the same network namespace, which enables accessing another container in the Pod with localhost or 127.0.0.1.
It means containers can't use the same port.
It's very easy to achieve this with the help of docker run or docker-compose, by using 8001:80 for the first container and 8002:80 for the second container.
Is there any similar or better solution to do this in Kubernetes Pod ? Without separating these 2 containers into different Pods.
Basically I totally agree with #David's and #Patric's comments but I decided to add to it a few more things expanding it into an answer.
I have the same problem as the following: Dual nginx in one Kubernetes pod
And there is already a pretty good answer for that problem in a mentioned thread. From the technical point of view it provides ready solution to your particular use-case however it doesn't question the idea itself.
It's very easy to achieve this with the help of docker run or
docker-compose, by using 8001:80 for the first container and 8002:80
for the second container.
It's also very easy to achieve in Kubernetes. Simply put both containers in different Pods and you will not have to manipulate with nginx config to make it listen on a port different than 80. Note that those two docker containers that you mentioned don't share a single network namespace and that's why they can both listen on ports 80 which are mapped to different ports on host system (8001 and 8002). This is not the case with Kubernetes Pods. Read more about microservices architecture and especially how it is implemented on k8s and you'll notice that placing a few containers in a single Pod is really rare use case and definitely should not be applied in a case like yours. There should be a good reason to put 2 or more containers in a single Pod. Usually the second container has some complimentary function to the main one.
There are 3 design patterns for multi-container Pods, commonly used in Kubernetes: sidecar, ambassador and adapter. Very often all of them are simply referred to as sidecar containers.
Note that 2 or more containers coupled together in a single Pod in all above mentioned use cases have totally different function. Even if you put more than just one container in a single Pod (which is most common), in practice it is never a container of the same type (like two nginx servers listening on different ports in your case). They should be complimentary and there should be a good reason why they are put together, why they should start and shut down at the same time and share same network namespace. Sidecar container with a monitoring agent running in it has complimentary function to the main container which can be e.g. nginx webserver. You can read more about container design patterns in general in this article.
I don't have a very firm use case, because I'm still
very new to Kubernetes and the concept of a cluster.
So definitely don't go this way if you don't have particular reason for such architecture.
My initial planning of the cluster is putting all my containers of the system
into a pod. So that I can replicate this pod as many as I want.
You don't need a single Pod to replicate it. You can have in your cluster a lot of replicaSets (usually managed by Deployments), each of them taking care of running declared number of replicas of a Pod of a certain kind.
But according to all the feedback that I have now, it seems like I going
in the wrong direction.
Yes, this is definitely wrong direction, but it was actually already said. I'd like only to highlight why namely this direction is wrong. Such approach is totally against the idea of microservices architecture and this is what Kubernetes is designed for. Putting all your infrastructure in a single huge Pod and binding all your containers tightly together makes no sense. Remember that a Pod is the smallest deployable unit in Kubernetes and when one of its containers crashes, the whole Pod crashes. There is no way you can manually restart just one container in a Pod.
I'll review my structure and try with the
suggests you all provided. Thank you, everyone! =)
This is a good idea :)
I believe what you need to do is specify a different Container Port for each container in the pod. Kubernetes allows you specify the port each container exposes using this parameter in the pod definition file. You can then create services pointing to same pods but different ports.
I'm trying to learn Kubernetes to push up my microservices solution to some Kubernetes in the Cloud (e.g. Azure Kubernetes Service, etc)
As part of this, I'm trying to understand the main concepts, specifically around Pods + Workers and (in the yml file) Pods + Services. To do this, I'm trying to compare what I have inside my docker-compose file against the new concepts.
Context
I currently have a docker-compose.yml file which contains about 10 images. I've split the solution up into two 'networks': frontend and backend. The backend network contains 3 microservices and cannot be accessed at all via a browser. The frontend network contains a reverse-proxy (aka. Traefik, which is just like nginx) which is used to route all requests to the appropriate backend microservice and a simple SPA web app. All works 100% awesome.
Each backend Microservice has at least one of these:
Web API host
Background tasks host
So this means, I could scale out the WebApi hosts, if required .. but I should never scale out the background tasks hosts.
Here's a simple diagram of the solution:
So if the SPA app tries to request some data with the following route:
https://api.myapp.com/account/1 this will hit the reverse-proxy and match a rule to then forward onto <microservice b>/account/1
So it's from here, I'm trying to learn how to write up an Kubernetes deployment file based on these docker-compose concepts.
Questions
Each 'Pod' has it's own IP so I should create a Pod per container. (Yes, a Pod can have multiple containers and to me, that's like saying 'install these software products on the same machine')
A 'Worker Node' is what we replicate/scale out, so we should put our Pods into a Node based on the scaling scenario. For example, the Background Task hosts should go into one Node because they shouldn't be scaled. Also, the hardware requirements for that node are really small. While the Web Api's should go into another Node so they can be replicated/scaled out
If I'm on the right path with the understanding above, then I'll have a lot of nodes and pods ... which feels ... weird?
The pod is the unit of Workload, and has one or more containers. Exactly one container is normal. You scale that workload by changing the number of Pod Replicas in a ReplicaSet (or Deployment).
A Pod is mostly an accounting construct with no direct parallel to base docker. It's similar to docker-compose's Service. A pod is mostly immutable after creation. Like every resource in kubernetes, a pod is a declaration of desired state - containers to be running somewhere. All containers defined in a pod are scheduled together and share resources (IP, memory limits, disk volumes, etc).
All Pods within a ReplicaSet are both fungible and mortal - a request can be served by any pod in the ReplicaSet, and any pod can be replaced at any time. Each pod does get its own IP, but a replacement pod will probably get a different IP. And if you have multiple replicas of a pod they'll all have different IPs. You don't want to manage or track pod IPs. Kubernetes Services provide discovery (how do I find those pods' IPs) and routing (connect to any Ready pod without caring about its identity) and load balancing (round robin over that group of Pods).
A Node is the compute machine (VM or Physical) running a kernel and a kubelet and a dockerd. (This is a bit of a simplification. Other container runtimes than just dockerd exist, and the virtual-kubelet project aims to turn this assumption on its head.)
All pods are Scheduled on Nodes. When a pod (with containers) is scheduled on a node, the kubelet responsible for & running on that node does things. The kubelet talks to dockerd to start containers.
Once scheduled on a node, a pod is not moved to another node. Nodes are fungible & mortal too, though. If a node goes down or is being decommissioned, the pod will be evicted/terminated/deleted. If that pod was created by a ReplicaSet (or Deployment) then the ReplicaSet Controller will create a new replica of that pod to be scheduled somewhere else.
You normally start many (1-100) pods+containers on the same node+kubelet+dockerd. If you have more pods than that (or they need a lot of cpu/ram/io), you need more nodes. So the nodes are also a unit of scale, though very much indirectly wrt the web-app.
You do not normally care which Node a pod is scheduled on. You let kubernetes decide.
I am creating a docker container ( using docker run) in a kubernetes Environment by invoking a rest API.
I have mounted the docker.sock of the host machine and i am building an image and running that image from RESTAPI..
Now i need to connect to this container from some other container which is actually started by Kubectl from deployment.yml file.
But when used kubeclt describe pod (Pod name), my container created using Rest API is not there.. So where is this container running and how can i connect to it from some other container ?
Are you running the container in the same namespace as namespace with deployment.yml? One of the option to check that would be to run -
kubectl get pods --all-namespaces
If you are not able to find the docker container there than I would suggest performing below steps -
docker ps -a {verify running docker status}
Ensuring that while mounting docker.sock there are no permission errors
If there are permission errors, escalate privileges to the appropriate level
To answer the second question, connection between two containers should be possible by referencing cluster DNS in below format -
"<servicename>.<namespacename>.svc.cluster.local"
I would also request you to detail steps, codes and errors(if there are any) for me to better answer the question.
You probably shouldn't be directly accessing the Docker API from anywhere in Kubernetes. Kubernetes will be totally unaware of anything you manually docker run (or equivalent) and as you note normal administrative calls like kubectl get pods won't see it; the CPU and memory used by the pod won't be known about by the node interface and this could cause a node to become over utilized. The Kubernetes network environment is also pretty complicated, and unless you know the details of your specific CNI provider it'll be hard to make your container accessible at all, much less from a pod running on a different node.
A process running in a pod can access the Kubernetes API directly, though. That page notes that all of the official client libraries are aware of the conventions this uses. This means that you should be able to directly create a Job that launches your target pod, and a Service that connects to it, and get the normal Kubernetes features around this. (For example, servicename.namespacename.svc.cluster.local is a valid DNS name that reaches any Pod connected to the Service.)
You should also consider whether you actually need this sort of interface. For many applications, it will work just as well to deploy some sort of message-queue system (e.g., RabbitMQ) and then launch a pool of workers that connects to it. You can control the size of the worker queue using a Deployment. This is easier to develop since it avoids a hard dependency on Kubernetes, and easier to manage since it prevents a flood of dynamic jobs from overwhelming your cluster.
I have started recently getting familiar with Kubernetes, however while I do get the concept I have some questions I am unable to answer clearly through Kubernete's Concept and Documentation, and some understandings that I'd wish to confirm.
A Deployment is a group of one or more container images (Docker ..etc) that is deployed within a Pod, and through Kubernetes Deployment Controller such deployments are monitored and created, updated, or deleted.
A Pod is a group of one or more containers, are those containers from the same Deployment, or can they be from multiple deployments?
"A pod models contains one or more application containers which are relatively tightly coupled". Is there any clear criteria on when to deploy containers within the same pod, rather than separate pods?
"Pods are the smallest deployable units of computing that can be created and managed in Kubernetes" - Pods, Kuberenets Documentation. Is that to mean that Kubernetes API is unable to monitor, and manage containers (at least directly)?
Appreciate your input.
your question is actually too broad for StackOverflow but I'll quickly answer before this one is closed.
Maybe it get's clearer when you look at the API documentation. Which you could read like this:
A Deployment describes a specification of the desired behavior for the contained objects.
This is done within the spec field which is of type DeploymentSpec.
A DeploymentSpec defines how the related Pods should look like with a templatethrough the PodTemplateSpec
The PodTemplateSpec then holds the PodSpec for all the require parameters and that defines how containers within this Pod should look like through a Container definition.
This is not a punchy oneline statement, but maybe makes it easier to see how things relate to each other.
Related to the criteria on what's a good size and what's too big for a Pod or a Container. This is very opinion loaded and the best way to figure that out is to read through the opinions on the size of Microservices.
To cover your last point - Kubernetes is able to monitor and manage containers, but the "user" is not able to schedule single containers. They have to be embedded in a Pod definion. You can of course access Container status and details per container (e.g. through kubeget logs <pod> -c <container> (details) or through the metrics API.
I hope this helps a bit and doesn't add to the confusion.
Pod is an abstraction provided by Kubernetes and it corresponds to a group of containers which share a subset of namespaces, most importantly the network namespace. For instances the applications running in these containers can interact like the way applications in the same vm would interact, except for the fact that they don't share the same filesystem hierarchy.
The workloads are run in the form of pods, but POD is a lower level abstraction. The workloads are typically scheduled in terms of Kubernetes Deployments/ Jobs / CronJobs / Daemonsets etc which in turn create the Pods.