I have 1 master kubernetes server and 9 nodes. In that, I want to run backend on 2 nodes and frontend on 2 nodes and DB on 3 nodes.
For all backend, frontend, DB I have ready DockerImage.
How to run an image using kubernetes on only desired(2 or 3).
Please share some ideas to achieve the same.
The Kubernetes scheduler most of the time will do a good job distributing the pods across the cluster. You may want to delegate that responsibility to the scheduler unless you have very specific requirements.
If you want to control this, you can use:
Node selectors
Node Affinity or Anti-Affinity
Directly specify the node name in the deployment spec
From these three, the recommended approach is to use node affinity or anti-affinity due to its flexibility.
Run the front end as a Deployment with desired replica count and let kubernetes manage it for you.
Run Backend as Deployment with desired number of replicas and Kubernetes will figure out how to run it. Use node selectors if you prefer specific nodes.
Run the DB as Deployment OR StatefulSet, Kubernetes will figure out how to run it.
https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/
Use network policies to restrict traffic.
You may use labels and nodeSelector. Here it is:
https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
Related
I'm learning spark, and I'm get confused about running docker which contains spark code on Kubernetes cluster.
I read that spark get utilized multiple nodes (servers) and it can run code on different nodes, in order to get complete jobs faster (and get used the memory of each node, when the data is too big)
On the other side, I read that Kubernetes pod (which contains dockers/containers) run on one node.
For example, I'm running the following spark code from docker:
num = [1, 2, 3, 4, 5]
num_rdd = sc.parallelize(num)
double_rdd = num_rdd.map(lambda x: x * 2)
Some notes and reminders (from my understanding):
When using the map command, each value of the num array maps to different spark node (slave worker)
k8s pod run on one node
So I'm confused how spark utilized multiple nodes when the pod run on one node ?
Does the spark slave workers runs on different nodes, and this is how the pod which run the code above can communicate with those nodes in order to utilize the spark framework ?
When you run Spark on Kubernetes, you have a few ways to set things up. The most common way is to set Spark to run in client-mode.
Basically Spark can run on Kubernetes on a Pod.. then the application itself, having the endpoints for the k8s masters, is able to spawn its own worker Pods, as long as everything is correctly configured.
What is needed for this setup is to deploy the Spark application on Kubernetes (usually with a StatefulSet but it's not a must) along with an headless ClusterIP Service (which is required to make worker Pods able to communicate with the master application that spawned them)
You also need to give the Spark application all the correct configurations such as the k8s masters endpoint, the Pod name and other parameters to set things up.
There are other ways to setup Spark, there's no obligation to spawn worker Pods, you can run all the stages of your code locally (and the configuration is easy, if you have small jobs with small amount of data to execute you don't need workers)
Or you can execute the Spark application externally from the Kubernetes cluster, so not on a pod.. but giving it the Kubernetes master endpoints so that it can still spawn workers on the cluster (aka cluster-mode)
You can find a lot more info in the Spark documentation, which explains mostly everything to set things up (https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode)
And can read about StatefulSets and their usage of headless ClusterIP Services here (https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)
We developed an application which consist out of a few go/java services, a mongoDB and a reverse proxy which forwards REST-calls to the specific service. Each service runs in an own docker container.
The whole app is deployable with a single docker-compose file.
We successfully managed to deploy the app in a kubernetes cluster.
Now the "tricky" part: We want to deploy one isolated instance of the app for each customer. (remember one instance consists of approximately 10 containers)
In the past we reached this goal by deploying multiple instances of the docker-compose file.
What is the recommended way in Kubernetes to reach this?
Thank you very much.
Applications can be separated via simple naming and labels or namespaces. Separation could go even further into restricting the nodes an instance may run on or even running separate clusters.
Network policies can be applied on top of deployment to improve network isolation. This would be needed to emulate the docker-compose "network bridge per instance" setup.
"Isolated" can mean a lot of things though as there are various layers where the term can be applied in various ways.
Naming
Many instances of a deployment can run intermingled on a cluster as long as the naming of each kubernetes resource doesn't clash. This includes the labels applied (and sometimes annotations) that are used to select or report on apps so you can uniquely identify a customers resources.
kubectl create -f deployment-customer1.yaml
kubectl create -f deployment-customer2.yaml
This type of naming is easier to manage with a deployment mechanism like helm. Helm "charts" describe a release and are built with the base concept of a variable "release name", so yaml templates can rely on variables. The average helm release would be:
helm install -f customer1-values.yaml customer1-app me/my-app-chart
helm install -f customer2-values.yaml customer2-app me/my-app-chart
Namespaces
A namespace is a logical grouping of resources in a cluster. By itself, a namespace only provides naming isolation but a lot of other k8s resources can then depend on a namespace to apply to:
Authorization/Role based access to k8s
Pod security policy
Resource quotas
A namespace per customer/instance may be useful, for example if you had a "premium" customer that get a bigger slice of resources quotas. It may also make labelling and selecting instances easier, which Network Policy will use.
Environments can be a good fit for a namespace, so a similar deployment can go to the dev/test/prod ns. If you are giving users access to manage or query Kubernetes resources themselves, namespaces make management much easier.
Managing namespaced resources might look like:
kubectl create ns customer1
kubectl create -f deployment.yaml -n customer1
kubectl create ns customer2
kubectl create -f deployment.yaml -n customer2
Again, helm is equally applicable to namespaced deployments.
DNS is probably worth a mention too, containers will look up host names in their own namespace by default. In namespace customer1, looking up the host name service-name will resolve to service-name.customer1.svc.cluster.local
Similarly in the namespace customer2: A lookup for service-name is service-name.customer2.svc.cluster.local
Nodes
Customers could be pinned to a particular nodes (VM or physical) to provide security and/or resource isolation from other customers.
Clusters
Cluster separation can provide full security, resource and network isolation without relying on kubernetes to manage it.
Large apps can often end up using a complete cluster per "grouping". This has a huge overhead of management for each cluster but allow closer to complete independence between instances. Security can be a big driver for this, as you can provide a layer of isolation between clusters outside of the Kubernetes masters.
Network Policy
A network policy lets you restrict network access between Pods/Services via label selectors. Kubernetes will actively manage the firewall rules wherever the Pods are scheduled in the cluster. This would be required to provide similar network isolation to docker-compose creating a network per instance.
The cluster will need to use a network plugin (CNI) that supports network policies, like Calico.
In kuberenetes, you can package all your resources into a helm chart (https://helm.sh/docs/topics/charts/) so that you deploy different instances of each and can manage its lifecycle. You can also pass parameters for each of the instances if required.
Another method is by deploying your application instance using kubernetes operators (https://kubernetes.io/docs/concepts/extend-kubernetes/operator/). This also helps in managing your application components.
I am new to the world of kubernetes. I am trying to implement kubernetes advantages on my personal project.
I have an api service in a docker container which fetches data from back end.
I plan on creating multiple replicas of this api service container on a single external port in the kubernetes cluster. Do replicas share traffic if they're on a single node ?
My end goal is to create multiple instances of this api service to make my application faster(users can access one of the multiple api services which should reduce traffic on a single instance).
Am i thinking right in terms of kubernetes functionality?
You are right, the multiple replicas of your API service will share the load. In Kubernetes, there is a concept of Services which will send traffic to the backend and in this case it is your api application running in the pods. By default, the choice of backend is random. Also, it doesn't matter whether the Pods are running on a single node or on different nodes, the traffic will be distributed randomly among all the Pods based on the labels.
This will also make your application highly available because you will use deployment for specifying the number of replicas and whenever the number of available replicas are less than the desired replicas, Kubernetes will provision new pods to meet the desired state.
If you add multiple instances / replicas of your web server it will share the load and will avoid single point of failure.
However to achieve this you will have to create and expose a Service. You will have to communicate using the Service endpoint and not using each pods IP directly.
A service exposes an endpoint. It has load balancing. It usually uses round robin to distribute load / requests to servers behind the service load balancer.
Kubernetes manages Pods. Pods are wrappers around containers. Kubernetes can schedule multiple pods on the same node(hardware) or across multiple nodes. Depends how you configure it. You can use Deployments to manage ReplicaSets which manage Pods.
Usually it is recommended to avoid managing pods directly. Pods can crash, stop abruptly. Kubectl will create a new for you automatically depending on the Replica Set config.
Using deployments you can do rolling updates also.
You can refer to Kubernetes Docs to read about this in detail.
Yes. It's called Braess's paradox.
I am trying to implement the CI/CD pipeline using Kubernetes , Jenkins with my private SVN repository. And I am planning to use Kubernetes cluster having 3 master and 15 worker machine/Node. And Using Jenkins to deploy the microservice developed using spring boot. So When I am deploying using Jenkins , How I can define which microservice need to deploy in which node in kubernetes cluster?. Do I need to specify in Pod ? Or Any other definition ?
How I can define which microservice need to deploy in which node in kubernetes cluster?. Do I need to specify in Pod ? Or Any other definition ?
As said in other answers you don't need to do this, but you can if there is any reason to do so using deprecated nodeSelector or preferable affinities. They are well worth the time to read since you can have some pods relating to specific services/microservices group together or away from each other across available nodes to allow for more flexible and resilient architecture and proper spread out. This way you are helping scheduler deciding where to place what to achieve desired layout. For most basic needs previously mentioned resource allocation can do the trick but for any fine graining you have affinity (and anti affinity) at your disposal. Documentation detailing this is here: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
Kubernetes figures out what nodes should run what pods. You don't have to do that. You have to indicate how much memory and cpu each pod needs, k8s to a first approximation figures out the rest.
That said, what you do have to do is figure out how to partition the full set of workloads you need to run- say, by environment (dev/stage/prod), or by tenant (team X/team Y/team Z or client X/client Y/client Z)- into namespaces, then figure out what workflow makes sense for that partitioning, then configure the CI to satisfy that workflow.
I am trying to deploy my set of microservices in different nodes. For installing kubeadm and creation of clusters I am following the below documentations.
https://medium.com/#SystemMining/setup-kubenetes-cluster-on-ubuntu-16-04-with-kubeadm-336f4061d929
https://medium.com/#Grigorkh/install-kubernetes-on-ubuntu-1ac2ef522a36
https://www.youtube.com/watch?v=b_fOIELGMDY&t=108s
I need one master with 2 worker machines. I got clear idea about how to create the kubernetes clusters.
My requirements: I have an application which has separate set of microservices. I need to deploy docker images for one set of microservices into one node1.And docker images for other set into node2. And 3 rd set of microservices in node3...etc...This is my planning of deployment. Please correct me if I am going in wrong direction, Since I only started exploration in docker, kubernetes and jenkins. Devop.
My confusions:
According to my requirement region wise deployment by nodes , Is this deployment strategy is possible by Kubernetes ? And is this one of the standard way ?
If I am using Jenkins for implementing CI/CD pipeline , then Do I need to install Jenkins in each Vm? Means master machine and also in machine which resides nodes?
These all are my confusion about this Kubernetes deployment. Please correct me if my thoughts are wrong, since I am only a beginner in DevOp world. How can I clarify my doubts about deployment by using Kubernetes ?
To answer your first question - you basically need to allocate each node for a tenant. If there are compliance/regulatory reasons then you should do it (Though it won't be very efficient). Here is how you can do it:
On the node1 add a taint:
kubectl taint nodes node1.compute.companyname.com reservedfor=tenant1:NoSchedule
What above means is that the node1 will only schedule pods which have a matching toleration and not any other pod. For the microservice which you need to schedule on node1, you will have to add a toleration to the pod YAML file like:
tolerations:
- key: "reservedfor"
operator: "Equal"
value: "tenant1"
effect: "NoSchedule"
The same logic can be extended - so that even if tenant1 needs 4 machines, then all the 4 machines can be tainted with above key value pair and then pods can be tolerated on those nodes. Check out the documentation here and blog with an example here
You can also use the pod/node affinity to achieve above.
Your second question on Jenkins - No, you don't need to install Jenkins on each node, but other than that more details are needed for that question.