I am trying to implement the CI/CD pipeline using Kubernetes , Jenkins with my private SVN repository. And I am planning to use Kubernetes cluster having 3 master and 15 worker machine/Node. And Using Jenkins to deploy the microservice developed using spring boot. So When I am deploying using Jenkins , How I can define which microservice need to deploy in which node in kubernetes cluster?. Do I need to specify in Pod ? Or Any other definition ?
How I can define which microservice need to deploy in which node in kubernetes cluster?. Do I need to specify in Pod ? Or Any other definition ?
As said in other answers you don't need to do this, but you can if there is any reason to do so using deprecated nodeSelector or preferable affinities. They are well worth the time to read since you can have some pods relating to specific services/microservices group together or away from each other across available nodes to allow for more flexible and resilient architecture and proper spread out. This way you are helping scheduler deciding where to place what to achieve desired layout. For most basic needs previously mentioned resource allocation can do the trick but for any fine graining you have affinity (and anti affinity) at your disposal. Documentation detailing this is here: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
Kubernetes figures out what nodes should run what pods. You don't have to do that. You have to indicate how much memory and cpu each pod needs, k8s to a first approximation figures out the rest.
That said, what you do have to do is figure out how to partition the full set of workloads you need to run- say, by environment (dev/stage/prod), or by tenant (team X/team Y/team Z or client X/client Y/client Z)- into namespaces, then figure out what workflow makes sense for that partitioning, then configure the CI to satisfy that workflow.
Related
I don't know much about kubernetes, but as far as I know, it is a system that enables you to control and manage containerized applications. So, generally speaking, the essence of the benefit that we get from kubernetes is the ability to "tell" kubernetes what containers we want running, how many of them, on which machines, among other details, and kubernetes will take care of doing that for us. Is that correct?
If so, I just can't see the benefit of running a CI pipeline using a kubernetes pod, as I understand that some people do. Let's say you have your build tools on Docker containers instead of having them installed on a specific machine, that's great - you can just use those containers in the build process, why kubernetes? Is there any performance gain or something like this?
Appreciate some insights.
It is highly recommended to get a good understanding of what Kubernetes is and what it can and cannot do.
Generally, containers combined with an orchestration tools can provide a better management of your machines and services. It can significantly improve the reliability of your application and reduce the time and resources spent on DevOps.
Some of the features worth noting are:
Horizontal infrastructure scaling: New servers can be added or removed easily.
Auto-scaling: Automatically change the number of running containers, based on CPU utilization or other application-provided metrics.
Manual scaling: Manually scale the number of running containers through a command or the interface.
Replication controller: The replication controller makes sure your cluster has an equal amount of pods running. If there are too many pods, the replication controller terminates the extra pods. If there are too few, it starts more pods.
Health checks and self-healing: Kubernetes can check the health of nodes and containers ensuring your application doesn’t run into any failures. Kubernetes also offers self-healing and auto-replacement so you don’t need to worry about if a container or pod fails.
Traffic routing and load balancing: Traffic routing sends requests to the appropriate containers. Kubernetes also comes with built-in load balancers so you can balance resources in order to respond to outages or periods of high traffic.
Automated rollouts and rollbacks: Kubernetes handles rollouts for new versions or updates without downtime while monitoring the containers’ health. In case the rollout doesn’t go well, it automatically rolls back.
Canary Deployments: Canary deployments enable you to test the new deployment in production in parallel with the previous version.
However you should also know what Kubernetes is not:
Kubernetes is not a traditional, all-inclusive PaaS (Platform as a
Service) system. Since Kubernetes operates at the container level
rather than at the hardware level, it provides some generally
applicable features common to PaaS offerings, such as deployment,
scaling, load balancing, and lets users integrate their logging,
monitoring, and alerting solutions. However, Kubernetes is not
monolithic, and these default solutions are optional and pluggable.
Kubernetes provides the building blocks for building developer
platforms, but preserves user choice and flexibility where it is
important.
Especially in your use case note that Kubernetes:
Does not deploy source code and does not build your application.
Continuous Integration, Delivery, and Deployment (CI/CD) workflows are
determined by organization cultures and preferences as well as
technical requirements.
The decision is yours but having in mind the main concepts above will help you make it.
An important detail is that you do not tell Kubernetes what nodes a given pod should run on; it picks itself, and if the cluster is low on resources, in many cases it can actually allocate more nodes on its own (via the cluster autoscaler).
So if your CI system is fairly busy, and uses all containers for everything, it could make more sense to run an individual build job as a Kubernetes Job. If you have 100 builds that all start at the same time, it's possible for the cluster to give itself more hardware, and the build queue will clear out faster. Particularly if you're using Kubernetes for other tasks, this can save you same administrative effort over maintaining a dedicated pool of CI-system workers that need to be separately updated and will sit mostly idle until that big set of builds arrives.
Kubernetes's security settings are also substantially better than Docker's. Say your CI system needs to launch containers as part of a build. In Kubernetes, it can run under a service account, and be given permissions to create and delete deployments in a specific namespace, and nothing else. In Docker the standard approach is to give your CI system access to the host's Docker socket, but this can be easily exploited to take over the host.
I’m trying to figure out and learn the patterns and best practices on moving a bunch of Docker containers I have for an application into Kubernetes. Things like, pod design, services, deployments, etc. For example, I could create a Pod with the single web and application containers in them, but that’d not be a good design.
Searching for things like architecture and design with Kubernetes just seems to yield topics on the product’s architecture or how to implement a Kubernetes cluster, and not the overlay of designing the pods, services, etc.
What does the community generally refer to this application later design in the Kubernetes world, and can anyone refer me to a 101 on this topic please?
Thanks.
Kubernetes is a complex system, and learning step by step is the best way to gain expertise. What I recommend you is documentation about Kubernetes, from where you can learn about each of components.
Another good option is to review 70 best K8S tutorials, which are categorized in many ways.
Designing and running applications with scalability, portability, and robustness in mind can be challenging. Here are great resources about it:
Architecting applications for Kubernetes
Using Kubernetes in production, lessons learned
Kubernetes Design Principles from Google
Well, there's no Kubernetes approach but rather a Cloud Native one: I would suggest you Designing Distributed Systems: patterns and paradigms by Brendan Burns.
It's really good because it provides several scenarios along with pattern approached and related code.
Most of the examples are obviously based on Kubernetes but I think that the implementation is not so important, since you have to understand why and when to use an Ambassador pattern or a FaaS according to the application needs.
The answer to this can be quite complex and that's why it is important that software/platform architects understand K8s well.
Mostly you will find an answer on that which tells you "put each application component in a single pod". And basically that's correct as the main reason for K8s is high availability, fault tolerance of the infrastructure and things like this. This leads us to, if you put every single component to a single pod and make it with a replica higher than 2 its will reach a batter availability.
But you also need to know why you want to go to K8s. At the moment it is a trending topic. But if you don't want to Ops a cluster and actually don't need HA or so, why you don't run on stuff like AWS ECS, Digital Ocean droplets and co?
Best answers you will currently find are all around how to design and cut microservices as each microservice could be represented in a pod. Also, a good starting point is from RedHat Principles of container-based Application Design
or InfoQ.
Un kubernetes cluster is composed of:
A master server called control plane
Nodes: nodes which execute the applications / Containers or pods
By design, a production kubernetes cluster must have at least a master server and 2 nodes according to the kubernetes documentation.
Here is a summary of the components of a kubernetes cluster:
Master = control plane:
kube-api-server: expose the kubernetes api
etcd: key values store for the cluster
kube-scheduler: distributed the pods on the nodes
kube-controller-manager: controller of nodes, pods, cluster components.
Nodes = Servers that run applications
Kubelet: runs on each node, It makes sure that the containers are running in a pod.
kube-proxy: Allows the pods to communicate in the cluster and outside
Runtine container: allows to run the containers / pods
Complementary modules = addons
DNS: DNS server that serves DNS records for Kubernetes services.
Webui: Graphical dashboard for the cluster
Container Resource Monitoring: Records metrics on containers in a central DB, provides UI to browse them
Cluster-level Logging: Records container logs in a central log with a search / browse interface.
I am trying to deploy my set of microservices in different nodes. For installing kubeadm and creation of clusters I am following the below documentations.
https://medium.com/#SystemMining/setup-kubenetes-cluster-on-ubuntu-16-04-with-kubeadm-336f4061d929
https://medium.com/#Grigorkh/install-kubernetes-on-ubuntu-1ac2ef522a36
https://www.youtube.com/watch?v=b_fOIELGMDY&t=108s
I need one master with 2 worker machines. I got clear idea about how to create the kubernetes clusters.
My requirements: I have an application which has separate set of microservices. I need to deploy docker images for one set of microservices into one node1.And docker images for other set into node2. And 3 rd set of microservices in node3...etc...This is my planning of deployment. Please correct me if I am going in wrong direction, Since I only started exploration in docker, kubernetes and jenkins. Devop.
My confusions:
According to my requirement region wise deployment by nodes , Is this deployment strategy is possible by Kubernetes ? And is this one of the standard way ?
If I am using Jenkins for implementing CI/CD pipeline , then Do I need to install Jenkins in each Vm? Means master machine and also in machine which resides nodes?
These all are my confusion about this Kubernetes deployment. Please correct me if my thoughts are wrong, since I am only a beginner in DevOp world. How can I clarify my doubts about deployment by using Kubernetes ?
To answer your first question - you basically need to allocate each node for a tenant. If there are compliance/regulatory reasons then you should do it (Though it won't be very efficient). Here is how you can do it:
On the node1 add a taint:
kubectl taint nodes node1.compute.companyname.com reservedfor=tenant1:NoSchedule
What above means is that the node1 will only schedule pods which have a matching toleration and not any other pod. For the microservice which you need to schedule on node1, you will have to add a toleration to the pod YAML file like:
tolerations:
- key: "reservedfor"
operator: "Equal"
value: "tenant1"
effect: "NoSchedule"
The same logic can be extended - so that even if tenant1 needs 4 machines, then all the 4 machines can be tainted with above key value pair and then pods can be tolerated on those nodes. Check out the documentation here and blog with an example here
You can also use the pod/node affinity to achieve above.
Your second question on Jenkins - No, you don't need to install Jenkins on each node, but other than that more details are needed for that question.
I have started recently getting familiar with Kubernetes, however while I do get the concept I have some questions I am unable to answer clearly through Kubernete's Concept and Documentation, and some understandings that I'd wish to confirm.
A Deployment is a group of one or more container images (Docker ..etc) that is deployed within a Pod, and through Kubernetes Deployment Controller such deployments are monitored and created, updated, or deleted.
A Pod is a group of one or more containers, are those containers from the same Deployment, or can they be from multiple deployments?
"A pod models contains one or more application containers which are relatively tightly coupled". Is there any clear criteria on when to deploy containers within the same pod, rather than separate pods?
"Pods are the smallest deployable units of computing that can be created and managed in Kubernetes" - Pods, Kuberenets Documentation. Is that to mean that Kubernetes API is unable to monitor, and manage containers (at least directly)?
Appreciate your input.
your question is actually too broad for StackOverflow but I'll quickly answer before this one is closed.
Maybe it get's clearer when you look at the API documentation. Which you could read like this:
A Deployment describes a specification of the desired behavior for the contained objects.
This is done within the spec field which is of type DeploymentSpec.
A DeploymentSpec defines how the related Pods should look like with a templatethrough the PodTemplateSpec
The PodTemplateSpec then holds the PodSpec for all the require parameters and that defines how containers within this Pod should look like through a Container definition.
This is not a punchy oneline statement, but maybe makes it easier to see how things relate to each other.
Related to the criteria on what's a good size and what's too big for a Pod or a Container. This is very opinion loaded and the best way to figure that out is to read through the opinions on the size of Microservices.
To cover your last point - Kubernetes is able to monitor and manage containers, but the "user" is not able to schedule single containers. They have to be embedded in a Pod definion. You can of course access Container status and details per container (e.g. through kubeget logs <pod> -c <container> (details) or through the metrics API.
I hope this helps a bit and doesn't add to the confusion.
Pod is an abstraction provided by Kubernetes and it corresponds to a group of containers which share a subset of namespaces, most importantly the network namespace. For instances the applications running in these containers can interact like the way applications in the same vm would interact, except for the fact that they don't share the same filesystem hierarchy.
The workloads are run in the form of pods, but POD is a lower level abstraction. The workloads are typically scheduled in terms of Kubernetes Deployments/ Jobs / CronJobs / Daemonsets etc which in turn create the Pods.
Try to deploy multiple Usergrid containers on different machines, and make them point to a Cassandra cluster. But I cannot find documents about running multiple Usergrid nodes, and I only found instructions about Cassandra cluster.
Is this the right way to scale up my Usergrid services ? Or, what is the best practice to run multiple Usergrid nodes ?
My understanding is this is the correct way to go about it. You just need to to deploy the ROOT.war file to a new Tomcat instance.
Docs for configuring the usergrid-deployment.properties file so that UG knows where Cass and ES instances are, then deploying to Tomcat are steps 4 and 5 here: https://usergrid.apache.org/docs/installation/deployment-guide.html#deploying-the-usergrid-stack
You can also use the AWS cloudformation scripts in the repo to have AWS handle this for you (https://github.com/apache/usergrid/tree/master/deployment/aws)
There are no documented architecture about scalable usergrid deployment. You need to configure your own deployment based on your requirements. Some samples can be found on the internet, this presentation helped me to configure our usergrid installation: http://events.linuxfoundation.org/sites/events/files/slides/Intro-To-Usergrid%20-%20ApacheCon%20EU%202014.pdf (pages 47-48).
And here is my deployment strategy: All the components (tomcat, C*, es) are java applications, so putting them on to the same machine will be expensive on RAM. So, separate the layers, and scale them independently. For example, if your application chokes on incoming user connections, just scale up tomcat cluster (behind a LB probably). Spend time on configuring Cassandra, and don't stick to the default values - your data will be there and you don't want to lose it.