I've worked quite a lot with Docker in the past years, but I'm a newbie when it comes to Kubernetes. I'm starting today and I am struggling with the usefulness of the Pod concept in comparison with the way I used to do thinks with Docker swarm.
Let's say that I have a cluster with 7 powerful machines and I have the following stack:
I want three Cassandra replicas each running in a dedicated machine (3/7)
I want two Kafka replicas each running in a dedicated machine (5/7)
I want a MyProducer replica running on its own machine, receiving messages from the web and pushing them into Kafka (6/7)
I want 3 MyConsumer replicas all running in the last machine (7/7), which pull from Kafka and insert in Cassandra.
With docker swarm I used to handle container distribution with node labels, e.g. I would label three machines and Cassandra container configuration as C_HOST, 2 machines and Kafka configuration as K_HOST,... The swarm deployment would place each container correctly.
I have the following questions:
Does Kubernetes pods bring any advantage comparing to my previous approach (e.g. simplicity)? I understood that I am still required to configure labels, if so, I don't see the appeal.
What would be the correct way to configure these pods? Would it be one pod for Cassandra replicas, one pod for Kafka replicas, one pod for MyConsumer replicas and one pod for MyProducer?
Using pod anti-affinity, you can ensure that a pod is not co-located with other pods with specific labels.
So say your have a label "app" with values "cassandra", "kafka", "my-producer" and "my-consumer".
Since you want to have cassandra, kafka and my-producer on dedicated nodes all by themselves, you simply configure an anti-affinity to ALL the existing labels:
(see https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ for full schema)
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- cassandra
- kafka
- my-producer
- my-consumer
This is for a "Pod" resource, so you'd define this in a deployment (where you also define how many replicas) in the pod template.
Since you want three instances of my-consumer running on the same node (or really, you don't care where they run, since by now only one node is left), you do not need to define anything about affinity or anti-affinity:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-consumer
namespace: default
labels:
app: my-consumer
spec:
selector:
matchLabels:
app: my-consumer
replicas: 3 # here you set the number of replicas that should run
template: # this is the pod template
metadata:
labels:
app: my-consumer # now this is the label you can set an anti-affinity to
spec:
containers:
- image: ${IMAGE}
name: my-consumer
# affinity:
# now here below this you'd put the affinity-settings from above
# for the other deployments
You can still use node labels and use nodeSelector parameter.
You can add node labels by using kubectl...
kubectl label nodes <node-name> <label-key>=<label-value> to add a label to the node you’ve chosen.
But more advanced way is use affinity for pod distribution...
Related
I am trying to avoid having to create three different images for separate deployment environments.
Some context on our current ci/cd pipeline:
For the CI portion, we build our app into a docker container and then submit that container to a security scan. Once the security scan is successful, the container gets put into a private container repository.
For the CD portion, using helm charts, we pull the container from the repository and then deploy to a company managed Kubernetes cluster.
There was an ask and the solution was to use a piece of software in the container. And for some reason (I'm the devops person and not the software engineer) the software needs environment variables (specific to the deployment environment) passed to it when it starts. How would we be able to start and pass environment variables to this software at deployment?
I could just create three different images with the environment variables but I feel like that is an anti-pattern. It takes away from the flexibility of having one image that can be deployed to different environments.
Can any one point me to resources that can accomplish starting an application with specific environment variables using Helm? I've looked but did not find a solution or anything that pointed me to the right direction. As a plan b, I'll just create three different images but I want to make sure that there is not a better way.
Depending on the container orchestration, you can pass the env in differnt ways:
Plain docker:
docker run -e MY_VAR=MY_VAL <image>
Docker compose:
version: '3'
services:
app:
image: '<image>'
environment:
- MY_VAR=my-value
Check on docker-compose docs
Kubernetes:
apiVersion: v1
kind: Pod
metadata:
name: envar-demo
spec:
containers:
- name: app
image: <image>
env:
- name: MY_VAR
value: "my value"
Check on kubernetes-docu
Helm:
Add the values in your values.yaml:
myKey: myValue
Then reference it in your helm template:
apiVersion: v1
kind: Pod
metadata:
name: envar-demo
spec:
containers:
- name: app
image: <image>
env:
- name: MY_VAR
value: {{ .Values.myKey }}
Check out the helm docs.
I am new to Kubernetes but familiar with docker.
Docker Use Case
Usually, when I want to persist data I just create a volume with a name then attach it to the container, and even when I stop it then start another one with the same image I can see the data persisting.
So this is what i used to do in docker
docker volume create nginx-storage
run -it --rm -v nginx-storage:/usr/share/nginx/html -p 80:80 nginx:1.14.2
then I:
Create a new html file in /usr/share/nginx/html
Stop container
Run the same docker run command again (will create another container with same volume)
html file exists (which means data persisted in that volume)
Kubernetes Use Case
Usually, when I work with Kubernetes volumes I specify a PVC (PersistentVolumeClaim) and PV (PersistentVolume) using hostPath which will bind mount directory or a file from the host machine to the container.
what I want to do is reproduce the same behavior specified in the previous example (Docker Use Case) so how can I do that? Is Kubernetes creating volumes process is different from Docker? and if possible providing a YAML file would help me understand.
To a first approximation, you can't (portably) do this. Build your content into the image instead.
There are two big practical problems, especially if you're running a production-oriented system on a cloud-hosted Kubernetes:
If you look at the list of PersistentVolume types, very few of them can be used in ReadWriteMany mode. It's very easy to get, say, an AWSElasticBlockStore volume that can only be used on one node at a time, and something like this will probably be the default cluster setup. That means you'll have trouble running multiple pod replicas serving the same (static) data.
Once you do get a volume, it's very hard to edit its contents. Consider the aforementioned EBS volume: you can't edit it without being logged into the node on which it's mounted, which means finding the node, convincing your security team that you can have root access over your entire cluster, enabling remote logins, and then editing the file. That's not something that's actually possible in most non-developer Kubernetes setups.
The thing you should do instead is build your static content into a custom image. An image registry of some sort is all but required to run Kubernetes and you can push this static content server into the same registry as your application code.
FROM nginx:1.14.2
COPY . /usr/share/nginx/html
# Base image has a working CMD, no need to repeat it
Then in your deployment spec, set image: registry.example.com/nginx-frontend:20220209 or whatever you've chosen to name this build of this image, and do not use volumes at all. You'd deploy this the same way you deploy other parts of your application; you could use Helm or Kustomize to simplify the update process.
Correspondingly, in the plain-Docker case, I'd avoid volumes here. You don't discuss how files get into the nginx-storage named volume; if you're using imperative commands like docker cp or debugging tools like docker exec, those approaches are hard to script and are intrinsically local to the system they're running on. It's not easy to copy a Docker volume from one place to another. Images, though, can be pushed and pulled through a registry.
I managed to do that by creating a PVC only this is how I did it (with an Nginx image):
nginx-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
nginx-deployment.yaml
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template: # template for the pods
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
volumeMounts:
- mountPath: /usr/share/nginx/html
name: nginx-data
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nginx-data
restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: nginx
ports:
- name: http
port: 80
nodePort: 30080
type: NodePort
Once I run kubectl apply on the PVC then on the deployment going to localhost:30080 will show 404 not found page means that all data in the /usr/share/nginx/html was deleted once the container gets started and that's because it's bind mounting a dir from the k8s cluster node to that container as a volume:
/usr/share/nginx/html <-- dir in volume
/var/lib/k8s-pvs/nginx2-data/pvc-9ba811b0-e6b6-4564-b6c9-4a32d04b974f <-- dir from node (was automatically created)
I tried adding a new file into that container in the html dir as a new index.html file, then deleted the container, a new container was created by the pod and checking localhost:30080 worked with the newly created home page
I tried deleting the deployment and reapplying it (without deleting the PVC) checked localhost:30080 and everything still persists.
An alternative solution specified in the comments kubernetes.io/docs/tasks/configure-pod-container/… by
larsks
After building a docker image named my-http I can create a deployment from it with
kubectl create deploy http-deployment --image=my-http
This will not pull the image because imagePullPolicy is Always.
So then run
kubectl edit deploy http-deployment
and change the imagePullPolicy to Never, then it runs.
But for automation purposes I've created a yaml to create the deployment and set the imagePullPolicy at the same time.
apiVersion: apps/v1
kind: Deployment
metadata:
name: http-deployment
spec:
replicas: 3
selector:
matchLabels:
app: http
template:
metadata:
labels:
app: http
spec:
containers:
- name: my-http
image: my-http
imagePullPolicy: Never
ports:
- containerPort: 8080
Then apply -f and the pods start running but after a while a Crashloopbackoff starts with the message
container image my-http already present on machine
Apparently it has something to do with the container port but what to use for that port to get it running? There is no container running...
edit: the image already present is just informational, this is the last line in the pod description
Warning BackOff 7s (x8 over 91s) kubelet, minikube Back-off
restarting failed container
If you using kubernetes cluster your images only available on the nodes that you build the images.
You have to push images to container registries then the kubernetes will try to pull the image to node that will running the container.
If you want to run the container in the nodes that you build the images you have to use NodeSelector, or PodAffinity.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
Your image is probably private image which Kubernetes can't pull if you didn't specify imagePullSecrets.
This shouldn't be the problem however, because imagePullPolicy: Never would just use the image on the nodes. You can diagnose real problem by either kubectl describe pod pod_name or getting logs of the previous pod with --previous flag because newer pod may not have encountered the problem.
I have a Jenkins pipeline using the kubernetes plugin to run a docker in docker container and build images:
pipeline {
agent {
kubernetes {
label 'kind'
defaultContainer 'jnlp'
yaml """
apiVersion: v1
kind: Pod
metadata:
labels:
name: dind
...
I also have a pool of persistent volumes in the jenkins namespace each labelled app=dind. I want one of these volumes to be picked for each pipeline run and used as /var/lib/docker in my dind container in order to cache any image pulls on each run. I want to have a pool and caches, not just a single one, as I want multiple pipeline runs to be able to happen at the same time. How can I configure this?
This can be achieved natively in kubernetes by creating a persistent volume claim as follows:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dind
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
selector:
matchLabels:
app: dind
and mounting it into the Pod, but I'm not sure how to configure the pipeline to create and cleanup such a persistent volume claim.
First of all, I think the way you think it can be achieved natively in kubernetes - wouldn't work. You either have to re-use same PVC which will make build pods to access same PV concurrently, or if you want to have a PV per build - your PVs will be stuck in Released status and not automatically available for new PVCs.
There is more details and discussion available here https://issues.jenkins.io/browse/JENKINS-42422.
It so happens that I wrote two simple controllers - automatic PV releaser (that would find and make Released PVs Available again for new PVCs) and dynamic PVC provisioner (for Jenkins Kubernetes plugin specifically - so you can define a PVC as annotation on a Pod). Check it out here https://github.com/plumber-cd/kubernetes-dynamic-reclaimable-pvc-controllers. There is a full Jenkinsfile example here https://github.com/plumber-cd/kubernetes-dynamic-reclaimable-pvc-controllers/tree/main/examples/jenkins-kubernetes-plugin-with-build-cache.
I'm now using kubernetes to run the Docker container.I just create the container and i use SSH connect to my pods.
I need to do some system config change so i need to reboot the container but when i`reboot the container it will lose all the data in the pod. kubernetes will run a new pod just like the Docker image original.
So how can i reboot the pod and just keep the data in it?
The kubernetes was offered my Bluemix
You need to learn more about containers as your question suggests that you are not fully grasping the concepts.
Running SSH in a container is an anti-pattern, a container is not a virtual machine. So remove the SSH server from it.
the fact that you run SSH indicates that you may be running more than one process per container. This is usually bad practice. So remove that supervisor and call your main process directly in your entrypoint.
Setup your container image main process to use environment variables or configuration files for configuration at runtime.
The last item means that you can define environment variables in your Pod manifest or use Kubernetes configmaps to store configuration file. Your Pod will read those and your process in your container will get configured properly. If not your Pod will die or your process will not run properly and you can just edit the environment variable or config map.
My main suggestion here is to not use Kubernetes until you have your docker image properly written and your configuration thought through, you should not have to exec in the container to get your process running.
Finally, more generally, you should not keep state inside a container.
For you to store your data you need to set up persistent storage, if you're using for example Google Cloud as your platform, you would need to create a disk to store your data on and define the use of this disk in your manifest.
With Bluemix it looks like you just have to create the volumes and use them.
bx ic volume-create myapplication_volume ext4
bx ic run --volume myapplication_volume:/data --name myapplication registry.eu-gb.bluemix.net/<my_namespace>/my_image
Bluemix - Persistent storage documentation
I don't use Bluemix myself so i'll proceed with an example manifest using Google's persistent disks.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: myapplication
namespace: default
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: myapplication
template:
metadata:
labels:
app: myapplication
spec:
containers:
- name: myapplication
image: eu.gcr.io/myproject/myimage:latest
imagePullPolicy: Always
ports:
- containerPort: 80
- containerPort: 443
volumeMounts:
- mountPath: /data
name: myapplication-volume
volumes:
- name: myapplication-volume
gcePersistentDisk:
pdName: mydisk-1
fsType: ext4
Here the disk mydisk-1 is mapped to the /data mountpoint.
The only data that will persist after reboots will be inside that folder.
If you want to store your logs for example you could symlink the logs folder.
/var/log/someapplication -> /data/log/someapplication
It works, but this is NOT recommended!
It's not clear to me if you're sshing to the nodes or using some tool to execute a shell inside the containers. Even though running multiple processes per container is bad practice it seems to be working very well, if you keep tabs on memory and cpu use.
Running a ssh server and cronjobs in the same container for example will absolutely work though it's not the best of solutions.
We've been using supervisor with multiple (2-5) processses in production for over a year now and it's working surprisingly well.
For more information about persistent volumes in a variety of platforms.
https://kubernetes.io/docs/concepts/storage/persistent-volumes/