How to attache a volume to kubernetes pod container like in docker? - docker

I am new to Kubernetes but familiar with docker.
Docker Use Case
Usually, when I want to persist data I just create a volume with a name then attach it to the container, and even when I stop it then start another one with the same image I can see the data persisting.
So this is what i used to do in docker
docker volume create nginx-storage
run -it --rm -v nginx-storage:/usr/share/nginx/html -p 80:80 nginx:1.14.2
then I:
Create a new html file in /usr/share/nginx/html
Stop container
Run the same docker run command again (will create another container with same volume)
html file exists (which means data persisted in that volume)
Kubernetes Use Case
Usually, when I work with Kubernetes volumes I specify a PVC (PersistentVolumeClaim) and PV (PersistentVolume) using hostPath which will bind mount directory or a file from the host machine to the container.
what I want to do is reproduce the same behavior specified in the previous example (Docker Use Case) so how can I do that? Is Kubernetes creating volumes process is different from Docker? and if possible providing a YAML file would help me understand.

To a first approximation, you can't (portably) do this. Build your content into the image instead.
There are two big practical problems, especially if you're running a production-oriented system on a cloud-hosted Kubernetes:
If you look at the list of PersistentVolume types, very few of them can be used in ReadWriteMany mode. It's very easy to get, say, an AWSElasticBlockStore volume that can only be used on one node at a time, and something like this will probably be the default cluster setup. That means you'll have trouble running multiple pod replicas serving the same (static) data.
Once you do get a volume, it's very hard to edit its contents. Consider the aforementioned EBS volume: you can't edit it without being logged into the node on which it's mounted, which means finding the node, convincing your security team that you can have root access over your entire cluster, enabling remote logins, and then editing the file. That's not something that's actually possible in most non-developer Kubernetes setups.
The thing you should do instead is build your static content into a custom image. An image registry of some sort is all but required to run Kubernetes and you can push this static content server into the same registry as your application code.
FROM nginx:1.14.2
COPY . /usr/share/nginx/html
# Base image has a working CMD, no need to repeat it
Then in your deployment spec, set image: registry.example.com/nginx-frontend:20220209 or whatever you've chosen to name this build of this image, and do not use volumes at all. You'd deploy this the same way you deploy other parts of your application; you could use Helm or Kustomize to simplify the update process.
Correspondingly, in the plain-Docker case, I'd avoid volumes here. You don't discuss how files get into the nginx-storage named volume; if you're using imperative commands like docker cp or debugging tools like docker exec, those approaches are hard to script and are intrinsically local to the system they're running on. It's not easy to copy a Docker volume from one place to another. Images, though, can be pushed and pulled through a registry.

I managed to do that by creating a PVC only this is how I did it (with an Nginx image):
nginx-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
nginx-deployment.yaml
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template: # template for the pods
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
volumeMounts:
- mountPath: /usr/share/nginx/html
name: nginx-data
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nginx-data
restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: nginx
ports:
- name: http
port: 80
nodePort: 30080
type: NodePort
Once I run kubectl apply on the PVC then on the deployment going to localhost:30080 will show 404 not found page means that all data in the /usr/share/nginx/html was deleted once the container gets started and that's because it's bind mounting a dir from the k8s cluster node to that container as a volume:
/usr/share/nginx/html <-- dir in volume
/var/lib/k8s-pvs/nginx2-data/pvc-9ba811b0-e6b6-4564-b6c9-4a32d04b974f <-- dir from node (was automatically created)
I tried adding a new file into that container in the html dir as a new index.html file, then deleted the container, a new container was created by the pod and checking localhost:30080 worked with the newly created home page
I tried deleting the deployment and reapplying it (without deleting the PVC) checked localhost:30080 and everything still persists.
An alternative solution specified in the comments kubernetes.io/docs/tasks/configure-pod-container/… by
larsks

Related

Copy files from container to host while inside the container

I'm working on automation pipeline using Kubernetes and Jenkins. All my commands are running from inside the jnlp-slave container. The jnlp-slave is deployed onto a worker node by Kubernetes. I have -v /var/run/docker.sock on my jnlp-slave so it can run docker commands from inside the container.
Issue:
I'm trying to copy files inside the jnlp-slave container to the host machine (worker node), but the command below does not copy files to host machine, but to destination of the container itself:
docker cp <container_id>:/home/jenkins/workspace /home/jenkins/workspace
Clarification:
Since the container is executing the command, files located inside the container is copied to the destination path which is also inside the container.
Normally, docker commands are executed on the host machine. Therefore, the docker cp can be used to copy files from container to host and from host to container. But in this case, the docker cp is executed from inside the container.
How can I make the container to copy files to the host machine without running docker commands on the host? Is there a command which the container can run to copy files to the host?
P.S. I've tried mounting volume on the host. But the files only can be shared from the host to the container and not the other way around. Any help is appreciated, thanks.
As suggested in comments, you should probably redesign entirely your solution.
But let's summarize what you currently have and try to figure out what you can do with it and not make your solution even more complicated at the same time.
What I did was copy the files from jnlp-slave container to the other
containers.
Copying files from one container to all others is a bit an overkill (btw. How many of them do you place in one Pod ?)
Maybe your containers don't have to be deployed withing the same Pod ? If for some reason this is currently impossible, maybe at least the content of the /home/jenkins/workspace directory shouldn't be integral part of your docker image as it is now ? If this is also impossible, you have no other remedy than to copy somehow those files from the original container based on that image to the shared location which is also available for other containers existing within the same Pod.
emptyDir, mentioned in comments might be an option allowing you to achieve it. However keep in mind that the data stored in an emptyDir volume is not persistent in any way and is deleted along with the deletion of the pod. If you're ok with this fact, it may be a solution for you.
If your data is originally part of your image, it should be first transferred to your emptyDir volume as by its very definition it is initially empty. Simply mounting it under /home/jenkins/workspace won't make the data originally available in this directory in your container automagically appear in the emptyDir volume. From the moment you mount your emptyDir under /home/jenkins/workspace it will contain the content of emptyDir i.e. nothing.
Therefore you need to pre-populate it somehow and one of the available solutions to do that is using an initContainer. As your data is originally the integral part of your docker image, you must use the same image also for the initContainer.
Your deployment may look similar to the one below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-deployment
spec:
replicas: 1
selector:
matchLabels:
app: sample-app
template:
metadata:
labels:
app: sample-app
spec:
initContainers:
- name: pre-populate-empty-dir
image: <image-1>
command: ['sh', '-c', 'cp -a /home/jenkins/workspace/* /mnt/empty-dir-content/']
volumeMounts:
- name: cache-volume
mountPath: "/mnt/empty-dir-content/"
containers:
- name: app-container-1
image: <image-1>
ports:
- containerPort: 8081
volumeMounts:
- name: cache-volume
mountPath: "/home/jenkins/workspace"
- name: app-container-2
image: <image-2>
ports:
- containerPort: 8082
volumeMounts:
- name: cache-volume
mountPath: "/home/jenkins/workspace"
- name: app-container-3
image: <image-3>
ports:
- containerPort: 8083
volumeMounts:
- name: cache-volume
mountPath: "/home/jenkins/workspace"
volumes:
- name: cache-volume
emptyDir: {}
Once your data has been copied to an emptyDir volume by the initContainer, it will be available for all the main containers and can be mounted under the same path by each of them.

Is there any better way for changing the source code of a container instead of creating a new image?

What is the best way to change the source code of my application running as Kubernetes pod without creating a new version of image so I can avoid time taken for pushing and pulling image from repository?
You may enter the container using bash if it installed on the image and modify it using -
docker exec -it <CONTAINERID> /bin/bash
However, this isn’t advisable solution. If your modifications succeed, you should update the Dockerfile accordingly or else you risk losing your work and ability to share it with others.
Have the container pull from git on creation?
Setup CI/CD?
Another way to achieve a similar result is to leave the application source outside of the container and mount the application source folder in the container.
This is especially useful when developing web applications in environments such as PHP: your container is setup with your Apache/PHP stack and /var/www/html is configured to mount your local filesystem.
If you are using minikube, it already mounts a host folder within the minikube VM. You can find the exact paths mounted, depending on your setup, here:
https://kubernetes.io/docs/getting-started-guides/minikube/#mounted-host-folders
Putting it all together, this is what a nginx deployment would look like on kubernetes, mounting a local folder containing the web site being displayed:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
volumeMounts:
- mountPath: /var/www/html/
name: sources
readOnly: true
volumes:
- name: sources
hostPath:
path: /Users/<username>/<source_folder>
type: Directory
Finally we have resolved the issue. Here, we changed our image repository from docker hub to aws ecr in the same region where we are running kubernetes cluster. Now, it is taking very lesstime for pushing/pulling images.
This is definitely not recommended for production.
But if your intention is local development with kubernetes, take a look at these tools:
Telepresence
Telepresence is an open source tool that lets you run a single service
locally, while connecting that service to a remote Kubernetes cluster.
Kubectl warp
Warp is a kubectl plugin that allows you to execute your local code
directly in Kubernetes without slow image build process.
The kubectl warp command runs your command inside a container, the same
way as kubectl run does, but before executing the command, it
synchronizes all your files into the container.
I think it should be taken as process to create new images for each deployment.
Few benefits:
immutable images: no intervention in running instance this will ensure image run in any environment
rollback: if you encounter issues in new version, rollback to previous version
dependencies: new versions may have new dependencies

Kubernetes: Specify a tarball docker image to run pod

I have saved a docker image as a tar file locally using the command,
docker save -o ./dockerImage:version.tar docker.io/image:latest-1.0
How to specify this file in my pod.yaml to use this tarball and start the pod instead of pulling / already pulled image to launch the container.
Current pod.yaml file:
apiVersion: myApp/v1
kind: myKind
metadata:
name: myPod2
spec:
baseImage: docker.io/image
version: latest-1.0
I want similar to this
apiVersion: myApp/v1
kind: myKind
metadata:
name: myPod2
spec:
baseImage: localDockerImage.tar:latest-1.0
version: latest-1.0
There's no direct way to achieve that in Kubernetes.
See the discussions here: https://github.com/kubernetes/kubernetes/issues/1668
They have finally closed that issue because of the following reasons:
Given that there are a number of ways to do this (your own cluster startup scripts, run a daemonset to side load your custom images, create VM images with images pre-loaded, run a cluster-local docker registry), and the fact that there have been no substantial updates in over two years, I'm going to close this as obsolete.

how to inspect the content of persistent volume by kubernetes on azure cloud service

I have packed the software to a container. I need to put the container to cluster by Azure Container Service. The software have outputs of an directory /src/data/, I want to access the content of the whole directory.
After searching, I have to solution.
use Blob Storage on azure, but then after searching, I can't find the executable method.
use Persistent Volume, but all the official documentation of azure and pages I found is about Persistent Volume itself, not about how to inspect it.
I need to access and manage my output directory on Azure cluster. In other words, I need a savior.
As I've explained here and here, in general, if you can interact with the cluster using kubectl, you can create a pod/container, mount the PVC inside, and use the container's tools to, e.g., ls the contents. If you need more advanced editing tools, replace the container image busybox with a custom one.
Create the inspector pod
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: pvc-inspector
spec:
containers:
- image: busybox
name: pvc-inspector
command: ["tail"]
args: ["-f", "/dev/null"]
volumeMounts:
- mountPath: /pvc
name: pvc-mount
volumes:
- name: pvc-mount
persistentVolumeClaim:
claimName: YOUR_CLAIM_NAME_HERE
EOF
Inspect the contents
kubectl exec -it pvc-inspector -- sh
$ ls /pvc
Clean Up
kubectl delete pod pvc-inspector

How to reboot kubernetes pod and keep the data

I'm now using kubernetes to run the Docker container.I just create the container and i use SSH connect to my pods.
I need to do some system config change so i need to reboot the container but when i`reboot the container it will lose all the data in the pod. kubernetes will run a new pod just like the Docker image original.
So how can i reboot the pod and just keep the data in it?
The kubernetes was offered my Bluemix
You need to learn more about containers as your question suggests that you are not fully grasping the concepts.
Running SSH in a container is an anti-pattern, a container is not a virtual machine. So remove the SSH server from it.
the fact that you run SSH indicates that you may be running more than one process per container. This is usually bad practice. So remove that supervisor and call your main process directly in your entrypoint.
Setup your container image main process to use environment variables or configuration files for configuration at runtime.
The last item means that you can define environment variables in your Pod manifest or use Kubernetes configmaps to store configuration file. Your Pod will read those and your process in your container will get configured properly. If not your Pod will die or your process will not run properly and you can just edit the environment variable or config map.
My main suggestion here is to not use Kubernetes until you have your docker image properly written and your configuration thought through, you should not have to exec in the container to get your process running.
Finally, more generally, you should not keep state inside a container.
For you to store your data you need to set up persistent storage, if you're using for example Google Cloud as your platform, you would need to create a disk to store your data on and define the use of this disk in your manifest.
With Bluemix it looks like you just have to create the volumes and use them.
bx ic volume-create myapplication_volume ext4
bx ic run --volume myapplication_volume:/data --name myapplication registry.eu-gb.bluemix.net/<my_namespace>/my_image
Bluemix - Persistent storage documentation
I don't use Bluemix myself so i'll proceed with an example manifest using Google's persistent disks.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: myapplication
namespace: default
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: myapplication
template:
metadata:
labels:
app: myapplication
spec:
containers:
- name: myapplication
image: eu.gcr.io/myproject/myimage:latest
imagePullPolicy: Always
ports:
- containerPort: 80
- containerPort: 443
volumeMounts:
- mountPath: /data
name: myapplication-volume
volumes:
- name: myapplication-volume
gcePersistentDisk:
pdName: mydisk-1
fsType: ext4
Here the disk mydisk-1 is mapped to the /data mountpoint.
The only data that will persist after reboots will be inside that folder.
If you want to store your logs for example you could symlink the logs folder.
/var/log/someapplication -> /data/log/someapplication
It works, but this is NOT recommended!
It's not clear to me if you're sshing to the nodes or using some tool to execute a shell inside the containers. Even though running multiple processes per container is bad practice it seems to be working very well, if you keep tabs on memory and cpu use.
Running a ssh server and cronjobs in the same container for example will absolutely work though it's not the best of solutions.
We've been using supervisor with multiple (2-5) processses in production for over a year now and it's working surprisingly well.
For more information about persistent volumes in a variety of platforms.
https://kubernetes.io/docs/concepts/storage/persistent-volumes/

Resources