Unable to write file from docker run inside a kubernetes pod - docker

I have a docker image that uses a volume to write files:
docker run --rm -v /home/dir:/out/ image:cli args
when I try to run this inside a pod the container exit normally but no file is written.
I don't get it.
The container throw errors if it does not find the volume, for example if I run it without the -v option it throws:
Unhandled Exception: System.IO.DirectoryNotFoundException: Could not find a part of the path '/out/file.txt'.
But I don't have any error from the container.
It finishes like it wrote files, but files do not exist.
I'm quite new to Kubernetes but this is getting me crazy.
Does kubernetes prevent files from being written? or am I missing something obvious?
The whole Kubernetes context is managed by GCP composer-airflow, if it helps...
docker -v: Docker version 17.03.2-ce, build f5ec1e2

If you want to have that behavior in Kubernetes you can use a hostPath volume.
Essentially you specify it in your pod spec and then the volume is mounted on the node where your pod runs and then the file should be there in the node after the pod exits.
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: image:cli
name: test-container
volumeMounts:
- mountPath: /home/dir
name: test-volume
volumes:
- name: test-volume
hostPath:
path: /out
type: Directory

when I try to run this inside a pod the container exit normally but no file is written
First of all, there is no need to run the docker run command inside the pod :). A spec file (yaml) should be written for the pod and kubernetes will run the container in the pod using docker for you. Ideally, you don't need to run docker commands when using kubernetes (unless you are debugging docker-related issues).
This link has useful kubectl commands for docker users.
If you are used to docker-compose, refer Kompose to go from docker-compose to kubernetes:
https://github.com/kubernetes/kompose
http://kompose.io
Some options to mount a directory on the host as a volume inside the container in kubernetes:
hostPath
emptyDir
configMap

Related

How to merge data from docker image with kubernetes volume (NFS)

I have a docker image that contains data in directors /opt/myfiles, Lets say the following:
/opt/myfiles/file1.txt
/opt/myfiles/file2.dat
I want to deploy that image to kubernetes and mount an NFS volume to that directory so that the changes to these files are persisted when I delete the pod.
When I do it in docker swarm I simply mount an empty NFS volume to /opt/myfiles/ and then my docker swarm service is started, the volume is populated with the files from the image and I can then work with my service and when I delete the service, I still have the files on my NFS server, so on next start of the service, I have my previous state back.
In kubernetes, when I mount an empty NFS volume to /opt/myfiles/, the pod is started and /opt/myfiles/ is overwritten with an empty directory, so my pod does not see the files from the image anymore.
My volme mount and volume definition:
[...]
volumeMounts:
- name: myvol
mountPath: /opt/myfiles
[...]
volumes:
- name: myvol
nfs:
server: nfs-server.mydomain.org
path: /srv/shares/myfiles
I read some threads about similar problems (for example K8s doesn't mount files on Persistent Volume) and tried some stuff using subPath and subPathExpr as in the documentation (https://kubernetes.io/docs/concepts/storage/volumes/#using-subpath) but none of my changes does, what docker swarm does by default.
The behaviour of kubernetes seems strange to my as I have worked with docker swarm for quite a while now and I am familiar with the way docker swarm handles that. But I am sure that there is a reason why kubernetes handles that in another way and that there are some possibilities to get, what I need.
So please, can someone have a look at my problem and help me find a way to get the following behaviour?
I have files in my image in some directory
I want to mount an NFS volume to that directory, because I need to have the data persisted, when for example my pod crashes and moves to another host or when my pod is temporarily shut down for whatever reason.
When my pod starts, I want the volume to be populated with the files from the image
And of course I would be really happy if someone could explain me, why kubernetes and docker swarm behave so different.
Thanks a lot in advance
This can be usually achieved in Kubernetes with init-containers.
Let's have an image with files stored in /mypath folder. If you are able to reconfigure the container to use a different path (like /persistent) you can use init container to copy files from /mypath to /persistent on pod's startup.
containers:
- name: myapp-container
image: myimage
env:
- name: path
value: /persistent
volumeMounts:
- name: myvolume
path: /persistent
initContainers:
- name: copy-files
image: myimage
volumeMounts:
- name: myvolume
path: /persistent
command: ['sh', '-c', 'cp /mypath/*.* /persistent']
In this case you have a main container myapp-container using files from /persistent folder from the NFS volume and each time when container starts the files from /mypath will be copied into that folder by init container copy-files.

Kubernetes mountPropagation clarifications

I have a docker image A that contains a folder I need to share with another container B in the same K8s pod.
At first I decided to use a shared volume (emptyDir) and launched A as an init container to copy all the content of the folder into the shared volume. This works fine.
Then looking at k8s doc I realised I could use mountPropagation between the containers.
So I changed the initContainer to a plain container (side car) in the same pod and performed a mount of the container A folder I want to share with container B. This works fine but I need to keep the container running A up with a wait loop. Or not...
Then I decided to come back to the InitContainer pattern and do the same, meaning mount the folder in A inside the shared volume and then the container finishes cause it is an InitContainer and then use the newly mounted folder in container B. And it works !!!!
So my question is, can someone explains me if this is expected on all Kubernetes clusters ? and explain to me why the mounted folder from A that is no longer running as a container can still be seen by my other container ?
Here is a simple manifest to demonstrate it.
apiVersion: v1
kind: Pod
metadata:
name: testvol
spec:
initContainers:
- name: busybox-init
image: busybox
securityContext:
privileged: true
command: ["/bin/sh"]
args: ["-c", "mkdir -p /opt/connectors; echo \"bar\" > /opt/connectors/foo.txt; mkdir -p /opt/connectors_new; mount --bind /opt/connectors /opt/connectors_new; echo connectors mount is ok"]
volumeMounts:
- name: connectors
mountPath: /opt/connectors_new
mountPropagation: Bidirectional
containers:
- name: busybox
image: busybox
command: ["/bin/sh"]
args: ["-c", "cat /opt/connectors/foo.txt; trap : TERM INT; (while true; do sleep 1000; done) & wait"]
volumeMounts:
- name: connectors
mountPath: /opt/connectors
mountPropagation: HostToContainer
volumes:
- name: connectors
emptyDir: {}
here the manifest to reproduce the behavior
This works because your containers run in a pod. The pod is where your volume is defined, not the container. So you are creating a volume in your pod that is an empty directory. Then you are mounting it in your init container and making changes. That makes changes to the volume on the pod.
Then when your init container finishes, the files at the pod level don't go away, they are still there, so your second container picks up the files when it mounts the same volume from the pod.
This is expected behavior and doesn't need mountPropagation fields at all. The mountPropagation fields may have some effect on emptyDir volumes, but it is not related to preserving the files:
https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
All containers in the Pod can read and write the same files in the emptyDir volume, though that volume can be mounted at the same or different paths in each container. When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.
Note: A container crashing does not remove a Pod from a node. The data in an emptyDir volume is safe across container crashes.
The note here doesn't explicitly state it, but this implies it is also safe across initContainer to Container transitions. As long as your pod exists on the node, your data will be there in the volume.

Docker volume vs Kubernetes persistent volume

The closest answer I found is this.
But I want to know is that, will the Dockerfile VOLUME command be totally ignored by Kubernetes? Or data will be persisted into two places? One for docker volume (in the host which pod running) and another is Kubernetes's PV?
The reason of asking this is because I deploy some containers from docker hub which contain VOLUME command. Meanwhile I also attach PVC to my pod. I am thinking whether local volume (docker volume, not K8 PV) will be created in the node? If my pod scheduled to another node, then another new volume created?
On top of this, thanks for #Rico to point out that -v command and Kubernetes's mount will take precedence over dockerfile VOLUME command, but what if as scenario below:
dockerfile VOLUME onto '/myvol'
Kubernetes mount PVC to '/anotherMyVol'
In this case, will myvol mount to my local node harddisk? and cause unaware data persisted locally?
It will not be ignored unless you override it on your Kubernetes pod spec. For example, if you follow this example from the Docker documentation:
$ docker run -it container bash
root#7efcf5ef12a2:/# mount | grep myvol
/dev/nvmeXnXpX on /myvol type ext4 (rw,relatime,discard,data=ordered)
root#7efcf5ef12a2:/#
You'll see that it's mounted on the root drive of the host where the container is running on. Docker actually creates a volume on the host filesystem under /var/lib/docker/volumes (/var/lib/docker is your Docker graph directory):
$ pwd
/var/lib/docker/volumes
$ find . | grep greeting
./d0bc20d085243c39c4f386dce2f6cafcd8146128d6b0c8f9dcb27cfb61a7ecab/_data/greeting
You can override this with the -v option in Docker:
$ docker run -it -v /mnt:/myvol container bash
root#1c7211cf43d0:/# cd /myvol/
root#1c7211cf43d0:/myvol# touch hello
root#1c7211cf43d0:/myvol# exit
exit
$ pwd # <= on the host
/mnt
$ ls
hello
So on Kubernetes you can override it in the pod spec:
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: mycontainer
image: container
volumeMounts:
- name: storage
mountPath: /myvol
volumes:
- name: storage
hostPath:
path: /mnt
type: Directory
You need to explicitly define a PersistentVolumeClaim and/or PersistentVolume. This is not done for you.

Is there any better way for changing the source code of a container instead of creating a new image?

What is the best way to change the source code of my application running as Kubernetes pod without creating a new version of image so I can avoid time taken for pushing and pulling image from repository?
You may enter the container using bash if it installed on the image and modify it using -
docker exec -it <CONTAINERID> /bin/bash
However, this isn’t advisable solution. If your modifications succeed, you should update the Dockerfile accordingly or else you risk losing your work and ability to share it with others.
Have the container pull from git on creation?
Setup CI/CD?
Another way to achieve a similar result is to leave the application source outside of the container and mount the application source folder in the container.
This is especially useful when developing web applications in environments such as PHP: your container is setup with your Apache/PHP stack and /var/www/html is configured to mount your local filesystem.
If you are using minikube, it already mounts a host folder within the minikube VM. You can find the exact paths mounted, depending on your setup, here:
https://kubernetes.io/docs/getting-started-guides/minikube/#mounted-host-folders
Putting it all together, this is what a nginx deployment would look like on kubernetes, mounting a local folder containing the web site being displayed:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
volumeMounts:
- mountPath: /var/www/html/
name: sources
readOnly: true
volumes:
- name: sources
hostPath:
path: /Users/<username>/<source_folder>
type: Directory
Finally we have resolved the issue. Here, we changed our image repository from docker hub to aws ecr in the same region where we are running kubernetes cluster. Now, it is taking very lesstime for pushing/pulling images.
This is definitely not recommended for production.
But if your intention is local development with kubernetes, take a look at these tools:
Telepresence
Telepresence is an open source tool that lets you run a single service
locally, while connecting that service to a remote Kubernetes cluster.
Kubectl warp
Warp is a kubectl plugin that allows you to execute your local code
directly in Kubernetes without slow image build process.
The kubectl warp command runs your command inside a container, the same
way as kubectl run does, but before executing the command, it
synchronizes all your files into the container.
I think it should be taken as process to create new images for each deployment.
Few benefits:
immutable images: no intervention in running instance this will ensure image run in any environment
rollback: if you encounter issues in new version, rollback to previous version
dependencies: new versions may have new dependencies

how to inspect the content of persistent volume by kubernetes on azure cloud service

I have packed the software to a container. I need to put the container to cluster by Azure Container Service. The software have outputs of an directory /src/data/, I want to access the content of the whole directory.
After searching, I have to solution.
use Blob Storage on azure, but then after searching, I can't find the executable method.
use Persistent Volume, but all the official documentation of azure and pages I found is about Persistent Volume itself, not about how to inspect it.
I need to access and manage my output directory on Azure cluster. In other words, I need a savior.
As I've explained here and here, in general, if you can interact with the cluster using kubectl, you can create a pod/container, mount the PVC inside, and use the container's tools to, e.g., ls the contents. If you need more advanced editing tools, replace the container image busybox with a custom one.
Create the inspector pod
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: pvc-inspector
spec:
containers:
- image: busybox
name: pvc-inspector
command: ["tail"]
args: ["-f", "/dev/null"]
volumeMounts:
- mountPath: /pvc
name: pvc-mount
volumes:
- name: pvc-mount
persistentVolumeClaim:
claimName: YOUR_CLAIM_NAME_HERE
EOF
Inspect the contents
kubectl exec -it pvc-inspector -- sh
$ ls /pvc
Clean Up
kubectl delete pod pvc-inspector

Resources