Mounting a large file in Kubernetes - docker

We are running a pod in Kubernetes that needs to load a file during runtime. This file has the following properties:
It is known at build time
It should be mounted read-only by multiple pods (the same kind)
It might change (externally to the cluster) and needs to be updated
For various reasons (security being the main concern) the file cannot be inside the docker image
It is potentially quite large, theoretically up to 100 MB, but in practice between 200kB - 10MB.
We have considered various options:
Creating a persistent volume, mount the volume in a temporary pod to write (update) the file, unmount the volume, and then mount it in the service with ROX (Read-Only Multiple) claims. This solution means we need downtime during upgrade, and it is hard to automate (due to timings).
Creating multiple secrets using the secrets management of Kubernetes, and then "assemble" the file before loading it in an init-container or something similar.
Both of these solutions feels a little bit hacked - is there a better solution out there that we could utilize for solving this?

You need to use a shared filesystem that supports Read/Write Multiple Pods.
Here is a link to the CSI Drivers which can be used with Kubernetes and provide those access:
https://kubernetes-csi.github.io/docs/drivers.html
Ideally, you need a solution that is not an appliance, and can run anywhere meaning it can run in the cloud or on-prem.
The platforms that could work for you are Ceph, GlusterFS, and Quobyte (Disclaimer, I work for Quobyte)

Related

Should I use the application code inside the docker images or in volumes?

I am working on a Devops project. I want to find the perfect solution.
I have a conflict between two solutions. should I use the application code inside the docker images or in volumes?
Your code should almost never be in volumes, developer-only setups aside (and even then). This is doubly true if you have a setup like a frequent developer-only Node setup that puts the node_modules directory into a Docker-managed anonymous volume: since Docker will refuse to update that directory on its own, the primary effect of this is to cause Docker to ignore any changes to the package.json file.
More generally, in this context, you should think of the image as a way to distribute the application code. Consider clustered environments like Kubernetes: the cluster manager knows how to pull versioned Docker images on its own, but you need to work around a lot of the standard machinery to try to push code into a volume. You should not need to both distribute a Docker image and also separately distribute the code in the image.
I'd suggest using host-directory mounts for injecting configuration files and for storing file-based logs (if the container can't be configured to log to stdout). Use either host-directory or named-volume mounts for stateful containers' data (host directories are easier to back up, named volumes are faster on non-Linux platforms). Do not use volumes at all for your application code or libraries.
(Consider that, if you're just overwriting all of the application code with volume mounts, you may as well just use the base node image and not build a custom image; and if you're doing that, you may as well use your automation system (Salt Stack, Ansible, Chef, etc.) to just install Node and ignore Docker entirely.)

Dask +SLURM over ftp mount (CurlFtpFS)

So I have a working DASK/SLURM cluster of 4 raspberry Pis with a common NFS share, that I can run Python jobs succesfully.
However, I want to add some more arm devices to my cluster that do not support NFS mounts (Kernel module missing) so I wish to move to fuse based ftp mounts wiht CurlftpFS.
I have setup the mounts sucesfully with anonymous username and without any passwords and the common FTP share can be seen by all the nodes (just as before when it was an NFS share).
I can still run SLURM jobs (since they do not use the share) but when I try to run a DASK job the master node timesout complaining that no worker nodes could be started.
I am not sure what exactly is the problem, since the share it open to anyone for read/write access (e.g. logs and dask queue intermediate files).
Any ideas how I can troubleshoot this?
I don't believe anyone has a cluster like yours!
At a guess, the filesystem access via FUSE, ftp and the pi is much slower than the OS is expecting, and you are seeing the effects of low-level timeouts, i.e., from Dask's point of view it appears that files reads are failing. Dask needs access to storage for configuration and sometimes temporary files. You would want to make sure that these locations are on local storage or tuned off. However, if this is happening during import of modules, which you have on the shared drive by design, there may be no fixing it (python loads many small files during import). Why not use rsync to move the files to the nodes?

rbind usage on local volume mounting

I have a directory that is configured to be managed by an automounter (as described here). I need to use this directory (and all directories that are mounted inside) in multiple pods as a Local Persistent Volume.
I am able to trigger the automounter within the containers, but there are some use-cases when this directory is not empty when the container starts up. This makes sub-directories appear as empty and not being able to trigger the automounter (whithin the container)
I did some investigation and discovered that when using Local PVs, there is a mount -o bind command between the source directory and some internal directory managed by the kubelet (this is the line in the source code).
What I actually do need is rbind to be used (recursive binding - here is a good explanation).
Using rbind also requires some changes to the part that unmounts the volume (recursive unmounting is needed)
I don't want to patch the kubelet and recompile it..yet.
So my question is: are there some official methods to provide to Kubernetes some custom mounter/unmounter?
Meanwhile, I did find a solution for this use-case.
Based on Kubernetes docs there is something called Out-Of-Tree Volume Plugins
The Out-of-tree volume plugins include the Container Storage Interface (CSI) and FlexVolume. They enable storage vendors to create custom storage plugins without adding them to the Kubernetes repository
Even that CSI is encouraged to be used, I chose FlexVolume to implement my custom driver. Here is a detailed documentation.
This driver is actually a py script that supports three actions: init/mount/unmount (--rbind is used to mount that directory managed by automounter and unmounts it like this). It is deployed using a DaemonSet (docs here)
And this is it!

Analyze container file system on kubernetes after it exits/crashes

Perhaps a silly question with no sense:
In a kubernetes deployment (or minikube), when a pod container crashes, i would like to analyze the file system at that moment. In this way, i could see core dumps or any other useful information.
I know that i could mount a volume or PVC to get core dumps from a host-defined core pattern location, and i also could get logs by mean a rsyslog sidecar or any other way, but i still would like to do "post-mortem" analysis if possible. I assume that kubernetes should provide (but i don't know how, that's the reason of my question) some mechanism to do this forensics tasks easing the life to all of us, because in a production system we could need to analyze killed/exited containers.
I tried playing directly with docker run without --rm option, but can't get nothing useful from inspection to get useful information or recreate the file system in last moment that had the container alive.
Thank u very much!
When a pod container crashes, i would like to analyze the file system at that moment.
POD (Containers) natively use non-persistent storage.
When a container exits/terminates, so does the container’s storage.
POD (Container) can be connected to storage that is external. This will allows for the storage of persistent data (you can configure volume mount as path to core dump etc..), since this external storage is not removed when a container is stopped/killed will help you with more flexibility to analysis the file system. Configuring container file system storage with commonly used file systems such as NFS .. etc ..

How to Convert NFS into a Storage Class in kubernetes

I work in an media organisation where we deploy all our application on monolithic VMs but now we want move to kubernetes but we have major problem we have almost 40+NFS servers from which we are consuming the data in terabytes
The major problem is how do we read all this data from containers
The solutions we tried creating a
1.Persistent Volume and Persistent Volume Claim of the NFS which according to us is not a feasible solution as the data grow we have to create a new pv and pvc and create deployment
2.Mounting volumes on Kubernetes if we do this there would be no difference between kubernetes and VMs
3.Adding docker volumes to containers we were able to add the volume but we cannot see the data in the container
How can we make the existing nfs as storage class and use it or how to mount all the 40+ NFS servers on pods
It sounds like you need some form of object storage or block storage platform to manage the disks and automatically provisions disks for you.
You could use something like Rook for deploying Ceph into your cluster.
This will enable disk management in a much more friendly way, and help to automatically provision the NFS disks into your cluster.
Take a look at this: https://docs.ceph.com/docs/mimic/radosgw/nfs/
There is also the option of creating your own implementation using CRDs to trigger PV/PVC creation on certain actions/disks being mounted in your servers.

Resources