docker data volume vs kubernetes persistent storage

docker data volume vs kubernetes persistent storage - docker

docker engine supports data volumes
A Docker data volume persists after a container is deleted
docker run and docker-compose both support it:
docker run --volume data_vol:/mount/point
docker-compose with named volumes using top-level volumes key
kubernetes also supports persistent volumes, but does it support the same concept of having a data volume - that is, a volume which resides within a container?
if kubernetes supports a data volume (within a container):
would appreciate any reference to the documentation (or an example)
does it also support the migration of the data volume in the same manner it supports the migration of regular containers?
i found some related questions, but couldn't get the answer i am looking for.

What you are trying to say is:
If you do not specify a host path for a docker volume mount, docker dynamically provisions a path and persist it between restarts.
"that is, a volume which resides within a container"
Volume is generated outside of container and mounted later.
For example:
# data_vol location is decided by docker installation
docker run --volume data_vol:/mount/point
# host path is explicitly given
docker run --volume /my/host/path:/mount/point
In kubernetes terms, this is similar to dynamic provisioning. If you want dynamic provisioning, you need to have Storage classes depending on your storage backend.
Please read https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/ .
If you want to specify a host path, following is an example. You can also achieve similar results by using NFS, block storage etc.
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
hostPath:
path: /home/user/my-vol
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: my-ss
spec:
replicas: 1
selector:
matchLabels:
app: my-ss
serviceName: my-svc
template:
metadata:
labels:
app: my-ss
spec:
containers:
- image: ubuntu
name: my-container
volumeMounts:
- mountPath: /my-vol
name: my-vol
volumeClaimTemplates:
- metadata:
name: my-vol
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
selector:
matchLabels:
app: my-ss

Related

Deploying bluespice-free in Kubernetes

According to this source, I can store to my/data/folder by following:
docker run -d -p 80:80 -v {/my/data/folder}:/data bluespice/bluespice-free
I have created following deployment but not sure how to use persistent volume.
apiVersion: apps/v1
kind: Deployment
metadata:
name: bluespice
namespace: default
labels:
app: bluespice
spec:
replicas: 1
selector:
matchLabels:
app: bluespice
template:
metadata:
labels:
app: bluespice
spec:
containers:
- name: bluespice
image: bluespice/bluespice-free
ports:
- containerPort: 80
env:
- name: bs_url
value: "https://bluespice.mycompany.local"
My persistent volume claim name is bluespice-pvc.
Also I have deployed the pod without persistent volume. Can I attach persistent volume on the fly to keep data?

if you want to mount a local directory, you don't have to deal with PVC since you can't force a specific host path in a PersistentVolumeClaim. For testing locally, you can use hostPath as it explained in the documentation:
A hostPath volume mounts a file or directory from the host node's filesystem into your Pod. This is not something that most Pods will need, but it offers a powerful escape hatch for some applications.
For example, some uses for a hostPath are:
running a container that needs access to Docker internals; use a hostPath of /var/lib/docker
running cAdvisor in a container; use a hostPath of /sys
allowing a Pod to specify whether a given hostPath should exist prior to the Pod running, whether it should be created, and what it should exist as
In addition to the required path property, you can optionally specify a type for a hostPath volume.
hostPath configuration example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: bluespice
namespace: default
labels:
app: bluespice
spec:
replicas: 1
selector:
matchLabels:
app: bluespice
template:
metadata:
labels:
app: bluespice
spec:
containers:
- image: bluespice/bluespice-free
name: bluespice
volumeMounts:
- mountPath: /data
name: bluespice-volume
volumes:
- name: bluespice-volume
hostPath:
# directory location on host
path: /my/data/folder
# this field is optional
type: Directory
However, if you want to move to a production cluster, you should consider more reliable option since allowing HostPaths has a lack of security and it's not portable:
HostPath volumes present many security risks, and it is a best practice to avoid the use of HostPaths when possible. When a HostPath volume must be used, it should be scoped to only the required file or directory, and mounted as ReadOnly.
If restricting HostPath access to specific directories through AdmissionPolicy, volumeMounts MUST be required to use readOnly mounts for the policy to be effective.
For more information about PersistentVolumes, you can check the official Kubernetes documents
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see AccessModes).
Therefore, I would recommend to use some cloud solutions like GCP or AWS, or at least by using a NFS share directly from Kubernetes. Also check this topic on StackOverFlow.
About your last question: it's impossible to attach Persistent Volume on the fly.

How to mount folder with files in kubernetes

I am running a docker image that has certain configuration files within it. I need to persist/mount the same folder to the disk as new files will get added later on. When I use standard volume mount in kubernetes, it mounts an empty directory without the intial configuration files. How do I make sure my initial files are copied to the volume while mounting?
- mountPath: /tmp
name: my-vol
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: my-vol
persistentVolumeClaim:
claimName: wso2-disk2```

A possible solution could be the use the node storage mounted on containers (easiest way) or using a DFS solution like NFS, GlusterFS, and so on.
Another and recommended way to achieve what you need is to use a persistent volumes to share the same files between your containers.
Assuming you have a kubernetes cluster that has only one Node, and you want to share the path /mtn/data of your node with your pods (Source):
Create a PersistentVolume:
A hostPath PersistentVolume uses a file or directory on the Node to emulate network-attached storage.
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
Create a PersistentVolumeClaim:
Pods use PersistentVolumeClaims to request physical storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
Look at the PersistentVolumeClaim:
kubectl get pvc task-pv-claim
The output shows that the PersistentVolumeClaim is bound to your PersistentVolume, task-pv-volume.
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
task-pv-claim Bound task-pv-volume 10Gi RWO manual 30s
Create a deployment with 2 replicas for example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/mnt/data"
name: task-pv-storage
Now you can check inside both container the path /mnt/data has the same files.
If you have cluster with more than 1 node I recommend you to think about the other types of persistent volumes or using DFS.
References:
Configure persistent volumes
Persistent volumes
Volume Types

The suggested way to provide configurations to your pod is by creating a configmap for your configurations and mount it in your pod using volumes. This guide ( https://kubernetes.io/docs/concepts/storage/volumes/#configmap) descibes how to do that.
Other ways are to create a persistent volume and persistent volume claim in your cluster and copy your configuration file in that path. Mount the persistent volume in your pod.
You can also copy your configuration on one of the nodes in your cluster and mount that path using hostPath but this requires that your pod should also run on the same node as it tries to look for the path in that node. (Not a recommended approach)

Create configmap of the folder you would like to mount, the following creates configmap consisting of all the files in your-folder:
kubectl create configmap your-config --from-file=your-folder/
Then mount this to the volume and you will have the initial files in your folder. And note that you will need to mount it to subpath since you dont want it to overwrite everything in the directory.

Docker container does/doesnt work inside kubernetes

I am a bit confused here. It does work as normal docker container but when it goes inside a pod it doesnt. So here is how i do it.
Dockerfile in my local to create the image and publish to docker registry
FROM alpine:3.7
COPY . /var/www/html
CMD tail -f /dev/null
Now if i just pull the image(after deleting the local) and run as a container. It works and i can see my files inside /var/www/html.
Now i want to use that inside my kubernetes cluster.
Def : Minikube --vm-driver=none
I am running kube inside minikube with driver none option. So for single node cluster.
EDIT
I can see my data inside /var/www/html if i remove volume mounts and claim from deployment file.
Deployment file
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
io.kompose.service: app
name: app
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
creationTimestamp: null
labels:
io.kompose.service: app
spec:
securityContext:
runAsUser: 1000
runAsGroup: 1000
containers:
- image: kingshukdeb/mycode
name: pd-mycode
resources: {}
volumeMounts:
- mountPath: /var/www/html
name: claim-app-storage
restartPolicy: Always
volumes:
- name: claim-app-storage
persistentVolumeClaim:
claimName: claim-app-nginx
status: {}
PVC file
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
labels:
io.kompose.service: app-nginx1
name: claim-app-nginx
spec:
storageClassName: testmanual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
status: {}
PV file
apiVersion: v1
kind: PersistentVolume
metadata:
name: app-nginx1
labels:
type: local
spec:
storageClassName: testmanual
capacity:
storage: 100Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/volumes/app"
Now when i run these files it creates the pod, pv, pvc and pvc is bound to pv. But if i go inside my container i dont see my files. hostpath is /data/volumes/app . Any ideas will be appreciated.

When PVC is bound to a pod, volume is mounted in location described in pod/deployment yaml file. In you case: mountPath: /var/www/html. That's why files "baked into" container image are not accessible (simple explanation why here)
You can confirm this by exec to the container by running kubectl exec YOUR_POD -i -t -- /bin/sh, and running mount | grep "/var/www/html".
Solution
You may solve this in many ways. It's best practice to keep your static data separate (i.e. in PV), and keep the container image as small and fast as possible.
If you transfer files you want to mount in PV to your hosts path /data/volumes/app they will be accessible in your pod, then you can create new image omitting the COPY operation. This way even if pod crashes changes to files made by your app will be saved.
If PV will be claimed by more than one pod, you need to change accessModes as described here:
The access modes are:
ReadWriteOnce – the volume can be mounted as read-write by a single node
ReadOnlyMany – the volume can be mounted read-only by many nodes
ReadWriteMany – the volume can be mounted as read-write by many nodes
In-depth explanation of Volumes in Kubernetes docs: https://kubernetes.io/docs/concepts/storage/persistent-volumes/

Kubernetes Persistent Volume and hostpath

I was experimenting with something with Kubernetes Persistent Volumes, I can't find a clear explanation in Kubernetes documentation and the behaviour is not the one I am expecting so I like to ask here.
I configured following Persistent Volume and Persistent Volume Claim.
kind: PersistentVolume
apiVersion: v1
metadata:
name: store-persistent-volume
namespace: test
spec:
storageClassName: hostpath
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/Volumes/Data/data"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: store-persistent-volume-claim
namespace: test
spec:
storageClassName: hostpath
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
and the following Deployment and Service configuration.
kind: Deployment
apiVersion: apps/v1beta2
metadata:
name: store-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
k8s-app: store
template:
metadata:
labels:
k8s-app: store
spec:
volumes:
- name: store-volume
persistentVolumeClaim:
claimName: store-persistent-volume-claim
containers:
- name: store
image: localhost:5000/store
ports:
- containerPort: 8383
protocol: TCP
volumeMounts:
- name: store-volume
mountPath: /data
---
#------------ Service ----------------#
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: store
name: store
namespace: test
spec:
type: LoadBalancer
ports:
- port: 8383
targetPort: 8383
selector:
k8s-app: store
As you can see I defined '/Volumes/Data/data' as Persistent Volume and expecting that to mount that to '/data' container.
So I am assuming whatever in '/Volumes/Data/data' in the host should be visible at '/data' directory at container. Is this assumption correct? Because this is definitely not happening at the moment.
My second assumption is, whatever I save at '/data' should be visible at host, which is also not happening.
I can see from Kubernetes console that everything started correctly, (Persistent Volume, Claim, Deployment, Pod, Service...)
Am I understanding the persistent volume concept correctly at all?
PS. I am trying this in a Mac with Docker (18.05.0-ce-mac67(25042) -Channel edge), may be it should not work at Mac?
Thx for answers

Assuming you are using multi-node Kubernetes cluster, you should be able to see the data mounted locally at /Volumes/Data/data on the specific worker node that pod is running
You can check on which worker your pod is scheduled by using the command kubectl get pods -o wide -n test
Please note, as per kubernetes docs, HostPath (Single node testing only – local storage is not supported in any way and WILL NOT WORK in a multi-node cluster) PersistentVolume
It does work in my case.

As you are using the host path, you should check this '/data' in the worker node in which the pod is running.

Like the guy said above. You need to run a 'kubectl get po -n test -o wide' and you will see the node the pod is hosted on. Then if you SSH that worker you can see the volume

Kubernetes access Persistance volume mount externally

I have setup kubernetes Cluster and mounted volume mount as gcePersistentDisk in Google Cloud, It claims and mount successfully in Pods.
But i want to access this volume externally so that i can write it through git/ssh or manual. As disk is Already used and mounted i cannot access it.
How to write files through externally?

gcePersistentDisk is a network-based disk, and provisioned volumes can only be used by GCE
instances in the same project and zone.
The fact is that this kind of resource supports readWriteOnce and ReadOnlyMany.
You can use a GCE persistent storage to share data as read-only between multiple pods in the
same zone.
Back to your question: you can write on this volume only from one pod. No other pods can
use it as write storage - neither external nor from the same project.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: php
labels:
app: php
spec:
replicas: 1
selector:
matchLabels:
app: php
template:
metadata:
labels:
app: php
spec:
containers:
- image: php:7.1-apache
imagePullPolicy: Always
name: php
resources:
requests:
cpu: 200m
ports:
- containerPort: 80
name: php
volumeMounts:
- name: php-persistent-storage
mountPath: /var/www
volumes:
- name: php-persistent-storage
gcePersistentDisk:
pdName: php-phantomjs-disk
fsType: ext4

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart