I was experimenting with something with Kubernetes Persistent Volumes, I can't find a clear explanation in Kubernetes documentation and the behaviour is not the one I am expecting so I like to ask here.
I configured following Persistent Volume and Persistent Volume Claim.
kind: PersistentVolume
apiVersion: v1
metadata:
name: store-persistent-volume
namespace: test
spec:
storageClassName: hostpath
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/Volumes/Data/data"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: store-persistent-volume-claim
namespace: test
spec:
storageClassName: hostpath
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
and the following Deployment and Service configuration.
kind: Deployment
apiVersion: apps/v1beta2
metadata:
name: store-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
k8s-app: store
template:
metadata:
labels:
k8s-app: store
spec:
volumes:
- name: store-volume
persistentVolumeClaim:
claimName: store-persistent-volume-claim
containers:
- name: store
image: localhost:5000/store
ports:
- containerPort: 8383
protocol: TCP
volumeMounts:
- name: store-volume
mountPath: /data
---
#------------ Service ----------------#
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: store
name: store
namespace: test
spec:
type: LoadBalancer
ports:
- port: 8383
targetPort: 8383
selector:
k8s-app: store
As you can see I defined '/Volumes/Data/data' as Persistent Volume and expecting that to mount that to '/data' container.
So I am assuming whatever in '/Volumes/Data/data' in the host should be visible at '/data' directory at container. Is this assumption correct? Because this is definitely not happening at the moment.
My second assumption is, whatever I save at '/data' should be visible at host, which is also not happening.
I can see from Kubernetes console that everything started correctly, (Persistent Volume, Claim, Deployment, Pod, Service...)
Am I understanding the persistent volume concept correctly at all?
PS. I am trying this in a Mac with Docker (18.05.0-ce-mac67(25042) -Channel edge), may be it should not work at Mac?
Thx for answers
Assuming you are using multi-node Kubernetes cluster, you should be able to see the data mounted locally at /Volumes/Data/data on the specific worker node that pod is running
You can check on which worker your pod is scheduled by using the command kubectl get pods -o wide -n test
Please note, as per kubernetes docs, HostPath (Single node testing only – local storage is not supported in any way and WILL NOT WORK in a multi-node cluster) PersistentVolume
It does work in my case.
As you are using the host path, you should check this '/data' in the worker node in which the pod is running.
Like the guy said above. You need to run a 'kubectl get po -n test -o wide' and you will see the node the pod is hosted on. Then if you SSH that worker you can see the volume
Related
We have a docker image that is processing some files on a samba share.
For this we created a cifs share which is mounted to /mnt/dfs and files can be accessed in the container with:
docker run -v /mnt/dfs/project1:/workspace image
Now what I was aked to do is get the container into k8s and to acces a cifs share from a pod a cifs Volume driver usiong FlexVolume can be used. That's where some questions pop up.
I installed this repo as a daemonset
https://k8scifsvol.juliohm.com.br/
and it's up and running.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cifs-volumedriver-installer
spec:
selector:
matchLabels:
app: cifs-volumedriver-installer
template:
metadata:
name: cifs-volumedriver-installer
labels:
app: cifs-volumedriver-installer
spec:
containers:
- image: juliohm/kubernetes-cifs-volumedriver-installer:2.4
name: flex-deploy
imagePullPolicy: Always
securityContext:
privileged: true
volumeMounts:
- mountPath: /flexmnt
name: flexvolume-mount
volumes:
- name: flexvolume-mount
hostPath:
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
Next thing to do is add a PeristentVolume, but that needs a capacity, 1Gi in the example. Does this mean that we lose all data on the smb server? Why should there be a capacity for an already existing server?
Also, how can we access a subdirectory of the mount /mnt/dfs from within the pod? So how to access data from /mnt/dfs/project1 in the pod?
Do we even need a PV? Could the pod just read from the host's mounted share?
apiVersion: v1
kind: PersistentVolume
metadata:
name: mycifspv
spec:
capacity:
storage: 1Gi
flexVolume:
driver: juliohm/cifs
options:
opts: sec=ntlm,uid=1000
server: my-cifs-host
share: /MySharedDirectory
secretRef:
name: my-secret
accessModes:
- ReadWriteMany
No, that field has no effect on the FlexVol plugin you linked. It doesn't even bother parsing out the size you pass in :)
Managed to get it working with the fstab/cifs plugin.
Copy its cifs script to /usr/libexec/kubernetes/kubelet-plugins/volume/exec and give it execute permissions. Also restart kubelet on all nodes.
https://github.com/fstab/cifs
Then added
containers:
- name: pablo
image: "10.203.32.80:5000/pablo"
volumeMounts:
- name: dfs
mountPath: /data
volumes:
- name: dfs
flexVolume:
driver: "fstab/cifs"
fsType: "cifs"
secretRef:
name: "cifs-secret"
options:
networkPath: "//dfs/dir"
mountOptions: "dir_mode=0755,file_mode=0644,noperm"
Now there is the /data mount inside the container pointing to //dfs/dir
I am a bit confused here. It does work as normal docker container but when it goes inside a pod it doesnt. So here is how i do it.
Dockerfile in my local to create the image and publish to docker registry
FROM alpine:3.7
COPY . /var/www/html
CMD tail -f /dev/null
Now if i just pull the image(after deleting the local) and run as a container. It works and i can see my files inside /var/www/html.
Now i want to use that inside my kubernetes cluster.
Def : Minikube --vm-driver=none
I am running kube inside minikube with driver none option. So for single node cluster.
EDIT
I can see my data inside /var/www/html if i remove volume mounts and claim from deployment file.
Deployment file
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
io.kompose.service: app
name: app
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
creationTimestamp: null
labels:
io.kompose.service: app
spec:
securityContext:
runAsUser: 1000
runAsGroup: 1000
containers:
- image: kingshukdeb/mycode
name: pd-mycode
resources: {}
volumeMounts:
- mountPath: /var/www/html
name: claim-app-storage
restartPolicy: Always
volumes:
- name: claim-app-storage
persistentVolumeClaim:
claimName: claim-app-nginx
status: {}
PVC file
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
labels:
io.kompose.service: app-nginx1
name: claim-app-nginx
spec:
storageClassName: testmanual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
status: {}
PV file
apiVersion: v1
kind: PersistentVolume
metadata:
name: app-nginx1
labels:
type: local
spec:
storageClassName: testmanual
capacity:
storage: 100Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/volumes/app"
Now when i run these files it creates the pod, pv, pvc and pvc is bound to pv. But if i go inside my container i dont see my files. hostpath is /data/volumes/app . Any ideas will be appreciated.
When PVC is bound to a pod, volume is mounted in location described in pod/deployment yaml file. In you case: mountPath: /var/www/html. That's why files "baked into" container image are not accessible (simple explanation why here)
You can confirm this by exec to the container by running kubectl exec YOUR_POD -i -t -- /bin/sh, and running mount | grep "/var/www/html".
Solution
You may solve this in many ways. It's best practice to keep your static data separate (i.e. in PV), and keep the container image as small and fast as possible.
If you transfer files you want to mount in PV to your hosts path /data/volumes/app they will be accessible in your pod, then you can create new image omitting the COPY operation. This way even if pod crashes changes to files made by your app will be saved.
If PV will be claimed by more than one pod, you need to change accessModes as described here:
The access modes are:
ReadWriteOnce – the volume can be mounted as read-write by a single node
ReadOnlyMany – the volume can be mounted read-only by many nodes
ReadWriteMany – the volume can be mounted as read-write by many nodes
In-depth explanation of Volumes in Kubernetes docs: https://kubernetes.io/docs/concepts/storage/persistent-volumes/
I'm working in Kubernetes in GCP and I'm having problems with volumes and persistent disks.
I'm using Directus 7 (CMS Headless), which saves most of its information in the database except the files that are uploaded, these files are in the /var/www/html/public/uploads folder (tested locally with docker-compose and works fine), and that folder is the one I'm trying to save on the persistent disk.
No error occurs but when restart the Kubernetes Pod i lose the uploaded images (they are not being saved on the disk).
This is my configuration:
apiVersion: v1
kind: PersistentVolume
metadata:
name: directus-pv
namespace: default
spec:
storageClassName: ""
capacity:
storage: 100G
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: directus-disk
fsType: ext4
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: directus-pvc
namespace: default
labels:
app: .....
spec:
storageClassName: ""
volumeName: directus-pv
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100G
And in the deploy.yaml:
volumeMounts:
- name: api-disk
mountPath: /var/www/html/public/uploads
readOnly: false
volumes:
- name: api-disk
persistentVolumeClaim:
claimName: directus-pvc
Thanks for the help
Remove namespace property from pv and pvc manifest. They are shared resources in the cluster.
Remove storage class property as well.
I presume that your manually provisioned persistence volume directus-pv, is being created somehow with PersistentVolumeReclaimPolicy=*Recycle. That's the only possible reason that could cause data erase on each POD restart.
I'm not able to reproduce your case with the provided manifest files,
but I tried the following test:
Create gcePersistentDisk
Create PersistentVolume
Create PersistentVolumeClaim
Create ReplicaSet (replicas=1) like this one
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: busybox-list-uploads
spec:
replicas: 1
template:
metadata:
labels:
app: busybox-list-uploads
version: "2"
spec:
containers:
- image: busybox
args: [/bin/sh, -c, 'sleep 9999' ]
volumeMounts:
- mountPath: /var/www/html/public/uploads
name: api-disk
name: busybox
volumes:
- name: api-disk
persistentVolumeClaim:
claimName: directus-pvc
Write some file into mounted folder /var/www/html/public/uploads
Restart POD (=kill the POD) by resizing replica to 0 then to 1
List content of /var/www/html/public/uploads on newly created POD
for i in busybox-list-uploads-dgfbc; do kubectl exec -it $i -- ls /var/www/html/public/uploads; done;
lost+found picture_from_busybox-list-uploads-ng4t6.png
As you can see output shows clearly, that data survives POD restart
* you can verify it with cmd: kubectl get pv/directus-pv -o yaml
I have made up a little cluster (it is 1 Machine the master and two VM the nodes), now I have created a NFS directory to share a persistence volume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs #nome di riferimento
spec:
capacity:
storage: 100Mi
accessModes:
- ReadWriteMany
nfs:
server: 192.168.57.1
path: "/mnt/shardisk"
and a claim that call it:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Mi
and finally a stupid pod to use it:
kind: Pod
apiVersion: v1
metadata:
name: nginx-nfs
spec:
volumes:
- name: storage
persistentVolumeClaim:
claimName: test-pvc
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: storage
now I have created a cluster from the physical machine and I have joined it from the VM, I have used callico for the network services (because flannel fail to start if someone know why it would be wonderful to solve it)
now if I try to do:
kubectl describe pod I see all work fine and so to kubectl logs nginx-nfs, but if I try to do kubectl exec -it nginx-nfs /bin/bash
all freeze for a very long time and after that I have this:
Error from server: error dialing backend: dial tcp 10.0.2.15:10250: getsockopt: connection timed out
I have "solve" it, i use kubernetes in 2 different lan and so the admin.conf have an ip that no match the current ip and it will not work, i have solve it creating same vm internal to the host and nat a static ip on it
I'm trying to setup Jenkins to run in a container on Kubernetes, but I'm having trouble persisting the volume for the Jenkins home directory.
Here's my deployment.yml file. The image is based off jenkins/jenkins
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins-deployment
labels:
app: jenkins
spec:
replicas: 1
selector:
matchLabels:
app: jenkins
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: 1234567.dkr.ecr.us-east-1.amazonaws.com/mycompany/jenkins
imagePullPolicy: "Always"
ports:
- containerPort: 8080
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
- name: jenkins-home
emptyDir: {}
However, if i then push a new container to my image repository and update the pods using the below commands, Jenkins comes back online but asks me to start from scratch (enter admin password, none of my Jenkins jobs are there, no plugins etc)
kubectl apply -f kubernetes (where my manifests are stored)
kubectl set image deployment/jenkins-deployment jenkins=1234567.dkr.ecr.us-east-1.amazonaws.com/mycompany/jenkins:$VERSION
Am I misunderstanding how this volume mount is meant to work?
As an aside, I also have backup and restore scripts which backup the Jenkins home directory to s3, and download it again, but that's somewhat outside the scope of this issue.
You should use PersistentVolumes along with StatefulSet instead of Deployment resource if you wish your data to survive re-deployments|restarts of your pod.
You have specified the volume type EmptyDir. This will essentially mount an empty directory on the kube node that runs your pod. Every time you restart your deployment, the pod could move between kube hosts and the empty dir isn't present, so your data isn't persisting across restarts.
I see you're pulling you image from an ECR repository, so I'm assuming you're running k8s in AWS.
You'll need to configure a StorageClass for AWS. If you've provisioned k8s using something like kops, this will already be configured. You can confirm this by doing kubectl get storageclass - the provisioner should be configured as EBS:
NAME PROVISIONER
gp2 (default) kubernetes.io/aws-ebs
Then, you need to specify a persistentvolumeclaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jenkins-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp2 # must match your storageclass from above
resources:
requests:
storage: 30Gi
You can now the pv claim on your deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins-deployment
labels:
app: jenkins
spec:
replicas: 1
selector:
matchLabels:
app: jenkins
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: 1234567.dkr.ecr.us-east-1.amazonaws.com/mycompany/jenkins
imagePullPolicy: "Always"
ports:
- containerPort: 8080
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
persistentVolumeClaim:
claimName: jenkins-data # must match the claim name from above