Persistent disk problem on Kubernetes GCP - docker

I'm working in Kubernetes in GCP and I'm having problems with volumes and persistent disks.
I'm using Directus 7 (CMS Headless), which saves most of its information in the database except the files that are uploaded, these files are in the /var/www/html/public/uploads folder (tested locally with docker-compose and works fine), and that folder is the one I'm trying to save on the persistent disk.
No error occurs but when restart the Kubernetes Pod i lose the uploaded images (they are not being saved on the disk).
This is my configuration:
apiVersion: v1
kind: PersistentVolume
metadata:
name: directus-pv
namespace: default
spec:
storageClassName: ""
capacity:
storage: 100G
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: directus-disk
fsType: ext4
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: directus-pvc
namespace: default
labels:
app: .....
spec:
storageClassName: ""
volumeName: directus-pv
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100G
And in the deploy.yaml:
volumeMounts:
- name: api-disk
mountPath: /var/www/html/public/uploads
readOnly: false
volumes:
- name: api-disk
persistentVolumeClaim:
claimName: directus-pvc
Thanks for the help

Remove namespace property from pv and pvc manifest. They are shared resources in the cluster.
Remove storage class property as well.

I presume that your manually provisioned persistence volume directus-pv, is being created somehow with PersistentVolumeReclaimPolicy=*Recycle. That's the only possible reason that could cause data erase on each POD restart.
I'm not able to reproduce your case with the provided manifest files,
but I tried the following test:
Create gcePersistentDisk
Create PersistentVolume
Create PersistentVolumeClaim
Create ReplicaSet (replicas=1) like this one
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: busybox-list-uploads
spec:
replicas: 1
template:
metadata:
labels:
app: busybox-list-uploads
version: "2"
spec:
containers:
- image: busybox
args: [/bin/sh, -c, 'sleep 9999' ]
volumeMounts:
- mountPath: /var/www/html/public/uploads
name: api-disk
name: busybox
volumes:
- name: api-disk
persistentVolumeClaim:
claimName: directus-pvc
Write some file into mounted folder /var/www/html/public/uploads
Restart POD (=kill the POD) by resizing replica to 0 then to 1
List content of /var/www/html/public/uploads on newly created POD
for i in busybox-list-uploads-dgfbc; do kubectl exec -it $i -- ls /var/www/html/public/uploads; done;
lost+found picture_from_busybox-list-uploads-ng4t6.png
As you can see output shows clearly, that data survives POD restart
* you can verify it with cmd: kubectl get pv/directus-pv -o yaml

Related

Accessing CIFS files from pods

We have a docker image that is processing some files on a samba share.
For this we created a cifs share which is mounted to /mnt/dfs and files can be accessed in the container with:
docker run -v /mnt/dfs/project1:/workspace image
Now what I was aked to do is get the container into k8s and to acces a cifs share from a pod a cifs Volume driver usiong FlexVolume can be used. That's where some questions pop up.
I installed this repo as a daemonset
https://k8scifsvol.juliohm.com.br/
and it's up and running.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cifs-volumedriver-installer
spec:
selector:
matchLabels:
app: cifs-volumedriver-installer
template:
metadata:
name: cifs-volumedriver-installer
labels:
app: cifs-volumedriver-installer
spec:
containers:
- image: juliohm/kubernetes-cifs-volumedriver-installer:2.4
name: flex-deploy
imagePullPolicy: Always
securityContext:
privileged: true
volumeMounts:
- mountPath: /flexmnt
name: flexvolume-mount
volumes:
- name: flexvolume-mount
hostPath:
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
Next thing to do is add a PeristentVolume, but that needs a capacity, 1Gi in the example. Does this mean that we lose all data on the smb server? Why should there be a capacity for an already existing server?
Also, how can we access a subdirectory of the mount /mnt/dfs from within the pod? So how to access data from /mnt/dfs/project1 in the pod?
Do we even need a PV? Could the pod just read from the host's mounted share?
apiVersion: v1
kind: PersistentVolume
metadata:
name: mycifspv
spec:
capacity:
storage: 1Gi
flexVolume:
driver: juliohm/cifs
options:
opts: sec=ntlm,uid=1000
server: my-cifs-host
share: /MySharedDirectory
secretRef:
name: my-secret
accessModes:
- ReadWriteMany
No, that field has no effect on the FlexVol plugin you linked. It doesn't even bother parsing out the size you pass in :)
Managed to get it working with the fstab/cifs plugin.
Copy its cifs script to /usr/libexec/kubernetes/kubelet-plugins/volume/exec and give it execute permissions. Also restart kubelet on all nodes.
https://github.com/fstab/cifs
Then added
containers:
- name: pablo
image: "10.203.32.80:5000/pablo"
volumeMounts:
- name: dfs
mountPath: /data
volumes:
- name: dfs
flexVolume:
driver: "fstab/cifs"
fsType: "cifs"
secretRef:
name: "cifs-secret"
options:
networkPath: "//dfs/dir"
mountOptions: "dir_mode=0755,file_mode=0644,noperm"
Now there is the /data mount inside the container pointing to //dfs/dir

Postgres running on kubernetes looses data upon Pod recreation or cluster reboot

I have the postgres container running in a Pod on GKE and a PersistentVolume set up to store the data. However, all of the data in the database is lost if the cluster reboots or if the Pod is deleted.
If I run kubectl delete <postgres_pod> to delete the existing Pod and check the newly created (by kubernetes) Pod to replace the deleted one, the respective database has not the data that it had before the Pod being deleted.
Here are the yaml files I used to deploy postgres.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: custom-storage
parameters:
type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Retain
volumeBindingMode: Immediate
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: postgres-volume-claim
spec:
storageClassName: custom-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
spec:
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:11.5
resources: {}
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: "dbname"
- name: POSTGRES_USER
value: "user"
- name: POSTGRES_PASSWORD
value: "password"
volumeMounts:
- mountPath: /var/lib/postgresql/
name: postgresdb
volumes:
- name: postgresdb
persistentVolumeClaim:
claimName: postgres-volume-claim
I double checked that the persistentVolumeReclaimPolicy has value Retain.
What am I missing?
Is the cluster creating a new volume each time you delete a pod? Check with kubectl get pv.
Is this a multi-zone cluster? Your storage class is not provisionig regional disks, so you might be getting a new disk when the pod moves from one zone to another.
Possibly related to your problem, the postgres container reference recommends mounting at /var/lib/postgresql/data/pgdata and setting the PGDATA env variable:
https://hub.docker.com/_/postgres#pgdata

Kubernetes Persistent Volume and hostpath

I was experimenting with something with Kubernetes Persistent Volumes, I can't find a clear explanation in Kubernetes documentation and the behaviour is not the one I am expecting so I like to ask here.
I configured following Persistent Volume and Persistent Volume Claim.
kind: PersistentVolume
apiVersion: v1
metadata:
name: store-persistent-volume
namespace: test
spec:
storageClassName: hostpath
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/Volumes/Data/data"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: store-persistent-volume-claim
namespace: test
spec:
storageClassName: hostpath
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
and the following Deployment and Service configuration.
kind: Deployment
apiVersion: apps/v1beta2
metadata:
name: store-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
k8s-app: store
template:
metadata:
labels:
k8s-app: store
spec:
volumes:
- name: store-volume
persistentVolumeClaim:
claimName: store-persistent-volume-claim
containers:
- name: store
image: localhost:5000/store
ports:
- containerPort: 8383
protocol: TCP
volumeMounts:
- name: store-volume
mountPath: /data
---
#------------ Service ----------------#
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: store
name: store
namespace: test
spec:
type: LoadBalancer
ports:
- port: 8383
targetPort: 8383
selector:
k8s-app: store
As you can see I defined '/Volumes/Data/data' as Persistent Volume and expecting that to mount that to '/data' container.
So I am assuming whatever in '/Volumes/Data/data' in the host should be visible at '/data' directory at container. Is this assumption correct? Because this is definitely not happening at the moment.
My second assumption is, whatever I save at '/data' should be visible at host, which is also not happening.
I can see from Kubernetes console that everything started correctly, (Persistent Volume, Claim, Deployment, Pod, Service...)
Am I understanding the persistent volume concept correctly at all?
PS. I am trying this in a Mac with Docker (18.05.0-ce-mac67(25042) -Channel edge), may be it should not work at Mac?
Thx for answers
Assuming you are using multi-node Kubernetes cluster, you should be able to see the data mounted locally at /Volumes/Data/data on the specific worker node that pod is running
You can check on which worker your pod is scheduled by using the command kubectl get pods -o wide -n test
Please note, as per kubernetes docs, HostPath (Single node testing only – local storage is not supported in any way and WILL NOT WORK in a multi-node cluster) PersistentVolume
It does work in my case.
As you are using the host path, you should check this '/data' in the worker node in which the pod is running.
Like the guy said above. You need to run a 'kubectl get po -n test -o wide' and you will see the node the pod is hosted on. Then if you SSH that worker you can see the volume

How to mount entire directory in Kubernetes using configmap?

I want to be able to mount an unknown number of config files in /etc/configs
I have added some files to the configmap using:
kubectl create configmap etc-configs --from-file=/tmp/etc-config
The number of files and file names are never going to be known and I would like to recreate the configmap and the folder in the Kubernetes container should be updated after sync interval.
I have tried to mount this but I'm not able to do so, the folder is always empty but I have data in the configmap.
bofh$ kubectl describe configmap etc-configs
Name: etc-configs
Namespace: default
Labels: <none>
Annotations: <none>
Data
====
file1.conf:
----
{
... trunkated ...
}
file2.conf:
----
{
... trunkated ...
}
file3.conf:
----
{
... trunkated ...
}
Events: <none>
I'm using this one in the container volumeMounts:
- name: etc-configs
mountPath: /etc/configs
And this is the volumes:
- name: etc-configs
configMap:
name: etc-configs
I can mount individual items but not an entire directory.
Any suggestions about how to solve this?
You can mount the ConfigMap as a special volume into your container.
In this case, the mount folder will show each of the keys as a file in the mount folder and the files will have the map values as content.
From the Kubernetes documentation:
apiVersion: v1
kind: Pod
metadata:
name: dapi-test-pod
spec:
containers:
- name: test-container
image: k8s.gcr.io/busybox
...
volumeMounts:
- name: config-volume
mountPath: /etc/config
volumes:
- name: config-volume
configMap:
# Provide the name of the ConfigMap containing the files you want
# to add to the container
name: special-config
I'm feeling really stupid now.
Sorry, My fault.
The Docker container did not start so I was manually staring it using docker run -it --entrypoint='/bin/bash' and I could not see any files from the configMap.
This does not work since docker don't know anything about my deployment until Kubernetes starts it.
The docker image was failing and the Kubernetes config was correct all the time.
I was debugging it wrong.
With your config, you're going to mount each file listed in your configmap.
If you need to mount all file in a folder, you shouldn't use configmap, but a persistenceVolume and persistenceVolumeClaims:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-volume-jenkins
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/pv-jenkins"
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-claim-jenkins
spec:
accessModes:
- ReadWriteOnce
storageClassName: ""
resources:
requests:
storage: 50Gi
In your deployment.yml:
volumeMounts:
- name: jenkins-persistent-storage
mountPath: /data
volumes:
- name: jenkins-persistent-storage
persistentVolumeClaim:
claimName: pv-claim-jenkins
You can also use the following:
kubectl create configmap my-config --from-file=/etc/configs
to create the config map with all files in that folder.
Hope this helps.

Volume mounting in Jenkins on Kubernetes

I'm trying to setup Jenkins to run in a container on Kubernetes, but I'm having trouble persisting the volume for the Jenkins home directory.
Here's my deployment.yml file. The image is based off jenkins/jenkins
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins-deployment
labels:
app: jenkins
spec:
replicas: 1
selector:
matchLabels:
app: jenkins
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: 1234567.dkr.ecr.us-east-1.amazonaws.com/mycompany/jenkins
imagePullPolicy: "Always"
ports:
- containerPort: 8080
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
- name: jenkins-home
emptyDir: {}
However, if i then push a new container to my image repository and update the pods using the below commands, Jenkins comes back online but asks me to start from scratch (enter admin password, none of my Jenkins jobs are there, no plugins etc)
kubectl apply -f kubernetes (where my manifests are stored)
kubectl set image deployment/jenkins-deployment jenkins=1234567.dkr.ecr.us-east-1.amazonaws.com/mycompany/jenkins:$VERSION
Am I misunderstanding how this volume mount is meant to work?
As an aside, I also have backup and restore scripts which backup the Jenkins home directory to s3, and download it again, but that's somewhat outside the scope of this issue.
You should use PersistentVolumes along with StatefulSet instead of Deployment resource if you wish your data to survive re-deployments|restarts of your pod.
You have specified the volume type EmptyDir. This will essentially mount an empty directory on the kube node that runs your pod. Every time you restart your deployment, the pod could move between kube hosts and the empty dir isn't present, so your data isn't persisting across restarts.
I see you're pulling you image from an ECR repository, so I'm assuming you're running k8s in AWS.
You'll need to configure a StorageClass for AWS. If you've provisioned k8s using something like kops, this will already be configured. You can confirm this by doing kubectl get storageclass - the provisioner should be configured as EBS:
NAME PROVISIONER
gp2 (default) kubernetes.io/aws-ebs
Then, you need to specify a persistentvolumeclaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jenkins-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp2 # must match your storageclass from above
resources:
requests:
storage: 30Gi
You can now the pv claim on your deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins-deployment
labels:
app: jenkins
spec:
replicas: 1
selector:
matchLabels:
app: jenkins
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: 1234567.dkr.ecr.us-east-1.amazonaws.com/mycompany/jenkins
imagePullPolicy: "Always"
ports:
- containerPort: 8080
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
persistentVolumeClaim:
claimName: jenkins-data # must match the claim name from above

Resources