k8s unable to mount vsphere vsan disk to pod - storage

i have a problem with persistent volumes and vmware vsan
i'm using kubernetes v1.9.5 on CentOS 7 deployed with kubespray (cloudprovider: vsphere).
the pods are not able to mount to vsan devices:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned pvpod to k8s-worker-5
Normal SuccessfulMountVolume 2m kubelet, k8s-worker-5 MountVolume.SetUp succeeded for volume "default-token-c5mqh"
Warning FailedMount 56s kubelet, k8s-worker-5 Unable to mount volumes for pod "pvpod_default(62b1704e-3e27-11e8-9cfc-0050568f0ce4)": timeout expired waiting for volumes to attach/mount for pod "default"/"pvpod". list of unattached/unmounted volumes=[test-volume]
the volume is assigned to the kubernetes worker-5 (sdb)
[root#k8s-worker-5 ~]# fdisk -l
Disk /dev/sda: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000ac2d7
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 2099199 1048576 83 Linux
/dev/sda2 2099200 3147775 524288 82 Linux swap / Solaris
/dev/sda3 3147776 209712509 103282367 83 Linux
Disk /dev/sdb: 2147 MB, 2147483648 bytes, 4194304 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
pv and pvc are created:
[root#k8s-jump ]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-60b0f1d9-3e27-11e8-9cfc-0050568f0ce4 2Gi RWO Delete Bound default/pvcsc001 vsan 5m
[root#k8s-jump ]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvcsc001 Bound pvc-60b0f1d9-3e27-11e8-9cfc-0050568f0ce4 2Gi RWO vsan 5m
this is my test deployment manifest
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: vsan
provisioner: kubernetes.io/vsphere-volume
parameters:
datastore: vsanDatastore
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvcsc001
annotations:
volume.beta.kubernetes.io/storage-class: vsan
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
---
apiVersion: v1
kind: Pod
metadata:
name: pvpod
spec:
containers:
- name: test-container
image: k8s.gcr.io/test-webserver
volumeMounts:
- name: test-volume
mountPath: /test-vmdk
volumes:
- name: test-volume
persistentVolumeClaim:
claimName: pvcsc001

The instances need the vmware flag "disk.enableUUID=1"
https://vmware.github.io/vsphere-storage-for-kubernetes/documentation/existing.html#step-3-enable-disk-uuid-on-node-virtual-machines

Related

Jenkins is not mounting the AWS EFS file system and using the default volume instead

I'm trying to use jenkins with an EFS persistence volume on EKS. however all my attempts to make it use the provided EFS file system did not succeed. What makes me wonder is that when i tested with a busybox image the EFS was successfully mounted and could see data written to the shared storage.
EFS definition
resource "aws_efs_file_system" "jenkins_shared_file_system" {
creation_token = "Jenkins shared file system"
performance_mode = "generalPurpose"
throughput_mode = "bursting"
encrypted = true
tags = {
Name = "Jenkins shared file system"
}
}
resource "aws_efs_mount_target" "jenkins_efs_private_subnet_1_mount_target" {
file_system_id = aws_efs_file_system.jenkins_shared_file_system.id
subnet_id = aws_subnet.ci_cd_private_subnet_1.id
security_groups = [aws_security_group.jenkins_efs_sg.id]
}
resource "aws_efs_mount_target" "jenkins_efs_private_subnet_2_mount_target" {
file_system_id = aws_efs_file_system.jenkins_shared_file_system.id
subnet_id = aws_subnet.ci_cd_private_subnet_2.id
security_groups = [aws_security_group.jenkins_efs_sg.id]
}
resource "aws_efs_access_point" "jenkins_efs_access_point" {
file_system_id = aws_efs_file_system.jenkins_shared_file_system.id
tags = {
Name = "Jenkins EFS access point"
}
posix_user {
gid = 1000
uid = 1000
}
root_directory {
path = "/jenkins"
creation_info {
owner_uid = 1000
owner_gid = 1000
permissions = 777
}
}
}
The CSI driver is installed following instructions from https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html
here is the persistence configurations
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv
namespace: jenkins
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: efs-sc
csi:
driver: efs.csi.aws.com
volumeHandle: fs-12345::fsap-12345
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-pvc
namespace: jenkins
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
and the jenkins values config
controller:
componentName: jenkins-controller
image: "jenkins/jenkins"
tag: lts-jdk11
imagePullPolicy: IfNotPresent
installPlugins: false
disableRememberMe: false
resources:
requests:
cpu: 2
memory: 2Gi
limits:
cpu: 6
memory: 4Gi
runAsUser: 1000
fsGroup: 1000
serviceType: ClusterIP
persistence:
enabled: true
existingClaim: efs-pvc
storageClassName: efs-sc
ingress:
enabled: true
apiVersion: "networking.k8s.io/v1"
ingressClassName: nginx
kubernetes.io/ingress.class: nginx
rules:
- host: foo.jenkins.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: jenkins
port:
number: 80
tls:
- secretName: jenkins-tls
hosts:
- foo.jenkins.com
the outout before deploying jenkins with helm
kubernetes git:(jenkins) ✗ kc get sc,pv,pvc -n jenkins
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/efs-sc efs.csi.aws.com Delete Immediate false 11m
storageclass.storage.k8s.io/gp2 (default) kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 69m
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/efs-pv 5Gi RWX Retain Bound jenkins/efs-pvc efs-sc 11m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/efs-pvc Bound efs-pv 5Gi RWX efs-sc 11m
and after deploying
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/efs-sc efs.csi.aws.com Delete Immediate false 15m
storageclass.storage.k8s.io/gp2 (default) kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 73m
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/efs-pv 5Gi RWX Retain Bound jenkins/efs-pvc efs-sc 15m
persistentvolume/pvc-94adfdfb-a1db-4f16-8189-84ac20474607 8Gi RWO Delete Bound jenkins/jenkins gp2 12s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/efs-pvc Bound efs-pv 5Gi RWX efs-sc 15m
persistentvolumeclaim/jenkins Bound pvc-94adfdfb-a1db-4f16-8189-84ac20474607 8Gi RWO gp2 17s
The output of mount when i exec inside the pod shows no NFS mounted volume. which is really weird
Any help is really appreciated. thank you !
A good rest and a clear mind helped me to fix this issue after an entire day of hitting my head against the wall.
The problem is that persistence block should be independent and not under the controller block.
persistence:
enabled: true
existingClaim: efs-pvc
storageClassName: efs-sc
controller:
componentName: jenkins-controller
image: "jenkins/jenkins"
tag: lts-jdk11
imagePullPolicy: IfNotPresent
installPlugins: false
disableRememberMe: false
resources:
requests:
cpu: 2
memory: 2Gi
limits:
cpu: 6
memory: 4Gi
runAsUser: 1000
fsGroup: 1000
serviceType: ClusterIP
ingress:
enabled: true
apiVersion: "networking.k8s.io/v1"
ingressClassName: nginx
kubernetes.io/ingress.class: nginx
rules:
- host: foo.jenkins.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: jenkins
port:
number: 80
tls:
- secretName: jenkins-tls
hosts:
- foo.jenkins.com

How do I know how much memory I should provide in k8s pod?

I have applied a elasticsearch on k8s in my Mac (a minikube cluster). The elasticsearch configure file is:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 7.10.0
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: -Xms2g -Xmx2g
resources:
requests:
memory: 4Gi
cpu: 4
limits:
memory: 4Gi
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard
resources:
requests:
storage: 1Gi
after I run kubectl apply -f es.yaml, the pod and services are created but the pod is pending.
$kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 28h
quickstart-es-default ClusterIP None <none> 9200/TCP 21m
quickstart-es-http ClusterIP 10.103.177.195 <none> 9200/TCP 21m
quickstart-es-transport ClusterIP None <none> 9300/TCP 21m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
quickstart-es-default-0 0/1 Pending 0 21m
the output of describe pods is:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 22m (x2 over 22m) default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Warning FailedScheduling 40s (x17 over 22m) default-scheduler 0/1 nodes are available: 1 Insufficient memory.
It seems that I don't have enough memory in my pod. How can I allocate more memories to my pod?
Minikube itself starts with DefaultMemory = 2048, you are hitting this limit.
Using minikube you should think in advance how much resources your pods/replicas use in total, so that you can use --memory flag to allocated need amount of RAM.
You already got an answer in separate question. In addition to that I would add that you should always do minikube delete prior to start minikube with --memory= option, eg minikube start --memory=8192
You can always check you current memory configuration by kubectl describe node in Capacity section, e.g.
Capacity:
cpu: 1
...
memory: 3785984Ki
pods: 110

How can I use an existing PVC to helm install stable/jenkins

I am stuck with a helm install of jenkins
:(
please help!
I have predefined a storage class via:
$ kubectl apply -f generic-storage-class.yaml
with generic-storage-class.yaml:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: generic
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
zones: us-east-1a, us-east-1b, us-east-1c
fsType: ext4
I then define a PVC via:
$ kubectl apply -f jenkins-pvc.yaml
with jenkins-pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jenkins-pvc
namespace: jenkins-project
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
I can then see the PVC go into the BOUND status:
$ kubectl get pvc --all-namespaces
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
jenkins-project jenkins-pvc Bound pvc-a173294f-7cea-11e9-a90f-161c7e8a0754 20Gi RWO gp2 27m
But when I try to Helm install jenkins via:
$ helm install --name jenkins \
--set persistence.existingClaim=jenkins-pvc \
stable/jenkins --namespace jenkins-project
I get this output:
NAME: jenkins
LAST DEPLOYED: Wed May 22 17:07:44 2019
NAMESPACE: jenkins-project
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
jenkins 5 0s
jenkins-tests 1 0s
==> v1/Deployment
NAME READY UP-TO-DATE AVAILABLE AGE
jenkins 0/1 1 0 0s
==> v1/PersistentVolumeClaim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
jenkins Pending gp2 0s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
jenkins-6c9f9f5478-czdbh 0/1 Pending 0 0s
==> v1/Secret
NAME TYPE DATA AGE
jenkins Opaque 2 0s
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins LoadBalancer 10.100.200.27 <pending> 8080:31157/TCP 0s
jenkins-agent ClusterIP 10.100.221.179 <none> 50000/TCP 0s
NOTES:
1. Get your 'admin' user password by running:
printf $(kubectl get secret --namespace jenkins-project jenkins -o jsonpath="{.data.jenkins-admin-password}" | base64 --decode);echo
2. Get the Jenkins URL to visit by running these commands in the same shell:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get svc --namespace jenkins-project -w jenkins'
export SERVICE_IP=$(kubectl get svc --namespace jenkins-project jenkins --template "{{ range (index .status.loadBalancer.ingress 0) }}{{ . }}{{ end }}")
echo http://$SERVICE_IP:8080/login
3. Login with the password from step 1 and the username: admin
For more information on running Jenkins on Kubernetes, visit:
https://cloud.google.com/solutions/jenkins-on-container-engine
where I see helm creating a new PersistentVolumeClaim called jenkins.
How come helm did not use the "exsistingClaim"
I see this as the only helm values for the jenkins release
$ helm get values jenkins
persistence:
existingClaim: jenkins-pvc
and indeed it has just made its own PVC instead of using the pre-created one.
kubectl get pvc --all-namespaces
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
jenkins-project jenkins Bound pvc-a9caa3ba-7cf1-11e9-a90f-161c7e8a0754 8Gi RWO gp2 6m11s
jenkins-project jenkins-pvc Bound pvc-a173294f-7cea-11e9-a90f-161c7e8a0754 20Gi RWO gp2 56m
I feel like I am close but missing something basic. Any ideas?
So per Matthew L Daniel's comment I ran helm repo update and then re-ran the helm install command. This time it did not re-create the PVC but instead used the pre-made one.
My previous jenkins chart version was "jenkins-0.35.0"
For anyone wondering what the deployment looked like:
Name: jenkins
Namespace: jenkins-project
CreationTimestamp: Wed, 22 May 2019 22:03:33 -0700
Labels: app.kubernetes.io/component=jenkins-master
app.kubernetes.io/instance=jenkins
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=jenkins
helm.sh/chart=jenkins-1.1.21
Annotations: deployment.kubernetes.io/revision: 1
Selector: app.kubernetes.io/component=jenkins-master,app.kubernetes.io/instance=jenkins
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: Recreate
MinReadySeconds: 0
Pod Template:
Labels: app.kubernetes.io/component=jenkins-master
app.kubernetes.io/instance=jenkins
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=jenkins
helm.sh/chart=jenkins-1.1.21
Annotations: checksum/config: 867177d7ed5c3002201650b63dad00de7eb1e45a6622e543b80fae1f674a99cb
Service Account: jenkins
Init Containers:
copy-default-config:
Image: jenkins/jenkins:lts
Port: <none>
Host Port: <none>
Command:
sh
/var/jenkins_config/apply_config.sh
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 50m
memory: 256Mi
Environment:
ADMIN_PASSWORD: <set to the key 'jenkins-admin-password' in secret 'jenkins'> Optional: false
ADMIN_USER: <set to the key 'jenkins-admin-user' in secret 'jenkins'> Optional: false
Mounts:
/tmp from tmp (rw)
/usr/share/jenkins/ref/plugins from plugins (rw)
/usr/share/jenkins/ref/secrets/ from secrets-dir (rw)
/var/jenkins_config from jenkins-config (rw)
/var/jenkins_home from jenkins-home (rw)
/var/jenkins_plugins from plugin-dir (rw)
Containers:
jenkins:
Image: jenkins/jenkins:lts
Ports: 8080/TCP, 50000/TCP
Host Ports: 0/TCP, 0/TCP
Args:
--argumentsRealm.passwd.$(ADMIN_USER)=$(ADMIN_PASSWORD)
--argumentsRealm.roles.$(ADMIN_USER)=admin
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 50m
memory: 256Mi
Liveness: http-get http://:http/login delay=90s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:http/login delay=60s timeout=5s period=10s #success=1 #failure=3
Environment:
JAVA_OPTS:
JENKINS_OPTS:
JENKINS_SLAVE_AGENT_PORT: 50000
ADMIN_PASSWORD: <set to the key 'jenkins-admin-password' in secret 'jenkins'> Optional: false
ADMIN_USER: <set to the key 'jenkins-admin-user' in secret 'jenkins'> Optional: false
Mounts:
/tmp from tmp (rw)
/usr/share/jenkins/ref/plugins/ from plugin-dir (rw)
/usr/share/jenkins/ref/secrets/ from secrets-dir (rw)
/var/jenkins_config from jenkins-config (ro)
/var/jenkins_home from jenkins-home (rw)
Volumes:
plugins:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
tmp:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
jenkins-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: jenkins
Optional: false
plugin-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
secrets-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
jenkins-home:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: jenkins-pvc
ReadOnly: false
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing True ReplicaSetUpdated
OldReplicaSets: jenkins-86dcf94679 (1/1 replicas created)
NewReplicaSet: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 42s deployment-controller Scaled up replica set jenkins-86dcf94679 to 1

kubernetes with ceph rbd

I want to use ceph rbd with kubernetes.
I have a kubernetes 1.9.2 and ceph 12.2.5 cluster and on my k8s nodes I have installed ceph-common pag.
[root#docker09 manifest]# ceph auth get-key client.admin|base64
QVFEcmxmcGFmZXlZQ2hBQVFJWkExR0pXcS9RcXV4QmgvV3ZFWkE9PQ==
[root#docker09 manifest]# cat ceph-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: ceph-secret
data:
key: QVFEcmxmcGFmZXlZQ2hBQVFJWkExR0pXcS9RcXV4QmgvV3ZFWkE9PQ==
kubectl create -f ceph-secret.yaml
Then:
[root#docker09 manifest]# cat ceph-pv.yaml |grep -v "#"
apiVersion: v1
kind: PersistentVolume
metadata:
name: ceph-pv
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
rbd:
monitors:
- 10.211.121.61:6789
- 10.211.121.62:6789
- 10.211.121.63:6789
pool: rbd
image: ceph-image
user: admin
secretRef:
name: ceph-secret
fsType: ext4
readOnly: false
persistentVolumeReclaimPolicy: Recycle
[root#docker09 manifest]# rbd info ceph-image
rbd image 'ceph-image':
size 2048 MB in 512 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.341d374b0dc51
format: 2
features: layering
flags:
create_timestamp: Fri Jun 15 15:58:04 2018
[root#docker09 manifest]# cat task-claim.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ceph-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
[root#docker09 manifest]# kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv/ceph-pv 2Gi RWO Recycle Bound default/ceph-claim 54m
pv/host 10Gi RWO Retain Bound default/hostv 24d
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc/ceph-claim Bound ceph-pv 2Gi RWO 53m
pvc/hostv Bound host 10Gi RWO 24d
I create a pod use this pvc .
[root#docker09 manifest]# cat ceph-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: ceph-pod2
spec:
containers:
- name: ceph-busybox
image: busybox
command: ["sleep", "60000"]
volumeMounts:
- name: ceph-vol1
mountPath: /usr/share/busybox
readOnly: false
volumes:
- name: ceph-vol1
persistentVolumeClaim:
claimName: ceph-claim
[root#docker09 manifest]# kubectl get pod ceph-pod2 -o wide
NAME READY STATUS RESTARTS AGE IP NODE
ceph-pod2 0/1 ContainerCreating 0 14m <none> docker10
The pod is still in ContainerCreating status.
[root#docker09 manifest]# kubectl describe pod ceph-pod2
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15m default-scheduler Successfully assigned ceph-pod2 to docker10
Normal SuccessfulMountVolume 15m kubelet, docker10 MountVolume.SetUp succeeded for volume "default-token-85rc7"
Warning FailedMount 1m (x6 over 12m) kubelet, docker10 Unable to mount volumes for pod "ceph-pod2_default(56af9345-7073-11e8-aeb6-1c98ec29cbec)": timeout expired waiting for volumes to attach/mount for pod "default"/"ceph-pod2". list of unattached/unmounted volumes=[ceph-vol1]
I don't know why this happening, need your help... Best regards.
There's no need to reinvent a wheel here. There's already project called ROOK, which deploys ceph on kubernetes and it's super easy to run.
https://rook.io/
rbd -v (included in ceph-common) should return the same version as your cluster. You should also check the messages of kubelet.

Google Kubernetes Engine: Not seeing mount persistent volume in the instance

I created a 200G disk with the command gcloud compute disks create --size 200GB my-disk
then created a PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-volume
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: my-disk
fsType: ext4
then created a PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
then created a StatefulSet and mount the volume to /mnt/disks, which is an existing directory. statefulset.yaml:
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: ...
spec:
...
spec:
containers:
- name: ...
...
volumeMounts:
- name: my-volume
mountPath: /mnt/disks
volumes:
- name: my-volume
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: my-claim
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 200Gi
I ran command kubectl get pv and saw that disk was successfully mounted to each instance
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
my-volume 200Gi RWO Retain Available 19m
pvc-17c60f45-2e4f-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claim-xxx_1 standard 13m
pvc-5972c804-2e4e-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claim standard 18m
pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claimxxx_0 standard 18m
but when I ssh into an instance and run df -hT, I do not see the mounted volume. below is the output:
Filesystem Type Size Used Avail Use% Mounted on
/dev/root ext2 1.2G 447M 774M 37% /
devtmpfs devtmpfs 1.9G 0 1.9G 0% /dev
tmpfs tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs tmpfs 1.9G 744K 1.9G 1% /run
tmpfs tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
tmpfs tmpfs 1.9G 0 1.9G 0% /tmp
tmpfs tmpfs 256K 0 256K 0% /mnt/disks
/dev/sda8 ext4 12M 28K 12M 1% /usr/share/oem
/dev/sda1 ext4 95G 3.5G 91G 4% /mnt/stateful_partition
tmpfs tmpfs 1.0M 128K 896K 13% /var/lib/cloud
overlayfs overlay 1.0M 148K 876K 15% /etc
anyone has any idea?
Also worth mentioning that I'm trying to mount the disk to a docker image which is running in kubernete engine. The pod was created with below commands:
docker build -t gcr.io/xxx .
gcloud docker -- push gcr.io/xxx
kubectl create -f statefulset.yaml
The instance I sshed into is the one that runs the docker image. I do not see the volume in both instance and the docker container
UPDATE
I found the volume, I ran df -ahT in the instance, and saw the relevant entries
/dev/sdb - - - - - /var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
then I went into the docker container and ran df -ahT, I got
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 ext4 95G 3.5G 91G 4% /mnt/disks
Why I'm seeing 95G total size instead of 200G, which is the size of my volume?
More info:
kubectl describe pod
Name: xxx-replicaset-0
Namespace: default
Node: gke-xxx-cluster-default-pool-5e49501c-nrzt/10.128.0.17
Start Time: Fri, 23 Mar 2018 11:40:57 -0400
Labels: app=xxx-replicaset
controller-revision-hash=xxx-replicaset-755c4f7cff
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"StatefulSet","namespace":"default","name":"xxx-replicaset","uid":"d6c3511f-2eaf-11e8-b14e-42010af0000...
kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container xxx-deployment
Status: Running
IP: 10.52.4.5
Created By: StatefulSet/xxx-replicaset
Controlled By: StatefulSet/xxx-replicaset
Containers:
xxx-deployment:
Container ID: docker://137b3966a14538233ed394a3d0d1501027966b972d8ad821951f53d9eb908615
Image: gcr.io/sampeproject/xxxstaging:v1
Image ID: docker-pullable://gcr.io/sampeproject/xxxstaging#sha256:a96835c2597cfae3670a609a69196c6cd3d9cc9f2f0edf5b67d0a4afdd772e0b
Port: 8080/TCP
State: Running
Started: Fri, 23 Mar 2018 11:42:17 -0400
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment:
Mounts:
/mnt/disks from my-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-hj65g (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
my-claim:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-claim-xxx-replicaset-0
ReadOnly: false
my-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-hj65g:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-hj65g
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 10m (x4 over 10m) default-scheduler PersistentVolumeClaim is not bound: "my-claim-xxx-replicaset-0" (repeated 5 times)
Normal Scheduled 9m default-scheduler Successfully assigned xxx-replicaset-0 to gke-xxx-cluster-default-pool-5e49501c-nrzt
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "my-volume"
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "default-token-hj65g"
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "pvc-902c57c5-2eb0-11e8-b14e-42010af0000e"
Normal Pulling 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt pulling image "gcr.io/sampeproject/xxxstaging:v1"
Normal Pulled 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Successfully pulled image "gcr.io/sampeproject/xxxstaging:v1"
Normal Created 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Created container
Normal Started 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Started container
Seems like it did not mount the correct volume. I ran lsblk in docker container
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 95.9G 0 part /mnt/disks
├─sda2 8:2 0 16M 0 part
├─sda3 8:3 0 2G 0 part
├─sda4 8:4 0 16M 0 part
├─sda5 8:5 0 2G 0 part
├─sda6 8:6 0 512B 0 part
├─sda7 8:7 0 512B 0 part
├─sda8 8:8 0 16M 0 part
├─sda9 8:9 0 512B 0 part
├─sda10 8:10 0 512B 0 part
├─sda11 8:11 0 8M 0 part
└─sda12 8:12 0 32M 0 part
sdb 8:16 0 200G 0 disk
Why this is happening?
When you use PVCs, K8s manages persistent disks for you.
The exact way how PVs can by defined by provisioner in storage classes. Since you use GKE your default SC uses kubernetes.io/gce-pd provisioner (https://kubernetes.io/docs/concepts/storage/storage-classes/#gce).
In other words for each pod new PV is created.
If you would like to use existing disk you can use Volumes instead of PVCs (https://kubernetes.io/docs/concepts/storage/volumes/#gcepersistentdisk)
The PVC is not mounted into your container because you did not actually specify the PVC in your container's volumeMounts. Only the emptyDir volume was specified.
I actually recently modified the GKE StatefulSet tutorial. Before, some of the steps were incorrect and saying to manually create the PD and PV objects. Instead, it's been corrected to use dynamic provisioning.
Please try that out and see if the updated steps work for you.

Resources