Google Kubernetes Engine: Not seeing mount persistent volume in the instance - docker

I created a 200G disk with the command gcloud compute disks create --size 200GB my-disk
then created a PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-volume
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: my-disk
fsType: ext4
then created a PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
then created a StatefulSet and mount the volume to /mnt/disks, which is an existing directory. statefulset.yaml:
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: ...
spec:
...
spec:
containers:
- name: ...
...
volumeMounts:
- name: my-volume
mountPath: /mnt/disks
volumes:
- name: my-volume
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: my-claim
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 200Gi
I ran command kubectl get pv and saw that disk was successfully mounted to each instance
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
my-volume 200Gi RWO Retain Available 19m
pvc-17c60f45-2e4f-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claim-xxx_1 standard 13m
pvc-5972c804-2e4e-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claim standard 18m
pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claimxxx_0 standard 18m
but when I ssh into an instance and run df -hT, I do not see the mounted volume. below is the output:
Filesystem Type Size Used Avail Use% Mounted on
/dev/root ext2 1.2G 447M 774M 37% /
devtmpfs devtmpfs 1.9G 0 1.9G 0% /dev
tmpfs tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs tmpfs 1.9G 744K 1.9G 1% /run
tmpfs tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
tmpfs tmpfs 1.9G 0 1.9G 0% /tmp
tmpfs tmpfs 256K 0 256K 0% /mnt/disks
/dev/sda8 ext4 12M 28K 12M 1% /usr/share/oem
/dev/sda1 ext4 95G 3.5G 91G 4% /mnt/stateful_partition
tmpfs tmpfs 1.0M 128K 896K 13% /var/lib/cloud
overlayfs overlay 1.0M 148K 876K 15% /etc
anyone has any idea?
Also worth mentioning that I'm trying to mount the disk to a docker image which is running in kubernete engine. The pod was created with below commands:
docker build -t gcr.io/xxx .
gcloud docker -- push gcr.io/xxx
kubectl create -f statefulset.yaml
The instance I sshed into is the one that runs the docker image. I do not see the volume in both instance and the docker container
UPDATE
I found the volume, I ran df -ahT in the instance, and saw the relevant entries
/dev/sdb - - - - - /var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
then I went into the docker container and ran df -ahT, I got
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 ext4 95G 3.5G 91G 4% /mnt/disks
Why I'm seeing 95G total size instead of 200G, which is the size of my volume?
More info:
kubectl describe pod
Name: xxx-replicaset-0
Namespace: default
Node: gke-xxx-cluster-default-pool-5e49501c-nrzt/10.128.0.17
Start Time: Fri, 23 Mar 2018 11:40:57 -0400
Labels: app=xxx-replicaset
controller-revision-hash=xxx-replicaset-755c4f7cff
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"StatefulSet","namespace":"default","name":"xxx-replicaset","uid":"d6c3511f-2eaf-11e8-b14e-42010af0000...
kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container xxx-deployment
Status: Running
IP: 10.52.4.5
Created By: StatefulSet/xxx-replicaset
Controlled By: StatefulSet/xxx-replicaset
Containers:
xxx-deployment:
Container ID: docker://137b3966a14538233ed394a3d0d1501027966b972d8ad821951f53d9eb908615
Image: gcr.io/sampeproject/xxxstaging:v1
Image ID: docker-pullable://gcr.io/sampeproject/xxxstaging#sha256:a96835c2597cfae3670a609a69196c6cd3d9cc9f2f0edf5b67d0a4afdd772e0b
Port: 8080/TCP
State: Running
Started: Fri, 23 Mar 2018 11:42:17 -0400
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment:
Mounts:
/mnt/disks from my-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-hj65g (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
my-claim:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-claim-xxx-replicaset-0
ReadOnly: false
my-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-hj65g:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-hj65g
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 10m (x4 over 10m) default-scheduler PersistentVolumeClaim is not bound: "my-claim-xxx-replicaset-0" (repeated 5 times)
Normal Scheduled 9m default-scheduler Successfully assigned xxx-replicaset-0 to gke-xxx-cluster-default-pool-5e49501c-nrzt
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "my-volume"
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "default-token-hj65g"
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "pvc-902c57c5-2eb0-11e8-b14e-42010af0000e"
Normal Pulling 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt pulling image "gcr.io/sampeproject/xxxstaging:v1"
Normal Pulled 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Successfully pulled image "gcr.io/sampeproject/xxxstaging:v1"
Normal Created 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Created container
Normal Started 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Started container
Seems like it did not mount the correct volume. I ran lsblk in docker container
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 95.9G 0 part /mnt/disks
├─sda2 8:2 0 16M 0 part
├─sda3 8:3 0 2G 0 part
├─sda4 8:4 0 16M 0 part
├─sda5 8:5 0 2G 0 part
├─sda6 8:6 0 512B 0 part
├─sda7 8:7 0 512B 0 part
├─sda8 8:8 0 16M 0 part
├─sda9 8:9 0 512B 0 part
├─sda10 8:10 0 512B 0 part
├─sda11 8:11 0 8M 0 part
└─sda12 8:12 0 32M 0 part
sdb 8:16 0 200G 0 disk
Why this is happening?

When you use PVCs, K8s manages persistent disks for you.
The exact way how PVs can by defined by provisioner in storage classes. Since you use GKE your default SC uses kubernetes.io/gce-pd provisioner (https://kubernetes.io/docs/concepts/storage/storage-classes/#gce).
In other words for each pod new PV is created.
If you would like to use existing disk you can use Volumes instead of PVCs (https://kubernetes.io/docs/concepts/storage/volumes/#gcepersistentdisk)

The PVC is not mounted into your container because you did not actually specify the PVC in your container's volumeMounts. Only the emptyDir volume was specified.
I actually recently modified the GKE StatefulSet tutorial. Before, some of the steps were incorrect and saying to manually create the PD and PV objects. Instead, it's been corrected to use dynamic provisioning.
Please try that out and see if the updated steps work for you.

Related

Unable to bring up Jenkins using Helm

I'm following the doc in Jenkins page, I'm running with 2 node K8s cluster (1 master 1 worker), setting service type to nodeport, for some reason the init container crashes and never comes up.
kubectl describe pod jenkins-0 -n jenkins
Name: jenkins-0
Namespace: jenkins
Priority: 0
Node: vlab048009.dom047600.lab/10.204.110.35
Start Time: Wed, 09 Dec 2020 23:19:59 +0530
Labels: app.kubernetes.io/component=jenkins-controller
app.kubernetes.io/instance=jenkins
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=jenkins
controller-revision-hash=jenkins-c5795f65f
statefulset.kubernetes.io/pod-name=jenkins-0
Annotations: checksum/config: 2a4c2b3ea5dea271cb7c0b8e8582b682814d39f8e933e0348725b0b9a7dbf258
Status: Pending
IP: 10.244.1.28
IPs:
IP: 10.244.1.28
Controlled By: StatefulSet/jenkins
Init Containers:
init:
Container ID: docker://95e3298740bcaed3c2adf832f41d346e563c92add728080cfdcfcac375e0254d
Image: jenkins/jenkins:lts
Image ID: docker-pullable://jenkins/jenkins#sha256:1433deaac433ce20c534d8b87fcd0af3f25260f375f4ee6bdb41d70e1769d9ce
Port: <none>
Host Port: <none>
Command:
sh
/var/jenkins_config/apply_config.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 09 Dec 2020 23:41:28 +0530
Finished: Wed, 09 Dec 2020 23:41:29 +0530
Ready: False
Restart Count: 9
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 50m
memory: 256Mi
Environment: <none>
Mounts:
/usr/share/jenkins/ref/plugins from plugins (rw)
/var/jenkins_config from jenkins-config (rw)
/var/jenkins_home from jenkins-home (rw)
/var/jenkins_plugins from plugin-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from jenkins-token-ppfw7 (ro)
Containers:
jenkins:
Container ID:
Image: jenkins/jenkins:lts
Image ID:
Ports: 8080/TCP, 50000/TCP
Host Ports: 0/TCP, 0/TCP
Args:
--httpPort=8080
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 50m
memory: 256Mi
Liveness: http-get http://:http/login delay=0s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:http/login delay=0s timeout=5s period=10s #success=1 #failure=3
Startup: http-get http://:http/login delay=0s timeout=5s period=10s #success=1 #failure=12
Environment:
POD_NAME: jenkins-0 (v1:metadata.name)
JAVA_OPTS: -Dcasc.reload.token=$(POD_NAME)
JENKINS_OPTS:
JENKINS_SLAVE_AGENT_PORT: 50000
CASC_JENKINS_CONFIG: /var/jenkins_home/casc_configs
Mounts:
/run/secrets/chart-admin-password from admin-secret (ro,path="jenkins-admin-password")
/run/secrets/chart-admin-username from admin-secret (ro,path="jenkins-admin-user")
/usr/share/jenkins/ref/plugins/ from plugin-dir (rw)
/var/jenkins_config from jenkins-config (ro)
/var/jenkins_home from jenkins-home (rw)
/var/jenkins_home/casc_configs from sc-config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from jenkins-token-ppfw7 (ro)
config-reload:
Container ID:
Image: kiwigrid/k8s-sidecar:0.1.275
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
POD_NAME: jenkins-0 (v1:metadata.name)
LABEL: jenkins-jenkins-config
FOLDER: /var/jenkins_home/casc_configs
NAMESPACE: jenkins
REQ_URL: http://localhost:8080/reload-configuration-as-code/?casc-reload-token=$(POD_NAME)
REQ_METHOD: POST
REQ_RETRY_CONNECT: 10
Mounts:
/var/jenkins_home from jenkins-home (rw)
/var/jenkins_home/casc_configs from sc-config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from jenkins-token-ppfw7 (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
plugins:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
jenkins-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: jenkins
Optional: false
plugin-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
jenkins-home:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
sc-config-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
admin-secret:
Type: Secret (a volume populated by a Secret)
SecretName: jenkins
Optional: false
jenkins-token-ppfw7:
Type: Secret (a volume populated by a Secret)
SecretName: jenkins-token-ppfw7
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 22m default-scheduler Successfully assigned jenkins/jenkins-0 to vlab048009.dom047600.lab
Normal Pulled 22m kubelet Successfully pulled image "jenkins/jenkins:lts" in 4.648858149s
Normal Pulled 21m kubelet Successfully pulled image "jenkins/jenkins:lts" in 1.407161762s
Normal Pulled 21m kubelet Successfully pulled image "jenkins/jenkins:lts" in 4.963056101s
Normal Created 21m (x4 over 22m) kubelet Created container init
Normal Started 21m (x4 over 22m) kubelet Started container init
Normal Pulled 21m kubelet Successfully pulled image "jenkins/jenkins:lts" in 8.0749493s
Normal Pulling 20m (x5 over 22m) kubelet Pulling image "jenkins/jenkins:lts"
Warning BackOff 2m1s (x95 over 21m) kubelet Back-off restarting failed container
[
kubectl logs -f jenkins-0 -c init -n jenkins
Error from server: Get "https://10.204.110.35:10250/containerLogs/jenkins/jenkins-0/init?follow=true": dial tcp 10.204.110.35:10250: connect: no route to host
kubectl get events -n jenkins
LAST SEEN TYPE REASON OBJECT MESSAGE
23m Normal Scheduled pod/jenkins-0 Successfully assigned jenkins/jenkins-0 to vlab048009.dom047600.lab
21m Normal Pulling pod/jenkins-0 Pulling image "jenkins/jenkins:lts"
23m Normal Pulled pod/jenkins-0 Successfully pulled image "jenkins/jenkins:lts" in 4.648858149s
22m Normal Created pod/jenkins-0 Created container init
22m Normal Started pod/jenkins-0 Started container init
23m Normal Pulled pod/jenkins-0 Successfully pulled image "jenkins/jenkins:lts" in 1.407161762s
3m30s Warning BackOff pod/jenkins-0 Back-off restarting failed container
23m Normal Pulled pod/jenkins-0 Successfully pulled image "jenkins/jenkins:lts" in 4.963056101s
22m Normal Pulled pod/jenkins-0 Successfully pulled image "jenkins/jenkins:lts" in 8.0749493s
23m Normal SuccessfulCreate statefulset/jenkins create Pod jenkins-0 in StatefulSet jenkins successful
Every 2.0s: kubectl get all -n jenkins Wed Dec 9 23:48:31 2020
NAME READY STATUS RESTARTS AGE
pod/jenkins-0 0/2 Init:CrashLoopBackOff 10 28m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/jenkins NodePort 10.103.209.122 <none> 8080:32323/TCP 28m
service/jenkins-agent ClusterIP 10.103.195.120 <none> 50000/TCP 28m
NAME READY AGE
statefulset.apps/jenkins 0/1 28m
Using helm3 to deploy jenkins, pretty much changes done as per doc.
Not sure how to debug this issues wrt init container crashing, any leads or a solution would be appreciated, Thanks
Firstly make sure that you had executed command:
$ helm repo update
Execute also command:
$ kubectl logs <pod-name> -c <init-container-name>
to inspect init container. Then you will be able to properly debug this setup.
This might be a connection issue to the Jenkins update site. You can build an image which contains required plugins and disable plugin download. Take a look: jenkins-kubernetes.
See more: jenkins-helm-issues - in this case problem lays in plug-in compatibility.

How can I use an existing PVC to helm install stable/jenkins

I am stuck with a helm install of jenkins
:(
please help!
I have predefined a storage class via:
$ kubectl apply -f generic-storage-class.yaml
with generic-storage-class.yaml:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: generic
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
zones: us-east-1a, us-east-1b, us-east-1c
fsType: ext4
I then define a PVC via:
$ kubectl apply -f jenkins-pvc.yaml
with jenkins-pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jenkins-pvc
namespace: jenkins-project
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
I can then see the PVC go into the BOUND status:
$ kubectl get pvc --all-namespaces
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
jenkins-project jenkins-pvc Bound pvc-a173294f-7cea-11e9-a90f-161c7e8a0754 20Gi RWO gp2 27m
But when I try to Helm install jenkins via:
$ helm install --name jenkins \
--set persistence.existingClaim=jenkins-pvc \
stable/jenkins --namespace jenkins-project
I get this output:
NAME: jenkins
LAST DEPLOYED: Wed May 22 17:07:44 2019
NAMESPACE: jenkins-project
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
jenkins 5 0s
jenkins-tests 1 0s
==> v1/Deployment
NAME READY UP-TO-DATE AVAILABLE AGE
jenkins 0/1 1 0 0s
==> v1/PersistentVolumeClaim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
jenkins Pending gp2 0s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
jenkins-6c9f9f5478-czdbh 0/1 Pending 0 0s
==> v1/Secret
NAME TYPE DATA AGE
jenkins Opaque 2 0s
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins LoadBalancer 10.100.200.27 <pending> 8080:31157/TCP 0s
jenkins-agent ClusterIP 10.100.221.179 <none> 50000/TCP 0s
NOTES:
1. Get your 'admin' user password by running:
printf $(kubectl get secret --namespace jenkins-project jenkins -o jsonpath="{.data.jenkins-admin-password}" | base64 --decode);echo
2. Get the Jenkins URL to visit by running these commands in the same shell:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get svc --namespace jenkins-project -w jenkins'
export SERVICE_IP=$(kubectl get svc --namespace jenkins-project jenkins --template "{{ range (index .status.loadBalancer.ingress 0) }}{{ . }}{{ end }}")
echo http://$SERVICE_IP:8080/login
3. Login with the password from step 1 and the username: admin
For more information on running Jenkins on Kubernetes, visit:
https://cloud.google.com/solutions/jenkins-on-container-engine
where I see helm creating a new PersistentVolumeClaim called jenkins.
How come helm did not use the "exsistingClaim"
I see this as the only helm values for the jenkins release
$ helm get values jenkins
persistence:
existingClaim: jenkins-pvc
and indeed it has just made its own PVC instead of using the pre-created one.
kubectl get pvc --all-namespaces
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
jenkins-project jenkins Bound pvc-a9caa3ba-7cf1-11e9-a90f-161c7e8a0754 8Gi RWO gp2 6m11s
jenkins-project jenkins-pvc Bound pvc-a173294f-7cea-11e9-a90f-161c7e8a0754 20Gi RWO gp2 56m
I feel like I am close but missing something basic. Any ideas?
So per Matthew L Daniel's comment I ran helm repo update and then re-ran the helm install command. This time it did not re-create the PVC but instead used the pre-made one.
My previous jenkins chart version was "jenkins-0.35.0"
For anyone wondering what the deployment looked like:
Name: jenkins
Namespace: jenkins-project
CreationTimestamp: Wed, 22 May 2019 22:03:33 -0700
Labels: app.kubernetes.io/component=jenkins-master
app.kubernetes.io/instance=jenkins
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=jenkins
helm.sh/chart=jenkins-1.1.21
Annotations: deployment.kubernetes.io/revision: 1
Selector: app.kubernetes.io/component=jenkins-master,app.kubernetes.io/instance=jenkins
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: Recreate
MinReadySeconds: 0
Pod Template:
Labels: app.kubernetes.io/component=jenkins-master
app.kubernetes.io/instance=jenkins
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=jenkins
helm.sh/chart=jenkins-1.1.21
Annotations: checksum/config: 867177d7ed5c3002201650b63dad00de7eb1e45a6622e543b80fae1f674a99cb
Service Account: jenkins
Init Containers:
copy-default-config:
Image: jenkins/jenkins:lts
Port: <none>
Host Port: <none>
Command:
sh
/var/jenkins_config/apply_config.sh
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 50m
memory: 256Mi
Environment:
ADMIN_PASSWORD: <set to the key 'jenkins-admin-password' in secret 'jenkins'> Optional: false
ADMIN_USER: <set to the key 'jenkins-admin-user' in secret 'jenkins'> Optional: false
Mounts:
/tmp from tmp (rw)
/usr/share/jenkins/ref/plugins from plugins (rw)
/usr/share/jenkins/ref/secrets/ from secrets-dir (rw)
/var/jenkins_config from jenkins-config (rw)
/var/jenkins_home from jenkins-home (rw)
/var/jenkins_plugins from plugin-dir (rw)
Containers:
jenkins:
Image: jenkins/jenkins:lts
Ports: 8080/TCP, 50000/TCP
Host Ports: 0/TCP, 0/TCP
Args:
--argumentsRealm.passwd.$(ADMIN_USER)=$(ADMIN_PASSWORD)
--argumentsRealm.roles.$(ADMIN_USER)=admin
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 50m
memory: 256Mi
Liveness: http-get http://:http/login delay=90s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:http/login delay=60s timeout=5s period=10s #success=1 #failure=3
Environment:
JAVA_OPTS:
JENKINS_OPTS:
JENKINS_SLAVE_AGENT_PORT: 50000
ADMIN_PASSWORD: <set to the key 'jenkins-admin-password' in secret 'jenkins'> Optional: false
ADMIN_USER: <set to the key 'jenkins-admin-user' in secret 'jenkins'> Optional: false
Mounts:
/tmp from tmp (rw)
/usr/share/jenkins/ref/plugins/ from plugin-dir (rw)
/usr/share/jenkins/ref/secrets/ from secrets-dir (rw)
/var/jenkins_config from jenkins-config (ro)
/var/jenkins_home from jenkins-home (rw)
Volumes:
plugins:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
tmp:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
jenkins-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: jenkins
Optional: false
plugin-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
secrets-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
jenkins-home:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: jenkins-pvc
ReadOnly: false
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing True ReplicaSetUpdated
OldReplicaSets: jenkins-86dcf94679 (1/1 replicas created)
NewReplicaSet: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 42s deployment-controller Scaled up replica set jenkins-86dcf94679 to 1

kubernetes with ceph rbd

I want to use ceph rbd with kubernetes.
I have a kubernetes 1.9.2 and ceph 12.2.5 cluster and on my k8s nodes I have installed ceph-common pag.
[root#docker09 manifest]# ceph auth get-key client.admin|base64
QVFEcmxmcGFmZXlZQ2hBQVFJWkExR0pXcS9RcXV4QmgvV3ZFWkE9PQ==
[root#docker09 manifest]# cat ceph-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: ceph-secret
data:
key: QVFEcmxmcGFmZXlZQ2hBQVFJWkExR0pXcS9RcXV4QmgvV3ZFWkE9PQ==
kubectl create -f ceph-secret.yaml
Then:
[root#docker09 manifest]# cat ceph-pv.yaml |grep -v "#"
apiVersion: v1
kind: PersistentVolume
metadata:
name: ceph-pv
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
rbd:
monitors:
- 10.211.121.61:6789
- 10.211.121.62:6789
- 10.211.121.63:6789
pool: rbd
image: ceph-image
user: admin
secretRef:
name: ceph-secret
fsType: ext4
readOnly: false
persistentVolumeReclaimPolicy: Recycle
[root#docker09 manifest]# rbd info ceph-image
rbd image 'ceph-image':
size 2048 MB in 512 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.341d374b0dc51
format: 2
features: layering
flags:
create_timestamp: Fri Jun 15 15:58:04 2018
[root#docker09 manifest]# cat task-claim.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ceph-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
[root#docker09 manifest]# kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv/ceph-pv 2Gi RWO Recycle Bound default/ceph-claim 54m
pv/host 10Gi RWO Retain Bound default/hostv 24d
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc/ceph-claim Bound ceph-pv 2Gi RWO 53m
pvc/hostv Bound host 10Gi RWO 24d
I create a pod use this pvc .
[root#docker09 manifest]# cat ceph-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: ceph-pod2
spec:
containers:
- name: ceph-busybox
image: busybox
command: ["sleep", "60000"]
volumeMounts:
- name: ceph-vol1
mountPath: /usr/share/busybox
readOnly: false
volumes:
- name: ceph-vol1
persistentVolumeClaim:
claimName: ceph-claim
[root#docker09 manifest]# kubectl get pod ceph-pod2 -o wide
NAME READY STATUS RESTARTS AGE IP NODE
ceph-pod2 0/1 ContainerCreating 0 14m <none> docker10
The pod is still in ContainerCreating status.
[root#docker09 manifest]# kubectl describe pod ceph-pod2
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15m default-scheduler Successfully assigned ceph-pod2 to docker10
Normal SuccessfulMountVolume 15m kubelet, docker10 MountVolume.SetUp succeeded for volume "default-token-85rc7"
Warning FailedMount 1m (x6 over 12m) kubelet, docker10 Unable to mount volumes for pod "ceph-pod2_default(56af9345-7073-11e8-aeb6-1c98ec29cbec)": timeout expired waiting for volumes to attach/mount for pod "default"/"ceph-pod2". list of unattached/unmounted volumes=[ceph-vol1]
I don't know why this happening, need your help... Best regards.
There's no need to reinvent a wheel here. There's already project called ROOK, which deploys ceph on kubernetes and it's super easy to run.
https://rook.io/
rbd -v (included in ceph-common) should return the same version as your cluster. You should also check the messages of kubelet.

k8s unable to mount vsphere vsan disk to pod

i have a problem with persistent volumes and vmware vsan
i'm using kubernetes v1.9.5 on CentOS 7 deployed with kubespray (cloudprovider: vsphere).
the pods are not able to mount to vsan devices:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned pvpod to k8s-worker-5
Normal SuccessfulMountVolume 2m kubelet, k8s-worker-5 MountVolume.SetUp succeeded for volume "default-token-c5mqh"
Warning FailedMount 56s kubelet, k8s-worker-5 Unable to mount volumes for pod "pvpod_default(62b1704e-3e27-11e8-9cfc-0050568f0ce4)": timeout expired waiting for volumes to attach/mount for pod "default"/"pvpod". list of unattached/unmounted volumes=[test-volume]
the volume is assigned to the kubernetes worker-5 (sdb)
[root#k8s-worker-5 ~]# fdisk -l
Disk /dev/sda: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000ac2d7
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 2099199 1048576 83 Linux
/dev/sda2 2099200 3147775 524288 82 Linux swap / Solaris
/dev/sda3 3147776 209712509 103282367 83 Linux
Disk /dev/sdb: 2147 MB, 2147483648 bytes, 4194304 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
pv and pvc are created:
[root#k8s-jump ]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-60b0f1d9-3e27-11e8-9cfc-0050568f0ce4 2Gi RWO Delete Bound default/pvcsc001 vsan 5m
[root#k8s-jump ]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvcsc001 Bound pvc-60b0f1d9-3e27-11e8-9cfc-0050568f0ce4 2Gi RWO vsan 5m
this is my test deployment manifest
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: vsan
provisioner: kubernetes.io/vsphere-volume
parameters:
datastore: vsanDatastore
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvcsc001
annotations:
volume.beta.kubernetes.io/storage-class: vsan
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
---
apiVersion: v1
kind: Pod
metadata:
name: pvpod
spec:
containers:
- name: test-container
image: k8s.gcr.io/test-webserver
volumeMounts:
- name: test-volume
mountPath: /test-vmdk
volumes:
- name: test-volume
persistentVolumeClaim:
claimName: pvcsc001
The instances need the vmware flag "disk.enableUUID=1"
https://vmware.github.io/vsphere-storage-for-kubernetes/documentation/existing.html#step-3-enable-disk-uuid-on-node-virtual-machines

Kubernetes 1.7 on Google Cloud: FailedSync Error syncing pod, SandboxChanged Pod sandbox changed, it will be killed and re-created

My Kubernetes pods and containers are not starting. They are stuck in with the status ContainerCreating.
I ran the command kubectl describe po PODNAME, which lists the events and I see the following error:
Type Reason Message
Warning FailedSync Error syncing pod
Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.
The Count column indicates that these errors are being repeated over and over again, roughly once a second. The full output is below from this command is below, but how do I go about debugging this? I'm not even sure what these errors mean.
Name: ocr-extra-2939512459-3hkv1
Namespace: ocr-da-cluster
Node: gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2/10.240.0.11
Start Time: Tue, 24 Oct 2017 21:05:01 -0400
Labels: component=ocr
pod-template-hash=2939512459
role=extra
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"ocr-da-cluster","name":"ocr-extra-2939512459","uid":"d58bd050-b8f3-11e7-9f9e-4201...
Status: Pending
IP:
Created By: ReplicaSet/ocr-extra-2939512459
Controlled By: ReplicaSet/ocr-extra-2939512459
Containers:
ocr-node:
Container ID:
Image: us.gcr.io/ocr-api/ocr-image
Image ID:
Ports: 80/TCP, 443/TCP, 5555/TCP, 15672/TCP, 25672/TCP, 4369/TCP, 11211/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 31
memory: 10Gi
Liveness: http-get http://:http/ocr/live delay=270s timeout=30s period=60s #success=1 #failure=5
Readiness: http-get http://:http/_ah/warmup delay=180s timeout=60s period=120s #success=1 #failure=3
Environment:
NAMESPACE: ocr-da-cluster (v1:metadata.namespace)
Mounts:
/var/log/apache2 from apachelog (rw)
/var/log/celery from cellog (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
log-apache2-error:
Container ID:
Image: busybox
Image ID:
Port: <none>
Args:
/bin/sh
-c
echo Apache2 Error && sleep 90 && tail -n+1 -F /var/log/apache2/error.log
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 20m
Environment: <none>
Mounts:
/var/log/apache2 from apachelog (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
log-worker-1:
Container ID:
Image: busybox
Image ID:
Port: <none>
Args:
/bin/sh
-c
echo Celery Worker && sleep 90 && tail -n+1 -F /var/log/celery/worker*.log
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 20m
Environment: <none>
Mounts:
/var/log/celery from cellog (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
apachelog:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
cellog:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-dhjr5:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-dhjr5
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/instance-type=n1-highcpu-32
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
10m 10m 2 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (10), Insufficient memory (2), MatchNodeSelector (2).
10m 10m 1 default-scheduler Normal Scheduled Successfully assigned ocr-extra-2939512459-3hkv1 to gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2
10m 10m 1 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "apachelog"
10m 10m 1 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "cellog"
10m 10m 1 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "default-token-dhjr5"
10m 1s 382 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Warning FailedSync Error syncing pod
10m 0s 382 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.
Check your resource limits. I faced the same issue and the reason in my case was because I was using m instead of Mi for memory limits and memory requests.
Are you sure you need 31 cpu as initial request (ocr-node)?
This will require a very big node.
I'm seeing similar issues with some of my pods. Deleting them and allowing them to be recreated sometimes helps. Not consistent.
I'm sure there is enough resources available.
See Kubernetes pods failing on "Pod sandbox changed, it will be killed and re-created"

Resources