kubernetes with ceph rbd - docker

I want to use ceph rbd with kubernetes.
I have a kubernetes 1.9.2 and ceph 12.2.5 cluster and on my k8s nodes I have installed ceph-common pag.
[root#docker09 manifest]# ceph auth get-key client.admin|base64
QVFEcmxmcGFmZXlZQ2hBQVFJWkExR0pXcS9RcXV4QmgvV3ZFWkE9PQ==
[root#docker09 manifest]# cat ceph-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: ceph-secret
data:
key: QVFEcmxmcGFmZXlZQ2hBQVFJWkExR0pXcS9RcXV4QmgvV3ZFWkE9PQ==
kubectl create -f ceph-secret.yaml
Then:
[root#docker09 manifest]# cat ceph-pv.yaml |grep -v "#"
apiVersion: v1
kind: PersistentVolume
metadata:
name: ceph-pv
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
rbd:
monitors:
- 10.211.121.61:6789
- 10.211.121.62:6789
- 10.211.121.63:6789
pool: rbd
image: ceph-image
user: admin
secretRef:
name: ceph-secret
fsType: ext4
readOnly: false
persistentVolumeReclaimPolicy: Recycle
[root#docker09 manifest]# rbd info ceph-image
rbd image 'ceph-image':
size 2048 MB in 512 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.341d374b0dc51
format: 2
features: layering
flags:
create_timestamp: Fri Jun 15 15:58:04 2018
[root#docker09 manifest]# cat task-claim.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ceph-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
[root#docker09 manifest]# kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv/ceph-pv 2Gi RWO Recycle Bound default/ceph-claim 54m
pv/host 10Gi RWO Retain Bound default/hostv 24d
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc/ceph-claim Bound ceph-pv 2Gi RWO 53m
pvc/hostv Bound host 10Gi RWO 24d
I create a pod use this pvc .
[root#docker09 manifest]# cat ceph-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: ceph-pod2
spec:
containers:
- name: ceph-busybox
image: busybox
command: ["sleep", "60000"]
volumeMounts:
- name: ceph-vol1
mountPath: /usr/share/busybox
readOnly: false
volumes:
- name: ceph-vol1
persistentVolumeClaim:
claimName: ceph-claim
[root#docker09 manifest]# kubectl get pod ceph-pod2 -o wide
NAME READY STATUS RESTARTS AGE IP NODE
ceph-pod2 0/1 ContainerCreating 0 14m <none> docker10
The pod is still in ContainerCreating status.
[root#docker09 manifest]# kubectl describe pod ceph-pod2
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15m default-scheduler Successfully assigned ceph-pod2 to docker10
Normal SuccessfulMountVolume 15m kubelet, docker10 MountVolume.SetUp succeeded for volume "default-token-85rc7"
Warning FailedMount 1m (x6 over 12m) kubelet, docker10 Unable to mount volumes for pod "ceph-pod2_default(56af9345-7073-11e8-aeb6-1c98ec29cbec)": timeout expired waiting for volumes to attach/mount for pod "default"/"ceph-pod2". list of unattached/unmounted volumes=[ceph-vol1]
I don't know why this happening, need your help... Best regards.

There's no need to reinvent a wheel here. There's already project called ROOK, which deploys ceph on kubernetes and it's super easy to run.
https://rook.io/

rbd -v (included in ceph-common) should return the same version as your cluster. You should also check the messages of kubelet.

Related

Deploying on Kubernetes with shared volume

I am trying to deploy this app here on Kubernetes GKE.
annotations:
kompose.cmd: kompose convert -f docker-compose.yml
kompose.version: 1.26.1 (a9d05d509)
creationTimestamp: null
labels:
io.kompose.network/taiga: "true"
io.kompose.service: taiga-gateway
spec:
containers:
- image: nginx:1.19-alpine
name: taiga-gateway
ports:
- containerPort: 80
resources: {}
volumeMounts:
- mountPath: /etc/nginx/conf.d/default.conf
name: taiga-gateway-claim0
- mountPath: /taiga/static
name: taiga-static-data
- mountPath: /taiga/media
name: taiga-media-data
restartPolicy: Always
volumes:
- name: taiga-gateway-claim0
persistentVolumeClaim:
claimName: taiga-gateway-claim0
- name: taiga-static-data
persistentVolumeClaim:
claimName: taiga-static-data
- name: taiga-media-data
persistentVolumeClaim:
claimName: taiga-media-data
status: {}
In kubernetes I am creating the volume this way
requests:
storage: 5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: taiga-static-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: taiga-media-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: taiga-db-data
..etc
So, I end up with the following error
Normal Scheduled 4m39s default-scheduler
Successfully assigned default/taiga-gateway-77976dc77-ppbvb to
gke-taiga-cluster-default-pool-cccc58aa-jwfr Warning
FailedAttachVolume 4m39s attachdetach-controller Multi-Attach
error for volume "pvc-a28ae890-fe9d-4985-8183-ce54d7ed57d8" Volume is
already used by pod(s) taiga-async-6c7d9dbd7b-vl79s Warning
FailedAttachVolume 4m39s attachdetach-controller Multi-Attach
error for volume "pvc-3cab3ec0-88d9-4c70-96c8-97d7b48e755c" Volume is
already used by pod(s) taiga-async-6c7d9dbd7b-vl79s Normal
SuccessfulAttachVolume 4m32s attachdetach-controller
AttachVolume.Attach succeeded for volume
"pvc-d2e39951-094f-447c-86ff-c36639786111" Warning FailedMount
2m36s kubelet Unable to attach or mount volumes:
unmounted volumes=[taiga-static-data taiga-media-data], unattached
volumes=[taiga-static-data taiga-media-data kube-api-access-6lw4n
taiga-gateway-claim0]: timed out waiting for the condition Warning
FailedMount 19s kubelet Unable to
attach or mount volumes: unmounted volumes=[taiga-media-data
taiga-static-data], unattached volumes=[taiga-media-data
kube-api-access-6lw4n taiga-gateway-claim0 taiga-static-data]: timed
out waiting for the condition
PVC status
testuser#cloudshell:~/kube-deploy (taiga-pursuit)$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
taiga-async-rabbitmq-data Bound pvc-3c8eb896-1bca-4047-915f-01e9a2ca5911 1Gi RWO standard 23m
taiga-db-data Bound pvc-ab03a878-1783-4f03-92bc-c9ef96a7d36d 5Gi RWO standard 23m
taiga-events-rabbitmq-data Bound pvc-0c4da3b5-bfb6-4cd9-b45d-e4e7b44a83e5 1Gi RWO standard 23m
taiga-gateway-claim0 Bound pvc-d2e39951-094f-447c-86ff-c36639786111 5Gi RWO standard 24m
taiga-media-data Bound pvc-a28ae890-fe9d-4985-8183-ce54d7ed57d8 5Gi RWO standard 23m
taiga-static-data Bound pvc-3cab3ec0-88d9-4c70-96c8-97d7b48e755c 5Gi RWO standard 23m
Is the warning that these are being shared causing it to fail mounting or is it something else? I am not able to figure out

How should I deploy Persistent Volume(PV) for JupyterHub on Kubernetes?

Environment information:
Computer detail: One master node and four slave nodes. All are CentOS Linux release 7.8.2003 (Core).
Kubernetes version: v1.18.0.
Zero to JupyterHub version: 0.9.0.
Helm version: v2.11.0
I recently try to deploy an online code environment(like Google Colab) in new lab servers via Zero to JupyterHub. Unfortunately, I failed to deploy Persistent Volume(PV) for JupyterHub and I got a failure message such below:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4s (x27 over 35m) default-scheduler running "VolumeBinding" filter plugin for pod "hub-7b9cbbcf59-747jl": pod has unbound immediate PersistentVolumeClaims
I followed the installing process by the tutorial of JupyterHub, and I was used Helm to install JupyterHub on k8s. That config file such below:
config.yaml
proxy:
secretToken: "2fdeb3679d666277bdb1c93102a08f5b894774ba796e60af7957cb5677f40706"
singleuser:
storage:
dynamic:
storageClass: local-storage
Here, I was config a local-storage for JupyterHub, the local-storage was observed k8s: Link. And its yaml file
such like that:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
Then I use kubectl get storageclass to check it work, I got the message below:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-storage kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 64m
So, I thought I deployed a storage for JupyterHub, but I so naive. I am so disappointed about that because my other Pods(JupyterHub) are all running. And I have been search some solutions so long, but also failed.
So now, my problems are:
What is the true way to solve the PV problems? (Better using local storage.)
Is the local storage way will using other nodes disk not only master?
In fact, my lab had a could storage service, so if Q2 answer is No, and how I using my lab could storage service to deploy PV?
I had been addressed above problem with #Arghya Sadhu's solution. But now, I got a new problem is the Pod hub-db-dir also pending, it result my service proxy-public pending.
The description of hub-db-dir such below:
Name: hub-7b9cbbcf59-jv49z
Namespace: jhub
Priority: 0
Node: <none>
Labels: app=jupyterhub
component=hub
hub.jupyter.org/network-access-proxy-api=true
hub.jupyter.org/network-access-proxy-http=true
hub.jupyter.org/network-access-singleuser=true
pod-template-hash=7b9cbbcf59
release=jhub
Annotations: checksum/config-map: c20a64c7c9475201046ac620b057f0fa65ad6928744f7d265bc8705c959bce2e
checksum/secret: 1beaebb110d06103988476ec8a3117eee58d97e7dbc70c115c20048ea04e79a4
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/hub-7b9cbbcf59
Containers:
hub:
Image: jupyterhub/k8s-hub:0.9.0
Port: 8081/TCP
Host Port: 0/TCP
Command:
jupyterhub
--config
/etc/jupyterhub/jupyterhub_config.py
--upgrade-db
Requests:
cpu: 200m
memory: 512Mi
Readiness: http-get http://:hub/hub/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
PYTHONUNBUFFERED: 1
HELM_RELEASE_NAME: jhub
POD_NAMESPACE: jhub (v1:metadata.namespace)
CONFIGPROXY_AUTH_TOKEN: <set to the key 'proxy.token' in secret 'hub-secret'> Optional: false
Mounts:
/etc/jupyterhub/config/ from config (rw)
/etc/jupyterhub/cull_idle_servers.py from config (rw,path="cull_idle_servers.py")
/etc/jupyterhub/jupyterhub_config.py from config (rw,path="jupyterhub_config.py")
/etc/jupyterhub/secret/ from secret (rw)
/etc/jupyterhub/z2jh.py from config (rw,path="z2jh.py")
/srv/jupyterhub from hub-db-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hub-token-vlgwz (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hub-config
Optional: false
secret:
Type: Secret (a volume populated by a Secret)
SecretName: hub-secret
Optional: false
hub-db-dir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: hub-db-dir
ReadOnly: false
hub-token-vlgwz:
Type: Secret (a volume populated by a Secret)
SecretName: hub-token-vlgwz
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 61s (x43 over 56m) default-scheduler 0/5 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 4 node(s) didn't find available persistent volumes to bind.
The information with kubectl get pv,pvc,sc.
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/hub-db-dir Pending local-storage 162m
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/local-storage (default) kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 8h
So, how to fix it?
In addition to #Arghya Sadhu answer, in order to make it work using local storage you have to create a PersistentVolume manually.
For example:
apiVersion: v1
kind: PersistentVolume
metadata:
name: hub-db-pv
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: <path_to_local_volume>
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- <name_of_the_node>
Then you can deploy the chart:
helm upgrade --install $RELEASE jupyterhub/jupyterhub \
--namespace $NAMESPACE \
--version=0.9.0 \
--values config.yaml
The config.yaml file can be left as is:
proxy:
secretToken: "<token>"
singleuser:
storage:
dynamic:
storageClass: local-storage
I think you need to make local-storage as default storage class
kubectl patch storageclass local-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Local storage will use the local disk storage of the node where the pod get scheduled.
Hard to tell without more details. You can either create PV manually or use a storage class which does dynamic volume provisioning.

Pod cannot find mount path on Docker Desktop(Win)

On Docker Desktop(Win, open file sharing already), create pv/pvc and bound successfully, but start pod failed shows:
Warning Failed 4s (x4 over 34s) kubelet, docker-desktop Error: stat /c/cannot-found: no such file or directory
Have already created the "cannot-found" folder on my root which path is C:\cannot-found
Here are my pv/pvc and pod yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: global-volume
labels:
pv_pvc_label: nfs
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 1Gi
hostPath:
path: "/c/cannot-found"
persistentVolumeReclaimPolicy: Delete
storageClassName: standard
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: global-volume
spec:
storageClassName: standard
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
selector:
matchLabels:
pv_pvc_label: nfs
volumeName: global-volume
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: alpine
namespace: default
spec:
selector:
matchLabels:
component: centos
module: centos
replicas: 1
template:
metadata:
labels:
component: centos
module: centos
spec:
volumes:
- name: nfs
persistentVolumeClaim:
claimName: global-volume
containers:
- image: centos
command:
- /bin/sh
- "-c"
- "touch /root/test.txt; echo \"test mount\">/root/test.txt; cp /root/test.txt /tmp/test.txt; sleep 60m"
imagePullPolicy: IfNotPresent
name: alpine
volumeMounts:
- name: nfs
mountPath: /tmp
subPath: mountPath
restartPolicy: Always
My pv/pvc status:
leih#CNleih01 MINGW64 /c/dev/SAW/x-workspace/deployers/common (leih/workspace)
$ kubectl create -f global-volume-tmp.yaml
persistentvolume/global-volume created
persistentvolumeclaim/global-volume created
leih#CNleih01 MINGW64 /c/dev/SAW/x-workspace/deployers/common (leih/workspace)
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
global-volume Pending global-volume 0 standard 7s
leih#CNleih01 MINGW64 /c/dev/SAW/x-workspace/deployers/common (leih/workspace)
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
global-volume Bound global-volume 1Gi RWX standard 10s
leih#CNleih01 MINGW64 /c/dev/SAW/x-workspace/deployers/common (leih/workspace)
$ kubectl scale deployment alpine --replicas=1
deployment.apps/alpine scaled
leih#CNleih01 MINGW64 /c/dev/SAW/x-workspace/deployers/common (leih/workspace)
$ kubectl get po |grep alpine
alpine-6559ddcb88-n262l 0/1 CreateContainerConfigError 0 14s
leih#CNleih01 MINGW64 /c/dev/SAW/x-workspace/deployers/common (leih/workspace)
$ kubectl describe po alpine-6559ddcb88-n262l |grep Error
Reason: CreateContainerConfigError
Warning Failed 4s (x4 over 34s) kubelet, docker-desktop Error: stat /c/cannot-found: no such file or directory
File sharing:
k8s version: v1.16.6-beta.0
docker desktop version(win): Docker Desktop Community 2.3.0.2
docker version: v19.03.8
How can I solve it?
Update:
After rollback docker desktop from v2.3.0.2 to v2.2.0.5, it works fine.

Unable to mount volumes for pod "metadata-api-local": timeout expired waiting for volumes to attach or mount for pod "metadata"/"metadata-api-local"

I have a failing service with Kubernetes, it seems that service doesnt want to mount volume.
Unable to mount volumes for pod "metadata-api-local": timeout expired waiting for volumes to attach or mount for pod "metadata"/"metadata-api-local". list of unmounted volumes=[metadata-api-claim]. list of unattached volumes=[metadata-api-claim default-token-8lqmp]
Here is the log:
➜ metadata_api git:(develop) ✗ kubectl describe pod -n metadata metadata-api-local-f5bddb8f7-clmwq
Name: metadata-api-local-f5bddb8f7-clmwq
Namespace: metadata
Priority: 0
Node: minikube/192.168.0.85
Start Time: Wed, 18 Sep 2019 16:59:02 +0200
Labels: app=metadata-api-local
pod-template-hash=f5bddb8f7
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/metadata-api-local-f5bddb8f7
Containers:
metadata-api-local:
Container ID:
Image: metadata_api:local
Image ID:
Port: 18000/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment Variables from:
metadata-env Secret Optional: false
Environment: <none>
Mounts:
/var/lib/nodered-peer from metadata-api-claim (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-8lqmp (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
metadata-api-claim:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: metadata-api-claim
ReadOnly: false
default-token-8lqmp:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-8lqmp
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned metadata/metadata-api-local-f5bddb8f7-clmwq to minikube
Warning FailedMount 47s (x6 over 12m) kubelet, minikube Unable to mount volumes for pod "metadata-api-local-f5bddb8f7-clmwq_metadata(94cbb26c-4907-4512-950a-29a25ad1ef20)": timeout expired waiting for volumes to attach or mount for pod "metadata"/"metadata-api-local-f5bddb8f7-clmwq". list of unmounted volumes=[metadata-api-claim]. list of unattached volumes=[metadata-api-claim default-token-8lqmp]
Here is my metadata_pvc.yml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: metadata-api-pv
namespace: metadata
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
claimRef:
namespace: metadata
name: metadata-api-claim
hostPath:
path: /data/metadata-api
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: metadata-api-claim
namespace: metadata
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: metadata-postgres-volume
namespace: metadata
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
claimRef:
namespace: metadata
name: metadata-postgres-claim
hostPath:
path: /data/metadata-postgres
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: metadata-postgres-claim
namespace: metadata
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
When I list pv, I get:
➜ metadata_api git:(develop) ✗ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
metadata-api-pv 1Gi RWO Retain Available metadata/metadata-api-claim 12m
metadata-postgres-volume 1Gi RWO Retain Available metadata/metadata-postgres-claim 12m
➜ metadata_api git:(develop) ✗ kubectl get pvc
No resources found.
What is failing ?
You shouldn't specify claimRef, that field is automatically generated by Kubernetes controllers. Instead you should use storage classes for both your PersistentVolumes and PersistentVolumeClaims as that is the mechanism used to match them. Adding the storageClassName: name field to both your PersistentVolumes and PersistentVolumeClaims should fix your issue.

Unable to update image of StatefulSet in Kubernetes

I recently evaluated Kubernetes with a simple test project and I was able to update image of StatefulSet with command like this:
kubectl set image statefulset/cloud-stateful-set cloud-stateful-container=ncccloud:v716
I'm now trying to get our real system to work in Kubernetes and the pods don't do anything when I try to update image, even though I'm using basically the same command.
It says:
statefulset.apps "cloud-stateful-set" image updated
And kubectl describe statefulset.apps/cloud-stateful-set says:
Image: ncccloud:v716"
But kubectl describe pod cloud-stateful-set-0 and kubectl describe pod cloud-stateful-set-1 say:
"Image: ncccloud:latest"
The ncccloud:latest is an image which doesn't work:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cloud-stateful-set-0 0/1 CrashLoopBackOff 7 13m
cloud-stateful-set-1 0/1 CrashLoopBackOff 7 13m
mssql-deployment-6cd4ff766-pzz99 1/1 Running 1 55m
Another strange thing is that every time I try to apply the StatefulSet it says configured instead of unchanged.
$ kubectl apply -f k8s/cloud-stateful-set.yaml
statefulset.apps "cloud-stateful-set" configured
Here is my cloud-stateful-set.yaml:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cloud-stateful-set
labels:
app: cloud
group: service
spec:
replicas: 2
# podManagementPolicy: Parallel
serviceName: cloud-stateful-set
selector:
matchLabels:
app: cloud
template:
metadata:
labels:
app: cloud
group: service
spec:
containers:
- name: cloud-stateful-container
image: ncccloud:latest
imagePullPolicy: Never
ports:
- containerPort: 80
volumeMounts:
- name: cloud-stateful-storage
mountPath: /cloud-stateful-data
volumeClaimTemplates:
- metadata:
name: cloud-stateful-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Mi
Here is full output of kubectl describe pod/cloud-stateful-set-1:
Name: cloud-stateful-set-1
Namespace: default
Node: docker-for-desktop/192.168.65.3
Start Time: Tue, 02 Jul 2019 11:03:01 +0300
Labels: app=cloud
controller-revision-hash=cloud-stateful-set-5c9964c897
group=service
statefulset.kubernetes.io/pod-name=cloud-stateful-set-1
Annotations: <none>
Status: Running
IP: 10.1.0.20
Controlled By: StatefulSet/cloud-stateful-set
Containers:
cloud-stateful-container:
Container ID: docker://3ec26930c1a81caa39d5c5a16c4e25adf7584f90a71e0110c0b03ecb60dd9592
Image: ncccloud:latest
Image ID: docker://sha256:394427c40e964e34ca6c9db3ce1df1f8f6ce34c4ba8f3ab10e25da6e89678830
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 139
Started: Tue, 02 Jul 2019 11:19:03 +0300
Finished: Tue, 02 Jul 2019 11:19:03 +0300
Ready: False
Restart Count: 8
Environment: <none>
Mounts:
/cloud-stateful-data from cloud-stateful-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-gzxpx (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
cloud-stateful-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: cloud-stateful-storage-cloud-stateful-set-1
ReadOnly: false
default-token-gzxpx:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-gzxpx
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 19m default-scheduler Successfully assigned cloud-stateful-set-1 to docker-for-desktop
Normal SuccessfulMountVolume 19m kubelet, docker-for-desktop MountVolume.SetUp succeeded for volume "pvc-4c9e1796-9c9a-11e9-998f-00155d64fa03"
Normal SuccessfulMountVolume 19m kubelet, docker-for-desktop MountVolume.SetUp succeeded for volume "default-token-gzxpx"
Normal Pulled 17m (x5 over 19m) kubelet, docker-for-desktop Container image "ncccloud:latest" already present on machine
Normal Created 17m (x5 over 19m) kubelet, docker-for-desktop Created container
Normal Started 17m (x5 over 19m) kubelet, docker-for-desktop Started container
Warning BackOff 4m (x70 over 19m) kubelet, docker-for-desktop Back-off restarting failed container
Here is full output of kubectl describe statefulset.apps/cloud-stateful-set:
Name: cloud-stateful-set
Namespace: default
CreationTimestamp: Tue, 02 Jul 2019 11:02:59 +0300
Selector: app=cloud
Labels: app=cloud
group=service
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"apps/v1","kind":"StatefulSet","metadata":{"annotations":{},"labels":{"app":"cloud","group":"service"},"name":"cloud-stateful-set","names...
Replicas: 2 desired | 2 total
Pods Status: 2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=cloud
group=service
Containers:
cloud-stateful-container:
Image: ncccloud:v716
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/cloud-stateful-data from cloud-stateful-storage (rw)
Volumes: <none>
Volume Claims:
Name: cloud-stateful-storage
StorageClass:
Labels: <none>
Annotations: <none>
Capacity: 10Mi
Access Modes: [ReadWriteOnce]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 25m statefulset-controller create Pod cloud-stateful-set-0 in StatefulSet cloud-stateful-set successful
Normal SuccessfulCreate 25m statefulset-controller create Pod cloud-stateful-set-1 in StatefulSet cloud-stateful-set successful
I'm using Docker Desktop on Windows, if it matters.
in my case imagePullPolicy was set to Always already:
kubectl patch statefulset web --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google_containers/nginx-slim:0.8"}]'
helped in my case, see k8s docs: https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#rolling-update
In the stateful set yaml, change
imagePullPolicy: Never
to
imagePullPolicy: Always

Resources