Pod not getting created due to Volume Count

Pod not getting created due to Volume Count - azure-aks

All,
I tried setting up a SQLServer BDC on my personal account on Azure with Standard_E4s_v3. I had requested them to increase the vCPU from 10 max earlier and that is how I have been able to get this far.
However, this time, the deployment seems to be stuck while creating the sparkhead-0 pod. I checked the pod's description and below is what I got:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 51s (x43 over 61m) default-scheduler 0/3 nodes are available: 3 node(s) exceed max volume count.
Does this mean that I have to request an increase in the number of drives? It seems a bit odd that an Odd number of VMs (3) could result in Even number of Drives (26) (3 * 8) + 2)
Below is the status of the various Pods
C:\Users\rgn>kubectl get pods -n mssql-cluster
NAME READY STATUS RESTARTS AGE
appproxy-pgmhq 2/2 Running 0 67m
compute-0-0 3/3 Running 0 67m
control-gm6nh 3/3 Running 0 75m
controldb-0 2/2 Running 0 75m
controlwd-rc4g2 1/1 Running 0 71m
data-0-0 3/3 Running 0 67m
data-0-1 3/3 Running 0 67m
gateway-0 2/2 Running 0 66m
logsdb-0 1/1 Running 0 71m
logsui-7qhp8 1/1 Running 0 71m
master-0 3/3 Running 0 66m
metricsdb-0 1/1 Running 0 71m
metricsdc-2mc7w 1/1 Running 0 71m
metricsdc-fw96x 1/1 Running 0 71m
metricsdc-xgnmh 1/1 Running 0 71m
metricsui-spd2v 1/1 Running 0 71m
mgmtproxy-bmkld 2/2 Running 0 71m
nmnode-0-0 2/2 Running 0 67m
sparkhead-0 0/4 Pending 0 66m
storage-0-0 4/4 Running 0 66m
storage-0-1 4/4 Running 0 66m
Here is a description of the Volume claim and I do see the Claim bound for sparkhead-0 (data-sparkhead-0 and logs-sparkhead-0)
C:\Users\rgn>kubectl get pvc -n mssql-cluster
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-compute-0-0 Bound pvc-13aa6838-7d94-4485-b362-5a93e3dff650 15Gi RWO default 18m
data-controldb Bound pvc-d0a98bd0-481d-4893-b7e3-4b0a3fe177c9 15Gi RWO default 26m
data-controller Bound pvc-636ee4dd-1ad4-49dc-921d-2a08b932bc6f 15Gi RWO default 26m
data-data-0-0 Bound pvc-93a6c67f-aa73-4120-ac48-aa543bd4e256 15Gi RWO default 18m
data-data-0-1 Bound pvc-3ef5af39-9434-4933-8e31-ba3454f3532e 15Gi RWO default 18m
data-gateway-0 Bound pvc-2e66a32a-c1af-4fbc-bd6f-e486034f3fcf 15Gi RWO default 17m
data-logsdb-0 Bound pvc-d3005d81-dfef-4eb7-9513-a9ffa22b25cf 15Gi RWO default 22m
data-master-0 Bound pvc-5d94c6db-a061-4e48-9ac6-d9073405901c 15Gi RWO default 18m
data-metricsdb-0 Bound pvc-b79e7de9-a996-4266-bd4b-e9757b74c286 15Gi RWO default 22m
data-nmnode-0-0 Bound pvc-cb042a08-a55f-4911-aabd-dd71b8371674 15Gi RWO default 18m
data-sparkhead-0 Bound pvc-b7090ce9-d3f7-4250-8327-4bb2e83ac64c 15Gi RWO default 17m
data-storage-0-0 Bound pvc-ad248251-78af-4c82-a0c6-95aa78baeb72 15Gi RWO default 17m
data-storage-0-1 Bound pvc-ca1eca42-9a9f-4db9-b844-a39fed60def0 15Gi RWO default 17m
logs-compute-0-0 Bound pvc-883dafe4-76df-493b-8b3e-489ee3b26c10 10Gi RWO default 18m
logs-controldb Bound pvc-c0135496-3268-471a-958f-8bd3c9d346c6 10Gi RWO default 26m
logs-controller Bound pvc-0b2aba14-9272-4db9-bd35-12dd4d810ade 10Gi RWO default 26m
logs-data-0-0 Bound pvc-92a13038-a14a-48fb-9a9f-7c92083087f8 10Gi RWO default 18m
logs-data-0-1 Bound pvc-10e3668c-f172-47d2-86f3-cdff99763b36 10Gi RWO default 18m
logs-gateway-0 Bound pvc-6d0525e3-83fb-480d-96f8-3d7e6778d3c0 10Gi RWO default 17m
logs-logsdb-0 Bound pvc-81ad630a-e8d5-443e-9629-262a3b769819 10Gi RWO default 22m
logs-master-0 Bound pvc-4cdbdf8f-ed9d-4fdc-bdb4-a776a6928868 10Gi RWO default 18m
logs-metricsdb-0 Bound pvc-93e054bb-df90-4714-ae40-033a539316b7 10Gi RWO default 22m
logs-nmnode-0-0 Bound pvc-6fd75daa-6a54-4ce0-8d82-e98e6c14dbad 10Gi RWO default 18m
logs-sparkhead-0 Bound pvc-dd989502-b612-490c-95d0-2afa5127e5b1 10Gi RWO default 17m
logs-storage-0-0 Bound pvc-6ffcfdc4-b1dd-4bd7-81b1-3b16aaa8b1c8 10Gi RWO default 17m
logs-storage-0-1 Bound pvc-29ac02b3-c4f6-49f0-80bc-6551bae26fb6 10Gi RWO default 17m
Can someone point me on the right direction?
Thanks,
rgn

I think the problem is that you can create multiple pvc from the default sc, they the disks. But each node with the size Standard_E4s_v3 only can attach with 8 data disk. So when you mount the 25th data disk to the pod, it fails. – Charles Xu

Related

Pod coredns stuck in ContainerCreating state with Weave on k8s

First of all, let me thank you for this amazing guide. I'm very new to kubernetes and having a guide like this to follow helps a lot when trying to setup my first cluster!
That said, I'm having some issues with creating deploytments, as there are two pods that aren't being created, and remain stuck in the state: ContainerCreating
[root#master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane 25h v1.24.0
node1 Ready <none> 24h v1.24.0
node2 Ready <none> 24h v1.24.0
[root#master ~]# kubectl cluster-info
Kubernetes control plane is running at https://192.168.3.200:6443
CoreDNS is running at https://192.168.3.200:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The problem:
[root#master ~]# kubectl get all --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-6d4b75cb6d-v5pvk 0/1 ContainerCreating 0 114m
kube-system pod/coredns-7599c5f99f-q6nwq 0/1 ContainerCreating 0 114m
kube-system pod/coredns-7599c5f99f-sg4wn 0/1 ContainerCreating 0 114m
kube-system pod/etcd-master 1/1 Running 3 (3h26m ago) 25h
kube-system pod/kube-apiserver-master 1/1 Running 3 (3h26m ago) 25h
kube-system pod/kube-controller-manager-master 1/1 Running 3 (3h26m ago) 25h
kube-system pod/kube-proxy-ftxzx 1/1 Running 2 (3h11m ago) 24h
kube-system pod/kube-proxy-pcl8q 1/1 Running 3 (3h26m ago) 25h
kube-system pod/kube-proxy-q7dpw 1/1 Running 2 (3h23m ago) 24h
kube-system pod/kube-scheduler-master 1/1 Running 3 (3h26m ago) 25h
kube-system pod/weave-net-2p47z 2/2 Running 5 (3h23m ago) 24h
kube-system pod/weave-net-k5529 2/2 Running 4 (3h11m ago) 24h
kube-system pod/weave-net-tq4bs 2/2 Running 7 (3h26m ago) 25h
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 25h
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 25h
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/kube-proxy 3 3 3 3 3 kubernetes.io/os=linux 25h
kube-system daemonset.apps/weave-net 3 3 3 3 3 <none> 25h
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system deployment.apps/coredns 0/2 2 0 25h
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system replicaset.apps/coredns-6d4b75cb6d 1 1 0 25h
kube-system replicaset.apps/coredns-7599c5f99f 2 2 0 116m
Note that the first three pods, from coredns, fail to start.
[root#master ~]# kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
93m Warning FailedCreatePodSandBox pod/nginx-deploy-99976564d-s4shk (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "fd79c77289f42b3cb0eb0be997a02a42f9595df061deb6e2d3678ab00afb5f67": failed to find network info for sandbox "fd79c77289f42b3cb0eb0be997a02a42f9595df061deb6e2d3678ab00afb5f67"
.
[root#master ~]# kubectl describe pod coredns-6d4b75cb6d-v5pvk -n kube-system
Name: coredns-6d4b75cb6d-v5pvk
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: node2/192.168.3.202
Start Time: Thu, 12 May 2022 19:45:58 +0000
Labels: k8s-app=kube-dns
pod-template-hash=6d4b75cb6d
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/coredns-6d4b75cb6d
Containers:
coredns:
Container ID:
Image: k8s.gcr.io/coredns/coredns:v1.8.6
Image ID:
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4bpvz (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
kube-api-access-4bpvz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 93s (x393 over 124m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "7d0f8f4b3dbf2dffcf1a8c01b41368e16b1f80bc97ff3faa611c1fd52c0f6967": failed to find network info for sandbox "7d0f8f4b3dbf2dffcf1a8c01b41368e16b1f80bc97ff3faa611c1fd52c0f6967"
Versions:
[root#master ~]# docker --version
Docker version 20.10.15, build fd82621
[root#master ~]# kubelet --version
Kubernetes v1.24.0
[root#master ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-03T13:44:24Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
I have no idea where to go from here. I googled keywords like "rpc error weave k8s" and "Failed to create pod sandbox: rpc error" but none of the solutions I found had a solution to my problem. I saw some problems mentioning weaving net, could this be the problem? Maybe I got it wrong, but I'm sure I followed the instructions very well.
Any help would be greatly appreciated!

Looks like you got pretty far! Support for docker as a container runtime was dropped in 1.24.0. I can't tell if that is what you are using or not but if you are that could be your problem.
https://kubernetes.io/blog/2022/05/03/kubernetes-1-24-release-announcement/
You could switch to containerd for your container runtime but for the purposes of learning you could try the latest 1.23.x version of kubernetes. Get that to work then circle back and tackle containerd with kubernetes v1.24.0
You can still use docker on your laptop/desktop but on the k8s servers you will not be able to use docker on 1.24.x or later.
Hope that helps and good luck!

How do I know how much memory I should provide in k8s pod?

I have applied a elasticsearch on k8s in my Mac (a minikube cluster). The elasticsearch configure file is:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 7.10.0
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: -Xms2g -Xmx2g
resources:
requests:
memory: 4Gi
cpu: 4
limits:
memory: 4Gi
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard
resources:
requests:
storage: 1Gi
after I run kubectl apply -f es.yaml, the pod and services are created but the pod is pending.
$kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 28h
quickstart-es-default ClusterIP None <none> 9200/TCP 21m
quickstart-es-http ClusterIP 10.103.177.195 <none> 9200/TCP 21m
quickstart-es-transport ClusterIP None <none> 9300/TCP 21m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
quickstart-es-default-0 0/1 Pending 0 21m
the output of describe pods is:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 22m (x2 over 22m) default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Warning FailedScheduling 40s (x17 over 22m) default-scheduler 0/1 nodes are available: 1 Insufficient memory.
It seems that I don't have enough memory in my pod. How can I allocate more memories to my pod?

Minikube itself starts with DefaultMemory = 2048, you are hitting this limit.
Using minikube you should think in advance how much resources your pods/replicas use in total, so that you can use --memory flag to allocated need amount of RAM.
You already got an answer in separate question. In addition to that I would add that you should always do minikube delete prior to start minikube with --memory= option, eg minikube start --memory=8192
You can always check you current memory configuration by kubectl describe node in Capacity section, e.g.
Capacity:
cpu: 1
...
memory: 3785984Ki
pods: 110

K8s SQL2019 HA Containers - Dude, where are my Pods?

New to K8s. So far I have the following:
docker-ce-19.03.8
docker-ce-cli-19.03.8
containerd.io-1.2.13
kubelet-1.18.5
kubeadm-1.18.5
kubectl-1.18.5
etcd-3.4.10
Use Flannel for Pod Overlay Net
Performed all of the host-level work (SELinux permissive, swapoff, etc.)
All Centos7 in an on-prem Vsphere envioronment (6.7U3)
I've built all my configs and currently have:
a 3-node external/stand-alone etcd cluster with peer-to-peer and client-server encrypted transmissions
a 3-node control plane cluster -- kubeadm init is bootstrapped with x509s and targets to the 3 etcds (so stacked etcd never happens)
HAProxy and Keepalived are installed on two of the etcd cluster members, load-balancing access to the API server endpoints on the control plane (TCP6443)
6-worker nodes
Storage configured with the in-tree Vmware Cloud Provider (I know it's deprecated)--and yes, this is my DEFAULT SC
Status Checks:
kubectl cluster-info reports:
[me#km-01 pods]$ kubectl cluster-info
Kubernetes master is running at https://k8snlb:6443
KubeDNS is running at https://k8snlb:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubectl get all --all-namespaces reports:
[me#km-01 pods]$ kubectl get all --all-namespaces -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ag1 pod/mssql-operator-68bcc684c4-rbzvn 1/1 Running 0 86m 10.10.4.133 kw-02.bogus.local <none> <none>
kube-system pod/coredns-66bff467f8-k6m94 1/1 Running 4 20h 10.10.0.11 km-01.bogus.local <none> <none>
kube-system pod/coredns-66bff467f8-v848r 1/1 Running 4 20h 10.10.0.10 km-01.bogus.local <none> <none>
kube-system pod/kube-apiserver-km-01.bogus.local 1/1 Running 8 10h x.x.x..25 km-01.bogus.local <none> <none>
kube-system pod/kube-controller-manager-km-01.bogus.local 1/1 Running 2 10h x.x.x..25 km-01.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-7l76c 1/1 Running 0 10h x.x.x..30 kw-01.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-8kft7 1/1 Running 0 10h x.x.x..33 kw-04.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-r5kqv 1/1 Running 0 10h x.x.x..34 kw-05.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-t6xcd 1/1 Running 0 10h x.x.x..35 kw-06.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-vhnx8 1/1 Running 0 10h x.x.x..32 kw-03.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-xdk2n 1/1 Running 0 10h x.x.x..31 kw-02.bogus.local <none> <none>
kube-system pod/kube-flannel-ds-amd64-z4kfk 1/1 Running 4 20h x.x.x..25 km-01.bogus.local <none> <none>
kube-system pod/kube-proxy-49hsl 1/1 Running 0 10h x.x.x..35 kw-06.bogus.local <none> <none>
kube-system pod/kube-proxy-62klh 1/1 Running 0 10h x.x.x..34 kw-05.bogus.local <none> <none>
kube-system pod/kube-proxy-64d5t 1/1 Running 0 10h x.x.x..30 kw-01.bogus.local <none> <none>
kube-system pod/kube-proxy-6ch42 1/1 Running 4 20h x.x.x..25 km-01.bogus.local <none> <none>
kube-system pod/kube-proxy-9css4 1/1 Running 0 10h x.x.x..32 kw-03.bogus.local <none> <none>
kube-system pod/kube-proxy-hgrx8 1/1 Running 0 10h x.x.x..33 kw-04.bogus.local <none> <none>
kube-system pod/kube-proxy-ljlsh 1/1 Running 0 10h x.x.x..31 kw-02.bogus.local <none> <none>
kube-system pod/kube-scheduler-km-01.bogus.local 1/1 Running 5 20h x.x.x..25 km-01.bogus.local <none> <none>
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
ag1 service/ag1-primary NodePort 10.104.183.81 x.x.x..30,x.x.x..31,x.x.x..32,x.x.x..33,x.x.x..34,x.x.x..35 1433:30405/TCP 85m role.ag.mssql.microsoft.com/ag1=primary,type=sqlservr
ag1 service/ag1-secondary NodePort 10.102.52.31 x.x.x..30,x.x.x..31,x.x.x..32,x.x.x..33,x.x.x..34,x.x.x..35 1433:30713/TCP 85m role.ag.mssql.microsoft.com/ag1=secondary,type=sqlservr
ag1 service/mssql1 NodePort 10.96.166.108 x.x.x..30,x.x.x..31,x.x.x..32,x.x.x..33,x.x.x..34,x.x.x..35 1433:32439/TCP 86m name=mssql1,type=sqlservr
ag1 service/mssql2 NodePort 10.109.146.58 x.x.x..30,x.x.x..31,x.x.x..32,x.x.x..33,x.x.x..34,x.x.x..35 1433:30636/TCP 86m name=mssql2,type=sqlservr
ag1 service/mssql3 NodePort 10.101.234.186 x.x.x..30,x.x.x..31,x.x.x..32,x.x.x..33,x.x.x..34,x.x.x..35 1433:30862/TCP 86m name=mssql3,type=sqlservr
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h <none>
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 20h k8s-app=kube-dns
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
kube-system daemonset.apps/kube-flannel-ds-amd64 7 7 7 7 7 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-amd64 app=flannel
kube-system daemonset.apps/kube-flannel-ds-arm 0 0 0 0 0 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-arm app=flannel
kube-system daemonset.apps/kube-flannel-ds-arm64 0 0 0 0 0 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-arm64 app=flannel
kube-system daemonset.apps/kube-flannel-ds-ppc64le 0 0 0 0 0 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-ppc64le app=flannel
kube-system daemonset.apps/kube-flannel-ds-s390x 0 0 0 0 0 <none> 20h kube-flannel quay.io/coreos/flannel:v0.12.0-s390x app=flannel
kube-system daemonset.apps/kube-proxy 7 7 7 7 7 kubernetes.io/os=linux 20h kube-proxy k8s.gcr.io/kube-proxy:v1.18.7 k8s-app=kube-proxy
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
ag1 deployment.apps/mssql-operator 1/1 1 1 86m mssql-operator mcr.microsoft.com/mssql/ha:2019-CTP2.1-ubuntu app=mssql-operator
kube-system deployment.apps/coredns 2/2 2 2 20h coredns k8s.gcr.io/coredns:1.6.7 k8s-app=kube-dns
NAMESPACE NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
ag1 replicaset.apps/mssql-operator-68bcc684c4 1 1 1 86m mssql-operator mcr.microsoft.com/mssql/ha:2019-CTP2.1-ubuntu app=mssql-operator,pod-template-hash=68bcc684c4
kube-system replicaset.apps/coredns-66bff467f8 2 2 2 20h coredns k8s.gcr.io/coredns:1.6.7 k8s-app=kube-dns,pod-template-hash=66bff467f8
To the problem: There are a number of articles talking about a SQL2019 HA build. It appears that every single one however, is in the cloud whereas mine is on-prem in a Vsphere env. They appear to be very simple: Run 3 scripts in this order: operator.yaml, sql.yaml, and ag-service.yaml.
My YAML's are based on: https://github.com/microsoft/sql-server-samples/tree/master/samples/features/high%20availability/Kubernetes/sample-manifest-files
For the blogs that actually screenshot the environment afterward, there should be at least 7 pods (1 Operator, 3 SQL Init, 3 SQL). If you look at my aforementioned all --all-namespaces output, I have everything (and in a running state) but no pods other than the running Operator...???
I actually broke the control plane back to a single-node just to try to isolate the logs. /var/log/container/* and /var/log/pod/* contain nothing of value to indicate a problem with storage or any other reason the the Pods are non-existent. It's probably also worth noting that I started using the latest sql2019 label: 2019-latest but when I got the same behavior there, I decided to try to use the old bits since so many blogs are based on CTP 2.1.
I can create PVs and PVCs using the VCP storage provider. I have my Secrets and can see them in the Secrets store.
I'm at a loss as to explain why pods are missing or where to look after checking journalctl, the daemons themselves, and /var/log and I don't see any indication there's even an attempt to create them -- the kubectl apply -f mssql-server2019.yaml that I adapted runs to completion and without error indicating 3 sql objects and 3 sql services get created. But here is the file anyway targeting CTP2.1:
cat << EOF > mssql-server2019.yaml
apiVersion: mssql.microsoft.com/v1
kind: SqlServer
metadata:
labels: {name: mssql1, type: sqlservr}
name: mssql1
namespace: ag1
spec:
acceptEula: true
agentsContainerImage: mcr.microsoft.com/mssql/ha:2019-CTP2.1
availabilityGroups: [ag1]
instanceRootVolumeClaimTemplate:
accessModes: [ReadWriteOnce]
resources:
requests: {storage: 5Gi}
storageClass: default
saPassword:
secretKeyRef: {key: sapassword, name: sql-secrets}
sqlServerContainer: {image: 'mcr.microsoft.com/mssql/server:2019-CTP2.1'}
---
apiVersion: v1
kind: Service
metadata: {name: mssql1, namespace: ag1}
spec:
ports:
- {name: tds, port: 1433}
selector: {name: mssql1, type: sqlservr}
type: NodePort
externalIPs:
- x.x.x.30
- x.x.x.31
- x.x.x.32
- x.x.x.33
- x.x.x.34
- x.x.x.35
---
apiVersion: mssql.microsoft.com/v1
kind: SqlServer
metadata:
labels: {name: mssql2, type: sqlservr}
name: mssql2
namespace: ag1
spec:
acceptEula: true
agentsContainerImage: mcr.microsoft.com/mssql/ha:2019-CTP2.1
availabilityGroups: [ag1]
instanceRootVolumeClaimTemplate:
accessModes: [ReadWriteOnce]
resources:
requests: {storage: 5Gi}
storageClass: default
saPassword:
secretKeyRef: {key: sapassword, name: sql-secrets}
sqlServerContainer: {image: 'mcr.microsoft.com/mssql/server:2019-CTP2.1'}
---
apiVersion: v1
kind: Service
metadata: {name: mssql2, namespace: ag1}
spec:
ports:
- {name: tds, port: 1433}
selector: {name: mssql2, type: sqlservr}
type: NodePort
externalIPs:
- x.x.x.30
- x.x.x.31
- x.x.x.32
- x.x.x.33
- x.x.x.34
- x.x.x.35
---
apiVersion: mssql.microsoft.com/v1
kind: SqlServer
metadata:
labels: {name: mssql3, type: sqlservr}
name: mssql3
namespace: ag1
spec:
acceptEula: true
agentsContainerImage: mcr.microsoft.com/mssql/ha:2019-CTP2.1
availabilityGroups: [ag1]
instanceRootVolumeClaimTemplate:
accessModes: [ReadWriteOnce]
resources:
requests: {storage: 5Gi}
storageClass: default
saPassword:
secretKeyRef: {key: sapassword, name: sql-secrets}
sqlServerContainer: {image: 'mcr.microsoft.com/mssql/server:2019-CTP2.1'}
---
apiVersion: v1
kind: Service
metadata: {name: mssql3, namespace: ag1}
spec:
ports:
- {name: tds, port: 1433}
selector: {name: mssql3, type: sqlservr}
type: NodePort
externalIPs:
- x.x.x.30
- x.x.x.31
- x.x.x.32
- x.x.x.33
- x.x.x.34
- x.x.x.35
---
EOF
Edit1: kubectl logs -n ag mssql-operator-*
[sqlservers] 2020/08/14 14:36:48 Creating custom resource definition
[sqlservers] 2020/08/14 14:36:48 Created custom resource definition
[sqlservers] 2020/08/14 14:36:48 Waiting for custom resource definition to be available
[sqlservers] 2020/08/14 14:36:49 Watching for resources...
[sqlservers] 2020/08/14 14:37:08 Creating ConfigMap sql-operator
[sqlservers] 2020/08/14 14:37:08 Updating mssql1 in namespace ag1 ...
[sqlservers] 2020/08/14 14:37:08 Creating ConfigMap ag1
[sqlservers] ERROR: 2020/08/14 14:37:08 could not process update request: error creating ConfigMap ag1: v1.ConfigMap: ObjectMeta: v1.ObjectMeta: readObjectFieldAsBytes: expect : after object field, parsing 627 ...:{},"k:{\"... at {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"ag1","namespace":"ag1","selfLink":"/api/v1/namespaces/ag1/configmaps/ag1","uid":"33af6232-4464-4290-bb14-b21e8f72e361","resourceVersion":"314186","creationTimestamp":"2020-08-14T14:37:08Z","ownerReferences":[{"apiVersion":"mssql.microsoft.com/v1","kind":"ReplicationController","name":"mssql1","uid":"e71a7246-2776-4d96-9735-844ee136a37d","controller":false}],"managedFields":[{"manager":"mssql-server-k8s-operator","operation":"Update","apiVersion":"v1","time":"2020-08-14T14:37:08Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"e71a7246-2776-4d96-9735-844ee136a37d\"}":{".":{},"f:apiVersion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}}}}]}}
[sqlservers] 2020/08/14 14:37:08 Updating ConfigMap sql-operator
[sqlservers] 2020/08/14 14:37:08 Updating mssql2 in namespace ag1 ...
[sqlservers] ERROR: 2020/08/14 14:37:08 could not process update request: error getting ConfigMap ag1: v1.ConfigMap: ObjectMeta: v1.ObjectMeta: readObjectFieldAsBytes: expect : after object field, parsing 627 ...:{},"k:{\"... at {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"ag1","namespace":"ag1","selfLink":"/api/v1/namespaces/ag1/configmaps/ag1","uid":"33af6232-4464-4290-bb14-b21e8f72e361","resourceVersion":"314186","creationTimestamp":"2020-08-14T14:37:08Z","ownerReferences":[{"apiVersion":"mssql.microsoft.com/v1","kind":"ReplicationController","name":"mssql1","uid":"e71a7246-2776-4d96-9735-844ee136a37d","controller":false}],"managedFields":[{"manager":"mssql-server-k8s-operator","operation":"Update","apiVersion":"v1","time":"2020-08-14T14:37:08Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"e71a7246-2776-4d96-9735-844ee136a37d\"}":{".":{},"f:apiVersion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}}}}]}}
[sqlservers] 2020/08/14 14:37:08 Updating ConfigMap sql-operator
[sqlservers] 2020/08/14 14:37:08 Updating mssql3 in namespace ag1 ...
[sqlservers] ERROR: 2020/08/14 14:37:08 could not process update request: error getting ConfigMap ag1: v1.ConfigMap: ObjectMeta: v1.ObjectMeta: readObjectFieldAsBytes: expect : after object field, parsing 627 ...:{},"k:{\"... at {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"ag1","namespace":"ag1","selfLink":"/api/v1/namespaces/ag1/configmaps/ag1","uid":"33af6232-4464-4290-bb14-b21e8f72e361","resourceVersion":"314186","creationTimestamp":"2020-08-14T14:37:08Z","ownerReferences":[{"apiVersion":"mssql.microsoft.com/v1","kind":"ReplicationController","name":"mssql1","uid":"e71a7246-2776-4d96-9735-844ee136a37d","controller":false}],"managedFields":[{"manager":"mssql-server-k8s-operator","operation":"Update","apiVersion":"v1","time":"2020-08-14T14:37:08Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"e71a7246-2776-4d96-9735-844ee136a37d\"}":{".":{},"f:apiVersion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}}}}]}}
I've looked over my operator and mssql2019.yamls (specifically around the kind: SqlServer, since that seems to be where it's failing) and can't identify any glaring inconsistencies or differences.

So your operator is running:
ag1 pod/pod/mssql-operator-68bcc684c4-rbzvn 1/1 Running 0 86m 10.10.4.133 kw-02.bogus.local <none> <none>
I would start by looking at the logs there:
kubectl -n ag1 logs pod/mssql-operator-68bcc684c4-rbzvn
Most likely it needs to interact with the cloud provider (i.e Azure) and VMware is not supported but check what the logs say 👀.
Update:
Based on the logs you posted it looks like you are using K8s 1.18 and the operator is incompatible. It's trying to create a ConfigMap with a spec that the kube-apiserver is rejecting.
✌️

YAMLs mine are based off of: https://github.com/microsoft/sql-server-samples/tree/master/samples/features/high%20availability/Kubernetes/sample-manifest-files
Run 3 scripts in this order: operator.yaml, sql.yaml, and ag-service.yaml.
I have just ran it on my GKE cluster and got similar result if I try running only these 3 files.
If you ran it without preparing PV and PVC ( .././sample-deployment-script/templates/pv*.yaml )
$ git clone https://github.com/microsoft/sql-server-samples.git
$ cd sql-server-samples/samples/features/high\ availability/Kubernetes/sample-manifest-files/
$ kubectl create -f operator.yaml
namespace/ag1 created
serviceaccount/mssql-operator created
clusterrole.rbac.authorization.k8s.io/mssql-operator-ag1 created
clusterrolebinding.rbac.authorization.k8s.io/mssql-operator-ag1 created
deployment.apps/mssql-operator created
$ kubectl create -f sqlserver.yaml
sqlserver.mssql.microsoft.com/mssql1 created
service/mssql1 created
sqlserver.mssql.microsoft.com/mssql2 created
service/mssql2 created
sqlserver.mssql.microsoft.com/mssql3 created
service/mssql3 created
$ kubectl create -f ag-services.yaml
service/ag1-primary created
service/ag1-secondary created
You'll have:
kubectl get pods -n ag1
NAME READY STATUS RESTARTS AGE
mssql-initialize-mssql1-js4zc 0/1 CreateContainerConfigError 0 6m12s
mssql-initialize-mssql2-72d8n 0/1 CreateContainerConfigError 0 6m8s
mssql-initialize-mssql3-h4mr9 0/1 CreateContainerConfigError 0 6m6s
mssql-operator-658558b57d-6xd95 1/1 Running 0 6m33s
mssql1-0 1/2 CrashLoopBackOff 5 6m12s
mssql2-0 1/2 CrashLoopBackOff 5 6m9s
mssql3-0 0/2 Pending 0 6m6s
I see that the failed mssql<N> pods are parts of statefulset.apps/mssql<N> and mssql-initialize-mssql<N> are parts of job.batch/mssql-initialize-mssql<N>
Upon adding PV and PVC it looks in a following way:
$ kubectl get all -n ag1
NAME READY STATUS RESTARTS AGE
mssql-operator-658558b57d-pgx74 1/1 Running 0 20m
And 3 sqlservers.mssql.microsoft.com objects
$ kubectl get sqlservers.mssql.microsoft.com -n ag1
NAME AGE
mssql1 64m
mssql2 64m
mssql3 64m
That is why it looks exactly as it is specified in the abovementioned files.
Any assistance would be greatly appreciated.
However, if you run:
sql-server-samples/samples/features/high availability/Kubernetes/sample-deployment-script/$ ./deploy-ag.py deploy --dry-run
configs will be generated automatically.
without dry-run and that configs (and with properly set PV+PVC) it gives us 7 pods.
You'll have configs generated. It'll be useful to compare auto-generated configs with the one's you have (and compare running only subset 3 files vs. stuff from deploy-ag.py )
P.S.
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15+" GitVersion:"v1.15.11-dispatcher"
Server Version: version.Info{Major:"1", Minor:"15+" GitVersion:"v1.15.12-gke.2"

Kubernetes pods is in status pending state

I am trying to install Kubectl but when I type this in the terminal :
kubectl get pods --namespace knative-serving -w
I got this :
NAME READY STATUS RESTARTS AGE
activator-69b8474d6b-jvzvs 2/2 Running 0 2h
autoscaler-6579b57774-cgmm9 2/2 Running 0 2h
controller-66cd7d99df-q59kl 0/1 Pending 0 2h
webhook-6d9568d-v4pgk 1/1 Running 0 2h
controller-66cd7d99df-q59kl 0/1 Pending 0 2h
controller-66cd7d99df-q59kl 0/1 Pending 0 2h
controller-66cd7d99df-q59kl 0/1 Pending 0 2h
controller-66cd7d99df-q59kl 0/1 Pending 0 2h
controller-66cd7d99df-q59kl 0/1 Pending 0 2h
controller-66cd7d99df-q59kl 0/1 Pending 0 2h
I don't understand why controller-66cd7d99df-q59kl is still pending.
When I tried this : kubectl describe pods -n knative-serving controller-66cd7d99df-q59kl I got this :
Name: controller-66cd7d99df-q59kl
Namespace: knative-serving
Node: <none>
Labels: app=controller
pod-template-hash=66cd7d99df
Annotations: sidecar.istio.io/inject=false
Status: Pending
IP:
Controlled By: ReplicaSet/controller-66cd7d99df
Containers:
controller:
Image: gcr.io/knative-releases/github.com/knative/serving/cmd/controller#sha256:5a5a0d5fffe839c99fc8f18ba028375467fdcd83cbee9c7015c1a58d01ca6929
Port: 9090/TCP
Limits:
cpu: 1
memory: 1000Mi
Requests:
cpu: 100m
memory: 100Mi
Environment: <none>
Mounts:
/etc/config-logging from config-logging (rw)
/var/run/secrets/kubernetes.io/serviceaccount from controller-token-d9l64 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config-logging:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: config-logging
Optional: false
controller-token-d9l64:
Type: Secret (a volume populated by a Secret)
SecretName: controller-token-d9l64
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 40s (x98 over 2h) default-scheduler 0/1 nodes are available: 1 Insufficient cpu.

Please consider the comments above: you have kubectl installed correctly (it's working) and kubectl describe pod/<pod> would help...
But, the information you provide appears sufficient for an answer:
FailedScheduling because of Insufficient cpu
The pod that you show (one of several) requests:
cpu: 1
memory: 1000Mi
The cluster has insufficient capacity to deploy this pod (and apparently the others).
You should increase the number (and|or size) of the nodes in your cluster to accommodate the capacity needed for the pods.
You needn't delete these pods because, once the cluster's capacity increases, you should see these pods deploy successfully.

Please verify your cpu resources by running:
kubectl get nodes
kubectl describe nodes (your node)
take a look also for all information related to:
Capacity:
cpu:
Allocatable:
cpu:
CPU Requests, CPU Limits information can be helpful

Google Kubernetes Engine: Not seeing mount persistent volume in the instance

I created a 200G disk with the command gcloud compute disks create --size 200GB my-disk
then created a PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-volume
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: my-disk
fsType: ext4
then created a PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
then created a StatefulSet and mount the volume to /mnt/disks, which is an existing directory. statefulset.yaml:
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: ...
spec:
...
spec:
containers:
- name: ...
...
volumeMounts:
- name: my-volume
mountPath: /mnt/disks
volumes:
- name: my-volume
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: my-claim
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 200Gi
I ran command kubectl get pv and saw that disk was successfully mounted to each instance
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
my-volume 200Gi RWO Retain Available 19m
pvc-17c60f45-2e4f-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claim-xxx_1 standard 13m
pvc-5972c804-2e4e-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claim standard 18m
pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e 200Gi RWO Delete Bound default/my-claimxxx_0 standard 18m
but when I ssh into an instance and run df -hT, I do not see the mounted volume. below is the output:
Filesystem Type Size Used Avail Use% Mounted on
/dev/root ext2 1.2G 447M 774M 37% /
devtmpfs devtmpfs 1.9G 0 1.9G 0% /dev
tmpfs tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs tmpfs 1.9G 744K 1.9G 1% /run
tmpfs tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
tmpfs tmpfs 1.9G 0 1.9G 0% /tmp
tmpfs tmpfs 256K 0 256K 0% /mnt/disks
/dev/sda8 ext4 12M 28K 12M 1% /usr/share/oem
/dev/sda1 ext4 95G 3.5G 91G 4% /mnt/stateful_partition
tmpfs tmpfs 1.0M 128K 896K 13% /var/lib/cloud
overlayfs overlay 1.0M 148K 876K 15% /etc
anyone has any idea?
Also worth mentioning that I'm trying to mount the disk to a docker image which is running in kubernete engine. The pod was created with below commands:
docker build -t gcr.io/xxx .
gcloud docker -- push gcr.io/xxx
kubectl create -f statefulset.yaml
The instance I sshed into is the one that runs the docker image. I do not see the volume in both instance and the docker container
UPDATE
I found the volume, I ran df -ahT in the instance, and saw the relevant entries
/dev/sdb - - - - - /var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gke-xxx-cluster-c-pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
/dev/sdb - - - - - /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/61bb679b-2e4e-11e8-9b77-42010af0000e/volumes/kubernetes.io~gce-pd/pvc-61b9daf9-2e4e-11e8-9b77-42010af0000e
then I went into the docker container and ran df -ahT, I got
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 ext4 95G 3.5G 91G 4% /mnt/disks
Why I'm seeing 95G total size instead of 200G, which is the size of my volume?
More info:
kubectl describe pod
Name: xxx-replicaset-0
Namespace: default
Node: gke-xxx-cluster-default-pool-5e49501c-nrzt/10.128.0.17
Start Time: Fri, 23 Mar 2018 11:40:57 -0400
Labels: app=xxx-replicaset
controller-revision-hash=xxx-replicaset-755c4f7cff
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"StatefulSet","namespace":"default","name":"xxx-replicaset","uid":"d6c3511f-2eaf-11e8-b14e-42010af0000...
kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container xxx-deployment
Status: Running
IP: 10.52.4.5
Created By: StatefulSet/xxx-replicaset
Controlled By: StatefulSet/xxx-replicaset
Containers:
xxx-deployment:
Container ID: docker://137b3966a14538233ed394a3d0d1501027966b972d8ad821951f53d9eb908615
Image: gcr.io/sampeproject/xxxstaging:v1
Image ID: docker-pullable://gcr.io/sampeproject/xxxstaging#sha256:a96835c2597cfae3670a609a69196c6cd3d9cc9f2f0edf5b67d0a4afdd772e0b
Port: 8080/TCP
State: Running
Started: Fri, 23 Mar 2018 11:42:17 -0400
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment:
Mounts:
/mnt/disks from my-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-hj65g (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
my-claim:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-claim-xxx-replicaset-0
ReadOnly: false
my-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-hj65g:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-hj65g
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 10m (x4 over 10m) default-scheduler PersistentVolumeClaim is not bound: "my-claim-xxx-replicaset-0" (repeated 5 times)
Normal Scheduled 9m default-scheduler Successfully assigned xxx-replicaset-0 to gke-xxx-cluster-default-pool-5e49501c-nrzt
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "my-volume"
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "default-token-hj65g"
Normal SuccessfulMountVolume 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt MountVolume.SetUp succeeded for volume "pvc-902c57c5-2eb0-11e8-b14e-42010af0000e"
Normal Pulling 9m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt pulling image "gcr.io/sampeproject/xxxstaging:v1"
Normal Pulled 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Successfully pulled image "gcr.io/sampeproject/xxxstaging:v1"
Normal Created 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Created container
Normal Started 8m kubelet, gke-xxx-cluster-default-pool-5e49501c-nrzt Started container
Seems like it did not mount the correct volume. I ran lsblk in docker container
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 95.9G 0 part /mnt/disks
├─sda2 8:2 0 16M 0 part
├─sda3 8:3 0 2G 0 part
├─sda4 8:4 0 16M 0 part
├─sda5 8:5 0 2G 0 part
├─sda6 8:6 0 512B 0 part
├─sda7 8:7 0 512B 0 part
├─sda8 8:8 0 16M 0 part
├─sda9 8:9 0 512B 0 part
├─sda10 8:10 0 512B 0 part
├─sda11 8:11 0 8M 0 part
└─sda12 8:12 0 32M 0 part
sdb 8:16 0 200G 0 disk
Why this is happening?

When you use PVCs, K8s manages persistent disks for you.
The exact way how PVs can by defined by provisioner in storage classes. Since you use GKE your default SC uses kubernetes.io/gce-pd provisioner (https://kubernetes.io/docs/concepts/storage/storage-classes/#gce).
In other words for each pod new PV is created.
If you would like to use existing disk you can use Volumes instead of PVCs (https://kubernetes.io/docs/concepts/storage/volumes/#gcepersistentdisk)

The PVC is not mounted into your container because you did not actually specify the PVC in your container's volumeMounts. Only the emptyDir volume was specified.
I actually recently modified the GKE StatefulSet tutorial. Before, some of the steps were incorrect and saying to manually create the PD and PV objects. Instead, it's been corrected to use dynamic provisioning.
Please try that out and see if the updated steps work for you.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Pod not getting created due to Volume Count - azure-aks

I think the problem is that you can create multiple pvc from the default sc, they the disks. But each node with the size Standard_E4s_v3 only can attach with 8 data disk. So when you mount the 25th data disk to the pod, it fails. – Charles Xu

Related

Pod coredns stuck in ContainerCreating state with Weave on k8s

How do I know how much memory I should provide in k8s pod?

K8s SQL2019 HA Containers - Dude, where are my Pods?

Kubernetes pods is in status pending state

Google Kubernetes Engine: Not seeing mount persistent volume in the instance

Categories

Resources