I'm stuck with an annoying issue, where my pod can't access the mounted persistent volume.
Kubeadm: v1.19.2
Docker: 19.03.13
Zookeeper image: library/zookeeper:3.6
Cluster info: Locally hosted, no Cloud Provide
K8s configuration:
apiVersion: v1
kind: Service
metadata:
name: zk-hs
labels:
app: zk
spec:
selector:
app: zk
ports:
- port: 2888
targetPort: 2888
name: server
protocol: TCP
- port: 3888
targetPort: 3888
name: leader-election
protocol: TCP
clusterIP: ""
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
name: zk-cs
labels:
app: zk
spec:
selector:
app: zk
ports:
- name: client
protocol: TCP
port: 2181
targetPort: 2181
type: LoadBalancer
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
selector:
matchLabels:
app: zk
maxUnavailable: 1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zk
spec:
selector:
matchLabels:
app: zk
serviceName: zk-hs
replicas: 1
updateStrategy:
type: RollingUpdate
podManagementPolicy: OrderedReady
template:
metadata:
labels:
app: zk
spec:
volumes:
- name: zoo-config
configMap:
name: zoo-config
- name: datadir
persistentVolumeClaim:
claimName: zoo-pvc
containers:
- name: zookeeper
imagePullPolicy: Always
image: "library/zookeeper:3.6"
resources:
requests:
memory: "1Gi"
cpu: "0.5"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper/data
- name: zoo-config
mountPath: /conf
securityContext:
fsGroup: 2000
runAsUser: 1000
runAsNonRoot: true
volumeClaimTemplates:
- metadata:
name: datadir
annotations:
volume.beta.kubernetes.io/storage-class: local-storage
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: local-storage
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
name: zoo-config
namespace: default
data:
zoo.cfg: |
tickTime=10000
dataDir=/var/lib/zookeeper/data
clientPort=2181
initLimit=10
syncLimit=4
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: zoo-pv
labels:
type: local
spec:
storageClassName: local-storage
persistentVolumeReclaimPolicy: Retain
hostPath:
path: "/mnt/data"
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- <node-name>
I've tried running the pod as root with the following security context, which I know is a terrible idea, purely as a test. This however caused a bunch of other issues.
securityContext:
fsGroup: 0
runAsUser: 0
Once the pod starts up the logs contain the following,
Zookeeper JMX enabled by default
Using config: /conf/zoo.cfg
<log4j Warnings>
Unable too access datadir, exiting abnormally
Inspecting the pod, provides me with the following information,
~$ kubectl describe pod/zk-0
Name: zk-0
Namespace: default
Priority: 0
Node: <node>
Start Time: Sat, 26 Sep 2020 15:48:00 +0200
Labels: app=zk
controller-revision-hash=zk-6c68989bd
statefulset.kubernetes.io/pod-name=zk-0
Annotations: <none>
Status: Running
IP: <IP>
IPs:
IP: <IP>
Controlled By: StatefulSet/zk
Containers:
zookeeper:
Container ID: docker://281e177d677394604785542c231d21b71f1666a22e74c1c10ef88491dad7a522
Image: library/zookeeper:3.6
Image ID: docker-pullable://zookeeper#sha256:6c051390cfae7958ff427834937c353fc6c34484f6a84b3e4bc8c512b53a16f6
Ports: 2181/TCP, 2888/TCP, 3888/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 3
Started: Sat, 26 Sep 2020 16:04:26 +0200
Finished: Sat, 26 Sep 2020 16:04:27 +0200
Ready: False
Restart Count: 8
Requests:
cpu: 500m
memory: 1Gi
Environment: <none>
Mounts:
/conf from zoo-config (rw)
/var/lib/zookeeper/data from datadir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-88x56 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: datadir-zk-0
ReadOnly: false
zoo-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: zoo-config
Optional: false
default-token-88x56:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-88x56
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 17m default-scheduler Successfully assigned default/zk-0 to <node>
Normal Pulled 17m kubelet Successfully pulled image "library/zookeeper:3.6" in 1.932381527s
Normal Pulled 17m kubelet Successfully pulled image "library/zookeeper:3.6" in 1.960610662s
Normal Pulled 17m kubelet Successfully pulled image "library/zookeeper:3.6" in 1.959935633s
Normal Created 16m (x4 over 17m) kubelet Created container zookeeper
Normal Pulled 16m kubelet Successfully pulled image "library/zookeeper:3.6" in 1.92551645s
Normal Started 16m (x4 over 17m) kubelet Started container zookeeper
Normal Pulling 15m (x5 over 17m) kubelet Pulling image "library/zookeeper:3.6"
Warning BackOff 2m35s (x71 over 17m) kubelet Back-off restarting failed container
To me, it seems like the pod has full rw access to the volume, so I'm unsure why it's still refusing to access the directory. Any help will be appreciated!
After quite some digging, I finally figured out why it wasn't working. The logs were actually telling me all I needed to know in the end, the mounted persistentVolumeClaim simply did not have the correct file permissions to read from the mounted hostpath /mnt/data directory
To fix this, in a somewhat hacky way, I gave read, write & execute permissions to all.
chmod 777 /mnt/data
Overview can be found here
This is definitely not the most secure way, of fixing the issue, and I would strongly advise against using this in any production like environment.
Probably a better approach would be the following
sudo usermod -a -G 1000 1000
Related
I have defined a K8S configuration which deploy a metricbeat container. Below is the configuration file. But I got an error when run kubectl describe pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m24s default-scheduler Successfully assigned default/metricbeat-6467cc777b-jrx9s to ip-192-168-44-226.ap-southeast-2.compute.internal
Warning FailedMount 3m21s kubelet Unable to attach or mount volumes: unmounted volumes=[config], unattached volumes=[default-token-4dxhl config modules]: timed out waiting for the condition
Warning FailedMount 74s (x10 over 5m24s) kubelet MountVolume.SetUp failed for volume "config" : configmap "metricbeat-daemonset-config" not found
Warning FailedMount 67s kubelet Unable to attach or mount volumes: unmounted volumes=[config], unattached volumes=[config modules default-token-4dxhl]: timed out waiting for the condition
based on the error message, it says configmap "metricbeat-daemonset-config" not found but it does exist in below configuration file. why does it report this error?
apiVersion: v1
kind: ConfigMap
metadata:
name: metricbeat-daemonset-config
namespace: kube-system
labels:
k8s-app: metricbeat
data:
metricbeat.yml: |-
metricbeat.config.modules:
# Mounted `metricbeat-daemonset-modules` configmap:
path: ${path.config}/modules.d/*.yml
# Reload module configs as they change:
reload.enabled: false
processors:
- add_cloud_metadata:
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST}:${ELASTICSEARCH_PORT:9200}']
---
apiVersion: v1
kind: ConfigMap
metadata:
name: metricbeat-daemonset-modules
labels:
k8s-app: metricbeat
data:
aws.yml: |-
- module: aws
access_key_id: 'xxxx'
secret_access_key: 'xxxx'
period: 600s
regions:
- ap-southeast-2
metricsets:
- elb
- natgateway
- rds
- transitgateway
- usage
- vpn
- cloudwatch
metrics:
- namespace: "*"
---
# Deploy a Metricbeat instance per node for node metrics retrieval
apiVersion: apps/v1
kind: Deployment
metadata:
name: metricbeat
labels:
k8s-app: metricbeat
spec:
selector:
matchLabels:
k8s-app: metricbeat
template:
metadata:
labels:
k8s-app: metricbeat
spec:
terminationGracePeriodSeconds: 30
hostNetwork: true
containers:
- name: metricbeat
image: elastic/metricbeat:7.11.1
env:
- name: ELASTICSEARCH_HOST
value: es-entrypoint
- name: ELASTICSEARCH_PORT
value: '9200'
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/metricbeat.yml
readOnly: true
subPath: metricbeat.yml
- name: modules
mountPath: /usr/share/metricbeat/modules.d
readOnly: true
volumes:
- name: config
configMap:
defaultMode: 0640
name: metricbeat-daemonset-config
- name: modules
configMap:
defaultMode: 0640
name: metricbeat-daemonset-modules
There's a good chance that the other resources are ending up in namespace default because they do not specify a namespace, but the config does (kube-system). You should probably put all of this in its own metricbeat namespace.
I am working with Docker Desktop on Windows 10 as well as Windows Subsystem for Linux (WSL). I have a containerized app that I deploy to the local K8s cluster (courtesy of Docker Desktop). Typical story: all was working fine until one day a Docker Desktop update came and ruined everything). DD version I have now is 2.3.0.2 stable.
I have a pod with MySQL with defined pv, pvc and storage class. When I deploy my app to the cluster I can see that pv and pvc are bound but the pod is stuck at ContainerCreating state:
$ kubectl describe pod mysql-6779d8fb8b-d25wz
Name: mysql-6779d8fb8b-d25wz
Namespace: typo3-connector
Priority: 0
Node: docker-desktop/192.168.65.3
Start Time: Wed, 13 May 2020 14:21:43 +0200
Labels: app=mysql
pod-template-hash=6779d8fb8b
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/mysql-6779d8fb8b
Containers:
mysql:
Container ID:
Image: lw-mysql
Image ID:
Port: 3306/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment Variables from:
mysql-credentials Secret Optional: false
Environment: <none>
Mounts:
/var/lib/mysql from mysql-persistent-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wr6g9 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
mysql-persistent-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mysql-pv-claim
ReadOnly: false
default-token-wr6g9:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-wr6g9
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler persistentvolumeclaim "mysql-pv-claim" not found
Warning FailedScheduling <unknown> default-scheduler persistentvolumeclaim "mysql-pv-claim" not found
Normal Scheduled <unknown> default-scheduler Successfully assigned typo3-connector/mysql-6779d8fb8b-d25wz to docker-desktop
Warning FailedMount 35s (x8 over 99s) kubelet, docker-desktop MountVolume.NewMounter initialization failed for volume "mysql-pv" : path "/c/kubernetes/typo3-8/mysql-storage/" does not exist
The error is
MountVolume.NewMounter initialization failed for volume "mysql-pv" : path "/c/kubernetes/typo3-8/mysql-storage/" does not exist
But the path actually exists on disk (in WSL that is).
The pv:
$ kubectl describe pv mysql-pv
Name: mysql-pv
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"mysql-pv"},"spec":{"accessModes":["ReadWriteOnce"],"capa...
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pv-protection]
StorageClass: local-storage
Status: Bound
Claim: typo3-connector/mysql-pv-claim
Reclaim Policy: Retain
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 2Gi
Node Affinity:
Required Terms:
Term 0: kubernetes.io/hostname in [docker-desktop]
Message:
Source:
Type: LocalVolume (a persistent volume backed by local storage on a node)
Path: /c/kubernetes/typo3-8/mysql-storage/
Events: <none>
The pvc:
$ kubectl describe pvc mysql-pv-claim
Name: mysql-pv-claim
Namespace: typo3-connector
StorageClass: local-storage
Status: Bound
Volume: mysql-pv
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"mysql-pv-claim","namespace":"typo3-connector"},"spe...
pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 2Gi
Access Modes: RWO
VolumeMode: Filesystem
Mounted By: mysql-6779d8fb8b-d25wz
Events: <none>
I tried running it from PowerShell but no luck. I get the same result.
But it was all working fine before the update.
Is it configuration based issue?
EDIT
Here's the manifest file:
apiVersion: v1
kind: Namespace
metadata:
name: typo3-connector
---
apiVersion: v1
data:
MYSQL_PASSWORD: ZHVtbXk=
MYSQL_ROOT_PASSWORD: ZHVtbXk=
MYSQL_USER: ZHVtbXk=
kind: Secret
metadata:
name: mysql-credentials
namespace: typo3-connector
type: Opaque
---
apiVersion: v1
kind: Service
metadata:
labels:
app: mysql
name: mysql
namespace: typo3-connector
spec:
ports:
- name: mysql-backend
port: 3306
protocol: TCP
selector:
app: mysql
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
namespace: typo3-connector
spec:
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- envFrom:
- secretRef:
name: mysql-credentials
image: mysql:5.6
imagePullPolicy: IfNotPresent
name: mysql
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- mountPath: /var/lib/mysql
name: mysql-persistent-storage
imagePullSecrets:
- name: lwdockerregistry
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pv-claim
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
local:
path: /c/kubernetes/typo3-8/mysql-storage/
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- docker-desktop
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
namespace: typo3-connector
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: local-storage
You must use this format for the path:
/run/desktop/mnt/host/c/someDir/volumeDir
So in your case it is:
/run/desktop/mnt/host/c/kubernetes/typo3-8/mysql-storage/
source for solution
UPDATE: You really don't want to use windows host directories inside WSL2 distro because it is really really poor performance. Use the WSL2 distro directories instead that you can access just as easily from the host windows on this address \\wsl$
As described in the following example taken from here, the path shouldn't end with /.
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-local-pv
spec:
capacity:
storage: 500Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/disks/vol1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- my-node
Changing your path to path: /c/kubernetes/typo3-8/mysql-storage makes the magic and your PVC works as designed.
I am trying to use gcePersistentDisk as ReadOnlyMany so that my pods on multiple nodes can read the data on this disk. Following the documentation here for the same.
To create and later format the gce Persistent Disk, I have followed the instructions in the documentation here. Following this doc, I have sshed into one of the nodes and have formatted the disk. See below the complete error and also the other yaml files.
kubectl describe pods -l podName
Name: punk-fly-nodejs-deployment-5dbbd7b8b5-5cbfs
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: gke-mycluster-default-pool-b1c1d316-d016/10.160.0.12
Start Time: Thu, 25 Apr 2019 23:55:38 +0530
Labels: app.kubernetes.io/instance=punk-fly
app.kubernetes.io/name=nodejs
pod-template-hash=1866836461
Annotations: kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container nodejs
Status: Pending
IP:
Controlled By: ReplicaSet/punk-fly-nodejs-deployment-5dbbd7b8b5
Containers:
nodejs:
Container ID:
Image: rajesh12/smartserver:server
Image ID:
Port: 3002/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment:
MYSQL_HOST: mysqlservice
MYSQL_DATABASE: app
MYSQL_ROOT_PASSWORD: password
Mounts:
/usr/src/ from helm-vol (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-jpkzg (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
helm-vol:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-readonly-pvc
ReadOnly: true
default-token-jpkzg:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-jpkzg
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned default/punk-fly-nodejs-deployment-5dbbd7b8b5-5cbfs to gke-mycluster-default-pool-b1c1d316-d016
Normal SuccessfulAttachVolume 1m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-9c796180-677e-11e9-ad35-42010aa0000f"
Warning FailedMount 10s (x8 over 1m) kubelet, gke-mycluster-default-pool-b1c1d316-d016 MountVolume.MountDevice failed for volume "pvc-9c796180-677e-11e9-ad35-42010aa0000f" : failed to mount unformatted volume as read only
Warning FailedMount 0s kubelet, gke-mycluster-default-pool-b1c1d316-d016 Unable to mount volumes for pod "punk-fly-nodejs-deployment-5dbbd7b8b5-5cbfs_default(86293044-6787-11e9-ad35-42010aa0000f)": timeout expired waiting for volumes to attach or mount for pod "default"/"punk-fly-nodejs-deployment-5dbbd7b8b5-5cbfs". list of unmounted volumes=[helm-vol]. list of unattached volumes=[helm-vol default-token-jpkzg]
readonly_pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-readonly-pv
spec:
storageClassName: ""
capacity:
storage: 1G
accessModes:
- ReadOnlyMany
gcePersistentDisk:
pdName: mydisk0
fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-readonly-pvc
spec:
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 1G
deployment.yaml
volumes:
- name: helm-vol
persistentVolumeClaim:
claimName: my-readonly-pvc
readOnly: true
containers:
- name: {{ .Values.app.backendName }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tagServer }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
env:
- name: MYSQL_HOST
value: mysqlservice
- name: MYSQL_DATABASE
value: app
- name: MYSQL_ROOT_PASSWORD
value: password
ports:
- name: http-backend
containerPort: 3002
volumeMounts:
- name: helm-vol
mountPath: /usr/src/
It sounds like your PVC is dynamically provisioning a new volume that is not formatted with the default StorageClass
It could be that your Pod is being created in a different availability from where you have the PV provisioned. The gotcha with having multiple Pod readers for the gce volume is that the Pods always have to be in the same availability zone.
Some options:
Simply create and format the PV on the same availability zone where your node is.
When you define your PV you could specify Node Affinity to make sure it always gets assigned to a specific node.
Define a StorageClass that specifies the filesystem
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: mysc
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
fsType: ext4
And then use it in your PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 1G
storageClassName: mysc
The volume will be automatically provisioned and formatted.
I received the same error message when trying to provision a persistent disk with an access mode of ReadWriteOnce. What fixed the issue for me was removing the property readOnly: true from the volumes declaration of the Deployment spec. In the case of your deployment.yaml file, that would be this block:
volumes:
- name: helm-vol
persistentVolumeClaim:
claimName: my-readonly-pvc
readOnly: true
Try removing that line and see if the error goes away.
I had the same error and managed to fix it using a few lines from a related article about using preexisting disks
https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/preexisting-pd
You need to add storageClassName and volumeName to your persistent volume claim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-claim-demo
spec:
# It's necessary to specify "" as the storageClassName
# so that the default storage class won't be used, see
# https://kubernetes.io/docs/concepts/storage/persistent-volumes/#class-1
storageClassName: ""
volumeName: pv-demo
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 500G
I'm having a few issues getting Ambassador to work correctly. I'm new to Kubernetes and just teaching myself.
I have successfully managed to work through the demo material Ambassador provide - e.g /httpbin/ endpoint is working correctly, but when I try to deploy a Go service it is falling over.
When hitting the 'qotm' endpoint, the page this is the response:
upstream request timeout
Pod status:
CrashLoopBackOff
From my research, it seems to be related to the yaml file not being configured correctly but I'm struggling to find any documentation relating to this use case.
My cluster is running on AWS EKS and the images are being pushed to AWS ECR.
main.go:
package main
import (
"fmt"
"net/http"
"os"
)
func main() {
var PORT string
if PORT = os.Getenv("PORT"); PORT == "" {
PORT = "3001"
}
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello World from path: %s\n", r.URL.Path)
})
http.ListenAndServe(":" + PORT, nil)
}
Dockerfile:
FROM golang:alpine
ADD ./src /go/src/app
WORKDIR /go/src/app
EXPOSE 3001
ENV PORT=3001
CMD ["go", "run", "main.go"]
test.yaml:
apiVersion: v1
kind: Service
metadata:
name: qotm
annotations:
getambassador.io/config: |
---
apiVersion: ambassador/v1
kind: Mapping
name: qotm_mapping
prefix: /qotm/
service: qotm
spec:
selector:
app: qotm
ports:
- port: 80
name: http-qotm
targetPort: http-api
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: qotm
spec:
replicas: 1
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: qotm
spec:
containers:
- name: qotm
image: ||REMOVED||
ports:
- name: http-api
containerPort: 3001
readinessProbe:
httpGet:
path: /health
port: 5000
initialDelaySeconds: 30
periodSeconds: 3
resources:
limits:
cpu: "0.1"
memory: 100Mi
Pod description:
Name: qotm-7b9bf4d499-v9nxq
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: ip-192-168-89-69.eu-west-1.compute.internal/192.168.89.69
Start Time: Sun, 17 Mar 2019 17:19:50 +0000
Labels: app=qotm
pod-template-hash=3656908055
Annotations: <none>
Status: Running
IP: 192.168.113.23
Controlled By: ReplicaSet/qotm-7b9bf4d499
Containers:
qotm:
Container ID: docker://5839996e48b252ac61f604d348a98c47c53225712efd503b7c3d7e4c736920c4
Image: IMGURL
Image ID: docker-pullable://IMGURL
Port: 3001/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sun, 17 Mar 2019 17:30:49 +0000
Finished: Sun, 17 Mar 2019 17:30:49 +0000
Ready: False
Restart Count: 7
Limits:
cpu: 100m
memory: 200Mi
Requests:
cpu: 100m
memory: 200Mi
Readiness: http-get http://:3001/health delay=30s timeout=1s period=3s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5bbxw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-5bbxw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5bbxw
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12m default-scheduler Successfully assigned default/qotm-7b9bf4d499-v9nxq to ip-192-168-89-69.eu-west-1.compute.internal
Normal Pulled 10m (x5 over 12m) kubelet, ip-192-168-89-69.eu-west-1.compute.internal Container image "IMGURL" already present on machine
Normal Created 10m (x5 over 12m) kubelet, ip-192-168-89-69.eu-west-1.compute.internal Created container
Normal Started 10m (x5 over 11m) kubelet, ip-192-168-89-69.eu-west-1.compute.internal Started container
Warning BackOff 115s (x47 over 11m) kubelet, ip-192-168-89-69.eu-west-1.compute.internal Back-off restarting failed container
In your kubernetes deployment file you have exposed a readiness probe on port 5000 while your application is exposed on port 3001, also while running the container a few times I got OOMKilled so increased the memory limit. Anyways below deployment file should work fine.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: qotm
spec:
replicas: 1
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: qotm
spec:
containers:
- name: qotm
image: <YOUR_IMAGE>
imagePullPolicy: Always
ports:
- name: http-api
containerPort: 3001
readinessProbe:
httpGet:
path: /health
port: 3001
initialDelaySeconds: 30
periodSeconds: 3
resources:
limits:
cpu: "0.1"
memory: 200Mi
I've created an example Rails 5 app that uses Google Cloud PostgreSQL.
I'm able to run the app locally with docker-compose up, but I'm not able to connect to it remote when I deploy it to GCP.
I tried to replicate https://cloud.google.com/ruby/tutorials/bookshelf-on-kubernetes-engine where they use targetPort: http-server
The rails app is published on Github.
Am I doing anything obviously wrong? :-|
Running the app locally works
git clone git#github.com:stabenfeldt/k8s-colors.git
docker-compose up -d
docker-compose run colors rake db:create db:migrate
open http://localhost:3000
Create a GKE cluster
gcloud container clusters create color-cluster --num-nodes=2
Setup PostgreSQL Cloud SQL
I followed the instructions from https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine?authuser=1
and updated my config/database.yml and k8s/colors.yml with these values.
Deployed but stuck on ContainerCreating
kubectl apply -f k8s/colors.yml
kubectl get pods
NAME READY STATUS RESTARTS AGE
colors-d9f744dc-d5l5v 0/2 ContainerCreating 0 5m
colors-d9f744dc-spmws 0/2 ContainerCreating 0 5m
kubectl logs d9f744dc-d5l5v -c colors # => Nothing logged
kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
colors 2 2 2 0 7m
But fails to connect to the app
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
colors LoadBalancer 10.55.245.192 35.228.111.217 80:30746/TCP 1h
kubernetes ClusterIP 10.55.240.1 <none> 443/TCP 1h
curl 35.228.111.217 # => No response! :-/
kubectl describe svc colors
Name: colors
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"colors","namespace":"default"},"spec":{"ports":[{"port":80,"targetPort":3000}]...
Selector: app=colors
Type: LoadBalancer
IP: 10.55.252.91
LoadBalancer Ingress: 35.228.203.46
Port: <unset> 80/TCP
TargetPort: 3000/TCP
NodePort: <unset> 30964/TCP
Endpoints: <none>
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Type 1m service-controller ClusterIP -> LoadBalancer
Normal EnsuringLoadBalancer 1m service-controller Ensuring load balancer
Normal EnsuredLoadBalancer 30s service-controller Ensured load balancer
k8s/service.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: colors
labels:
app: colors
spec:
replicas: 2
selector:
matchLabels:
app: colors
template:
metadata:
labels:
app: colors
spec:
containers:
- name: colors
image: docker.io/stabenfeldt/colors:latest
ports:
- name: http-server
containerPort: 3000
env:
- name: POSTGRES_HOST
value: 127.0.0.1:5432
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: cloudsql-db-credentials
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: cloudsql-db-credentials
key: password
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/cloud_sql_proxy",
"-instances=PROJECT_ID:europe-west1:staging=tcp:5432",
"-credential_file=/secrets/cloudsql/credentials.json"]
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
---
apiVersion: v1
kind: Service
metadata:
name: colors
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 3000
selector:
app: colors
kubectl describe deployment
Name: colors
Namespace: default
CreationTimestamp: Fri, 13 Jul 2018 10:37:06 +0200
Labels: app=colors
Annotations: deployment.kubernetes.io/revision=1
kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"colors"},"name":"colors","namespace":"default"},"spec":{"repl...
Selector: app=colors
Replicas: 2 desired | 2 updated | 2 total | 0 available | 2 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=colors
Containers:
colors:
Image: docker.io/stabenfeldt/colors:latest
Port: 3000/TCP
Environment:
POSTGRES_HOST: 127.0.0.1:5432
POSTGRES_USER: <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
POSTGRES_PASSWORD: <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
Mounts: <none>
cloudsql-proxy:
Image: gcr.io/cloudsql-docker/gce-proxy:1.11
Port: <none>
Command:
/cloud_sql_proxy
-instances=MY-INSTANCE:europe-west1:staging=tcp:5432
-credential_file=/secrets/cloudsql/credentials.json
Environment: <none>
Mounts:
/secrets/cloudsql from cloudsql-instance-credentials (ro)
Volumes:
cloudsql-instance-credentials:
Type: Secret (a volume populated by a Secret)
SecretName: cloudsql-instance-credentials
Optional: false
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing True ReplicaSetUpdated
OldReplicaSets: <none>
NewReplicaSet: colors-d9f744dc (2/2 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 1m deployment-controller Scaled up replica set colors-d9f744dc to 2
kubectl describe service
Name: colors
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"colors","namespace":"default"},"spec":{"ports":[{"port":80,"targetPort":3000}]...
Selector: app=colors
Type: LoadBalancer
IP: 10.55.252.91
LoadBalancer Ingress: 35.228.203.46
Port: <unset> 80/TCP
TargetPort: 3000/TCP
NodePort: <unset> 30964/TCP
Endpoints: <none>
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Type 4m service-controller ClusterIP -> LoadBalancer
Normal EnsuringLoadBalancer 4m service-controller Ensuring load balancer
Normal EnsuredLoadBalancer 3m service-controller Ensured load balancer
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.55.240.1
Port: https 443/TCP
TargetPort: 443/TCP
Endpoints: 35.228.79.249:443
Session Affinity: ClientIP
Events: <none>
I don't see anything wrong outright, but here are a few tips to verifying your Kubernetes Objects look like they should compared to your yamls:
Use the describe command to get more information about objects and make sure they are set up correctly.
For example, if you do kubectl describe deployment <deployment_name> you should verify the following line is present:
Port: 3000/TCP
And for your Service - kubectl describe service <service_name>:
LoadBalancer Ingress: <PUBLIC_IP>
Port: <unset> 80/TCP
TargetPort: 3000/TCP
Finally, I'm not sure if you want to apply the following in your LoadBalancer:
labels:
app: colors
Since you are using this label as a selector, it may be doing something funky and trying to load balance to itself instead of your containers with the apps in it.
Also as a side note on your terminology, GCP (Google Cloud Platform) is the overarching name of Google's Services, GKE (Google Kubernetes Engine) is the service providing you with a managed Kuberenetes Cluster.
Hope this helps.
A working setup can be in my example Rails app at Github.
k8s/colors.yml
# Remember to update MY-INSTANCE
apiVersion: v1
kind: Service
metadata:
name: colors-frontend
labels:
app: colors
tier: frontend
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: http-server
selector:
app: colors
tier: frontend
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: colors-frontend
labels:
app: colors
tier: frontend
spec:
replicas: 3
template:
metadata:
labels:
app: colors
tier: frontend
spec:
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
containers:
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/cloud_sql_proxy",
"-instances=MY-INSTANCE:europe-west1:development=tcp:5432",
"-credential_file=/secrets/cloudsql/credentials.json"]
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
- name: colors-app
image: docker.io/stabenfeldt/colors:1
imagePullPolicy: Always
env:
- name: RAILS_LOG_TO_STDOUT
value: "true"
- name: RAILS_ENV
value: development
- name: POSTGRES_HOST
value: 127.0.0.1
- name: POSTGRES_USERNAME
valueFrom:
secretKeyRef:
name: cloudsql-db-credentials
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: cloudsql-db-credentials
key: password
ports:
- name: http-server
containerPort: 3000
Your POSTGRES_HOST environment variable needs to be localhost instead of 127.0.0.01:5432. You do not need to add port in the POSTGRES_HOST