I would like to run nexus3 within the Google Container Engine.
I created a persistent disk and configured the following deployment file:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nexus3
labels:
app: nexus3
spec:
replicas: 1
selector:
matchLabels:
app: nexus3
template:
metadata:
labels:
app: nexus3
tier: web
spec:
containers:
- image: gcr.io/nexustest-182520/nexus3:3.6.0
name: nexus3
volumeMounts:
- mountPath: /nexus-data
name: nexus3-persistent-storage
ports:
- containerPort: 8081
volumes:
- name: nexus3-persistent-storage
gcePersistentDisk:
pdName: nexus3-disk
fsType: ext4
The deployment fails with this problem:
kubectl get pods -o=wide
NAME READY STATUS RESTARTS AGE IP NODE
nexus3-1260341461-mj7rf 0/1 Error 2 36s x.x.x.x gke-nexus-cluster-default-pool-9a58e4f2-p1t9
kubectl describe po/nexus3-1260341461-mj7rf
[...]
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 default-scheduler Normal Scheduled Successfully assigned nexus3-1260341461-mj7rf to gke-nexus-cluster-default-pool-9a58e4f2-p1t9
1m 1m 1 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "default-token-gsnbn"
1m 1m 1 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "nexus3-persistent-storage"
1m 12s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 spec.containers{nexus3} Normal Pulled Container image "gcr.io/nexustest-182520/nexus3:3.6.0" already present on machine
1m 12s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 spec.containers{nexus3} Normal Created Created container
1m 12s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 spec.containers{nexus3} Normal Started Started container
56s 8s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 spec.containers{nexus3} Warning BackOff Back-off restarting failed container
56s 8s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 Warning FailedSync Error syncing pod
I think the restart happens because nexus itself is not able to start.
I found this in the logs:
mkdir: cannot create directory '../sonatype-work/nexus3/log': Permission denied
and
Unable to update instance pid: Unable to create directory /nexus-data/instances
Where is my mistake? What needs to be done, to enable nexus to write into the disk and the folder?
Best,
Lars
Well, I solved it myself directly after creating the question. :)
Regarding https://github.com/sonatype/docker-nexus3 the application runs on a different pid then root.
Adding this to the deployment file did the trick:
spec:
securityContext:
fsGroup: 200
fsGroup parameter as indicated by reschifl is an elegant solution. But it did not work for me. I used an alternative, that is to launch an initContainer that fixes the permissions so then Nexus can launch. It is described here:
https://github.com/sonatype/docker-nexus/issues/31
Related
kubectl get all -n migration:
NAME READY STATUS RESTARTS AGE
pod/nginx2-7b8667968c-zxtq7 0/1 ImagePullBackOff 0 5m38s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx2 0/1 1 0 5m38s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx2-7b8667968c 1 1 0 5m38s
kubectl describe pod nginx2-7b8667968c-zxtq7 -n migration:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 44s default-scheduler Successfully assigned migration/nginx2-7b8667968c-zxtq7 to k8s-master01
Normal SandboxChanged 33s kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulling 18s (x2 over 43s) kubelet Pulling image "nginx"
Warning Failed 14s (x2 over 34s) kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
Warning Failed 14s (x2 over 34s) kubelet Error: ErrImagePull
Normal BackOff 3s (x4 over 32s) kubelet Back-off pulling image "nginx"
Warning Failed 3s (x4 over 32s) kubelet Error: ImagePullBackOff
Post-logging in with different docker account:
I can manually pull an image from docker using docker pull nginx
But while deploying a deployment, it again shows the same error.
Deployment yaml is as:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx2
namespace: migration
spec:
replicas: 1
selector:
matchLabels:
name: nginx2
template:
metadata:
labels:
name: nginx2
spec:
containers:
- name: nginx2
imagePullPolicy: Always
image: nginx
ports:
- containerPort: 3000
volumeMounts:
- name: game-demo
mountPath: /usr/src/app/config
- name: secret-basic-auth
mountPath: /usr/src/app/secret
- name: site-data2
mountPath: /var/www/html
volumes:
- name: game-demo
configMap:
name: game-demo
- name: secret-basic-auth
secret:
secretName: secret-basic-auth
- name: site-data2
persistentVolumeClaim:
claimName: demo-pvc-claim2
Also, as the nginx image is present locally, I tried with modifying imagePullPolicy to Never as well as 'IfNotPresent'.
But nothing works. Please guide.
Here is your errror:
Warning Failed 14s (x2 over 34s) kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
Basically docker hub is now becoming a paid thing if you exceed their rate limits.
You can use the publically hosted image of "nginx" from Amazon's ECR instead:
docker pull public.ecr.aws/nginx/nginx:latest
That should be the same as the one you use using, just double check over here:
https://gallery.ecr.aws/nginx/nginx
I set up a Harbor registry which worked successfully for a couple of weeks now. For each deployment and namespace I a have a secret with the credentials from my ~/.docker/config.json file to get access to the registry. Since last weekend I was not able to pull images from that registry anymore and I didn't change anything! The cluster is running on GKE v1.12.5 btw.
What works?
I can pull and push images from my local machine witch docker.
What does not work?
My Kubernetes cluster cannot pull images anymore and runs in a timeout.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned k8s-test7/nginx-k8s-test7-6f7b8fdd79-2ffmp to gke-k8s-cloudops-test-default-pool-72fccd21-hrhk
Normal SandboxChanged 12m kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Pod sandbox changed, it will be killed and re-created.
Warning Failed 11m (x3 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Failed to pull image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10": rpc error: code = Unknown desc = Error response from daemon: Get https://core.k8s-harbor-test.my-domain.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 11m (x3 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Error: ErrImagePull
Normal BackOff 11m (x7 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Back-off pulling image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10"
Normal Pulling 10m (x4 over 13m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk pulling image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10"
Warning Failed 3m2s (x38 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Error: ImagePullBackOff
deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-k8s-test7
namespace: k8s-test7
spec:
replicas: 1
template:
metadata:
labels:
app: nginx-k8s-test7
spec:
containers:
- name: nginx-k8s-test7
image: core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10
volumeMounts:
- name: webcontent
mountPath: /usr/share/nginx/html
ports:
- containerPort: 80
volumes:
- name: webcontent
configMap:
name: webcontent
imagePullSecrets:
- name: harborcred
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: webcontent
namespace: k8s-test7
annotations:
volume.alpha.kubernetes.io/storage-class: default
spec:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 5Gi
The secret "harborcred" is part of every namespace so that the deployment can access it. The secret was created per kubernetes documentation:
https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
kubectl create secret generic harborcred \
--from-file=.dockerconfigjson=~/.docker/config.json \
--type=kubernetes.io/dockerconfigjson \
--namespace=k8s-test7
Any help would be appreciated!
Hi at first look could you please:
Change image source and use some public one f.e. nginx to verify your deployment doesn't have other issues.
https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ provide also more details about inspecting the "Secrets".
Please also perform additional tests related to connectivity directly from your node as described within this post [How to debug "ImagePullBackOff"?
Additional steps to find the root cause:
1. Convert your secrets data:
kubectl get secret harborcred -n k8s-test7
--output="jsonpath={.data.\.dockerconfigjson}" | base64 --decode
2. Compare the result of decoding your "auth" field from the 1 step with your docker credentials using:
echo "your auth data" | base64 --decode
3. To find the root cause please use also:
kubectl get events -n k8s-test7 | grep pull
Please share with your logs.
So I have 2 similar deployments on k8s that pulls the same image from GitLab. Apparently this resulted in my second deployment to go on a CrashLoopBackOff error and I can't seem to connect to the port to check on the /healthz of my pod. Logging the pod shows that the pod received an interrupt signal while describing the pod shows the following message.
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
29m 29m 1 default-scheduler Normal Scheduled Successfully assigned java-kafka-rest-kafka-data-2-development-5c6f7f597-5t2mr to 172.18.14.110
29m 29m 1 kubelet, 172.18.14.110 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "default-token-m4m55"
29m 29m 1 kubelet, 172.18.14.110 spec.containers{consul} Normal Pulled Container image "..../consul-image:0.0.10" already present on machine
29m 29m 1 kubelet, 172.18.14.110 spec.containers{consul} Normal Created Created container
29m 29m 1 kubelet, 172.18.14.110 spec.containers{consul} Normal Started Started container
28m 28m 1 kubelet, 172.18.14.110 spec.containers{java-kafka-rest-development} Normal Killing Killing container with id docker://java-kafka-rest-development:Container failed liveness probe.. Container will be killed and recreated.
29m 28m 2 kubelet, 172.18.14.110 spec.containers{java-kafka-rest-development} Normal Created Created container
29m 28m 2 kubelet, 172.18.14.110 spec.containers{java-kafka-rest-development} Normal Started Started container
29m 27m 10 kubelet, 172.18.14.110 spec.containers{java-kafka-rest-development} Warning Unhealthy Readiness probe failed: Get http://10.5.59.35:7533/healthz: dial tcp 10.5.59.35:7533: getsockopt: connection refused
28m 24m 13 kubelet, 172.18.14.110 spec.containers{java-kafka-rest-development} Warning Unhealthy Liveness probe failed: Get http://10.5.59.35:7533/healthz: dial tcp 10.5.59.35:7533: getsockopt: connection refused
29m 19m 8 kubelet, 172.18.14.110 spec.containers{java-kafka-rest-development} Normal Pulled Container image "r..../java-kafka-rest:0.3.2-dev" already present on machine
24m 4m 73 kubelet, 172.18.14.110 spec.containers{java-kafka-rest-development} Warning BackOff Back-off restarting failed container
I have tried to redeploy the deployments under different images and it seems to work just fine. However I don't think this will be efficient as the images are the same throughout. How do I go on about this?
Here's what my deployment file looks like:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: "java-kafka-rest-kafka-data-2-development"
labels:
repository: "java-kafka-rest"
project: "java-kafka-rest"
service: "java-kafka-rest-kafka-data-2"
env: "development"
spec:
replicas: 1
selector:
matchLabels:
repository: "java-kafka-rest"
project: "java-kafka-rest"
service: "java-kafka-rest-kafka-data-2"
env: "development"
template:
metadata:
labels:
repository: "java-kafka-rest"
project: "java-kafka-rest"
service: "java-kafka-rest-kafka-data-2"
env: "development"
release: "0.3.2-dev"
spec:
imagePullSecrets:
- name: ...
containers:
- name: java-kafka-rest-development
image: registry...../java-kafka-rest:0.3.2-dev
env:
- name: DEPLOYMENT_COMMIT_HASH
value: "0.3.2-dev"
- name: DEPLOYMENT_PORT
value: "7533"
livenessProbe:
httpGet:
path: /healthz
port: 7533
initialDelaySeconds: 30
timeoutSeconds: 1
readinessProbe:
httpGet:
path: /healthz
port: 7533
timeoutSeconds: 1
ports:
- containerPort: 7533
resources:
requests:
cpu: 0.5
memory: 6Gi
limits:
cpu: 3
memory: 10Gi
command:
- /envconsul
- -consul=127.0.0.1:8500
- -sanitize
- -upcase
- -prefix=java-kafka-rest/
- -prefix=java-kafka-rest/kafka-data-2
- java
- -jar
- /build/libs/java-kafka-rest-0.3.2-dev.jar
securityContext:
readOnlyRootFilesystem: true
- name: consul
image: registry.../consul-image:0.0.10
env:
- name: SERVICE_NAME
value: java-kafka-rest-kafka-data-2
- name: SERVICE_ENVIRONMENT
value: development
- name: SERVICE_PORT
value: "7533"
- name: CONSUL1
valueFrom:
configMapKeyRef:
name: consul-config-...
key: node1
- name: CONSUL2
valueFrom:
configMapKeyRef:
name: consul-config-...
key: node2
- name: CONSUL3
valueFrom:
configMapKeyRef:
name: consul-config-...
key: node3
- name: CONSUL_ENCRYPT
valueFrom:
configMapKeyRef:
name: consul-config-...
key: encrypt
ports:
- containerPort: 8300
- containerPort: 8301
- containerPort: 8302
- containerPort: 8400
- containerPort: 8500
- containerPort: 8600
command: [ entrypoint, agent, -config-dir=/config, -join=$(CONSUL1), -join=$(CONSUL2), -join=$(CONSUL3), -encrypt=$(CONSUL_ENCRYPT) ]
terminationGracePeriodSeconds: 30
nodeSelector:
env: ...
To those having this problem, I've discovered the problem and solution to my question. Apparently the problem lies with my service.yml where my targetPort was aimed to a port different than the one I opened in my docker image. Make sure the port that's opened in the docker image connects to the right port.
Hope this helps.
You can also check the logs of the pods.
for me error was in the pod
kubectl logs <pod> -n your-namespace
In my docker image I have a directory /opt/myapp/etc which has some files and directories. I want to create statefulset for my app. In that statefulset I am creating persistent volume claim and attach to /opt/myapp/etc. Statefulset yaml is attached below. Can anyone tell me how to attach volume to container in this case?
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: statefulset
labels:
app: myapp
spec:
serviceName: myapp
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- image: 10.1.23.5:5000/redis
name: redis
ports:
- containerPort: 6379
name: redis-port
- image: 10.1.23.5:5000/myapp:18.1
name: myapp
ports:
- containerPort: 8181
name: port
volumeMounts:
- name: data
mountPath: /opt/myapp/etc
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: standard
resources:
requests:
storage: 5Gi
Here is the output of describe pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m (x4 over 3m) default-scheduler pod has unbound PersistentVolumeClaims
Normal Scheduled 3m default-scheduler Successfully assigned controller-statefulset-0 to dev-k8s-2
Normal SuccessfulMountVolume 3m kubelet, dev-k8s-2 MountVolume.SetUp succeeded for volume "default-token-xpskd"
Normal SuccessfulAttachVolume 3m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-77d2cef8-a674-11e8-9358-fa163e3294c1"
Normal SuccessfulMountVolume 3m kubelet, dev-k8s-2 MountVolume.SetUp succeeded for volume "pvc-77d2cef8-a674-11e8-9358-fa163e3294c1"
Normal Pulling 2m kubelet, dev-k8s-2 pulling image "10.1.23.5:5000/redis"
Normal Pulled 2m kubelet, dev-k8s-2 Successfully pulled image "10.1.23.5:5000/redis"
Normal Created 2m kubelet, dev-k8s-2 Created container
Normal Started 2m kubelet, dev-k8s-2 Started container
Normal Pulled 1m (x4 over 2m) kubelet, dev-k8s-2 Container image "10.1.23.5:5000/myapp:18.1" already present on machine
Normal Created 1m (x4 over 2m) kubelet, dev-k8s-2 Created container
Normal Started 1m (x4 over 2m) kubelet, dev-k8s-2 Started container
Warning BackOff 1m (x7 over 2m) kubelet, dev-k8s-2 Back-off restarting failed container
storageclass definition
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: standard
namespace: controller
provisioner: kubernetes.io/cinder
reclaimPolicy: Retain
parameters:
availability: nova
check if you have storage class defined in your cluster.
kubectl get storageclass
If your are using default storage class as host-path(in case of minikube) then you do not need to include storage class into your template.
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 5Gi
by specifying no storage class k8s will go ahead and schedule the persistent volume with the default storage class which would be host-path in case of minikube also make sure /opt/myapp/etc exist on the node where pod is going to be scheduled.
Kubernetes will not allow the mounting 2 volumes to a same directory. second mount will overwrite the files created by the first.
In my case docker image had some files in etc directory, which were removed after mounting the volume. Solved the problem using subpath.
Using Kubernetes cluster: 3 hosts(1 master and 2 nodes).
Kubernetes version: 1.7
Deployed a Rails app to Kubernetes cluster.
Here is the deployment.yaml file:
apiVersion: v1
kind: Service
metadata:
name: server
labels:
app: server
spec:
ports:
- port: 80
selector:
app: server
tier: backend
type: LoadBalancer
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: server
labels:
app: server
spec:
replicas: 3
template:
metadata:
labels:
app: server
tier: backend
spec:
containers:
- image: 192.168.33.13/myapp/server
name: server
ports:
- containerPort: 3000
name: server
imagePullPolicy: Always
Deploy it:
$ kubectl create -f deployment.yaml
Then check the pods status:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
server-962161505-kw3jf 0/1 CrashLoopBackOff 6 9m
server-962161505-lxcfb 0/1 CrashLoopBackOff 6 9m
server-962161505-mbnkn 0/1 CrashLoopBackOff 6 9m
At the beginning, its status got Completed but went to CrashLoopBackOff soon. Is there anything wrong in the config yaml file?
(By the way, I don't want to run a entrypoint.sh script here but used a job.yaml file to call k8s Job kind to do it.)
Edit
Result of kubectl describe pod server-962161505-kw3jf:
Name: server-962161505-kw3jf
Namespace: default
Node: node1/192.168.33.11
Start Time: Mon, 13 Nov 2017 17:45:47 +0900
Labels: app=server
pod-template-hash=962161505
tier=backend
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"server-962161505","uid":"0acadda6-c84f-11e7-84b8-02178ad2db9a","...
Status: Running
IP: 10.42.254.104
Created By: ReplicaSet/server-962161505
Controlled By: ReplicaSet/server-962161505
Containers:
server:
Container ID: docker://29eca3d9a20c60c83314101b036d742c5868c3bf25a39f28c5e4208bcdbfcede
Image: 192.168.33.13/myapp/server
Image ID: docker-pullable://192.168.33.13/myapp/server#sha256:0e056e3ff5b1f1084e0946bc4211d33c6f48bc06dba7e07340c1609bbd5513d6
Port: 3000/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 14 Nov 2017 10:13:12 +0900
Finished: Tue, 14 Nov 2017 10:13:13 +0900
Ready: False
Restart Count: 26
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-csjqn (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-csjqn:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-csjqn
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulMountVolume 22m kubelet, node1 MountVolume.SetUp succeeded for volume "default-token-csjqn"
Normal SandboxChanged 22m kubelet, node1 Pod sandbox changed, it will be killed and re-created.
Warning Failed 20m (x3 over 21m) kubelet, node1 Failed to pull image "192.168.33.13/myapp/server": rpc error: code = 2 desc = Error response from daemon: {"message":"Get http://192.168.33.13/v2/: dial tcp 192.168.33.13:80: getsockopt: connection refused"}
Normal BackOff 20m (x5 over 21m) kubelet, node1 Back-off pulling image "192.168.33.13/myapp/server"
Normal Pulling 4m (x7 over 21m) kubelet, node1 pulling image "192.168.33.13/myapp/server"
Normal Pulled 4m (x4 over 20m) kubelet, node1 Successfully pulled image "192.168.33.13/myapp/server"
Normal Created 4m (x4 over 20m) kubelet, node1 Created container
Normal Started 4m (x4 over 20m) kubelet, node1 Started container
Warning FailedSync 10s (x99 over 21m) kubelet, node1 Error syncing pod
Warning BackOff 10s (x91 over 20m) kubelet, node1 Back-off restarting failed container