Harbor Registry: Failed to pull image in Kubernetes - docker

I set up a Harbor registry which worked successfully for a couple of weeks now. For each deployment and namespace I a have a secret with the credentials from my ~/.docker/config.json file to get access to the registry. Since last weekend I was not able to pull images from that registry anymore and I didn't change anything! The cluster is running on GKE v1.12.5 btw.
What works?
I can pull and push images from my local machine witch docker.
What does not work?
My Kubernetes cluster cannot pull images anymore and runs in a timeout.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned k8s-test7/nginx-k8s-test7-6f7b8fdd79-2ffmp to gke-k8s-cloudops-test-default-pool-72fccd21-hrhk
Normal SandboxChanged 12m kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Pod sandbox changed, it will be killed and re-created.
Warning Failed 11m (x3 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Failed to pull image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10": rpc error: code = Unknown desc = Error response from daemon: Get https://core.k8s-harbor-test.my-domain.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 11m (x3 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Error: ErrImagePull
Normal BackOff 11m (x7 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Back-off pulling image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10"
Normal Pulling 10m (x4 over 13m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk pulling image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10"
Warning Failed 3m2s (x38 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Error: ImagePullBackOff
deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-k8s-test7
namespace: k8s-test7
spec:
replicas: 1
template:
metadata:
labels:
app: nginx-k8s-test7
spec:
containers:
- name: nginx-k8s-test7
image: core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10
volumeMounts:
- name: webcontent
mountPath: /usr/share/nginx/html
ports:
- containerPort: 80
volumes:
- name: webcontent
configMap:
name: webcontent
imagePullSecrets:
- name: harborcred
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: webcontent
namespace: k8s-test7
annotations:
volume.alpha.kubernetes.io/storage-class: default
spec:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 5Gi
The secret "harborcred" is part of every namespace so that the deployment can access it. The secret was created per kubernetes documentation:
https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
kubectl create secret generic harborcred \
--from-file=.dockerconfigjson=~/.docker/config.json \
--type=kubernetes.io/dockerconfigjson \
--namespace=k8s-test7
Any help would be appreciated!

Hi at first look could you please:
Change image source and use some public one f.e. nginx to verify your deployment doesn't have other issues.
https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ provide also more details about inspecting the "Secrets".
Please also perform additional tests related to connectivity directly from your node as described within this post [How to debug "ImagePullBackOff"?
Additional steps to find the root cause:
1. Convert your secrets data:
kubectl get secret harborcred -n k8s-test7
--output="jsonpath={.data.\.dockerconfigjson}" | base64 --decode
2. Compare the result of decoding your "auth" field from the 1 step with your docker credentials using:
echo "your auth data" | base64 --decode
3. To find the root cause please use also:
kubectl get events -n k8s-test7 | grep pull
Please share with your logs.

Related

private docker registry for gitlab secret not working in kubernetes

I am trying to setup a gitlab private registry for my kubernetes container images.
I've cut the irrelevant code out below.
My replica set is defined as:
kind: ReplicaSet
...
spec:
containers:
- name: redacted
image: registry.gitlab.com/redacted/redacted/redacted:latest
ports:
- containerPort: 8080
volumeMounts:
- name: redacted-data
mountPath: /var/www/html
imagePullSecrets:
- name: github-auth
...
I'm setting my secret with the following kubectl command:
kubectl create -n redacted secret docker-registry gitlab-auth \
--docker-server="registry.gitlab.com:5000" \
--docker-username="redacted" \
--docker-password="redacted" \
--docker-email="redacted" \
--namespace="redacted"
Here is the failing container output:
Name: redacted-cgbrk
...
Containers:
redacted:
Container ID:
Image: registry.gitlab.com/redacted/redacted/redacted:latest
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: ErrImagePull
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qv24l (ro)
/var/www/html from redacted-data (rw)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 64s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
Normal Scheduled 62s default-scheduler Successfully assigned redacted/redacted-cgbrk to pool-2t9lbcb5l-7d37n
Normal SuccessfulAttachVolume 55s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-6c4aac85-bb60-44e8-b557-7f65d62543fa"
Normal Pulling 16s (x3 over 54s) kubelet Pulling image "registry.gitlab.com/redacted/mpro/redacted:latest"
Warning Failed 16s (x3 over 54s) kubelet Failed to pull image "registry.gitlab.com/redacted/redacted/redacted:latest": rpc error: code = Unknown desc = failed to pull and unpack image "registry.gitlab.com/redacted/redacted/redacted:latest": failed to resolve reference "registry.gitlab.com/redacted/redacted/redacted:latest": failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden
...
Kubernetes uses a separate auth than docker login, Check you may have configured Kubernetes with the required authentication so it can pull from your private registries.
Follow below steps :
1)Log in to Docker Hub
2)Create a Secret based on existing credentials
3)Create a Secret by providing credentials on the command line
4)Inspecting the Secret regcred
5)Create a Pod that uses your Secret
Please see K8S issue here: Pull an Image from a Private Registry for more information.
Also Refer to this Similar SO for more information.

Minikube: Can't pull image from private repo on dockerhub

I have pushed some docker images to my private repo on dockerhub, which I am now trying to use to create deployments in a Minikube Kubernetes cluster.
I have done the following:
docker login -u [username] -p [password]
docker tag [mslearn-microservices-pizzabackend] [username]/[mslearn-microservices-pizzabackend]
docker tag [mslearn-microservices-pizzafrontend] [username]/[mslearn-microservices-pizzafrontend]
I can see both the images in my private dockerhub repo. To be able to use them in a deployment, I have done the following:
kubectl create secret docker-registry dockerhub-credentials --docker-server="docker.io" --docker-username="[username]" --docker-password="[password]" --docker-email="[email]"
After that, I try to create a deployment for the first image using the following manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mslearn-microservices-pizzabackend
spec:
replicas: 1
selector:
matchLabels:
app: mslearn-microservices-pizzabackend
template:
metadata:
labels:
app: mslearn-microservices-pizzabackend
spec:
imagePullSecrets:
- name: dockerhub-credentials
containers:
- name: mslearn-microservices-pizzabackend
image: [username]/mslearn-microservices-pizzabackend:latest
ports:
- containerPort: 80
env:
- name: ASPNETCORE_URLS
value: http://*:80
---
apiVersion: v1
kind: Service
metadata:
name: mslearn-microservices-pizzabackend
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: mslearn-microservices-pizzabackend
But when I check the events of the pod that gets created by the deployment, I can see the following:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 18s default-scheduler Successfully assigned default/mslearn-microservices-pizzabackend-79dcd6677d-cgh7z to minikube
Normal BackOff 15s kubelet Back-off pulling image "[username]/mslearn-microservices-pizzabackend:latest"
Warning Failed 15s kubelet Error: ImagePullBackOff
Normal Pulling 3s (x2 over 18s) kubelet Pulling image "[username]/mslearn-microservices-pizzabackend:latest"
Warning Failed 1s (x2 over 16s) kubelet Failed to pull image "[username]/mslearn-microservices-pizzabackend:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for [username]/mslearn-microservices-pizzabackend, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
Warning Failed 1s (x2 over 16s) kubelet Error: ErrImagePull
I have tried searching for solutions on the web and I can see that other people have had similiar issues, but none of their solutions have worked for me.
Any suggestions?

How to resolve pods failing due to exceeding pull rate limit from docker?

kubectl get all -n migration:
NAME READY STATUS RESTARTS AGE
pod/nginx2-7b8667968c-zxtq7 0/1 ImagePullBackOff 0 5m38s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx2 0/1 1 0 5m38s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx2-7b8667968c 1 1 0 5m38s
kubectl describe pod nginx2-7b8667968c-zxtq7 -n migration:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 44s default-scheduler Successfully assigned migration/nginx2-7b8667968c-zxtq7 to k8s-master01
Normal SandboxChanged 33s kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulling 18s (x2 over 43s) kubelet Pulling image "nginx"
Warning Failed 14s (x2 over 34s) kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
Warning Failed 14s (x2 over 34s) kubelet Error: ErrImagePull
Normal BackOff 3s (x4 over 32s) kubelet Back-off pulling image "nginx"
Warning Failed 3s (x4 over 32s) kubelet Error: ImagePullBackOff
Post-logging in with different docker account:
I can manually pull an image from docker using docker pull nginx
But while deploying a deployment, it again shows the same error.
Deployment yaml is as:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx2
namespace: migration
spec:
replicas: 1
selector:
matchLabels:
name: nginx2
template:
metadata:
labels:
name: nginx2
spec:
containers:
- name: nginx2
imagePullPolicy: Always
image: nginx
ports:
- containerPort: 3000
volumeMounts:
- name: game-demo
mountPath: /usr/src/app/config
- name: secret-basic-auth
mountPath: /usr/src/app/secret
- name: site-data2
mountPath: /var/www/html
volumes:
- name: game-demo
configMap:
name: game-demo
- name: secret-basic-auth
secret:
secretName: secret-basic-auth
- name: site-data2
persistentVolumeClaim:
claimName: demo-pvc-claim2
Also, as the nginx image is present locally, I tried with modifying imagePullPolicy to Never as well as 'IfNotPresent'.
But nothing works. Please guide.
Here is your errror:
Warning Failed 14s (x2 over 34s) kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
Basically docker hub is now becoming a paid thing if you exceed their rate limits.
You can use the publically hosted image of "nginx" from Amazon's ECR instead:
docker pull public.ecr.aws/nginx/nginx:latest
That should be the same as the one you use using, just double check over here:
https://gallery.ecr.aws/nginx/nginx

Why Kubernetes is not attaching my secret into my pod?

I already created my secret as recommend by Kubernetes and followed the tutorial, but the pod isnt with my secret attached.
As you can see, i created the secret and described it.
After i created my pod.
$ kubectl get secret my-secret --output="jsonpath={.data.\.dockerconfigjson}" | base64 --decode
{"auths":{"my-private-repo.com":{"username":"<username>","password":"<password>","email":"<email>","auth":"<randomAuth>="}}}
$ kubectl create -f my-pod.yaml
pod "my-pod" created
$ kubectl describe pods trunfo
Name: my-pod
Namespace: default
Node: gke-trunfo-default-pool-07eea2fb-3bh9/10.233.224.3
Start Time: Fri, 28 Sep 2018 16:41:59 -0300
Labels: <none>
Annotations: kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container container-trunfo
Status: Pending
IP: 10.10.1.37
Containers:
container-trunfo:
Container ID:
Image: <my-image>
Image ID:
Port: 9898/TCP
State: Waiting
Reason: ErrImagePull
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-hz4mf (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-hz4mf:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-hz4mf
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4s default-scheduler Successfully assigned trunfo to gke-trunfo-default-pool-07eea2fb-3bh9
Normal SuccessfulMountVolume 4s kubelet, gke-trunfo-default-pool-07eea2fb-3bh9 MountVolume.SetUp succeeded for volume "default-token-hz4mf"
Normal Pulling 3s kubelet, gke-trunfo-default-pool-07eea2fb-3bh9 pulling image "my-private-repo.com/my-image:latest"
Warning Failed 3s kubelet, gke-trunfo-default-pool-07eea2fb-3bh9 Failed to pull image "my-private-repo.com/my-image:latest": rpc error: code = Unknown desc = Error response from daemon: Get https://my-private-repo.com/v1/_ping: dial tcp: lookup my-private-repo.com on 169.254.169.254:53: no such host
Warning Failed 3s kubelet, gke-trunfo-default-pool-07eea2fb-3bh9 Error: ErrImagePull
Normal BackOff 3s kubelet, gke-trunfo-default-pool-07eea2fb-3bh9 Back-off pulling image "my-private-repo.com/my-image:latest"
Warning Failed 3s kubelet, gke-trunfo-default-pool-07eea2fb-3bh9 Error: ImagePullBackOff
What can i do to fix it?
EDIT
This is my pod:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: my-container
image: my-private-repo/images/<my-image>
ports:
- containerPort: 9898
imagePullSecrets:
- name: my-secret
As we can see, the secret is defined as expected, but not attached correctly.
You did not get as far as secrets yet. Your logs say
Failed to pull image "my-private-repo.com/my-image:latest": rpc error: code = Unknown desc = Error response from daemon: Get https://my-private-repo.com/v1/_ping: dial tcp: lookup my-private-repo.com on 169.254.169.254:53: no such host
Warning Failed 3s kubelet, gke-trunfo-default-pool-07eea2fb-3bh9 Error: ErrImagePull
Which means that your pod cannot event start because the image is not available. Fix that, and if you still have problem with secrets after you observer pod state "ready" post your yaml definition.

GKE: nexus disk not writable

I would like to run nexus3 within the Google Container Engine.
I created a persistent disk and configured the following deployment file:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nexus3
labels:
app: nexus3
spec:
replicas: 1
selector:
matchLabels:
app: nexus3
template:
metadata:
labels:
app: nexus3
tier: web
spec:
containers:
- image: gcr.io/nexustest-182520/nexus3:3.6.0
name: nexus3
volumeMounts:
- mountPath: /nexus-data
name: nexus3-persistent-storage
ports:
- containerPort: 8081
volumes:
- name: nexus3-persistent-storage
gcePersistentDisk:
pdName: nexus3-disk
fsType: ext4
The deployment fails with this problem:
kubectl get pods -o=wide
NAME READY STATUS RESTARTS AGE IP NODE
nexus3-1260341461-mj7rf 0/1 Error 2 36s x.x.x.x gke-nexus-cluster-default-pool-9a58e4f2-p1t9
kubectl describe po/nexus3-1260341461-mj7rf
[...]
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 default-scheduler Normal Scheduled Successfully assigned nexus3-1260341461-mj7rf to gke-nexus-cluster-default-pool-9a58e4f2-p1t9
1m 1m 1 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "default-token-gsnbn"
1m 1m 1 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "nexus3-persistent-storage"
1m 12s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 spec.containers{nexus3} Normal Pulled Container image "gcr.io/nexustest-182520/nexus3:3.6.0" already present on machine
1m 12s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 spec.containers{nexus3} Normal Created Created container
1m 12s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 spec.containers{nexus3} Normal Started Started container
56s 8s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 spec.containers{nexus3} Warning BackOff Back-off restarting failed container
56s 8s 4 kubelet, gke-nexus-cluster-default-pool-9a58e4f2-p1t9 Warning FailedSync Error syncing pod
I think the restart happens because nexus itself is not able to start.
I found this in the logs:
mkdir: cannot create directory '../sonatype-work/nexus3/log': Permission denied
and
Unable to update instance pid: Unable to create directory /nexus-data/instances
Where is my mistake? What needs to be done, to enable nexus to write into the disk and the folder?
Best,
Lars
Well, I solved it myself directly after creating the question. :)
Regarding https://github.com/sonatype/docker-nexus3 the application runs on a different pid then root.
Adding this to the deployment file did the trick:
spec:
securityContext:
fsGroup: 200
fsGroup parameter as indicated by reschifl is an elegant solution. But it did not work for me. I used an alternative, that is to launch an initContainer that fixes the permissions so then Nexus can launch. It is described here:
https://github.com/sonatype/docker-nexus/issues/31

Resources