Deploying Docker images from Gitlab's private Docker registry to Openshift - docker

I'm working on a Docker-based project. The project code is hosted in a private Gitlab installation, git.example.com. With it, the Docker private registry shipped with Gitlab is deployed, registry.example.com.
The project has a CI setup which ends up building Docker images and pushing to the registry, this part works as expected. As Gitlab+Docker registry does not yet support multiple images related to the same Git repo, I'm using the tags workaround which specifies an image as:
registry.example.com/group/my.project:web
registry.example.com/group/my.project:app
etc.
I've created a user and attached it to the projects, logged in via it locally and tried to pull above images, that works as expected.
I've added the ImageStream block as so:
apiVersion: v1
kind: ImageStream
metadata:
name: web
spec:
tags:
-
from:
kind: DockerImage
name: registry.example.com/group/my.project:web
name: latest
This adds the image in the Images section, but it cannot pull it Openshift doesn't have access to the Docker Registry yet. I add a new Docker secret as described here and am now able to see image metadata in Openshift, everything looks as expected.
But, if I add a deployment config, like so:
apiVersion: v1
kind: DeploymentConfig
metadata:
creationTimestamp: null
labels:
service: web
name: web
spec:
replicas: 1
selector:
service: web
strategy:
resources: { }
template:
metadata:
creationTimestamp: null
labels:
service: web
spec:
containers:
-
name: web
ports:
-
containerPort: 80
resources: { }
restartPolicy: Always
test: false
triggers:
-
type: ConfigChange
-
type: ImageChange
imageChangeParams:
automatic: true
containerNames:
- web
from:
kind: ImageStreamTag
name: 'web:latest'
status: { }
I keep getting error:
Failed to pull image "registry.example.com/group/my.project#sha256:3333022641e571d7e4dcae2953d35be8cdf9416b13967b99537c4e8f150f74e4": manifest unknown: manifest unknown
in the Events tab of the pod created. This basically kills my plan to deploy prebuilt images to Openshift.
I know about Docker 1.9 -> 1.10 incompatibility, but this is Openshift 1.4.1, images were pushed with Docker 1.13 so it shouldn't be a problem.
How do I even start debugging this, is there a way to access any sort of log which would explain what's going on? Why is ImageStream able to find everything it needs (and access my registry), but not the DeploymentConfig?

To answer my own question: it seems Docker's Distribution (registry daemon) has a bug which manifests itself in quite a weird way.
Basically, the problem is:
Registry is behind Apache reverse proxy
the image gets built and pushed from CI runner to Gitlab's Registry, digest SHA:1234 (example, of course)
the image gets imported to Openshift, it queries the metadata and Docker Distribution claims the digest is SHA:ABCD, you can reproduce this by pushing and then pulling right away, the digests are supposed to be identical both times, as explained in the link
when Openshift tries to actually pull the image, it will get the dreaded "Manifest unknown" error above (as it's trying to fetch the image using an invalid digest, not by fault of its own)
all symptoms look exactly like with Docker v1 => Docker v2 API changes, except for totally different reasons
I've since moved my Gitlab instance to another machine (where it's behind Nginx) and it works without a problem.

Related

Connect kubernetes to GitLab Container Registry

I got problem with connecting my k3s cluster to GitLab Docker Registry.
On cluster I got created secret in default namespace like this
kubectl create secret docker-registry regcred --docker-server=https://gitlab.domain.tld:5050 --docker-username=USERNAME --docker-email=EMAIL --docker-password=TOKEN
Then in Deployment config I got this secret included, my config:
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
labels:
app.kubernetes.io/name: "app"
app.kubernetes.io/version: "1.0"
namespace: default
spec:
template:
metadata:
labels:
app: app
spec:
imagePullSecrets:
- name: regcred
containers:
- image: gitlab.domain.tld:5050/group/appproject:1.0
name: app
imagePullPolicy: Always
ports:
- containerPort: 80
But the created pod is still unable to pull this image.
There is still error message of:
failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden
Can you help me, where the error may be?
If I try connect to this GitLab registry via secrets above on local docker, it working fine, docker login is right, also a pulling of this image.
Thanks
To pull from a private container registry on Gitlab you must first create a Deploy Token similar to how the pipeline or similar "service" would access it. Go to the repository then go to Settings -> Repository -> Deploy Tokens
Give the deploy token a name, and a username(it says optional but we'll be able to use this custom username with the token) and make sure it has read_registry access. That is all it needs to pull from the registry. If you later need to push then you would need write_registry. Once you click create deploy token it will show you the token be sure to copy it as you won't see it again.
Now just recreate your secret in your k8s cluster.
kubectl create secret docker-registry regcred --docker-server=<private gitlab registry> --docker-username=<deploy token username> --docker-password=<deploy token>
Make sure to apply the secret to the same namespace as your deployment that is pulling the image.
[See Docs] https://docs.gitlab.com/ee/user/project/deploy_tokens/#gitlab-deploy-token

Why are microservices not restarted on GKE - "failed to copy: httpReaderSeeker: failed open: could not fetch content descriptor"

I'm running microservices on GKE and using skaffold for management.
Everything works fine for a week and suddenly all services are killed (not sure why).
Logging shows this same info for all services:
There is no indication that something is going wrong in the logs before all services fail. It looks like the pods are all killed at the same time by GKE for whatever reason.
What confuses me is why the services do not restart.
kubectl describe pod auth shows a "imagepullbackoff" error.
When I simulate this situation on the test system (same setup) by deleting a pod manually, all services restart just fine.
To deploy the microservices, I use skaffold.
---deployment.yaml for one of the microservices---
apiVersion: apps/v1
kind: Deployment
metadata:
name: auth-depl
namespace: development
spec:
replicas: 1
selector:
matchLabels:
app: auth
template:
metadata:
labels:
app: auth
spec:
volumes:
- name: google-cloud-key
secret:
secretName: pubsub-key
containers:
- name: auth
image: us.gcr.io/XXXX/auth
volumeMounts:
- name: google-cloud-key
mountPath: /var/secrets/google
env:
- name: NATS_CLIENT_ID
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: NATS_URL
value: 'http://nats-srv:4222'
- name: NATS_CLUSTER_ID
value: XXXX
- name: JWT_KEY
valueFrom:
secretKeyRef:
name: jwt-secret
key: JWT_KEY
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/secrets/google/key.json
Any idea why the microservices don't restart? Again, they run fine after deploying them with skaffold and also after simulating pod shutdown on the test system ... what changed here?
---- Update 2021.10.30 -------------
After some further digging in the cloud log explorer, I figured out that the pod is trying to pull the previously build image but fails. If I pull the image on cloud console manually using the image name plus tag as listed in the logs, it works just fine (thus the image is there).
The log gives the following reason for the error:
Failed to pull image "us.gcr.io/scout-productive/client:v0.002-72-gaa98dde#sha256:383af5c5989a3e8a5608495e589c67f444d99e7b705cfff29e57f49b900cba33": rpc error: code = NotFound desc = failed to pull and unpack image "us.gcr.io/scout-productive/client#sha256:383af5c5989a3e8a5608495e589c67f444d99e7b705cfff29e57f49b900cba33": failed to copy: httpReaderSeeker: failed open: could not fetch content descriptor sha256:4e9f2cdf438714c2c4533e28c6c41a89cc6c1b46cf77e54c488db30ca4f5b6f3 (application/vnd.docker.image.rootfs.diff.tar.gzip) from remote: not found"
Where is that "sha256:4e9f2cdf438714c2c4533e28c6c41a89cc6c1b46cf77e54c488db30ca4f5b6f3" coming from that it cannot find according to the error message?
Any help/pointers are much appreciated!
Thanks
When Skaffold causes an image to be built, the image is pushed to a repository (us.gcr.io/scout-productive/client) using a tag generated with your specified tagging policy (v0.002-72-gaa98dde; this tag looks like the result of using Skaffold's default gitCommit tagging policy). The remote registry returns the image digest of the received image, a SHA256 value computed from the image metadata and image file contents (sha256:383af5c5989a3e8a5608495e589c67f444d99e7b705cfff29e57f49b900cba33). The image digest is unique like a fingerprint, whereas a tag is just name → image digest mapping, and which may be updated to point to a different image digest.
When deploying your application, Skaffold rewrites your manifests on-the-fly to reference the specific built container images by using both the generated tag and the image digest. Container runtimes ignore the image tag when an image digest is specified since the digest identifies a unique image.
So the fact that your specific image cannot be resolved means that the image has been deleted from your repository. Are you, or someone on your team, deleting images?
Some people run image garbage collectors to delete old images. Skaffold users do this to delete the interim images generated by skaffold dev between dev loops. But some of these collectors are indiscriminate, such as only keeping images tagged with latest. Since Skaffold tags images using a configured tagging policy, such collectors can delete your Skaffold-built images. To avoid problems, either tune your collector (e.g., only deleted untagged images), or have Skaffold build your dev and production images to different repositories. For example, GCP's Artifact Registry allows having multiple independent repositories in the same project, and your GKE cluster can access all such repositories.
The imagepullbackoff means that kubernetes couldn't download the images from the registry - that could mean that the image with that name/tag doesn't exist, OR the the credentials to the registry is wrong/expired.
From what I see in your deployment.yml there aren't provided any credentials to the registry at all. You can do it by providing imagePullSecret. I never used Skaffold, but my assumptions is that you login to private registry in Skaffold, and use it to deploy the images, so when Kubernetes tries to redownload the image from the registry by itself, it fails because because of lack of authorization.
I can propose two solutions:
1. ImagePullSecret
You can create secret resource witch will contain credentials to private registry, then define that secret in the deployment.yml, that way Kubernetes will be able to authorize at your private registry and pull the images.
https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
2. ImagePullPolicy
You can prevent re-downloading your images, by setting ImagePullPolicy:IfNotPresent in the deployment.yml . It will reuse the one which is available locally. Using this method requires defining image tags properly. For example using this with :latest tag can lead to not pulling the "newest latest" image, because from cluster perspective, it already has image with that tag so it won't download a new one.
https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy
To resolve problem with pods being killed, can you share Kubernetes events from the time when pods are being killed? kubectl -n NAMESPACE get events This would maybe give more information.

Kubernetes fails to pull images from gitlab registry unknown-sha256: <4ca..252> unexpected commit digest precondition

Been learning kubernetes in the past several weeks. I've recently built a bare-metal kubernetes cluster with (3) master nodes and (3) worker nodes (containerd runtime). Installed an another stand-alone bare-metal gitlab server with container registry enabled.
I was successful in building a simple nginx container with a custom index.html using docker build and pushed it to the registry; up until this point everything works great.
Now I wanted to create a simple pod using the image built above.
So, did the following steps.
Created a deploy token with read_registry access
Created a secret in kubernetes with username and the token as the password
Inserted imagePullSecrets to the deployment yaml file.
kubectl apply -f nginx.yaml.
Kubernetes pod status stays in ImagePullBackOff.
Failed to pull image "<gitlab-host>:5050/<user>/<project>/nginx:v1": rpc error: code = FailedPrecondition desc = failed to pull and unpack image
"<gitlab-host>:5050/<user>/<project>/nginx:v1": failed commit on ref "unknown-sha256:4ca40a571e91ac4c425500a504490a65852ce49c1f56d7e642c0ec44d13be252": unexpected commit digest sha256:0d899af03c0398a85e36d5cd7ee9a8828e5618db255770a4a96331785ff26d9c, expected sha256:4ca40a571e91ac4c425500a504490a65852ce49c1f56d7e642c0ec44d13be252: failed precondition.
Troubleshooting steps followed.
docker login from another server works.
docker pull works
In one of the worker nodes where kubernetes was scheduling the pod, I did ctr image pull which works
Did some googling but couldn't find any solutions. So, here I am as a last resort to figure this out.
Appreciate any help that I get.
My Deployment nginx.yml file
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: <gitlab-host>:5050/<username>/<project>/nginx:v1
imagePullPolicy: IfNotPresent
name: nginx
imagePullSecrets:
- name: regcred
I found the problem. I made a silly mistake in /etc/containerd/config.toml in the registry section and not mentioning the endpoint with port number <gitlab-host>:5050.
And also adding the private registries in config.toml is not necessary unless you want to run ctr command on the k8s nodes.

What is the proper way to reference a Dockerfile when creating a Deployment in Kubernetes?

Let's say I have a deployment that looks something like this:
apiVersion: v1
kind: Deployment
metadata:
name: myapp-deployment
spec:
replicas: 3
template:
kind: Pod
metadata: myapp-pod
labels:
apptype: front-end
containers:
- name: nginx
containers: <--what is supposed to go here?-->
How do I properly build a container using an existing Dockerfile without having to push a build image up to Docker hub?
Kubernetes can't build images. You all but are required to use an image registry. This isn't necessarily Docker Hub: the various public-cloud providers (AWS, Google, Azure) all have their own registry offerings, there are some third-party ones out there, or you can run your own.
If you're using a cloud-hosted Kubernetes installation (EKS, GKE, ...) the "right" way to do this is to push your built image to the corresponding image registry (ECR, GCR, ...) before you run it.
docker build -t gcr.io/my/image:20201116 .
docker push gcr.io/my/image:20201116
containers:
- name: anything
image: gcr.io/my/image:20201116
There are some limited exceptions to this in a very local development environment. For example, if you're using Minikube as a local Kubernetes installation, you can point docker commands at it, so that docker build builds an image inside the Kubernetes context.
eval $(minikube docker-env)
docker build -t my-image:20201116 .
containers:
- name: anything
image: my-image:20201116 # matches `docker build -t` option
imagePullPolicy: Never # since you manually built it inside the minikube Docker
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment check this out.
Make sure you give a good read at the documentation :)

Docker for Windows Kubernetes pod gets ImagePullBackOff after creating a new deployment

I have successfully built Docker images and ran them in a Docker swarm. When I attempt to build an image and run it with Docker Desktop's Kubernetes cluster:
docker build -t myimage -f myDockerFile .
(the above successfully creates an image in the docker local registry)
kubectl run myapp --image=myimage:latest
(as far as I understand, this is the same as using the kubectl create deployment command)
The above command successfully creates a deployment, but when it makes a pod, the pod status always shows:
NAME READY STATUS RESTARTS AGE
myapp-<a random alphanumeric string> 0/1 ImagePullBackoff 0 <age>
I am not sure why it is having trouble pulling the image - does it maybe not know where the docker local images are?
I just had the exact same problem. Boils down to the imagePullPolicy:
PC:~$ kubectl explain deployment.spec.template.spec.containers.imagePullPolicy
KIND: Deployment
VERSION: extensions/v1beta1
FIELD: imagePullPolicy <string>
DESCRIPTION:
Image pull policy. One of Always, Never, IfNotPresent. Defaults to Always
if :latest tag is specified, or IfNotPresent otherwise. Cannot be updated.
More info:
https://kubernetes.io/docs/concepts/containers/images#updating-images
Specifically, the part that says: Defaults to Always if :latest tag is specified.
That means, you created a local image, but, because you use the :latest it will try to find it in whatever remote repository you configured (by default docker hub) rather than using your local. Simply change your command to:
kubectl run myapp --image=myimage:latest --image-pull-policy Never
or
kubectl run myapp --image=myimage:latest --image-pull-policy IfNotPresent
I had this same ImagePullBack error while running a pod deployment with a YAML file, also on Docker Desktop.
For anyone else that finds this via Google (like I did), the imagePullPolicy that Lucas mentions above can also be set in the deployment yaml file. See the spec.templage.spec.containers.imagePullPolicy in the yaml snippet below (3 lines from the bottom).
I added that and my app deployed successfully into my local kube cluser, using the kubectl yaml deploy command: kubectl apply -f .\Deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app-deployment
labels:
app: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: web-app
image: node-web-app:latest
imagePullPolicy: Never
ports:
- containerPort: 3000
You didn't specify where myimage:latest is hosted, but essentially ImagePullBackoff means that I cannot pull the image because either:
You don't have networking setup in your Docker VM that can get to your Docker registry (Docker Hub?)
myimage:latest doesn't exist in your registry or is misspelled.
myimage:latest requires credentials (you are pulling from a private registry). You can take a look at this to configure container credentials in a Pod.

Resources