Spring Cloud Data Flow- Task Execution fails with error ErrImgPull - spring-cloud-dataflow

Added the Task Application using a Docker image by following the syntax : docker://url_docker_img_for_task_app_from_private_repo. This is successful.
Next, created a task and executed it. Looking at the logs for the App Pod created by SCDF, the application fails to pull the docker image and pod fails to access the docker image in private registry and ends up with the ErrImgPull status.
I have already created a Secret to access the private registry by following the steps here : https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#_private_docker_registry.
Tried both : deployer. property way and global server level. Both failed to load the image.
Any clue how can I troubleshoot this issue ?

Related

Amazon ECR Image is not downloaded when launching an EC2 task on ECS

I am pretty new to AWS. I have created a private repository in ECR, and uploaded an image to it. I defined a task definition in ECS that uses that image. I have a cluster defined using an autoscale group from EC2, and a service that should run the task with the ECR image. When I scale the autoscale group to 1 instance and in turn go to run my task from my service, the service shows the task starting (status in progress), but it seems that the image is never downloaded.
I have verified that by looking at the summary on the repository, which doesn't show a download for this image but shows downloads for other repositories when they are run, and when I ssh to the instance and poke around on the logs in /var/log I never see any logs where the image is being downloaded or fails to download, etc. Docker only shows the images for the ecs-agent and something else, not the image that I am trying to run. When I use the aws cli to describe the task after it inevitably fails later, the output includes this for the reason the task is stopped:
"stopCode": "TaskFailedToStart",
"stoppedAt": "2022-12-20T07:44:46.292000-08:00",
"stoppedReason": "Task failed to start",
"stoppingAt": "2022-12-20T07:44:46.292000-08:00",
I have run a different container on this cluster successfully. In this fail case there are no logs that make it to the logstream, no events other than the task starting, I am baffled. Any help would be greatly appreciated.

Deploying an Azure durable function using a docker image in vscode

I have created a durable function in VSCODE, it works perfectly fine locally, but when I deploy it to azure it is missing some dependencies which cannot be included in the python environment (Playwright). I created a Dockerfile and a docker image on a private docker hub repository on which I want to use to deploy the function app, but I don't know how I can deploy the function app using this image.
I have already using commands such as:
az functionapp config container set --docker-custom-image-name <docker-id>/<image>:latest --name <function> --resource-group <rg>
Then when I deploy nothing happens, and I simply get The service is unavailable. I also tried adding the environment variables DOCKER_REGISTRY_SERVER_USERNAME, DOCKER_REGISTRY_SERVER_PASSWORD and DOCKER_REGISTRY_SERVER_PASSWORD. However, it is unclear whether the url should be <docker-id>/<image>:latest, docker.io/<image>:latest, https://docker.io/<image>:latest etc. Still the deployment gets stuck on The service is unavailable, not a very useful error message.
So I basicly have the function app project ready and the dockerfile/image. How can it be so difficult to simply deploy using the giving image? The documentation here is very elaborate but I am missing the details for a private repository. Also it is very different from my usual vscode deployment, making it very tough to follow and execute.
Created the Python 3.9 Azure Durable Functions in VS Code.
Created Container Registry in Azure and Pushed the Function Code to ACR using docker push.
az functionapp config container set --docker-custom-image-name customcontainer4funapp --docker-registry-server-password <login-server-pswd> --docker-registry-server-url https://customcontainer4funapp.azurecr.io --docker-registry-server-user customcontainer4funapp --name krisdockerfunapp --resource-group AzureFunctionsContainers-rg
As following the same MS Doc, pushed the function app to docker custom container made as private and to the Azure Function App. It is working as expected.
Refer to this similar issue resolution regarding the errorThe service is unavailable comes post deployment of the Azure Functions Project as there are several reasons which needs to be diagnosed in certain steps.

GKE can't pull image from GCR

This one is a real head-scratcher, because everything had worked fine for years until yesterday. I have a google cloud account and the billing is set up correctly. I have private images in my GCR registry which I can 'docker pull' and 'docker push' from my laptop (MacBook Pro with Big Sur 11.4) with no problems.
The problem I detail here started happening yesterday after I deleted a project in the google cloud console, then created it again from scratch with the same name. The previous project had no problem pulling GCR images, the new one couldn't pull the same images. I have now used the cloud console to create new, empty test projects with a variety of names, with new clusters using default GKE values. But this new problem persists with all of them.
When I use kubectl to create a deployment on GKE that uses any of the GCR images in the same project, I get ErrImagePull errors. When I 'describe' the pod that won't load the image, the error (with project id obscured) is:
Failed to pull image "gcr.io/test-xxxxxx/test:1.0.0": rpc error: code
= Unknown desc = failed to pull and unpack image "gcr.io/test-xxxxxx/test:1.0.0": failed to resolve reference
"gcr.io/test-xxxxxx/test:1.0.0": unexpected status code [manifests
1.0.0]: 401 Unauthorized.
This happens when I use kubectl from my laptop (including after wiping out and creating a new .kube/config file with proper credentials), but happens exactly the same when I use the cloud console to set up a deployment by choosing 'Deploy to GKE' for the GCR image... no kubectl involved.
If I ssh into a node in any of these new clusters and try to 'docker pull' a GCR image (in the same project), I get a similar error:
Error response from daemon: unauthorized: You don't have the needed
permissions to perform this operation, and you may have invalid
credentials. To authenticate your request, follow the steps in:
https://cloud.google.com/container-registry/docs/advanced-authentication
My understanding from numerous articles is that no special authorization needs to be set up for GKE to pull GCR images from within the same project, and I've NEVER had this issue in the past.
I hope I'm not the only one on this deserted island. Thanks in advance for your help!
I tried Implementing the setup and faced the same error both on the GKE Cluster and the Cluster’s nodes. This was caused because the access to Cloud Storage API is “Disabled” on the Cluster Nodes which can be verified from Node (VM instance) details under the “Cloud API access scopes” section.
We can rectify this by changing the “Access scopes” to “Set access for each API” and modify access to specific API in the Node Pools -> default-pool -> Security section when creating the cluster. In our case we need at least "Read Only" access for Storage API to enable access to Cloud Storage where the Image is stored. Changing the service account and access scopes for an instance for more information.

How to configure docker/docker-compose to use Nexus by default instead of docker.io?

I'm trying to use TestContainers to run JUnit tests.
However, I'm getting a InternalServerErrorException: Status 500: {"message":"Get https://registry-1.docker.io/v2/: Forbidden"} error.
Please note, that I am on a secure network.
I can replicate this by doing docker pull testcontainers/ryuk on the command line.
$ docker pull testcontainers/ryuk
Using default tag: latest
Error response from daemon: Get https://registry-1.docker.io/v2/: Forbidden
However, I need it to pull from our nexus service: https://nexus.company.com/18443.
Inside the docker-compose file, I'm already using the correct nexus image path. (Verified by manually starting it with docker-compose. However TestContainers also pulls in additional images which are outside the docker-compose file. It is these images that are causing the failure.
I'd be glad for either a Docker Desktop or TestContainers configuration change that would fix this for me.
Note: I've already tried adding the host URL for nexus to the Docker Engine JSON configuration on the dashboard, with no change to the resulting error when doing docker pull.
Since the version 1.15.1 Testcontainers allow to automatically append prefixes to all docker images. In case your private registry is configured as a docker hub mirror this functionality should help with the mentioned issue.
Quote from the documentation:
You can then configure Testcontainers to apply the prefix registry.mycompany.com/mirror/ to every image that it tries to pull from Docker Hub. This can be done in one of two ways:
Setting environment variables TESTCONTAINERS_HUB_IMAGE_NAME_PREFIX=registry.mycompany.com/mirror/
Via config file, setting hub.image.name.prefix in either:
the ~/.testcontainers.properties file in your user home directory, or
a file named testcontainers.properties on the classpath
Basically set the same prefix you did for the images in your docker-compose file.
If you're stuck with older versions for some reason, a deprecated solution would be to override just the ryuk.container.image property. Read about it here.
The process is described on this page:
Add the following to your Docker daemon config:
{
"registry-mirrors": ["https://nexus.company.com:18443"]
}
Make sure to restart the daemon to apply the changes.

GCP container push not working - "denied: Please enable Google Container Registry API in Cloud Console at ..."

I'm having trouble uploading my docker image to GCP Container registry. I was following the instructions here.
As you can see in the screenshot below, I've:
Logged into my google cloud shell and built a docker image via a dockerfile
Tagged my image correctly (I think)
Tried to push the image using the correct command (I think)
However, I'm getting this error:
denied:
Please enable Google Container Registry API in Cloud Console at https://console.cloud.google.com/apis/api/containerregistry.googleapis.com/overview?project=bentestproject-184220 before performing this operation.
When I follow that link, it takes me to the wrong project:
When I select my own project, I can see that "Google Container Registry API" is indeed enabled:
How do I upload my docker images?
I seems that you mistype you project ID. You project name is BensTestsProject but ID is bentestproject-184220.
I have the same issue and solved it. In my case the project name in the image tag was wrong. you must be re-check your "bentestproject-184220" in your image tag is correctly your projectID.

Resources