Pulling docker image in GKE - docker

Apologies if this is a duplicate, I haven't found a solution in similar questions.
I'm trying to upload a docker image to Google Kubernetes Engine.
I've did it successfully before, but I can't seem to find my luck this time around.
I have Google SDK set up locally with kubectl and my Google Account, which is project owner and has all required permissions.
When I use
kubectl create deployment hello-app --image=gcr.io/{project-id}/hello-app:v1
I see the deployment on my GKE console, consistently crashing as it "cannot pull the image from the repository.ErrImagePull Cannot pull image '' from the registry".
It provides 4 recommendations, which I have by now triple checked:
Check for spelling mistakes in the image names.
Check for errors when pulling the images manually (all fine in Cloud Shell)
Check the image pull secret settings
So, based on this https://blog.container-solutions.com/using-google-container-registry-with-kubernetes, I manually added 'gcr-json-key' from a new service account with project view permissions as well as 'gcr-access-token' to kubectl default service account.
Check the firewall for the cluster to make sure the cluster can connect to the ''. Afaik, this should not be an issue with a newly set up cluster.
The pods themselve provide the following error code:
Failed to pull image "gcr.io/{project id}/hello-app:v1":
[rpc error: code = Unknown desc = Error response from daemon:
Get https://gcr.io/v2/{project id}/hello-app/manifests/v1: unknown: Unable to parse json key.,
rpc error: code = Unknown desc = Error response from daemon:
Get https://gcr.io/v2/{project id}/hello-app/manifests/v1:
unauthorized: Not Authorized., rpc error: code = Unknown desc = Error response from daemon:
pull access denied for gcr.io/{project id}/hello-app,
repository does not exist or may require 'docker login': denied:
Permission denied for "v1" from request "/v2/{project id}/hello-app/manifests/v1".]
My question now, what am I doing wrong or how can I find out why my pods can't pull my image?
Kubernetes default serviceaccount spec:
kubectl get serviceaccount -o json
{
"apiVersion": "v1",
"imagePullSecrets": [
{
"name": "gcr-json-key"
},
{
"name": "gcr-access-token"
}
],
"kind": "ServiceAccount",
"metadata": {
"creationTimestamp": "2020-11-25T15:49:16Z",
"name": "default",
"namespace": "default",
"resourceVersion": "6835",
"selfLink": "/api/v1/namespaces/default/serviceaccounts/default",
"uid": "436bf59a-dc6e-49ec-aab6-0dac253e2ced"
},
"secrets": [
{
"name": "default-token-5v5fb"
}
]
}

It does take several steps and the blog post you referenced appears to have them correctly. So, I suspect your error is in one of the steps.
Couple of things:
The error message says Failed to pull image "gcr.io/{project id}/hello-app:v1". Did you edit the error message to remove your {project id}? If not, that's one problem.
My next concern is the second line: Unable to parse json key. This suggests that you created the secret incorrectly:
Create the service account and generate a key
Create the Secret exactly as shown: kubectl create secret docker-registry gcr-json-key... (in the default namespace unless --namespace=... differs)
Update the Kubernetes spec with ImagePullSecrets
Because of the ImagePullSecrets requirement, I'm not aware of an alternative kubectl run equivalent but, you can try accessing your image using Docker from your host:
See: https://cloud.google.com/container-registry/docs/advanced-authentication#json-key
And then try docker pull gcr.io/{project id}/hello-app:v1 ensuring that {project id} is replaced with the correct GCP Project ID.
This proves:
The Service Account & Key are correct
The Container Image is correct
That leaves, your creation of the Secret and your Kubernetes spec to test.
NOTE The Service Account IAM permission of Project Viewer is overly broad for GCR access, see the permissions
Use StorageObject Viewer (roles/storage.objectViewer) if the Service Account needs only to pull images.

Related

Docker push to AWS ECR hangs immediately and times out

I'm trying to push my first docker image to ECR. I've followed the steps provided by AWS and things seem to be going smoothly until the final push which immediately times out. Specifically, I pass my aws ecr credentials to docker and get a "login succeeded" message. I then tag the image which also works. pushing to the ecr repo I get no error message, just the following:
The push refers to repository [xxxxxxxxxxx.dkr.ecr.ca-central-1.amazonaws.com/reponame]
714c1b96dd83: Retrying in 1 second
d2cdc77dd068: Retrying in 1 second
30aad807caf5: Retrying in 1 second
0559774c4ea2: Retrying in 1 second
285b8616682f: Retrying in 1 second
4aeea0ec2b15: Waiting
1b1312f842d8: Waiting
c310009e0ef3: Waiting
a48777e566d3: Waiting
2a0c9f28029a: Waiting
EOF
It tries a bunch of times and then exits with no message. Any idea what's wrong?
I figured out my issue. I wasn't using the correct credentials. I had a personal AWS account as my default credentials and needed to add my work profile to my credentials.
EDIT
If you have multiple aws profiles, you can mention the profile name at the docker login as below (assuming you have done aws configure --profile someprofile at earlier day),
aws ecr get-login-password --region us-east-1 --profile someprofile | docker login ....
You will get the same behaviour if you forget to create ECR repo before pushing.
Use CloudTrail to get a clue what is wrong.
Also make sure that you have configured correct policy for your user — for example, AmazonEC2ContainerRegistryFullAccess.
Make sure the name of your repository is the same name as your images.
image:latest 756839881602.dkr.ecr.us-east-1.amazonaws.com/image:latest in this case my repository name is image and my image name is image as well. This worked for me.
In my case, the repository I wanted to push to didn't exist (For example, I tried pushing to my-app/backend:latest but only the my-app/cms repository exists). So make sure your repository exists in the AWS ECR Console in the right region. The error returned from AWS CLI (EOF) didn't help at all.
Check your aws permissions. In addition to AmazonEC2ContainerRegistryFullAccess permission, below actions has to be granted for the correct resource. Especially check "arn:aws:ecr:${REGION}:${ACCOUNT_ID}:repository/{$REGISTRY_NAME}" part.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecr:BatchGetImage",
"ecr:BatchCheckLayerAvailability",
"ecr:CompleteLayerUpload",
"ecr:DescribeImages",
"ecr:DescribeRepositories",
"ecr:GetDownloadUrlForLayer",
"ecr:InitiateLayerUpload",
"ecr:ListImages",
"ecr:PutImage",
"ecr:UploadLayerPart"
],
"Resource": "arn:aws:ecr:${REGION}:${ACCOUNT_ID}:repository/{$REGISTRY_NAME}"
},
{
"Effect": "Allow",
"Action": "ecr:GetAuthorizationToken",
"Resource": "*"
}
]
}
In my case it was related to MFA (Multi-Factor-Authentication).
I had to create a session token. The docker login seemed to be successful, but pushing does not work.
The following script is doing all for you and creates a aws profile "mfa" used to login: get_mfa_credentials.py
After executing, you can login with:
aws ecr get-login-password --region <YOUR_REGION> --profile mfa | docker login --username AWS --password-stdin <Your_REPO>
I do not know who wrote it, but I'm very grateful to this guy.
And thanks to AWS for bad tools that do not help.
Assuming you authenticated successfully to AWS and you have permissions to read, write to ECR, check if the repository does exist
aws ecr describe-repositories --repository-name reponame
If you catch an error RepositoryNotFoundException, then you will create to that repository with the following command
aws ecr create-repository --repository-name reponame
After that, try to push again, it will be fine!
I also was able to login to the registry, yet the pushing of the image would just timeout.
The solution for me was to add AmazonEC2ContainerRegistryFullAccess to my IAM user.
After adding that permission to my IAM user account, I could docker push to the ECS registry just fine.
I have to add for anyone else encountering this problem. Go to IAM and make sure you have put permissions. I don't want to say how long I wasted before figuring that out.
Edit to help #zac's answer:
The policies that need to be attached are AmazonEC2ContainerRegistryFullAccess and AWSAppRunnerServicePolicyForECRAccess
For those who tried the solution above, and it didn't work, make sure the image name your are pushing is the same as the repository name.
Ensure you are using the correct profile and that the repository exists
Command to login with profile: aws ecr get-login-password --region <region> --profile=<profile-name> | docker login --username AWS --password-stdin <aws-account-id>.dkr.ecr.eu-west-1.amazonaws.com
Command to create repo if it does not exists:
aws ecr describe-repositories --repository-names ${REPO_NAME} || aws ecr create-repository --repository-name ${REPO_NAME}(source)
If anyone is still stuck with the issue. I would highly recommend watching this short vid https://www.youtube.com/watch?v=89ZeXaZEf80&ab_channel=IdenticalCloud
Here are the steps I took to fix the issue (if you prefer not to watch the video):
Create a new IAM user with "Access keys" checked
Under permissions, click on "attach existing policies directly" and choose "AmazonEC2ContainerRegistryFullAccess"
Download the CSV file
Run "AWS configure" on your terminal and pass in the credentials from the CSV file
Set the location to the location you created your ECR (mine was us-east-1)
Go to ECR and follow the steps to push the image
For me, I had to delete the stack and re-deploy the stack. Then, I was able to push the docker image to ECR.
Please check cloud trail event logs , this is where all the api issues are clearly highlighted .
In my case it was because i had a - in my image name and hence it was throwing the following error in the cloud trail logs
"The repository with name 'myimage-nginx' does not exist in the registry with id '516583196897'
Please note the - in the image name.
Fixing the image name to remove the - resolved the issue for me.
Commands
docker tag nginx:latest 516583196897.dkr.ecr.ap-south-1.amazonaws.com/myimage:latest
docker push 516583196897.dkr.ecr.ap-south-1.amazonaws.com/myimage:latest
In my case I was creating the repo in us-east-2 and attempting to push to us-east-1, so docker couldn't find it.
Make sure your assumed aws role has the ability to push images to AWS ECR. Easiest is to check the role via the command:
aws sts get-caller-identity --profile=saml
I was following this documentation and hit this error. What addressed the problem was using the repository id instead of the account name.
aws ecs create-repository creates a repo, returning a repositoryUri. Then, the docker login, docker tag and docker push should be done using that repository url instead of the user one.
I had this problem with sam deploy
sam delete --stack-name ...
sam deploy --guided
worked for me

pull access denied for "aaaaa" repository does not exist or may require 'docker login'

After cloning a project from Gitlab on ubuntu, I tried to run it through docker, I opened a terminal, got to the directory where the .yml file was and wrote down:
I was greeted with this message:
The image for the service you're trying to recreate has been removed.
If you continue, volume data could be lost. Consider backing up your
data before continuing.
Continue with the new image?
I pressed y.
Then for a few seconds, I got:
Pulling "name of the module here"
Then I was greeted with this message:
ERROR: pull access denied for "name of the module here", the repository does not exist or may require 'docker login'
pedroesteves#pedro:~/Desktop/project$
Any help would be very appreciated. I already looked to some similar posts but none was able to help me.
You can pull image from a private registry, but You will get an error on the new version of Docker related to certificated, during pull image on the client side.
First, you to modify docker daemon file on the client side
create the file if not exist
/etc/docker/daemon.json
Add the following
{ "insecure-registries":["gitlab.my-site.com:5000"] }
Then you are good to go
docker login gitlab.my-site.com:5000

Argo artifact passing cant save output

I am trying to run the artifact passing example on Argoproj. However, I am getting the following error:
failed to save outputs: verify serviceaccount platform:default has necessary privileges
This error is appearing in the first step (generate-artifact) itself.
Selecting the generate-artifact component and clicking YAML gives following line highlighted
Nothing appears on clicking LOGS.
I need to understand the correct sequence of steps in running the YAML file so that this error does not appear and artifacts are passed. Could not find much resources on this issue other than this page where the issue is discussed on argo repository.
All pods in a workflow run with the service account specified in workflow.spec.serviceAccountName, or if omitted, the default service account of the workflow's namespace.
Here the default service account of that namespace doesn't seem to be given any roles by default.
Try granting a role to the “default” service account in a namespace:
kubectl create rolebinding argo-default-binding \
--clusterrole=cluster-admin \
--serviceaccount=platform:default \
--namespace=platform
Since the default service account now gets all access via the 'cluster-admin' role, the example should work now.

How to reject docker registries in kubernetes?

I want to reject all docker registries except my own one. I'm looking for a some kind of policies for docker registries and their images.
For example my registry name is registry.my.com. I want to make kubernetes pulling/running images only from registry.my.com, so:
image: prometheus:2.6.1
or any another should be rejected, while:
image: registry.my.com/prometheus:2.6.1
shouldn't.
Is there a way to do that?
Admission Controllers is what you are looking for.
Admission controllers intercept operations to validate what should happen before the operation is committed by the api-server.
An example is the ImagePolicyWebhook, an admission controller that intercept Image operations to validate if it should be allowed or rejected.
It will make a call to an REST endpoint with a payload like:
{
"apiVersion":"imagepolicy.k8s.io/v1alpha1",
"kind":"ImageReview",
"spec":{
"containers":[
{
"image":"myrepo/myimage:v1"
},
{
"image":"myrepo/myimage#sha256:beb6bd6a68f114c1dc2ea4b28db81bdf91de202a9014972bec5e4d9171d90ed"
}
],
"annotations":[
"mycluster.image-policy.k8s.io/ticket-1234": "break-glass"
],
"namespace":"mynamespace"
}
}
and the API answer with Allowed:
{
"apiVersion": "imagepolicy.k8s.io/v1alpha1",
"kind": "ImageReview",
"status": {
"allowed": true
}
}
or Rejected:
{
"apiVersion": "imagepolicy.k8s.io/v1alpha1",
"kind": "ImageReview",
"status": {
"allowed": false,
"reason": "image currently blacklisted"
}
}
The endpoint could be a Lambda function or a container running in the cluster.
This github repo github.com/flavio/kube-image-bouncer implements a sample using ImagePolicyWebhook to reject containers using the tag "Latest".
There is also the option to use the flag registry-whitelist on startup to a pass a comma separated list of allowed registries, this will be used by the ValidatingAdmissionWebhook to validate if the registry is whitelisted.
.
The other alternative is the project Open Policy Agent[OPA].
OPA is a flexible engine used to create policies based on rules to match resources and take decisions according to the result of these expressions. It is a mutating and a validating webhook that gets called for matching Kubernetes API server requests by the admission controller mentioned above. In summary, the operation would work similarly as described above, the only difference is that the rules are written as configuration instead of code. The same example above rewritter to use OPA would be similar to this:
package admission
import data.k8s.matches
deny[{
"id": "container-image-whitelist", # identifies type of violation
"resource": {
"kind": "pods", # identifies kind of resource
"namespace": namespace, # identifies namespace of resource
"name": name # identifies name of resource
},
"resolution": {"message": msg}, # provides human-readable message to display
}] {
matches[["pods", namespace, name, matched_pod]]
container = matched_pod.spec.containers[_]
not re_match("^registry.acmecorp.com/.+$", container.image) # The actual validation
msg := sprintf("invalid container registry image %q", [container.image])
}
The above translates to: deny any pod where the container image does not match the following registry registry.acmecorp.com
Currently not something that you can enable or disable with one command , but there are admission controllers that you can use.
If you are on redhat platform and running just docker or kubernetes nodes on RHEL , with RHEL docker as container runtime , you can white list registries there.
Whitelisting Docker Registries
You can specify a whitelist of docker registries, allowing you to
curate a set of images and templates that are available for download
by OpenShift Container Platform users. This curated set can be placed
in one or more docker registries, and then added to the whitelist.
When using a whitelist, only the specified registries are accessible
within OpenShift Container Platform, and all other registries are
denied access by default.
To configure a whitelist:
Edit the /etc/sysconfig/docker file to block all registries:
BLOCK_REGISTRY='--block-registry=all'
You may need to uncomment the BLOCK_REGISTRY line.
In the same file, add registries to which you want to allow access:
ADD_REGISTRY='--add-registry=<registry1> --add-registry=<registry2>'
Allowing Access to Registries
ADD_REGISTRY='--add-registry=registry.access.redhat.com'
There is also a github project:
https://github.com/flavio/kube-image-bouncer
That you can use to white list registries. I think registry white listing is already implemented in it , you just need to provide it the list when you are going to run the binary.
In case you are dealing with an Azure-managed AKS cluster you can make use of Azure Policies. Here is a summary. I wrote about it in more detail in my blog post which can be found here.
Activate the Policy Insights resource provider on your subscription
az provider register --namespace Microsoft.PolicyInsights
Enable AKS Azure Policy Add-On
az aks enable-addons --addons azure-policy --name <cluster> --resource-group rg-demo
Assign one of the built-in policies that allow just for that use case
# Define parameters for Azure Policy
$param = #{
"effect" = "deny";
"excludedNamespaces" = "kube-system", "gatekeeper-system", "azure-arc", "playground";
"allowedContainerImagesRegex" = "myregistry\.azurecr\.io\/.+$";
}
# Set a name and display name for the assignment
$name = 'restrict-container-registries'
# Retrieve the Azure Policy object
$policy = Get-AzPolicyDefinition -Name 'febd0533-8e55-448f-b837-bd0e06f16469'
# Retrieve the resource group for scope assignment
$scope = Get-AzResourceGroup -Name rg-demo
# Assign the policy
New-AzPolicyAssignment -DisplayName $name -name $name -Scope $scope.ResourceId -PolicyDefinition $policy -PolicyParameterObject $param
A couple of things worth noting:
Installing the add-on, installs gatekeeper for you
It can take up to 20 minutes until the policies do get applied
I excluded the namespace playground on purpose for demo only

Error response from daemon: Get https://xxxxxxxxx.dkr.ecr.us-east-2.amazonaws.com/v2/xxxx/manifests/v_50: no basic auth credentials

I'm trying to implement CD/CI workflow with jenkins-docker-aws. I'm in the point of having the job properly configured but I'm getting an error at deployment time in ec2.
I face in AWS the following error:
Status reason CannotPullContainerError: API error (404): repository xxxxxxxxx.dkr.ecr.us-east-2.amazonaws.com/xxxxxxxxx not found
My repository exists in AWS ECR. So, debugging and trying to pull the image that is in the Repository, I've executed the following commands to confirm everything is fine:
1.- Getting Logging succeeded by executing the output of:
aws ecr get-login --no-include-email
2.- Checked my ~/.docker/config.json it shows, firstly it showed registry URL without protocol, but after reading some recomendations pointed to add it:
{
"auths": {
"https://xxxxxxxx.dkr.ecr.us-west-1.amazonaws.com": {
"auth": "long key..."
}
},
"HttpHeaders": {
"User-Agent": "Docker-Client/17.12.1-ce (linux)"
}
}
So, after these checks and execute the pull command, I'm still getting...
[ec2-user#ip-xxxxxx .docker]$ docker pull xxxxxxxxx.dkr.ecr.us-east-2.amazonaws.com/xxxxxxxxx:v_50
Error response from daemon: Get https://xxxxxxxxx.dkr.ecr.us-east-2.amazonaws.com/v2/davidtest/manifests/v_50: no basic auth credentials

Resources