I've setup a Google Cloud Build pipeline that'll build a docker image from a Dockerfile, test the image and push the image into Google Container Registry.
Upon running the pipeline I noticed that all defined steps passed with SUCCESS status but the build summary itself returned with FAILURE status even though I can see the image being produced into Google Container Registry.
I used following command to build the image
gcloud builds submit --config cloudbuild.yml --gcs-log-dir 'gs://<bucket>' .
and below is the error message returned:
ERROR: (gcloud.builds.submit) build www-xxxx-yyyy-zzzz completed with status "FAILURE"
🚨 Error: The command exited with status 1
Is there any reason for the gcloud builds submit command to exit with code 1 as above if all the steps were marked as SUCCESS?
Below is some filtered log data taken from gcloud builds describe command for that specific build.
steps:
- args:
- build
- -t
- <host>/<project/<image>:<tag>
- .
name: gcr.io/cloud-builders/docker
status: SUCCESS
- args:
- test
- --image
- <host>/<project/<image>:<tag>
- --config
- test_config.yml
- -o
- json
name: gcr.io/gcp-runtimes/container-structure-test
status: SUCCESS
Below is Google Cloud Build setup:
# cloudbuild.yml
steps:
# Build the image
- name: 'gcr.io/cloud-builders/docker'
args: [ 'build', '-t', '<host>/<project/<image>:<tag>', '.' ]
# Test the image
- name: 'gcr.io/gcp-runtimes/container-structure-test'
args: [
'test',
'--image',
'<host>/<project/<image>:<tag>',
'--config',
'test_config.yml',
'-o',
'json'
]
# Push the image
images: [ '<host>/<project/<image>:<tag>' ]
I've finally resolved this issue with the assistance of Google Cloud support team.
They found out a 403 Permission Denied error as the Cloud Build container trying to access Google Cloud Storage to delete a certain log object stored in the bucket, this error message is found at the back system of Cloud Build where users/clients have no access to. The 403 Permission Denied error is the result of the object retention policy applied to the bucket.
In my case, I've replaced retention policy with lifecycle policy to resolve this issue and it worked. We do this as we consider keeping Cloud Build log size under control is our primary objective and, to prevent any accidental deletion/modification to the log file, we ended up with setting read-only access to the resources in the log bucket except for the service account used by Cloud Build.
Related
- task: Docker#2
displayName: Build an image
inputs:
command: build
repository: weather-update-project
dockerfile: '**/Dockerfile'
buildContext: '$(Build.SourcesDirectory)'
tags: 'latest'
- task: ECRPushImage#1
inputs:
awsCredentials: 'weather'
regionName: us-west-2
imageSource: 'imagename'
sourceImageName: 'weather-update-project'
sourceImageTag: 'latest'
pushTag: 'latest'
repositoryName: 'weather-update-project'
I'm building an image and then trying to push that image to ECR. When it gets to the ECR push image task, it tries to push a few times and then gives me the error "The process '/usr/bin/docker' failed with exit code 1" and that's it. There's no other information in my logs in regards to the error like there normally is. What is possibly happening? My ECR is public and all of my credentials are correct. Here's my YAML code for the docker build and ecrpushimage tasks in Azure DevOps
My Repository name that contains my dockerfile is 'weather-update-project' and my ECR repository also has the name 'weather-update-project'
Can you please validate on what agent this is running on & if Docker is there or not?
Is the image being created properly?
While executing the ECRPushImage task at the beginning it should show at least the configuration log like below, if not then it is related to docker on that agent.
Configuring credentials for task
...configuring AWS credentials from service endpoint 'xxxxxxxxxxxx'
...endpoint defines standard access/secret key credentials
Configuring region for task
...configured to use region us-east-1, defined in tas
I would like to use the machine executor so that I can run some component tests with docker-compose. My workflow fails on the checkout step and throws this error: Making checkout directory "/opt/my-app" Error: mkdir /opt/my-app: permission denied
Here is the yaml for the component_test stage in my workflow:
component_test:
machine: true
working_directory: /opt/my-app
steps:
- checkout
If I use docker instead of the machine executor then I don't get any permission issues:
component_test:
machine: true
working_directory: /opt/my-app
steps:
- checkout
But, I'd like to be able to use docker-compose and thus need to be able to run the machine executor. Has anyone seen a permission issue like this before?
You need to either change the working directory into something in /home/circleci or just exclude it complete as it's optional.
Right now, the circleci user runs the checkout step, which doesn't have permission to git clone to the working directory you choose.
Also, I wouldn't use machine: true as that is deprecated. Specify an image: https://circleci.com/docs/2.0/configuration-reference/#available-machine-images
I have created a React app and setup a fully working pipeline in Azure DevOps using Docker to build, test and publish my application.
Snippet from my pipeline yaml file:
#----------------------------------------------------------------------
# Build the docker image
#----------------------------------------------------------------------
- task: Docker#2
inputs:
containerRegistry: 'XXXcontainerrepo'
repository: 'XXX'
command: 'build'
Dockerfile: '**/production.Dockerfile'
tags: |
$(Build.BuildNumber)
arguments: '--build-arg buildNumber="$(Build.BuildNumber)"'
displayName: 'Docker build'
#----------------------------------------------------------------------
# Export and publish test results
#----------------------------------------------------------------------
- script: |
export id=$(docker images --filter "label=test=$(Build.BuildNumber)" -q | head -1)
echo "Container ID: ${id}"
docker create --name testcontainer $id
docker cp testcontainer:/app/coverage ./coverage
docker cp testcontainer:/app/junit.xml ./junit.xml
docker rm testcontainer
displayName: 'Copy test results and code coverage reports'
- task: PublishCodeCoverageResults#1
inputs:
codeCoverageTool: 'cobertura'
summaryFileLocation: '$(System.DefaultWorkingDirectory)/coverage/cobertura-coverage.xml'
displayName: 'Publish code coverage reports'
- task: PublishTestResults#2
inputs:
testResultsFormat: 'JUnit'
testResultsFiles: '**/junit.xml'
mergeTestResults: true
failTaskOnFailedTests: true
testRunTitle: 'Jest Unit Tests'
displayName: 'Publish test results'
Snippet from the Docker file:
# run tests
LABEL test=$buildNumber
RUN npm run test -- --reporters=default --reporters=jest-junit --coverage --watchAll=false
When all the test cases pass a test report is generated which I then copy out from my Docker container and publish using a task in the Azure pipeline.
But when a test fails I get an error in the Docker container and my build fails without any report being created. Of course I can see the failing test in the pipeline build log, but I don't get the report available in Azure DevOps since it is never created and thus not available to publish.
When I run the same command locally I get a test report created in both cases, when all test pass or when one or more fail.
So my question is, is it possible to create the report even if one or more test fail using Docker? As I understand the number of failing tests is the exit code, and when the exit code != 0 the Docker container fails.
I would still like the Docker build step to fail after the report has been created, but in case that is not possible I have set a flag in the test result publish task to fail the build if one or more test fail.
failTaskOnFailedTests: true
Update
When adding the continueOnError: true flag to the Docker build step the build continues to the next step, but the problem still exists. It seems like the test run is halted before it can create the test and coverage reports, probably due to the non zero exit code it produces that makes the Docker step exit.
In the copy test results step I then get this output:
Error: No such container:path: testcontainer:/app/coverage
Error: No such container:path: testcontainer:/app/junit.xml
Which tells me the test run didn't create a report due to the non zero exit code which stopped the Docker build step to exit.
Solution
I ended up delaying the exit code so that the container with the test label finished and thus was available to the next step to extract the report files.
RUN npm run test -- --reporters=default --reporters=jest-junit --coverage --
watchAll=false; \
echo $? > /npm.exitcode;
# if the npm command failed - fail the build here so that we can get the
test-report files regardless of exit code
RUN exit $(cat /npm.exitcode)
I also added the condition: succeededOrFailed() to the copy test results step and the publish code coverage step so that they always run, even if the build step "fails".
You may try the workaround in the following links:
https://github.com/MicrosoftDocs/azure-devops-docs/issues/2183
https://github.com/microsoft/vstest/issues/1848
Try adding "exit 0" to the end of this line as this: RUN dotnet test MyProject.Tests.csproj -c Release --logger "trx;LogFileName=testresults.trx"; exit 0
You can set continueOnError field on docker step which enable next steps to run. However, in this way your build ends with status SucceededWithIssues. If this is ok for you, all is done. If not you can put at the end step with condition on Agent.JobStatus like
- pwsh: exit 1
condition eq(variables['Agent.JobStatus'], 'SucceededWithIssues')
to fail a build when it happens.
This similar question is not applicable because I am not using Kubernetes or my own registered runner.
I am attempting to build a Ruby-based image in my GitLabCI pipeline in order to have my gems pre-installed for use by subsequent pipeline stages. In order to build this image, I am attempting to use Kaniko in a job that runs in the .pre stage.
build_custom_dockerfile:
stage: .pre
image:
name: gcr.io/kaniko-project/executor:debug
entrypoint: [""]
variables:
IMAGE_TAG: ${CI_COMMIT_REF_SLUG}-${CI_COMMIT_SHORT_SHA}
script:
- echo "{\"auths\":{\"${CI_REGISTRY}\":{\"username\":\"${CI_REGISTRY_USER}\",\"password\":\"${CI_REGISTRY_PASSWORD}\"}}}" > /kaniko/.docker/config.json
- /kaniko/executor --context ${CI_PROJECT_DIR} --dockerfile ${CI_PROJECT_DIR}/dockerfiles/custom/Dockerfile --destination \
${CI_REGISTRY_IMAGE}:${IMAGE_TAG}
This is of course based on the official GitLabCI Kaniko documentation.
However, when I run my pipeline, this job returns an error with the following message:
error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: getting tag for destination: registries must be valid RFC 3986 URI authorities: registry.gitlab.com
The Dockerfile path is correct and through testing with invalid Dockerfile paths to the --dockerfile argument, it is clear to me this is not the source of the issue.
As far as I can tell, I am using the correct pipeline environment variables for authentication and following the documentation for using Kaniko verbatim. I am running my pipeline jobs with GitLab's shared runners.
According to this issue comment from May, others were experiencing a similar issue which was then resolved when reverting to the debug-v0.16.0 Kaniko image. Likewise, I changed the Image name line to name: gcr.io/kaniko-project/executor:debug-v0.16.0 but this resulted in the same error message.
Finally, I tried creating a generic user to access the registry, using a deployment key as indicated here. Via the GitLabCI environment variables project settings interface, I added two variables corresponding to the username and key, and substituted these variables in my pipeline script. This resulted in the same error message.
I tried several variations on this approach, including renaming these custom variables to "CI_REGISTRY_USER" and "CI_REGISTRY_PASSWORD" (the predefined variables). I also made sure neither of these variables was marked as "protected". None of this solved the problem.
I have also tried running the tutorial script verbatim (without custom image tag), and this too results in the same error message.
Has anyone had any recent success in using Kaniko to build Docker images in their GitLabCI pipelines? It appears others are experiencing similar problems but as far as I can tell, no solutions have been put forward and I am not certain whether the issue is on my end. Please let me know if any additional information would be useful to diagnose potential problem sources. Thanks all!
I ran into this issue before many times forgetting that the variable was set to protected thus will only be exported to protected branches.
Hey i got it working but it was quite a hassle to find out.
The credentials i had to use were my git username and password not the registry user/passwordd!
Here is what my gitlab-ci.yml looks like (of course you would need to replace everything with variables but i was too lazy to do it until now)
build:
stage: build
image:
name: gcr.io/kaniko-project/executor:debug
entrypoint: [""]
tags:
- k8s
script:
- echo "{\"auths\":{\"registry.mydomain.de/myusername/mytag\":{\"username\":\"myGitusername\",\"password\":\"myGitpassword\"}}}" > /kaniko/.docker/config.json
- /kaniko/executor --context $CI_PROJECT_DIR --dockerfile $CI_PROJECT_DIR/Dockerfile --destination registry.mydoamin.de/myusername/mytag:$CI_COMMIT_SHORT_SHA
I was trying to push the docker image build to the Azure Container registry using the Azure YAML pipeline. But while pushing i am getting an error like
"denied: requested access to the resource is denied
##[error]denied: requested access to the resource is denied
##[error]/usr/bin/docker failed with return code: 1"
Below is the azure yaml pipeline code part i used for this. Also tried by removing the includeSourceTags and additionalImageTags
variables:
BuildConfiguration: "Release"
AzureSubscription: "ea158397-eb3f-461a-94df-0eb6bbaada60"
AzureContainerRegistry: "microservicecontainerregistry01.azurecr.io"
KubernetesServiceEndpoint: "AKSServiceConnection"
ResourceGroup: "microservicedelivery"
- task: Docker#1
condition: eq(variables['fullCI'],True)
displayName: 'Push runtime image'
inputs:
azureSubscriptionEndpoint: ${{variables.AzureSubscription}}
azureContainerRegistry: ${{variables.AzureContainerRegistry}}
command: 'Push an image'
imageName: '$(imageName)'
includeSourceTags: false
additionalImageTags: $(Build.BuildId)
Same error is happening in the Microsoft hosted machine as well as with the private machine.
Please refer the documentation here with the sample YAML snippet to help you push the images to ACR.