Disable cache for Docker directly in Dockerfile - docker

I use Gitpod as my online IDE. Gitpod builds a Docker container from a user-provided Dockerfile. The user doesn't have access to the terminal which runs the docker build command and thus no flags can be passed. At the moment, my Dockerfile fails build because Docker incorrectly caches instructions, including mkdir commands. Specifically, given the Dockerfile:
# Base image is one of Ubuntu's official distributions.
FROM ubuntu:20.04
# Install curl.
RUN apt-get update
RUN apt-get -y install sudo
RUN sudo apt-get install -y curl
RUN sudo apt-get install -y python3-pip
# Download Google Cloud CLI installation script.
RUN mkdir -p /tmp/google-cloud-download
RUN curl -sSL https://sdk.cloud.google.com > /tmp/google-cloud-download/install.sh
# Install Google Cloud CLI.
RUN mkdir -p /tmp/google-cloud-cli
RUN bash /tmp/gcloud.sh --install-dir=/tmp/google-cloud-cli --disable-prompts
# Move the content of /tmp/gcloud into the container.
COPY /tmp/google-cloud-cli /google-cloud-cli
The build fails with the following log:
#1 [internal] load .dockerignore
#1 transferring context: 114B done
#1 DONE 0.0s
#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 1.43kB done
#2 DONE 0.0s
#3 [internal] load metadata for docker.io/library/ubuntu:20.04
#3 DONE 1.2s
#4 [ 1/13] FROM docker.io/library/ubuntu:20.04#sha256:af5efa9c28de78b754777af9b4d850112cad01899a5d37d2617bb94dc63a49aa
#4 resolve docker.io/library/ubuntu:20.04#sha256:af5efa9c28de78b754777af9b4d850112cad01899a5d37d2617bb94dc63a49aa done
#4 sha256:3b65ec22a9e96affe680712973e88355927506aa3f792ff03330f3a3eb601a98 0B / 28.57MB 0.1s
#4 ...
#5 [internal] load build context
#5 transferring context: 1.70MB 0.1s done
#5 DONE 0.1s
#6 [ 5/13] RUN sudo apt-get install -y python3-pip
#6 CACHED
#7 [ 9/13] RUN bash /tmp/gcloud.sh --install-dir=/tmp/google-cloud-cli --disable-prompts
#7 CACHED
#8 [ 4/13] RUN sudo apt-get install -y curl
#8 CACHED
#9 [ 7/13] RUN curl -sSL https://sdk.cloud.google.com > /tmp/google-cloud-download/install.sh
#9 CACHED
#10 [ 8/13] RUN mkdir -p /tmp/google-cloud-cli
#10 CACHED
#11 [ 3/13] RUN apt-get -y install sudo
#11 CACHED
#12 [ 6/13] RUN mkdir -p /tmp/google-cloud-download
#12 CACHED
#13 [10/13] COPY /tmp/google-cloud-cli /google-cloud-cli
#13 ERROR: failed to calculate checksum of ref j0t2zzxkw0572xeibprcp5ebn::w8exf03p6f5luerwcumrkxeii: "/tmp/google-cloud-cli": not found
#14 [ 2/13] RUN apt-get update
#14 CANCELED
------
> [10/13] COPY /tmp/google-cloud-cli /google-cloud-cli:
------
Dockerfile:22
--------------------
20 |
21 | # Move the content of /tmp/gcloud into the container.
22 | >>> COPY /tmp/google-cloud-cli /google-cloud-cli
23 |
24 | # Copy local code to the container image.
--------------------
error: failed to solve: failed to compute cache key: failed to calculate checksum of ref j0t2zzxkw0572xeibprcp5ebn::w8exf03p6f5luerwcumrkxeii: "/tmp/google-cloud-cli": not found
{"#type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent","command":"build","error":"exit status 1","level":"error","message":"build failed","serviceContext":{"service":"bob","version":""},"severity":"ERROR","time":"2022-08-28T05:31:11Z"}
exit
headless task failed: exit status 1
Other than stop using Gitpod altogheter, which I'm considering, how could I solve this issue?

When you COPY /tmp/google-cloud-cli /google-cloud-cli, it tries to copy a file from outside of Docker space (the build context, the directory argument to docker build, frequently the same directory as the Dockerfile) into the image.
In your case, you already have the file inside the image, so you need to RUN cp or mv or another command to relocate the existing file.
RUN bash /tmp/gcloud.sh --install-dir=/tmp/google-cloud-cli --disable-prompts
RUN mv /tmp/google-cloud-cli /google-cloud-cli

A way to invalidate caches of docker layers in Gitpod is to put in an environment variable above all the layers you want to invalidate and change its value.
FROM gitpod/workspace-full
ENV INVALIDATE_CACHE=1
...
(If this doesn't help, please share a repository with the mentioned Dockerfile to reproduce)

Related

Docker : failed to compute cache key

I am trying to build a docker image for my sample-go app.
I am running it from the sample-app folder itself and using the goland editor's terminal. But the build is failing and giving me certain errors.
My docker file looks like this:
FROM alpine:latest
RUN mkdir -p /src/build
WORKDIR /src/build
RUN apk add --no-cache tzdata ca-certificates
COPY ./configs /configs
COPY main /main
EXPOSE 8000
CMD ["/main"]
command for building:
docker build --no-cache --progress=plain - < Dockerfile
Error And Logs:
#1 [internal] load build definition from Dockerfile
#1 sha256:8bb9ee83603259cf748d90ce42602f12527fa720d7417da22799b2ad4e503497
#1 transferring dockerfile: 222B done
#1 DONE 0.0s
#2 [internal] load .dockerignore
#2 sha256:f93d938488588cd0e0a94d9d343fe69dcfd28d0cb1da95ad7aab00aac50235c3
#2 transferring context: 2B done
#2 DONE 0.0s
#3 [internal] load metadata for docker.io/library/alpine:latest
#3 sha256:13549c58a76bcb5dac9d52bc368a8fb6b5cf7659f94e3fa6294917b85546978d
#3 DONE 0.0s
#10 [1/6] FROM docker.io/library/alpine:latest
#10 sha256:d20daa00e252bfb345a1b4f53b6bb332aafe702d8de5e583a76fcd09ba7ea1c1
#10 CACHED
#7 [internal] load build context
#7 sha256:0f7a8a6082a837c139acc2855e1b745bba9f28cc96709d45cd0b7be42442c0e8
#7 transferring context: 2B done
#7 DONE 0.0s
#4 [2/6] RUN mkdir -p /src/build
#4 sha256:b9fa3007a44471d47414dd29b3ff07ead6af28ede820a2b4bae0ce84cf2c5a83
#4 CACHED
#5 [3/6] WORKDIR /src/build
#5 sha256:b2ec58a365fdd74c4f9030b0caff2e2225eea33617da306678ad037fce675388
#5 CACHED
#6 [4/6] RUN apk add --no-cache tzdata ca-certificates
#6 sha256:0966097abf956d5781bc2330d49cf715cd52c3807e8fedfff07dec50907ff03b
#6 CACHED
#9 [6/6] COPY main /main
#9 sha256:f4b81960427c014a020361bea0903728f289e1d796892fe0adc6409434f3ca76
#9 ERROR: "/main" not found: not found
#8 [5/6] COPY ./configs /configs
#8 sha256:630f272dd60dd307f40dbbdaef277ee0dfc24b71fa11e10a3b8efd64d3c05086
#8 ERROR: "/configs" not found: not found
#4 [2/6] RUN mkdir -p /src/build
#4 sha256:b9fa3007a44471d47414dd29b3ff07ead6af28ede820a2b4bae0ce84cf2c5a83
#4 DONE 0.2s
------
> [5/6] COPY ./configs /configs:
------
------
> [6/6] COPY main /main:
------
failed to compute cache key: "/main" not found: not found
PS: I am not able to find where is the problem? Help Please
The two folders /main and /configs does not exist.
The COPY command can't copy into this folders.
1. Solution
Create the folders on build
RUN mkdir -p /main
RUN mkdir -p /configs
And than use COPY
2. Solution
Try to build without COPY and CMD
Than run the the new image
exec into running container with bash or sh
Create the folders
Exit exec container
Create a new image of the running container with docker run commit
Stop the container and delete it
Build again with your new image and include COPY and CMD
This is a basic mistake.
COPY ./configs /configs: copy the folder configs from the host to the Docker image.
COPY main /main: copy the executable file main from the host to the Docker image.
The problems are:
The base Docker images do not have these folders /configs, /main. You must create them manually (Docker understood your command this way).
But I have some advice:
Create 2 Docker images for 2 purposes: build, production.
Copy the source code into Docker builder image which is use for building your app.
Copy necessary output files from the Docker builder image into the Docker production image.
In my case, the issue was the connected vpn/proxy network from my machine.
It worked after I disconnecting the vpn/proxy network.
In my case I missed the folder entry in .dockerignore file. Do something like that.
**/*
!docker-images
!configs
!main

buildkit extremely slow on zypper commands

I am running into a problem with buildkit and I cannot figure out which is the reason.
I have one Dockerfile using as base image sles OS and it tries to do some package installation via zypper. Everytime this step is executed, not cached, it takes years to complete.
This is a dummy Dockerfile for verification of this issue.
# syntax=docker/dockerfile:1.3
FROM registry.suse.com/suse/sles12sp4
RUN zypper search iproute2
This is execution when I enable Buildkit:
docker build --no-cache --progress=plain --pull -t test_zypper .
#1 [internal] load build definition from Dockerfile
#1 sha256:1e8bc50247fba08161184996db9e2b6bca36c339623376a360765244d9d3ed8b
#1 transferring dockerfile: 202B done
#1 DONE 0.0s
#2 [internal] load .dockerignore
#2 sha256:bfa4297d1f77b21d1d84347ff3f9c338cef560c9f5c8ef8f6843338b88a83178
#2 transferring context: 2B done
#2 DONE 0.0s
#3 resolve image config for docker.io/docker/dockerfile:1.3
#3 sha256:4fcd28d33487ad029eab28c03869fd56295f3902c713674c129a438f7a780653
#3 DONE 1.1s
#4 docker-image://docker.io/docker/dockerfile:1.3#sha256:42399d4635eddd7a9b8a24be879d2f9a930d0ed040a61324cfdf59ef1357b3b2
#4 sha256:7862c1373501a4a9cd96ccd04641bb1d96c86d034546e74fe74585e3dd12f952
#4 CACHED
#5 [internal] load build definition from Dockerfile
#5 sha256:adf8dd6b4b2604f820e4a4112252c8bfd5984ffa809d1fc7c5330e387575a53d
#5 DONE 0.0s
#6 [internal] load .dockerignore
#6 sha256:59c105584afe8ac8255febcea4650f6e8891b4b14fcdd7b93254039769df3828
#6 DONE 0.0s
#7 [internal] load metadata for registry.suse.com/suse/sles12sp4:latest
#7 sha256:30c143f62f5a593ad20fd34265d2933e13da97368f12f3e0c990b52851933dff
#7 DONE 0.5s
#8 [1/2] FROM registry.suse.com/suse/sles12sp4#sha256:06390bd3b9903f3d4bb1345deb7fc35e18af73de0263d0f4d5c619267bee2adf
#8 sha256:3d15a7aaf66ed6810de2347b0da9787e5a57b9c536d85ccc4b01e9eb5831bcc1
#8 CACHED
#9 [2/2] RUN zypper search iproute2
#9 sha256:17060fcd75740edd49881abc4d1b5a4f7de80f59cde5b2b6f32e97ff02bbc29d
#9 377.9 Refreshing service 'container-suseconnect-zypp'.
#9 556.7 Problem retrieving the repository index file for service 'container-suseconnect-zypp':
#9 556.7 [container-suseconnect-zypp|file:/usr/lib/zypp/plugins/services/container-suseconnect-zypp]
#9 556.7 Warning: Skipping service 'container-suseconnect-zypp' because of the above error.
#9 556.7 Loading repository data...
#9 556.7 Warning: No repositories defined. Operating only with the installed resolvables. Nothing can be installed.
#9 556.7 Reading installed packages...
#9 556.7 No matching items found.
#9 ERROR: executor failed running [/bin/sh -c zypper search iproute2]: exit code: 104
------
> [2/2] RUN zypper search iproute2:
------
executor failed running [/bin/sh -c zypper search iproute2]: exit code: 104
This is execution when I don't enable Buildkit:
time docker build --no-cache --progress=plain --pull -t test_zypper .
Sending build context to Docker daemon 678.5MB
Step 1/2 : FROM registry.suse.com/suse/sles12sp4
latest: Pulling from suse/sles12sp4
Digest: sha256:06390bd3b9903f3d4bb1345deb7fc35e18af73de0263d0f4d5c619267bee2adf
Status: Image is up to date for registry.suse.com/suse/sles12sp4:latest
---> 3126dff9c7fd
Step 2/2 : RUN zypper search iproute2
---> Running in 3efe8a741628
Refreshing service 'container-suseconnect-zypp'.
Problem retrieving the repository index file for service 'container-suseconnect-zypp':
[container-suseconnect-zypp|file:/usr/lib/zypp/plugins/services/container-suseconnect-zypp]
Warning: Skipping service 'container-suseconnect-zypp' because of the above error.
Loading repository data...
Warning: No repositories defined. Operating only with the installed resolvables. Nothing can be installed.
Reading installed packages...
No matching items found.
The command '/bin/sh -c zypper search iproute2' returned a non-zero code: 104
real 0m23.972s
user 0m1.987s
sys 0m2.161s
It is not a problem of not having repositories as in my original Dockerfile it is all defined and it eventually works, but taking 20min or more each zypper command.
Is something wrong in my way to use buildkit??
Thanks in advance!

Why does Docker invalidate the build cache after changing permissions on a file COPY'd with the --chmod flag?

My working directory contains two files:
Dockerfile given below
FROM ubuntu:20.04
COPY --chmod=755 ./hello.txt /opt/
an empty hello.txt file with 644 chmod permissions
I ran the following sequence of commands, using Docker 20.10.8:
docker build . first build, nothing is re-used from the cache
#6 [2/2] COPY --chmod=755 ./hello.txt /opt/
#6 sha256:2cb62e01e8edffaed06aeda7d8fa85b5f89850564f7458ab0be5cbe3d90478bf
#6 DONE 0.0s
docker build . second build without any change, all layers are re-used from the cache
#6 [2/2] COPY --chmod=755 ./hello.txt /opt/
#6 sha256:7d6fd54f3d6cc543bd4d186861d7ae116c5ed16c7ebca40a37fdbe027bce9ecc
#6 CACHED
chmod 777 hello.txt change permissions from 644 to 777
docker build . => another build after changing permissions, COPY --chmod layer is not re-used from the cache
#4 [2/2] COPY --chmod=755 ./hello.txt /opt/
#4 sha256:3f74f87506d9be5cbb722a723bf1422ce0a24b538201d719f712bbb4915de5a6
#4 DONE 0.0s
Why is the command 4 not re-using the build cache like command 2, even though the permissions and the content of the file are identical in the image due to the --chmod flag in the COPY instruction?

Docker: using SSH in builds with Buildkit

Following the documentation I'm trying to pass an SSH key to my container. This is my original Dockerfile
# syntax=docker/dockerfile:experimental
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.6
RUN mkdir -p -m 0600 ~/.ssh && ssh-keyscan github.com >> ~/.ssh/known_hosts
RUN --mount=type=ssh git clone git#github.com:USER/REPO.git
and this works
DOCKER_BUILDKIT=1 docker build --ssh default=~/github .
However, if I try to install anything with apt:
# syntax=docker/dockerfile:experimental
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.6
RUN apt update
RUN mkdir -p -m 0600 ~/.ssh && ssh-keyscan github.com >> ~/.ssh/known_hosts
RUN --mount=type=ssh git clone git#github.com:USER/REPO.git
I get the following error:
[+] Building 1.8s (7/9)
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 306B 0.0s
=> resolve image config for docker.io/docker/dockerfile:experimental 1.1s
=> CACHED docker-image://docker.io/docker/dockerfile:experimental#sha256:de85b2f3a3e8a2f7fe48e8e84a65f6fdd5cd5183afa6412fff9caa6871649c44 0.0s
=> [internal] load metadata for docker.io/tiangolo/uvicorn-gunicorn-fastapi:python3.6 0.0s
=> CACHED [1/4] FROM docker.io/tiangolo/uvicorn-gunicorn-fastapi:python3.6 0.0s
=> ERROR [2/4] RUN apt update 0.4s
------
> [2/4] RUN apt update:
#7 0.352
#7 0.352 WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
#7 0.352
#7 0.359 Reading package lists...
#7 0.375 E: Could not get lock /var/lib/apt/lists/lock - open (13: Permission denied)
#7 0.375 E: Unable to lock directory /var/lib/apt/lists/
------
failed to solve with frontend dockerfile.v0: failed to solve with frontend gateway.v0: rpc error: code = Unknown desc = failed to build LLB: executor failed running [/bin/sh -c apt update]: runc did not terminate sucessfully
However, the second Dockerfile actually works if Buildkit is disabled. Any suggestions on what might be the problem?
I had this exact same issue. For me the solution was to upgrade Docker. I had this issue with 19.03.11 which my Ubuntu install was pulling in as a snap. 20.10.1 (latest as of this post) worked for me.
More info here: https://github.com/moby/moby/issues/39106#issuecomment-752246367
edit: Unfortunately this doesn't work when the build is run non-interactively (for example, as a systemd-based CI agent) - at least for me.

Docker build --ulimit flag has no effect

My docker builds are failing because of a file handle limit error. They crash out with
Error: EMFILE: too many open files
when I check ulimit -n on the container I see
-n: file descriptors 1024
So I pass the following flags to my build command
docker build --ulimit nofile=65536:65536 -t web .
but this does not change anything, my container still shows
-n: file descriptors 1024
No matter what I do I dont seem to be able to get that ulimit file descriptor limit to change.
What am I doing wrong here?
So, I discovered the cause. Posting the answer incase anyone else is having the same issue as I just wasted most of a day on this.
I have been debugging a very long running build and have been using
export DOCKER_BUILDKIT=1
to enable some extended build information. Very useful timings etc, although it appears as though enabling DOCKER_BUILDKIT completely ignores ulimit flags passed to the docker build command.
When I set
export DOCKER_BUILDKIT=0
it works. So long story short, avoid using buildkit with ulimit params
I wrote a simple test and it seams to work fine on Docker 18.06
> $ docker -v
Docker version 18.06.1-ce, build e68fc7a
I created a Dockerfile like this:
FROM alpine
RUN ulimit -n > /tmp/ulimit.txt
And then:
> $ docker build --ulimit nofile=65536:65536 .
Sending build context to Docker daemon 2.048kB
Step 1/2 : FROM alpine
---> e21c333399e0
Step 2/2 : RUN ulimit -n > /tmp/ulimit.txt
---> Running in 1aa4391d057d
Removing intermediate container 1aa4391d057d
---> 18dd1953d365
Successfully built 18dd1953d365
docker run -ti 18dd1953d365 cat /tmp/ulimit.txt
65536
> $ docker build --ulimit nofile=1024:1024 --no-cache .
Sending build context to Docker daemon 2.048kB
Step 1/2 : FROM alpine
---> e21c333399e0
Step 2/2 : RUN ulimit -n > /tmp/ulimit.txt
---> Running in c20067d1fe10
Removing intermediate container c20067d1fe10
---> 134fc7252574
Successfully built 134fc7252574
> $ docker run -ti 134fc7252574 cat /tmp/ulimit.txt
1024
When using the BuildKit, docker seems to execute the command in the systemd unit context of the daemon the has the ulimit.
I used the Dockerfile to test:
> cat <<EOF >Dockerfile
FROM alpine
RUN echo -e "\n\n-----------------\nulimit: $(ulimit -n)\n-----------------\n\n"
EOF
Check first the actual limit values for docker service:
> systemctl show docker.service | grep LimitNOFILE
LimitNOFILE=infinity
LimitNOFILESoft=infinity
The values set inside running container is 1048576:
> docker run -it --rm alpine sh -c "ulimit -n"
1048576
The values set inside BuildKit build is 1073741816:
> DOCKER_BUILDKIT=1 docker build --progress=plain --no-cache .
#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 195B done
#2 DONE 0.0s
#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s
#3 [internal] load metadata for docker.io/library/alpine:latest
#3 DONE 0.0s
#5 [1/2] FROM docker.io/library/alpine
#5 CACHED
#4 [2/2] RUN echo -e "\n\n-----------------\nulimit: $(ulimit -n)\n--------...
#4 0.452
#4 0.452
#4 0.452 -----------------
#4 0.452 ulimit: 1073741816
#4 0.452 -----------------
#4 0.452
#4 0.452
#4 DONE 0.5s
#6 exporting to image
#6 exporting layers 0.0s done
#6 writing image sha256:facf7aee0b81d814d5b23a663e4f859ec8ba54d7e5fe6fdbbf8beacf0194393b done
#6 DONE 0.0s
Configure the docker.service to set a different default value (LimitNOFILE=1024) that will be also used by BuildKit (be careful not to overwrite an existing file):
> mkdir -p /etc/systemd/system/docker.service.d
> cat <<EOF >/etc/systemd/system/docker.service.d/service.conf.ok
[Service]
LimitNOFILE=1024
EOF
> systemctl daemon-reload
> systemctl restart docker.service
The values set inside running container remains unchanged to 1048576:
> docker run -it --rm alpine sh -c "ulimit -n"
1048576
The values set inside BuildKit build is now 1024:
> DOCKER_BUILDKIT=1 docker build --progress=plain --no-cache .
#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 195B done
#2 DONE 0.0s
#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s
#3 [internal] load metadata for docker.io/library/alpine:latest
#3 DONE 0.0s
#5 [1/2] FROM docker.io/library/alpine
#5 CACHED
#4 [2/2] RUN echo -e "\n\n-----------------\nulimit: $(ulimit -n)\n--------...
#4 0.452
#4 0.452
#4 0.452 -----------------
#4 0.452 ulimit: 1024
#4 0.452 -----------------
#4 0.452
#4 0.452
#4 DONE 0.5s
#6 exporting to image
#6 exporting layers 0.0s done
#6 writing image sha256:7e40c8a8d5f0ca8f2b2b53515f11f47655f6e1693ffcd5f5a118402c13a44ab4 done
#6 DONE 0.0s

Resources