I have a Dockerfile with a RUN instruction like this:
RUN echo "$PWD"
When generating the Docker image I only get this in console:
Step 6/17 : RUN echo "$PWD"
---> Using cache
---> 0e27924953b9
How can I get the output of the echo command in my console?
Each line in the dockerfile creates a new layer in the resulting filesystem. Unless a modification is made to your dockerfile that hasn't previously been encountered, Docker optimizes the build by reusing existing layers that have previously been built. You can see these intermediate images with the command docker images --all.
This means that Docker only needs to build from the 1st changed line in the dockerfile onwards, saving much time with repeated builds on a well-crafted dockerfile. You've already built this layer in a previous build, so it's being skipped and taken from cache.
docker build --no-cache .
should prevent the build process from using cached layers.
Change the final path parameter above to suit your environment.
Related
I've searched site:stackoverflow.com dockerfile: ENV, RUN - layers or images and have read Does Docker EXPOSE make a new layer? and What are Docker image "layers"?.
While reading docs Best practices for writing Dockerfiles and trying to understand this part:
Each ENV line creates a new intermediate layer, just like RUN
commands. This means that even if you unset the environment variable
in a future layer, it still persists in this layer and its value can
be dumped.
I recalled that part above:
In older versions of Docker, it was important that you minimized the
number of layers in your images to ensure they were performant. The
following features were added to reduce this limitation:
Only the instructions RUN, COPY, ADD create layers. Other instructions
create temporary intermediate images, and do not increase the size of
the build.
I've read How to unset "ENV" in dockerfile?. and redid the example given on doc page, it indeed proves ENV is not unset:
$ docker build -t alpine:envtest -<<HITHERE
> FROM alpine
> ENV ADMIN_USER="mark"
> RUN unset ADMIN_USER
> HITHERE
Sending build context to Docker daemon 3.072kB
Step 1/3 : FROM alpine
latest: Pulling from library/alpine
89d9c30c1d48: Already exists
Digest: sha256:c19173c5ada610a5989151111163d28a67368362762534d8a8121ce95cf2bd5a
Status: Downloaded newer image for alpine:latest
---> 965ea09ff2eb
Step 2/3 : ENV ADMIN_USER="mark"
---> Running in 5d34f829a387
Removing intermediate container 5d34f829a387
---> e9c50b16c0e1
Step 3/3 : RUN unset ADMIN_USER
---> Running in dbcf57ca390d
Removing intermediate container dbcf57ca390d
---> 2cb4de2e0257
Successfully built 2cb4de2e0257
Successfully tagged alpine:envtest
$ docker run --rm alpine:envtest sh -c 'echo $ADMIN_USER'
mark
And the output says same "Removing intermediate container" for both ENV and RUN.
I've recently downloaded docker, don't think it is that old:
$ docker --version
Docker version 19.03.5, build 633a0ea
Maybe RUN instruction and RUN command is a different thing?
ENV, RUN - do they create layers, images or containers?
The Docker as containerization system based on two main concepts of image and container. The major difference between them is the top writable layer. When you generate a new container the new writable layer will be put above the last image layer. This layer is often called the container layer.
All the underlying image content remains unchanged and each change in the running container that creating new files, modifying existing files and so on will be copied in this thin writable layer.
In this case, Docker only stores the actual containers data and one image instance, which decreases storage usage and simplifies the underlying workflow. I would compare it with static and dynamic linking in C language, so Docker uses dynamic linking.
The image is a combination of layers. Each layer is only a set of differences from the layer before it.
The documentation says:
Only the instructions RUN, COPY, ADD create layers. Other instructions
create temporary intermediate images, and do not increase the size of
the build.
The description here is neither really clear nor accurate, and generally speaking these aren't the only instructions that create layers in the latest versions of Docker, as the documentation outlines.
For example, by default WORKDIR creates a given path if it does not exist and change directory to it. If the new path was created WORKDIR will generate a new layer.
By the way, ENV doesn't lead to layer creation. The data will be stored permanently in image and container config and there is no easy way to get rid of it. Basically, there are two options, how to organize workflow:
Temporal environment variables, they will be available until the end of the current RUN directive:
RUN export NAME='megatron' && echo $NAME # 'megatron'
RUN echo $NAME # blank
Clean environment variable, if there is no difference for you between the absence of env or blank content of it, then you could do:
ENV NAME='megatron'
# some instructions
ENV NAME=''
RUN echo $NAME
In the context of Docker, there is no distinction between commands and instructions. For RUN any commands that don't change filesystem content won't trigger permanent layers creation. Consider the following Dockerfile:
FROM alpine:latest
RUN echo "Hello World" # no layer
RUN touch file.txt # new layer
WORKDIR /no/existing/path # new layer
In the end, the output would be:
Step 1/4 : FROM alpine:latest
---> 965ea09ff2eb
Step 2/4 : RUN echo "Hello World"
---> Running in 451adb70f017
Hello World
Removing intermediate container 451adb70f017
---> 816ccbd1e8aa
Step 3/4 : RUN touch file.txt
---> Running in 9edc6afdd1e5
Removing intermediate container 9edc6afdd1e5
---> ea0040ec0312
Step 4/4 : WORKDIR /no/existing/path
---> Running in ec0feaf6710d
Removing intermediate container ec0feaf6710d
---> f2fe46478f7c
Successfully built f2fe46478f7c
Successfully tagged envtest:lastest
There is inspect command for inspecting Docker objects:
docker inspect --format='{{json .RootFS.Layers}}' <image_id>
Which shows us the list of SHA of three layers getting FROM, second RUN and WORKDIR directives, I would recommend using dive for exploring each layer in a docker image.
So why does it say removing intermediate container and not removing intermediate layer? Actually to execute RUN commands Docker needs to instantiate a container with the intermediate image up to that line of the Dockerfile and run the actual command. It will then "commit" the state of the container as a new intermediate image and continue the building process.
We have a Dockerfile, where at a certain point would like no caching to happen.
Currently we are using
ENV CACHE_BUST=$($RANDOM)
Upon further inspection, funny enough that gets cached:
Step 1/1 : ENV CACHE_BUST=$($RANDOM)
---> Using cache
Is there any way from inside the Dockerfile to bust the cache without passing in a unique build-arg (Like docker build . --build-arg CACHE_BUST=$(date +%s)) in the build step?
Update: Reviewing this one, it looks like you injected the cache busting option incorrectly in two ways:
ENV is not an ARG
The $(x) syntax is not a variable expansion, you need curly brackets (${}), not parenthesis ($()).
To break the cache on the next run line, the syntax is:
ARG CACHE_BUST
RUN echo "command with external dependencies"
And then build with:
docker build --build-arg CACHE_BUST=$(date +%s) .
Why does that work? Because during the build, the values for ARG are injected into RUN commands as environment variables. Changing an environment variable results in a cache miss on the new build.
To bust the cache, one of the inputs needs to change. If the command being run is the same, the cache will be reused even if the command has external dependencies that have changed, since docker cannot see those external dependencies.
Options to work around this include:
Passing a build arg that changes (e.g. setting it to a date stamp).
Changing a file that gets included into the image with COPY or ADD.
Running your build with the --no-cache option.
Since you do not want to do option 1, there is a way to do option 3 on a specific line, but only if you can split up your Dockerfile into 2 parts. The first Dockerfile has all the lines as you have today up to the point you want to break the cache. Then the second Dockerfile has a FROM line to depend on the first Dockerfile, and you build that with the --no-cache option. E.g.
Dockerfile1:
FROM base
RUN normal steps
Dockerfile2
FROM intermediate
RUN curl external.jar>file.jar
RUN other lines that cannot be cached
CMD your cmd
Then build with:
docker build -f Dockerfile1 -t intermediate .
docker build -f Dockerfile2 -t final --no-cache .
The only other option I can think of is to make a new frontend with BuildKit that allows you to inject an explicit cache break, or unique variable that results in a cache break.
You can add ADD layer with the downloading of some dynamic page from stable source in the beginning of Dockerfile. Image will be always re-built without using a cache.
Just an example of Dockerfile:
FROM alpine:3.9
ADD https://google.com cache_bust
RUN apk add --no-cache wget
p.s. I believe you are aware of docker build --no-cache option.
I'm confused about when should I use CMD vs RUN. For example, to execute bash/shell commands (i.e. ls -la) I would always use CMD or is there a situation where I would use RUN? Trying to understand the best practices about these two similar Dockerfile directives.
RUN is an image build step, the state of the container after a RUN command will be committed to the container image. A Dockerfile can have many RUN steps that layer on top of one another to build the image.
CMD is the command the container executes by default when you launch the built image. A Dockerfile will only use the final CMD defined. The CMD can be overridden when starting a container with docker run $image $other_command.
ENTRYPOINT is also closely related to CMD and can modify the way a container is started from an image.
RUN - command triggers while we build the docker image.
CMD - command triggers while we launch the created docker image.
I found this article very helpful to understand the difference between them:
RUN -
RUN instruction allows you to install your application and packages
required for it. It executes any commands on top of the current image
and creates a new layer by committing the results. Often you will find
multiple RUN instructions in a Dockerfile.
CMD -
CMD instruction allows you to set a default command, which will be
executed only when you run container without specifying a command.
If Docker container runs with a command, the default command will be
ignored. If Dockerfile has more than one CMD instruction, all but last
CMD instructions are ignored.
The existing answers cover most of what anyone looking at this question would need. So I'll just cover some niche areas for CMD and RUN.
CMD: Duplicates are Allowed but Wasteful
GingerBeer makes an important point: you won't get any errors if you put in more than one CMD - but it's wasteful to do so. I'd like to elaborate with an example:
FROM busybox
CMD echo "Executing CMD"
CMD echo "Executing CMD 2"
If you build this into an image and run a container in this image, then as GingerBeer states, only the last CMD will be heeded. So the output of that container will be:
Executing CMD 2
The way I think of it is that "CMD" is setting a single global variable for the entire image that is being built, so successive "CMD" statements simply overwrite any previous writes to that global variable, and in the final image that's built the last one to write wins. Since a Dockerfile executes in order from top to bottom, we know that the bottom-most CMD is the one gets this final "write" (metaphorically speaking).
RUN: Commands May not Execute if Images are Cached
A subtle point to notice about RUN is that it's treated as a pure function even if there are side-effects, and is thus cached. What this means is that if RUN had some side effects that don't change the resultant image, and that image has already been cached, the RUN won't be executed again and so the side effects won't happen on subsequent builds. For example, take this Dockerfile:
FROM busybox
RUN echo "Just echo while you work"
First time you run it, you'll get output such as this, with different alphanumeric IDs:
docker build -t example/run-echo .
Sending build context to Docker daemon 9.216kB
Step 1/2 : FROM busybox
---> be5888e67be6
Step 2/2 : RUN echo "Just echo while you work"
---> Running in ed37d558c505
Just echo while you work
Removing intermediate container ed37d558c505
---> 6f46f7a393d8
Successfully built 6f46f7a393d8
Successfully tagged example/run-echo:latest
Notice that the echo statement was executed in the above. Second time you run it, it uses the cache, and you won't see any echo in the output of the build:
docker build -t example/run-echo .
Sending build context to Docker daemon 9.216kB
Step 1/2 : FROM busybox
---> be5888e67be6
Step 2/2 : RUN echo "Just echo while you work"
---> Using cache
---> 6f46f7a393d8
Successfully built 6f46f7a393d8
Successfully tagged example/run-echo:latest
RUN - Install Python , your container now has python burnt into its image
CMD - python hello.py , run your favourite script
Note: Don’t confuse RUN with CMD. RUN actually runs a command and
commits the result; CMD does not execute anything at build time, but
specifies the intended command for the image.
from docker file reference
https://docs.docker.com/engine/reference/builder/#cmd
RUN Command:
RUN command will basically, execute the default command, when we are building the image. It also will commit the image changes for next step.
There can be more than 1 RUN command, to aid in process of building a new image.
CMD Command:
CMD commands will just set the default command for the new container. This will not be executed at build time.
If a docker file has more than 1 CMD commands then all of them are ignored except the last one. As this command will not execute anything but just set the default command.
RUN: Can be many, and it is used in build process, e.g. install multiple libraries
CMD: Can only have 1, which is your execute start point (e.g. ["npm", "start"], ["node", "app.js"])
There has been enough answers on RUN and CMD. I just want to add a few words on ENTRYPOINT. CMD arguments can be overwritten by command line arguments, while ENTRYPOINT arguments are always used.
This article is a good source of information.
I'm confused about when should I use CMD vs RUN. For example, to execute bash/shell commands (i.e. ls -la) I would always use CMD or is there a situation where I would use RUN? Trying to understand the best practices about these two similar Dockerfile directives.
RUN is an image build step, the state of the container after a RUN command will be committed to the container image. A Dockerfile can have many RUN steps that layer on top of one another to build the image.
CMD is the command the container executes by default when you launch the built image. A Dockerfile will only use the final CMD defined. The CMD can be overridden when starting a container with docker run $image $other_command.
ENTRYPOINT is also closely related to CMD and can modify the way a container is started from an image.
RUN - command triggers while we build the docker image.
CMD - command triggers while we launch the created docker image.
I found this article very helpful to understand the difference between them:
RUN -
RUN instruction allows you to install your application and packages
required for it. It executes any commands on top of the current image
and creates a new layer by committing the results. Often you will find
multiple RUN instructions in a Dockerfile.
CMD -
CMD instruction allows you to set a default command, which will be
executed only when you run container without specifying a command.
If Docker container runs with a command, the default command will be
ignored. If Dockerfile has more than one CMD instruction, all but last
CMD instructions are ignored.
The existing answers cover most of what anyone looking at this question would need. So I'll just cover some niche areas for CMD and RUN.
CMD: Duplicates are Allowed but Wasteful
GingerBeer makes an important point: you won't get any errors if you put in more than one CMD - but it's wasteful to do so. I'd like to elaborate with an example:
FROM busybox
CMD echo "Executing CMD"
CMD echo "Executing CMD 2"
If you build this into an image and run a container in this image, then as GingerBeer states, only the last CMD will be heeded. So the output of that container will be:
Executing CMD 2
The way I think of it is that "CMD" is setting a single global variable for the entire image that is being built, so successive "CMD" statements simply overwrite any previous writes to that global variable, and in the final image that's built the last one to write wins. Since a Dockerfile executes in order from top to bottom, we know that the bottom-most CMD is the one gets this final "write" (metaphorically speaking).
RUN: Commands May not Execute if Images are Cached
A subtle point to notice about RUN is that it's treated as a pure function even if there are side-effects, and is thus cached. What this means is that if RUN had some side effects that don't change the resultant image, and that image has already been cached, the RUN won't be executed again and so the side effects won't happen on subsequent builds. For example, take this Dockerfile:
FROM busybox
RUN echo "Just echo while you work"
First time you run it, you'll get output such as this, with different alphanumeric IDs:
docker build -t example/run-echo .
Sending build context to Docker daemon 9.216kB
Step 1/2 : FROM busybox
---> be5888e67be6
Step 2/2 : RUN echo "Just echo while you work"
---> Running in ed37d558c505
Just echo while you work
Removing intermediate container ed37d558c505
---> 6f46f7a393d8
Successfully built 6f46f7a393d8
Successfully tagged example/run-echo:latest
Notice that the echo statement was executed in the above. Second time you run it, it uses the cache, and you won't see any echo in the output of the build:
docker build -t example/run-echo .
Sending build context to Docker daemon 9.216kB
Step 1/2 : FROM busybox
---> be5888e67be6
Step 2/2 : RUN echo "Just echo while you work"
---> Using cache
---> 6f46f7a393d8
Successfully built 6f46f7a393d8
Successfully tagged example/run-echo:latest
RUN - Install Python , your container now has python burnt into its image
CMD - python hello.py , run your favourite script
Note: Don’t confuse RUN with CMD. RUN actually runs a command and
commits the result; CMD does not execute anything at build time, but
specifies the intended command for the image.
from docker file reference
https://docs.docker.com/engine/reference/builder/#cmd
RUN Command:
RUN command will basically, execute the default command, when we are building the image. It also will commit the image changes for next step.
There can be more than 1 RUN command, to aid in process of building a new image.
CMD Command:
CMD commands will just set the default command for the new container. This will not be executed at build time.
If a docker file has more than 1 CMD commands then all of them are ignored except the last one. As this command will not execute anything but just set the default command.
RUN: Can be many, and it is used in build process, e.g. install multiple libraries
CMD: Can only have 1, which is your execute start point (e.g. ["npm", "start"], ["node", "app.js"])
There has been enough answers on RUN and CMD. I just want to add a few words on ENTRYPOINT. CMD arguments can be overwritten by command line arguments, while ENTRYPOINT arguments are always used.
This article is a good source of information.
All, i'm trying to persistently copy files from my host to an image so those files are available with every container launched based on that image. Running on debian wheezy 64bit as virtualbox guest.
the Dockerfile is fairly simple (installing octave image):
FROM debian:jessie
MAINTAINER GG_Python <[redacted]#gmail.com>
RUN apt-get update
RUN apt-get update
RUN apt-get install -y octave octave-image octave-missing-functions octave-nan octave-statistics
RUN mkdir /octave
RUN mkdir /octave/libs
RUN mkdir /octave/libs/jsonlab
COPY ~/octave/jsonlab/loadjson.m /octave/libs/jsonlab/.
I'm getting the following trace after issuing a build command: docker build -t octave .
Sending build context to Docker daemon 423.9 kB
Sending build context to Docker daemon
Step 0 : FROM debian:jessie
---> 58052b122b60
Step 1 : MAINTAINER GG_Python <[..]#gmail.com>
---> Using cache
---> 90d2dd2f7ee8
Step 2 : RUN apt-get update
---> Using cache
---> 4c72c25cd829
Step 3 : RUN apt-get update
---> Using cache
---> b52f0bcb9f86
Step 4 : RUN apt-get install -y octave octave-image octave-missing-functions octave-nan octave-statistics
---> Using cache
---> f0637ab96d5e
Step 5 : RUN mkdir /octave
---> Using cache
---> a2d278b2819b
Step 6 : RUN mkdir /octave/libs
---> Using cache
---> 65efbbe01c99
Step 7 : RUN mkdir /octave/libs/jsonlab
---> Using cache
---> e41b80901266
Step 8 : COPY ~/octave/jsonlab/loadjson.m /octave/libs/jsonlab/.
INFO[0000] ~/octave/jsonlab/loadjson.m: no such file or directory
Docker absolutely refuses to copy this file from the host into the image. Needless to say a the file loadjson.m is there (cat displays), all my attempts to change the path (relative, absolute, etc.) failed. Any advice why this simple task is problematic?
At the time I originally wrote this, Docker didn’t expand ~ or $HOME. Now it does some expansions inside the build context, but even so they are probably not what you want—they aren’t your home directory outside the context. You need to reference the file explicitly, or package it relative to the Dockerfile itself.
Docker can only copy files from the context, the folder you are minus any file listed in the dockerignore file.
When you run 'docker build' docker tars the context and it sends it to the docker daemon you are connected to. It only lets you copy files inside of the context because the daemon might be a remote machine.
I couldn't get COPY to work until I understood the context (I was trying to copy a file from outside of the context)
The docker build command builds an image from a Dockerfile and a context. The build’s context is the files at a specified location PATH. The PATH is a directory on your local filesystem.
A context is processed recursively. So, a PATH includes any subdirectories.
The build is run by the Docker daemon, not by the CLI. The first thing a build process does is send the entire context (recursively) to the daemon. In most cases, it’s best to start with an empty directory as context and keep your Dockerfile in that directory. Add only the files needed for building the Dockerfile.
Warning: Do not use your root directory, /, as the PATH as it causes the build to transfer the entire contents of your hard drive to the Docker daemon.
Reference:
https://docs.docker.com/engine/reference/builder/#usage
I had similar issue. I solved it by checking two things:
Inside your, docker-compose.yaml check context of the service, docker will not copy any file outside of this directory. For example if the context is app/ then you cannot copy anything from ../app
Check .dockerignore to be sure that you are not ignoring the file you want to copy.
I got it working by first checking what the context was,
setting an absolute path before the source file in
your Dockerfile to get that information:
# grep COPY Dockerfile
COPY /path/to/foo /whatever/path/in/destination/foo
Building with that:
docker build -t bar/foo .
you'll get an error, which states the context-path that Docker
is apparently looking into for its files, e.g.
it turns out to be:
/var/lib/docker/tmp # I don't remember
Copying(!) your set of build-files in that directory (here: /var/lib/docker/tmp),
cd into it, build from there.
See if that works, and don't forget to do some housekeeping cleaning up
the tmp, deleting your files before the next visit(or).
HTH
Michael
Got this error using a Dockerfile for a linux container on a Windows machine:
#24 1.160 Skipping project "/src/Common/MetaData/Metadata.csproj" because it was not found.
Restore worked perfectly on the host machine.
Turned out to be the error mentioned here:
https://stackoverflow.com/a/68592423/3850405
A .csproj file did not match casing in Visual Studio vs the file system.