What context does the WORKDIR keyword in a Dockerfile refer to? Is it in the context I run docker build from or inside the container I am producing?
I find myself often putting RUN cd && ... in my docker files and am hoping there's another way, I feel like I'm missing something.
It is inside the container.
Taken for the Dockerfile reference site https://docs.docker.com/engine/reference/builder/#workdir
The WORKDIR instruction sets the working directory for any RUN, CMD, ENTRYPOINT, COPY and ADD instructions that follow it in the Dockerfile. If the WORKDIR doesn’t exist, it will be created even if it’s not used in any subsequent Dockerfile instruction.
So rather than adding RUN cd && ... you could do:
WORKDIR /path/to/dir
RUN command
All paths in a Dockerfile, except the first half of COPY and ADD instructions, refer to image filesystem paths. The source paths for COPY and ADD are relative paths (even if they start with /) relative to the build context (the directory at the end of the docker build command, frequently the directory containing the Dockerfile). Nothing in a Dockerfile can ever reference an absolute path on the host or content outside the build context tree.
The only difference between these two Dockerfiles is the directory the second command gets launched in.
RUN cd /dir && command1
RUN command2
WORKDIR /dir
RUN command1
RUN command2
WORKDIR sets the directory inside the image and hence allows you to avoid RUN cd calls.
I've been expiriencing a bit of weird behavior regarding volumes. We have a container which contains a database, and is expected to bind mount folders from the host which contain the data. I'm trying to create a child container which ships with test data, as it is just used for testing.
This requires that during the build step, some data is copied off the host machine, and then some scripts run which create additional files. I've noticed however though when I have a look at the running container, only the copied files exist, and the ones created by scripts do not. I've boiled down the steps to the following docker file:
FROM ubuntu:xenial-20180112.1
VOLUME /test
COPY /test/copydir/copyfile.txt /test/copydir/copyfile.txt
RUN mkdir -p /test/mkdir && \
touch /test/mkdir/touch.txt
Note that when I bash into the running container and do an
ls -l /test
I only get the 'copydir' folder. If I run an ls in my dockerfile however, I see that both folders exist.
What's going on here?
edit:
For additional context, the following prints out that both directories exist:
FROM ubuntu:xenial-20180112.1
VOLUME /test
COPY /test/copydir/copyfile.txt /test/copydir/copyfile.txt
RUN mkdir -p /test/mkdir && \
touch /test/mkdir/touch.txt && \
ls -l /test
But the following only shows that copydir exists:
FROM ubuntu:xenial-20180112.1
VOLUME /test
COPY /test/copydir/copyfile.txt /test/copydir/copyfile.txt
RUN mkdir -p /test/mkdir && \
touch /test/mkdir/touch.txt
RUN ls -l /test
I don't have the exact explanation of this but when you build an image with a Dockerfile it will make the lightest image possible. When you use RUN you don't make data persistant but you only do an action that will give a result that will not stay in the image.
Note that apt-get and yum commands make installations persist. It's kinda weird.
Try to change your Dockerfile to:
FROM ubuntu:xenial-20180112.1
RUN mkdir -p /test
COPY /test/copydir/copyfile.txt /test/copydir/copyfile.txt
RUN mkdir -p /test/mkdir && \
touch /test/mkdir/touch.txt
VOLUME /test
In a remark you said "The example I provided was cut down for clarity, in actuality the volume is defined by a parent image." That would relate the problem to the case that is not possible to undeclare a volume entry for a derived image. If that is possible (e.g. using docker-copyedit) then your problem may go away. ;)
In my dockerfile, I want to copy a file from ~/.ssh of my host machine into the container, so i worte it like this:
# create ssh folder and copy ssh keys from local into container
RUN mkdir -p /root/.ssh
COPY ~/.ssh/id_rsa /root/.ssh/
But when I run docker build -t foo to build it, it stopped with an error:
Step 2 : RUN mkdir -p /root/.ssh
---> Using cache
---> db111747d125
Step 3 : COPY ~/.ssh/id_rsa /root/.ssh/
~/.ssh/id_rsa: no such file or directory
It seems the ~ symbol is not recognized by dockerfile, how could I resolve this issue?
In Docker, it is not possible to copy files from anywhere on the system into the image, since this would be considered a security risk. COPY paths are always considered relative to the build context, which is current directory where you run the docker build command.
This is described in the documentation: https://docs.docker.com/reference/builder/#copy
As a result, the ~ has no useful meaning, since it would try and direct you to a location which is not part of the context.
If you want to put your local id_rsa file into the docker, you should put it into the context first, e.g. copy it along side the Dockerfile, and refer to it that way.
What is the difference between the COPY and ADD commands in a Dockerfile, and when would I use one over the other?
COPY <src> <dest>
The COPY instruction will copy new files from <src> and add them to the container's filesystem at path <dest>
ADD <src> <dest>
The ADD instruction will copy new files from <src> and add them to the container's filesystem at path <dest>.
You should check the ADD and COPY documentation for a more detailed description of their behaviors, but in a nutshell, the major difference is that ADD can do more than COPY:
ADD allows <src> to be a URL
Referring to comments below, the ADD documentation states that:
If is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory. Resources from remote URLs are not decompressed.
Note that the Best practices for writing Dockerfiles suggests using COPY where the magic of ADD is not required. Otherwise, you (since you had to look up this answer) are likely to get surprised someday when you mean to copy keep_this_archive_intact.tar.gz into your container, but instead, you spray the contents onto your filesystem.
COPY is
Same as 'ADD', but without the tar and remote URL handling.
Reference straight from the source code.
There is some official documentation on that point: Best Practices for Writing Dockerfiles
Because image size matters, using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead. That way you can delete the files you no longer need after they've been extracted and you won't have to add another layer in your image.
RUN mkdir -p /usr/src/things \
&& curl -SL http://example.com/big.tar.gz \
| tar -xJC /usr/src/things \
&& make -C /usr/src/things all
For other items (files, directories) that do not require ADD’s tar auto-extraction capability, you should always use COPY.
From Docker docs:
ADD or COPY
Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.
More: Best practices for writing Dockerfiles
If you want to add a xx.tar.gz to a /usr/local in container, unzip it, and then remove the useless compressed package.
For COPY:
COPY resources/jdk-7u79-linux-x64.tar.gz /tmp/
RUN tar -zxvf /tmp/jdk-7u79-linux-x64.tar.gz -C /usr/local
RUN rm /tmp/jdk-7u79-linux-x64.tar.gz
For ADD:
ADD resources/jdk-7u79-linux-x64.tar.gz /usr/local/
ADD supports local-only tar extraction. Besides it, COPY will use three layers, but ADD only uses one layer.
COPY copies a file/directory from your host to your image.
ADD copies a file/directory from your host to your image, but can also fetch remote URLs, extract TAR files, etc...
Use COPY for simply copying files and/or directories into the build context.
Use ADD for downloading remote resources, extracting TAR files, etc..
When creating a Dockerfile, there are two commands that you can use to copy files/directories into it – ADD and COPY. Although there are slight differences in the scope of their function, they essentially perform the same task.
So, why do we have two commands, and how do we know when to use one or the other?
DOCKER ADD COMMAND
===
Let’s start by noting that the ADD command is older than COPY. Since the launch of the Docker platform, the ADD instruction has been part of its list of commands.
The command copies files/directories to a file system of the specified container.
The basic syntax for the ADD command is:
ADD <src> … <dest>
It includes the source you want to copy (<src>) followed by the destination where you want to store it (<dest>). If the source is a directory, ADD copies everything inside of it (including file system metadata).
For instance, if the file is locally available and you want to add it to the directory of an image, you type:
ADD /source/file/path /destination/path
ADD can also copy files from a URL. It can download an external file and copy it to the wanted destination. For example:
ADD http://source.file/url /destination/path
An additional feature is that it copies compressed files, automatically extracting the content to the given destination. This feature only applies to locally stored compressed files/directories.
ADD source.file.tar.gz /temp
Bear in mind that you cannot download and extract a compressed file/directory from a URL. The command does not unpack external packages when copying them to the local filesystem.
DOCKER COPY COMMAND
===
Due to some functionality issues, Docker had to introduce an additional command for duplicating content – COPY.
Unlike its closely related ADD command, COPY only has only one assigned function. Its role is to duplicate files/directories in a specified location in their existing format. This means that it doesn’t deal with extracting a compressed file, but rather copies it as-is.
The instruction can be used only for locally stored files. Therefore, you cannot use it with URLs to copy external files to your container.
To use the COPY instruction, follow the basic command format:
Type in the source and where you want the command to extract the content as follows:
COPY <src> … <dest>
For example:
COPY /source/file/path /destination/path
Which command to use? (Best Practice)
Considering the circumstances in which the COPY command was introduced, it is evident that keeping ADD was a matter of necessity. Docker released an official document outlining best practices for writing Dockerfiles, which explicitly advises against using the ADD command.
Docker’s official documentation notes that COPY should always be the go-to instruction as it is more transparent than ADD.
If you need to copy from the local build context into a container, stick to using COPY.
The Docker team also strongly discourages using ADD to download and copy a package from a URL. Instead, it’s safer and more efficient to use wget or curl within a RUN command. By doing so, you avoid creating an additional image layer and save space.
Ref: https://phoenixnap.com/kb/docker-add-vs-copy
From Docker docs:
https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#add-or-copy
"Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.
If you have multiple Dockerfile steps that use different files from your context, COPY them individually, rather than all at once. This will ensure that each step’s build cache is only invalidated (forcing the step to be re-run) if the specifically required files change.
For example:
COPY requirements.txt /tmp/
RUN pip install --requirement /tmp/requirements.txt
COPY . /tmp/
Results in fewer cache invalidations for the RUN step, than if you put the COPY . /tmp/ before it.
Because image size matters, using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead. That way you can delete the files you no longer need after they’ve been extracted and you won’t have to add another layer in your image. For example, you should avoid doing things like:
ADD http://example.com/big.tar.xz /usr/src/things/
RUN tar -xJf /usr/src/things/big.tar.xz -C /usr/src/things
RUN make -C /usr/src/things all
And instead, do something like:
RUN mkdir -p /usr/src/things \
&& curl -SL htt,p://example.com/big.tar.xz \
| tar -xJC /usr/src/things \
&& make -C /usr/src/things all
For other items (files, directories) that do not require ADD’s tar auto-extraction capability, you should always use COPY."
Source: https://nickjanetakis.com/blog/docker-tip-2-the-difference-between-copy-and-add-in-a-dockerile:
COPY and ADD are both Dockerfile instructions that serve similar purposes. They let you copy files from a specific location into a Docker image.
COPY takes in a src and destination. It only lets you copy in a local file or directory from your host (the machine building the Docker image) into the Docker image itself.
ADD lets you do that too, but it also supports 2 other sources. First, you can use a URL instead of a local file / directory. Secondly, you can extract a tar file from the source directly into the destination
A valid use case for ADD is when you want to extract a local tar file into a specific directory in your Docker image.
If you’re copying in local files to your Docker image, always use COPY because it’s more explicit.
Since Docker 17.05 COPY is used with the --from flag in multi-stage builds to copy artifacts from previous build stages to the current build stage.
from the documentation
Optionally COPY accepts a flag --from=<name|index> that can be used to set the source location to a previous build stage (created with FROM .. AS ) that will be used instead of a build context sent by the user.
COPY doesn't support <src> with URL scheme.
COPY doesn't unpack compression file.
For instruction <src> <dest>, if <src> is a tar compression file and <dest>doesn't end with a trailing slash:
ADD consider <dest> as a directory and unpack <src> to it.
COPY consider <dest> as a file and write <src> to it.
COPY support to overwrite build context by --from arg.
ADD instruction copies files or folders from a local or remote source and adds them to the container's file system. It used to copy local files, those must be in the working directory. ADD instruction unpacks local .tar files to the destination image directory.
Example
ADD http://someserver.com/filename.pdf /var/www/html
COPY copies files from the working directory and adds them to the container's file system. It is not possible to copy a remote file using its URL with this Dockerfile instruction.
Example
COPY Gemfile Gemfile.lock ./
COPY ./src/ /var/www/html/
If you have a foo.tar.gz file, comparing the following command.
The ADD command creates less layers than the COPY command, and saves a lot of net traffic when pushing docker image.
ADD foo.tar.gz /
COPY foo.tar.gz /
RUN tar -zxvf foo.tar.gz
RUN rm -rf foo.tar.gz
Let's say you have a tar file and you want to uncompress it after placing it in your container, remove it, you can use the COPY command to do this. Butt he various commands would be 1) Copy the tar file to the destination, 2). Uncompress it, 3) Remove the tar file. If you did this in 3 steps then there will be a new image created after each step. You can do this in one step using & but it becomes a hassle.
But you used ADD, then Docker will take care of everything for you and only one intermediate image will be created.
ADD and COPY both have same functionality of copying files and directories from source to destination but ADD has extra of file extraction and URL file extraction functionality. The best practice is to use COPY in only copy operation only avoid ADD is many areas. The link will explain it with some simple examples difference between COPY and ADD in dockerfile
docker build -t {image name} -v {host directory}:{temp build directory} .
This is another way to copy files into an image. The -v option temporarily creates a volume that us used during the build process.
This is different that other volumes because it mounts a host directory for the build only. Files can be copied using a standard cp command.
Also, like the curl and wget, it can be run in a command stack (runs in a single container) and not multiply the image size. ADD and COPY are not stackable because they run in a standalone container and subsequent commands on those files that execute in additional containers will multiply the image size:
With the options set thus:
-v /opt/mysql-staging:/tvol
The following will execute in one container:
RUN cp -r /tvol/mysql-5.7.15-linux-glibc2.5-x86_64 /u1 && \
mv /u1/mysql-5.7.15-linux-glibc2.5-x86_64 /u1/mysql && \
mkdir /u1/mysql/mysql-files && \
mkdir /u1/mysql/innodb && \
mkdir /u1/mysql/innodb/libdata && \
mkdir /u1/mysql/innodb/innologs && \
mkdir /u1/mysql/tmp && \
chmod 750 /u1/mysql/mysql-files && \
chown -R mysql /u1/mysql && \
chgrp -R mysql /u1/mysql