Docker COPY and keep directory - docker

Need to copy multiple directories in my Dockerfile. Currently, I'm doing:
COPY dir1 /opt/dir1
COPY dir2 /opt/dir2
COPY dir3 /opt/dir3
I would prefer to consolidate those into one single statement, specifying all the sources in one go. However, this way the contents are copied, and I lose the dir1, dir2, dir3 structure:
COPY dir1 dir2 dir3 /opt/
Same in this case:
COPY dir1/ dir2/ dir3/ /opt/
Is there some way to achieve this with one line?

You should consider ADD instead of COPY: see Dockerfile ADD
If <src> is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory.
That means you can wrap your docker build step into a script which would first tar -cvf dirs.tar dir1 dir2 dir3
Your Dockerfile can then ADD dirs.tar: you will find your folders in your image.
See also Dockerfile Best Practices: ADD or COPY.

Related

How to use cp command in dockerfile

I want to decrease the number of layers used in my Dockerfile.
So I want to combine the COPY commands in a RUN cp.
dependencies
folder1
file1
file2
Dockerfile
The following below commands work which I want to combine using a single RUN cp command
COPY ./dependencies/file1 /root/.m2
COPY ./dependencies/file2 /root/.sbt/
COPY ./dependencies/folder1 /root/.ivy2/cache
This following below command says No such file or directory present error. Where could I be going wrong ?
RUN cp ./dependencies/file1 /root/.m2 && \
cp ./dependencies/file2 /root/.sbt/ && \
cp ./dependencies/folder1 /root/.ivy2/cache
You can't do that.
COPY copies from the host to the image.
RUN cp copies from a location in the image to another location in the image.
To get it all into a single COPY statement, you can create the file structure you want on the host and then use tar to make it a single file. Then when you COPY or ADD that tar file, Docker will unpack it and put the files in the correct place. But with the current structure your files have on the host, it's not possible to do in a single COPY command.
Problem
The COPY is used to copy files from your host to your container. So, when you run
COPY ./dependencies/file1 /root/.m2
COPY ./dependencies/file2 /root/.sbt/
COPY ./dependencies/folder1 /root/.ivy2/cache
Docker will look for file1, file2, and folder1 on your host.
However, when you do it with RUN, the commands are executed inside the container, and ./dependencies/file1 (and so on) does not exist in your container yet, which leads to file not found error.
In short, COPY and RUN are not interchangeable.
How to fix
If you don't want to use multiple COPY commands, you can use one COPY to copy all files from your host to your container, then use the RUN command to move them to the proper location.
To avoid copying unnecessary files, use .dockerignore. For example:
.dockerignore
./dependencies/no-need-file
./dependencies/no-need-directory/
Dockerfile
COPY ./dependencies/ /root/
RUN mv ./dependencies/file1 /root/.m2 && \
mv ./dependencies/file2 /root/.sbt/ && \
mv ./dependencies/folder1 /root/.ivy2/cache
You a re missing final slash in /root/.ivy2/cache/

How to COPY or ADD multiple files and directories in one layer with Dockerfile

I'm trying to ADD/COPY files and directories in one layer with Dockerfile like this:
ADD file.txt dir1 /app/
But it is only copying the content of dir1 instead of the dir itself, how can I copy/add the files and directories in one layer?
The docs on ADD clearly note, that only the content of the directory is copied but not the directory itself.
So for this to work you could create a subdirectory foo containing file.txt and dir1 and then do
ADD foo /app/
which would copy the content of the foo directory into your image.

Recursively COPY files matching filename and preserve directory structure

Assume I have the following dir structure:
languages
-en-GB
--page1.json
--page2.json
-fr-FR
--page1.json
--page2.json
Now let's assume I want to copy the folder structure, but only page1.json content:
I've tried this:
COPY ["languages/**/*page1.json", "./"]
Which results in the folders being copied, but no files.
What I want to end up with is
languages
-en-GB
--page1.json
-fr-FR
--page1.json
Copied into my image
I am not sure you can use wildcards to produce the filtered result you are looking for.
I believe there are at least two clean and clear ways to achieve this:
Option 1: Copy everything, and cleanup later:
FROM alpine
WORKDIR /languages
COPY languages .
RUN rm -r **/page2.json
Option 2: Add files you don't want into your .dockerignore
# .dockerignore
languages/**/page*.json
!languages/**/page1.json
Option 3: Copy all to a temporary directory, and copy what you need from inside the container using more flexible tools
FROM alpine
WORKDIR /languages
COPY languages /tmp/langs
RUN cd /tmp/langs ; find -name 'page1.json' -exec cp --parents {} /languages \;
CMD ls -lR /languages

Dockerfile - only copy files which match an extension whilst maintaining folder structure

Suppose I have a very nested folder structure with lots of project files:
src
projectA
projectA.csproj
someFile.txt
projectB
projectB.csproj
someFile.txt
projectC
projectC.csproj
someFile.txt
In this case I want my DockerFile to copy over the full folder structure, but only include .csproj files:
src
projectA
projectA.csproj
projectB
projectB.csproj
projectC
projectC.csproj
I can do this for each file line by line, but is there a cleaner way?
COPY src/projectA/projectA.csproj src/projectA/projectA.csproj
COPY src/projectB/projectB.csproj src/projectB/projectB.csproj
COPY src/projectC/projectC.csproj src/projectC/projectC.csproj
I've faced a similar situation and the only solution I've found was to prepare a .tgz file containing what I needed and copy it in the docker image using the ADD directive.
e.g.
this is a run.sh script similar to what I used:
#!/bin/bash
tar cvfz csproj.tgz $( find src -name "*.csproj" )
docker build -t test .
docker run -it --rm test
this is a test Dockerfile:
FROM alpine
RUN mkdir /src
ADD csproj.tgz /src
CMD ls -alR /src
This solution is not very pleasant but it did do what I needed at the time.
The ADD directive (src: https://docs.docker.com/engine/reference/builder/#add) is able to copy files (like the COPY directive) and
If is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory. Resources from remote URLs are not decompressed.

What is the difference between the 'COPY' and 'ADD' commands in a Dockerfile?

What is the difference between the COPY and ADD commands in a Dockerfile, and when would I use one over the other?
COPY <src> <dest>
The COPY instruction will copy new files from <src> and add them to the container's filesystem at path <dest>
ADD <src> <dest>
The ADD instruction will copy new files from <src> and add them to the container's filesystem at path <dest>.
You should check the ADD and COPY documentation for a more detailed description of their behaviors, but in a nutshell, the major difference is that ADD can do more than COPY:
ADD allows <src> to be a URL
Referring to comments below, the ADD documentation states that:
If is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory. Resources from remote URLs are not decompressed.
Note that the Best practices for writing Dockerfiles suggests using COPY where the magic of ADD is not required. Otherwise, you (since you had to look up this answer) are likely to get surprised someday when you mean to copy keep_this_archive_intact.tar.gz into your container, but instead, you spray the contents onto your filesystem.
COPY is
Same as 'ADD', but without the tar and remote URL handling.
Reference straight from the source code.
There is some official documentation on that point: Best Practices for Writing Dockerfiles
Because image size matters, using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead. That way you can delete the files you no longer need after they've been extracted and you won't have to add another layer in your image.
RUN mkdir -p /usr/src/things \
&& curl -SL http://example.com/big.tar.gz \
| tar -xJC /usr/src/things \
&& make -C /usr/src/things all
For other items (files, directories) that do not require ADD’s tar auto-extraction capability, you should always use COPY.
From Docker docs:
ADD or COPY
Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.
More: Best practices for writing Dockerfiles
If you want to add a xx.tar.gz to a /usr/local in container, unzip it, and then remove the useless compressed package.
For COPY:
COPY resources/jdk-7u79-linux-x64.tar.gz /tmp/
RUN tar -zxvf /tmp/jdk-7u79-linux-x64.tar.gz -C /usr/local
RUN rm /tmp/jdk-7u79-linux-x64.tar.gz
For ADD:
ADD resources/jdk-7u79-linux-x64.tar.gz /usr/local/
ADD supports local-only tar extraction. Besides it, COPY will use three layers, but ADD only uses one layer.
COPY copies a file/directory from your host to your image.
ADD copies a file/directory from your host to your image, but can also fetch remote URLs, extract TAR files, etc...
Use COPY for simply copying files and/or directories into the build context.
Use ADD for downloading remote resources, extracting TAR files, etc..
When creating a Dockerfile, there are two commands that you can use to copy files/directories into it – ADD and COPY. Although there are slight differences in the scope of their function, they essentially perform the same task.
So, why do we have two commands, and how do we know when to use one or the other?
DOCKER ADD COMMAND
===
Let’s start by noting that the ADD command is older than COPY. Since the launch of the Docker platform, the ADD instruction has been part of its list of commands.
The command copies files/directories to a file system of the specified container.
The basic syntax for the ADD command is:
ADD <src> … <dest>
It includes the source you want to copy (<src>) followed by the destination where you want to store it (<dest>). If the source is a directory, ADD copies everything inside of it (including file system metadata).
For instance, if the file is locally available and you want to add it to the directory of an image, you type:
ADD /source/file/path /destination/path
ADD can also copy files from a URL. It can download an external file and copy it to the wanted destination. For example:
ADD http://source.file/url /destination/path
An additional feature is that it copies compressed files, automatically extracting the content to the given destination. This feature only applies to locally stored compressed files/directories.
ADD source.file.tar.gz /temp
Bear in mind that you cannot download and extract a compressed file/directory from a URL. The command does not unpack external packages when copying them to the local filesystem.
DOCKER COPY COMMAND
===
Due to some functionality issues, Docker had to introduce an additional command for duplicating content – COPY.
Unlike its closely related ADD command, COPY only has only one assigned function. Its role is to duplicate files/directories in a specified location in their existing format. This means that it doesn’t deal with extracting a compressed file, but rather copies it as-is.
The instruction can be used only for locally stored files. Therefore, you cannot use it with URLs to copy external files to your container.
To use the COPY instruction, follow the basic command format:
Type in the source and where you want the command to extract the content as follows:
COPY <src> … <dest>
For example:
COPY /source/file/path /destination/path
Which command to use? (Best Practice)
Considering the circumstances in which the COPY command was introduced, it is evident that keeping ADD was a matter of necessity. Docker released an official document outlining best practices for writing Dockerfiles, which explicitly advises against using the ADD command.
Docker’s official documentation notes that COPY should always be the go-to instruction as it is more transparent than ADD.
If you need to copy from the local build context into a container, stick to using COPY.
The Docker team also strongly discourages using ADD to download and copy a package from a URL. Instead, it’s safer and more efficient to use wget or curl within a RUN command. By doing so, you avoid creating an additional image layer and save space.
Ref: https://phoenixnap.com/kb/docker-add-vs-copy
From Docker docs:
https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#add-or-copy
"Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.
If you have multiple Dockerfile steps that use different files from your context, COPY them individually, rather than all at once. This will ensure that each step’s build cache is only invalidated (forcing the step to be re-run) if the specifically required files change.
For example:
COPY requirements.txt /tmp/
RUN pip install --requirement /tmp/requirements.txt
COPY . /tmp/
Results in fewer cache invalidations for the RUN step, than if you put the COPY . /tmp/ before it.
Because image size matters, using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead. That way you can delete the files you no longer need after they’ve been extracted and you won’t have to add another layer in your image. For example, you should avoid doing things like:
ADD http://example.com/big.tar.xz /usr/src/things/
RUN tar -xJf /usr/src/things/big.tar.xz -C /usr/src/things
RUN make -C /usr/src/things all
And instead, do something like:
RUN mkdir -p /usr/src/things \
&& curl -SL htt,p://example.com/big.tar.xz \
| tar -xJC /usr/src/things \
&& make -C /usr/src/things all
For other items (files, directories) that do not require ADD’s tar auto-extraction capability, you should always use COPY."
Source: https://nickjanetakis.com/blog/docker-tip-2-the-difference-between-copy-and-add-in-a-dockerile:
COPY and ADD are both Dockerfile instructions that serve similar purposes. They let you copy files from a specific location into a Docker image.
COPY takes in a src and destination. It only lets you copy in a local file or directory from your host (the machine building the Docker image) into the Docker image itself.
ADD lets you do that too, but it also supports 2 other sources. First, you can use a URL instead of a local file / directory. Secondly, you can extract a tar file from the source directly into the destination
A valid use case for ADD is when you want to extract a local tar file into a specific directory in your Docker image.
If you’re copying in local files to your Docker image, always use COPY because it’s more explicit.
Since Docker 17.05 COPY is used with the --from flag in multi-stage builds to copy artifacts from previous build stages to the current build stage.
from the documentation
Optionally COPY accepts a flag --from=<name|index> that can be used to set the source location to a previous build stage (created with FROM .. AS ) that will be used instead of a build context sent by the user.
COPY doesn't support <src> with URL scheme.
COPY doesn't unpack compression file.
For instruction <src> <dest>, if <src> is a tar compression file and <dest>doesn't end with a trailing slash:
ADD consider <dest> as a directory and unpack <src> to it.
COPY consider <dest> as a file and write <src> to it.
COPY support to overwrite build context by --from arg.
ADD instruction copies files or folders from a local or remote source and adds them to the container's file system. It used to copy local files, those must be in the working directory. ADD instruction unpacks local .tar files to the destination image directory.
Example
ADD http://someserver.com/filename.pdf /var/www/html
COPY copies files from the working directory and adds them to the container's file system. It is not possible to copy a remote file using its URL with this Dockerfile instruction.
Example
COPY Gemfile Gemfile.lock ./
COPY ./src/ /var/www/html/
If you have a foo.tar.gz file, comparing the following command.
The ADD command creates less layers than the COPY command, and saves a lot of net traffic when pushing docker image.
ADD foo.tar.gz /
COPY foo.tar.gz /
RUN tar -zxvf foo.tar.gz
RUN rm -rf foo.tar.gz
Let's say you have a tar file and you want to uncompress it after placing it in your container, remove it, you can use the COPY command to do this. Butt he various commands would be 1) Copy the tar file to the destination, 2). Uncompress it, 3) Remove the tar file. If you did this in 3 steps then there will be a new image created after each step. You can do this in one step using & but it becomes a hassle.
But you used ADD, then Docker will take care of everything for you and only one intermediate image will be created.
ADD and COPY both have same functionality of copying files and directories from source to destination but ADD has extra of file extraction and URL file extraction functionality. The best practice is to use COPY in only copy operation only avoid ADD is many areas. The link will explain it with some simple examples difference between COPY and ADD in dockerfile
docker build -t {image name} -v {host directory}:{temp build directory} .
This is another way to copy files into an image. The -v option temporarily creates a volume that us used during the build process.
This is different that other volumes because it mounts a host directory for the build only. Files can be copied using a standard cp command.
Also, like the curl and wget, it can be run in a command stack (runs in a single container) and not multiply the image size. ADD and COPY are not stackable because they run in a standalone container and subsequent commands on those files that execute in additional containers will multiply the image size:
With the options set thus:
-v /opt/mysql-staging:/tvol
The following will execute in one container:
RUN cp -r /tvol/mysql-5.7.15-linux-glibc2.5-x86_64 /u1 && \
mv /u1/mysql-5.7.15-linux-glibc2.5-x86_64 /u1/mysql && \
mkdir /u1/mysql/mysql-files && \
mkdir /u1/mysql/innodb && \
mkdir /u1/mysql/innodb/libdata && \
mkdir /u1/mysql/innodb/innologs && \
mkdir /u1/mysql/tmp && \
chmod 750 /u1/mysql/mysql-files && \
chown -R mysql /u1/mysql && \
chgrp -R mysql /u1/mysql

Resources