I'm trying to uncompress a file and delete the original, compressed, archive in my Dockerfile image build instructions. I need to do this because the file in question is larger than the 2GB limit set by Github on large file sizes (see here). The solution I'm pursuing is to compress the file (bringing it under the 2GB limit), and then decompress when I build the application. I know it's bad practice to build large images and plan to integrate a external database into the project but don't have time now to do this.
I've tried various options, but have been unsuccessful.
Compress the file in .zip format and use apt-get to install unzip and then decompress the file with unzip:
FROM python:3.8-slim
#install unzip
RUN apt-get update && apt-get install unzip
WORKDIR /app
COPY /data/databases/file.db.zip /data/databases
RUN unzip /data/databases/file.db.zip && rm -f /data/databases/file.db.zip
COPY ./ ./
This fails with unzip: cannot find or open /data/databases/file.db.zip, /data/databases/file.db.zip.zip or /data/databases/file.db.zip.ZIP. I don't understand this, as I thought COPY added files to the image.
Following this advice, I compressed the large file with gzip and tried to use the Docker native ADD command to uncompress it, i.e.:
FROM python:3.8-slim
WORKDIR /app
ADD /data/databases/file.db.gz /data/databases/file.db
COPY ./ ./
While this compiles without error, it does not decompress the file, which I can see using docker exec -t -i clean-dash /bin/bash to explore the image directory structure. Since the large file is a gzip file, my understanding is ADD should decompress it, i.e. from the docs.
How can I solve these requirements?
ADD only decompresses local tar files, not necessarily compressed single files. It may work to package the contents in a tar file, even if it only contains a single file:
ADD ./data/databases/file.tar.gz /data/databases/
(cd data/databases && tar cvzf file.tar.gz file.db)
docker build .
If you're using the first approach, you must use a multi-stage build here. The problem is that each RUN command generates a new image layer, so the resulting image is always the previous layer plus whatever changes the RUN command makes; RUN rm a-large-file will actually result in an image that's slightly larger than the image that contains the large file.
The BusyBox tool set includes, among other things, an implementation of unzip(1), so you should be able to split this up into a stage that just unpacks the large file and then a stage that copies the result in:
FROM busybox AS unpack
WORKDIR /unpack
COPY data/databases/file.db.zip /
RUN unzip /file.db.zip
FROM python:3.8-slim
COPY --from=unpack /unpack/ /data/databases/
In terms of the Docker image any of these approaches will create a single very large layer. In the past I've run into operational problems with single layers larger than about 1 GiB, things like docker push hanging up halfway through. With the multi-stage build approach, if you have multiple files you're trying to copy, you could have several COPY steps that break the batch of files into multiple layers. (But if it's a single SQLite file, there's nothing you can really do.)
Based on #David Maze's answer, the following worked, which I post here for completeness.
#unpacks zipped database
FROM busybox AS unpack
WORKDIR /unpack
COPY data/databases/file.db.zip /
RUN unzip /file.db.zip
FROM python:3.8-slim
COPY --from=unpack /unpack/file.db /
WORKDIR /app
COPY ./ ./
#move the unpacked db and delete the original
RUN mv /file.db ./data/databases && rm -f ./data/databases/file.db.zip
Related
I want to decrease the number of layers used in my Dockerfile.
So I want to combine the COPY commands in a RUN cp.
dependencies
folder1
file1
file2
Dockerfile
The following below commands work which I want to combine using a single RUN cp command
COPY ./dependencies/file1 /root/.m2
COPY ./dependencies/file2 /root/.sbt/
COPY ./dependencies/folder1 /root/.ivy2/cache
This following below command says No such file or directory present error. Where could I be going wrong ?
RUN cp ./dependencies/file1 /root/.m2 && \
cp ./dependencies/file2 /root/.sbt/ && \
cp ./dependencies/folder1 /root/.ivy2/cache
You can't do that.
COPY copies from the host to the image.
RUN cp copies from a location in the image to another location in the image.
To get it all into a single COPY statement, you can create the file structure you want on the host and then use tar to make it a single file. Then when you COPY or ADD that tar file, Docker will unpack it and put the files in the correct place. But with the current structure your files have on the host, it's not possible to do in a single COPY command.
Problem
The COPY is used to copy files from your host to your container. So, when you run
COPY ./dependencies/file1 /root/.m2
COPY ./dependencies/file2 /root/.sbt/
COPY ./dependencies/folder1 /root/.ivy2/cache
Docker will look for file1, file2, and folder1 on your host.
However, when you do it with RUN, the commands are executed inside the container, and ./dependencies/file1 (and so on) does not exist in your container yet, which leads to file not found error.
In short, COPY and RUN are not interchangeable.
How to fix
If you don't want to use multiple COPY commands, you can use one COPY to copy all files from your host to your container, then use the RUN command to move them to the proper location.
To avoid copying unnecessary files, use .dockerignore. For example:
.dockerignore
./dependencies/no-need-file
./dependencies/no-need-directory/
Dockerfile
COPY ./dependencies/ /root/
RUN mv ./dependencies/file1 /root/.m2 && \
mv ./dependencies/file2 /root/.sbt/ && \
mv ./dependencies/folder1 /root/.ivy2/cache
You a re missing final slash in /root/.ivy2/cache/
I am new to Docker. I want to build a docker image by building a c++ library using make command. The way I am doing it in Dockerfile is that
copy the source code from host
install required packages
run make
copy the libraries (.so) into different folder inside the image
delete the source code
The Dockerfile code is written below.
The problem I am facing is that even after deleting the source code the final image size is big.
Since each line of Dockerfile creates a different layer, there is a way to download the source code using curl or wget and later delete the source code in the same layer. But I don't like the solution.
FROM alpine
RUN apk update && apk add <required_packages>
COPY source_code /tmp/source_code
RUN make -C /tmp/source_code && \
mkdir /libraries/
cp /tmp/lib/* /libraries/
rm -rf /tmp/*
I just want to minimize the final image size. Is it the right way I am doing this or is there any better way? Please help.
You can do a multi-stage build and copy the artifacts on a new image from the previous one. Also install any required runtime dependencies (if any).
FROM alpine AS builder
RUN apk add --no-cache <build_dependencies>
COPY source_code /tmp/source_code
RUN make -C /tmp/source_code && \
mkdir /libraries/
cp /tmp/lib/* /libraries/
rm -rf /tmp/*
FROM alpine
RUN apk add --no-cache <runtime_dependencies>
COPY --from=builder /libraries/ /libraries/
Another way for compacting the resulting image, aside from using a multistage Docker build, is using the --squash build option. Example image build command line:
docker image build --squash -t your-image .
When deleting files in the Docker image, the files themselves aren't truly gone, but remain in previous Docker filesystem layers, so they still take up space.
Squashing collapses all filesystem layers of your image, so files that are deleted with rmwill be removed from the resulting single layer. This is an effective way for removing the source code from your image and compacting it.
Note that, squashing in an experimental Docker feature, and has to be enabled in Docker configuration.
For more details on docker build --squash, see:
Docker image build reference
How does the new Docker squash option work?
I'm building a Rust program in Docker (rust:1.33.0).
Every time code changes, it re-compiles (good), which also re-downloads all dependencies (bad).
I thought I could cache dependencies by adding VOLUME ["/usr/local/cargo"]. edit I've also tried moving this dir with CARGO_HOME without luck.
I thought that making this a volume would persist the downloaded dependencies, which appear to be in this directory.
But it didn't work, they are still downloaded every time. Why?
Dockerfile
FROM rust:1.33.0
VOLUME ["/output", "/usr/local/cargo"]
RUN rustup default nightly-2019-01-29
COPY Cargo.toml .
COPY src/ ./src/
RUN ["cargo", "build", "-Z", "unstable-options", "--out-dir", "/output"]
Built with just docker build ..
Cargo.toml
[package]
name = "mwe"
version = "0.1.0"
[dependencies]
log = { version = "0.4.6" }
Code: just hello world
Output of second run after changing main.rs:
...
Step 4/6 : COPY Cargo.toml .
---> Using cache
---> 97f180cb6ce2
Step 5/6 : COPY src/ ./src/
---> 835be1ea0541
Step 6/6 : RUN ["cargo", "build", "-Z", "unstable-options", "--out-dir", "/output"]
---> Running in 551299a42907
Updating crates.io index
Downloading crates ...
Downloaded log v0.4.6
Downloaded cfg-if v0.1.6
Compiling cfg-if v0.1.6
Compiling log v0.4.6
Compiling mwe v0.1.0 (/)
Finished dev [unoptimized + debuginfo] target(s) in 17.43s
Removing intermediate container 551299a42907
---> e4626da13204
Successfully built e4626da13204
A volume inside the Dockerfile is counter-productive here. That would mount an anonymous volume at each build step, and again when you run the container. The volume during each build step is discarded after that step completes, which means you would need to download the entire contents again for any other step needing those dependencies.
The standard model for this is to copy your dependency specification, run the dependency download, copy your code, and then compile or run your code, in 4 separate steps. That lets docker cache the layers in an efficient manner. I'm not familiar with rust or cargo specifically, but I believe that would look like:
FROM rust:1.33.0
RUN rustup default nightly-2019-01-29
COPY Cargo.toml .
RUN cargo fetch # this should download dependencies
COPY src/ ./src/
RUN ["cargo", "build", "-Z", "unstable-options", "--out-dir", "/output"]
Another option is to turn on some experimental features with BuildKit (available in 18.09, released 2018-11-08) so that docker saves these dependencies in what is similar to a named volume for your build. The directory can be reused across builds, but never gets added to the image itself, making it useful for things like a download cache.
# syntax=docker/dockerfile:experimental
FROM rust:1.33.0
VOLUME ["/output", "/usr/local/cargo"]
RUN rustup default nightly-2019-01-29
COPY Cargo.toml .
COPY src/ ./src/
RUN --mount=type=cache,target=/root/.cargo \
["cargo", "build", "-Z", "unstable-options", "--out-dir", "/output"]
Note that the above assumes cargo is caching files in /root/.cargo. You'd need to verify this and adjust as appropriate. I also haven't mixed the mount syntax with a json exec syntax to know if that part works. You can read more about the BuildKit experimental features here: https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/experimental.md
Turning on BuildKit from 18.09 and newer versions is as easy as export DOCKER_BUILDKIT=1 and then running your build from that shell.
I would say, the nicer solution would be to resort to docker multi-stage build as pointed here and there
This way you can create yourself a first image, that would build both your application and your dependencies, then use, only, in the second image, the dependency folder from the first one
This is inspired by both your comment on #Jack Gore's answer and the two issue comments linked here above.
FROM rust:1.33.0 as dependencies
WORKDIR /usr/src/app
COPY Cargo.toml .
RUN rustup default nightly-2019-01-29 && \
mkdir -p src && \
echo "fn main() {}" > src/main.rs && \
cargo build -Z unstable-options --out-dir /output
FROM rust:1.33.0 as application
# Those are the lines instructing this image to reuse the files
# from the previous image that was aliased as "dependencies"
COPY --from=dependencies /usr/src/app/Cargo.toml .
COPY --from=dependencies /usr/local/cargo /usr/local/cargo
COPY src/ src/
VOLUME /output
RUN rustup default nightly-2019-01-29 && \
cargo build -Z unstable-options --out-dir /output
PS: having only one run will reduce the number of layers you generate; more info here
Here's an overview of the possibilities. (Scroll down for my original answer.)
Add Cargo files, create fake main.rs/lib.rs, then compile dependencies. Afterwards remove the fake source and add the real ones. [Caches dependencies, but several fake files with workspaces].
Add Cargo files, create fake main.rs/lib.rs, then compile dependencies. Afterwards create a new layer with the dependencies and continue from there. [Similar to above].
Externally mount a volume for the cache dir. [Caches everything, relies on caller to pass --mount].
Use RUN --mount=type=cache,target=/the/path cargo build in the Dockerfile in new Docker versions. [Caches everything, seems like a good way, but currently too new to work for me. Executable not part of image. Edit: See here for a solution.]
Run sccache in another container or on the host, then connect to that during the build process. See this comment in Cargo issue 2644.
Use cargo-build-deps. [Might work for some, but does not support Cargo workspaces (in 2019)].
Wait for Cargo issue 2644. [There's willingness to add this to Cargo, but no concrete solution yet].
Using VOLUME ["/the/path"] in the Dockerfile does NOT work, this is per-layer (per command) only.
Note: one can set CARGO_HOME and ENV CARGO_TARGET_DIR in the Dockerfile to control where download cache and compiled output goes.
Also note: cargo fetch can at least cache downloading of dependencies, although not compiling.
Cargo workspaces suffer from having to manually add each Cargo file, and for some solutions, having to generate a dozen fake main.rs/lib.rs. For projects with a single Cargo file, the solutions work better.
I've got caching to work for my particular case by adding
ENV CARGO_HOME /code/dockerout/cargo
ENV CARGO_TARGET_DIR /code/dockerout/target
Where /code is the directory where I mount my code.
This is externally mounted, not from the Dockerfile.
EDIT1: I was confused why this worked, but #b.enoit.be and #BMitch cleared up that it's because volumes declared inside the Dockerfile only live for one layer (one command).
You do not need to use an explicit Docker volume to cache your dependencies. Docker will automatically cache the different "layers" of your image. Basically, each command in the Dockerfile corresponds to a layer of the image. The problem you are facing is based on how Docker image layer caching works.
The rules that Docker follows for image layer caching are listed in the official documentation:
Starting with a parent image that is already in the cache, the next
instruction is compared against all child images derived from that
base image to see if one of them was built using the exact same
instruction. If not, the cache is invalidated.
In most cases, simply comparing the instruction in the Dockerfile with
one of the child images is sufficient. However, certain instructions
require more examination and explanation.
For the ADD and COPY instructions, the contents of the file(s) in the
image are examined and a checksum is calculated for each file. The
last-modified and last-accessed times of the file(s) are not
considered in these checksums. During the cache lookup, the checksum
is compared against the checksum in the existing images. If anything
has changed in the file(s), such as the contents and metadata, then
the cache is invalidated.
Aside from the ADD and COPY commands, cache checking does not look at
the files in the container to determine a cache match. For example,
when processing a RUN apt-get -y update command the files updated in
the container are not examined to determine if a cache hit exists. In
that case just the command string itself is used to find a match.
Once the cache is invalidated, all subsequent Dockerfile commands
generate new images and the cache is not used.
So the problem is with the positioning of the command COPY src/ ./src/ in the Dockerfile. Whenever there is a change in one of your source files, the cache will be invalidated and all subsequent commands will not use the cache. Therefore your cargo build command will not use the Docker cache.
To solve your problem it will be as simple as reordering the commands in your Docker file, to this:
FROM rust:1.33.0
RUN rustup default nightly-2019-01-29
COPY Cargo.toml .
RUN ["cargo", "build", "-Z", "unstable-options", "--out-dir", "/output"]
COPY src/ ./src/
Doing it this way, your dependencies will only be reinstalled when there is a change in your Cargo.toml.
Hope this helps.
With the integration of BuildKit into docker, if you are able to avail yourself of the superior BuildKit backend, it's now possible to mount a cache volume during a RUN command, and IMHO, this has become the best way to cache cargo builds. The cache volume retains the data that was written to it on previous runs.
To use BuildKit, you'll mount two cache volumes, one for the cargo dir, which caches external crate sources, and one for the target dir, which caches all of your built artifacts, including external crates and the project bins and libs.
If your base image is rust, $CARGO_HOME is set to /usr/local/cargo, so your command looks like this:
RUN --mount=type=cache,target=/usr/local/cargo,from=rust,source=/usr/local/cargo \
--mount=type=cache,target=target \
cargo build
If your base image is something else, you will need to change the /usr/local/cargo bit to whatever is the value of $CARGO_HOME, or else add a ENV CARGO_HOME=/usr/local/cargo line. As a side note, the clever thing would be to set literally target=$CARGO_HOME and let Docker do the expansion, but it
doesn't seem to work right - expansion happens, but buildkit still doesn't persist the same volume across runs when you do this.
Other options for achieving Cargo build caching (including sccache and the cargo wharf project) are described in this github issue.
I figured out how to get this also working with cargo workspaces, using romac's fork of cargo-build-deps.
This example has my_app, and two workspaces: utils and db.
FROM rust:nightly as rust
# Cache deps
WORKDIR /app
RUN sudo chown -R rust:rust .
RUN USER=root cargo new myapp
# Install cache-deps
RUN cargo install --git https://github.com/romac/cargo-build-deps.git
WORKDIR /app/myapp
RUN mkdir -p db/src/ utils/src/
# Copy the Cargo tomls
COPY myapp/Cargo.toml myapp/Cargo.lock ./
COPY myapp/db/Cargo.toml ./db/
COPY myapp/utils/Cargo.toml ./utils/
# Cache the deps
RUN cargo build-deps
# Copy the src folders
COPY myapp/src ./src/
COPY myapp/db/src ./db/src/
COPY myapp/utils/src/ ./utils/src/
# Build for debug
RUN cargo build
I'm sure you can adjust this code for use with a Dockerfile, but I wrote a dockerized drop-in replacement for cargo that you can save to a package and run as ./cargo build --release. This just works for (most) development (uses rust:latest), but isn't set up for CI or anything.
Usage: ./cargo build, ./cargo build --release, etc
It will use the current working directory and save the cache to ./.cargo. (You can ignore the entire directory in your version control and it doesn't need to exist beforehand.)
Create a file named cargo in your project's folder, run chmod +x ./cargo on it, and place the following code in it:
#!/bin/bash
# This is a drop-in replacement for `cargo`
# that runs in a Docker container as the current user
# on the latest Rust image
# and saves all generated files to `./cargo/` and `./target/`.
#
# Be sure to make this file executable: `chmod +x ./cargo`
#
# # Examples
#
# - Running app: `./cargo run`
# - Building app: `./cargo build`
# - Building release: `./cargo build --release`
#
# # Installing globally
#
# To run `cargo` from anywhere,
# save this file to `/usr/local/bin`.
# You'll then be able to use `cargo`
# as if you had installed Rust globally.
sudo docker run \
--rm \
--user "$(id -u)":"$(id -g)" \
--mount type=bind,src="$PWD",dst=/usr/src/app \
--workdir /usr/src/app \
--env CARGO_HOME=/usr/src/app/.cargo \
rust:latest \
cargo "$#"
Attempting to create a container with microsoft/dotnet:2.1-aspnetcore-runtime. The .net core solution file has multiple projects nested underneath the solution, each with it's own .csproj file. I am attemping to create a more elegant COPY instruction for the sub-projects
The sample available here https://github.com/dotnet/dotnet-docker/tree/master/samples/aspnetapp has a solution file with only one .csproj so creates the Dockerfile thusly:
COPY *.sln .
COPY aspnetapp/*.csproj ./aspnetapp/
RUN dotnet restore
It works this way
COPY my_solution_folder/*.sln .
COPY my_solution_folder/project/*.csproj my_solution_folder/
COPY my_solution_folder/subproject_one/*.csproj subproject_one/
COPY my_solution_folder/subproject_two/*.csproj subproject_two/
COPY my_solution_folder/subproject_three/*.csproj subproject_three/
for a solution folder structure of:
my_solution_folder\my_solution.sln
my_solution_folder\project\my_solution.csproj
my_solution_folder\subproject_one\subproject_one.csproj
my_solution_folder\subproject_two\subproject_two.csproj
my_solution_folder\subproject_three\subproject_three.csproj
but this doesn't (was a random guess)
COPY my_solution_folder/*/*.csproj working_dir_folder/*/
Is there a more elegant solution?
2021: with BuildKit, see ".NET package restore in Docker cached separately from build" from Palec.
2018: Considering that wildcard are not well-supported by COPY (moby issue 15858), you can:
either experiment with adding .dockerignore files in the folder you don't want to copy (while excluding folders you do want): it is cumbersome
or, as shown here, make a tar of all the folders you want
Here is an example, to be adapted in your case:
find .. -name '*.csproj' -o -name 'Finomial.InternalServicesCore.sln' -o -name 'nuget.config' \
| sort | tar cf dotnet-restore.tar -T - 2> /dev/null
With a Dockerfile including:
ADD docker/dotnet-restore.tar ./
The idea is: the archive gets automatically expanded with ADD.
The OP sturmstrike mentions in the comments "Optimising ASP.NET Core apps in Docker - avoiding manually copying csproj files (Part 2)" from Andrew Lock "Sock"
The alternative solution actually uses the wildcard technique I previously dismissed, but with some assumptions about your project structure, a two-stage approach, and a bit of clever bash-work to work around the wildcard limitations.
We take the flat list of csproj files, and move them back to their correct location, nested inside sub-folders of src.
# Copy the main source project files
COPY src/*/*.csproj ./
RUN for file in $(ls *.csproj); do mkdir -p src/${file%.*}/ && mv $file src/${file%.*}/; done
L01nl suggests in the comments an alternative approach that doesn't require compression: "Optimising ASP.NET Core apps in Docker - avoiding manually copying csproj files", from Andrew Lock "Sock".
FROM microsoft/aspnetcore-build:2.0.6-2.1.101 AS builder
WORKDIR /sln
COPY ./*.sln ./NuGet.config ./
# Copy the main source project files
COPY src/*/*.csproj ./
RUN for file in $(ls *.csproj); do mkdir -p src/${file%.*}/ && mv $file src/${file%.*}/; done
# Copy the test project files
COPY test/*/*.csproj ./
RUN for file in $(ls *.csproj); do mkdir -p test/${file%.*}/ && mv $file test/${file%.*}/; done
RUN dotnet restore
# Remainder of build process
This solution is much cleaner than my previous tar-based effort, as it doesn't require any external scripting, just standard docker COPY and RUN commands.
It gets around the wildcard issue by copying across csproj files in the src directory first, moving them to their correct location, and then copying across the test project files.
One other option to consider is using a multi-stage build to prefilter / prep the desired files. This is mentioned on the same moby issue 15858.
For those building on .NET Framework, you can take it a step further and leverage robocopy.
For example:
FROM mcr.microsoft.com/dotnet/framework/sdk:4.8 AS prep
# Gather only artifacts necessary for NuGet restore, retaining directory structure
COPY / /temp/
RUN Invoke-Expression 'robocopy C:/temp C:/nuget /s /ndl /njh /njs *.sln nuget.config *.csproj packages.config'
[...]
# New build stage, independent cache
FROM mcr.microsoft.com/dotnet/framework/sdk:4.8 AS build
# Copy prepped NuGet artifacts, and restore as distinct layer
COPY --from=prep ./nuget ./
RUN nuget restore
# Copy everything else, build, etc
COPY src/ ./src/
RUN msbuild
[...]
The big advantage here is that there are no assumptions made about the structure of your solution. The robocopy '/s' flag will preserve any directory structure for you.
Note the '/ndl /njh /njs' flags are there just to cut down on log noise.
In addition to VonC's answer (which is correct), I am building from a Windows 10 OS and targetting Linux containers. The equivalent to the above answer using Windows and 7z (which I normally have installed anyway) is:
7z a -r -ttar my_project_files.tar .\*.csproj .\*.sln .\*nuget.config
followed by the ADD in the Dockerfile to decompress.
Be aware that after installing 7-zip, you will need to add the installation folder to your environment path to call it in the above fashion.
Looking at the moby issue 15858, you will see the execution of the BASH script to generate the tar file and then the subsequent execution of the Dockerfile using ADD to extract.
Fully automate either with a batch or use the Powershell execution as given in the below example.
Pass PowerShell variables to Docker commands
Qnother solution, maybe a bit slower but all in one
Everything in one file and one command docker build .
I've split my Dockerfile in 2 steps,
First image to tar the *.csproj files
Second image use the tar and setup project
code:
FROM ubuntu:18.04 as tar_files
WORKDIR /tar
COPY . .
RUN find . -name "*.csproj" -print0 | tar -cvf projectfiles.tar --null -T -
FROM mcr.microsoft.com/dotnet/core/sdk:3.1 AS build
WORKDIR /source
# copy sln
COPY *.sln .
# Copy all the csproj files from previous image
COPY --from=tar_files /tar/projectfiles.tar .
RUN tar -xvf projectfiles.tar
RUN rm projectfiles.tar
RUN dotnet restore
# Remainder of build process
I use this script
COPY SolutionName.sln SolutionName.sln
COPY src/*/*.csproj ./
COPY tests/*/*.csproj ./
RUN cat SolutionName.sln \
| grep "\.csproj" \
| awk '{print $4}' \
| sed -e 's/[",]//g' \
| sed 's#\\#/#g' \
| xargs -I {} sh -c 'mkdir -p $(dirname {}) && mv $(basename {}) $(dirname {})/'
RUN dotnet restore "/src/Service/Service.csproj"
COPY ./src ./src
COPY ./tests ./tests
RUN dotnet build "/src/Service/Service.csproj" -c Release -o /app/build
Copy solution file
Copy project files
(optional) Copy test project files
Make linux magic (scan sln-file for projects and restore directory structure)
Restore packages for service project
Copy sources
(optional) Copy test sources
Build service project
This is working for all Linux containers
What is the difference between the COPY and ADD commands in a Dockerfile, and when would I use one over the other?
COPY <src> <dest>
The COPY instruction will copy new files from <src> and add them to the container's filesystem at path <dest>
ADD <src> <dest>
The ADD instruction will copy new files from <src> and add them to the container's filesystem at path <dest>.
You should check the ADD and COPY documentation for a more detailed description of their behaviors, but in a nutshell, the major difference is that ADD can do more than COPY:
ADD allows <src> to be a URL
Referring to comments below, the ADD documentation states that:
If is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory. Resources from remote URLs are not decompressed.
Note that the Best practices for writing Dockerfiles suggests using COPY where the magic of ADD is not required. Otherwise, you (since you had to look up this answer) are likely to get surprised someday when you mean to copy keep_this_archive_intact.tar.gz into your container, but instead, you spray the contents onto your filesystem.
COPY is
Same as 'ADD', but without the tar and remote URL handling.
Reference straight from the source code.
There is some official documentation on that point: Best Practices for Writing Dockerfiles
Because image size matters, using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead. That way you can delete the files you no longer need after they've been extracted and you won't have to add another layer in your image.
RUN mkdir -p /usr/src/things \
&& curl -SL http://example.com/big.tar.gz \
| tar -xJC /usr/src/things \
&& make -C /usr/src/things all
For other items (files, directories) that do not require ADD’s tar auto-extraction capability, you should always use COPY.
From Docker docs:
ADD or COPY
Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.
More: Best practices for writing Dockerfiles
If you want to add a xx.tar.gz to a /usr/local in container, unzip it, and then remove the useless compressed package.
For COPY:
COPY resources/jdk-7u79-linux-x64.tar.gz /tmp/
RUN tar -zxvf /tmp/jdk-7u79-linux-x64.tar.gz -C /usr/local
RUN rm /tmp/jdk-7u79-linux-x64.tar.gz
For ADD:
ADD resources/jdk-7u79-linux-x64.tar.gz /usr/local/
ADD supports local-only tar extraction. Besides it, COPY will use three layers, but ADD only uses one layer.
COPY copies a file/directory from your host to your image.
ADD copies a file/directory from your host to your image, but can also fetch remote URLs, extract TAR files, etc...
Use COPY for simply copying files and/or directories into the build context.
Use ADD for downloading remote resources, extracting TAR files, etc..
When creating a Dockerfile, there are two commands that you can use to copy files/directories into it – ADD and COPY. Although there are slight differences in the scope of their function, they essentially perform the same task.
So, why do we have two commands, and how do we know when to use one or the other?
DOCKER ADD COMMAND
===
Let’s start by noting that the ADD command is older than COPY. Since the launch of the Docker platform, the ADD instruction has been part of its list of commands.
The command copies files/directories to a file system of the specified container.
The basic syntax for the ADD command is:
ADD <src> … <dest>
It includes the source you want to copy (<src>) followed by the destination where you want to store it (<dest>). If the source is a directory, ADD copies everything inside of it (including file system metadata).
For instance, if the file is locally available and you want to add it to the directory of an image, you type:
ADD /source/file/path /destination/path
ADD can also copy files from a URL. It can download an external file and copy it to the wanted destination. For example:
ADD http://source.file/url /destination/path
An additional feature is that it copies compressed files, automatically extracting the content to the given destination. This feature only applies to locally stored compressed files/directories.
ADD source.file.tar.gz /temp
Bear in mind that you cannot download and extract a compressed file/directory from a URL. The command does not unpack external packages when copying them to the local filesystem.
DOCKER COPY COMMAND
===
Due to some functionality issues, Docker had to introduce an additional command for duplicating content – COPY.
Unlike its closely related ADD command, COPY only has only one assigned function. Its role is to duplicate files/directories in a specified location in their existing format. This means that it doesn’t deal with extracting a compressed file, but rather copies it as-is.
The instruction can be used only for locally stored files. Therefore, you cannot use it with URLs to copy external files to your container.
To use the COPY instruction, follow the basic command format:
Type in the source and where you want the command to extract the content as follows:
COPY <src> … <dest>
For example:
COPY /source/file/path /destination/path
Which command to use? (Best Practice)
Considering the circumstances in which the COPY command was introduced, it is evident that keeping ADD was a matter of necessity. Docker released an official document outlining best practices for writing Dockerfiles, which explicitly advises against using the ADD command.
Docker’s official documentation notes that COPY should always be the go-to instruction as it is more transparent than ADD.
If you need to copy from the local build context into a container, stick to using COPY.
The Docker team also strongly discourages using ADD to download and copy a package from a URL. Instead, it’s safer and more efficient to use wget or curl within a RUN command. By doing so, you avoid creating an additional image layer and save space.
Ref: https://phoenixnap.com/kb/docker-add-vs-copy
From Docker docs:
https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#add-or-copy
"Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.
If you have multiple Dockerfile steps that use different files from your context, COPY them individually, rather than all at once. This will ensure that each step’s build cache is only invalidated (forcing the step to be re-run) if the specifically required files change.
For example:
COPY requirements.txt /tmp/
RUN pip install --requirement /tmp/requirements.txt
COPY . /tmp/
Results in fewer cache invalidations for the RUN step, than if you put the COPY . /tmp/ before it.
Because image size matters, using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead. That way you can delete the files you no longer need after they’ve been extracted and you won’t have to add another layer in your image. For example, you should avoid doing things like:
ADD http://example.com/big.tar.xz /usr/src/things/
RUN tar -xJf /usr/src/things/big.tar.xz -C /usr/src/things
RUN make -C /usr/src/things all
And instead, do something like:
RUN mkdir -p /usr/src/things \
&& curl -SL htt,p://example.com/big.tar.xz \
| tar -xJC /usr/src/things \
&& make -C /usr/src/things all
For other items (files, directories) that do not require ADD’s tar auto-extraction capability, you should always use COPY."
Source: https://nickjanetakis.com/blog/docker-tip-2-the-difference-between-copy-and-add-in-a-dockerile:
COPY and ADD are both Dockerfile instructions that serve similar purposes. They let you copy files from a specific location into a Docker image.
COPY takes in a src and destination. It only lets you copy in a local file or directory from your host (the machine building the Docker image) into the Docker image itself.
ADD lets you do that too, but it also supports 2 other sources. First, you can use a URL instead of a local file / directory. Secondly, you can extract a tar file from the source directly into the destination
A valid use case for ADD is when you want to extract a local tar file into a specific directory in your Docker image.
If you’re copying in local files to your Docker image, always use COPY because it’s more explicit.
Since Docker 17.05 COPY is used with the --from flag in multi-stage builds to copy artifacts from previous build stages to the current build stage.
from the documentation
Optionally COPY accepts a flag --from=<name|index> that can be used to set the source location to a previous build stage (created with FROM .. AS ) that will be used instead of a build context sent by the user.
COPY doesn't support <src> with URL scheme.
COPY doesn't unpack compression file.
For instruction <src> <dest>, if <src> is a tar compression file and <dest>doesn't end with a trailing slash:
ADD consider <dest> as a directory and unpack <src> to it.
COPY consider <dest> as a file and write <src> to it.
COPY support to overwrite build context by --from arg.
ADD instruction copies files or folders from a local or remote source and adds them to the container's file system. It used to copy local files, those must be in the working directory. ADD instruction unpacks local .tar files to the destination image directory.
Example
ADD http://someserver.com/filename.pdf /var/www/html
COPY copies files from the working directory and adds them to the container's file system. It is not possible to copy a remote file using its URL with this Dockerfile instruction.
Example
COPY Gemfile Gemfile.lock ./
COPY ./src/ /var/www/html/
If you have a foo.tar.gz file, comparing the following command.
The ADD command creates less layers than the COPY command, and saves a lot of net traffic when pushing docker image.
ADD foo.tar.gz /
COPY foo.tar.gz /
RUN tar -zxvf foo.tar.gz
RUN rm -rf foo.tar.gz
Let's say you have a tar file and you want to uncompress it after placing it in your container, remove it, you can use the COPY command to do this. Butt he various commands would be 1) Copy the tar file to the destination, 2). Uncompress it, 3) Remove the tar file. If you did this in 3 steps then there will be a new image created after each step. You can do this in one step using & but it becomes a hassle.
But you used ADD, then Docker will take care of everything for you and only one intermediate image will be created.
ADD and COPY both have same functionality of copying files and directories from source to destination but ADD has extra of file extraction and URL file extraction functionality. The best practice is to use COPY in only copy operation only avoid ADD is many areas. The link will explain it with some simple examples difference between COPY and ADD in dockerfile
docker build -t {image name} -v {host directory}:{temp build directory} .
This is another way to copy files into an image. The -v option temporarily creates a volume that us used during the build process.
This is different that other volumes because it mounts a host directory for the build only. Files can be copied using a standard cp command.
Also, like the curl and wget, it can be run in a command stack (runs in a single container) and not multiply the image size. ADD and COPY are not stackable because they run in a standalone container and subsequent commands on those files that execute in additional containers will multiply the image size:
With the options set thus:
-v /opt/mysql-staging:/tvol
The following will execute in one container:
RUN cp -r /tvol/mysql-5.7.15-linux-glibc2.5-x86_64 /u1 && \
mv /u1/mysql-5.7.15-linux-glibc2.5-x86_64 /u1/mysql && \
mkdir /u1/mysql/mysql-files && \
mkdir /u1/mysql/innodb && \
mkdir /u1/mysql/innodb/libdata && \
mkdir /u1/mysql/innodb/innologs && \
mkdir /u1/mysql/tmp && \
chmod 750 /u1/mysql/mysql-files && \
chown -R mysql /u1/mysql && \
chgrp -R mysql /u1/mysql