Docker: created files disappear between layers - docker

Running Docker version 17.06.0-ce, build 02c1d87, I have a dockerfile that looks like this:
FROM maven:3.5.2-jdk-8-alpine as builder
RUN chmod -R 777 /root/.m2 &&\
mkdir -p /root/.m2/repository/com/foo/bar &&\
echo "Text" > /root/.m2/repository/com/foo/bar/baz.txt &&\
ls -R -a -l /root/.m2/repository/com/foo
RUN ls -R -a -l /root/.m2/repository/com/foo
The first RUN command successfully creates a file, but the second command can't find it:
Step 1/46 : FROM maven:3.5.2-jdk-8-alpine as builder
---> 293423a981a7
Step 2/46 : RUN chmod -R 777 /root/.m2 && mkdir -p /root/.m2/repository/com/foo/bar && echo "Text" > /root/.m2/repository/com/foo/bar/baz.txt && ls -R -a -l /root/.m2/repository/com/foo
---> Running in a1c0fd142856
/root/.m2/repository/com/foo:
total 12
drwxr-xr-x 3 root root 4096 Nov 30 13:32 .
drwxr-xr-x 3 root root 4096 Nov 30 13:32 ..
drwxr-xr-x 2 root root 4096 Nov 30 13:32 bar
/root/.m2/repository/com/foo/bar:
total 12
drwxr-xr-x 2 root root 4096 Nov 30 13:32 .
drwxr-xr-x 3 root root 4096 Nov 30 13:32 ..
-rw-r--r-- 1 root root 5 Nov 30 13:32 baz.txt
---> b997ccbfd5b0
Step 3/46 : RUN ls -R -a -l /root/.m2/repository/com/foo
---> Running in 603671c87ecc
ls: /root/.m2/repository/com/foo: No such file or directory
The command '/bin/sh -c ls -R -a -l /root/.m2/repository/com/foo' returned a non-zero code: 1
What's going on? (NB. this is a toy example, but there is a real issue in that JARs installed into the Maven repository seem to disappear between layers.)

The upstream maven image defines this directory as a volume. Once an image does this, you cannot reliably make changes to that directory in the image.
From their Dockerfile:
ARG USER_HOME_DIR="/root"
...
VOLUME "$USER_HOME_DIR/.m2"
The Dockerfile documentation describes this behavior:
Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.
Your options are to:
Use another directory for your build
Request that the upstream image removes this VOLUME definition
Build your own image without this definition (it's fairly easy to fork their repo and do your own build)
For more details, you can see an old blog post by me about this behavior and the problems it creates.

Related

How to save data from a docker container on a local host?

I run R-Studio in a container on GitLab. R-Studio build a lot of csv and pdf files. When I run
docker run --rm -it registry.gitlab.com/user/paperboy /bin/bash
I can find in the folder /home/output/csv and /home/output/pdf the files. I will save all this files in a /output/csv and /output/pdf files on a host, in my case on GitLab. The question is how to save data outside the docker Container?
Here is my Dockerfile.
FROM rocker/r-base:latest
RUN apt-get update \
&& apt-get install -yq --no-install-recommends groff \
&& rm -rf /var/lib/apt/lists/*
# Create directories
RUN mkdir -p /home/output/ /home/output/csv/ /home/output/pdf/ /home/script/
WORKDIR /home/script
# Install R-packages
COPY /src/install_packages.R /home/script/install_packages.R
RUN Rscript /home/script/install_packages.R
# Copy data
COPY /src/pairs.csv /home/script/pairs.csv
COPY /src/master.R /home/script/master.R
COPY /src/paperboy.ms /home/script/paperboy.ms
# Run the script
RUN ["Rscript", "master.R"]
$ docker run -d
-v $(pwd)/output/:/home/output
-v $(pwd)/output/csv/:/home/output/csv
-v $(pwd)/output/pdf/:/home/output/pdf
$CONTAINER_IMAGE/$DOCKER_IMAGE
5d11eb7e3d93e8b98b6381f1970c25be426ff67abef5e378b715263f174849c9
This is a part from the .gitlab-ci.yml
run:
stage: run
script:
- git remote set-url origin https://$GIT_CI_USER:$GIT_CI_PASS#gitlab.com/$CI_PROJECT_PATH.git
- git config --global user.name ""
- git config --global user.email ""
- git checkout
- docker login registry.gitlab.com --username gitlab+deploy-token-aaaa --password bbbb
- docker pull $CONTAINER_IMAGE/$DOCKER_IMAGE
- docker image ls
- docker run -t -d
-v $(pwd)/output/:/home/output
-v $(pwd)/user/paperboy/output/csv/:/home/output/csv
-v $(pwd)/user/paperboy/output/pdf/:/home/output/pdf
$CONTAINER_IMAGE/$DOCKER_IMAGE
- rm -rf "%CACHE_PATH%/%CI_PIPELINE_ID%"
- pwd
- ls -la
- ls -laR output
- git status
only:
- master
The csv and pdf folder are empty.
$ ls -laR output
output:
total 32
drwxr-xr-x 4 root root 4096 Oct 8 11:37 .
drwxrwxrwx 5 root root 4096 Oct 8 11:37 ..
drwxr-xr-x 2 root root 4096 Oct 8 11:37 csv
drwxr-xr-x 2 root root 4096 Oct 8 11:37 pdf
output/csv:
total 16
drwxr-xr-x 2 root root 4096 Oct 8 11:37 .
drwxr-xr-x 4 root root 4096 Oct 8 11:37 ..
output/pdf:
total 16
drwxr-xr-x 2 root root 4096 Oct 8 11:37 .
drwxr-xr-x 4 root root 4096 Oct 8 11:37 ..

Docker is deleting downloaded files during build when I use a VOLUME, why?

I have this simple Dockerfile:
FROM fabric8/java-centos-openjdk8-jdk
VOLUME /tmp
RUN curl -k -Lo /tmp/oc.tar.gz "https://mirror.openshift.com/pub/openshift-v3/clients/3.6.173.0.21/linux/oc.tar.gz" && ls -l /tmp
RUN ls -l /tmp && tar zxf /tmp/oc.tar.gz -C /usr/local/bin
It has to download a file, prints the /tmp folder contents, then ls again and extracts the downloaded file's content.
The problem is after downloading the file it is there (&& ls -l /tmp), but in the next RUN ls -l /tmp the file isn't there anymore.
Step 6/17 : RUN curl -k -Lo /tmp/oc.tar.gz "https://mirror.openshift.com/pub/openshift-v3/clients/3.6.173.0.21/linux/oc.tar.gz" && ls -l /tmp
---> Running in 5ad24909ed82
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 34.4M 100 34.4M 0 0 2489k 0 0:00:14 0:00:14 --:--:-- 5660k
total 35308
drwxr-xr-x 2 root root 4096 Mar 17 11:10 hsperfdata_root
-rwx------ 1 root root 836 Mar 2 01:07 ks-script-IAlIsB
-rw-r--r-- 1 root root 36145614 May 24 08:07 oc.tar.gz
-rw------- 1 root root 0 Mar 2 01:06 yum.log
Removing intermediate container 5ad24909ed82
---> 09e50e6d4d84
Step 7/17 : RUN ls -l /tmp && tar zxf /tmp/oc.tar.gz -C /usr/local/bin
---> Running in 49c305788ac9
total 8
drwxr-xr-x 2 root root 4096 Mar 17 11:10 hsperfdata_root
-rwx------ 1 root root 836 Mar 2 01:07 ks-script-IAlIsB
-rw------- 1 root root 0 Mar 2 01:06 yum.log
tar (child): /tmp/oc.tar.gz: Cannot open: No such file or directory
I has something to do with the VOLUME /tmp, without it, it works fine. What's the explanation of this?
Once the volume is defined, you won't be able to modify it. My best guess of what is happening internally during the build is that a temporary volume is setup with the temporary container used to perform the RUN step, and when the RUN step completes, the changes to the image are captured which will not include any changes to the temporary volume files. This behavior is documented by docker:
Changing the volume from within the Dockerfile: If any build steps
change the data within the volume after it has been declared, those
changes will be discarded.
I've also blogged on the topic here.

mkdir .ssh in a Dockerfile, folder is not there?

I want my Dockerfile to mkdir .ssh/
But it does not, why not?
FROM jenkinsci/jnlp-slave
MAINTAINER Johnny5 isAlive <johnny5#hotmail.com>
USER root
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN apt-get update
RUN apt-get install unzip git curl vim -y
USER jenkins
RUN mkdir -p /home/jenkins/.ssh && touch /home/jenkins/.ssh/aFile
...building...
Looks fine?
Step 12 : RUN mkdir -p /home/jenkins/.ssh && touch /home/jenkins/.ssh/aFile
---> Running in ca19a679580d
---> 5980df7db482
Removing intermediate container ca19a679580d
Successfully built 5980df7db482
Running and looking around, the .ssh/ folder and aFile inside are not there ...
$ docker run -it -u 0 --entrypoint /bin/bash 5980df7db482
root#4aa40a18baf2:~# pwd
/home/jenkins
root#4aa40a18baf2:~# ls -al
total 24
drwxr-xr-x 3 jenkins jenkins 4096 Oct 17 23:17 .
drwxr-xr-x 4 root root 4096 Sep 14 08:50 ..
-rw-r--r-- 1 jenkins jenkins 220 Nov 12 2014 .bash_logout
-rw-r--r-- 1 jenkins jenkins 3515 Nov 12 2014 .bashrc
-rw-r--r-- 1 jenkins jenkins 675 Nov 12 2014 .profile
drwxr-xr-x 2 jenkins jenkins 4096 Sep 14 08:50 .tmp
root#4aa40a18baf2:~#
If I pull the parent image, jenkinsci/jnlp-slave, and inspect it with docker inspect jenkinsci/jnlp-slave, I can see that it already has a volume defined at /home/jenkins:
[
{
...
"ContainerConfig": {
...
"Volumes": {
"/home/jenkins": {}
},
...
}
]
This means that during each build step, any changes you make to that location won't be committed to your new layer.
Here's a simplified version of your Dockerfile to highlight what's going on:
FROM jenkinsci/jnlp-slave
RUN mkdir -p /home/jenkins/.ssh
Now, let's build with the following command: docker build --no-cache --rm=false -t jns .:
Sending build context to Docker daemon 2.56 kB
Step 1 : FROM jenkinsci/jnlp-slave
---> d7731d944ad7
Step 2 : RUN mkdir -p /home/jenkins/.ssh
---> Running in 520a8e2f7cae
---> 962189878d5e
Successfully built 962189878d5e
The --no-cache option makes the command easier to work with on repeat invocations. The --rm=false will cause the builder to not remove the containers created for each step.
In this case, the builder ran the Step 2 in 520a8e2f7cae on my system. I can now do a docker inspect 520a8e2f7cae and see the actual container used for this step. Specifically, I'm curious about the mounts location:
[
{
...
"Mounts": [
{
"Name": "e34fd82bd190f21dbd63b5cf70167a16674cd00d95fdc6159314c25c6d08e10e",
"Source": "/var/lib/docker/volumes/e34fd82bd190f21dbd63b5cf70167a16674cd00d95fdc6159314c25c6d08e10e/_data",
"Destination": "/home/jenkins",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
],
...
}
]
I see that there's an anonymous volume with id e34fd82bd190f21dbd63b5cf70167a16674cd00d95fdc6159314c25c6d08e10e for /home/jenkins.
I can inspect the contents of that volume like this:
$ docker run --rm -v e34fd82bd190f21dbd63b5cf70167a16674cd00d95fdc6159314c25c6d08e10e:/volume alpine ls -lah /volume
total 28
drwxr-xr-x 4 10000 10000 4.0K Oct 18 02:49 .
drwxr-xr-x 25 root root 4.0K Oct 18 02:55 ..
-rw-r--r-- 1 10000 10000 220 Nov 12 2014 .bash_logout
-rw-r--r-- 1 10000 10000 3.4K Nov 12 2014 .bashrc
-rw-r--r-- 1 10000 10000 675 Nov 12 2014 .profile
drwxr-xr-x 2 10000 10000 4.0K Oct 18 02:49 .ssh
drwxr-xr-x 2 10000 10000 4.0K Sep 14 08:50 .tmp
The .ssh directory created in the RUN step is in this volume. Since volumes aren't part of the container's write layer, it won't get committed. I can confirm this by doing a docker diff on this container:
docker diff 520a8e2f7cae
There is no output, showing no changes to the container's filesystem, which is why it doesn't come forward into this layer of the image.
The other contents at this location are files in the parent image that were committed before the VOLUME directive that made /home/jenkins into a volume.

"No such file or directory" what's wrong in this Dockerfile?

I am playing with a Dockerfile and I have this:
ARG PUID=1000
ARG PGID=1000
RUN groupadd -g $PGID docker-user && \
useradd -u $PUID -g docker-user -m docker-user && \
mkdir /home/docker-user/.composer
COPY container-files/home/docker-user/.composer/composer.json /home/docker-user/.composer
RUN chown -R docker-user:docker-user /home/docker-user/.composer
USER docker-user
RUN composer global install
But when I try to build the image it ends with the following error:
Step 6 : COPY container-files/home/docker-user/.composer/composer.json /home/docker-user/.composer
lstat container-files/home/docker-user/.composer/composer.json: no such file or directory
The file does exist on the host as per this output:
$ ls -la workspace/container-files/home/docker-user/.composer/
total 12
drwxrwxr-x 2 rperez rperez 4096 Oct 5 11:34 .
drwxrwxr-x 3 rperez rperez 4096 Oct 5 11:14 ..
-rw-rw-r-- 1 rperez rperez 208 Oct 5 11:20 composer.json
I have tried also this syntax:
COPY container-files /
But didn't work either. So I should ask: what's wrong? Why this keep failing once and once? What I am missing here?
The documentation addresses this with:
By default the docker build command will look for a Dockerfile at
the root of the build context. The -f, --file, option lets you
specify the path to an alternative file to use instead. This is useful
in cases where the same set of files are used for multiple builds. The
path must be to a file within the build context. If a relative path is specified then it is interpreted as relative to the root of the context.
In this case I think is
COPY workspace/container-files/home/docker-user/.composer/composer.json /home/docker-user/.composer

Cannot change owner of Docker Volume directory to non-root user

I am using Docker 1.4.1 on Ubuntu 14.04.1 LTS with Kernel 3.13.0-4.
Consider the following Dockerfile
FROM debian:wheezy
VOLUME /var/myvol
# RUN mkdir /var/myvol
# copy content to volume
ADD foo /var/myvol/foo
# create user, and make it new owner of directory
RUN useradd nonroot \
&& chown -R nonroot:nonroot /var/myvol/ \
&& ls -al /var/myvol
# switch to new user
USER nonroot
# remove directory owned by user
RUN ls -al /var/myvol && rm /var/myvol/foo && ls -al /var/myvol
and build it with
touch foo
docker build -t test .
then the resulting output is
Step 0 : FROM debian:wheezy
---> c90d655b99b2
Step 1 : VOLUME /var/myvol
---> Running in d3bc83df9451
---> b860e18186d8
Removing intermediate container d3bc83df9451
Step 2 : ADD foo /var/myvol/foo
---> aded36dba841
Removing intermediate container db5dd1b08958
Step 3 : RUN useradd nonroot && chown -R nonroot:nonroot /var/myvol/ && ls -al /var/myvol
---> Running in 148941cb7858
total 8
drwxr-xr-x 2 nonroot nonroot 4096 Feb 6 09:55 .
drwxr-xr-x 13 root root 4096 Feb 6 09:55 ..
-rw-rw-r-- 1 nonroot nonroot 0 Feb 6 09:30 foo
---> 144e4ff90439
Removing intermediate container 148941cb7858
Step 4 : USER nonroot
---> Running in 924f317b6718
---> 345c1586c69f
Removing intermediate container 924f317b6718
Step 5 : RUN ls -al /var/myvol && rm /var/myvol/foo && ls -al /var/myvol
---> Running in 16c8c2349f27
total 8
drwxr-xr-x 2 root root 4096 Feb 6 09:55 .
drwxr-xr-x 13 root root 4096 Feb 6 09:55 ..
-rw-rw-r-- 1 root root 0 Feb 6 09:30 foo
rm: cannot remove `/var/myvol/foo': Permission denied
INFO[0000] The command [/bin/sh -c ls -al /var/myvol && rm /var/myvol/foo && ls -al /var/myvol] returned a non-zero code: 1
If I'd replace the VOLUME line with the commented one below, it works perfectly. What is really strange is the output of ls -al: While the first says the owner was nonroot, the second one outputs the owner as root, so the chown command seems to be somehow discarded or the permissions resetted after switching to the new user.
Am I understanding Docker volumes in a wrong way? Is only root allowed to work with them, or may this be a bug that I should report?
[Edit]
What I want to achieve is to use a volume as data-storage for a containerized service. This service isn't required to run as root (and so I would prefer to use a non-root user), but is required to delete directories and files that are no longer needed.
When you declare a directory as a VOLUME, you effectively can't use it in a Dockerfile any more. The basic reason is that volumes are set up when the container is run, not built.
In this case, you could simply move the VOLUME statement to the end of the Dockerfile. Any data in the image at that directory will be copied into the volume when the container is started.

Resources