Why does a file within a docker volume not get overwritten? - docker

I'm trying to understand volumes.
When I build and run this image with docker build -t myserver . and docker run -dp 8080:80 myserver, the web server on it prints "Hallo". When I change "Hallo" to "Huhu" in the Dockerfile and rebuild & run the image/container, it shows "Huhu". So far, no surprises.
Next, I added a docker-compose.yaml file that has two volumes. One volume is mounted on an existing path of where the Dockerfile creates the index.html. The other is mounted on a new and unused path. I build and run everything with docker compose up --build.
On the first build, the web server prints "Hallo" as expected. I can also see the two volumes in Docker GUI and its contents. The index.html that was written to the image, is now present in the volume. (I guess the volume gets mounted before the Dockerfile can write to it.)
On the second build (swap "Hallo" with "huhu" and run docker compose up --build again) I was expecting the webserver to print "Huhu". But it prints "Hallo". So I'm not sure why the data on the volume was not overwritten by the Dockerfile.
Can you explain?
Here are the files:
Dockerfile
FROM nginx
# First build
RUN echo "Hallo" > /usr/share/nginx/html/index.html
# Second build
# RUN echo "Huhu" > /usr/share/nginx/html/index.html
docker-compose.yaml
services:
web:
build: .
ports:
- "8080:80"
volumes:
- html:/usr/share/nginx/html
- persistent:/persistent
volumes:
html:
persistent:

There are three different cases here:
When you build the image, it knows nothing about volumes. Whatever string is in that RUN echo line, it is stored in the image. Volumes are not mounted when you run the docker-compose build step, and the Dockerfile cannot write to a volume at all.
The first time you run a container with the volume mounted, and the first time only, if the volume is empty, Docker copies content from the mount point in the image into the volume. This only happens with named volumes and not bind mounts; it only happens on native Docker and not Kubernetes; the volume content is never updated at all after this happens.
The second time you run a container with the volume mounted, since the volume is already populated, the content from the volume hides the content in the image.
You routinely see various cases that uses named volumes to "pass through" to the image (especially Node applications) or to "share files" with another container (frequently an Nginx server). These only work because Docker (and only Docker) automatically populates empty named volumes, and therefore they only work the first time. If you change your package.json, your Node application that mounts a volume over node_modules won't see updates; if you change your static assets that you're sharing with a Web server, the named volume will hide those changes in both the application and HTTP-server containers.
Since the named-volume auto-copy only happens in this one very specific case, I'd try to avoid using it, and more generally try to avoid mounting anything over non-empty directories in your image.

Related

Populate a volume using multiple containers

I am checking the docker documentation on how to use named volumes to share data between containers.
In Populate a volume using a container it is specified that:
If you start a container which creates a new volume, as above, and the container has files or directories in the directory to be mounted (such as /app/ above), the directory’s contents are copied into the volume. The container then mounts and uses the volume, and other containers which use the volume also have access to the pre-populated content.
So I did a simple example where:
I start a container which creates the volume and mounts it to a directory with existing files
I start a second container on which I mount the volume and indeed I can see the first container's files.
So far so good.
However I wanted to see if it is possible to have pre-populated content from more than one containers.
What I did was
Create two simple images which have their respective configuration files in the same directory
FROM alpine:latest
WORKDIR /opt/test
RUN mkdir -p "/opt/test/conf" && \
echo "container from image 1" > /opt/test/conf/config_1.cfg
FROM alpine:latest
WORKDIR /opt/test
RUN mkdir -p "/opt/test/conf" && \
echo "container from image 2" > /opt/test/conf/config_2.cfg
Create a docker compose which defines a named volume which is mounted on both services
services:
test_container_1:
image:
test_image_1
volumes:
- test_volume:/opt/test/conf
tty: true
test_container_2:
image:
test_image_2
volumes:
- test_volume:/opt/test/conf
tty: true
volumes:
test_volume:
Started the services.
> docker-compose -p example up
Creating network "example_default" with the default driver
Creating volume "example_test_volume" with default driver
Creating example_test_container_2_1 ... done
Creating example_test_container_1_1 ... done
Attaching to example_test_container_1_1, example_test_container_2_1
According to the logs container_2 was created first and it pre-populated the volume. However, the volume was then mounted to container_1 and the only file available on the mount was apparently /opt/test/conf/config_2.cfg effectively removing config_1.
So my question is, if it is possible to have a volume populated with data from 2 or more containers.
The reason I want to explore this, is so that I can have additional app configuration loaded from different containers, to support a multi tenant scenario, without having to rework the app to read the tenant configuration from different folders.
Thank you in advance
Once there is any content in a named volume at all, Docker will never automatically copy content into it. It will not merge content from two different images, update the volume if one of the images changes, or anything else.
I'd advise you to ignore the paragraph you quote in the Docker documentation. Assume any volume you mount into the container is initially empty. This matches the behavior you'll get with Docker bind-mounts (host directories), Kubernetes persistent volumes, and basically any other kind of storage besides Docker named volumes proper. Don't mount a volume over the content in your image.
If you can, restructure your application to avoid sharing files at all. One common use of named volumes I see is trying to republish static assets to a reverse proxy, for example; rather than trying to use a named volume (which will never update itself) you can COPY the static assets into a dedicated Web server image. This avoids the various complexities around trying to use a volume here.
If you really don't have a choice in the matter, then you can approach this with dedicated code in both of the containers. The basic setup here is:
Have a data directory somewhere outside your application directory, and mount the volume there.
Include the original files in the image somewhere different.
In an entrypoint wrapper script, copy the original files into the data directory (the mounted volume).
Let's say for the sake of argument that you've installed the application into /opt/test, and the data directory will be /etc/test. The entrypoint wrapper script can be as little as
#!/bin/sh
# Copy config files from the application tree into the config tree
# (overwriting anything that's already there)
cp /opt/test/* "$TEST_CONFIG_DIR"
# Run the main container command
exec "$#"
In the Dockerfile, you need to make sure that directory exists (and if you'll use a non-root user, that user needs permission to write to it).
FROM alpine
WORKDIR /opt/test
COPY ./ ./
ENV TEST_CONFIG_DIR=/etc/test
RUN mkdir "$TEST_CONFIG_DIR"
ENTRYPOINT ["./entrypoint.sh"]
CMD ["./my_app"]
Finally, in the Compose setup, mount the volume on that data directory (you can't use the environment variable, but consider the filesystem path part of the image's API):
version: '3.8'
volumes:
test_config:
services:
one:
build: ./one
volumes:
- test_config:/etc/test
two:
build: ./two
volumes:
- test_config:/etc/test
You would be able to run, for example,
docker-compose run one ls /etc/test
docker-compose run two ls /etc/test
to see both sets of files appear there.
The entrypoint script is code you control. There's nothing especially magical about it beyond the final exec "$#" line to run the main container command. If you want to ignore files that already exist, for example, or if you have a way to merge in changes, then you can implement something more clever than a simple cp command.

Docker Compose - services unable to share a named volume when built with context

My goal is to build two containers using docker-compose. Both containers will read/write to a shared volume.
For simplicity lets say Dockerfile.a looks like:
FROM busybox:latest
WORKDIR /app
RUN touch /app/apple.txt
Dockerfile.b:
FROM busybox:latest
WORKDIR /app
RUN touch /app/banana.txt
and docker-compose.yml
version: "3.8"
services:
a:
image: sandbox/a:v1.0
volumes:
- setupapp-vol:/app
build:
context: .
dockerfile: Dockerfile.a
b:
image: sandbox/b:v1.0
volumes:
- setupapp-vol:/app
build:
context: .
dockerfile: Dockerfile.b
volumes:
setupapp-vol:
Run docker-compose up and check what's in each container.
# ls
banana.txt
# exit
~/sandbox$ docker run -v setupapp-vol:/app -it sandbox/a:v1.0 sh
# ls
banana.txt
# exit
My question is: Why don't I find both apple.txt and banana.txt?
If I don't use a separate build context, the following docker-compose.yml succeeds in both services writing to the shared volume. (But I really need a separate build context because the real Dockerfiles are more complicated than what I've described.)
version: "3.8"
services:
a:
image: busybox:latest
volumes:
- setupapp:/app
command: "touch /app/apple.txt"
b:
image: busybox:latest
volumes:
- setupapp:/app
command: "touch /app/banana.txt"
volumes:
setupapp:
driver:local
There's a lot going on here. If you can restructure your broader application to not need file sharing between containers, it will be more robust.
In your first example, you have two separate images. Image a contains the file /app/apple.txt and b contains the file /app/banana.txt. When you launch containers on both of them, you mount the named volume setupapp-vol over the /app directory.
The first time only a container gets launched with an empty named volume, the contents from its image get copied into the volume. If there is already content in the volume, it hides what was in the image. This means that if image b gets updated, those updates will get lost, and if image a contains different content, there's no way to see it. A sequence consistent with your observation is:
Container b gets started.
The named volume is empty, so /app/banana.txt gets duplicated from image b into the volume.
Container a gets started.
The named volume is not empty, so its content hides the contents of the volume.
You should be able to further demonstrate this by manipulating the volume content:
# manipulates the contents of the shared volume
docker-compose run b \
mv /app/banana.txt /app/coconut.txt
# will show only coconut, not apple or banana
docker-compose run a \
ls /app
Similarly, if you change the filename in Dockerfile.b and re-run docker-compose up --build, you won't see that change in the volume contents: there is already content in the volume and it hides what's in the image.
(This is the same behavior behind many SO questions around changes in a Node application's package.json being ignored: the node_modules directory is kept in a volume and the old volume contents override the updated image.)
In your second example, you are overriding the main container command: to create the files. There's no content in the image, and it is a plain busybox image. In your original Dockerfiles, this would be similar to changing RUN to CMD for the touch command. Now that sequence is:
Container b gets started, mounting the volume. There's no /app directory in the image, so there's no auto-duplication behavior.
The main container command creates /app/banana.txt in the volume and the container exits.
Container a gets started, mounting the volume.
The main container command creates /app/apple.txt in the volume and the container exits.
That is, the difference here is that first the volume contents get mounted into the container (hiding the image contents) and then the main container command runs. In your first example the file is in the image (and gets hidden) but in the second it is created by the main image command (after the mount).
So say you have two images, and they each have some data, and you can't get around trying to merge the data into a single volume. From the examples above we have to do that after the containers start (and the volume gets mounted).
It's a little bit awkward, but you can do this with an entrypoint wrapper script that runs at container startup time. That copies the data from somewhere else in the image, then runs whatever it was given as the main container command:
#!/bin/sh
cp -a /data/* /app
exec "$#"
In your image, COPY the files into this template directory, make the script above be the ENTRYPOINT, and leave the CMD as it was before. (If you had split the command into two words, combine them into CMD; for example CMD ["python3", "app.py"].)
FROM busybox:latest
RUN touch /data/apple.txt
COPY entrypoint.sh /usr/local/bin
WORKDIR /app
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
# CMD ...
Now the data will be built into the image (as in the first example) but when you start it up it will copy its own data into the volume (as in the second example).

Mount files in read-only volume (where source is in .dockerignore)

My app depends on secrets, which I have stored in the folder .credentials (e.g. .credentials/.env, .credentials/.google_api.json, etc...) I don't want these files built into the docker image, however they need to be visible to the docker container.
My solution is:
Add .credentials to my .dockerignore
Mount the credentials folder in read-only mode with a volume:
# docker-compose.yaml
version: '3'
services:
app:
build: .
volumes:
- ./.credentials:/app/.credentials:ro
This is not working (I do not see any credentials inside the docker container). I'm wondering if the .dockerignore is causing the volume to break, or if I've done something else wrong?
Am I going about this the wrong way? e.g. I could just pass the .env file with docker run IMAGE_NAME --env-file .env
Edit:
My issue was to do with how I was running the image. I was doing docker-compose build and then docker run IMAGE_NAME, assuming that the volumes were build into the image. However this seems not to be the case.
Instead the above code works when I do docker-compose run app(where app is the service name) after building.
From the comments, the issue here is in looking at the docker-compose.yml file for your container definition while starting the container with docker run. The docker run command does not use the compose file, so no volumes were defined on the resulting container.
The build process itself creates an image where you do not specify the source of volumes. Only the Dockerfile and your build context is used as an input to the build. The rest of the compose file are all run time settings that apply to containers. Many projects do not even use the compose file for building the image, so all settings in the compose file for those projects are a way to define the default settings for containers being created.
The solution is to using docker-compose up -d to test your docker-compose.yml.

Docker COPY command not mounting a directory

Host OS: Linux
Container OS: Linux
I'm trying to learn how to use docker. I use docker-compose and I'm successfully building images and running containers.
Now if I want to mount some directory inside the container the documentation says that I should use the COPY command inside Dockerfile.
COPY /path/to/my/addons/ /path/to/directory/inside/container
Sadly when I compose this container the COPY command is ignored and my stuff from /path/to/my/addons doesn't make it to the container.
I've also tried with ADD command, but same problem.
Absolute paths
First, you can't use absolute paths for COPY. All paths must be inside the context of the build, which means relative to the Dockerfile. If the folder structure on your host is like this
my-docker-directory
-- Dockerfile
-- docker-compose.yml
-- addons
then you're able to use COPY addons /path/to/directory/inside/container. For all subsequent explanations, I assume that you have an addons folder relative to the Dockerfile.
Mounting a directory
COPY doesn't simply mount a folder to the container at runtime. It doesn't really mount the directory at all. Instead, addons is copied to /path/to/directory/inside/container inside the image. It's important to understand, that this process happens unidirectional (host > image) and only happens when the image is build.
COPY is designed to add dependencies to the image that were required during buildtime like source code that got compiled to binaries. That's the reason why you can't absolute paths. A Dockerfile usually is placed together with source code/config files at the top level area.
The build process of an image happens only on the first run, except you force it using docker-compose up --build. But it doesn't seem that this is what you want. To mount a directory from the host at runtime, use a volume in the docker-compose file:
version: '3'
services:
test:
build: .
volumes:
- ./addons/:/path/to/directory/inside/container
When to use COPY and when volumes?
It's important to realize that COPY and ADD will copy the stuff into the image at buildtime, where volumes mount them from the host at runtume (without including them in the image). So you usually copy general things to the image, that the users needs like default configuration files.
Volumes are required to include files from the host like customized configuration files. Or persistent things as the data-directory of a database. Without volumes those containers work, but are not persistent. So all content would get lost when the container restarts.
Please note that one doesn't exclude the other. It's fine to COPY a default configuration for some application in the image, where the user may override this with volumes to modify them. Especially during development this can make things easier because you don't have to rebuild the entire image for a single changed config file*
* Altough it's a good practice to optimize Dockerfiles for the integrated caching mechanism. If a Dockerfile is well written, rebuilding small config changes often doesn't take too long. But that's another topic out of this scope.
More detailled explanation with example
Basic setup with COPY in Dockerfile
As simple example, we create a Dockerfile from the nginx webserver image and copy html in it
FROM nginx:alpine
COPY my-html /usr/share/nginx/html
Lets create the folder with demo content
mkdir my-html
echo "Dockerfile content" > my-html/index.html
and add a minimalistic docker-compose.yml
version: '3'
services:
test:
build: .
If we run it for the first time using docker-compose up -d, the image got build and our test page is served:
root#server2:~/docker-so-example# docker-compose up -d
Creating network "docker-so-example_default" with the default driver
Creating docker-so-example_test_1 ... done
root#server2:~/docker-so-example# curl $(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' docker-so-example_test_1)
Dockerfile content
Let's manipulate our testfile:
echo "NEW Modified content" > my-html/index.html
If we request our server with curl again, we get the old response:
root#server2:~/docker-so-example# curl $(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' docker-so-example_test_1)
Dockerfile content
To apply our content, a rebuild is required:
docker-compose down && docker-compose up -d --build
Now we can see our changes:
root#server2:~/docker-so-example# curl $(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' docker-so-example_test_1)
NEW Modified content
Use volumes in docker-compose
To show the difference, we use volumes by modifing our docker-compose.yml file like this:
version: '3'
services:
test:
build: .
volumes:
- ./my-html:/usr/share/nginx/html
Now restart the containers using docker-compose down && docker-compose up -d and try it again:
root#server2:~/docker-so-example# echo "Again changed content" > my-html/index.html
root#server2:~/docker-so-example# curl $(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' docker-so-example_test_1)
NEW Modified content
root#server2:~/docker-so-example# echo "Some content" > my-html/index.html
root#server2:~/docker-so-example# curl $(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' docker-so-example_test_1)
Some content
Notice that we didn't re-build the image and our modifications apply immediately. Using volumes, the files are not included in the image.
COPY command inside a docker file copies the content to the image while building. mounting a volume is a different thing. for mounting you need to use
docker run -v <volume_name>:<volume_name> ...
what exactly you want to achieve ? Do you want to see the folders inside containers in your host ?
Take your addon folder to location where your Dockerfile is and then run
mkdir -p /path/to/directory/inside/container
COPY ./addons/* /path/to/directory/inside/container

Docker Data/Named Volumes

I am trying to wrap my mind around Docker volumes but I must have some things missing to understand it.
Let's say I have a Python app that require some initialisation depending on env variables. What I'm trying to achieve is having a "Code only image" from which I can start containers that would be mounted at executions. The entrypoint script of the Main container will then read and generate some files from/on the Code only container.
I tried to create an image to have a copy of the code
FROM ubuntu
COPY ./app /usr/local/code/app
Then docker create --name code_volume
And with docker-compose:
app:
image: python/app
hostname: app
ports:
- "2443:443"
environment:
- ENV=stuff
volumes_from:
- code_volume
I get an error from the app container saying it can't find a file in /usr/local/code/app/src but when I run code_volume with bash then ls into the folder, the file is sitting there...
I tried to change access rights, add /bin/true (seeing it in some examples) but I just can't get what I want to be working. I checked the docker volume create feature but it seems to be for storing/sharing data afterward
What am I missing ? Is the entrypoint script executed before volumes are mounted ? Is there any best practices for cases like this that don't involve mounting folders and keeping one copy for every container ? Should I be thinking my containers over again ?
You do not declare the volume on code_volume container upon creation.
docker create -v /usr/local/code/app --name code_volume

Resources