Maybe I'm missing this when reading the docs, but is there a way to overwrite files on the container's file system when issuing a docker run command?
Something akin to the Dockerfile COPY command? The key desire here is to be able to take a particular Docker image, and spin several of the same image up, but with different configuration files. (I'd prefer to do this with environment variables, but the application that I'm Dockerizing is not partial to that.)
You have a few options. Using something like docker-compose, you could automatically build a unique image for each container using your base image as a template. For example, if you had a docker-compose.yml that look liked:
container0:
build: container0
container1:
build: container1
And then inside container0/Dockerfile you had:
FROM larsks/thttpd
COPY index.html /index.html
And inside container0/index.html you had whatever content you
wanted, then running docker-compose build would generate unique
images for each entry (and running docker-compose up would start
everything up).
I've put together an example of the above
here.
Using just the Docker command line, you can use host volume mounts,
which allow you to mount files into a container as well as
directories. Using my thttpd as an example again, you could use the
following -v argument to override /index.html in the container
with the content of your choice:
docker run -v index.html:/index.html larsks/thttpd
And you could accomplish the same thing with docker-compose via the
volume entry:
container0:
image: larsks/thttpd
volumes:
- ./container0/index.html:/index.html
container1:
image: larsks/thttpd
volumes:
- ./container1/index.html:/index.html
I would suggest that using the build mechanism makes more sense if you are trying to override many files, while using volumes is fine for one or two files.
A key difference between the two mechanisms is that when building images, each container will have a copy of the files, while using volume mounts, changes made to the file within the image will be reflected on the host filesystem.
Related
I'm trying to understand volumes.
When I build and run this image with docker build -t myserver . and docker run -dp 8080:80 myserver, the web server on it prints "Hallo". When I change "Hallo" to "Huhu" in the Dockerfile and rebuild & run the image/container, it shows "Huhu". So far, no surprises.
Next, I added a docker-compose.yaml file that has two volumes. One volume is mounted on an existing path of where the Dockerfile creates the index.html. The other is mounted on a new and unused path. I build and run everything with docker compose up --build.
On the first build, the web server prints "Hallo" as expected. I can also see the two volumes in Docker GUI and its contents. The index.html that was written to the image, is now present in the volume. (I guess the volume gets mounted before the Dockerfile can write to it.)
On the second build (swap "Hallo" with "huhu" and run docker compose up --build again) I was expecting the webserver to print "Huhu". But it prints "Hallo". So I'm not sure why the data on the volume was not overwritten by the Dockerfile.
Can you explain?
Here are the files:
Dockerfile
FROM nginx
# First build
RUN echo "Hallo" > /usr/share/nginx/html/index.html
# Second build
# RUN echo "Huhu" > /usr/share/nginx/html/index.html
docker-compose.yaml
services:
web:
build: .
ports:
- "8080:80"
volumes:
- html:/usr/share/nginx/html
- persistent:/persistent
volumes:
html:
persistent:
There are three different cases here:
When you build the image, it knows nothing about volumes. Whatever string is in that RUN echo line, it is stored in the image. Volumes are not mounted when you run the docker-compose build step, and the Dockerfile cannot write to a volume at all.
The first time you run a container with the volume mounted, and the first time only, if the volume is empty, Docker copies content from the mount point in the image into the volume. This only happens with named volumes and not bind mounts; it only happens on native Docker and not Kubernetes; the volume content is never updated at all after this happens.
The second time you run a container with the volume mounted, since the volume is already populated, the content from the volume hides the content in the image.
You routinely see various cases that uses named volumes to "pass through" to the image (especially Node applications) or to "share files" with another container (frequently an Nginx server). These only work because Docker (and only Docker) automatically populates empty named volumes, and therefore they only work the first time. If you change your package.json, your Node application that mounts a volume over node_modules won't see updates; if you change your static assets that you're sharing with a Web server, the named volume will hide those changes in both the application and HTTP-server containers.
Since the named-volume auto-copy only happens in this one very specific case, I'd try to avoid using it, and more generally try to avoid mounting anything over non-empty directories in your image.
I am checking the docker documentation on how to use named volumes to share data between containers.
In Populate a volume using a container it is specified that:
If you start a container which creates a new volume, as above, and the container has files or directories in the directory to be mounted (such as /app/ above), the directory’s contents are copied into the volume. The container then mounts and uses the volume, and other containers which use the volume also have access to the pre-populated content.
So I did a simple example where:
I start a container which creates the volume and mounts it to a directory with existing files
I start a second container on which I mount the volume and indeed I can see the first container's files.
So far so good.
However I wanted to see if it is possible to have pre-populated content from more than one containers.
What I did was
Create two simple images which have their respective configuration files in the same directory
FROM alpine:latest
WORKDIR /opt/test
RUN mkdir -p "/opt/test/conf" && \
echo "container from image 1" > /opt/test/conf/config_1.cfg
FROM alpine:latest
WORKDIR /opt/test
RUN mkdir -p "/opt/test/conf" && \
echo "container from image 2" > /opt/test/conf/config_2.cfg
Create a docker compose which defines a named volume which is mounted on both services
services:
test_container_1:
image:
test_image_1
volumes:
- test_volume:/opt/test/conf
tty: true
test_container_2:
image:
test_image_2
volumes:
- test_volume:/opt/test/conf
tty: true
volumes:
test_volume:
Started the services.
> docker-compose -p example up
Creating network "example_default" with the default driver
Creating volume "example_test_volume" with default driver
Creating example_test_container_2_1 ... done
Creating example_test_container_1_1 ... done
Attaching to example_test_container_1_1, example_test_container_2_1
According to the logs container_2 was created first and it pre-populated the volume. However, the volume was then mounted to container_1 and the only file available on the mount was apparently /opt/test/conf/config_2.cfg effectively removing config_1.
So my question is, if it is possible to have a volume populated with data from 2 or more containers.
The reason I want to explore this, is so that I can have additional app configuration loaded from different containers, to support a multi tenant scenario, without having to rework the app to read the tenant configuration from different folders.
Thank you in advance
Once there is any content in a named volume at all, Docker will never automatically copy content into it. It will not merge content from two different images, update the volume if one of the images changes, or anything else.
I'd advise you to ignore the paragraph you quote in the Docker documentation. Assume any volume you mount into the container is initially empty. This matches the behavior you'll get with Docker bind-mounts (host directories), Kubernetes persistent volumes, and basically any other kind of storage besides Docker named volumes proper. Don't mount a volume over the content in your image.
If you can, restructure your application to avoid sharing files at all. One common use of named volumes I see is trying to republish static assets to a reverse proxy, for example; rather than trying to use a named volume (which will never update itself) you can COPY the static assets into a dedicated Web server image. This avoids the various complexities around trying to use a volume here.
If you really don't have a choice in the matter, then you can approach this with dedicated code in both of the containers. The basic setup here is:
Have a data directory somewhere outside your application directory, and mount the volume there.
Include the original files in the image somewhere different.
In an entrypoint wrapper script, copy the original files into the data directory (the mounted volume).
Let's say for the sake of argument that you've installed the application into /opt/test, and the data directory will be /etc/test. The entrypoint wrapper script can be as little as
#!/bin/sh
# Copy config files from the application tree into the config tree
# (overwriting anything that's already there)
cp /opt/test/* "$TEST_CONFIG_DIR"
# Run the main container command
exec "$#"
In the Dockerfile, you need to make sure that directory exists (and if you'll use a non-root user, that user needs permission to write to it).
FROM alpine
WORKDIR /opt/test
COPY ./ ./
ENV TEST_CONFIG_DIR=/etc/test
RUN mkdir "$TEST_CONFIG_DIR"
ENTRYPOINT ["./entrypoint.sh"]
CMD ["./my_app"]
Finally, in the Compose setup, mount the volume on that data directory (you can't use the environment variable, but consider the filesystem path part of the image's API):
version: '3.8'
volumes:
test_config:
services:
one:
build: ./one
volumes:
- test_config:/etc/test
two:
build: ./two
volumes:
- test_config:/etc/test
You would be able to run, for example,
docker-compose run one ls /etc/test
docker-compose run two ls /etc/test
to see both sets of files appear there.
The entrypoint script is code you control. There's nothing especially magical about it beyond the final exec "$#" line to run the main container command. If you want to ignore files that already exist, for example, or if you have a way to merge in changes, then you can implement something more clever than a simple cp command.
My app depends on secrets, which I have stored in the folder .credentials (e.g. .credentials/.env, .credentials/.google_api.json, etc...) I don't want these files built into the docker image, however they need to be visible to the docker container.
My solution is:
Add .credentials to my .dockerignore
Mount the credentials folder in read-only mode with a volume:
# docker-compose.yaml
version: '3'
services:
app:
build: .
volumes:
- ./.credentials:/app/.credentials:ro
This is not working (I do not see any credentials inside the docker container). I'm wondering if the .dockerignore is causing the volume to break, or if I've done something else wrong?
Am I going about this the wrong way? e.g. I could just pass the .env file with docker run IMAGE_NAME --env-file .env
Edit:
My issue was to do with how I was running the image. I was doing docker-compose build and then docker run IMAGE_NAME, assuming that the volumes were build into the image. However this seems not to be the case.
Instead the above code works when I do docker-compose run app(where app is the service name) after building.
From the comments, the issue here is in looking at the docker-compose.yml file for your container definition while starting the container with docker run. The docker run command does not use the compose file, so no volumes were defined on the resulting container.
The build process itself creates an image where you do not specify the source of volumes. Only the Dockerfile and your build context is used as an input to the build. The rest of the compose file are all run time settings that apply to containers. Many projects do not even use the compose file for building the image, so all settings in the compose file for those projects are a way to define the default settings for containers being created.
The solution is to using docker-compose up -d to test your docker-compose.yml.
I'm trying to create a relatively simple setup to develop and test npm packages. A problem was in the fact, that after you mounted a code volume to the container it replaces node_modules.
I tried a lot of generally logical stuff, mostly aimed to move node_modules to another location and then reference it within configuration files. It works, but the solution is ugly. Also, it's not good practice to install webpack globally, but my solution requires it.
However, after some time I found this solution, which looks elegant, just what I needed, but it also has one problem. I don't understand completely, how it works.
That my version of how everything operates.
Docker reorders volume mounting based on container paths
Docker mounts sub dir volume at first
Docker mounts parent dir volume but due to an unexplained mechanism, it does not override the sub dir volume...
???
PROFIT. node_modules dir is in place and webpack runs perfectly.
So, I really want to understand how it actually does all of this black magic. Because without this knowledge I feel like I'm missing something important.
So, guys, how it works?
Thanks in advance.
services:
react-generic-form:
image: react-generic-form:package
container_name: react-generic-form-package
build:
dockerfile: dev.Dockerfile
context: ./package
volumes:
- "./package:/package"
- "/package/node_modules"
The Docker daemon, when it creates the container, sorts all of the mount points to avoid shadowing. (On non-Windows, this happens in (*github.com/docker/docker/daemon.Daemon).setupMounts.) So, in your example:
The Docker daemon sees that both /package and /package/node_modules contain data that's stored outside the container filespace.
It sorts these shortest to longest.
It mounts /package, as a bind-mount to the named host directory. (First, because it's a shorter path name.)
It mounts /package/node_modules, shadowing the equivalent directory in the previous mount, probably as a bind-mount to a directory with long hex identifier name somewhere in /var/lib/docker/volumes.
You can experiment more with this with a docker-compose.yml file like
version: '3'
services:
touch:
image: busybox
volumes:
- ./b:/a/b
- ./a:/a
command: touch /a/b/c
Notice that whichever order you put the volumes: in, you will get an empty directory ./a/b (which becomes the mount point inside the container), plus an empty file ./b/c (the result of the touch command).
Also note the statement here that the node_modules directory contains data, that should be persisted across container invocations, and has a lifecycle separately from either the container or its base image. Changing the image and re-running docker-compose up will have no effect on this volume's content.
I am working on a docker container that is being created from a generic image. The entry point of this container is dependent on a file in the local file system and not in the generic image. My docker-compose file looks something like this:
service_name:
image: base_generic_image
container_name: container_name
entrypoint:
- "/user/dlc/bin"
- "-p"
- "localFolder/fileName.ext"
- more parameters
The challenge that I am facing is removing this dependency and adding it to the base_generic_image at run time so that I can deploy it independently. Should I add this file to the base generic image and then proceed(this file is not required by others) or should this be done when creating the container, if so then what is the best way of going about it.
You should create a separate image for each part of your application. These can be based on the base image if you'd like; the Dockerfile might look like
FROM base_generic_image
COPY dlc /usr/bin
CMD ["dlc"]
Your Docker Compose setup might have a subdirectory for each component and could look like
servicename:
image: my/servicename
build:
context: ./servicename
command: ["dlc", "-p", ...]
In general Docker volumes and bind-mounts are good things to use for persistent data (when absolutely required; stateless containers with external databases are often easier to manage), getting log files out of containers, and pushing complex configuration into containers. The actual program that's being run generally should be built into the base image. The goal is that you can take the image and docker run it on a clean system without any of your development environment on it.