Modifying volume data inherited from parent image - docker

Say there is an image A described by the following Dockerfile:
FROM bash
RUN mkdir "/data" && echo "FOO" > "/data/test"
VOLUME "/data"
I want to specify an image B that inherites from A and modifies /data/test. I don't want to mount the volume, I want it to have some default data I specify in B:
FROM A
RUN echo "BAR" > "/data/test"
The thing is that the test file will maintain the content it had at the moment of VOLUME instruction in A Dockerfile. B image test file will contain FOO instead of BAR as I would expect.
The following Dockerfile demonstrates the behavior:
FROM bash
# overwriting volume file
RUN mkdir "/volume-data" && echo "FOO" > "/volume-data/test"
VOLUME "/volume-data"
RUN echo "BAR" > "/volume-data/test"
RUN cat "/volume-data/test" # prints "FOO"
# overwriting non-volume file
RUN mkdir "/regular-data" && echo "FOO" > "/regular-data/test"
RUN echo "BAR" > "/regular-data/test"
RUN cat "/regular-data/test" # prints "BAR"
Building the Dockerfile will print FOO and BAR.
Is it possible to modify file /data/test in B Dockerfile?

It seems that this is intended behavior.
Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.

VOLUMEs are not part of your IMAGE. So what is the use case to seed data into it. When you push the image into another location, it is using an empty volume at start. The dockerfile behaviour does remind you of that.
So basically, if you want to keep the data along with the app code, you should not use VOLUMEs. If the volume declaration did exist in the parent image then you need to remove the volume before starting your own image build. (docker-copyedit).

There are a few non-obvious ways to do this, and all of them have their obvious flaws.
Hijack the parent docker file
Perhaps the simplest, but least reusable way, is to simply use the parent Dockerfile and modify that. A quick Google of docker <image-name:version> source should find the github hosting the parent image Dockerfile. This is good for optimizing the final image, but destroyes the point of using layers.
Use an on start script
While a Dockerfile can't make further modifications to a volume, a running container can. Add a script to the image, and change the Entrypoint to call that (and have that script call the original entry point). This is what you will HAVE to do if you are using a singleton-type container, and need to partially 'reset' a volume on start up. Of course, since volumes are persisted outside the container, just remember that 1) your changes may already be made, and 2) Another container started at the same time may already be making those changes.
Since volumes are (virtually) forever, I just use one time setup scripts after starting the containers for the first time. That way I easily control when the default data is setup/reset. (You can use docker inspect <volume-name> to get the host location of the volume if you need to)
The common middle ground on this one seems to be to have a one-off image whose only purpose is to run once to do the volume configurations, and then clean it up.
Bind to a new volume
Copy the contents of the old volume to a new one, and configure everything to use the new volume instead.
And finally... reconsider if Docker is right for you
You probably already wasted more time on this than it was worth. (In my experience, the maintenance pain of Docker has always far outweighed the benefits. However, Docker is a tool, and with any tool, you need to sometimes take a moment to reflect if you are using it right, and if there are better tools for the job.)

Related

How can I replace a set of ENV commands in Dockerfiles with placing them to some other file, which can be reused?

I have two Dockerfiles (maybe will have more) with the list of environment variables, same for the both files. Let's say:
ENV VAR1="value1"
ENV VAR2="value2"
ENV VAR3="value3"
Can I somehow move this setup to a file, which can be used in all the Dockerfiles, where it's required?
I want to remove duplicates and have a common place for setting those variables.
You can split these into a custom base image. That image would look like
FROM ubuntu:18.04 # or whatever else you're using
ENV VAR1="value1"
ENV VAR2="value2"
ENV VAR3="value3"
# and that's all
You would have to manually build this in most situations
docker build -t my/env-base -f Dockerfile.env .
and then you can refer to it in the downstream Dockerfiles
FROM my/env-base
# the rest of the Dockerfile commands as normal
Tooling like Docker Compose won't really be aware of this image layering. There's no good way to list a base image that needs to be built as a dependency of other things, but shouldn't run a container on its own. If you do change these values you'll have to manually rebuild the base image, then rebuild the application images.
You should also consider whether you need all of these environment variables. In other SO questions I see variables used for filesystem paths (which can be fixed in an isolated Docker image), usernames (not a Docker concept really), credentials (keep far away from the image, it's really easy to get them back out), versions, and URLs. You might be able to get away with using fixed values for these (use /app rather than $INSTALL_PATH), or have a sensible default in your application code.

Deriving FROM existing Dockerfile + setting USER to non-root

I'm trying to find a generic best practice for how to:
Take an arbitrary (parent) Dockerfile, e.g. one of the official Docker images that run their containerized service as root,
Derive a custom (child) Dockerfile from it (via FROM ...),
Adjust the child in the way that it runs the same service as the parent, but as non-root user.
I've been searching and trying for days now but haven't been able to come up with a satisfying solution.
I'd like to come up with an approach e.g. similar to the following, simply for adjusting the user the original service runs as:
FROM mariadb:10.3
RUN chgrp -R 0 /var/lib/mysql && \
chmod g=u /var/lib/mysql
USER 1234
However, the issue I'm running into again and again is whenever the parent Dockerfile declares some path as VOLUME (in the example above actually VOLUME /var/lib/mysql), that effectively makes it impossible for the child Dockerfile to adjust file permissions for that specific path. The chgrp & chmod are without effect in that case, so the resulting docker container won't be able to start successfully, due to file access permission issues.
I understand that the VOLUME directive works that way by design and also why it's like that, but to me it seems that it completely prevents a simple solution for the given problem: Taking a Dockerfile and adjusting it in a simple, clean and minimalistic way to run as non-root instead of root.
The background is: I'm trying to run arbitrary Docker images on an Openshift Cluster. Openshift by default prevents running containers as root, which I'd like to keep that way, as it seems quite sane and a step into the right direction, security-wise.
This implies that a solution like gosu, expecting the container to be started as root in order to drop privileges during runtime isn't good enough here. I'd like to have an approach that doesn't require the container to be started as root at all, but only as the specified USER or even with a random UID.
The unsatisfying approaches that I've found until now are:
Copy the parent Dockerfile and adjust it in the way necessary (effectively duplicating code)
sed/awk through all the service's config files during build time to replace the original VOLUME path with an alternate path, so the chgrp and chmod can work (leaving the original VOLUME path orphaned).
I really don't like these approaches, as they require to really dig into the logic and infrastructure of the parent Dockerfile and how the service itself operates.
So there must be better ways to do this, right? What is it that I'm missing? Help is greatly appreciated.
Permissions on volume mount points don't matter at all, the mount covers up whatever underlying permissions were there to start with. Additionally you can set this kind of thing at the Kubernetes level rather than worrying about the Dockerfile at all. This is usually though a PodSecurityPolicy but you can also set it in the SecurityContext on the pod itself.

Is there a good, standard way to "bootstrap" a containerized application?

I'm working on an application which needs to be initialized the first time it is run.
Practically, what this will do is initialize a database with some starter values, and save some files in a persistent volume. If I stop the container and then restart it, I don't want to re-run that bootstrapping routine. In other words, if the container is present and populated - skip the initialization routine.
The way I was going to implement this was have an entry-point script which checks if the configuration files are present, and if so will skip the bootstrapping routine, however, I was wondering if there is a better way to do it?
For example, is there a way to run a script which is specifically triggered by the need to create a volume? If I could d that, the only circumstance under which I'd run the bootstrapper would be when the application was initializing for the first time.
Or, is there a better, more Dockerish pattern that defines how I should go about this problem?
"Do the initialization in an entrypoint script if the files don't already exist" seems to be reasonably idiomatic. For example, the standard postgres:9.6 image checks for a $PGDATA/PG_VERSION file.
Hypothetically this can look something like:
#!/bin/sh
if [ ! -f /data/config.ini ]; then
/opt/myapp/setup-data.sh /data
fi
exec "$#"
Remember that it's very routine to delete and recreate containers for a variety of reasons (IME stop and start as actions are rare, but some of this is habits born of an earlier age of Docker);this ties well into your intuition to use the entrypoint for this, since it will get launched on every docker run. From within your container you can't really tell if a directory is or isn't a volume and there aren't any hooks you can tie into; at the point the entrypoint begins, the container environment is fully set up, with whatever networks and volumes already attached.

Is there a reason why people write everything to the DockerFile instead of a separate shell script?

I somehow don't like the RUN x && y && z ... syntax we currently use in DockerFile. As far as I understand I could just run a shell script instead like RUN xyz.sh and do the same tasks on my favorite language. Does the latter have any disadvantage?
Update:
In additional to the point made by David about the complexity, I believe writing everything to the Dockerfile makes it easier for share (thus creating a survivorship bias for you). Namely on the DockerHub, most of the time, you have a "Dockerfile" tab to quickly get the idea on how the image is built. If the author uses COPY and RUN xyz.sh, he/she would have to host the script elsewhere or the Dockerfile alone becomes meaningless.
CMD is executed at runtime, that is when the container is created from the image. RUN is a build time instruction. So the question is actually why people run things with RUN instead of CMD at runtime. (You can of course COPY script.sh /script.sh then RUN bash /script.sh)
If you do things like installing dependencies, it could take a lot of time, in case of scaling up your service, this would make auto-scaling useless because it can't be fast enough to absorb the peak.
At build time, RUN can be cached, so next time the build will be a lot faster.
Because the way docker file system works, creating 10 containers from the same image takes only a few more space than creating 1 container. So you can save disk space by installing packages in the image, while if you install them at runtime, they will all occupy a part of disk space.
RUN executes commands in a new layer and creates a new image. This happens when you build the image using docker build.
CMD specifies a default command an parameters to be run when a container is launched from the image.
In summary. Run and cmd is not interchangeable, RUN runs when an image is created, CMD when a container is launched.

How do I update docker images?

I read that docker works with layers, so when creating a container with a Dockerfile, you start with the base image, then subsequent commands run add a layer to the container, so if you save the state of that new container, you have a new image. There are a couple of things I'm wondering about this.
If I start from a Ubuntu image, which is pretty big and bulky since its a complete OS, then I add a few tools to it and save this as a new image which I upload to the hub. If someone downloads my image, and they already have a Ubuntu image saved in their images folder, does this mean they can skip downloading Ubuntu since they already have the image? If so, how does this work when I modify parts of the original image, does Docker use its cached data to selectively apply those changes to the Ubuntu image after it loads it?
2.) How do I update an image that I built by modifying the Dockerfile? I setup a simple django project with this Dockerfile:
FROM python:3.5
ENV PYTHONBUFFERED 1
ENV APPLICATION_ROOT /app
ENV APP_ENVIRONMENT L
RUN mkdir -p $APPLICATION_ROOT
WORKDIR $APPLICATION_ROOT
ADD requirements.txt $APPLICATION_ROOT
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
ADD . $APPLICATION_ROOT
and used this to create the image in the beginning. So everytime I create a box, it loads all these environment variables, if I rebuild the box completely it reinstalls the packages and all the extras. I need to add a new environment variable, so I added it to the bottom of the Dockerfile, along with a test variable:
ENV COMPOSE_CONVERT_WINDOWS_PATHS 1
ENV TEST_ENV_VAR TEST
When I delete the container and the image, and build a new container, it all seems to go accordingly, it tells me that it creates the new Step 4 : ENV
COMPOSE_CONVERT_WINDOWS_PATHS 1
---> Running in 75551ea311b2
---> b25b60e29f18
Removing intermediate container 75551ea311b2
So its like something gets lost in some of these intermediate container transitions. Is this how the caching system works, every new layer is an intermediate container? So with that in mind, how do you add a new layer, do you always have to add the new data at the bottom of the Dockerfile? Or would it be better to leave the Dockerfile alone once the image is built, and just modify the container and built a new image?
EDIT I just tried installing an image, a package called bwawrik/bioinformatics, which is a CentOS based container which has a wide range of tools installed.
It froze half way through, so I exited it and then ran it again to see if everything was installed:
$ docker pull bwawrik/bioinformatics
Using default tag: latest
latest: Pulling from bwawrik/bioinformatics
a3ed95caeb02: Already exists
a3ed95caeb02: Already exists
7e78dbe53fdd: Already exists
ebcc98113eaa: Already exists
598d3c8fd678: Already exists
12520d1e1960: Already exists
9b4912d2bc7b: Already exists
c64f941884ae: Already exists
24371a4298bf: Already exists
993de48846f3: Already exists
2231b3c00b9e: Already exists
2d67c793630d: Already exists
d43673e70e8e: Already exists
fe4f50dda611: Already exists
33300f752b24: Already exists
b4eec31201d8: Already exists
f34092f697e8: Already exists
e49521d8fb4f: Already exists
8349c93680fe: Already exists
929d44a7a5a1: Already exists
09a30957f0fb: Already exists
4611e742e0b5: Already exists
25aacf0148db: Already exists
74da82504b6c: Already exists
3e0aac083b86: Already exists
f52c7e0ac000: Already exists
35eee92aaf2f: Already exists
5f6d8eb70885: Already exists
536920bfe266: Already exists
98638e678c51: Already exists
9123956b991d: Already exists
1c4c8a29cd65: Already exists
1804bf352a97: Already exists
aa6fe9359956: Already exists
e7e38d1250a9: Already exists
05e935c831dc: Already exists
b7dfc22c26f3: Already exists
1514d4797ffd: Already exists
Digest: sha256:0391808e21b7b5cc0eb44fc2dad0d7f5415115bdaafb4534c0b6a12efd47a88b
Status: Image is up to date for bwawrik/bioinformatics:latest
So it definitely installed the package in pieces, not all in one go. Are these pieces, different images?
image vs. container
First, let me clarify some terminology.
image: A static, immutable object. This is the thing you build when you run docker build using a Dockerfile. An image is not a thing that runs.
Images are composed of layers. an image might have only one layer, or it might have many layers.
container: A running thing. It uses an image as its starting template.
This is similar to a binary program and a process. You have a binary program on disk (such as /bin/sh), and when you run it, it is a process on your system. This is similar to the relationship between images and containers.
Adding layers to a base image
You can build your own image from a base image (such as ubuntu in your example). Some commands in your Dockerfile will create a new layer in the ultimate image. Some of those are RUN, COPY, and ADD.
The very first layer has no parent layer. But every other layer will have a parent layer. In this way they link to one another, stacking up like pancakes.
Each layer has a unique ID (the long hexadecimal hashes you have already seen). They can also have human-friendly names, known as tags (e.g. ubuntu:16.04).
What is a layer vs. an image?
Technically, each layer is also an image. If you build a new image and it has 5 layers, you can use that image and it will contain all 5 layers. If you run a container using the third layer in the stack as your image ID, you can do that too - but it would only contain 3 layers. The one you specify and the two that are its ancestors.
But as a matter of convention, the term "image" generally means the layer that has a tag associated. When you run docker images, it will show you all of the top-level images, and hide the layers beneath (but you can show them all with -a).
What is an intermediate container?
When docker build runs, it does all of its work inside of containers (naturally!) So if it encounters a RUN step, it will create a container from the current top layer, run the specified commands in there, and then save the result as a new layer. Then it will create a container from this new layer, run the next thing... etc.
The intermediate containers are only used for the build process, and are discarded after the build.
How layer filesystems work
You asked whether someone downloading your ubuntu-based image are only doing a partial download, if they already had the ubuntu image locally.
Yes! That's exactly right.
Every layer uses the layer beneath it as a base. The new layer is basically a diff between that layer and a new state. It's not a diff in the same way as a git commit might work, though. It works at the file level, not at a the line level.
Say you started from ubuntu, and you ran this Dockerfile.
FROM: ubuntu:16.04
RUN groupadd dan && useradd -g dan dan
This would result in a two layer image. The first layer would be the ubuntu image. The second would probably have only a handful of changes.
A newer copy of /etc/passwd with user "dan"
A newer copy of /etc/group with group "dan"
A new directory /home/dan
A couple of default files like /home/dan/.bashrc
And that's it. If you start a container from this image, those few files would be in the topmost layer, and everything else would come from the filesystem in the ubuntu image.
The top-most read-write layer in a container
One other point. When you run a container, you can write files in the filesystem. But if you stop the container and run another container from the same image, everything is reset. So where are the files written?
Images are immutable, so once they are created, they can't be changed. You can build a new version, but that's a new image. It would have a different ID and would not be the same image.
A container has a top-level read-write layer which is put on top of the image layers. Any writes happen in that layer. It works just like the other layers. If you need to modify a file (or add one, or delete one), that is done in the top layer, and doesn't affect the lower layers. If the file exists already, it is copied into the read-write layer, and then modified. This is known as copy-on-write (CoW).
Where to add changes
Do you have to add new things to the bottom of Dockerfile? No, you can add anything anywhere (or change anything).
However, how you do things does affect your build times because of how the build caching works.
Docker will try to cache results during builds. If it finds as it reads through Dockerfile that the FROM is the same, the first RUN is the same, the second RUN is the same... it will assume it has already done those steps, and will use cached results. If it encounters something that is different from the last build, it will invalidate the cache. Everything from that point on will be re-run fresh.
Some things will always invalidate the cache. For instance if you use ADD or COPY, those always invalidate the cache. That's because Docker only keeps track of what the build commands are. It doesn't try to figure out "is this version of the file I'm copying the same one as last time?"
So it is a common practice to start with FROM, then put very static things like RUN commands that install packages with e.g. apt-get, etc. Those things tend to not change a lot after your Dockerfile has been initially written. Later in the file is a more convenient place to put things that change more often.
It's hard to concisely give good advice on this, because it really depends on the project in question. But it pays to learn how the build caching works and try to take advantage of it.

Resources