"a bind mount won't copy the container contents to the host automatically, unlike a named volume" - docker

Need clarity on a comment here:
The only 'problem' with a bind mount is that it won't copy the
container contents to the host automatically, unlike a named volume.
docs.docker.com/compose/compose-file/#volumes
Is this accurate? If yes, then:
how does one get the container's "new data" (e.g. a growing database) into the host when using a bind mount (to persist the data in case of a container restart)?
how did Docker persist data across container restarts before there were named volumes?

The only 'problem' with a bind mount is that it won't copy the
container contents to the host automatically, unlike a named volume.
Is this accurate?
Close to accurate, but I can see the confusion. Host volumes, aka bind mounts, do not have an initialization feature from docker. With anonymous and named volumes, docker will initialize the volume with the contents of the image at that path. This initialization includes ownership and permissions which helps avoid permission errors. This initialization only runs when the container is created and the volume is new or empty, so subsequent containers will not pickup changes to the image made in newer image versions.
If yes, then:
how does one get the container's "new data" (e.g. a growing database) into the host when using a bind mount (to persist the data
in case of a container restart)?
Reads and writes from the app in the container will continue through to the host filesystem used in the bind mount as expected. It's only the initialization step that doesn't run.
how did Docker persist data across container restarts before there were named volumes?
There were data containers, mounting volumes from other containers, but this was inflexible (all volume paths were fixed to the path in the data container) and mixed management of persistent data with ephemeral containers, and has therefore been phased out.
Volumes are used to handle data persistence between containers. A single container restarting (rather than being replaced) will still have all the container specific filesystem changes. The docker rm command deletes these filesystem changes, along with container logs and metadata/configuration of the container.
The container specific changes are the read/write top layer of an overlay filesystem used by docker. Volume mounts are all separate mounts into subdirectories of this overlay filesystem (just like /home or /var are often separate filesystem mounts in the / filesystem of a Linux host, all reads and writes to those other paths go to a separate underlying filesystem).

If you're going to mount a volume into a container, and you want that volume to reliably contain some content from the image, you need to manually copy it there at container startup time. One way to do this is with an entrypoint wrapper script:
#!/bin/sh
# Copy data into a possibly-mounted location
cp -a /app/static /var/www
# Then run the image's CMD
exec "$#"
You'd include this in your image's Dockerfile
# Must use JSON-array syntax
ENTRYPOINT ["/app/entrypoint.sh"]
CMD same as it was before
There are two important details about Docker named volumes' initialization behavior to be aware of here. The first, which you note, is that Docker only copies content into a volume for Docker named volumes; it doesn't happen for bind mounts, and it doesn't happen in other environments like Kubernetes.
The second, more subtle detail is that the initialization only happens the first time the container runs. If there's already content in a volume that you mount into a container, it will hide what was already there. In other SO questions you can see this manifest as, for example, "I added a package to my Node package.json file, but when I put the node_modules directory in a volume, it ignores the update" or "I'm using a volume to export content to an nginx proxy but it doesn't update".

I think #BMitch having the accepted answer is correct, but I will just try to add in some details with the hope of being useful.
Is this accurate? If yes, then:
Given it is my claim being scrutinised - I totally defer to #BMitch here :)!
However I would also add:
https://github.com/docker/compose/issues/4581#issuecomment-389559090
Provides a layman explanation of how named volumes / host volumes behave
My explanation needs updated to reflect the notion of 'initialization'
https://stackoverflow.com/a/40030535/3080207
This is how I would recommend setting up volumes in docker-compose at the moment, courtesy of #kaiser
how does one get the container's "new data" (e.g. a growing database) into the host when using a bind mount (to persist the data in case of a container restart)?
Both host volumes and named volumes can achieve this.
I think the point of contention is what you want to happen on the:
first run of the container
subsequent runs of the container and
the location/accessibility of the volume on the host system.
Once a volume is attached to a container (be it a named volume or bind mount), whatever is stored to that volume should be persisted between restarts - that effectively comes for free. This assumes the same docker-compose config, and no manual removal of volumes.
Previously it was a bit limiting using a named volume, as you couldn't tail logs, or edit code directly from the host as easily as you could with a bind mount - but it seems that problem is resolved / has a work around now.
Bind mounts are able to persist data between restarts. I personally find that bind volumes do what I want 99% of the time, that being said, named volumes can now 'do it all' and I'd be using those moving forward.
There are differences between them though, and I'm sure they'll still bite people occasionally, requiring them to reach out to actual experts, instead of users like me :).

Related

Does docker container maintain volume data?

This might come across as a stupid question, but I am unable to figure something about docker volumes. Going through the official documentation I can see that we can map the host machine file system on the container for persistent storage. Following the instruction I was successfully able to mount a folder on my container.
Once I exec bash into the container, I can see the mapped directory structure there as expected. My question is, how is the data mapped between these two paths, that is from the container to the mount volume on host OS. Is the data duplicated or the container directly stores the data on the volume on host OS and the mapped paths are shown for something like symlink ?
This question comes across since we are trying to maintain a large amount of data on a mounted disk but accessible by the container, with the assumption that mounting volume would directly store the data on the disk and nothing on the container.
The Docker documentation refers to this type of mount as a "bind mount"; that's also a technical Linux term that allows one part of the filesystem to also appear somewhere else, and there's a mount --bind option you can use outside of Docker (usually a pretty specialized option).
On native Linux, the host content and the container-visible content are literally the exact same disk content. If you have a bind-mounted host directory or a named Docker volume mounted over a container directory, all reads and writes will use that mounted content, and in fact nothing will be written to the container filesystem on that path.
You mention symlinks; these are always resolved as filenames in their respective filesystem space. If the mounted filesystem has a symlink passwd -> /etc/passwd then reading it will yield the host's password file on the host, and the container's password file inside the container. If it has a symlink f -> ../f then it will look at the directory above the mount point in whichever the local filesystem is.
On non-Linux this process is a little bit more technically complex since there is typically a Linux virtual machine involved in the mix. This usually manifests as file synchronization appearing slow. For data you don't need to directly access as a human, storing it in a named Docker volume will usually be faster.

Docker force usage of volume

I know that you can specify a volume inside the dockerfile, but I see the problem that the user is not required to create such a volume.
What if he forgot to specify a volume and than there are many, possibly expensive to create, files saved there, but they are not persistent, because there is no volume specified?
So my question is if it is possible to force the user to create a volume for that mountpoint, or at least check at start time (inside the container) if there is a volume mounted, so that it can react to the missing volume?
EDIT: With the new information that there are automatic created unnnamed volumes I would also accept a user-side solution (not changing the container in such a ways that he checks the volume, but a docker-deamon settings which warn/prevents me from creating unnamed volumes by mistake.
I think the VOLUME declaration is the best you can do here.
In general, a container cannot force itself to be run with any particular options. You could make a similar argument that a container "must" be run with published port or with an attached stdin to be useful, but Docker doesn't allow an image to force these on either. (And more importantly, an image can't require direct access to the host filesystem, host networking, or privileged mode.)
As #masseyb notes in a comment, the key effect of the Dockerfile VOLUME directive is to create a new anonymous volume on the given directory if nothing else is mounted there. docker volume ls will show it and you should be able to use the volume ID directly in docker run -v options, so you won't actually lose data here. (There doesn't seem to be a command to give a name to the volume, surprisingly.)
In principle it's possible to check some things in an entrypoint wrapper script, but that won't work well for this volume case. The container can't tell whether a directory is an automatically-created anonymous volume or a new empty named volume.
(Also remember that volumes, including automatically-created anonymous volumes, are never committed to images. In your Dockerfile you can't change the directory content after you declare it a VOLUME; if an end user tries to docker commit a derived image it won't include the volume data. Unless you're sure it's what you want, I usually advise against declaring VOLUME. The case you describe in the question is pretty much the one case where it's useful.)

How does volume mount from container to host and vice versa work?

docker run -ti --rm -v DataVolume3:/var ubuntu
Lets say I have a volume DataVolume 3 which pulls the contents of /var in the ubuntu container
even after killing this ubuntu container the volume remains and I can use this volume DataVolume3 to mount it to other containers.
This means with the deletion of container the volume mounts are not deleted.
How does this work ?
Does that volume mount mean that it copies the contents of /var into some local directory because this does not look like a symbolic link ?
If I have the container running and I create a file in the container then the same file gets copied to the host path ?
How does this whole process of volume mount from container to host and host to container work ?
Volumes are used for persistent storage and the volumes persists independent of the lifecycle of the container.
We can go through a demo to understand it clearly.
First, let's create a container using the named volumes approach as:
docker run -ti --rm -v DataVolume3:/var ubuntu
This will create a docker volume named DataVolume3 and it can be viewed in the output of docker volume ls:
docker volume ls
DRIVER VOLUME NAME
local DataVolume3
Docker stores the information about these named volumes in the directory /var/lib/docker/volumes/ (*):
ls /var/lib/docker/volumes/
1617af4bce3a647a0b93ed980d64d97746878564b141f30b6110d0818bf32b76 DataVolume3
Next, let's write some data from the ubuntu container at the mounted path var:
echo "hello" > var/file1
root#2b67a89a0050:/# cat /var/file1
hello
We can see this data with cat even after deleting the container:
cat /var/lib/docker/volumes/DataVolume3/_data/file1
hello
Note: Although, we are able to access the volumes like shown above but it not a recommended practice to access volumes data like this.
Now, next time when another container uses the same volume then the data from the volume gets mounted at the container directory specified as part of -v flag.
(*) The location may vary based on OS as pointed by David and probably can be seen by the docker volume inspect command.
Docker has a concept of a named volume. By default the storage for this lives somewhere on your host system and you can't directly access it from outside Docker (*). A named volume has its own lifecycle, it can be independently docker volume rm'd, and if you start another container mounting the same volume, it will have the same persistent content.
The docker run -v option takes some unit of storage, either a named volume or a specific host directory, and mounts it (as in the mount(8) command) in a specific place in the container filesystem. This will hide what was originally in the image and replace it with the volume content.
As you note, if the thing you mount is an empty named volume, it will get populated from the image content at container initialization time. There are some really important caveats on this functionality:
Named volume initialization happens only if the volume is totally empty.
The contents of the named volume never automatically update.
If the volume isn't empty, the volume contents completely replace what's in the image, even if it's changed.
The initialization happens only on native Docker, and not for example in Kubernetes.
The initialization happens only on named volumes, and not for bind-mounted host directories.
With all of these caveats, I'd avoid relying on this functionality.
If you need to mount a volume into a container, assume it will be empty when your entrypoint or the main container command starts. If you need a particular directory layout or file structure there, an entrypoint script can create it; if you're expecting it to hold particular data, keep a copy of it somewhere else in your image and copy it in if it's not already there (or, perhaps, always).
(*) On native Linux you can find a filesystem location for it, but accessing this isn't a best practice. On other OSes this will be hidden inside a virtual machine or other opaque storage. If you need to directly access the data (or inject config files, or read log files) a docker run -v /host/path:/container/path bind mount is a better choice.
Volumes are part of neither the container nor the host. Well, technically everything resides in the host machine. But the docker directories are only accessible by users in "docker" group. The files in these directories are separately managed by docker.
"Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux)."
Hence volumes are like the union of files under the docker container and the host itself. Any addition on either end will be added to the volume(/var/lib/docker/volumes), not hard copy, rather something like symbol link
As volumes can be shared across different containers, deleting a container does not cascade to the volumes associated with it.
To remove unused volumes:
docker volume prune .

Creating a volume in a dockerfile without a persistent volume (claim) in Kubernetes?

I have an application that I am converting into a docker container.
I am going to test some different configuration for the application regarding persisted vs non persisted storage.
E.g. in one scenario I am going to create a persisted volume and mount some data into that volume.
In another scenario I am going to test not having any persisted volume (and accept that any date generated while the container is running is gone when its stopped/restarted).
Regarding the first scenario that works fine. But when I am testing the second scenario - no persisted storage - I am not quite sure what to do on the docker side.
Basically does it make any sense do define a volume in my Dockerfile when I don't plan to have any persisted volumes in kubernetes?
E.g. here is the end of my Dockerfile
...
ENTRYPOINT ["./bin/run.sh"]
VOLUME /opt/application-x/data
So does it make any sense at all to have the last line when I don't create and kubernetes volumes?
Or to put it in another way, are there scenarios where creating a volume in a dockerfile makes sense even though no corresponding persistent volumes are created?
It usually doesn’t make sense to define a VOLUME in your Dockerfile.
You can use the docker run -v option or Kubernetes’s container volume mount setting on any directory in the container filesystem space, regardless of whether or not its image originally declared it as a VOLUME. Conversely, a VOLUME can leak anonymous volumes in an iterative development sequence, and breaks RUN commands later in the Dockerfile.
In the scenario you describe, if you don’t have a VOLUME, everything is straightforward: if you mount something on to that path, in either plain Docker or Kubernetes, storage uses the mounted volume, and if not, data stays in the container filesystem and is lost when the container exits (which you want). I think if you do have a VOLUME then the container runtime will automatically create an anonymous volume for you; the overall behavior will be similar (it’s hard for other containers to find/use the anonymous volume) but in plain Docker at least you need to remember to clean it up.

What's the point of data-only docker containers?

Instead of using a data-only container, I can ...
create a directory on the host (say /opt/shared_data)
Run every container with -v /opt/shared_data:/some/mount/point_inside/container
voila, now /opt/shared_data is effectively shared amongst all containers , correct?
If my understanding is correct, if I create a data-only container and then use "--volumes-from" when running other containers, I am stuck mounting them in the same location they were mounted, whereas, this way I get to choose which directory they are mounted as in my containers.
So why do I need "data-only" containers? Besides, the volume just points to somewhere on the host (/var/lib/docker/volumes?) which is functionally equivalent to my /opt/shared_data anyway right? Whats the advantage of the former?
Data containers have been largely deprecated in favor of named volumes. There's really no advantage to using a data container over a named volume, and includes the disadvantage of being stuck with the mount points.
To compare named volumes with host volumes (aka bind mounts), you have have a few differences:
Host volumes include permission issues, users inside the container will differ from those outside the container and files may not be easily accessed from both environments
Named volumes add the ability to use any volume driver so you can mount your data from remote locations.
Named volumes are initialized to the contents of the image at that path, including all files and any directory permissions.
The latter point is a big one for me, it means you can create an initial default value for a data folder in your image, but update it using the container and keep those changes in a named volume. With bind mounts, if the directory is empty or doesn't exist, that's also what you get when you mount it in your container.

Resources