Combing VOLUME + docker run -v

Combing VOLUME + docker run -v - docker

I was looking for an explanation on the VOLUME entry when writing a Dockerfile and came across this statement
A volume is a persistent data stored in /var/lib/docker/volumes/...
You can either declare it in a Dockerfile, which means each time a container is started from the image, the volume is created (empty), even if you don't have any -v option.
You can declare it on runtime docker run -v [host-dir:]container-dir.
combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
docker volume create creates a volume without having to define a Dockerfile and build an image and run a container. It is used to quickly allow other containers to mount said volume.
But I'm having a hard time understanding this line:
...combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
For example, let's say I have a config file on my host machine and I run the container based off the image I made with the Dockerfile I wrote. Will it copy the config file into where the volume that I stated in my the volume entry?
Would it be something like (pseudocode)
#dockerfile
From Ubuntu
Run apt-get update
Run apt-get install mysql
Volume . /etc/mysql/conf.d
Cmd systemcl start MySQL
And when I run it
docker run -it -v /path/to/config/file: ubuntu_based_image
Is this what they mean?

You probably don't want VOLUME in your Dockerfile. It's not necessary to mount files or directories at runtime, and it has confusing side effects like making subsequent RUN commands silently lose state.
If an image does have a VOLUME, and you don't mount anything else there when you start the container, Docker will create an anonymous volume and mount it for you. This can result in space leaks if you don't clean these volumes up.
You can use a docker run -v option on any container directory regardless of whether or not it's declared as a VOLUME.
If you docker run -v /host/path:/container/path, the two directories are actually the same; nothing is copied, and writes to one are (supposed to be) immediately visible on the other.
docker run -v /host/path:/container/path bind mounts aren't visible in /var/lib/docker at all.
You shouldn't usually be looking at content in /var/lib/docker (and can't if you're not on a native-Linux host). If you need to access the volume file content directly, use a bind mount rather than a named or anonymous volume.
Bind mounts like you've shown are appropriate for injecting config files into containers, and for reading log files back out. Named volumes are appropriate for stateful applications' storage, like the data for a MySQL database. Neither type of volume is appropriate for code or libraries; build these directly into Docker images instead.

Related

Can you mount a directory from a image into another container

Is it currently possible with docker to do something like this conceptually?
docker run --mount type=xxx,image=imageX,subdir=/somedir,dst=/mnt-here imageY ...
I understand this can be done during at docker build time with COPY --from=...., however, in my use-case it would only really be beneficial if it can be done at container creation time.

The only things it's possible to mount into a container arbitrary host directories, tmpfs directories, and Docker named volumes. You can make a named volume use anything you could mount with the Linux mount(8) command. Potentially you can install additional volume drivers to mount other things. But these are all of the possible options.
None of these options allow you to mount image or container content into a different container. The COPY --from=other-image syntax you suggest is probably the best approach here.
If you really absolutely needed it in a volume, one option is to create a volume yourself, copy the content from the source image, and then mount that into the destination image.
docker volume create some-volume
# Since the volume is empty, mounting it into the container will
# copy the contents from the image into the volume. This only happens
# with native Docker volumes and only if the volume is totally empty.
# Docker will never modify the contents of this volume after this.
# Create an empty temporary container to set up the volume
docker run -v some-volume:/somedir --rm some-image /bin/true
# Now you can mount the volume into the actual container
docker run -v some-volume:/mnt-here ...

How does volume mount from container to host and vice versa work?

docker run -ti --rm -v DataVolume3:/var ubuntu
Lets say I have a volume DataVolume 3 which pulls the contents of /var in the ubuntu container
even after killing this ubuntu container the volume remains and I can use this volume DataVolume3 to mount it to other containers.
This means with the deletion of container the volume mounts are not deleted.
How does this work ?
Does that volume mount mean that it copies the contents of /var into some local directory because this does not look like a symbolic link ?
If I have the container running and I create a file in the container then the same file gets copied to the host path ?
How does this whole process of volume mount from container to host and host to container work ?

Volumes are used for persistent storage and the volumes persists independent of the lifecycle of the container.
We can go through a demo to understand it clearly.
First, let's create a container using the named volumes approach as:
docker run -ti --rm -v DataVolume3:/var ubuntu
This will create a docker volume named DataVolume3 and it can be viewed in the output of docker volume ls:
docker volume ls
DRIVER VOLUME NAME
local DataVolume3
Docker stores the information about these named volumes in the directory /var/lib/docker/volumes/ (*):
ls /var/lib/docker/volumes/
1617af4bce3a647a0b93ed980d64d97746878564b141f30b6110d0818bf32b76 DataVolume3
Next, let's write some data from the ubuntu container at the mounted path var:
echo "hello" > var/file1
root#2b67a89a0050:/# cat /var/file1
hello
We can see this data with cat even after deleting the container:
cat /var/lib/docker/volumes/DataVolume3/_data/file1
hello
Note: Although, we are able to access the volumes like shown above but it not a recommended practice to access volumes data like this.
Now, next time when another container uses the same volume then the data from the volume gets mounted at the container directory specified as part of -v flag.
(*) The location may vary based on OS as pointed by David and probably can be seen by the docker volume inspect command.

Docker has a concept of a named volume. By default the storage for this lives somewhere on your host system and you can't directly access it from outside Docker (*). A named volume has its own lifecycle, it can be independently docker volume rm'd, and if you start another container mounting the same volume, it will have the same persistent content.
The docker run -v option takes some unit of storage, either a named volume or a specific host directory, and mounts it (as in the mount(8) command) in a specific place in the container filesystem. This will hide what was originally in the image and replace it with the volume content.
As you note, if the thing you mount is an empty named volume, it will get populated from the image content at container initialization time. There are some really important caveats on this functionality:
Named volume initialization happens only if the volume is totally empty.
The contents of the named volume never automatically update.
If the volume isn't empty, the volume contents completely replace what's in the image, even if it's changed.
The initialization happens only on native Docker, and not for example in Kubernetes.
The initialization happens only on named volumes, and not for bind-mounted host directories.
With all of these caveats, I'd avoid relying on this functionality.
If you need to mount a volume into a container, assume it will be empty when your entrypoint or the main container command starts. If you need a particular directory layout or file structure there, an entrypoint script can create it; if you're expecting it to hold particular data, keep a copy of it somewhere else in your image and copy it in if it's not already there (or, perhaps, always).
(*) On native Linux you can find a filesystem location for it, but accessing this isn't a best practice. On other OSes this will be hidden inside a virtual machine or other opaque storage. If you need to directly access the data (or inject config files, or read log files) a docker run -v /host/path:/container/path bind mount is a better choice.

Volumes are part of neither the container nor the host. Well, technically everything resides in the host machine. But the docker directories are only accessible by users in "docker" group. The files in these directories are separately managed by docker.
"Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux)."
Hence volumes are like the union of files under the docker container and the host itself. Any addition on either end will be added to the volume(/var/lib/docker/volumes), not hard copy, rather something like symbol link
As volumes can be shared across different containers, deleting a container does not cascade to the volumes associated with it.
To remove unused volumes:
docker volume prune .

Docker volume bind empty volume or convert files to folders

I'm running a container by sending to docker daemon so it can run a sibling container and in that container I try to run another container and mount a volume to access some data, however in the sibling container, the volume is either empty or the file is converted to a folder...
Running the first container:
$ docker run -v /var/run/docker.sock:/var/run/docker.sock -it example /bin/bash
root#3aa35965846a:/home/node/example# ls some_volume/
test.txt
root#3aa35965846a:/home/node/example# cat some_volume/test.txt
hello
// Running the second container
root#3aa35965846a:/home/node/example# docker run -v /home/node/example/some_volume/:/some_volume/ -it node:10 /bin/bash
root#6a84739fbb92:/# ls /some_volume/
* test.txt
root#6a84739fbb92:/# cat /some_volume/test.txt/
cat: /some_volume/test.txt/: Is a directory
The first time I run the second container the volume is empty, if I try to mount a file directly it is converted to a folder, and after that if I try to mount the folder like the example above, there is only the file I tried to mount earlier and it is a folder.
How is this possible ? If i try to mount a volume outside the first container I don't have any problem, how can I fix this ?

The first path in the docker run -v option is always on the host system. For example, if you
docker run -v /etc:/x busybox cat /x/shadow
it will dump out the host's encrypted password file, regardless of whether you ran this command directly from the host or from a container.
There isn't a way to share an arbitrary directory from one container to another. If the launching container knows something about its own directory structure (in particular that some directory was mounted from a specific host path or named volume) then it can replicate that to the other container, but that's not a generic answer. The other behaviors you're seeing are just a consequence of those directories not existing on the host system.
In general I would advise not using Docker for short-lived processes that principally interact with the outside world through the filesystem. Take whatever program you'd run in the other container, install it in your image's Dockerfile, and run it directly without going through Docker.
If you really can't avoid this workflow, the only thing I've found to work reliably is to docker create the container, docker cp files in, docker start it, and docker wait for it to finish. When it's done, docker cp the result out before docker rm it. That's a kind of painstaking workflow but it gets around the problem of the two containers not sharing any filesystem space.

When mounting named volumes, under what conditions is data copied from the container?

I know that, in some cases, mounting a named volume on a container can cause data from the container to be automatically copied to the volume, similar to what happens with anonymous volumes.
However, even after reading the docs, it's unclear to me what are the exact conditions that trigger the copy.
I created a test image with two files /root/a/a.txt and /root/b/b.txt and performed the following experiment:
$ docker volume create --name hello
hello
$ docker run -v hello:/root/a test /bin/true
$ docker run -v hello:/root/b test /bin/true
$ docker run -ti -v hello:/tmp/result test /bin/bash
(now inside the container)
# cd /tmp/result
# ls
a.txt
It seems that the data is copied into the volume after the first mount, and not afterwards (despite not having specified nocopy for the second mount).
Do named volumes have hidden state that tracks if they haven't been mounted yet? Or is the copy triggered just by the volume being empty? If copying only happens the first time, what is the purpose of the nocopy modifier to the -v option of docker run?
I'm using Docker version 1.11.1.

From my testing, it appears to trigger the copy when the volume is empty when you run the container.
I tested with a volume that was populated from a first mount point in one container. Using this named volume on a second mount point of another container showed the contents of the first mount point. I then deleted the contents but the container was still running and volume still mounted. Finally, I started a second container with a different mount point and the contents from the second mount point were then populated.

How Docker container volumes work even when they aren't running?

Take a typical data only Docker container:
FROM stackbrew/busybox:latest
RUN mkdir /data
VOLUME /data
Now I have seen a great deal of them that are run like this:
docker run -name my-data data true
The true command exits as soon as it runs, and so does the container. But surprisingly it continues to serve the volume when you connect it with another container via --volumes-from my-data.
My question is, how does that work? How does a stopped container still allow access in it's volumes?

Volumes in docker are not a top-level thing. They are "simply" part of container's meta-data.
When you have VOLUME in your dockerfile or start a container with -v, Docker will create a directory in /var/lib/docker/volumes* with a random ID (this is the exact same process as creating an image with commit except it is empty) and add that random ID to the container's metadata.
When the container starts, Docker will mount-bind the directory /var/lib/docker/volumes/* at the given location for that volume.
When you use volumes-from, Docker will just lookup the volume id and the location from an other container, running or not and mount-bind the directory at the set location.
Volumes are not linked with the runtime, it is just directories that are mounted.
* With newer versions, Docker now uses the vfs driver for storage and /var/lib/docker/volumes/ is used only for metadatas like size, create time, etc. The actual data are stored in /var/lib/docker/vfs/dir/<volume id>

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart