Docker Compose: Which syntax produces a bind mount, which produces a volume - docker

In the Docker Compose documentation, here, you have the following example related to the volumes section of docker-compose.yml files:
volumes:
# (1) Just specify a path and let the Engine create a volume
- /var/lib/mysql
# (2) Specify an absolute path mapping
- /opt/data:/var/lib/mysql
# (3) Path on the host, relative to the Compose file
- ./cache:/tmp/cache
# (4) User-relative path
- ~/configs:/etc/configs/:ro
# (5) Named volume
- datavolume:/var/lib/mysql
Which syntaxes produce a bind mount and which produce a docker volume?
At some place of the documentation, the two concepts are strictly differentiated but at this place they are mixed together... so it is not clear to me.

Whenever you see "volume" in the comment, that will create a volume: so (1) and (5).
If there is not a volume in the comment, this is about a bind mount.
The documentation regarding volumes in docker-compose is here:
Mount host paths or named volumes, specified as sub-options to a service.
You can mount a host path as part of a definition for a single service, and there is no need to define it in the top level volumes key.
But, if you want to reuse a volume across multiple services, then define a named volume in the top-level volumes key.
The top-level volumes key defines a named volume and references it from each service’s volumes list. This replaces volumes_from in earlier versions of the Compose file format. See Use volumes and Volume Plugins for general information on volumes.

Those are two completely different concepts. A volume means that given directory will be persisted between container runs. Imagine MySQL database. You don’t want to lose your data. On the other hand there’s a bind mount where you attach your local directory to the directory in the container. If the container writes something there it will appear in your file system and vice versa (synchronization).
As a side note a volume is nothing more than a symlink to the directory on your machine :) (to a /var/lib/docker/volumes/... directory by default)

Related

How docker named volume works?

I am new to docker and volumes and is confused about how named volumes are working. I have two scenarios in which I want to know how the named volumes will work
First Scenario
I have to setup two projects with docker and both have separate databases. Now how the database volumes will be mapped with /var/lib/mysql? Does it maintain separate data based on db name?
Second Scenario
I have two services using same named volume. In both the services, the path of container mapped to the named volume is different. How this will work?
services:
s1:
volume:
- vol:/var/lib/s1
s2:
volume:
- vol:/var/lib/s2
volumes:
vol:
Since you are using docker-compose, it does some things for you. If your composed "project name" is project_a, the docker-compose vol volume will be named project_a_vol. Verify this by running docker volume ls. By a "composed project name" I mean the name of the project which usually equals to the name of the directory in which the docker-compose was run, or custom one if the --project-name parameter was set (eg. docker-compose --project-name xxx up)
I assume you're using the default docker volume filesystem storage driver. A named volume is nothing more than a directory inside the /var/lib/docker/volumes folder (try it sudo ls -l /var/lib/docker/volumes). By mounting a volume using vol:/var/lib/s1 you tell docker to synchronize directories:
Local /var/lib/docker/volumes/project_a_vol with container directory /var/lib/s1.
If you compose your services this way:
services:
s1:
volumes:
- vol:/var/lib/s1
s2:
volumes:
- vol:/var/lib/s2
The same directory will be mounted to 2 services: s1 and s2 and you most probably will have a problem because 2 services will try to read & write to the same directory at the same time. Unless those services can handle such case.
It's better to have separate volumes though. In such case a volume for one service can be purged leaving the other one intact.
Some hints to your questions.
First Scenario : Two docker containers with a DB each.
- For this scenario since the databases are different, they run on their own container space.
- You can create a Docker Volume or use Docker bind mounts to attach disk to your Database `/var/lib/mysql'
- If you use volumes, you create one volume per database and the data are isolated.
- If you use bind mounts, make sure you mount different disk locations, if you use same location, the second database container data will overwrite the first database data.
Second Scenario : As per this scenario, since the Volume label is same, the second coming up service data would replace the already running Service data every time the services start.

Best practice - Anonymous volume vs bind mount

In a container,
anonymous volume can be created
with syntax(VOLUME /build) in Dockerfile
or
below syntax with volumes having /build entry
cache:
build: ../../
dockerfile: docker/dev/Dockerfile
volumes:
- /tmp/cache:/cache
- /build
entrypoint: "true"
My understanding is, both approach(above) make volume /build also available after container goes in Exited state.
Volume is anonymous because /build points to some random new location(in /var/lib/docker/volumes directory) in docker host
I see that anonymous volumes are more safer than named volumes(like /tmp/cache:/cache).
Because /tmp/cache location is vulnerable because there is more chance that this location is used by more than one docker container.
1)
Why anonymous volume usage is discouraged?
2)
Is
VOLUME /build in Dockerfile
not same as
volumes:
- /build
in docker-compose.yml file? Is there a scenario, where we need to mention both?
You're missing a key third option, named volumes. If you declare:
version: '3'
volumes:
build: {}
services:
cache:
image: ...
volumes:
- build:/build
Docker Compose will create a named volume for you; you can see it with docker volume ls, for example. You can explicitly manage named volumes' lifetime, and set several additional options on them which are occasionally useful. The Docker documentation has a page describing named volumes in some detail.
I'd suggest that named volumes are strictly superior to anonymous volumes, for being able to explicitly see when they are created and destroyed, and for being able to set additional options on them. You can also mount the same named volume into several containers. (In this sequence of questions you've been asking, I'd generally encourage you to use a named volume and mount it into several containers and replace volumes_from:.)
Named volumes vs. bind mounts have advantages and disadvantages in both directions. Bind mounts are easy to back up and manage, and for content like log files that you need to examine directly it's much easier; on MacOS systems they are extremely slow. Named volumes can run independently of any host-system directory layout and translate well to clustered environments like Kubernetes, but it's much harder to examine them or back them up.
You almost never need a VOLUME directive. You can mount a volume or host directory into a container regardless of whether it's declared as a volume. Its technical effect is to mount a new anonymous volume at that location if nothing else is mounted there; its practical effect is that it prevents future Dockerfile steps from modifying that directory. If you have a VOLUME line you can almost always delete it without affecting anything.
Actually, anonymous volumes (/build) usage is encouraged over the use of bind mounts (/tmp/cache:/cache):
Volumes have several advantages over bind mounts:
Volumes are easier to back up or migrate than bind mounts.
You can manage volumes using Docker CLI commands or the Docker API.
Volumes work on both Linux and Windows containers.
Volumes can be more safely shared among multiple containers.
Volume drivers let you store volumes on remote hosts or cloud providers, to encrypt the contents of volumes, or to add other
functionality.
New volumes can have their content pre-populated by a container.
Regarding your second question, yes. You can create anonymous volumes in docker-compose file or in the Dockerfile. No need to specify in both places.

docker-compose read-only bind volumes merge

Currently I'm trying to mount two folders (./app + ./test/public) and one file (./test/test.py) into a shared folder in the container (type: bind), so I always have the current code in the container without restarting. The problem is that the content in /test is also mounted to /app in the host system. Can this be avoided?
Here is my example file:
volumes:
- "./app:/app"
- "./test/public:/app/test/public"
- "./test/test.py:/app/test.py"
I've searched the ninternet for about an hour now and read the docker-compose documentation, but i coudn't figure out how to solve this problem..
Hope you can help :)
edit: after docker-compose up the ./app on the host machine contains
./app/test/public and ./app/test.py too; So i simply want to mount and merge these folders without changing the host files.
I don't think you can avoid this behavior: Docker needs to create filesystem entries to which it can attach the bind mounts for your test/... mounts. If you're bind-mounting a file, a file must exist at the target location first; similarly for a directory.
That means that before performing your bind mounts, Docker first creates a new (empty) file or directory to provide the target for the mount. This is what you see inside your app directory.
Your options are either (a) just live with it, or (b) restructure your project so that you don't need to bind mount things into an existing bind mount.

Docker Compose: volumes without colon (:)

I have a docker-compose.yml file with the following:
volumes:
- .:/usr/app/
- /usr/app/node_modules
First option maps current host directory to /usr/app, but what does the second option do?
[Refreshing this answer since it seems others have similar questions]
There are three kinds of volumes in docker:
Host volumes: these map a path from the host into the container with a bind mount. They have the short syntax /path/on/host:/path/in/container. Whatever exists on the host is what will be visible in the container, there's no merging of files or initialization from the image, and uid/gid's do not get any special mapping so you need to take care to allow the container uid/gid read and write access to this location (an exception is Docker for Mac with OSXFS). If the path on the host does not exist, docker will create an empty directory as root, and if it is a file, you can mount a single file into the container this way.
Named volumes: these have a name, instead of a host path as the source. They have the short syntax name:/path/in/container and in a compose file, you also need to define the named volume used in containers at the top level. By default, these are also a bind mount, but to a docker specific directory under /var/lib/docker/volumes that should be considered internal. However these defaults can be changed to allow things like NFS mounts, mounting disks, or even your own bind mounts to other locations. Named volumes also have a feature in docker, when they are new or empty and first used, docker copies the contents from the image into named volume before mounting it. This includes files, directories, uid/gid owners, and permissions. After that, they behave identical to a host volume, whatever is inside the volume overlays the image location.
Anonymous volumes: these only have a path inside the container. They are in the form /path/in/container and docker will create a default named volume with a guid as the name. They share the behaviors of named volumes, storing files under /var/lib/docker/volumes, initializing with the contents of the image, except they have a randomly generated guid that gives you no indication of how or even if they are being used. You can mount the volume in another container and inspect the contents, or you can find the container using the volume by inspecting each container to find the guid. If you create a container with the --rm flag, anonymous volumes will also be deleted automatically.
tmpfs: Wait, I said 3, and this is 4? That's because tmpfs isn't considered a volume, the syntax to mount it is different. The result is a pointer to an empty in memory filesystem. This is useful if you have temporary files you don't wish to save, they are relatively small, and you either need speed or want to be sure they aren't saved to disk.
In the OP's case:
/usr/app is mounted from the host, commonly used for development
/usr/app/node_modules is an anonymous volume initialized from the image
Why do this? Likely because you do not want to modify the node_modules directory on the host, particularly if there's platform specific data and you're running on Docker desktop where it's Mac/Win on the host and Linux in the container. It's also possible there's data in the image you want to get access to within the directory structure of the other volume mount.
Are there downsides to anonymous volumes? Two that I can think of:
If there's anything in /usr/app/node_modules that you want to reuse in a future container, you're unlikely to find the old volume. I tend to consider any data written to these as likely lost.
You'll often find the volumes on the host full of guids over time, and it's unclear which are in use and which can be deleted. Unused anonymous volumes are one of several causes of excessive disk use in docker.
For more details on docker volumes, see: https://docs.docker.com/storage/
Original answer:
The second one creates an anonymous volume. It will be listed in docker volume ls with a long unique id rather than a name. Docker-compose will be able to reuse this if you update your image, but it's easy to lose track of which volume belongs to what with those names, so I recommend always giving your volume a name.
Just to complement the accepted answer, according to Docker's Knowledge Base there are three types of volumes: host, anonymous, and named:
A host volume lives on the Docker host's filesystem and can be
accessed from within the container. Example volume path:
/path/on/host:/path/in/container
An anonymous volume is useful for when you would rather have
Docker handle where the files are stored. It can be difficult,
however, to refer to the same volume over time when it is an
anonymous volumes. Example volume path:
/path/in/container
A named volume is similar to an anonymous volume. Docker manages
where on disk the volume is created, but you give it a volume name. Example volume path:
name:/path/in/container
The path used in your example is an anonymous volume.
I had the same question while I was going through this tutorial, and the answer to what those lines could actually be doing is this:
Without the anonymous volume ('/usr/src/app/node_modules'), the node_modules directory would essentially disappear by the mounting of the host directory at runtime:
Build - The node_modules directory is created.
Run - The current directory is copied into the container, overwriting the node_modules that were just installed when the container was built.
The docker-compose.yml file for this:
version: '3.5'
services:
something-clever:
container_name: something-clever
build:
context: .
dockerfile: Dockerfile
volumes:
- '.:/usr/src/app'
- '/usr/src/app/node_modules'
ports:
- '4200:4200'

Docker Anonymous Volumes

I've seen Docker volume definitions in docker-compose.yml files like so:
-v /path/on/host/modules:/var/www/html/modules
I noticed that Drupal's official image, their docker-compose.yml file is using anonymous volumes.
Notice the comments:
volumes:
- /var/www/html/modules
- /var/www/html/profiles
- /var/www/html/themes
# this takes advantage of the feature in Docker that a new anonymous
# volume (which is what we're creating here) will be initialized with the
# existing content of the image at the same location
- /var/www/html/sites
Is there a way to associate an anonymous volume with a path on the host machine after the container is running? If not, what is the point of having anonymous volumes?
Full docker-compose.yml example:
version: '3.1'
services:
drupal:
image: drupal:8.2-apache
ports:
- 8080:80
volumes:
- /var/www/html/modules
- /var/www/html/profiles
- /var/www/html/themes
# this takes advantage of the feature in Docker that a new anonymous
# volume (which is what we're creating here) will be initialized with the
# existing content of the image at the same location
- /var/www/html/sites
restart: always
postgres:
image: postgres:9.6
environment:
POSTGRES_PASSWORD: example
restart: always
Adding a bit more info in response to a follow-up question/comment from #JeffRSon asking how anonymous volumes add flexibility, and also to answer this question from the OP:
Is there a way to associate an anonymous volume with a path on the host machine after the container is running? If not, what is the point of having anonymous volumes?
TL;DR: You can associate a specific anonymous volume with a running container via a 'data container', but that provides flexibility to cover a use case that is now much better served by the use of named volumes.
Anonymous volumes were helpful before the addition of volume management in Docker 1.9. Prior to that, you didn't have the option of naming a volume. With the 1.9 release, volumes became discrete, manageable objects with their own lifecycle.
Before 1.9, without the ability to name a volume, you had to reference it by first creating a data container
docker create -v /data --name datacontainer mysql
and then mounting the data container's anonymous volume into the container that needed access to the volume
docker run -d --volumes-from datacontainer --name dbinstance mysql
These days, it's better to use named volumes since they are much easier to manage and much more explicit.
Anonymous volumes are equivalent to having these directories defined as VOLUME's in the image's Dockerfile. In fact, directories defined as VOLUME's in a Dockerfile are anonymous volumes if they are not explicitly mapped to the host.
The point of having them is added flexibility.
PD:
Anonymous volumes already reside in the host somewhere in /var/lib/docker (or whatever directory you configured). To see where they are:
docker inspect --type container -f '{{range $i, $v := .Mounts }}{{printf "%v\n" $v}}{{end}}' $CONTAINER
Note: Substitute $CONTAINER with the container's name.
One possible usecase of anonymous volumes in these days is in combination with Bind Mounts. When you want to bind some folder but without any specific subfolders. These specific subfolders should be then set as named or anonymous volumes. It will guarantee that these subfolders will be present in your container folder which is bounded outside the container but you do not have to have it in your bound folder on the host machine at all.
For example you can have your frontend NodeJS project built in container where is needed node_modules folder for it but you dont need this folder for your coding at all. You can then map your project folder to some folder outside the container and set the node_modules folder as an anonymous volume. Node_modules folder will be present in the container all the time even if you do not have it on the host machine in your working folder.
Not sure why Drupal developers suggest such settings. Anyways, I can think of two differences:
With named volumes you have a name that suggests to which project it belongs.
After docker-compose down && docker-compose up -d a new empty anonymous volume gets attached to the container. (But the old one doesn't disappear. docker doesn't delete volumes unless you tell it to.) With named volumes you'll get the volume that was attached to the container before docker-compose down.
As such, you probably don't want to put data you don't want to lose into an anonymous volume (like db or something). Again, they won't disappear by themselves. But after docker-compose down && docker-compose up -d && docker volume prune a named volume will survive.
For something less critical (like node_modules) I don't have strong argument for or against named volumes.
Is there a way to associate an anonymous volume with a path on the host machine after the container is running?
For that you need to change the settings, e.g. /var/www/html/modules -> ./modules:/var/www/html/modules, and do docker-compose up -d. But that will turn an anonymous volume into a bind mount. And you will need to copy the data from the volume to ./modules. Similarly, you can turn an anonymous volume into a named volume.

Resources