Description
Official Docker documentation are usually not very useful, and alot of times things remain unclear even after reading through their sections.
There are many things unclear, but this question I just want to target these:
When running docker volume create:
--driver
--opt device
--opt type
When I run docker volume create --driver local --opt device=:/var/www/html/app --opt type=volume volumename I actually do get a volume :
$docker volume inspect customvolume`
[
{
"CreatedAt": "2020-08-03T09:28:10Z",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/customvolume/_data",
"Name": "customvolume",
"Options": {
"device": ":/var/www/html/customfolder",
"type": "volume"
},
"Scope": "local"
}
]
Trying to mount this new volume:
docker run --name test-with-volume \
--mount source=customvolume,target=/var/www/html/app77' \
my-app-only:latest
Error:
Error response from daemon: error while mounting
volume '/var/lib/docker/volumes/customvolume/_data': failed to
mount local volume: mount :/var/www/html/customfolder:/var/lib/docker/volumes/customvolume/_data: no such device.
Questions
Clearly the options allow you to do some unexpected things, I was able to create a volume volume at a custom location, but it is not mountable.
What are the options for type (with difference of each explained) : when using docker volume create, they are unclear to me.
docker run --mount documentation talks about volume, bind, tmp, but on docker volume create they only show examples, which are tmpfs, btrfs, nfs.
When can you use device?
I thought this could be used to create a custom location for volume type (aka named volumes) on the source host (similar to how bind-mounts can be mounted)
I assumed I could use the 'recommended way of named volumes including a custom folder location' instead of host mounts (bind-mounts).
Finally, how could you setup a docker-compose.yml volume custom driver correctly as well.
I think the confusion lies in the fact that docker run --mount vs docker volume create seems to be inconsistent, because of how unclear Docker documentation quality is
There are two main categories of data — persistent and non-persistent.
Persistent is the data you need to keep. Things like; customer records, financial data, research results, audit logs, and even some types of application log data. Non-persistent is the data you don’t need to keep.
Both are important, and Docker has solutions for both.
To deal with non-persistent data, every Docker container gets its own non-persistent storage. This is automatically created for every container and is tightly coupled to the lifecycle of the container. As a result, deleting the container will delete the storage and any data on it.
To deal with persistent data, a container needs to store it in a volume. Volumes are separate objects that have their lifecycles decoupled from containers. This means you can create and manage volumes independently, and they’re not tied to the lifecycle of any container. Net result, you can delete a container that’s using a volume, and the volume won’t be deleted.
This writable layer of local storage is managed on every Docker host by a storage driver (not to be confused with a volume driver). If you’re running Docker in production on Linux, you’ll need to make sure you match the right storage driver with the Linux distribution on your Docker host. Use the following list as a guide:
Red Hat Enterprise Linux: Use the overlay2 driver with modern
versions of RHEL running Docker 17.06 or higher. Use the devicemapper
driver with older versions. This applies to Oracle Linux and other
Red Hat related upstream and downstream distros.
Ubuntu: Use the overlay2 or aufs drivers. If you’re using a Linux 4.x
kernel or higher you should go with overlay2.
SUSE Linux Enterprise Server: Use the btrfs storage driver.
Windows Windows only has one driver and it is configured by default.
By default, Docker creates new volumes with the built-in local driver. As the name suggests, volumes created with the local driver are only available to containers on the same node as the volume. You can use the -d flag to specify a different driver. Third-party volume drivers are available as plugins. These provide Docker with seamless access external storage systems such as cloud storage services and on-premises storage systems including SAN or NAS.
$ docker volume inspect myvol
[
{
"CreatedAt": "2020-05-02T17:44:34Z",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/myvol/_data",
"Name": "myvol",
"Options": {},
"Scope": "local"
}
]
Notice that the Driver and Scope are both local. This means the volume was created with the local driver and is only available to containers on this Docker host. The Mountpoint property tells us where in the Docker host’s filesystem the volume exists.
With bind mounts
version: '3.7'
services:
maria_db:
image: mariadb:10.4.13
environment:
MYSQL_ROOT_PASSWORD: Test123#123
MYSQL_DATABASE: database
ports:
- 3306:3306
volumes:
- /etc/localtime:/etc/localtime:ro
- ./data_mariadb/:/var/lib/mysql/
With volume mount
version: "3.8"
services:
web:
image: mariadb:10.4.13
volumes:
- type: volume
source: dbdata
target: /var/lib/mysql/
volumes:
dbdata:
Bind mounts explanation
Bind mounts have been around since the early days of Docker. Bind mounts have limited functionality compared to volumes. When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its full or relative path on the host machine. By contrast, when you use a volume, a new directory is created within Docker’s storage directory on the host machine, and Docker manages that directory’s contents.
tmpfs mounts explanation
Volumes and bind mounts let you share files between the host machine and container so that you can persist data even after the container is stopped. If you’re running Docker on Linux, you have a third option: tmpfs mounts. When you create a container with a tmpfs mount, the container can create files outside the container’s writable layer. As opposed to volumes and bind mounts, a tmpfs mount is temporary and only persisted in the host memory. When the container stops, the tmpfs mount is removed, and files are written there won’t be persisted.
Volume explanation
Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. While bind mounts are dependent on the directory structure of the host machine, volumes are completely managed by Docker.
Recently I searched for something similar: how to force a docker volume into writing its data to a custom path that is actually the mount point of a persistent disk. There were 2 motives:
first avoid that the docker volume would be stuck inside the VM
Image's disk space.
second have the data outlive the docker volume itself (e.g. easy to reuse on another VM instance and freshly created docker volume).
This seemed feasible by passing extra options to the standard local driver when executing docker volume create. For example the command below makes the docker volume tmp-volume write into the device's argument value. Note that docker volume inspect still outputs a completely different but unused MountPoint. It worked when Ubuntu was the host OS inside that VM instance:
docker volume create -d local --name tmp-volume\
--opt device="/mnt/disks/disk-instance-test-volume" \
--opt type="none" \
--opt o="bind"
Maybe this is overlapping with your use-case? I blogged the whole story here in more detail: https://medium.com/#francis.meyvis/how-to-force-a-docker-volume-on-a-gce-disk-45b59d4973e?source=friends_link&sk=0e71ef39db84f4cb0ecccc7cd0f3c254
Damith's detailed explanation about named-volumes vs bind-mounts is a good reference to read for anyone. To answer the question I had, he talked about 3rd party plugins so I had to investigate further.
There seems to be no way to use custom location when using a named-volume (only the bind-mounts are able to) with a default Docker installation, but there is indeed a plugin that acts similarly to named-volumes but with some extra functionality.
While this only partially answers some of the things I mentioned in question (and still not clear about), use this for reference if you want to use named-volume acting like bind-mounts
Solution
For my particular use case, the Docker plugin local-persist seems to solve my requirements, it has the capability to 1) persist data when containers get deleted and 2) provide a way to use a custom location.
Matchbooklab Docker local-persist
Installation:
Confirmed to work with Ubuntu 20.04 installation
Run this install script: note: there is also custom installation instructions at the github link if you want to install it manually.
curl -fsSL https://raw.githubusercontent.com/MatchbookLab/local-persist/master/scripts/install.sh | sudo bash
This will install and setup startup script for local-persist to monitor volumes.
Setup volume
Create a new local-persist volume:
docker volume create -d local-persist --opt mountpoint=/custom/path/on/host --name new-volume-name
Usage
Attach the volume to a container:
Newer --mount syntax:
docker run --name container-name --mount 'source=new-volume-name,target=/path/inside/container'
-v syntax: (not tested - as shown in github readme)
docker run -d -v images:/path/inside/container/ imagename:version
Or with docker-compose.yml: (example shows v2; not tested yet)
version: '2'
services:
one:
image: alpine
working_dir: /one/
command: sleep 600
volumes:
- data:/one/
two:
image: alpine
working_dir: /two/
command: sleep 600
volumes:
- data:/two/
volumes:
data:
driver: local-persist
driver_opts:
mountpoint: /data/local-persist/data
Related
I'm trying to learn docker at the moment and I'm getting confused about where data volumes actually exist.
I'm using Docker Desktop for Windows. (Windows 10)
In the docs they say that running docker inspect on the object will give you the source:https://docs.docker.com/engine/tutorials/dockervolumes/#locating-a-volume
$ docker inspect web
"Mounts": [
{
"Name": "fac362...80535",
"Source": "/var/lib/docker/volumes/fac362...80535/_data",
"Destination": "/webapp",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
]
however I don't see this, I get the following:
$ docker inspect blog_postgres-data
[
{
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/blog_postgres-data/_data",
"Name": "blog_postgres-data",
"Options": {},
"Scope": "local"
}
]
Can anyone help me? I just want to know where my data volume actually exists is it on my host machine? If so how can i get the path to it?
I am Windows + WSL 2 (Ubuntu 18.04).
Type in the Windows file explorer :
For Docker version 20.10.+ : \\wsl$\docker-desktop-data\data\docker\volumes
For Docker Engine v19.03: \\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes\
You will have one directory per volume.
Your volume directory is /var/lib/docker/volumes/blog_postgres-data/_data, and /var/lib/docker usually mounted in C:\Users\Public\Documents\Hyper-V\Virtual hard disks. Anyway you can check it out by looking in Docker settings.
You can refer to these docs for info on how to share drives with Docker on Windows.
BTW, Source is the location on the host and Destination is the location inside the container in the following output:
"Mounts": [
{
"Name": "fac362...80535",
"Source": "/var/lib/docker/volumes/fac362...80535/_data",
"Destination": "/webapp",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
]
Updated to answer questions in the comment:
My main curiosity here is that sharing images etc is great but how do I share my data?
Actually volume is designed for this purpose (manage data in Docker container). The data in a volume is persisted on the host FS and isolated from the life-cycle of a Docker container/image. You can share your data in a volume by:
Mount Docker volume to host and reuse it
docker run -v /path/on/host:/path/inside/container image
Then all your data will persist in /path/on/host; you could back it up, copy it to another machine, and re-run your container with the same volume.
Create and mount a data container.
Create a data container: docker create -v /dbdata --name dbstore training/postgres /bin/true
Run other containers based on this container using --volumes-from: docker run -d --volumes-from dbstore --name db1 training/postgres, then all data generated by db1 will persist in the volume of container dbstore.
For more information you could refer to the official Docker volumes docs.
Simply speaking, volumes is just a directory on your host with all your container data, so you could use any method you used before to backup/share your data.
can I push a volume to docker-hub like I do with images?
No. A Docker image is something you can push to a Docker hub (a.k.a. 'registry'); but data is not. You could backup/persist/share your data with any method you like, but pushing data to a Docker registry to share it does not make any sense.
can I make backups etc?
Yes, as posted above :-)
For Windows 10 + WSL 2 (Ubuntu 20.04), Docker version 20.10.2, build 2291f61
Docker artifacts can be found in
DOCKER_ARTIFACTS == \\wsl$\docker-desktop-data\version-pack-data\community\docker
Data volumes can be found in
DOCKER_ARTIFACTS\volumes\[VOLUME_ID]\_data
\\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes\
Worked for me as well (Windows 10 Home), great stuff.
I have found that my setup of Docker with WSL 2 (Ubuntu 20.04) uses this location at Windows 10:
C:\Users\Username\AppData\Local\Docker\wsl\data\ext4.vhdx
Where Username is your username.
When running linux based containers on a windows host, the actual volumes will be stored within the linux VM and will not be available on the host's fs, otherwise windows running on windows => C:\ProgramData\Docker\volumes\
Also docker inspect <container_id> will list the container configuration, under Mounts section see more details about the persistence layer.
Update:
Not applicable for Docker running on WSL.
If you have wsl2 enabled, u can find it in file explorer under \\wsl$\docker-desktop\mnt\host\wsl\docker-desktop-data\data\docker
you can find the volume associated with host on below path for Docker Desktop(Windows)
\\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes
In my case, i install docker-desktop on wsl2, windows 10 home. i find my image files in
\\wsl$\docker-desktop-data\version-pack-data\community\docker\overlay2
\\wsl$\docker-desktop-data\version-pack-data\community\docker
Containers, images volumes infos are all there.
All image files are stored there, and have been seperated into several folders with long string names. When i look into every folder, i can find all the real image files in "diff" folders.
Although the terminal show the path "var/lib/docker", but the folder doesn't exsit and the actual files are not stored there. i think there is no error, the "var/lib/docker" is just linked or mapped to the real folder, kind like that
For me I found my volumes in
\\wsl$\docker-desktop-data\data\docker\volumes\
Using WSL2 and Windows 21H1
Mounting any NTFS based directories did not work for my purpose (MongoDB - as far as I'm aware it is also the case for Redis and CouchDB at least): NTFS permissions did not allow necessary access for such DBs running in containers. The following is a setup with named volumes on HyperV.
The following approach starts an ssh server within a service, setup with docker-compse such that it automatically starts up and uses public key encryption between host and container for authorization. This way, data can be uploaded/downloaded via scp or sftp.
The full docker-compose.yml for a webapp + mongodb is below, together with some documentation on how to use ssh service:
version: '3'
services:
foo:
build: .
image: localhost.localdomain/${repository_name}:${tag}
container_name: ${container_name}
ports:
- "3333:3333"
links:
- mongodb-foo
depends_on:
- mongodb-foo
- sshd
volumes:
- "${host_log_directory}:/var/log/app"
mongodb-foo:
container_name: mongodb-${repository_name}
image: "mongo:3.4-jessie"
volumes:
- mongodata-foo:/data/db
expose:
- '27017'
#since mongo data on Windows only works within HyperV virtual disk (as of 2019-4-3), the following allows upload/download of mongo data
#setup: you need to copy your ~/.ssh/id_rsa.pub into $DOCKER_DATA_DIR/.ssh/id_rsa.pub, then run this service again
#download (all mongo data): scp -r -P 2222 user#localhost:/data/mongodb [target-dir within /c/]
#upload (all mongo data): scp -r -P 2222 [source-dir within /c/] user#localhost:/data/mongodb
sshd:
image: maltyxx/sshd
volumes:
- mongodata-foo:/data/mongodb
- $DOCKER_DATA_DIR/.ssh/id_rsa.pub:/home/user/.ssh/keys/id_rsa.pub:ro
ports:
- "2222:22"
command: user::1001
#please note: using a named volume like this for mongo is necessary on Windows rather than mounting an NTFS directory.
#mongodb (and probably most other databases) are not compatible with windows native data directories due ot permissions issues.
#this means that there is no direct access to this data, it needs to be dumped elsewhere if you want to reimport something.
#it will however be persisted as long as you don't delete the HyperV virtual drive that docker host is using.
#on Linux and Docker for Mac it is not an issue, named volumes are directly accessible from host.
volumes:
mongodata-foo:
this is unrelated, but for a fully working example, before any docker-compose call the following script needs to be run:
#!/usr/bin/env bash
set -o errexit
set -o pipefail
set -o nounset
working_directory="$(pwd)"
host_repo_dir="${working_directory}"
repository_name="$(basename ${working_directory})"
branch_name="$(git rev-parse --abbrev-ref HEAD)"
container_name="${repository_name}-${branch_name}"
host_log_directory="${DOCKER_DATA_DIR}/log/${repository_name}"
tag="${branch_name}"
export host_repo_dir
export repository_name
export container_name
export tag
export host_log_directory
Update: Please note that you can also just use docker cp nowadays, so the sshd container outlined above is probably not necessary anymore, except if you need remote access to the file system running in a container under a Windows host.
If you find \\wsl$ a pain to enter or remember, there's a more GUI-friendly method in Windows 10 release 2004 and onwards. With WSL 2, you can safely navigate to all the special WSL shares via the new Linux icon in File Explorer:
From there you can drill down to (e.g.) \docker-desktop-data\data\docker\volumes, as mentioned in other answers.
For more details, refer to Microsoft's official WSL filesystems documentation, which mentions these access methods. For the technically curious, Microsoft's deep dive video should answer a lot of questions.
If you're searching where the data is actually located when you put a volume that is pointing to the docker "vm" like here:
version: '3.0'
services:
mysql-server:
image: mysql:latest
container_name: mysql-server
restart: always
ports:
- 3306:3306
volumes:
- /opt/docker/mysql/data:/var/lib/mysql
The "/opt/docker/mysql/data" or just the / is located in \\wsl$\docker-desktop\mnt\version-pack\containers\services\docker\rootfs
Hope it's helping :)
In Windows 11 with Docker Desktop v4.15.0 with WSL2 enabled, the path to navigate to the volumes folder is \\wsl.localhost\docker-desktop-data\data\docker\volumes
If you're on windows and use Docker For Windows then Docker works via VM (MobyLinuxVM). Your volumes (as everting else) are in this VM! It is how to find them:
# get a privileged container with access to Docker daemon
docker run --privileged -it --rm -v /var/run/docker.sock:/var/run/docker.sock -v /usr/bin/docker:/usr/bin/docker alpine sh
# in second power-shell run a container with full root access to MobyLinuxVM
docker run --net=host --ipc=host --uts=host --pid=host -it --security-opt=seccomp=unconfined --privileged --rm -v /:/host alpine /bin/sh
# switch to host FS
chroot /host
# and then go to the volume you asked for
cd /var/lib/docker/volumes/YOUR_VOLUME_NAME/_data
Each container has its own filesystem which is independent from the host filesystem. If you run your container with the -v flag you can mount volumes so that the host and container see the same data (as in docker run -v hostFolder:containerFolder).
The first output you printed describes such a mounted volume (hence mounts) where /var/lib/docker/volumes/fac362...80535/_data (host) is mounted to /webapp (container).
I assume you did not use -v hence the folder is not mounted and only accessible in the container filesystem which you can find in /var/lib/docker/volumes/blog_postgres-data/_data. This data will be deleted if you remove the container (docker -rm) so it might be a good idea to mount the folder.
As to the question where you can access this data from windows. As far as I know, docker for windows uses the bash subsystem in Windows 10. I would try to run bash for windows10 and go to that folder or find out how to access the linux folders from windows 10. Check this page for a FAQ on the linux subsystem in windows 10.
Update: You can also use docker cp to copy files between host and container.
If your using windows, your docker files (in this case your volumes) exist on a virtual machine that docker uses for windows either Hyper-V or WSL. However if you need to access those files, you can copy your container files and store them locally on your machine and access the data this way.
docker cp container_Id_Here:/var/lib/mysql path_To_Your_Local_Machine_Here
I'm trying to learn docker at the moment and I'm getting confused about where data volumes actually exist.
I'm using Docker Desktop for Windows. (Windows 10)
In the docs they say that running docker inspect on the object will give you the source:https://docs.docker.com/engine/tutorials/dockervolumes/#locating-a-volume
$ docker inspect web
"Mounts": [
{
"Name": "fac362...80535",
"Source": "/var/lib/docker/volumes/fac362...80535/_data",
"Destination": "/webapp",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
]
however I don't see this, I get the following:
$ docker inspect blog_postgres-data
[
{
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/blog_postgres-data/_data",
"Name": "blog_postgres-data",
"Options": {},
"Scope": "local"
}
]
Can anyone help me? I just want to know where my data volume actually exists is it on my host machine? If so how can i get the path to it?
I am Windows + WSL 2 (Ubuntu 18.04).
Type in the Windows file explorer :
For Docker version 20.10.+ : \\wsl$\docker-desktop-data\data\docker\volumes
For Docker Engine v19.03: \\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes\
You will have one directory per volume.
Your volume directory is /var/lib/docker/volumes/blog_postgres-data/_data, and /var/lib/docker usually mounted in C:\Users\Public\Documents\Hyper-V\Virtual hard disks. Anyway you can check it out by looking in Docker settings.
You can refer to these docs for info on how to share drives with Docker on Windows.
BTW, Source is the location on the host and Destination is the location inside the container in the following output:
"Mounts": [
{
"Name": "fac362...80535",
"Source": "/var/lib/docker/volumes/fac362...80535/_data",
"Destination": "/webapp",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
]
Updated to answer questions in the comment:
My main curiosity here is that sharing images etc is great but how do I share my data?
Actually volume is designed for this purpose (manage data in Docker container). The data in a volume is persisted on the host FS and isolated from the life-cycle of a Docker container/image. You can share your data in a volume by:
Mount Docker volume to host and reuse it
docker run -v /path/on/host:/path/inside/container image
Then all your data will persist in /path/on/host; you could back it up, copy it to another machine, and re-run your container with the same volume.
Create and mount a data container.
Create a data container: docker create -v /dbdata --name dbstore training/postgres /bin/true
Run other containers based on this container using --volumes-from: docker run -d --volumes-from dbstore --name db1 training/postgres, then all data generated by db1 will persist in the volume of container dbstore.
For more information you could refer to the official Docker volumes docs.
Simply speaking, volumes is just a directory on your host with all your container data, so you could use any method you used before to backup/share your data.
can I push a volume to docker-hub like I do with images?
No. A Docker image is something you can push to a Docker hub (a.k.a. 'registry'); but data is not. You could backup/persist/share your data with any method you like, but pushing data to a Docker registry to share it does not make any sense.
can I make backups etc?
Yes, as posted above :-)
For Windows 10 + WSL 2 (Ubuntu 20.04), Docker version 20.10.2, build 2291f61
Docker artifacts can be found in
DOCKER_ARTIFACTS == \\wsl$\docker-desktop-data\version-pack-data\community\docker
Data volumes can be found in
DOCKER_ARTIFACTS\volumes\[VOLUME_ID]\_data
\\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes\
Worked for me as well (Windows 10 Home), great stuff.
I have found that my setup of Docker with WSL 2 (Ubuntu 20.04) uses this location at Windows 10:
C:\Users\Username\AppData\Local\Docker\wsl\data\ext4.vhdx
Where Username is your username.
When running linux based containers on a windows host, the actual volumes will be stored within the linux VM and will not be available on the host's fs, otherwise windows running on windows => C:\ProgramData\Docker\volumes\
Also docker inspect <container_id> will list the container configuration, under Mounts section see more details about the persistence layer.
Update:
Not applicable for Docker running on WSL.
If you have wsl2 enabled, u can find it in file explorer under \\wsl$\docker-desktop\mnt\host\wsl\docker-desktop-data\data\docker
you can find the volume associated with host on below path for Docker Desktop(Windows)
\\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes
In my case, i install docker-desktop on wsl2, windows 10 home. i find my image files in
\\wsl$\docker-desktop-data\version-pack-data\community\docker\overlay2
\\wsl$\docker-desktop-data\version-pack-data\community\docker
Containers, images volumes infos are all there.
All image files are stored there, and have been seperated into several folders with long string names. When i look into every folder, i can find all the real image files in "diff" folders.
Although the terminal show the path "var/lib/docker", but the folder doesn't exsit and the actual files are not stored there. i think there is no error, the "var/lib/docker" is just linked or mapped to the real folder, kind like that
For me I found my volumes in
\\wsl$\docker-desktop-data\data\docker\volumes\
Using WSL2 and Windows 21H1
Mounting any NTFS based directories did not work for my purpose (MongoDB - as far as I'm aware it is also the case for Redis and CouchDB at least): NTFS permissions did not allow necessary access for such DBs running in containers. The following is a setup with named volumes on HyperV.
The following approach starts an ssh server within a service, setup with docker-compse such that it automatically starts up and uses public key encryption between host and container for authorization. This way, data can be uploaded/downloaded via scp or sftp.
The full docker-compose.yml for a webapp + mongodb is below, together with some documentation on how to use ssh service:
version: '3'
services:
foo:
build: .
image: localhost.localdomain/${repository_name}:${tag}
container_name: ${container_name}
ports:
- "3333:3333"
links:
- mongodb-foo
depends_on:
- mongodb-foo
- sshd
volumes:
- "${host_log_directory}:/var/log/app"
mongodb-foo:
container_name: mongodb-${repository_name}
image: "mongo:3.4-jessie"
volumes:
- mongodata-foo:/data/db
expose:
- '27017'
#since mongo data on Windows only works within HyperV virtual disk (as of 2019-4-3), the following allows upload/download of mongo data
#setup: you need to copy your ~/.ssh/id_rsa.pub into $DOCKER_DATA_DIR/.ssh/id_rsa.pub, then run this service again
#download (all mongo data): scp -r -P 2222 user#localhost:/data/mongodb [target-dir within /c/]
#upload (all mongo data): scp -r -P 2222 [source-dir within /c/] user#localhost:/data/mongodb
sshd:
image: maltyxx/sshd
volumes:
- mongodata-foo:/data/mongodb
- $DOCKER_DATA_DIR/.ssh/id_rsa.pub:/home/user/.ssh/keys/id_rsa.pub:ro
ports:
- "2222:22"
command: user::1001
#please note: using a named volume like this for mongo is necessary on Windows rather than mounting an NTFS directory.
#mongodb (and probably most other databases) are not compatible with windows native data directories due ot permissions issues.
#this means that there is no direct access to this data, it needs to be dumped elsewhere if you want to reimport something.
#it will however be persisted as long as you don't delete the HyperV virtual drive that docker host is using.
#on Linux and Docker for Mac it is not an issue, named volumes are directly accessible from host.
volumes:
mongodata-foo:
this is unrelated, but for a fully working example, before any docker-compose call the following script needs to be run:
#!/usr/bin/env bash
set -o errexit
set -o pipefail
set -o nounset
working_directory="$(pwd)"
host_repo_dir="${working_directory}"
repository_name="$(basename ${working_directory})"
branch_name="$(git rev-parse --abbrev-ref HEAD)"
container_name="${repository_name}-${branch_name}"
host_log_directory="${DOCKER_DATA_DIR}/log/${repository_name}"
tag="${branch_name}"
export host_repo_dir
export repository_name
export container_name
export tag
export host_log_directory
Update: Please note that you can also just use docker cp nowadays, so the sshd container outlined above is probably not necessary anymore, except if you need remote access to the file system running in a container under a Windows host.
If you find \\wsl$ a pain to enter or remember, there's a more GUI-friendly method in Windows 10 release 2004 and onwards. With WSL 2, you can safely navigate to all the special WSL shares via the new Linux icon in File Explorer:
From there you can drill down to (e.g.) \docker-desktop-data\data\docker\volumes, as mentioned in other answers.
For more details, refer to Microsoft's official WSL filesystems documentation, which mentions these access methods. For the technically curious, Microsoft's deep dive video should answer a lot of questions.
If you're searching where the data is actually located when you put a volume that is pointing to the docker "vm" like here:
version: '3.0'
services:
mysql-server:
image: mysql:latest
container_name: mysql-server
restart: always
ports:
- 3306:3306
volumes:
- /opt/docker/mysql/data:/var/lib/mysql
The "/opt/docker/mysql/data" or just the / is located in \\wsl$\docker-desktop\mnt\version-pack\containers\services\docker\rootfs
Hope it's helping :)
In Windows 11 with Docker Desktop v4.15.0 with WSL2 enabled, the path to navigate to the volumes folder is \\wsl.localhost\docker-desktop-data\data\docker\volumes
If you're on windows and use Docker For Windows then Docker works via VM (MobyLinuxVM). Your volumes (as everting else) are in this VM! It is how to find them:
# get a privileged container with access to Docker daemon
docker run --privileged -it --rm -v /var/run/docker.sock:/var/run/docker.sock -v /usr/bin/docker:/usr/bin/docker alpine sh
# in second power-shell run a container with full root access to MobyLinuxVM
docker run --net=host --ipc=host --uts=host --pid=host -it --security-opt=seccomp=unconfined --privileged --rm -v /:/host alpine /bin/sh
# switch to host FS
chroot /host
# and then go to the volume you asked for
cd /var/lib/docker/volumes/YOUR_VOLUME_NAME/_data
Each container has its own filesystem which is independent from the host filesystem. If you run your container with the -v flag you can mount volumes so that the host and container see the same data (as in docker run -v hostFolder:containerFolder).
The first output you printed describes such a mounted volume (hence mounts) where /var/lib/docker/volumes/fac362...80535/_data (host) is mounted to /webapp (container).
I assume you did not use -v hence the folder is not mounted and only accessible in the container filesystem which you can find in /var/lib/docker/volumes/blog_postgres-data/_data. This data will be deleted if you remove the container (docker -rm) so it might be a good idea to mount the folder.
As to the question where you can access this data from windows. As far as I know, docker for windows uses the bash subsystem in Windows 10. I would try to run bash for windows10 and go to that folder or find out how to access the linux folders from windows 10. Check this page for a FAQ on the linux subsystem in windows 10.
Update: You can also use docker cp to copy files between host and container.
If your using windows, your docker files (in this case your volumes) exist on a virtual machine that docker uses for windows either Hyper-V or WSL. However if you need to access those files, you can copy your container files and store them locally on your machine and access the data this way.
docker cp container_Id_Here:/var/lib/mysql path_To_Your_Local_Machine_Here
I am trying to mount a network drive as a volume. This is the command I am trying
docker run -v //NetworkDirectory/Folder:/data alpine ls /data
I am running this command on windows and the data directory is coming up empty. How can I mount this network directory as a volume on the windows host and access it inside the container?
Working with local directories works just fine, so the following command works as expected.
docker run -v c:/Users/:/data alpine ls /data
I can make it work in linux since I can mount the share with cifs-utils on a local directory and use that directory as the volume.
Edit: Looks like this is not possible: How to mount network Volume in Docker for Windows (Windows 10)
My colleague came up with this and it works with our company network drive and it might help someone out there.
We start by creating a docker volume named mydockervolume.
docker volume create --driver local --opt type=cifs --opt device=//networkdrive-ip/Folder --opt o=user=yourusername,domain=yourdomain,password=yourpassword mydockervolume
--driver specifies the volume driver name
--opt Sets driver specific options. I guess they are given to the linux mount command when the container starts up.
We can then test that the volume works with
docker run -v mydockervolume:/data alpine ls /data
Here you can read more about driver specific options and docker volume create
I found this when looking for something similar but see that though it's old it's missing some key information, possibly because they weren't available at the time
The CIFS storage is, I believe, only for when you are connecting to a Windows System as I do not believe it is used by Linux at all.
EDIT: It looks like Docker considered SMB(Samba) to be CIFS Volumes
This same thing can be done with NFS, which is less secure, but is supported by almost everything.
you can create an NFS volume in a similar way to the CIFS one, just with a few changes. I'll list both so they can be seen side by side
When using NFS on WSL2 you 1st need to install the NFS service into the Linux Host OS. I believe CIFS requires a similar one, but as I don't use it I'm not certain.
EDIT: It looks like WSL2 Docker at least for SMB(Samba), CIFS Volumes either don't require any dependencies or I already have them, possibly the same one I install for NFS bellow
In my case the Host OS is Ubuntu, but you should be able to find the appropriate one by finding your system's equivalent for nfs-common installation
sudo apt update
sudo apt install nfs-common
That's it. That will install the service so NFS works on Docker (It took me forever to realize that was the problem since it doesn't seem to be mentioned as needed anywhere)
On the network device you need to have set NFS permissions for the NFS folder, in my case this would be done at the folder folder with the mount then being to a folder inside it. That's fine. In my case the NAS that is my server mounts to #IP#/volume1/folder, within the NAS I never see the volume1 in the directory structure, but that full path to the shared folder is shown in the settings page when I set the NFS permissions. I'm not including the volume1 part as your system will likely be different & you want the FULL PATH after the IP (use the IP as the numbers NOT the HostName), according to your NFS share, whatever it may be.
The nolock option is often needed but may not be on your system. It just disables the ability to "lock" files.
The soft option means that if the system cannot connect to the mount directory it will not hang. If you need it to only work if the mount is there you can change this to hard instead.
The rw (read/write) option is for Read/Write, ro (read-only) would be for Read Only
As I don't personally use the CIFS volume the options set are just ones in the examples I found, whether they are necessary for you will need to be looked into.
The username & password are required & must be included for CIFS
uid & gid are Linux user & group settings & should be set, I believe, to what your container needs as Windows doesn't use them to my knowledge
file_mode=0777 & dir_mode=0777 are Linux Read/Write Permissions essentially like chmod 0777 giving anything that can access the file Read/Write/Execute permissions (More info Link #4) & this should also be for the Docker Container not the CIFS host
noexec has to do with execution permissions but I don't think actually function here, nosuid limits it's ability to access files that are specific to a specific user ID & shouldn't need to be removed unless you know you need it to be, as it's a protection, nosetuids means that it won't set UID & GUID for newly created files, nodev means no access to/creation of devices on the mount point, vers=1.0 I think is a fallback for compatibility, I personally would not include it unless there is a problem or it doesn't work without it
In these examples I'm mounting //NET.WORK.DRIVE.IP/folder/on/addr/device to a volume named "my-docker-volume" in Read/Write mode. The CIFS volume is using the user supercool with password noboDyCanGue55
NFS from the CLI
docker volume create --driver local --opt type=nfs --opt o=addr=NET.WORK.DRIVE.IP,nolock,rw,soft --opt device=:/folder/on/addr/device my-docker-volume
CIFS from CLI (May not work if Docker is installed on a system other than Windows, will only connect to an IP on a Windows system)
docker volume create --driver local --opt type=cifs --opt o=user=supercool,password=noboDyCanGue55,rw --opt device=//NET.WORK.DRIVE.IP/folder/on/addr/device my-docker-volume
This can also be done within Docker Compose or Portainer.
When you do it there, you will need to add a Volumes: at the bottom of the compose file, with no indent, on the same level as services:
In this example I am mounting the volumes
my-nfs-volume from //10.11.12.13/folder/on/NFS/device to "my-nfs-volume" in Read/Write mode & mounting that in the container to /nfs
my-cifs-volume from //10.11.12.14/folder/on/CIFS/device with permissions from user supercool with password noboDyCanGue55 to "my-cifs-volume" in Read/Write mode & mounting that in the container to /cifs
version: '3'
services:
great-container:
image: imso/awesome/youknow:latest
container_name: totally_awesome
environment:
- PUID=1000
- PGID=1000
ports:
- 1234:5432
volumes:
- my-nfs-volume:/nfs
- my-cifs-volume:/cifs
volumes:
my-nfs-volume:
name: my-nfs-volume
driver_opts:
type: "nfs"
o: "addr=10.11.12.13,nolock,rw,soft"
device: ":/folder/on/NFS/device"
my-cifs-volume:
driver_opts:
type: "cifs"
o: "username=supercool,password=noboDyCanGue55,uid=1000,gid=1000,file_mode=0777,dir_mode=0777,noexec,nosuid,nosetuids,nodev,vers=1.0"
device: "//10.11.12.14/folder/on/CIFS/device/"
More details can be found here:
https://docs.docker.com/engine/reference/commandline/volume_create/
https://www.thegeekdiary.com/common-nfs-mount-options-in-linux/
https://web.mit.edu/rhel-doc/5/RHEL-5-manual/Deployment_Guide-en-US/s1-nfs-client-config-options.html
https://www.maketecheasier.com/file-permissions-what-does-chmod-777-means/
I didn't find a native CIFS storage driver on docker.
You can use an external volume plugin like this one: https://github.com/ContainX/docker-volume-netshare which support NFS, AWS EFS & Samba/CIFS
I'm trying to learn docker at the moment and I'm getting confused about where data volumes actually exist.
I'm using Docker Desktop for Windows. (Windows 10)
In the docs they say that running docker inspect on the object will give you the source:https://docs.docker.com/engine/tutorials/dockervolumes/#locating-a-volume
$ docker inspect web
"Mounts": [
{
"Name": "fac362...80535",
"Source": "/var/lib/docker/volumes/fac362...80535/_data",
"Destination": "/webapp",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
]
however I don't see this, I get the following:
$ docker inspect blog_postgres-data
[
{
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/blog_postgres-data/_data",
"Name": "blog_postgres-data",
"Options": {},
"Scope": "local"
}
]
Can anyone help me? I just want to know where my data volume actually exists is it on my host machine? If so how can i get the path to it?
I am Windows + WSL 2 (Ubuntu 18.04).
Type in the Windows file explorer :
For Docker version 20.10.+ : \\wsl$\docker-desktop-data\data\docker\volumes
For Docker Engine v19.03: \\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes\
You will have one directory per volume.
Your volume directory is /var/lib/docker/volumes/blog_postgres-data/_data, and /var/lib/docker usually mounted in C:\Users\Public\Documents\Hyper-V\Virtual hard disks. Anyway you can check it out by looking in Docker settings.
You can refer to these docs for info on how to share drives with Docker on Windows.
BTW, Source is the location on the host and Destination is the location inside the container in the following output:
"Mounts": [
{
"Name": "fac362...80535",
"Source": "/var/lib/docker/volumes/fac362...80535/_data",
"Destination": "/webapp",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
]
Updated to answer questions in the comment:
My main curiosity here is that sharing images etc is great but how do I share my data?
Actually volume is designed for this purpose (manage data in Docker container). The data in a volume is persisted on the host FS and isolated from the life-cycle of a Docker container/image. You can share your data in a volume by:
Mount Docker volume to host and reuse it
docker run -v /path/on/host:/path/inside/container image
Then all your data will persist in /path/on/host; you could back it up, copy it to another machine, and re-run your container with the same volume.
Create and mount a data container.
Create a data container: docker create -v /dbdata --name dbstore training/postgres /bin/true
Run other containers based on this container using --volumes-from: docker run -d --volumes-from dbstore --name db1 training/postgres, then all data generated by db1 will persist in the volume of container dbstore.
For more information you could refer to the official Docker volumes docs.
Simply speaking, volumes is just a directory on your host with all your container data, so you could use any method you used before to backup/share your data.
can I push a volume to docker-hub like I do with images?
No. A Docker image is something you can push to a Docker hub (a.k.a. 'registry'); but data is not. You could backup/persist/share your data with any method you like, but pushing data to a Docker registry to share it does not make any sense.
can I make backups etc?
Yes, as posted above :-)
For Windows 10 + WSL 2 (Ubuntu 20.04), Docker version 20.10.2, build 2291f61
Docker artifacts can be found in
DOCKER_ARTIFACTS == \\wsl$\docker-desktop-data\version-pack-data\community\docker
Data volumes can be found in
DOCKER_ARTIFACTS\volumes\[VOLUME_ID]\_data
\\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes\
Worked for me as well (Windows 10 Home), great stuff.
I have found that my setup of Docker with WSL 2 (Ubuntu 20.04) uses this location at Windows 10:
C:\Users\Username\AppData\Local\Docker\wsl\data\ext4.vhdx
Where Username is your username.
When running linux based containers on a windows host, the actual volumes will be stored within the linux VM and will not be available on the host's fs, otherwise windows running on windows => C:\ProgramData\Docker\volumes\
Also docker inspect <container_id> will list the container configuration, under Mounts section see more details about the persistence layer.
Update:
Not applicable for Docker running on WSL.
If you have wsl2 enabled, u can find it in file explorer under \\wsl$\docker-desktop\mnt\host\wsl\docker-desktop-data\data\docker
you can find the volume associated with host on below path for Docker Desktop(Windows)
\\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes
In my case, i install docker-desktop on wsl2, windows 10 home. i find my image files in
\\wsl$\docker-desktop-data\version-pack-data\community\docker\overlay2
\\wsl$\docker-desktop-data\version-pack-data\community\docker
Containers, images volumes infos are all there.
All image files are stored there, and have been seperated into several folders with long string names. When i look into every folder, i can find all the real image files in "diff" folders.
Although the terminal show the path "var/lib/docker", but the folder doesn't exsit and the actual files are not stored there. i think there is no error, the "var/lib/docker" is just linked or mapped to the real folder, kind like that
For me I found my volumes in
\\wsl$\docker-desktop-data\data\docker\volumes\
Using WSL2 and Windows 21H1
Mounting any NTFS based directories did not work for my purpose (MongoDB - as far as I'm aware it is also the case for Redis and CouchDB at least): NTFS permissions did not allow necessary access for such DBs running in containers. The following is a setup with named volumes on HyperV.
The following approach starts an ssh server within a service, setup with docker-compse such that it automatically starts up and uses public key encryption between host and container for authorization. This way, data can be uploaded/downloaded via scp or sftp.
The full docker-compose.yml for a webapp + mongodb is below, together with some documentation on how to use ssh service:
version: '3'
services:
foo:
build: .
image: localhost.localdomain/${repository_name}:${tag}
container_name: ${container_name}
ports:
- "3333:3333"
links:
- mongodb-foo
depends_on:
- mongodb-foo
- sshd
volumes:
- "${host_log_directory}:/var/log/app"
mongodb-foo:
container_name: mongodb-${repository_name}
image: "mongo:3.4-jessie"
volumes:
- mongodata-foo:/data/db
expose:
- '27017'
#since mongo data on Windows only works within HyperV virtual disk (as of 2019-4-3), the following allows upload/download of mongo data
#setup: you need to copy your ~/.ssh/id_rsa.pub into $DOCKER_DATA_DIR/.ssh/id_rsa.pub, then run this service again
#download (all mongo data): scp -r -P 2222 user#localhost:/data/mongodb [target-dir within /c/]
#upload (all mongo data): scp -r -P 2222 [source-dir within /c/] user#localhost:/data/mongodb
sshd:
image: maltyxx/sshd
volumes:
- mongodata-foo:/data/mongodb
- $DOCKER_DATA_DIR/.ssh/id_rsa.pub:/home/user/.ssh/keys/id_rsa.pub:ro
ports:
- "2222:22"
command: user::1001
#please note: using a named volume like this for mongo is necessary on Windows rather than mounting an NTFS directory.
#mongodb (and probably most other databases) are not compatible with windows native data directories due ot permissions issues.
#this means that there is no direct access to this data, it needs to be dumped elsewhere if you want to reimport something.
#it will however be persisted as long as you don't delete the HyperV virtual drive that docker host is using.
#on Linux and Docker for Mac it is not an issue, named volumes are directly accessible from host.
volumes:
mongodata-foo:
this is unrelated, but for a fully working example, before any docker-compose call the following script needs to be run:
#!/usr/bin/env bash
set -o errexit
set -o pipefail
set -o nounset
working_directory="$(pwd)"
host_repo_dir="${working_directory}"
repository_name="$(basename ${working_directory})"
branch_name="$(git rev-parse --abbrev-ref HEAD)"
container_name="${repository_name}-${branch_name}"
host_log_directory="${DOCKER_DATA_DIR}/log/${repository_name}"
tag="${branch_name}"
export host_repo_dir
export repository_name
export container_name
export tag
export host_log_directory
Update: Please note that you can also just use docker cp nowadays, so the sshd container outlined above is probably not necessary anymore, except if you need remote access to the file system running in a container under a Windows host.
If you find \\wsl$ a pain to enter or remember, there's a more GUI-friendly method in Windows 10 release 2004 and onwards. With WSL 2, you can safely navigate to all the special WSL shares via the new Linux icon in File Explorer:
From there you can drill down to (e.g.) \docker-desktop-data\data\docker\volumes, as mentioned in other answers.
For more details, refer to Microsoft's official WSL filesystems documentation, which mentions these access methods. For the technically curious, Microsoft's deep dive video should answer a lot of questions.
If you're searching where the data is actually located when you put a volume that is pointing to the docker "vm" like here:
version: '3.0'
services:
mysql-server:
image: mysql:latest
container_name: mysql-server
restart: always
ports:
- 3306:3306
volumes:
- /opt/docker/mysql/data:/var/lib/mysql
The "/opt/docker/mysql/data" or just the / is located in \\wsl$\docker-desktop\mnt\version-pack\containers\services\docker\rootfs
Hope it's helping :)
In Windows 11 with Docker Desktop v4.15.0 with WSL2 enabled, the path to navigate to the volumes folder is \\wsl.localhost\docker-desktop-data\data\docker\volumes
If you're on windows and use Docker For Windows then Docker works via VM (MobyLinuxVM). Your volumes (as everting else) are in this VM! It is how to find them:
# get a privileged container with access to Docker daemon
docker run --privileged -it --rm -v /var/run/docker.sock:/var/run/docker.sock -v /usr/bin/docker:/usr/bin/docker alpine sh
# in second power-shell run a container with full root access to MobyLinuxVM
docker run --net=host --ipc=host --uts=host --pid=host -it --security-opt=seccomp=unconfined --privileged --rm -v /:/host alpine /bin/sh
# switch to host FS
chroot /host
# and then go to the volume you asked for
cd /var/lib/docker/volumes/YOUR_VOLUME_NAME/_data
Each container has its own filesystem which is independent from the host filesystem. If you run your container with the -v flag you can mount volumes so that the host and container see the same data (as in docker run -v hostFolder:containerFolder).
The first output you printed describes such a mounted volume (hence mounts) where /var/lib/docker/volumes/fac362...80535/_data (host) is mounted to /webapp (container).
I assume you did not use -v hence the folder is not mounted and only accessible in the container filesystem which you can find in /var/lib/docker/volumes/blog_postgres-data/_data. This data will be deleted if you remove the container (docker -rm) so it might be a good idea to mount the folder.
As to the question where you can access this data from windows. As far as I know, docker for windows uses the bash subsystem in Windows 10. I would try to run bash for windows10 and go to that folder or find out how to access the linux folders from windows 10. Check this page for a FAQ on the linux subsystem in windows 10.
Update: You can also use docker cp to copy files between host and container.
If your using windows, your docker files (in this case your volumes) exist on a virtual machine that docker uses for windows either Hyper-V or WSL. However if you need to access those files, you can copy your container files and store them locally on your machine and access the data this way.
docker cp container_Id_Here:/var/lib/mysql path_To_Your_Local_Machine_Here
We can have a data volume in docker:
$ docker run -v /path/to/data/in/container --name test_container debian
$ docker inspect test_container
...
Mounts": [
{
"Name": "fac362...80535",
"Source": "/var/lib/docker/volumes/fac362...80535/_data",
"Destination": "/path/to/data/in/container",
"Driver": "local",
"Mode": "",
"RW": true
}
]
...
But if the data volume lives in /var/lib/docker/volumes/fac362...80535/_data, is it any different from having the data in a folder mounted using -v /path/to/data/in/container:/home/user/a_good_place_to_have_data?
Although using volumes and bind mounts feels the same (with the only change being the location of the directory), there are differences in behavior.
Volumes vs Bind Mounts
With Bind Mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its full or relative path on the host machine.
With Volume, a new directory is created within Docker's storage directory on the host machine, and Docker manages that directory's content.
Volumes advantages over bind mounts:
Volumes are easier to back up or migrate than bind mounts.
You can manage volumes using Docker CLI commands or the Docker API.
Volumes work on both Linux and Windows containers.
Volumes can be more safely shared among multiple containers.
Volume drivers allow you to store volumes on remote hosts or cloud providers, to encrypt the contents of volumes, or to add other functionality.
A new volume’s contents can be pre-populated by a container.
EDIT (9.9.2019):
According to #Sebi2020 comment, Bind mounts are much easier to backup. Docker doesn't provide any command to backup volumes. You have to use temporary containers with a bind mount to create backups.
Volumes
Created and managed by Docker. You can create a volume explicitly
using the docker volume create command, or Docker can create a volume
during container or service creation.
When you create a volume, it is stored within a directory on the
Docker host. When you mount the volume into a container, this
directory is what is mounted into the container. This is similar to
the way that bind mounts work, except that volumes are managed by
Docker and are isolated from the core functionality of the host
machine.
A given volume can be mounted into multiple containers simultaneously.
When no running container is using a volume, the volume is still
available to Docker and is not removed automatically. You can remove
unused volumes using docker volume prune.
When you mount a volume, it may be named or anonymous. Anonymous
volumes are not given an explicit name when they are first mounted
into a container, so Docker gives them a random name that is
guaranteed to be unique within a given Docker host. Besides the name,
named and anonymous volumes behave in the same ways.
Volumes also support the use of volume drivers, which allow you to
store your data on remote hosts or cloud providers, among other
possibilities.
Bind mounts
Available since the early days of Docker. Bind mounts have limited
functionality compared to volumes. When you use a bind mount, a file
or directory on the host machine is mounted into a container. The file
or directory is referenced by its full path on the host machine. The
file or directory does not need to exist on the Docker host already.
It is created on demand if it does not yet exist. Bind mounts are very
performant, but they rely on the host machine’s filesystem having a
specific directory structure available. If you are developing new
Docker applications, consider using named volumes instead. You can’t
use Docker CLI commands to directly manage bind mounts.
There is also tmpfs mounts.
tmpfs mounts
A tmpfs mount is not persisted on disk, either on the Docker host or
within a container. It can be used by a container during the lifetime
of the container, to store non-persistent state or sensitive
information. For instance, internally, swarm services use tmpfs mounts
to mount secrets into a service’s containers.
Reference:
https://docs.docker.com/storage/
is it any different from having the data in a folder mounted using -v /path/to/data/in/container:/home/user/a_good_place_to_have_data?
It is because, as mentioned in "Mount a host directory as a data volume"
The host directory is, by its nature, host-dependent. For this reason, you can’t mount a host directory from Dockerfile because built images should be portable. A host directory wouldn’t be available on all potential hosts.
If you have some persistent data that you want to share between containers, or want to use from non-persistent containers, it’s best to create a named Data Volume Container, and then to mount the data from it.
You can combine both approaches:
docker run --volumes-from dbdata -v $(pwd):/backup ubuntu tar cvf /backup/backup.tar /dbdata
Here we’ve launched a new container and mounted the volume from the dbdata container.
We’ve then mounted a local host directory as /backup.
Finally, we’ve passed a command that uses tar to backup the contents of the dbdata volume to a backup.tar file inside our /backup directory. When the command completes and the container stops we’ll be left with a backup of our dbdata volume.
Yes, this is quite different from a few perspectives. Like you wrote in the question's title, it is about understanding why we need data volumes vs bind mount to host.
Part 1 - Basic scenarios with examples
Lets take 2 scenarios.
Case 1: Web server
We want to provide our web server a configuration file that might change frequently. For example: exposing ports according to the current environment.
We can rebuild the image each time with the relevant setup or create 2 different images for each environment. Both of this solutions aren’t very efficient.
With Bind mounts Docker mounts the given source directory into a location inside the container.
(The original directory / file in the read-only layer inside the union file system will simply be overridden).
For example - binding a dynamic port to nginx:
version: "3.7"
services:
web:
image: nginx:alpine
volumes:
- type: bind #<-----Notice the type
source: ./mysite.template
target: /etc/nginx/conf.d/mysite.template
ports:
- "9090:8080"
environment:
- PORT=8080
command: /bin/sh -c "envsubst < /etc/nginx/conf.d/mysite.template >
/etc/nginx/conf.d/default.conf && exec nginx -g 'daemon off;'"
(*) Notice that this example could also be solved using Volumes.
Case 2 : Databases
Docker containers do not store persistent data -- any data that will be written to the writable layer in container’s union file system will be lost once the container stops running.
But what if we have a database running on a container, and the container stops - that means that all the data will be lost?
Volumes to the rescue.
Those are named file system trees which are managed for us by Docker.
For example - persisting Postgres SQL data:
services:
db:
image: postgres:latest
volumes:
- "dbdata:/var/lib/postgresql/data"
volumes:
- type: volume #<-----Notice the type
source: dbdata
target: /var/lib/postgresql/data
volumes:
dbdata:
Notice that in this case, for named volumes, the source is the name of the volume
(for anonymous volumes, this field is omitted).
Part 2 - Comparison
Differences in management and isolation on the host
Bind mounts exist on the host file system and being managed by the host maintainer. Applications / processes outside of Docker can also modify it.
Volumes can also be implemented on the host, but Docker will manage them for us and they can not be accessed outside of Docker.
Volumes are a much wider solution
Although both solutions help us to separate the data lifecycle from containers,
by using Volumes you gain much more power and flexibility over your system.
With Volumes we can design our data effectively and decouple it from other parts of the system by storing it in dedicated remote locations (e.g., in the cloud) and integrate it with external services like backups, monitoring, encryption and hardware management.
The difference between host directory and a data volume is in that that Docker manages the latter by placing it into the $DOCKER-DATA-DIR/volumes directory and attaching a reference to it (names or randomly generated ids). That is you get a little bit of convenience.
Both host directories and data volumes are directories on the host. Both are host dependent. You can't reference either of them in a Dockerfile; the VOLUME directive creates a new nameless (with randomly generated id) volume every time you launch a new container and cannot reference an existing volume.
* $DOCKER-DATA-DIR is /var/lib/docker here unless you changed the defaults.