Where should live docker volumes on the host? - docker

On the host side, should all the mount points be located in the same location? Or should they reflect the locations which are inside the containers?
For example, what is the best place to mount /var/jenkins_home on the host side in order to be consistent with its Unix filesystem?
/var/jenkins_home
/srv/jenkins_home
/opt/docker-volumes/jenkins/var/jenkins_home
Other location ?

It absolutely depends on you where you want to mount the volume on the host. Just don't map it to any system file locations.
In my opinion the volumes reflecting the locations inside the container is not a great idea since you will have many containers, and all will have similar file system structure, so you will never be able to isolate container writes.
With jenkins, since the official Jenkins docker image runs with user "jenkins", it will be not a bad idea for you to create jenkins user on the host and map /home/jenkins on the host to /var/jenkins_home on the container.

Rather than using explicit host:container mounts, consider using named volumes. This has several benefits:
They can be shared easily into other containers
They are host-agnostic (if you don't have the specific mount on that machine, it will fail)
They can be managed as first-class citizens in the Docker world (docker volume)
You don't have to worry about where to put them on your host ;)

Related

Are Bind Mounts and Host Volumes the same thing in Docker?

I have seen the terms "bind mount" and "host volume" being used in various articles but none of them mention whether they are the same thing or not. But looking at their function, it looks like they are pretty much the same thing. Can anyone answer whether it is the same thing or not? If not, what is the difference?
Ref:
Docker Docs - Use bind mounts
https://blog.logrocket.com/docker-volumes-vs-bind-mounts/
They are different concepts.
As mentioned in bind mounts:
Bind mounts have been around since the early days of Docker. Bind mounts have limited functionality compared to volumes. When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its absolute path on the host machine. By contrast, when you use a volume, a new directory is created within Docker’s storage directory on the host machine, and Docker manages that directory’s contents.
And as mentioned in volumes:
Volumes are the preferred mechanism for persisting data generated by
and used by Docker containers. While bind mounts are dependent on the
directory structure and OS of the host machine, volumes are completely
managed by Docker. Volumes have several advantages over bind mounts:
Volumes are easier to back up or migrate than bind mounts.
You can manage volumes using Docker CLI commands or the Docker API.
Volumes work on both Linux and Windows containers.
Volumes can be more safely shared among multiple containers.
Volume drivers let you store volumes on remote hosts or cloud providers, to encrypt the contents of volumes, or to add other functionality.
New volumes can have their content pre-populated by a container.
Volumes on Docker Desktop have much higher performance than bind mounts from Mac and Windows hosts.
A "bind mount" is when you let your container see and use a normal directory in a normal filesystem on your host. Changes made by programs running in the container will be visible in your host's filesystem.
A "volume" is a single file on your host that acts like a whole filesystem visible to the container. You can't normally see what's inside it from the host.
I was able to figure it out.
There are 3 types of storage in Docker.
1. Bind mounts-also known as host volumes.
2. Anonymous volumes.
3. Named volumes.
So bind mount = host volume. They are the same thing. "Host volume" must be a deprecating term though, as I cannot see it in Docker docs. But it can be seen in various articles published 1-2 years ago.
Examples for where it is referred to as "host volume":
https://docs.drone.io/pipeline/docker/syntax/volumes/host/
https://spin.atomicobject.com/2019/07/11/docker-volumes-explained/
This docs page here Manage data in Docker is quite helpful
Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem. Volumes are the best way to persist data in Docker.
Bind mounts may be stored anywhere on the host system. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.

On Windows, where are the files of my CMS running with Docker?

I already found a command line to get a path of the files corresponding to my CMS (Prestashop) that runs with Docker, i.e:
docker exec -it <mycontainer> bash
But, it brings me to:
root#4c3cae74d5b1:/var/www/html#
Which looks like a Linux path. So, do you know how to know where the files are situated on my Windows file system ?
Thanks a lot !
Aymeric
If you have not otherwise specified, the files are only inside the one container filesystem, not at all on your host filesystem. The files are in your Windows file system only if you have used bind mounts when running your container and mapped host files/directories to container volume mounts.
In general Docker files can exist in three places:
layered container filesystem (default)
volumes (persistent volumes in your Docker host, volumes can be shared between multiple containers running on the same host)
bind mounts (files or directories in your Docker host filesystem)
You did not provide the actual docker run command you have used to run your Prestashop. This would reveal what option your setup is. More info on Dockker volumes can be found here: https://docs.docker.com/storage/
Which ever way you have stored the volume, you can use docker cp command to copy data between your container and host operating system.
Technically of course also the container filesystems and volumes are stored on your host disk but are not meant to be accessible directly. It is not recommended to access them directly and different versions of Docker have different restrictions. Some info on where to find it on Docker for Windows can be found from answers to this question: Locating data volumes in Docker Desktop (Windows)

Accessibility in Docker volumes

I'm reading a document from Microsoft that states about Docker volumes
Volumes are stored within directories on the host filesystem. Docker
will mount and manage the volumes in the container. Once mounted,
these volumes are isolated from the host machine.
Multiple containers can simultaneously use the same volumes. Volumes
also don't get removed automatically when a container stops using the
volume.
In our example, we can create a directory on our container host and
mount this volume into the container when we create the tracking
portal container. When our tracking portal logs data, we can access
this information via the container host's filesystem. We'll have
access to this log file even if our container is removed.
I'm confused as I understand that the volumes are isolated from the host machine, but how can that be if we can access to the data via the host.
I'm less familiar with Docker on Windows, but I'm sure it's probably the same as Linux in this regard...
Docker volumes are "isolated on the host machine" by being in a particular location with particular permissions on the host's filesystem (i.e. via namespaces). Users/accounts with elevated permissions would still be granted access to those directories/files.
By contrast a bind mount can be made to (pretty much) any directory on the host's file system.

Docker, Volumes vs Bind Mounts for persistent data such as DB, elasticsearch?

docker data volume vs mounted host directory
says volumes should be preferred over bind mounts
I have a few questions regarding the issue. The post says:
When you create a volume, it is stored within a directory on the Docker host
Bear with me, but I'm new to docker, and I'm wondering what is docker host here.
Is it a machine where I build the image (probably not)?
Is it the machine where the image will be run? If it is so, what happens if I run the image on multiple machines, will it create two independent volumes?
When I have developement and production setup, how docker manages two separate volumes for each environment?
Besides it seems fairly easy to lose data by doing docker-compose down when I use data volumes, that's the first obstacle that makes me to hesitate to use data volumes, is there an obvious solution to mitigate the issue?
That's not a doctrine actually - not using bind mounts. Yes, they can damage your host's file system if mounted inaccurately (like -v /bin:/var/log) as soon as your have root privileges inside container by default; also they are less portable but they facilitate file exchange between host and container. When you want to provide initial configuration for your service, or put source code for compilation into container, I believe you would prefer to bind mount instead of creating and running temporary container for docker volume cp operations. Also, you should always use :ro option when possible (read only) to prevent data modification from inside container.
Docker host - it is a machine (PC), where Docker daemon is running.
Is it a machine where I build the image (probably not)?
Not true. You can build using docker CLI or docker API remotely.
Is it the machine where the image will be run?
Yes, images are run by docker daemon and thus it will be the host.
If it is so, what happens if I run the image on multiple machines,
will it create two independent volumes?
It depends. Running images on different machines can be achieved in different ways, staring with orchestrators like kubernetes or docker swarm and ending with manual launch on separate docker daemons. With orchestrators it is possible to have same volume, shared among different hosts, but in this case you can't use bind mounts, you use volumes.
When I have developement and production setup, how docker manages two
separate volumes for each environment?
Docker doesn't it is you who manages.
Besides it seems fairly easy to lose data by doing docker-compose down
when I use data volumes, that's the first obstacle that makes me to
hesitate to use data volumes, is there an obvious solution to mitigate
the issue?
Volumes can easily persist between docker-compose sessions. The most explicit way to achieve that is to declare volume in advance with
docker volume create foo
and then use it in your compose files:
version: '3'
services:
abc:
volumes:
foo:/foo
volumes:
foo:
external: true
Feature
Bind
Volume                                 
Internal soul
Bind mounts attach a user-specified location on host filesystem to a specific point in a container file tree.
Volume attach with disk storage on the host filesystem or cloud storage.
command
--mount type=bind,src="",dst=""
Docker CLI docker volume command
Dependency
dependent on location on to the host filesystem.
Container-independent data management
Separation of concerns
No
Yes
Conflict with other containers
Yes Example: multiple instances of Cassandra that all use the same host location as a bind mount for data storage. In that case, each of the instances would compete for the same set of files. Without other tools such as file locks, that would likely result in corruption of the database.
No. By default, Docker creates volumes by using the local volume plugin.
When to choose
1- Bind mounts are useful when the host provides a file or directory that is needed by a program running in a container, or when that containerized program produces a file or log that is processed by users or programs running outside containers. 2- appropriate tools for workstations, machines with specialized concerns 3- systems with more traditional configuration management tooling.
Working with Persistent storage 1. Databases 2. Cloud storage
When not to choose
Better to avoid these kinds of specific bindings in generalized platforms or hardware pools.
To be written

Docker Volume Binded with Container from Shared location

I am trying to make a centralized location for postgresql data, and use that for multiple containers on the same network of docker. For this i have to use the shared location in the volume, for example something like
docker -v 0.0.0.0\data:/var/lib/postgrsql/data
How can i specify the shared location as volume's host path and make a linkage with the container's binded folder.
Environment details:
Ubuntu 17.10
Docker 17
Any help or guidance to achieve this would be appreciated.
In Swarm you want to use a volume driver. This way 1. you don't have manual nfs mounts on each node host OS, and 2. you can ensure that the volume you want for a specific service is connected to the host it plans to run on.
The Docker Store has a list of volume plugin drivers for various storage solutions.
If you're using cloud storage or simple NFS, REX-Ray is likely the plugin you want.

Resources