Sharing volume between Docker containers - docker

I am using Docker to deploy some services and I want to share the Docker volumes between different containers.
Suppose I have a Docker container A which mounts a volume at /data. Here is its Dockerfile:
VOLUME /data
From my understanding, this will attach a volume to the container but it will not mount a host directory to the container. So the data inside this volume is still inside the container A.
I have another container B which provides an FTP service. It accesses the data under volume /public. Its Dockerfile is:
VOLUME /public
Now I want to link them together so that I can use B to manage A's data. From the Docker doc https://docs.docker.com/engine/userguide/containers/dockervolumes/ I shall use the --volumes-from flag to mount A's data volume to B. But this command will mount A's data to /data in B instead of /public, and in this case, the container B is not able to access the data. I didn't see any way to rename the mount point.
Any suggestions or best practices to handle this case?
The data-only container gives a good solution for this case. But if you want to use volumes-from and mount the data to different mount point, this question may be helpful!
How to map volume paths using Docker's --volumes-from?

You may find a lot of pointers mentioning data-only containers and --volumes-from. However, since docker 1.9, volumes have become first class citizens, they can have names, and have more flexibility:
It's now easy to achieve the behavior you want, here's an example :
Create a named data volume with name service-data:
docker volume create --name service-data
You can then create a container that mounts it in your /public folder by using the -v flag:
docker run -t -i -v service-data:/public debian:jessie /bin/bash
For testing purpose we create a small text file in our mapped folder:
cd public
echo 'hello' > 'hello.txt'
You may then attach your named volume to a second container, but this time under the data folder:
docker run -t -i -v service-data:/data debian:jessie /bin/bash
ls /data #-->shows "hello.txt"
Just remember, if both containers are using different images, be careful with ownership and permissions!

Related

Combing VOLUME + docker run -v

I was looking for an explanation on the VOLUME entry when writing a Dockerfile and came across this statement
A volume is a persistent data stored in /var/lib/docker/volumes/...
You can either declare it in a Dockerfile, which means each time a container is started from the image, the volume is created (empty), even if you don't have any -v option.
You can declare it on runtime docker run -v [host-dir:]container-dir.
combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
docker volume create creates a volume without having to define a Dockerfile and build an image and run a container. It is used to quickly allow other containers to mount said volume.
But I'm having a hard time understanding this line:
...combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
For example, let's say I have a config file on my host machine and I run the container based off the image I made with the Dockerfile I wrote. Will it copy the config file into where the volume that I stated in my the volume entry?
Would it be something like (pseudocode)
#dockerfile
From Ubuntu
Run apt-get update
Run apt-get install mysql
Volume . /etc/mysql/conf.d
Cmd systemcl start MySQL
And when I run it
docker run -it -v /path/to/config/file: ubuntu_based_image
Is this what they mean?
You probably don't want VOLUME in your Dockerfile. It's not necessary to mount files or directories at runtime, and it has confusing side effects like making subsequent RUN commands silently lose state.
If an image does have a VOLUME, and you don't mount anything else there when you start the container, Docker will create an anonymous volume and mount it for you. This can result in space leaks if you don't clean these volumes up.
You can use a docker run -v option on any container directory regardless of whether or not it's declared as a VOLUME.
If you docker run -v /host/path:/container/path, the two directories are actually the same; nothing is copied, and writes to one are (supposed to be) immediately visible on the other.
docker run -v /host/path:/container/path bind mounts aren't visible in /var/lib/docker at all.
You shouldn't usually be looking at content in /var/lib/docker (and can't if you're not on a native-Linux host). If you need to access the volume file content directly, use a bind mount rather than a named or anonymous volume.
Bind mounts like you've shown are appropriate for injecting config files into containers, and for reading log files back out. Named volumes are appropriate for stateful applications' storage, like the data for a MySQL database. Neither type of volume is appropriate for code or libraries; build these directly into Docker images instead.

How does volume mount from container to host and vice versa work?

docker run -ti --rm -v DataVolume3:/var ubuntu
Lets say I have a volume DataVolume 3 which pulls the contents of /var in the ubuntu container
even after killing this ubuntu container the volume remains and I can use this volume DataVolume3 to mount it to other containers.
This means with the deletion of container the volume mounts are not deleted.
How does this work ?
Does that volume mount mean that it copies the contents of /var into some local directory because this does not look like a symbolic link ?
If I have the container running and I create a file in the container then the same file gets copied to the host path ?
How does this whole process of volume mount from container to host and host to container work ?
Volumes are used for persistent storage and the volumes persists independent of the lifecycle of the container.
We can go through a demo to understand it clearly.
First, let's create a container using the named volumes approach as:
docker run -ti --rm -v DataVolume3:/var ubuntu
This will create a docker volume named DataVolume3 and it can be viewed in the output of docker volume ls:
docker volume ls
DRIVER VOLUME NAME
local DataVolume3
Docker stores the information about these named volumes in the directory /var/lib/docker/volumes/ (*):
ls /var/lib/docker/volumes/
1617af4bce3a647a0b93ed980d64d97746878564b141f30b6110d0818bf32b76 DataVolume3
Next, let's write some data from the ubuntu container at the mounted path var:
echo "hello" > var/file1
root#2b67a89a0050:/# cat /var/file1
hello
We can see this data with cat even after deleting the container:
cat /var/lib/docker/volumes/DataVolume3/_data/file1
hello
Note: Although, we are able to access the volumes like shown above but it not a recommended practice to access volumes data like this.
Now, next time when another container uses the same volume then the data from the volume gets mounted at the container directory specified as part of -v flag.
(*) The location may vary based on OS as pointed by David and probably can be seen by the docker volume inspect command.
Docker has a concept of a named volume. By default the storage for this lives somewhere on your host system and you can't directly access it from outside Docker (*). A named volume has its own lifecycle, it can be independently docker volume rm'd, and if you start another container mounting the same volume, it will have the same persistent content.
The docker run -v option takes some unit of storage, either a named volume or a specific host directory, and mounts it (as in the mount(8) command) in a specific place in the container filesystem. This will hide what was originally in the image and replace it with the volume content.
As you note, if the thing you mount is an empty named volume, it will get populated from the image content at container initialization time. There are some really important caveats on this functionality:
Named volume initialization happens only if the volume is totally empty.
The contents of the named volume never automatically update.
If the volume isn't empty, the volume contents completely replace what's in the image, even if it's changed.
The initialization happens only on native Docker, and not for example in Kubernetes.
The initialization happens only on named volumes, and not for bind-mounted host directories.
With all of these caveats, I'd avoid relying on this functionality.
If you need to mount a volume into a container, assume it will be empty when your entrypoint or the main container command starts. If you need a particular directory layout or file structure there, an entrypoint script can create it; if you're expecting it to hold particular data, keep a copy of it somewhere else in your image and copy it in if it's not already there (or, perhaps, always).
(*) On native Linux you can find a filesystem location for it, but accessing this isn't a best practice. On other OSes this will be hidden inside a virtual machine or other opaque storage. If you need to directly access the data (or inject config files, or read log files) a docker run -v /host/path:/container/path bind mount is a better choice.
Volumes are part of neither the container nor the host. Well, technically everything resides in the host machine. But the docker directories are only accessible by users in "docker" group. The files in these directories are separately managed by docker.
"Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux)."
Hence volumes are like the union of files under the docker container and the host itself. Any addition on either end will be added to the volume(/var/lib/docker/volumes), not hard copy, rather something like symbol link
As volumes can be shared across different containers, deleting a container does not cascade to the volumes associated with it.
To remove unused volumes:
docker volume prune .

docker volume and VOLUME inside Dockerfile

I'm confused with what is different between creating docker volume create my-vol and VOLUME ["/var/www"].
My understanding is:
1) docker volume create my-vol creates a persistent volume on our machine and each container could be linked to my-vol.
2) VOLUME ["/var/www"] creates a volume inside its container.
And when I create another container, I could link my-vol as follows:
when running a container
$ docker run -d --name devtest --mount source=myvol2,target=/app nginx:latest
At that time, if I added VOLUME ["/var/www"] in my Dockerfile, all data of this docker file will be stored in both myvol2 and /var/www?
The Dockerfile VOLUME command says two things:
If the operator doesn't explicitly mount a volume on the specific container directory, create an anonymous one there anyways.
No Dockerfile step will ever be able to make further changes to that directory tree.
As an operator, you can mount a volume (either a named volume or a host directory) into a container with the docker run -v option. You can mount it over any directory in the container, regardless of whether or not there was a VOLUME declared for it in the Dockerfile.
(Since you can use docker run -v regardless of whether or not you declare a VOLUME, and it has confusing side effects, I would generally avoid declaring VOLUME in Dockerfiles.)
Just like in ordinary Linux, only one thing can be (usefully) mounted on any given directory. With the setup you describe, data will be stored in the myvol2 you create and mount, and it will be visible in /var/www in the container, but the data will only actually be stored in one place. If you deleted and recreated the container without the volume mount the data would not be there any more.
There are two types of persistent storage used in Docker,the first one is Docker Volumes and the second one is bind mounts. The differebce between them is that volumes are internal to Docker and stored in the Docker store (which is usually all under /var/lib/docker) and bind mounts use a physical location on your machine to store persistent data.
If you want to use a Docker Volume for nginx:
docker volume create nginx-vol
docker run -d --name devtest -v nginx-vol:/usr/share/nginx/html nginx
If you want to use a bind mount:
docker run -d --name devtest -v [path]:/usr/share/nginx/html nginx
[path] is the location in which you want to store the container's data.

In docker, can I publish a volume with initial data?

I want to share a file storage between two containers. From the documentation, I've seen that you can create and use volumes like this:
docker volume create --name DataVolume1
docker run -ti --rm -v DataVolume1:/datavolume1 ubuntu
However, I want containers to be able to access an initial set of shared data. Does docker support publishing of volumes? If not, does this mean I should write the initial data manually, after creating the volume, or is there another solution for publishing the data along with the images?
With a named volume (not with a host volume, aka bind mount) docker will initialize an empty named volume to the contents of the image at the location you mount it. So if you have files in your image at /datavolume1, and DataVolume1 is empty, docker will copy those files into the named volume.

How to remove a mount for existing container?

I'm learning docker and reading their chapter "Manage data in containers". In the "Mount a host directory as a data volume". They mentioned the following paragraph:
In addition to creating a volume using the -v flag you can also mount a directory from your Docker engine’s host into a container.
$ docker run -d -P --name web -v /src/webapp:/opt/webapp training/webapp python app.py
This command mounts the host directory, /src/webapp, into the container at /opt/webapp. If the path /opt/webapp already exists inside the container’s image, the /src/webapp mount overlays but does not remove the pre-existing content. Once the mount is removed, the content is accessible again. This is consistent with the expected behavior of the mount command.
Experiment 1
Then when I tried to run this command and try to inspect the container, I found that that actually container doesn't even run. Then I use docker logs web and find this error:
can't open file 'app.py': [Errno 2] No such file or directory
I assume that the /src/webapp mount overlays on the /opt/webapp, which there is no content.
Question 1
How can I remove this mount and check if the content is still there as the quote said?
Experiment 2
When I tried to run
$ docker run -d -P --name web2 -v newvolume:/opt/webapp training/webapp python app.py
I found that the container ran correctly. Then I use docker exec -it web2 /bin/bash and find that all of the existing content are still inside the /opt/webapp. I can also add more files inside here. So in this case, it looks like that the volume is not overlay but combined. If I use docker inspect web and check Mounts, then I'll see that the volume is created under /var/lib/docker/volumes/newvolume/_data
Question 2
If I give a name instead of a host-dir absolute path, then the volume will not overlay the container-dir /opt/webapp but connect the two dir together?
An alternative solution is to commit the container (or export it) using docker cli and re-create it without doing the mapping.
Question 1 How can I remove this mount and check if the content is still there as the quote said?
You would create a new container without the volume mount. E.g.
$ docker run -d -P --name web training/webapp python app.py
(Theoretically it's possible to perform some privileged operations to remove the mount on a running container, but inside the container you will not normally have this permission, and it's a good practice to get into the habit of treating containers as ephemeral.)
Question 2 If I give a name instead of a host-dir absolute path, then the volume will not overlay the container-dir /opt/webapp but connect the two dir together?
Almost. What's happening with named volumes is that docker provides an initialization step when the volume is empty and the container is created with that volume mount. The initialization step copies the contents of the image at that directory into the volume, including all files and directories recursively, ownership, and permissions. This is very useful to running containers as a non-root user with a volume directory that the user inside the container needs to be able to write into. After that initialization has happened, future containers with the same named volume will skip the initialization, even if the image content has changed, e.g. if you add new content into the image.

Resources