Dockerfile: understanding VOLUME instruction

Dockerfile: understanding VOLUME instruction - docker

Let's take an example.
The following is the VOLUME instruction for the nginx image:
VOLUME ["/etc/nginx/sites-enabled", "/etc/nginx/certs", "/etc/nginx/conf.d", "/var/log/nginx", "/var/www/html"]
Here are my questions:
When you start the container, will these directories show up on my host? And when I stop my container, the directories will stay?
If some (or all) of these directories already exist in my host, what will happen? For example, let's say the image comes with a default config file within the /etc/nginx directory of the container, and I also have a config file within /etc/nginx on my host. When the container starts, which of these files will get priority?
What's the key difference between -v /host/dir:container/dir and VOLUME?
References:
https://github.com/dockerfile/nginx/blob/master/Dockerfile
http://www.tech-d.net/2014/11/03/docker-indepth-volumes/
How to mount host volumes into docker containers in Dockerfile during build
http://jpetazzo.github.io/2015/01/19/dockerfile-and-data-in-volumes/

A container's volumes are just directories on the host regardless of what method they are created by. If you don't specify a directory on the host, Docker will create a new directory for the volume, normally under /var/lib/docker/vfs.
However the volume was created, it's easy to find where it is on the host by using the docker inspect command e.g:
$ ID=$(docker run -d -v /data debian echo "Data container")
$ docker inspect -f {{.Mounts}} $ID
[{0d7adb21591798357ac1e140735150192903daf3de775105c18149552a26f951 /var/lib/docker/volumes/0d7adb21591798357ac1e140735150192903daf3de775105c18149552a26f951/_data /data local true }]
We can see that Docker has created a directory for the volume at /var/lib/docker/volumes/0d7adb21591798357ac1e140735150192903daf3de775105c18149552a26f951/_data.
You are free to modify/add/delete files in this directory from the host, but note that you may need to use sudo for permissions.
Docker will only delete volume directories in two circumstances:
If the --rm option is given to docker run, any volumes will be deleted when the container exits
If a container is deleted with docker rm -v CONTAINER, any volumes will be removed.
In both cases, volumes will only be deleted if no other containers refer to them. Volumes mapped to specific host directories (the -v HOST_DIR:CON_DIR syntax) are never deleted by Docker. However, if you remove the container for a volume, the naming scheme means you will have a hard time figuring out which directory contains the volume.
So, specific questions:
Yes and yes, with above caveats.
Each Docker managed volume gets a new directory on the host
The VOLUME instruction is identical to -v without specifying the host dir. When the host dir is specified, Docker does not create any directories for the volume, will not copy in files from the image and will never delete the volume (docker rm -v CONTAINER will not delete volumes mapped to user-specified host directories).
More information here:
https://blog.container-solutions.com/understanding-volumes-docker

Related

Combing VOLUME + docker run -v

I was looking for an explanation on the VOLUME entry when writing a Dockerfile and came across this statement
A volume is a persistent data stored in /var/lib/docker/volumes/...
You can either declare it in a Dockerfile, which means each time a container is started from the image, the volume is created (empty), even if you don't have any -v option.
You can declare it on runtime docker run -v [host-dir:]container-dir.
combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
docker volume create creates a volume without having to define a Dockerfile and build an image and run a container. It is used to quickly allow other containers to mount said volume.
But I'm having a hard time understanding this line:
...combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
For example, let's say I have a config file on my host machine and I run the container based off the image I made with the Dockerfile I wrote. Will it copy the config file into where the volume that I stated in my the volume entry?
Would it be something like (pseudocode)
#dockerfile
From Ubuntu
Run apt-get update
Run apt-get install mysql
Volume . /etc/mysql/conf.d
Cmd systemcl start MySQL
And when I run it
docker run -it -v /path/to/config/file: ubuntu_based_image
Is this what they mean?

You probably don't want VOLUME in your Dockerfile. It's not necessary to mount files or directories at runtime, and it has confusing side effects like making subsequent RUN commands silently lose state.
If an image does have a VOLUME, and you don't mount anything else there when you start the container, Docker will create an anonymous volume and mount it for you. This can result in space leaks if you don't clean these volumes up.
You can use a docker run -v option on any container directory regardless of whether or not it's declared as a VOLUME.
If you docker run -v /host/path:/container/path, the two directories are actually the same; nothing is copied, and writes to one are (supposed to be) immediately visible on the other.
docker run -v /host/path:/container/path bind mounts aren't visible in /var/lib/docker at all.
You shouldn't usually be looking at content in /var/lib/docker (and can't if you're not on a native-Linux host). If you need to access the volume file content directly, use a bind mount rather than a named or anonymous volume.
Bind mounts like you've shown are appropriate for injecting config files into containers, and for reading log files back out. Named volumes are appropriate for stateful applications' storage, like the data for a MySQL database. Neither type of volume is appropriate for code or libraries; build these directly into Docker images instead.

docker volume and VOLUME inside Dockerfile

I'm confused with what is different between creating docker volume create my-vol and VOLUME ["/var/www"].
My understanding is:
1) docker volume create my-vol creates a persistent volume on our machine and each container could be linked to my-vol.
2) VOLUME ["/var/www"] creates a volume inside its container.
And when I create another container, I could link my-vol as follows:
when running a container
$ docker run -d --name devtest --mount source=myvol2,target=/app nginx:latest
At that time, if I added VOLUME ["/var/www"] in my Dockerfile, all data of this docker file will be stored in both myvol2 and /var/www?

The Dockerfile VOLUME command says two things:
If the operator doesn't explicitly mount a volume on the specific container directory, create an anonymous one there anyways.
No Dockerfile step will ever be able to make further changes to that directory tree.
As an operator, you can mount a volume (either a named volume or a host directory) into a container with the docker run -v option. You can mount it over any directory in the container, regardless of whether or not there was a VOLUME declared for it in the Dockerfile.
(Since you can use docker run -v regardless of whether or not you declare a VOLUME, and it has confusing side effects, I would generally avoid declaring VOLUME in Dockerfiles.)
Just like in ordinary Linux, only one thing can be (usefully) mounted on any given directory. With the setup you describe, data will be stored in the myvol2 you create and mount, and it will be visible in /var/www in the container, but the data will only actually be stored in one place. If you deleted and recreated the container without the volume mount the data would not be there any more.

There are two types of persistent storage used in Docker,the first one is Docker Volumes and the second one is bind mounts. The differebce between them is that volumes are internal to Docker and stored in the Docker store (which is usually all under /var/lib/docker) and bind mounts use a physical location on your machine to store persistent data.
If you want to use a Docker Volume for nginx:
docker volume create nginx-vol
docker run -d --name devtest -v nginx-vol:/usr/share/nginx/html nginx
If you want to use a bind mount:
docker run -d --name devtest -v [path]:/usr/share/nginx/html nginx
[path] is the location in which you want to store the container's data.

Bind-mount a host directory into a volume of a running docker container

Let's say that I start a docker container with a bind-mounted local folder:
docker run --rm -v /ux1/dmtest:/data -it ubuntu
Then, locally - not inside the container, I bind-mount a directory from another fs into /ux1/dmtest:
mkdir /ux1/dmtest/bm
mount --bind /ux0/bm /ux1/dmtest/bm
Now, from the container, I see /data/bm/ and I can write content to it, but this content will not be visible on the host on /ux0/bm.
Where is this content stored?
And is there any way to mount additional storage into a running docker container (this workaround clearly doesn't work)?

Mounts done after the fact won't be seen by the container due to mount namespaces that Docker uses. The files will be in the /ux1/dmtest directory that was in place before your second bind mount.
If you do want to use a bind mount, put it in place, and then start the docker daemon, and then your container will see it.

How to remove a mount for existing container?

I'm learning docker and reading their chapter "Manage data in containers". In the "Mount a host directory as a data volume". They mentioned the following paragraph:
In addition to creating a volume using the -v flag you can also mount a directory from your Docker engine’s host into a container.
$ docker run -d -P --name web -v /src/webapp:/opt/webapp training/webapp python app.py
This command mounts the host directory, /src/webapp, into the container at /opt/webapp. If the path /opt/webapp already exists inside the container’s image, the /src/webapp mount overlays but does not remove the pre-existing content. Once the mount is removed, the content is accessible again. This is consistent with the expected behavior of the mount command.
Experiment 1
Then when I tried to run this command and try to inspect the container, I found that that actually container doesn't even run. Then I use docker logs web and find this error:
can't open file 'app.py': [Errno 2] No such file or directory
I assume that the /src/webapp mount overlays on the /opt/webapp, which there is no content.
Question 1
How can I remove this mount and check if the content is still there as the quote said?
Experiment 2
When I tried to run
$ docker run -d -P --name web2 -v newvolume:/opt/webapp training/webapp python app.py
I found that the container ran correctly. Then I use docker exec -it web2 /bin/bash and find that all of the existing content are still inside the /opt/webapp. I can also add more files inside here. So in this case, it looks like that the volume is not overlay but combined. If I use docker inspect web and check Mounts, then I'll see that the volume is created under /var/lib/docker/volumes/newvolume/_data
Question 2
If I give a name instead of a host-dir absolute path, then the volume will not overlay the container-dir /opt/webapp but connect the two dir together?

An alternative solution is to commit the container (or export it) using docker cli and re-create it without doing the mapping.

Question 1 How can I remove this mount and check if the content is still there as the quote said?
You would create a new container without the volume mount. E.g.
$ docker run -d -P --name web training/webapp python app.py
(Theoretically it's possible to perform some privileged operations to remove the mount on a running container, but inside the container you will not normally have this permission, and it's a good practice to get into the habit of treating containers as ephemeral.)
Question 2 If I give a name instead of a host-dir absolute path, then the volume will not overlay the container-dir /opt/webapp but connect the two dir together?
Almost. What's happening with named volumes is that docker provides an initialization step when the volume is empty and the container is created with that volume mount. The initialization step copies the contents of the image at that directory into the volume, including all files and directories recursively, ownership, and permissions. This is very useful to running containers as a non-root user with a volume directory that the user inside the container needs to be able to write into. After that initialization has happened, future containers with the same named volume will skip the initialization, even if the image content has changed, e.g. if you add new content into the image.

How to mount a directory in a Docker container to the host?

Assume that i have an application with this simple Dockerfile:
//...
RUN configure.sh --logmyfiles /var/lib/myapp
ENTRYPOINT ["starter.sh"]
CMD ["run"]
EXPOSE 8080
VOLUME ["/var/lib/myapp"]
And I run a container from that:
sudo docker run -d --name myapp -p 8080:8080 myapp:latest
So it works properly and stores some logs in /var/lib/myapp of docker container.
My question
I need these log files to automatically saved in host too, So how can i mount the /var/lib/myapp from the container to the /var/lib/myapp in host server (without removing current container) ?
Edit
I also see Docker - Mount Directory From Container to Host, but it doesn't solve my problem i need a way to backup my files from docker to host.

First, a little information about Docker volumes. Volume mounts occur only at container creation time. That means you cannot change volume mounts after you've started the container. Also, volume mounts are one-way only: From the host to the container, and not vice-versa. When you specify a host directory mounted as a volume in your container (for example something like: docker run -d --name="foo" -v "/path/on/host:/path/on/container" ubuntu), it is a "regular ole" linux mount --bind, which means that the host directory will temporarily "override" the container directory. Nothing is actually deleted or overwritten on the destination directory, but because of the nature of containers, that effectively means it will be overridden for the lifetime of the container.
So, you're left with two options (maybe three). You could mount a host directory into your container and then copy those files in your startup script (or if you bring cron into your container, you could use a cron to periodically copy those files to that host directory volume mount).
You could also use docker cp to move files from your container to your host. Now that is kinda hacky and definitely not something you should use in your infrastructure automation. But it does work very well for that exact purpose. One-off or debugging is a great situation for that.
You could also possibly set up a network transfer, but that's pretty involved for what you're doing. However, if you want to do this regularly for your log files (or whatever), you could look into using something like rsyslog to move those files off your container.

So how can i mount the /var/lib/myapp from the container to the /var/lib/myapp in host server
That is the opposite: you can mount an host folder to your container on docker run.
(without removing current container)
I don't think so.
Right now, you can check docker inspect <containername> and see if you see your log in the /var/lib/docker/volumes/... associated to the volume from your container.
Or you can redirect the result of docker logs <containername> to an host file.
For more example, see this gist.
The alternative would be to mount a host directory as the log folder and then access the log files directly on the host.
me#host~$ docker run -d -p 80:80 -v <sites-enabled-dir>:/etc/nginx/sites-enabled -v <certs-dir>:/etc/nginx/certs -v <log-dir>:/var/log/nginx dockerfile/nginx
me#host~$ ls <log-dir>
(again, that apply to a container that you start, not an existing running one)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart