docker persistent storage via volume - docker

All that read are saying that we can create docker persistent storage via the VOLUME control, e.g., here and here.
However, I have the following in my Dockerfile:
VOLUME ["/home", "/root"]
but nothing was kept in there (I tried touch abc, and when I exit and get back in again the file is not there).
I see that the official usage has no other special controls or treatments, yet it can provide persistent storage, even the apt-cacher-ng service container is stopped.
How is that possible? I've checked out the following but still don't have a clue:
Docker volume persistent
Persistent Docker volume
Docker volume vs. persistent volume on official Docker beginner tutorial
In conclusion, as explained here:
The first mechanism will create an implicit storage sandbox for the container that requested host-based persistence. ... The key thing to understand is that the data stored in the sandbox is not available to other containers, except the one that requested it.
How can I make persistent storage works? Why my changes are wiped each time, while the apt-cacher-ng service container can maintain its persistent storage even it is stopped and restarted?
What is the "other special controls or treatments" that I failed to see here?
Explain with plain simple Dockerfile (not docker-compose.yml ) please.

The problem is that the VOLUME instruction creates an external storage when the docker run is used without the "mount" option for that specific folder. In other words, every time you create a container for that image, docker will create a new volume with a random label but since you're not giving those volumes a name, the new containers cannot re-use the existing generated volumes.
e.g.
FROM ubuntu
RUN mkdir /myvol
RUN echo "hello world" > /myvol/greeting
VOLUME /myvol
Given this Dockerfile, if you simply build and run this docker image using the following commands:
echo "There are currently $(docker volume ls | wc -l) volumes"
docker build -t my_volume_test:latest .
docker run --name another_test my_volume_test:latest
echo "There are currently $(docker volume ls | wc -l) volumes"
you'll see that the amount of volumes present on your machine has increased. That container is now using a volume to store the data but that specific volume does not have a name or label so it's only bound to that specific container. If you delete and recreate that container, docker will generate a new volume with a random name unless you manage to mount the volume you've created earlier.
If you want to make it easy, I suggest to first create the volume and mount it then. e.g.
docker rm -f another_test
docker volume create my-vol
docker run \
--name another_test \
--mount source=my-vol,target=/myvol \
my_volume_test:latest
# alternative
docker rm -f another_test
docker run \
--name another_test \
-v my-vol:/myvol \
my_volume_test:latest
In this case you can create and remove as many containers you want but they'll all use the same volume.
check the VOLUME reference and Use volumes for more info.

Related

What is the use of VOLUME in this official Dockerfile of postgresql

I found the following code in the Dockerfile of official postgresql. https://github.com/docker-library/postgres/blob/master/11/Dockerfile
ENV PGDATA /var/lib/postgresql/data
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA" # this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
VOLUME /var/lib/postgresql/data
I want to know what is the purpose of VOLUME in this regard.
VOLUME /var/lib/postgresql/data
As per my understanding it will create a new storage volume when we start a container and that storage volume will also be deleted permanently when the container is removed (docker stop contianerid; docker rm containeid)
Then if the data is not going to persist then why to use this. Because VOLUME are used if we want the data to persist.
My question is w.r.t what is its use if the postgres data is only going to remain only till the container is running and after that everything is wiped out. If i have done lot of work and in the end everything is gone then whats the use.
As per my understanding it will create a new storage volume when we start a container and that storage volume will also be deleted permanently when the container is removed (docker stop contianerid; docker rm containeid)
If you run a container with the --rm option, anonymous volumes are deleted when the container exits. If you do not pass the --rm option when creating the container, then the -v option to docker container rm will also delete volumes. Otherwise, these anonymous volumes will persist after a stop/rm.
That said, anonymous volumes are difficult to manage since it's not clear which volume contains what data. Particularly with images like postgresql, I would prefer if they removed the VOLUME line from their Dockerfile, and instead provided a compose file that defined the volume with a name. You can see more about what the VOLUME line does and why it creates problems in my answer over here.
Your understanding of how volumes works is almost correct but not completely.
As you stated, when you create a container from an image defining a VOLUME, docker will indeed create an anonymous volume (i.e. with a random name).
When you stop/remove the container the volume itself will not be deleted and will still be accessible by the docker volume family of commands.
Indeed in order to remove a container and delete associated volumes you have to use the -v flag as in docker rm -v container-name. This command will remove the container and delete all the anonymous volumes associated with it (named volume will never be deleted unless explicitly requested via docker volume rm volume-name).
So to sum up the VOLUME directive inside a Dockerfile is used to identify those places that will host persistent data and ensuring the following:
data will survive the life of the container by default
data can be shared with other containers (i.e. --volumes-from)
The most important aspect to me is that it also serves as a sort of implicit documentation for your user to let them know where the persistent state is kept (so that they can name the volumes via the -v flag of docker run).

Creating docker named volume pointing to folder on host

I think that my question is simple but I want to make sure I am taking the right approach. On my host computer, I have a path, e.g. /my/docs, which contains HTML files which get updated automatically.
I have a Docker container with a small web server these html files. I would like to create a named coker volume to called my-docs to point to /my/docs, so that I can start the container as docker run -v my-docs:/public ...
Is this the right approach, and if so, what is the docker create volume command?
The default "local" driver for named volumes places the data inside of the docker directories. The correct way to do what you want is to use the built in host volume, instead of a named volume:
docker run -v /my/docs:/public ...
Alternatively, you can first copy the contents from /my/docs into the named volume and then use that named volume:
docker run --rm \
-v /my/docs:/source -v my-docs:/target \
busybox cp -av /source/. /target/
docker run --rm -it -v my-docs:/public busybox /bin/sh
It's also possible that someone has created a driver for this, or for you to create one yourself. See the volume driver plugin docs for more details.

What is the purpose of VOLUME in Dockerfile

I'm trying to go deeper in my understanding of Docker's volume, and I'm having an hard time to figure out the differences / use-case of:
The docker volume create command
The docker run -v /path:/host_path
The VOLUME entry in the Dockerfile file
I particularly don't understand what happens if you combine the VOLUME entry with the -v flag.
A volume is a persistent data stored in /var/lib/docker/volumes/...
You can either declare it in a Dockerfile, which means each time a container is started from the image, the volume is created (empty), even if you don't have any -v option.
You can declare it on runtime docker run -v [host-dir:]container-dir.
combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
docker volume create creates a volume without having to define a Dockerfile and build an image and run a container. It is used to quickly allow other containers to mount said volume.
If you had persisted some content in a volume, but since then deleted the container (which by default does not deleted its associated volume, unless you are using docker rm -v), you can re-attach said volume to a new container (declaring the same volume).
See "Docker - How to access a volume not attached to a container?".
With docker volume create, this is easy to reattached a named volume to a container.
docker volume create --name aname
docker run -v aname:/apath --name acontainer
...
# modify data in /apath
...
docker rm acontainer
# let's mount aname volume again
docker run -v aname:/apath --name acontainer
ls /apath
# you find your data back!
VOLUME instruction becomes interesting when you combine it with volumes-from runtime parameter.
Given the following Dockerfile:
FROM busybox
VOLUME /myvolume
Build an image with:
docker build -t my-busybox .
And spin up a container with:
docker run --rm -it --name my-busybox-1 my-busybox
The first thing to notice is you will have a folder in this image named myvolume. But it is not particularly interesting since when we exit the container the volume will be removed as well.
Create an empty file in this folder, so run the following in the container:
cd myvolume
touch hello.txt
Now spin up a new container, but share the same volume with my-busybox-1:
docker run --rm -it --volumes-from my-busybox-1 --name my-busybox-2 my-busybox
You will see that my-busybox-2 contains the file hello.txt in myvolume folder.
Once you exit both containers, the volume will be removed as well.
#radium226
Specifying VOLUME in Dockerfile makes sure the folder is to be treated as a volume(i.e., outside container) at runtime, as opposed to be a regular directory inside the container. Note the performance and accessibility implications.
If having forgot to specify "-v" in "docker run" command line, the above is still true. It's just the volume name becomes anonymous. But there are still ways to access or recover data from such anonymous volumes.
Using MYSQL from docker hub:
Running the below command as an example:
$ docker run --name some-mysql -v /my/own/datadir:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql:tag
The -v /my/own/datadir:/var/lib/mysql part of the command mounts the /my/own/datadir directory from the underlying host system as /var/lib/mysql inside the container, where MySQL by default will write its data files.
Therefore, a directory that persists when the container is killed is mounted is available that also provided higher performance for some operations like databases actions.

How to write data to host file system from Docker container

I have a Docker container which is running some code and creating some HTML reports. I want these reports to be published into a specific directory on the host machine, i.e. at /usr/share/nginx/reports
The way I have gone about doing this is to mount this host directory as a data volume, i.e. docker run -v /usr/share/nginx/reports --name my-container com.containers/my-container
However, when I ssh into the host machine, and check the contents of the directory /usr/share/nginx/reports, I don't see any of the report data there.
Am I doing something wrong?
The host machine is an Ubuntu server, and the Docker container is also Ubuntu, no boot2docker weirdness going on here.
From "Managing data in containers", mounting a host folder to a container would be:
docker run -v /Users/<path>:/<container path>
(see "Use volume")
Using only -v /usr/share/nginx/reports would declare the internal container path /usr/share/nginx/reports as a volume, but would have nothing to do with the host folder.
This is one of the type of mounts available:
The answer to this question is problematic because it varies depending on your operating system and your full requirements. The answer by VonC makes some assumptions that should be addressed and is therefore only correct in some contexts. Other answers on this topic generally ignore the fact that some people are running linux, others windows, and still others are on OSX or other weird OS's.
As VonC mentioned in his answer, in a lot of cases it is possible to bind-mount a host directory straight into the container, using a -v host-path:container-path argument to the docker command (you can also use --volume for added readability or --mount for rocket-science).
One of the biggest problems (in 2020) is the use of the Windows Subsystem for Linux (WSL), where bind-mounting a host volume is fraught with error and may or may not work as expected depending on whether the path mounted is in the linux filesystem or the windows filesystem. VonC's answer was written before WSL became a big problem, but it still makes assumptions about the local filesystem being real rather than mounted into a virtual-machine of some kind.
I have found that a lot of engineers prefer to bypass this unnecessary confusion through the use of docker volumes. A docker volume can be created with the command:
docker volume create <name>
Listed with
docker volume ls
and removed with
docker volume rm <name>
You can mount this by specifying the name of the volume on the left-hand-side of the --volume argument. If your volume was called, for example, 'logs', you could use something like --volume logs:/usr/share/nginx/reports to bind it to the log dir you're interested in. You can view the contents of the directory with something like this:
docker run -it --rm --volume logs:/logs alpine ls -AlF /logs/
This should list the files in that directory. If you have a file called 'nginx.log' for example, you could view it like this:
docker run -it --rm --volume logs:/logs alpine less /logs/nginx.log
And the contents would be paged to your terminal.
You can bind this volume to multiple containers simultaneously if needed. This is useful if, for example, you're writing to your logs with one container, and paging them to a console with another.
If you want to copy the example log file from above into a tmp directory on your local filesystem you can achieve that with:
docker run -it --rm --volume logs:/logs --volume /tmp:/local_tmp alpine cp /logs/nginx.log /local_tmp/
I am using Docker toolbox on windows. I am Working on a Spring Boot Application using Docker. My application writes logs to
users/path/service.log
So when i started my application from host terminal the Log file was successfully updated.
But the same when i did on docker no file was created and neither updated.
So i changed my log file location to match with the Container's Directories
var/log/service.log
I started my container again and my file was updated again.
You can choose any location as long as it matches with the container Directory. Just bash into the container and see what suits you.
Next step is to copy log files from container to host.
So in order to copy those logs to your host. You can use one of two ways i know of-
1- use Volumes in docker
2- use following Docker command to copy file from docker container to host-:
docker cp <containerId>:/file/path/within/container /host/path/target
First, you need to create a directory where you want to share the data
mkdir -p /abc/def/
Now, you need to create a docker volume using the below command. As we see here, we are specifying device as '/abc/def/'
docker volume create --driver local \
--opt type=none \
--opt device=/abc/def/ \
--opt o=bind \
spark-volume
Now, start your container with below command..
docker run -d \
--mount type=volume,dst=/abc/def/,volume-driver=local,volume-opt=type=none,volume-opt=o=bind,volume-opt=device=/opt/spark/ \
--network host \
img:tag
Now, docker container will use /abc/def/ in local Filesystem as its storage and you will have all contents of /abc/def/ in docker container available in Local Filesystem
In your application, if you set a working directory for your php code (report path), the path must be the one on the container. Then docker will copie automaticly copy to your host directory. It wasn't docker mis-configuration, but my application that was writing to the wrong place. Weird at first, but did work in my case.

How to move Docker containers between different hosts?

I cannot find a way of moving docker running containers from one host to another.
Is there any way I can push my containers to repositories like we do for images ?
Currently, I am not using data volumes to store the data associated with applications running inside containers. So some data resides inside containers, which I want to persist before redesigning the setup.
Alternatively, if you do not wish to push to a repository:
Export the container to a tarball
docker export <CONTAINER ID> > /home/export.tar
Move your tarball to new machine
Import it back
cat /home/export.tar | docker import - some-name:latest
You cannot move a running docker container from one host to another.
You can commit the changes in your container to an image with docker commit, move the image onto a new host, and then start a new container with docker run. This will preserve any data that your application has created inside the container.
Nb: It does not preserve data that is stored inside volumes; you need to move data volumes manually to new host.
What eventually worked for me, after lot's of confusing manuals and confusing tutorials, since Docker is obviously at time of my writing at peak of inflated expectations, is:
Save the docker image into archive:
docker save image_name > image_name.tar
copy on another machine
on that other docker machine, run docker load in a following way:
cat image_name.tar | docker load
Export and import, as proposed in another answers does not export ports and variables, which might be required for your container to run. And you might end up with stuff like "No command specified" etc... When you try to load it on another machine.
So, difference between save and export is that save command saves whole image with history and metadata, while export command exports only files structure (without history or metadata).
Needless to say is that, if you already have those ports taken on the docker hyper-visor you are doing import, by some other docker container, you will end-up in conflict, and you will have to reconfigure exposed ports.
Note: In order to move data with docker, you might be having persistent storage somewhere, which should also be moved alongside with containers.
Use this script:
https://github.com/ricardobranco777/docker-volumes.sh
This does preserve data in volumes.
Example usage:
# Stop the container
docker stop $CONTAINER
# Create a new image
docker commit $CONTAINER $CONTAINER
# Save image
docker save -o $CONTAINER.tar $CONTAINER
# Save the volumes (use ".tar.gz" if you want compression)
docker-volumes.sh $CONTAINER save $CONTAINER-volumes.tar
# Copy image and volumes to another host
scp $CONTAINER.tar $CONTAINER-volumes.tar $USER#$HOST:
# On the other host:
docker load -i $CONTAINER.tar
docker create --name $CONTAINER [<PREVIOUS CONTAINER OPTIONS>] $CONTAINER
# Load the volumes
docker-volumes.sh $CONTAINER load $CONTAINER-volumes.tar
# Start container
docker start $CONTAINER
From Docker documentation:
docker export does not export the contents of volumes associated
with the container. If a volume is mounted on top of an existing
directory in the container, docker export will export the contents
of the underlying directory, not the contents of the volume. Refer
to Backup, restore, or migrate data
volumes
in the user guide for examples on exporting data in a volume.
I tried many solutions for this, and this is the one that worked for me :
1.commit/save container to new image :
++ commit the container:
# docker stop
# docker commit CONTAINER_NAME
# docker save --output IMAGE_NAME.tar IMAGE_NAME:TAG
ps:"Our container CONTAINER_NAME has a mounted volume at '/var/home'" ( you have to inspect your container to specify its volume path : # docker inspect CONTAINER_NAME )
++ save its volume : we will use an ubuntu image to do the thing.
# mkdir backup
# docker run --rm --volumes-from CONTAINER_NAME -v ${pwd}/backup:/backup ubuntu
bash -c “cd /var/home && tar cvf /backup/volume_backup.tar .”
Now when you look at ${pwd}/backup , you will find our volume under tar format.
Until now, we have our conatainer's image 'IMAGE_NAME.tar' and its volume 'volume_backup.tar'.
Now you can , recreate the same old container on a new host.
docker export | gzip > .tar.gz
#new host
gunzip < /mnt/usb/.tar.gz | docker import -
docker run -i -p 80:80 /bin/bash

Resources