Docker volume content does not persist - docker

I am trying to capture the state of a docker container as an image, in a way that includes files I have added to a volume within the container. So, if I run the original container in this way:
$ docker run -ti -v /cookbook ubuntu:14.04 /bin/bash
root#b78f3599d936:/# cd cookbook
root#b78f3599d936:/cookbook# touch foo.txt
Now, if I either export, or commit the container as a new docker image, and then run a container from the new image, then the file, foo.txt is never included in the /cookbook directory.
My question is whether there is a way to create an image from a container in a way that allows the image to include file content within its volumes.

whether there is a way to create an image from a container in a way that allows the image to include file content within its volumes?
No, because volume is designed to manage data inside and between your Docker containers, it's used to persist and share data. What's in image is usually your program(artifacts, executables, libs. e.g) with its whole environment, building/updating data to image does not make much sense.
And in docs of volumes, they told us:
Changes to a data volume will not be included when you update an image.
Also in docs of docker commit:
The commit operation will not include any data contained in volumes mounted inside the container.

Well, by putting the changes in a volume, you're excluding them from the actual container. The documentation for docker export includes this:
The docker export command does not export the contents of volumes associated with the container. If a volume is mounted on top of an existing directory in the container, docker export will export the contents of the underlying directory, not the contents of the volume.
Refer to Backup, restore, or migrate data volumes in the user guide for examples on exporting data in a volume.
This points to this documentation. Please follow the steps there to export the information stored in the volume.
You're probably looking for something like this:
docker run --rm --volumes-from <containerId> -v $(pwd):/backup ubuntu tar cvf /backup/backup.tar /cookbook
This would create a file backup.tar with the contents of the container's /cookbook directory and store it in the current directory of the host. You could then use this tar file to import it in another container.

Essentially, there are three ways to do persistence in Docker:
You can keep files in a volume, which is a filesystem managed by Docker. This is what happens in your example: because the /cookbook directory is part of a volume, your file does not get commited/exported with the image. It does however get stored in the volume, so if you remount the same volume in a different container, you will find your file there. You can list your volumes using docker volume ls. As you can see, you should probably give your volumes names if you plan to reuse them. You can mount an existing volume, or create a new one, if the name does not exist, with
docker run -v name:/directory ubuntu
You can keep files as part of the image. If you commit the container, all changes to its file hierarchy are stored in the new image except those made to mounted volumes. So if you just get rid of the -v flag, your file shows up in the commit.
You can bind mount a directory from the host machine to the container, by using the -v /hostdir:/targetdir syntax. The container then simply has access to a directory of the host machine.

Docker commit allows you to create an image from a container and its data (mounted volumes will be ignored)

Related

Combing VOLUME + docker run -v

I was looking for an explanation on the VOLUME entry when writing a Dockerfile and came across this statement
A volume is a persistent data stored in /var/lib/docker/volumes/...
You can either declare it in a Dockerfile, which means each time a container is started from the image, the volume is created (empty), even if you don't have any -v option.
You can declare it on runtime docker run -v [host-dir:]container-dir.
combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
docker volume create creates a volume without having to define a Dockerfile and build an image and run a container. It is used to quickly allow other containers to mount said volume.
But I'm having a hard time understanding this line:
...combining the two (VOLUME + docker run -v) means that you can mount the content of a host folder into your volume persisted by the container in /var/lib/docker/volumes/...
For example, let's say I have a config file on my host machine and I run the container based off the image I made with the Dockerfile I wrote. Will it copy the config file into where the volume that I stated in my the volume entry?
Would it be something like (pseudocode)
#dockerfile
From Ubuntu
Run apt-get update
Run apt-get install mysql
Volume . /etc/mysql/conf.d
Cmd systemcl start MySQL
And when I run it
docker run -it -v /path/to/config/file: ubuntu_based_image
Is this what they mean?
You probably don't want VOLUME in your Dockerfile. It's not necessary to mount files or directories at runtime, and it has confusing side effects like making subsequent RUN commands silently lose state.
If an image does have a VOLUME, and you don't mount anything else there when you start the container, Docker will create an anonymous volume and mount it for you. This can result in space leaks if you don't clean these volumes up.
You can use a docker run -v option on any container directory regardless of whether or not it's declared as a VOLUME.
If you docker run -v /host/path:/container/path, the two directories are actually the same; nothing is copied, and writes to one are (supposed to be) immediately visible on the other.
docker run -v /host/path:/container/path bind mounts aren't visible in /var/lib/docker at all.
You shouldn't usually be looking at content in /var/lib/docker (and can't if you're not on a native-Linux host). If you need to access the volume file content directly, use a bind mount rather than a named or anonymous volume.
Bind mounts like you've shown are appropriate for injecting config files into containers, and for reading log files back out. Named volumes are appropriate for stateful applications' storage, like the data for a MySQL database. Neither type of volume is appropriate for code or libraries; build these directly into Docker images instead.

How to keep file from mounted volume and create new container with that file?

I have a docker container image that requires me to mount a volume containing a specific configuration file, in order for that container to properly start (this image is not one that I have control over, and is vendor supplied). If that volume is not mounted, the container will exit because the file is not found. So I need to put a configuration file in /host/folder/, and then:
docker run --name my_app -v /host/folder:/container/folder image_id
The application will then look in /container/folder/ for the file it needs to start.
I want to create/commit a new image with that file inside /container/folder/, but when that folder is mounted as volume from the host, docker cp will not help me do this, as far as I have tried. I think, as far as docker is concerned, the file copied there is no different than the files in the mounted volume, and will disappear when the container is stopped.
Part of the reason I want to do this, is because the file will not be changed, and should be there by default. The other reason is that I want to run this container in Kubernetes, and avoid using persistent volumes there to mount these directories. I have looked into using configmaps, but I'm not seeing how I can use those for this purpose.
If you can store the file into the ConfigMap you can mount the file to volume and use it inside the Kubernetes.
I am not sure with the type of file you have to use.
ConfigMap will inject this file into the volume of a POD so the application could access and use it.
In this case there will be no PVC required.
You can also follow this nice example showing how to mount the file into a volume inside a pod.
OR
Also, I am not sure about the docker image but if you can use that docker image you can add the file into the path, something like:
FROM <docker image>
ADD file ./container/folder/
In this case, you might have to check you can use the vendor docker image as a base and add the file into it.

How does volume mount from container to host and vice versa work?

docker run -ti --rm -v DataVolume3:/var ubuntu
Lets say I have a volume DataVolume 3 which pulls the contents of /var in the ubuntu container
even after killing this ubuntu container the volume remains and I can use this volume DataVolume3 to mount it to other containers.
This means with the deletion of container the volume mounts are not deleted.
How does this work ?
Does that volume mount mean that it copies the contents of /var into some local directory because this does not look like a symbolic link ?
If I have the container running and I create a file in the container then the same file gets copied to the host path ?
How does this whole process of volume mount from container to host and host to container work ?
Volumes are used for persistent storage and the volumes persists independent of the lifecycle of the container.
We can go through a demo to understand it clearly.
First, let's create a container using the named volumes approach as:
docker run -ti --rm -v DataVolume3:/var ubuntu
This will create a docker volume named DataVolume3 and it can be viewed in the output of docker volume ls:
docker volume ls
DRIVER VOLUME NAME
local DataVolume3
Docker stores the information about these named volumes in the directory /var/lib/docker/volumes/ (*):
ls /var/lib/docker/volumes/
1617af4bce3a647a0b93ed980d64d97746878564b141f30b6110d0818bf32b76 DataVolume3
Next, let's write some data from the ubuntu container at the mounted path var:
echo "hello" > var/file1
root#2b67a89a0050:/# cat /var/file1
hello
We can see this data with cat even after deleting the container:
cat /var/lib/docker/volumes/DataVolume3/_data/file1
hello
Note: Although, we are able to access the volumes like shown above but it not a recommended practice to access volumes data like this.
Now, next time when another container uses the same volume then the data from the volume gets mounted at the container directory specified as part of -v flag.
(*) The location may vary based on OS as pointed by David and probably can be seen by the docker volume inspect command.
Docker has a concept of a named volume. By default the storage for this lives somewhere on your host system and you can't directly access it from outside Docker (*). A named volume has its own lifecycle, it can be independently docker volume rm'd, and if you start another container mounting the same volume, it will have the same persistent content.
The docker run -v option takes some unit of storage, either a named volume or a specific host directory, and mounts it (as in the mount(8) command) in a specific place in the container filesystem. This will hide what was originally in the image and replace it with the volume content.
As you note, if the thing you mount is an empty named volume, it will get populated from the image content at container initialization time. There are some really important caveats on this functionality:
Named volume initialization happens only if the volume is totally empty.
The contents of the named volume never automatically update.
If the volume isn't empty, the volume contents completely replace what's in the image, even if it's changed.
The initialization happens only on native Docker, and not for example in Kubernetes.
The initialization happens only on named volumes, and not for bind-mounted host directories.
With all of these caveats, I'd avoid relying on this functionality.
If you need to mount a volume into a container, assume it will be empty when your entrypoint or the main container command starts. If you need a particular directory layout or file structure there, an entrypoint script can create it; if you're expecting it to hold particular data, keep a copy of it somewhere else in your image and copy it in if it's not already there (or, perhaps, always).
(*) On native Linux you can find a filesystem location for it, but accessing this isn't a best practice. On other OSes this will be hidden inside a virtual machine or other opaque storage. If you need to directly access the data (or inject config files, or read log files) a docker run -v /host/path:/container/path bind mount is a better choice.
Volumes are part of neither the container nor the host. Well, technically everything resides in the host machine. But the docker directories are only accessible by users in "docker" group. The files in these directories are separately managed by docker.
"Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux)."
Hence volumes are like the union of files under the docker container and the host itself. Any addition on either end will be added to the volume(/var/lib/docker/volumes), not hard copy, rather something like symbol link
As volumes can be shared across different containers, deleting a container does not cascade to the volumes associated with it.
To remove unused volumes:
docker volume prune .

In docker, can I publish a volume with initial data?

I want to share a file storage between two containers. From the documentation, I've seen that you can create and use volumes like this:
docker volume create --name DataVolume1
docker run -ti --rm -v DataVolume1:/datavolume1 ubuntu
However, I want containers to be able to access an initial set of shared data. Does docker support publishing of volumes? If not, does this mean I should write the initial data manually, after creating the volume, or is there another solution for publishing the data along with the images?
With a named volume (not with a host volume, aka bind mount) docker will initialize an empty named volume to the contents of the image at the location you mount it. So if you have files in your image at /datavolume1, and DataVolume1 is empty, docker will copy those files into the named volume.

How to mount current directory as read-only but still allow changes inside the container?

I have a situation where:
I want to mount a directory ~/tmp/mycode to /mycode readonly
I want to be able to edit the files in the directory, so I can't just run -v /my/local/path/tmp/mycode:/mycode
I want it to not persist changes on the host filesystem though so I can't mount it read/write
~/tmp/mycode is rather large
Basically I want to be able to edit the files in the mounted volume but not have those changes persisted.
My current workflow is to create a dummy container using a dockerfile:
ADD . /mycode
and then execute that container.
However as the repository grows, this step takes longer and longer to perform, because the only way I can think is to make a complete copy of ~/tmp/mycode in order to be able to manipulate the files in the container.
I've also thought about mounting the directory and copying it inside the container and committing that container, but that has the same issue.
Is there a way to run a docker container to allow file edits without persisting them on the host short of copying the whole directory?
I am using the latest docker for mac, currently Version 17.03.1-ce-mac5 (16048).
This is fairly trivial to do with docker and overlay:
docker run --name myenv --privileged -v /my/local/path/tmp/mycode:/mnt/rocode:ro -it ubuntu /bin/bash
docker exec -d myenv /sbin/mount -t overlay overlay -o lowerdir=/mnt/rocode,upperdir=/mycode,workdir=/mnt/code-workdir /mycode
This should mount the code from your directory read only and create the overlay inside the container so that /mnt/rocode is read only, but /mycode is writable.
Make sure that your kernel is 3.18+ and that you have overlay in your /proc/filesystems.

Resources