Docker: how to set up file ownership in a data-only container? - docker

I've dockerized a PHP application using a data-only container. I used this (docker-symfony) containers stack.
This is a simple definition of a data-only container:
FROM debian:jessie
MAINTAINER Vincent Composieux <vincent.composieux#gmail.com>
VOLUME /var/www/symfony
Everything plays really well apart from ownership and permissions. I've noticed that when I mount volumes (my local directory) to the data-only container, the mounted files remain owned by my current user on the host, which in not recognized inside the container.
For example, If I'm starting the containers with docker-compose up as ltarasiewicz, and then I log into the data container, I can see that the mounted files have ownership set to:
drwxrwxr-x 7 1000 1000 4096 Jun 10 21:27 symfony
uid and gid of 1000 correspond to my host's user uid and gid. Because there is no such user inside the container, only IDs are displayed for the symfony directory. This makes it impossible to run the application.
So my question is how I can mount volumes to a data-only container and assign correct ownership to the mounted files, e.g. root:www-data or whatever other users I choose.

Use the same image you use for running the application to make your data container. For example, if you want to make a postgresql database, use the postgres image for your data container:
$ docker run --name dc postgres echo "Data Container"
This command created the data container and exited - note that data containers aren't left running.
In the case of postgres, the volume ownership won't be set correctly until you use the volume to start a db:
$ docker run -d --volumes-from dc postgres
But other images will set up the ownership correctly in the Dockerfile.
Of course, if you just want to fix the permissions in a data container, just mount it and run chown:
$ docker run --volumes-from dc postgres chown -R postgres:postgres /var/lib/postgresql/data

Related

docker: get files from one container to another

I have 2 docker containers which I created from 2 Dockerfiles.
docker run container1 # It updates a txt (update.txt) file every minutes and store it in the same container
docker run container2 --link container1 # A web server which in intended to read the updated file in container1
Now I want to access the file update.txt in container2 but I can't do that. I don't want to just copy the file since it will become static but I want to read the dynamically updated file to read the latest updates. Can anyone suggest a way out?
Use named volume to store update.txt in that volume on host.
Mount this volume in both containers.
All changes that container 1 writes then will be accessible in container 2.
Firstly, create a docker volume by using the below command
$ docker volume create --name sharedVolume
sharedVolume
And then start the first container by mounting the above-created volume and write data in that location where volume will be mounted.
$ docker run -it -v sharedVolume:/dataToWrite ubuntu
root#1021d9260d7b:/# echo "DATA Written" >> /dataToWrite/Example.txt
root#1021d9260d7b:/# cat /dataToWrite/Example.txt
DATA Written
Now, start the second container and mount the same volume you created above and check whether the same file present in the second container or not
$ docker run -it -v sharedVolume:/dataToWrite alpine
/ # cat /dataToWrite/Example.txt
DATA Written
As you can see above, the first container is ubuntu and the second container is alpine. Contents which are written in the first container is present in second container.

docker volume and VOLUME inside Dockerfile

I'm confused with what is different between creating docker volume create my-vol and VOLUME ["/var/www"].
My understanding is:
1) docker volume create my-vol creates a persistent volume on our machine and each container could be linked to my-vol.
2) VOLUME ["/var/www"] creates a volume inside its container.
And when I create another container, I could link my-vol as follows:
when running a container
$ docker run -d --name devtest --mount source=myvol2,target=/app nginx:latest
At that time, if I added VOLUME ["/var/www"] in my Dockerfile, all data of this docker file will be stored in both myvol2 and /var/www?
The Dockerfile VOLUME command says two things:
If the operator doesn't explicitly mount a volume on the specific container directory, create an anonymous one there anyways.
No Dockerfile step will ever be able to make further changes to that directory tree.
As an operator, you can mount a volume (either a named volume or a host directory) into a container with the docker run -v option. You can mount it over any directory in the container, regardless of whether or not there was a VOLUME declared for it in the Dockerfile.
(Since you can use docker run -v regardless of whether or not you declare a VOLUME, and it has confusing side effects, I would generally avoid declaring VOLUME in Dockerfiles.)
Just like in ordinary Linux, only one thing can be (usefully) mounted on any given directory. With the setup you describe, data will be stored in the myvol2 you create and mount, and it will be visible in /var/www in the container, but the data will only actually be stored in one place. If you deleted and recreated the container without the volume mount the data would not be there any more.
There are two types of persistent storage used in Docker,the first one is Docker Volumes and the second one is bind mounts. The differebce between them is that volumes are internal to Docker and stored in the Docker store (which is usually all under /var/lib/docker) and bind mounts use a physical location on your machine to store persistent data.
If you want to use a Docker Volume for nginx:
docker volume create nginx-vol
docker run -d --name devtest -v nginx-vol:/usr/share/nginx/html nginx
If you want to use a bind mount:
docker run -d --name devtest -v [path]:/usr/share/nginx/html nginx
[path] is the location in which you want to store the container's data.

Is it possible to change the read-only/read-write status of a docker mount at runtime?

I have a dockerized application that uses the filesystem to store lots of state. The application code is contained in the docker image
I am considering a update strategy which involves sharing the volume between two containers, but making sure that at most one container at a time can write to that filesystem.
The workflow would be:
start container A with /data mounted rw
start container B with /data mounted ro, and a newer version of the application
stop serving requests to container A
for container A, make the /data mount read-only
for container B, make the /data mount read-write
start serving requests to container B
You can re-mount your volume from inside the container, in the rw mode, like that:
mount -o remount,rw /mnt/data
The catch is that mount syscall is not allowed inside the Docker containers by default so that you would have to run it in a privileged mode:
docker run --privileged ...
or enable the SYS_ADMIN capability
SYS_ADMIN Perform a range of system administration operations.
docker run --cap-add=SYS_ADMIN --security-opt apparmor:unconfined
(note that I have had to also add --security-opt apparmor:unconfined, to make this work on Ubuntu).
Also, remounting the rw volume back to ro might be tricky, as some process(es) might have already opened some files inside it for writing , in which case the remount will fail with is busy error message.
But my guess is that you can just restart the container instead (as it would be the one running an old version of the app).
Not exactly what the OP requested, but I've had a similar question where i needed to get data OUT of the running container, but had mounted RW.
Other ways to extract the data would have taken too long.
My approach ? Stash the container as an image and start a new container from that Image with a mount as RW :D
Initial container start:
docker run -p 80:8080 --mount type=bind,source="C:\data-folder-local\",target=/data-folder-container-ro,readonly -d imageName:imageTag
Making an image from the container. You can stop this container before/after if you want.
docker commit -a "mud" -m "Damn, mount should be rw, stashing a snapshot to reuse." CONTAINER_ID_HERE snapshotImageName:snapshotImageTag
where CONTAINER_ID_HERE i got from the output of docker ps (https://docs.docker.com/engine/reference/commandline/ps/)
Start a new container from the image made, but this time mount with write rights!
docker run -p 80:8080 --mount type=bind,source="C:\data-folder-local\",target=/data-folder-container-rw -d snapshotImageName:snapshotImageTag
write out files to the mount folder (on local system) from within your container :D
Hope that helps somebody.

How to remove a mount for existing container?

I'm learning docker and reading their chapter "Manage data in containers". In the "Mount a host directory as a data volume". They mentioned the following paragraph:
In addition to creating a volume using the -v flag you can also mount a directory from your Docker engine’s host into a container.
$ docker run -d -P --name web -v /src/webapp:/opt/webapp training/webapp python app.py
This command mounts the host directory, /src/webapp, into the container at /opt/webapp. If the path /opt/webapp already exists inside the container’s image, the /src/webapp mount overlays but does not remove the pre-existing content. Once the mount is removed, the content is accessible again. This is consistent with the expected behavior of the mount command.
Experiment 1
Then when I tried to run this command and try to inspect the container, I found that that actually container doesn't even run. Then I use docker logs web and find this error:
can't open file 'app.py': [Errno 2] No such file or directory
I assume that the /src/webapp mount overlays on the /opt/webapp, which there is no content.
Question 1
How can I remove this mount and check if the content is still there as the quote said?
Experiment 2
When I tried to run
$ docker run -d -P --name web2 -v newvolume:/opt/webapp training/webapp python app.py
I found that the container ran correctly. Then I use docker exec -it web2 /bin/bash and find that all of the existing content are still inside the /opt/webapp. I can also add more files inside here. So in this case, it looks like that the volume is not overlay but combined. If I use docker inspect web and check Mounts, then I'll see that the volume is created under /var/lib/docker/volumes/newvolume/_data
Question 2
If I give a name instead of a host-dir absolute path, then the volume will not overlay the container-dir /opt/webapp but connect the two dir together?
An alternative solution is to commit the container (or export it) using docker cli and re-create it without doing the mapping.
Question 1 How can I remove this mount and check if the content is still there as the quote said?
You would create a new container without the volume mount. E.g.
$ docker run -d -P --name web training/webapp python app.py
(Theoretically it's possible to perform some privileged operations to remove the mount on a running container, but inside the container you will not normally have this permission, and it's a good practice to get into the habit of treating containers as ephemeral.)
Question 2 If I give a name instead of a host-dir absolute path, then the volume will not overlay the container-dir /opt/webapp but connect the two dir together?
Almost. What's happening with named volumes is that docker provides an initialization step when the volume is empty and the container is created with that volume mount. The initialization step copies the contents of the image at that directory into the volume, including all files and directories recursively, ownership, and permissions. This is very useful to running containers as a non-root user with a volume directory that the user inside the container needs to be able to write into. After that initialization has happened, future containers with the same named volume will skip the initialization, even if the image content has changed, e.g. if you add new content into the image.

How Docker container volumes work even when they aren't running?

Take a typical data only Docker container:
FROM stackbrew/busybox:latest
RUN mkdir /data
VOLUME /data
Now I have seen a great deal of them that are run like this:
docker run -name my-data data true
The true command exits as soon as it runs, and so does the container. But surprisingly it continues to serve the volume when you connect it with another container via --volumes-from my-data.
My question is, how does that work? How does a stopped container still allow access in it's volumes?
Volumes in docker are not a top-level thing. They are "simply" part of container's meta-data.
When you have VOLUME in your dockerfile or start a container with -v, Docker will create a directory in /var/lib/docker/volumes* with a random ID (this is the exact same process as creating an image with commit except it is empty) and add that random ID to the container's metadata.
When the container starts, Docker will mount-bind the directory /var/lib/docker/volumes/* at the given location for that volume.
When you use volumes-from, Docker will just lookup the volume id and the location from an other container, running or not and mount-bind the directory at the set location.
Volumes are not linked with the runtime, it is just directories that are mounted.
* With newer versions, Docker now uses the vfs driver for storage and /var/lib/docker/volumes/ is used only for metadatas like size, create time, etc. The actual data are stored in /var/lib/docker/vfs/dir/<volume id>

Resources