Does Docker swarm ignore VOLUME commands from Dockerfiles? - docker

We are deploying a stack, the compose file list two services:
mediawiki
mySQL
We saw that the mysql Dockerfile has a VOLUME directive, to persist the databases to a Docker volume
Yet, if we docker stack rm then docker stack deploy our compose file, we lose all the database content.
Is that the expected behaviour? What would be the rationale for it?

Swarm mode doesn't build docker images, so it's safe to say that docker swarm is ignoring every line of your Dockerfile. The resulting image will have a volume identified which will be created when the container is run, whether that is inside or outside of swarm mode. If you do not specify a volume mount at runtime, docker will give you an anonymous volume at that location, which is difficult to identify later on, and local to the node where the container is running. With a new stack, the previous anonymous volume won't be used, and in various situations, the anonymous volume can be automatically deleted (e.g. when containers are configured to automatically delete on exit, which I'm not sure if it applies to swarm mode).
A named volume can make the data easier to reused in a single node cluster. When you get to a multi-node cluster, you need to move this data off of the node where the container is running. For details on how to use something like NFS for external data storage, see this answer to a related question.

The Dockerfile has a volume but you probably haven't mounted that volume as accessible outside of that container, which means it probably isn't being pulled out to the persistent overlay storage to be shared between containers/container deployments.
In your database service in the docker-compose.yml file you're going to need a volumes section. I'm omitting your current setup since you didn't provide it and I'm going to replace it with ellipses ....
mySQL:
...
volumes:
- mySQL-data:/var/lib/mysql
This tells the system that you want to persist the /var/lib/mysql directory outside of the container in your shared overlay that I randomly named mySQL-data.

Being in swarm mode is not relevant in this case.
If you are only relying on the anonymous volume defined in the Dockerfile, running a new container will create and mount a new fresh volume. You need to specifically mount a named volume at container start (in your case add it in your compose file) to remount the same volume between runs.
If you need to remount a lost data volume, it might still be possible if you did not prune data on your server. You will just need to find the relevant volume (that has a hash as name), eventually rename it and remount it in your new container.
I played the following scenario to illustrate my point:
First get the image and have a look:
$ docker pull mysql:latest
$ docker image inspect mysql:latest
From this last command we can see there is a volume declared for /var/lib/mysql
I'm on my dev machine. Cleaned up everything so I have no volumes at this time
$ docker volume ls
DRIVER VOLUME NAME
$
Start a container then look at volumes again
$ docker run -d --name test -e MYSQL_ALLOW_EMPTY_PASSWORD=1 mysql:latest
37a92341f52b189d00636d1f03ecfbd4e3e7e5d55b685f5ec254971d7732566c
$ docker volume ls
DRIVER VOLUME NAME
local 50f9810d13271c7c91b7e025e139db480f288a936f0a55f85d9580ef3aa83af7
Now add some data to the container
$ docker exec -it test bash
root#37a92341f52b:/# mysql
Welcome to the MySQL monitor. [... snip ...]
mysql> create database testso;
Query OK, 1 row affected (0.03 sec)
mysql> use testso;
Database changed
mysql> create table test (id int not null primary key);
Query OK, 0 rows affected (0.08 sec)
mysql> insert into test values (1);
Query OK, 1 row affected (0.05 sec)
mysql> select * from test;
+----+
| id |
+----+
| 1 |
+----+
1 row in set (0.00 sec)
mysql> exit
Bye
root#37a92341f52b:/# exit
$
Create a new container
$ docker rm -f test
test
$ docker run -d --name test -e MYSQL_ALLOW_EMPTY_PASSWORD=1 mysql:latest
79148de09d7a3e13db338da133cfd7d44fe3590dc1c7ffe6129722c5c6baea21
$ docker volume ls
DRIVER VOLUME NAME
local 50f9810d13271c7c91b7e025e139db480f288a936f0a55f85d9580ef3aa83af7
local ef6fb2647c11c10ef98d25ca0dc2bd43231729ed18386c65768a0ad808fca93b
As you can see we have a second volume for the new container. I will not show this here but if I connect, the data is empty as in your case. Now let's try to recover.
First a little cleanup
$ docker rm -f test
test
$ docker volume rm ef6fb2647c11c10ef98d25ca0dc2bd43231729ed18386c65768a0ad808fca93b
ef6fb2647c11c10ef98d25ca0dc2bd43231729ed18386c65768a0ad808fca93b
We want to have a human readable name but we cannot rename a volume. What I did is mount the old one and a new named volume in a busybox container to transfer the data over.
$ docker run -it --rm -v 50f9810d13271c7c91b7e025e139db480f288a936f0a55f85d9580ef3aa83af7:/mysql/old -v mysql_data:/mysql/new busybox:latest
/ # cd /mysql/
/mysql # mv old/* new/
/mysql # exit
We now have this new volume and we can get rid off the anonymous one
$ docker volume ls
DRIVER VOLUME NAME
local 50f9810d13271c7c91b7e025e139db480f288a936f0a55f85d9580ef3aa83af7
local mysql_data
$ docker volume rm 50f9810d13271c7c91b7e025e139db480f288a936f0a55f85d9580ef3aa83af7
50f9810d13271c7c91b7e025e139db480f288a936f0a55f85d9580ef3aa83af7
And finally we remount the named volume in a fresh mysql container to get our data back
$ docker run -d --name test -e MYSQL_ALLOW_EMPTY_PASSWORD=1 -v mysql_data:/var/lib/mysql mysql:latest
3f6eff0b7660f3f8e9518564affc6555acb17184845156099d18300b3e76f4a2
$ docker exec -it test bash
root#3f6eff0b7660:/# mysql
Welcome to the MySQL monitor. [... snip ...]
mysql> use testso
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from test;
+----+
| id |
+----+
| 1 |
+----+
1 row in set (0.01 sec)

Related

My changes were lost in new Docker container

Steps to reproduce:
Download and run postgres:9.6.24:
docker run --name my_container --restart=always -d -p 127.0.0.1:5432:5432 -e POSTGRES_PASSWORD=pgmypass postgres:9.6.24
Here result:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
879883bfc84a postgres:9.6.24 "docker-entrypoint.s…" 26 seconds ago Up 25 seconds 127.0.0.1:5432->5432/tcp my_container
OK.
Open file inside container /var/lib/postgresql/data/pg_hba.conf
docker exec -it my_container bash
root#879883bfc84a:/# cat /var/lib/postgresql/data/pg_hba.conf
IPv4 local connections:
host all all 127.0.0.1/32 trust
Replace file /var/lib/postgresql/data/pg_hba.conf inside container by my file. Copy and overwrite my file from host to container:
tar --overwrite -c pg_hba.conf | docker exec -i my_container /bin/tar -C /var/lib/postgresql/data/ -x
Make sure the file has been modified. Go inside container and open changed file
docker exec -it my_container bash
root#879883bfc84a:/# cat /var/lib/postgresql/data/pg_hba.conf
IPv4 local connections:
host all all 0.0.0.0/0 trust
As you can see the content of file was changed.
Create new image from container
docker commit my_container
See result:
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> ee57ad4bc6b4 3 seconds ago 200MB
postgres 9.6.24 027ccf656dc1 12 months ago 200MB
Now tag my new image
docker tag ee57ad4bc6b4 my_new_image:1.0.0
See reult:
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
my_new_image 1.0.0 ee57ad4bc6b4 About a minute ago 200MB
postgres 9.6.24 027ccf656dc1 12 months ago 200MB
OK.
Stop and delete old continer:
docker stop my_continer
docker rm my_container
See result:
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
As you can see not exit any container. OK.
Create new continer from new image
docker run --name my_new_container_test --restart=always -d -p 127.0.0.1:5432:5432 -e POSTGRES_PASSWORD=pg1210 my_new_image:1.0.0
See result:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3a965dbbd991 my_new_image:1.0.0 "docker-entrypoint.s…" 7 seconds ago Up 6 seconds 127.0.0.1:5432->5432/tcp my_new_container
Open file inside container /var/lib/postgresql/data/pg_hba.conf
docker exec -it my_new_container bash
root#879883bfc84a:/# cat /var/lib/postgresql/data/pg_hba.conf
IPv4 local connections:
host all all 127.0.0.1/32 trust
As you can see my change in files are lost. The content of file is original. Not my changes.
P.S. This problem is only with file pg_hba.config. E.g if I created in the container the folder and file: /Downaloads/myfile.txt then this file not lost in the my container "my_new_container".
Editing files inside container with docker exec, in general, will in fact cause you to lose work. You mention docker commit but that's almost never a best practice. (If this was successful, but then you discovered PostgreSQL 9.6.24 exactly had some critical bug and you must upgrade, could you recreate the exact some image?)
In the case of the postgres image, the files in /var/lib/postgresql/data are always stored in a Docker volume or mount point. In your case you didn't use a docker run -v option, but the image is configured to create an anonymous volume in that directory. The volume is not included in docker commit, which is why you're not seeing it on the rebuilt container. (Also see docker postgres with initial data is not persisted over commits.)
For editing a configuration file, the easiest thing to do is to store the data on the host system. Create a directory to hold it, and extract the configuration file from the image. (Since the data directory is created by the image's startup script, you need a slightly longer path to get it out.)
mkdir pgdata
docker run -d --name pgtmp postgres:9.6.24
docker cp pgtmp:/var/lib/postgresql/data/pg_hba.conf ./pgdata
docker stop pgtmp
docker rm pgtmp
$EDITOR pgdata/pg_hba.conf
Now when you run the container, provide this data directory as a bind mount. That will inject the configuration file, but also cause the database data to persist over container exits.
docker run -v "$PWD/pgdata:/var/lib/postgresql/data" -u $(id -u) ... postgres:9.6.24
Note that this sequence doesn't use docker exec or "go inside" containers at all, and you haven't created an image without corresponding source. Everything is run with commands from the host. If you do need to reset the database data, in this setup, it's just files, and you can rm -rf pgdata, maybe saving the modified configuration file along the way.
(If I'm reading this configuration change correctly, you're trying to globally disable passwords and instead allow trust authentication for all inbound connections. That's not usually a good idea, especially since username/password authentication is standard in every database library I've encountered. You probably still want the volume to persist data, but I might not make this change to pg_hba.conf.)
Docker Container is a readyonly entity, which means if you will create a file into the container, remove it and re-create it (The container), the file is not supposed to be there.
what you want to do is one of two things,
Map your container to a local directory (volume)
Create a docker file based on the postgres image, and generate this modifications in a script, that your dockerfile reads.
docker volume usages
Dockerfile Reference

Strange issue about applying docker named volume for new container

As I understand from all the information I found, the docker host volume can be created from three ways:
1- by ignoring the host-path (it will automatically create a directory with random ID)
2- by specify the host-path (it will also automatically create a directory with random ID)
3- by named volume and specify to a host-path
So I was trying first 2 ways:
$ docker run --name mongo-docker -v /data/db -p 27017:27017 -d mongo
$ docker run --name mongo-docker2 -v $(pwd)/data/:/data/db -p 27222:27017 -d mongo
And I look at the docker volume list:
$ docker volume ls
DRIVER VOLUME NAME
local 5b829a731245cb7fe3a1f28aca4c4c3c3791105be228182ccb9b2f72319180c8
local fb058e804412fb56b2096e2cb903e3ae73647ef6ca076ad9003708b80f94ffc5
It looks just what I was expected.
But when I tried the last one, by created first a volume:
$ docker volume create mongoVol
mongoVol
$ docker volume ls
DRIVER VOLUME NAME
local mongoVol
and use it for host-volume path, came up like this:
$ docker run --name mongo-docker3 -v mongoVol:/data/db -p 27322:27017 -d mongo
86bea0e52c9f395268665e191edc59f795d07266f17667502c7fa32879a6e021
$ docker volume ls
DRIVER VOLUME NAME
local 0de25c92be504d0a6b9bb9c83aa8a6fe17bf9bc195562314ca49edb1c4cf4377 <=== create a new one for new container?
local mongoVol
Why is this create a new directory for it? Shouldn't it be just the "mongoVol" volume?
I can’t find answers to related questions on any forum, post nor any videos....
The mongo image's Dockerfile has two directories named in a VOLUME statement. You're mounting content on /data/db but not on /data/configdb.
If the Dockerfile declares a directory as a VOLUME and nothing is explicitly mounted there, Docker automatically creates an anonymous volume (your first case). That's what results in the additional volume appearing in the docker volume ls listing.

Why some volumes are already created inside docker engine?

Whenever I run the below command:
docker volume ls
I can see some volumes already created in my docker engine.
DRIVER VOLUME NAME
local 5df9458932cd504e10b2b37856c434cbdf3876733684b100cbf390c965ac9581
local 6f7037bc33861a5e42a9f8bcd699f8184ff1916a297a718ccc4df5f369d07530
local 8a86c462020f35f1051b47c48555228a1df359251f2496c32ed45a9081bb1872
local 85ed838d2e081eddc672fd8ddb15bbb3eecc73adb270678c98b7c50a03ecb2fc
Why are those volume created ?
How can I find for what purpose they exists ?
If you started a Docker container with a volume that doesn't have a name or host mount point, Docker will create a unique name for them. These docs briefly mention anonymous volumes like this. Most likely, a Dockerfile had a VOLUME section and wasn't run with a corresponding --mount or -v flag to bind some local volume to the container's volume.
Also see this devops stack exchange answer.
Here's an example of when an anonymous volume is created:
Dockerfile with anonymous volumes:
FROM alpine:3.9
VOLUME ["/root", "/test"]
Building/running container without mounting or otherwise naming the /root, /test volumes:
$ docker volume ls
DRIVER VOLUME NAME
$ docker build -t test .
$ docker run -it --rm -d --name volume-test test:latest sh
$ docker volume ls
DRIVER VOLUME NAME
local 5b332abd25b77c1ac324a0e3c00dc9a554cfe80c996a20bd77ef10c35c8ef98a
local 05c903f47f3f3666e03ee06154ff54b23547a5cc65750ca18bb40be40ed4049c
local 6f595aada6ae7c9fb16831996c2bdd8d652bec55a7cedf96afef95aec8f4e6e1
local 7f54c9dbbec46acc5a843499c65a50e23a78baa884facd026704d0dcb0362c9e
local 47a791197d6164757b015df1e2aba48bac3999720ead6b5981820a3aaece4113
local 214155fe63200cc859c1eddd2b31aa990fd6eb7c8614aa02bd8b57690b0fe53e
Of course, you can always inspect the volumes to try to find out where they came from but this may or may not be useful for you:
docker inspect 5b332abd25b77c1ac324a0e3c00dc9a554cfe80c996a20bd77ef10c35c8ef98a

Can one docker user hide data from another?

Alice and Bob are both members of the docker group on the same host. Alice wants to run some long-running calculations in a docker container, then copy the results to her home folder. Bob is very nosy, and Alice doesn't want him to be able to read the data that her calculation is using.
Is there anything that the system administrator can do to keep Bob out of Alice's docker containers?
Here's how I think Alice should get data in and out of her container, based on named volumes and the docker cp command, as described in this question and this one.
$ pwd
/home/alice
$ date > input1.txt
$ docker volume create sandbox1
sandbox1
$ docker run --name run1 -v sandbox1:/data alpine echo OK
OK
$ docker cp input1.txt run1:/data/input1.txt
$ docker run --rm -v sandbox1:/data alpine sh -c "cp /data/input1.txt /data/output1.txt && date >> /data/output1.txt"
$ docker cp run1:/data/output1.txt output1.txt
$ cat output1.txt
Thu Oct 5 16:35:30 PDT 2017
Thu Oct 5 23:36:32 UTC 2017
$ docker container rm run1
run1
$ docker volume rm sandbox1
sandbox1
$
I create an input file, input1.txt and a named volume, sandbox1. Then I start a container named run1 just so I can copy files into the named volume. That container just prints an "OK" message and quits. I copy the input file, then run the main calculation. In this example, it copies the input to the output and adds a second timestamp to it.
After the calculation finishes, I copy the output file, then remove the container and the named volume.
Is there any way to stop Bob from loading his own container that mounts the named volume and shows him Alice's data? I've set up Docker to use a user namespace, so Alice and Bob don't have root access to the host, but I can't see how to make Alice and Bob use different user namespaces.
Alice and Bob have been granted virtual root access to the host by being in the docker group.
The docker group grants them access to the Docker API via a socket file. There is no facility in Docker at the moment to differentiate between users of the Docker API. The Docker daemon runs as root and by virtue of what the Docker API allows, Alice and Bob will be able to work around any barriers that you did try to put in place.
User Namespaces
The use of the user namespace isolation stops users inside a container breaking out of a container as a privileged or different user, so in effect the container process is now running as an unprivileged user.
An example would be
Alice is given ssh access to container A running in namespace_a.
Bob is given ssh access to container B in namespace_b.
Because the users are now only inside the container, they won't be able to modify each others files on the host. Say if both containers mapped the same host volume, files without world read/write/execute will be safe from each others containers. As they have no control over the daemon, they can't do anything to break out.
Docker Daemon
The namespace doesn't secure the Docker daemon and API itself, which is still a privileged process. The first way around a user name space is setting the host namespace on the command line:
docker run --privileged --userns=host busybox fdisk -l
The docker exec, docker cp and docker export commands will give someone with access to the Docker API the contents of any created containers.
Restricting Docker Access
It is possible to restrict access to the API but you can't have users with shell access in the docker group.
Allowing a limited set of docker commands via sudo or providing sudo access to scripts that hard code the docker parameters:
#!/bin/sh
docker run --userns=whom image command
For automated systems, access can be provided via an additional shim API with appropriate access controls in front of the Docker API that then passes on the "controlled" request to Docker. dockerode or docker-py can be easily plugged into a REST service and interface with Docker.

Port data out of docker container

I use this method below to port data out of one container.
docker run --volumes-from <data container> ubuntu tar -cO <volume path> | gzip -c > volume.tgz
But there is one problem with it is every time it performs a backup, there will be a zombie container left. What is the good way to get that id and remove the zombie container afterward.
Thanks
Apparently, you just want to be able to export volume data. To do that, you just need to start your initial container with a volume pointing to a directory on the host with the -v option. You can tar on the host without creating a container for it. Your current tactic seems a bit over-engineered ;)
The easy way to remove the container after executing the command, is to use the option --rm, from here
However, if you feel that the container you are creating will have data that you will need to
1. update in real time
2. access after the container has been created
then you may also mount a host directory as a container volume and access the contents of that directory from the host.
If you start a container using the -volume option, you can also call reference the directory created on this host
$ docker run -v /volume_directory ubuntu
$ container=$(docker ps -n=1 -q)
$ docker inspect -f '{{.Volumes}}' $container

Resources