I am Dockerising an old project. A feature in the project pulls in user-specified Git repos, and since the size of a repo could cause the filing system to be overwhelmed, I created a local filing system of a fixed size, and then mounted it. This was intended to prevent the web host from having its file system filled up.
The general approach is this:
IMAGE=filesystem/image.img
MOUNT_POINT=filesystem/mount
SIZE=20
PROJECT_ROOT=`pwd`
# Number of M to set aside for this filing system
dd if=/dev/zero of=$IMAGE bs=1M count=$SIZE &> /dev/null
# Format: the -F permits creation even though it's not a "block special device"
mkfs.ext3 -F -q $IMAGE
# Mount if the filing system is not already mounted
$MOUNTCMD | cut -d ' ' -f 3 | grep -q "^${PROJECT_ROOT}/${MOUNT_POINT}$"
if [ $? -ne 0 ]; then
# -p Create all parent dirs as necessary
mkdir -p $MOUNT_POINT
/bin/mount -t ext3 $IMAGE $MOUNT_POINT
fi
This works fine in a Linux local or remote VM. However, I'd like to run this shell code, or something like it, inside a container. Part of the reason I'd like to do that is to contain all fiddly stuff inside a container, so that building a new host machine is as kept as simple as possible (in my view, setting up custom mounts and cron-restart rules on the host works against that).
So, this command does not work inside a container ("filesystem" is an on-host Docker volume)
mount -t ext3 filesystem/image.img filesystem/mount
mount: can't setup loop device: No space left on device
It also does not work on a container folder ("filesystem2" is a container directory):
dd if=/dev/zero of=filesystem2/image.img bs=1M count=20
mount -t ext3 filesystem2/image.img filesystem2/mount
mount: can't setup loop device: No space left on device
I wonder whether containers just don't have the right internal machinery to do mounting, and thus whether I should change course. I'd prefer not to spend too much time on this (I'm just moving a project to a Docker-only server) which is why I would like to get mount working if I can.
Other options
If that's not possible, then a size-limited Docker volume, that works with both Docker and Swarm, may be an alternative I'd need to look into. There are conflicting reports on the web as to whether this actually works (see this question).
There is a suggestion here to say this is supported in Flocker. However, I am hesitant to use that, as it appears to be abandoned, presumably having been affected by ClusterHQ going bust.
This post indicates I can use --storage-opt size=120G with docker run. However, it does not look like it is supported by docker service create (unless perhaps the option has been renamed).
Update
As per the comment convo, I made some progress; I found that adding --privileged to the docker run enables mounting, at the cost of removing security isolation. A helpful commenter says that it is better to use the more fine-grained control of --cap-add SYS_ADMIN, allowing the container to retain some of its isolation.
However, Docker Swarm has not yet implemented either of these flags, so I can't use this solution. This lengthy feature request suggests to me that this feature is not going to be added in a hurry; it's been pending for two years already.
You won't be able to safely do this inside of a container. Docker removes the mount privilege from containers because using this you could mount the host filesystem and escape the container. However, you can do this outside of the container and mount the filesystem into the container as a volume using the default local driver. The size option isn't supported by most filesystems, tmpfs being one of the few exceptions. Most of them use the size of the underlying device which you defined with the image file creation command:
dd if=/dev/zero of=filesystem/image.img bs=1M count=$SIZE
I had trouble getting docker to create the loop device dynamically, so here's the process to create it manually:
$ sudo losetup --find --show ./vol-image.img
/dev/loop0
$ sudo mkfs -t ext3 /dev/loop0
mke2fs 1.43.4 (31-Jan-2017)
Creating filesystem with 10240 1k blocks and 2560 inodes
Filesystem UUID: 25c95fcd-6c78-4b8e-b923-f808517b28df
Superblock backups stored on blocks:
8193
Allocating group tables: done
Writing inode tables: done
Creating journal (1024 blocks): done
Writing superblocks and filesystem accounting information: done
When defining the volume mount options are passed almost verbatim from the mount command you run on the command line:
docker volume create --driver local --opt type=ext3 \
--opt device=filesystem/image.img app_vol
docker service create --mount type=volume,src=app_vol,dst=/filesystem/mount ...
or in a single service create command:
docker service create \
--mount type=volume,src=app_vol,dst=/filesystem/mount,volume-driver=local,volume-opt=type=ext3,volume-opt=device=filesystem/image.img ...
With docker run, the command looks like:
$ docker run -it --rm --mount type=volume,dst=/data,src=ext3vol,volume-driver=local,volume-opt=type=ext3,volume-opt=device=/dev/loop0 busybox /bin/sh
/ # ls -al /data
total 17
drwxr-xr-x 3 root root 1024 Sep 19 14:39 .
drwxr-xr-x 1 root root 4096 Sep 19 14:40 ..
drwx------ 2 root root 12288 Sep 19 14:39 lost+found
The only prerequisite is that you create this file and loop device before creating the service, and that this file is accessible wherever the service is scheduled. I would also suggest making all of the paths in these commands fully qualified rather than relative to the current directory. I'm pretty sure there are a few places that relative paths don't work.
I have found a size-limiting solution I am happy with, and it does not use the Linux mount command at all. I've not implemented it yet, but the tests documented below are satisfying enough. Readers may wish to note the minor warnings at the end.
I had not tried mounting Docker volumes prior to asking this question, since part of my research stumbled on a Stack Overflow poster casting doubt on whether Docker volumes can be made to respect a size limitation. My test indicates that they can, but you may wish to test this on your own platform to ensure it works for you.
Size limit on Docker container
The below commands have been cobbled together from various sources on the web.
To start with, I create a volume like so, with a 20m size limit:
docker volume create \
--driver local \
--opt o=size=20m \
--opt type=tmpfs \
--opt device=tmpfs \
hello-volume
I then create an Alpine Swarm service with a mount on this container:
docker service create \
--mount source=hello-volume,target=/myvol \
alpine \
sleep 10000
We can ensure the container is mounted by getting a shell on the single container in this service:
docker exec -it amazing_feynman.1.lpsgoyv0jrju6fvb8skrybqap
/ # ls - /myvol
total 0
OK, great. So, while remaining in this shell, let's try slowly overwhelming this disk, in 5m increments. We can see that it fails on the fifth try, which is what we would expect:
/ # cd /myvol
/myvol # ls
/myvol # dd if=/dev/zero of=image1 bs=1M count=5
5+0 records in
5+0 records out
/myvol # dd if=/dev/zero of=image2 bs=1M count=5
5+0 records in
5+0 records out
/myvol # ls -l
total 10240
-rw-r--r-- 1 root root 5242880 Sep 16 13:11 image1
-rw-r--r-- 1 root root 5242880 Sep 16 13:12 image2
/myvol # dd if=/dev/zero of=image3 bs=1M count=5
5+0 records in
5+0 records out
/myvol # dd if=/dev/zero of=image4 bs=1M count=5
5+0 records in
5+0 records out
/myvol # ls -l
total 20480
-rw-r--r-- 1 root root 5242880 Sep 16 13:11 image1
-rw-r--r-- 1 root root 5242880 Sep 16 13:12 image2
-rw-r--r-- 1 root root 5242880 Sep 16 13:12 image3
-rw-r--r-- 1 root root 5242880 Sep 16 13:12 image4
/myvol # dd if=/dev/zero of=image5 bs=1M count=5
dd: writing 'image5': No space left on device
1+0 records in
0+0 records out
/myvol #
Finally, let's see if we can get an error by overwhelming the disk in one go, in case the limitation only applies to newly opened file handles in a full disk:
/ # cd /myvol
/ # rm *
/myvol # dd if=/dev/zero of=image1 bs=1M count=21
dd: writing 'image1': No space left on device
21+0 records in
20+0 records out
It turns out we can, so that looks pretty robust to me.
Nota bene
The volume is created with a type and a device of "tmpfs", which sounded to me worryingly like a RAM disk. I've successfully checked that the volume remains connected and intact after a system reboot, so it looks good to me, at least for now.
However, I'd say that when it comes to organising your data persistence systems, don't just copy what I have. Make sure the volume is robust enough for your use case before you put it into production, and of course, make sure you include it in your back-up process.
(This is for Docker version 18.06.1-ce, build e68fc7a).
Related
I'm working on a build step that handles common deployment tasks in a Docker Swarm Mode cluster. As this is a common problem for us and for others, we've shared this build step as a BitBucket pipe: https://bitbucket.org/matchory/swarm-secret-pipe/
The pipe needs to use the docker command to work with a remote Docker installation. This doesn't work, however, because the docker executable cannot be found when the pipe runs.
The following holds true for our test repository pipeline:
The docker option is set to true:
options:
docker: true
The docker service is enabled for the build step:
main:
- step:
services:
- docker: true
Docker works fine in the repository pipeline itself, but not within the pipe.
Pipeline log shows the docker path being mounted into the pipe container:
docker container run \
--volume=/opt/atlassian/pipelines/agent/build:/opt/atlassian/pipelines/agent/build \
--volume=/opt/atlassian/pipelines/agent/ssh:/opt/atlassian/pipelines/agent/ssh:ro \
--volume=/usr/local/bin/docker:/usr/local/bin/docker:ro \
--volume=/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes:/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes \
--volume=/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes/matchory/swarm-secret-pipe:/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes/matchory/swarm-secret-pipe \
--workdir=$(pwd) \
--label=org.bitbucket.pipelines.system=true \
radiergummi/swarm-secret-pipe:1.3.7#sha256:baf05b25b38f2a59b044e07f4ad07065de90257a000137a0e1eb71cbe1a438e5
The pipe is pretty standard and uses a recent Alpine image; nothing special in that regard. The PATH is never overwritten. Now for the fun part: If I do ls /usr/local/bin/docker inside the pipe, it shows an empty directory:
ls /usr/local/bin
total 16K
drwxr-xr-x 1 root root 4.0K May 13 13:06 .
drwxr-xr-x 1 root root 4.0K Apr 4 16:06 ..
drwxr-xr-x 2 root root 4.0K Apr 29 09:30 docker
ls /usr/local/bin/docker
total 8K
drwxr-xr-x 2 root root 4.0K Apr 29 09:30 .
drwxr-xr-x 1 root root 4.0K May 13 13:06 ..
ls: /usr/local/bin/docker/docker: No such file or directory
As far as I understand pipelines and Docker, /usr/local/bin/docker should be the docker binary file. Instead, it appears to be an empty directory for some reason.
What is going on here?
I've also looked at other, official, pipes. They don't do anything differently, but seem to be using the docker command just fine (eg. the Azure pipe).
After talking to BitBucket support, I solved the issue. As it turns out, if the docker context is changed, any docker command is sent straight to the remote docker binary, which (on our services) lives at a different path than in BitBucket Pipelines!
As we changed the docker context before using the pipe, and the docker instance mounted into the pipe still has the remote context set, but the pipe searches for the docker binary at another place, the No such file or directory error is thrown.
TL;DR: Always restore the default docker host/context before passing control to a pipe, e.g.:
script:
- export DEFAULT_DOCKER_HOST=$DOCKER_HOST
- unset DOCKER_HOST
- docker context create remote --docker "host=ssh://${DEPLOY_SSH_USER}#${DEPLOY_SSH_HOST}"
- docker context use remote
# do your thing
- export DOCKER_HOST=$DEFAULT_DOCKER_HOST # <------ restore the default host
- pipe: matchory/swarm-secret-pipe:1.3.16
I would like to monitor data written "inside" a Docker container, meaning data written to the backing filesystem by the overlay storage driver. Not data written to volumes, tmpfs or bind mounts. Typical monitoring tools, such as docker stats seem to report the total amount of data written.
BLOCK I/O The amount of data the container has read to and written from [sic] block devices on the host
Source: docker stats
The idea is to keep containers as read-only as possible, by finding "write-heavy" files / folders and moving them to volumes or bind mounts. So an ideal solution would not (only) show the data currently written, but the total amount of data written since the container was started, ideally breaking it down to single files.
At the moment I'm simply using find -type f -mtime x from the container shell, where x is a smaller than the image age, but there must be a better solution for this.
I'm using: Server Version: 18.06.1-ce, Storage Driver: overlay2, Backing Filesystem: extfs
Actually the docker storage driver itself provides the answer already.
Taking the overlay2 storage driver, which is the default driver on most distributions, as an example, we see that the container layer, where all data written to the container is stored, is kept in a separate folder:
Source: How the overlay driver works
Total amount of data written to the container layer
For a complete overview of what has been written to the container, we only have to take a look at the upperdir, which is called diff on the backing (host) file system.
The path of the diff folder can be found with
docker container inspect <container_name> --format='{{.GraphDriver.Data.UpperDir}}' # or
docker container inspect <container_name> | grep UpperDir
With default settings, this path points to /var/lib/docker/overlay2/. Note that access to the "inner workings" of docker requires root access on the host, and it's a good idea not to do any writes to these folders.
Now that we have the folder on the backing file system, we can simply du in much detail as we want. As a test example, I've used an alpine image that runs a script, which writes a 10 MB dummy file every 10 seconds.
root#testbox:/var/lib/docker/overlay2/83a825d...# du -h -d 1
8.0K ./work
216M ./diff
216M .
root#testbox:/var/lib/docker/overlay2/83a825d...# ll diff/tmp
total 220164
drwxrwxrwt 2 root root 4096 Okt 21 22:57 ./
drwxr-xr-x 3 root root 4096 Okt 21 22:53 ../
-rw-r--r-- 1 root root 9266613 Okt 21 22:53 dummy0.tar.gz
-rw-r--r-- 1 root root 9266613 Okt 21 22:55 dummy10.tar.gz
-rw-r--r-- 1 root root 9266613 Okt 21 22:55 dummy11.tar.gz
[...]
Hence, seeing all the files and folders written to the container is as easy as with any other directory.
I am trying to set up a multi-container service with docker-compose.
Some of the containers need to be restarted from a fresh container (eg. the file system should be like in the image) when they restart.
How can I achieve this?
I've found the restart: always option I can put on my service in the docker-compose.yml file, but that doesn't give me a fresh file system as it uses the same container.
I've also seen the --force-recreate option of docker-compose up, but that doesn't apply as that only recreates the containers when the command is runned.
EDIT:
This is probably not a docker-compose issue, but more of a general docker question: What is the best way to make sure a container is in a fresh state when it is restarted? With fresh state, I mean a state identical to that of a brand new container from the same image. Restarted is the docker command docker restart or docker stop and docker start.
In docker, immutability typically refers to the image layers. They are immutable, and any changes are pushed to a container specific copy-on-write layer of the filesystem. That container specific layer will last for the lifetime of the container. So to have those files not-persist, you have two options:
Recreate the container instead of just restart it
Don't write the changes to the container filesystem, and don't write them to any persistent volumes.
You cannot do #1 with a restart policy by it's very definition. A restart policy gives you the same container filesystem, with the application restarted. But if you use docker's swarm mode, it will recreate containers when they exit, so if you can migrate to swarm mode, you can achieve this result.
Option #2 looks more difficult than it is. If you aren't writing to the container filesystem, or to a volume, then where? The answer is a tmpfs volume that is only stored in memory and is lost as soon as the container exits. In compose, this is a tmpfs: /data/dir/to/not/persist line. Here's an example on the docker command line.
First, let's create a container with a tmpfs mounted at /data, add some content, and exit the container:
$ docker run -it --tmpfs /data --name no-persist busybox /bin/sh
/ # ls -al /data
total 4
drwxrwxrwt 2 root root 40 Apr 7 21:50 .
drwxr-xr-x 1 root root 4096 Apr 7 21:50 ..
/ # echo 'do not save' >>/data/tmp-data.txt
/ # cat /data/tmp-data.txt
do not save
/ # ls -al /data
total 8
drwxrwxrwt 2 root root 60 Apr 7 21:51 .
drwxr-xr-x 1 root root 4096 Apr 7 21:50 ..
-rw-r--r-- 1 root root 12 Apr 7 21:51 tmp-data.txt
/ # exit
Easy enough, it behaves as a normal container, let's restart it and check the directory contents:
$ docker restart no-persist
no-persist
$ docker attach no-persist
/ # ls -al /data
total 4
drwxr-xr-x 2 root root 40 Apr 7 21:51 .
drwxr-xr-x 1 root root 4096 Apr 7 21:50 ..
/ # echo 'still do not save' >>/data/do-not-save.txt
/ # ls -al /data
total 8
drwxr-xr-x 2 root root 60 Apr 7 21:52 .
drwxr-xr-x 1 root root 4096 Apr 7 21:50 ..
-rw-r--r-- 1 root root 18 Apr 7 21:52 do-not-save.txt
/ # exit
As you can see, the directory returned empty, and we can add data as needed back to the directory. The only downside of this is the directory will be empty even if you have content in the image at that location. I've tried combinations of named volumes, or using the mount syntax and passing the volume-nocopy option to 0, without luck. So if you need the directory to be initialized, you'll need to do that as part of your container entrypoint/cmd by copying from another location.
In order to not persist any changes to your containers it is enough that you don't map any directory from host to the container.
In this way, every time the containers runs (with docker run or docker-compose up ), it starts with a fresh file system.
docker-compose down also removes the containers, deleting any data.
The best solution I have found so far, is for the container itself to make sure to clean up when starting or stopping. I solve this by cleaning up when starting.
I copy my app files to /srv/template with the docker COPY directive in my Dockerfile, and have something like this in my ENTRYPOINT script:
rm -rf /srv/server/
cp -r /srv/template /srv/server
cd /srv/server
The following command works perfectly and the riak service starts as expected:
docker run --name=riak -d -p 8087:8087 -p 8098:8098 -v $(pwd)/schemas:/etc/riak/schema basho/riak-ts
The local schemas directory is mounted successfully and the sql file in it is read by riak. However if I try to mount the riak's data or log directories, the riak service does not start and timeouts after 15 seconds:
docker run --name=riak -d -p 8087:8087 -p 8098:8098 -v $(pwd)/logs:/var/log/riak -v $(pwd)/schemas:/etc/riak/schema basho/riak-ts
Output of docker logs riak:
+ /usr/sbin/riak start
riak failed to start within 15 seconds,
see the output of 'riak console' for more information.
If you want to
wait longer, set the environment variable
WAIT_FOR_ERLANG to the
number of seconds to wait.
Why does riak not start when it's logs or data directories are mounted to local directories?
This issue is with the directory owner of mounted log folder. The folder $GROUP and $USER are expected to be riak as follow:
root#20e489124b9a:/var/log# ls -l
drwxr-xr-x 2 riak riak 4096 Jul 19 10:00 riak
but with volumes you are getting:
root#3546d261a465:/var/log# ls -l
drwxr-xr-x 2 root root 4096 Jul 19 09:58 riak
One way to solve this is to have the directory ownership as riak user and group on host before starting the container. I looked the UID/GID (/etc/passwd) in docker image and they were:
riak:x:102:105:Riak user,,,:/var/lib/riak:/bin/bash
now change the ownership on host directories before starting the container as:
sudo chown 102:105 logs/
sudo chown 102:105 data/
This should solve it. At least for now. Details here.
I've been looking into setting up a data volume for a Docker container that I'm running on my server. The container is from this FreePBX image https://hub.docker.com/r/jmar71n/freepbx/
Basically I want persistent data so I don't lose my VoIP extensions and settings in the case of Docker stopping. I've tried many guides, ones here on stack overflow, and on the Docker manpages, but I just can't quite get it to work.
Can anyone help me with what commands I need to run in order to attach a volume to the FreePBX image I linked above?
You can do this by running a container with the -v option and mapping to a host directory - you just need to know where the container's storing the data.
Looking at the Dockerfile for that image, I'm assuming that the data you're interested in is stored in MySql. In the MySql config the data directory the container's using is /var/lib/mysql.
So you can start your container like this, mapping the MySql data directory to /docker/pbx-data on your host:
> docker run -d -t -v /docker/pbx-data:/var/lib/mysql jmar71n/freepbx
20b45b8fb2eec63db3f4dcab05f89624ef7cb1ff067cae258e0f8a910762fb1a
Use docker inpect to confirm that the mount is mapped as expected:
> docker inspect --format '{{json .Mounts}}' 20b
[{"Source":"/docker/pbx-data",
"Destination":"/var/lib/mysql",
"Mode":"","RW":true,"Propagation":"rprivate"}]
When the container runs it bootstraps the database, so on the host you'll be able to see the contents of the MySql data directory the container is using:
> ls -l /docker/pbx-data
total 28684
-rw-r----- 1 103 root 2062 Sep 21 09:30 20b45b8fb2ee.err
-rw-rw---- 1 103 messagebus 18874368 Sep 21 09:30 ibdata1
-rw-rw---- 1 103 messagebus 5242880 Sep 21 09:30 ib_logfile0
-rw-rw---- 1 103 messagebus 5242880 Sep 21 09:30 ib_logfile1
drwx------ 2 103 root 4096 Sep 21 09:30 mysql
drwx------ 2 103 messagebus 4096 Sep 21 09:30 performance_schema
If you kill the container and run another one with the same volume mapping, it will have all the data files from the previous container, and your app state should be preserved.
I'm not familiar with FreePBX, but if there is state being stored in other directories, you can find the locations in config and map them to the host in the same way, with multiple -v options.
Hi Elton Stoneman and user3608260!
Yes, you assuming correctly for data saves in Mysql (records, users, configs, etc.).
But in asterisk, all configurations are saved in files '.conf' and similars.
In this case, the archives looked for user3608260 are storaged in '/etc/asterisk/*'
Your answer is perfectly with more one command: -v /local_to_save:/etc/asterisk
the final docker command:
docker run -d -t -v /docker/pbx-data:/var/lib/mysql -v /docker/pbx-asterisk:/etc/asterisk jmar71n/freepbx
[Assuming /docker/pbx-asterisk is a host directory. ]