Is multiple Docker data-roots possible and how? - docker

I have one container which needs a lot of space and I want it to use a dedicated drive on my server.
This answer comprehensively explains how to move docker data-root. But is it possible to have two data-roots and assign a specific container to the second one?

You sound like you have specific container-based needs.
Thus, moving docker data-root to another location does not seem to be the suited answer here (though you may do it anyway).
What you need are "volumes".
Wrap your image within a docker-compose file, and mount some container directories as volume pointing to some "host" path (outside of docker data-root). They must indeed be the directories that will request a lot of space, and point to a VG or external mounting point (e.g. NFS) with sufficient space !
Eg:
...
my-service:
image: my-image
volumes:
- "/path/within/host/opt/data/tmp/:/path/within/container/cache/:rw"
- "/path/within/host/opt/data/layers/:/path/within/container/layers/:rw"
- "/path/within/host/opt/data/logs/:/path/within/container/logs/:rw"
...
(note that "rw" can be omitted here, since it's the default value)

Related

Docker compose. The volume inside another volume on the same container. How does it work?

I'm trying to create a relatively simple setup to develop and test npm packages. A problem was in the fact, that after you mounted a code volume to the container it replaces node_modules.
I tried a lot of generally logical stuff, mostly aimed to move node_modules to another location and then reference it within configuration files. It works, but the solution is ugly. Also, it's not good practice to install webpack globally, but my solution requires it.
However, after some time I found this solution, which looks elegant, just what I needed, but it also has one problem. I don't understand completely, how it works.
That my version of how everything operates.
Docker reorders volume mounting based on container paths
Docker mounts sub dir volume at first
Docker mounts parent dir volume but due to an unexplained mechanism, it does not override the sub dir volume...
???
PROFIT. node_modules dir is in place and webpack runs perfectly.
So, I really want to understand how it actually does all of this black magic. Because without this knowledge I feel like I'm missing something important.
So, guys, how it works?
Thanks in advance.
services:
react-generic-form:
image: react-generic-form:package
container_name: react-generic-form-package
build:
dockerfile: dev.Dockerfile
context: ./package
volumes:
- "./package:/package"
- "/package/node_modules"
The Docker daemon, when it creates the container, sorts all of the mount points to avoid shadowing. (On non-Windows, this happens in (*github.com/docker/docker/daemon.Daemon).setupMounts.) So, in your example:
The Docker daemon sees that both /package and /package/node_modules contain data that's stored outside the container filespace.
It sorts these shortest to longest.
It mounts /package, as a bind-mount to the named host directory. (First, because it's a shorter path name.)
It mounts /package/node_modules, shadowing the equivalent directory in the previous mount, probably as a bind-mount to a directory with long hex identifier name somewhere in /var/lib/docker/volumes.
You can experiment more with this with a docker-compose.yml file like
version: '3'
services:
touch:
image: busybox
volumes:
- ./b:/a/b
- ./a:/a
command: touch /a/b/c
Notice that whichever order you put the volumes: in, you will get an empty directory ./a/b (which becomes the mount point inside the container), plus an empty file ./b/c (the result of the touch command).
Also note the statement here that the node_modules directory contains data, that should be persisted across container invocations, and has a lifecycle separately from either the container or its base image. Changing the image and re-running docker-compose up will have no effect on this volume's content.

What's the difference between declaring in docker-compose.yml volume as section and under a service?

What's the difference between declaring in the docker-compose.yml file a volume section and just using the volumes keyword under a service?
For example, I map a volume this way for a container:
services:
mysqldb:
volumes:
- ./data:/var/lib/mysql
This will map to the folder called data from my working directory.
But I could also map a volume by declaring a volume section and use its alias for the container:
services:
mysqldb:
volumes:
- data_volume:/var/lib/mysql
volumes:
data_volume:
driver: local
In this method, the actual location of where the mapped files are stored appears to be somewhat managed by docker compose.
What are the differences between these 2 methods or are they the same? Which one should I really use?
Are there any benefits of using one method over the other?
The difference between the methods you've described is that first method is a bind mount, and the other is a volume. These are more of Docker functions (rather than Docker Compose), and there are several benefits volumes provide over mounting a path from your host's filesystem. As described in the documentation, they:
are easier to back up or migrate
can be managed with docker volumes or the API (as opposed to the raw filesystem)
work on both Linux and Windows containers
can be safely shared among multiple containers
can have content pre-populated by a container (with bind mounts sometimes you have to copy data out, then restart the container)
Another massive benefit to using volumes are the volume drivers, which you'd specify in place of local. They allow you to store volumes remotely (i.e. cloud, etc) or add other features like encryption. This is core to the concept of containers, because if the running container is stateless and uses remote volumes, then you can move the container across hosts and it can be run without being reconfigured.
Therefore, the recommendation is to use Docker volumes. Another good example is the following:
services:
webserver_a:
volumes:
- ./serving/prod:/var/www
webserver_b:
volumes:
- ./serving/prod:/var/www
cache_server:
volumes:
- ./serving/prod:/cache_root
If you move the ./serving directory somewhere else, the bind mount breaks because it's a relative path. As you noted, volumes have aliases and have their path managed by Docker, so:
you wouldn't need to find and replace the path 3 times
the volume using local stores data somewhere else on your system and would continue mounting just fine
TL;DR: try and use volumes. They're portable, and encourage practices that reduce dependencies on your host machine.

Using docker-compose to mount volume that will change across hosts

In docker compose I know you can set volume binds by putting something like
volumes:
- /some_path_on_host:/some_path_in_container
However the /some_path_on_host for me will be different based on the host machine
I tried looking in docker documentation and couldn't find anything specific to this case.
Basically I want to make sure that even someone without any docker experience can set the path for the volume on the host machine, without having to edit docker-compose.
From my understanding Docker also allows paths to be set with environment variables which can be seen here
volumes:
- ${SOME_ENV_VAR}:/some_path_in_container
Is there any other way to set volumes that is more user friendly or should I just tell them to set SOME_ENV_VAR? Would using the SOME_ENV_VARIABLE be best practice?

Scaling Docker containers in Rancher with different but persistent volumes

I'm currently trying to bridge the gap between persistent, but unique volumes while scaling containers with Rancher (alternatively Docker Compose, since this is more of an abstract question).
Take as an example a Minecraft server, I have a Service defined in Rancher/Compose which uses a named volume as its data/world directory (e.g. -v minecraft_data:/data where the Minecraft image loads its world files from this /data directory). The reason I'm using such a named volume, is that I want it to persist between service upgrades (e.g. I'm changing the image version, or want to change some environment variables), which would not be possible with an anonymous volume.
Now when trying to scale up my service, I'm either getting multiple containers accessing the same data (not good for many use cases), or losing the service upgradeability when using anonymous volumes.
Are there any tools, best practices or patterns that might help with this issue?
In current versions of rancher (v1.4 at this time) storage drivers can be plugged in at the environment infrastructure level. This allows you to create volumes that are scoped at the environment, stack, or container.
For your use case, it sounds like per-container scope is what you need. Using rancher-compose you do something like:
version: '2'
services:
foo:
image: busybox
volumes:
- bar:/var/lib/storage
command: /bin/sh -c 'while true; do sleep 500; done'
volumes:
bar:
per_container: true
Then, rancher-compose up -d will create the stack and service with one container and a unique volume. rancher scale foo=2 will create another container with its own volume, etc. You can also specify volume storage drivers for each volume like rancher-ebs or rancher-nfs with their respective options.
I think what you want is to have difference instances of the entire project. scale implies identical clones, but if they have different data, they are not identical.
Instead of using scale, I would start different instances with different project names: https://docs.docker.com/compose/overview/#multiple-isolated-environments-on-a-single-host

chown docker volumes on host (possibly through docker-compose)

I have the following example
version: '2'
services:
proxy:
container_name: proxy
hostname: proxy
image: nginx
ports:
- 80:80
- 443:443
volumes:
- proxy_conf:/etc/nginx
- proxy_htdocs:/usr/share/nginx/html
volumes:
proxy_conf: {}
proxy_htdocs: {}
which works fine. When I run docker-compose up it creates those named volumes in /var/lib/docker/volumes and all is good. However, from the host, I can only access /var/lib/docker as root, because it's root:root (makes sense). I was wondering if there is a way of chowning the host's directories to something more sensible/safe (like, my relatively unprivileged user that I use to do most things on the host) or if I just have to suck it up and chown them manually. I'm starting to have a number of scripts already to work around other issues, so having an extra couple of lines won't be much of a problem, but I'd really like to keep my self-written automation minimal, if I can -- fewer chances for stupid mistakes.
By the way, no: if I mount host directories instead of creating volumes, they get overlaid, meaning that if they start empty, they stay empty, and I don't get the default configuration (or whatever) from inside the container.
Extra points: can I just move the volumes to a more convenient location? Say, /home/myuser/myserverstuff/volumes?
It's best to not try to access files inside /var/lib/docker directly. Those directories are meant to be managed by the docker daemon, and not to be messed with.
To access the data inside a volume, there's a number of options;
use a bind-mounted directory (you considered that, but didn't fit your use case).
use a "service" container that uses the same volume and makes it accessible through that container, for example a container running ssh (to use scp) or a SAMBA container (such as svendowideit/samba)
use a volume-driver plugin. there's various plugins around that offer all kind of options. For example, the local persist plugin is a really simple plug-in that allows you to specify where docker should store the volume data (so outside of /var/lib/docker)

Resources