Docker access to future mounted volume - docker

I am currently wondering how I can access to an encrypted USB key without the --privileged. Let's say I have /dev/sda1 a Luks encrypted key and container a running docker container. This key is opened via cryptsetup luksOpen /dev/sda1 encrypted_sda1 --key-file=key-file. So now, I have /dev/mapper/encrypted_sda1 accessible. Now, I exec mount /dev/mapper/encrypted_sda1 /media/sda1 where /media is shared between my host and my container.
Now, from my host device, I can access to the content of my key via /media/sda1. But from my container (without privileged), I can't. I just have an empty directory called sda1 in /media
The strange thing is if I run my container after mounting the USB key in /media, I can access to /media/sda1 from the container. So I think it's because the volume is not correctly sync and there is something wrong with some rights.
I don't really understand why I can't access /media/sda1 when I mount a USB key from the host when the container is running. Any lead?
Have a nice day!

Instead of --privileged, I think you need to configure bind propagation on the /media mount. The default is rprivate, meaning that no mount points anywhere within the original or replica mount points propagate in either direction. If you use rslave, submounts (e.g., /media/sda1) of the original mount are visible in the replica.
docker container run --mount type=bind,source=/media,target=/media,bind-propagation=rslave …

Related

Does docker container maintain volume data?

This might come across as a stupid question, but I am unable to figure something about docker volumes. Going through the official documentation I can see that we can map the host machine file system on the container for persistent storage. Following the instruction I was successfully able to mount a folder on my container.
Once I exec bash into the container, I can see the mapped directory structure there as expected. My question is, how is the data mapped between these two paths, that is from the container to the mount volume on host OS. Is the data duplicated or the container directly stores the data on the volume on host OS and the mapped paths are shown for something like symlink ?
This question comes across since we are trying to maintain a large amount of data on a mounted disk but accessible by the container, with the assumption that mounting volume would directly store the data on the disk and nothing on the container.
The Docker documentation refers to this type of mount as a "bind mount"; that's also a technical Linux term that allows one part of the filesystem to also appear somewhere else, and there's a mount --bind option you can use outside of Docker (usually a pretty specialized option).
On native Linux, the host content and the container-visible content are literally the exact same disk content. If you have a bind-mounted host directory or a named Docker volume mounted over a container directory, all reads and writes will use that mounted content, and in fact nothing will be written to the container filesystem on that path.
You mention symlinks; these are always resolved as filenames in their respective filesystem space. If the mounted filesystem has a symlink passwd -> /etc/passwd then reading it will yield the host's password file on the host, and the container's password file inside the container. If it has a symlink f -> ../f then it will look at the directory above the mount point in whichever the local filesystem is.
On non-Linux this process is a little bit more technically complex since there is typically a Linux virtual machine involved in the mix. This usually manifests as file synchronization appearing slow. For data you don't need to directly access as a human, storing it in a named Docker volume will usually be faster.

How does volume mount from container to host and vice versa work?

docker run -ti --rm -v DataVolume3:/var ubuntu
Lets say I have a volume DataVolume 3 which pulls the contents of /var in the ubuntu container
even after killing this ubuntu container the volume remains and I can use this volume DataVolume3 to mount it to other containers.
This means with the deletion of container the volume mounts are not deleted.
How does this work ?
Does that volume mount mean that it copies the contents of /var into some local directory because this does not look like a symbolic link ?
If I have the container running and I create a file in the container then the same file gets copied to the host path ?
How does this whole process of volume mount from container to host and host to container work ?
Volumes are used for persistent storage and the volumes persists independent of the lifecycle of the container.
We can go through a demo to understand it clearly.
First, let's create a container using the named volumes approach as:
docker run -ti --rm -v DataVolume3:/var ubuntu
This will create a docker volume named DataVolume3 and it can be viewed in the output of docker volume ls:
docker volume ls
DRIVER VOLUME NAME
local DataVolume3
Docker stores the information about these named volumes in the directory /var/lib/docker/volumes/ (*):
ls /var/lib/docker/volumes/
1617af4bce3a647a0b93ed980d64d97746878564b141f30b6110d0818bf32b76 DataVolume3
Next, let's write some data from the ubuntu container at the mounted path var:
echo "hello" > var/file1
root#2b67a89a0050:/# cat /var/file1
hello
We can see this data with cat even after deleting the container:
cat /var/lib/docker/volumes/DataVolume3/_data/file1
hello
Note: Although, we are able to access the volumes like shown above but it not a recommended practice to access volumes data like this.
Now, next time when another container uses the same volume then the data from the volume gets mounted at the container directory specified as part of -v flag.
(*) The location may vary based on OS as pointed by David and probably can be seen by the docker volume inspect command.
Docker has a concept of a named volume. By default the storage for this lives somewhere on your host system and you can't directly access it from outside Docker (*). A named volume has its own lifecycle, it can be independently docker volume rm'd, and if you start another container mounting the same volume, it will have the same persistent content.
The docker run -v option takes some unit of storage, either a named volume or a specific host directory, and mounts it (as in the mount(8) command) in a specific place in the container filesystem. This will hide what was originally in the image and replace it with the volume content.
As you note, if the thing you mount is an empty named volume, it will get populated from the image content at container initialization time. There are some really important caveats on this functionality:
Named volume initialization happens only if the volume is totally empty.
The contents of the named volume never automatically update.
If the volume isn't empty, the volume contents completely replace what's in the image, even if it's changed.
The initialization happens only on native Docker, and not for example in Kubernetes.
The initialization happens only on named volumes, and not for bind-mounted host directories.
With all of these caveats, I'd avoid relying on this functionality.
If you need to mount a volume into a container, assume it will be empty when your entrypoint or the main container command starts. If you need a particular directory layout or file structure there, an entrypoint script can create it; if you're expecting it to hold particular data, keep a copy of it somewhere else in your image and copy it in if it's not already there (or, perhaps, always).
(*) On native Linux you can find a filesystem location for it, but accessing this isn't a best practice. On other OSes this will be hidden inside a virtual machine or other opaque storage. If you need to directly access the data (or inject config files, or read log files) a docker run -v /host/path:/container/path bind mount is a better choice.
Volumes are part of neither the container nor the host. Well, technically everything resides in the host machine. But the docker directories are only accessible by users in "docker" group. The files in these directories are separately managed by docker.
"Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux)."
Hence volumes are like the union of files under the docker container and the host itself. Any addition on either end will be added to the volume(/var/lib/docker/volumes), not hard copy, rather something like symbol link
As volumes can be shared across different containers, deleting a container does not cascade to the volumes associated with it.
To remove unused volumes:
docker volume prune .

Any unix wizardy to mount a device in the bare metal OS from a container

We use containers to provision storage on our storage nodes but I can't for the life of my figure out a way to mount a device to the bare metal OS from a container. Both bare metal and containers are running oracle linux 7.5.
We cannot use ssh in any form for this. This is an isolated compute environment and the only access is thru the orchestration we use to manage containers.
I'm mainly a solaris guy so wondering if there is any linux magic I can work here.
I can mount any bare metal devices or filesystems into the container and I can run the container in privileged more.
Thx for any help
* clarification *
This is not about mounting a volume into a container.
This container is a temporary provisioning container, ie: it does stuff like mount iscisi volumes, create volume groups, create logical volumes and make filesystems.
This part is all working fine.
The last step this container needs to do is somehow tell the BARE METAL OPERATING SYSTEM TO MOUNT A DEVICE INTO IT'S FILESYSTEM. NOT IN THE CONTAINER.
Simplistic example: I need this container to somehow tell the OS to "mount /dev/sdg /data".
This mount does not need to be available to the container. The container is being destroyed once it allocate the storage and mounts it.
And we can't use SSH for this.
There are several problems you need to overcome.
By default, Docker does not have access to block devices on the
host.
A docker container is unable to modify its own mount namespace.
A docker container runs in a private mount namespace, so even after
solving (1) and (2), any mounts you make inside the container will
not be visible from the host.
Fortunately, there are solutions to all of the above!
We can solve (1) and (2) by passing the --privileged flag to
docker run. This removes all the restrictions that Docker normally
places on a container.
For solving (3), we need to use the --mount option instead of the
-v option, since we need to modify the style of mount propagation
used. Reading through the documentation on
bind-mounts,
we see that the --mount option supports the following options:
The type of the mount, which can be bind, volume, or tmpfs. This topic discusses bind mounts, so the type will always be bind.
The source of the mount. For bind mounts, this is the path to the file or directory on the Docker daemon host. May be specified as source or src.
The destination takes as its value the path where the file or directory will be mounted in the container. May be specified as destination, dst, or target.
The readonly option, if present, causes the bind mount to be mounted into the container as read-only.
The bind-propagation option, if present, changes the bind propagation. May be one of rprivate, private, rshared, shared, rslave, slave.
The consistency option, if present, may be one of consistent, delegated, or cached. This setting only applies to Docker for Mac, and is ignored on all other platforms.
The one we care about is the bind-propagation option. The values for
that are described later on in the same
document.
Reading through them, we probably want rshared.
Armed with this knowledge, I can run:
docker run -it \
--mount type=bind,source=/,dst=/host,bind-propagation=rshared \
--privileged alpine sh
Then inside the container I can run, for example:
mount /dev/sdd1 /host/mnt
And on the host I see the contents of /dev/sdd1 mounted on /mnt. The mount will persist after the container exits.

Docker volume vs mount bind for external hdd

First time docker ser here, running on Raspberry Pi 3 (Hypriot OS). I have an external hdd attached to my raspberry pi to store all the files. The os is on the sdcard.
I am setting up many images on docker: sonarr, radarr, emby server and bittorrent client.
I have created all containers following the lines on docker hub page, so I attached all of the folders using mount bind (-v /some/path:/some/path).
Now the documentation says volume is better because it doesn't rely on filesystem. Also, I am having problems because I want to use hardlink between files on my external hdd, but because I am using mount binds, it seems to not work when calling hardlink from one mount to another on the same hdd. I think adding only one mount bind should solve this but I just want to make the config correct now.
Is volume an option to store all the movies or should I keep using mount bind?
In canse of volume, can I specify the external hdd to store movies? I have docker installed on an sdcard but I need the movies on my external hdd.
I have used docker create volume --name something -o device=/myhddmount/ but I am not sure if this is ok, because docker volume inspect shows a mountpoint on the sdcard. Also, when I create the volume, should I set -o type=ext4? because according to the manual etx4 doesn't has a device= option.
Thanks!

How to list Docker mounted volumes from within the container

I want to list all container directories that are mounted volumes.
I.e. to be able to get similar info I get from
docker inspect --format "{{ .Volumes }}" <self>
But from within the container and without having docker installed in there.
I tried cat /proc/mounts, but I couldn't find a proper filter for it.
(EDIT - this may no longer work on Mac) If your Docker host is OS X, the mounted volumes will be type osxfs (or fuse.osxfs). You can run a
mount | grep osxfs | awk '{print $3}'
and get a list of all the mounted volumes.
If your Docker host is Linux (at least Ubuntu 14+, maybe others), the volumes appear to all be on /dev, but not on a device that is in your container's /dev filesystem. The volumes will be alongside /etc/resolv.conf, /etc/hostname, and /etc/hosts. If you do a mount | grep ^/dev to start, then filter out any of the files in ls /dev/*, then filter out the three files listed above, you should be left with host volumes.
mount | grep ^/dev/ | grep -v /etc | awk '{print $3}'
My guess is the specifics may vary from Linux to Linux. Not ideal, but at least possible to figure out.
Assuming you want to check what volumes are mounted from inside a linux based container you can look up entries beginning with "/dev" in /etc/mtab, removing the /etc entries
$ grep "^/dev" /etc/mtab | grep -v " \/etc/"
/dev/nvme0n1p1 /var/www/site1 ext4 rw,relatime,discard,data=ordered 0 0
/dev/nvme0n1p1 /var/www/site2 ext4 rw,relatime,discard,data=ordered 0 0
As you can read from many of the comments you had, a container is initially nothing but a restricted, reserved part of resources that is totally cut away from the rest of your machine. It is not aware of being a Docker, and inside the container everything behaves as if it were a separate machine. Sort of like the matrix, I guess ;)
You get access to the host machine's kernel and its resources, but yet again restricted as just a filtered out set. This is done with the awesome "cgroups" functionality that comes with Unix/Linux kernels.
Now the good news: There are multiple ways for you to provide the information to your Docker, but that is something that you are going to have to provide and build yourself.
The easiest ad most powerful way is to mount the Unix socket located on your host at /var/run/docker.sock to the inside of your container at the same location. That way, when you use the Docker client inside your container you are directly talking to the docker engine on your host.
However, with great power comes great responsibility. This is a nice setup, but it is not very secure. Once someone manages to get into your docker it has root access to your host system this way.
A better way would be to provide a list of mounts through the environment settings, or clinging on to some made-up conventions to be able to predict the mounts.
(Do you realize that there is a parameter for mounting, to give mounts an alias for inside your Docker?)
The docker exec command is probably what you are looking for.
This will let you run arbitrary commands inside an existing container.
For example:
docker exec -it <mycontainer> bash
Of course, whatever command you are running must exist in the container filesystem.
#docker cp >>>> Copy files/folders between a container and the local filesystem
docker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH
docker cp [OPTIONS] SRC_PATH CONTAINER:DEST_PATH
to copy full folder:
docker cp ./src/build b081dbbb679b:/usr/share/nginx/html
Note – This will copy build directory in container’s …/nginx/html/ directory to copy only files present in folder:
docker cp ./src/build/ b081dbbb679b:/usr/share/nginx/html
Note – This will copy contents of build directory in container’s …./nginx/html/ directory
Docker Storage options:
Volumes are stored in a part of the host filesystem which is managed by Docker(/var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem. Volumes are the best way to persist data in Docker.
When you create a volume, it is stored within a directory on the Docker host. When you mount the volume into a container, this directory is what is mounted into the container. This is similar to the way that bind mounts work, except that volumes are managed by Docker and are isolated from the core functionality of the host machine.
A given volume can be mounted into multiple containers simultaneously. When no running container is using a volume, the volume is still available to Docker and is not removed automatically. You can remove unused volumes using docker volume prune.
When you mount a volume, it may be named or anonymous. Anonymous volumes are not given an explicit name when they are first mounted into a container, so Docker gives them a random name that is guaranteed to be unique within a given Docker host. Besides the name, named and anonymous volumes behave in the same ways.
Volumes also support the use of volume drivers, which allow you to store your data on remote hosts or cloud providers, among other possibilities.
Bind mounts may be stored anywhere on the host system. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.
Available since the early days of Docker. Bind mounts have limited functionality compared to volumes. When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its full path on the host machine. The file or directory does not need to exist on the Docker host already. It is created on demand if it does not yet exist. Bind mounts are very performant, but they rely on the host machine’s filesystem having a specific directory structure available. If you are developing new Docker applications, consider using named volumes instead. You can’t use Docker CLI commands to directly manage bind mounts.
One side effect of using bind mounts, for better or for worse, is that you can change the host filesystem via processes running in a container, including creating, modifying, or deleting important system files or directories. This is a powerful ability which can have security implications, including impacting non-Docker processes on the host system.
tmpfs mounts are stored in the host system’s memory only, and are never written to the host system’s filesystem.
A tmpfs mount is not persisted on disk, either on the Docker host or within a container. It can be used by a container during the lifetime of the container, to store non-persistent state or sensitive information. For instance, internally, swarm services use tmpfs mounts to mount secrets into a service’s containers.
If you need to specify volume driver options, you must use --mount.
-v or --volume: Consists of three fields, separated by colon characters (:). The fields must be in the correct order, and the meaning of each field is not immediately obvious.
o In the case of named volumes, the first field is the name of the volume, and is unique on a given host machine. For anonymous volumes, the first field is omitted.
o The second field is the path where the file or directory will be mounted in the container.
o The third field is optional, and is a comma-separated list of options, such as ro. These options are discussed below.
• --mount: Consists of multiple key-value pairs, separated by commas and each consisting of a = tuple. The --mount syntax is more verbose than -v or --volume, but the order of the keys is not significant, and the value of the flag is easier to understand.
o The type of the mount, which can be bind, volume, or tmpfs. This topic discusses volumes, so the type will always be volume.
o The source of the mount. For named volumes, this is the name of the volume. For anonymous volumes, this field is omitted. May be specified as source or src.
o The destination takes as its value the path where the file or directory will be mounted in the container. May be specified as destination, dst, or target.
o The readonly option, if present, causes the bind mount to be mounted into the container as read-only.
o The volume-opt option, which can be specified more than once, takes a key-value pair consisting of the option name and its value.

Resources