Docker - Exposing container directories to host directories without obscuring original contents - docker

I am still relatively new to Docker, so I haven't quite figured out all the nuances yet, so forgive me if this has already been resolved elsewhere.
I would like to share files between the container and the host.
So far I have been using volumes, and mounting a specific host directory to a container directory - but this presents an issue in that, if the host directory is messed around with, these changes are also present in the container.
Another problem I am experiencing is that if the host directory is empty, and is mounted to a pre-existing directory on the container, then the contents of the directory are made invisible. I do understand that this behaviour is consistent with mounting, so I know this is not technically an issue.
But I wonder if it would be possible to set up a volume, or an alternative solution, that:
firstly cannot be edited on the host side,
and secondly allows for the host directory contents to be merged with the container's directory?

I wonder if it would be possible to set up a volume, or an alternative solution, that
firstly cannot be edited on the host side,
That would be a data volume container (with docker create on an image with a VOLUME directive)
and secondly allows for the host directory contents to be merged with the container's directory?
Not that I know of. You would have to do some kind of copy on startup.

Related

Overlay a folder in docker by one from host

My situation is the following:
I am having a docker image/container in which I am compiling. I had to install some components to $HOME via the Dockerfile (so while creating the image).
Let's say one of those components is in ~/.config, but also other folders.
I would like to have the possibility to override the files in .config by mounting a home folder from the host on top of the one inside docker. Whenever you place a file in the mounted folder, it overrides the one which is already inside the container.
So in theory, this is exactly what an OverlayFS does, right? While the lower directory would be the one inside the Docker container, the upper directory would be the one on my Host.
Is there a way to accomplish that?
Until now I found the following related topics:
https://serverfault.com/questions/841238/how-to-use-overlayfs-with-docker-volumes
Drawback: The answer does only show how to use overlayfs on the host, but getting acccess to the lower container/image directory is not that self-explaining and also feels dirty.
Can I mount docker host directory as copy on write/overlay?
Drawback: Using mount -t overlay inside docker does not work on newer kernels because of the disabled overlay-on/over-overlay option
I also thought about manipulating the docker files on host directly, i.e. the directories where docker stores the files, but that feels a bit dirty.
To do so, I would declare VOLUME /home/user at the end of the Dockerfile. Then I would find my files of that directory in /var/lib/docker/volumes/user/_data. I could then create a overlayfs on my host, using that directory as lower, my other folder as upper. I could then remount that new directory using docker run --volume. Unfortunately this would involve su rights to access the /var/lib directory.
The other way around would be to bind-mount single files, but that's maybe a bit hackish too.

Does docker container maintain volume data?

This might come across as a stupid question, but I am unable to figure something about docker volumes. Going through the official documentation I can see that we can map the host machine file system on the container for persistent storage. Following the instruction I was successfully able to mount a folder on my container.
Once I exec bash into the container, I can see the mapped directory structure there as expected. My question is, how is the data mapped between these two paths, that is from the container to the mount volume on host OS. Is the data duplicated or the container directly stores the data on the volume on host OS and the mapped paths are shown for something like symlink ?
This question comes across since we are trying to maintain a large amount of data on a mounted disk but accessible by the container, with the assumption that mounting volume would directly store the data on the disk and nothing on the container.
The Docker documentation refers to this type of mount as a "bind mount"; that's also a technical Linux term that allows one part of the filesystem to also appear somewhere else, and there's a mount --bind option you can use outside of Docker (usually a pretty specialized option).
On native Linux, the host content and the container-visible content are literally the exact same disk content. If you have a bind-mounted host directory or a named Docker volume mounted over a container directory, all reads and writes will use that mounted content, and in fact nothing will be written to the container filesystem on that path.
You mention symlinks; these are always resolved as filenames in their respective filesystem space. If the mounted filesystem has a symlink passwd -> /etc/passwd then reading it will yield the host's password file on the host, and the container's password file inside the container. If it has a symlink f -> ../f then it will look at the directory above the mount point in whichever the local filesystem is.
On non-Linux this process is a little bit more technically complex since there is typically a Linux virtual machine involved in the mix. This usually manifests as file synchronization appearing slow. For data you don't need to directly access as a human, storing it in a named Docker volume will usually be faster.

Docker: Handling user uploads and saving files

I have been reading about Docker, and one of the first things that I read about docker was that it runs images in a read-only manner. This has raised this question in my mind, what happens if I need users to upload files? In that case where would the file go (are they appended to the image)? or in other words, how to handle uploaded files?
Docker containers are meant to be immutable and replaceable - you should be able to stop a container and replace it with a newer version without any ill effects. It's bad practice to store any configuration or operational data inside the container.
The situation you describe with file uploads would typically be resolved with a volume, which mounts a folder from the host filesystem into the container. Any modifications performed by the container to the mounted folder would persist on the host filesystem. When the container is replaced, the folder is re-mounted when the new container is started.
It may be helpful to read up on volumes: https://docs.docker.com/storage/volumes/
docker containers use file systems similar to their underlying operating system, as it seems in your case Windows Nano Server(windows optimized to be used in a container).
so any uploads to your container will be placed on the corresponding path you provided when uploading the file.
but this data is ephemeral, this means your data will persist until the container is for whatever reason stopped.
to use persistent storage you must provide a volume for your docker container, you can think of volumes as external disks attached to a container that mount on a path inside the container. this will persist data regardless of container state

How can I share data between host and container using mounts

I've been attempting to share data between my host and my container. I've been reading a lot about volumes and I believe I have misunderstood some of the fundamentals around sharing data.
Here's how I've been doing it (with Docker Compose)
version: "2"
services:
my-server:
volumes:
- type: bind
source: ./test/
target: /var/logs
The problem with this approach is that the initial creation of the mount destroys any data in the target folder. So for example if my image was built from another image that had some logs in that folder (for whatever reason), the logs would be destroyed.
This is a major problem with my use case. I need to mount a volume (a folder, basically) so that I can share data between my host and guest, similar to how a shared folder with a VM would work.
I've looked into named volumes but from what I understand, named and anonymous volumes are designed to share data between containers, and not to share data with the host (which is what I need for my use case).
So besides bind mounts, is it possible share data between the host and container?
This is not really a Docker problem. I think you'll run into this with any mount. Basically you are already using the correct mechanism for sharing data between the host and your container.
When you mount something in linux, the mount target (i.e. the path at which you mount something) is always replaced with the root of whatever you mount. It does not merge the contents of the mount target with the contents of the (in this case) bind mounted directory. I'm surprised that works with VM shared folders because you run a high risk of a collision. e.g. same file in both locations. How would it resolve that? File system mounts are not the same as a dropbox like synchronisation of files between two locations.
I suggest that you do your bind mount to somewhere else in your container which has no contents and then modify your in-container workflow to handle this. In your example it sounds like you are attempting to collect logs. It also sounds like the containers configured log directory might have some contents which you want to be copied to the host. You could achieve this by having your container init itself by configuring a new log directory before starting your services/running anything, and copying any existing logs to that location. This new location would be the bind mount. Your init script could also detect if the bind mount was already used in this fashion and not sync over the data. This is really an application specific problem.

Give Docker access to host directory but discard changes later

I want to achieve the following with Docker: I want to give a container access to a host directory, such that the container can make changes, but the changes are discarded once the container is exiting/removed (pretty much like an overlayfs).
Simply mounting the directory as a volume for the docker container seems like the wrong way to me, since changes made to a volume persist and I don't want that.
How do I tackle this problem?
The only way for a container to modify the host is to mount a directory between the host and the container. But the changes made by host or container will persist.
You could try the other way: COPY the files you want from host to container using a Dockerfile. The files will be only on the container. When you remove and launch another one, the new container will start with the original files.

Resources