tl;dr What are best practices for exposing read-only data to a Dockerized app, with transparent decompression?
I have a Dockerized app which needs read-only access to a set of data files totalling 80 GB. Currently, I manually copy these to hosts, and expose them via a bind-mount. I would like to migrate these files to a Docker volume, but their size is an issue.
These files compress well, down to 15 GB. Is it possible to take advantage of that compression in the Docker volume, to avoid eagerly decompressing the full 80 GB to the host? For example, can a Docker volume use SquashFS or similar for on-demand decompression?
Things I've tried:
Using the btrfs storage driver, but this requires that the host be configured with a dedicated block device.
Using fuse-zip and mounting inside the container, but this is very hacky and requires extending additional SYS_ADMIN capabilities to the container.
In case it's relevant, the files are accessed linearly, not random-access. Thank you for any help!
Related
I have been reading about Docker, and one of the first things that I read about docker was that it runs images in a read-only manner. This has raised this question in my mind, what happens if I need users to upload files? In that case where would the file go (are they appended to the image)? or in other words, how to handle uploaded files?
Docker containers are meant to be immutable and replaceable - you should be able to stop a container and replace it with a newer version without any ill effects. It's bad practice to store any configuration or operational data inside the container.
The situation you describe with file uploads would typically be resolved with a volume, which mounts a folder from the host filesystem into the container. Any modifications performed by the container to the mounted folder would persist on the host filesystem. When the container is replaced, the folder is re-mounted when the new container is started.
It may be helpful to read up on volumes: https://docs.docker.com/storage/volumes/
docker containers use file systems similar to their underlying operating system, as it seems in your case Windows Nano Server(windows optimized to be used in a container).
so any uploads to your container will be placed on the corresponding path you provided when uploading the file.
but this data is ephemeral, this means your data will persist until the container is for whatever reason stopped.
to use persistent storage you must provide a volume for your docker container, you can think of volumes as external disks attached to a container that mount on a path inside the container. this will persist data regardless of container state
I understand Docker volumes and the way they refer to directories on the host. What about the rest of the filesystem within a container?
To think about it a different way: suppose you have a server with most of the storage on a remote drive, meaning reads and writes take longer than usual. If you don't mount any volumes, would it keep any/some/most/all of the container filesystem in RAM? Or does it write some amount of it to disk, meaning it would be just as slow as a volume in this case?
Non-volume data is stored in a layered overlay filesystem (in most distributions, this will be either an AUFS or DeviceMapper filesystem). The principle is the same in both cases (image source):
As already mentioned in comments, I can recommend reading the section "Understand images, containers and storage drivers" from the official documentation. This answer is just a short summary.
Each Docker image consists of multiple layers of filesystem images. For example, an Apache+PHP image might consist of (1) a generic Ubuntu base layer, (2) an additional layer with the Apache HTTP server installed and (3) another layer on top with PHP-FPM and configuration files (just an example).
When you start a new container from an image, a new per-container layer is added to the existing image layers. This layer will contain all changes that is written within the container itself (to non-volume directories).
Regarding your specific questions:
If you don't mount any volumes, would it keep any/some/most/all of the container filesystem in RAM?
Nope, there's nothing in RAM (besides the usual filesystem caches). It's all in the overlay filesystem, which are mounted using AUFS, DeviceMapper or another storage driver.
Or does it write some amount of it to disk, meaning it would be just as slow as a volume in this case?
In general, filesystem access in volumes is more performant than in the overlay filesystem. After all, a volume (at least, a regular host-based volume, letting aside volume drivers that add network storage volumes) is simply a bind mount to a regular directory in the host filesystem, bypassing the layer filesystem entirely. The performance of volumes in comparison to layer filesystem is (among other topics) investigated in this paper:
AUFS introduces significant overhead which is not surprising since I/O is going through several layers, [...]. Applications that are filesystem or disk intensive should bypass AUFS by using volumes. [...] Although containers themselves have almost no overhead, Docker is not without performance gotchas. Docker volumes have noticeably better performance than files stored in AUFS.
I have a docker container which does alot of read/write to disk. I would like to test out what happens when my entire docker filesystem is in memory. I have seen some answers here that say it will not be a real performance improvement, but this is for testing.
The ideal solution I would like to test is sharing the common parts of each image and copy to your memory space when needed.
Each container files which are created during runtime should be in memory as well and separated. it shouldn't be more than 5GB fs in idle time and up to 7GB in processing time.
Simple solutions would duplicate all shared files (even those part of the OS you never use) for each container.
There's no difference between the storage of the image and the base filesystem of the container, the layered FS accesses the images layers directly as a RO layer, with the container using a RW layer above to catch any changes. Therefore your goal of having the container running in memory while the Docker installation remains on disk doesn't have an easy implementation.
If you know where your RW activity is occurring (it's fairly easy to check the docker diff of a running container), the best option to me would be a tmpfs mounted at that location in your container, which is natively supported by docker (from the docker run reference):
$ docker run -d --tmpfs /run:rw,noexec,nosuid,size=65536k my_image
Docker stores image, container, and volume data in its directory by default. Container HDs are made of the original image and the 'container layer'.
You might be able set this up using a RAM disk. You would hard allocate some RAM, mount it, and format it with your file system of choice. Then move your docker installation to the mounted RAM disk and symlink it back to the original location.
Setting up a Ram Disk
Best way to move the Docker directory
Obviously this is only useful for testing as Docker and it's images, volumes, containers, etc would be lost on reboot.
I have a very large file in my docker container (it's a virtualbox image) which --- unfortunately -- must be modified as part of running it. Docker's copy-on-write policy works against me here and unfortunately any mutation/copying of the file takes about 10 minutes, compared to about 10 seconds to copy the same file on the host.
Can anything be done to speed up the creation/copy of very large files within a docker container? Note that this is an entirely transient file that I do not need to persist after the container is closed.
Declare the folder the file is in as a volume. If you do this, the copy-on-write-policy is not applied. Note that you don't have to mount this volume to the host system, it is sufficient to declare it as a volume.
For more information: https://docs.docker.com/userguide/dockervolumes/
I've recently discovered docker. It looks very useful for us.
But what I don't understand is the role of the registry beyond getting initial docker images. We'll likely be starting with some images based on those from docker.io, but will be customizing those and adding some private closed source software.
What concerns me is if the images were large enough then could I run out of space on my / drive.
Can /var/lib/docker just be a mount to a shared file system like cephfs or nfs?
I'm also interested in using CoreOS in a PXE or iPXE configuration. It appears that in that scenario / is mounted as tmpfs up to 50% RAM which is needlessly wasteful for pulling images that could be available on a shared file system. However I've read comments that for some reason /var/lib/docker needs to be on btrfs. Is this true? why?
Ok I've found an answer to my last question. CoreOS requires /var/lib/docker to be mounted on btrfs because it uses the btrfs backend. This backend uses btrfs snapshots to implement the layers docker uses to represent it's image.
Which helps with my second question. Can /var/lib/docker just be a mount to a shared file system. By the looks of it, no. Not unless the super slow vfs backend is used.
It's easy and cheap to store your registry in S3.
I would recommend against mounting /var/lib/docker on nfs. If someone hammers the nfs, all your services will essentially stop working, since the file systems of the containers live there.