docker overlay2 increase size - docker

I am very new to docker so please pardon if anything stupid :P
I have docker running on my cloud server and was facing issue of running out of space because of docker overlay files. So I mounted 100GB of storage to the server at
/home/<user>/data
and in daemon.json configured the docker root directory to this newly mounted storage and copied all the old files but after that also when I check
df -h
overlay file shows size 36G. Am I doing something wrong
How can I increase this overlay to completely utilize the storage ?
PS: Also when it starts filling up it doesn't increase space it just fills up and all the apps stop working

Docker stores images, containers, and volumes under /var/lib/docker by default. If you haven't mounted another filesystem there, you are likely looking at the free space on your root filesystem.
When mounting another filesystem in this location, you likely want to move the current directory aside so you can copy it into the new filesystem. If you do restore the content, be sure to use a command that preserves ownership, permissions, and symlinks (I believe cp -a and tar both do this).
Also, make sure the docker engine is not running when you replace this directory, and be sure the filesystem type matches your current root filesystem type, or is compatible with your graph driver.

Related

Overlay a folder in docker by one from host

My situation is the following:
I am having a docker image/container in which I am compiling. I had to install some components to $HOME via the Dockerfile (so while creating the image).
Let's say one of those components is in ~/.config, but also other folders.
I would like to have the possibility to override the files in .config by mounting a home folder from the host on top of the one inside docker. Whenever you place a file in the mounted folder, it overrides the one which is already inside the container.
So in theory, this is exactly what an OverlayFS does, right? While the lower directory would be the one inside the Docker container, the upper directory would be the one on my Host.
Is there a way to accomplish that?
Until now I found the following related topics:
https://serverfault.com/questions/841238/how-to-use-overlayfs-with-docker-volumes
Drawback: The answer does only show how to use overlayfs on the host, but getting acccess to the lower container/image directory is not that self-explaining and also feels dirty.
Can I mount docker host directory as copy on write/overlay?
Drawback: Using mount -t overlay inside docker does not work on newer kernels because of the disabled overlay-on/over-overlay option
I also thought about manipulating the docker files on host directly, i.e. the directories where docker stores the files, but that feels a bit dirty.
To do so, I would declare VOLUME /home/user at the end of the Dockerfile. Then I would find my files of that directory in /var/lib/docker/volumes/user/_data. I could then create a overlayfs on my host, using that directory as lower, my other folder as upper. I could then remount that new directory using docker run --volume. Unfortunately this would involve su rights to access the /var/lib directory.
The other way around would be to bind-mount single files, but that's maybe a bit hackish too.

Does docker container maintain volume data?

This might come across as a stupid question, but I am unable to figure something about docker volumes. Going through the official documentation I can see that we can map the host machine file system on the container for persistent storage. Following the instruction I was successfully able to mount a folder on my container.
Once I exec bash into the container, I can see the mapped directory structure there as expected. My question is, how is the data mapped between these two paths, that is from the container to the mount volume on host OS. Is the data duplicated or the container directly stores the data on the volume on host OS and the mapped paths are shown for something like symlink ?
This question comes across since we are trying to maintain a large amount of data on a mounted disk but accessible by the container, with the assumption that mounting volume would directly store the data on the disk and nothing on the container.
The Docker documentation refers to this type of mount as a "bind mount"; that's also a technical Linux term that allows one part of the filesystem to also appear somewhere else, and there's a mount --bind option you can use outside of Docker (usually a pretty specialized option).
On native Linux, the host content and the container-visible content are literally the exact same disk content. If you have a bind-mounted host directory or a named Docker volume mounted over a container directory, all reads and writes will use that mounted content, and in fact nothing will be written to the container filesystem on that path.
You mention symlinks; these are always resolved as filenames in their respective filesystem space. If the mounted filesystem has a symlink passwd -> /etc/passwd then reading it will yield the host's password file on the host, and the container's password file inside the container. If it has a symlink f -> ../f then it will look at the directory above the mount point in whichever the local filesystem is.
On non-Linux this process is a little bit more technically complex since there is typically a Linux virtual machine involved in the mix. This usually manifests as file synchronization appearing slow. For data you don't need to directly access as a human, storing it in a named Docker volume will usually be faster.

When to use auto mapped Docker data volume

What is the main purpose of Docker data volume created by -v option without specified host file? For example docker run -v /data -ti my-image. Doc says it creates a new filesystem mapped to host filesystem to persist data (at some random-ish location). I understand that. But containers also persist all data when they are stopped and started again. So what is the difference between persisted data in stopped container vs. data volume?
I understand use-case for its advanced usage to map specific host file with -v /data:/data/host.
Off the top of my head:
If you are planning on using docker commit at some point, then an ephemeral volume like that can be used to intentionally prevent some contents from getting committed to the new filesystem image (because the contents of volumes are not preserved as part of the image).
If you will be generating a lot of temporary data and you are worried about filling up the root container filesystem, using a volume will give you more space (because your data won't be sharing space with operating system files).

How to copy /var/lib/docker with overlayfs directory structure with data *as-is* without increasing the storage space

I have a docker installation with several images and about 150Gigs of data in /var/lib/docker. This setup uses overlayfs as its storage driver. There are several directories for each layer under /var/lib/docker/overlay holding the actual data. The partition size is 160G.
My requirement is to copy the the docker directory from /var/lib/docker to a new disk of 1TB, so that I point docker to start from this new partition and continue to use my old images.
Now the problem is, when I use an rsync or a cp command with -a, to copy /var/lib/docker to new partition, instead of a total of 150G actual data, the total copied data is coming to as much as 600G (and counting..).
Docker is stopped as well, but not sure how OS is looking at the 160G data and copying into 600G+. I hope it is not the overlayfs (merged directories). There is no overlay information on df -aTh. Nor did it help unloading kernel overlayfs driver with rmmod overlay
How is it possible that I could just copy this data as-is, without any expansion/merge taking place.
It turned out that docker is using hardlinks within those directories under /var/lib/docker/overlay. Using -H with rsync (copy hardlinks as hardlinks) solved the issue.
rsync -avPHSX /var/lib/docker /new/partition/

Docker: in memory file system

I have a docker container which does alot of read/write to disk. I would like to test out what happens when my entire docker filesystem is in memory. I have seen some answers here that say it will not be a real performance improvement, but this is for testing.
The ideal solution I would like to test is sharing the common parts of each image and copy to your memory space when needed.
Each container files which are created during runtime should be in memory as well and separated. it shouldn't be more than 5GB fs in idle time and up to 7GB in processing time.
Simple solutions would duplicate all shared files (even those part of the OS you never use) for each container.
There's no difference between the storage of the image and the base filesystem of the container, the layered FS accesses the images layers directly as a RO layer, with the container using a RW layer above to catch any changes. Therefore your goal of having the container running in memory while the Docker installation remains on disk doesn't have an easy implementation.
If you know where your RW activity is occurring (it's fairly easy to check the docker diff of a running container), the best option to me would be a tmpfs mounted at that location in your container, which is natively supported by docker (from the docker run reference):
$ docker run -d --tmpfs /run:rw,noexec,nosuid,size=65536k my_image
Docker stores image, container, and volume data in its directory by default. Container HDs are made of the original image and the 'container layer'.
You might be able set this up using a RAM disk. You would hard allocate some RAM, mount it, and format it with your file system of choice. Then move your docker installation to the mounted RAM disk and symlink it back to the original location.
Setting up a Ram Disk
Best way to move the Docker directory
Obviously this is only useful for testing as Docker and it's images, volumes, containers, etc would be lost on reboot.

Resources