I have a docker container which does alot of read/write to disk. I would like to test out what happens when my entire docker filesystem is in memory. I have seen some answers here that say it will not be a real performance improvement, but this is for testing.
The ideal solution I would like to test is sharing the common parts of each image and copy to your memory space when needed.
Each container files which are created during runtime should be in memory as well and separated. it shouldn't be more than 5GB fs in idle time and up to 7GB in processing time.
Simple solutions would duplicate all shared files (even those part of the OS you never use) for each container.
There's no difference between the storage of the image and the base filesystem of the container, the layered FS accesses the images layers directly as a RO layer, with the container using a RW layer above to catch any changes. Therefore your goal of having the container running in memory while the Docker installation remains on disk doesn't have an easy implementation.
If you know where your RW activity is occurring (it's fairly easy to check the docker diff of a running container), the best option to me would be a tmpfs mounted at that location in your container, which is natively supported by docker (from the docker run reference):
$ docker run -d --tmpfs /run:rw,noexec,nosuid,size=65536k my_image
Docker stores image, container, and volume data in its directory by default. Container HDs are made of the original image and the 'container layer'.
You might be able set this up using a RAM disk. You would hard allocate some RAM, mount it, and format it with your file system of choice. Then move your docker installation to the mounted RAM disk and symlink it back to the original location.
Setting up a Ram Disk
Best way to move the Docker directory
Obviously this is only useful for testing as Docker and it's images, volumes, containers, etc would be lost on reboot.
Related
I am very new to docker so please pardon if anything stupid :P
I have docker running on my cloud server and was facing issue of running out of space because of docker overlay files. So I mounted 100GB of storage to the server at
/home/<user>/data
and in daemon.json configured the docker root directory to this newly mounted storage and copied all the old files but after that also when I check
df -h
overlay file shows size 36G. Am I doing something wrong
How can I increase this overlay to completely utilize the storage ?
PS: Also when it starts filling up it doesn't increase space it just fills up and all the apps stop working
Docker stores images, containers, and volumes under /var/lib/docker by default. If you haven't mounted another filesystem there, you are likely looking at the free space on your root filesystem.
When mounting another filesystem in this location, you likely want to move the current directory aside so you can copy it into the new filesystem. If you do restore the content, be sure to use a command that preserves ownership, permissions, and symlinks (I believe cp -a and tar both do this).
Also, make sure the docker engine is not running when you replace this directory, and be sure the filesystem type matches your current root filesystem type, or is compatible with your graph driver.
I am running a docker compose setup on a AWS EC2 instance with three docker container.
After a few weeks running my docker images the size of the /containers dir increases quite a bit:
8,1G /var/lib/docker/containers
0 /var/lib/docker/plugins
3,1G /var/lib/docker/overlay2
When I stop all my images and remove them and the containers and restart my docker images it looks like this:
96K /var/lib/docker/containers
0 /var/lib/docker/plugins
3,1G /var/lib/docker/overlay2
A docker image prune --all did not free anything.
So how can I prevent the var/lib/docker/containers from growing that much.
this happens because you are writing data into the container itself. you should write data to an external volume. each time you write data into the container, a new layer is created on top of the current image.
after a while, your /var/lib/docker/container will be collecting a lot of layers of changed/written file and keep growing
each time you stop your container, the layers are removed, and you are back to the original state of the image when you build them.
Quote:
Containers that write a lot of data consume more space than containers
that do not. This is because most write operations consume new space
in the container’s thin writable top layer.
Note: for write-heavy applications, you should not store the data in the container. Instead, use Docker volumes, which are independent
of the running container and are designed to be efficient for I/O. In
addition, volumes can be shared among containers and do not increase
the size of your container’s writable layer.
Reference: https://docs.docker.com/storage/storagedriver/
I understand Docker volumes and the way they refer to directories on the host. What about the rest of the filesystem within a container?
To think about it a different way: suppose you have a server with most of the storage on a remote drive, meaning reads and writes take longer than usual. If you don't mount any volumes, would it keep any/some/most/all of the container filesystem in RAM? Or does it write some amount of it to disk, meaning it would be just as slow as a volume in this case?
Non-volume data is stored in a layered overlay filesystem (in most distributions, this will be either an AUFS or DeviceMapper filesystem). The principle is the same in both cases (image source):
As already mentioned in comments, I can recommend reading the section "Understand images, containers and storage drivers" from the official documentation. This answer is just a short summary.
Each Docker image consists of multiple layers of filesystem images. For example, an Apache+PHP image might consist of (1) a generic Ubuntu base layer, (2) an additional layer with the Apache HTTP server installed and (3) another layer on top with PHP-FPM and configuration files (just an example).
When you start a new container from an image, a new per-container layer is added to the existing image layers. This layer will contain all changes that is written within the container itself (to non-volume directories).
Regarding your specific questions:
If you don't mount any volumes, would it keep any/some/most/all of the container filesystem in RAM?
Nope, there's nothing in RAM (besides the usual filesystem caches). It's all in the overlay filesystem, which are mounted using AUFS, DeviceMapper or another storage driver.
Or does it write some amount of it to disk, meaning it would be just as slow as a volume in this case?
In general, filesystem access in volumes is more performant than in the overlay filesystem. After all, a volume (at least, a regular host-based volume, letting aside volume drivers that add network storage volumes) is simply a bind mount to a regular directory in the host filesystem, bypassing the layer filesystem entirely. The performance of volumes in comparison to layer filesystem is (among other topics) investigated in this paper:
AUFS introduces significant overhead which is not surprising since I/O is going through several layers, [...]. Applications that are filesystem or disk intensive should bypass AUFS by using volumes. [...] Although containers themselves have almost no overhead, Docker is not without performance gotchas. Docker volumes have noticeably better performance than files stored in AUFS.
I have a very large file in my docker container (it's a virtualbox image) which --- unfortunately -- must be modified as part of running it. Docker's copy-on-write policy works against me here and unfortunately any mutation/copying of the file takes about 10 minutes, compared to about 10 seconds to copy the same file on the host.
Can anything be done to speed up the creation/copy of very large files within a docker container? Note that this is an entirely transient file that I do not need to persist after the container is closed.
Declare the folder the file is in as a volume. If you do this, the copy-on-write-policy is not applied. Note that you don't have to mount this volume to the host system, it is sufficient to declare it as a volume.
For more information: https://docs.docker.com/userguide/dockervolumes/
Assume I am starting a big number of docker containers which are based on the same docker image. It means that each docker container is running the same application. It could be the case that the application is big enough and requires a lot of hard drive memory.
How is docker dealing with it?
Does all docker containers sharing the static part defined in the docker image?
If not does it make sense to copy the application into some directory on the machine which is used to run docker containers and to mount this app directory for each docker container?
Docker shares resources at kernel level. This means application logic is in never replicated when it is ran. If you start notepad 1000 times it is still stored only once on your hard disk, the same counts for docker instances.
If you run 100 instances of the same docker image, all you really do is keep the state of the same piece of software in your RAM in 100 different separated timelines. The hosts processor(s) shift the in-memory state of each of these container instances against the software controlling it, so you DO consume 100 times the RAM memory required for running the application.
There is no point in physically storing the exact same byte-code for the software 100 times because this part of the application is always static and will never change. (Unless you write some crazy self-altering piece of software, or you choose to rebuild and redeploy your container's image)
This is why containers don't allow persistence out of the box, and how docker differs from regular VM's that use virtual hard disks. However, this is only true for the persistence inside the container. The files that are being changed by docker software on the hard disk are "mounted" into containers using the docker volumes and thus arent really part of the docker environments, but just mounted into them. (Read more about this at: https://docs.docker.com/userguide/dockervolumes/)
Another question that you might want to ask when you think about this, is how does docker store changes that it makes to its disk on runtime. What is really sweet to check out, is how docker actually manages to get this working. The original state of the container's hard disk is what is given to it from the image. It can NOT write to this image. Instead of writing to the image, a diff is made of what is changed in the containers internal state in comparison to what is in the docker image.
Docker uses a technology called "Union Filesystem", which creates a diff layer on top of the initial state of the docker image.
This "diff" (referenced as the writable container in the image below) is stored in memory and disappears when you delete your container. (Unless you use the command "docker commit", however: I don't recommend this. The state of your new docker image is not represented in a dockerfile and can not easily be regenerated from a rebuild)