Disk size of a docker image - docker

As someone new to docker world and coming from a virtual machine mindset. I downloaded a docker image for elastic search from docker hub.I am thinking about any configurations I need to do because a lot of data will be forwarded to this image. I need to be considerate about the available disk space. In a Virtual machine world, I can always add additional vhds to increase disk size etc. What's similar operation in a docker world?

A Docker container uses copy-on-write storage drivers such as aufs, btrfs, ... to manage the container layers. All writes done by a container are persisted in the top read-write layer. These layers (especially the write layer) are what determine a container's size.
There is a size limit to the Docker container known as base device size. The default value is 10GB. This value can be changed to allow the container to have more size using dockerd --storage-opt dm.basesize=50G. This will give the container a rootFS size of 50GB.
However, this is not the recommended way to handle heavy write operations that increase the container size. The recommnded way to do that is using Docker volumes. Volumes are not persisted within the Docker local storage area for container layers (i.e /var/lib/docker/<storage-driver>/...), and are thus independent from the container's storage driver. Therefore, they do not contribute to the size of the container.
There is no limits on the number of volumes a container can have. It is recommended to map directories inside the container that will grow in size into volumes.
For more info about storage drivers check About storage drivers
Note: for write-heavy applications, you should not store the data in
the container. Instead, use Docker volumes, which are independent of
the running container and are designed to be efficient for I/O. In
addition, volumes can be shared among containers and do not increase
the size of your container’s writable layer.

Use volumes mounted to the container for storage. That way the size of your containers does not become too big.
https://docs.docker.com/storage/volumes/

Related

Does Docker's mount technology - v mean that there are two identical data files in Linux at the same time?

In Docker technology, the mount volume can map the directory in the container and the directory of the host computer to achieve the container data persistence in the host computer. Is it understandable that a data in the container is a space for the expansion of a Linux host computer, and the mounted host computer directory and the container data are the same, and also occupy a space for Linux? If there is only one file, when the container is running, will it take up two spaces of the machine, one on the host and the other in the container?
From https://docs.docker.com/storage/volumes/ :
In addition, volumes are often a better choice than persisting data in a container’s writable layer, because a volume does not increase the size of the containers using it, and the volume’s contents exist outside the lifecycle of a given container.

Bandwidth and Disk space for Docker container

Does docker container get the same band-width as the host container? Or do we need to configure min and(or) max. I 've noticed that we need to override default RAM(which is 2 GB) and Swap space configuration if we need to run CPU intensive jobs.
Also do we need to configure the disk-space ? Or does it by default get as much space as the actual hard disk.
Memory and CPU are controlled using cgroups by docker. If you do not configure these, they are unrestricted and can use all of the memory and CPU on the docker host. If you run in a VM, which includes all Docker for Desktop installs, then you will be limited to that VM's resources.
Disk space is usually limited to the disk space available in /var/lib/docker. For that reason, many make this a different mount. If you use devicemapper for docker's graph driver (this has been largely deprecated), created preallocated blocks of disk space, and you can control that block size. You can restrict containers by running them with read-only root filesystems, and mounting volumes into the container that have a limited disk space. I've seen this done with loopback device mounts, but it requires some configuration outside of docker to setup the loopback device. With a VM, you will again be limited by the disk space allocated to that VM.
Network bandwidth is by default unlimited. I have seen an interesting project called docker-tc which monitors containers for their labels and updates bandwidth settings for a container using tc (traffic control).
Does docker container get the same band-width as the host container?
Yes. There is no limit imposed on network utilization. You could maybe impose limits using a bridge network.
Also do we need to configure the disk-space ? Or does it by default get as much space as the actual hard disk.
It depends on which storage driver you're using because each has its own options. For example, devicemapper uses 10G by default but can be configured to use more. The recommended driver now is overlay2. To configure start docker with overlay2.size.
This depends some on what your host system is and how old it is.
In all cases network bandwidth isn't explicitly limited or allocated between the host and containers; a container can do as much network I/O as it wants up to the host's limitations.
On current native Linux there isn't a desktop application and docker info will say something like Storage driver: overlay2 (overlay and aufs are good here too). There are no special limitations on memory, CPU, or disk usage; in all cases a container can use up to the full physical host resources, unless limited with a docker run option.
On older native Linux there isn't a desktop application and docker info says Storage driver: devicemapper. (Consider upgrading your host!) All containers and images are stored in a separate filesystem and the size of that is limited (it is included in the docker info output); named volumes and host bind mounts live outside this space. Again, memory and CPU are not intrinsically limited.
Docker Toolbox and Docker for Mac both use virtual machines to provide a Linux kernel to non-Linux hosts. If you see a "memory" slider you are probably using a solution like this. Disk use for containers, images, and named volumes is limited to the VM capacity, along with memory and CPU. Host bind mounts generally get passed through to the host system.

Container storage in container optimized OS images

I'm trying to build a Container Optimized VM in Google Cloud to host a Docker container. This Docker container needs storage but the optimized container VM images have almost no writeable storage. I then created a persistent disk to attach to the VM to mount in the container, but the VMs /etc is also read only, so I'm unable to write to fstab, OR mount the disk anywhere in the filesystem. How is this supposed to be accomplished in a VM that is designed specifically to host Docker containers?
The storage space in the instances is independent of the image used.
You can change the boot disk size on creation time or later. This will allow you to have more storage space in the instance.
If you want to use Kubernetes Engine it is also possible to change the boot disk size on creation time.

Using different raid partitions with docker

I have a Linux (Ubuntu) machine with a partition on an SSD raid and a partition on an HDD raid. I want to put my docker containers with high traffic (like a database) on the SSD part and the other containers on the cheaper HDD part. I can't find an answer here or on other sides. Is there a possibility?
Docker itself doesn't provide that level of control over Docker storage on a per container basis.
You can use the devicemapper storage driver and use a specific raid logical volume for the container file systems. There's no way to choose between multiple storage devices at container run time, or via some policy.
Docker does have volumes that can be added to a container and volume plugins to use different storage backends for volumes. These can controlled on a per container basis.
There is an LVM volume plugin. You could assign SSD's to a lvm volume
group and mount data volumes from that in any container you want the extra write performance in.
Another option would be to run multiple Docker daemons, one with each storage configuration, that would be difficult to maintain.

When does a running Docker container run out of disk space?

I've read through so much documentation, and I'm still not sure how this really works. It's a bit of a Docker vs. VM question.
If I start a VM with a 2GB hard drive and fill its disk with files, I know it runs out after 2GB of files.
Does Docker work the same way? I would assume so. But from what I've read about "UnionFS" it seems like it does not run out of space.
So then why do Docker "volumes" exist? Is that automagically expanding Docker disk space transient in some way? Will the files I've saved inside of my Docker container disappear after a reboot? How about after restarting the container?
Docker's usage (1.12+) depends on the Docker storage driver and possibly the physical file system in use.
TL;DR Storage will be shared between all containers and local volumes unless you are using the devicemapper storage driver or have set a limit via docker run --storage-opt size=X when running on the zfs or btrfs drivers. Docker 1.13+ also supports a quota size with overlay2 on an xfs backed file system.
Containers
For all storage drivers, except devicemapper, the container and local volume storage is limited by the underlying file system hosting /var/lib/docker and it's subdirectories. A container can fill the shared file system and then other containers can't write any more.
When using the devicemapper driver, a default volume size of 100G is "thin allocated" for each container. The default size can be overridden with the daemon option --storage-opt dm.basesize option or set on a per container basis with docker run --storage-opt size=2G.
The same per container quota support is available for the zfs and btrfs drivers as both file systems provide simple built in support for creating volumes with a size or quota.
The overlay2 storage driver on xfs supporta per container quotas as of Docker 1.13. This will probably be extended to ext4 when new 4.5+ kernels become standard/common and ext4 and xfs quotas share a common API.
Volumes
Docker volumes are separate from a container and can be viewed as a persistant storage area for an ephemeral container.
Volumes are stored separately from Docker storage, and have their own plugins for different backends. local is the default backend, which writes data to /var/lib/docker/volumes so is held outside of the containers storage and possible quota system.
Other volume plugins could be used if you wanted to set per volume limits on a local file system that supports it.
Containers will keep their own file state over a container restart and reboot, until you docker rm the container. Files in a volume will survive a container removal and can be mounted on creation of the new container.

Resources