Limit docker (or docker-compose) resources GLOBALLY

Limit docker (or docker-compose) resources GLOBALLY - docker

i'm kinda new to docker and docker compose, plus i recently switched back to ubuntu from a year or so of using osx.
I am working with some docker-compose projects that are quite resource consuming, and when configuring the env on ubuntu i stumbled across a problem: when using docker on a mac (https://docs.docker.com/docker-for-mac/) you can specify maximum resources allocation -like hd space, memory, cpu - for the entire system (so to speak) in the docker app - in ubuntu i didn't find such thing anywhere.
I saw that there is a way to do this for some specific container, but what if i want to - say - allow a max of 6GB of ram for ALL containers? Is there a way to do this i'm not seeing?
Thanks a lot!

you need to setup a cgroup with limited CPU and Memory and refer Docker engine to it
example for a cgroup configs in "/etc/systemd/system/my_docker_slice.slice":
[Unit]
Description=my cgroup for Docker
Before=slices.target
[Slice]
MemoryAccounting=true
MemoryHigh=2G
MemoryMax=2.5G
CPUAccounting=true
CPUQuota=50%
and then update your docker daemon.json in /etc/docker/
{
"cgroup-parent": "/my_docker_slice.slice"
}
Note:
If the cgroup has a leading forward slash (/), the cgroup is created
under the root cgroup, otherwise the cgroup is created under the
daemon cgroup.
you can read more by search after "Default cgroup parent" here

Related

Docker driver "overlay2" failed to remove root filesystem: unlinkat - device or resource busy

When trying to remove a docker container (for example when trying to docker-compose down) I always get these errors:
ERROR: for <my_container> container d8424f80ef124c2f3dd8f22a8fe8273f294e8e63954e7f318db93993458bac27: driver "overlay2" failed to remove root filesystem: unlinkat /var/lib/docker/overlay2/64311f2ee553a5d42291afa316db7aa392a29687ffa61971b4454a9be026b3c4/merged: device or resource busy
Common advice like restarting the docker service, pruning or force removing the containers doesn't work. The only thing that I found that works is manually unmounting with sudo umount /home/virtfs/xxxxx/var/lib/docker/overlay2/<container_id>/merged, then I'm able to remove the container.
My OS is CentOS Linux release 7.9.2009 (Core) with kernel version 3.10.0-1127.19.1.el7.x86_64. I thought this was maybe due to overlay2 clashing with CentOS, but according to this page my CentOS/kernel version should work. It would be great to find a solution to this, because I ideally want to docker-compose down without having to use elevated privileges to umount beforehand.

From the error log, it is observed that the file mounting is involved
Execute the following command to view the related processes
grep 64311f2ee553a5d42291afa316db7aa392a29687ffa61971b4454a9be026b3c4 /proc/*/mountinfo
ps -ef | grep "The process ID obtained by the grep command above"
Stop the occupied process
Then delete the container

Often this happens when there are no processes listed as blocking, then you know it's a kernel module blocking it.
The most likely culprits are nfs (not sure why you'd run that in docker), or files inside docker that are bind-mounted, sometimes the automatic ones, such as perhaps ones created by systemd-networkd.
Overlay2 was phased out by Ubuntu for a reason. CentOS is at its end of life, so this problem might be resolved already in your most likely upgrade path, to Rocky Linux. Alternately you can enter the jungle of migrating your docker storage engine.
Or you can just get rid of the package or software hogging it in the first place, if you can. But you'll have to share more info on what it is for help on that.

Docker 17.06-ce default container memory limit on shared host resources

I have a host with a resource of 8 cores / 16 GB RAM. We use cgroup to allocate CPU and memory for our custom application. We tried to create a static partition resource between our custom application and docker. For example, we are trying to allocate the following :-
4 CPU cores / 8 GB RAM --> docker
3 CPU cores / 6 GB RAM --> custom_app_1
the remaining for OS
We have manage to perform the segregation for custom_app_1. Question is how I create a default limit memory and cpu to our container without having to use the flag --memory or --cpus for individual container. I don't need to limit each container but I need to make sure that all containers running in the host cannot exceed the usage of 8GB RAM and 4 CPU cores, otherwise, it will be fighting resources with my custom_app_1
When i perform docker stats, each container is seeing 16 GB RAM, how do I configure so that they only see 8 GB RAM and 4 CPU cores instead

So what you need to do is create a SystemD slice for the memory.
# /etc/systemd/system/limit-docker-memory.slice
[Unit]
Description=Slice with MemoryLimit=8G for docker
Before=slices.target
[Slice]
MemoryAccounting=true
MemoryLimit=8G
Then configure that slice in /etc/docker/daemon.json
{
"cgroup-parent": "limit-docker-memory.slice"
}
Reload systemctl and restart docker
systemctl daemon-reload
systemctl restart docker
See the relevant section in documentation
DEFAULT CGROUP PARENT
The --cgroup-parent option allows you to set the default cgroup parent to use for containers. If this option is not set, it defaults to /docker for fs cgroup driver and system.slice for systemd cgroup driver.
If the cgroup has a leading forward slash (/), the cgroup is created under the root cgroup, otherwise the cgroup is created under the daemon cgroup.
Assuming the daemon is running in cgroup daemoncgroup, --cgroup-parent=/foobar creates a cgroup in /sys/fs/cgroup/memory/foobar, whereas using --cgroup-parent=foobar creates the cgroup in /sys/fs/cgroup/memory/daemoncgroup/foobar
The systemd cgroup driver has different rules for --cgroup-parent. Systemd represents hierarchy by slice and the name of the slice encodes the location in the tree. So --cgroup-parent for systemd cgroups should be a slice name. A name can consist of a dash-separated series of names, which describes the path to the slice from the root slice. For example, --cgroup-parent=user-a-b.slice means the memory cgroup for the container is created in /sys/fs/cgroup/memory/user.slice/user-a.slice/user-a-b.slice/docker-.scope.
This setting can also be set per container, using the --cgroup-parent option on docker create and docker run, and takes precedence over the --cgroup-parent option on the daemon.

How to add a device to all docker containers

To make use of SGX enclaves applications have to talk to the SGX driver which is exposed via /dev/isgx on the host. We execute such applications inside of Docker containers mapping /dev/isgx inside with the --device command line option.
Is there an option to add a device (/dev/isgx in this case) to any container ever started by a docker engine?
Edit:
Progress on my side so far:
Docker uses containerd & runc to create a containers configuration before it is started. Docker's configuration file /etc/docker/daemon.json has a field runtimes where one can provide arbitrary arguments to runc:
[...]
"runtimes": {
"runc": {
"path": "runc"
},
"custom": {
"path": "/usr/local/bin/my-runc-replacement",
"runtimeArgs": [
"--debug"
]
}
},
[...]
Sadly, it seams runc is not consuming to many useful arguments for my means (runc --help and runc spec --help <-- creates the configuration).
I found interesting source code regarding DefaultSimpleDevices and DefaultAllowedDevices in runc's codebase. The last commit to this file says 'Do not create /dev/fuse by default' which is promising, but would involve building my own runc. I was hoping for a generic solution via a configuration option.

UPDATE
This is not the correct answer. Turns out, that docker's default parent cgroup already has open devices permissions:
/# cat /sys/fs/cgroup/devices/docker/devices.list
a *:* rwm
Upon container creation a new cgroup for that container is created with more restricted devices rules.
ORIGINAL ANSWER
I think you could use cgroups to achieve what you want.
You could create a new cgroup on your host machine which allows access to /dev/isgx and start your docker daemon with --cgroup-parent=<my-cgroup-name>.
You could also set the cgroup-parent option in your /etc/docker/daemon.json.
If you have never worked with cgroups before, then it might not be trivial to setup though.
How to create a new cgroup depends on your host system, but you must use the devices controller to whitelist specific devices for a cgroup.
E.g., one way is to use libcgroup's /etc/cgconfig.conf and give read/write access to a block device for cgroup dockerdaemon in the following way:
group dockerdaemon {
devices {
devices.allow = b <device-major>:<device-minor> rw
}
}
Here is one example on how to find out the major/minor of your block device:
sudo udevadm info -n /dev/isgx
Here are some further links that might give you more insights into the whole cgroup topic:
cgroups in CentOS6
cgroups in redhat
cgroups in Ubuntu

You need something like this in your docker-compose.yaml file (or similar for other Docker-based technologies:
devices:
- "/dev/isgx:/dev/isgx"

Usage of loopback devices is strongly discouraged for production use

I want to test docker in my CentOS 7.1 box, I got this warning:
[root#docker1 ~]# docker run busybox /bin/echo Hello Docker
Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
Hello Docker
I want to know the reason and how to suppress this warning.
The CentOS instance is running in virtualbox created by vagrant.

The warning message occurs because your Docker storage configuration is using a "loopback device" -- a virtual block device such as /dev/loop0 that is actually backed by a file on your filesystem. This was never meant as anything more than a quick hack to get Docker up and running quickly as a proof of concept.
You don't want to suppress the warning; you want to fix your storage configuration such that the warning is no longer issued. The easiest way to do this is to assign some local disk space for use by Docker's devicemapper storage driver and use that.
If you're using LVM and have some free space available on your volume group, this is relatively easy. For example, to give docker 100G of space, first create a data and metadata volume:
# lvcreate -n docker-data -L 100G /dev/my-vg
# lvcreate -n docker-metadata -L1G /dev/my-vg
And then configure Docker to use this space by editing /etc/sysconfig/docker-storage to look like:
DOCKER_STORAGE_OPTIONS=-s devicemapper --storage-opt dm.datadev=/dev/my-vg/docker-data --storage-opt dm.metadatadev=/dev/my-vg/docker-metadata
If you're not using LVM or don't have free space available on your VG, you could expose some other block device (e.g., a spare disk or partition) to Docker in a similar fashion.
There are some interesting notes on this topic here.

Thanks. This was driving me crazy. I thought bash was outputting this message. I was about to submit a bug against bash. Unfortunately, none of the options presented are viable on a laptop or such where disk is fully utilized. Here is my answer for that scenario.
Here is what I used in the /etc/sysconfig/docker-storage on my laptop:
DOCKER_STORAGE_OPTIONS="--storage-opt dm.no_warn_on_loop_devices=true"
Note: I had to restart the docker service for this to have an effect. On Fedora the command for that is:
systemctl stop docker
systemctl start docker
There is also just a restart command (systemctl restart docker), but it is a good idea to check to make sure stop really worked before starting again.
If you don't mind disabling SELinux in your containers, another option is to use overlay. Here is a link that describes that fully:
http://www.projectatomic.io/blog/2015/06/notes-on-fedora-centos-and-docker-storage-drivers/
In summary for /etc/sysconfig/docker:
OPTIONS='--selinux-enabled=false --log-driver=journald'
and for /etc/sysconfig/docker-storage:
DOCKER_STORAGE_OPTIONS=-s overlay
When you change a storage type, restarting docker will destroy your complete image and container store. You may as well everything up in the /var/lib/docker folder when doing this:
systemctl stop docker
rm -rf /var/lib/docker
dnf reinstall docker
systemctl start docker
In RHEL 6.6 any user with docker access can access my private keys, and run applications as root with the most trivial of hacks via volumes. SELinux is the one thing that prevents that in Fedora and RHEL 7. That said, it is not clear how much of the additional RHEL 7 security comes from SELinux outside the container and how much inside the container...
Generally, loopback devices are fine for instances where the limit of 100GB maximum and a slightly reduced performance are not a problem. The only issue I can find is the docker store can be corrupt if you have a disk full error while running... That can probably be avoided with quotas, or other simple solutions.
However, for a production instance it is definitely worth the time and effort to set this up correctly.
100G may excessive for your production instance. Containers and images are fairly small. Many organizations are running docker containers within VM's as an additional measure of security and isolation. If so, you might have a fairly small number of containers running per VM. In which case even 10G might be sufficient.
One final note. Even if you are using direct lvm, you probable want a additional filesystem for /var/lib/docker. The reason is the command "docker load" will create an uncompressed version of the images being loaded in this folder before adding it to the data store. So if you are trying to keep it small and light then explore options other than direct lvm.

#Igor Ganapolsky Feb and #Mincă Daniel Andrei
Check this:
systemctl edit docker --full
If directive EnvironmentFile is not listed in [Service] block, then no luck (I also have this problem on Centos7), but you can extend standard systemd unit like this:
systemctl edit docker
EnvironmentFile=-/etc/sysconfig/docker
ExecStart=
ExecStart=/usr/bin/dockerd $OPTIONS
And create a file /etc/sysconfig/docker with content:
OPTIONS="-s overlay --storage-opt dm.no_warn_on_loop_devices=true"

Limit disk size and bandwidth of a Docker container

I have a physical host machine with Ubuntu 14.04 running on it. It has 100G disk and 100M network bandwidth. I installed Docker and launched 10 containers. I would like to limit each container to a maximum of 10G disk and 10M network bandwidth.
After going though the official documents and searching on the Internet, I still can't find a way to allocate specified size disk and network bandwidth to a container.
I think this may not be possible in Docker directly, maybe we need to bypass Docker. Does this means we should use something "underlying", such as LXC or Cgroup? Can anyone give some suggestions?
Edit:
#Mbarthelemy, your suggestion seems to work but I still have some questions about disk:
1) Is it possible to allocate other size (such as 20G, 30G etc) to each container? You said it is hardcoded in Docker so it seems impossible.
2) I use the command below to start the Docker daemon and container:
docker -d -s devicemapper
docker run -i -t training/webapp /bin/bash
then I use df -h to view the disk usage, it gives the following output:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/docker-longid 9.8G 276M 9.0G 3% /
/dev/mapper/Chris--vg-root 27G 5.5G 20G 22% /etc/hosts
from the above I think the maximum disk a container can use is still larger than 10G, what do you think
?

I don't think this is possible right now using Docker default settings. Here's what I would try.
About disk usage: You could tell Docker to use the DeviceMapper storage backend instead of AuFS. This way each container would run on a block device (Devicemapper dm-thin target) limited to 10GB (this is a Docker default, luckily enough it matches your requirement!).
According to this link, it looks like latest versions of Docker now accept advanced storage backend options. Using the devicemapperbackend, you can now change the default container rootfs size option using --storage-opt dm.basesize=20G (that would be applied to any newly created container).
To change the storage backend: use the --storage-driver=devicemapper Docker option. Note that your previous containers won't be seen by Docker anymore after the change.
About network bandwidth : you could tell Docker to use LXC under the hoods : use the -e lxcoption.
Then, create your containers with a custom LXC directive to put them into a traffic class :
docker run --lxc-conf="lxc.cgroup.net_cls.classid = 0x00100001" your/image /bin/stuff
Check the official documentation about how to apply bandwidth limits to this class.
I've never tried this myself (my setup uses a custom OpenVswitch bridge and VLANs for networking, so bandwidth limitation is different and somewhat easier), but I think you'll have to create and configure a different class.
Note : the --storage-driver=devicemapperand -e lxcoptions are for the Docker daemon, not for the Docker client you're using when running docker run ........

New releases version has --device-read-bps and --device-write-bps.
You can use:
docker run --device-read-bps=/dev/sda:10mb
More info here:
https://blog.docker.com/2016/02/docker-1-10/

If you have access to the containers you can use tc for bandwidth control within them.
eg: in your entry point script you can add:
tc qdisc add dev eth0 root tbf rate 240kbit burst 300kbit latency 50ms
to have a bandwidth of 240kbps, burst 300kbps and 50 ms latency.
You also need to pass the --cap-add=NET_ADMIN to the docker run command if you are not running the containers as root.

1) Is it possible to allocate other size (such as 20G, 30G etc) to each container? You said it is hardcoded in Docker so it seems impossible.
to answer this question please refer to Resizing Docker containers with the Device Mapper plugin

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart