Rethinkdb container: rethinkdb process takes less RAM than the whole container - docker

I'm running my rethinkdb container in Kubernetes cluster. Below is what I notice:
Running top in the host which is CoreOS, rethinkdb process takes about 3Gb:
$ top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
981 root 20 0 53.9m 34.5m 20.9m S 15.6 0.4 1153:34 hyperkube
51139 root 20 0 4109.3m 3.179g 22.5m S 15.0 41.8 217:43.56 rethinkdb
579 root 20 0 707.5m 76.1m 19.3m S 2.3 1.0 268:33.55 kubelet
But running docker stats to check the rethinkdb container, it takes about 7Gb!
$ docker ps | grep rethinkdb
eb9e6b83d6b8 rethinkdb:2.1.5 "rethinkdb --bind al 3 days ago Up 3 days k8s_rethinkdb-3.746aa_rethinkdb-rc-3-eiyt7_default_560121bb-82af-11e5-9c05-00155d070266_661dfae4
$ docker stats eb9e6b83d6b8
CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O
eb9e6b83d6b8 4.96% 6.992 GB/8.169 GB 85.59% 0 B/0 B
$ free -m
total used free shared buffers cached
Mem: 7790 7709 81 0 71 3505
-/+ buffers/cache: 4132 3657
Swap: 0 0 0
Can someone explain why the container is taking a lot more memory than the rethinkdb process itself?
I'm running docker v1.7.1, CoreOS v773.1.0, kernel 4.1.5

In top command, your are looking at physical memory amount. in stats command, this also include the disk cached ram, so it's always bigger than the physical amount of ram. When you really need more RAM, the disk cached will be released for the application to use.
In deed, the memmory usage is pulled via cgroup memory.usage_in_bytes, you can access it in /sys/fs/cgroup/memory/docker/long_container_id/memory.usage_in_bytes. And acording to linux doc https://www.kernel.org/doc/Documentation/cgroups/memory.txt section 5.5:
5.5 usage_in_bytes
For efficiency, as other kernel components, memory cgroup uses some
optimization to avoid unnecessary cacheline false sharing.
usage_in_bytes is affected by the method and doesn't show 'exact'
value of memory (and swap) usage, it's a fuzz value for efficient
access. (Of course, when necessary, it's synchronized.) If you want to
know more exact memory usage, you should use RSS+CACHE(+SWAP) value in
memory.stat(see 5.2).

Related

How do i make docker container resources mutually exclusive?

I want multiple running containers to have mutually exclusive resources with each other. For example, when there are CPU cores from id0 to id63, if 32 CPU cores are allocated to each container, the CPU cores assigned to them are mutually exclusive. In addition, when the host has 16GB of RAM, we want to allocate 8GB to each container so that one container does not affect the memory usage of another container.
Is there good way to do this?
I think all you need is to just limit container resources. This way you can ensure that no container uses more than X cores and/or Y RAM. To limit CPU usage to 1 core add --cpus=1.0 to your docker run command. To limit RAM to 8 gigabytes add -m=8g. Putting it altogether:
docker run --rm --cpus=1 -m=8g debian:buster cat /dev/stdout
And if your look at docker stats you will see that memory is limited (no indication for CPU though):
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
8d9a33b00950 funny_shirley 0.00% 1MiB / 8GiB 0.10% 6.7kB / 0B 0B / 0B 1
Read more in the docs.

Docker daemon memory consumption grows over time

Here's the scenario:
On a Debian GNU/Linux 9 (stretch) VM I have two containers running. The day before yesterday I got a warning from the monitoring that the memory usage is relatively high. After looking at the VM it could be determined that not the containers but Docker daemon needs them. htop
After a restart of the service I noticed a new increase of memory demand after two days. See graphic.
RAM + Swap overview
Is there a known memory leak for this version?
Docker version
Memory development (container) after 2 days:
Container 1 is unchanged
Container 2 increased from 21.02MiB to 55MiB
Memory development (VM) after 2 days:
The MEM increased on the machine from 273M (after reboot) to 501M
dockerd
- after restart 1.3% MEM%
- 2 days later 6.0% of MEM%
Monitor your containers to see if their memory usage changes over time:
> docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
623104d00e43 hq 0.09% 81.16MiB / 15.55GiB 0.51% 6.05kB / 0B 25.5MB / 90.1kB 3
We saw a similar issue and it seems to have been related to the gcplogs logging driver. We saw the problem on docker 19.03.6 and 19.03.9 (the most up-to-date that we can easily use).
Switching back to using a log forwarding container (e.g. logspout) resolved the issue for us.

Able to malloc more than docker-compose mem_limit

I'm trying to limit my container so that it doesn't take up all the RAM on the host. From the Docker docs I understand that --memory limits the RAM and --memory-swap limits (RAM+swap). From the docker-compose docs it looks like the terms for those are mem_limit and memswap_limit, so I've constructed the following docker-compose file:
> cat docker-compose.yml
version: "2"
services:
stress:
image: progrium/stress
command: '-m 1 --vm-bytes 15G --vm-hang 0 --timeout 10s'
mem_limit: 1g
memswap_limit: 2g
The progrium/stress image just runs stress, which in this case spawns a single thread which requests 15GB RAM and holds on to it for 10 seconds.
I'd expect this to crash, since 15>2. (It does crash if I ask for more RAM than the host has.)
The kernel has cgroups enabled, and docker stats shows that the limit is being recognised:
> docker stats
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
7624a9605c70 0.00% 1024MiB / 1GiB 99.99% 396B / 0B 172kB / 0B 2
So what's going on? How do I actually limit the container?
Update:
Watching free, it looks like the RAM usage is effectively limited (only 1GB of RAM is used) but the swap is not: the container will gradually increase swap usage until it's eaten though all of the swap and stress crashes (it takes about 20secs to get through 5GB of swap on my machine).
Update 2:
Setting mem_swappiness: 0 causes an immediate crash when requesting more memory than mem_limit, regardless of memswap_limit.
Running docker info shows WARNING: No swap limit support
According to https://docs.docker.com/engine/installation/linux/linux-postinstall/#your-kernel-does-not-support-cgroup-swap-limit-capabilities this is disabled by default ("Memory and swap accounting incur an overhead of about 1% of the total available memory and a 10% overall performance degradation.") You can enable it by editing the /etc/default/grub file:
Add or edit the GRUB_CMDLINE_LINUX line to add the following two key-value pairs:
GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"
then update GRUB with update-grub and reboot.

How to check the number of cores used by docker container?

I have been working with Docker for a while now, I have installed docker and launched a container using
docker run -it --cpuset-cpus=0 ubuntu
When I log into the docker console and run
grep processor /proc/cpuinfo | wc -l
It shows 3 which are the number of cores I have on my host machine.
Any idea on how to restrict the resources to the container and how to verify the restrictions??
The issue has been already raised in #20770. The file /sys/fs/cgroup/cpuset/cpuset.cpus reflects the correct output.
The cpuset-cpus is taking effect however is not being reflected in /proc/cpuinfo
docker inspect <container_name>
will give the details of the container launched u have to check for "CpusetCpus" in there and then u will find the details.
Containers aren't complete virtual machines. Some kernel resources will still appear as they do on the host.
In this case, --cpuset-cpus=0 modifies the resources the container cgroup has access to which is available in /sys/fs/cgroup/cpuset/cpuset.cpus. Not what the VM and container have in /proc/cpuinfo.
One way to verify is to run the stress-ng tool in a container:
Using 1 cpu will be pinned at 1 core (1 / 3 cores in use, 100% or 33% depending on what tool you use):
docker run --cpuset-cpus=0 deployable/stress -c 3
This will use 2 cores (2 / 3 cores, 200%/66%):
docker run --cpuset-cpus=0,2 deployable/stress -c 3
This will use 3 ( 3 / 3 cores, 300%/100%):
docker run deployable/stress -c 3
Memory limits are another area that don't appear in kernel stats
$ docker run -m 64M busybox free -m
total used free shared buffers cached
Mem: 3443 2500 943 173 261 1858
-/+ buffers/cache: 379 3063
Swap: 1023 0 1023
yamaneks answer includes the github issue.
it should be in double quotes --cpuset-cpus="", --cpuset-cpus="0" means it make use of cpu0.

Pagecache and dirty pages in paused container

I have a Java application running in Ubuntu 14.04 container. The application relies OS pagecache to speed-up reads and writes. The container is issued a pause command which according to docker documentation triggers a cgroup freezer https://www.kernel.org/doc/Documentation/cgroups/freezer-subsystem.txt.
What happens to dirty pages and pagecache of the paused container? Are they flushed to disk? Or the whole notion of container-scope pagecache is wrong and dirty pages for all containers are managed at the docker host level?
docker host free -m:
user#0000 ~ # free -m
total used free shared buffers cached
Mem: 48295 47026 1269 0 22 45010
-/+ buffers/cache: 1993 46302
Swap: 24559 12 24547
container docker exec f1b free -m
user#0000 ~ # docker exec f1b free -m
total used free shared buffers
Mem: 48295 47035 1259 0 22
-/+ buffers: 47013 1282
Swap: 24559 12 24547
Once a container is paused, I cannot check memory as seen by the container.
FATA[0000] Error response from daemon: Container f1 is paused, unpause the container before exec

Resources