I am running postgres timescaledb on my docker swarm node. I set limit of CPU to 4 and Mem limit to 32G. When I check docker stats, I can see this output:
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
c6ce29c7d1a4 pg-timescale.1.6c1hql1xui8ikrrcsuwbsoow5 341.33% 20.45GiB / 32GiB 63.92% 6.84GB / 5.7GB 582GB / 172GB 133
CPU% is oscilating around 400%. Node has 6CPUs and average load has been 1 - 2 (1minute load average), So according to me, with my limit of CPUs - 4, the maximum load should be oscilating around 6. My current load is 20 (1minute load average), and output of top command from inside of postgres show 50-60%.
My service configuration limit:
deploy:
resources:
limits:
cpus: '4'
memory: 32G
I am confused, all values are different so what is real CPU usage of postgres and how to limit it ? My server load is pushed to maximum even limit of postgres is set to 4. Inside postgres I can see from htop that there is 6 cores and 64G MEM so its looks like it has all resources of the hosts. From docker stats maximum cpu is 400% - corelate with limit of 4 cpus.
Load average from commands like top in Linux refer to the number of processes running or waiting to run on average over some time period. CPU limits used by docker specify the number of CPU cycles over some timeframe permitted for processes inside of a cgroup. These aren't really measuring the same thing, especially when you factor in things like I/O waiting. You can have a process waiting for a read from disk, that wants to run but is blocked on that I/O call, increasing your load measurements on the host, but not using any CPU cycles.
When calculating how much CPU to allocate to a cgroup, no only do you need to factor in the I/O and other system needs of the process, but consider queuing theory when you approach saturation on the CPU. The closer you get to 100% utilization of the CPU, the longer the queue of processes ready to run will likely be, resulting in significant jumps in load measurements.
Setting these limits correctly will likely require trial and error because not all processes are the same, and not all workload on the host is the same. A batch processing job that kicks off at irregular intervals and saturates the drives and network will have a very different impact on the host from a scientific computation that is heavily CPU and memory bound.
Related
I want to know how can I limit my docker containers CPU frequency by Gigahertz/GHz.
I don't want to limit CPU only by the number of cores.
--cpus=<value>
--cpu-period=<value>
--cpu-quota=<value>
--cpuset-cpus
--cpu-shares
this onnes not helpfull
Limiting container CPU usage by Hz is not possible using Docker, the options to limit CPU usage are described in the documentation.
On Kubernetes, Limits and requests for CPU resources are measured in cpu units. One cpu, in Kubernetes, is equivalent to 1 vCPU/Core for cloud providers and 1 hyperthread on bare-metal Intel processors.
I have a C program running in an alipine docker container. The image size is 10M on both OSX and on ubuntu.
On OSX, when I run this image, using the 'docker stats' I see it uses 1M RAM and so in the docker compose file I allocate a max of 5M within my swarm.
However, on Ubuntu 16.04.4 LTS the image is also 10M but when running it uses about 9M RAM, and I have to increase the max allocated memory in my compose file.
Why is there a such a difference in RAM usage between OSX and Ubuntu?
Even though we have different OSs, I would have thought once you are running inside a framework, then you would behave similarly on different machines. So I would have thought there should be comparable memory usage.
Update:
Thanks for the comments. So 'stats' may be inaccurate, and there are differences so best to baseline on linux. As an aside, but I think interesting, the reason for asking this question is to understand the 'under the hood' in order to tune my setup for a large number of deployed programs. Originally, when I tested I tried to allocate the smallest amount of maximum RAM on ubuntu, this resulted in a lot of disk thrashing something I didn't see or hear on my Macbook, (no hard disks!).
Some numbers which are completely my setup but also I think are interesting.
1000 docker containers, 1 C program each, 20M RAM MAX per container, Server load of 98, Server runs 4K processes in total, [1000 C programs total]
20 docker containers, 100 C programs each, 200M RAM MAX per container, Server load of 5 to 50, Server runs 2.3K processes in total, [2000 C programs total].
This all points at give your docker images a good amount of MAX RAM and it is nicer to your server to have fewer docker containers running.
I'm relatively new to docker. I'm trying to get the percent of cpu quota (actually) used by a container. Is there a default metric emitted by one of the endpoints or is it something that I will have to calculate with other metrics? Thanks!
docker stats --no-stream
CONTAINER ID NAME CPU % (rest of line truncated)
949e2a3724e6 practical_shannon 8.32% (truncated)
As mentioned in the comment from #asuresh4, above, docker stats appears to give the ACTUAL cpu utilization, not the configured values. The output here is from Docker version 17.12.1-ce, build 7390fc6
--no-stream means run stats once, not continuously as it normally does. As you might guess, you can also ask for stats on a single container (specify the container name or id).
In addition to CPU %, MEM USAGE / LIMIT, MEM %, NET I/O, and BLOCK I/O are also shown.
Trying to figure out how to kill a container that has its cpu usage over 100% using the docker stats results. I have created the below script that exports the stats to a file then looks at the results and looks for container id with cpu over 100% and kills it the problem is it looks like it is killing containers that are at 40%. The results return in this format 00.00% which I think might be the problem but not sure how the awk views the number when comparing to the % in the file.
#!/bin/bash
docker stats --no-stream > /tmp/cpu.log
sed -i 's/CONTAINER//g' /tmp/cpu.log
KILLCPU=$(awk '$2 >= 11000 {print$1}' /tmp/cpu.log)
docker stop $KILLCPU
Add a +0 to the field to get awk to properly recognize the percentage.
KILLCPU=$(awk '$2+0 >= 110 {print$1}' /tmp/cpu.log)
When a containers uses > 100% CPU it is running on more than one CPU. Killing them because they reach a certain percentage is not the correct approach.
I suggest you use the --cpu-shares option to docker run:
See:
https://docs.docker.com/engine/reference/run/#cpu-share-constraint
CPU share constraint By default, all containers get the same
proportion of CPU cycles. This proportion can be modified by changing
the container’s CPU share weighting relative to the weighting of all
other running containers.
To modify the proportion from the default of 1024, use the -c or
--cpu-shares flag to set the weighting to 2 or higher. If 0 is set, the system will ignore the value and use the default of 1024.
The proportion will only apply when CPU-intensive processes are
running. When tasks in one container are idle, other containers can
use the left-over CPU time. The actual amount of CPU time will vary
depending on the number of containers running on the system.
For example, consider three containers, one has a cpu-share of 1024
and two others have a cpu-share setting of 512. When processes in all
three containers attempt to use 100% of CPU, the first container would
receive 50% of the total CPU time. If you add a fourth container with
a cpu-share of 1024, the first container only gets 33% of the CPU. The
remaining containers receive 16.5%, 16.5% and 33% of the CPU.
On a multi-core system, the shares of CPU time are distributed over
all CPU cores. Even if a container is limited to less than 100% of CPU
time, it can use 100% of each individual CPU core.
For example, consider a system with more than three cores. If you
start one container {C0} with -c=512 running one process, and another
container {C1} with -c=1024 running two processes, this can result in
the following division of CPU shares:
PID container CPU CPU share
100 {C0} 0 100% of CPU0
101 {C1} 1 100% of CPU1
102 {C1} 2 100% of CPU2
My project might be an overcommitted system, and I have to improve the reliability by specifying an appropriate container mem limit, by which the total mem of the node should not be divided. But I'm confused with the following statements in the Kubernetes v1.1 doc Resource of Qos:
Incompressible Resource Guarantees
if they exceed their memory request, they could be killed (if some other container needs memory)
Containers will be killed if they use more memory than their limit.
and the command docker stats shows a "LIMIT" for each container:
I think it means that containers will not use mems more than the "LIMIT" since I've met sometimes the MEM% stays at 100% for a while, so how and when the containers are killed?
Update
I think OOM Killer is enabled with the default value 0.
> cat /proc/sys/vm/oom_kill_allocating_task
0
Cgroup memory limit feature is used, so I recommend to read cgroup doc:
Tasks that attempt to consume more memory than they are allowed are
immediately killed by the OOM killer.
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-memory.html