Docker v1.12 service comes with four flags for setting the resource limits on a service.
--limit-cpu value Limit CPUs (default 0.000)
--limit-memory value Limit Memory (default 0 B)
--reserve-cpu value Reserve CPUs (default 0.000)
--reserve-memory value Reserve Memory (default 0 B)
What is the difference between limit and reserve in this context?
What does the cpu value mean in here? Does this mean number of cores? cpu share? What is the unit?
Reserve holds those resources on the host so they are always available for the container. Think dedicated resources.
Limit prevents the binary inside the container from using more than that. Think of controlling runaway processes in container.
Based on my limited testing with stress, --limit-cpu is percent of a core, though if there are multiple threads, it'll spread those out across core's and seems to attempt to keep the total near what you'd expect.
In the below pic, from left to right, was --limit-cpu 4, then 2.5, then 2, then 1. All of those tests had stress set to CPU of 4 (worker threads).
Related
Folks,
With regards to docker compose v3's 'cpus' parameter setting (under 'deploy' 'resources' 'limits') to limit the available CPUs to a service, is it an absolute number that specifies the count of CPUs or is it a more useful percentage of available CPUs setting.
From what i read it appears to be an absolute number, where in, say if a host has 4 CPUs and one were to set two services in the compose file with 0.5 then both the services combined can only use a max of 1 CPU (0.5 each) while leaving the 3 remaining CPUs idle.
But thinking loudly it appears to me that it would be nicer if this is a percentage of available cores setting in which case for the same previous example this would result in both services each being able to use up to 2 CPUs each & thereby the two combined could use up all 4 when needed. This way when i increase or decrease the available cores the relative settings would help avoid modifying this value again.
EDIT(09/10/21):
On reading this it appears that the above can be achieved with 'cpu-shares' setting instead of setting 'cpus'. Is my understanding correct?
The doc for 'cpu-shares' however mentions the below cautionary note,
"It does not guarantee or reserve any specific CPU access."
If the above is achieved with this setting, then what does it mean (what is to lose) to not have a guarantee or reservation?
EDIT(09/13/21):
Just to summarize,
The 'cpus' parameter setting is an an absolute number that refers to the number of CPUs a service has reserved for it to use at all times. Correct?
The 'cpu-shares' parameter setting is a relative weight number the value of which is used to compute/determine the percentage of total available CPU that a service can use only when there is contention. Correct?
I am trying to use some of the uncore hardware counters, such as: skx_unc_imc0-5::UNC_M_WPQ_INSERTS. It's supposed to count the number of allocations into the Write Pending Queue. The machine has 2 Intel Xeon Gold 5218 CPUs with cascade lake architecture, with 2 memory controllers per CPU. linux version is 5.4.0-3-amd64. I have the following simple loop and I am reading this counter for it. Array elements are 64 byte in size, equal to cache line.
for(int i=0; i < 1000000; i++){
array[i].value=2;
}
For this loop, when I map memory to DRAM NUMA node, the counter gives around 150,000 as a result, which maybe makes sense: There are 6 channels in total for 2 memory controllers in front of this NUMA node, which use DRAM DIMMs in interleaving mode. Then for each channel there is one separate WPQ I believe, so skx_unc_imc0 gets 1/6 from the entire stores. There are skx_unc_imc0-5 counters that I got with papi_native_avail, supposedly each for different channels.
The unexpected result is when instead of mapping to DRAM NUMA node, I map the program to Non-Volatile Memory, which is presented as a separate NUMA node to the same socket. There are 6 NVM DIMMs per-socket, that create one Interleaved Region. So when writing to NVM, there should be similarly 6 different channels used and in front of each, there is same one WPQ, that should get again 1/6 write inserts.
But UNC_M_WPQ_INSERTS returns only around up 1000 as a result on NV memory. I don't understand why; I expected it to give similarly around 150,000 writes in WPQ.
Am I interpreting/understanding something wrong? Or is there two different WPQs per channel depending wether write goes to DRAM or NVM? Or what else could be the explanation?
It turns out that UNC_M_WPQ_INSERTS counts the number of allocations into the Write Pending Queue, only for writes to DRAM.
Intel has added corresponding hardware counter for Persistent Memory: UNC_M_PMM_WPQ_INSERTS which counts write requests allocated in the PMM Write Pending Queue for IntelĀ® Optaneā¢ DC persistent memory.
However there is no such native event showing up in papi_native_avail which means it can't be monitored with PAPI yet. In linux version 5.4, some of the PMM counters can be directly found in perf list uncore such as unc_m_pmm_bandwidth.write - Intel Optane DC persistent memory bandwidth write (MB/sec), derived from unc_m_pmm_wpq_inserts, unit: uncore_imc. This implies that even though UNC_M_PMM_WPQ_INSERTS is not directly listed in perf list as an event, it should exist on the machine.
As described here the EventCode for this counter is: 0xE7, therefore it can be used with perf as a raw hardware event descriptor as following: perf stat -e uncore_imc/event=0xe7/. However, it seems that it does not support event modifiers to specify user-space counting with perf. Then after pinning the thread in the same socket as the NVM NUMA node, for the program that basically only does the loop described in the question, the result of perf kind of makes sense:
Performance counter stats for 'system wide': 1,035,380 uncore_imc/event=0xe7/
So far this seems to be the the best guess.
Can someone explain how the cpu usage is calculated inside pods with multiple containers for use with an Horizontal Pod Autoscaler?
Is it the mean value and how is this calculated?
For example:
If we have 2 containers:
Container1 requests 0.5 cpu and uses 0 cpu
Container2 requests 1 cpu and uses 2 cpu
If we calculate both seperatly and take the mean: (0% + 200%)/2 = 100% usage?
If we take the sums and take the mean: 2/1.5 = 133% usage?
Or is my logic way off?
As of kubernetes 1.9 HPA calculates pod cpu utilization as total cpu usage of all containers in pod divided by total request. So in your example the calculated usage would be 133%. I don't think that's specified in docs anywhere, but the relevant code is here: https://github.com/kubernetes/kubernetes/blob/v1.9.0/pkg/controller/podautoscaler/metrics/utilization.go#L49
However, I would consider this an implementation detail. As such it can easily change in future versions.
In the Horizontal Pod Autoscaling design documentation it's clearly written that it takes the arithmetic mean of the pods' CPU utilization to compare against the target value. Here is the text:
The autoscaler is implemented as a control loop. It periodically
queries pods described by Status.PodSelector of Scale subresource, and
collects their CPU utilization. Then, it compares the arithmetic mean
of the pods' CPU utilization with the target defined in
Spec.CPUUtilization, and adjusts the replicas of the Scale if needed
to match the target (preserving condition: MinReplicas <= Replicas <=
MaxReplicas).
The target number of pods is calculated from the following formula:
TargetNumOfPods = ceil(sum(CurrentPodsCPUUtilization) / Target)
For further detail: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/autoscaling/horizontal-pod-autoscaler.md
I'm using ubuntu server and configure the odoo project. it has 8GB of ram and available memory is arround 6GB so i need to increase the odoo default memory. So please let me know how to increase?
Have you tried playing with some of Odoo's Advanced and Multiprocessing options?
odoo.py --help
Advanced options:
--osv-memory-count-limit=OSV_MEMORY_COUNT_LIMIT
Force a limit on the maximum number of records kept in
the virtual osv_memory tables. The default is False,
which means no count-based limit.
--osv-memory-age-limit=OSV_MEMORY_AGE_LIMIT
Force a limit on the maximum age of records kept in
the virtual osv_memory tables. This is a decimal value
expressed in hours, and the default is 1 hour.
--max-cron-threads=MAX_CRON_THREADS
Maximum number of threads processing concurrently cron
jobs (default 2).
Multiprocessing options:
--workers=WORKERS Specify the number of workers, 0 disable prefork mode.
--limit-memory-soft=LIMIT_MEMORY_SOFT
Maximum allowed virtual memory per worker, when
reached the worker be reset after the current request
(default 671088640 aka 640MB).
--limit-memory-hard=LIMIT_MEMORY_HARD
Maximum allowed virtual memory per worker, when
reached, any memory allocation will fail (default
805306368 aka 768MB).
--limit-time-cpu=LIMIT_TIME_CPU
Maximum allowed CPU time per request (default 60).
--limit-time-real=LIMIT_TIME_REAL
Maximum allowed Real time per request (default 120).
--limit-request=LIMIT_REQUEST
Maximum number of request to be processed per worker
(default 8192).
Also if you are using WSGI or something similar to run Odoo, these may also need some tuning.
Is there a limit to the number of processes that can be register globally? Or is this only limited by the memory/ max number of atoms ?
Ubuntu 12.04 and Erlang R15B01.
Good question! I'd bet on the number of atoms, if you take into account the following. The Efficiency Guide has a section on system limits:
Processes
The maximum number of simultaneously alive Erlang processes is by default 32768. This limit can be raised up to at most 268435456 processes at startup (see documentation of the system flag +P in the erl(1) documentation). The maximum limit of 268435456 processes will at least on a 32-bit architecture be impossible to reach due to memory shortage.
Distributed nodes
Known nodes
A remote node Y has to be known to node X if there exist any pids, ports, references, or funs (Erlang data types) from Y on X, or if X and Y are connected. The maximum number of remote nodes simultaneously/ever known to a node is limited by the maximum number of atoms available for node names. All data concerning remote nodes, except for the node name atom, are garbage-collected.
Also, the erl manual section describes the flag you can use to alter the number of processes in your node:
+P Number
Sets the maximum number of concurrent processes for this system. Number must be in the range 16..134217727. Default is 32768.
Since you can alter the number of concurrent processes per node, but you cant alter the number of allowed atoms, and the process names are atoms which are copied in replica per node, that should be the total allowed number of globally registered processes.
Hope it helps :)
EDIT: Actually, turns out you can change the number of allowed atoms :)
Atoms
By default, the maximum number of atoms is 1048576. This limit can be raised or lowered using the +t option.
+t size
Set the maximum number of atoms the VM can handle. Default is 1048576.