I am using kubernetes cluster with 1 master node and 2 worker with 4 core cpu and 256mb ram. I wanted to know how much cpu and ram is needed for kubelet.
Is there any way to set limit (cpu, memory) for kubelet?. I searched for documentation but i found only worker node requirements.
I think you should understand what kubelet does. This can be found in kubelet documentation.
The kubelet is the primary “node agent” that runs on each node. It can register the node with the apiserver using one of: the hostname; a flag to override the hostname; or specific logic for a cloud provider.
The kubelet works in terms of a PodSpec. A PodSpec is a YAML or JSON object that describes a pod. The kubelet takes a set of PodSpecs that are provided through various mechanisms (primarily through the apiserver) and ensures that the containers described in those PodSpecs are running and healthy. The kubelet doesn’t manage containers which were not created by Kubernetes.
Other than from an PodSpec from the apiserver, there are three ways that a container manifest can be provided to the Kubelet.
File: Path passed as a flag on the command line. Files under this path will be monitored periodically for updates. The monitoring period is 20s by default and is configurable via a flag.
HTTP endpoint: HTTP endpoint passed as a parameter on the command line. This endpoint is checked every 20 seconds (also configurable with a flag).
HTTP server: The kubelet can also listen for HTTP and respond to a simple API (underspec’d currently) to submit a new manifest.
There is several flags that you could use with kubelet, but they mostly are DEPRECATED and parameter should be set via config file specified by Kubelet's --config flag. This is explain on Set Kubelet parameters via a config file.
The flags that might be interesting for you are:
--application-metrics-count-limit int
Max number of application metrics to store (per container) (default 100) (DEPRECATED)
--cpu-cfs-quota
Enable CPU CFS quota enforcement for containers that specify CPU limits (default true) (DEPRECATED)
--event-qps int32
If > 0, limit event creations per second to this value. If 0, unlimited. (default 5) (DEPRECATED)
--event-storage-age-limit string
Max length of time for which to store events (per type). Value is a comma separated list of key values, where the keys are event types (e.g.: creation, oom) or "default" and the value is a duration. Default is applied to all non-specified event types (default "default=0") (DEPRECATED)
--event-storage-event-limit string
Max number of events to store (per type). Value is a comma separated list of key values, where the keys are event types (e.g.: creation, oom) or "default" and the value is an integer. Default is applied to all non-specified event types (default "default=0") (DEPRECATED)
--log-file-max-size uint
Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
--pods-per-core int32
Number of Pods per core that can run on this Kubelet. The total number of Pods on this Kubelet cannot exceed max-pods, so max-pods will be used if this calculation results in a larger number of Pods allowed on the Kubelet. A value of 0 disables this limit. (DEPRECATED)
--registry-qps int32
If > 0, limit registry pull QPS to this value. If 0, unlimited. (default 5) (DEPRECATED)
Related
Folks,
With regards to docker compose v3's 'cpus' parameter setting (under 'deploy' 'resources' 'limits') to limit the available CPUs to a service, is it an absolute number that specifies the count of CPUs or is it a more useful percentage of available CPUs setting.
From what i read it appears to be an absolute number, where in, say if a host has 4 CPUs and one were to set two services in the compose file with 0.5 then both the services combined can only use a max of 1 CPU (0.5 each) while leaving the 3 remaining CPUs idle.
But thinking loudly it appears to me that it would be nicer if this is a percentage of available cores setting in which case for the same previous example this would result in both services each being able to use up to 2 CPUs each & thereby the two combined could use up all 4 when needed. This way when i increase or decrease the available cores the relative settings would help avoid modifying this value again.
EDIT(09/10/21):
On reading this it appears that the above can be achieved with 'cpu-shares' setting instead of setting 'cpus'. Is my understanding correct?
The doc for 'cpu-shares' however mentions the below cautionary note,
"It does not guarantee or reserve any specific CPU access."
If the above is achieved with this setting, then what does it mean (what is to lose) to not have a guarantee or reservation?
EDIT(09/13/21):
Just to summarize,
The 'cpus' parameter setting is an an absolute number that refers to the number of CPUs a service has reserved for it to use at all times. Correct?
The 'cpu-shares' parameter setting is a relative weight number the value of which is used to compute/determine the percentage of total available CPU that a service can use only when there is contention. Correct?
I'm using the prometheus plugin for Jenkins in order to pass data to the prometheus server and subsequently have it displayed in grafana.
With the default setup I can see the metrics at http://:8080/prometheus
But in the list I also find some duplicate entries for the same job
default_jenkins_builds_duration_milliseconds_summary_sum{jenkins_job="spring_api/com.xxxxxx.yyy:yyy-web",repo="NA",} 217191.0
default_jenkins_builds_duration_milliseconds_summary_sum{jenkins_job="spring_api",repo="NA",} 526098.0
Both entries refer to the same jenkins job spring_api. But the metrics have different value. Why do I see two entries for the same metric?
Possibly one is a a subset of the other.
In the kubernetes world you will have the resource consumption for each container in a pod ,and the pod's overall resource usage.
Suppose I query the metric "container_cpu_usage_seconds_total" for {pod="X"}.
Pod X has 2 containers so I'll get back four metrics.
{pod="X",container="container1"}
{pod="X",container="container2"}
{pod="X",container="POD"} <- some weird "pause" image with very low usage
{pod="X"} <- sum of container1 and container2
There might also be a discrepancy where the metrics with no container is greater than the sum of the container consumption. That might be some "not accounted for" overhead, like maybe pod dns lookups or something. I'm not sure.
I guess my point is that prometheus will often use combinations of labels and omissions of labels to show how a metric is broken down.
I have a single node Kubernetes cluster which shows 10Gi, 3 CPU as available(of total 16 Gi, 4CPU) for running the pods post the cluster startup. I am trying two different scenarios then:
Scenario-1.
Running 3 pods individually with configs(Request,Limit) as:
Pod-A: (1 Gi,3.3Gi) and (1 cpu,1 cpu)
Pod-B: (1 Gi,3.3Gi) and (1 cpu,1 cpu)
Pod-C: (1 Gi,3.3Gi) and (1 cpu,1 cpu)
In this scenario, apps get perfectly up in there corresponding pods and works fine as expected.
Scenario-2.
Running 3 pods individually with configs(Request,Limit) as:
Pod-A: (1 Gi,10 Gi) and (1 cpu,3 cpu)
Pod-B: (1 Gi,10 Gi) and (1 cpu,3 cpu)
Pod-C: (1 Gi,10 Gi) and (1 cpu,3 cpu)
In the second scenario, apps get up in there corresponding pods and but fails randomly after some load is put over any of these pods i.e. sometime Pod-A gets down, at times Pod-2 or Pod-3. At any point of time I am not able to run all the three pods together.
The only event I can see in the failed pod is as below
"The warning which is available in node logs says "Warning CheckLimitsForResolvConf 1m (x32 over 15m) kubelet, xxx.net Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains!.".
Having only this information in logs, I am not able to figure out the actual reason for random failure of Pods.
Can anyone help me understand if there is anything wrong with the configs or is there something else I am missing?
Thanks
When you create a Pod, the Kubernetes scheduler selects a node for the Pod to run on.
Each node has a maximum capacity for each of the resource types: the amount of CPU and memory it can provide for Pods. The scheduler ensures that, for each resource type, the sum of the resource requests of the scheduled Containers is less than the capacity of the node.
Note Although actual memory or CPU resource usage on nodes is very low, the scheduler still refuses to place a Pod on a node if the capacity check fails. This protects against a resource shortage on a node when resource usage later increases, for example, during a daily peak in request rate.
So after scheduling If a Container exceeds its memory request, it is likely that its Pod will be evicted whenever the node runs out of memory
Refer Default Hard Eviction Threshold values.
The kubelet has the following default hard eviction threshold:
memory.available<100Mi
nodefs.available<10%
nodefs.inodesFree<5%
imagefs.available<15%
You should track your Node Conditions when load is running.
kubelet maps one or more eviction signals to a corresponding node condition.
If a hard eviction threshold has been met, or a soft eviction threshold has been met independent of its associated grace period, the kubelet reports a condition that reflects the node is under pressure i.e MemoryPressure or DiskPressure
I'm using ubuntu server and configure the odoo project. it has 8GB of ram and available memory is arround 6GB so i need to increase the odoo default memory. So please let me know how to increase?
Have you tried playing with some of Odoo's Advanced and Multiprocessing options?
odoo.py --help
Advanced options:
--osv-memory-count-limit=OSV_MEMORY_COUNT_LIMIT
Force a limit on the maximum number of records kept in
the virtual osv_memory tables. The default is False,
which means no count-based limit.
--osv-memory-age-limit=OSV_MEMORY_AGE_LIMIT
Force a limit on the maximum age of records kept in
the virtual osv_memory tables. This is a decimal value
expressed in hours, and the default is 1 hour.
--max-cron-threads=MAX_CRON_THREADS
Maximum number of threads processing concurrently cron
jobs (default 2).
Multiprocessing options:
--workers=WORKERS Specify the number of workers, 0 disable prefork mode.
--limit-memory-soft=LIMIT_MEMORY_SOFT
Maximum allowed virtual memory per worker, when
reached the worker be reset after the current request
(default 671088640 aka 640MB).
--limit-memory-hard=LIMIT_MEMORY_HARD
Maximum allowed virtual memory per worker, when
reached, any memory allocation will fail (default
805306368 aka 768MB).
--limit-time-cpu=LIMIT_TIME_CPU
Maximum allowed CPU time per request (default 60).
--limit-time-real=LIMIT_TIME_REAL
Maximum allowed Real time per request (default 120).
--limit-request=LIMIT_REQUEST
Maximum number of request to be processed per worker
(default 8192).
Also if you are using WSGI or something similar to run Odoo, these may also need some tuning.
Docker v1.12 service comes with four flags for setting the resource limits on a service.
--limit-cpu value Limit CPUs (default 0.000)
--limit-memory value Limit Memory (default 0 B)
--reserve-cpu value Reserve CPUs (default 0.000)
--reserve-memory value Reserve Memory (default 0 B)
What is the difference between limit and reserve in this context?
What does the cpu value mean in here? Does this mean number of cores? cpu share? What is the unit?
Reserve holds those resources on the host so they are always available for the container. Think dedicated resources.
Limit prevents the binary inside the container from using more than that. Think of controlling runaway processes in container.
Based on my limited testing with stress, --limit-cpu is percent of a core, though if there are multiple threads, it'll spread those out across core's and seems to attempt to keep the total near what you'd expect.
In the below pic, from left to right, was --limit-cpu 4, then 2.5, then 2, then 1. All of those tests had stress set to CPU of 4 (worker threads).