I am deploying some pods in Azure Kubernetes Service. When I deploy the pods with CPU requests for 100m I can see the 5 pods are running. Now with this state I run some performance tests and benchmark my result.
Now I redeploy the pods with CPU requests of 1 CPU and run same tests again. I can see that the pods are created successfully in both the cases and are in running state.
Shouldnt I see better performance results? Can someone please explain. Below is deployment file. CPU request for first test is 100m and for second is 1. If no performance difference is expected how to improve performance?
resources:
limits:
cpu: 3096m
memory: 2Gi
requests:
cpu: 100m
memory: 1Gi
CPU requests are mostly more important for the kube-scheduler to identify the best node suitable to place a pod. If you set CPU requests = 1 for every workload there will be no more capacity soon to schedule new pods.
Furthermore assigning more CPU requests to a pod does not automatically mean that the container/application will consume this.
CPU limits on the other hand can be responsible for CPU throttling in Kubernetes bcs they limit the time pods can consume the CPU.
Here is a great article about it.
Basically there are a lot of articles about about to no limit the CPU to avoid kernel throttling but from my experience throttling of a pod is less harmless than a pod going crazy and consume the whole CPU of a node. So i would recommend to not overcommit resources and set requests=limits.
You can also check the capacity and allocated resources of your nodes:
kubectl describe node <node>:
Capacity:
cpu: 4
ephemeral-storage: 203070420Ki
memory: 16393308Ki
Allocatable:
cpu: 3860m
ephemeral-storage: 187149698763
memory: 12899420Ki
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1080m (27%) 5200m (134%)
memory 1452Mi (11%) 6796Mi (53%)
Related
I have below resource limits in kubernates pod. where as my image starts with jvm args .
we have added resource limit less than min/max heap . java process started to run for some time and pod got killed abruptly with OutOfmemory.
How can pod start if memory specified in resource limit is 3 times less value ? Could some one help on this ?
cmd:
java -Xmx1512m -Xms1512m $LOGARGS -jar test-ms.jar
pod resourcelimits:
resources:
limits:
cpu: 300m
memory: 500Mi
requests:
cpu: 150m
memory: 300Mi
/start.sh: line 19: 7 Killed java -Xmx1512m -Xms1512m $LOGARGS -jar test-ms.jar
At 500Mi your but is evicted. If Java requires 1500, this cant work. Raise the memory value in the limit section, play with the value. Your container does not only need memory for Java.
Update to your comment:
Means that when the container reach 500Mi (500 MiB = 524.288 MB) the pod is restarted. To avoid that the pod use to much memory (ex. 10GB) because something inaspect happens. Memoryleaks for example. You limit the memory so that also other pods can run on the node. You must get what is a normal scenario for memory inside your container. As you are setting memory requirements for java, you can check if you really need them. If your cluster has metrics installed like prometheus.
https://github.com/kubernetes-sigs/metrics-server
You can install:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml
Then to analyse Cpu and memory:
kubectl top node // shows values for the nodes
kubectl top pod // shows values for the pods
Refere to the documentation if you want to use it.
While your container is running, you can also get inside and execute the normal top linux command.
in a setup that we are using, it seems like PODs are able to use more memory than specified in the LIMITS section.
Limits:
cpu: 1
memory: 256Mi
Requests:
cpu: 100m
memory: 128Mi
Above is mentioned for the strimzi-kafka pod but is also the case with other applications. However the "kubectl top" command shows that this pod is consuming around 1200Mi of memory. Also the "docker stats" command shows consumption of 1.2GiB.
I understand that as per general working principles of K8s, any pod trying to utilize memory beyond the specified limit, will be terminated with OOM error. We have seen this happen in many cases.
Now in the above scenario, however, this logic is not working as defined.
Kindly help in understanding and finding a solution to this.
Thanks in advance!
We have an Openshift environment on our company.
We're trying to maximize the resources between our data scientists using jupyterhub.
Is there an option for assigning more resources dynamicly per demand (if there are free resources avaliable).
You can take a look at setting resource restrictions. (Quite counterintuitive, I agree).
In Kubernetes (and therefore in OpenShift) you can set resource requests and limits.
Resource requests are the minimum a pod is guaranteed from the scheduler on the node it runs on. Resource Limits on the other hand give you the capability to allow your pod to exceed its requested resources up to the specified limit.
What is the difference between not setting resource requests and limits vs setting them?
Setting resource requests and limits
scheduler is aware of resources utilized. New resources can be scheduled according to the resources they require.
scheduler can rebalance pods across the nodes if a node hits maximum resource utilization
pods can not exceed the limits specified
pods are guaranteed to get at least the resources specified in requests
NOT setting resource requests and limits
scheduler is aware of resources utilized. New resources (e.g. pods) can be scheduled based on a best guess basis. It is not guaranteed that those new pods get minimum resources they require to run stable.
scheduler is not able to rebalance resources without at least requests
pods can utilize memory / cpu without restrictions
pods are not guaranteed any memory / cpu time
How to set up resource requests and limits
https://docs.openshift.com/container-platform/3.11/dev_guide/compute_resources.html
In the end it should look something like this
apiVersion: v1
kind: Pod
spec:
containers:
- image: openshift/hello-openshift
name: hello-openshift
resources:
requests:
cpu: 100m
memory: 200Mi
ephemeral-storage: 1Gi
limits:
cpu: 200m
memory: 400Mi
ephemeral-storage: 2Gi
Additional information can be found here
https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits
I have been trying to set up an Kubernetes 1.13 AKS deployment to use HPA, and I keep running into a problem:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
dev-hpa-poc Deployment/dev-hpa-poc <unknown>/50% 1 4 2 65m
Describing the HPA gives me these events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 5m4s (x200 over 55m) horizontal-pod-autoscaler failed to get cpu utilization: missing request for cpu
Warning FailedGetResourceMetric 3s (x220 over 55m) horizontal-pod-autoscaler missing request for cpu
It doesn't appear to be able to actually retrieve CPU usage. I have specified cpu and memory usage in the deployment YAML:
resources:
requests:
cpu: 250m
memory: 128Mi
limits:
cpu: 800m
memory: 1024Mi
The system:metrics-server is running and healthy, too, so that's not it. I can monitor pod health and CPU usage from the Azure portal. Any ideas as to what I'm missing? Could this potentially be a permissions issue?
for missing request for [x] make sure that all the containers in the pod have requests declared.
In my case the reason was that other deployment haven't resource limits. You should add resources for each pod and deployment in namespace.
Adding to #nakamume's answer, make sure to double check sidecar containers.
For me, I forgot to declare requests for GCP cloud-sql-proxy sidecar which had me pulling hairs for couple of hours.
The resource limit of Pod has been set as:
resource
limit
cpu: 500m
memory: 5Gi
and there's 10G mem left on the node.
I've created 5 pods in a short time successfully, and the node maybe still have some mem left, e.g. 8G.
The mem usage is growing as the time goes on, and reach the limit (5G x 5 = 25G > 10G), then the node will be out of response.
In order to ensure the usability, is there a way to set the resource limit on the node?
Update
The core problem is that pod memory usage does not always equal to the limit, especially in the time when it just starts. So there can be unlimited pods created as soon as possible, then make all nodes full load. That's not good. There might be something to allocate resources rather than setting the limit.
Update 2
I've tested again for the limits and resources:
resources:
limits:
cpu: 500m
memory: 5Gi
requests:
cpu: 500m
memory: 5Gi
The total mem is 15G and left 14G, but 3 pods are scheduled and running successfully:
> free -mh
total used free shared buff/cache available
Mem: 15G 1.1G 8.3G 3.4M 6.2G 14G
Swap: 0B 0B 0B
> docker stats
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O
44eaa3e2d68c 0.63% 1.939 GB / 5.369 GB 36.11% 0 B / 0 B 47.84 MB / 0 B
87099000037c 0.58% 2.187 GB / 5.369 GB 40.74% 0 B / 0 B 48.01 MB / 0 B
d5954ab37642 0.58% 1.936 GB / 5.369 GB 36.07% 0 B / 0 B 47.81 MB / 0 B
It seems that the node will be crushed soon XD
Update 3
Now I change the resources limits, request 5G and limit 8G:
resources:
limits:
cpu: 500m
memory: 5Gi
requests:
cpu: 500m
memory: 8Gi
The results are:
According to the k8s source code about the resource check:
The total memory is only 15G, and all the pods needs 24G, so all the pods may be killed. (my single one container will cost more than 16G usually if not limited.)
It means that you'd better keep the requests exactly equals to the limits in order to avoid pod killed or node crush. As if the requests value is not specified, it will be set to the limit as default, so what exactly requests used for? I think only limits is totally enough, or IMO, on the contrary of what K8s claimed, I rather like to set the resource request greater than the limit, in order to ensure the usability of nodes.
Update 4
Kubernetes 1.1 schedule the pods mem requests via the formula:
(capacity - memoryRequested) >= podRequest.memory
It seems that kubernetes is not caring about memory usage as Vishnu Kannan said. So the node will be crushed if the mem used much by other apps.
Fortunately, from the commit e64fe822, the formula has been changed as:
(allocatable - memoryRequested) >= podRequest.memory
waiting for the k8s v1.2!
Kubernetes resource specifications have two fields, request and limit.
limits place a cap on how much of a resource a container can use. For memory, if a container goes above its limits, it will be OOM killed. For CPU, its usage may be throttled.
requests are different in that they ensure the node that the pod is put on has at least that much capacity available for it. If you want to make sure that your pods will be able to grow to a particular size without the node running out of resources, specify a request of that size. This will limit how many pods you can schedule, though -- a 10G node will only be able to fit 2 pods with a 5G memory request.
Kubernetes supports Quality of Service. If your Pods have limits set, they belong to the Guaranteed class and the likelihood of them getting killed due to system memory pressure is extremely low. If the docker daemon or some other daemon you run on the node consumes a lot of memory, that's when there is a possibility for Guaranteed Pods to get killed.
The Kube scheduler does take into account memory capacity and memory allocated while scheduling. For instance, you cannot schedule more than two pods each requesting 5 GB on a 10GB node.
Memory usage is not consumed by Kubernetes as of now for the purposes of scheduling.