CPU usage graphs on Azure Container Instance - docker

When running a simple Linux container on ACI there is a huge discrepancy between the 'graphed' CPU usage in the portal compared to running 'top' in the container itself.
I can see my process running in 'top' and the cpu usage stays at around 5% and the load on the machine is below 0.10 but the portal reports around 60% usage. It's a single processor container.
Under heavier loads I have seen CPU usage of 300-400 % which feels like an issue related to the number of processors but even this does not add up and as previously stated it's a single processor container
Any thoughts ??

the ACI CPU Usage metrics seems to be in millicores, not in %. So when you see 300-400 it would be in fact .3 to .4 CPU which for a single CPU would represent 30-40%.
https://learn.microsoft.com/en-us/azure/container-instances/container-instances-monitor#available-metrics
Hoping this helps.

Related

Docker contianer limiting CPU resources

I have some upstream flask containers and the CPU usage hit 100% percent when i entertain some requests.
the system shows that the containers are using your CPU 100%.
My questions are:
If i limit the CPU usage on these containers, will they exit with zero error if they hit there allocated resources OR what are the disadvantages of limiting resources against docker containers?
which one is the better approach in terms of resource allocation to docker containers? (For 6 cpu cores)
a) Two containers running with default settings. (Use as much resources as the kernel can provide may be)
b) 4 containers can only use 1 CPU (--limit cpus ='1')
Please let me know if you want me elaborate more.
Thanks in Advance
Containers (and other Linux processes) that try to use more CPU cycles than they have been allocated will just get throttled: the Linux kernel will schedule other processes instead. Going over your CPU limit has no adverse consequences other than your process running slower.
For example, say your program starts 4 threads and each runs some intensive computation using a full core, but you're running this in a Docker container with --cpus=2. All four threads will run, but the combined program will be limited to 200% CPU, and the overall performance will be similar to if you had only launched 2 threads.
You will usually get better overall system utilization if you don't explicitly limit CPU utilization. If you are running 4 containers, and one of them is running the 4-thread computation job described above but the other three are idle, you will fully use the available system resources if you don't have limits.
If you do have a specific computationally intensive container, you may want to limit its CPU utilization to not starve out other processes. If you only have the one worker container and three Web server containers, consider limiting the worker to 3 or 3.5 CPUs on a 4-core system to guarantee some spare cycles for HTTP traffic. This is a tuning optimization, so look into it only if you're seeing a problem.
Note that CPU and memory work differently. You can't really use "too much" CPU, since if you wait there will always be more CPU cycles, but the kernel rations out what your process is able to run. On the other hand, memory is fixed, and your process will get killed if it goes over a memory limit.

High memory utilisation in Golang application deployed on kubernetes cluster

We have an Image Service written in Golang.
It supports image operation like resize crop blur..
The RPS is around 400.
Pod Config : 16GB RAM and 8 cores
We deployed the application and observed for a day, it showed high core utilization
We introduced ballast(https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i-learnt-to-stop-worrying-and-love-the-heap-26c2462549a2/) of 4GB and Sync pool(https://medium.com/a-journey-with-go/go-understand-the-design-of-sync-pool-2dde3024e277) to contain the core issues
Next we started observing high memory utilization.
Hence we reduced Ballast to 1GB, but still memory utilization is high
According to this article https://www.bwplotka.dev/2019/golang-memory-monitoring/ Goland version 1.12+ reported high RSS According to the article "This does not mean that they require more memory, it’s just optimization for cases where there is no other memory pressure."
To verify that we did a small POC on local machine to validate above and it worked.
Local Set up - Container memory - 500MB
The memory would continuously increase if it had and would remain there at 450MB until the pressure increases. As soon as the pressure increases the memory would go down to 4MB.
But this POC failed on Kubernetes cluster and the pods started crashing and restarting when the memory reached ~16 GB RAM on high RPS like 400.
Can someone suggest how can we contain this memory issue and why this POC failed on the cluster.
Let me know if more detail is required..

Kubernetes: High CPU usage

I am using Rancher. I have deployed a cluster with 1 master & 3 worker nodes.
All Machines are VPSes with 2 vCPU, 8GB RAM and 80GB SSD.
After the cluster was set up, the CPU reserved figure on Rancher dashboard was 15%. After metrics were enabled, I could see CPU used figures too and now CPU reserved had become 44% and CPU used was 16%. I find those figures too high. Is it normal for Kubernetes a cluster to consume this much CPU by itself?
Drilling down into the metrics, if find that the networking solution that Rancher uses - Canal - consumes almost 10% of CPU resources. Is this normal?
Rancher v2.3.0
User Interface v2.3.15
Helm v2.10.0-rancher12
Machine v0.15.0-rancher12-1
This "issue" is known for some time now and it affects smaller clusters. Kuberenetes is very CPU hungry relative to small clusters and this is currently by design. I have found multiple threads reporting this for different kind of setups. Here is an example.
So the short answer is: yes, Kubernetes setup consumes these amounts of CPU when used with relative small clusters.
I hope it helps.

Change CPU capacity of Docker containers

I'm doing an internship focused on Docker and I have to load-balance an application which have a client, a server and a database. My goal is to dynamically scale the number of server containers according their CPU usage. For instance if the CPU usage is over 60% I add a new container on the fly to divide the CPU usage. My problem is that my simulation does not get the CPU usage higher than 20%, it is a very simple simulation where a random users register and go to random pages.
Question : How can I lower the CPU capacity of my server containers using my docker-compose file in order to artificially make the CPU go higher ? I tried to use the cpu_quota and cpu_shares instructions but it's not very documented and I don't know how it works or affects my containers.

Monitoring CPU Core Usage on Terminal Servers

I have windows 2003 terminal servers, multi-core. I'm looking for a way to monitor individual CPU core usage on these servers. It is possible for an end-user to have a run-away process (e.g. Internet Explorer or Outlook). The core for that process may spike to near 100% leaving the other cores 'normal'. Thus, the overall CPU usage on the server is just the total of all the cores or if 7 of the cores on a 8 core server are idle and the 8th is running at 100% then 1/8 = 12.5% usage.
What utility can I use to monitor multiple servers ? If the CPU usage for a core is "high" what would I use to determine the offending process and then how could I automatically kill that process if it was on the 'approved kill process' list?
A product from http://www.packettrap.com/ called PT360 would be perfect except they use SMNP to get data and SMNP appears to only give total CPU usage, it's not broken out by an individual core. Take a look at their Dashboard option with the CPU gauge 'gadget'. That's exactly what I need if only it worked at the core level.
Any ideas?
Individual CPU usage is available through the standard windows performance counters. You can monitor this in perfmon.
However, it won't give you the result you are looking for. Unless a thread/process has been explicitly bound to a single CPU then a run-away process will not spike one core to 100% while all the others idle. The run-away process will bounce around between all the processors. I don't know why windows schedules threads this way, presumably because there is no gain from forcing affinity and some loss due to having to handle interrupts on particular cores.
You can see this easily enough just in task manager. Watch the individual CPU graphs when you have a single compute bound process running.
You can give Spotlight on Windows a try. You can graphically drill into all sorts of performance and load indicators. Its freeware.
perfmon from Microsoft can monitor each individual CPU. perfmon also works remote and you can monitor farious aspects of Windows.
I'm not sure if it helps to find run-away processes because the Windows scheduler dos not execute a process always on the same CPU -> on your 8 CPU machine you will see 12.5 % usage on all CPU's if one process runs away.

Resources