Building CPU usage graph in Grafana for Docker container - docker

I have connected cAdvisor -> Prometheus and Grafana to get graphs for my Docker containers. One of them is the CPU load, but i can only see the cumulative usage lines and not actually a value at the moment. Id love to see somethin similar cAdvisor is showing. Whts the way to do so?

You're looking for the irate() and rate() functions. Using irate(my_metric[5m]) will calculate the per-second value for you.

Related

how to collect a specific prometheus metric from replicated containers and get average of them

I have a container named downloader and scaled it into 3 containers.
I am using prometheus python client, and in each container have a Gauge metric called Rate.
This Rate metric is separately accessible through internal port 8000 of each container.I mapped these 3 internal port to range 8000-8002 of external port.
what I want is that I want to get Average of these 3 Rate and show this average in Grafana panel as a single item.
I think I should change prometheus/prometheus.yml file, but not sure what is the case. any idea could help.
Thanks in advance.

How to calculate CPU utilization of windows web server in Grafana

I'm using prometheus as a data source and in grafana i want to make CPU utlization graph with accurate values and less interval
I guess this is what you are looking for: https://github.com/prometheus-community/windows_exporter
since you are using Prometheus you can use "node_exporter" , https://www.robustperception.io/understanding-machine-cpu-usage

Prometheus CPU Usage Histogram Metrics

my goal is to observe metrics (like CPU, Memory usage etc.) with Prometheus on a server and on its running docker containers. Before sending an alarm, I would like to compare the certain values of those metrics with e.g. an 0.95 quantile. However, over several weeks of search in the internet I still struggle to create metrics for the certain quantiles. Therefore I ask in this thread for your help/advice, how a quantile for certain metrics can be created.
Background
The code base is a fork of the docprom repository. This code relies on Prometheus for monitoring. Prometheus retrieves its data from a running cAdvisor container. The provided metrics of cAdvisor for Prometheus can be seen on the following page. However, it provides only Gauge and Counter metric types. During my research I was not able to find parameters that would enable modifications/extensions of those provided metrics.
Problem
According to my current understanding, the metric type should be a Histogram or Summary in order to observe the quantiles. What is the best approach to use the histogram_quantile query on the metrics provided by cAdvisor?
My current idea is to
create a custom server
fetch the desired data from Prometheus
calculate the desired data
provide it as a metric from the server, so that Prometheus can scrape it
Run histogram_quantile on the custom metric
Is it the right approach in order to create a metric that can be used with quantiles?
For example I would like to fire an alarm if a certain containers' CPU usage exceeds a 0,95 quantile. The code for the CPU usage can be seen exemplary below:
sum(rate(container_cpu_usage_seconds_total{name="CONTAINER_NAME"}[10m]))) / count(node_cpu_seconds_total{mode="system"}) * 100
What would be the best approach to create the desired quantiles? Am I on the right path or am I missing something simple here? Because it looks way too hard for me in order to get a simple query with a quantile.
I am thankful for all help and information.

Is it possible to run a new docker container of same image when the old one reaches a specified memory limit

I am wondering if is it possible to run a new docker container by some automated means such that whenever the old container reaches a specific memory/CPU usage limit ,the old container doesn't get killed and new one balances the load.
You mean a sort of autoscaling, at the moment I don't have a built-in solution ready to be used but I can share with you my idea:
You can use a collector for metrics like cAdvisor https://github.com/google/cadvisor you can grab information about your container (you can also use docker stats to do that)
You can store this data inside a time series database like InfluxDB or prometheus.
Create a continuous query to trigger an event "create new container" when some metrics go our of your limit.
I know that you are looking for something of ready but at the moment I don't see any tools that resolve this problem.
It sounds like you need a container orchestrator for possibly other use cases. You can drive scale choices with metrics via almost any of them. Mesos, Kubernetes, or Swarm. Swarm is evolving a lot with Docker investing heavily. Swarm mode is a new feature coming in 1.12 which will put a lot of this orchestration in the core product, and would probably give you a great use case.

How can we collect performance metrics from CAdvisor docker container?

Sorry I just started to learn docker. My question may seem stupid for some of you.
In fact, I would like to know if there is a way to collect performance metrics from "CAdvisor" container (not from cgroup) at runtime ? I mean, extract performance values from the curves designed by cadvisor like memory usage or network traffic.
I need to record this values and save them in a database so that, I can perform a statistic analyzes upon these generated values (like comparing memory consumption for two docker containers at t=50s).
Thanks in advance.
As other answers mention, cAdvisor doesn't provide its own performance data API, instead it exposes metrics which are typically handled in a separate database if one wants to derive performance data beyond "real time". For example, cAdvisor exports Prometheus metrics natively:
http://prometheus.io/docs/instrumenting/exporters/
The Prometheus metric types:
http://prometheus.io/docs/concepts/metric_types/
Prometheus supports a fairly rich functional expression language that can be used for querying and visualization:
http://prometheus.io/docs/querying/basics/
cAdvisor does provide a rest endpoint to get any stats in real time. By default, it keeps latest two minute of data. You can configure it to keep more or less. It also supports a storage backend to keep dumping stats to an influxdb database.
REST Api:
eg. /api/v1.3/containers
doc: https://github.com/google/cadvisor/blob/master/docs/api.md
Doc on setting up InfluxDB:
https://github.com/google/cadvisor/blob/master/docs/influxdb.md
I think you could use https://github.com/tutumcloud/container-metrics for this. Basically what that would be doing is using influxdb http://influxdb.com/ as a time series data store.
There is some more information available here: http://blog.tutum.co/2014/08/25/panamax-docker-application-template-with-cadvisor-elasticsearch-grafana-and-influxdb/
A couple of people seemed to be looking into the ELK stack (Elastic Search, Logstash, Kibana) for visualising some of this data here: https://github.com/google/cadvisor/issues/634

Resources