I have connected cAdvisor -> Prometheus and Grafana to get graphs for my Docker containers. One of them is the CPU load, but i can only see the cumulative usage lines and not actually a value at the moment. Id love to see somethin similar cAdvisor is showing. Whts the way to do so?
You're looking for the irate() and rate() functions. Using irate(my_metric[5m]) will calculate the per-second value for you.
Related
I have a container named downloader and scaled it into 3 containers.
I am using prometheus python client, and in each container have a Gauge metric called Rate.
This Rate metric is separately accessible through internal port 8000 of each container.I mapped these 3 internal port to range 8000-8002 of external port.
what I want is that I want to get Average of these 3 Rate and show this average in Grafana panel as a single item.
I think I should change prometheus/prometheus.yml file, but not sure what is the case. any idea could help.
Thanks in advance.
I'm using prometheus as a data source and in grafana i want to make CPU utlization graph with accurate values and less interval
I guess this is what you are looking for: https://github.com/prometheus-community/windows_exporter
since you are using Prometheus you can use "node_exporter" , https://www.robustperception.io/understanding-machine-cpu-usage
my goal is to observe metrics (like CPU, Memory usage etc.) with Prometheus on a server and on its running docker containers. Before sending an alarm, I would like to compare the certain values of those metrics with e.g. an 0.95 quantile. However, over several weeks of search in the internet I still struggle to create metrics for the certain quantiles. Therefore I ask in this thread for your help/advice, how a quantile for certain metrics can be created.
Background
The code base is a fork of the docprom repository. This code relies on Prometheus for monitoring. Prometheus retrieves its data from a running cAdvisor container. The provided metrics of cAdvisor for Prometheus can be seen on the following page. However, it provides only Gauge and Counter metric types. During my research I was not able to find parameters that would enable modifications/extensions of those provided metrics.
Problem
According to my current understanding, the metric type should be a Histogram or Summary in order to observe the quantiles. What is the best approach to use the histogram_quantile query on the metrics provided by cAdvisor?
My current idea is to
create a custom server
fetch the desired data from Prometheus
calculate the desired data
provide it as a metric from the server, so that Prometheus can scrape it
Run histogram_quantile on the custom metric
Is it the right approach in order to create a metric that can be used with quantiles?
For example I would like to fire an alarm if a certain containers' CPU usage exceeds a 0,95 quantile. The code for the CPU usage can be seen exemplary below:
sum(rate(container_cpu_usage_seconds_total{name="CONTAINER_NAME"}[10m]))) / count(node_cpu_seconds_total{mode="system"}) * 100
What would be the best approach to create the desired quantiles? Am I on the right path or am I missing something simple here? Because it looks way too hard for me in order to get a simple query with a quantile.
I am thankful for all help and information.
I am wondering if is it possible to run a new docker container by some automated means such that whenever the old container reaches a specific memory/CPU usage limit ,the old container doesn't get killed and new one balances the load.
You mean a sort of autoscaling, at the moment I don't have a built-in solution ready to be used but I can share with you my idea:
You can use a collector for metrics like cAdvisor https://github.com/google/cadvisor you can grab information about your container (you can also use docker stats to do that)
You can store this data inside a time series database like InfluxDB or prometheus.
Create a continuous query to trigger an event "create new container" when some metrics go our of your limit.
I know that you are looking for something of ready but at the moment I don't see any tools that resolve this problem.
It sounds like you need a container orchestrator for possibly other use cases. You can drive scale choices with metrics via almost any of them. Mesos, Kubernetes, or Swarm. Swarm is evolving a lot with Docker investing heavily. Swarm mode is a new feature coming in 1.12 which will put a lot of this orchestration in the core product, and would probably give you a great use case.
Sorry I just started to learn docker. My question may seem stupid for some of you.
In fact, I would like to know if there is a way to collect performance metrics from "CAdvisor" container (not from cgroup) at runtime ? I mean, extract performance values from the curves designed by cadvisor like memory usage or network traffic.
I need to record this values and save them in a database so that, I can perform a statistic analyzes upon these generated values (like comparing memory consumption for two docker containers at t=50s).
Thanks in advance.
As other answers mention, cAdvisor doesn't provide its own performance data API, instead it exposes metrics which are typically handled in a separate database if one wants to derive performance data beyond "real time". For example, cAdvisor exports Prometheus metrics natively:
http://prometheus.io/docs/instrumenting/exporters/
The Prometheus metric types:
http://prometheus.io/docs/concepts/metric_types/
Prometheus supports a fairly rich functional expression language that can be used for querying and visualization:
http://prometheus.io/docs/querying/basics/
cAdvisor does provide a rest endpoint to get any stats in real time. By default, it keeps latest two minute of data. You can configure it to keep more or less. It also supports a storage backend to keep dumping stats to an influxdb database.
REST Api:
eg. /api/v1.3/containers
doc: https://github.com/google/cadvisor/blob/master/docs/api.md
Doc on setting up InfluxDB:
https://github.com/google/cadvisor/blob/master/docs/influxdb.md
I think you could use https://github.com/tutumcloud/container-metrics for this. Basically what that would be doing is using influxdb http://influxdb.com/ as a time series data store.
There is some more information available here: http://blog.tutum.co/2014/08/25/panamax-docker-application-template-with-cadvisor-elasticsearch-grafana-and-influxdb/
A couple of people seemed to be looking into the ELK stack (Elastic Search, Logstash, Kibana) for visualising some of this data here: https://github.com/google/cadvisor/issues/634