we are planning to use cadvisor for collecting cgroup data from docker host. we have set up using collectd and grafana to chart monitoring of other app metrics.
anyone know plugins for cadvisor plugin for collectd ? as per my info collectd can not pull data of cgroup docker containers.
cadvisor has influxdb support, and grafana can hook up to influxdb for metrics visualization. but as we do not have influxdb in place in current landscape we are exploring quick approach for docker container metrics monitoring.
Thanks in advance
cAdvisor plugin for collectd would be pretty simple. Can you file an issue on github.com/google/cadvisor and we can help you get one.
Alternatively, you can always hit the cAdvisor rest endpoint for the whole machine or a particular container to pull the data into graphite and push it from a helper process. eg.
/api/v1.3/containers/
In any case, file a feature request and we can help you out with the setup.
You can use fluentd to collect data from cadvisor using fluent-plugin-cadvisor.
Maybe it is not the best plugin, but it's very easy to extend and add it as you own plugin to fluentd.
Related
Trying to determine if cAdvisor + Prometheus is the OTHER option for monitoring openshift containers. Or if there is another combo that I can use natively from Prometheus.
cAdvisor is essentially built into K8S deployments so it's as native as you can get really. If you want to use additional software there are other ways to collect data with agents, but cAdvisor is quite well understood and efficient for doing this type of data collection. Prometheus also scrapes other K8S APIs aside from cAdvisor.
I am planning to use cAdvisor to monitor performance of running docker container on multiple VMs, do I need to install cAdvisor on all VMs, or there is any other way?
Yes, you need it on each host seeing as it uses the local mounts to get the data it exports
I would like to deploy a sidecar container that is measuring the memory usage (and potentially also CPU usage) of the main container in the pod and then send this data to an endpoint.
I was looking at cAdvisor, but Google Kubernetes Engine has hardcoded 10s measuring interval, and I need 1s granularity. Deploying another cAdvisor is an option, but I need those metrics only for a subset of pods, so it would be wasteful.
Is it possible to write a sidecar container that monitors the main container metrics? If so, what tools could the sidecar use to gather the data?
That one second granularity will be probably the main showstopper for many monitoring tools. In theory you can script it on your own. You can use Docker stats API and read stats stream only for main pod. You will need to mount /var/run/docker.sock to the sidecar container. Curl example:
curl -N --unix-socket /var/run/docker.sock http:/containers/<container-id>/stats
Another option is to read metric from cgroups. But you will need more calculations in this case. Mounting of croups to the sidecar container will be required. See some examples of cgroup pseudo-files on https://docs.docker.com/config/containers/runmetrics/
This could be done by sharing the process namespace for the Pod. Then the sidecar container would be able to see the processes from the main container (e.g. via ps), and would be able to monitor the CPU / Memory usage with standard unix tools.
One tool could be node-exporter, with the processes collector enabled. This can then be monitored by Prometheus
See topz, a simple utility to expose top command as web interface.
You can use Prometheus and Grafana for memory and cpu usage and monitoring. These are open source tools and can be used on production environment as well.
Is there a way to run a cAdvisor container in a Monitoring server and monitor docker containers in a separate server? Is there a command I can include when running cAdvisor?
Because I want to be able to monitor containers in a separate server but I’m not sure how to achieve that…
Any suggestions or shared knowledge would be very helpful. Thank you.
To take measurements from different machines, you will have to deploy cAdvisor to every separate server.
My source is:
A monitoring solution for Docker hosts, containers and containerized services
Extending the monitoring system
Dockprom Grafana dashboards can be easily extended to cover more then one Docker host. In order to monitor more hosts, all you need to do is to deploy a node-exporter and a cAdvisor container on each host and point the Prometheus server to scrape those.
On how to create and start a container by using the remote api, you can check this answer: How to use docker remote api to create container?
I'm looking for the monitoring solution for the web application, deployed as a Swarm of Docker containers spread through 7-10 VMs. High level requirements are:
Configurable Web and REST interface to performance dashboard
General performance metrics on VM levels (CPU/Memory/IO)
Alerts when containers and/or VMs are going offline/restart
Possibility to drill down into containers process activity when needed
Host OS are CoreOS and Ubuntu
Any recommendations/best practices here?
NOTE: external Kibana installation is being used to collect application logs from Logstash agents deployed on VMs.
Based on your requirements, it sounds like Sematext Docker Agent would be a good fit. It runs as a tiny container on each Docker host and collects all host+containers metrics, events, and logs. It can parse logs, route them, blacklist/whitelist them, has container auto-discovery, and so on. In the end logs end up in Logsene and metrics and events end up in SPM, which gives you a single pane of glass sort of view into all your Docker ops bits, with alerting, anomaly detection, correlation, and so on.
I am currently evaluating bosun with scollector + cAdvisor support. Look ok so far.
Edit:
It should meet all the listed requirements and a little bit more. :)
Take a look at Axibase Time-Series Database / Google Cadvisor / collectd stack.
Disclosure: I work for the company that develops ATSD.
Deploy 1 Cadvisor container per VM to collect Docker container statistics. Cadvisor front-end allows you to view top container processes.
Deploy 1 ATSD container to ingest data from multiple Cadvisor instances.
Deploy collectd daemon on each VM to collect host statistics, configure collectd daemons to stream data into ATSD using write_atsd plugin.
Dashboards:
Host:
Container:
API / SQL:
https://github.com/axibase/atsd/tree/master/api#api-categories
Alerts:
ATSD comes with a built-in rule engine. You can configure a rule to watch when containers stops collecting data and trigger an email or system command.