Aggregation of Metrics by label - monitoring

Currently, I am trying to write a service that reads information from prometheus, processes this and then exposes this information back to be scrape by prometheus.
I have this working, and the metrics are being scraped, but to process the metrics, I am using a queue to distribute work to consumers, this is cauing the metrics when queried to be (correctly) registered as multiple different timeseries due to the different instance labels.
From what I can see there seems to be two main options I know of but am unsure of one of them.
Add these metrics back to a queue and deploy a service to manage if these metrics continue to be exposed (this can be seen working by deploying only 1 instance of the app).
I believe that there may be a mechansim (the prometheus rules) to automatically consume these metrics and produce a single timeseries for each pod_name label, but i am unsure how to achieve this as I don't believe using sum(x) by (pod_name) is correct, as i do not with to have a sum of these values but a new series. If this is possible my other worry is then the redundant data once this new timeseries is created.
I appraciate any input
Kind Regards.

You can use relabel_config to modify labels as you wish.
Regarding the design, I think you need to have 2 labels: 1 for the instance that his metric was originally collected from, and one for for the instance that it was delegated by.


Prometheus CPU Usage Histogram Metrics

my goal is to observe metrics (like CPU, Memory usage etc.) with Prometheus on a server and on its running docker containers. Before sending an alarm, I would like to compare the certain values of those metrics with e.g. an 0.95 quantile. However, over several weeks of search in the internet I still struggle to create metrics for the certain quantiles. Therefore I ask in this thread for your help/advice, how a quantile for certain metrics can be created.
The code base is a fork of the docprom repository. This code relies on Prometheus for monitoring. Prometheus retrieves its data from a running cAdvisor container. The provided metrics of cAdvisor for Prometheus can be seen on the following page. However, it provides only Gauge and Counter metric types. During my research I was not able to find parameters that would enable modifications/extensions of those provided metrics.
According to my current understanding, the metric type should be a Histogram or Summary in order to observe the quantiles. What is the best approach to use the histogram_quantile query on the metrics provided by cAdvisor?
My current idea is to
create a custom server
fetch the desired data from Prometheus
calculate the desired data
provide it as a metric from the server, so that Prometheus can scrape it
Run histogram_quantile on the custom metric
Is it the right approach in order to create a metric that can be used with quantiles?
For example I would like to fire an alarm if a certain containers' CPU usage exceeds a 0,95 quantile. The code for the CPU usage can be seen exemplary below:
sum(rate(container_cpu_usage_seconds_total{name="CONTAINER_NAME"}[10m]))) / count(node_cpu_seconds_total{mode="system"}) * 100
What would be the best approach to create the desired quantiles? Am I on the right path or am I missing something simple here? Because it looks way too hard for me in order to get a simple query with a quantile.
I am thankful for all help and information.

Share storage/volume between worker nodes in Kubernetes?

Is it possible to have a centralized storage/volume that can be shared between two pods/instances of an application that exist in different worker nodes in Kubernetes?
So to explain my case:
I have a Kubernetes cluster with 2 worker nodes. In each one of these I have 1 instance of app X running. This means I have 2 instances of app X running totally at the same time.
Both instances subscribe on the topic topicX, that has 2 partitions, and are part of a consumer group in Apache Kafka called groupX.
As I understand it the message load will be split among the partitions, but also among the consumers in the consumer group. So far so good, right?
So to my problem:
In my whole solution I have a hierarchy division with the unique constraint by country and ID. Each combination of country and ID has a pickle model (python Machine Learning Model), which is stored in a directory accessed by the application. For each combination of a country and ID I receive one message per minute.
At the moment I have 2 countries, so to be able to scale properly I wanted to split the load between two instances of app X, each one handling its own country.
The problem is that with Kafka the messages can be balanced between the different instances, and to access the pickle-files in each instance without know what country the message belongs to, I have to store the pickle-files in both instances.
Is there a way to solve this? I would rather keep the setup as simple as possible so it is easy to scale and add a third, fourth and fifth country later.
Keep in mind that this is an overly simplified way of explaining the problem. The number of instances is much higher in reality etc.
Yes. It's possible if you look at this table any PV (Physical Volume) that supports ReadWriteMany will help you accomplish having the same data store for your Kafka workers. So in summary these:
VsphereVolume - (works when pods are collocated)
In my opinion, NFS is the easiest to implement. Note that Azurefile, Quobyte, and Portworx are paid solutions.

Removal of labels in Prometheus

I'm looking into our company using Prometheus to gather stats from our experiments which run on Kubernetes. There's a plan to use labels to mark the name of specific experiments in our cloud / cluster. This means we will generate a lot of labels which will hog storage over time. When the associated time series have expired, will the labels also be deleted?
tldr; From an operational perspective, Prometheus does not differentiate between time-series names and labels; by deleting your experiment data you will effectively recover the labels you created.
What follows is only relevant to Prometheus >= 2.0
Prometheus stores a times series for each unique combination of metric name, label, and label value. So my_metric{my_tag="a"}, my_metric{my_tag="b"}, and your_metric{} are all just different time series; there is nothing special about labels or label values vs. metrics names.
Furthermore, Prometheus stores data in 2-hour frames on disk. So any labels you've created do not effect operations of your database after two hours, except for on-disk storage size, and query performance if you actually access that older data. Both of these concerns are addressed after your data is purged. Experiment away!

bosun and telegraf metrics meta information

hello i really want to use bosun/tsdbrelay/opentsdb with the telegraf collector, as it gets all the metrics we want to monitor out of the box.
i allready have a small setup to push metrics from 5 servers to bosun for indexing and opentsdb for storage.
i used the haproxy configs from kyle brandts bosun infrastructure blog to make the tsdbs ha-ready
but bosun is showing that it cannot use the auto-type for metrics, and also in the primary stats view does not show any graphs for cpu / mem etc.
what can i provide that the graphs show up.
kind regards.
Both of these features are mostly scollector specific. The "host" view (I've considered ripping that out, it was done in the early days, better to use something like grafana) depends on scollector specific metrics such as os.cpu.
As far as "Auto" for rate vs gauge, that is also metadata that comes from scollector and sent to bosun. If you want to try to mimic the behavior see and - you would need to create at least the "rate" key for each metric you are getting from telegraph.

Fetch data subset from gmond

This is in the context of a small data-center setup where the number of servers to be monitored are only in double-digits and may grow only slowly to few hundreds (if at all). I am a ganglia newbie and have just completed setting up a small ganglia test bed (and have been reading and playing with it). The couple of things I realise -
gmetad supports interactive queries on port 8652 using which I can get metric data subsets - say data of particular metric family in a specific cluster
gmond seems to always return the whole dump of data for all metrics from all nodes in a cluster (on doing 'netcat host 8649')
In my setup, I dont want to use gmetad or RRD. I want to directly fetch data from the multiple gmond clusters and store it in a single data-store. There are couple of reasons to not use gmetad and RRD -
I dont want multiple data-stores in the whole setup. I can have one dedicated machine to fetch data from the multiple, few clusters and store them
I dont plan to use gweb as the data front end. The data from ganglia will be fed into a different monitoring tool altogether. With this setup, I want to eliminate the latency that another layer of gmetad could add. That is, gmetad polls say every minute and my management tool polls gmetad every minute will add 2 minutes delay which I feel is unnecessary for a relatively small/medium sized setup
There are couple of problems in the approach for which I need help -
I cannot get filtered data from gmond. Is there some plugin that can help me fetch individual metric/metric-group information from gmond (since different metrics are collected in different intervals)
gmond output is very verbose text. Is there some other (hopefully binary) format that I can configure for export?
Is my idea of eliminating gmetad/RRD completely a very bad idea? Has anyone tried this approach before? What should I be careful of, in doing so from a data collection standpoint.
Thanks in advance.
