Removal of labels in Prometheus - monitoring

I'm looking into our company using Prometheus to gather stats from our experiments which run on Kubernetes. There's a plan to use labels to mark the name of specific experiments in our cloud / cluster. This means we will generate a lot of labels which will hog storage over time. When the associated time series have expired, will the labels also be deleted?

tldr; From an operational perspective, Prometheus does not differentiate between time-series names and labels; by deleting your experiment data you will effectively recover the labels you created.
What follows is only relevant to Prometheus >= 2.0
Prometheus stores a times series for each unique combination of metric name, label, and label value. So my_metric{my_tag="a"}, my_metric{my_tag="b"}, and your_metric{} are all just different time series; there is nothing special about labels or label values vs. metrics names.
Furthermore, Prometheus stores data in 2-hour frames on disk. So any labels you've created do not effect operations of your database after two hours, except for on-disk storage size, and query performance if you actually access that older data. Both of these concerns are addressed after your data is purged. Experiment away!

Related

Aggregation of Metrics by label

Currently, I am trying to write a service that reads information from prometheus, processes this and then exposes this information back to be scrape by prometheus.
I have this working, and the metrics are being scraped, but to process the metrics, I am using a queue to distribute work to consumers, this is cauing the metrics when queried to be (correctly) registered as multiple different timeseries due to the different instance labels.
From what I can see there seems to be two main options I know of but am unsure of one of them.
Add these metrics back to a queue and deploy a service to manage if these metrics continue to be exposed (this can be seen working by deploying only 1 instance of the app).
I believe that there may be a mechansim (the prometheus rules) to automatically consume these metrics and produce a single timeseries for each pod_name label, but i am unsure how to achieve this as I don't believe using sum(x) by (pod_name) is correct, as i do not with to have a sum of these values but a new series. If this is possible my other worry is then the redundant data once this new timeseries is created.
I appraciate any input
Kind Regards.
You can use relabel_config to modify labels as you wish.
Regarding the design, I think you need to have 2 labels: 1 for the instance that his metric was originally collected from, and one for for the instance that it was delegated by.

Is there a way to limit the performance data being recorded by AKS clusters?

I am using azure log analytics to store monitoring data from AKS clusters. 72% of the data stored is performance data. Is there a way to limit how often AKS reports performance data?
At this point we do not provide a mechanism to change performance metric collection frequency. It is set to 1 minute and cannot be changed.
We were actually thinking about adding an option to make more frequent collection as was requested by some customers.
Given the number of objects (pods, containers, etc) running in the cluster collecting even a few perf metrics may generate noticeable amount of data... You need that data in order to figure out what is going on in case of a problem.
Curious: you say your perf data is 72% of total - how much is it in terms om Gb/day, do you now? Do you have any active applications running on the cluster generating tracing? What we see is that once you stand up a new cluster, perf data is "the king" of volume, but once you start ading active apps that trace, logs become more and more of a factor in telemetry data volume...

Naming statsd metrics for short lived streams

I am trying to model statistics to submit to statsd/graphite. However what I am monitoring is "session" centric. For example, I have a game that is played in real time. There are multiple instances of a game active on the servers. Each game has multiple (and variable number of) participants. Each instance of a game has a unique ID as does each player.
I want to track (and graph) each player's stats but then roll the metric up for the whole instance and then for all the instances of a game. For example there may be two instances of a game active at a given time. Lets say each has two players in the game
GameTitle.RealTime.VoiceErrors.game_instance_a.player_id_1 10
GameTitle.RealTime.VoiceErrors.game_instance_a.player_id_2 20
GameTitle.RealTime.VoiceErrors.game_instance_b.player_id_3 50
GameTitle.RealTime.VoiceErrors.game_instance_b.player_id_4 70
where game_instances and player_ids are 128 bit numbers
And I want to be able to see that the value of all voice errors for game_instance_a is 30
while all voice errors across the system is 150
Given this I have three questions
What guidance would you have on naming the metrics.
Is it kosher to have metrics that have "dynamic" identifiers as part of the name
What are they scale limits on this. If I had a 100K game instances
with say as many as 1000 players in a game, is this going to kill statsd/graphite?
Thanks!
What guidance would you give on naming the metrics?
Graphite recommends that "Volatile path components should be kept as deep into the hierarchy as possible". This essentially means that if you can push the parts of the metrics that are frequently unique to the end of the "bucket" without impacting your grouping queries you should try to do so.
Here is a great post on using Graphite that includes naming recommendations. And here is another one with additional info from Jason Dixon (an excellent source for Graphite stuff in general).
Is it kosher to have metrics that have "dynamic" identifiers as part of the name?
I usually try to avoid identifiers in the metric names unless they are very low in number (<100). Because Graphite will store a .wsp file for every metric name you'll have a difficult time re-sizing or adjusting the storage settings should you decide to change your configuration. Additionally, the Graphite UI will have a "folder" for every metric name so you can easily make the UI unusable.
In your case, I'd probably graph the total number of game instances, the total number of players, and the number of errors (by type), etc. Additionally, I might try to track players per instance (generally) and maybe errors per instance (again without knowing the actual instance. e.g. GameTitle.RealTime.PerInstance.VoiceErrors) if I had that capability (i.e. state stored per instance in my application).
Logstash, Elastic Search, Kibana
I'd suggest logging this error information with instance and player ids and using logstash to send your logs to elastic search and kibana. Then I'd watch Graphite for real time error and health anomaly detection and use Kibana (and Elastic Search underneath) to dig deeper.
What are the scale limits on this. If I had a 100K game instances with say as many as 1000 players in a game, is this going to kill statsd/graphite?
Statsd should have no problem with this, as it just acts as a -mostly- dumb aggregator. While it does maintain some state internally I don't anticipate a problem.
I don't think you'll have problems with the internal Graphite Whisper Storage itself, as it is just using files and folders. But, as I mentioned above, the Graphite Web UI will be unusable and I think you'll also run the risk of other manageability issues.
Summary
Keep the volatile (dynamic) metric buckets at the end of the name and avoid going above a couple hundred of these.

Fetch data subset from gmond

This is in the context of a small data-center setup where the number of servers to be monitored are only in double-digits and may grow only slowly to few hundreds (if at all). I am a ganglia newbie and have just completed setting up a small ganglia test bed (and have been reading and playing with it). The couple of things I realise -
gmetad supports interactive queries on port 8652 using which I can get metric data subsets - say data of particular metric family in a specific cluster
gmond seems to always return the whole dump of data for all metrics from all nodes in a cluster (on doing 'netcat host 8649')
In my setup, I dont want to use gmetad or RRD. I want to directly fetch data from the multiple gmond clusters and store it in a single data-store. There are couple of reasons to not use gmetad and RRD -
I dont want multiple data-stores in the whole setup. I can have one dedicated machine to fetch data from the multiple, few clusters and store them
I dont plan to use gweb as the data front end. The data from ganglia will be fed into a different monitoring tool altogether. With this setup, I want to eliminate the latency that another layer of gmetad could add. That is, gmetad polls say every minute and my management tool polls gmetad every minute will add 2 minutes delay which I feel is unnecessary for a relatively small/medium sized setup
There are couple of problems in the approach for which I need help -
I cannot get filtered data from gmond. Is there some plugin that can help me fetch individual metric/metric-group information from gmond (since different metrics are collected in different intervals)
gmond output is very verbose text. Is there some other (hopefully binary) format that I can configure for export?
Is my idea of eliminating gmetad/RRD completely a very bad idea? Has anyone tried this approach before? What should I be careful of, in doing so from a data collection standpoint.
Thanks in advance.

What is Mnesia replication strategy?

What strategy does Mnesia use to define which nodes will store replicas of particular table?
Can I force Mnesia to use specific number of replicas for each table? Can this number be changed dynamically?
Are there any sources (besides the source code) with detailed (not just overview) description of Mnesia internal algorithms?
Manual. You're responsible for specifying what is replicated where.
Yes, as above, manually. This can be changed dynamically.
I'm afraid (though may be wrong) that none besides the source code.
In terms of documenation the whole Erlang distribution is hardly the leader
in the software world.
Mnesia does not automatically manage the number of replicas of a given table.
You are responsible for specifying each node that will store a table replica (hence their number). A replica may be then:
stored in memory,
stored on disk,
stored both in memory and on disk,
not stored on that node - in this case the table will be accessible but data will be fetched on demand from some other node(s).
It's possible to reconfigure the replication strategy when the system is running, though to do it dynamically (based on a node-down event for example) you would have to come up with the solution yourself.
The Mnesia system events could be used to discover a situation when a node goes down; given you know what tables were stored on that node you could check the number of their online replicas based on the nodes which were still online and then perform a replication if needed.
I'm not aware of any application/library which already manages this kind of stuff and it seems like a quite an advanced (from my point of view, at least) endeavor to make one.
However, Riak is a database which manages data distribution among it's nodes transparently from the user and is configurable with respect to the options you mentioned. That may be the way to go for you.

Resources