Relabeling in Prometheus - monitoring

Setup
Prometheus node exporter is registered as a service with consul agent with various tags. Example service definition provided to consul agent:
{
"service":{
"id": "server-stats",
"name": "server-stats",
"tags": [
"a=1_meow",
"b=2_woof",
"c=3_moo",
"monkey"
],
"port": 9100,
"checks": [
{
"name": "Process #1",
"script": "/path/to/healthcheck/script.sh",
"interval": "5s"
}
]
}
}
Prometheus is set to look for this server-stats service and use the configuration (host address and port) provided by Consul to scrape stats from servers. The above tags are available as a comma separated list in __meta_consul_tags that can be used for relabeling.
Prometheus relabeling configuration:
relabel_configs:
- source_labels: [__meta_consul_tags]
separator: ','
#regex: '(.+)=(.+)'
regex: '([a-z_]+)=([a-z_]+|\d+)'
target_label: ${1}
replacement: ${2}
Issue
I am trying to expose tags to Prometheus so that we can get stats and graphs based on labels. Keeping the above service configuration in mind, I would like each metric to have following labels in addition to whatever Prometheus does internally:
a=1_meow, b=2_woof, c=3_moo and ignore monkey because it is just a string. I can remove monkey from my list of tags if there is a solution that requires = to be there. The relabel configuration written above is not leading to exposing any tag at all and seems to be getting ignored. Running Prometheus with log level set to debug is also not yielding anything.
Relevant docs
https://prometheus.io/docs/operating/configuration/#%3Crelabel_config%3E
https://www.robustperception.io/extracting-full-labels-from-consul-tags/

Incorrect understanding
I think there was a mistake in my understanding of how labeling in prometheus works. My incorrect understanding was:
before applying regex, string would be first split on separator (otherwise what is its purpose?),
each substring has regex evaluated against it,
if match groups are declared and found, they will be available as indexed values available to use in target_label and replacement fields.
if regex does not match, then that substring will be ignored.
because regex is expected to be applied to each substring after the split, it will lead to multiple labels from multiple substrings.
Correct understanding
However, from brian-brazil's post linked in his answer and Prometheus's documentation, it seems the following happening:
All __meta tags are combined into one long separator separated line.
regex is applied on that line only once.
If regex matches and includes groups, they are indexed beginning from 1 and available for use in target_label and replacement.
separator seems to be getting ignored in this section even if you mention it.
Config from corrected understanding
From this idea and following from example in the question, I was able to make the following config that works
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: '.*,a=([a-z0-9_]+),.+'
target_label: 'a'
replacement: ${1}
- source_labels: [__meta_consul_tags]
regex: '.*,b=([a-z0-9_]+),.+'
target_label: 'b'
replacement: ${1}
- source_labels: [__meta_consul_tags]
regex: '.*,c=([a-z0-9_]+),.+'
target_label: 'c'
replacement: ${1}
- source_labels: [__meta_consul_tags]
regex: '.*,d=([a-z0-9_]+),.+'
target_label: 'd'
replacement: ${1}
Caveats
I believe both approaches (the approach brian-brazil wrote in his blogpost, and what I am using above) have caveats - we either need to know all the labels we want beforehand, or have a set number of them. This means if a developer wants to associate different, or more labels with his/her service, s/he would need to work with ops as general flow will not be able to handle it. I think it is a minor caveat that should be addressed.

https://www.robustperception.io/extracting-full-labels-from-consul-tags/ shows how to do this, in particular the last example.

The following relabeling rule can be used for extracting a particular tag from __meta_consul_tags and placing it to destination_label:
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: '.*,tag=([^,]+),.*'
target_label: destination_label
There is no need to specify the replacement option because it defaults to $1.
If multiple tags must be extracted from __meta_consul_tags into multiple labels, then just repeat the relabeling rule per each needed tag. For example, the following relabeling rules extract a tag to a label and b tag to b label:
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: '.*,a=([^,]+),.*'
target_label: a
- source_labels: [__meta_consul_tags]
regex: '.*,b=([^,]+),.*'
target_label: b
More information about relabeling rules in Prometheus is available in this article.
Update (2022-12-19): VictoriaMetrics and vmagent (these are Prometheus-like monitoring systems and scrapers I work on) expose __meta_consul_tag_<tagname> and __meta_consul_tagpresent_<tagname> labels for discovered targets starting from v1.85.2. This allows copying all the tags defined at Consul service into the scrape target labels with a single relabeling rule:
- action: labelmap
regex: __meta_consul_tag_(.+)
Try playing with this relabeling rule at Prometheus relabel debugger.

Related

Helm3: How to replace 2 string values uses regex?

I'm trying to replace the hardcode in the Helm3 template with variables.
the name of the variable must match the value of the string name from the Values.yaml file. (name: test-rrrr-blabla-file) without spaces and only the last 2 blocks, i.e. blablafile.
Adequate examples in the
https://helm.sh/docs/chart_best_practices/templates/
I didn't find how to do it, tried the following expression:
{{- default .Chart.Name .Values ​​| (split “-” .Values.nameOverride)._3 }}-{{ (split “-” .Values.nameOverride)._4 }}
but it didn't work.
Also found undocumented capabilities here:
https://github.com/Masterminds/sprig/blob/master/docs/strings.md#regexfind
Not sure exactly, maybe need to use or regexSplit
or regex_replace, but I don't understand how to properly compose the expression... maybe you have come across this in practice?
Any help would be appreciated.
Thank you!

How can I parse info with Logstash?

I have an input:
May 16 12:45:47 host-dev1 kernel: [ 162.648366] wireguard: wg0: Sending keepalive packet to peer 2 (171.12.198.123:51079)
I want to parse the info as: TIMESTAMP "Sending keepalive packet to peer 2" IP:PORT
For the middle sentence I want to parse whatever is after wg0: until the first parenthesis of the port. This sentence can change to "Sending handshake initiation to peer 10" for example.
I've done
filter {
grok {
match => { "message" => "%{SYSLOGBASE:timestap} %{GREEDYDATA:action} %{IP:peerip}:%{NUMBER:port}" }
}
}
I need to change GREEDYDATA to something that will specifically parse the mentioned boundaries
Give this a try:
%{SYSLOGBASE:timestamp} \[ ?%{NUMBER:TIMESTAMP} ?\]( %{WORD}:)* %{GREEDYDATA:action} \(%{IP:peerip}:%{NUMBER:port}
Here's a breakdown:
%{SYSLOGBASE:timestamp} - the syslog prefix
\[ ?%{NUMBER:TIMESTAMP} ?\] - the application timestamp
( %{WORD}:)* - any words followed by a colon, like
'wg0:', zero or more times
%{GREEDYDATA:action} - any characters ('DATA' would also work)
\(%{IP:peerip}:%{NUMBER:port} - a literal '(' followed by IP and port
The important thing in making GREEDYDATA / DATA work here is that the the boundaries (%{WORD}: and \() are properly defined.
You may need to vary the boundary definitions depending on what other log messages look like (specifically, whether you can rely on the colon and parenthesis at the boundaries).
It may be helpful to use a named capture group, depending on whether existing grok patterns cover your other message formats, like : (?<notColons>[^:]*) \( to specify "a colon, then a space, then any number of non-colon characters, then a space, then an open bracket".

MQTT: does sport/# match sport/ (with a trailing slash)?

The MQTT spec explicitly states that
“sport/+” does not match “sport” but it does match “sport/”.
“sport/#” also matches the singular “sport”, since # includes the parent level.
But does “sport/#” also match “sport/”? The spec leaves this totally ambiguous.
As an aside, who else thinks allowing trailing slashes was a really bad design decision?
The # matches zero or more further elements so subscriptions to sport/# will match sport/
This is easily tested with mosquitto_sub/mosquitto_pub
Publish:
$ mosquitto_pub -t "sport/" -m "foo"
Subscription:
$ mosquitto_sub -v -t "sport/#"
sport/ foo
Yes, sport/# matches sport/. The confusing aspect of the spec is that unlike file and directory names, topic levels can be empty strings.
The spec says:
4.7.1.1 Topic level separator
The forward slash (‘/’ U+002F) is used to separate each level within a topic tree and provide a hierarchical structure to the Topic Names... Adjacent Topic level separators indicate a zero length topic level.
This means that sport/ is parsed as not one, but two topic levels -- sport and the empty string -- hence the # in sport/# matches the second empty string topic level. This is the same reason that sport/+ matches sport/ -- the + is matching the empty string level.

CircleCI config filters's tags's only regex: how to use flags

I am trying to use the tags filter for a job in CircleCI.
workflows:
foo:
jobs:
- bar:
filters:
tags:
only: /\d+/
The only key of tags is what I am interested in. Here's a sample regex: /\d+/
It's designed to match 1+ digits
Currently it doesn't match numbers with 2+ digits, because I need to add the global flag, /g
See this question for why
The correct regex would be /\d+/g
The CircleCI docs point to java.util.regex docs
Which didn't help me figure out if CircleCI regex would support flags :S
My question(s)
Does CircleCI regex support the use of flags?
How can I use flags in a regex?
Can you provide a link to an example?
Will my above regex of /\d+/g work?
I don't think CircleCI supports the use of flags, it doesn't seem necessary.
Looking at the example on https://circleci.com/docs/2.0/workflows/#using-regular-expressions-to-filter-tags-and-branches
You should ensure that you use ^ and $ to encapsulate your match pattern otherwise they will early out.
e.g. /\d+/ will match "123" but will stop as soon as the first digit is encountered, however /^\d+$/ will not since the pattern has start/end markers.

Regex for extracting second level domain from FQDN?

I can't figure this out. I need to extract the second level domain from a FQDN. For example, all of these need to return "example.com":
example.com
foo.example.com
bar.foo.example.com
example.com:8080
foo.example.com:8080
bar.foo.example.com:8080
Here's what I have so far:
Dim host = Request.Headers("Host")
Dim pattern As String = "(?<hostname>(\w+)).(?<domainname>(\w+.\w+))"
Dim theMatch = Regex.Match(host, pattern)
ViewData("Message") = "Domain is: " + theMatch.Groups("domainname").ToString
It fails for example.com:8080 and bar.foo.example.com:8080. Any ideas?
I used this Regex successfully to match "example.com" from your list of test cases.
"(?<hostname>(\w+\.)*)(?<domainname>(\w+\.\w+))"
The dot character (".") needs to escaped as "\.". The "." character in a regex pattern matches any character.
Also the regex pattern you provided requires that there be 1 or more word characters followed by a dot before the domainname match (this part "(?(\w+))." of the pattern. Also, I'm assuming that the . character was supposed to be escaped). This fails to make a match for the input "example.com" because there's no word character and dot before the domainname match.
I changed the pattern so that the hostname match would have zero or more matches of "1 or more word characters followed by a dot". This will match "foo" in "foo.example.com" and "foo.bar" in "foo.bar.example.com".
This assumes you've validated the contents of the fqdn elsewhere (e.g.: dashes allowed, no underscores or other non-alphanumeric characters), and is otherwise as liberal as possible.
'(?:(?<hostname>.+)\.)?(?<domainname>[^.]+\.[^.]+?)(?:\:(?<port>[^:]+))?$'
Matches the hostname component if present (including multiple additional levels):
bar.foo.example.com:8000 would match:
hostname: bar.foo (optional)
domainname: example.com
port: 8000 (optional)
I'm not familiar with VB.NET or ASP, but on the subject of regular expressions...
First off, you'll want to anchor your expression with ^ and $.
Next, \w may match different things depending on implementation, locale, etc., so you may want to be explicit. For example, \w may not match a hyphen, a valid character in domain names.
You don't seem to be taking into account an optional port number.
I'm sure there's a more RFC-accurate expression out there, but here's a start at something that should work for you.
^([a-z0-9\-]+\.)*([a-z0-9\-]+\.[a-z0-9\-]+)(:[0-9]+)?$
Broken down:
([a-z0-9\-]+\.)*: Start with zero or more hostnames...
([a-z0-9\-]+\.[a-z0-9\-]+): followed by two hostnames...
(:[0-9]+)?: followed by an optional port declaration.
Note that if you're dealing with a domain like example.ne.jp, you will only get .ne.jp. Also, note that the above example expression should be matched case-insensitively.

Resources