How do you display most failed http requests in prometheus/grafana table?

How do you display most failed http requests in prometheus/grafana table? - monitoring

I am monitoring my nodejs application using prometheus/grafana/express-prom-bundle which exposes a counter metric called http_request_duration_seconds_count. The metric has three labels of interest. status_code, path and method.
I would like to display a table in my grafana dashboard to list the most frequently failed paths/method (status_code="500") within the dashboard date range.
is that possible and if so what is prometheus query and grafana table settings that I need to achieve this list.
Thank you in advance for your help.

Here you want the topk aggregator, so
topk(5,
sum by (method, path) (
rate(http_request_duration_seconds_count{status_code="500"}[5m])
)
)

Related

How can I take all targets' metrics in one page at Prometheus

I have Prometheus setup on AWS EC2. I have 11 targets configured which have 2+ endpoints. I would like to setup a endpoint/query etc to gather all the metrics in one page. I am pretty stuck right now. I could use some help thanks:)
my prometheus targets file

Prometheus adds an unique instance label per each scraped target according to these docs.
Prometheus provides an ability to select time series matching the given series selector. For example, the following series selector selects time series containing {instance="1.2.3.4:56"} label, e.g. all the time series obtained from the target with the given instance label.
Prometheus provides the /api/v1/series endpoint, which returns time series matching the provided match[] series selector.
So, if you need obtaining all the time series from a particular target my-target, you can issue the following request to /api/v1/series:
curl 'http://prometheus:9090/api/v1/series?match[]={instance="my-target"}'
If you need obtaining metrics from the my-target at the given timestamp, then issue the query with the series selector to /api/v1/query:
curl 'http://prometheus:9090/api/v1/query?query={instance="my-target"}&time=needed-timestamp'
If you need obtaining all the raw samples from the my-target on the given time range (end_timestamp+d ... end_timestamp], then use the following query:
curl 'http://prometheus:9090/api/v1/query?query={instance="my-target"}[d]&time=end_timestamp'
See these docs for details on how to read raw samples from Prometheus.
If you need obtaining all the metrics / series from all the targets, then just use the following series selector: {__name__!=""}
See also /api/v1/query_range - this endpoint is used by Grafana for building graphs from Prometheus data.

Create sum of multiple queries with influxdb

I have four singlestat panels which show my used space on different hosts (every host has also different type_instances):
The query for one of this singlestats is the following:
Question: Is there a way to create a fifth singlestat panel which sows the sum of the other 4 singlestats ? (The sum of all "storj_value" where type=shared)

The influx query language does not currently support aggregations across metrics (eg, JOINs). It is possible with Kapacitor but that requires that new aggregated values for all the measurements are written to the DB, by writing code to do it, which will need to be queried separately.
Only option currently is to use an API that does have cross-metric function support, for example Graphite with an InfluxDB storage back-end, InfluxGraph.
The two APIs are quite different - Influx's is query language based, Graphite is not - and tagged InfluxDB data will need to be configured as a Graphite metric path via templates, see configuration examples.
After that, Graphite functions that act across series can be used, in particular for the above question, sumSeries.

InfluxDB: query to calculate average of StatsD "executionTime" values

I'm sending metrics in StatsD format to Telegraf, which forwards them to InfluxDB 0.9.
I'm measuring execution times (of some event) from multiple hosts. The measurement is called "execTime", and the tag is "host". Once Telegraf gets these numbers, it calculates mean/upper/lower/count, and stores them in separate measurements.
Sample data looks like this in influxdb:
TIME...FIELD..............HOST..........VALUE
t1.....execTime.count.....VM1...........3
t1.....execTime.mean......VM1...........15
t1.....execTime.count.....VM2...........6
t1.....execTime.mean......VM2...........22
(So at time t1, there were 3 events on VM1, with mean execution time 15ms, and on VM2 there were 6 events, and the mean execution time was 22ms)
Now I want to calculate the mean of the operation execution time across both hosts at time t1. Which is (3*15 + 6*22)/(3+6) ms.
But since the count and mean values are in two different series, I can't simply use "select mean(value) from execTime.mean"
Do I need to change my schema, or can I do this with the current setup?

What I need is essentially a new series, which is a combination of the execTime.count and execTime.mean across all hosts. Instead of calculating this on-the-fly, the best approach seems to be to actually create the series along with the others.
So now I have two timer stats being generated on each host for each event:
1. one event with actual hostname for the 'host' tag
2. second event with one tag "host=all"
I can use the first set of series to check mean execution times per host. And the second series gives me the mean time for all hosts combined.

It is possible to do mathematical operations on fields from two different series, provided both series are members of the same measurement. I suspect your schema is non-optimized for your use case.

Grafana dynamically display new hosts added by collectd

How to get grafana to dynamically add graphs for newly added hosts? For example, I have grafana chart to display load average for existing hosts. When I add a new host, the collectd will send the new host metrics to influxdb. But every time I have to manually add one more graph in grafana which is not desired? Is there a way to get grafana automatically plot the new host metrics without changing grafana?

You have to make use of the Grafana HTTP api and update your dashboard by adding the new graph that you want. This practically means that you have to:
use the api to take the json of the dashboard
handle this data and add your extra code for the new panel that you want to add
use the api again to update the dashboard
The hierarchy is simple: a dashboard has rows and rows have panels. Probably you will have to add some json code inside panels. Go check your json file and all these will make sense to you...

You can use regexp patterns in InfluxDB 0.8 (see also the 0.9 equivalent docs) to match all your newly added hosts. InfluxDB regexps use the Golang syntax.
For example, to match all series starting with stats.cpuNUMBER:
series: /^stats\.cpu\d+/
select: avg(load)
However this way you won't get one new plot for each newly added host, but a line for every host in the same plot.

You have to add regex in your select clause.
SELECT mean(value) FROM /logstash.*.requests.count/ WHERE $timeFilter
GROUP BY time($interval)
Above script will plot each series matching above regex automatically for all hosts without changing the grafana.
logstash.ABC1.requests.count
logstash.ABC2.requests.count
logstash.ABC3.requests.count
When ABC4 host is added and it is shipped correctly, new graph will be plotted automatically.

Is there an SQL-like feature in Google Apps Script?

Is it possible to connect to Google Spreadsheet in GAS using SQL-type queries? If so, any working samples?
Thanks.

Depending on what you want to do, there are two options.
1) You can have a SQL Database and connect to it using the Jdbc Service ( https://developers.google.com/apps-script/service_jdbc ).
2) The other option, which I've used once earlier, is by making use of the QUERY() function. You can set the formula on a cell to the SQL like query and then read the subsequent cells.
( https://support.google.com/docs/bin/answer.py?hl=en&answer=1388882 ).
Update after Google I/O 2012:
As you might have already noticed, Google possibly heard you and introduced ScriptDB which is better than the two options mentioned

You can also used the "Structured Query" parameters available in the Spreadsheets API to make calls like: &sq=filter:7 where filter is the name of your column and you want to return the results of any row where the value in that column is 7. See the List Feed section of the Spreadsheet API.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How do you display most failed http requests in prometheus/grafana table? - monitoring

Here you want the topk aggregator, so topk(5, sum by (method, path) ( rate(http_request_duration_seconds_count{status_code="500"}[5m]) ) )

Related

How can I take all targets' metrics in one page at Prometheus

Create sum of multiple queries with influxdb

InfluxDB: query to calculate average of StatsD "executionTime" values

Grafana dynamically display new hosts added by collectd

Is there an SQL-like feature in Google Apps Script?

Categories

Resources