Histogram in % in Kibana - histogram

I'd like to create a percentage histogram visualization in Kibana.
Each document has a 'DurationInSeconds' field and I can create a histogram with a bucket size (e.g. 10, [0,9], [10,19], [20,29], ...) and count (number of documents when the duration is within one bucket ).
But, I'd like to have this histogram in Percentage on Y-axis.
I think using JSON Input it might be possible, but don't know how to do this.
Is it possible to create percentage histogram?
Thanks

Related

Using a Grafana Histogram with Prometheus Buckets

I have a Prometheus metric called latency, with a bunch of buckets.
I'm using an increase query to determine all the events that happened in the last 15 minutes (in all the buckets).
This query works well, switching to table view shows numbers that make sense: most latencies are below 300ms, with some above that value:
However, when I use a Grafana Histogram, it seems like x and y axis are interchanged:
Now I could use a different diagram style, like a Bar Gauge. I tried that, but it doesn't work well: I have too many buckets, so the labels become totally illegible. Also it forces me to display all the buckets that my application collects, but it would be nice if that wasn't set in stone, and I could aggregate buckets in Grafana. It also doesn't work well once I change the bucket sizes to exponential sizes.
Any hint how to either get the Histogram working properly (with x axis: bucket (in s), y axis: count), or another visualization that would be appropriate here? My preferred outcome would be something like the plot of a function.
Answer
The Grafana panel type "Bar gauge" with format option "Heatmap" and interval $__range seems to be the best option if you have a small number of buckets. There is no proper solution for large number of buckets, yet.
The documentation states that the format option "Heatmap" should work with panel tape "Heatmap" (and it does), see Introduction to histograms and heatmaps with Pre-bucketed data. The Heatmap panel has an option to produce a histogram on mouseover, so you might want to use this.
About panel type Histogram
The Grafana panel type "Histogram" produces a value distribution and the value of some bucket is a count. This panel type does not work well with Prometheus histograms, even if you switch from format option "Time series" to "Heatmap". I don't know if this is due to the beta status of this panel type in the Grafana Version I am currently using (which is 9.2.4). There are also open bugs, claiming that the maximum value of the x axis is not computed correctly, see issue 32006 and issue 33073.
The larger the number of buckets, the better the estimation of histogram_quantile(). You could let the Histogram panel calculate a distribution of latencies by using this function. Let's start with the following query:
histogram_quantile(1, sum by (le) (rate(latency_bucket{...}[$__rate_interval])))
You could now visualize the query results with the Histogram panel and set the bucket size to a very small number such as 0.1. The resulting histogram ignores a significant amount of samples as it is only related to the maximum value of all data points within $__rate_interval.
The values on the y-axis depend on the interval. The smaller the intervall, the higher the values, simply due to more data points in the query result. This is a big downside, you loose the exact number of data points which you originally had in the buckets.
I can not really recommend this, but it might be woth a try.
Additional notes
Grafana has a transform functions like "Create heatmap" and "Histogram", but these are not useful for Prometheus histogram data. Note that "Create heatmap" allows to set one dimension to logarithmic.
There are two interesting design documents that show, that the developers of Prometheus are aware of problems with the current implementation of histograms and work on some promising features:
Sparse high-resolution histograms for Prometheus
Prometheus Sparse Histograms and PromQL
See DESIGN DOCUMENTS.
There also is this feature request Prometheus histogram as stacked chart over time #11464.
There is an excellent overview about histograms: How to visualize Prometheus histograms in Grafana.

what's dataset type in tensorflow object-detection api?

I am trying to do my own object detection using my own dataset. I started my first machine learning program from google tensorflow object detection api, the link is here:eager_few_shot_od_training_tf2_colab.ipynb
In the colab tutorial, the author use javascript label the images, the result like this:
gt_boxes = [
np.array([[0.436, 0.591, 0.629, 0.712]], dtype=np.float32),
np.array([[0.539, 0.583, 0.73, 0.71]], dtype=np.float32),
np.array([[0.464, 0.414, 0.626, 0.548]], dtype=np.float32),
np.array([[0.313, 0.308, 0.648, 0.526]], dtype=np.float32),
np.array([[0.256, 0.444, 0.484, 0.629]], dtype=np.float32)
]
When I run my own program, I use labelimg replace to javascript, but the dataset is not compatible.
Now I have two questions, the first one is what is the dataset type in colab tutorial? coco, yolo, voc, or any other? the second is how transform dataset between labelimg data and colab tutorial data? My target is using labelimg to label data then substitute in colab tutorial.
The "data type" are just ratio values based on the height and width of the image. So the coordinates are just ratio values for where to start and end the bounding box. Since each image is going to be preprocessed, that is, it's dimensions are changed when fed into the model (batch,height,width,channel) the bounding box coordinates must have the correct ratio as the image might change dimensions from it's original size.
Like for the example, the model expects images to be 640x640. So if you provide an image of 800x600 it has to be resized. Now if the model gave back the coordinates [100,100,150,150] for an 640x640, clearly that would not be the same for 800x600 images.
However, to get this data format you should use PascalVOC when using labelImg.
The typical way to do this is to create TFRecord files and decode them in your training script order to create datasets. However, you are free to choose whatever method you like Tensorflow dataset in order to train your model.
Hope this answered your questions.

MeshLab: Can I export quality histogram to external file?

Using MeshLab I get quality histogram of distances after applying Hausdorff distance between two mashes. I want to export the histogram to external file so I can analysis the histogram in external tool like python or MATLAB.
Can I do it? How?
Thanks
Niv
You can do it easy if you use the filter "Distance from reference mesh" instead of "Hausdorff Distance". That filter will leave the distances stored as quality value on each vertex of the Measured mesh.
After that, you can save the mesh to plot the distances outside of meshlab. The recommended file format is PLY, and ensure "Quality" checkbox is checked and "Binary encoding" is not checked.
The output file has a 11 lines header and then a line per each vertex containing 4 numbers. First 3 numbers are the XYZ coordinates, and the last value is the quality (which is the distance you are looking for your plot)
0 -2 0 1.902114
0 2 0 1.902113
1 -2 0 1.701302
0.9848077 -2 0.1736482 1.714225
0.9396926 -2 0.3420202 1.722303
This method works not only with distances, but also with any value that meshlab can store as quality : curvature, distance to boundary, etc..

Prometheus latency graph in histogram and calculate percentile

I need to plot latency graph on prometheus by the histogram time-series, but I've been unsuccessful to display a histogram in grafana.
What I expect is to be able to show:
Y-axis is latency, x-axis is timeseries.
Each line representing the p50,p75,p90,p100 - aggregated for a given time window.
A sample metric would be the request time of an nginx.
suppose if i have a histogram like this,
nginx_request_time_bucket(le=1) 1,
nginx_request_time_bucket(le=10) 2,
nginx_request_time_bucket(le=60) 2,
nginx_request_time_bucket(le=+inf) 5
An example graph of what I am looking for is in this link,
[][]
[click]: https://www.instana.com/blog/how-to-measure-latency-properly-in-7-minutes/
I tried to picture histogram with heatmap using this query but this doesn't give me what im looking for. Im looking something similar to the graph
histogram_quantile(0.75, sum(rate(nginx_request_time_bucket[5m])) by (le))
Any help here is highly appreciated!
You need to set up a separate query on a Grafana graph per each needed percentile. For example, if you need p50, p75, p90 and p100 latencies over the last 5 minutes, then the following four separate queries should be set up in Grafana:
histogram_quantile(0.50, sum(rate(nginx_request_time_bucket[5m])) by (le))
histogram_quantile(0.75, sum(rate(nginx_request_time_bucket[5m])) by (le))
histogram_quantile(0.90, sum(rate(nginx_request_time_bucket[5m])) by (le))
histogram_quantile(1.00, sum(rate(nginx_request_time_bucket[5m])) by (le))
P.S. It is possible to compact these queries into a single one when using histogram_quantiles function from Prometheus-compatible query engine such as MetricsQL:
histogram_quantiles(
"percentile",
0.50, 0.75, 0.90, 1.00,
sum(rate(nginx_request_time_bucket[5m])) by (le)
)
Note that the accuracy for percentiles calculated over Prometheus histograms highly depends on the chosen buckets. It may be hard to choose the best set of buckets for the given set of percentiles, so it may be better to use histograms with automatically generated buckets. See, for example, VictoriaMetrics histograms.

Plotting values onto a histogram in rails

I realize there are several histogram gems, but my question is a bit unique. I don't need a graph or image of any kind. My rails app has an algorithm that gives each user a score between 0 and 1. e.g. billybob's raw_score might be .00901 and frankiejoe's raw_score might be .00071.
Without going into why I want to do this, I'd like to plot these values on a histogram, then map the mean raw_score as 50% and the mean plus standard deviation at about 65% (the mean minus standard deviation at 35%), the mean plus 2 x standard deviation at 80% etc. So 15 percentile for each standard deviation unit.
I don't need the actual histogram chart/image, I just want their corresponding histogram values after being loaded onto a histogram. I am essentially converting the numbers into a more aesthetically pleasing score, e.g. billybob's histogram_score might now be .987 and frankiejoe's might be .471. For now, it's only dozens of users or scores, but I'd like ability to handle thousands of users/scores.
I'd like to store the converted value into my database. The numbers I have now are raw_score:decimal and I'll store them as histogram_score:decimal.
How might I do this in my rails app?
Thank you!
Figured this out. So the descriptive_statistics gem is going to accomplish this for me.
require 'descriptive_statistics'
data = [0.15, 0.25, 0.10, 0.05, 0.35, 0.10]
data.map {|score| data.percentile_rank(score)}
=> [66.66666666666666, 83.33333333333334, 50.0, 16.666666666666664, 100.0, 50.0]
In my case, I'm just looping through each user's raw_score and storing it as the percentile_score. Works great!

Resources