How to calculate difference between 2 data points in grafana - monitoring

This graph is giving me total bytes website has accepted i want to find instant value at that time for that i've to minus two data points can anyone help?
(sum(windows_iis_received_bytes_total{instance=~"ServerName",site="WebSiteName"} + windows_iis_sent_bytes_total{instance=~"ServerName",site="WebSiteName"} ) )

Related

Tableau question: How to link a reference table to a dynamic calculated field value (which is an integer)? I'm assigning P values

Since Tableau does not have a function for P-values(correct me if I'm wrong here) I created a spreadsheet with all possible sample sizes under two different alphas/significance levels and need to connect the appropriate p-value to a calculated field from the main database source (aggregate count of people). I assumed I could easily match numbers with a condition to bring back the p-value in a calculated field yet I'm hitting a brick wall. Biggest issue seems to be that the field I want to join the P-value reference table to is an aggregated integer. Also, I do not have any extensions and my end result needs to be an integer, not a graph.
Any secret tricks here?
Seems I cannot blend the reference table in nor join it to an aggregate?
Thanks!
I found a work around in calculating the critical value for a two tailed t-test in tableau. However, I didn't figure out how to join based on an aggregated calculated field. Work around: I used a conditional statement just copying and pasting about 100 critical values based on (sample size - 2) aka degrees of freedom, into a calculated field. To save time, use excel to pull down the conditions to 120. Worked like a charm!
Here is the conditional logic for alpha = .2 (80%) in two tailed t-test (replace the ## line with about 117 rows):
IF [degrees of freedom] = 1 THEN 3.08
ELSEIF [degrees of freedom] = 2 THEN 1.89
ELSEIF [degrees of freedom] = 3 THEN 1.64
##ELSEIF [...calculate down to 120] = ... then ...
ELSEIF [degrees of freedom] > 121 THEN 1.28
END

How to sum amount for each group of account number using BI publisher add in words

I have summary pay app, and it has multiple cost codes and I need to sum up the amount charged for each account code
see example:
Cost code1 123-458-111
Line 1 2000
Line 2 1000
Cost code2 222-123-222
Line 3 3000
Cost code3 123-458-121
Line 4 1500
Line 5 2500
I need to print the result as follow
Cost code1 123-458-111 3000
Cost code2 222-123-222 3000
Cost code3 123-458-121 4000
I have tried this code but it does not give me the correct result, it just looks to the first line.
*for-each BI_ITEM_1 BITEMID_1 <?sum (UGENTTLCMPTODATECA_1[.!=''] )?> end*
BI_Item_1 = account number name.
BI_ItemID_1 = account actual number.
UGENTTLCMPTODATECA_1 = total completed to date (the amount charged to the account number )
For each is for the data set and its grouped by BI_Item_1
Any suggestion? Your help is much appreciated.
I understand pictures make it easier to visualize but I'm not allowed to upload picture to the post yet! which is weird.

Google Sheets Query Group By / First-N-Per-Group

I'm trying to find a simple solution for first-n-per-group.
I have a table of data, first column dates and rest data. I want to group based around the date, as multiple entries per date are allowed. For the second column some numbers, but want the FIRST record.
Currently the aggregate function I could possibly use is MIN() but that will return the lowest value and not the first.
A B
01/01/2018 10
01/01/2018 15
02/01/2018 10
02/01/2018 2
02/01/2018 100
02/01/2018 20
03/01/2018 5
03/01/2018 2
Desired output
A B
01/01/2018 10
02/01/2018 10
03/01/2018 5
Current results using MIN() - undesired
A B
01/01/2018 10
02/01/2018 2
03/01/2018 2
It's a shame there isn't a FIRST() aggregate function in Google Sheets, which would make this a lot easier.
I saw a couple of examples of using the Row Number and ArrayQuery, but that doesn't seem to work for me. There are about 5000 rows of data so trying to keep this as efficient as possible, and not have to recalculate the entire sheet on any change, each taking a few seconds.
Currently I have this, which appends a third column with the Row Number:
=query({A1:B, arrayformula(row(A1:B))}, "select min(Col1),min(Col2) group by Col1")
Thanks
EDIT 1
A suggested solution was =SORTN(A:B,2^99,2,1,1), which is a clean simple one. However, this requires a large range of "free space" to display the returned dataset. Imagine 3000+ rows.
I was hoping for a QUERY() -based solution, as I wanted to do further operations with the results. Specifically, count the occurrences of distinct values.
For example: I wanted a returned dataset of
A B
01/01/2018 10
02/01/2018 10
03/01/2018 5
Yet I want to count the occurrences of those values (and then ignoring the dates). For example:
B C
10 2
5 1
Perhaps I've confused the situation by using numbers? the "data" in ColB is TEXT (short 3 letter codes), however I used numbers to show I couldn't use MIN() function as that returns the numerically lowest value.
So in brief:
Go through all rows (3000+ rows) and group by the FIRST row of a particular date
return the FIRST value of that row
COUNT() all unique occurrences of those FIRST values, disregarding the date. Just a list with the unique values and their count (again, only the first one of any particular day)
=SORTN(A:B,2^99,2,1,1)
If your data is sorted as in the sample, You can easily remove duplicates with SORTN()

InfluxDB: Starting cumulative_sum() from zero / aggregate grouping required for cumulative_sum and non_negative_difference

Using InfluxDB, I'm trying produce an output that shows cumulative rainfall for a time period, that starts from zero.
The rainfall sensor outputs a cumulative rainfall amount, but resets to zero on power-failure, restart etc.
My first query component uses non_negative_difference() to show the increments.
select
non_negative_difference(rain) as nnd
FROM
weather
WHERE
$time_query
.... yields an increment per raw data point, for example:
2018-06-01T14:21:00.926Z 0
2018-06-01T14:22:02.959Z 0.30000000000000426
2018-06-01T14:23:04.992Z 0.3999999999999986
2018-06-01T14:24:07.024Z 0.10000000000000142
2018-06-01T14:25:09.059Z 0.19999999999999574
2018-06-01T14:26:11.094Z 0
2018-06-01T14:27:13.127Z 0.10000000000000142
2018-06-01T14:28:15.158Z 0.20000000000000284
2018-06-01T14:29:20.027Z 0.09999999999999432
2018-06-01T14:30:22.476Z 0.10000000000000142
2018-06-01T14:30:53.918Z 0.6000000000000014
2018-06-01T14:31:55.968Z 0.5
2018-06-01T14:32:58.007Z 0.5
2018-06-01T14:34:00.046Z 0.20000000000000284
2018-06-01T14:35:02.075Z 0.3999999999999986
2018-06-01T14:36:04.102Z 0.3999999999999986
2018-06-01T14:37:06.136Z 0.20000000000000284
2018-06-01T14:38:08.201Z 0
So far so good.
I'm now trying to stitch these readings back to cumulative total, starting from zero for the intended period.
I can use cumulative_sum() for this, for example:
SELECT
cumulative_sum(nnd)
FROM
(SELECT
non_negative_difference(rain) as nnd
FROM
weather
WHERE
$time_query )
which yields:
2018-06-01T14:21:00.926Z 0
2018-06-01T14:22:02.959Z 0.30000000000000426
2018-06-01T14:23:04.992Z 0.7000000000000028
2018-06-01T14:24:07.024Z 0.8000000000000043
2018-06-01T14:25:09.059Z 1
2018-06-01T14:26:11.094Z 1
2018-06-01T14:27:13.127Z 1.1000000000000014
2018-06-01T14:28:15.158Z 1.3000000000000043
2018-06-01T14:29:20.027Z 1.3999999999999986
2018-06-01T14:30:22.476Z 1.5
2018-06-01T14:30:53.918Z 2.1000000000000014
2018-06-01T14:31:55.968Z 2.6000000000000014
2018-06-01T14:32:58.007Z 3.1000000000000014
2018-06-01T14:34:00.046Z 3.3000000000000043
2018-06-01T14:35:02.075Z 3.700000000000003
2018-06-01T14:36:04.102Z 4.100000000000001
2018-06-01T14:37:06.136Z 4.300000000000004
2018-06-01T14:38:08.201Z 4.300000000000004
Looking good!
Now I'd like to group it up into more distinct time buckets, for nice graphing.
Let's try....
SELECT
cumulative_sum(max(nnd))
FROM (SELECT
non_negative_difference(rain) as nnd
FROM
weather
WHERE
$time_query)
GROUP BY
time(5m)
and I get an error: ERR: aggregate function required inside the call to non_negative_difference
But I cannot find a reasonable way of adding aggregates and groupings to non_negative_difference() that do not affect the accuracy of the differencing function itself.
The only thing I've been able to do is a dummy aggregate SUM() over time groups that are smaller than the sensor period. But this isn't robust enough for my liking - (and i'm still not sure it is 100% correct)
Is it correct that I must have both queries as aggregate queries?
I was trying to do this very thing for my weather station. Instead of having the weather station calculate the cumulative value I wanted Grafana to do it. The solution that worked for me is the advanced syntax Yuri Lachin mentions in his comments.
With InfluxDB you can use CUMULATIVE_SUM(), but the basic syntax doesn't allow you to group by time (only by tag). The "advanced syntax", however, allows you to to have a time series by nesting an aggregate function like MEAN() or SUM().
Here's the function I am using in Grafana to get a cumulative rainfall total for a selected time period:
SELECT CUMULATIVE_SUM(MEAN("rainfall")) FROM "weather" WHERE $timeFilter GROUP BY time(1h) fill(0).
The GROUP BY is, of course, flexible. I was interested in hourly rainfall so I grouped by 1h. You can group by the time period you find most interesting.
Using this query the rainfall will start from zero for period you select in Grafana. In the Seattle area we had measurable rain (I know, shocker) on 8/6/2020 and 8/8/2020. If I set my date range to include both dates the graph shows just under .2mm total rainfall:
If I switch my graph to 8/8 and 8/9 the total is just under 1mm:
Note: I was also interested in seeing the individual bucket tips so included those as bars on the second Y-axis.
For more detail see: https://docs.influxdata.com/influxdb/v1.8/query_language/functions/#advanced-syntax-7

Duplicates varying slightly in string values with additional temporal aspect

I use emergency tweets from the netherlands for a project. There are sometimes more than one tweet regarding one event, varying slightly in timestamp and in the string of the tweet itself. I want to delete those "duplicates".
So, In my database if have rows which are quite alike but not exactly the same like
"2014-01-11 10:01:17";"HV 1 METINGEN (+Inc,net: 1+) (KLEIN OGS) (slachtoffers: ) , Van Ostadestraat 332 AMSTERDAM [ ] "
"2014-01-11 09:59:06";"HV 1 METINGEN (+Inc,net: 1+) (KLEIN OGS) (slachtoffers:1) , Van Ostadestraat 332 AMSTERDAM ] "
The Problem is that i have to take into account the temporal aspect and can't just rely on the string. The text can occur multiple times.
Ideal would be an approach where i delete all rows within a temporal buffer of 10 minutes after the first tweet, when the text similarity is over a threshold of 0.75.
for the string comparison i tried similarity(text,text) see
http://www.postgresql.org/docs/9.1/static/pgtrgm.html
for the time aggregation i used :
(extract(minute FROM timestamp_column)::int / 10)
in addition to the regular YYYY-MM-DD-HH24 time aggregation
Any help is appreciated.

Resources