influxdb query basic percentage calculation - influxdb

I want calculate a division between a number of values different form zero in a specific table and the number of value equal to zero in the same table
SELECT (count("value") WHERE value = 0 / count("value") WHERE value != 0) * 100 FROM "ping_rtt" WHERE time < now() - 15
Obviously this is wrong and I was wondering what could me the correct way to structure the query.

If your field value consists of just zeros or ones; you can easily calculate percentage as:
SELECT 100*sum(value)/count(value) from your_metric
Or simply use Mean function instead of count/sum.
But if value consists of any arbitrary numbers; there is a tricky way (based on this fact that current InfluxDB implementation calculates zero/zero as zero) to achieve this :) You can first map your field value to zeros and ones and then calculate percentage:
SELECT 100*count(map_value)/sum(map_value) FROM (SELECT value/value as map_value FROM your_metric)
It works properly in my influxdb 1.6.0; suppose there is a metric called metric which contain a field val as:
> select * from metric
name: metric
time tag val
---- --- ---
1539780859073434500 15
1539780862064944400 10
1539780865272757400 7
1539780867449546100 0
1539780880145442700 -8
1539781131768616600 12 0
1539781644977103800 12 0.5
1539781649113051900 12 1.5
as you can see, there are different float number as 0,-8,1.5,0.5 and so on.
we can now map our val field to zero or one:
> select val/val as normal_val from metric
name: metric
time normal_val
---- ----------
1539780859073434500 1
1539780862064944400 1
1539780865272757400 1
1539780867449546100 0
1539780880145442700 1
1539781131768616600 0
1539781644977103800 1
1539781649113051900 1

Related

Using postgres percentile function with negative numbers

I have a table of containing records with negative numbers:
ID
Location
Temperature
1
Paris
-1
2
London
-2
3
Berlin
-3
4
Moscow
-4
5
Rome
-5
6
Warsaw
-6
7
Madrid
-7
8
Amsterdam
-8
9
Milan
-9
10
Zurich
-10
(my actual records and values are more numerous and more complex, but this should help illustrate the issue)
I want to get the minimum, first quartile, median, third quartile, maximum of the temperature values, but in reverse.
For instance, in my example I would have:
Aggregate
Value
Minimum
-1
First quartile
-2.5
Median
-5
Third quartile
-7.5
Maximum
-10
The problem as I see it is that my numbers are negative. So when I run:
SELECT PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY "city_temperatures"."temperature") AS percentile_temperature FROM "city_temperatures"
I actually get the value third quartile as opposed to the first quartile.
What's the best way to handle negative numbers in a query like this?
Add DESC to ORDER BY?
SELECT percentile_cont(0.25) WITHIN GROUP (ORDER BY t.temperature DESC) AS pct_temp
FROM city_temperatures t;
You might get all of it as array in a single calls with:
SELECT percentile_cont('{0,0.25,0.5,0.75,1}'::float8[])
WITHIN GROUP (ORDER BY t.temperature DESC) AS pct_temps
FROM city_temperatures t;

How to not take Null values while doing table calculation but keep them for the aggregated total

Background:
I have a cohort analysis table that shows month of year on the Y axis (Rows) and Difference from next month as my X axis (columns). With Customers as my measure, you see values from 0 to 12 on the column side, showing all the conversions from that particular month and people who did not make a conversion are shown as Null.
Problem:
As I have a table with column showing Null, 0,1,2... 8 showing values, and the total of this shows me my cohort size. So as a customer, Null is important as it shows the size of the total group. But I want to have a percentage of each group and show the cumulative growth of the group without taking the null group.
Summary
I want to show a cumulative percentage growth but not take my first column (that is the null values) but keep it to have the totals show correct value.
The following image can help you understand
For January 2022,
You have the total value of each column
You see the individual percentage of that column/ total
You see the cumulative percentage growth total
Result to see: if we can do cumulative percentage total without taking the Nulls.
Follow up Clarification from Image:
The expected answer should look like this
For 0 --> 3.9%
For 1 --> 7.8% (3.9%+3.9%)
For 2 --> 9.5% (3.9%+3.9%+ 1.7%)
As you can see, the percentage value takes the total cohort size and shows the 3.9%,which is the correct value and doing cumulative by excluding the Null value % that is 87.1% (therefore hiding the Null value column)

Tableau calculating variances (division) of two calculated (countif fields)

Tableau is giving me a hard time, trying to compare two items by percentages. I need to display the percentage different between the number (couintif) of string items based on condition.
Basically, I wrote two calculated fields like:
Calc field #1
IF [Outcome] = "Complete" Then 1 Else 0
Calc field #2
IF [Outcome] = "Pending" Then 1 Else 0
and a third field to get the percentage of pending sales to completed sales
Calc percentage
SUM(Calc field #1 / Calc field #2)
But it's not working. The first two fields work fine, validated them with dataset, but the third calculation doesn't work and always outputs 0
The formula for Calc percentage should be
SUM(Calc field #1) / SUM(Calc field #2)
As both the calculated fields are computed row-wise, it is important to aggregate while using it in a formula.

Influxdb - how to calculate sum of differences per second a the minute level

I want to query the sum per minute from the result obtained from another query that calculates the difference between subsequent values.
select sum(ph1), sum(ph2), sum(ph2) from (select
non_negative_difference(day_chan1) as ph1,
non_negative_difference(day_chan2) as ph2,
non_negative_difference(day_chan3) as ph3
from electricity)
group by time(1m) tz('Europe/Dublin')
For example if I get the following from the suqbquery
time ph1 ph2 ph3
---- --- --- ---
2017-04-02T14:40:38Z 0 0 2
2017-04-02T14:41:38Z 1 1 1
2017-04-02T14:41:39Z 0 0 2
2017-04-02T14:42:38Z 1 1 1
2017-04-02T14:42:39Z 0 1 2
I want to sum them up into
time ph1 ph2 ph3
---- --- --- ---
2017-04-02T14:40:00Z 0 0 2
2017-04-02T14:41:00Z 1 1 3
2017-04-02T14:42:00Z 1 2 3
but what I get from the query is aggregate function required inside the call to non_negative_difference but if I do the sub query on its own, it returns the results
I was also looking a long time for this and I finally found the solution:
select sum(ph1), sum(ph2), sum(ph2) from (select
This is right. Now we want to add an aggregate function inside the non_negative_difference call (as the error also indicates). I assume you want to sum everything.
non_negative_difference(sum(day_chan1)) as ph1,
non_negative_difference(sum(day_chan2)) as ph2,
non_negative_difference(sum(day_chan3)) as ph3
from electricity
Now if we don't add the following line the group by function of the inside query will also be 1m. We don't want this since if a value is missing the way influx calculates sum this will result in a very large differnce. So we group this subquery by the smallest interval you have (e.g. 1s)
group by time(1s))
Finally you can group the outer query by the interval you would like the values to be added together.
group by time(1m) tz('Europe/Dublin')

How to get the sum of a column up to a certain value?

I have a google sheet that I am using to try and calculate leveling and experience points. Column A has the level and Column B has the exp needed to reach the next level. i.e. To get to Level 3 you need 600 exp.
A B
1 200
2 400
3 600
...
99 19800
In column I2 I have an integer for an amount of exp (e.g. 2000), in column J2 I want to figure out what level someone would be at if they started from 0.
Put this in column J and ddrag down as required. Rounddown(I2,-2) rounds I2 down to the nearest 100. Index match finds a match in column B and returns the value in column A of the matched row.
=index(A2:A100,match(ROUNDDOWN(I2,-2),B2:B100,0))
Using a helper column (for example Z): put =sum(B$1:B1) in cell Z1 and drag down. This will compute the sums required for each level. In J2, use the formula
=vlookup(I2, {B:B, Z:Z}, 2) + 1
which looks up I2 in column B, and returns the nearest match that is less than or equal to the search key. It adds 1 to find the level that would be reached, because your table has this kind of an offset to you: the entry against level N is about achieving level N+1.
You may want to put 0 0 on top of the table, to correctly handle the amounts under 200. Or treat them with a separate if condition.
Using algebra
In your specific scenario, the point amount required for level N can be computed as
200*(1+2+3+...+N-1) = 200*(N-1)*N/2 = 100*(N-1/2)^2 - 25
So, given x amount of points, we can find N directly with algebra:
N = floor(sqrt((x+25)/100)+1/2)
which means that the formula
=floor(sqrt((I2 + 25) / 100) + 1/2)
will have the desired effect in cell J2, without the need for an extra column and vlookup.
However, the second approach only works for this specific point values.

Resources