How to select data with minimum time interval between results - influxdb

I am not sure how to best ask this question.. I am looking to select data but with a minimum time interval between the results. For example:
This measurement:
time field
2015-08-18T00:00:00Z 12
2015-08-18T00:00:00Z 1
2015-08-18T00:06:00Z 11
2015-08-18T00:06:00Z 3
2015-08-18T05:54:00Z 2
2015-08-18T06:00:00Z 1
2015-08-18T06:06:00Z 8
2015-08-18T06:12:00Z 7
This Query:
select sum(*) from measurement where field > 0 would return the sum of all of the rows. I would like to be able to specify a minimum interval between results and only match on the first row in a set of closely timed rows. Ex. 8 minute minimum interval would only match these rows (and result in a sum of 22):
time field
2015-08-18T00:00:00Z 12
2015-08-18T05:54:00Z 2
2015-08-18T06:06:00Z 8
Is there a way to get my expected output from influxdb?
The only alternative I can think of is to just return all of the rows without the sum() aggregate function then loop through the results and do lots of time comparisons or date math in my application.

Probably not with InfluxQL.
InfluxQL has a function elapsed which returns the time elapsed between consecutive datapoints https://docs.influxdata.com/influxdb/v1.7/query_language/functions/#elapsed
That's possibly the only function that has something to do with time but I can't think of a way to apply it for what you need.
You may have better luck with the window function of Flux https://v2.docs.influxdata.com/v2.0/query-data/guides/window-aggregate/
I'm not familiar enough to say how, if at all possible.
Doing it in your application may be the way to go.

Related

In Google Sheets, how do I multiply a duration or interval constant? [duplicate]

This question already has answers here:
How to SUM duration in Google Sheets?
(5 answers)
Closed 2 months ago.
This post was edited and submitted for review 2 months ago and failed to reopen the post:
Original close reason(s) were not resolved
I'm making calculations on production cost (in number of resources) and duration.
I have a process that takes 5 minutes. Using the Duration format, I would enter that as 00:05:00.
I want to queue up this process a certain number of times and calculate the total duration. The output should either be something like 16:35:00 or 5 02:15:00. A "d HH.mm.ss" format.
How, in Google Sheets, do I multiply a Duration by an integer to get a total Duration? To be clear, I am not doing a summation of a column of durations. I am taking a duration constant, such as 5 minutes or 25 minutes, and multiplying it by an integer representing the number of times the process will be run, consecutively.
All these attempts resulted in Formula Parse Error:
=(5*00:05:00)
=(112*00:05:00.000)
=(VALUE(C27)*00:05:00)
=MULTIPLY(VALUE(C27),00:05:00.000)
Well, blow me down. I came up with a workaround while I was trying different ways to fail. I assigned 00:05:00 to it's own cell with the Duration format, then referenced that cell in the formula.
I.E. =C27*J7 gives me 9:20:00 when C27 equates to 112 (it's a summation of it's own) and J7 is the cell holding 00:05:00.
Still doesn't give me days when it goes over 24 hours, and I'd rather have the duration value as a constant in the formula, but it's a step forward.
Would something like this work for you?? It's no longer a number, but if it's for expressing the amount in your desired format it may be useful:
=IF(ROUNDDOWN(W2*W3),ROUNDDOWN(W2*W3)&"d "&TEXT(W2*W3-ROUNDDOWN(W2*W3),"hh:mm:ss"),TEXT(W2*W3,"hh:mm:ss"))
Change the cell references, obviously
PS: If you want to have the value as a constant in your formula, you can try to change the cell reference with TIME function within your formula:
In both Excel and Google spreadsheet, DATE are represented in a number start counting from 1899/12/30,
which...
1 is equal to 1 day
1/24 is equal to 1 hour
1/24/60 is equal to 1 minute
1/24/60/60 is equal to 1 second
you can do like:
=TODAY()+1 which gives you tomorrow, or...
=TODAY()+12/24 which gives you "date of today" 12:00:00
and when you are done with the calculations, you can simply use a TEXT() to format the NUMBER back into DATE format, such as:
=TEXT(TODAY()+7 +13/24 +15/24/60,"yyyy-mm-dd hh:mm:ss")
will return the date of a week away from today at 01:15:00 p.m.
This date/time format doesn't requires a full date to work, you can get difference of two time format like this:
=TEXT(1/24/60 - 1/24/60/60,"hh:mm:ss")
since 1/24/60 is 1 min, and 1/24/60/60 is 1 second,
this formula returns 00:00:59, telling you that there is a 59 seconds diff. between 1 min and 1 sec.

InfluxQL time calculations return no records

I'd like to query InfluxDB using InfluxQL and exclude any rows from 0 to 5 minutes after the hour.
Seems pretty easy to do using the time field (the number of nanoseconds since the epoch) and a little modulus math. But the problem is that any WHERE clause with even the simplest calculation on time returns zero records.
How can I get what I need if I can't perform calculations on time? How can I exclude any rows from 0 to 5 minutes after the hour?
# Returns 10 records
SELECT * FROM "telegraf"."autogen"."processes" WHERE time > 0 LIMIT 10
# Returns 0 records
SELECT * FROM "telegraf"."autogen"."processes" WHERE (time/1) > 0 LIMIT 10

Showing hourly average (histogramm) in grafana

Given a timeseries of (electricity) marketdata with datapoints every hour, I want to show a Bar Graph with all time / time frame averages for every hour of the data, so that an analyst can easily compare actual prices to all time averages (which hour of the day is most/least expensive).
We have cratedb as backend, which is used in grafana just like a postgres source.
SELECT
extract(HOUR from start_timestamp) as "time",
avg(marketprice) as value
FROM doc.el_marketprices
GROUP BY 1
ORDER BY 1
So my data basically looks like this
time value
23.00 23.19
22.00 25.38
21.00 29.93
20.00 31.45
19.00 34.19
18.00 41.59
17.00 39.38
16.00 35.07
15.00 30.61
14.00 26.14
13.00 25.20
12.00 24.91
11.00 26.98
10.00 28.02
9.00 28.73
8.00 29.57
7.00 31.46
6.00 30.50
5.00 27.75
4.00 20.88
3.00 19.07
2.00 18.07
1.00 19.43
0 21.91
After hours of fiddling around with Bar Graphs, Histogramm Mode, Heatmap Panel und much more, I am just not able to draw a simple Hours-of-the day histogramm with this in Grafana. I would very much appreciate any advice on how to use any panel to get this accomplished.
your query doesn't return correct time series data for the Grafana - time field is not valid timestamp, so don't extract only
hour, but provide full start_timestamp (I hope it is timestamp
data type and value is in UTC)
add WHERE time condition - use Grafana's macro __timeFilter
use Grafana's macro $__timeGroupAlias for hourly groupping
SELECT
$__timeGroupAlias(start_timestamp,1h,0),
avg(marketprice) as value
FROM doc.el_marketprices
WHERE $__timeFilter(start_timestamp)
GROUP BY 1
ORDER BY 1
This will give you data for historic graph with hourly avg values.
Required histogram may be a tricky, but you can try to create metric, which will have extracted hour, e.g.
SELECT
$__timeGroupAlias(start_timestamp,1h,0),
extract(HOUR from start_timestamp) as "metric",
avg(marketprice) as value
FROM doc.el_marketprices
WHERE $__timeFilter(start_timestamp)
GROUP BY 1
ORDER BY 1
And then visualize it as histogram. Remember that Grafana is designated for time series data, so you need proper timestamp (not only extracted hours, eventually you can fake it) otherwise you will have hard time to visualize non time series data in Grafana. This 2nd query may not work properly, but it gives you at least idea.

How do I shift Google Sheets duration value from 35:55:00 to 0:35:55?

I have pasted multiple run duration values from Garmin into a Google Sheet. The longer runs (> 1 hour) copy/paste correctly. Eg: 2:10:35. The problem is shorter (< 1 hour) runs. Eg 35:55. The latter are being shown in Google Sheets as 35:55:00. Ie Google assumes 35:55 is 35 hours and 55 mins, not 35 mins and 55 seconds. So for my shorter sub 1 hour durations I need an easy way to convert 35:55:00 to 0:35:55.
As Tom Sharpe said, there is some room for interpretation in the data you have. But assuming that the duration of your runs is always between 10 minutes and 10 hours, we can disambiguate the values as follows:
=if(A1 > 10/24, A1/60, A1)
Numerically, the duration values are measured in days, so A1 > 10/24 means "more than 10 hours". In this case the value gets divided by 60.
Depending on your workout regime you may want to replace the threshold of 10 by another number; perhaps it's safer to say that the runs are always between 5 minutes and 5 hours.

How to find time intervals with no data points in InfluxDB

I have a bunch of IoT sensors that upload second by second data to InfluxDB. Since their network is unreliable, sometimes they do not report data.
I'm trying to figure out how to determine time periods in InfluxDB for which there is no data, and am encountering some wacky behavior with subqueries.
What I've tried so far:
Count the number of points each second, for example:
select count(power)
from energy
where time < '2017-05-14T00:05:10Z'
and time >= '2017-05-14T00:04:30Z'
group by time(1s);
This looks promising, as it returns a result for each second in the interval and the count of data points:
...
1494720297000000000 1
1494720298000000000 1
1494720299000000000 0
1494720300000000000 0
...
Now I want only the time periods where there are 0 points, however when I try this, only time ranges with non-zero numbers of points are reported:
select "points"
from
(select count(power) as "points"
from energy
where time < '2017-05-14T00:05:10Z'
and time >= '2017-05-14T00:04:30Z'
group by time(1s));
Returns:
...
1494720297000000000 1
1494720298000000000 1
No data after 1494720298000000000 is returned, even though the subquery does return rows.
Any help would be appreciated in crafting a query or approach to identify only the areas of time where there is no data.
add fill(none) after your query
Example-select count(power)from energy where time < '2017-05-14T00:05:10Z' and time >= '2017-05-14T00:04:30Z' group by time(1s) fill(none)

Resources