Get last stored value at a given time in influxDB - influxdb

I'm storing values of my temperature sensors in an influxDB database and I'm looking for a special request.
Each sensor sends sensed data when temperature changes with a certain threshold which means that all sensors do not send data at the same time.
So sensor 1, namely S1 will send value 1 (S1_v1) at instant t1. Then S2 will send S2_v2 at t2, S3 sends S3_v3 at t3, etc.
I'd like to have the values of all the sensors at a given time t so that at t2, the returned value of S1 will be S1_v1 (the last stored one).
How can I do that with influxDB please? I hope my request is enough clear.
Thank you very much.

You can store all your sensor data into one measurement.
Then have a tag call name to store the sensor's name.
Example:
> select * from sensors;
name: sensors
time name value
---- ---- -----
1547100000000000000 s1 500
1547200000000000000 s2 600
1547300000000000000 s3 700
1548000000000000000 s1 900
1548000000000000000 s2 800
1548000000000000000 s3 999
To retrieve the latest stored value of all the sensors at a given time range t you can do the following;
SELECT * FROM sensors
WHERE time >= 1547000000000000000 and time <= 1547300000000000000
GROUP BY "name" order by desc limit 1;
Output:
name: sensors
tags: name=s3
time value
---- -----
1547300000000000000 700
name: sensors
tags: name=s2
time value
---- -----
1547200000000000000 600
name: sensors
tags: name=s1
time value
---- -----
1547100000000000000 500
The query above is essentially grouping the data of all your sensors into individual bucket based on your time filter. Then the ORDER BY DESC is for sorting them into descending order so that the first row is always the point with greatest time. Limit 1 is just asking the query engine to return you the top 1 row.

Related

Insert from Influxdb subquery, rows missing and time set to 0

I am inserting rows to a new measurement from a subquery. The subquery returns 2 rows, but only one is actually inserted to the new measurement. In addition the time is set to 0, which means I had to set the duration in
RETENTION POLICY "autogen" to before 1.1.1970.
This is the content of StoreSales:
INSERT StoreSales,StoreNumber="1",EnteredBy="Jake",Month=201906 value=1000
INSERT StoreSales,StoreNumber="1",EnteredBy="Jill",Month=201906 value=2000
INSERT StoreSales,StoreNumber="2",EnteredBy="Jill",Month=201905 value=2000
INSERT StoreSales,EnteredBy="Ann",Month=201906 value=1000
Set duration to before Unix epoch:
ALTER RETENTION POLICY "autogen" on "DT" duration 450000h0m0s
ALTER RETENTION POLICY "autogen" on "DT" shard duration 450000h0m0s
This is the insert I am trying to use:
SELECT * INTO "StoreSalesByStoreByMonth" FROM ( SELECT Sum(value) FROM "StoreSales" WHERE StoreNumber !='' GROUP BY StoreNumber, Month)
The result is:
time written
---- -------
0 2
But StoreSalesByStoreByMonth only includes one record:
SELECT * FROM "StoreSalesByStoreByMonth"
name: StoreSalesByStoreByMonth
time Month StoreNumber sum
---- ----- ----------- ---
0 201906 "1" 3000
The record for Month=201905, StoreNumber="2" is missing.
There is on record in StoreSales without StoreNumber on purpose to verify
that the group by excludes the records without that tag.
How can I get all the records from the subquery inserted?
Can I set the time in the query somewhere, so I don't need to set the RETENTION POLICY "autogen" to before 1.1.1970?

influxdb query basic percentage calculation

I want calculate a division between a number of values different form zero in a specific table and the number of value equal to zero in the same table
SELECT (count("value") WHERE value = 0 / count("value") WHERE value != 0) * 100 FROM "ping_rtt" WHERE time < now() - 15
Obviously this is wrong and I was wondering what could me the correct way to structure the query.
If your field value consists of just zeros or ones; you can easily calculate percentage as:
SELECT 100*sum(value)/count(value) from your_metric
Or simply use Mean function instead of count/sum.
But if value consists of any arbitrary numbers; there is a tricky way (based on this fact that current InfluxDB implementation calculates zero/zero as zero) to achieve this :) You can first map your field value to zeros and ones and then calculate percentage:
SELECT 100*count(map_value)/sum(map_value) FROM (SELECT value/value as map_value FROM your_metric)
It works properly in my influxdb 1.6.0; suppose there is a metric called metric which contain a field val as:
> select * from metric
name: metric
time tag val
---- --- ---
1539780859073434500 15
1539780862064944400 10
1539780865272757400 7
1539780867449546100 0
1539780880145442700 -8
1539781131768616600 12 0
1539781644977103800 12 0.5
1539781649113051900 12 1.5
as you can see, there are different float number as 0,-8,1.5,0.5 and so on.
we can now map our val field to zero or one:
> select val/val as normal_val from metric
name: metric
time normal_val
---- ----------
1539780859073434500 1
1539780862064944400 1
1539780865272757400 1
1539780867449546100 0
1539780880145442700 1
1539781131768616600 0
1539781644977103800 1
1539781649113051900 1

Total sum is less than sum of sums by tags in InfluxDB measurement?

Or I am doing something wrong. I have a service that counts requests it receives. Requests have platform, version of client application that does them, and other tags. When service is restarted (which happens rarely on updates, metrics are reset).
So, I want to count percentage of queries from each platform in recent time range, and do:
SELECT SUM("received") as "received"
FROM (
SELECT NON_NEGATIVE_DIFFERENCE(MAX("received")) as "received"
FROM "http_metrics"
WHERE time >= now() - 4h GROUP BY time(1s)
) GROUP BY "platform";
Which returns:
...
tags: platform=ios
time received
---- --------
1970-01-01T00:00:00Z 581
tags: platform=unknown
time received
---- --------
1970-01-01T00:00:00Z 12310
tags: platform=web
time received
---- --------
1970-01-01T00:00:00Z 6196
And do the same without grouping:
SELECT SUM("received") as "received"
FROM (
SELECT NON_NEGATIVE_DIFFERENCE(MAX("received")) as "received"
FROM "http_metrics"
WHERE time >= now() - 4h GROUP BY time(1s)
);
Which returns:
time received
---- --------
1970-01-01T00:00:00Z 8274
Which is obviously incorrect, because "unknown" platform could not receive more requests than all of them. But I even don't know which is incorrect, total or by-platform or both?
How to count total and by-platform sum of requests properly?
Ok, so the problem was that as my measurement had other tags, like server and app version, and separate counter for each of that, then they all become interleaved, which could be seen on a graph, which should be smooth and cumulative, but is very spiky:
But when we add GROUP BY *
SELECT "received" FROM "http_metrics" WHERE $timeFilter GROUP BY *;
It splits into lot's of separate smooth series:
Now, that is differentiable, and from that we could create subquery to aggregate from.
Total:
SELECT SUM("received") as "received" FROM (
SELECT NON_NEGATIVE_DIFFERENCE(MAX("received")) as "received"
FROM "http_metrics"
WHERE time >= now() - 6h
GROUP BY time(1s), *
) WHERE time >= now() - 6h;
time received
---- --------
2018-07-17T01:07:46.184292033Z 1367
Grouped:
SELECT SUM("received") as "received" FROM (
SELECT NON_NEGATIVE_DIFFERENCE(MAX("received")) as "received"
FROM "http_metrics"
WHERE time >= now() - 6h
GROUP BY time(1s), *
) WHERE time >= now() - 6h
GROUP BY "platform";
... I'm not going to bore you with response, but sum matches total sum.
So, I guess moral the story should be: every time you want to differentiate counters which have some tags, you need to GROUP BY *

InfluxDB: Starting cumulative_sum() from zero / aggregate grouping required for cumulative_sum and non_negative_difference

Using InfluxDB, I'm trying produce an output that shows cumulative rainfall for a time period, that starts from zero.
The rainfall sensor outputs a cumulative rainfall amount, but resets to zero on power-failure, restart etc.
My first query component uses non_negative_difference() to show the increments.
select
non_negative_difference(rain) as nnd
FROM
weather
WHERE
$time_query
.... yields an increment per raw data point, for example:
2018-06-01T14:21:00.926Z 0
2018-06-01T14:22:02.959Z 0.30000000000000426
2018-06-01T14:23:04.992Z 0.3999999999999986
2018-06-01T14:24:07.024Z 0.10000000000000142
2018-06-01T14:25:09.059Z 0.19999999999999574
2018-06-01T14:26:11.094Z 0
2018-06-01T14:27:13.127Z 0.10000000000000142
2018-06-01T14:28:15.158Z 0.20000000000000284
2018-06-01T14:29:20.027Z 0.09999999999999432
2018-06-01T14:30:22.476Z 0.10000000000000142
2018-06-01T14:30:53.918Z 0.6000000000000014
2018-06-01T14:31:55.968Z 0.5
2018-06-01T14:32:58.007Z 0.5
2018-06-01T14:34:00.046Z 0.20000000000000284
2018-06-01T14:35:02.075Z 0.3999999999999986
2018-06-01T14:36:04.102Z 0.3999999999999986
2018-06-01T14:37:06.136Z 0.20000000000000284
2018-06-01T14:38:08.201Z 0
So far so good.
I'm now trying to stitch these readings back to cumulative total, starting from zero for the intended period.
I can use cumulative_sum() for this, for example:
SELECT
cumulative_sum(nnd)
FROM
(SELECT
non_negative_difference(rain) as nnd
FROM
weather
WHERE
$time_query )
which yields:
2018-06-01T14:21:00.926Z 0
2018-06-01T14:22:02.959Z 0.30000000000000426
2018-06-01T14:23:04.992Z 0.7000000000000028
2018-06-01T14:24:07.024Z 0.8000000000000043
2018-06-01T14:25:09.059Z 1
2018-06-01T14:26:11.094Z 1
2018-06-01T14:27:13.127Z 1.1000000000000014
2018-06-01T14:28:15.158Z 1.3000000000000043
2018-06-01T14:29:20.027Z 1.3999999999999986
2018-06-01T14:30:22.476Z 1.5
2018-06-01T14:30:53.918Z 2.1000000000000014
2018-06-01T14:31:55.968Z 2.6000000000000014
2018-06-01T14:32:58.007Z 3.1000000000000014
2018-06-01T14:34:00.046Z 3.3000000000000043
2018-06-01T14:35:02.075Z 3.700000000000003
2018-06-01T14:36:04.102Z 4.100000000000001
2018-06-01T14:37:06.136Z 4.300000000000004
2018-06-01T14:38:08.201Z 4.300000000000004
Looking good!
Now I'd like to group it up into more distinct time buckets, for nice graphing.
Let's try....
SELECT
cumulative_sum(max(nnd))
FROM (SELECT
non_negative_difference(rain) as nnd
FROM
weather
WHERE
$time_query)
GROUP BY
time(5m)
and I get an error: ERR: aggregate function required inside the call to non_negative_difference
But I cannot find a reasonable way of adding aggregates and groupings to non_negative_difference() that do not affect the accuracy of the differencing function itself.
The only thing I've been able to do is a dummy aggregate SUM() over time groups that are smaller than the sensor period. But this isn't robust enough for my liking - (and i'm still not sure it is 100% correct)
Is it correct that I must have both queries as aggregate queries?
I was trying to do this very thing for my weather station. Instead of having the weather station calculate the cumulative value I wanted Grafana to do it. The solution that worked for me is the advanced syntax Yuri Lachin mentions in his comments.
With InfluxDB you can use CUMULATIVE_SUM(), but the basic syntax doesn't allow you to group by time (only by tag). The "advanced syntax", however, allows you to to have a time series by nesting an aggregate function like MEAN() or SUM().
Here's the function I am using in Grafana to get a cumulative rainfall total for a selected time period:
SELECT CUMULATIVE_SUM(MEAN("rainfall")) FROM "weather" WHERE $timeFilter GROUP BY time(1h) fill(0).
The GROUP BY is, of course, flexible. I was interested in hourly rainfall so I grouped by 1h. You can group by the time period you find most interesting.
Using this query the rainfall will start from zero for period you select in Grafana. In the Seattle area we had measurable rain (I know, shocker) on 8/6/2020 and 8/8/2020. If I set my date range to include both dates the graph shows just under .2mm total rainfall:
If I switch my graph to 8/8 and 8/9 the total is just under 1mm:
Note: I was also interested in seeing the individual bucket tips so included those as bars on the second Y-axis.
For more detail see: https://docs.influxdata.com/influxdb/v1.8/query_language/functions/#advanced-syntax-7

How to query difference between series with different tags in one measurement in InfluxDB?

for example, temperature data from two different locations stored in one measurement, like
time temperature location
---- ---- ----
1 15 A
2 20 B
3 17 A
4 18 B
How to calculate the difference between temperatures at A and B with querying?
i.e. expecting something like “SELECT a.temperature - b.temperature FROM measurement WHERE location = ‘A’ as a WHERE location = ‘B’ as b”
Thanks
You can use subqueries for that e.g.
SELECT last("temp_a") - last("temp_b") FROM (
SELECT "gauge" AS "temp_a" FROM "measurement" WHERE ("location"='A') AND $timeFilter
),(
SELECT "gauge" AS "temp_b" FROM "measurement" WHERE ("location"='B') AND $timeFilter
) GROUP BY time($__interval) fill(null)
Influx query language does not support functions across measurements.

Resources