Insert from Influxdb subquery, rows missing and time set to 0 - influxdb

I am inserting rows to a new measurement from a subquery. The subquery returns 2 rows, but only one is actually inserted to the new measurement. In addition the time is set to 0, which means I had to set the duration in
RETENTION POLICY "autogen" to before 1.1.1970.
This is the content of StoreSales:
INSERT StoreSales,StoreNumber="1",EnteredBy="Jake",Month=201906 value=1000
INSERT StoreSales,StoreNumber="1",EnteredBy="Jill",Month=201906 value=2000
INSERT StoreSales,StoreNumber="2",EnteredBy="Jill",Month=201905 value=2000
INSERT StoreSales,EnteredBy="Ann",Month=201906 value=1000
Set duration to before Unix epoch:
ALTER RETENTION POLICY "autogen" on "DT" duration 450000h0m0s
ALTER RETENTION POLICY "autogen" on "DT" shard duration 450000h0m0s
This is the insert I am trying to use:
SELECT * INTO "StoreSalesByStoreByMonth" FROM ( SELECT Sum(value) FROM "StoreSales" WHERE StoreNumber !='' GROUP BY StoreNumber, Month)
The result is:
time written
---- -------
0 2
But StoreSalesByStoreByMonth only includes one record:
SELECT * FROM "StoreSalesByStoreByMonth"
name: StoreSalesByStoreByMonth
time Month StoreNumber sum
---- ----- ----------- ---
0 201906 "1" 3000
The record for Month=201905, StoreNumber="2" is missing.
There is on record in StoreSales without StoreNumber on purpose to verify
that the group by excludes the records without that tag.
How can I get all the records from the subquery inserted?
Can I set the time in the query somewhere, so I don't need to set the RETENTION POLICY "autogen" to before 1.1.1970?

Related

Get the value timestamp when grouping by time in InfluxDB

I am trying to get the MAX/MIN values of an interval of time, and I would like to get the correspond timestamp of the value.
If I run: SELECT max(value) FROM data WHERE time > 1549034249000000000 and time < 1550157449000000000 GROUP BY time(10s)
I am receiving the timestamp of the range beginning instead of the max(value) timestamp.
What alternatives could there be for receiving the max(value) of an interval and his timestamp?
In SQL is possible to execute a query like: SELECT value DENSE_RANK () OVER (PARTITION BY time ORDER BY variableName DESC) AS Rank FROM tableName Is not possible to run something like that in InfluxDB?
Yout can not. When you use group by you get ever the beginning timestamp of the group by.
The alternative is not to use group by.

Get last stored value at a given time in influxDB

I'm storing values of my temperature sensors in an influxDB database and I'm looking for a special request.
Each sensor sends sensed data when temperature changes with a certain threshold which means that all sensors do not send data at the same time.
So sensor 1, namely S1 will send value 1 (S1_v1) at instant t1. Then S2 will send S2_v2 at t2, S3 sends S3_v3 at t3, etc.
I'd like to have the values of all the sensors at a given time t so that at t2, the returned value of S1 will be S1_v1 (the last stored one).
How can I do that with influxDB please? I hope my request is enough clear.
Thank you very much.
You can store all your sensor data into one measurement.
Then have a tag call name to store the sensor's name.
Example:
> select * from sensors;
name: sensors
time name value
---- ---- -----
1547100000000000000 s1 500
1547200000000000000 s2 600
1547300000000000000 s3 700
1548000000000000000 s1 900
1548000000000000000 s2 800
1548000000000000000 s3 999
To retrieve the latest stored value of all the sensors at a given time range t you can do the following;
SELECT * FROM sensors
WHERE time >= 1547000000000000000 and time <= 1547300000000000000
GROUP BY "name" order by desc limit 1;
Output:
name: sensors
tags: name=s3
time value
---- -----
1547300000000000000 700
name: sensors
tags: name=s2
time value
---- -----
1547200000000000000 600
name: sensors
tags: name=s1
time value
---- -----
1547100000000000000 500
The query above is essentially grouping the data of all your sensors into individual bucket based on your time filter. Then the ORDER BY DESC is for sorting them into descending order so that the first row is always the point with greatest time. Limit 1 is just asking the query engine to return you the top 1 row.

Influxdb - how to calculate sum of differences per second a the minute level

I want to query the sum per minute from the result obtained from another query that calculates the difference between subsequent values.
select sum(ph1), sum(ph2), sum(ph2) from (select
non_negative_difference(day_chan1) as ph1,
non_negative_difference(day_chan2) as ph2,
non_negative_difference(day_chan3) as ph3
from electricity)
group by time(1m) tz('Europe/Dublin')
For example if I get the following from the suqbquery
time ph1 ph2 ph3
---- --- --- ---
2017-04-02T14:40:38Z 0 0 2
2017-04-02T14:41:38Z 1 1 1
2017-04-02T14:41:39Z 0 0 2
2017-04-02T14:42:38Z 1 1 1
2017-04-02T14:42:39Z 0 1 2
I want to sum them up into
time ph1 ph2 ph3
---- --- --- ---
2017-04-02T14:40:00Z 0 0 2
2017-04-02T14:41:00Z 1 1 3
2017-04-02T14:42:00Z 1 2 3
but what I get from the query is aggregate function required inside the call to non_negative_difference but if I do the sub query on its own, it returns the results
I was also looking a long time for this and I finally found the solution:
select sum(ph1), sum(ph2), sum(ph2) from (select
This is right. Now we want to add an aggregate function inside the non_negative_difference call (as the error also indicates). I assume you want to sum everything.
non_negative_difference(sum(day_chan1)) as ph1,
non_negative_difference(sum(day_chan2)) as ph2,
non_negative_difference(sum(day_chan3)) as ph3
from electricity
Now if we don't add the following line the group by function of the inside query will also be 1m. We don't want this since if a value is missing the way influx calculates sum this will result in a very large differnce. So we group this subquery by the smallest interval you have (e.g. 1s)
group by time(1s))
Finally you can group the outer query by the interval you would like the values to be added together.
group by time(1m) tz('Europe/Dublin')

Influxdb: How to get count of number of results in a group by query

Is there anyway that i can get the count of total number of results / points / records in a group by query result?
> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m)
name: h2o_feet
--------------
time count
2015-08-18T00:00:00Z 2
2015-08-18T00:12:00Z 2
2015-08-18T00:24:00Z 2
I expect the count as 3 in this case. Even though I can calculate the number of results using the time period and interval (12m) here, I would like to know whether it is possible to do so with a query to database.
You can use Multiple Aggregates in single query.
Using your example add a select count(*) from (<inner query>):
> SELECT COUNT(*) FROM (SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m))
name: h2o_feet
--------------
time count_count
1970-01-01T00:00:00Z 3
However if you had a situation in which the grouping by returns empty rows, they will not be counted.
For example, counting over the below table will result in a count of 2 rather than 3:
name: h2o_feet
--------------
time count
2015-08-18T00:00:00Z 2
2015-08-18T00:12:00Z
2015-08-18T00:24:00Z 2
To include empty rows in your count you will need to add fill(1) to your query like this:
> SELECT COUNT(*) FROM (SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) fill(1))
You will need to do some manual work. Run it directly,
$ influx -execute "select * from measurement_name" -database="db_name" | wc -l
This will return 4 more than the actual values.
Here is an example,
luvpreet#DHARI-Inspiron-3542:~/www$ influx -execute "select * from yprices" -database="vehicles" | wc -l
5
luvpreet#DHARI-Inspiron-3542:~/www$ influx -execute "select * from yprices" -database="vehicles"
name: yprices
time price
---- -----
1493626629063286219 2
luvpreet#DHARI-Inspiron-3542:~/www$
So, I think now you know why subtract 4 from the value.

How to query how many metrics in a period with Influxdb?

I want to know how many events we send to InfluxDB for a given period. If I use the following query SELECT COUNT(value) FROM /./ WHERE time > now() - 1h GROUP BY time(10m), I get that grouped for each metric but I want the total for all metrics.
If I use SELECT COUNT(*) FROM /./ WHERE time > now() - 1h GROUP BY time(10m), I get an error:
Server returned error: expected field argument in count()
The COUNT function takes one and only one field key as an argument. If you have field keys that are not named value you will have to run a separate query to count them.
Alternately, you can run them together like:
SELECT COUNT(value), COUNT(otherfield), COUNT(anotherfield) FROM /./ WHERE time > now() - 1h GROUP BY time(10m)

Resources