Workaround for a nested query in Kapactior TICKscript

Workaround for a nested query in Kapactior TICKscript - influxdb

As far I know there is no possibility to execute nested query in Kapacitor TickScript, so I'm looking for some other way to achieve the same result as I have in InfluxQL query:
select count(*) from (SELECT sum("value") FROM "measurement"."autogen"."consumption" WHERE (time > now() -5d -1h AND time <= now() - 5d) GROUP BY time(60m, 1ms), "param1", "param2", "param3")
The result of that query is a single point with one value, which contains total row numbers from nested query, for example 50.
I wrote something similar in tickscript:
var cron = '0 59 * * * * *'
var size_1 = batch
|query('''
SELECT sum(value) FROM "measurement"."autogen"."consumption"
''')
.period(1h)
.offset(5d - 1m - 1ms)
.groupBy(time(60m, 1ms), 'param1', 'param2', 'param3')
.fill(1)
.cron(cron)
|count('sum')
|log()
size_1
|alert()
.kafka()
.kafkaTopic('influx')
But I don't get a single value in output, instead of that I have multiple points still grouped by that 3 parameters from query ('param1', 'param2', 'param3') and they are only counted non-unique set of params, fragment of kapacitor log():
2021-04-21T03:51:00.220+02:00
Kapacitor Point
3 Tags
param1: 2.8.0
param2: 0015474_7
param3: SUPPLEMENTARY
1 Fields
count: 2
2021-04-21T03:51:00.221+02:00
Kapacitor Point
3 Tags
param1: 2.8.0
param2: PW0001_1
param3: SUPPLEMENTARY
1 Fields
count: 2
etc.
How to get the same output with single count() resulut in kapacitor tickscript like in influxQL query?

Related

Combining LAST and Cumulative SUM on influxdb subquery data

I have some telemetry data that looks like this:
> select * from connected_users LIMIT 100
name: connected_users
time event_id value
---- -------- -----
1605485019101319658 13759 2
1605485299566004832 13759 0
1605490011182967926 13760 4
1605490171409428188 13760 0
1605490207031398071 13760 7
1605490246151204709 13760 0
1605491054403726308 13761 1
1605491403050521520 13761 0
1605491433407347399 13762 2
1605491527773331816 13762 3
1605492020976088377 12884 1
1605492219827002782 13761 1
1605492613984911844 13763 1
1605492806683323942 13763 0
...
These writes only occur when something changes on the event (i.e. are not at fixed intervals). I want to write a query which will give me a cumulative sum per-minute of the current "value" on all the event_ids. However, because I can't guarantee that a new data value will have been written in the preceding 60 seconds, I use LAST to get whatever was last set per event_id
So far I got to:
SELECT SUM(*) FROM
(SELECT LAST("value") FROM
"connected_users" WHERE
time <= now() AND
time >= now() - 3h
GROUP BY time(1m), "event_id" fill(previous))
GROUP BY time(1m)
But this seems to give me a much lower outer "value" than expected, and a lot of duplicate time entries (and thus a lot of duplicate entries in the output data.)
I can see that the inner query is correct, because If I just run that in isolation, I can stack the output data in Grafana and manually see the correct total value in the graph. However, I want to have a single series rather than a grouped set of series, and I can't wrap my head around how to transform the data to do that.
EDIT: To give more context:
This is the inner query (Grafana, hence $timeFilter):
SELECT last("value") FROM "connected_users" WHERE $timeFilter GROUP BY time(1m), "event_id" fill(previous)
This produces a chart which I can stack and is correct:
If I then wrap that inner query in a SUM and GROUP BY time(1m) again, I can isolate a single series:
SELECT SUM(*) FROM (SELECT last("value") FROM "connected_users" WHERE $timeFilter AND ("event_id = '9970') GROUP BY time(1m), "event_id" fill(previous)) GROUP BY time(1m)
However, If I remove the AND and attempt to SUM all series values, I just end up with both a jumbled mess (presumably because there are duplicate/overlapping time values?) and also a lower max value than expected (expecting 18, got 8):
SELECT SUM(*) FROM (SELECT last("value") FROM "connected_users" WHERE $timeFilter GROUP BY time(1m), "event_id" fill(previous)) GROUP BY time(1m)

Apply a SUM function on the product of two fields in InfluxDB

I have the following query:
SELECT sum("field1" * "field2") FROM "my_db"."autogen"."data" GROUP BY time(1d) FILL(null)
In short I would like to perform the operation sum on the product of two fields field and field2.
The above query returns an error: expected field argument in sum().
Is this kind of thing at all possible in InfluxDB?

Here's a idea: try Sub Query
Note:I don't have editor right now so it might give error too
SELECT SUM(Multiplication) FROM
(SELECT "field1" * "field2" as Multiplication, time(1d) as Days FROM
"my_db"."autogen"."data" GROUP BY time(1d) FILL(null)
) GROUP BY Days

InfluxDB 1.7.2 - Top X over time

I’m new to InfluxDB. I’m using it to store ntopng timeseries data.
ntopng writes a measurement called asn:traffic that stores how many bytes were sent and received for an ASN.
> show tag keys from "asn:traffic"
name: asn:traffic
tagKey
------
asn
ifid
> show field keys from "asn:traffic"
name: asn:traffic
fieldKey fieldType
-------- ---------
bytes_rcvd float
bytes_sent float
>
I can run a query to see the data rate in bps for a specific ASN:
> SELECT non_negative_derivative(mean("bytes_rcvd"), 1s) * 8 FROM "asn:traffic" WHERE "asn" = '2906' AND time >= now() - 12h GROUP BY time(30s) fill(none)
name: asn:traffic
time non_negative_derivative
---- -----------------------
1550294640000000000 30383200
1550294700000000000 35639600
...
...
...
>
However, what I would like to do is create a query that I can use to return the top N ASNs by data rate and plot that on a Grafana graph. Sort of like this example that is using ELK.
I've tried a few variants from posts here and elsewhere, but I haven't been able to get what I'm after. For example, this query I think gets me closer to where I want to be, but there are no values in asn:
> select top(bps,asn,10) from (SELECT non_negative_derivative(mean(bytes_rcvd), 1s) * 8 as bps FROM "asn:traffic" WHERE time >= now() - 12h GROUP BY time(30s) fill(none))
name: asn:traffic
time top asn
---- --- ---
1550299860000000000 853572800
1550301660000000000 1197327200
1550301720000000000 1666883866.6666667
1550310780000000000 674889600
1550329320000000000 20979431866.666668
1550332740000000000 707015600
1550335920000000000 2066646533.3333333
1550336820000000000 618554933.3333334
1550339280000000000 669084933.3333334
1550340300000000000 704147333.3333334
>
Thinking then that perhaps the sub query needs to select asn also, however that proceeds an error about mixing queries:
> select top(bps,asn,10) from (SELECT asn, non_negative_derivative(mean(bytes_rcvd), 1s) * 8 as bps FROM "asn:traffic" WHERE time >= now() - 12h GROUP BY time(30s) fill(none))
ERR: mixing aggregate and non-aggregate queries is not supported
>
Anyone have any thoughts on a solution?
EDIT 1
Per the suggestion by George Shuklin, modifying the query to include asn in GROUP BY displays ASN in the CLI output, but that doesn't translate in Grafana. I'm expecting a stacked graph with each layer of the stacked graph being one of the top 10 asn results.

Try to make ASN as tag, than you can use group by time(30s), 'asn', and that tag will be available in the outer query.

InfluxDB: In a query result, why does the "values" key have its values nested in an array?

I have some data stored in InfluxDB. Here are two queries with their results:
http://influxdb.server/query?u=test&p=test&db=sensors&q=SELECT * FROM /^dwd\..*\.zehn-minuten\.niederschlag/ GROUP BY * ORDER BY DESC LIMIT 1
{"results":[{"statement_id":0,"series":[
{"name":"dwd.str.schnarrenberg.zehn-minuten.niederschlag","columns":["time","QN","RWS_10","RWS_DAU_10","RWS_IND_10"],"values":[["2017-12-10T23:50:00Z",2,0,5,1]]},
{"name":"dwd.str.flughafen.zehn-minuten.niederschlag","columns":["time","QN","RWS_10","RWS_DAU_10","RWS_IND_10"],"values":[["2017-12-10T23:50:00Z",2,0,0,0]]}
]}]}
Both contain the following segment
"values":[[..., ..., ...]]
What is the reason for nesting the values in an array?
Also, why are the results wrapped in an array?
"results":[{

How to get the number of entries in a measurement

I am a newbie to influxdb. I just started to read the influx documentation.
I cant seem to get the equivalent of 'select count(*) from table' to work in influx db.
I have a measurement called cart:
time status cartid
1456116106077429261 0 A
1456116106090573178 0 B
1456116106095765618 0 C
1456116106101532429 0 D
but when I try to do
select count(cartid) from cart
I get the error
ERR: statement must have at least one field in select clause

I suppose cartId is a tag rather than a field value? count() currently can't be used on tag and time columns. So if your status is a non-tag column (a field), do the count on that.
EDIT:
Reference

This works as long as no field or tag exists with the name count:
SELECT SUM(count) FROM (SELECT *,count::INTEGER FROM MyMeasurement GROUP BY count FILL(1))
If it does use some other name for the count field. This works by first selecting all entries including an unpopulated field (count) then groups by the unpopulated field which does nothing but allows us to use the fill operator to assign 1 to each entry for count. Then we select the sum of the count fields in a super query. The result should look like this:
name: MyMeasurement
----------------
time sum
0 47799
It's a bit hacky but it's the only way to guarantee a count of all entries when no field exists that is always present in all entries.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Workaround for a nested query in Kapactior TICKscript - influxdb

Related

Combining LAST and Cumulative SUM on influxdb subquery data

Apply a SUM function on the product of two fields in InfluxDB

InfluxDB 1.7.2 - Top X over time

InfluxDB: In a query result, why does the "values" key have its values nested in an array?

How to get the number of entries in a measurement

Categories

Resources