How to get a query variable on InfluxDB 2.0 dashboard? - influxdb

I read the documentation https://docs.influxdata.com/influxdb/v2.0/visualize-data/variables/
I thought great that will be a piece of cake.
I take a look at an existing query variable named bucket:
buckets()
|> filter(fn: (r) => r.name !~ /^_/)
|> rename(columns: {name: "_value"})
|> keep(columns: ["_value"])
It returns this data:
#group,false,false,false
#datatype,string,long,string
#default,_result,,
,result,table,_value
,,0,pool
,,0,test
The bucket variable works and I can refer to it as v.bucket in the cell queries of any dashboard.
Building on this example I craft the following query:
from(bucket: "pool")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "minerstat")
|> keep(columns: ["account"])
|> distinct(column: "account")
|> keep(columns: ["_value"])
That returns this data:
#group,false,false,false
#datatype,string,long,string
#default,_result,,
,result,table,_value
,,0,0x04ff4e0c05c0feacccf93251c52a78639e0abef4
,,0,0x201f1a58f31801dcd09dc75616fa40e07a70467f
,,0,0x80475710b08ef41f5361e07ad5a815eb3b11ed7b
,,0,0xa68a71f0529a864319082c2475cb4e495a5580fd
And I save it as a query variable with the name account.
Then I use it in a dashboard cell query like this:
from(bucket: "pool")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "minerstat")
|> filter(fn: (r) => r["account"] == v.account)
|> filter(fn: (r) => r["_field"] == "currentHashrate" or r["_field"] == "hashrate")
|> aggregateWindow(every: v.windowPeriod, fn: last, createEmpty: false)
|> yield(name: "last")
But this returns no data. And the dropdown menu for the account variable on the dashboard view is empty.
If I replace v.account above with one of the value returned by the query behind the variable:
from(bucket: "pool")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "minerstat")
|> filter(fn: (r) => r["account"] == "0x04ff4e0c05c0feacccf93251c52a78639e0abef4")
|> filter(fn: (r) => r["_field"] == "currentHashrate" or r["_field"] == "hashrate")
|> aggregateWindow(every: v.windowPeriod, fn: last, createEmpty: false)
|> yield(name: "last")
That works as intended and display a nice graph.
What am I missing here?
SOLUTION: you cannot use variables inside the definition of a variable.
I replaced
start: v.timeRangeStart, stop: v.timeRangeStop
with
start: -1d
in the variable definition:
from(bucket: "pool")
|> range(start: -1d)
|> filter(fn: (r) => r._measurement == "minerstat")
|> keep(columns: ["account"])
|> distinct(column: "account")
|> keep(columns: ["_value"])

I don't think you can use variables within variables, so things like v.timeRangeStart that you can use in a dashboard query can't be used to define another dashboard variable.
You can use duration literals though, like -5d or -2h in your range() call though.

Related

Flux Query : iterate filter from a list

I search a way to filter by a loop/iterate from a list
is it possible?
The table tophdd contains 2 entries, but i can't filter these two entries with a regex.
tophdd = from(bucket: v.bucket)
|>range(start: v.timeRangeStart, stop: v.timeRangeStop)
|>filter(fn: (r) => r._measurement == "HDDID")
|>filter(fn: (r) => r.serial == "${Serial}")
|>filter(fn: (r) => r._field == "HDDID_IOPS")
|>highestMax(n:2,groupColumns: ["HDDID"])
|>keep(columns: ["HDDID" ])
|>from(bucket: v.bucket)
|>range(start: v.timeRangeStart, stop: v.timeRangeStop)
|>filter(fn: (r) => r._measurement == "HDDID")
|>filter(fn: (r) => r.serial == "${Serial}")
|>filter(fn: (r) => r._field == "HDDID_IOPS")
|>filter(fn: (r) => r.HDDID = =~ /"${tophdd}"/)
|>aggregateWindow(column: "_value", every: v.windowPeriod, fn: mean)
i search to filter like this:
filter(fn: (r) => r.HDDID = =~ /"${tophdd}"/)
Is it possible to filter from a list?
Many thanks,
Looks like you just had a duplicate equal sign ( = = ) there. Try to update the query as follows:
filter(fn: (r) => r.HDDID =~ /"${tophdd}"/)
See more details here.
You can extract column values using findColumn to an array and then use contains function in the filter. Eg.
tophdd = from(bucket: v.bucket)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> ...
|> keep(columns: ["HDDID" ])
|> findColumn(fn: (key) => true, column: "HDDID")
from(bucket: v.bucket)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "HDDID")
|> ...
|> filter(fn: (r) => contains(set: tophdd, value: r.HDDID))
|> aggregateWindow(column: "_value", every: v.windowPeriod, fn: mean)
Please note that the performance may be suboptimal as contains() is not a pushdown op.

InfluxdB Flux Query group by substring

I search a way to group by substring/regex of a column:
I search in the documentation https://docs.influxdata.com/ without success.
The column "proc" is composed as follow:
CPU-010.CORE010-00
CPU-010.CORE010-01
CPU-010.CORE010-02
CPU-010.CORE010-03
CPU-010.CORE010-04
CPU-010.CORE010-05
...
CPU-020.CORE020-00
CPU-020.CORE020-01
CPU-020.CORE020-02
CPU-020.CORE020-03
CPU-020.CORE020-04
...
CPU-110.CORE110-00
CPU-110.CORE110-01
CPU-110.CORE110-02
CPU-110.CORE110-03
CPU-110.CORE110-04
...
CPU-120.CORE120-00
CPU-120.CORE120-01
CPU-120.CORE120-02
CPU-120.CORE120-03
etc..
The csv imported is composed as follow:
#datatype measurement,tag,tag,double,dateTime:number
Processors,srv,proc,usage,time
Processors,srv1,CPU-010.CORE010-00,52,1671231960
Processors,srv1,CPU-010.CORE010-00,50,1671232020
Processors,srv1,CPU-010.CORE010-00,49,1671232080
Processors,srv1,CPU-010.CORE010-00,50,1671232140
Processors,srv1,CPU-010.CORE010-00,48,1671232200
Processors,srv1,CPU-010.CORE010-00,53,1671232260
...
Processors,srv1,CPU-020.CORE020-00,52,1671231960
Processors,srv1,CPU-020.CORE020-00,50,1671232020
Processors,srv1,CPU-020.CORE020-00,49,1671232080
Processors,srv1,CPU-020.CORE020-00,50,1671232140
Processors,srv1,CPU-020.CORE020-00,48,1671232200
Processors,srv1,CPU-020.CORE020-00,53,1671232260
...
I tried with this query without success:
from(bucket: v.bucket)
|> range(start: v.timeRangeStart, stop:v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "Processors" and proc.p =~ /(CPU-[09].*)[.].*/)
|> group(columns: ["proc"])
|> aggregateWindow(every: v.windowPeriod, fn: mean)
I tried to group as follow:
CPU-010
CPU-020
CPU-110
CPU-120
Etc..
Many thank for any help
I found the correct syntax with your sample code:
import "strings"
from(bucket: v.bucket)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "Processors")
|> filter(fn: (r) => r["_field"] == "usage")
|> map(fn: (r) => ({r with newProc: strings.substring(v: r.proc, start: 0, end: 7)}))
|> group(columns: ["newProc"])
|> aggregateWindow(column: "_value", every: v.windowPeriod, fn: mean)
|> yield(name: "mean")
Many thanks for your help.
Since you are trying to group by the prefix of the column names, you could try following:
create a new column based on the prefix
group by the new column
Sample code is as follows:
from(bucket: v.bucket)
|> range(start: v.timeRangeStart, stop:v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "Processors")
|> map(fn: (r) => ({r with newProc: strings.substring(v: r._value, start: 0, end: 7)}))
|> group(columns: ["newProc"])
|> aggregateWindow(every: v.windowPeriod, fn: mean)
Notice: |> aggregateWindow(column: "_value", every: v.windowPeriod, fn: mean) ou can specify the column this way. Before trying this, try comment out this line to see the results before the aggregation, especially the column names.
See more details for map function and substring function and aggregateWindow function.

influxDB - Get daily Max. value

I have data with hydrological measurements.
I want to get the daily max Water flow:
from(bucket: "API")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "hydro")
|> filter(fn: (r) => r["_field"] == "temperature")
|> filter(fn: (r) => r["loc"] == "XXX")
|> aggregateWindow(every: v.windowPeriod, fn: max, createEmpty: false)
|> yield(name: "max")
For some reason, for some days, this returns multiple measurements per day.
But not always.
How do I get only the max entry per day?
You need to set the every parameter in the aggregateWindow method to 1d:
from(bucket: "API")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "hydro")
|> filter(fn: (r) => r["_field"] == "temperature")
|> filter(fn: (r) => r["loc"] == "XXX")
|> aggregateWindow(every: 1d, fn: max, createEmpty: false)
|> yield(name: "max")
See the Flux documentation for more details.

Why is this Flux query faster?

The following query...
from(bucket: "mybucket")
|> range(start: 2021-10-01T21:16:00.000Z, stop: 2022-10-01T21:16:00.000Z)
|> filter(fn: (r) => r._measurement == "DATA")
|> filter(fn: (r) => r["code"] == "88820MS")
|> filter(fn: (r) => r._field == "airPressure")
|> aggregateWindow(every: 86400s, fn: mean, createEmpty: false)
|> sort(columns: ["_time"])
|> yield(name:"PUB_88820MS")
from(bucket: "mybucket")
|> range(start: 2021-10-01T21:16:00.000Z, stop: 2022-10-01T21:16:00.000Z)
|> filter(fn: (r) => r._measurement == "DATA")
|> filter(fn: (r) => r["code"] == "86900MS")
|> filter(fn: (r) => r._field == "airPressure")
|> aggregateWindow(every: 86400s, fn: mean, createEmpty: false)
|> sort(columns: ["_time"])
|> yield(name:"PUB_86900MS")
is much faster (like 100 ms vs. 3000 ms = factor 30x) than this equivalent query (on InfluxDB Cloud):
basis = from(bucket: "mybucket")
|> range(start: 2021-10-01T21:16:00.000Z, stop: 2022-10-01T21:16:00.000Z)
DATA = basis
|> filter(fn: (r) => r._measurement == "DATA")
DATA
|> filter(fn: (r) => r["code"] == "88820MS")
|> filter(fn: (r) => r._field == "airPressure")
|> aggregateWindow(every: 86400s, fn: mean, createEmpty: false)
|> sort(columns: ["_time"])
|> yield(name:"PUB_88820MS")
DATA
|> filter(fn: (r) => r["code"] == "86900MS")
|> filter(fn: (r) => r._field == "airPressure")
|> aggregateWindow(every: 86400s, fn: mean, createEmpty: false)
|> sort(columns: ["_time"])
|> yield(name:"PUB_86900MS")
What is the reason? I would expect the second query being faster (or at least just as fast), since I naively thought that the second query is optimized (since common filtered data DATA is reused). Instead, this seems to confuse InfluxDB and stop pushdown processing, making the second query slower.
Why is that?

InfluxDB Flux - Getting last and first values as a column

I am trying to create two new columns with the first and last values using the last() and first() functions. However the function isn’t working when I try to map the new columns. Here is the sample code below. Is this possible using Flux?
from(bucket: "bucket")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "price_info")
|> filter(fn: (r) => r["_field"] == "price")
|> map(fn: (r) => ({r with
open: last(float(v: r._value)),
close: first(float(v: r._value)),
})
I am not answering directly to the question, however it might help.
I wanted to perform some calculation between first and last, here is my method, I have no idea if it is the right way to do.
The idea is to create 2 tables, one with only the first value and the other with only the last value, then to perform a union between both.
data = from(bucket: "bucket")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "plop")
l = data
|> last()
|> map(fn:(r) => ({ r with _time: time(v: "2011-01-01T01:01:01.0Z") }))
f = data
|> first()
|> map(fn:(r) => ({ r with _time: time(v: "2010-01-01T01:01:01.0Z") }))
union(tables: [f, l])
|> sort(columns: ["_time"])
|> difference()
For an unknown reason I have to set wrong date, just to be able to sort values and take into account than first is before last.
Just a quick thank you. I was struggeling with this as well. This is my code now:
First = from(bucket: "FirstBucket")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "mqtt_consumer")
|> filter(fn: (r) => r["topic"] == "Counters/Watermeter 1")
|> filter(fn: (r) => r["_field"] == "Counter")
|> first()
|> yield(name: "First")
Last = from(bucket: "FirstBucket")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "mqtt_consumer")
|> filter(fn: (r) => r["topic"] == "Counters/Watermeter 1")
|> filter(fn: (r) => r["_field"] == "Counter")
|> last()
|> yield(name: "Last")
union(tables: [First, Last])
|> difference()
Simple answer is to use join (You may also use old join, when using "new" join remember to import "join")
Example:
import "join"
balance_asset_gen = from(bucket: "telegraf")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "balance")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
balance_asset_raw = from(bucket: "telegraf")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "balance_raw")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
// In my example I merge two data sources but you may just use 1 data source
balances_merged = union(tables: [balance_asset_gen, balance_asset_raw])
|> group(columns:["_time"], mode:"by")
|> sum()
f = balances_merged |> first()
l = balances_merged |> last()
// Watch out, here we assume we work on single TABLE (we don't have groups/one group)
join.left(
left: f,
right: l,
on: (l, r) => l.my_tag == r.my_tag, // pick on what to merge e.g. l._measurement == r._measurement
as: (l, r) => ({
_time: r._time,
_start: l._time,
_stop: r._time,
_value: (r._value / l._value), // we can calculate new field
first_value: l._value,
last_value: r._value,
}),
)
|> yield()

Resources