Flux still has 2 tables after union and group function - influxdb

I am using flux to sum 5 minute data per day for the past 7 days. I use aggregateWindow on 2 columns and then union to join them back together. I use sort and fill to get the data on each time stamp together. This bit works fine. The issue I have is that the original table is still there.
Code
data = from(bucket: "home")
|> range(start: -7d)
|> filter(fn: (r) => r["_field"] == "BATTERY_CHARGE" or r["_field"] == "BATTERY_DISCHARGE")
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({r with "Battery_Discharge": r.BATTERY_DISCHARGE / 12.0}))
|> map(fn: (r) => ({r with "Battery_Charge": r.BATTERY_CHARGE / 12.0}))
|> keep(columns: ["_time", "Battery_Charge", "Battery_Discharge"])
Discharge_Sum = data
|> aggregateWindow(column: "Battery_Discharge", every: 1d, offset: -8h, fn: sum)
Charge_Sum = data
|> aggregateWindow(column: "Battery_Charge", every: 1d, offset: -8h, fn: sum)
union(tables: [Discharge_Sum, Charge_Sum])
|> yield(name: "result")
|> group(columns: ["_time"], mode: "by")
|> sort(columns: ["Battery_Charge"])
|> fill(column: "Battery_Discharge", usePrevious: true)
|> tail(n: 1)
|> group()
|> tail(n: 7)
|> drop(fn: (column) => column =~ /^(_start|_stop|_measurement|location|region)/)
|> rename(columns: {Battery_Discharge: "Battery Discharge", Battery_Charge: "Battery Charge"})
The table data is as follows.
The bottom table is what I want to keep.
I have been unable to find anything that allows a table to be deleted. I have tried to filter out the first table value but after the union the filter functions only work on the output table.
any ideas?

Related

influx query: how to get historical average

I am SQL native struggling with flux syntax (philosophy?) once again. Here is what I am trying to do: plot values of a certain measurement as a ratio of their historical average (say over the past month).
Here is as far as I have gotten:
from(bucket: "secret_bucket")
|> range(start: v.timeRangeStart, stop:v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "pg_stat_statements_fw")
|> group(columns: ["query"])
|> aggregateWindow(every: v.windowPeriod, fn: sum)
|> timedMovingAverage(every: 1d, period: 30d)
I believe this produces an average over the past 30 days, for each day window. Now what I don't know how to do is divide the original data by these values in order to get the relative change, i.e. something like value(_time)/tma_value(_time).
Thanks to #Munun, I got the following code working. I made a few changes since my original post to make things work as I needed.
import "date"
t1 = from(bucket: "secret_bucket")
|> range(start: v.timeRangeStart, stop:v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "pg_stat_statements_fw")
|> group(columns: ["query"])
|> aggregateWindow(every: 1h, fn: sum)
|> map(fn: (r) => ({r with window_value: float(v: r._value)}))
t2 = from(bucket: "secret_bucket")
|> range(start: date.sub(from: v.timeRangeStop, d: 45d), stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "pg_stat_statements_fw")
|> mean(column: "_value")
|> group()
|> map(fn: (r) => ({r with avg_value: r._value}))
join(tables: {t1: t1, t2: t2}, on: ["query"])
|> map(fn: (r) => ({r with _value: (r.window_value - r.avg_value)/ r.avg_value * 100.0 }))
|> keep(columns: ["_value", "_time", "query"])
Here are few steps you could try:
re-add _time after the aggregate function so that you can have same number of records as the original one:
|> duplicate(column: "_stop", as: "_time")
calculate the ratio with two data sources via join and map
The final Flux could be:
t1 = from(bucket: "secret_bucket")
|> range(start: v.timeRangeStart, stop:v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "pg_stat_statements_fw")
|> group(columns: ["query"])
|> aggregateWindow(every: v.windowPeriod, fn: sum)
|> timedMovingAverage(every: 1d, period: 30d)
|> duplicate(column: "_stop", as: "_time")
t2 = from(bucket: "secret_bucket")
|> range(start: v.timeRangeStart, stop:v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "pg_stat_statements_fw")
join(tables: {t1: t1, t2: t2}, on: ["hereIsTheTagName"])
|> map(fn: (r) => ({r with _value: r._value_t2 / r._value_t1 * 100.0}))

influxDB return 0 instead of 'no results'

we are using the influxDB for statistics and dashboards. We love it! Blazing fast and easy to integrate. However we are stuck when we launch new features.
We have the following FLUX query. A massive database with all "model_events" based on the businessUUID. However if the business doesn't have a car.created it returns no results instead of a range with 0's. If it has one car.created even without the range it will return a 0 range. Is there a possibility to always get the range even if the _measurement doesn't have a value?
from(bucket: "_events")
|> range(start: 2022-09-01, stop: 2022-09-11)
|> filter(fn: (r) => r["_measurement"] == "car.created")
|> filter(fn: (r) => r["business_uuid"] == "055ade92-ecd9-47b1-bf85-c1381d0afd22")
|> aggregateWindow(every: 1d, fn: count, createEmpty: true)
|> yield(name: "amount")
BTW.... a bit new to InfluxDB...
Maybe you could create a dummy table and union() it like:
import "experimental/array"
rows = [{_time: now(), _field: "someField", _value: 0}]
dummy = array.from(rows: rows)
data = from(bucket: "_events")
|> range(start: 2022-09-01, stop: 2022-09-11)
|> filter(fn: (r) => r["_measurement"] == "car.created")
|> filter(fn: (r) => r["business_uuid"] == "055ade92-ecd9-47b1-bf85-c1381d0afd22")
|> aggregateWindow(every: 1d, fn: count, createEmpty: true)
|> yield(name: "amount")
union(tables: [dummy, data])

What does `|>` mean in TICKscript

Trying to write my fist TICKscript to work out when two sensor values cross: if the outside temperature has changed from lower to higher than the inside temperature then I need to close the windows (and conversely).
Using the query builder in InfluxDB I'm getting this for the meadian of the temperature values inside the house over the last 15 minutes:
from(bucket: "zigbee")
|> range(start: -15m, stop: now())
|> filter(fn: (r) => r["room"] == "Kitchen" or r["room"] == "DiningRoom" or r["room"] == "Bed3" or r["room"] == "Bed1")
|> filter(fn: (r) => r["_field"] == "temperature")
|> group(columns: ["_measurement"])
|> aggregateWindow(every: 15m, fn: mean, createEmpty: false)
|> yield(name:"inside")
The syntax |> appears to undocumented -- can you provide a reference?
Replacing |> with | breaks it.
It seems that group and aggregateWindow do not commute?
Presumably because aggregateWindow is forced to choose a single representative _time value for each window?
I think the plan is to
assign this to a stream,
copy and edit to creata a second stream shifted by 15 minutes,
create a second pair of streams for the outside temperature.
join all four streams and caluclate a value indicating whether the inside and outside temperatures have crossed over.
Unless you have a better idea?
(Right now it's looking easier to import the data into SQL.)
Check InfluxDB Flux language documentation for |>:
InfluxDB Pipe-forward operator
According to your flux syntax query:
from(bucket: "zigbee")
|> range(start: -15m, stop: now())
|> filter(fn: (r) => r["room"] == "Kitchen" or r["room"] == "DiningRoom" or r["room"] == "Bed3" or r["room"] == "Bed1")
|> filter(fn: (r) => r["_field"] == "temperature")
|> group(columns: ["_measurement"])
|> aggregateWindow(every: 15m, fn: mean, createEmpty: false)
|> yield(name:"inside")
You are taking data from bucket "zigbee"
Data from source are passed to range filter function with pipe-forward |> operator
Results from range filter data are passed to next filter function with another pipe-forward operator
Etc.
So all data flows as a result from one function to another.
You can group by but in your case columns are "room" key values if I understand your intentions correctly, so try:
|> group(columns: ["room"])
There is a difference between key values and measurement names - you should check InfluxDB documentation for understatnding data structure.
Flux data model documentation
I'ts not TICKscript, it's something do to with InfluxDB that might be called flux.
mean = from(bucket: "zigbee")
|> range(start: -5d, stop: now())
|> filter(fn: (r) => r["room"] == "Outside")
|> filter(fn: (r) => r["_measurement"] == "temperature")
|> aggregateWindow(every: 30m, fn: mean, createEmpty: false)
shift = mean
|> timeShift(duration: -3h)
j = join(tables: {mean: mean, shift: shift}, on: ["_time"])
|> map(fn: (r) => ({ r with diff: float(v: r._value_mean) - float( v: r._value_shift) }))
// yield contains 1 table with the required columns, but the UI doesn't understand it.
// The UI requires 1 table for each series.
j |> map(fn: (r) => ({_time: r._time, _value: r._value_mean})) |> yield(name: "mean")
j |> map(fn: (r) => ({_time: r._time, _value: r._value_shift})) |> yield(name: "shift")
j |> map(fn: (r) => ({_time: r._time, _value: r.diff})) |> yield(name: "diff")
The |> in TickScript "Declares a chaining method call which creates an instance of a new node and chains it to the node above it." as said in the official documentation

Work with non-table values, aka "A is not subtractable"

I see many similar questions but couldn't find a good match.
If we define a query and the result aught to be single value, is there a flux way to store as such? Example:
total = from(bucket: "xxx")
|> range(start: 0)
|> filter(fn: (r) => ...)
|> keep(columns: ["_value"])
|> sum()
consumed = from(bucket: "xxx")
|> range(start: 0)
|> filter(fn: (r) => ...)
|> keep(columns: ["_value"])
|> last()
total - consumed
Results in
invalid: error #18:1-18:40: [A] is not Subtractable
I can think of other ways to solve similar issues, but this example made me question whether flux actually supports easy working with single values or 1x1 relations.
Thanks
Not answering my original question but I want to provide the workaround I went with to solve this. I would still be interested in a more direct solution.
I've introduced a second column, then joined the two tables on that column:
total = from(bucket: "xxx")
|> range(start: 0)
|> filter(fn: (r) => ...)
|> keep(columns: ["_value"])
|> sum()
// Added:
|> map(fn: (r) => ({ age: "latest", _value:r._value }))
consumed = from(bucket: "xxx")
|> range(start: 0)
|> filter(fn: (r) => ...)
|> keep(columns: ["_value"])
|> last()
// Added:
|> map(fn: (r) => ({ age: "latest", _value:r._value }))
join(tables: {total: total, consumed: consumed}, on: ["age"])
|> map(fn: (r) => ({_value: r._value_total - r._value_consumed}))
In the query, total and consumed are tables. For how to extract and use scalar values, please see Extract scalar values in Flux

How to merge (join) two tables in a specific way in Grafana using InfluxDB flux query?

Grafana: 7.1.5
InfluxDB: 1.8
I currently have three separate table panels in Grafana where the only difference between each query is the time range (Year, Month, Day). I would like to combine these three tables into one, where the measurement's value is separated into three columns (one for each time range).
More explicitly, what I have currently is:
Table1 Columns: [Tag1+Tag2, _value] where _value is the units this
year
Table2 Columns: [Tag1+Tag2, _value] where _value is the units
this month
Table3 Columns: [Tag1+Tag2, _value] where _value is the units this
day
What I want is:
Table Columns: [Tag1+Tag2, Table1_value (Year), Table2_value (Month), Table3_value (Day)]
These are my queries:
import "date"
thisYearSoFar = date.truncate(t: now(), unit: 1y)
thisMonthSoFar = date.truncate(t: now(), unit: 1mo)
thisDaySoFar = date.truncate(t: now(), unit: 1d)
from(bucket: "consumption")
|> range(start: thisYearSoFar, stop: now())
|> filter(fn: (r) => r._measurement == "stuff" and r._field == "units" and r._value > 0)
|> group(columns: ["datacenter","tenant"])
|> sum(column: "_value")
|> map(fn: (r) => ({r with _value: r._value / 4.0}))
from(bucket: "consumption")
|> range(start: thisMonthSoFar, stop: now())
|> filter(fn: (r) => r._measurement == "stuff" and r._field == "units" and r._value > 0)
|> group(columns: ["datacenter","tenant"])
|> sum(column: "_value")
|> map(fn: (r) => ({r with _value: r._value / 4.0}))
from(bucket: "consumption")
|> range(start: thisDaySoFar, stop: now())
|> filter(fn: (r) => r._measurement == "stuff" and r._field == "units" and r._value > 0)
|> group(columns: ["datacenter","tenant"])
|> sum(column: "_value")
|> map(fn: (r) => ({r with _value: r._value / 4.0}))
I've tried joining these tables in various ways, but nothing I'm doing is working properly to get me the one table with 4 columns that I'm looking for.
Anyone have ideas on how to achieve this? Thanks!
I worked with a Flux developer that helped me come up with the solution:
import "date"
sum_over_range = (unit) =>
from(bucket: "consumption")
|> range(start: date.truncate(t: now(), unit: unit))
|> filter(fn: (r) => r._measurement == "stuff" and r._field == "units" and r._value > 0)
|> group(columns: ["datacenter", "tenant"])
|> sum()
|> map(fn: (r) => ({r with _value: r._value / 4.0, _field: string(v: unit), _time: 0}))
union(tables: [sum_over_range(unit: 1y), sum_over_range(unit: 1mo), sum_over_range(unit: 1d)
])
|> group(columns: ["datacenter", "tenant"])
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|> drop(columns: ["_time", "_start", "_stop", "result"])
|> group()
Then additionally in Grafana, I had to apply the 'Filter by name' transformation to hide the 'result' and 'table' columns that showed.

Resources