I am writing to a InfluxDB counters per time period (the delta between each submission for the new seen messages of that type). I would like to combine the total count of messages over a time period to give messages per hour (or other time periods).
I have the below query, and using https://docs.influxdata.com/influxdb/cloud/reference/flux/stdlib/built-in/transformations/aggregates/sum/:
from(bucket: "ServiceStats")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "PolygonIoStream")
|> filter(fn: (r) => r["_field"] == "aggregatesCounter" or r["_field"] == "quotesCounter" or r["_field"] == "statusesCounter" or r["_field"] == "tradesCounter")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
|> Sum()
|> yield(name: "mean")
However i get the error runtime error #6:6-6:74: aggregateWindow: missing time column "_time"
I will be honest, this as my first query I have quickly gotten out of my depth - pointers much appreciated.
Related
I'm using InfluxDB 2 and I've got the following curve:
Instead of showing the total accumulated, I want to know the consumption per day.
from(bucket: "my-bucket")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "kWh")
|> filter(fn: (r) => r["entity_id"] == "plug_energy")
|> filter(fn: (r) => r["_field"] == "value")
|> aggregateWindow(every: 1d, fn: sum, createEmpty: false)
I thought this would work but it actually gives me the number of time the smart plug was turned on each day, not the energy consumed per day.
I have found this similar question but this looks like a very complicated solution for something that should be simpler?
I have saved my location in a bucket with latitude and longitude.
So far, I've been able to get the result to show nicely in a table with both latitude and longitude with the following code:
import "experimental/geo"
from(bucket: "home-assistant")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["entity_id"] == "maxime")
|> filter(fn: (r) => r["_field"] == "latitude" or r["_field"] == "longitude")
|> keep(columns: ["_time", "_value", "_field"])
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
Now, let say I want to know how much time I've been home per day.
I feel like it should be something doable but seeing this issue, I'm suddenly not so sure about it.
What I need I believe is a mix between the geo.filterRows so that I can have a certain radius to designate my house, and maybe elapsed? But if that's the case I haven't managed to wrap my head around it to get the expected result.
I am quite new to Flux and want to solve an issue:
I got a bucket containing measurements, which are generated by a worker-service.
Each measurement belongs to a site and has an identifier (uuid). Each measurement contains three measurement points containing a value.
What I want to archive now is the following: Create a graph/list/table of measurements for a specific site and aggregate the median value of each of the three measurement points per measurement.
TLDR;
Get all measurementpoints that belong to the specific site-uuid
As each measurement has an uuid and contains three measurement points, group by measurement and take the median for each measurement
Return a result that only contains the median value for each measurement
This does not work:
from(bucket: "test")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "lighthouse")
|> filter(fn: (r) => r["_field"] == "speedindex")
|> filter(fn: (r) => r["site"] == "1d1a13a3-bb07-3447-a3b7-d8ffcae74045")
|> group(columns: ["measurement"])
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
|> yield(name: "mean")
This does not throw an error, but it of course does not take the median of the specific groups.
This is the result (simple table):
If I understand your question correctly you want a single number to be returned.
In that case you'll want to use the |> mean() function:
from(bucket: "test")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "lighthouse")
|> filter(fn: (r) => r["_field"] == "speedindex")
|> filter(fn: (r) => r["site"] == "1d1a13a3-bb07-3447-a3b7-d8ffcae74045")
|> group(columns: ["measurement"])
|> mean()
|> yield(name: "mean")
The aggregateWindow function aggregates your values over (multiple) windows of time. The script you posted computes the mean over each v.windowPeriod (in this case 20 minutes).
I am not entirely sure what v.windowPeriod represents, but I usually use time literals for all times (including start and stop), I find it easier to understand how the query relates to the result that way.
On a side note: the yield function only renames your result and allows you to have multiple returning queries, it does not compute anything.
In Influx Flux, is there a technical difference (like in execution or performance) between setting a filter operation in a single statement vs. using multiple, chained statements?
For example, the single statement:
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) =>
r._measurement == "example-measurement" and
r._field == "example-field" and
r.tag == "example-tag"))
... versus using multiple, chained lambda's:
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> filter(fn: (r) => r._field == "example-field")
|> filter(fn: (r) => r.tag == "example-tag"))
Perhaps both operations are executed equally. But I cannot find canon in the docs on it, although the examples seem to prefer the first example.
I understand that logical operator OR isn't ideal in the second case. Let's assume for this question it's all AND.
I want to raise an alarm when the count of a particular kind of event is less than 5 for the 3 hours leading up to the moment the check is evaluated, but I need to do this check every 15 minutes.
Since I need to check more frequently than the span of time I'm measuring, I can't do this based on my raw data (according to the docs, "[the schedule] interval matches the aggregate function interval for the check query". But I figured I could use a "task" to transform my data into a form that would work.
I was able to aggregate the data in the way that I hoped via a flux query, and I even saved the resultant rolling count to a dashboard.
from(bucket: "myBucket")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) =>
(r._measurement == "measurementA"))
|> filter(fn: (r) =>
(r._field == "booleanAttributeX"))
|> window(
every: 15m,
period: 3h,
timeColumn: "_time",
startColumn: "_start",
stopColumn: "_stop",
createEmpty: true,
)
|> count()
|> yield(name: "count")
|> to(bucket: "myBucket", org: "myOrg")
Results in the following scatterplot.
My hope was that I could just copy-paste this as a new task and get my nice new aggregated dataset. After resolving a couple of legible syntax errors, I settled on the following task definition:
option v = {timeRangeStart: -12h, timeRangeStop: now()}
option task = {name: "blech", every: 15m}
from(bucket: "myBucket")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) =>
(r._measurement == "measurementA"))
|> filter(fn: (r) =>
(r._field == "booleanAttributeX"))
|> window(
every: 15m,
period: 3h,
timeColumn: "_time",
startColumn: "_start",
stopColumn: "_stop",
createEmpty: true,
)
|> count()
|> yield(name: "count")
|> to(bucket: "myBucket", org: "myOrg")
Unfortunately, I'm stuck on an error that I can't find any mention of anywhere: could not execute task run; Err: no time column detected: no time column detected.
If you could help me debug this task run error, or sidestep it by accomplishing this task in some other manner, I'll be very grateful.
I know I'm late here, but the to function needs a _time column, but the count aggregate you are adding returns a _start and _stop column to indicate the time frame for the count, not a _time.
You can solve this by either adding |> duplicate(column: "_stop", as: "_time") just before your to function, or leveraging the aggregateWindow function which handles this for you.
|> aggregateWindow(every: 15m, fn: count)
References:
https://v2.docs.influxdata.com/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/count
https://v2.docs.influxdata.com/v2.0/reference/flux/stdlib/built-in/transformations/duplicate/
https://v2.docs.influxdata.com/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/