Hi I am currently doing this in R but would like to know if there is a way for me to do it in Flux instead:
I have a time series that tracks a value and only stores when the signal is turned on and off. The problem is that the nature of the machine I am tracking only allows for it to be done this way. This results in a data table/measurement where two rows kind of display a single value (e.g. the start and end of a Fault). How do I query the data using flux to combine those two rows? (With "start" and "stop" as tags/fields)
I currently use the elapsed()-function to calculate the time difference/duration of my value
time value field measurement equipmentNumber workplace duration
2021-01-29 07:11:17.496 1 FAULT_LASER FAULT_LASER L5211M0855 0 188
2021-01-29 07:12:03.332 0 FAULT_LASER FAULT_LASER L5211M0855 0 45835
2021-01-29 07:12:19.618 1 FAULT_LASER FAULT_LASER L5211M0855 0 16285
2021-01-29 07:12:19.618 0 FAULT_LASER FAULT_LASER L5211M0855 0 161725
Im am doing this in R at the moment:
for(i in 1:nrow(df_f)){
if(df_f[i, "duration"] > 0){
df_fdur[i, "start"] <- df_f[i, "time"]
df_fdur[i, "stop"] <- df_f[i+1, "time"]
df_fdur[i, "type"] <- df_f[i, "value"]
df_fdur[i, "duration"] <- df_f[i, "duration"]
df_fdur[i, "workplace"] <- df_f[i, "workplace"]
df_fdur[i, "equipmentNumber"] <- df_f[i, "equipmentNumber"]
}
}
Any ideas on how I can do that?
This does not directly solve the question but it solved the problem I was working on. Maybe it is useful for someone else. Have a great day!
// Get all the data from the bucket filtered for FAULT_LASER
data = from(bucket: "plcview_4/autogen")
|> range(start: 2021-01-29T00:00:00.000Z, stop: now()) // regular time range filter
|> filter(fn: (r) => r._measurement == "FAULT_LASER") // filter for the measurement
|> elapsed(unit: 1ms, timeColumn: "_time", columnName: "duration") // calculate time difference between rows
|> yield(name: "data")
// returns data tables for every unique set of tags (workplace and equipmentNumber)
// Filter for all "No-Fault-Values" and sum their durations
operational = data
|> filter(fn: (r) => r._value == 0) // filter for all rows where FAULT_LASER = 0 --> No Faults
|> group() // group all data tables together
|> sum(column: "duration") // sum all the durations from all data tables
|> yield(name: "operational")
// Count the number of faults
nfaults = data
|> filter(fn: (r) => r._value == 1) // filter for all rows where FAULT_LASER = 1 --> Faults
|> group() // group all data tables together
|> count() // count the number of records
|> yield(name: "nfaults")
Related
I want to calculate the working time of a component.
The speed signal is available in an InfluxDB.
So I need a query that calculates the sum of time span in which the speed is not zero.
With this I filter the data. But I don't know how to get the timespan.
from(bucket: "${bucket}")
|> range(start: v.timeRangeStart, stop:v.timeRangeStop)
|> filter(fn: (r) =>
r._measurement == "${measurement}"
and r._field == "Speed"
and r._value > 0
)
I get an issue understanding how to use the join.inner() function.
It seems I can only have a result (and the correct one) if I use the limit() function to the stream I want to use the join.inner function with.
If don't limit this left stream, I don't get any error but just no result.
It is because of how I get my left stream ?
Do you have any idea what I am doing wrong here ?
I am pretty new using InfluxDB therefore the flux language so it must be me.
Thank you all for your answers !
import "array"
import "join"
left =
from(bucket: "TestBucket")
|> range(start: 0)
|> filter(fn: (r) => r["_measurement"] == "TestMeasurement")
|> limit(n : 1000000000000000000)
|> group()
//|> yield(name: "LEFT")
right =
array.from(
rows: [
{arrayValue: "123", _time: 2023-02-07T12:00:00.000Z}, //This timestamp exists in the left stream
],
)
//|> yield(name: "RIGHT")
result = join.inner(
left: left,
right: right,
on: (l, r) => l._time == r._time, // I made sure that there is indeed a common time
as: (l, r) => ({l with rightValue: r.arrayValue}),
)
|> yield(name: "RESULT")
Ok, the solution was to group by _time column the stream AND the table :
|> group(columns: ["_time"])
Using the quantile function, I was able to get 95 % percentile value in a stream.
Now, i want to filter records which lie below the 95% percentile.
hence, I loop over my recods and filter records which lie below the percentile.
However, at this topic I get error –
Please find code below –
percentile = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> group(columns:["documentType"])
// |> yield()
|> quantile(column: "processTime", q: 0.95, method: "estimate_tdigest", compression: 9999.0)
|> limit(n: 1)
|> rename(columns: {processTime: "pt"})
Gives me data – >
0 PurchaseOrder 999
Now, I try to loop over my records and filter -
percentile_filered = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> filter(fn: (r) => r.processTime < percentile[0]["pt"])
|> yield()
Where, totalTimeByDoc is like below –
|0|PurchaseOrder|testpass22PID230207222747-1|1200|
|1|PurchaseOrder|testpass22PID230207222747-2|807|
|2|PurchaseOrder|testpass22PID230207222934-1|671|
|3|PurchaseOrder|testpass22PID230207222934-2|670|
I get following error from above query –
error #116:41-116:51: expected [{A with pt: B}] (array) but found stream[{A with pt: B}]
You are only missing column extraction from percentile stream. Have a look at Extract scalar values. In this very case, you could do
percentile = totalTimeByDoc
|> ...
|> rename(columns: {processTime: "pt"})
|> findColumn(fn: (key) => true, column: "pt")
percentile_filtered = totalTimeByDoc
|> filter(fn: (r) => r["documentType"] == "PurchaseOrder")
|> filter(fn: (r) => r.processTime < percentile[0])
|> yield()
I'm struggling with an Influx 2 query in Flux on how to join and map data from two differents sets (tables) into a specific desired output.
My current Flux query is this:
data = from(bucket: "foo")
|> range(start:-1d)
|> filter(fn: (r) => r._measurement == "io")
|> filter(fn: (r) => r["device_id"] == "12345")
|> filter(fn: (r) => r._field == "status_id" )
# count the total points
totals = data
|> count(column: "_value")
|> toFloat()
|> set(key: "_field", value: "total_count")
# calculate the amount of onlines points (e.g. status = '1')
onlines = data
|> filter(fn: (r) => r._value == 1)
|> count(column: "_value")
|> toFloat()
|> set(key: "_field", value: "online_count")
union(tables: [totals, onlines])
This returns as output:
[{'online_count': 58.0}, {'total_count': 60.0}]
I would like to have appended to this output a percentage calculated from this. Something like:
[{'online_count': 58.0}, {'total_count': 60.0}, {'availability': 0.96666667}]
I've tried combining this using .map(), but to no avail:
# It feels like the map() is what I need, but can't find the right
# combination with .join/union(), .map(), .set()., .keep() etc.
union(tables: [totals, onlines])
|> map(fn: (r) => ({ r with percentage_online: r.onlines.online_count / r.totals.total_count * 100 }))
How can I append the (calculated) percentage as new field 'availability' in this Flux query?
Or, alternatively, is there a different Flux query approach to achieve this outcome?
N.B. I am aware of the Calculate percentages with Flux article from the docs, which I can't get working into this specific scenario. But it's close.
I am new to Influxdb. I am using 1.8+ Influxdb and com.influxdb:influxdb-client-java:1.11.0. I have a below measurement
stocks {
(tag) symbol: String
(field) price: Double
(field) volume: Long
(time) ts: Long
}
I am trying to query the measurement with a 15 min window. I have the below query
"from(bucket: \"test/autogen\")" +
" |> range(start: -12h)" +
" |> filter(fn: (r) => (r[\"_measurement\"] == \"$measurementName\" and r[\"_field\"] == \"volume\"))" +
" |> cumulativeSum(columns: [\"_value\"])" +
" |> window(every: 15m, period: 15m)"
I believe that the above query calculates the cumulative sum over the data and returns just the volume field. However, I want the entire measurement including price, symbol, and ts along with the cumulative sum of the volume in a single flux query. I am not sure how to do this. Any help is appreciated. Thanks.
Thanks to Ethan Zhang. Flux output tables use a vertical (column-wise) data layout for fields.
Note that the price and the volume fields are stored as two separate rows.
To achieve the result you can use a function called v1.fieldsAsCols() to convert the table from a vertical layout back to the horizontal layout. Here is a link to its documentation: https://docs.influxdata.com/influxdb/v2.0/reference/flux/stdlib/influxdb-v1/fieldsascols/
Hence query can be rewritten as follows: sample query 1
from(bucket: \"test/autogen\")
|> range(start: -1h)
|> filter(fn: (r) => r["_measurement"] == "stocks"))
|> v1.fieldsAsCols()
|> group()
|> cumulativeSum(columns: ["volume"])
|> window(every: 15m, period: 15m)
Another approach is using pivot: sample query 2
from(bucket: \"test/autogen\")
|> range(start: -1h)
|> filter(fn: (r) => r["_measurement"] == "stocks")
|> pivot(rowKey:[\"_time\"], columnKey: [\"_field\"], valueColumn: \"_value\")
|> group()
|> cumulativeSum(columns: ["volume"])
|> window(every: 15m, period: 15m)