Remove some series from query in influx - influxdb

I have query:
SELECT non_negative_derivative(max("value"), 10s)
FROM "interface_rx"
WHERE "host" =~ /host.+/
AND "instance" =~ /eth.+/
AND "type" = 'if_octets'
AND $timeFilter
GROUP BY time(5m), "instance"
fill(null)
It returns all found series - and this is too much.
I want to cut down sequences with values of non_negative_derivative(max("value"), 10s) > 100.
If I do this:
SELECT non_negative_derivative(max("value"), 10s)
as irx
FROM "interface_rx"
WHERE "host" =~ /host.+/
AND "instance" =~ /eth.+/
AND "type" = 'if_octets'
AND $timeFilter
AND irx > 100
GROUP BY time(5m), "instance"
fill(null)
influx just ignores me (empty results).
How can I filter out slow series from result? Thanks.

Unfortunately there isn't a way to refer to irx inside the body of the query.
To achieve the result you're looking for you'll need to issue two queries:
SELECT non_negative_derivative(max("value"), 10s) AS irx
INTO tmp
FROM "interface_rx"
WHERE "host" =~ /host.+/
AND "instance" =~ /eth.+/
AND "type" = 'if_octets'
AND $timeFilter
GROUP BY time(5m), "instance"
fill(null)
and
SELECT irx FROM tmp WHERE irx > 100 GROUP BY instance

Related

Combining LAST and Cumulative SUM on influxdb subquery data

I have some telemetry data that looks like this:
> select * from connected_users LIMIT 100
name: connected_users
time event_id value
---- -------- -----
1605485019101319658 13759 2
1605485299566004832 13759 0
1605490011182967926 13760 4
1605490171409428188 13760 0
1605490207031398071 13760 7
1605490246151204709 13760 0
1605491054403726308 13761 1
1605491403050521520 13761 0
1605491433407347399 13762 2
1605491527773331816 13762 3
1605492020976088377 12884 1
1605492219827002782 13761 1
1605492613984911844 13763 1
1605492806683323942 13763 0
...
These writes only occur when something changes on the event (i.e. are not at fixed intervals). I want to write a query which will give me a cumulative sum per-minute of the current "value" on all the event_ids. However, because I can't guarantee that a new data value will have been written in the preceding 60 seconds, I use LAST to get whatever was last set per event_id
So far I got to:
SELECT SUM(*) FROM
(SELECT LAST("value") FROM
"connected_users" WHERE
time <= now() AND
time >= now() - 3h
GROUP BY time(1m), "event_id" fill(previous))
GROUP BY time(1m)
But this seems to give me a much lower outer "value" than expected, and a lot of duplicate time entries (and thus a lot of duplicate entries in the output data.)
I can see that the inner query is correct, because If I just run that in isolation, I can stack the output data in Grafana and manually see the correct total value in the graph. However, I want to have a single series rather than a grouped set of series, and I can't wrap my head around how to transform the data to do that.
EDIT: To give more context:
This is the inner query (Grafana, hence $timeFilter):
SELECT last("value") FROM "connected_users" WHERE $timeFilter GROUP BY time(1m), "event_id" fill(previous)
This produces a chart which I can stack and is correct:
If I then wrap that inner query in a SUM and GROUP BY time(1m) again, I can isolate a single series:
SELECT SUM(*) FROM (SELECT last("value") FROM "connected_users" WHERE $timeFilter AND ("event_id = '9970') GROUP BY time(1m), "event_id" fill(previous)) GROUP BY time(1m)
However, If I remove the AND and attempt to SUM all series values, I just end up with both a jumbled mess (presumably because there are duplicate/overlapping time values?) and also a lower max value than expected (expecting 18, got 8):
SELECT SUM(*) FROM (SELECT last("value") FROM "connected_users" WHERE $timeFilter GROUP BY time(1m), "event_id" fill(previous)) GROUP BY time(1m)

Apply a SUM function on the product of two fields in InfluxDB

I have the following query:
SELECT sum("field1" * "field2") FROM "my_db"."autogen"."data" GROUP BY time(1d) FILL(null)
In short I would like to perform the operation sum on the product of two fields field and field2.
The above query returns an error: expected field argument in sum().
Is this kind of thing at all possible in InfluxDB?
Here's a idea: try Sub Query
Note:I don't have editor right now so it might give error too
SELECT SUM(Multiplication) FROM
(SELECT "field1" * "field2" as Multiplication, time(1d) as Days FROM
"my_db"."autogen"."data" GROUP BY time(1d) FILL(null)
) GROUP BY Days

InfluxDB Query to fetch count of distinct values for Grafana

I have a collector which collects three fields from a log file and saves it to influxDB in following format:
FeildA FeildB FeildC
------- -------- --------
A 00 123B 02 100A 00 13A 00 123
I want to plot graph in Grafana such that I get count of occurrence of "A" and "B" (FeildA)
IMP: FeildA can have multiple values, not known before-hand. Hence writing query with "where" clause is not an option.
If FeildA is only defined as field in measurement schema you can use regexp in "where" clause and these queries might work for you:
```
SELECT COUNT(FeildA) FROM "logdata" WHERE $timeFilter and FeildA::field =~ /^A$/
SELECT COUNT(FeildA) FROM "logdata" WHERE $timeFilter and FeildA::field =~ /^A$/
SELECT COUNT(FeildA) FROM "logdata" WHERE $timeFilter and FeildA =~ /^(A|B)$/
```
If the number of expected distinct values of FeildA (cardinality) is resonable the real solution would be to make FeildA a "Tag" instead of "Field". Then you can use "group by tag" in query. For example, query:
```
SELECT COUNT(FeildA) FROM "logdata" WHERE $timeFilter AND "FeildA" =~ /^(A|B|C|D)$/ GROUP BY time(1m), FeildA fill(null)
```
will give counts of occurrence of "A","B","C","D". But this require changes in collector.
FeildA can be both a "tag" and a "field" in influxdb but it is better when names are different to avoid collision and simplify syntax in queries.

Rails Active Record Count Across Calculated Field

Using the AREL / Rails calculations I'm trying to execute the following:
SELECT to_char(timestamp, 'YYYY-MM-DD') AS segment, COUNT(*) AS counter
FROM pages
GROUP BY segment
ORDER BY segment
I can run something like:
Page.order(FIELD).count(group: FIELD)
{ a: 1, b: 4, c: 1 }
However, I can't get this working across calculated fields. Any thoughts?
Came up with this:
Page.count(:all, from: "(SELECT to_char(#{SEGMENT}, 'YYYY-MM-DD') AS segment FROM pages) AS pages", group: "segment", order: "segment")
> SELECT COUNT(*) AS count_all, segment AS segment FROM (SELECT to_char(created_at, 'YYYY-MM-DD') AS segment FROM pages) AS pages GROUP BY segment ORDER BY segment

SQL Syntax Issue

I'm joining two tables and making a simple count, but I can't seem to rename the joined key variable into something more appropriate for the two tables, I keep getting the error '"CUSTOMER_NO" is not valid in the context where it is used.' I'm sure it's just a little syntax error, but I can't see it...
SELECT owner_no AS customer_no,
CASE
WHEN customer_no BETWEEN 5000 and 5999 THEN 'RENTER'
WHEN customer_no BETWEEN 6000 and 6999 THEN 'OWNER'
END AS customer_type
FROM owner_phone AS op
INNER JOIN renter_phone AS rp ON op.owner_no = rp.renter_no
GROUP BY customer_no
HAVING COUNT(*) > 1;
Use the actual column name in your CASE and GROUP BY, not the aliased column name.
CASE
WHEN owner_no BETWEEN 5000 and 5999 THEN 'RENTER'
WHEN owner_no BETWEEN 6000 and 6999 THEN 'OWNER'
END AS customer_type
FROM owner_phone AS op
INNER JOIN renter_phone AS rp ON op.owner_no = rp.renter_no
GROUP BY owner_no
HAVING Count(*) > 1;
You have to use OWNER_NO through the rest of your query but leave the AS CUSTOMER_NO to make that the column name.
SELECT owner_no AS customer_no,
CASE
WHEN owner_no BETWEEN 5000 and 5999 THEN 'RENTER'
WHEN owner_no BETWEEN 6000 and 6999 THEN 'OWNER'
END AS customer_type
FROM owner_phone AS op
INNER JOIN renter_phone AS rp ON op.owner_no = rp.renter_no
GROUP BY owner_no
HAVING COUNT(*) > 1;

Resources