InfluxDB Query to fetch count of distinct values for Grafana - monitoring

I have a collector which collects three fields from a log file and saves it to influxDB in following format:
FeildA FeildB FeildC
------- -------- --------
A 00 123B 02 100A 00 13A 00 123
I want to plot graph in Grafana such that I get count of occurrence of "A" and "B" (FeildA)
IMP: FeildA can have multiple values, not known before-hand. Hence writing query with "where" clause is not an option.

If FeildA is only defined as field in measurement schema you can use regexp in "where" clause and these queries might work for you:
```
SELECT COUNT(FeildA) FROM "logdata" WHERE $timeFilter and FeildA::field =~ /^A$/
SELECT COUNT(FeildA) FROM "logdata" WHERE $timeFilter and FeildA::field =~ /^A$/
SELECT COUNT(FeildA) FROM "logdata" WHERE $timeFilter and FeildA =~ /^(A|B)$/
```
If the number of expected distinct values of FeildA (cardinality) is resonable the real solution would be to make FeildA a "Tag" instead of "Field". Then you can use "group by tag" in query. For example, query:
```
SELECT COUNT(FeildA) FROM "logdata" WHERE $timeFilter AND "FeildA" =~ /^(A|B|C|D)$/ GROUP BY time(1m), FeildA fill(null)
```
will give counts of occurrence of "A","B","C","D". But this require changes in collector.
FeildA can be both a "tag" and a "field" in influxdb but it is better when names are different to avoid collision and simplify syntax in queries.

Related

Apply a SUM function on the product of two fields in InfluxDB

I have the following query:
SELECT sum("field1" * "field2") FROM "my_db"."autogen"."data" GROUP BY time(1d) FILL(null)
In short I would like to perform the operation sum on the product of two fields field and field2.
The above query returns an error: expected field argument in sum().
Is this kind of thing at all possible in InfluxDB?
Here's a idea: try Sub Query
Note:I don't have editor right now so it might give error too
SELECT SUM(Multiplication) FROM
(SELECT "field1" * "field2" as Multiplication, time(1d) as Days FROM
"my_db"."autogen"."data" GROUP BY time(1d) FILL(null)
) GROUP BY Days

InfluxDB - Including multiple values in where clause based on tags

I'm trying to query data based on tag values. Is it possible to include multiple queries in the where clause . I could not find an operator similar to the IN operator in SQL.
select * from students where rollNumber='1' limit 10
students is the measurement and rollNumber is a tag. I want include multiple values of rollNumber in the query.
Any suggestions to solve the problem?
InfluxDB does not have IN operator, however it supports Go-lang regular expressions in WHERE clause for fields and tags. Regular expressions are enclosed with / and require adding ~ after comparison operator:
select * from students where rollNumber =~ /1|2|3/ limit 10
This will return 10 students, where rollNumber tag contains 1 or 2 or 3.
For a precise match the following should work:
select * from students where rollNumber =~ /^[1|2|3]$/ limit 10
Note: In case of filtering fields, if the type of fields is not string, regex will not work...
But as noted in the comments, using OR operator with explicit comparison should work better, as tag index can be used for more efficient querying.

InfluxDb Continuous Query that excludes zeros

I'm trying to create a continuous query in InfluxDb to downsample the measurement data to the hourly mean values. I can do that with the continuous query below.
CREATE CONTINUOUS QUERY "cq_test_1h" ON "db-name"
BEGIN
SELECT mean("value") AS "mean_value"
INTO "downsampled"."downsampled_measurement"
FROM "autogen"."measurement"
GROUP BY time(1h)
END
But I also want that if the hourly mean equals zero, the result is excluded; so the downsampled_measurement series does not contain any zero values. I can make a (nested) query that does what I want, but I don't know how to make this into a continuous query.
SELECT mean_value
FROM
(SELECT mean(value) AS mean_value
FROM "measurement"
WHERE time<now()
GROUP BY time(1h))
WHERE mean_value>0
The query above works, but to make it a continuous query it needs an aggregator, GROUP BY clause and a duration argument in the WHERE clause:
SELECT mean(mean_value)
FROM
(SELECT mean(value) AS mean_value FROM "db-name"."autogen"."measurement"
WHERE time<now()
GROUP BY time(1h))
WHERE mean_value>0 AND time<now()
GROUP BY time(1h)
However, this query no longer returns any values. How can I make a continuous query that excludes zeros?

ruby on rails' alphabetical order method doesn't place word boundary before "a"? [duplicate]

I use PostgreSQL 9.3.3 and I have a table with one column named as title (character varying(50)).
When I have executed the following query:
select * from test
order by title asc
I got the following results:
#
A
#Example
Why "#Example" is in the last position? In my opinion "#Example" should be in the second position.
Sort behaviour for text (including char and varchar as well as the text type) depends on the current collation of your locale.
See previous closely related questions:
PostgreSQL Sort
https://stackoverflow.com/q/21006868/398670
If you want to do a simplistic sort by ASCII value, rather than a properly localized sort following your local language rules, you can use the COLLATE clause
select *
from test
order by title COLLATE "C" ASC
or change the database collation globally (requires dump and reload, or full reindex). On my Fedora 19 Linux system, I get the following results:
regress=> SHOW lc_collate;
lc_collate
-------------
en_US.UTF-8
(1 row)
regress=> WITH v(title) AS (VALUES ('#a'), ('a'), ('#'), ('a#a'), ('a#'))
SELECT title FROM v ORDER BY title ASC;
title
-------
#
a
#a
a#
a#a
(5 rows)
regress=> WITH v(title) AS (VALUES ('#a'), ('a'), ('#'), ('a#a'), ('a#'))
SELECT title FROM v ORDER BY title COLLATE "C" ASC;
title
-------
#
#a
a
a#
a#a
(5 rows)
PostgreSQL uses your operating system's collation support, so it's possible for results to vary slightly from host OS to host OS. In particular, at least some versions of Mac OS X have significantly broken unicode collation handling.
It seems, that when sorting Oracle as well as Postgres just ignore non alpha numeric chars, e.g.
select '*'
union all
select '#'
union all
select 'A'
union all
select '*E'
union all
select '*B'
union all
select '#C'
union all
select '#D'
order by 1 asc
returns (look: that DBMS doesn't pay any attention on prefix before 'A'..'E')
*
#
A
*B
#C
#D
*E
In your case, what Postgres actually sorts is
'', 'A' and 'Example'
If you put '#' in the middle od the string, the behaviour will be the same:
select 'A#B'
union all
select 'AC'
union all
select 'A#D'
union all
select 'AE'
order by 1 asc
returns (# ignored, and so 'AB', 'AC', 'AD' and 'AE' actually compared)
A#B
AC
A#D
AE
To change the comparison rules you should use collation, e.g.
select '#' collate "POSIX"
union all
select 'A' collate "POSIX"
union all
select '#Example' collate "POSIX"
order by 1 asc
returns (as it required in your case)
#
#Example
A

Group by Error: PG::GroupingError: ERROR: column must appear in the GROUP BY clause or be used in an aggregate function [duplicate]

I am getting this error in the pg production mode, but its working fine in sqlite3 development mode.
ActiveRecord::StatementInvalid in ManagementController#index
PG::Error: ERROR: column "estates.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "estates".* FROM "estates" WHERE "estates"."Mgmt" = ...
^
: SELECT "estates".* FROM "estates" WHERE "estates"."Mgmt" = 'Mazzey' GROUP BY user_id
#myestate = Estate.where(:Mgmt => current_user.Company).group(:user_id).all
If user_id is the PRIMARY KEY then you need to upgrade PostgreSQL; newer versions will correctly handle grouping by the primary key.
If user_id is neither unique nor the primary key for the 'estates' relation in question, then this query doesn't make much sense, since PostgreSQL has no way to know which value to return for each column of estates where multiple rows share the same user_id. You must use an aggregate function that expresses what you want, like min, max, avg, string_agg, array_agg, etc or add the column(s) of interest to the GROUP BY.
Alternately you can rephrase the query to use DISTINCT ON and an ORDER BY if you really do want to pick a somewhat arbitrary row, though I really doubt it's possible to express that via ActiveRecord.
Some databases - including SQLite and MySQL - will just pick an arbitrary row. This is considered incorrect and unsafe by the PostgreSQL team, so PostgreSQL follows the SQL standard and considers such queries to be errors.
If you have:
col1 col2
fred 42
bob 9
fred 44
fred 99
and you do:
SELECT col1, col2 FROM mytable GROUP BY col1;
then it's obvious that you should get the row:
bob 9
but what about the result for fred? There is no single correct answer to pick, so the database will refuse to execute such unsafe queries. If you wanted the greatest col2 for any col1 you'd use the max aggregate:
SELECT col1, max(col2) AS max_col2 FROM mytable GROUP BY col1;
I recently moved from MySQL to PostgreSQL and encountered the same issue. Just for reference, the best approach I've found is to use DISTINCT ON as suggested in this SO answer:
Elegant PostgreSQL Group by for Ruby on Rails / ActiveRecord
This will let you get one record for each unique value in your chosen column that matches the other query conditions:
MyModel.where(:some_col => value).select("DISTINCT ON (unique_col) *")
I prefer DISTINCT ON because I can still get all the other column values in the row. DISTINCT alone will only return the value of that specific column.
After often receiving the error myself I realised that Rails (I am using rails 4) automatically adds an 'order by id' at the end of your grouping query. This often results in the error above. So make sure you append your own .order(:group_by_column) at the end of your Rails query. Hence you will have something like this:
#problems = Problem.select('problems.username, sum(problems.weight) as weight_sum').group('problems.username').order('problems.username')
#myestate1 = Estate.where(:Mgmt => current_user.Company)
#myestate = #myestate1.select("DISTINCT(user_id)")
this is what I did.

Resources