influxdb: selecting values containing quotes - influxdb

I have some tag values that were unfortunately sent with quotes in them:
> SELECT count("count") FROM "railswebapp"
WHERE "auth_method" =~ /facebook/ GROUP BY "auth_method"
name: railswebapp
tags: auth_method=\"facebook\"
time count
---- -----
0 4
name: railswebapp
tags: auth_method=facebook
time count
---- -----
0 2632
>
Alas, querying for the "facebook" series is harder than I expect:
> SELECT "count" FROM "railswebapp"
WHERE "auth_method" = '\"facebook\"'
>
This work-around works but surely I can do better. Any suggestions?
> SELECT count("count") FROM "railswebapp"
WHERE "auth_method" =~ /facebook/
AND "auth_method" != 'facebook'
GROUP BY "auth_method"
name: railswebapp
tags: auth_method=\"facebook\"
time count
---- -----
0 4
> SELECT count FROM "railswebapp"
WHERE "auth_method" =~ /facebook/
AND "auth_method" != 'facebook'
GROUP BY "auth_method"
name: railswebapp
tags: auth_method=\"facebook\"
time count
---- -----
152412927875202308 1
152412927882740082 1
1524130761574200511 1
1524134859852346944 1
>
(Note: influx doesn't support line breaks in queries: they just make this question more readable.)

It is mentioned on influxdb documentation to not double or single quote measurement names, tag keys, tag values, and field keys.
Here is how quoting are handled,
Do not double or single quote measurement names, tag keys, tag values, and field keys. It is valid Line Protocol but InfluxDB assumes that the quotes are part of the name.
Never single quote field values (even if they’re strings!). It’s also not valid Line Protocol.
Do not double quote field values that are floats, integers, or Booleans. InfluxDB will assume that those values are strings.
It also mentioned ,
For string field values use a backslash character \ to escape:
Example,
Inserting data with tag-value containing double-quote,
INSERT cpu,host="x" value=10
Querying,
select * from cpu where host = '\"x\"'
Output,
name: cpu
time host value
---- ---- -----
1530754262442056777 "x" 10

Related

How to select some fields/tags from an influx DB?

I am able to make a query to an influxdb and select all the fields/tags:
select * from http_reqs where time > now() - 4d and "status" =~ /^4/
which returns a list of matching values. The first row looks like this:
time error error_code method name proto scenario status tls_version type url value
But when I try to select only a subset of these fields/tags (according to the documentation), I get no result at all:
select "time","name" from http_reqs where time > now() - 4d and "status" =~ /^4
No matter what I try to select. The documentation seems to be wrong or incorrect!
How am I be able to select the fields/tags I want?
It seems you have to first figure out what are "fields" and what are "tags". So you need to do:
show field keys from http_reqs;
in my case this returns
name: http_reqs
fieldKey fieldType
-------- ---------
url string
value float
As per documentation, you have to use at least one field key (for whatever reason).
Then you can query what you need (in addition to the field key, if you need that or not):
select "url","error" from http_reqs where time > now() - 4d and "status" =~ /^4/

InfluxDB cannot select field key

I'm new to InfluxDB. I have an existing database with a table language. When I run select * from language I get the following table:
name: language
time application_guid application_name application_type instance_index lang metric_type stream_name value
---- ---------------- ---------------- ---------------- -------------- ---- ----------- ----------- -----
2019-03-07T07:46:49.225Z 31429 counter sink 0 ar counter tweetlang 0
2019-03-07T07:46:49.225Z 31429 counter sink 0 ca counter tweetlang 0
2019-03-07T07:46:49.225Z 31429 counter sink 0 de counter tweetlang 0
2019-03-07T07:46:49.225Z 31429 counter sink 0 el counter tweetlang 0
When I run select "lang" from language I get an empty result. What is the problem here?
Found the solution here:
The SELECT clause must specify at least one field when it includes a
tag.
According to documentation Select Clause can be given to field key and possible to give query with field name and tag key together where tag key is for indexing and field is what you defined as a column in the measurement.
If a select query will be given by a tag name, it won't return any result.
According to your measurement named language has lang key as a tag key.
if we consider your field names as follows:
application_guid application_name application_type instance_index
And tags as given below:
lang metric_type stream_name value
The select query can be given like these ways:
select * from language
select application_guid,lang from language
if multiple field and tags you want to give in the statement then:
select application_guid,application_type::field, metric_type,stream_name::tag from language
In addition, InfluxDb uses Line protocol, where line protocol informs InfluxDB of the data’s measurement, tag set, field set, and timestamp.

InfluxDB 1.7.2 - Top X over time

I’m new to InfluxDB. I’m using it to store ntopng timeseries data.
ntopng writes a measurement called asn:traffic that stores how many bytes were sent and received for an ASN.
> show tag keys from "asn:traffic"
name: asn:traffic
tagKey
------
asn
ifid
> show field keys from "asn:traffic"
name: asn:traffic
fieldKey fieldType
-------- ---------
bytes_rcvd float
bytes_sent float
>
I can run a query to see the data rate in bps for a specific ASN:
> SELECT non_negative_derivative(mean("bytes_rcvd"), 1s) * 8 FROM "asn:traffic" WHERE "asn" = '2906' AND time >= now() - 12h GROUP BY time(30s) fill(none)
name: asn:traffic
time non_negative_derivative
---- -----------------------
1550294640000000000 30383200
1550294700000000000 35639600
...
...
...
>
However, what I would like to do is create a query that I can use to return the top N ASNs by data rate and plot that on a Grafana graph. Sort of like this example that is using ELK.
I've tried a few variants from posts here and elsewhere, but I haven't been able to get what I'm after. For example, this query I think gets me closer to where I want to be, but there are no values in asn:
> select top(bps,asn,10) from (SELECT non_negative_derivative(mean(bytes_rcvd), 1s) * 8 as bps FROM "asn:traffic" WHERE time >= now() - 12h GROUP BY time(30s) fill(none))
name: asn:traffic
time top asn
---- --- ---
1550299860000000000 853572800
1550301660000000000 1197327200
1550301720000000000 1666883866.6666667
1550310780000000000 674889600
1550329320000000000 20979431866.666668
1550332740000000000 707015600
1550335920000000000 2066646533.3333333
1550336820000000000 618554933.3333334
1550339280000000000 669084933.3333334
1550340300000000000 704147333.3333334
>
Thinking then that perhaps the sub query needs to select asn also, however that proceeds an error about mixing queries:
> select top(bps,asn,10) from (SELECT asn, non_negative_derivative(mean(bytes_rcvd), 1s) * 8 as bps FROM "asn:traffic" WHERE time >= now() - 12h GROUP BY time(30s) fill(none))
ERR: mixing aggregate and non-aggregate queries is not supported
>
Anyone have any thoughts on a solution?
EDIT 1
Per the suggestion by George Shuklin, modifying the query to include asn in GROUP BY displays ASN in the CLI output, but that doesn't translate in Grafana. I'm expecting a stacked graph with each layer of the stacked graph being one of the top 10 asn results.
Try to make ASN as tag, than you can use group by time(30s), 'asn', and that tag will be available in the outer query.

Influx-DB Uniq Query

I am trying to do a query against the INFLUX-DB to get unique values.Below is the query I use,
select host AS Host,(value/100) AS Load from metrics where time > now() - 1h and command='Check_load_current' and value>4000;
The output for the query is,
What I actually want is the unique "Host" values. For example I want "host-1" as a output repeated only once(latest value) eventhough the load values are different.How can I achieve this? Any help would be much helpful.
Q: I want the latest values from each unique "Host", how do I achieve it?
Given the following database:
time host value
---- ---- -----
1529508443000000000 host01 42.72
1529508609000000000 host05 53.94
1529508856000000000 host01 40.37
1529508913000000000 host02 41.02
1529508937000000000 host01 44.49
A: Consider breaking the problem down.
First you can group the "tag values" into their individual buckets using the "Groupby" operation.
Select * from number group by "host"
name: number
tags: host=host01
time value
---- -----
1529508443000000000 42.72
1529508856000000000 40.37
1529508937000000000 44.49
name: number
tags: host=host02
time value
---- -----
1529508913000000000 41.02
name: number
tags: host=host05
time value
---- -----
1529508609000000000 53.94
Next, you will want to order the data in each bucket to be in descending order and then tell influxdb to only return the top 1 row of each bucket.
Hence add the "Order by DESC" and the "limit 1" filter to the first query and it should yield you the desire result.
> select * from number group by "host" order by desc limit 1;
name: number
tags: host=host05
time value
---- -----
1529508609000000000 53.94
name: number
tags: host=host02
time value
---- -----
1529508913000000000 41.02
name: number
tags: host=host01
time value
---- -----
1529508937000000000 44.49
Reference:
https://docs.influxdata.com/influxdb/v1.5/query_language/data_exploration/#the-group-by-clause
https://docs.influxdata.com/influxdb/v1.5/query_language/data_exploration/#order-by-time-desc
https://docs.influxdata.com/influxdb/v1.5/query_language/data_exploration/#the-limit-and-slimit-clauses
If you want to get only the latest value for each unique host tag do the following:
SELECT host AS Host, last(value)/100 AS Load
FROM metrics
GROUP BY host

How to get the number of entries in a measurement

I am a newbie to influxdb. I just started to read the influx documentation.
I cant seem to get the equivalent of 'select count(*) from table' to work in influx db.
I have a measurement called cart:
time status cartid
1456116106077429261 0 A
1456116106090573178 0 B
1456116106095765618 0 C
1456116106101532429 0 D
but when I try to do
select count(cartid) from cart
I get the error
ERR: statement must have at least one field in select clause
I suppose cartId is a tag rather than a field value? count() currently can't be used on tag and time columns. So if your status is a non-tag column (a field), do the count on that.
EDIT:
Reference
This works as long as no field or tag exists with the name count:
SELECT SUM(count) FROM (SELECT *,count::INTEGER FROM MyMeasurement GROUP BY count FILL(1))
If it does use some other name for the count field. This works by first selecting all entries including an unpopulated field (count) then groups by the unpopulated field which does nothing but allows us to use the fill operator to assign 1 to each entry for count. Then we select the sum of the count fields in a super query. The result should look like this:
name: MyMeasurement
----------------
time sum
0 47799
It's a bit hacky but it's the only way to guarantee a count of all entries when no field exists that is always present in all entries.

Resources