influxdb - selection using bit testing? - influxdb

Is there a way in InfluxDB to use bitwise operators in a search query? For example, if I want to find all points where the 2nd bit of a tag or field value is set, I'd like to be able to do something like:
SELECT * FROM measurement WHERE tag_name & (1<<1) = true
or
SELECT * FROM measurement WHERE (tag_name >> 1) & 1 = true

Q: Can I find all the points which has the 2nd bit set in a tag or field?
A: Not possible for tag because its value is always string type.
Yes for Field value but performing filtering on it can have performance impact as field is not indexed. That is, every SELECT query is a full table scan.
The bitwise operators are only introduced in influx 1.3.x, so if you're using an earlier release then you won't be able to do the following.
Assuming you have the following dataset.
> show field keys from measurement_abcd
name: measurement_abcd
fieldKey fieldType
-------- ---------
value integer
> select * from measurement_abcd
name: measurement_abcd
time tag1 value
---- ---- -----
1434086562000000000 2
1434087562000000000 3
1434088562000000000 4
1434089562000000000 abc 5
1434089562000000000 5
You can retrieve rows that has the 2nd bit set in field value by doing;
SELECT * FROM measurement_abcd WHERE "value" | 2 = "value"
This will give you the following output;
> SELECT * FROM measurement_abcd WHERE "value" | 2 = "value"
name: measurement_abcd
time tag1 value
---- ---- -----
1434086562000000000 2
1434087562000000000 3
Something worth noting is that influx's Bitwise operators only works on boolean and integer. It won't work on float.
Reference:
https://docs.influxdata.com/influxdb/v1.3/query_language/math_operators/#bitwise-and
Use the following data to replicate the experiment above:
curl -i -XPOST 'http://localhost:8086/write?db=stackoverflow1'
--data-binary 'measurement_abcd value=2i 1434086562000000000'
curl -i -XPOST 'http://localhost:8086/write?db=stackoverflow1' --data-binary 'measurement_abcd value=3i 1434087562000000000'
curl -i -XPOST
'http://localhost:8086/write?db=stackoverflow1' --data-binary
'measurement_abcd value=4i 1434088562000000000'
curl -i -XPOST
'http://localhost:8086/write?db=stackoverflow1' --data-binary
'measurement_abcd value=5i 1434089562000000000'

Related

How to list the types of a influxdb measurement fields?

I have an influxdb with a measurement named http_reqs. This measurement has several fields which I can 'list' as follows (I see 11 fields):
SELECT * FROM http_reqs LIMIT 1;
Here the first line of the output is as follows:
time error error_code method name proto scenario status tls_version type url value
I assume to have 11 fields in that 'measurement' http_reqs:
time
error
error_code
method
name
proto
scenario
status
tls_version
type
url
value
I want to know the 'type' of these fields.
For example:
I the field status a string? Or is the field status an integer? Or is the field status a float? Or is the field status a boolean?
I hope my question if a bit more clear now.
I found this documentation and I can run
SHOW FIELD KEYS FROM http_reqs;
but it seems to list only 2 fields! The output is:
name: http_reqs
fieldKey fieldType
-------- ---------
url string
value float
Yes that is what I want for all of the fields! I can see, that the field url has type string, and I can see that the field value is of type float.
But 9 of the above listed fields seem to be missing. I see only two fields (url and value). I do not see the type of the field status, for example. I want to know the type of the field status.
I also can do the following query:
SHOW TAG KEYS FROM http_reqs
which gives this output:
name: http_reqs
tagKey
------
error
error_code
method
name
proto
scenario
status
tls_version
type
Interestingly, this query lists all of the 9 'missing' fields (or tags, or keys, or whatever these things are). But this output does not tell me what type the element status is, for example. I want to know the type of the element status.
How can I see the types of each of the 11 elements of the 'measurement' http_reqs?
Your term fields is very likely InfluxDB fields + InfluxDB tags in your case:
Inspect InfluxDB fields:
SHOW FIELD KEYS FROM http_reqs
Inspect InfluxDB tags:
SHOW TAG KEYS FROM http_reqs

Returning the value as the first field in an influxdb query?

Is it possible to return just a value, and not a time from influxdb?
I know influxdb is a Time Series Database, so it includes the time regardless, but the tool I'm using requires the first field displayed from the query to be a value, and not a time.
I've also tried switching the order of the arguments and time is still the first argument returned in the query.
select value,time FROM smart_threshold WHERE host =~ /proxmox/ AND instance = 'sda' AND type_instance='reallocated-sector-count' limit 1
Result:
name: smart_threshold
time value
---- -----
1512183206033040777 36
Would it be possible to return:
name: smart_threshold
value time
------ ----
36 1512183206033040777
or just:
name: smart_threshold
value
------
36
instead?
As of now, it seems there's no other way than do simple man-in-the-middle script that would swap two values in the output (in case you don't own "the tool").
What the tool is that, though?

Influxdb querying values from 2 measurements and using SUM() for the total value

select SUM(value)
from /measurment1|measurment2/
where time > now() - 60m and host = 'hostname' limit 2;
Name: measurment1
time sum
---- ---
1505749307008583382 4680247
name: measurment2
time sum
---- ---
1505749307008583382 3004489
But is it possible to get value of SUM(measurment1+measurment2) , so that I see only o/p .
Not possible in influx query language. It does not support functions across measurements.
If this is something you require, you may be interested in layering another API on top of influx that do this, like Graphite via Influxgraph.
For the above, something like this.
/etc/graphite-api.yaml:
finders:
- influxgraph.InfluxDBFinder
influxdb:
db: <your database>
templates:
# Produces metric paths like 'measurement1.hostname.value'
- measurement.host.field*
Start the graphite-api/influxgraph webapp.
A query /render?from=-60min&target=sum(*.hostname.value) then produces the sum of value on tag host='hostname' for all measurements.
{measurement1,measurement2}.hostname.value can be used instead to limit it to specific measurements.
NB - Performance wise (of influx), best to have multiple values in the same measurement rather than same value field name in multiple measurements.

How to match string with another string that is almost the same (fuzzy matching)

In Rails, I am passing in a string: 'AE18BX21'. I am querying the database to find strings that match with the input string. However the input string and the string in the database sometimes don't match up. Sometimes there is an extra letter/number, sometimes a letter/number is missing, or sometimes the letter/number is a different letter/number.
I have tried a few different regex expressions like:
Table.where("string =~ ?", 'A+E+1+8+B+X+2+1')
Table.where("string =~ ?", '(A|.)+(E|.)+(1|.)+(8|.)+(B|.)+(X|.)+(2|.)+(1|.)')
In an ideal world, I would want it to return only the strings that match up 80% or more.
After reading your question, I think you want something like Levenshtein distance, and as you stated in your comment, for Postgres you could use it.
Quoting its documentation here:
https://www.postgresql.org/docs/9.1/static/fuzzystrmatch.html
test=# SELECT levenshtein('GUMBO', 'GAMBOL');
levenshtein
-------------
2
(1 row)
test=# SELECT levenshtein('GUMBO', 'GAMBOL', 2,1,1);
levenshtein
-------------
3
(1 row)
test=# SELECT levenshtein_less_equal('extensive', 'exhaustive',2);
levenshtein_less_equal
------------------------
3
(1 row)
test=# SELECT levenshtein_less_equal('extensive', 'exhaustive',4);
levenshtein_less_equal
------------------------
4
(1 row)
Then you can build your sql query with your desire distance:
SELECT *
FROM YourTable
WHERE levenshtein(string , 'AE18BX21') <= 2

Query Influxdb based on tags?

I have started playing around with Influxdb v0.13 and I have some dummy values in my test db where id is a tag and value is a field:
> SELECT * FROM dummy
name: dummy
--------------
time id value
1468276508161069051 1234 12345
1468276539152853428 613 352
1468276543470535110 613 4899
1468276553853436191 1234 12
I get no results returned when I run this query:
> SELECT * FROM dummy WHERE id=1234
but I do get the following when querying with the field instead:
> SELECT * FROM dummy WHERE value=12
name: dummy
--------------
time id value
1468276553853436191 1234 12
Am I doing something wrong here? I thought the point of tags were to be queried (since they are indexed and fields are not), but they seem to break my queries.
It appears that Influx will treat every tag key and value we insert as string and this is evidently shown in their official documentation.
See: https://docs.influxdata.com/influxdb/v0.13/guides/writing_data/
When writing points, you must specify an existing database in the db
query parameter. See the HTTP section on the Write Syntax page for a
complete list of the available query parameters. The body of the POST
- we call this the Line Protocol - contains the time-series data that you wish to store. They consist of a measurement, tags, fields, and a
timestamp. InfluxDB requires a measurement name. Strictly speaking,
tags are optional but most series include tags to differentiate data
sources and to make querying both easy and efficient. Both tag keys
and tag values are strings.
Note: the text in bold.
Hence to filter by tag key value - the query must be enquoted.
Example:
SELECT * FROM dummy WHERE id='1234'

Resources