bosun with influxdb valid result - influxdb

Is there a simple test to make sure I have proper influxdb communication?
My configuration looks like this
influxHost = influxhost:8086
smtpHost = mail:25
emailFrom = user#domain.com
template cpu {
body = `Alert definition:
Name: {{.Alert.Name}}
Crit: {{.Alert.Crit}}
Tags:{{range $k, $v := .Tags}}
{{$k}}: {{$v}}{{end}}
`
subject = cpu idle at {{.Alert.Vars.q | .E}} on {{.Tags.host}}
}
notification default {
email = user#domain.com
next = default
timeout = 1h
}
On the bosun expression evulator I am doing
influx("db",'''SELECT mean(usage_idle) FROM "cpu" group by host''',"10m","","2m")
I keep getting
influx: did not get a valid result from InfluxDB

Make sure you have the correct influx database and that there is data in the specified time range. I usually try from the admin site first:
Then insert the query into the influx(...) expression
Bosun will add the time conditions to the WHERE and GROUP BY clauses as needed, so the full influxql generated should be something like:
SELECT mean(usage_idle) FROM cpu WHERE time >= '2016-12-07 20:00:00' AND time <= '2016-12-07 20:10:00' GROUP BY host,time(2m)
If it still doesn't work try SELECT * FROM cpu on the admin page to see what data is in the table (telegraf has gone thru a few changes). Also note in the recent versions you probably want to add cpu = 'cpu-total' to the WHERE clause to get the overall average.

Related

batch query is not allowed to request data from "derivatives"."autogen"

Good afternoon,
I have created the following tickscript with a standard tickstack setup.
Which includes: InfluxDB(latest version) and kapacitor(latest version):
dbrp "derivatives"."default"
var data = batch
|query('select sum(value) from "derivatives"."default".derivative_test where time > now() - 10m')
.every(1m)
.period(2m)
var slope = data
|derivative('value')
.as('slope')
.unit(2m)
slope
|eval(lambda: ("slope" - "value") / "value")
.as('percentage')
|alert()
.crit(lambda: "percentage" <= -50)
.id('derivative_test_crit')
.message('{{ .Level }}: DERIVATIVE FOUND!')
.topic('derivative')
// DEBUGGING
|influxDBOut()
.database('derivatives')
.measurement('derivative_logs')
.tag('sum', 'sum')
.tag('slope', 'slope')
.tag('percentage', 'percentage')
But every time i want to define it i get the following message:
batch query is not allowed to request data from "derivatives"."autogen"
I never had this problem before with stream's but every batch tick script i write returns the same message.
My kapacitor user has full admin privs and i am able to get the data via a curl request, does anyone have any idea what could possibly be the problem here?
My thanks in advance.
Change this
dbrp "derivatives"."default"
var data = batch
|query('select sum(value) from "derivatives"."default".derivative_test where time > now() - 10m')
to this:
dbrp "derivatives"."autogen"
var data = batch
|query('select sum(value) from "derivatives"."autogen".derivative_test where time > now() - 10m')
It might not be obvious, but the retention policy is most likely incorrect.
If you run SHOW RETENTION POLICIES on the derivatives database you will see the RP's. I suspect you have an RP of autogen, which is the default RP. However "default" doesn't normally exist as an RP unless you create it, it just signifies that it is the default RP, if that makes sense?
RP Documentation might help clear it up Database Documentation.
default autogen RP

InfluxDB: How to create a continuous query to calculate delta values?

I'd like to calculate the delta values for a series of measurements stored in an InfluxDB. The values are readings from an electricity meter taken every 5 minutes. The values increase over time. Here is subset of the data to give you an idea (commands shown below are executed in the InfluxDB CLI):
> SELECT "Haushaltstromzaehler - cnt" FROM "myhome_measurements" WHERE time >= '2018-02-02T10:00:00Z' AND time < '2018-02-02T11:00:00Z'
name: myhome_measurements
time Haushaltstromzaehler - cnt
---- --------------------------
2018-02-02T10:00:12.610811904Z 11725.638
2018-02-02T10:05:11.242021888Z 11725.673
2018-02-02T10:10:10.689827072Z 11725.707
2018-02-02T10:15:12.143326976Z 11725.736
2018-02-02T10:20:10.753357056Z 11725.768
2018-02-02T10:25:11.18448512Z 11725.803
2018-02-02T10:30:12.922032896Z 11725.837
2018-02-02T10:35:10.618788096Z 11725.867
2018-02-02T10:40:11.820355072Z 11725.9
2018-02-02T10:45:11.634203904Z 11725.928
2018-02-02T10:50:11.10436096Z 11725.95
2018-02-02T10:55:10.753853952Z 11725.973
Calculating the differences in the InfluxDB CLI is pretty straightforward with the difference() function. This gives me the electricity consumed within the 5 minutes intervals:
> SELECT difference("Haushaltstromzaehler - cnt") FROM "myhome_measurements" WHERE time >= '2018-02-02T10:00:00Z' AND time < '2018-02-02T11:00:00Z'
name: myhome_measurements
time difference
---- ----------
2018-02-02T10:05:11.242021888Z 0.03499999999985448
2018-02-02T10:10:10.689827072Z 0.033999999999650754
2018-02-02T10:15:12.143326976Z 0.02900000000045111
2018-02-02T10:20:10.753357056Z 0.0319999999992433
2018-02-02T10:25:11.18448512Z 0.03499999999985448
2018-02-02T10:30:12.922032896Z 0.033999999999650754
2018-02-02T10:35:10.618788096Z 0.030000000000654836
2018-02-02T10:40:11.820355072Z 0.03299999999944703
2018-02-02T10:45:11.634203904Z 0.028000000000247383
2018-02-02T10:50:11.10436096Z 0.02200000000084401
2018-02-02T10:55:10.753853952Z 0.02299999999922875
Where I struggle is getting this to work in a continuous query. Here is the command I used to setup the continuous query:
CREATE CONTINUOUS QUERY cq_Haushaltstromzaehler_cnt ON myhomedb
BEGIN
SELECT difference(sum("Haushaltstromzaehler - cnt")) AS "delta" INTO "Haushaltstromzaehler_delta" FROM "myhome_measurements" GROUP BY time(1h)
END
Looking in the InfluxDB log file I see that no data is written in the new 'delta' measurement from the continuous query execution:
...finished continuous query cq_Haushaltstromzaehler_cnt, 0 points(s) written...
After much troubleshooting and experimenting I now understand why no data is generated. Setting up a continuous query requires to use the GROUP BY time() statement. This in turn requires to use an aggregate function within the differences() function. The problem now is that the aggregate function returns only one value for the time period specified by GROUP BY time(). Obviously, the differences() function cannot calculate a difference from just one value. Essentially, continuous query executes a command like this:
> SELECT difference(sum("Haushaltstromzaehler - cnt")) FROM "myhome_measurements" WHERE time >= '2018-02-02T10:00:00Z' AND time < '2018-02-02T11:00:00Z' GROUP BY time(1h)
>
I'm now somewhat clueless as to how to make this work and appreciate any advice you might have.
Does it help using the last aggregate function? Not tested this as a cq yet.
Select difference(last(T1_Consumed)) AS T1_Delta, difference(last(T2_Consumed)) AS T2_Delta
from P1Data
where time >= 1551648871000000000 group by time(1h)
DIFFERENCE() would calculate delta from the "aggregated" value taken from previous group, not within current group.
So fill free to use selector function there - since your counters seemed to be cumulative, LAST() should be working well.

Query Execution Time Varies - IBM Informix - Data Studio

I am executing one SQL statement in Informix Data Studio 12.1. It takes around 50 to 60 ms for execution(One day date).
SELECT
sum( (esrt.service_price) * (esrt.confirmed_qty + esrt.pharmacy_confirm_quantity) ) AS net_amount
FROM
episode_service_rendered_tbl esrt,
patient_details_tbl pdt,
episode_details_tbl edt,
ms_mat_service_header_sp_tbl mmshst
WHERE
esrt.patient_id = pdt.patient_id
AND edt.patient_id = pdt.patient_id
AND esrt.episode_id = edt.episode_id
AND mmshst.material_service_sp_id = esrt.material_service_sp_id
AND mmshst.bill_heads_id = 1
AND esrt.delete_flag = 1
AND esrt.customer_sp_code != '0110000006'
AND pdt.patient_category_id IN(1001,1002,1003,1004,1005,1012,1013)
AND edt.episode_type ='ipd'
AND esrt.generated_date BETWEEN '2017-06-04' AND '2017-06-04';
When i am trying to execute the same by creating function it takes around 35 to 40 Seconds for execution.
Please find the code below.
CREATE FUNCTION sb_pharmacy_account_summary_report_test1(START_DATE DATE,END_DATE DATE)
RETURNING VARCHAR(100),DECIMAL(10,2);
DEFINE v_sale_credit_amt DECIMAL(10,2);
BEGIN
SELECT
sum( (esrt.service_price) * (esrt.confirmed_qty +
esrt.pharmacy_confirm_quantity) ) AS net_amount
INTO
v_sale_credit_amt
FROM
episode_service_rendered_tbl esrt,
patient_details_tbl pdt,
episode_details_tbl edt,
ms_mat_service_header_sp_tbl mmshst
WHERE
esrt.patient_id = pdt.patient_id
AND edt.patient_id = pdt.patient_id
AND esrt.episode_id = edt.episode_id
AND mmshst.material_service_sp_id = esrt.material_service_sp_id
AND mmshst.bill_heads_id = 1
AND esrt.delete_flag = 1
AND esrt.customer_sp_code != '0110000006'
AND pdt.patient_category_id IN(1001,1002,1003,1004,1005,1012,1013)
AND edt.episode_type ='ipd'
AND esrt.generated_date BETWEEN START_DATE AND END_DATE;
RETURN 'SALE CREDIT','' with resume;
RETURN 'IP SB Credit Amount',v_sale_credit_amt;
END
END FUNCTION;
Can someone tell me what is the reason for this time variation?
..in very easy words.
If you create a function the sql is parsed and stored with some optimization stuff in the database. If you call the function, optimizer knows about the sql and execute it. So optimization is done only once, if you create the function.
If you run the SQL, Optimizer parse the sql, optimizes it and then execute it, every time you execute the SQL.
This explains the time difference.
I would say the difference in time is due the parametrized query.
The first SQL has hardcoded dates values, the one in the SPL has parameters. That may cause a different query plan (e.g: which index to follow) to be applied to the query in the SPL than the one executed from Data Studio.
You can try getting the query plan (using set explain) from the first SQL and then use directives in the SPL to force the engine to use that same path.
have a look at:
https://www.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.perf.doc/ids_prf_554.htm
it explains how to use optimizer directives to speed up queries.

How to get CPU usage percentage measured by Collectd in InfluxDB

I'm collecting cpu usage measured in jiffies by Collectd 5.4.0 and then storing the results in InfluxDB 0.9.4. I use the following query to get cpu percentage from InfluxDB:
SELECT MEAN(value) FROM cpu_value WHERE time >= '' and time <= '' GROUP BY type,type_instance
But when I plot the result it makes no sense. There is no pattern in cpu usage. Please let me know If I do something wrong.
Thanks
Since Collectd 5.5 you can get values in percentage instead of jiffies:
<Plugin cpu>
ReportByState = true
ReportByCpu = true
ValuesPercentage = true
</Plugin>
Then you can write query like:
SELECT mean("value") FROM "cpu_value" WHERE
"type_instance" =~ /user|system|nice|irq/
AND "type" = 'percent' AND $timeFilter
GROUP BY time($interval), "host"
If you can upgrade it might be the easiest option. Otherwise you can:
precompute percentage on client
use other client for reporting stats (such as statsd, telegraf, etc.)
With InfluxDB 0.12 you can perform arithmetic operations between fields like:
SELECT MEAN(usage_system) + MEAN(usage_user) AS cpu_total
FROM cpu
However for using this you would have to report from collectd user|system|nice|irq as FIELDS not TAGS.
this is my query, I use it with percent unit (on Axes tab), but stack+percent (on display tab) make sense as well
SELECT non_negative_derivative(mean("value"), 1s) FROM "cpu_value" WHERE "type_instance" =~ /(idle|system|user|wait)/ AND $timeFilter GROUP BY time($interval), "type_instance" fill(null)
The non_nagative_derivative(1s) can be replaced with derivative(1s), I had some of negative value when values was missing.

Query to postgreSQL from RoR slower than from prompt?

I am querying a postgreSQL DB from my ruby on rails application this way:
var = Map.connection.execute("
SELECT *
FROM shortest_path('SELECT * FROM japan WHERE japan.geom_way && ST_MakeEnvelope(139.68012, 35.63993, 139.71918, 35.66024)', 242945, 582735, false, false)
JOIN japan ON edge_id = id;")
The execution time shown in the rails server console is 327.8 ms.
I execute an identical query from the psql promtp:
SELECT *
FROM shortest_path('SELECT * FROM japan WHERE japan.geom_way && ST_MakeEnvelope(139.68012, 35.63993, 139.71918, 35.66024)', 242945, 582735, false, false)
JOIN japan ON edge_id = id;
The execution time is 53.108 ms.
I thought that some caching could be the reason of the different execution times, but if I try to execute 2 times in a row the same query in the rails application, the execution time for 1 query doesn't change. For instance:
var = Map.connection.execute("SELECT * FROM shortest_path('SELECT * FROM japan WHERE japan.geom_way && ST_MakeEnvelope(139.68012, 35.63993, 139.71918, 35.66024)', 242945, 582735, false, false) JOIN japan ON edge_id = id;")
var = Map.connection.execute("SELECT * FROM shortest_path('SELECT * FROM japan WHERE japan.geom_way && ST_MakeEnvelope(139.68012, 35.63993, 139.71918, 35.66024)', 242945, 582735, false, false) JOIN japan ON edge_id = id;")
gives an execution time of 330.7 ms and 327.8 ms.
Since the 2 queries are identical, shouldn't I expect the same execution time in RoR and in the prompt?
Thanks in advance for any idea.
Look at http://www.depesz.com/2008/05/10/prepared-statements-gotcha/ - maybe reason is similar?
One is using Ruby and one isn't.
Also you haven't indicated if there are any network transport effects, and if there are network differences, how many rows are being returned across the wire.
You haven't said how you are timing these things. If the time includes client-side processing in the Ruby case, then in the second case, there is obviously less processing and also the prompt doesn't include transfer time or time to process the results into the display as part of the "execution time" that it is reporting to you.

Resources