We have a metrics setup in the sumologic that gives us the latest row_count of the tables in our system(polls the row_count from DB in every hour). I want to raise an alert if the row_count for any of the table decreases. So far tried the query predicate as
metrics=some_metric table_name=some_table | where latest < max
But, it seems that the max and latest both gets decremented to the new row_count and query fails. Is there a a way to implement this using metrics?
Related
I have a SpringBoot application that is under moderate load. I want to collect metric data for a few of the operations of my app. I am majorly interested in Counters and Timers.
I want to count the number of times a method was invoked (# of invocation over a window, for example, #invocation over last 1 day, 1 week, or 1 month)
If the method produces any unexpected result increase failure count and publish a few tags with that metric
I want to time a couple of expensive methods, i.e. I want to see how much time did that method took, and also I want to publish a few tags with metrics to get more context
I have tried StatsD-SignalFx and Micrometer-InfluxDB, but both these solutions have some issues I could not solve
StatsD aggregates the data over flush window and due to aggregation metric tags get messed up. For example, if I send 10 events in a flush window with different tag values, and the StatsD agent aggregates those events and publishes only one event with counter = 10, then I am not sure what tag values it's sending with aggregated data
Micrometer-InfluxDB setup has its own problems, one of them being micrometer sending 0 values for counters if no new metric is produced and in that fake ( 0 value counter) it uses same tag values from last valid (non zero counter)
I am not sure how, but Micrometer also does some sort of aggregation at the client-side in MeterRegistry I believe, because I was getting a few counters with a value of 0.5 in InfluxDB
Next, I am planning to explore Micrometer/StatsD + Telegraf + Influx + Grafana to see if it suits my use case.
Questions:
How to avoid metric aggregation till it reaches the data store (InfluxDB). I can do the required aggregation in Grafana
Is there any standard solution to the problem that I am trying to solve?
Any other suggestion or direction for my use case?
When trying to put some date into InfluxDB (1.7.3) I am getting error that max-series-per-database is reached:
(“error”:“partial write: max-series-per-database limit exceeded: (1000000) dropped=2")
Meanwhile show series exact cardinality for specific database shows that there are just around 510 000 entries.
Also select count(*) from database gives same result
Any idea I am getting error that max series per database is reached?
upd:
I am using open source version of InfluxDB without clustering
show series cardinality show almost the same result what exact cardinality does
Try increasing max-series-per-database configuration option and see if the error persists.
If you are using enterprise clustering, the exact cardinality may only count series on one region server, while another has the rest 490k series.
Are there other retention policies in the same database?
Also note, that error may be generated based on approximate cardinality.
In my app users can save sales reports for given dates. What I want to do now is to query the database and select only the latest sales reports (all those reports that have the maximum date in my table).
I know how to sort all reports by date and to select the one with the highest date - however I don't know how to retrieve multiple reports with the highest date.
How can I achieve that? I'm using Postgres.
Something like this?
SalesReport.where(date: SalesReport.maximum('date'))
EDIT: Just to bring visibility to #muistooshort's comment below, you can reduce the two queries to a single query (with a subselect), using the following form:
SalesReport.where(date: SalesReport.select('MAX(date)'))
If there is a lot of latency between your web host and your database host, this could halve execution times. It is almost always the preferred form.
You can get the maximum date to search for matching reports:
max_date = Report.maximum('date')
reports = Report.where(date: max_date)
Due to the PowerShell methods of getting mailbox statistics from Office365 taking about 2 seconds per mailbox, I am working on getting the data from Office 365 Reporting web service, which takes only a few seconds for each 2000 mailboxes.
The problem I'm running into is that the stats are updated periodically and some historical data is kept, so there are numerous records for each user. I only want to get the latest record for each user, but I haven't been able to find a way to do that. The closest I've come is to use $filter=Date ge DateTime'2016-03-10T00:00:00' where the date is concatenated to a couple of days ago. Theoretically, if I sort by Date desc I should get the latest records first, and if there is a user that has a record for 3/10 and 3/11, the 3/11 record would get pulled first, which would work for me. But regardless of how I do the sort it seems to come back with the older records first.
Ideally, I would like to be able to set criteria so that it only returns the latest record for each mailbox, but I can't seem to figure out or find how to do that. The closest I've been able to come is to just start running queries filtered on specific dates, walking the date back a day on each query.
If I can get the latest records to be returned first, I would be able to work with that because I can just discard a record if I've already received a later one.
https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MailboxUsageDetail/
?DelegatedOrg=nnn.onmicrosoft.com&$select=Date,WindowsLiveID,CurrentMailboxSize
&$filter=Date ge DateTime'2016-03-08T00:00:00'&$orderby=Date desc
So the questions are:
Is there a way to specify criteria so that only the latest record for each user is returned?
Is there a way to get it to order by Date descending--what am I doing wrong with the $orderby?
Thanks!
You can use $top=1 to get latest record by applying $orderby on date (desc). $filter and $skip may not require in this case.
https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MailboxUsageDetail/?DelegatedOrg=nnn.onmicrosoft.com&$select=Date,WindowsLiveID,CurrentMailboxSize&$orderby=Date desc&$top=1
Your query looks fine, here is an another example from Odata sample service to get employee detail with most recent birth date.
http://services.odata.org/V4/Northwind/Northwind.svc/Employees?$select=EmployeeID,FirstName,LastName,BirthDate&$orderby=BirthDate%20desc&$top=1
For my case, I need to capture 15 performance metrics for devices and save it to InfluxDB. Each device has a unique device id.
Metrics are written into InfluxDB in the following way. Here I only show one as an example
new Serie.Builder("perfmetric1")
.columns("time", "value", "id", "type")
.values(getTime(), getPerf1(), getId(), getType())
.build()
Writing data is fast and easy. But I saw bad performance when I run query. I'm trying to get all 15 metric values for the last one hour.
select value from perfmetric1, perfmetric2, ..., permetric15
where id='testdeviceid' and time > now() - 1h
For an hour, each metric has 120 data points, in total it's 1800 data points. The query takes about 5 seconds on a c4.4xlarge EC2 instance when it's idle.
I believe InfluxDB can do better. Is this a problem of my schema design, or is it something else? Would splitting the query into 15 parallel calls go faster?
As #valentin answer says, you need to build an index for the id column for InfluxDB to perform these queries efficiently.
In 0.8 stable you can do this "indexing" using continuous fanout queries. For example, the following continuous query will expand your perfmetric1 series into multiple series of the form perfmetric1.id:
select * from perfmetric1 into perfmetric1.[id];
Later you would do:
select value from perfmetric1.testdeviceid, perfmetric2.testdeviceid, ..., permetric15.testdeviceid where time > now() - 1h
This query will take much less time to complete since InfluxDB won't have to perform a full scan of the timeseries to get the points for each testdeviceid.
Build an index on id column. Seems that he engine uses full scan on table to retrieve data. By splitting your query in 15 threads, the engine will use 15 full scans and the performance will be much worse.