influxdb + grafana with multiple counters - influxdb

I have data stored in a table which has a timestamp, counters for various things, and tags of [ receivingHostname, clientHostname, clientName, sourceType ].
I want to plot counters by various combinations of these tags, to select different groupings of the data. For instance, I want to plot all counters for queries per second for clientName over a 1 hour time window.
So far, this seems to be very hard to do with InfluxDB. I have tried derivative(sum(counter), 60s) or the like, but this produces garbage.
Other than massaging the data ahead of time and converting these counters to deltas, is there a way to do this?

Related

How to send non aggregated metric to Influx from Springboot application?

I have a SpringBoot application that is under moderate load. I want to collect metric data for a few of the operations of my app. I am majorly interested in Counters and Timers.
I want to count the number of times a method was invoked (# of invocation over a window, for example, #invocation over last 1 day, 1 week, or 1 month)
If the method produces any unexpected result increase failure count and publish a few tags with that metric
I want to time a couple of expensive methods, i.e. I want to see how much time did that method took, and also I want to publish a few tags with metrics to get more context
I have tried StatsD-SignalFx and Micrometer-InfluxDB, but both these solutions have some issues I could not solve
StatsD aggregates the data over flush window and due to aggregation metric tags get messed up. For example, if I send 10 events in a flush window with different tag values, and the StatsD agent aggregates those events and publishes only one event with counter = 10, then I am not sure what tag values it's sending with aggregated data
Micrometer-InfluxDB setup has its own problems, one of them being micrometer sending 0 values for counters if no new metric is produced and in that fake ( 0 value counter) it uses same tag values from last valid (non zero counter)
I am not sure how, but Micrometer also does some sort of aggregation at the client-side in MeterRegistry I believe, because I was getting a few counters with a value of 0.5 in InfluxDB
Next, I am planning to explore Micrometer/StatsD + Telegraf + Influx + Grafana to see if it suits my use case.
Questions:
How to avoid metric aggregation till it reaches the data store (InfluxDB). I can do the required aggregation in Grafana
Is there any standard solution to the problem that I am trying to solve?
Any other suggestion or direction for my use case?

What is the equivalent of a UNION statement

I need to chart physical measurements using Influx. All measurements are stored as series inside a single Influx "measurement".
Some are "current" values like temperatures, other are things like energy meter readings.
The problem is that these need different queries in order to produce visually attractive output. Charting the meter readings as current power is possible using the DIFFERENCE function.
SELECT difference(max("value")) AS "diff_value"
FROM "volkszaehler"."autogen"."data"
WHERE time > :dashboardTime:
GROUP BY time(1d), "title" FILL(linear)
For other values like temperatures the selection should be mean("value") without the difference.
Is there a way to "union" result sets in InfluxDB similar to mysql in order to display them in a single chart in Chronograf?
Sorry, this isn't possible and new functionality isn't being added to InfluxQL while Flux is being actively worked on.
https://github.com/influxdata/flux

Is there a way to tell Google Cloud Dataflow that the data coming in is already ordered?

We have an input data source that is approximately 90 GB (it can be either a CSV or XML, it doesn't matter) that contains an already ordered list of data. For simplicity, you can think of it as having two columns: time column, and a string column. The hundreds of millions of rows in this file are already ordered by the time column in ascending order.
In our Google cloud DataFlow, we have modeled each row as an element in our Pcollection, and we apply DoFn transformations to the string field (e.g. count the number of characters that are uppercase in the string etc.). This works fine.
However, we then need to apply functions that are supposed to be calculated for a block of time (e.g. five minutes) with a one minute overlap. So, we are thinking about using a sliding windowing function (even though the data is bounded).
However, the calculations logic that needs to be applied over these five-minute windows assumes that the data is ordered logically ( i.e. ascending) by the time field. My understanding is that even when using these windowing functions, one cannot assume that within each window the P collection objects are ordered in any way – so one would need to manually iterate through every P collection and reorder them, right? However, this seems like a huge waste of computational power, since the incoming data already contains ordered data. So is there a way to teach/inform Google cloud data flow that the input data is ordered and so to maintain that order even within the windows?
On a minor note, I had another question: my understanding is that if the data source is unbounded, there is never a "overall aggregation" function that would ever execute, as it never really make sense (since there is no end to the incoming data); however, if one uses a windowing function for bounded data, there is a true end state which corresponds to when all the data has been read from the CSV file. Therefore, is there a way to tell Google cloud data flow to do a final calculation once all the data has been read in, even though we are using a windowing function to divide the data up?
SlidingWindows sounds like the right solution for your problem. The ordering of the incoming data is not preserved across a GroupByKey, so informing Dataflow of that would not be useful currently. However, the batch Dataflow runner does already sort by timestamp in order to implement windowing efficiently, so for simple windowing like SlidingWindows, your code will see the data in order.
If you want to do a final calculation after doing some windowed calculations on a bounded data set, you can re-window your data into the global window again, and do your final aggregation after that:
p.apply(Window.into(new GlobalWindows()));

InfluxDB performance

For my case, I need to capture 15 performance metrics for devices and save it to InfluxDB. Each device has a unique device id.
Metrics are written into InfluxDB in the following way. Here I only show one as an example
new Serie.Builder("perfmetric1")
.columns("time", "value", "id", "type")
.values(getTime(), getPerf1(), getId(), getType())
.build()
Writing data is fast and easy. But I saw bad performance when I run query. I'm trying to get all 15 metric values for the last one hour.
select value from perfmetric1, perfmetric2, ..., permetric15
where id='testdeviceid' and time > now() - 1h
For an hour, each metric has 120 data points, in total it's 1800 data points. The query takes about 5 seconds on a c4.4xlarge EC2 instance when it's idle.
I believe InfluxDB can do better. Is this a problem of my schema design, or is it something else? Would splitting the query into 15 parallel calls go faster?
As #valentin answer says, you need to build an index for the id column for InfluxDB to perform these queries efficiently.
In 0.8 stable you can do this "indexing" using continuous fanout queries. For example, the following continuous query will expand your perfmetric1 series into multiple series of the form perfmetric1.id:
select * from perfmetric1 into perfmetric1.[id];
Later you would do:
select value from perfmetric1.testdeviceid, perfmetric2.testdeviceid, ..., permetric15.testdeviceid where time > now() - 1h
This query will take much less time to complete since InfluxDB won't have to perform a full scan of the timeseries to get the points for each testdeviceid.
Build an index on id column. Seems that he engine uses full scan on table to retrieve data. By splitting your query in 15 threads, the engine will use 15 full scans and the performance will be much worse.

How do I handle large amounts of logfile data for display in dynamic charts?

I have a lot of logfile data that I want to display dynamic graphs from, for basically arbitrary time periods, optionally filtered or aggregated by different columns (that I could pregenerate). I'm wondering about the best way to store the data in a database and access it for displaying charts, when:
the time resolution should be variable from one second to a year
there are entries that span several 'time buckets', e.g. a connection might have been open for a few days and I want to count and display the user for every hour she was connected, not just in the hour 'slot' the connection was created or finished
Are there best practices, or tools/plugins for rails that help handle this kind and amount of data? Are there maybe database engines specifically tailored towards this, or having helpful functions (e.g. CouchDB indexes)?
EDIT: I'm looking for a scalable way to handle this data and access pattern. Things we considered: Run a query for each bucket, merge in app - probably way too slow. GROUP BY timestamp/granularity - does not count connections correctly. Preprocessing data into rows by smallest granularity and downsampling on query - probably the best way.
I think you can use mysql timestamps for this.
The way I solved it in the end was to pre-process the data into per-minute buckets, so there's one row for every event and minute. That makes it easy and fast enough to select and yields correct results. To get different granularity, you can do integer arithmetic on the timestamp columns - select abs(timestamp/factor)*factor and group by abs(timestamp/factor)*factor.

Resources