Adding a version to time-series data - time-series

I wish to store time series data with versioning. By versioning, I mean that I might have a metric energy_mwh with tag meter_id=123 and a fieldset something like this time=2016-01-01 10:00, mwh=20.50, read-time=2016-01-01 20:15 and if I re-read the meter at a later time I want to keep both the new and old version of the meter reading. Later when I query the data I will be mostly just interested in the mwh value with the highest read-time for any given time. If I query over a range of times the read-time is going to vary.
I am thinking of using InfluxDB or some other time series database with a similar data model.
Is there a right way of doing this? I believe that I must keep read-time as a tag - not a field or I will lose the older version of the data. I guess that is answer - but it doesn't feel right to me to have what I see as a piece of data: read-time sitting in an identifier - specifically a tag. Am I on the right track?

Related

How can I render time-series data on a geographic display in grafana?

My goal is to render time-series data from set locations on a map. Essentially, I have about 30 predefined (static) locations in Switzerland from which I will be receiving real-time data. The data itself is relatively simple, just the signal/noise ratio of the signal we're receiving, which should be updated every few seconds or every minute. I am using InfluxDB as my database. Are there any specific setups I should be using for this kind of visualization?
My first question is: is it best to use the worldmap panel or the geomap panel at this time? I seem to be finding more information/documentation on the worldmap panel even though i have also read that geomap is (or at least will be) its replacement.
Second, I assume that since I'm using time-series data, that I should be using the Time-Series format, and not the Table format. However, I have not been able to render any data points using the time-series feature, even by following the simplest of examples in your documentation. The best I can do is use the Table feature, and internally remove previous points from my database at every iteration (so that multiple points aren't rendered at the same time for each location). Here are two screenshots of when I'm able to render data on the geomap using the Table format, and then after switching to Time-Series format that the points are no longer there (note that I have the same problem with the Worldmap application as well).
I'm able to render data using the Table method:
...but not using time series:
Thanks for any help!
For rendering timeseries data on the geomap, you must convert your lat/long fields to a single geohash field. You'll have to do that prior to inserting the lat/longs into influxDB
See this answer

Storing Date Components Instead of a Date

My app lets people log the movies they see (for example). Each logged movie usually (but not always) has a date and sometimes has a time. It's not unusual to have one but not the other. Occasionally the dates are only a year ("I watched a Dumbo sometime in 1984"), but could realistically be any combination of day/month/year/time.
I am used to modeling dates as date objects in my app and my backend. But is it a viable approach to store each component separately? When I need to reference an actual date from the components (e.g. for sorting the log) this will be built client-side, or perhaps be stored as a derived property sortDate and updated whenever any of the components change.
My reservation is that the information the user is saving is truly a 'moment in time' and I will have to take care of some things myself - for example what time zone are my components stored relative to? This would be captured automatically as part of a real Date object.
The alternative seems to be assuming some sort of 'default' for missing components (e.g. year 0000 if no year, time 00:00 if no time). But those defaults have meaning and I won't be able to distinguish them from 'not provided'.
What are the limitations and/or pitfalls of this approach? Does anyone have experience modeling their dates this way?
If it's of any consequence, my app is for iOS written in Swift and uses a Parse Server backend.
I've successfully used question marks to represent ambiguous and unknown timestamp parts in legal systems. Try to keep in mind that you're really not modeling dates here ('1984' isn't a date); you're modeling facts about dates.
So, if one of your users saw a movie some time in 1984, you might record the value '1984-??-?? ??:??:??' in a text column in a database. Values like this sort sensibly.
See also this answer on dba. Comments on that answer are also good to read.

elasticsearch - time series - search time intersection between two indexes

I have two indexes. Each one of them has time and value.
click to see the data structure
In the example above I would like to find a specific time where Index2.val-Index1.val>70
Note that the values do not change from the last time entry which means that if a value is set to 20 on the 1-1-14 it will be the same on the 2-1-14 if no entry exists.
A solution can be fetching both of the vectors and do it with a linear algorithm but I suppose that the performance will be bad.
Is there an out of the box solution for that?
Thanks
David

Is it possible keep only value changes in influxdb?

Is it possible to downsample older data using influxdb in a way that it only keeps change of values?
My example is the following:
I have a binary sensor sending data every 10 min, so naturally the consecutive values look something like this: 0,0,0,0,0,1,1,0,0,0,0...
My goal is to keep this kind of raw data over a certain period of time using retention policies and downsample the data for longer storage. I want to delete all successive values with the same number so that I have only the datapoint with their timestamps when the value actually changed. The downsampled data should look like this: 0,1,0,1,0,1,0.... but with the correct timestamp when the event actually occurred.
Currently this isn't possible with InfluxDB, though the plan is to eventually support this kind of use case.
I would encourage you to open an feature request on the InfluxDB repo asking for this.

How do I handle large amounts of logfile data for display in dynamic charts?

I have a lot of logfile data that I want to display dynamic graphs from, for basically arbitrary time periods, optionally filtered or aggregated by different columns (that I could pregenerate). I'm wondering about the best way to store the data in a database and access it for displaying charts, when:
the time resolution should be variable from one second to a year
there are entries that span several 'time buckets', e.g. a connection might have been open for a few days and I want to count and display the user for every hour she was connected, not just in the hour 'slot' the connection was created or finished
Are there best practices, or tools/plugins for rails that help handle this kind and amount of data? Are there maybe database engines specifically tailored towards this, or having helpful functions (e.g. CouchDB indexes)?
EDIT: I'm looking for a scalable way to handle this data and access pattern. Things we considered: Run a query for each bucket, merge in app - probably way too slow. GROUP BY timestamp/granularity - does not count connections correctly. Preprocessing data into rows by smallest granularity and downsampling on query - probably the best way.
I think you can use mysql timestamps for this.
The way I solved it in the end was to pre-process the data into per-minute buckets, so there's one row for every event and minute. That makes it easy and fast enough to select and yields correct results. To get different granularity, you can do integer arithmetic on the timestamp columns - select abs(timestamp/factor)*factor and group by abs(timestamp/factor)*factor.

Resources