How to copy measurements into new measurements with different structure - influxdb

Say I have measurements named:
eth0
wlan0
And I want to change these measurements to be named just "traffic" and have the previous name be a tag, so something like:
traffic - tags(ifname=eth0)
traffic - tags(ifname=wlan0)
This is a better structure because with this I should be able to retrieve either the total traffic for all interfaces or traffic from a specific interface, while with the previous structure I understand that I can't do it.
However, I need to write a data migration in my application that converts the past data to the new structure, how can I do that?
Do I have to rewrite each point one by one or is there a faster way to do it?
Thanks in advance!

Related

NEO4J: What is a good practice to store a returned path with addtional information (e.g. a concrete train run in a metro network)?

Say there is a metro network with n stops, each represented by a NEO4J node with the rail connection between two stops represented by a relationship.
I wish to store the fact train_run that e.g. Train 01234 ran from stop n1 to stop n4 via stops n2 and n3 at a certain time. I wish to store this information in a manner that must be consistent with the existing DB information regarding the metro network, hence preventing the creation of any train_run along a path that doesn't exist (e.g jumping stop n3).
What would be a good way to achieve this?
Is there a useful way to store in the Neo4J DB a path p returned from that DB jointly with the properties .train_number and time_stamp? Or should I consider a totally different approach?
Thanks for your thoughts.
you can use a structure like this to represent your data . Train to source and destination is optional .That will just help you in finding number of trains between source and destination efficiently . If there are multiple trains between two stops , you need to have multiple relations , one for each train no.

InfluxDB: Group rows with same timestamp

Assume a DB with the following data records:
2018-04-12T00:00:00Z value=1000 [series=distance]
2018-04-12T00:00:00Z value=10 [series=signal_quality]
2018-04-12T00:01:00Z value=1100 [series=distance]
2018-04-12T00:01:00Z value=0 [series=signal_quality]
There is one field called value. Square brackets denote the tags (further tags omitted). As you can see, the data is captured in different data records instead of using multiple fields on the same record.
Given the above structure, how can I query the time series of distances, filtered by signal quality? The goal is to only get distance data points back when the signal quality is above a fixed threshold (e.g. 5).
"Given the above structure", there's no way to do it in plain InfluxDB.
Please keep in mind - Influx is NONE of a relational DB, it's different, despite query language looks familiar.
Again, given that structure - you can proceed with Kapacitor, as it was already mentioned.
But I strongly suggest you to rethink the structure, if it is possible, if you're able to control the way the metrics are collected.
If that is not an option - here's the way: spin a simple job in Kapacitor that will just join the two points into one basing on time (check this out for how), and then drop it into new measurement.
The data point would look like this, then:
DistanceQualityTogether,tag1=if,tag2=you,tag2=need,tag4=em distance=1000,signal_quality=10 2018-04-12T00:00:00Z
The rest is oblivious with such a measurement.
But again, if you can configure your metrics to be sent like this - better do it.

Microservices (Application-Level joins) more API calls - leads to more latency?

I have 2 Micro Services one for Orders and one for Customers
Exactly like below example
http://microservices.io/patterns/data/database-per-service.html
Which works without any problem.
I can list Customers data and Orders data based on input CustomerId
But now there is new requirement to develop a new screen
Which shows Orders of input Date and show CustomerName beside each Order information
When going to implementation
I can fetch the list of Ordersof input Date
But to show the corresponding CustomerNames based on a list of CustomerIds
I make a multiple API calls to Customer microservice , each call send CustomerId to get CustomerName
Which lead us to more latency
I know above solution is a bad one
So any ideas please?
The point of a microservices architecture is to split your problem domain into (technically, organizationally and semantically) independent parts. Making the "microservices" glorified (apified) tables actually creates more problems than it solves, if it solves any problem at all.
Here are a few things to do first:
List architectural constraints (i.e. the reason for doing microservices). Is it separate scaling ability, organizational problems, making team independent, etc.
List business-relevant boundaries in the problem domain (i.e. parts that theoretically don't need each other to work, or don't require synchronous communication).
With that information, here are a few ways to fix the problem:
Restructure the services based on business boundaries instead of technical ones. This means not using tables or layers or other technical stuff to split functions. Services should be a complete vertical slice of the problem domain.
Or as a work-around create a third system which aggregates data and can create reports.
Or if you find there is actually no reason to keep the microservices approach, just do it in a way you are used to.
New requirement needs data from cross Domain
Below are the ways
Update the customer Id and Name in every call . Issue is latency as
there would be multiple round trips
Have a cache of all CustomerName with ID in Order Service ( I am
assuming there a finite customers ).Issue would be , when to refresh
cache or invalidate cache , For that you may need to expose some
rest call to invalidate fields. For new customers which are not
there in cache go and fetch from DB and update cache for future . )
Use CQRS way in which all the needed data( Orders customers etc ..) goes to a separate table . Now in this schema you can create a composite SQL query . This will remove the round trips etc ...

Integrate multiple same structure datasets in one database

I have 8 different datasets with the same structure. I am using Neo4j and need to query all of them at different points on the website I am developing. What would be the approaches at storing the datasets in one database?
One idea that comes to my mind is to supply for each node an additional property that would distinguish nodes of one dataset from nodes of the other ones. But that seems too repetitive and wrong for me. The other idea is just to create 8 databases and query them separately but how could I do that? Running each one in its own port seems crazy.
Any suggestions would be greatly appreciated.
If your datasets are in a tree structure, you could add a different root node to each of them that you could use for reference, similar to GraphAware TimeTree. Another option (better than a property, I think) would be to differentiate each dataset by adding a specific label to nodes from that dataset (i.e. all nodes from "dataset A" get a :DataSetA label)
I imagine that the specific structure of your dataset may yield other options. For example, if you always begin traversals of the dataset from a few set locations, you only need to be able to determine which dataset the entry points are a part of, because once entered, all traversals would be made within the same dataset <-- if that makes sense.

Time Series Graph DB

Anybody know of any Graph DB's that support time series data?
Ideally we're looking for one that will scale well, and ideally use Cassandra or HBase as their persistent store.
Why would you want to do that? Best practice would be to store the dependency graph (in other words, the "Model" of the time series data) in a graphdb, but the actual time series in something more suited to that. Eg a KV store or a log-specific tool like Splunk...
See the KNMI (Dutch Weather Service) example for a case study: http://vimeopro.com/neo4j/graphconnect-europe-2015/video/128351859
Cheers!
Rik
One convenient way of doing that is to build a tree structure, with a common root, years as children, months as children of each year and down to the desired granularity.
At the end you attach the events nodes as leaf to this tree, and that give you the possibility to make many types of queries, from single point in time, to ranges, and also reverse, from the event to the timestamp...
Here is an example of this concept and an implementation within Neo4j

Resources