How to create a custom connector for Presto and InfluxDB - influxdb

I am trying to create a custom connector for Presto and InfluxDB in order to make it possible for Presto to run SQL queries on InfluxDB. Are there any examples of such a connector being available already?
Connectors are the source of all data for queries in Presto. Even if your data source doesn’t have underlying tables backing it, as long as you adapt your data source to the API expected by Presto, you can write queries against this data.
The only documentation that I found for writing a connector is:
https://prestodb.io/docs/current/develop/example-http.html
If anyone has other examples, can you please share it?

There are multiple connectors in the presto source tree.
When you're connecting to a data source having JDBC driver (probably not your case), extending presto-base-jdbc driver gives you almost all you need. See for example https://github.com/trinodb/trino/tree/master/presto-postgresql
When you're connecting to a non-JDBC-enabled data source (or you need more that it's possible with presto-base-jdbc), you need to implement all the relevant connector interfaces. There isn't good documentation for this other than Java interfaces & source code, but you can follow examples e.g. https://github.com/trinodb/trino/tree/master/presto-cassandra, https://github.com/trinodb/trino/tree/master/presto-accumulo
Yet another option is Greg Leclercq's suggestion to implement a Thrift connector. See his answer for directions.

Another option if you prefer to code in a programming language other than Java, is to implement a Thrift service and use the Thrift connector

Related

Ask for FIWARE project recommendations: 3D plot monitoring of entity attrs

The goal of the project is to plot the x,y,z coordinates (attrs from an entity) in a 3D graph which updates as they change.
Note: it's not important how the value of x,y,z changes, it can be for example by hand through the prompt, using curl.
At first, I thought about using QuantumLeap, CrateDB and Grafana, but when I have deployed it I have realised that Grafana doesn't support the crate plugin anymore (https://community.grafana.com/t/plugin-cratedb-not-available/17165), and I got errors (I have tried it using PostgreSQL as it is explained here: https://crate.io/a/pair-cratedb-with-grafana-6-x/)
At this point, I would like to ask for some recommendations: Do you think I need to work with time-series data? If not, how should I address the problem? If yes, can I use another database manager with QuantumLeap and supported by Grafana that works with this time-series format? Or maybe do not use Grafana and accessing the time-series data from the Crate database manually via any frontend software which shows the 3D graph?
This is all a matter of question framing. Because the data format is well defined you can indirectly use any tool with with any NGSI Context Broker.
The problem can be broken down into the following steps:
What Graphing/Business Intelligence tools are available?
What databases do they support?
Which FIWARE Components can push data into a supported Database?
Now the simplest answer (given the user's needs) and as proposed in the question is to use Grafana - the PostGres plugin for Grafana will read from a CrateDB database and the QuantumLeap component can persist time-series data into CrateDB which is compatible with the PostGres format. An example on how to do this can be found in the QuantumLeap documentation
However you could use a component such as Draco or Cygnus to persist your data to a database (Draco is easier here since you could write a custom NIFI step to push in your preferred format.
Alternatively you could use the Cosmos Spark or Flink connectors to listen to an incoming stream of context data and persist something to a database
Or you could write a custom microservice which listens to the NGSI notification endpoint (which is raised by a subscription) interpret the payload and push to the database of your choice.
Once you have the data in a database there as well as Grafana there are plenty of other tools available - consider using the Knowage Engine or Apache Superset for example.

Neo4j to grafana

I want to present release data complexity which is associated with each node like at epic, userstory etc in grafana in form of charts but grafana do not support neo4j database.Is there any way Directly or indirectly to present neo4j database in grafana?
I'm having the same issues and found this question among others. From my research I cannot agree with this answer completely, so I felt I should point some things out, here.
Just to clarify: a graph database may seem structurally different from a relational or time series database, but it is possible to build Cypher queries that basically return graph data as tables with proper columns as it would be with any other supported data source. Therefore this sentence of the above mentioned answer:
So what you want to do is just not possible.
is not absolutely true, I'd say.
The actual problem is, there is no datasource plugin for Neo4j available at the moment. You would need to implement one on your own, which will be a lot of work (as far as I can see), but I suspect it to be possible. For me at least, this will be too much work to do, so I won't use any approach to read data directly from Neo4j into Grafana.
As a (possibly dirty) workaround (in my case), a service will regularly copy relevant portions of the Neo4j graph into a relational database (or a time series database, if the data model is sufficiently simple for that), which Grafana is aware of (see datasource plugins), so I can query it from there. This is basically the replication idea also given in the above mentioned answer. In this case you obviously end up with at least 2 different database systems and an additional service, which is not so insanely great, but at the moment it seems to be the quickest way to resolve the problem with the missing datasource plugin. Maybe this is applicable in your case, too.
Using neo4j's graphite metrics you can actually configure data to be sent to grafana, and from there build whichever dashboards you like.
Up until recently, graphite/grafana wasn't supported, but it is now (in the recent 3.4 series releases), along with prometheus and other options.
Update July 2021
There is a new plugin called Node Graph Panel (currently in beta) that can visualise graph structures in Grafana. A prerequisite for displaying your graph is to make sure that you have an API that exposes two data frames, one for nodes and one for edges, and that you set frame.meta.preferredVisualisationType = 'nodeGraph' on both data frames. See the Data API specification for more information.
So, one option would be to setup an API around your Neo4j instance that returns the nodes and edges according to the specifications above. Note that I haven't tried it myself (yet), but it seems like a viable solution to get Neo4j data into Grafana.
Grafana support those databases, but not Neo4j : Graphite, InfluxDB, OpenTSDB, Prometheus, Elasticsearch, CloudWatch
So what you want to do is just not possible.
You can replicate your Neo4j data inside of those database, but the datamodel is really different ... (timeseries vs graph).
If you just want to have some charts, you can use Apache Zeppeline for that.

Neo4J end user interface

I need to share a Neo4J graph visualization with end users. They should be able to interact with the graph, and perform some very basic querying. For example:
- show me the relationships up to 3 hops away from node named 'Joe'
A first option would be to just give them the standard user interface (usually exposed at port 7474); however this is too powerful as they could perform anything in Cypher.
Is there any way of restricting this interface (so that they cannot trigger expensive queries or even graph updates)? Or maybe other open source / community alternatives?
Thanks
If you are using the Enterprise Edition of neo4j, you will have access to extensive authentication and authorization capabilities, including the ability to assign a reader role to specific user names.
If you do want to use the standard browser interface, you can apply some settings on the neo4j.conf file that may help you out:
dbms.transaction.timeout=10s
dbms.read_only=true
dbms.transaction.timeout will terminate queries exceeding the timeout, so that can prevent expensive queries.
dbms.read_only makes the entire db instance read-only.
You may also build a custom web UI that calls the REST endpoint (need to auth in headers)
or
create an unmanaged extension
https://neo4j.com/docs/java-reference/3.1/#server-unmanaged-extensions
I suggest you the chapter 8 of the excellent book Learning Neo4j, by Rik Van Bruggen. This book is available for download at Neo4j web site.
One of the sections of this chapter shows some open source visualization libraries and visualization solutions.
EDIT 1:
Analyzing a bit more the chapter 8 of the Learning Neo4j book I believe that a promising tool for your use case is the paid solution Linkurio.us (you can run a demo in the site). This solution has a native integration with Neo4j and others graph databases.
EDIT 2:
Alternatively you can build your own visualization solution with a graph visualization library in JavaScript, for example. Here a very useful answer from another StackOverflow question that lists more some libraries that can help you.

How to visualise Neo4j graph database created from an embedded Neo4j java application

I created an application which embedded Neo4j. In that application I created and stored some nodes with some relationships. My application has saved this database to a file. I would like to visualise that data. I know I can see graphs if I fire up the Neo4j server but I do not know how to import my neo4j.db file into the Neo4j server so that I can visualise it. Any suggestions would be greatly appreciated.
Depending on your use case you might have different solutions:
Use a web-based visualization
Use a desktop application to visualize your data
Use web-based visualization
In this case you have to take care of the web-app to visualize the data.
You have basically two solutions out there: Javascript or Java applets.
For the Javascript side you have many choices: D3js, VivaGraph, SigmaJS, KeyLines.
The first three are open source and free while the last one has a commercial licence and non-free.
There're already a million questions about these libraries on SO, so I'll link you to some of those to understand the various differences.
Desktop Application
The main solutions in this case I would recommend you, depending on the kind of data are: either Gephi or Cytoscape.
In both cases I believe you have to write your own adapter to communicate with your application.
Architecture Reference
The architecture in both cases will be the following:
The controller renders a webpage with the JS visualisation framework you want to use
The controller offers a couple of JSON endpoints the client can use to query the data from the Neo4J embedded
Each query fetch the data, put in a model and render the JSON to send to the client
If you're NOT using neo4j 2.0+ then really good way to visualize your graph is by using neoclipse. https://github.com/neo4j-contrib/neoclipse/downloads
it's really handy and it has cypher support too.
Or
another quick hack is to copy your db folder (which you created by using embedded database) into $NEO4j_HOME/data/
and
change $NEO4j_HOME/conf/neo4j-server-properties file to point to
and
start your server (bin/.neo4j start). You'll be able to visualize your database at localhost:7474
I hope it helps!

Can I connect directly to the output of a Mahout model with other data-related tools?

My only experience with Machine Learning / Data Mining is via SQL Server Analysis Services.
Using SSAS, I can set up models and fire direct singleton queries against it do to things like real-time market basket analysis and product suggestions. I can grab the "results" from the model as a flattened resultset and visualize same elsewhere.
Can I connect directly to the output of a Mahout model with other data-related tools in the same manner? For example, is there any way I can pull out a tabular resultset so I can render same with the visualization tool of my choice? ODBC driver, maybe?
Thanks!
The output of Mahout is generally a file on HDFS, though you could dump it out anywhere Hadoop can put data. And with another job to translate to put in whatever form you need, it's readable. And if you can find an ODBC driver for the data store you put it in, yes.
So I suppose the answer is, no, there is not by design any integration with any particular consumer. But you can probably hook up whatever you imagine.
There are some bits that are designed to be real-time systems queried via API, but I don't think it's what you mean.

Resources