Graph database referencing cassandra tables - datastax-enterprise

I have a scenario where I would like to model my IoT asset in a the graph database of DataStax Enterprise. This is a perfect fit for my hierarchical data structure. However, when it comes to my time series data I already have that stored in a separate Cassandra table. Is there a way to bridge the gap between data in the graph database and data in a standard cassandra table?
Thanks

At this current moment, all data needs to reside in DSE Graph tables to be available via Gremlin traversals for OLTP or OLAP use cases. We have features coming out soon though that will help provide an OLAP scenario. We'd love to learn more about your use case though to enhance the product for this type of scenario. If you'd like, please join the DataStax Academy Graph channel and we can discuss this requirement further - https://academy.datastax.com/slack

Related

If i have multiple systems in an organization with different DB. Is Neo4J support all of them to put together

Let's take it like an organization having multiple divisions maintained in multiple database systems. If I want to create a neo4j Knowledge graph for all the DB's using neo4j how can we do that? which shouldn't affect the current scenario and the knowledge graph should be up and running?
Neo4j Enterprise Edition 4.0 introduced support for Fabric, which is:
a way to store and retrieve data in multiple databases, whether they
are on the same Neo4j DBMS or in multiple DBMSs, using a single Cypher query.

How to create a graph visualisation of a relational database in Neo4j?

I have some normalized Master Data in PostgreSQL.
I want a graph visualization layer in Neo4j without migrating any Data to Neo4j. Kind of like a view. Lazy fetching of data at runtime.
Neo4j will not commit any changes and only meant for viewing.
Can Neo4j use something like a PostgreSQL JDBC connector and provide a visualization?
Thanks.
You could with apoc.load.jdbc and virtual nodes/relationships created from the data.
But it would be a bit involved as you need to load all tables and then connect them.
With the Neo4j-ETL tool you can do a quick (few min) one-time import to visualize.
https://neo4j.com/blog/neo4j-etl-1-2-0-release-whats-new-and-demo/
Esp. if you don't just visualize but also query you need to transfer the data anyway.
You can use ETL tool, from Neo4j.
You need to ask for an activation key via email at devrel#neo4j.com

Performance of graph databases

Is there any size limitation in database in Neo4j and Arango db? I'm using python. Which one is more consistent?
You'll find both are suitable for a concept project. The key difference you will notice though is that ArangoDB is a multi-model database, so it can store normal NoSQL document collections and key/values as well as normal graph data. Neo4j focuses just on the graph data. Typically any application that stores/reads graph data will need to also deal with flat document collections, and if you use Neo4j you'll need to implement another technology to do that, but with ArangoDB it's there for you. Both are consistent, size limitation is only hardware. Good luck with your concept.

Persisting data to neo4j stand alone server

I'm currently doing some R and D regarding moving some business functionality from an Oracle RDBMS to Neo4j to reduce join complexity in the application queries. Due to the maintenance and visibility requirements for the data, I believe the stand alone server is the best option.
My thought is that within a java program I would pull the relevant data out of the Oracle tables, map it to a node object and persist it to neo4j (creating the appropriate relationships in the process).
I'm curious, with SDN over REST not being an optimal solution, what options are available for persistence. Are server plugins or unmanaged extensions the preferred method or am I overcomplicating the issue as tends to happen from time to time.
Thank you!
REST refers to a way to query the data over a network, not a way to store the data. Typically, you're going to store the data on some machine; you then have the option of either making it accessible via RESTful services with the neo4j server, or just using java applications to access the data.
I assume by SDN you're referring to spring data neo4j. Spring is a framework used for java applications, and SDN then refers to a plugin if you will for spring that allows java programmers to store models in neo4j. One could indeed use spring-data-neo4j to read data in, and then store it in Neo4J - but again this is a method of how the data gets into neo4j, it's not storage by itself.
The storage model in most cases is pretty much always the same. This link describes aspects of how storage actually happens.
Now -- to your larger business objective. In order to do this with neo4j, you're going to need to take a look at your oracle data and decide how it is best modeled as a graph. There's a big difference between an oracle RDBMS and Neo4J in terms of how the data is represented. Once you've settled on a graph design, you can then load your data into neo4j (many different options for doing that).
Will all of this "reduce join complexity in the application queries"? Well, yes, in the sense that Neo4j doesn't do joins. Will it improve the speed/performance of your application? There's just no way to tell. The answer to that depends on what your app is, what the queries are, how you model the data as a graph, and how you express the resulting queries over that graph.

BigData Vs Neo4J

I´ve been looking for a triple store for my project. In this project i want to store my data according to certain ontologies (OWL).
From my research i ended up with two tecnologies Neo4J and BigData that seems to fit well in this case.
I want to know if any of this two is more apropriated to use with RDF, RDFS, OWL and SPARQL Queries.
Neo4j can be used to store as entity-relationship-entity form. In case of Bigdata, you should not be upload your whole data into Neo4j because it will become very heavy and process will be very much slow. You should use complimentary db for storing actual data and store ids and some params into Neo4j for Graph traversal to perform sort of Graph Analytics. Neo4j is mainly build up for Graph Analytics that its power or you have to use Graph engine e.g GraphX (Spark).
Thanks,
You might want to try out the SparQL plugin for Neo4j, see here for a HTTP based test, and this Berlin Dataset Test for embedded usage.
Neo4J is a specific technology, while big data is more a generic term. I think what you're asking about OLAP and OLTP. As data gets bigger, there are differences between use cases for RDF style graph databases, which are often used for OLAP (On-line Analytical Processing) style analytics. In short, OLAP is designed for analytics that look across an big data set, while OLTP is more aimed at INSERT/DELETEs (on potentially big data).
OLAP-based traversals tend to process the entire graph, while OLTP based traversals tend to process smaller data sets by starting with one or a handful of vertices and traversing from there.
For example, let’s say you wanted to calculate the average age of friends of one particular user. Great use case for OLTP, since the query data set is small. However, if you wanted to calculate the average age of everyone on the database, OLAP is the preferred technology.
OLAP is optimal for deep analysis of a lot of data, while OLTP is better suited for fast running queries and a lot of INSERTs. If you’re trying to achieve a SLA where the analytics must complete within a certain timeframe, consider the type of analytics and which one is better suited. Or maybe you need both.

Resources