I may want to use erlacassa to communicate between Cassandra and Eralng. It is a CQL client. So I was wondering, what are the limitations of CQL (cassandra query language) compared to cassandra accessed by thrift?
For example I have found over the internet that:
CQL has some current limitations and does not support operations such
as GROUP BY, ORDER BY
This partly depends on the version of Cassandra that you are using. For example, CQL did not support composite columns until CQL 3.0 (which is available in Cassandra 1.1 but not turned on by default). But for the most part all major features are available both in the thrift API and in CQL.
As for group by this is not supported by either CQL or the thrift API. Order by is in CQL 3.0, but it is only used to specify a reversed ordering (which is the same limitation you would have through Thrift). It sounds like the article you found was comparing Cassandra to a traditional SQL database.
Aside from syntax differences, the biggest difference is that CQL pretends to be SQL whereas the thrift APIs make no such pretension. Developers will see the SQL and make relational assumptions that simply don't apply to Cassandra. For example, there is discussion here advocating for using order by. In the world of Cassandra, it is far better to denormalize materialized views of every way that you wish to access the data rather than by changing the query.
Don't get me wrong. I see a lot of value in replacing the thrift interface with a DSL such as http://glennengstrand.info/nosql/cassandra/cql but I believe that the familiarity with SQL as a way to access relational data will lead developers into using Cassandra in ways that will simply not scale.
Related
I want to present release data complexity which is associated with each node like at epic, userstory etc in grafana in form of charts but grafana do not support neo4j database.Is there any way Directly or indirectly to present neo4j database in grafana?
I'm having the same issues and found this question among others. From my research I cannot agree with this answer completely, so I felt I should point some things out, here.
Just to clarify: a graph database may seem structurally different from a relational or time series database, but it is possible to build Cypher queries that basically return graph data as tables with proper columns as it would be with any other supported data source. Therefore this sentence of the above mentioned answer:
So what you want to do is just not possible.
is not absolutely true, I'd say.
The actual problem is, there is no datasource plugin for Neo4j available at the moment. You would need to implement one on your own, which will be a lot of work (as far as I can see), but I suspect it to be possible. For me at least, this will be too much work to do, so I won't use any approach to read data directly from Neo4j into Grafana.
As a (possibly dirty) workaround (in my case), a service will regularly copy relevant portions of the Neo4j graph into a relational database (or a time series database, if the data model is sufficiently simple for that), which Grafana is aware of (see datasource plugins), so I can query it from there. This is basically the replication idea also given in the above mentioned answer. In this case you obviously end up with at least 2 different database systems and an additional service, which is not so insanely great, but at the moment it seems to be the quickest way to resolve the problem with the missing datasource plugin. Maybe this is applicable in your case, too.
Using neo4j's graphite metrics you can actually configure data to be sent to grafana, and from there build whichever dashboards you like.
Up until recently, graphite/grafana wasn't supported, but it is now (in the recent 3.4 series releases), along with prometheus and other options.
Update July 2021
There is a new plugin called Node Graph Panel (currently in beta) that can visualise graph structures in Grafana. A prerequisite for displaying your graph is to make sure that you have an API that exposes two data frames, one for nodes and one for edges, and that you set frame.meta.preferredVisualisationType = 'nodeGraph' on both data frames. See the Data API specification for more information.
So, one option would be to setup an API around your Neo4j instance that returns the nodes and edges according to the specifications above. Note that I haven't tried it myself (yet), but it seems like a viable solution to get Neo4j data into Grafana.
Grafana support those databases, but not Neo4j : Graphite, InfluxDB, OpenTSDB, Prometheus, Elasticsearch, CloudWatch
So what you want to do is just not possible.
You can replicate your Neo4j data inside of those database, but the datamodel is really different ... (timeseries vs graph).
If you just want to have some charts, you can use Apache Zeppeline for that.
I am learning neo4j. I am accessing neo4j via REST api(s) supported by the server mode. CRUD operations are implemented using neo4jOperations. For experimentation , I have benchmarked its read operations but I have found that methods : 'query' and 'queryForObjects' are taking huge execution time, although I am querying via a field which is indexed. Traversals are not complex.
I have : around 500K+ nodes, 900K+ relationships.
neo4j version : 3.0.8.
Is there any solution to improve the performance of query on neo4j in server mode?
Without looking at your actual queries and model it is hard to say why the performance would not be up to your expectations. Try to run the queries through the Neo4j browser and either EXPLAIN or PROFILE them, that may give you a hint of where the issue is.
Having said that, you really should move to version 3.2.1 and access the server over the bolt:/ protocol. That by itself should already significantly improve things.
Regards,
Tom
I am evaluating if I should use Spring Data Neo4j 4 or directly use the native APIs that Neo4j has. Is it possible to get the full potential of Neo4j when using Spring Data Neo4j 4 or will it limit my future usage of Neo4j?
I see the benefit with that POJO simplifies the storage of objects in the database.
The recently updated content on https://graphaware.com/spring-data-neo4j may provide you with additional information to consider.
In my view, yes, SDN allows you to make use of the full potential of Neo4j. That said, for use cases where needed you can also sidestep SDN and make direct use of the underlying OGM and/or Cypher directly. In other words, when making use of SDN you also have the freedom and flexibility to use alternate options of what best suits your needs, so your usage does not need to be an "all SDN" or "no SDN" approach; you can mix and match as needed.
There are 2 "native" APIs
There is the Java API, that you can access in unmanaged extensions or when using Neo4j as embedded
Neo4j java driver (a.k.a. Bolt) - this what Neo itself promotes the most
OGM (and therefore SDN) supports both embedded and bolt, with new features of Bolt being covered shortly after being released.
There are some features of the embedded database that can't be used (at least not directly, you may use them through user-defined procedures/functions). E.g. traversals etc..
You should also consider other aspects of your use case, like performance, if your domain model matches the graph model etc..
After reading through Traversal API, I liked the idea the concept of BranchSelector, Expander and Uniqueness. It is something like describing how traversal should be made. In other words, I felt it like giving declarative description of traversal to be performed. The fluent API is well suited for this purpose. However it seems that it can only be used to target neo4j whose database can be accessed through file system, that is we need to specify graph.db folder path. This essentially means we can only use it in embedded mode. Can we use traversal API to perform traversal on graph running on remote machine?
Especially I want to have the convenience of API (BranchSelector, Expander and Uniqueness) available to perform traversal.
I read we can use bolt to access embedded neo4j. However this does not seem to mean that we can use embedded neo4j from remote machine.
So it seems that their is no way to use Traversal API if I cannot have access to physical (or directory) location of graph database. Is it so?
The Traversal API can only be used if the code is colocated with the data, otherwise it would usually be awfully slow. The problem is that it's not purely declarative, since you can provide implementations of PathExpander, Evaluator, etc. instead of only using pre-defined constants.
However, there are several ways of having this code colocated:
one is indeed by using Neo4j in embedded mode, as you have noted
another one is by extending Neo4j, using either used-defined procedures / functions, or unmanaged extensions
See APOC for a collection of procedures developed by the Neo4j community, including some traversals.
I'm currently doing some R and D regarding moving some business functionality from an Oracle RDBMS to Neo4j to reduce join complexity in the application queries. Due to the maintenance and visibility requirements for the data, I believe the stand alone server is the best option.
My thought is that within a java program I would pull the relevant data out of the Oracle tables, map it to a node object and persist it to neo4j (creating the appropriate relationships in the process).
I'm curious, with SDN over REST not being an optimal solution, what options are available for persistence. Are server plugins or unmanaged extensions the preferred method or am I overcomplicating the issue as tends to happen from time to time.
Thank you!
REST refers to a way to query the data over a network, not a way to store the data. Typically, you're going to store the data on some machine; you then have the option of either making it accessible via RESTful services with the neo4j server, or just using java applications to access the data.
I assume by SDN you're referring to spring data neo4j. Spring is a framework used for java applications, and SDN then refers to a plugin if you will for spring that allows java programmers to store models in neo4j. One could indeed use spring-data-neo4j to read data in, and then store it in Neo4J - but again this is a method of how the data gets into neo4j, it's not storage by itself.
The storage model in most cases is pretty much always the same. This link describes aspects of how storage actually happens.
Now -- to your larger business objective. In order to do this with neo4j, you're going to need to take a look at your oracle data and decide how it is best modeled as a graph. There's a big difference between an oracle RDBMS and Neo4J in terms of how the data is represented. Once you've settled on a graph design, you can then load your data into neo4j (many different options for doing that).
Will all of this "reduce join complexity in the application queries"? Well, yes, in the sense that Neo4j doesn't do joins. Will it improve the speed/performance of your application? There's just no way to tell. The answer to that depends on what your app is, what the queries are, how you model the data as a graph, and how you express the resulting queries over that graph.