Which graph database [closed] - neo4j

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
Which graph database should I use when dealing with a couple of thousand nodes and a couple of thousands relationships? Are these big numbers for any database or not? Which graph database is the fastest at read operations (assuming all data is loaded once at the beggining).
I had a look at neo4j and its visualization tool. Will I be able to have such a visualization tool in my application?

The questions you'll need to ask and answer for a graph database are similar to any other database. How much data? In memory or persistent? How will you interface with it? Embedded or a server process? Distributed or localized? Licensing?
A couple of thousand nodes and relationships is small for a graph database and most any graph database solution will work. For most people Neo4j is a fine choice, but there are some caveats. First, the licensing of Neo4j can be problematic in many situations. Secondly, the visualizer is part of the Neo4j server process - which means you're going to have another server process running. If you're concerned about the licensing you may want to check out OrientDB, which is under the Apache license, and thus very flexible.
From the sounds of it, you have a fairly small system and may be able to get by with using TinkerGraph, an in-memory graph database from Marko Rodriguez and the Tinkerpop hackers. It has the option to persist your data to a file if needed, is amazingly lightweight, and, like Neo4j and OrientDB, supports all the graph tools from the Tinkerpop stack, including the Jung Ouplemntation, which can give you the visualizations you desire.

Related

triplestore or graph database more suitable for this application? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am trying build a domain specific question answering system using the Wikidata that already has the rdf/json dump format.
There are pros and cons between triplestore (e.g. virtuoso) and graph (e.g. neo4j), I made a lot of reading but still can't decide which.
People said triplestore is good for inferencing/reasoning, while graph is not but I thought with graph in neo4j we can write Cypher in such a way that can do inferencing too such as this , so what I am not aware of ? Is it easier to query with SPARQL than Cypher for inferencing ?
I am leaning towards using graph database as it has a couple of advantages as explained here but maybe it has bias because it's written by neo4j. But if I choose this route, means I have to sanitize a lot of rdf data when importing to neo4j with the neosemantic plugin ?! I am not sure how direct it's and if it will be painful to import them.
I would prefer to have everything stored in a single db that include the user's login system, application specific data as well.
so what is more suitable in my situation, triplestore or graph ? Thanks

What are the options when it comes to handling Data Lineage in Snowflake? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
Any ideas/options about handling Data Lineage in Snowflake? We are following a microservice architecture in which we are running a set of stored procedures that contain quite a few SQL queries as soon as certain events are triggered.
Example: When Table A is populated execute SP_Populate_Table_B and the result is that Table B is populated. We have a big set of SPs as we are populating the Staging Area, DataVault and our Dimensional Model.
We are in the lookout for any good way of handling all the metadata around this microservice way of performing our ETL. Basically automated way to track dependencies between tables, visualize the orchestration, have a better way to handle the changes of the SPs when tables are changed etc.
Can you please advice for some frameworks or tools, preferably open-source, that you have tried for Snowflake? Will DBT be a solution to that?
Thank you
Pantelis
dbt is a good solution to deploying your warehouse as code, but not a great solution for using your warehouse as a db for services to write intermediary tables.
If you care about data lineage, and you're willing to rethink the SP approach, then I would recommend dbt as a tool to deploy your warehouse infrastructure as code, and easily understand the downstream dependencies of your data.
dbt is great if you are willing to approach everything as an ELT problem, and allow dbt to be the infrastructure that transforms a subset of your mass-loaded data/events, into something that is ready to be analyzed or ingested for BI.
Read this for more context:
https://discourse.getdbt.com/t/understanding-idempotent-data-transformations/518
I'm not 100% sure if it supports snowflake just yet but I'd highly recommend looking into Packyderm. I believe it was built to solve just this kind of problem.
Might be worth a look or even contributing to if you really want Snowflake support.

Neo4j community edition restrictions and limitations [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I was working with janus graph but I it has alot of drawback. so I am searching for other graph databases like neo4j.
I Want top prevent the problems with janus graph for answearing this question will help:
What is the restriction of Neo4j community edition
Neo4j has the concept of composite and mixed index.
Can I manage indexes easily like create/delete index.
Can I perform contains operation as like in RDBMS.
The drivers provided for c#, python.. can perform all type of queries supported by Neo4j
Does Noe4j case a problem with the next scenario:
creating nodes type and properties
inserting data
create index for the existing structure
change the old node and relations structure by adding new properties or types
create new index combining the old and new properties.
I faced these problems with janusgraph, so I don't want to re-start them.
Neo4j Enterprise is free to use under its open source license. You can use it in production, the US federal government does already. Neo4j.com won't help you find details on it. Many people are not aware of this.
https://GraphStack.io has more info.
I don't know what the problems with Janus were - you don't mention them - but to answer your questions:
The best place to look is http://neo4j.com/editions/. There are no differences in terms of capacity or the Cypher language (except property exist constraints). What you do lose is things like Clustering and High Availability.
Neo4j does have a composite index and composite constraints.
Yep.
I think you're asking if you can do the equivalent of LIKE in SQL? If so - yes - you have STARTS WITH, ENDS WITH and CONTAINS for strings
Yes, if you can write it in Cypher, you can execute it in the drivers.
Neo4j is schema-less, so this scenario has no problem.

Which API to use for realtime document collaboration [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I am currently building a virtual classroom website and so far I have successfully added webcam & audio functionality.
The next thing that is on my list is to add realtime document collaboration.
So how this would work is:
Two people join a private session
These two people have a shared document that they can both edit and changes are displayed in realtime to the other user.
An example of this would be google docs where you can be multiple people on one document.
Anyway, I have seen a few APIs that do this, for example I have looked into google docs api, but it requires you to have a google account which is not optimal. (Registering both on my website and on google docs can be a hassle or too much work for some people).
I have also looked into Zoho, but I am unsure if it can fill my needs.
Does anyone of you know an API that can do this? Preferably both document and sheets(excel looking).
Thanks!
The Google Realtime API is especially well-suited for document collaboration, but it sounds like it's not a good fit. There are a few other options out there:
ShareDB is an open-source realtime database backend, used in the DerbyJS framework.
Mozilla's TogetherJS provides view-level collaboration features.
Convergence (disclaimer: I am a founder) is a new hosted platform providing APIs for this sort of functionality. We have identified the most common pain points when implementing realtime collaboration features, and provide high-level APIs to solve them.
Multiplayer is concurrent editing database, it looks like it is based on Operation Transforms and they are planning to launch on Kickstarter. Looks like it can do exactly what you need, and they use Websockets to send changes in real-time.

Modeling software for network serialization protocol design [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am currently designing a low level network serialization protocol (in fact, a refinement of an existing protocol).
As the work progress, pen and paper documents start to show their limits: i have tons of papers, new and outdated merged together, etc... And i can't show anything to anyone since i describe the protocol using my own notation (a mix of flow chart & C structures).
I need a software that would help me to design a network protocol. I should be able to create structures, fields, their sizes, their layout, etc... and the software would generate some nice UMLish diagrams.
Sorry to say, everything I've seen so far (various serial protocols for embedded devices/networks) has used Word documents, with plain old tables showing allocations of fields to the bytes in the message. Alternatively, I've seen it done in Excel documents! It works, and people can read it.
Unfortunately, that's not helpful for automatic code generation, unless you have a very strict format in e.g. an Excel doc that you can then parse with a tool to generate some code. It would be good to have a notation that can be easily machine parsed, as well as human readable.
For showing message handshaking and sequences, a UML sequence diagram is good of course. There are lots of tools readily available to help you with that part of it.

Resources