Can I convert Neo4J Database files to XML?
I agree, GraphML is the way to go, if you don't have problems with the verbosity of XML. A simple way to do it is to open the Neo4j graph from Gremlin, where GraphML is the default import/export format, something like
peters: ./gremlin.sh
gremlin> $_g := neo4j:open('/tmp/neo4j')
==>neograph[/tmp/neo4j, vertices:2, edges:1]
gremlin> g:save('graphml-export.xml')
As described here
Does that solve your problem?
With Blueprints, simply do:
Graph graph = new Neo4jGraph("/tmp/mygraph");
GraphMLWriter.outputGraph(graph, new FileOutputStream("mygraph.xml"));
Or, with Gremlin (which does the same thing in the back):
g = new Neo4jGraph('/tmp/mygraph');
g.saveGraphML('mygraph.xml');
Finally, to the constructor for Neo4jGraph, you can also pass in a GraphDatabaseService instance.
I don't believe anything exists out there for this, not as of few months ago when messing with it. From what I saw, there are 2 main roadblocks:
XML is hierarchical, you can't represent graph data readily in this format.
Lack of explicit IDs for nodes. Even though implicit IDs exist it'd be like using ROWID in oracle for import/export...not guaranteed to be the same.
Some people have suggested that GraphML would be the proper format for this, I'm inclined to agree. If you don't have graphical structures and you would be fine represented in an XML/hierarchical format...well then that's just bad luck. Since the majority of users who would tackle this sort of enhancement task are using data that wouldn't store that way, I don't see an XML solution coming out...more likely to see a format supporting all uses first.
Take a look at NoSqlUnit
It has tools for converting GraphML to neo4j and back again.
In particular, there is com.lordofthejars.nosqlunit.graph.parser.GraphMLWriter and com.lordofthejars.nosqlunit.graph.parser.GraphMLReader which read / write XML files to / from a neo4j database.
Related
Are there any utilities to import database from Neo4j into ArangoDB? arangoimp utility expects the data to be in certain format for edges and vertices than what is exported by Neo4j.
Thanks!
Note: This is not an answer per se, but a comment wouldn't allow me to structure the information I gathered in a readable way.
Resources online seem to be scarce w/r to the transition from neo4j to arangodb.
One possible way is to combine APOC (https://github.com/neo4j-contrib/neo4j-apoc-procedures) and neo4j-shell-tools (https://github.com/jexp/neo4j-shell-tools)
Use apoc to create a cypher export file for the database (see https://neo4j.com/developer/kb/export-sub-graph-to-cypher-and-import/)
Use the neo4j-shell-tool cypher import with the -o switch -- this should generate csv-files
Analyse the csv-files,
massage them with csvtool OR
create json-data with one of the numerous csv2json converters available (npm, ...) and massage these files with jq
Feed the files to arangoimp, repeat 3 if necessary
There is also a graphml to json converter (https://github.com/uskudnik/GraphGL/blob/master/examples/graphml-to-json.py) available, so that you could use the afforementioned neo4j-shell-tools to export to graphml, convert this representation to json and massage these files to the necessary format.
I'm sorry that I can't be of more help, but maybe these thoughts get you started.
I am using Graph() in RDFLib, i am correctly getting results of from the graph using sparql. Is it possible to get the results directly in HTML table format?
rdflib is a library to work with rdf in python, not an HTML rendering engine. Usually if you work on a graph.sparql() query, you want to access the result in python itself.
That said, there is a fork focusing on hosting RDF called rdflib-web. In it you can find a htmlresults.py which does pretty much what i think you want.
I've been looking for a way to 'productionize' R or python based Random Forest/Gradient boosting tree models, and had thought that since all the individual component decision tree are binary trees, exporting to a graphical database might be a workable solution (deploying by holding the models in memory and invoking from a lightweight restful library like Flask doesn't scale that well). Here's how a decision tree is normally traversed:
1.) Data gets passed to the root node
2.) We check if the present node is a leaf node; if it is, we return a set of attributes (the predicted distribution/value).
If not, the node stores a decision rule, and checks the relevant column for which node to pass the data to next (e.g., "If age>9.5, move to left node")
Repeat 2-3.
I'm new to neo4j and graph databases in general, and it wasn't clear to me that it is possible to store(and subsequently traverse) decision rules in a node; all the examples I saw tended to be in the vein of
MATCH (neo:Database {name:"Neo4j"})
MATCH (johan:Person {name:"Johan"})
CREATE (johan)-[:FRIEND]->(:Person:Expert {name:"Max"})-[:WORKED_WITH]-> (neo)
where the conditional statements are prespecified in a query. Is this something which is feasible with neo4j, and if so, which areas of the documentation should I be focusing on?
Thank you for any guidance you could provide.
Interesting problem.
You need a way to export a model out of R or Python and translate that into a Neo4J graph.
The export mechanism can be PMML (if you're using R rpart package to generate prunded trees), Google protobuf (if you're using R gbm package to generate trees), or simply an Excel spreadsheet.
Parsing and unmarshalling to Neo4J is your issue.
I am not affiliated with Yhat in any way, but reading your question made me think of an alternative approach.
Yhat Science Ops
I don't know what that means for your team internally, but it seems like a pretty simple way to have a model easy to call via a basic API call.
I'm just starting to learn Neo4j and I've just stumbled across some issue.
It looks like Neo4j is using strong typing without on the fly type conversion, i.e. RETURN '17' = 17 results in false and RETURN '10' > 5 results in syntax error.
It looks very strange to me for NoSQL and schema-less database to implement such a strict behavior. Even strong typed schema-based databases such as MySQL and Postgresql allows type conversions in statements. Is this an ideology behind Neo4j? If so why?
Github issue.
In Neo4j 2.1 there were type conversion functions added, like toInt and toFloat that take care of the conversion.
In 2.0.x you can use str(17) = str('17') in the other direction.
Neo4j itself is less strict on structural information. But more strict on values. I.e. the value you put into a property is returned exactly like that and you have to convert it to a different type yourself. Some of that stems from its Java history and was already loosened for cypher.
I believe internally Cypher / Gremlin translate statement into corresponding Java method calls. Is there a way to trace what method calls in run?
For example, in Hibernate, we can specify "show sql" to see generated sql statement.
[Edit]
The reasonws I want to do that:
1. For Debugging purpose:
To find out why the cypher / gremlin doesn't produce the expected result.
For learning purpose:
To find what's happening under the hood
For optimization:
To find out where the bottleneck is.
In Cypher, that is planned for the coming months to add. Ultimately, yes, currently the methods used under the Hood are the Java Neo4j core API and the Traversal Framework. Mind adding a case that is causing you problem?
For Gremlin, do a .toString() at the end of your Gremlin expression to see which Pipes (http://pipes.tinkerpop.com) it ultimately compiles down to.