Neo4j json dl import - neo4j

I'm just getting into Neo4j and I spent a couple hours mapping out some nodes and relationships.
I D/L'd the JSON and I'm trying to move the nodes to another computer, it seems like it should be a pretty simple query, but everything I'm finding about batch import is for csv's and a bit more involved.
Is the just a simple cypher to import the JSON d'l from the local Neo4j server?

Moving a full graph db to another box is most simply done by copying over the data/graph.db directory.
Alternatively you can use neo4j-shell's dump command.

Related

Neo4j rebuilding database

We would like to import every day or so, tens of millions of nodes with relationships.
neo4j-admin import seems to be very slow (we removed all constrains and indexes before).
What is the best practice to load a huge amount of data into neo?
We would like every other day, when we import, to import it into a different database and SWITCH between the current database and the newly built database. (similar to alias index in elastic search).
How can this be done?
Thank you

Neo4j: Initial and Delta data load from SQL Database

Currently, evaluating Neo4j for some data analytic. Where data from different database will be pushed to Neo4j Database periodically. These can have "Addition, modification as well as delete".
As the model how data store in GraphDB and and in Origin SQL DB by virtue differs, we are thinking/trying to find - how to handle the add, modify and delete scenarios?
Are there any standard Rules/Ways?
Are there any tolls for sync? (Other than the CSV or similar imports)
Thanks in advance
There are not ready-to-use solution in your case. You should build it by yourself.
#1 - Manual imports
You can prepare data and manually execute import (using standard tools). Then, when new data appears - manually prepare new data and import to existing database.
Import tool and Cypher csv can be used here.
#2 - Unmanaged extension
You can develop unmanaged extension that will be capable to persist data in Neo4j from your database. In this case some sort of sync should be implemented on client and server side.
More information can be found here.
#3 - neo4j-csv-firehose
There is extension developed by #sarmbruster - neo4j-csv-firehose.
neo4j-csv-firehose enables Neo4j’s LOAD CSV Cypher command to load
other from other datasources as well. It provides a Neo4j unmanaged
extension doing on-the-fly conversion of the other datasource to csv -
and can therefore act as input for LOAD CSV. Alternatively it can be
run as standalone server.
Check README for more information.
#4 - neo4j-shell-tools
This is another project developed by #jexp - neo4j-shell-tools.
neo4j-shell-tools adds a number of commands to neo4j-shell which
easily allow import and export data into running Neo4j database.
Check README for more information.
#5 - Liquigraph
Another interesting tool is - Liquigraph.
Database migrations management tool, based on how Liquibase works.
You can write migration for Neo4j database in XML using this tool.
Also, you check other existing neo4j tools - maybe something works for you.
Not really sure what you're asking for.
Usually you have a import script that imports into the graph model.
This can be cypher or java code and be driven by csv, json, or whatever your datasource is (provided as parameter).

neo4j broken/corrupted after ungraceful shutdown

I'm using Neo4j over windows for testing purposes and I'm working with a db containing ~2 million relations and about the same amount of nodes. after I had an ungraceful shutdown of neo4j while writing a batch of relations the db got corrupted.
it seems like there are some broken nodes/relations in the db and whenever I try to read them I get this error (I'm using py2neo):
Error: NodeImpl#1292315 not found. This can be because someone else deleted this entity while we were trying to read properties from it, or because of concurrent modification of other properties on this entity. The problem should be temporary.
I tried rebooting but neo4j fails to recover from this error. I found this question:
Neo4j cannot read certain nodes. Throws NotFoundException. Corrupt database
but the answer he got is no good for me because it involved in going over the db and redo the indexing, and I can't even read those broken nodes/relations so I can't fix their index (tried it and got the same error).
In general I've had many stability issues with neo4j (and on multiple platforms, not just windows). if no decent solution is found for this problem I will have to switch to a different database.
thanks in advance!
I wrote a tool a while ago that allows you to copy a broken store and keeps the good records intact.
You might want to check it out. I assume you used the 2.1.x version of Neo4j.
https://github.com/jexp/store-utils/tree/21
For 2.0.x check out:
https://github.com/jexp/store-utils/tree/20
To verify if your datastore is consistent follow the steps mentioned in http://www.markhneedham.com/blog/2014/01/22/neo4j-backup-store-copy-and-consistency-check/.
Are you referring to batch inserter API when speaking of "while writing a batch of relations"?
If so, be aware that batch inserter API requires a clean shutdown, see the big fat red warning on http://docs.neo4j.org/chunked/stable/batchinsert.html.
Are the broken nodes schema indexed and are you attempting to read them via this indexed label/property? If so, it's possible you may have a broken index following the sudden shutdown.
Assuming this is the case, you could try deleting the schema subdirectory within the graph store directory while the server is not running and let the database rebuild the index on restart. While this isn't an official way to recover from a broken index, it can sometimes work. Obviously, I suggest you back up your store before trying this.

Batch-imported data not available in Neo4j until server restart

I'm using batch-import to load a very small graph (5 nodes; 3 rels) into neo4j 1.9.1. The import showed success, yet the data is not available through webadmin or REST queries until I restart the neo4j server. Very strange. Can somebody enlight me?
You should not have the server running if you to batchimport, since both systems are accessing the same files, and having the server running might corrupt your DB. First to batchimport, then start your server to get correct numbers.

Neo4J Subgraphs or Multiple Databases

I have my Neo4J (embedded) database setup like this:
I attach several user nodes to the reference node.
To each user node can be attached one or more project nodes.
To each project node is a complex graph attached.
The complex graph is traversable with a single traverse pattern (there is a hidden tree structure in them).
What I'd like to do is the following:
Remove all the nodes below a project node.
Remove all the project nodes below a user, when there's nothing below the project nodes
Export all nodes below a specific user node to .graphML (probably using Gremlin Java API?)
Import a .graphML file back to the database below a specific user node without removing the information that is located under different user nodes.
I've already worked with the Gremlin GraphML reader for importing and exporting entire Neo4J databases but I was not able to find something about importing/exporting subgraphs.
If this is in fact possible, how would Neo4J handle two users trying to import something at the same time? For instance user 1 imports his section under the user1 node, and user 2 imports his data under the user 2 node simultaneously.
The other possibility is to have a Neo4J database per user, but this is the less preferable option really and I'm very unsure whether it's actually posssible, be it embedded or server version. I've read something about having multiple server versions on different ports but our amount of users is per definition unlimited...
Any help would be greatly appreciated.
EDIT 1: I've also come across something called Geoff (org.neo4j.geoff) which deals with subgraphs. I'm absolutely clueless as to how this works but I'm looking into it right now.
You might take a lock on the user node when starting the import, so that the second import would have to wait (and would have to check).
With cypher queries you can delete the subgraphs and also export them to cypher again. There is export code for query-results in the Neo4j Console repository.
There you can also find geoff-export and import and also cypher importers.
One option may be to use something like Tinkerpop blueprints to create a generic Graph when traversing, then doing a GraphML export.
https://github.com/tinkerpop/blueprints/wiki/GraphML-Reader-and-Writer-Library would have more information, but if you are looking to export subgraphs, this is probably your best option.

Resources