I am using Neo4j BatchInserters to insert nodes in db.I am using LuceneBatchInserterIndexProvider for indexes. I have multiple files from where i am importing the data. I want if my process break then i should be able to restart the process from next file. But whenever i restart process it creates new db in graph folder and new indexes. My initialization code look like this.
Map<String, String> config = new HashMap<String, String>();
config.put("neostore.nodestore.db.mapped_memory", "2G");
config.put("batch_import.keep_db", "true");
BatchInserter db = BatchInserters.inserter("ttl.db", config);
BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider(
db);
index = indexProvider.nodeIndex("ttlIndex",
MapUtil.stringMap("type", "exact"));
index.setCacheCapacity(URI_PROPERTY, indexCache + 1);
Can somebody please help here?
To provide more details. I have multiple files ( around 400) which i want to import to Neo4j.
I want to divide my process into batches. After every batch i want to restart the process.
I used neo4j batch inserter config batch_import.keep_db = "true". This does not clear the graph but after restart indexer has lost information. I have this method to check for node existence. I am sure before restart i have created node.
private Long getNode(String nodeUrl)
{
IndexHits<Long> hits = index.get(URI_PROPERTY, nodeUrl);
if (hits.hasNext()) { // node exists
return hits.next();
}
return null;
}
Related
I am using neo4j in embedded mode. So for some operations in database on server, i am tying to execute groovy script. Groovy script is running successfully without any error,but it is not creating any new record when i am checking neo4j-communinty tool.
Script
/**
* Created by prabjot on 7/1/17.
*/
#Grab(group="org.neo4j", module="neo4j-kernel", version="2.3.6")
#Grab(group="org.neo4j", module="neo4j-lucene-index", version="2.3.6")
#Grab(group='org.neo4j', module='neo4j-shell', version='2.3.6')
#Grab(group='org.neo4j', module='neo4j-cypher', version='2.3.6')
import org.neo4j.graphdb.factory.GraphDatabaseFactory
import org.neo4j.graphdb.Node
import org.neo4j.graphdb.Result
import org.neo4j.graphdb.Transaction
class Neo4jEmbeddedAccess {
public static void main(String[] args) {
def map=[:]
map.put("allow_store_upgrade","true")
map.put("remote_shell_enabled","true")
def db = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder("/opt/neo4j-community-3.0.4/data/databases/graph.db")
.setConfig(map)
.newGraphDatabase()
Transaction tx =db.beginTx()
Node person = db.createNode();
person.setProperty("name","prabjot")
print("id---->" + person.id);
Result result = db.execute("Match (country:Country) where id(country)=73 SET country.modified=true return country")
print(result)
tx.success();
println """starting embedded graph db
use bin/neo4j-shell from a new distribution to connect
we're keeping the graphdb open for 120 secs"""
db.shutdown()
}
Please help what i am doing wrong here, i have checked my db location but is same as i am using in script and tool.
Thanks
You forgot tx.close() which commits the Transaction
Sucess only marks it as successful
It looks like previously working approach is deprecated now:
unsupported.dbms.executiontime_limit.enabled=true
unsupported.dbms.executiontime_limit.time=1s
According to the documentation new variables are responsible for timeouts handling:
dbms.transaction.timeout
dbms.transaction_timeout
At the same time the new variables look related to the transactions.
The new timeout variables look not working. They were set in the neo4j.conf as follows:
dbms.transaction_timeout=5s
dbms.transaction.timeout=5s
Slow cypher query isn't terminated.
Then the Neo4j plugin was added to model a slow query with transaction:
#Procedure("test.slowQuery")
public Stream<Res> slowQuery(#Name("delay") Number Delay )
{
ArrayList<Res> res = new ArrayList<>();
try ( Transaction tx = db.beginTx() ){
Thread.sleep(Delay.intValue(), 0);
tx.success();
} catch (Exception e) {
System.out.println(e);
}
return res.stream();
}
The function served by the plugin is executed with neoism Golang package. And the timeout isn't triggered as well.
The timeout is only honored if your procedure code invokes either operations on the graph like reading nodes and rels or explicitly checks if the current transaction is marked as terminate.
For the later, see https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/master/src/main/java/apoc/util/Utils.java#L41-L51 as example.
According to the documentation the transaction guard is interested in orphaned transactions only.
The server guards against orphaned transactions by using a timeout. If there are no requests for a given transaction within the timeout period, the server will roll it back. You can configure the timeout in the server configuration, by setting dbms.transaction_timeout to the number of seconds before timeout. The default timeout is 60 seconds.
I've not found a way how to trigger timeout for a query which isn't orphaned with a native functionality.
#StefanArmbruster pointed a good direction. The timeout triggering functionality can be got with creating a wrapper function in Neo4j plugin like it is made in apoc.
Is it possible to import data on Neo4J using the automatic indexing feature? I'm trying to import data using BatchInserter and BatchInserterIndex like the following example:
BatchInserter inserter = BatchInserters.inserter("/home/fmagalhaes/Neo4JDatabase");
BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider(inserter);
BatchInserterIndex nodeIndex = indexProvider.nodeIndex("node_auto_index", MapUtil.stringMap("type","exact"));
BatchInserterIndex relIndex = indexProvider.relationshipIndex("relationship_auto_index", MapUtil.stringMap("type","exact"));
...
inserter.createNode(vertexId, properties);
nodeIndex.add(vertexId, properties);
...
The problem is that when batch processing is completed, I'm trying to open this database with Blueprints generic API by doing the following:
Graph g = new Neo4jGraph("/home/fmagalhaes/Neo4JDatabase");
Set<String> nodeIndices = ((KeyIndexableGraph)g).getIndexedKeys(Vertex.class);
Set<String> relIndices = ((KeyIndexableGraph)g).getIndexedKeys(Edge.class);
and both nodeIndices and relIndices are empty. Auto indexing feature is disabled when I open the graph database on Blueprints API. Is it possible to create an automatic index during the batch processing such that this index will be visible (and will continue to index data automatically as properties are added to vertices and edges) when I open the database with Blueprints API?
you have to cleanly shut down both the batch-index as well as the batch inserter
you probably don't want to index all properties, just the key ones that you use to look up nodes
you have to enable auto-indexing in the neo4j config for the database you start afterwards, and for the same properties that you indexed during batch-insertion
I am able to create nodes and relationships through Java on a Neo4j database. When I try to access the created nodes in the next run I get this error:
Exception in thread "main" org.neo4j.graphdb.NotFoundException: Node 27 not found
In webadmin interface the dashboard shows the number of nodes/relationships created through Java, but when I issue this query: START n=node(*) RETURN n; I get only 1 node in the ouput.
(FYI I have installed Ne04j in my windows machine(local) and using embedded database java code to create nodes.)
Java code I used to connect to db:
final String dbpath = "C:\\neo4j-community-1.9.4\\data\\graph.db";
GraphDatabaseService graphdb = new GraphDatabaseFactory().newEmbeddedDatabase(dbpath);
The settings I have used in ne04j-server.properties are:
org.neo4j.server.database.location=/C:/neo4j-community-1.9.4/data/graph.db/
org.neo4j.server.webserver.https.keystore.location=data/keystore
org.neo4j.server.webadmin.rrdb.location=data/rrd
org.neo4j.server.webadmin.data.uri=/C:/neo4j-community-1.9.4/data/graph.db/
org.neo4j.server.webadmin.management.uri=/db/manage/
When I create node through Java the data/keystore file does not get populated, and only gets populated when creating a node through webadmin interface. Changing the path of keystore file to absolute path also did not work.
Can anybody point the mistake in this scenario, Thanks .
The problem was the nodes created were not comitted. To commit the nodes we got to give finish() :
try{
Transaction tx = graphdb.beginTx();
final String dbpath = "/C:/neo4j-community-1.9.4/data/graph.db/";
GraphDatabaseService graphdb = new GraphDatabaseFactory().newEmbeddedDatabase(dbpath);
Node n1 = graphdb.createNode();
n1.setProperty("type", "company");
n1.setProperty("location", "india");
....
...
}} catch(Exception e){
tx.failure();
} finally {
tx.success();
**tx.finish();**
}
Ranjith's answer was correct until recently, but tx.finish() has now been deprecated.
tx.close(); is now the correct way to commit or rollback the transaction - it will do one or the other depending on whether you've previously called tx.success().
They changed this so the transaction is autocloseable in a try with resources block.
Have you tried:
String dbpath = "C:/neo4j-community-1.9.4/data/graph.db";
I created the application using the instructions here:
http://blog.armbruster-it.de/2009/10/example-neo4j-with-grails/
I then added to DataSource.groovy this:
grails {
neo4j {
type = "embedded"
location = "/usr/local/Cellar/neo4j/"
params = []
}
}
Where my graph.db is located at /usr/local/Cellar/neo4j/community-1.8.1-unix/libexec/data/graph.db
What should be going into this location. I am adding new ndoes but when I run
start n=node(*) return n;
in the shell there is no new data. Thanks!
I believe your location should point right up to
/usr/local/Cellar/neo4j/community-1.8.1-unix/libexec/data/graph.db
since that is where your neo4j database is located