Unable to delete 15k+ nodes using neo4jclient - neo4j

I am writing a Windows console application in C# that is supposed to import 15k nodes from an XML file and build a graph database in Neo4j (2.0.0) using the neo4jclient.
At the very beginning of the app, I am trying to remove all nodes and relationships from the database so that at every run of the application the database is fresh and clean:
Console.WriteLine("Deleting nodes and relationships...");
graphClient.Cypher.OptionalMatch("n-[r]-()").Delete("r").ExecuteWithoutResults();
graphClient.Cypher.Match("n").Delete("n").ExecuteWithoutResults();
Console.WriteLine("...done!");
At the moment the database has about 16k nodes (with no relationships between them) which were created by a couple of previous run of the application itself. When the second Delete statement above runs, this exception is thrown after 30 or so second:
Unhandled Exception: System.AggregateException: One or more errors occurred. ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
--- End of inner exception stack trace ---
at Neo4jClient.GraphClient.SendHttpRequest(HttpRequestMessage request, String commandDescription, HttpStatusCode[] expectedStatusCodes) in c:\TeamCity\buildAgent\work\5bae2aa9bce99f44\Neo4jClient\GraphClient.cs:line 138
at Neo4jClient.GraphClient.Neo4jClient.IRawGraphClient.ExecuteCypher(CypherQuery query) in c:\TeamCity\buildAgent\work\5bae2aa9bce99f44\Neo4jClient\GraphClient.cs:line 843
at Neo4jClient.Cypher.CypherFluentQuery.ExecuteWithoutResults() in c:\TeamCity\buildAgent\work\5bae2aa9bce99f44\Neo4jClient\Cypher\CypherFluentQuery.cs:line 322
at Xml2Cypher.Program.Main(String[] args) in c:\_PrivateProjects\KanjiDoc2Neo4J\Xml2Cypher\Program.cs:line 25
I also tried batching the delete statements using a Limit statement, but I get the same exception. Any ideas? I think the Request is timing out, but even batching doesn't seem to solve the issue.
graphClient.Cypher.Match("n").With("n").Limit(100).Delete("n").ExecuteWithoutResults();
I tried running the following statement from the browser:
match (n:Characters) with n limit 100 delete n
but even there I get an "unknown error".

Just increase the http timeout in neo4jclient then it should work.
The error you get in browser is wrong. It should say: "node has still relationships"
the query for deletes is:
MATCH (n)
OPTIONAL MATCH (n)-[r]->()
DELETE n,r
If you have many rels to delete you probably want to batch it. Unfortunately the promising PERIODIC COMMIT was limited to LOAD CSV after 2.1-M01 :(
So you're back to batching yourself (delete a block of 5k nodes and their rels)
MATCH (n)
LIMIT 5000
OPTIONAL MATCH (n)-[r]->()
DELETE n,r
RETURN count(*)
repeat until it returns 0.

You could also stop your Windows service, delete the graph.db folder, and restart the service. I have used this to completely "refresh" the database as it will create a new database. Something like this should work:
public static void Main(string[] args)
{
RefreshDatabase();
}
private static void RefreshDatabase()
{
ServiceController sc = new ServiceController("Neo4j Graph Database", "computername");
if (sc.Status != ServiceControllerStatus.Stopped)
sc.Stop();
Console.WriteLine("Stopping Neo4j Graph Database service...");
//sc.WaitForStatus(ServiceControllerStatus.Stopped);
Console.WriteLine("Neo4j Graph Database service stopped.\n");
Console.WriteLine("Deleting graph.db files and folders...\n");
RecursiveDelete(#"C:\neo4j-community-2.0.1\data\graph.db");
Console.WriteLine("Finished deleting graph.db folder contents.\n");
Console.WriteLine("Starting Neo4j Graph Database service...\n");
sc.Start();
//sc.WaitForStatus(ServiceControllerStatus.Running);
Console.WriteLine("Neo4j Graph Database running.\n");
}
private static void RecursiveDelete(string path)
{
DirectoryInfo di = new DirectoryInfo(path);
foreach (FileInfo file in di.GetFiles())
{
file.Delete();
}
foreach (DirectoryInfo directory in di.GetDirectories())
{
directory.Delete(true);
}
}

Related

JS Neo4jError: Cannot run query in this transaction, because it has been rolled back either because of an error or explicit termination

I fire few hundreds of below mentioned query concurrently (tried synchronously also) from JS neo4j-driver 4.4.1. Few of the queries, sometimes throws the following error in nodejs. But when my retry logic retries after sometime, it works.
Query
MERGE (n0:Movie {movie_id: $movie_id})
WITH n0
CALL apoc.lock.nodes([n0])
CALL {
WITH n0
WITH n0 WHERE n0.updated_at IS NULL OR n0.updated_at < datetime($updated_at)
MERGE (n:Movie {movie_id: $movie_id})
ON CREATE SET n.movie_id = $movie_id
SET n.name = $name
SET n.downloads = $downloads
SET n.updated_at = datetime($updated_at)
RETURN count(*) AS cnt
}
RETURN n0, cnt
I run this query in separate transactions like below.
const session = driver.session();
await session.writeTransaction(async tx => {
return await tx.run(QUERY, {args});
});
Log
Neo4jError: Cannot run query in this transaction, because it has been rolled back either because of an error or explicit termination.
I couldn't find any trace related to that query in neo4j logs.
Any help with this?

Query timeout in Neo4j 3.0.6

It looks like previously working approach is deprecated now:
unsupported.dbms.executiontime_limit.enabled=true
unsupported.dbms.executiontime_limit.time=1s
According to the documentation new variables are responsible for timeouts handling:
dbms.transaction.timeout
dbms.transaction_timeout
At the same time the new variables look related to the transactions.
The new timeout variables look not working. They were set in the neo4j.conf as follows:
dbms.transaction_timeout=5s
dbms.transaction.timeout=5s
Slow cypher query isn't terminated.
Then the Neo4j plugin was added to model a slow query with transaction:
#Procedure("test.slowQuery")
public Stream<Res> slowQuery(#Name("delay") Number Delay )
{
ArrayList<Res> res = new ArrayList<>();
try ( Transaction tx = db.beginTx() ){
Thread.sleep(Delay.intValue(), 0);
tx.success();
} catch (Exception e) {
System.out.println(e);
}
return res.stream();
}
The function served by the plugin is executed with neoism Golang package. And the timeout isn't triggered as well.
The timeout is only honored if your procedure code invokes either operations on the graph like reading nodes and rels or explicitly checks if the current transaction is marked as terminate.
For the later, see https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/master/src/main/java/apoc/util/Utils.java#L41-L51 as example.
According to the documentation the transaction guard is interested in orphaned transactions only.
The server guards against orphaned transactions by using a timeout. If there are no requests for a given transaction within the timeout period, the server will roll it back. You can configure the timeout in the server configuration, by setting dbms.transaction_timeout to the number of seconds before timeout. The default timeout is 60 seconds.
I've not found a way how to trigger timeout for a query which isn't orphaned with a native functionality.
#StefanArmbruster pointed a good direction. The timeout triggering functionality can be got with creating a wrapper function in Neo4j plugin like it is made in apoc.

Neo4j BatchInserter initializing Db on restart

I am using Neo4j BatchInserters to insert nodes in db.I am using LuceneBatchInserterIndexProvider for indexes. I have multiple files from where i am importing the data. I want if my process break then i should be able to restart the process from next file. But whenever i restart process it creates new db in graph folder and new indexes. My initialization code look like this.
Map<String, String> config = new HashMap<String, String>();
config.put("neostore.nodestore.db.mapped_memory", "2G");
config.put("batch_import.keep_db", "true");
BatchInserter db = BatchInserters.inserter("ttl.db", config);
BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider(
db);
index = indexProvider.nodeIndex("ttlIndex",
MapUtil.stringMap("type", "exact"));
index.setCacheCapacity(URI_PROPERTY, indexCache + 1);
Can somebody please help here?
To provide more details. I have multiple files ( around 400) which i want to import to Neo4j.
I want to divide my process into batches. After every batch i want to restart the process.
I used neo4j batch inserter config batch_import.keep_db = "true". This does not clear the graph but after restart indexer has lost information. I have this method to check for node existence. I am sure before restart i have created node.
private Long getNode(String nodeUrl)
{
IndexHits<Long> hits = index.get(URI_PROPERTY, nodeUrl);
if (hits.hasNext()) { // node exists
return hits.next();
}
return null;
}

Cannot access nodes created using java in neo4j database, neo4j-server.properties issues

I am able to create nodes and relationships through Java on a Neo4j database. When I try to access the created nodes in the next run I get this error:
Exception in thread "main" org.neo4j.graphdb.NotFoundException: Node 27 not found
In webadmin interface the dashboard shows the number of nodes/relationships created through Java, but when I issue this query: START n=node(*) RETURN n; I get only 1 node in the ouput.
(FYI I have installed Ne04j in my windows machine(local) and using embedded database java code to create nodes.)
Java code I used to connect to db:
final String dbpath = "C:\\neo4j-community-1.9.4\\data\\graph.db";
GraphDatabaseService graphdb = new GraphDatabaseFactory().newEmbeddedDatabase(dbpath);
The settings I have used in ne04j-server.properties are:
org.neo4j.server.database.location=/C:/neo4j-community-1.9.4/data/graph.db/
org.neo4j.server.webserver.https.keystore.location=data/keystore
org.neo4j.server.webadmin.rrdb.location=data/rrd
org.neo4j.server.webadmin.data.uri=/C:/neo4j-community-1.9.4/data/graph.db/
org.neo4j.server.webadmin.management.uri=/db/manage/
When I create node through Java the data/keystore file does not get populated, and only gets populated when creating a node through webadmin interface. Changing the path of keystore file to absolute path also did not work.
Can anybody point the mistake in this scenario, Thanks .
The problem was the nodes created were not comitted. To commit the nodes we got to give finish() :
try{
Transaction tx = graphdb.beginTx();
final String dbpath = "/C:/neo4j-community-1.9.4/data/graph.db/";
GraphDatabaseService graphdb = new GraphDatabaseFactory().newEmbeddedDatabase(dbpath);
Node n1 = graphdb.createNode();
n1.setProperty("type", "company");
n1.setProperty("location", "india");
....
...
}} catch(Exception e){
tx.failure();
} finally {
tx.success();
**tx.finish();**
}
Ranjith's answer was correct until recently, but tx.finish() has now been deprecated.
tx.close(); is now the correct way to commit or rollback the transaction - it will do one or the other depending on whether you've previously called tx.success().
They changed this so the transaction is autocloseable in a try with resources block.
Have you tried:
String dbpath = "C:/neo4j-community-1.9.4/data/graph.db";

Getting NotInTransactionException while querying neo4j index

I am currently using neo4j 1.8.1 . I am getting NotInTransactionException , when I am querying the neo4j index to get some nodes.
Following is a simple query , which i am executing on neo4j
if (graphDb.index().existsForNodes("NODEINDEX")) {
IndexHits<Node> hits = graphDb.index().forNodes(NODEINDEX).query(query);
}
The following is stacktrace for the exception.
"message" : "Error fetching transaction for current thread",
"exception" : "NotInTransactionException",
"stacktrace" : [ "org.neo4j.kernel.impl.index.IndexConnectionBroker.getCurrentTransaction(IndexConnectionBroker.java:134)", "org.neo4j.kernel.impl.index.IndexConnectionBroker.acquireReadOnlyResourceConnection(IndexConnectionBroker.java:84)", "org.neo4j.index.impl.lucene.LuceneIndex.getReadOnlyConnection(LuceneIndex.java:105)", "org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:245)", "org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:227)", "org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:238)", "com.uprr.netcontrol.starmap.neo4j.plugins.aggregate_node_status.NodeStatusHelper.getGraphNodes(NodeStatusHelper.java:39)",
I found the following in Neo4j api.
private Transaction getCurrentTransaction() throws NotInTransactionException
{
try
{
return transactionManager.getTransaction();
}
catch ( SystemException se )
{
throw new NotInTransactionException(
"Error fetching transaction for current thread", se );
}
}
Do we need to explicitly start a transaction for querying neo4j index?
Any thoughts?
Thanks
Here's a theory: I don't know if this is only an issue with the code pasted here but the check:
if (graphDb.index().existsForNodes("NODEINDEX"))
checks for the index named "NODEINDEX", however the actual query
graphDb.index().forNodes(NODEINDEX).query(query);
checks for the index named whatever is in the constant NODEINDEX. Those two are probably not the same and so it tries to create that index for you and fails due to not being in a transaction.
if there isn't an existing appropriate index, I think it'll create one before returning it; this operation needs to be wrapped in a transaction.

Resources