I am using Jena (version 3.10.0) with Fuseki(version 3.10.0) to test some construct queries but it hangs after running 6 queries. Below is my code. I am not sure if it is bug in Jena or I am doing something wrong. Select queries work fine. Will really appreciate if someone can help.
#Test
public void testRun() {
for(int i =0 ; i < 10;i++) {
System.out.println(" ..... "+ i);
String query = "CONSTRUCT {?S ?P ?O} WHERE {?S ?P ?O}";
try(RDFConnectionFuseki connectFuseki = RDFConnectionFactory.connectFuseki("http://localhost:3030/test")) {
System.out.println("Got connection!");
org.apache.jena.rdf.model.Model model = connectFuseki.queryConstruct(query);
System.out.println("Executed query!");
model.write(System.out, "TURTLE");
}
}
}
Console output
..... 0
Got connection!
Executed query!
..... 1
Got connection!
Executed query!
..... 2
Got connection!
Executed query!
..... 3
Got connection!
Executed query!
..... 4
Got connection!
Executed query!
..... 5
Got connection!
Just in case if someone else hits this issue I am adding the answer. As explained in the comments this is due to a bug and is fixed with commit so next release should fix this problem. In case you are stuck you can use snapshot repo while waiting for the release.
Related
I fire few hundreds of below mentioned query concurrently (tried synchronously also) from JS neo4j-driver 4.4.1. Few of the queries, sometimes throws the following error in nodejs. But when my retry logic retries after sometime, it works.
Query
MERGE (n0:Movie {movie_id: $movie_id})
WITH n0
CALL apoc.lock.nodes([n0])
CALL {
WITH n0
WITH n0 WHERE n0.updated_at IS NULL OR n0.updated_at < datetime($updated_at)
MERGE (n:Movie {movie_id: $movie_id})
ON CREATE SET n.movie_id = $movie_id
SET n.name = $name
SET n.downloads = $downloads
SET n.updated_at = datetime($updated_at)
RETURN count(*) AS cnt
}
RETURN n0, cnt
I run this query in separate transactions like below.
const session = driver.session();
await session.writeTransaction(async tx => {
return await tx.run(QUERY, {args});
});
Log
Neo4jError: Cannot run query in this transaction, because it has been rolled back either because of an error or explicit termination.
I couldn't find any trace related to that query in neo4j logs.
Any help with this?
I have a grails application and a quartz job running on it. The job contains the below code similar to below .
class MyJob{
static triggers = {}
def printLog(msg){
String threadId = Thread.currentThread().getId()
String threadName = Thread.currentThread().getName()
log.info(threadId+" - "+threadName+" : "+msg)
}
def execute(context)
{
printLog("Before Sync");
synchronized(MyJob){
printLog("Inside Sync");
try{
printLog("Before sleep 20 minutes")
Thread.sleep(1200000)
printLog("After sleep")
}catch (Exception e){
log.error("Error while sleeping")
}
}
printLog("After Sync")
}
}
I have scheduled it to trigger a job every minute
I can see in the logs that one thread is getting the synchronized block and then the other jobs start piling up, waiting for the thread to finish, this is working as expected.
The issue here is the jobs stop after 10 minutes by that time it have created 10 Threads. Out of that one is sleeping for 20 minutes and other 9 are waiting for the 1st thread to release the lock. Why is no new jobs created ?
I saw in some answers I can fix the issue by modifying my triggers section like below
static triggers = {
simple repeatInterval: 100
}
I tried the above option and its still showing only 10 jobs.
From where its taking the default configuration of 10 ?
How can i modify the value to do infinitely ?
I am new to grails and quartz, so I have no idea what is happening.
I think the Grails plugin sets the threadCount to 10 in the bundled quartz.properties file, assuming you're using Grails 3 you can override in application.yml like this:
quartz:
threadPool:
threadCount: 25
Grails 2 - application.groovy
quartz {
props {
threadPool.threadCount = 100
}
}
In general, it's not a a good idea to lock the Job thread with sleeps
If you have a job running a long process you must to split it in several jobs in order to release the Thread as soon as posible
I am currently using neo4j 1.8.1 . I am getting NotInTransactionException , when I am querying the neo4j index to get some nodes.
Following is a simple query , which i am executing on neo4j
if (graphDb.index().existsForNodes("NODEINDEX")) {
IndexHits<Node> hits = graphDb.index().forNodes(NODEINDEX).query(query);
}
The following is stacktrace for the exception.
"message" : "Error fetching transaction for current thread",
"exception" : "NotInTransactionException",
"stacktrace" : [ "org.neo4j.kernel.impl.index.IndexConnectionBroker.getCurrentTransaction(IndexConnectionBroker.java:134)", "org.neo4j.kernel.impl.index.IndexConnectionBroker.acquireReadOnlyResourceConnection(IndexConnectionBroker.java:84)", "org.neo4j.index.impl.lucene.LuceneIndex.getReadOnlyConnection(LuceneIndex.java:105)", "org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:245)", "org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:227)", "org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:238)", "com.uprr.netcontrol.starmap.neo4j.plugins.aggregate_node_status.NodeStatusHelper.getGraphNodes(NodeStatusHelper.java:39)",
I found the following in Neo4j api.
private Transaction getCurrentTransaction() throws NotInTransactionException
{
try
{
return transactionManager.getTransaction();
}
catch ( SystemException se )
{
throw new NotInTransactionException(
"Error fetching transaction for current thread", se );
}
}
Do we need to explicitly start a transaction for querying neo4j index?
Any thoughts?
Thanks
Here's a theory: I don't know if this is only an issue with the code pasted here but the check:
if (graphDb.index().existsForNodes("NODEINDEX"))
checks for the index named "NODEINDEX", however the actual query
graphDb.index().forNodes(NODEINDEX).query(query);
checks for the index named whatever is in the constant NODEINDEX. Those two are probably not the same and so it tries to create that index for you and fails due to not being in a transaction.
if there isn't an existing appropriate index, I think it'll create one before returning it; this operation needs to be wrapped in a transaction.
We have a Grails project that runs behind a load balancer. There are three instances of the Grails application running on the server (using separate Tomcat instances). Each instance has its own searchable index. Because the indexes are separate, the automatic update is not enough keeping the index consistent between the application instances. Because of this we have disabled the searchable index mirroring and updates to the index are done manually in a scheduled quartz job. According to our understanding no other part of the application should modify the index.
The quartz job runs once a minute and it checks from the database which rows have been updated by the application, and re-indexes those objects. The job also checks if the same job is already running so it doesn’t do any concurrent indexing. The application runs fine for few hours after the startup and then suddenly when the job is starting, LockObtainFailedException is thrown:
22.10.2012 11:20:40 [xxxx.ReindexJob] ERROR Could not update searchable index, class org.compass.core.engine.SearchEngineException:
Failed to open writer for sub index [product]; nested exception is
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
out:
SimpleFSLock#/home/xxx/tomcat/searchable-index/index/product/lucene-a7bbc72a49512284f5ac54f5d7d32849-write.lock
According to the log the last time the job was executed, re-indexing was done without any errors and the job finished successfully. Still, this time the re-index operation throws the locking exception, as if the previous operation was unfinished and the lock had not been released. The lock will not be released until the application is restarted.
We tried to solve the problem by manually opening the locked index, which causes the following error to be printed to the log:
22.10.2012 11:21:30 [manager.IndexWritersManager ] ERROR Illegal state, marking an index writer as open, while another is marked as
open for sub index [product]
After this the job seems to be working correctly and doesn’t become stuck in a locked state again. However this causes the application to constantly use 100 % of the CPU resource. Below is a shortened version of the quartz job code.
Any help would be appreciated to solve the problem, thanks in advance.
class ReindexJob {
def compass
...
static Calendar lastIndexed
static triggers = {
// Every day every minute (at xx:xx:30), start delay 2 min
// cronExpression: "s m h D M W [Y]"
cron name: "ReindexTrigger", cronExpression: "30 * * * * ?", startDelay: 120000
}
def execute() {
if (ConcurrencyHelper.isLocked(ConcurrencyHelper.Locks.LUCENE_INDEX)) {
log.error("Search index has been locked, not doing anything.")
return
}
try {
boolean acquiredLock = ConcurrencyHelper.lock(ConcurrencyHelper.Locks.LUCENE_INDEX, "ReindexJob")
if (!acquiredLock) {
log.warn("Could not lock search index, not doing anything.")
return
}
Calendar reindexDate = lastIndexed
Calendar newReindexDate = Calendar.instance
if (!reindexDate) {
reindexDate = Calendar.instance
reindexDate.add(Calendar.MINUTE, -3)
lastIndexed = reindexDate
}
log.debug("+++ Starting ReindexJob, last indexed ${TextHelper.formatDate("yyyy-MM-dd HH:mm:ss", reindexDate.time)} +++")
Long start = System.currentTimeMillis()
String reindexMessage = ""
// Retrieve the ids of products that have been modified since the job last ran
String productQuery = "select p.id from Product ..."
List<Long> productIds = Product.executeQuery(productQuery, ["lastIndexedDate": reindexDate.time, "lastIndexedCalendar": reindexDate])
if (productIds) {
reindexMessage += "Found ${productIds.size()} product(s) to reindex. "
final int BATCH_SIZE = 10
Long time = TimeHelper.timer {
for (int inserted = 0; inserted < productIds.size(); inserted += BATCH_SIZE) {
log.debug("Indexing from ${inserted + 1} to ${Math.min(inserted + BATCH_SIZE, productIds.size())}: ${productIds.subList(inserted, Math.min(inserted + BATCH_SIZE, productIds.size()))}")
Product.reindex(productIds.subList(inserted, Math.min(inserted + BATCH_SIZE, productIds.size())))
Thread.sleep(250)
}
}
reindexMessage += " (${time / 1000} s). "
} else {
reindexMessage += "No products to reindex. "
}
log.debug(reindexMessage)
// Re-index brands
Brand.reindex()
lastIndexed = newReindexDate
log.debug("+++ Finished ReindexJob (${(System.currentTimeMillis() - start) / 1000} s) +++")
} catch (Exception e) {
log.error("Could not update searchable index, ${e.class}: ${e.message}")
if (e instanceof org.apache.lucene.store.LockObtainFailedException || e instanceof org.compass.core.engine.SearchEngineException) {
log.info("This is a Lucene index locking exception.")
for (String subIndex in compass.searchEngineIndexManager.getSubIndexes()) {
if (compass.searchEngineIndexManager.isLocked(subIndex)) {
log.info("Releasing Lucene index lock for sub index ${subIndex}")
compass.searchEngineIndexManager.releaseLock(subIndex)
}
}
}
} finally {
ConcurrencyHelper.unlock(ConcurrencyHelper.Locks.LUCENE_INDEX, "ReindexJob")
}
}
}
Based on JMX CPU samples, it seems that Compass is doing some scheduling behind the scenes. From 1 minute CPU samples it seems like there are few things different when normal and 100% CPU instances are compared:
org.apache.lucene.index.IndexWriter.doWait() is using most of the CPU time.
Compass Scheduled Executor Thread is shown in the thread list, this was not seen in a normal situation.
One Compass Executor Thread is doing commitMerge, in a normal situation none of these threads was doing commitMerge.
You can try increasing the 'compass.transaction.lockTimeout' setting. The default is 10 (seconds).
Another option is to disable concurrency in Compass and make it synchronous. This is controlled with the 'compass.transaction.processor.read_committed.concurrentOperations': 'false' setting. You might also have to set 'compass.transaction.processor' to 'read_committed'
These are the compass settings we are currently using:
compassSettings = [
'compass.engine.optimizer.schedule.period': '300',
'compass.engine.mergeFactor':'1000',
'compass.engine.maxBufferedDocs':'1000',
'compass.engine.ramBufferSize': '128',
'compass.engine.useCompoundFile': 'false',
'compass.transaction.processor': 'read_committed',
'compass.transaction.processor.read_committed.concurrentOperations': 'false',
'compass.transaction.lockTimeout': '30',
'compass.transaction.lockPollInterval': '500',
'compass.transaction.readCommitted.translog.connection': 'ram://'
]
This has concurrency switched off. You can make it asynchronous by changing the 'compass.transaction.processor.read_committed.concurrentOperations' setting to 'true'. (or removing the entry).
Compass configuration reference:
http://static.compassframework.org/docs/latest/core-configuration.html
Documentation for the concurrency of read_committed processor:
http://www.compass-project.org/docs/latest/reference/html/core-searchengine.html#core-searchengine-transaction-read_committed
If you want to keep async operations, you can also control the number of threads it uses. Using compass.transaction.processor.read_committed.concurrencyLevel=1 setting would allow asynchronous operations but just use one thread (the default is 5 threads). There are also the compass.transaction.processor.read_committed.backlog and compass.transaction.processor.read_committed.addTimeout settings.
I hope this helps.
class MyController {
def startTwoMinuteTask = {
response.contentType = 'text/html'
def out = response.outputStream.destination
out.println 'Starting ...'
out.flush()
for (int i=0;i<10;i++) {
out.println " <br> $i"
out.flush()
Thread.sleep(1000)
}
return null
}
}
I'd like this to display 1 through 10 as status updates, alas grails is buffering the the entire thing. How do I make this work? Thanks!
I know this isn't the actual answer to your question, but why aren't you using a background Thread?
Using something like the Quartz plugin will let you spin off the long-running process. You can have the browser poll for changes periodically (or using a feature like Atmosphere for push if you can).
The benefit of this is you aren't locking open a connection. Also, not all browsers will wait that long — sometimes they'll time out. HTTP isn't really intended as a long-running connection, especially if no information is being passed.