Quartz 2.2.1, JMX jobruntime always -1? - quartz

Is it normal that in Quartz, for the JMX Attribute CurrentlyExecutingJobs=> [item] => jobRunTime always is "-1" while it is currently running, or is there some setting in Quartz to ensure the jobRunTime is updated appropriately?
(confirmed via jconsole, Mission Control, and jmx code)
Usecase is to track/monitor long-running jobs, and thought jobRunTime would be the appropriate path. The alternative path is "fireTime" + CURRENT_NOW calculation, but wanted to avoid extra calculation if it was already occurring somewhere.

After chasing this around, this particular value is not updated without it being manually set. Reviewing tools that monitor Quartz jobs, such as Javamelody, they have to calculate it every time too:
elapsedTime = System.currentTimeMillis()- quartzAdapter.getContextFireTime(jobExecutionContext).getTime();
If you want to manually update the jobruntime value for long-running jobs to check the value rather than calculating it outside, you have to change every job you have to support this feature. Here is a rough example that can be modified for your needs sourced from: https://github.com/dhartford/quartz-snippets/blob/master/update_jobruntime_timer_innerclass
/**
* inner class to handle scheduled updates of the Quartz jobruntime attribute
*/
class UpdateJobTimer extends TimerTask{
private JobExecutionContextImpl jec;
/* usage example, such as at the start of the execute method of the Job interface:
* Timer timer = new Timer();
* //update every 10 seconds (in milliseconds), whatever poll timing you want
* timer.schedule(new UpdateJobTimer(jec), 0, 10000);
* ...
* timer.cancel(); //do cleanup in all appropriate spots
*/
UpdateJobTimer(JobExecutionContextImpl jec){
this.jec = jec;
}
#Override
public void run() {
long runtimeinms = jec.getFireTime().getTime() - new java.util.Date().getTime();
jec.setJobRunTime(runtimeinms);
System.out.println("DEBUG TIMERTASK on JOB: " + jec.getJobDetail().getKey().getName() + " triggered [" + jec.getFireTime() + "] updated [" + new java.util.Date() + "]" );
}
}`

Related

When does reactor execute a subscription chain?

The reactor documentation states the following:
Nothing happens until you subscribe
If that was true, why do I see a java.lang.NullPointerException when I run the following code snippet, which has a reactor chain without a subscription?
#Test
void test() {
String a = null;
Flux.just(a.toLowerCase())
.doOnNext(System.out::println);
}
Deepak,
Nothing happens means the data will not be flowing through the chain of your functions to your consumers until a subscription happens.
You're getting NPE because Java tries to compute the value which is given to a hot operator just() on the Flux definition step.
You can also convert just() to a cold operator using defer() so you will receive NPE only after a subscription happened:
public Flux<String> test() {
String a = null;
return Flux.defer(() -> Flux.just(a.toLowerCase()))
.doOnNext(System.out::println);
}
Please, read more about hot vs hold operators.
Update:
Small example of cold and hot publishers. Each time new subscription happens cold publisher's body is recalculated. Meanwhile, just() is only producing time that was calculated only once at definition time.
Mono<Date> currentTime = Mono.just(Calendar.getInstance().getTime());
Mono<Date> realCurrentTime = Mono.defer(() -> Mono.just(Calendar.getInstance().getTime()));
// 1 sec sleep
Thread.sleep(1000);
currentTime.subscribe(time -> System.out.println("Current Time " + time.getTime()));
realCurrentTime.subscribe(time -> System.out.println("Real current Time " + time.getTime()));
Thread.sleep(2000);
currentTime.subscribe(time -> System.out.println("Current Time " + time.getTime()));
realCurrentTime.subscribe(time -> System.out.println("Real current Time " + time.getTime()));
The output is:
Current Time 1583788755759
Real current Time 1583788756826
Current Time 1583788755759
Real current Time 1583788758833

Issue with Quartz grail plugin

I have a grails application and a quartz job running on it. The job contains the below code similar to below .
class MyJob{
static triggers = {}
def printLog(msg){
String threadId = Thread.currentThread().getId()
String threadName = Thread.currentThread().getName()
log.info(threadId+" - "+threadName+" : "+msg)
}
def execute(context)
{
printLog("Before Sync");
synchronized(MyJob){
printLog("Inside Sync");
try{
printLog("Before sleep 20 minutes")
Thread.sleep(1200000)
printLog("After sleep")
}catch (Exception e){
log.error("Error while sleeping")
}
}
printLog("After Sync")
}
}
I have scheduled it to trigger a job every minute
I can see in the logs that one thread is getting the synchronized block and then the other jobs start piling up, waiting for the thread to finish, this is working as expected.
The issue here is the jobs stop after 10 minutes by that time it have created 10 Threads. Out of that one is sleeping for 20 minutes and other 9 are waiting for the 1st thread to release the lock. Why is no new jobs created ?
I saw in some answers I can fix the issue by modifying my triggers section like below
static triggers = {
simple repeatInterval: 100
}
I tried the above option and its still showing only 10 jobs.
From where its taking the default configuration of 10 ?
How can i modify the value to do infinitely ?
I am new to grails and quartz, so I have no idea what is happening.
I think the Grails plugin sets the threadCount to 10 in the bundled quartz.properties file, assuming you're using Grails 3 you can override in application.yml like this:
quartz:
threadPool:
threadCount: 25
Grails 2 - application.groovy
quartz {
props {
threadPool.threadCount = 100
}
}
In general, it's not a a good idea to lock the Job thread with sleeps
If you have a job running a long process you must to split it in several jobs in order to release the Thread as soon as posible

Searchable index gets locked on manual update (LockObtainFailedException)

We have a Grails project that runs behind a load balancer. There are three instances of the Grails application running on the server (using separate Tomcat instances). Each instance has its own searchable index. Because the indexes are separate, the automatic update is not enough keeping the index consistent between the application instances. Because of this we have disabled the searchable index mirroring and updates to the index are done manually in a scheduled quartz job. According to our understanding no other part of the application should modify the index.
The quartz job runs once a minute and it checks from the database which rows have been updated by the application, and re-indexes those objects. The job also checks if the same job is already running so it doesn’t do any concurrent indexing. The application runs fine for few hours after the startup and then suddenly when the job is starting, LockObtainFailedException is thrown:
22.10.2012 11:20:40 [xxxx.ReindexJob] ERROR Could not update searchable index, class org.compass.core.engine.SearchEngineException:
Failed to open writer for sub index [product]; nested exception is
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
out:
SimpleFSLock#/home/xxx/tomcat/searchable-index/index/product/lucene-a7bbc72a49512284f5ac54f5d7d32849-write.lock
According to the log the last time the job was executed, re-indexing was done without any errors and the job finished successfully. Still, this time the re-index operation throws the locking exception, as if the previous operation was unfinished and the lock had not been released. The lock will not be released until the application is restarted.
We tried to solve the problem by manually opening the locked index, which causes the following error to be printed to the log:
22.10.2012 11:21:30 [manager.IndexWritersManager ] ERROR Illegal state, marking an index writer as open, while another is marked as
open for sub index [product]
After this the job seems to be working correctly and doesn’t become stuck in a locked state again. However this causes the application to constantly use 100 % of the CPU resource. Below is a shortened version of the quartz job code.
Any help would be appreciated to solve the problem, thanks in advance.
class ReindexJob {
def compass
...
static Calendar lastIndexed
static triggers = {
// Every day every minute (at xx:xx:30), start delay 2 min
// cronExpression: "s m h D M W [Y]"
cron name: "ReindexTrigger", cronExpression: "30 * * * * ?", startDelay: 120000
}
def execute() {
if (ConcurrencyHelper.isLocked(ConcurrencyHelper.Locks.LUCENE_INDEX)) {
log.error("Search index has been locked, not doing anything.")
return
}
try {
boolean acquiredLock = ConcurrencyHelper.lock(ConcurrencyHelper.Locks.LUCENE_INDEX, "ReindexJob")
if (!acquiredLock) {
log.warn("Could not lock search index, not doing anything.")
return
}
Calendar reindexDate = lastIndexed
Calendar newReindexDate = Calendar.instance
if (!reindexDate) {
reindexDate = Calendar.instance
reindexDate.add(Calendar.MINUTE, -3)
lastIndexed = reindexDate
}
log.debug("+++ Starting ReindexJob, last indexed ${TextHelper.formatDate("yyyy-MM-dd HH:mm:ss", reindexDate.time)} +++")
Long start = System.currentTimeMillis()
String reindexMessage = ""
// Retrieve the ids of products that have been modified since the job last ran
String productQuery = "select p.id from Product ..."
List<Long> productIds = Product.executeQuery(productQuery, ["lastIndexedDate": reindexDate.time, "lastIndexedCalendar": reindexDate])
if (productIds) {
reindexMessage += "Found ${productIds.size()} product(s) to reindex. "
final int BATCH_SIZE = 10
Long time = TimeHelper.timer {
for (int inserted = 0; inserted < productIds.size(); inserted += BATCH_SIZE) {
log.debug("Indexing from ${inserted + 1} to ${Math.min(inserted + BATCH_SIZE, productIds.size())}: ${productIds.subList(inserted, Math.min(inserted + BATCH_SIZE, productIds.size()))}")
Product.reindex(productIds.subList(inserted, Math.min(inserted + BATCH_SIZE, productIds.size())))
Thread.sleep(250)
}
}
reindexMessage += " (${time / 1000} s). "
} else {
reindexMessage += "No products to reindex. "
}
log.debug(reindexMessage)
// Re-index brands
Brand.reindex()
lastIndexed = newReindexDate
log.debug("+++ Finished ReindexJob (${(System.currentTimeMillis() - start) / 1000} s) +++")
} catch (Exception e) {
log.error("Could not update searchable index, ${e.class}: ${e.message}")
if (e instanceof org.apache.lucene.store.LockObtainFailedException || e instanceof org.compass.core.engine.SearchEngineException) {
log.info("This is a Lucene index locking exception.")
for (String subIndex in compass.searchEngineIndexManager.getSubIndexes()) {
if (compass.searchEngineIndexManager.isLocked(subIndex)) {
log.info("Releasing Lucene index lock for sub index ${subIndex}")
compass.searchEngineIndexManager.releaseLock(subIndex)
}
}
}
} finally {
ConcurrencyHelper.unlock(ConcurrencyHelper.Locks.LUCENE_INDEX, "ReindexJob")
}
}
}
Based on JMX CPU samples, it seems that Compass is doing some scheduling behind the scenes. From 1 minute CPU samples it seems like there are few things different when normal and 100% CPU instances are compared:
org.apache.lucene.index.IndexWriter.doWait() is using most of the CPU time.
Compass Scheduled Executor Thread is shown in the thread list, this was not seen in a normal situation.
One Compass Executor Thread is doing commitMerge, in a normal situation none of these threads was doing commitMerge.
You can try increasing the 'compass.transaction.lockTimeout' setting. The default is 10 (seconds).
Another option is to disable concurrency in Compass and make it synchronous. This is controlled with the 'compass.transaction.processor.read_committed.concurrentOperations': 'false' setting. You might also have to set 'compass.transaction.processor' to 'read_committed'
These are the compass settings we are currently using:
compassSettings = [
'compass.engine.optimizer.schedule.period': '300',
'compass.engine.mergeFactor':'1000',
'compass.engine.maxBufferedDocs':'1000',
'compass.engine.ramBufferSize': '128',
'compass.engine.useCompoundFile': 'false',
'compass.transaction.processor': 'read_committed',
'compass.transaction.processor.read_committed.concurrentOperations': 'false',
'compass.transaction.lockTimeout': '30',
'compass.transaction.lockPollInterval': '500',
'compass.transaction.readCommitted.translog.connection': 'ram://'
]
This has concurrency switched off. You can make it asynchronous by changing the 'compass.transaction.processor.read_committed.concurrentOperations' setting to 'true'. (or removing the entry).
Compass configuration reference:
http://static.compassframework.org/docs/latest/core-configuration.html
Documentation for the concurrency of read_committed processor:
http://www.compass-project.org/docs/latest/reference/html/core-searchengine.html#core-searchengine-transaction-read_committed
If you want to keep async operations, you can also control the number of threads it uses. Using compass.transaction.processor.read_committed.concurrencyLevel=1 setting would allow asynchronous operations but just use one thread (the default is 5 threads). There are also the compass.transaction.processor.read_committed.backlog and compass.transaction.processor.read_committed.addTimeout settings.
I hope this helps.

Multiple scheduler with Grails Quartz plugin

I have an application using Grails Quartz plugin. I need to have two jobs to have multiple instances running, but have separate limitation on number of threads to be used for each job. As far as I understand, I need separate Thread Pools, which is possible by having separate schedulers. However, I cannot figure out how to create multiple schedulers with Quartz plugin.
Assuming you want to use different triggers to start the job multiple times. this works for me.
class MyJob {
static triggers = {
cron name: 'trigger1', cronExpression: "0 30 12 ? * WED"
cron name: 'trigger2', cronExpression: "0 30 12 ? * SAT"
}
def execute() {
// execute task, do your thing here
println "Job executed"
}
}
Finally, about concurrent tasks.
This is from the plug-in page:
By default Jobs are executed in concurrent fashion, so new Job
execution can start even if previous execution of the same Job is
still running.
Quartz plugin 2.0.13
According to the official documentation :
Multiple triggers per job are allowed.
For instance,
class MyJob {
static triggers = {
simple name:'simpleTrigger', startDelay:10000, repeatInterval: 30000, repeatCount: 10
cron name:'cronTrigger', startDelay:10000, cronExpression: '0/6 * 15 * * ?'
custom name:'customTrigger', triggerClass:MyTriggerClass, myParam:myValue, myAnotherParam:myAnotherValue
}

Symfony: current execution time?

I have a time consuming script, and I want to periodically log it execution time. How do I find out the current execution time?
The symfonian way to log execution times is using the timer manager that comes with symfony.
//Get timer instance called 'myTimer'.
$timer = sfTimerManager::getTimer('myTimer');
//Start timer.
$timer->startTimer();
// Do things
...
// Stop the timer and add the elapsed time
$timer->addTime();
This timer will be saved into any logger you have configured with your symfony.
By default symfony has the sfWebDebugLogger for the DEV environment but you can create your own and configure it in the factories.yml file.
The nice thing about this logger is that it logs also the number of calls to the timer.
Why not use date('Y-m-d H:i:s') and log that... and/or calculate difference from a start time obtained with date() as well?
Think about what you are asking, each action is logged in symfony log (root/log/app_name_[env].log). This logging is done once the operation has ended, (there is no easy way to figure out the execution time of a thread executing a php from php). You could try messing up with code and add code in order to log at certain points of the code, the current execution time, something like:
$init = microtime();
log("Process started at:". $init);
foreach($this->getLotsOfRecords() as $index=>$record)
{
$start = microtime();
log ($index." record started at".microtime());
[do stuff]
$end = microtime();
log ($index." record ended at". $end . " time elapsed: ". ($start - $end));
}
$last = microtime();
log("Total elapsed time is: ". ($init - $last));
This is pseudo code but i believe you can figure out the rest, hope this helps!
I ended up writing my own static class:
class Timer
{
protected static $start = null;
public static function getDuration($format = '%H:%I:%S')
{
$now = new DateTime();
if (!self::$start)
{
self::$start = new DateTime();
}
$period = $now->diff(self::$start);
return $period->format($format);
}
}
And logging it in partial (that is looped):
<?php $logger->log(sprintf("Duration: %s", $duration = Timer::getDuration())) ?>

Resources