How to gracefully kill an unresponsive tcl script? - timeout

Lets say I have a tcl script which should normally execute in less than a minute - How could I make sure that the script NEVER takes more than 'x' minutes/seconds to execute, and if it does then the script should just be stopped.
For example, if the script has taken more than 100 seconds, then I should be able to automatically switch control to a clean up function which would gracefully end the script so that I have all the data from the script run so far but I also ensure that it doesn't take too long or get stuck infinitely.
I'm not sure if this can be done in tcl - any help or pointers would be welcome.

You could use interp limit when you use a child interpreter.
Note that this will throw an uncachable error, if you want to do some cleanup you to remove the limit in a parent interp.
set interp [interp create]
# initialize the interp
interp eval $interp {
source somestuff.tcl
}
# Add the limit. From now you have 60 seconds or an error will be thrown
interp limit $interp time -seconds [clock seconds] -milliseconds 60000
set errorcode [catch {interp eval {DoExpensiveStuff}} res opts]
# remove the limit so you can cleanup the mess if needed.
interp limit $interp time -seconds {}
if {$errorcode} {
# Do some cleanup here
}
# delete the interp, or reuse it?
interp delete $interp
# And what shall be done with the error? Throw it.
return -options $opt $res
Resource limits are the best bet with Tcl, but they are not bullet-proof. Tcl can not (and will not) abort C procedures, and there are some ways to let the Tcl core do some hard working.

There must be a loop that you're worried might take more than 100 seconds, yes? Save clock seconds (current time) before you enter the loop, and check the time again at the end of each iteration to see if more than 100 seconds have elapsed.
If for some reason that's not possible, you can try devising something using after—that is, kick off a timer to a callback that sets (or unsets) some global variable that your executing code is aware of—so that on detection, it can attempt to exit.

Related

waitForCompletion(timeout) in Abaqus API does not actually kill the job after timeout passes

I'm doing a parametric sweep of some Abaqus simulations, and so I'm using the waitForCompletion() function to prevent the script from moving on prematurely. However, occassionally the combination of parameters causes the simulation to hang on one or two of the parameters in the sweep for something like half an hour to an hour, whereas most parameter combos only take ~10 minutes. I don't need all the data points, so I'd rather sacrifice one or two results to power through more simulations in that time. Thus I tried to use waitForCompletion(timeout) as documented here. But it doesn't work - it ends up functioning just like an indefinite waitForCompletion, regardless of how low I set the wait time. I am using Abaqus 2017, and I was wondering if anyone else had gotten this function to work and if so how?
While I could use a workaround like adding a custom timeout function and using the kill() function on the job, I would prefer to use the built-in functionality of the Abaqus API, so any help is much appreciated!
It seems like starting from a certain version the timeOut optional argument was removed from this method: compare the "Scripting Reference Manual" entry in the documentation of v6.7 and v6.14.
You have a few options:
From Abaqus API: Checking if the my_abaqus_script.023 file still exists during simulation:
import os, time
timeOut = 600
total_time = 60
time.sleep(60)
# whait untill the the job is completed
while os.path.isfile('my_job_name.023') == True:
if total_time > timeOut:
my_job.kill()
total_time += 60
time.sleep(60)
From outside: Launching the job using the subprocess
Note: don't use interactive keyword in your command because it blocks the execution of the script while the simulation process is active.
import subprocess, os, time
my_cmd = 'abaqus job=my_abaqus_script analysis cpus=1'
proc = subprocess.Popen(
my_cmd,
cwd=my_working_dir,
stdout='my_study.log',
stderr='my_study.err',
shell=True
)
and checking the return code of the child process suing poll() (see also returncode):
timeOut = 600
total_time = 60
time.sleep(60)
# whait untill the the job is completed
while proc.poll() is None:
if total_time > timeOut:
proc.terminate()
total_time += 60
time.sleep(60)
or waiting until the timeOut is reached using wait()
timeOut = 600
try:
proc.wait(timeOut)
except subprocess.TimeoutExpired:
print('TimeOut reached!')
Note: I know that terminate() and wait() methods should work in theory but I haven't tried this solution myself. So maybe there will be some additional complications (like looking for all children processes created by Abaqus using psutil.Process(proc.pid) )

How can I stop values being outputted immediately in Forth?

Using SwiftForth, I am currently looking at methods for measuring the time it takes for a word to be executed. I am using the words 'counter' and then 'timer' in the form:
counter insert_word_here timer
This immediately outputs the time in microseconds that it takes to run the word. Is there a way I can prevent this integer from being outputted immediately, so that I can store it in the stack?
timer in SwiftForth is implemented something like
: timer \ t0 -- ;
counter swap - u.
;
Simply define a word without the u. and the elapsed time in milliseconds is left on the stack.
: timer-ms \ t0 -- t-elapsed
counter swap -
;
I don't have SwiftForth, but timer was defined as an example on the page I found. I think this should work, but I can't test it.

How to get results of tasks when they finish and not after all have finished in Dask?

I have a dask dataframe and want to compute some tasks that are independent. Some tasks are faster than others but I'm getting the result of each task after longer tasks have completed.
I created a local Client and use client.compute() to send tasks. Then I use future.result() to get the result of each task.
I'm using threads to ask for results at the same time and measure the time for each result to compute like this:
def get_result(future,i):
t0 = time.time()
print("calculating result", i)
result = future.result()
print("result {} took {}".format(i, time.time() - t0))
client = Client()
df = dd.read_csv(path_to_csv)
future1 = client.compute(df[df.x > 200])
future2 = client.compute(df[df.x > 500])
threading.Thread(target=get_result, args=[future1,1]).start()
threading.Thread(target=get_result, args=[future2,2]).start()
I expect the output of the above code to be something like:
calculating result 1
calculating result 2
result 2 took 10
result 1 took 46
Since the first task is larger.
But instead I got both at the same time
calculating result 1
calculating result 2
result 2 took 46.3046760559082
result 1 took 46.477620363235474
I asume that is because future2 actually computes in the background and finishes before future1, but it waits until future1 is completed to return.
Is there a way I can get the result of future2 at the moment it finishes ?
You do not need to make threads to use futures in an asynchronous fashion - they are already inherently async, and monitor their status in the background. If you want to get results in the order they are ready, you should use as_completed.
However, fo your specific situation, you may want to simply view the dashboard (or use df.visulalize()) to understand the computation which is happening. Both futures depend on reading the CSV, and this one task will be required before either can run - and probably takes the vast majority of the time. Dask does not know, without scanning all of the data, which rows have what value of x.

Asterisk PBX - Infinite Loop when user disconnects while using 'Read' application from LUA

I'm configuring interactive dial plans for asterisk at the moment and because I already know some LUA I thought it'd be easier to go that route.
I have a start extension like this:
["h"] = function(c,e)
app.verbose("Hung Up")
end;
["s"] = function(c, e)
local d = 0
while d == 0 do
say:hello()
app.read("read_result", nil, 1)
d = channel["read_result"].value;
if d == 1 then
say:goodbye()
elseif d == 2 then
call:forward('front desk')
end
d = 0
end
say:goodbye()
end;
As you can see, I want to repeat the instructions say:hello() whenever
the user gives an invalid answer. However, if the user hangs up while
app.read waits for their answer, asterisk ends up in an infinite loop
since d will always be nil.
I WOULD check for d==nil to detect disconnection, but nil also shows
up when the user just presses the # pound sign during app.read.
So far I've taken to using for loops instead of while to limit the
maximum iterations that way, but I'd rather find out how to detect a disconnected
channel. I can't find any documentation on that though.
I also tried setting up a h extension, but the program won't go to it when the
user hangs up.
Asterisk Verbose Output:
[...]
-- Executing [s#test-call:1] read("PJSIP/2300-00000004", "read_result,,1") │ test.lua:3: in main chunk
-- Accepting a maximum of 1 digit. │ [C]: ?
-- User disconnected │root#cirro asterisk lua test.lua
-- Executing [s#test-call:1] read("PJSIP/2300-00000004", "read_result,,1") │Global B
-- Accepting a maximum of 1 digit. │LocalB-B->a
-- User disconnected │LocalB-A
-- Executing [s#test-call:1] read("PJSIP/2300-00000004", "read_result,,1") │LocalB-A
-- Accepting a maximum of 1 digit. │LocalB-A
-- User disconnected │root#cirro asterisk cp ~/test.call /var/spool/asterisk/outgoing
-- Executing [s#test-call:1] read("PJSIP/2300-00000004", "read_result,,1")
[...]
Thanks for any help you might be able to offer.
First of all you can see in app_read docs(and any other doc), that it return different values for incorrect execution(when channel is down).
Also this exact app offer simplified way of determine result:
core show application Read
-= Info about application 'Read' =-
[Synopsis]
Read a variable.
[Description]
Reads a #-terminated string of digits a certain number of times from the user
in to the given <variable>.
This application sets the following channel variable upon completion:
${READSTATUS}: This is the status of the read operation.
OK
ERROR
HANGUP
INTERRUPTED
SKIPPED
TIMEOUT
If that still not suite you, you can direct ask asterisk about CHANNEL(state)
PS You NEVER should write dialplan or any other program with infinite loop. Count your loops and exit at 10+. This will save ALOT of money for client.

Set the memory high and then set_time_limit to 0

In PHP if I set the the memory 100M via ini_set and then I set set_time_limit(0); Does that mean that my PHP memory allocation is 100M forever(Until I restart my Apache)?
No its reset back to the original at the end of script execution.
From the manual:
string ini_set ( string $varname , string $newvalue )
Sets the value of the given configuration option. The configuration
option will keep this new value during the script's execution, and
will be restored at the script's ending.
and set_time_limit(0); is treated the same.
Example:
// 1. Script starts
echo ini_get('memory_limit');//128M
// 2. We set a new limit the script will now have 100M
ini_set('memory_limit','100M');
echo ini_get('memory_limit'); //100M
die;
// 3. Script ends now its set back to 128M
With set_time_limit(0); it just tells the script to not time out, tho say you were to use set_time_limit(0); within a loop then on each iteration its internal counter is set to 0 over and over.
So if you were to use set_time_limit(1); within a loop as long as each iteration of the loop did not last longer then 1 second then it would still not time out as set_time_limit(n); would reset the internal timeout counter to 0 on each iteration.
Example of it not timing out after 1 second:
for($i=0;$i<=10;$i++){
set_time_limit(1);
usleep(999998); //2micro seconds from a second
echo $i;
}

Resources