Execute_notebook never returns - papermill

I'm using papermill in a web app, and running execute_notebook inside of celery tasks. I'm logging the output, and the entire notebook finishes, I get the export I'm waiting for in GCS, and it all seems perfect. but my execute_notebook statement never returns, so my celery task never finishes either.
Here's some pared down code:
def execute_nb(parameters: NotebookParameters, product_type: ProductType, notebook_name: str = None):
try:
nb_path = f"{settings.NOTEBOOK_DIR_PATH}/{product_type}.ipynb"
if notebook_name:
nb_path = f"{settings.NOTEBOOK_DIR_PATH}/{notebook_name}.ipynb"
nb_content = NOTEBOOK_REPO.get_contents(nb_path, ref=settings.NOTEBOOK_REF).decoded_content
except Exception as e:
print(e)
raise NotebookDoesNotExistException(path=nb_path)
# Make sure these local notebook folders exist locally
if not os.path.isdir(settings.NOTEBOOK_DIR_PATH):
os.makedirs(settings.NOTEBOOK_DIR_PATH)
if not os.path.isdir(f"{settings.NOTEBOOK_DIR_PATH}/outputs"):
os.makedirs(f"{settings.NOTEBOOK_DIR_PATH}/outputs")
# Writes the notebook from git to a local file (at the same relative path as git, from main.py)
open(nb_path, "wb").write(nb_content)
parameters_dict = json.loads(parameters)
pm.execute_notebook(
nb_path, f"{settings.NOTEBOOK_DIR_PATH}/outputs/{product_type}_output.ipynb", parameters=parameters_dict, log_output=True
)
print('done!')
return True
It never prints that done statement, so my celery task never finishes. The logs in my container show this:
data-layer-generation-service-worker-1 | [2022-11-16 01:48:43,496: WARNING/ForkPoolWorkerExecuting: 100%|##########| 7/7 [00:39<00:00, 6.36s/cell]
So it's reaching the end. Am I supposed to do something to trigger the end of execute_notebook?

Related

Dask with tls connection can not end the program with to_parquet method

I am using dask to process 10 files which the size of each file is about 142MB. I build a method with delayed tag, following is an example:
#dask.delayed
def process_one_file(input_file_path, save_path):
res = []
for line in open(input_file_path):
res.append(line)
df = pd.DataFrame(line)
df.to_parquet(save_path+os.path.basename(input_file_path))
if __name__ == '__main__':
client = ClusterClient()
input_dir = ""
save_dir = ""
print("start to process")
cvss = [process_one_file(input_dir+filename, save_dir) for filename in os.listdir(input_dir)]
dask.compute(csvs)
However, dask does not always run successfully. After processing all files, the program often hangs.
I used the command line to run the program. The program often huangs after printing start to process. I know the program runs correctly, since I can see all output files after a while.
But the program never stops. If I disabled tls, the program can run successfully.
It was so strange that dask can not stop the program if I enable tls connection. How can I solve it?
I found that if I add to_parquet method, then the program cannot stop, while if I remove the method, it runs successfully.
I have found the problem. I set 10GB for each process. That means I set memory-limit=10GB. I totally set 2 workers and each has 2 processes. Each process has 2 threads.
Thus, each machine will have 4 processes which occupy 40GB. However, my machine only have 32GB. If I lower the memory limit, then the program will run successfully!

subprocess.Popen communicate with timeout, python 3

I'm trying to get the output of another process in python3
Here si my code
proc = subprocess.Popen(BIN, stdout=subprocess.PIPE)
try:
outs = proc.communicate(timeout=10)[0]
except subprocess.TimeoutExpired:
proc.kill()
outs = proc.communicate()[0]
The problem is :
BIN is an executable that never finish, so TimeoutExpired is always raised. But I'm unabled to get the output in the except block
Thank's for reading

ipython redirect stdout display corruption

I'm developing a system in python, and one functionality I need is the ability to have console output go to both the console and a user-specified file. This is replicating the Diary function in MATLAB. I have the following that works perfectly well on both IDLE on windows and python cmdline in ubuntu (this all exists inside a module that gets loaded):
class diaryout(object):
def __init__(self):
self.terminal = sys.stdout
self.save = None
def __del__(self):
try:
self.save.flush()
self.save.close()
except:
# do nothing, just catch the error; maybe it self was instantiated, but never opened
1/1
self.save = None
def dclose(self):
self.__del__()
def write(self, message):
self.terminal.write(message)
self.save.write(message)
def dopen(self,outfile):
self.outfile = outfile
try:
self.save = open(self.outfile, "a")
except Exception, e:
# just pass out the error here so the Diary function can handle it
raise e
def Diary(outfile = None):# NEW TO TEST
global this_diary
if outfile == None:
# None passed, so close the diary file if one is open
if isinstance(this_diary, diaryout):
sys.stdout = this_diary.terminal # set the stdout back to stdout
this_diary.dclose() # flush and close the file
this_diary = None # "delete" it
else:
# file passed, so let's open it and set it for the output
this_diary = diaryout() # instantiate
try:
this_diary.dopen(outfile) # open & test that it opened
except IOError:
raise IOError("Can't open %s for append!"%outfile)
this_dairy=none # must uninstantiate it, since already did that
except TypeError:
raise TypeError("Invalid input detected - must be string filename or None: %s"%Diary.__doc__)
this_dairy=none # must uninbstantiate it, since already did that
sys.stdout = this_diary # set stdout to it
Far superior to both IDLE and the plain python cmline, I'm using ipython; herein my problem lies. I can turn on the "diary" perfectly fine with no error but the display on the console gets messed. The attached screenshot shows this . The output file also becomes similarly garbled. Everything goes back to normal when I undo the redirection with Diary(None). I have tried editing the code so that it never even writes to the file, with no effect. It seems almost like something is forcing an unsupported character set or something I don't understand.
Anyone have an idea about this?

Why is my Ruby script utilizing 90% of my CPU?

I wrote a admin script that tails a heroku log and every n seconds, it summarizes averages and notifies me if i cross a certain threshold (yes I know and love new relic -- but I want to do custom stuff).
Here is the entire script.
I have never been a master of IO and threads, I wonder if I am making a silly mistake. I have a couple of daemon threads that have while(true){} which could be the culprit. For example:
# read new lines
f = File.open(file, "r")
f.seek(0, IO::SEEK_END)
while true do
select([f])
line = f.gets
parse_heroku_line(line)
end
I use one daemon to watch for new lines of a log, and the other to periodically summarize.
Does someone see a way to make it less processor-intensive?
This probably runs hot because you never really block while reading from the temporary file. IO::select is a thin layer over POSIX select(2). It looks like you're trying to block until the file is ready for reading, but select(2) considers EOF to be ready ("a file descriptor is also ready on end-of-file"), so you always return right away from select then call gets which returns nil at EOF.
You can get a truer EOF reading and nice blocking behavior by avoiding the thread which writes to the temp file and instead using IO::popen to fork the %x[heroku logs --ps router --tail --app pipewave-cedar] log tailer, connected to a ruby IO object on which you can loop over gets, exiting when gets returns nil (indicating the log tailer finished). gets on the pipe from the tailer will block when there's nothing to read and your script will only run as hot as it takes to do your line parsing and reporting.
EDIT: I'm not set up to actually try your code, but you should be able to replace the log tailer thread and your temp file read loop with this code to get the behavior described above:
IO.popen( %w{ heroku logs --ps router --tail --app my-heroku-app } ) do |logf|
while line = logf.gets
parse_heroku_line(line) if line =~ /^/
end
end
I also notice your reporting thread does not do anything to synchronize access to #total_lines, #total_errors, etc. So, you have some minor race conditions where you can get inconsistent values from the instance vars that parse_heroku_line method updates.
select is about whether a read would block. f is just a plain old file, so you when get to the end reads don't block, they just return nil instantly. As a result select returns instantly rather than waiting for something to be appending to the file as I assume you're expecting. Because of this you're sitting in a tight busy loop, so high cpu is to be expected.
If you are at eof (you could either check f.eof? or whether gets returns nil), then you could either start sleeping (perhaps with some sort of back off) or use something like listen to be notified of filesystem changes

echoprint-codegen runs indefinitely with delayed_job

I'm attempting to run echoprint-codegen in a background process for analysing audio files as they're uploaded to a web service.
The desired functionality exists with a simple system call to the tmp file that gets uploaded via paperclip:
result = `echoprint-codegen #{path} 0 20` # works!
Unfortunately, this is not the case when the delayed workers fire off a new job; the echoprint-codegen process appears to hang indefinitely.
Per the echoprint README, I've double checked that ffmpeg is also within the path (Paperclip.options[:command_path] is pointing to the correct path).
I've also attempted to encapsulate the echoprint-codegen command line in a Paperclip.run() call, but that also results in a hanging process.
Any pointers?
I have obtained desired functionality by placing the echoprint-codegen system call in a Ruby Thread:
thread = Thread.new { Thread.current[:result] = `echoprint-codegen #{path} 0 20` }
thread.join
result = thread[:result]

Resources