IPython crashing when I hold any key - pthreads

When I ssh into a particular remote machine and start an IPython session, it crashed whenever I hold a key for about half a second (e.g. backspace key).
The error output is pasted below:
File "/home/zach/local/anaconda3/bin/ipython", line 11, in <module>
sys.exit(start_ipython())
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/__init__.py", line 125, in start_ipython
return launch_new_instance(argv=argv, **kwargs)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/terminal/ipapp.py", line 356, in start
self.shell.mainloop()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 498, in mainloop
self.interact()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 481, in interact
code = self.prompt_for_code()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 410, in prompt_for_code
**self._extra_prompt_options())
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/shortcuts/prompt.py", line 738, in prompt
return run_sync()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/shortcuts/prompt.py", line 727, in run_sync
return self.app.run(inputhook=self.inputhook, pre_run=pre_run2)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/application/application.py", line 709, in run
return run()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/application/application.py", line 682, in run
run_until_complete(f, inputhook=inputhook)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/eventloop/defaults.py", line 123, in run_until_complete
return get_event_loop().run_until_complete(future, inputhook=inputhook)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/eventloop/posix.py", line 66, in run_until_complete
self._run_once(inputhook)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/eventloop/posix.py", line 85, in _run_once
self._inputhook_context.call_inputhook(ready, inputhook)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/eventloop/inputhook.py", line 78, in call_inputhook
threading.Thread(target=thread).start()
File "/home/zach/local/anaconda3/lib/python3.7/threading.py", line 847, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev#python.org
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
It drops me from here into a broken bash session where my keystrokes do not appear on screen, although I can execute commands such as ls, man, pwd, ipython, etc. I can only kill the bash session by pressing Control D followed by Control C. In particular, the message's suggestion that I press %tb and so forth is not possible.
Other programs are not competing for threads. Looking through the error, it looks like the an event loop is possibly trying to create a thread to handle every key press, and this eventually causes failure to allocate more threads. It seems a little far-fetched that this would be the issue though since holding a key down is surely expected behavior.
This seems potentially similar to the issue https://ipython.org/faq.html#ipython-crashes-under-os-x-when-using-the-arrow-keys.
It appears not to be a Python issue per se, since if I use Python rather than IPython the issue disappears. I initially used Anaconda ipython but also switched to the system ipython in /usr/bin/ipython with the same results. Also tried a clean install of Anaconda, with the same issue. Also tried a fresh install of Anaconda on a different machine with the same OS, and the issue did not occur.
I am looking for ideas to make progress on this issue. Any ideas are appreciated, and I will post follow-up data if needed.
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
IPython 7.5.0
Ubuntu 18.04.2 LTS

It is fixed now, but still somewhat mysterious to me. I followed the stack trace all the way down through CPython to the pthreads library calls. The pthreads documentation indicated that the error can essentially only arise if one is out of memory on the heap or if the max number of threads has been allocated. I used ulimit to set the virtual memory per process to unlimited (it had been ~3 GB). This resolved the issue.
So apparently the virtual memory limit interfered with the ability to allocate a thread. The obvious solution is that more memory was needed, although it is hard to believe that more than 3 GB is needed to respond to a key press. Another possibility is that the amount allocated per thread is a function of the virtual memory limit--I remember something like that in the pthreads documentation although it was a bit above my head.

Related

Memory leaks with tkinter partial and lambda command

need your help. My program, using python 3.7.3 tkinter GUI, continues to build up in memory - memory leaks, that is called I believe, I narrowed down the problem in my code and it seems like the problem is within command function.
The program analyses reply from serial console and if criteria met changes the color and command function of the button.
Loop I run to read serial console is:
def list_ser():
if ser_connect == 1:
try:
send_com('R_S')
except:
print('Serial port is close!')
window.after(1000, list_ser)
buttons to configure:
if var1.find('D02_1')>=1:
btn02.configure(text = 'D02 IS ON', bg='YELLOW', font=("Verdana", 13, "bold"), command=lambda: send_com('D02_0'))
if var1.find('D02_0')>=1:
btn02.configure(text = 'D02 IS OFF', bg='GREY', font=("Verdana", 13, "bold"), command=lambda: send_com('D02_1'))
if var1.find('D02_2')>=1:
btn02.configure(text = 'D02 PWM CONTROL', bg='GREEN', font=("Verdana", 13, "bold"), command=lambda: send_com('D02_0'))
if I comment {command=lambda: send_com('D02_0')} everything is working fine without increase of the ram memory. I have tried {partial} function, but no result, I have tried {destroy()} {delete()} functions, no result.
What is the current problem, please help to understand

ZMQ crashes "randomly" in aiohttp web service

We have a aiohttp based web services which uses ZMQ to send jobs to workers and waits for the result. We are of course using the ZMQ eventloop, so we can wait for ZMQ sockets. "Sometimes" the process crashes and we get this stack trace:
...
await socket.send(z, flags=flags)
File "/usr/local/lib/python3.5/dist-packages/zmq/eventloop/future.py", line 165, in send
kwargs=dict(flags=flags, copy=copy, track=track),
File "/usr/local/lib/python3.5/dist-packages/zmq/eventloop/future.py", line 276, in _add_send_event
timeout_ms = self._shadow_sock.sndtimeo
File "/usr/local/lib/python3.5/dist-packages/zmq/sugar/attrsettr.py", line 45, in _getattr_
return self._get_attr_opt(upper_key, opt)
File "/usr/local/lib/python3.5/dist-packages/zmq/sugar/attrsettr.py", line 49, in _get_attr_opt
return self.get(opt)
File "zmq/backend/cython/socket.pyx", line 449, in zmq.backend.cython.socket.Socket.get (zmq/backend/cython/socket.c:4920)
File "zmq/backend/cython/socket.pyx", line 221, in zmq.backend.cython.socket._getsockopt (zmq/backend/cython/socket.c:2860)
"Sometimes" means, that the code works fine, if I just run it on my test machine. We encountered the problem in some rare cases when using docker containers, but were never able to reproduce it in an reliable way. Since we moved our containers into a Kubernetes cluster, it occurs much more often. Does anybody know, what could be the source of the above stack trace?
aiohttp is not intended to be used with vanilla pyzmq.
Use aiozmq loopless streams instead.
See also https://github.com/zeromq/pyzmq/issues/894 and https://github.com/aio-libs/aiozmq/blob/master/README.rst

How do I view the source code of a GW-BASIC .BAS file?

I have an old .BAS file that I'm trying to view and for which I'm running into some problems. Searching online seems to indicate that I should be able to just open it in NOTEPAD.EXE or similar, but doing so gives me gibberish, like this:
þ*©¿TÜ…7[/C̸yõ»€¹Ù<Ñ~Æ-$Ì™}³nFuJ,ÖYòÎg)ʇŒ~Š¯DËðïþSnhœJN
‰=É™2+df”c).vX»[šû'Û9¹8%ñx5m#8úV4ÊBº)Eª;Iú¹ó‹|àÆ„72#Ž§i§Ë #îÑ?
í‘ú™ÞMÖæÕjYе‘_¢y<…7i$°Ò.ÃÅR×ÒTÒç_yÄÐ
}+d&jQ *YòÎg)ʇŒ~Š¯DË?úŽ©Ž5\šm€S{ÔÍo—#ìôÔ”ÜÍѱ]ʵ¬0wêÂLª¡öm#Å„Ws雦 X
Ô¶æ¯÷¦É®jÛ ¼§
”n ŸëÆf¿´ó½4ÂäÌ3§Œ®
I know the file is sound, because I can open it in GW-BASIC. However, list does not seem to work to view the file, and trying to save the file in ASCII format from within GW-BASIC, didn't work either. Both just gave me an "Illegal function call" error:
GW-BASIC 3.22
(C) Copyright Microsoft 1983,1984,1986,1987
60300 Bytes free
Ok
LOAD"Pwrharm
Ok
LIST
Illegal function call
Ok
SAVE "Pwrharm2",A
Illegal function call
Ok
RUN
[Program runs successfully]
Then again, the run command works just fine. What am I doing wrong?
You're not doing anything wrong; the file was originally saved in GWBASIC with the ,P option. There is a 'hack' to unprotect it, described at https://groups.google.com/forum/#!topic/comp.os.msdos.misc/PA9sve0eKAk - basically, you create a file (call it UNPROT.BAS) containing only the characters 0xff 0x1a, then load the protected file, then load UNPROT.BAS, and you should then be able to list and save the program.
If you can't LIST or EDIT a GW-BASIC .BAS file that you LOADed from disk, it means that the file was originally SAVEd in protected format via SAVE filespec, P.
The 1988 "Handbook of BASIC - third edition" by David I. Schneider describes it as follows:
A program that has been SAVEd in protected format can be unprotected with the following technique.
(a) Create a file called RECOVER.BAS with the following program.
10 OPEN "RECOVER.BAS" FOR OUTPUT AS #1
20 PRINT #1, CHR$(255);
30 CLOSE #1
(b) LOAD the protected program into memory.
(c) Enter LOAD "RECOVER.BAS"
The formerly protected program will now be in memory and can be LISTed or EDITed, and reSAVEd in an unprotected format. This technique appears to work with most versions of BASIC. I have used it successfully with IBM PC BASIC, Compaq BASIC, and several versions of GW-BASIC. LOADing the file RECOVER.BAS will also restore a program after a NEW command has been executed.

How to diagnose intermittent uwsgi errors?

First let me briefly describe our set up before I ask the question proper:
We have a web application server (virtual machine) running a django application. nginx at the front, uwsgi running under that, then a newrelic application wrapper followed by django et al., database is a separate postgresql server located via smartstack (synapse/nerve)
The issue we face is that occasionally (happened once 2 weeks ago, and twice in the last 2 days), one or two of the uwsgi worker processes will trip up and start producing "django.db.utils.InterfaceError: connection already closed" on most of their requests.
slightly redacted stack trace (user and application_name):
Traceback (most recent call last):
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/newrelic-2.8.0.7/newrelic/api/web_transaction.py", line 863, in __call__
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/newrelic-2.8.0.7/newrelic/api/function_trace.py", line 90, in literal_wrapper
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/newrelic-2.8.0.7/newrelic/api/web_transaction.py", line 752, in __call__
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 194, in __call__
signals.request_started.send(sender=self.__class__)
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/django/dispatch/dispatcher.py", line 185, in send
response = receiver(signal=self, sender=sender, **named)
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/django/db/__init__.py", line 91, in close_old_connections
conn.abort()
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 374, in abort
self.rollback()
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 177, in rollback
self._rollback()
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 141, in _rollback
return self.connection.rollback()
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/django/db/utils.py", line 99, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 141, in _rollback
return self.connection.rollback()
File "/home/user/webapps/application_name/local/lib/python2.7/site-packages/newrelic-2.8.0.7/newrelic/hooks/database_dbapi2.py", line 82, in rollback
django.db.utils.InterfaceError: connection already closed
The stack trace never gets in to our application, it only touches new relic and django. Once a worker trips, it doesn't recover and all further requests result in 500's in the uwsgi logs and 502's on the front side. I assume database connectivity is fine because the sibling workers continue to function normally, and restarting uwsgi instantly fixes the problem.
My question is how one would go about diagnosing this issue to pinpoint the root cause, I have checked everything I know how to check (memory, cpu, logs, database connectivity) and some things I don't fully understand but am trying to read up on (file descriptors mainly).
For now I updated new relic (stack trace is older version) as it's the only thing I felt I could do.
I would appreciate any feedback, many google searches have proved fruitless.
replies may be slightly delayed, my timezone says it's time to sleep. Also, apologies if this should be on serverfault or something, I just figured it's closer to an application debug issue than a server config issue.

Identifying file causing hang from strace

I have a GTK program running on Ubuntu 10.04 that hangs in interruptible state, and I'd like to understand the output of strace. In particular, I have this line:
read(5, 0x2ba9ac4, 4096) = -1 EAGAIN (Resource temporarily unavailable)
I suspect 5 is the file descriptor, 0x2ba9ac4 the address in this file to be read, and 4096 the amount of data to read. Can you confirm? More importantly, how can one determine which file the program is trying to read? This file descriptor does not exist in /proc/pid/fd (which is probably why the program hangs).
You can find which file uses this file descriptor by calling strace -o log -eopen,read yourprogram. Then search in the log file the call to read of interest. From this line (and not from the first line of the file), search upwards the first occurrence of this file descriptor (returned by a call to open).
For example here, the file descriptor returned by open is 3:
open("/etc/ld.so.cache", O_RDONLY) = 3
The second argument to read() is simply the destination pointer, it's asking for a read from file descriptor 5, and max 4096 bytes. See the manual page for read().
Adding to #liberforce answer, if the process is already running you can get the file name using lsof
form strace
[pid 7529] read(102, 0x7fedc64c2fd0, 16) = -1 EAGAIN (Resource temporarily unavailable)
Now, with lsof
lsof -p 7529 | grep 102
java 7529 luis 102u 0000 0,9 0 9178 anon_inode

Resources