Identifying file causing hang from strace

Identifying file causing hang from strace - memory

I have a GTK program running on Ubuntu 10.04 that hangs in interruptible state, and I'd like to understand the output of strace. In particular, I have this line:
read(5, 0x2ba9ac4, 4096) = -1 EAGAIN (Resource temporarily unavailable)
I suspect 5 is the file descriptor, 0x2ba9ac4 the address in this file to be read, and 4096 the amount of data to read. Can you confirm? More importantly, how can one determine which file the program is trying to read? This file descriptor does not exist in /proc/pid/fd (which is probably why the program hangs).

You can find which file uses this file descriptor by calling strace -o log -eopen,read yourprogram. Then search in the log file the call to read of interest. From this line (and not from the first line of the file), search upwards the first occurrence of this file descriptor (returned by a call to open).
For example here, the file descriptor returned by open is 3:
open("/etc/ld.so.cache", O_RDONLY) = 3

The second argument to read() is simply the destination pointer, it's asking for a read from file descriptor 5, and max 4096 bytes. See the manual page for read().

Adding to #liberforce answer, if the process is already running you can get the file name using lsof
form strace
[pid 7529] read(102, 0x7fedc64c2fd0, 16) = -1 EAGAIN (Resource temporarily unavailable)
Now, with lsof
lsof -p 7529 | grep 102
java 7529 luis 102u 0000 0,9 0 9178 anon_inode

Related

IPython crashing when I hold any key

When I ssh into a particular remote machine and start an IPython session, it crashed whenever I hold a key for about half a second (e.g. backspace key).
The error output is pasted below:
File "/home/zach/local/anaconda3/bin/ipython", line 11, in <module>
sys.exit(start_ipython())
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/__init__.py", line 125, in start_ipython
return launch_new_instance(argv=argv, **kwargs)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/terminal/ipapp.py", line 356, in start
self.shell.mainloop()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 498, in mainloop
self.interact()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 481, in interact
code = self.prompt_for_code()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 410, in prompt_for_code
**self._extra_prompt_options())
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/shortcuts/prompt.py", line 738, in prompt
return run_sync()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/shortcuts/prompt.py", line 727, in run_sync
return self.app.run(inputhook=self.inputhook, pre_run=pre_run2)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/application/application.py", line 709, in run
return run()
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/application/application.py", line 682, in run
run_until_complete(f, inputhook=inputhook)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/eventloop/defaults.py", line 123, in run_until_complete
return get_event_loop().run_until_complete(future, inputhook=inputhook)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/eventloop/posix.py", line 66, in run_until_complete
self._run_once(inputhook)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/eventloop/posix.py", line 85, in _run_once
self._inputhook_context.call_inputhook(ready, inputhook)
File "/home/zach/local/anaconda3/lib/python3.7/site-packages/prompt_toolkit/eventloop/inputhook.py", line 78, in call_inputhook
threading.Thread(target=thread).start()
File "/home/zach/local/anaconda3/lib/python3.7/threading.py", line 847, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev#python.org
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
It drops me from here into a broken bash session where my keystrokes do not appear on screen, although I can execute commands such as ls, man, pwd, ipython, etc. I can only kill the bash session by pressing Control D followed by Control C. In particular, the message's suggestion that I press %tb and so forth is not possible.
Other programs are not competing for threads. Looking through the error, it looks like the an event loop is possibly trying to create a thread to handle every key press, and this eventually causes failure to allocate more threads. It seems a little far-fetched that this would be the issue though since holding a key down is surely expected behavior.
This seems potentially similar to the issue https://ipython.org/faq.html#ipython-crashes-under-os-x-when-using-the-arrow-keys.
It appears not to be a Python issue per se, since if I use Python rather than IPython the issue disappears. I initially used Anaconda ipython but also switched to the system ipython in /usr/bin/ipython with the same results. Also tried a clean install of Anaconda, with the same issue. Also tried a fresh install of Anaconda on a different machine with the same OS, and the issue did not occur.
I am looking for ideas to make progress on this issue. Any ideas are appreciated, and I will post follow-up data if needed.
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
IPython 7.5.0
Ubuntu 18.04.2 LTS

It is fixed now, but still somewhat mysterious to me. I followed the stack trace all the way down through CPython to the pthreads library calls. The pthreads documentation indicated that the error can essentially only arise if one is out of memory on the heap or if the max number of threads has been allocated. I used ulimit to set the virtual memory per process to unlimited (it had been ~3 GB). This resolved the issue.
So apparently the virtual memory limit interfered with the ability to allocate a thread. The obvious solution is that more memory was needed, although it is hard to believe that more than 3 GB is needed to respond to a key press. Another possibility is that the amount allocated per thread is a function of the virtual memory limit--I remember something like that in the pthreads documentation although it was a bit above my head.

SQLite occasionally fails to create :memory: database

In our unit testing suite, we create and destroy a large number of SQLite databases that use the path of ":memory:". Occasionally, and only when running on the iOS simulator, the creation of those databases fails with the rather enigmatic message:
Database ":memory:": unable to open database file
99% of the time, these requests succeed. (Subsequent tests within the same test run typically succeed after this failure occurs.) But when you're using this in an automated build-acceptance test, you want 100%.
We've instrumented for memory consumption (it's within normal limits) and disk-space availability (20GB+ available).
Any ideas?
UPDATE: Captured this happening with extra logging per Richard's suggestion below. Here's the log output:
SQLITE ERROR: (28) attempt to open "/Users/xxx/Library/Developer/CoreSimulator/Devices/CF762060-7D23-4C79-A466-7F20AB6233E7/data/Containers/Data/Application/582E1ED0-81E0-4CC7-A6F6-DBEBC101BBE8/tmp/etilqs_1ghbf1MSTa8ilSj" as
SQLITE ERROR: (14) cannot open file at line 30595 of [f66f7a17b7]
SQLITE ERROR: (14) os_unix.c:30595: (17) open(/Users/xxx/Library/Developer/CoreSimulator/Devices/CF762060-7D23-4C79-A466-7F20AB6233E7/data/Containers/Data/Application/582E1ED0-81E0-4CC7-A6F6-DBEBC101BBE8/tmp/etilqs_1ghbf1MST

I've noticed that even a :memory: database will files on disk if you create a temporary table. The temporary files for unix system are built by a Prng, so there is a non-zero chance of name collision if lots and lots of temporary files are created simultaneously. Or, if the disk is full, the create could fail. Or if for some reason the unix temp directory is not accessible either because it's been deleted or permissions on it are invalid.
For example, I've turned on several loggers in sqlite3 command line by adding these command line arguments to llvc-gcc: -DSQLITE_DEBUG_OS_TRACE=1 -DSQLITE_TEST=1 -DSQLITE_DEBUG=1 then I observed a temp file being created from the command line using this SQL:
$ ./sqlite3
SQLite version 3.8.8.2 2015-01-30 14:30:45
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create temporary table t( x );
OPENX 3 /var/folders/nf/l1cw8sn1707b73zy5nqycrpw0000gn/T//etilqs_fvwR6KbMm518S4w 01002
OPEN 3
WRITE 3 512 0 0
OPENX 4 /var/folders/nf/l1cw8sn1707b73zy5nqycrpw0000gn/T//etilqs_OJJJ1lrTtQIFnUO 05402
OPEN 4
WRITE 4 1024 0 0
WRITE 4 1024 1024 0
WRITE 3 28 0 0
sqlite>

No ideas. But perhaps if you turned on the error and warning log it will give some clues.

Is there a good way to tell if HandBrakeCLI actually encoded anything?

I'm working on a system to convert a bunch of .mov files to H.264 (using HandBrakeCLI) and webm (using ffmpeg) as the .mov files are created. In general, things are going very well. I'm hung up on a bit of error detection. I want to know if one of the encodings failed so that we can investigate, try again, etc.
To test encoding failure, I copied a text file into a file with a .mov extention, and set the programs about trying to encode it. Naturally, they both fail to encode the file (I'm not sure what "success" would mean in this context...) However, while ffmpeg reports this failure by setting its exit code to 1, HandBrakeCLI sets the exit code to 0, because it exited cleanly. This is consistent with the HandBrakeCLI documentation but it leaves me wondering how I can tell if HandBrakeCLI knows if it failed to encode anything. That same documentation page suggests "If you want to monitor HandBrake's process, you should monitor the standard pipes", so I'm now getting the effect that I want by doing something like this:
HandBrakeCLI --preset 'Normal' --input bad.mov --output out.mv4 2>&1 | grep 'Encode done'
grep then sets its exit code to 0 if it found a match, and 1 if it didn't. But, this seems rather barbaric: for instance, the text "Encode done!" could change in a future release of HandBrake.
So, anyone have a better way to tell if HandBrake encoded something or not?
Some edited shell output is included below for reference...
$ ffmpeg -i 'develqueuedir/B_BH_120409.mov' 'develqueuedir/B_BH_120409.webm'
FFmpeg version 0.6.4-4:0.6.4-0ubuntu0.11.04.1, Copyright (c) 2000-2010 the Libav Developers
[snip]
develqueuedir/B_BH_120409.mov: Invalid data found when processing input
$ echo $?
1
$ HandBrakeCLI --preset 'Normal' --maxWidth 720 --optimize --input 'develqueuedir/B_BH_120409.mov' --output 'develqueuedir/B_BH_120409.mv4'
Output format couldn't be guessed from file name, using default.
[11:45:45] hb_init: starting libhb thread
HandBrake 0.9.6 (2012022900) - Linux x86_64 - http://handbrake.fr
Opening develqueuedir/B_BH_120409.mov...
[snip]
[11:45:45] libhb: scan thread found 0 valid title(s)
No title found.
HandBrake has exited.
$ echo $?
0

Short answer is no , you can find detailed explanation at HandBrake forum https://forum.handbrake.fr/viewtopic.php?f=12&t=18559&p=85529&hilit=return+code#p85529
adddition:
I think that there is a patch from user fonkprop that is rejected by developers , if you really need it contact that guy

Good news! It appears that this feature is about to be implemented in HandBrake-CLI 0.10. As you can read on the roadmap for the 0.10 milestone:
Basic support for return codes from the CLI. (0 = No Error, 1 = Cancelled, 2 = Invalid Input, 3 = Initialization error, 4 = Unknown Error")

Uploading a file larger than 2GB using PHP

I'm trying to upload a file larger than 2GB to a local PHP 5.3.4 server. I've set the following server variables:
memory_limit = -1
post_max_size = 9G
upload_max_filesize = 5G
However, in the error_log I found:
PHP Warning: POST Content-Length of 2120909412 bytes exceeds the limit of 1073741824 bytes in Unknown on line 0
Can anyone tell me why this keeps failing please?

I had a similar problem, but my config was:
post_max_size = 1.8G
upload_max_filesize = 1.8G
and yet I could not upload a 1.2GB file. The error was very same:
PHP Warning: POST Content-Length of 1347484420 bytes exceeds the limit of 1073741824 bytes in Unknown on line 0
I spent a day wondering where the heck was this "limit of 1073741824" coming from!
Solution:
Actually, the error was in the php.ini parser: It only understands INTEGER numbers, so essentially it was parsing 1.8G as 1G !!
Changing the value to e.g. 1800M fixed it.
Pls ensure to restart the apache server with the below command service apache2 restart

I don't know about in 5.3.x, but in 5.2.x there are some int/long issues in the PHP code. even if you're on a 64-bit system and have a version of PHP compiled with 64-bit, there are several problems.
First, the code that converts post_max_size and others from ascii to integer stores the value in an int, so it converting "9G" and putting the result into this int will bork the value because 9G is a larger number than a 32-bit variable can hold.
But there are also several other areas of PHP code that are used with the Apache module, CGI, etc. that need to be changed from int to long.
So...for this to work, you need to edit the PHP code and compile it by hand (make sure you compile it as 64-bit). here's a link to a list of diffs:
http://www.archive.org/~tracey/downloads/patches/karmic-64bit-post-large-files.patch
Referenced from this php bug post: http://bugs.php.net/bug.php?id=44522
The file above is a diff on 5.2.10 code, but I just made the changes by hand to 5.2.17 code and i just uploaded a 3.4gb single file through apache/php (which hadn't worked before the change).
ope that helps.

I figure out how to use http and php to upload a 10G file.
php.ini:
post_max_size = 0
upload_max_filesize = 0
It works in php 5.3.10.
if you do not load that file all into memory , memory_limit is unrelated.

Maybe this can come from apache limitations on POST size:
http://httpd.apache.org/docs/current/mod/core.html#limitrequestbody
It seems this limitation on 2Gb can be greater on 64bits installations, maybe. And i'm not sure setting 0 in this directove does not reach the compilation limit. see for examples that thread:
http://ubuntuforums.org/archive/index.php/t-1385890.html
Then do not forget to alter as well the max_input_time in PHP.
But you are reaching high limits :-) maybe you could try a rich client (flash? js?) on the browser side, doing the transfer in chunks or some sort of FTP things, with progress indicators for the user.

As phliKtid mentioned, this is a limitation with the PHP framework. Save for editing the source code as mentioned in the bug report phliKtid linked, there is a workaround that involves setting the upload_max_filesize to 0 in the php.ini file.
; Maximum allowed size for uploaded files.
; http://php.net/upload-max-filesize
upload_max_filesize = 0
By doing this, PHP will not crash when trying to convert "5G" into a 32-bit integer and you will be able to upload files as big as you allow with the "post_max_size" variable.

We've had the same problem: uploads stopped at 2GB.
Under SLES (SUSE Linux Enterprise Server) 11 SP 2, php53 was the problem.
Then we added a new repository that has php54:
http://download.opensuse.org/repositories/server:/php/SLE_11_SP2/
and upgraded to that, we now can upload 5GB :-)

Tailing a binary file in Erlang adds mysterious bit-string

I want to run tail on a named pipe to facilitate some binary logfile processing. The problem is that mysterious data is being added to the beginning of the stream. I run my tests by starting the erlang process with the opened port (open_port) and then I use another shell to cat the bin into the named pipe.
Here is a simple function for getting data from the port:
bin_from_tail() ->
open_port({spawn,"/usr/bin/tail -F named_pipe"},
[binary,in,eof]),
receive
{_,{data,<<Data/binary>>}} -> Data
end.
So here are two ways for me to grab the same data...
Create the named pipe
mkfifo named_pipe
This command blocks until you run "cat log.bin > named_pipe" from another shell
{ok,TailBin} = file:read_file(log.bin).
Read the entire file into memory using the erlang file library
FileBin = file:read_file(log.in).
But TailBin and FileBin are not the same! TailBin has a mysterious 120-byte string at the beginning:
<<40,6,161,69,172,216,56,14,100,0,80,6,0,0,0>>

Thanks for the idea about the endlessly looping cat/restarting a dead port. It appears that named pipes buffer just a little bit, so if the port opens up fast enough the writer process (another program) won't crash! Definitely risky stuff, but as far as hacks go... it works.
Because all the mailing list posts just said do this, do that without examples, I'm going to post how mine works! If anyone wants to offer up improvements, please feel free to do so. My solution:
read() ->
Port = open_port({spawn,"/bin/cat /path/to/pipe"},
[binary,in,eof]),
do_read(Port).
do_read(Port) ->
receive
{Port,{data,<<Data/binary>>}} ->
case do_something:with(Data) of
ok ->
io:format("G") % Good
Any ->
io:format("B") % Bad
end;
{Port,eof} ->
read();
Any ->
io:format("No match fifo_client:do_read/1, ~p~n",[Any])
end,
do_read(Port).

I found the same thing happened outside erlang. The problem is that tail is trying to show you the end of the file, not the whole file. If you use it on a normal file, anything written would be new, and picked up by -f, but in this case it looks like tail is waiting until the end of the file (the eof that comes through the pipe) and then showing the last 10 lines (treating the binary as text).
tail -F -c 9999999
(assuming your log is 9999999 bytes or less) would probably work.
Maybe try using cat instead of tail -F, that seemed to work for me. Then you just need to avoid the fact that cat exits upon eof, which I assume you were trying to avoid by using tail.
So a shell script which loops cat endlessly, maybe?
Or get erlang to restart close and recreate the port when it dies, since you're getting the eof signal anyway. Or use the exit_status flag to open_port to be signalled when the process exits, incase you need to distinguish eof and process exit. (If you use both exit_status and eof, the eof never comes, a brief test with cat < /dev/null indicates)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart