cuda-gdb attaching: A program is being debugged already. Kill it? - cuda-gdb

I tried to attach to running process(python3) after I found GPU-related log in the dmesg. ex) Xid 13, 31, 45
But i couldn't get any clue because of below message.
Does anyone know what this message means?
Couldn't write extended state status: Bad address.
A program is being debugged already. Kill it?
$ cuda-gdb python3 1243
NVIDIA (R) CUDA Debugger
10.2 release
Portions Copyright (C) 2007-2019 NVIDIA Corporation
GNU gdb (GDB) 7.12
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
(omit)
Reading symbols from /usr/lib64/libcuda.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/local/lib64/python3.6/site-packages/PIL/_imaging.cpython-36m-x86_64-linux-gnu.so...(no debugging symbols found)...done.
Reading symbols from /usr/local/lib64/python3.6/site-packages/PIL/../Pillow.libs/libjpeg-ba7bf5af.so.9.4.0...(no debugging symbols found)...done.
Reading symbols from /usr/local/lib64/python3.6/site-packages/PIL/../Pillow.libs/libopenjp2-b3d7668a.so.2.3.1...(no debugging symbols found)...done.
Reading symbols from /usr/local/lib64/python3.6/site-packages/PIL/../Pillow.libs/libtiff-41910f6d.so.5.5.0...(no debugging symbols found)...done.
Reading symbols from /usr/local/lib64/python3.6/site-packages/PIL/../Pillow.libs/./liblzma-99449165.so.5.2.5...(no debugging symbols found)...done.
0x00007ffcd4ed4980 in clock_gettime ()
Couldn't write extended state status: Bad address.
A program is being debugged already. Kill it? (y or n) y
/home1/irteam/apps/pytorch-app/src/1243: No such file or directory.
(cuda-gdb) bt
No stack.
(cuda-gdb) exit

Related

Problem compiling bitcoin source code(https://github.com/bitcoin/bitcoin) on linux

Issue: Problem compiling bitcoin source code from https://github.com/bitcoin/bitcoin
Building bitcoin code requires Berkeley DB 4.8( https://github.com/tinybike/get-bdb-4.8).
No problem with that.
My system is running on Ubuntu 20.04.
$ cpp --version
cpp (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ g++ --version
g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
When compiling the bitcoin code, after running 'configure' and 'make' an error occurs indicating that it could not find iostream.h
...
CXX libbitcoin_server_a-txrequest.o
CXX libbitcoin_server_a-txmempool.o
CXX libbitcoin_server_a-validation.o
CXX libbitcoin_server_a-validationinterface.o
CXX libbitcoin_server_a-versionbits.o
CXX wallet/libbitcoin_server_a-init.o
In file included from ./wallet/bdb.h:27,
from wallet/init.cpp:19:
/bitcoin/src/bdb/build_unix/build/include/db_cxx.h:59:10: fatal error: iostream.h: No such file or directory
59 | #include <iostream.h>
| ^~~~~~~~~~~~
compilation terminated.
make[2]: *** [Makefile:8933: wallet/libbitcoin_server_a-init.o] Error 1
make[2]: Leaving directory '/bitcoin/src'
make[1]: *** [Makefile:15214: all-recursive] Error 1
make[1]: Leaving directory '/bitcoin/src'
make: *** [Makefile:809: all-recursive] Error 1
On examining the header files location /usr/include/c++/9 I could not locate iostream.h
Is this a compiler package issue or bitcoin not using c++ iostream header file
I would guess you tried to build the "depends" BDB before installing the required system packages, and that produced an invalid/unusuable build.
Try removing your current "depends" builds and doing them over.
Alternatively, you could just use my db48 PPA for Ubuntu: https://launchpad.net/~luke-jr/+archive/ubuntu/db48
i got the same error,but it came when i build zero-ice with berkeley db. I found some usages about libdb and most of them add #define HAVE_CXX_STDHEADERS at the begining of codes, so i tried add this definition in ICEDIR/cpp/include/IceUtil/Config.h. It works.Wish it works for you.
something was completely wrong during db4.8 compilation
but as temporary fix, you may add in
include/db_cxx.h
#define HAVE_CXX_STDHEADERS 1
this may help, but depends.
imho
most correct way to build db-4.8 for bitcoin
wget http://download.oracle.com/berkeley-db/db-4.8.30.NC.tar.gz
tar zxvf db-4.8.30.NC.tar.gz
cd db-4.8.30.NC
build_unix/
../dist/configure --prefix=/usr/local/db48 --enable-cxx --with-pic --disable-replication --disable-shared
make install
cd ../bitcoin-x.x
export BDB_PREFIX=/usr/local/db48
export BDB_LIBS="-L/usr/local/db48/lib -ldb_cxx-4.8"
export BDB_CFLAGS="-I/usr/local/db48/include"
./configure
and etc.

Error: you did not specify -i=mi on GDB's command line! in Docker

I need to run gdb inside Docker, and I have a strong preference for the interface provided by emacs.
When doing M-x gdb, I enter "docker-compose -f ~/docker-services/dev/docker-compose.yml exec dev_rhel7 bash -c "gdb -i=mi"", and then it shows me the following message.
Current directory is /home/drcoeurjoly/docker-services/dev/
Error: you did not specify -i=mi on GDB's command line!
WARNING: The MY_UID variable is not set. Defaulting to a blank string.
1-inferior-tty-set /dev/pts/3
2-gdb-set height 0
3-gdb-set non-stop 1
4-enable-pretty-printing
5-file-list-exec-source-files
6-file-list-exec-source-file
7-gdb-show prompt
8-stack-info-frame
9-thread-info
10-break-list
11-thread-info
12-break-list
=thread-group-added,id="i1"
~"GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7\n"
~"Copyright (C) 2013 Free Software Foundation, Inc.\n"
~"License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html\nThis is free software: you are free to change and redistribute it.\nThere is NO WARRANTY, to the extent permitted by law. Type \"show copying\"\nand \"show warranty\" for details.\n"
~"This GDB was configured as \"x86_64-redhat-linux-gnu\".\nFor bug reporting instructions, please see:\n"
~"http://www.gnu.org/software/gdb/bugs/.\n"
=cmd-param-changed,param="history save",value="on"
=cmd-param-changed,param="history filename",value="/home/drcoeurjoly/dotfiles/gdb/.gdb_history"
=cmd-param-changed,param="print pretty",value="on"
=cmd-param-changed,param="print object",value="on"
=cmd-param-changed,param="print vtbl",value="on"
=cmd-param-changed,param="demangle-style",value="gnu-v3"
=cmd-param-changed,param="follow-fork-mode",value="child"
=cmd-param-changed,param="detach-on-fork",value="off"
(gdb)
1^done
(gdb)
2^done
(gdb)
3^done
(gdb)
4^done
(gdb)
5^done,files=[]
(gdb)
6^error,msg="No symbol table is loaded. Use the \"file\" command."
(gdb)
7^done,value="(gdb) "
(gdb)
8^error,msg="No registers."
(gdb)
9^done,threads=[]
(gdb)
10^done,BreakpointTable={nr_rows="0",nr_cols="6",hdr=[{width="7",alignment="-1",col_name="number",colhdr="Num"},{width="14",alignment="-1",col_name="type",colhdr="Type"},{width="4",alignment="-1",col_name="disp",colhdr="Disp"},{width="3",alignment="-1",col_name="enabled",colhdr="Enb"},{width="10",alignment="-1",col_name="addr",colhdr="Address"},{width="40",alignment="2",col_name="what",colhdr="What"}],body=[]}
(gdb)
11^done,threads=[]
(gdb)
12^done,BreakpointTable={nr_rows="0",nr_cols="6",hdr=[{width="7",alignment="-1",col_name="number",colhdr="Num"},{width="14",alignment="-1",col_name="type",colhdr="Type"},{width="4",alignment="-1",col_name="disp",colhdr="Disp"},{width="3",alignment="-1",col_name="enabled",colhdr="Enb"},{width="10",alignment="-1",col_name="addr",colhdr="Address"},{width="40",alignment="2",col_name="what",colhdr="What"}],body=[]}
(gdb)
Note that I successfully enter gdb in Docker, since GDB says:
"GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7\n"
In my host operating system (Debian):
gdb --version
outputs:
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
In Debian, I execute the Docker command:
docker-compose -f ~/docker-services/dev/docker-compose.yml exec dev_rhel7 bash -c "gdb -i=mi"
and I get the machine-oriented text interface.
From the previous test I deduce that it is an issue of emacs, not Docker.
When entering the path of a binary instead of the option -i=mi, it reads the symbols just fine:
M-x gdb RETURN docker-compose -f ~/docker-services/dev/docker-compose.yml exec dev_rhel7 bash -c "gdb ~/babel_sandbox/build/foo"
which outputs:
Current directory is /home/drcoeurjoly/docker-services/dev/
Error: you did not specify -i=mi on GDB's command line!
WARNING: The MY_UID variable is not set. Defaulting to a blank string.
1-inferior-tty-set /dev/pts/3
2-gdb-set height 0
3-gdb-set non-stop 1
4-enable-pretty-printing
5-file-list-exec-source-files
6-file-list-exec-source-file
7-gdb-show prompt
8-stack-info-frame
9-thread-info
10-break-list
11-thread-info
12-break-list
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /home/drcoeurjoly/babel_sandbox/build/foo...done.
(gdb) 1-inferior-tty-set /dev/pts/3
Undefined command: "1-inferior-tty-set". Try "help".
(gdb) 2-gdb-set height 0
Undefined command: "2-gdb-set". Try "help".
(gdb) 3-gdb-set non-stop 1
Undefined command: "3-gdb-set". Try "help".
(gdb) 4-enable-pretty-printing
Undefined command: "4-enable-pretty-printing". Try "help".
(gdb) 5-file-list-exec-source-files
Undefined command: "5-file-list-exec-source-files". Try "help".
(gdb) 6-file-list-exec-source-file
Undefined command: "6-file-list-exec-source-file". Try "help".
(gdb) 7-gdb-show prompt
Undefined command: "7-gdb-show". Try "help".
(gdb) 8-stack-info-frame
Undefined command: "8-stack-info-frame". Try "help".
(gdb) 9-thread-info
Undefined command: "9-thread-info". Try "help".
(gdb) 10-break-list
Undefined command: "10-break-list". Try "help".
(gdb) 11-thread-info
Undefined command: "11-thread-info". Try "help".
(gdb) 12-break-list
Undefined command: "12-break-list". Try "help".
(gdb)
I also tried putting the gdb -i=mi inside a script and calling that from emacs, to no avail. Calling directly from bash worked and not from emacs.
Relevant information:
My spacemacs config
foo program used for testing argument passing to gdb in emacs.
I don't know if the dockerfile and docker-compose yml are relevant. If so, I will create a repo.
Versions:
Host:
uname -a
Linux des26 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux
Emacs in host:
emacs --version
GNU Emacs 26.3
GDB in host:
gdb --version
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Docker in host:
docker --version
Docker version 18.09.1, build 4c52b90
Docker-compose in host:
docker-compose --version
docker-compose version 1.21.0, build unknown
Docker container:
cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
GDB in docker container:
gdb --version
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
This issue was reported by Robert Mecklenburg here and here as early as august 2017.

Octave memory allocation error after changing graphicsmagicks to quantum depth 16

I tried to increase the quantum depth of the GraphicsMagick to 16bits. I downloaded the GraphicsMagick package from here. Here is the result for the version details and quantum depth result for gm version.
karthikeyan#karthikeyan:~$ gm version
GraphicsMagick 1.3.25 2016-09-05 Q16 http://www.GraphicsMagick.org/
Copyright (C) 2002-2016 GraphicsMagick Group.
Additional copyrights and licenses apply to this software.
See http://www.GraphicsMagick.org/www/Copyright.html for details.
..
...
....
Configured using the command:
./configure '--with-quantum-depth=16' '--enable-shared' '--disable-static' '--with-magick-plus-plus=yes'
So I went ahead to process images,assuming my graphicsmagick can handle 16 bits, but I am getting this error. Please help to resolve this:
karthikeyan#karthikeyan:~$ octave
octave:1> i = imread("/home/karthikeyan/Pictures/Wallpapers/f1376677896.jpg");
warning: your version of GraphicsMagick limits images to 8 bits per pixel
*** Error in `/usr/bin/octave-cli': malloc(): memory corruption: 0x0000000002300dd0 ***
panic: Aborted -- stopping myself...
^C^CPress Control-C again to abort.
^Cpanic: attempted clean up apparently failed -- aborting...
Aborted (core dumped)

can't install this file (mercury6_2.for) with gfortran

I tried this:
Alan#Alan ~/mercury
$ gfortran -o mercury6_2.for
gfortran.exe: fatal error: no input files; unwilling to write output files
compilation terminated
and:
Alan#Alan ~/mercury
$ gfortran -o mercury mercury6_2.for
gfortran.exe: error: CreateProcess: No such file or directory
My file exist:
Alan#Alan ~/mercury
$ ls
big.in element.in mercury.inc mercury6_2.for README.txt
close.in element6.for mercury6.man message.in small.in
close6.for files.in mercury6.tar param.in swift.incenter code here
gfortran seems to be running in Cygwin:
Alan#Alan ~/mercury
$ gfortran --version
GNU Fortran (GCC) 4.8.0 20130302 (experimental) [trunk revision 196403]
Copyright (C) 2013 Free Software Foundation, Inc.
GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING
So I don't know.
Is there away that I could do this differently?

Nsclient: How can i display Windows HDD health on Nagios

i want to monitor hard-drive's health of my windows server, for this i have installed Smarttools(smartmontools-6.1-2.win32-setup.exe).
My question is, how can i display commands output on Nagios-Server via nrpe or somewhat else.
Some info: Nagios-Core-3.5, smartmontools-6.1-2,
Commands output on windows machine:
c:> smartctl.exe /dev/sda -l selftest
smartctl 6.1 2013-03-16 r3800 [i686-w64-mingw32-xp-sp2] (sf-6.1-2)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 17592 -
# 2 Extended offline Completed without error 00% 17393 -
# 3 Short offline Completed without error 00% 17392 -
c:> smartctl.exe /dev/sda -H
smartctl 6.1 2013-03-16 r3800 [i686-w64-mingw32-xp-sp2] (sf-6.1-2)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
C:>smartctl -d ata /dev/sda -i
smartctl 6.1 2013-03-16 r3800 [i686-w64-mingw32-xp-sp2] (sf-6.1-2)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.9
Device Model: ST3802110A
Serial Number: 5LR7M728
Firmware Version: 3.AAJ
User Capacity: 80,026,361,856 bytes [80.0 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA/ATAPI-7 (minor revision not indicated)
Local Time is: Fri Jun 07 19:02:13 2013 IST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Any help would greatly appreciated.
You have two issues.
You need to be able to get Nagios to run a check remotely on your Windows server, and
You need to be able to get the data into a Nagios-compatible format.
For the first, you can probably install an agent such as NC_Net or NSClient++. This can be queried using either check_nt or check_nrpe. I would recommend using NC_Net.
For the second, you will likely have to write your own script to run the command and output in Nagios plugin format (one line of text, and an exit status of 0/1/2/3 for OK/Warn/Crit/Unknown). This script can be remotely called via check_nrpe.
However, if your goal is simply to monitor disk space, you can do that using the standard check functions builtin to NC_Net or NSClient++
You may find pre-written scripts at monitoringexchange.org , such as this

Resources