I have a test program here:
program test
implicit none
integer(4) :: indp
integer(4) :: t1(80)
indp = -3
t1(indp) = 1
write(*,*) t1(indp)
end program test
in line 8 it is wrong, because the indp is negative number. but when I compile it use 'ifort' or 'gfortran' both of them cannot find this error.
and even use valgrind to debug this program it also cannot find this error.
do you have any idea find this kind of problem?
Fortran compilers aren't required to give you warnings about things like this; and in general, t1(-3) = 1 could be a perfectly reasonable statement if you set the lower bound of your fortran array to something equal to or less than -3, eg
integer(kind=4), dimension(-5:74) :: t1(80)
would certainly allow setting and reading t1(-3).
If you want to make sure these sorts of errors are checked at runtime, you can compile with -fbounds-check with gfortran:
$ gfortran -o foo foo.f90 -fcheck=bounds
$ ./foo
At line 8 of file foo.f90
Fortran runtime error: Array reference out of bounds for array 't1', lower bound of dimension 1 exceeded (-3 < 1)
or -check bounds in ifort:
ifort -o foo foo.f90 -check bounds
$ ifort -o foo foo.f90 -check bounds
$ ./foo
forrtl: severe (408): fort: (3): Subscript #1 of the array T1 has value -3 which is less than the lower bound of 1
Image PC Routine Line Source
foo 000000000046A8DA Unknown Unknown Unknown
The reason valgrind doesn't catch this is a little subtle, but note that it would if the array were allocated:
program test
implicit none
integer(kind=4) :: indp
integer(kind=4), allocatable :: t1(:)
indp = -3
allocate(t1(80))
t1(indp) = 1
write(*,*) t1(indp)
deallocate(t1)
end program test
$ gfortran -o foo foo.f90 -g
$ valgrind ./foo
==18904== Memcheck, a memory error detector
==18904== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==18904== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==18904== Command: ./foo
==18904==
==18904== Invalid write of size 4
==18904== at 0x400931: MAIN__ (foo.f90:9)
==18904== by 0x400A52: main (foo.f90:13)
==18904== Address 0x5bb3420 is 16 bytes before a block of size 320 alloc'd
==18904== at 0x4C264B2: malloc (vg_replace_malloc.c:236)
==18904== by 0x400904: MAIN__ (foo.f90:8)
==18904== by 0x400A52: main (foo.f90:13)
==18904==
==18904== Invalid read of size 4
==18904== at 0x4F07368: extract_int (write.c:450)
==18904== by 0x4F08171: write_integer (write.c:1260)
==18904== by 0x4F0BBAE: _gfortrani_list_formatted_write (write.c:1553)
==18904== by 0x40099F: MAIN__ (foo.f90:10)
==18904== by 0x400A52: main (foo.f90:13)
==18904== Address 0x5bb3420 is 16 bytes before a block of size 320 alloc'd
==18904== at 0x4C264B2: malloc (vg_replace_malloc.c:236)
==18904== by 0x400904: MAIN__ (foo.f90:8)
==18904== by 0x400A52: main (foo.f90:13)
There is no error. You declared indp as an integer of a certain range and precision (of a certain KIND <- look up in help for that term), which can be either positive or negative.
After that you assigned the value of 1 to an t1(indp) and wrote it out.
Related
How can I test the memory footprints programs written for a RISC and a CISC processor?
Which one would require more memory and why?
So, the way I would do this is via experimentation. I would compile binaries for both types of architectures and then use gcc tools to see what the memory footprints are. For the following examples, I will compare x86_64 and RISCV architectures. First method I would use is the size tool which breaks down the various portions of an elf and reports the size.
# riscv64-unknown-elf-size Test.elf
Which will output something like this
text data bss dec hex filename
XXXXXX XXX XXXXXXX XXXXXXX XXXXXX Test.elf
Then compare that to the x86 version:
# size Test.exe
Which will output something like this
text data bss dec hex filename
XXXXXX XXX XXXXXXX XXXXXXX XXXXXX Test.exe
The other method is to convert your elf to a straight binary that will be bit for bit what is put into your memory ( this may not be true for more complex memory architectures, but we'll assume a simple case where it is all stored and executed from a RAM ). The tool for that is objcopy.
# riscv64-unknown-elf-objcopy -O binary Test.elf Test.elf.bin
# objcopy -O binary Test.exe Test.exe.bin
Then check the sizes of the two resulting bin files.
I was running some tests using openmp and fortran and came to realize that a binary compiled with ifort 15 (15.0.0 20140723) has 690MB of virtual memory overhead.
My sample program is:
program sharedmemtest
use omp_lib
implicit none
integer :: nroot1
integer, parameter :: dp = selected_real_kind(14,200)
real(dp),allocatable :: matrix_elementsy(:,:,:,:)
!$OMP PARALLEL NUM_THREADS(10) SHARED(matrix_elementsy)
nroot1=2
if (OMP_GET_THREAD_NUM() == 0) then
allocate(matrix_elementsy(nroot1,nroot1,nroot1,nroot1))
print *, "after allocation"
read(*,*)
end if
!$OMP BARRIER
!$OMP END PARALLEL
end program
running
ifort -openmp test_openmp_minimal.f90 && ./a.out
shows a memory usage of
50694 user 20 0 694m 8516 1340 S 0.0 0.0 0:03.58 a.out
in top. Running
gfortran -fopenmp test_openmp_minimal.f90 && ./a.out
shows a memory usage of
50802 user 20 0 36616 956 740 S 0.0 0.0 0:00.98 a.out
Where is the 690MB of overhead coming from when compiling with ifort? Am I doing something wrong? Or is this a bug in ifort?
For completeness: This is a minimal example taken from a much larger program. I am using gfortran 4.4 (4.4.7 20120313).
I appreciate all comments and ideas.
I don't believe top is reliable here. I do not see any evidence that the binary created from your test allocates anywhere near that much memory.
Below I have shown the result of generating the binary normally, with the Intel libraries linked statically and with everything linked statically. The static binary is in the ballpark of 2-3 megabytes.
It is possible that OpenMP thread stacks, which I believe are allocated from the heap, could be the source of the addition virtual memory here. Can you try this test with OMP_STACKSIZE=4K? I think the default is a few megabytes.
Dynamic Executable
jhammond#cori11:/tmp> ifort -O3 -qopenmp smt.f90 -o smt
jhammond#cori11:/tmp> size smt
text data bss dec hex filename
748065 13984 296024 1058073 102519 smt
jhammond#cori11:/tmp> ldd smt
linux-vdso.so.1 => (0x00002aaaaaaab000)
libm.so.6 => /lib64/libm.so.6 (0x00002aaaaab0c000)
libiomp5.so => /opt/intel/parallel_studio_xe_2016.0.047/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libiomp5.so (0x00002aaaaad86000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaab0c7000)
libc.so.6 => /lib64/libc.so.6 (0x00002aaaab2e4000)
libgcc_s.so.1 => /opt/gcc/5.1.0/snos/lib64/libgcc_s.so.1 (0x00002aaaab661000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab878000)
/lib64/ld-linux-x86-64.so.2 (0x0000555555554000)
Dynamic Executable with Static Intel
jhammond#cori11:/tmp> ifort -O3 -qopenmp smt.f90 -static-intel -o smt
jhammond#cori11:/tmp> size smt
text data bss dec hex filename
1608953 41420 457016 2107389 2027fd smt
jhammond#cori11:/tmp> ls -l smt
-rwxr-x--- 1 jhammond jhammond 1872489 Jan 12 05:51 smt
Static Executable
jhammond#cori11:/tmp> ifort -O3 -qopenmp smt.f90 -static -o smt
jhammond#cori11:/tmp> size smt
text data bss dec hex filename
2262019 43120 487320 2792459 2a9c0b smt
jhammond#cori11:/tmp> ldd smt
not a dynamic executable
Here is my assembly code
section .data
msg: db "hello"
section .text
global _start
_start:
nop
mov rax,23
nop
can i access the data located in 'msg' with gdb
The command x/5cb &msg should dump five bytes at the correct address, in both decimal and character notation.
Alternatively, you should be able to use printf "%5.5s\n", &msg as well, substituting in whatever format string you need for other data (a null terminated string, for example, would need only "%s").
This was all tested under CygWin with the following program:
section .data
msg: db "hello"
section .text
global _start
_start: mov eax, 42
ret
When you compile and run that, you get the expected 42 as a return code:
pax> nasm -f elf -o prog.o prog.asm
pax> ld -o prog.exe prog.o
pax> ./prog.exe ; echo $?
42
Starting it in the debugger, you can see the commands needed to get at msg:
pax> gdb prog.exe
GNU gdb (GDB) 7.8
Copyright (C) 2014 Free Software Foundation, Inc.
<blah blah blah>
Reading symbols from prog.exe...(no debugging symbols found)...done.
(gdb) b start
Breakpoint 1 at 0x401000
(gdb) r
Starting program: /cygdrive/c/pax/prog.exe
[New Thread 7416.0x20c0]
Breakpoint 1, 0x00401000 in start ()
(gdb) x/5cb &msg
0x402000 <msg>: 104 'h' 101 'e' 108 'l' 108 'l' 111 'o'
(gdb) printf "%5.5s\n", &msg
hello
I have a couple processes running a tool I've written that are joined by pipes, and I would like to measure their collected memory usage with valgrind. So far, I have tried something like:
$ valgrind tool=massif trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses".%p" myProcesses.script
Where myProcesses.script runs the equivalent of my tool foo twice, e.g.:
foo | foo > /dev/null
Valgrind doesn't seem to capture the collected memory usage of this the way I expect. If I use top to track this, I get (for sake of argument) 10% memory usage on the first foo, and then another 10% collects on the second foo before the myProcesses.script completes. This is the sort of thing I want to measure: the usage of both processes. Valgrind instead returns the following error:
Massif: ms_main.c:1891 (ms_new_mem_brk): Assertion 'VG_IS_PAGE_ALIGNED(len)' failed.
Is there a way to collect memory usage data for commands I'm using in a piped fashion (using valgrind)? Or a similar tool that I can use to accurately automate these measurements?
The numbers that top returns while polling seem hand-wavy, to me, and I am seeking accurate and repeatable measurements. If you have suggestions for alternative tools, I would welcome those, as well.
EDIT - Fixed typo with valgrind option.
EDIT 2 - For some reason, it appears that the option --pages-as-heap is giving us troubles with the binaries we're testing. Your examples run fine. A new page is created every time we enter a non-inlined function (stack overflows - heh). We wanted to count those, but they're relatively minor in the scale of memory usage we're testing. (Perhaps there aren't function calls in ls or less?) Removing --pages-as-heap helped get testing working again. Thanks to MrGomez for the great help.
With the correct valgrind version given in the errata, this seems to just work for me in Valgrind 3.6.1. My invocation:
<me>#harley:/tmp/test$ /usr/local/bin/valgrind --tool=massif \
--trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes \
--massif-out-file=myProcesses".%p" ./testscript.sh
==21067== Massif, a heap profiler
==21067== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21067== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21067== Command: ./testscript.sh
==21067==
==21068== Massif, a heap profiler
==21068== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21068== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21068== Command: /bin/ls
==21068==
==21070== Massif, a heap profiler
==21070== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21070== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Massif, a heap profiler
==21069== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21069== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Command: /bin/sleep 5
==21069==
==21070== Command: /usr/bin/less
==21070==
==21068==
(END) ==21069==
==21070==
==21067==
The contents of my test script, testscript.sh:
ls | sleep 5 | less
Sparse contents from one of the files generated by --massif-out-file=myProcesses".%p" (myProcesses.21055):
desc: --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses.%p
cmd: ./testscript.sh
time_unit: i
#-----------
snapshot=0
#-----------
time=0
mem_heap_B=110592
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=1
#-----------
time=0
mem_heap_B=118784
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
...
#-----------
snapshot=18
#-----------
time=108269
mem_heap_B=1708032
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=peak
n2: 1708032 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
n3: 1474560 0x4015E42: mmap (mmap.S:62)
n1: 1425408 0x4005CAC: _dl_map_object_from_fd (dl-load.c:1209)
n2: 1425408 0x4007109: _dl_map_object (dl-load.c:2250)
n1: 1413120 0x400CEEA: openaux (dl-deps.c:65)
n1: 1413120 0x400D834: _dl_catch_error (dl-error.c:178)
n1: 1413120 0x400C1E0: _dl_map_object_deps (dl-deps.c:247)
n1: 1413120 0x4002B59: dl_main (rtld.c:1780)
n1: 1413120 0x40140C5: _dl_sysdep_start (dl-sysdep.c:243)
n1: 1413120 0x4000C6B: _dl_start (rtld.c:333)
n0: 1413120 0x4000855: ??? (in /lib/ld-2.11.1.so)
n0: 12288 in 1 place, below massif's threshold (01.00%)
n0: 28672 in 3 places, all below massif's threshold (01.00%)
n1: 20480 0x4005E0C: _dl_map_object_from_fd (dl-load.c:1260)
n1: 20480 0x4007109: _dl_map_object (dl-load.c:2250)
n0: 20480 in 2 places, all below massif's threshold (01.00%)
n0: 233472 0xFFFFFFFF: ???
#-----------
snapshot=19
#-----------
time=108269
mem_heap_B=1703936
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=20
#-----------
time=200236
mem_heap_B=1839104
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
Massif continues to complain about heap allocations in the remainder of my files. Note this is very similar to your error.
I theorize that your version of valgrind was built in debug mode, causing the asserts to fire. A rebuild from source (I used this with the defaults hanging off ./configure) will fix the issue.
Either way, this seems to be expected with Massif.
Some programs allow you to preload the libmemusage.so library and get a report of what memory allocations were allocated recorded:
$ LD_PRELOAD=libmemusage.so less /etc/passwd
Memory usage summary: heap total: 36212, heap peak: 35011, stack peak: 15008
total calls total memory failed calls
malloc| 39 5985 0
realloc| 3 64 0 (nomove:2, dec:0, free:0)
calloc| 238 30163 0
free| 51 11546
Histogram for block sizes:
0-15 128 45% ==================================================
16-31 13 4% =====
32-47 105 37% =========================================
48-63 2 <1%
64-79 4 1% =
80-95 5 1% =
96-111 3 1% =
112-127 3 1% =
160-175 1 <1%
192-207 1 <1%
208-223 2 <1%
256-271 1 <1%
432-447 1 <1%
560-575 1 <1%
656-671 1 <1%
768-783 1 <1%
944-959 1 <1%
1024-1039 2 <1%
1328-1343 1 <1%
2128-2143 1 <1%
3312-3327 1 <1%
7952-7967 1 <1%
8240-8255 1 <1%
Though I must admit that it doesn't always work -- LD_PRELOAD=libmemusage.so ls never reports anything, for example -- and I wish I knew the conditions that allow it to work or not work.
I failed to analyze the dump file using Windbg.
Any help would be greatly appreciated.
Here are my WinDbg settings:
Symbol Path: C:\symbols;srv*c:\mss*http://msdl.microsoft.com/download/symbols
(C:\symbols contains my own exe and dll symbols, map,pdb etc etc)
Image Path: C:\symbols
Source Path: W:\
loading crash dump(second chance) shows:
WARNING: Unable to verify checksum for nbsm.dll GetPageUrlData failed,
server returned HTTP status 404 URL requested:
http://watson.microsoft.com/StageOne/nbsm_sm_exe/8_0_0_0/4e5649f3/KERNELBASE_dll/6_1_7600_16385/4a5bdbdf/e06d7363/0000b727.htm?Retriage=1
FAULTING_IP:
+3a22faf00cadf58 00000000 ?? ???
EXCEPTION_RECORD: fffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000000007507b727
(KERNELBASE!RaiseException+0x0000000000000058) ExceptionCode:
e06d7363 (C++ EH exception) ExceptionFlags: 00000009
NumberParameters: 3
Parameter[0]: 0000000019930520
Parameter[1]: `0000000001aafb10`
Parameter[2]: 000000000040c958
DEFAULT_BUCKET_ID: STACKIMMUNE
PROCESS_NAME: nbsm_sm.exe
ERROR_CODE: (NTSTATUS) 0xe06d7363 -
EXCEPTION_CODE: (NTSTATUS) 0xe06d7363 -
EXCEPTION_PARAMETER1: 0000000019930520
EXCEPTION_PARAMETER2: 0000000001aafb10
EXCEPTION_PARAMETER3: 000000000040c958
MOD_LIST:
NTGLOBALFLAG: 0
APPLICATION_VERIFIER_FLAGS: 0
ADDITIONAL_DEBUG_TEXT: Followup set based on attribute
[Is_ChosenCrashFollowupThread] from Frame:[0] on
thread:[PSEUDO_THREAD]
LAST_CONTROL_TRANSFER: from 000000007324dbf9 to 000000007507b727
FAULTING_THREAD: ffffffffffffffff
PRIMARY_PROBLEM_CLASS: STACKIMMUNE
BUGCHECK_STR: APPLICATION_FAULT_STACKIMMUNE_ZEROED_STACK
STACK_TEXT: 0000000000000000 0000000000000000 nbsm_sm.exe+0x0
STACK_COMMAND: .cxr 01AAF6E8 ; kb ; ** Pseudo Context ** ; kb
SYMBOL_NAME: nbsm_sm.exe
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: nbsm_sm
IMAGE_NAME: nbsm_sm.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 4e5649f3
FAILURE_BUCKET_ID: STACKIMMUNE_e06d7363_nbsm_sm.exe!Unknown
BUCKET_ID:
X64_APPLICATION_FAULT_STACKIMMUNE_ZEROED_STACK_nbsm_sm.exe
FOLLOWUP_IP: nbsm_sm!__ImageBase+0
00400000 4d dec ebp
WATSON_STAGEONE_URL:
http://watson.microsoft.com/StageOne/nbsm_sm_exe/8_0_0_0/4e5649f3/KERNELBASE_dll/6_1_7600_16385/4a5bdbdf/e06d7363/0000b727.htm?Retriage=1
========================
Any ideas?
Thanks in advance!
Sandeep
If this crash dump has come from a user and it is either reproducible on their system or happens relatively often then you could ask them to download procdump and run a command such as this:
procdump -e 1 -w nbsm_sm.exe c:\dumpfiles
This will create a dumpfile on the first chance exception which may give you more useful information than you have at the moment. Sometimes the dump from a second chance exception is just produced too late to be useful.
You can try to run 'kb' in WinDbg to see the actual stack trace. If you don't see any valuable information, assuming you are developing a native/managed C++ application, you can turn on stack checks (/GS on the cl command line) and re-run the program.