Help needed:Analyze the dump file in WinDbg - crash-dumps

I failed to analyze the dump file using Windbg.
Any help would be greatly appreciated.
Here are my WinDbg settings:
Symbol Path: C:\symbols;srv*c:\mss*http://msdl.microsoft.com/download/symbols
(C:\symbols contains my own exe and dll symbols, map,pdb etc etc)
Image Path: C:\symbols
Source Path: W:\
loading crash dump(second chance) shows:
WARNING: Unable to verify checksum for nbsm.dll GetPageUrlData failed,
server returned HTTP status 404 URL requested:
http://watson.microsoft.com/StageOne/nbsm_sm_exe/8_0_0_0/4e5649f3/KERNELBASE_dll/6_1_7600_16385/4a5bdbdf/e06d7363/0000b727.htm?Retriage=1
FAULTING_IP:
+3a22faf00cadf58 00000000 ?? ???
EXCEPTION_RECORD: fffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000000007507b727
(KERNELBASE!RaiseException+0x0000000000000058) ExceptionCode:
e06d7363 (C++ EH exception) ExceptionFlags: 00000009
NumberParameters: 3
Parameter[0]: 0000000019930520
Parameter[1]: `0000000001aafb10`
Parameter[2]: 000000000040c958
DEFAULT_BUCKET_ID: STACKIMMUNE
PROCESS_NAME: nbsm_sm.exe
ERROR_CODE: (NTSTATUS) 0xe06d7363 -
EXCEPTION_CODE: (NTSTATUS) 0xe06d7363 -
EXCEPTION_PARAMETER1: 0000000019930520
EXCEPTION_PARAMETER2: 0000000001aafb10
EXCEPTION_PARAMETER3: 000000000040c958
MOD_LIST:
NTGLOBALFLAG: 0
APPLICATION_VERIFIER_FLAGS: 0
ADDITIONAL_DEBUG_TEXT: Followup set based on attribute
[Is_ChosenCrashFollowupThread] from Frame:[0] on
thread:[PSEUDO_THREAD]
LAST_CONTROL_TRANSFER: from 000000007324dbf9 to 000000007507b727
FAULTING_THREAD: ffffffffffffffff
PRIMARY_PROBLEM_CLASS: STACKIMMUNE
BUGCHECK_STR: APPLICATION_FAULT_STACKIMMUNE_ZEROED_STACK
STACK_TEXT: 0000000000000000 0000000000000000 nbsm_sm.exe+0x0
STACK_COMMAND: .cxr 01AAF6E8 ; kb ; ** Pseudo Context ** ; kb
SYMBOL_NAME: nbsm_sm.exe
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: nbsm_sm
IMAGE_NAME: nbsm_sm.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 4e5649f3
FAILURE_BUCKET_ID: STACKIMMUNE_e06d7363_nbsm_sm.exe!Unknown
BUCKET_ID:
X64_APPLICATION_FAULT_STACKIMMUNE_ZEROED_STACK_nbsm_sm.exe
FOLLOWUP_IP: nbsm_sm!__ImageBase+0
00400000 4d dec ebp
WATSON_STAGEONE_URL:
http://watson.microsoft.com/StageOne/nbsm_sm_exe/8_0_0_0/4e5649f3/KERNELBASE_dll/6_1_7600_16385/4a5bdbdf/e06d7363/0000b727.htm?Retriage=1
========================
Any ideas?
Thanks in advance!
Sandeep

If this crash dump has come from a user and it is either reproducible on their system or happens relatively often then you could ask them to download procdump and run a command such as this:
procdump -e 1 -w nbsm_sm.exe c:\dumpfiles
This will create a dumpfile on the first chance exception which may give you more useful information than you have at the moment. Sometimes the dump from a second chance exception is just produced too late to be useful.

You can try to run 'kb' in WinDbg to see the actual stack trace. If you don't see any valuable information, assuming you are developing a native/managed C++ application, you can turn on stack checks (/GS on the cl command line) and re-run the program.

Related

uboot fails to execute load cmd from uboot.env

I am working with
U-boot v2021.10
BeagleBone Black rev C
I've created an uboot.env image with mkenvimage tool from file
loadfromsd=load mmc 0:1 0x82000000 /zImage; load mmc 0:1 0x88000000 /am335x-boneblack.dtb
set_bootargs=setenv bootargs console=ttyS0,115200n8 root=/dev/mmcblk0p2 rw rootfstype=ext4 rootwait
uenvcmd=setenv auotload no; run set_bootargs; run loadfromsd; printenv bootargs; bootz 0x82000000 - 0x88000000
The problem is in files loading to memory with load cmd in first line.
Full message from start is:
U-Boot SPL 2021.10 (Oct 14 2021 - 20:41:20 -0700)
Trying to boot from MMC1
U-Boot 2021.10 (Oct 14 2021 - 20:41:20 -0700)
CPU : AM335X-GP rev 2.1
Model: TI AM335x BeagleBone Black
DRAM: 512 MiB
ti_sysc target-module#9000: failed to get fck clock
WDT: Started with servicing (60s timeout)
NAND: nand_base: timeout while waiting for chip to become ready
nand_base: No NAND device found
0 MiB
MMC: ti_sysc target-module#7000: failed to get fck clock
OMAP SD/MMC: 0, OMAP SD/MMC: 1
Loading Environment from FAT... OK
<ethaddr> not set. Validating first E-fuse MAC
Net: eth2: ethernet#4a100000, eth3: usb_ether
=> run uenvcmd
4295456 bytes read in 282 ms (14.5 MiB/s)
'ailed to load '/am335x-boneblack.dtb
bootargs=console=ttyS0,115200n8 root=/dev/mmcblk0p2 rw rootfstype=ext4 rootwait
Kernel image # 0x82000000 [ 0x000000 - 0x418b20 ]
ERROR: Did not find a cmdline Flattened Device Tree
Could not find a valid device tree
Actual error is
=> run uenvcmd
4295456 bytes read in 282 ms (14.5 MiB/s)
'ailed to load '/am335x-boneblack.dtb
P.S. My u-boot fails to recognize ${} substitutions properly, and usage of
console=ttyS0,115200n8
bootpartition=mmcblk0p2
set_bootargs=setenv bootargs console=${console} root=/dev/${bootpartition} rw rootfstype=ext4 rootwait
caused and error
syntax error:
rootfstype=ext4 rootwait0n8
this 0n8 was appended after rootwait and shouldn't be there. So I've written this "straight" file without variables.
Thanks to sawdust for info that carriage return character matters and overrides first letter of error msg - I've got an idea that it also matters for path to file in load cmd, and it matters.
If I use space+\r, NOT just \r - everything works fine.

It possible to boot freertos over network?

some words to my system.
Im work on the Xilinx development-board zc706.
The basic example of freertos are running.
Now the question is: How i can boot the application over network?
A freertos application is a bare-metal approach.
Typically a loader like u-boot is been used, but the examples I find, was only for the linux use-case.
Addition:
With the XMD console its possible to load the u-boot in the memory
XMD% source ps7_init.tcl
XMD% ps7_init
XMD% dow u-boot
Processor started. Type "stop" to stop processor
Processor Stop Condition Unknown
Processor Reset .... DONE
Downloading Program -- u-boot
section, .text: 0x04000000-0x040524d7
section, efi_runtime_text: 0x040524d8-0x040524fb
section, .rodata: 0x04052500-0x040650d1
section, .hash: 0x040650d4-0x040650ff
section, .dtb.init.rodata: 0x04065100-0x0406866f
section, .data: 0x04068670-0x0406b31b
section, .got.plt: 0x0406b31c-0x0406b327
section, efi_runtime_data: 0x0406b328-0x0406b3ff
section, .u_boot_list: 0x0406b400-0x0406c71f
section, .rel.dyn: 0x0406c720-0x04077d5f
section, .bss: 0x0406c720-0x040ad29f
Download Progress..10.20.30.40.50.60.70.80.90.Done
Setting PC with Program Start Address 0x04000000
XMD% run
RUNNING> 0
XMD%
The result ist seen with on a com port:
U-Boot 2017.01-00012-g374a838 (May 29 2017 - 17:55:04 +0200)
Model: Zynq ZC706 Development Board
Board: Xilinx Zynq
I2C: ready
DRAM: ECC disabled 1 GiB
MMC: sdhci#e0100000: 0 (SD)
SF: Detected s25fl128s_64k with page size 512 Bytes, erase size 128 KiB, total 32 MiB
*** Warning - bad CRC, using default environment
In: serial#e0001000
Out: serial#e0001000
Err: serial#e0001000
Model: Zynq ZC706 Development Board
Board: Xilinx Zynq
Net: ZYNQ GEM: e000b000, phyaddr 7, interface rgmii-id
eth0: ethernet#e000b000
Hit any key to stop autoboot: 0
Device: sdhci#e0100000
Manufacturer ID: 27
OEM: 5048
Name: SD16G
Tran Speed: 50000000
Rd Block Len: 512
SD version 3.0
High Capacity: Yes
Capacity: 14.5 GiB
Bus Width: 4-bit
Erase Group Size: 512 Bytes
reading uEnv.txt
** Unable to read file uEnv.txt **
Copying Linux from SD to RAM...
reading uImage
** Unable to read file uImage **
Zynq>
Addition:
I have build the FSBL with the flag FSBL_DEBUG:
(Project -> Properties -> C/C++ Build -> Settings -> ARM gcc compiler -> Symbols)
The I build the bin file only with the boot loader partion and put it on the SD card:
Xilinx Tools->Create Boot Image
Addition:
The problem is, that the SDK needs a file with name u-boot.elf. The extention was not there after the build of u-boot.
So now I have a TFTP-Server running on my host and the u-boot find the uEnv.txt file, but the cmd in this file doesn't run:
How I can setup the u-boot an give the right loadAddress to loadthe freeRTos elf-file?
The tftpboot cmd seems to be:
tftpboot [loadAddress] [bootfilename]
e.g.
tftpboot 0x80400000 vlm-boards/14726/uImage
What is the load address of the zc706 board?
Addition:
The connection an the download with the TFTP-server seems to work:
But after starting with the "go" cmd a reset occur.
Zynq> setenv ipaddr 192.168.150.101
Zynq> setenv netmask 255.255.255.0
Zynq> setenv gatewayip 192.168.150.1
Zynq> serverip=192.168.150.100
Zynq> ping 192.168.150.100
Using ethernet#e000b000 device
host 192.168.150.100 is alive
Zynq> tftpboot 0x8000 FreeRTOS_HelloWorld.elf
Using ethernet#e000b000 device
TFTP from server 192.168.150.100; our IP address is 192.168.150.101
Filename 'FreeRTOS_HelloWorld.elf'.
Load address: 0x8000
Loading: ###############
2.8 MiB/s
done
Bytes transferred = 205675 (3236b hex)
Zynq> go 0x8000
## Starting application at 0x00008000 ...
undefined instruction
pc : [<0000fa60>] lr : [<3ff443c4>]
reloc pc : [<c40cda60>] lr : [<040023c4>]
sp : 3eb20cf4 ip : 0000001c fp : 3ff4437c
r10: 3eb1f9b0 r9 : 3eb21ee8 r8 : 3ffaef30
r7 : 00000000 r6 : 00008000 r5 : 00000002 r4 : 3eb2f9b4
r3 : 00008000 r2 : 3eb2f9b4 r1 : 3eb2f9b4 r0 : 00001084
Flags: nZcv IRQs off FIQs off Mode SVC_32
Resetting CPU ...
resetting ...
Thx in advance
The solution is:
The Xilinx SDK supply as an output an Elf-File, which the u-boot understands:
tftpboot 0x000000 FreeRTOS_ZC706_HelloWorld.elf
bootelf 0x0
Thanks
tftpboot 0x0 hello.efl; bootelf 0x0;
works in Uboot 2019.2 version and FreeRTOS.elf.
For the other core, you need to convert it to bin format using arm-none-eabi-objcopy -O binary hello.elf hello.bin. tftpboot it under the correct memory postion. And fire it up in the CPU0 code.

Dump shared memory

Is it possible to dump and investigate shared memory content from Linux? I spoted some strange shared memory segments in a "ipcs -m" output and want to see what is in there.
Also is it possible to determine the creator of this segment. "nattch" seems to be always zero.
Have a look at this tool
Shmcat
It's a good tool for your purpose.
What do you mean with creator? Do you mean the PID of the process? In this last case you can use
ipcs -mp
You'll get this output:
------ Shared Memory Creator/Last-op --------
shmid owner cpid lpid
3211265 root 1857 1866
where
CPID
The process ID of the job that created the shared memory segment.
and
LPID
The process ID of the last job to attach or detach from the shared memory segment or change the semaphore value.
Edit:
I don't think is possible to log those informations with standard tools.
I think we can do in this way.
Suppose we execute the command:
ipcs -m
and get these results
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 3211265 root 644 80 2
Then, with the command grep 3211265 /proc/*/maps, we obtain:
/proc/1862/maps:bla bla bla rw-s 00000000 00:09 3211265 /SYSV00000000 (deleted)
/proc/1863/maps:bla bla bla rw-s 00000000 00:09 3211265 /SYSV00000000 (deleted)
In this way we get the processes that was attached to the segment.
Scanning the elements in /proc/*/maps, you are able to discover the PIDs that are currently attached to a given segment.
You can make use of bash script that log these particular information.

Tracking memory usage of piped commands with valgrind

I have a couple processes running a tool I've written that are joined by pipes, and I would like to measure their collected memory usage with valgrind. So far, I have tried something like:
$ valgrind tool=massif trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses".%p" myProcesses.script
Where myProcesses.script runs the equivalent of my tool foo twice, e.g.:
foo | foo > /dev/null
Valgrind doesn't seem to capture the collected memory usage of this the way I expect. If I use top to track this, I get (for sake of argument) 10% memory usage on the first foo, and then another 10% collects on the second foo before the myProcesses.script completes. This is the sort of thing I want to measure: the usage of both processes. Valgrind instead returns the following error:
Massif: ms_main.c:1891 (ms_new_mem_brk): Assertion 'VG_IS_PAGE_ALIGNED(len)' failed.
Is there a way to collect memory usage data for commands I'm using in a piped fashion (using valgrind)? Or a similar tool that I can use to accurately automate these measurements?
The numbers that top returns while polling seem hand-wavy, to me, and I am seeking accurate and repeatable measurements. If you have suggestions for alternative tools, I would welcome those, as well.
EDIT - Fixed typo with valgrind option.
EDIT 2 - For some reason, it appears that the option --pages-as-heap is giving us troubles with the binaries we're testing. Your examples run fine. A new page is created every time we enter a non-inlined function (stack overflows - heh). We wanted to count those, but they're relatively minor in the scale of memory usage we're testing. (Perhaps there aren't function calls in ls or less?) Removing --pages-as-heap helped get testing working again. Thanks to MrGomez for the great help.
With the correct valgrind version given in the errata, this seems to just work for me in Valgrind 3.6.1. My invocation:
<me>#harley:/tmp/test$ /usr/local/bin/valgrind --tool=massif \
--trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes \
--massif-out-file=myProcesses".%p" ./testscript.sh
==21067== Massif, a heap profiler
==21067== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21067== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21067== Command: ./testscript.sh
==21067==
==21068== Massif, a heap profiler
==21068== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21068== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21068== Command: /bin/ls
==21068==
==21070== Massif, a heap profiler
==21070== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21070== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Massif, a heap profiler
==21069== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21069== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Command: /bin/sleep 5
==21069==
==21070== Command: /usr/bin/less
==21070==
==21068==
(END) ==21069==
==21070==
==21067==
The contents of my test script, testscript.sh:
ls | sleep 5 | less
Sparse contents from one of the files generated by --massif-out-file=myProcesses".%p" (myProcesses.21055):
desc: --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses.%p
cmd: ./testscript.sh
time_unit: i
#-----------
snapshot=0
#-----------
time=0
mem_heap_B=110592
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=1
#-----------
time=0
mem_heap_B=118784
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
...
#-----------
snapshot=18
#-----------
time=108269
mem_heap_B=1708032
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=peak
n2: 1708032 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
n3: 1474560 0x4015E42: mmap (mmap.S:62)
n1: 1425408 0x4005CAC: _dl_map_object_from_fd (dl-load.c:1209)
n2: 1425408 0x4007109: _dl_map_object (dl-load.c:2250)
n1: 1413120 0x400CEEA: openaux (dl-deps.c:65)
n1: 1413120 0x400D834: _dl_catch_error (dl-error.c:178)
n1: 1413120 0x400C1E0: _dl_map_object_deps (dl-deps.c:247)
n1: 1413120 0x4002B59: dl_main (rtld.c:1780)
n1: 1413120 0x40140C5: _dl_sysdep_start (dl-sysdep.c:243)
n1: 1413120 0x4000C6B: _dl_start (rtld.c:333)
n0: 1413120 0x4000855: ??? (in /lib/ld-2.11.1.so)
n0: 12288 in 1 place, below massif's threshold (01.00%)
n0: 28672 in 3 places, all below massif's threshold (01.00%)
n1: 20480 0x4005E0C: _dl_map_object_from_fd (dl-load.c:1260)
n1: 20480 0x4007109: _dl_map_object (dl-load.c:2250)
n0: 20480 in 2 places, all below massif's threshold (01.00%)
n0: 233472 0xFFFFFFFF: ???
#-----------
snapshot=19
#-----------
time=108269
mem_heap_B=1703936
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=20
#-----------
time=200236
mem_heap_B=1839104
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
Massif continues to complain about heap allocations in the remainder of my files. Note this is very similar to your error.
I theorize that your version of valgrind was built in debug mode, causing the asserts to fire. A rebuild from source (I used this with the defaults hanging off ./configure) will fix the issue.
Either way, this seems to be expected with Massif.
Some programs allow you to preload the libmemusage.so library and get a report of what memory allocations were allocated recorded:
$ LD_PRELOAD=libmemusage.so less /etc/passwd
Memory usage summary: heap total: 36212, heap peak: 35011, stack peak: 15008
total calls total memory failed calls
malloc| 39 5985 0
realloc| 3 64 0 (nomove:2, dec:0, free:0)
calloc| 238 30163 0
free| 51 11546
Histogram for block sizes:
0-15 128 45% ==================================================
16-31 13 4% =====
32-47 105 37% =========================================
48-63 2 <1%
64-79 4 1% =
80-95 5 1% =
96-111 3 1% =
112-127 3 1% =
160-175 1 <1%
192-207 1 <1%
208-223 2 <1%
256-271 1 <1%
432-447 1 <1%
560-575 1 <1%
656-671 1 <1%
768-783 1 <1%
944-959 1 <1%
1024-1039 2 <1%
1328-1343 1 <1%
2128-2143 1 <1%
3312-3327 1 <1%
7952-7967 1 <1%
8240-8255 1 <1%
Though I must admit that it doesn't always work -- LD_PRELOAD=libmemusage.so ls never reports anything, for example -- and I wish I knew the conditions that allow it to work or not work.

Debug error in fortran for the array negtivel index

I have a test program here:
program test
implicit none
integer(4) :: indp
integer(4) :: t1(80)
indp = -3
t1(indp) = 1
write(*,*) t1(indp)
end program test
in line 8 it is wrong, because the indp is negative number. but when I compile it use 'ifort' or 'gfortran' both of them cannot find this error.
and even use valgrind to debug this program it also cannot find this error.
do you have any idea find this kind of problem?
Fortran compilers aren't required to give you warnings about things like this; and in general, t1(-3) = 1 could be a perfectly reasonable statement if you set the lower bound of your fortran array to something equal to or less than -3, eg
integer(kind=4), dimension(-5:74) :: t1(80)
would certainly allow setting and reading t1(-3).
If you want to make sure these sorts of errors are checked at runtime, you can compile with -fbounds-check with gfortran:
$ gfortran -o foo foo.f90 -fcheck=bounds
$ ./foo
At line 8 of file foo.f90
Fortran runtime error: Array reference out of bounds for array 't1', lower bound of dimension 1 exceeded (-3 < 1)
or -check bounds in ifort:
ifort -o foo foo.f90 -check bounds
$ ifort -o foo foo.f90 -check bounds
$ ./foo
forrtl: severe (408): fort: (3): Subscript #1 of the array T1 has value -3 which is less than the lower bound of 1
Image PC Routine Line Source
foo 000000000046A8DA Unknown Unknown Unknown
The reason valgrind doesn't catch this is a little subtle, but note that it would if the array were allocated:
program test
implicit none
integer(kind=4) :: indp
integer(kind=4), allocatable :: t1(:)
indp = -3
allocate(t1(80))
t1(indp) = 1
write(*,*) t1(indp)
deallocate(t1)
end program test
$ gfortran -o foo foo.f90 -g
$ valgrind ./foo
==18904== Memcheck, a memory error detector
==18904== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==18904== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==18904== Command: ./foo
==18904==
==18904== Invalid write of size 4
==18904== at 0x400931: MAIN__ (foo.f90:9)
==18904== by 0x400A52: main (foo.f90:13)
==18904== Address 0x5bb3420 is 16 bytes before a block of size 320 alloc'd
==18904== at 0x4C264B2: malloc (vg_replace_malloc.c:236)
==18904== by 0x400904: MAIN__ (foo.f90:8)
==18904== by 0x400A52: main (foo.f90:13)
==18904==
==18904== Invalid read of size 4
==18904== at 0x4F07368: extract_int (write.c:450)
==18904== by 0x4F08171: write_integer (write.c:1260)
==18904== by 0x4F0BBAE: _gfortrani_list_formatted_write (write.c:1553)
==18904== by 0x40099F: MAIN__ (foo.f90:10)
==18904== by 0x400A52: main (foo.f90:13)
==18904== Address 0x5bb3420 is 16 bytes before a block of size 320 alloc'd
==18904== at 0x4C264B2: malloc (vg_replace_malloc.c:236)
==18904== by 0x400904: MAIN__ (foo.f90:8)
==18904== by 0x400A52: main (foo.f90:13)
There is no error. You declared indp as an integer of a certain range and precision (of a certain KIND <- look up in help for that term), which can be either positive or negative.
After that you assigned the value of 1 to an t1(indp) and wrote it out.

Resources