Crash in free() - ios

I have several crash reports from an iOS app that stem from a SIGABRT in a free() call.
The call stack is consistent:
0 libsystem_kernel.dylib 0x3863c1f0 __pthread_kill + 8
1 libsystem_c.dylib 0x385ecfdd abort + 77
2 libsystem_malloc.dylib 0x38664d67 free + 383
I'm trying to get more diagnostics, but in the meantime, did anyone encounter the same? What kind of a wrong argument would crash a free() call? I can see several options:
a null pointer (actually legit)
a data area pointer (i. e. a string literal)
a stack pointer
a garbage pointer (i. e. an uninitialized one)
a heap pointer that was already freed
Any ideas please? Those are pretty rare, the last one was in Sep
'14. But I've got over 10 total, there is probably a bug there.

If I read the stack dump correctly, the code triggered an assertion in free and called abort. Look at the source code for the libsystem_malloc on http://opensource.apple.com and try and figure which assertion failed.
You have a stray pointer, guessing where it is hiding from a single non reproducible crash is next to impossible. Running your application in the emulator with valgrind (if that's possible) may help you track memory misuse.
It the stack dump is longer that 3 lines, you should have an indication of which call to free caused the problem. It may help you track the bug, but it may also be a late side-effect of some earlier pointer misuse.

Related

DJI SDK iOS crash

DJI SDK iOS community
I have been connecting the M300 and this crash has happening me randomly, any idea how to mitigate this issue?
EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x0000000bbf6e3070
Crashed: Thread
0 libobjc.A.dylib 0x1cf4 objc_msgSend + 20
1 DJISDK 0xa3e920 GetIsFCConnectedHandle(unsigned long long) + 51100
2 DJISDK 0x204288 mop_link_layer_recv + 128
3 DJISDK 0x204f1c mop_link_layer_node_init + 2908
4 libsystem_pthread.dylib 0x3348 _pthread_start + 116
5 libsystem_pthread.dylib 0x1948 thread_start + 8
Your problem is most likely caused by incorrect initialisation in conjunction with a network related event, say an inbound data packet.
Inbound data handling often use the words "recv" in their name.
Your third party code is trying to message an object at a too low an address (but not around zero). This usually means that some base pointer is nil, but some other value has been added to it, and that is the address being used.
Notice that your code is called from GetISFCConnectedHandle, and it takes a likely pointer-like argument. This is often a tell tale of code which has pointers to handler objects stored in data structures.
The first thing is to check configuration and initialisation data structures for your SDK and its associated hardware. The next thing to do is to enable the different sanitisers (address sanitiser for example) or use the memory allocations instrument to see the history of the system up until the crash. Then you'll have context as to what is going on.
The official resource for sanitisers is
https://developer.apple.com/documentation/xcode/diagnosing-memory-thread-and-crash-issues-early
I also have a book which goes over similar ground and gives some tips. Good luck!

ios and opencv: how to correctly call cvtColor without wasting memory?

I am using this function here in my ios app:
cv::cvtColor(image, image, cv::COLOR_BGR2RGB);
But when I call this in my - (void)processImage:(cv::Mat&)image delegate method,
images get lost in memory. So after a few seconds my app crashes with memory problems.
Terminated due to Memory Pressure
Don't I just copy the converted image over the previous image?
And what can I do to prevent this behavior?
- (void)processImage:(cv::Mat&)image
{
cv::cvtColor(image, image, cv::COLOR_BGR2RGB);
}
Some output of how data looks like in inspector (these vm_allocate lines appear a lot):
0 0x20961000 VM: CoreAnimation 00:22.762.010 • 7,91 MB QuartzCore CA::Render::aligned_malloc(unsigned long, void**)
1 0x20178000 VM: CoreAnimation 00:22.415.490 • 7,91 MB QuartzCore CA::Render::aligned_malloc(unsigned long, void**)
2 0x2114a000 CGSImageData 00:22.762.165 • 5,95 MB CoreGraphics CGSImageDataHandleCreate
3 0x1f3a0000 VM: Foundation 00:22.752.743 • 5,93 MB libsystem_kernel.dylib vm_allocate
4 0x1fb89000 VM: Foundation 00:22.408.091 • 5,93 MB libsystem_kernel.dylib vm_allocate
I usually convert over the original image with no problem. You can create another destination mat image if you wish to preserve the original one. So it's based on case to case basis.
Would rather comment than reply, but my answer would be too long.
Try this methods:
1) High chance is due to you not declaring the channel. For instance CV_8U3, etc..
2) If step one doesn't work, the other high possibility: Try using CV_BGR2RGB instead of cv::COLOR_BGR2RGB (Version compatibility problem)
3) Have you tried removing the pointer? &
If the three methods still doesn't work, please do comment on the exact error message you are receiving on this answer. I will try help you out. Cheers.
EDIT(To answer your comments):
When I was talking about channels, I meant CV_8U3, CV_8U, etc. You don't have to try it anymore though, cause the error is due to the IOS's aggressive kernel thread which sets all to kill all the processes on low memory. This means when background process are running, they are more likely to be "killed" to allocate memory for the current/running/foreground process.
More information about that kernel:http://newosxbook.com/articles/MemoryPressure.html
I am not an expertise when it comes to IOS, but things I think you can try:
1)Use global Variable, for instance, make Mat Image global rather than local
2)A slightly bad programming convention to some: skip the function and just dump the code from the function to the main/or program which was trying to call the function. This ensures that the IOS doesn't need to switch process, hence killing either of them.
3) Define app profile(UIBackgroundMode) in kernel, thereby taming Jetsam, the aggressive kernel killer a little.
4) release images from RAM (remove the reference to them) when there is no more need to the images

Does 'malloc_error_break' occurs on the same thread as the underlying memory corruption

I am trying to debug an occasional crash in our iOS application.
We get 'malloc_error_break' with the usual 'object was modified after being freed'.
The crash occurs in the same C library, but at different malloc places.
The top of the backtrace looks like this:
* thread #29: tid = 0x3a03, 0x32c8cfa8 libsystem_c.dylib`malloc_error_break, stop reason = breakpoint 1.1
frame #0: 0x32c8cfa8 libsystem_c.dylib`malloc_error_break
frame #1: 0x32c71ed0 libsystem_c.dylib`szone_error + 220
frame #2: 0x32c71f1c libsystem_c.dylib`free_list_checksum_botch + 28
frame #3: 0x32c1d3bc libsystem_c.dylib`tiny_malloc_from_free_list + 348
frame #4: 0x32c1c44a libsystem_c.dylib`szone_malloc_should_clear + 1274
frame #5: 0x32c1bf1e libsystem_c.dylib`malloc_zone_malloc + 66
Question:
Does this guarantee that the underlying memory corruption (f.i. double free, etc) happens on the same thread as the 'malloc_zone_malloc'? Or at least that the memory malloc_error_break is referring to, was allocated on the same thread?
Knowing this for sure, would help me isolate the crash from influence of other libraries, NSURLConnection requests, etc. The app is quite big and difficult too debug, as it is.
Edit:
I guess what I wanted too know first was something simpler.
Do different threads have separate heaps / malloc lists in iOS?
malloc_error_break() is invoked as soon as memory corruption is discovered, no matter what thread happens to discover it. There are absolutely no guarantees as to which thread this will be.
Do different threads have separate heaps / malloc lists in iOS?
No. There is a single shared heap used by all threads in your process.

Delphi 6 Compiler Options (Pentium-safe FDIV)

I recieved a crash report from MadExcept from a user. The Exception was Invalid floating point operation.
The odd part though is that the callstack dies at #FSafeDivide.
I did a google and found out that this was a check for certain pentium chips which didn't do division correctly. If the test failed all the divisions would be done in software rather than hardware. I have the Pentium-Safe FDIV option turned on in my compiler settings.
Could this have caused the error? I also read somewhere else that the EInvalidOp which was the exception class can be a stack overflow or something.
Here's a snipit of the mad except message if you want to read it.
exception class : EInvalidOp
exception message : Invalid floating point operation.
thread $1014 (TMyBossThread):
00403509 M5b3.exe System #FSafeDivide
008300c9 M5b3.exe MMyWorkerThread 317 TMyBossThread.Search
0073e87a M5b3.exe MMyManagerThread 186 TMyWorkerThread.Execute
008e8c17 M5b3.exe madExcept HookedTThreadExecute
0042c150 M5b3.exe Classes ThreadProc
00405354 M5b3.exe System ThreadWrapper
008e8af9 M5b3.exe madExcept CallThreadProcSafe
008e8b63 M5b3.exe madExcept ThreadExceptFrame
created by main thread ($864) at:
0073e828 M5b3.exe MMyManagerThread 171 TMyManagerThread.Create
First, unless you actually have people still running on early Pentium I chips, you should probably turn that compiler option off. It's to address a glitch in a few specific CPUs, and any chip sold since 1995 has not had the problem.
Having said that, if you've got an invalid floating point operation in a division, the problem's most likely in your code somewhere, especially since FSafeDivide is the routine that's supposed to produce the right results. Take a look at TMyBossThread.Search, line 317, and see what it's dividing there. Also look at line 316, since stack traces can sometimes point you to the line after the one you care about.
A few comments, before searching in the haystack:
"If it's not reproducible, it's not a bug but an anomaly". Don't waste time on What or Why but on How you can recreate it.
As Mason said, it's probably time to remove this compiler option. (D6 is almost 10 years old)
Do you know if it happens on a specific Windows version? For instance, Text-To-Speech working well on XP gives a "Floating point division by zero error" on Vista and up.
Supposing your code seems fine, what is called that would involve some floating point operations?
The 2 last ones refer to problems with the FPU registers being messed up:
See here for interoping with .Net and in the Help on Set8087CW for OpenGL
This (german) article describes a case where switching on the Pentium(tm)-safe divide ($U+) fixed a Data Execution Prevention error on a Windows 2003 Server system, which has DEP enabled:
http://entwickler-forum.de/archive/index.php/t-41207.html
Delphi 2009 still has this compiler flag with default $U- (no Pentium(tm)-safe divide.
So even if we can forget about the hardware related part (broken CPUs), it still could make a difference depending on Operating System 'features' like DEP

Address Error in Assembly (ColdFire MCF5307)

Taking my first course in assembly language, I am frustrated with cryptic error messages during debugging... I acknowledge that the following information will not be enough to find the cause of the problem (given my limited understanding of the assembly language, ColdFire(MCF5307, M68K family)), but I will gladly take any advice.
...
jsr out_string
Address Error (format 0x04 vector 0x03 fault status 0x1 status reg 0x2700)
I found a similar question on http://forums.freescale.com/freescale/board/message?board.id=CFCOMM&thread.id=271, regarding on ADDRESS ERROR in general.
The answer to the question states that the address error is because the code is "incorrectly" trying to execute on a non-aligned boundary (or accessing non-aligned memory).
So my questions will be:
What does it mean to "incorrectly" trying to execute a non-aligned boundary/memory? If there is an example, it would help a lot
What is non-aligned boundary/memory?
How would you approach fixing this problem, assuming you have little debugging technique(eg. using breakpoints and trace)
First of all, it is possible that isn't the instruction causing the error. Be sure to see if the previous or next instruction could have caused it. However, assuming that exception handlers and debuggers have improved:
An alignment exception is what occurs when, say 32 bit (4 byte) data is retrieved from an address which is not a multiple of 4 bytes. For example, variable x is 32 bits at address 2, then
const1: dc.w someconstant
x: dc.l someotherconstant
Then the instruction
mov.l x, %r0
would cause a data alignment fault on a 68000 (and 68010, IIRC). The 68020 eliminated this restriction and performs the unaligned access, but at the cost of decreased performance. I'm not aware of the jsr (jump to subroutine) instruction requiring alignment, but it's not unreasonable and it's easy to arrange—Before each function, insert the assembly language's macro for alignment:
.align long
func: ...
It has been a long time since I've used a 68K family processor, but I can give you some hints.
Trying to execute on an unaligned boundary means executing code at an odd address. If out_string were at an address with the low bit set for example.
The same holds true for a data access to memory of 2 or 4 byte data. I'm not sure if the Coldfire supports byte access to odd memory addresses, but the other 68K family members did.
The address error occurs on the instruction that causes the error in all cases.
Find out what instruction is there. If the pc matches (or is close) then it is an unaligned execution. If it is a memory access, e.g. move.w d0,(a0), then check to see what address is being read/written, in this case the one pointed at by a0.
I just wanted to add that this is very good stuff to figure out. I program high end medical imaging devices in my day job, but occasionally I need to get down to this level. I have found and fixed more than one COTS OS problem by being able to track down just this sort of problem.

Resources