How to analyze stack trace info of a thread? - ios

I'm trying to monitor performance of my app;
When cpu usage overloads, I dump the stack trace of the suspicious thread and main thread as string by two libs:
https://github.com/bestswifter/BSBacktraceLogger
https://github.com/plausiblelabs/plcrashreporter
Following are the stack trace of one thread that I record, but it cannot help me analyze and locate where the performance issue is.
Am I doing wrong or how can I analyze the stack trace of a thread?

Ohkay ! I somewhat got your problem. What is your app about,? I mean is it a Game or something.? With this very info, I'll give you few workarounds,
I would suggest you to thoroughly study the code and de-initialize all the resources which are not being used.
Check for how many static properties and global variables/properties you’re using and ask yourself are they even required ?
Also I would again suggest you to monitor your app with Instruments look very precisely when the memory bar is going high and when low [e.g. by opening what ViewController it eats up a lot, by closing what ViewController the memory bar goes down, is your app dependent on GPS coz in big applications like uber they do not update locations on didUpdateLocations rather they use other methods, like singletons / Timer / heartbeats etc,]
Plus if you wanna avoid all this manual work, go for NewRelic
A small tutorial for that : link
Post again for more, Would be happy to help. =)
Here are some links : by using and combining them with firebase you'll be able to see events and logs as well,
here's 1st -> watchdog
here's the 2nd 1 -> Prints the filename, function name, line number and textual..etc..
Now combine any of that with firebase and it'll send the logs to you directly.

Related

iOS: Background capability not working (Terminated due to signal 9.)

I am working on a project which includes following functionalities.
Location fetch and send to server in every 60 seconds.
Audio/Video Calls.
The background modes which are set for the project are mentioned as under
iOS: 14.1
Xcode: 12.1
Swift: 4
Problem:
Whenever I put the app in background it fetches location or call for sometime then I get following error in logs. Whenever I put the app in background when audio call is going on then audio works for some time and after few seconds following error arise.
Message from debugger: Terminated due to signal 9.
How ever all things work fine when the application is in foreground. Application fetches location and call works.
Kindly suggest what additional I have to do or anything wrong am I doing.
The comment thread on your question suggested that the termination is due to excessive background CPU usage.
Based on your last comment, it sounds like you don't know where to start with Instruments (I've been there) as another commenter recommended, so I'll give some basic info on how to get started with CPU profiling in instruments, and then you can seek out more detailed tutorials online (this WWDC video from Apple is as good a place to start as any: https://developer.apple.com/videos/play/wwdc2019/411/#:~:text=Instruments%20is%20a%20powerful%20performance,optimize%20their%20behavior%20and%20performance )
The following assumes using Xcode 12.1 and its corresponding Instruments version 12.1, but most recent versions should be fairly similar (maybe a button is slightly differently placed, etc. in older versions):
Open your app project in Xcode, and run it on a real device (simulators will give you information about your mac's CPU usage and will be very different to a real device).
Go to the Debug navigator in the left sidebar (Cmd+7), select CPU in the sidebar, then click the Profile in Instruments button on the top right.
Select 'Profile Current Session' if asked.
Instruments should launch and start recording automatically.
Reproduce the issue on your device.
Now to understand what's being shown in Instruments:
The top pane (the moving chart) shows your CPU usage over time.
The bottom pane shows the call tree of processes that have run.
There's a lot of info there, so you want to look at the Filter and configuration bar at the very bottom of the window, and select all the options in the Call Tree menu in the first instance. It looks like this:
Here's a short explanation of each of those options:
Separate by Thread: Shows the processes by thread to help diagnose overworked threads
Invert Call Tree: Reverses the stack to show the bottom portion first which is likely more useful for troubleshooting
Hide System Libraries: Removes system library processes so there's less noise & you can focus only on your app's code
Flatten Recursion: Combines recursive calls into one single entry
Top Functions: Combines the time the called function used, plus the time spent by functions called from that function. This can help you find your most expensive methods in terms of CPU usage.
Now you've got a filtered list of expensive CPU methods for your app only, and selecting a row gives you more information in the Extended Details pane to the right of the call tree view. This can show you exactly which method in which file of your code was running and even take you to it in Xcode (with a few button clicks).
Hopefully that should be enough to get you started, recognise some potential problem areas in your code that may be the cause of your app being terminated.

Is there a way to take action, thus execute code, when a iOS application crashes ? Is this possible?

Is there a way to take action, thus execute code, when an iOS application crashes? Specifically, I would like to save the core data storage. Is this possible? I would say that this is possible since, for example, Firebase has to send information online for making crashlytics work. How can this be achieved? Thanks
Yes, but it is very difficult, and "save core data storage" would be far too much (and very dangerous, to boot).
Most crashes result from a signal (often SIGSEGV, but also SIGABRT, SIGILL or others), and you can install a signal handler to run code in that case. However, that code must be very, very carefully written because you will be in a special execution state. There are a small number of C functions you are permitted to use (see the man page for sigaction for the list). Most notably, you can't allocate memory. Allocating memory in a signal catching function can deadlock the program in a tight spinlock (done that myself when I tried to write my own crash handler in my more naive days; it's really bad).
The way that crash handlers like Crashlytics do it is that they do as little as possible during the signal handler, mostly just writing the stack trace to storage (using pre-allocated buffers). When you restart, they see that there's an unhandled stack trace from a previous run, and then they do all the complicated stuff like uploading it to a server, or displaying UI, or whatever.
But even if you could write to Core Data in the middle of a signal handler, you would never want to do that. During a signal handler, the system is in an undefined state. Various invariants may not currently hold (such as whether the object graph is consistent). The fact that you're crashing this way indicates that something illegal has happened. The last thing you should do in that state is take data that is highly untrustworthy and overwrite the good data on disk.

Intercepting crashes on iOS

Description
I would like to catch all exceptions that are occurring in iOS app and log them to file and eventually send them to back-end server used by the app.
I've been reading about this topic and found usage of signals sent by device and handling them, but I'm not sure if it's gonna break App Store Review guidelines or it may introduce additional issues.
I've added following to AppDelegate:
NSSetUncaughtExceptionHandler { (exception) in
log.error(exception)
}
signal(SIGABRT) { s in
log.error(Thread.callStackSymbols.prettified())
exit(s)
}
signal(SIGILL) { s in
log.error(Thread.callStackSymbols.prettified())
exit(s)
}
signal(SIGSEGV) { s in
log.error(Thread.callStackSymbols.prettified())
exit(s)
}
Questions
Is this good approach, any other way?
Will it break App Store Review guidelines because of usage of exit()
Is it better to use kill(getpid(), SIGKILL) instead of exit()?
Resources
https://github.com/zixun/CrashEye/blob/master/CrashEye/Classes/CrashEye.swift
https://www.plcrashreporter.org/
https://chaosinmotion.blog/2009/12/02/a-useful-little-chunk-of-iphone-code/
former Crashlytics iOS SDK maintainer here.
The code you've written above does have a number of technical issues.
The first is there are actually very few functions that are defined as safe to invoke inside a signal handler. man sigaction lists them. The code you've written is not signal-safe and will deadlock from time to time. It all will depend on what the crashed thread is doing at the time.
The second is you are attempting to just exit the program after your handler. You have to keep in mind that signals/exception handlers are process-wide resources, and you might not be the only one using them. You have to save pre-existing handlers and then restore them after handling. Otherwise, you can negatively affect other systems the app might be using. As you've currently written this, even Apple's own crash reporter will not be invoked. But, perhaps you want this behavior.
Third, you aren't capturing all threads stacks. This is critical information for a crash report, but adds a lot of complexity.
Fourth, signals actually aren't the lowest level error system. Not to be confused with run time exceptions (ie NSException) mach exceptions are the underlying mechanism used to implement signals on iOS. They are a much more robust system, but are also far more complex. Signals have a bunch of pitfalls and limitations that mach exceptions get around.
These are just the issues that come to me off the top of my head. Crash reporting is tricky business. But, I don't want you to think it's magic, of course it's not. You can build a system that works.
One thing I do want to point out, is that crash reporters give you no feedback on failure. So, you might build something that works 25% of the time, and because you are only seeing valid reports, you think "hey, this works great!". Crashlytics had to put in effort over many years to identify the causes of failure and try to mitigate them. If this is all interesting to you, you can check out a talk I did about the Crashlytics system.
Update:
So, what would happen if you ship this code? Well, sometimes you'll get useful reports. Sometimes, your crash handling code will itself crash, which will cause an infinite loop. Sometimes your code will deadlock, and effectively hang your app.
Apple has made exit public API (for better or worse), so you are absolutely within the rules to use it.
I would recommend continuing down this road for learning purposes only. If you have a real app that you care about, I think it would be more responsible to integrate an existing open-source reporting system and point it to a backend server that you control. No 3rd parties, but also no need to worry about doing more harm than good.
Conclusion
It is possible to create custom crash reporter but it is definitely not recommended because there is a lot going on in background that could be easily forgotten and can introduce a lot of undefined behaviors. Even usage of third party frameworks can be troublesome but it is generally better way to go.
Thanks to everyone for providing information regarding this topic.
Answers to questions
Is this good approach, any other way?
Approach I mentioned in original question will have influence on Apple's own crash reporter and it introduces undefined behavior because of bad handling of signals. UNIX signals are not covering every error and API handling work with async signal safe functions. Mach exception handling which is used by Apple's crash reporter is better option but it is more complex.
Will usage of exit() break Apple App Store review?
No. Usage of exit() is more related to the normal operation of app. If app is crashing anyway, calling exit() isn't problem.
Is it better to use kill(getpid(), SIGKILL) instead of exit()?
Quote from Eskimo:
You must not call exit. There’s two problems with doing that:
exit is not async signal safe. In fact, exit can run arbitrary code
via handlers registered with atexit. If you want to exit the process,
call _exit.
Exiting the process is a bad idea anyway, because it will either
prevent the Apple crash reporter from running or cause it to log
incorrect state (the state of your signal handler rather than the
state of the crashed thread).
A better solution is to unregister your signal handler (set it to
SIG_DFL) and then return
Additional details (full context)
Since I cross posted this questions to Apple's official support forum and got really long and descriptive answer from well known Eskimo I would like to share it with anyone who decides to go same path as I did and starts researching this approach.
Quote from Eskimo
Before we start I’d like you to take look at my shiny new Implementing
Your Own Crash Reporter post. I’ve been meaning to write this up for
a while, and your question has give me a good excuse to allocate the
time.
You wrote:
I've got a requirement to catch all exceptions that are occuring in
iOS app and log them to file and eventually send them to back-end
server used by the app.
I strongly recommend against doing this. My Implementing Your Own
Crash Reporter post explains why this is so hard. It also has some
suggestions for how to avoid problems, but ultimately there’s no way
to implement a third-party crash reporter that’s reliable, binary
compatible, and sufficient to debug complex problems
With that out of the way, let’s look at your specific questions:
Is this good approach at all?
No. The issue is that your minimalist crash reporter will disrupt the
behaviour of the Apple crash reporter. The above-mentioned post
discusses this problem in gory detail.
Will it break App Store Review guidelines because of usage of exit()?
No. iOS’s prohibition against calling exit is all about the normal
operation of your app. If your app is crashing anyway, calling exit
isn’t a problem.
However, calling exit will exacerbate the problem I covered in the
previous point.
Is it better to use kill(getpid(), SIGKILL) instead?
That won’t improve things substantially.
callStackSymbols are not symbolicated, is there a way to symbolicate
callStackSymbols?
No. On-device symbolication is extremely tricky and should be
avoided. Again, I go into this in detail in the post referenced
above.
Share and Enjoy
Since links can break I will also quote post.
Implementing Your Own Crash Reporter
I often get questions about third-party crash reporting. These
usually show up in one of two contexts:
Folks are trying to implement their own crash reporter.
Folks have implemented their own crash reporter and are trying to debug a problem based on the report it generated.
This is a complex issue and this post is my attempt to untangle some
of that complexity.
If you have a follow-up question about anything I've raised here,
please start a new thread in .
IMPORTANT All of the following is my own direct experience. None of it should be considered official DTS policy. If you have questions
that need an official answer (perhaps you’re trying to convince your
boss that implementing your own crash reporter is a very bad idea :-),
you should open a DTS tech support
incident and we can
discuss things there.
Share and Enjoy — Quinn “The Eskimo!” Apple Developer Relations,
Developer Technical Support, Core OS/Hardware let myEmail = "eskimo"
+ "1" + "#apple.com"
Scope
First, I can only speak to the technical side of this issue. There
are other aspects that are beyond my remit:
I don’t work for App Review, and only they can give definitive answers about what will or won’t be allowed on the store.
Doing your own crash reporter has significant privacy implications.
IMPORTANT If you implement your own crash reporter, discuss the privacy impact with a lawyer.
This post assumes that you are implementing your own crash reporter.
A lot of folks use a crash reporter from another third party. From my
perspective these are the same thing. If you use a custom crash
reporter, you are responsible for its behaviour, both good and bad,
regardless of where the actual code came from.
Note If you use a crash reporter from another third party, run the tests outlined in Preserve the Apple Crash Report to verify that
it’s working well.
General Advice
I strongly advise against implementing your own crash reporter. It’s very easy to implement a basic crash reporter that works well
enough to debug simple problems. It’s impossible to create a good
crash reporter, one that’s reliable, binary compatible, and sufficient
to debug complex problems.
“Impossible?”, I hear you ask, “That’s a very strong word for Quinn to
use. He’s usually a lot more circumspect.” And yes, that’s true, I
usually am more circumspect, but in this case I’m extremely
confident of this conclusion.
There are two fundamental problems with implementing your own crash
reporter:
On iOS (and the other iOS-based platforms, watchOS and tvOS) your crash reporter must run inside the crashed process. That means it can
never be 100% reliable. If the process is crashing then, by
definition, it’s in an undefined state. Attempting to do real work in
that state is just asking for problems 1.
To get good results your crash reporter must be intimately tied to system implementation details. These can change from release to
release, which invalidates the assumptions made by your crash
reporter. This isn’t a problem for the Apple crash reporter because
it ships with the system. However, a crash reporter that’s built in
to your product is always going to be brittle.
I’m speaking from hard-won experience here. I worked for DTS during
the PowerPC-to-Intel transition, and saw a lot of folks with custom
crash reporters struggle through that process.
Still, this post exists because lots of folks ignore my general
advice, so the subsequent sections contain advice about specific
technical issues.
WARNING Do not interpret any of the following as encouragement to implement your own crash reporter. I strongly advise against that.
However, if you ignore my advice then you should at least try to
minimise the risk, which is what the rest of this document is about.
1 On macOS it’s possible for your crash reporter to run out of
process, just like the Apple crash reporter. However, that presents
its own problems: When running out of process you can’t access various
bits of critical state for the crashed process without being tightly
bound to implementation details that are not considered API.
Preserve the Apple Crash Report
You must ensure that your crash reporter doesn’t disrupt the Apple
crash reporter. Some fraction of your crashes will not be caused by
your code but by problems in framework code, and a poorly written
crash reporter will disrupt the Apple crash reporter and make it
harder to diagnose those issues.
Additionally, when dealing with really hard-to-debug problems, you
really need the more obscure info that’s shown in the Apple crash
report. If you disrupt that info, you end up making the hard problems
harder.
To avoid these issues I recommend that you test your crash reporter’s
impact on the Apple crash reporter. The basic idea is:
Create a program that generates a set of specific crashes.
Run through each crash.
Verify that your crash reporter produces sensible results.
Verify that the Apple crash reporter also produces sensible results.
With regards step 1, your test suite should include:
An un-handled language exception thrown by your code
An un-handled language exception thrown by the OS (accessing an NSArray out of bounds is an easy way to get this)
A memory access exception
An illegal instruction exception
A breakpoint exception
Make sure to test all of these cases on both the main thread and a
secondary thread.
With regards step 4, check that the resulting Apple crash report
includes correct values for:
The exception info
The crashed thread
That thread’s state
Any application-specific info, and especially the last exception backtrace
Signals
Many third-party crash reporters use UNIX signals to catch the crash.
This is a shame because using Mach exception handling, the mechanism
used by the Apple crash reporter, is generally a better option.
However, there are two reasons to favour UNIX signals over Mach
exception handling:
On iOS-based platforms your crash reporter must run in-process, and doing in-process Mach exception handling is not feasible.
Folks are a lot more familiar with UNIX signals. Mach exception handling, and Mach messaging in general, is pretty darned obscure.
If you use UNIX signals for your crash reporter, be aware that this
API has some gaping pitfalls. First and foremost, your signal handler
can only use async signal safe functions 1. You can find a list
of these functions in the sigaction man
page
2.
WARNING This list does not include malloc. This means that a crash reporter’s signal handler cannot use Objective-C or Swift, as
there’s no way to constrain how those language runtimes allocate
memory. That means you’re stuck with C or C++, but even there you
have to be careful to comply with this constraint.
The Operative: It’s worse than you know.
Many crash reports use functions like backtrace (see its man
page)
to get a backtrace from their signal handler. There’s two problems
with this:
backtrace is not an async signal safe function.
backtrace uses a naïve algorithm that doesn’t deal well with cross signal handler stack frames [3].
The latter example is particularly worrying, because it hides the
identity of the stack frame that triggered the signal.
If you’re going to backtrace out of a signal, you must use the crashed
thread’s state (accessible via the handlers uapparameter) to start
your backtrace.
Apropos that, if your crash reporter wants to log the state of the
crashed thread, that’s the place to get it.
Finally, there’s the question of how to exit from your signal handler.
You must not call exit. There’s two problems with doing that:
exit is not async signal safe. In fact, exit can run arbitrary code via handlers registered with atexit. If you want to exit the
process, call _exit.
Exiting the process is a bad idea anyway, because it will either prevent the Apple crash reporter from running or cause it to log
incorrect state (the state of your signal handler rather than the
state of the crashed thread).
A better solution is to unregister your signal handler (set it to
SIG_DFL) and then return. This will cause the crashed process to
continue execution, crash again, and generate a crash report via the
Apple crash reporter.
1 While the common signals caught by a crash reporter are not
technically async signals (except SIGABRT), you still have to treat
them as async signals because they can occur on any thread at any
time.
2 It’s reasonable to extend this list to other routines that are
implemented as thin shims on a system call. For example, I have no
qualms about calling vm_read (see below) from a signal handler.
[3] Cross signal handler stack frames are pushed on to the stack by
the kernel when it runs a signal handler on a thread. As there’s no
API to learn about the structure of these frames, there’s no way to
backtrace across one of these frames in isolation. I’m happy to go
into details but it’s really not relevant to this discussion. If
you’re interested, start a new thread in and we can chat there.
Reading Memory
A signal handler must be very careful about the memory it touches,
because the contents of that memory might have been corrupted by the
crash that triggered the signal. My general rule here is that the
signal handler can safely access:
Its code
Its stack
Its arguments
Immutable global state
In the last point, I’m using immutable to mean immutable after
startup. I think it’s reasonable to set up some global state when
the process starts, before installing your signal handler, and then
rely on it in your signal handler.
Changing any global state after the signal handler is installed is
dangerous, and if you need to do that you must be careful to ensure
that your signal handler sees a consistent state, even though a crash
might occur halfway through your change.
Note that you can’t protect this global state with a mutex because
mutexes are not async signal safe (and even if they were you’d
deadlock if the mutex was held by the thread that crashed). You
should be able to use atomic operations for this, but atomic
operations are notoriously hard to use correctly (if I had a dollar
for every time I’ve pointed out to a developer they’re using atomic
operations incorrectly, I’d be very badly paid (-: but that’s still a
lot of developers!).
If your signal handler reads other memory, it must take care to avoid
crashing while doing that read. There’s no BSD-level API for this
1, so I recommend that you use vm_read.
1 The traditional UNIX approach for doing this is to install a
signal handler to catch any memory exceptions triggered by the read,
but now we’re talking signal handling within a signal handler and
that’s just silly.
Writing Files
If your want to write a crash report from your signal handler, you
must use low-level UNIX APIs (open, write, close) because only
those low-level APIs are documented to be async signal safe. You must
also set up the path in advance because the standard APIs for
determining where to write the file (NSFileManager, for example) are
not async signal safe.
Offline Symbolication
Do not attempt to do symbolication from your signal handler. Rather,
write enough information to your crash report to support offline
symbolication. Specifically:
The addresses to symbolicate
For each Mach-O image in the process:
The image path
The image UUID
The image load address
You can get most of the Mach-O image information using the APIs in
<mach-o/dyld.h> 1. Be aware, however, that these APIs are not
async signal safe. You’ll need to get this information in advance and
cache it for your signal handler to record.
This is complicated by the fact that the list of Mach-O images can
change as you process loads and unloads code. This requires you to
share mutable state with your signal handler, which is exactly what I
recommend against in Reading Memory.
Note You can learn about images loading and unloading using _dyld_register_func_for_add_image
and_dyld_register_func_for_remove_image respectively.
1 I believe you’ll need to parse the Mach-O load commands to get the
image UUID.

Is it safe to use GCD to open a socket connection on iOS, even with a complex stack?

I have an iOS app with which I was previously using performSelectorInBackground to open a persistent socket connection. However, according to this answer, one should never use performSelectorInBackground.
The alternative, GCD and NSOperationQueue, both appear to persist the full stack frame, including 4! separate context switches, when I pause my app in the debugger. Therefore, if I open a socket connection inside a particular deep stack, that stack will persist for the lifetime of my app.
Are there any downsides to this? Here is what my stack looks like when I call my own internal beginSocketConnection:
Before, my call to beginSocketConnection was always (basically) at the top of a very simple stack because it had been invoked by performSelectorInBackground. That looked nicer/cleaner/better to me, so I'm not sure if this new approach is a bad thing.
First off, I agree with the other answer: Don't use -performSelectorInBackground:withObject:. Ever.
Next... you said:
The alternative, GCD and NSOperationQueue, both appear to persist the
full stack frame, including 4! separate context switches, when I pause
my app in the debugger. Therefore, if I open a socket connection
inside a particular deep stack, that stack will persist for the
lifetime of my app.
Are there any downsides to this?
Short answer? No. There are no downsides to this.
Long answer:
You appear to be mistaking debugging features for reality. In your screen shot there, those dotted lines aren't "context switches" (for any definition of 'context switch' that I'm aware of.) Xcode is eliding frames from the stack that it believes are not of interest to you (i.e. aren't in your code, or crossing into/out of your code.) You can toggle this behavior with the leftmost control in the bottom bar.
The "Enqueued from" stack traces are historical information aimed at helping you understand what caused the current (i.e. top) stack trace to happen. They're not execution states that still exist. They're just debugging information. You can safely assume that there's no "cost" to you associated with this.
The notion that opening a socket somehow causes a stack to "persist for the lifetime of [your] app" also doesn't make any sense. A socket is just a data structure in memory with some associated behavior provided by the OS. You're thinking about this too much.
Lastly, do yourself a favor, and don't choose APIs based on what makes your debugger stack traces look "nicer/cleaner/better."

How do I debug my program when it hangs?

I have an application which takes measurements every second (I am running it in demo mode & generating random data, so the problem is not to do with reading from devices attached to the serial port).
After 5 or 6 minutes it hangs.
I have added
try
// entire body of procedure/function goes here
except
on E: Exception do
begin
MessageDlg('Internal coding error in <function name>()',
mtError, [mbOK], 0);
end;
end;
to every single function (and Application.Run() in the project file), but I don't see any message dialogs. Any idea how I can test this?
Update: I woudl guess sit to be a resource problem, either RAM or MySql database - but other programs are running ok, and it's only 5 floats and timestamp that get saved with each measurement, so both seem unlikely after such a short time.
Solution: there were many great answers (thanks and +1 all round), but I finally got it (as suggested) by running in the IDE and using Run/Pause to see that it was an ever increasing loop.
Thanks again, everyone.
I'd try the following:
Attach and click Pause and see where it is, what threads are running, what threads are waiting (if all of them then you have a deadlock).
Refactor your main method into bunch of small ones (you may have already done that) and then replace small ones with dummy/hardcoded values. This may help but not necessarily identify faulty block.
Watch the system resource consumption (handles, threads, etc.) with PerfMon or something. See if you running out of memory and start using HDD.
If you using sockets, check if your reading timeout is set to infinity. If yes then change to some value and watch for timeouts.
In .NET it's possible to enabling handling of all exceptions, which means before the code handle it (like in catch statement), the IDE breaks at the point of the exception. Enable that in Delphi if possible and see if you are getting any.
Use the debugger to find out what your app is doing when it appears to hang.
Run your program under the debugger. Let it run until it hangs. When it hangs, switch to the debugger and select "Debug: Break All" (or equivalent) to make the debugger freeze all threads and take control of the process.
Now open the Threads view in the debugger and examine the stack traces of each thread in your program. Start with the main thread, obviously. Be sure to look back through the call stack for several calls to see if you recognize any of your code. If you find a stack trace that is in the middle of a loop, examine the local variables to see if your loop control variable has somehow slipped past the exit condition and put you into an infinite loop.
If your stack traces indicate that every thread is blocked waiting for some external event, you may have a thread deadlock - thread A taking lock A, then trying to take lock B, while thread B is holding lock B and trying to take lock A. If you're not using threads in your program, this is less likely but still keep an eye out for it.
If nothing odd jumps out at you after reviewing the stack traces of the threads, let the program run a few seconds more, then break into it again with the debugger and have another look around. See what's different in the stack traces.
This should help you narrow down at least which bodies of code are involved in your hang.
If you don't see an exception before adding the exception handlers, you won't produce any MessageDlgs to see.
If the program is hanging (rather than blowing up with an exception) you could have a looping problem, or you could have some blocking call that's never completing. Write log messages to a window or file (you can use OutputDebugStr) to isolate the problem to one section of your code — at least to the body of one function. You may see the problem right away. If not, you can use OutputDebugStr, breakpoints, and trace to learn what's happening in that section of code on a line-by-line basis.
First: try to debug it in Delphi IDE.
Second: if you can't do this (on customer PC), try the "Process Stack Viewer" of my (open source) sampling profiler:
http://code.google.com/p/asmprofiler/wiki/ProcessStackViewer
(you need some debug info: a .map or .jdbg file)
Then look at the stack of your threads (probably main/first thread). You can post the stack trace here then (if you can't find the problem).
If you want help with this sort of thing in an app that you cannot use the IDE for, then something like madExcept can help a lot. It has an automatic freeze checker for the main thread, and you can have it give you a stack dump to show what it was doing when it was frozen. The user can choose to kill or continue, and the app can tell madExcept it is busy and not to alert if appropriate (for doing long analysis or printing or something).

Resources