Delphi obtain stack trace after exception - delphi

I'm trying to figure out how to obtain a stack trace after an exception is thrown in Delphi. However, when I try to read the stack in the Application.OnException event using the function below, the stack already seems to be flushed and replaced by the throwing procedures.
function GetStackReport: AnsiString;
var
retaddr, walker: ^pointer;
begin
// ...
// History of stack, ignore esp frame
asm
mov walker, ebp
end;
// assume return address is present above ebp
while Cardinal(walker^) <> 0 do begin
retaddr := walker;
Inc(retaddr);
result := result + AddressInfo(Cardinal(retaddr^));
walker := walker^;
end;
end;
Here's what kind of results I'm getting:
001A63E3: TApplication.HandleException (Forms)
00129072: StdWndProc (Classes)
001A60B0: TApplication.ProcessMessage (Forms)
That's obviously not what I'm looking for, although it's correct. I'd like to retrieve the stack as it was just before the exception was thrown, or in other words the contents before (after would do too) the OnException call.
Is there any way to do that?
I am aware that I'm reinventing the wheel, because the folks over at madExcept/Eurekalog/jclDebug have already done this, but I'd like to know how it's done.

It is not possible to manually obtain a viable stack trace from inside the OnException event. As you have already noticed, the stack at the time of the error is already gone by the time that event is triggered. What you are looking for requires obtaining the stack trace at the time the exception is raised. Third-party exception loggers, like MadExcept, EurekaLog, etc handle those details for you by hooking into key functions and core exception handlers inside of the RTL itself.
In recent Delphi versions, the SysUtils.Exception class does have public StackTrace and StackInfo properties now, which would be useful in the OnException event except for the fact that Embarcadero has chosen NOT to implement those properties natively for unknown reasons. It requires third-party exception loggers to assign handlers to various callbacks exposed by the Exception class to generate stack trace data for the properties. But if you have JclDebug installed, for instance, then you could provide your own callback handlers in your own code that use JCL's stack tracing functions to generate the stack data for the properties.

I'd like to retrieve the stack as it was just before the
exception was thrown, or in other words the contents before
(after would do too) the OnException call.
Actually, you don't want the stack before the OnException call. That's what you've already got. You want the stack at the point at which the exception was raised. And that requires the stack tracing to happen ASAP after the raise. It's too late in the OnException call because the exception has propagated all the way to the top-level handler.
madExcept works by hooking all the RTL functions that handle exceptions. And it hooks the lowest level functions. This takes some serious effort to bring about. With these routines hooked the code can capture stack traces and so on. Note that the hooking is version specific and requires reverse engineering of the RTL.
What's more the stack walking is very much more advanced than your basic code. I don't mean that in a derogatory way, it's just that stack walking on x86 is a tricky business and the madExcept code is very well honed.
That's the basic idea. If you want to learn more then you can obtain the source code of JclDebug for free. Or buy madExcept and get its source.

Related

How does Delphi's try...except work for sub-procedures? Does exceptions handling work for sub-procedure?

I do not quite understand and I could not find the answer to the question bothering me. Can the try..except block catch and pass the sub-procedure exception?
Let's say that i have code:
try
ProcedureA;
except
on E : Exception do
...
end;
and code for ProcedureA
procedure ProcedureA;
begin
SubProcedureA;
SubProcedureB;
SubProcedureC;
...
end;
If SubProcedureB raises exception, will the exception be handled at the main ProcedureA level? Will SubProcedureC be performed? Will the exception be forwarded to procedure A unchanged? Or maybe there is a restriction on sub-procedures, for example, Sub-sub-sub-procedure will no longer pass an exception to the higher-level procedure?
Thank you for the information and I apologize if this is a beginner question (which I am). :)
If SubProcedureB raises exception, will the exception be handled at the main ProcedureA level?
Yes. When an exception is raised, it propagates up the call stack until a matching handler catches it. If no handler catches it, then the process will usually terminate.
Will SubProcedureC be performed?
Usually no, however on Windows at least, it is possible (but not with Delphi's except syntax) for an exception handler to instruct the system to return back to the original call site that raised the exception. This is useful in rare cases where an exception handler can actually fix the condition that caused the exception to be raised in the first place, allowing execution to continue from where it left off. But again, this is very rare.
Will the exception be forwarded to procedure A unchanged?
Usually yes. There is only 1 Exception object in memory, and it is passed to each exception handler on the call stack until a matching handler is found. That being said, it is possible for an exception handler to catch an exception, modify it (it is just an object in memory, after all), and then re-raise it to continue the search up the call stack for another handler. That is not the case in your example, but it is allowed.
Or maybe there is a restriction on sub-procedures, for example, Sub-sub-sub-procedure will no longer pass an exception to the higher-level procedure?
There is no such restriction.
Try except block catches the exception at any level. The exception is thrown up until it is processed.
Top level is Application.OnException event.

Creating NonContinuable exception in delphi

I have an exception which its raise command causes stack overflow. I read this article in order to know what should I do: http://www.debuggingexperts.com/modeling-exception-handling
What I understood is the exception 0xc0000025 means attempt to catch an exception which is forbidden to be caught (EXCEPTION_NONCONTINUABLE_EXCEPTION). Am I right?
If so, I wish to know what cause the exception to be defined as non-continuable. The exception is defined in Pascal and derived from Exception object.
In addition, I failed to found where this exception is handled, and added by myself a try-catch block. The exception caught successfully. Why?
EDIT
I want to explain the specific situation I need help:
There is a C++ code which calls Pascal code, which has the exception definition, and raise command happens in it.
Before I put the try-catch block in the C++ code, the raise in Pascal causes 1000 times exception of EXCEPTION_NONCONTINUABLE_EXCEPTION until stack overflowed.
After I added the try-catch block in the C++ code, the raise in Pascal code returned to the catch block in the C++ code.
Now I have 2 questions:
Why process didn't stop on the first NONCONTINUABLE exception?
Why the catch block in C++ code didn't cause this exception?
You are correct that EXCEPTION_NONCONTINUABLE_EXCEPTION means the program attempted to continue from an exception that isn't continuable. However, it's not possible to define such an exception in Delphi, so the source of your problem is elsewhere.
Consider debugging the creation, raising, catching, and destruction of your custom exception type. If there are external libraries involved in your program, particularly any written in something other than Delphi, make sure they either know what to do with external exceptions, or are shielded entirely from exceptions.

How to debug a (possible) RTL problem?

I'm asking this because I'm out of good ideas...hoping for someone else's fresh perspective.
I have a user running our 32-bit Delphi application (compiled with BDS 2006) on a Windows 7 64-bit system. Our software was "working fine" until a couple weeks ago. Now suddenly it isn't: it throws an Access Violation while initializing (instancing objects).
We've had him reinstall all our software--starting all over from scratch. Same AV error. We disabled his anti-virus software; same error.
Our stack tracing code (madExcept) for some reason wasn't able to provide a stack trace to the line of the error, so we've sent a couple error logging versions for the user to install and run, to isolate the line which generates the error...
Turns out, it's a line which instances a simple TStringList descendant (there's no overridden Create constructor, etc.--basically the Create is just instancing a TStringList which has a few custom methods associated with the descendant class.)
I'm tempted to send the user yet another test .EXE; one which just instances a plain-vanilla TStringList, to see what happens. But at this point I feel like I'm flailing at windmills, and risk wearing out the user's patience if I send too many more "things to try".
Any fresh ideas on a better approach to debugging this user's problem? (I don't like bailing out on a user's problems...those tend to be the ones which, if ignored, suddenly become an epidemic that 5 other users suddenly "find".)
EDIT, as Lasse requested:
procedure T_fmMain.AfterConstruction;
begin
inherited;
//Logging shows that we return from the Inherited call above,
//then AV in the following line...
FActionList := TAActionList.Create;
...other code here...
end;
And here's the definition of the object being created...
type
TAActionList = class(TStringList)
private
FShadowList: TStringList; //UPPERCASE shadow list
FIsDataLoaded : boolean;
public
procedure AfterConstruction; override;
procedure BeforeDestruction; override;
procedure DataLoaded;
function Add(const S: string): Integer; override;
procedure Delete(Index : integer); override;
function IndexOf(const S : string) : Integer; override;
end;
implementation
procedure TAActionList.AfterConstruction;
begin
Sorted := False; //until we're done loading
FShadowList := TStringList.Create;
end;
I hate these kind of problems, but I reckon you should focus on what's happening recently BEFORE the object tries to get constructed.
The symptoms you describe sound like typical heap corruption, so maybe you have something like...
An array being written to outside bounds? (turn bounds checking on, if you have it off)
Code trying to access an object which has been deleted?
Since my answer above, you've posted code snippets. This does raise a couple of possible issues that I can see.
a: AfterConstruction vs. modified constructor:
As others have mentioned, using AfterConstruction in this way is at best not idiomatic. I don't think it's truly "wrong", but it's a possible smell. There's a good intro to these methods on Dr. Bob's site here.
b: overridden methods Add, Delete, IndexOf
I'm guessing these methods use the FshadowList item in some way. Is it remotely possible that these methods are being invoked (and thus using FShadowList) before the FShadowList is created? This seems possible because you're using the AfterConstruction methods above, by which time virtual methods should 'work'. Hopefully this is easy to check with a debugger by setting some breakpoints and seeing the order they get hit in.
You should never override AfterConstruction and BeforeDestruction methods in your programs. They are not meant for what you're doing with them, but for low-level VCL hacking (like reference adding, custom memory handling or such).
You should override the Create constructor and Destroy destructor instead and put your initialization code here, like such:
constructor TAActionList.Create;
begin
inherited;
// Sorted := False; // not necessary IMHO
FShadowList := TStringList.Create;
end;
Take a look at the VCL code, and all serious published Delphi code, and you'll see that AfterConstruction and BeforeDestruction methods are never used. I guess this is the root cause of your problem, and your code must be modified in consequence. It could be even worse in future version of Delphi.
Clearly there is nothing suspicious about what TAActionList is doing at time of construction. Even considering ancestor constructors and possible side-effects of setting Sorted := False indicate there shouldn't be a problem. I'm more interested in what's happening inside T_fmMain.
Basically something is happening that causes FActionList := TAActionList.Create; to fail, even though there is nothing wrong in the implementation of TAActionList.Create (a possibility is that the form may have been unexpectedly destroyed).
I suggest you try changing T_fmMain.AfterConstruction as follows:
procedure T_fmMain.AfterConstruction;
begin
//This is safe because the object created has no form dependencies
//that might otherwise need to be initialised first.
FActionList := TAActionList.Create;
//Now, if the ancestor's AfterConstruction is causing the problem,
//the above line will work fine, and...
inherited AfterConstruction;
//... your error will have shifted to one of these lines here.
//other code here
end;
If an environment issue with a component used by your form is causing it destroy the form during AfterConstruction, then it's the assignment of the new TAActionList.Create instance to FActionList that's actually causing the AV. Another way to test would be to first create the object to a local variable, then assign it to the class field: FActionList := LActionList.
Environment problems can be subtle. E.g. We use a reporting component which we discovered requires that a printer driver is installed, otherwise it prevents our application from starting up.
You can confirm the destruction theory by setting a global variable in the form's destructor. Also you may be able to output a stack trace from the destructor to confirm the exact sequence leading to the destruction of the form.
Our software was "working fine" until a couple weeks ago... suddenly become an epidemic that 5 other users suddenly "find".) :
Sounds like you need to do some forensic analysis, not debugging: You need to discover what changed in that user's environment to trigger the error. All the more so if you have other users with the same deployment that don't have the problem (sounds like that's your situation). Sending a user 'things to try' is one of the best ways to erode user confidence very quickly! (If there is IT support at the user site, get them involved, not the user).
For starters, explore these options:
*) If possible, I'd check the Windows Event Log for events that may have occurred on that machine around the time the problem arose.
*) Is there some kind of IT support person on the user's side that you can talk to about possible changes/problems in that user's environment?
*) Was there some kind of support issue/incident with that user around the time the error surfaced that may be connected to it, and/or caused some kind of data or file corruption particular to them?
(As for the code itself, I agree with #Warran P about decoupling etc)
Things to do when MadExcept is NOT Enough (which is rare, I must say):
Try Jedi JCL's JCLDEBUG instead. You might get a stack traceback with it, if you change out MadExcept for JCLDEBUG, and write directly the stack trace to the disk without ANY UI interaction.
Run a debug-viewer like MS/SysInternals debugview, and trace output things like the Self pointers of the objects where the problems are happening. I suspect that somehow an INVALID instance pointer is ending up in there.
Decouple things and refactor things, and write unit tests, until you find the really ugly thing that's trashing you. (Someone suggested heap corruption. I often find heap corruption goes hand in hand with unsafe ugly untested code, and deeply bound UI+model cascading failures.)

Guard page exceptions in Delphi?

There is a post by Raymond Chen, where he tells how bad IsBadXxxPtr function is by eating guard page exception.
I don't quite understand how it is applied to Delphi. Who and how should normally (i.e. without call to IsBadXxxPtr) process this exception?
I do know that Delphi inserts a code, which (for example) access a memory for large static arrays - exactly for this reason: to expand stack.
But if guard page exception is raised: who will handle it in a Delphi application? Can't I accidentally mess with it by using try/except in inappropriate way? Will Delphi's debugger notify me about these exceptions?
Windows structured exception handling (SEH) is has a two-phase structure. When an exception occurs, Windows first looks for a handler for the exception by following the registered exception handler chain (the head of which is stored in fs:[0] on x86, i.e. the first dword in the segment pointed to by the FS segment register - all that ugly 16-bit segment-offset logic didn't go away in 32-bit, it just became less relevant).
The search is done by calling a function with a particular flag, a pointer to which is stored in each exception frame on the stack. fs:[0] points to the topmost frame. Each frame points to the previous frame. Ultimately, the last frame on the list is one that has been provided by the OS (this handler will pop up a app-crash dialog if an unhandled exception reaches it).
These functions normally check the type of the exception, and return a code to indicate what to do. One of the codes that can be returned is basically, "ignore this exception and continue". If Windows sees this, it will reset the instruction pointer to the point of the exception and resume execution. Another code indicates that this exception frame should handle the given exception. A third code is "I'm not going to catch this exception, keep searching". Windows keeps on calling these exception filter functions until it finds one that handles the exception one way or the other.
If Windows finds one that handles the exception by catching it, then it will proceed to unwind the stack back to that handler, which consists of calling all the functions again, only passing in a different flag. It's at this point that the functions execute the finally logic, up until the handler which executes the except logic.
However, with the stack page guard exception, the process is different. None of the language's exception handlers will elect to handle this exception, because otherwise the stack growth mechanism would break. Instead, the filter search filters all the way through to the base exception handler provided by the OS, which grows the stack allocation by committing the appropriate memory, and then returns the appropriate return code to indicate that the OS should continue where it left off, rather than unwind the stack.
The tool and debugging infrastructure are designed to let these particular exceptions play out correctly, so you don't need to worry about handling them.
You can read more about SEH in Matt Pietrek's excellent article in MSJ from over a decade ago.
From looking at the comments, it looks to me like the "guard page exception" mess takes place entirely within the kernel, and is not something that you need to be worrying about from user space.
You've gotta remember that this article was written for C++, which is nowhere near as advanced as Delphi on the memory management front. The uninitialized pointers issue is a lot less of a mess in Delphi than in C/C++ for two reasons:
Delphi checks for uninitialized variables at compile time, which (for whatever reason) a lot of C compilers tend to have trouble with.
Delphi initializes all of its dynamic memory to 0, so you don't have random heap garbage to deal with that might look like a good pointer when it's really not. This means that most bad pointers give you access violations, which are easy to debug, instead of silently failing and corrupting memory.

Hooking a Stacktrace in Delphi 2009

The Exception class in Delphi 2009 received a number of new features. A number of them are related to getting a stacktrace:
property StackTrace: string read GetStackTrace;
property StackInfo: Pointer read FStackInfo;
class var GetExceptionStackInfoProc: function (P: PExceptionRecord): Pointer;
class var GetStackInfoStringProc: function (Info: Pointer): string;
class var CleanUpStackInfoProc: procedure (Info: Pointer);
Has anyone used these to obtain a stack trace yet? Yeah, I know there are other ways to get a stack trace, but if it is supported natively in the Exception class I would rather leverage that.
Update: There is an interest blog post about this. Covers it in a lot of depth.
To me this looks like a framework where you can plug in your own stack tracing. I guess this might be used internally in the IDE with JCLDebug. Perhaps it's intended for users to be able to supply madExcept or another implementation.
No, I haven't used them yet (currently using madExcept for that, and also did some experiments with JclDebug) - but thanks for the tip!
TOndrej is correct. The new features added to Exception class are for third-parties to hook their own stack tracing code into the RTL. The default implementation of the Exception class does not produce its own stack traces.
You would be well advised to look at MadExcept. Not only does it provide excellent handling of any unhandled exceptions (screen grab, email etc) but it has a nice set of callable routines to hand you back a stack trace that you can use almost anywhere.
Bri

Resources