How to debug the CPU tab on Delphi? - delphi

I'm using Delphi 11. Sometimes an exception happens, and instead of Delphi going to the unit that threw the exception, it goes to this CPU screen, and I don't know what to do with it.
What can I do in such situations ?

This kind of screen typically means there is no code located at the memory address where the CPU tried to execute code from. Which most likely means that a class method was called on an object via an invalid but not nil pointer, or a function was called via a similarly invalid function pointer.
You won't be able to debug the location where the exception was raised from, since it is an invalid memory address to begin with. But you should still have access to the Call Stack trace of function calls that led up to the exception. So, just walk back through the Call Stack until you reenter valid code, and then debug from there to find the invalid pointer.

There might be better answers but I think you might look at EurekaLog or madExcept. Either works great. Never quite understood why this is not built-in to Delphi.

Related

Getting Access Violation 00000000 after a call to an OS function failed

We have been trying to solve this problem which is causing our program to crash. However, we haven't been able to reproduce the crash in house.
The call stack that is coming from the client's machine is on the link here:
Click to enlarge:
It doesn't seem to have any reference to any of the file in our project, so we're a bit lost as to where to look for a solution.
Could this be an environmental issue? The clients' that are getting this problem is using Windows 7 SP1 and Windows Server 2003. Sometimes, just prior to crash, the customer has been reporting that they have been getting 'A call to an OS function failed' error messages. Can this be related? Based on the call stack, can anyone make sense of what it is trying to do?
[Update] The call stack came from EurekaLog. Also I attach below the call stack from the 'A call to an OS function failed' error, that the customer is experiencing as well. This seems to be related to the AV error that the customer is getting but we are not sure. http://postimage.org/image/jku5dlnuf/
Based on the portion of the stack trace in your image, it's impossible to tell. The stack trace is mostly showing Windows API internal functions from the kernel DLLs.
Exceptions with an address of all zeros is a nil pointer (an object being used before it's created), but there's no way to tell where it's happening from that stack trace.
You should look at adding an exception handling product like MadExcept or EurekaLog to your application, which would give you a usable stack trace and more error information. Both are relatively inexpensive, especially when compared to the time spent trying to track down this type of error without them. (My own experience is with MadExcept, but I'm not affiliated with either of them.)

Is possible to use JclDebug to get the parameters values from the method that raised the exception?

I use the function 'JclGetExceptStackList' to log the call stack of the raised exceptions.
I wish, that if possible to also log the parameters value from the method that raised the exception.
I dont know if is possible to do that with JclDebug, or that exists any other way to do that.
Can anyone help me?
Thanks!
While it's not possible to do it with JclDebug, and while doing it by hand might be a lot of work, you might find that any logging tool, including Log4D, or CodeSite or even outputdebugString, could do it with less work. It will in fact be inserting code to log your parameter calls, in any place where you choose to insert such logging. It can also insert a special marker when an exception is raised, allowing you to reconstruct the exact scenario that lead to your crash.
But since CodeSite doesn't do it only when an exception is going to happen (how would it?) this isn't exactly what you wanted. I do find however, that a reasonable trace log, combined with the jcl stack traceback (or madeexcept or eurekalog) is more than enough and that if I really need to know all the parameter values involved in a call, I should go back and add more trace messages.
That's not possible in general because parameters are typically passed in registers. which are overwritten each time a procedure call is made.

How do I debug an Access violation in the field?

An application in the field is getting this message intermittently:
I am not able to reproduce this on my machine. I have also traced what I believe is the relevant code and can't find any access to uninitialized objects.
I've never had to deal with this kind of problem.
I did a build with madExcept and unfortunately the program does not crash once it is bundled.
Any opinions on madExcept vs EurekaLog for finding this kind of thing? I've never used FastMM. Would it be useful in his situation? (Delphi 2010) Any suggested flags to set in FastMM? Any other recommendations?
Note the very low address you are attempting to read. This sort of error almost certainly means you attempted to dereference a nil pointer even if you can't find one.
Given your description of the behavior I would suspect you've got a memory stomp going on--something is blasting a zero on top of the pointer to an object. When you change things you move things around and the stomp moves to someplace harmless.
Turn on both range checking and overflow checking.
Note the offending object must be at least 3C0 bytes in size--this should help narrow it down, most objects will be smaller than this.
What I have done in the past with such errors that only show in the field is put logging checkpoints in--a bunch of lines that display something in an out of the way place--a simple sequence of numbers is fine. Find out what number is showing when it crashes and you know which of you checkpoints was the last to execute. If that doesn't narrow it down enough you can repeat the process now that you've narrowed it down.
With a full map file you can identify the exact point in the code where this occurs. I hope you have a full map file for this image! Subtract $00401000 from the address at which the exception is raised ($007ADE8B in your case) and that corresponds to the values in the map file.
Having done that you know which object is nil and from there it is usually not too hard to work out what is going on.
One of the most common ways for this to occur is when a constructor raises an exception. When this occurs the destructor runs. If you access, in a destructor, a field that has not been initialised, and do anything other than call Free on it, then you will get an exception like this.
Looks like a memory overwrite where changing memory layout (your machine vs field machine or adding madExcept) makes the overwrite change something harmless.
FastMM is great at of making this kind of problems happen more consistently (and finding their source). Download the full version of FastMM, add it as the first unit of your project, and turn on FullDebugMode on its settings.
It might cause the problem to be reproduceable in your machine right away. If not, don't forget to deploy FastMM_FullDebugMode.dll with your application for testing. Keep madExcept on and let it embed the .map file for call stacks.

What is TExternalThread? "Cannot terminate externally created thread" when terminating a thread-based timer

This happens half of the time when closing my application in which I have placed a TLMDHiTimer on my form in design time, Enabled set to true.
In my OnFormClose event, I call MyLMDHiTimer.Enabled := false. When this is called, I sometimes (about half of the time) get this exception.
I debugged and stepped into the call and found that it is line 246 in LMDTimer.pas that gives this error.
FThread.Terminate;
I am using the latest version of LMDTools. I did a complete reinstall of LMD tools before the weekend and have removed and re-added the component to the form properly as well.
From what I've found, this has something to do with TExternalThread, but there's no documentation on it from Embarcadero and I haven't found anything referencing it within the LMDTools source code.
Using fully updated RAD Studio 2010, Delphi 2010.
What really upsets me here is that there's no documentation whatsoever. Google yeilds one result that actually talks about it, in which someone says that the error is caused by trying to terminate a TExternalThread.
But looking at the source-code for this LMDHiTimer, not once does it aim to do anything but create a regular TThread.
The one google result I could find, Thread: Cannot terminate an externally created thread? on Embarcadero mentions using GetCurrentThread() and GetCurrentThreadId() to get the data necessary to hook on to an existing thread, but the TLMDHiTimer does no such thing. It just creates its own TThread descendant with its own Create() constructor (overridden of course, and calls inherited at the start of the constructor)
So... What the heck is this TExternalThread? Has anyone else run into this kind of exception? And perhaps figured out a solution or workaround?
I've asked almost the exact same question to LMDTools' own support, but it can't hurt to ask in multiple places.
Thank you in advance for any assistance.
TExternalThread wraps a thread that the Delphi RTL didn't create. It might represent a thread belonging to the OS thread pool, or maybe a thread created by another DLL in your program. Since the thread is executing code that doesn't belong to the associated TExternalThread class, the Terminate method has no way to notify the thread that you want it to stop.
A Delphi TThread object would set its Terminated property to True, and the Execute method that got overridden would be expected to check that property periodically, but since this thread is non-Delphi code, there is no Execute method, and any Terminated property only came into existence after the thread code was already written someplace else (not by overriding Execute).
The newsgroup thread suggests what's probably happening in your case:
... you have corrupted memory that causes the TThread.FExternalThread member to become a non-zero value.
It might be due to a bug in the component library, or it could be due to a bug in your own code. You can use the debugger's data breakpoints to try to find out. Set a breakpoint in the timer's thread's constructor. When your program pauses there, use the "Add Breakpoint" command on the Run menu to add a data breakpoint using the address of the new object's FExternalThread field. Thereafter, if that field's value changes, the debugger will pause and show you what changed it. (The data breakpoint will get reset each time you run the program because the IDE assumes the object won't get allocated at the same address each time.)
Is there any chance the code might be trying to Terminate an already Destroyed TThread? This could easily happen if you have FreeOnTerminate set.
I noticed your post while diagnosing a similar (opposite?) error, "Cannot call Start on a running or suspended thread" in the constructor of a component placed on the main form. When I removed the Start(), that error was replaced by more telling errors, e.g. "invalid pointer operation" and "access violation", in the corresponding destructor. The component was trying to manipulate its TThread object after the TThread was Freed, thus leaving things up to Murphy's law. When I fixed that, I was able to replace the Start() call without the "Cannot call Start" error returning.
By analogy, could your problem be that the address of your FExternalThread had been recycled and clobbered before the destructor/Terminate call? In our case, we had a buggy implementation of the Singleton Instance Pattern; but again, FreeOnTerminate also seems like a likely suspect.
[FYI: I'm using I'm using C++ under RAD Studio XE]

Delphi 7 exception not caught

I have some really complicated legacy code I've been working on that crashes when collecting big chunks of data. I've been unable to find the exact reason for the crashes and am trying different ways to solve it or at least recover nicely. The last thing I did was enclose the crashing code in a
try
...
except
cleanup();
end;
just to make it behave. But the cleanup never gets done. Under what circumstances does an exception not get caught? This might be due to some memory overflow or something since the app is collecting quite a bit of data.
Oh and the exception I got before adding the try was "Access violation" (what else?) and the CPU window points to very low addresses. Any ideas or pointers would be much appreciated!
"Very low address" probably means that somebody tried to call a virtual method on an object that was not really there (i.e. was 'nil'). For example:
TStringList(nil).Clear;
The first part is very mysterious, though. I have no idea how that can happen.
I think you should try to catch that exception with madExcept. It has never failed me yet. (Disclaimer: I am not using D7.)
A trashed stack or a stack overflow can both cause irreparable harm to the structures on the stack that structured exception handling (SEH) in Windows uses to find the actual exception handlers.
If you have a buffer overflow in a buffer on the stack (e.g. a static array as a local variable but written beyond its end), and overwrite an exception record, then you can overwrite the "next" pointer, which points at the next exception record on the stack. If the pointer gets clobbered, there's nothing the OS can do to find the next exception handler and eventually reach your catch-all one.
Stack overflows are different: they can prevent calling functions at all, since every function call requires at least one dword of stack space for the return address.
you have a number of good answers. the wildest problems i've had to chase come from stack corruption issues like barry mentioned. i've seen stuff happen with the project's "Memory sizes" section on the linker page. i might be superstitious but it seemed like larger wasn't necessarily better. you might consider using the enhanced memory manager FastMM4--it's free & very helpful.
http://sourceforge.net/projects/fastmm/
i've used it with d7 and found some access to stale pointers and other evil things.
you may also wish to create a way to track valid objects and or instrument the code in other ways to have the code checking itself as it works.
when i'm seeing access to addresses like 0x00001000 or less, i think of access to a nil pointer. myStringList:=nil; myStringList.Clear;
when i'm seeing access to other addresses with much larger numbers, i think of stale pointers.
when things are strangely unstable & stack traces are proving to be nonsense and/or wildly varying, i know i have stack issues. one time it's in Controls.pas; next time it's in mmsys.pas, etc.
using the wrong calling convention to a DLL can really mess up your stack as well. this is because of the parameter passing/releasing when calling/returning from the DLL.
MadExcept will be helpful in finding the source of this, even if it shows nonsense...you'll win either way because you'll know where the problem is occurring or you'll know you have a stack issue.
is there any testing framework you can put on it to exercise it? i've found that to be very powerful because it makes it entirely repeatable.
i've fixed some pretty ugly problems this way.
I'll leave the reasons why the except might not work to Barry...
But I strongly suggest a simple strategy to narrow down the area where it happens.
Cut the big chunk in smaller parts surrounded by
try
OutputDebugString('entering part abc');
... // part abc code here
except
OutputDebugString('horror in part abc');
raise;
end;
...
try
OutputDebugString('entering in part xyz');
... // part xyz code here
except
OutputDebugString('horror in part xyz');
raise;
end;
and run your code with DebugView on the side... (works for apps without GUI as well like services).
You'll see which part is executed and if the exceptions are caught there.
I used to get this strange behabiour when calling some COM object that used a safecall calling convention. This object/method may raise an EOleException, not trapped by the usual try/except on the client code.
You should trap an EOleException and the handle it properly.
try
...
except
on E: EOleException do
...
end;
I don't know if it is the problem you are facing. But if it is, i recommend you to take a look at Implement error handling correctly, a very clarifiyng post about exception handling in delphi.
You can also enable your IDE Debug Options to stop on delhi exceptions e monitor the stack trace.
Is this perhaps a DLL or a COM object? If so, it is possible that the FPUExcpetion mask is being set by the host application to something different than Delphi is used to. An overflow, by default in Delphi produces an exception, but the FPUExcpetionmask can be set so that it doesn't, and the value is set to NAN. See the math.pas unit for more information on FPUExceptionmask
I've gotten exceptions in the initialization and finalization blocks of my code which madExcept doesn't even seem to even catch. This might occur if you're referencing external DLL's inside of that try block. I'm not certain of the reason.
Actually (and thanks to #Gung for informing me of the worthlessness of my ancient answer), I read this recently in the ancient O'Reilly Delphi Tome. You should put SysUtils as the first (or second after your non-standard memory manager unit) in your main form's DPR so that it's resident in memory with all it's Exception Catching goodness. Otherwise, if it's loaded from some other unit, it will be unloaded with that unit too and you can kiss built in exception handling goodbye.

Resources