Wrong code when combining anonymous and nested procedures - delphi

I've got some unexpected access violations for Delphi code that I think is correct, but seems to be miscompiled. I can reduce it to
procedure Run(Proc: TProc);
begin
Proc;
end;
procedure Test;
begin
Run(
procedure
var
S: PChar;
procedure Nested;
begin
Run(
procedure
begin
end);
S := 'Hello, world!';
end;
begin
Run(
procedure
begin
S := 'Hello';
end);
Nested;
ShowMessage(S);
end);
end;
What happens for me is that S := 'Hello, world!' is storing in the wrong location. Because of that, either an access violation is raised, or ShowMessage(S) shows "Hello" (and sometimes, an access violation is raised when freeing the objects used to implement anonymous procedures).
I'm using Delphi XE, all updates installed.
How can I know where this is going to cause problems? I know how to rewrite my code to avoid anonymous procedures, but I have trouble figuring out in precisely which situations they lead to wrong code, so I don't know where to avoid them.
It would be interesting to me to know if this is fixed in later versions of Delphi, but nothing more than interesting, upgrading is not an option at this point.
On QC, the most recent report I can find the similar #91876, but that is resolved in Delphi XE.
Update:
Based on AlexSC's comments, with a slight modification:
...
procedure Nested;
begin
Run(
procedure
begin
S := S;
end);
S := 'Hello, world!';
end;
...
does work.
The generated machine code for
S := 'Hello, world!';
in the failing program is
ScratchForm.pas.44: S := 'Hello, world!';
004BD971 B89CD94B00 mov eax,$004bd99c
004BD976 894524 mov [ebp+$24],eax
whereas the correct version is
ScratchForm.pas.45: S := 'Hello, world!';
004BD981 B8B0D94B00 mov eax,$004bd9b0
004BD986 8B5508 mov edx,[ebp+$08]
004BD989 8B52FC mov edx,[edx-$04]
004BD98C 89420C mov [edx+$0c],eax
The generated code in the failing program is not seeing that S has been moved to a compiler-generated class, [ebp+$24] is how outer local variables of nested methods are accessed how local variables are accessed.

Without seeing the whole Assembler Code for the whole (procedure Test) and only assuming on the Snippet you posted, it's probably that on the failing Snippet only a Pointer has been moved where on the correct version there is some Data moved too.
So it seems that S:=S or S:='' causes the Compiler to create a reference by it's own and could even allocate some Memory, which would explain why it works then.
I also assume that's why a Access Violation occurs without S:=S or S:='', because if there is no Memory allocated for the String (remember you only declared S: PChar) then a Access Violation is raised because non-allocated Memory was accessed.
If you simply declare S: String instead, this probably won't happen.
Additions after extended Commenting:
A PChar is only a Pointer to Data Structure, that must exist. Also another common Issue with PChar is to declare local Variables and then passing a PChar to that Variable to other Procs, because what happens is that the local Variable is freed once the routine ends, but the PChar will still point to it, which then raise Access Violations once accessed.
The only possibility that exists per Documentation is declaring something like that const S: PChar = 'Hello, world!' this works because the Compiler can resolve a relative Adresse to it. But this only works for Constants and not for Variables like on the Example above. Doing it like in the Example above needs Storage to be allocated for the string literal to which the PChar then points to like S:String; P:PChar; S:='Hello, world!'; P:=PChar(S); or similar.
If it still fails with declaring String or Integer then perhaps the Variable disappears somewhere along or suddenly isn't visible anymore in a proc, but that would be another Issue that has nothing to do with the existing PChar Issue explained already.
Final Conclusion:
It's possible to do S:PChar; S:='Hello, world!' but the Compiler then simply allocates it as a local or global Constant like const S: PChar = 'Hello, world!' does which is saved into Executable, a second S := 'Hello' then creates another one which is also saved into Executable and so on - but S then only points to the last one allocated, all others are still in the Executable but not accessible any more without knowing the exact Location, because S only points to the last one allocated.
So depending which one was the last S points either to Hello, world! or Hello. On the Example above i can only guess which one was the last and who knows perhaps the Compiler can only guess too and depending on optimizations and other unpredictable Factors S could suddenly point to unallocated Mem instead of the last one by the Time Showmessage(S) is executed which then raises a Access Violation.

How can I know where this is going to cause problems?
It's hard to tell at this point in time.
If we knew the nature of the fix in Delphi XE2 we'd be in a better position.
All you can do is refrain from using anonymous functions.
Delphi has had procedural variables, so the need for anonymous functions ready is not that dire.
See http://www.deltics.co.nz/blog/posts/48.
It would be interesting to me to know if this is fixed in later versions of Delphi
According to #Sertac Akyuz this has been fixed in XE2.
Personally I dislike anonymous methods and have had to ban people from using them in my Java projects because a sizable proportion of our code base was going anonymous (event handlers).
Used in extreme moderation I can see the use case.
But in Delphi where we have procedural variables and nested procedures... Not so much.

Related

What is difference between FastMM and BorlndMM in Delphi 2009

I am having a memory problem that cannot explain root cause, please help me!
The problem is as follows:
My program throws exception when i exit.
I investigated and thought that the cause of the error was the use Copymemory function to copy the record containing the variable have string data type.
Below is my demo program by delphi 2009:
In *.dpr file, I added ShareMem unit for use BorlndMM.dll instead of using FastMM memory management.
I define a struct containing a String variable.
Allocate 1 array of 2048 PByte elements.
Use Memory Copy function to copy 2 struct.
finally, Free array 2048.
when i exit program, my program threw exception.
type
TMyStructure = record
F1: TMyStructure;
str1: string;
unit 1
procedure TForm9.FormCreate(Sender: TObject);
var
i: Integer;
begin
F1.str1 := 'This is a string to demo for copying the String data type
using the Copymemory method';
end;
unit 2
procedure TForm8.btn1Click(Sender: TObject);
var
Form9: TForm9;
i: Integer;
s1: array[0..2048-1] of PByte;
begin
// Alloc 1st 2048 segments, each segment gain 1KB
for i := 0 to Length(s1) - 1 do
begin
s1[i] := AllocMem(1024);
end;
// Create Form9
Form9 := TForm9.Create(Owner);
// -> ERROR
CopyMemory(#Self.F1, #Form9.F1, SizeOf(TMyStructure));
// -> OK
// Self.F1 := Form9.F1;
// Free memory s1 array
for i := 0 to Length(s1) - 1 do
begin
FreeMem(s1[i]);
end;
end;
Note:
I have known management in memory of string is reference counting.
I've been investigating because of the use of Memorycopy, so reference count of the string is not incremented although there are 2 references to it.
I investigated that when the string was free for the first time, its reference count dropped to zero, but that memory was not returned to the OS.However, it seems that when another free variable has a nearby address, that memory loses the reference, so when accessing the second free will cause an exception.
Since no source code can be found as well as documentation describing the operation mechanism of BorlndMM.dll should read the spec as well as GetMem.inc source code file to understand the mechanism of operation FastMM and speculate the root cause of the bug. I'm guessing when a free variable, BorlndMM find ahead and after that free space and the combination leads to the memory area that can not be referenced anymore.Of course, using Memorycopy to copy two string variables is a action wrong and must fix. However, I would like to understand how the Memory Management mechanism works to explain the root cause of the above phenomenon.
Expect for help!
If possible please explain the root cause of the above phenomenon for me. And, If there is a document / source explaining how the operation of Memory Management 2005-BorlndMM , please send it to me. Thanks you so much!
The problem with your code is that it bypasses string reference counting. By copying the structure using CopyMemory you end up with two variables, both referring to the same string object. But that object still has a reference count of one.
What happens after that is unpredictable. You may encounter runtime errors. Or you may not. The behaviour is not well defined.
From your question you appear to be trying to understand why different memory managers lead to different behaviours. The thing about undefined behaviour is that it is, well, not defined. There's no point trying to reason about the behaviour. It's not possible to reason about it.
You must fix your code by replacing the memory copy with simple assignment:
F1 := Form9.F1;

Access Violation - how do I track down the cause?

I'm getting an access violation when I close a form in my application. It seems to happen only after I have access a database a couple of times, but that doesn't seem to make sense.
I have traced through and put outputdebugstring messages in all the related OnDestroy() methods, but the AV appears to be outside of my code.
This is the text of the message:
Access violation at address 00405F7C in module
'MySoopaApplication.exe'. Read of address 00000008.
How do I find where in the application 00405F7C is?
What tools are available in Delphi 10.1 Berlin to help me with this?
Edit: added a bit more info ... when clicking "Break" the IDE always takes me to this piece of code in GETMEM.INC:
#SmallPoolWasFull:
{Insert this as the first partially free pool for the block size}
mov ecx, TSmallBlockType[ebx].NextPartiallyFreePool
Further edit: well, I found the culprit, though I can't honestly say that the debug tools got me there - they just seemed to indicate it wasn't in my code.
I had used code from the net that I used to find the Windows logged in user - this is it:
function GetThisComputerName: string;
var
CompName: PChar;
maxlen: cardinal;
begin
maxlen := MAX_COMPUTERNAME_LENGTH +1;
GetMem(CompName, maxlen);
try
GetComputerName(CompName, maxlen);
Result := CompName;
finally
FreeMem(CompName);
end;
end;
once I had replaced the code with a simple result := '12345' the AVs stopped. I have no changed it to this code:
function GetThisComputerName: string;
var
nSize: DWord;
CompName: PChar;
begin
nSize := 1024;
GetMem(CompName, nSize);
try
GetComputerName(CompName, nSize);
Result := CompName;
finally
FreeMem(CompName);
end;
end;
which seems to work and, as a bonus, doesn't cause AVs.
Thanks for your help, much appreciated.
Under Tools|Options in the IDE go to Embarcadero Debuggers | Language Exceptions and make sure Notify on Language Exceptions is checked. Also under Project Options | Compiling, make sure Debugging | Use debug DCUs is checked.
Allow the exception to happen, then go to View | Debug Windows | Call stack and you should be able to see exactly where it occurred. The fact that it occurs after db access is probably because that causes some object to be created which generates the AV when it is destroyed. Possibly because it is being Free()ed twice.
If that doesn't solve it, you may may need an exception-logging tool like madExcept mentioned by DavidH.
Read of address 00000008.
The fact that this address is a low number is suggestive of it being the address of a member of an object (because they are typically at low offsets from the base address of the object).
How do I find where in the application 00405F7C is?
With your app running, in the IDE go to Search | Go to Address. This should find it if the exception is in your application and not in some related module like a .DLL it is using. The menu item is enabled once the application is running in the IDE and stopped at a breakpoint. Also there is a compiler command line switch to find an error by address.
Others have explained how to diagnose an AV.
Regarding the code itself, there are issues with it:
Most importantly, you are not allocating enough memory for the buffer. GetMem() operates on bytes but GetComputetName() operates on characters, and in this case SizeOf (Char) is 2 bytes. So you are actually allocating half the number of bytes that you are reporting to GetComputerName(), so if it writes more than you allocate then it will corrupt heap memory. The corruption went away when you over-allocated the buffer. So take SizeOf(Char) into account when allocating:
function GetThisComputerName: string;
var
CompName: PChar;
maxlen: cardinal;
begin
maxlen := MAX_COMPUTERNAME_LENGTH +1;
GetMem(CompName, maxlen * SizeOf(Char)); // <-- here
try
GetComputerName(CompName, maxlen);
Result := CompName;
finally
FreeMem(CompName);
end;
end;
In addition to that:
you are ignoring errors from GetComputerName(), so you are not guaranteeing that CompName is even valid to pass to Result in the first place.
You should use SetString(Result, CompName, nSize) instead of Result := CompName, since GetComputerName() outputs the actual CompName length. There is no need to waste processing time having the RTL calculate the length to copy when you already know the length. And since you don't check for errors, you can't rely on CompName being null terminated anyway if GetComputerName() fails.
You should get rid of GetMem() altogether and just use a static array on the stack instead:
function GetThisComputerName: string;
var
CompName: array[0..MAX_COMPUTERNAME_LENGTH] of Char;
nSize: DWORD;
begin
nSize := Length(CompName);
if GetComputerName(CompName, nSize) then
SetString(Result, CompName, nSize)
else
Result := '';
end;

How do I stop this Variant memory leak?

I'm using an old script engine that's no longer supported by its creators, and having some trouble with memory leaks. It uses a function written in ASM to call from scripts into Delphi functions, and returns the result as an integer then passes that integer as an untyped parameter to another procedure that translates it into the correct type.
This works fine for most things, but when the return type of the Delphi function was Variant, it leaks memory because the variant is never getting disposed of. Does anyone know how I can take an untyped parameter containing a variant and ensure that it will be disposed of properly? This will probably involve some inline assembly.
procedure ConvertVariant(var input; var output: variant);
begin
output := variant(input);
asm
//what do I put here? Input is still held in EAX at this point.
end;
end;
EDIT: Responding to Rob Kennedy's question in comments:
AnsiString conversion works like this:
procedure VarFromString2(var s : AnsiString; var v : Variant);
begin
v := s;
s := '';
end;
procedure StringToVar(var p; var v : Variant);
begin
asm
call VarFromString2
end;
end;
That works fine and doesn't produce memory leaks. When I try to do the same thing with a variant as the input parameter, and assign the original Null on the second procedure, the memory leaks still happen.
The variants mostly contain strings--the script in question is used to generate XML--and they got there by assigning a Delphi string to a variant in the Delphi function that this script is calling. (Changing the return type of the function wouldn't work in this case.)
Have you tried the same trick as with the string, except that with a Variant, you should put UnAssigned instead of Null to free it, like you did s := ''; for the string.
And by the way, one of the only reasons I can think of that requires to explicitly free the strings, Variants, etc... is when using some ThreadVar.

Elegant way for handling this string issue. (Unicode-PAnsiString issue)

Consider the following scenario:
type
PStructureForSomeCDLL = ^TStructureForSomeCDLL;
TStructureForSomeCDLL = record
pName: PAnsiChar;
end
function FillStructureForDLL: PStructureForSomeDLL;
begin
New(Result);
// Result.pName := PAnsiChar(SomeObject.SomeString); // Old D7 code working all right
Result.pName := Utf8ToAnsi(UTF8Encode(SomeObject.SomeString)); // New problematic unicode version
end;
...code to pass FillStructureForDLL to DLL...
The problem in unicode version is that the string conversion involved now returns a new string on stack and that's reclaimed at the end of the FillStructureForDLL call, leaving the DLL with corrupted data. In old D7 code, there were no intermediate conversion funcs and thus no problem.
My current solution is a converter function like below, which is IMO too much of an hack. Is there a more elegant way of achieving the same result?
var gKeepStrings: array of AnsiString;
{ Convert the given Unicode value S to ANSI and increase the ref. count
of it so that returned pointer stays valid }
function ConvertToPAnsiChar(const S: string): PAnsiChar;
var temp: AnsiString;
begin
SetLength(gKeepStrings, Length(gKeepStrings) + 1);
temp := Utf8ToAnsi(UTF8Encode(S));
gKeepStrings[High(gKeepStrings)] := temp; // keeps the resulting pointer valid
// by incresing the ref. count of temp.
Result := PAnsiChar(temp);
end;
One way might be to tackle the problem before it becomes a problem, by which I mean adapt the class of SomeObject to maintain an ANSI Encoded version of SomeString (ANSISomeString?) for you alongside the original SomeString, keeping the two in step in a "setter" for the SomeString property (using the same UTF8 > ANSI conversion you are already doing).
In non-Unicode versions of the compiler make ANSISomeString be simply a "copy" of SomeString string, which will of course not be a copy, merely an additional ref count on SomeString. In the Unicode version it references a separate ANSI encoding with the same "lifetime" as the original SomeString.
procedure TSomeObjectClass.SetSomeString(const aValue: String);
begin
fSomeString := aValue;
{$ifdef UNICODE}
fANSISomeString := Utf8ToAnsi(UTF8Encode(aValue));
{$else}
fANSISomeString := fSomeString;
{$endif}
end;
In your FillStructure... function, simply change your code to refer to the ANSISomeString property - this then is entirely independent of whether compiling for Unicode or not.
function FillStructureForDLL: PStructureForSomeDLL;
begin
New(Result);
result.pName := PANSIChar(SomeObject.ANSISomeString);
end;
There are at least three ways to do this.
You could change SomeObject's class
definition to use an AnsiString
instead of a string.
You could
use a conversion system to hold
references, like in your example.
You could initialize result.pname
with GetMem and copy the result of the
conversion to result.pname^ with
Move. Just remember to FreeMem it
when you're done.
Unfortunately, none of them is a perfect solution. So take a look at the options and decide which one works best for you.
Hopefully you already have code in your application to properly dispose off of all the dynamically allocated records that you New() in FillStructureForDLL(). I consider this code highly dubious, but let's assume this is reduced code to demonstrate the problem only. Anyway, the DLL you pass the record instance to does not care how big the chunk of memory is, it will only get a pointer to it anyway. So you are free to increase the size of the record to make place for the Pascal string that is now a temporary instance on the stack in the Unicode version:
type
PStructureForSomeCDLL = ^TStructureForSomeCDLL;
TStructureForSomeCDLL = record
pName: PAnsiChar;
// ... other parts of the record
pNameBuffer: string;
end;
And the function:
function FillStructureForDLL: PStructureForSomeDLL;
begin
New(Result);
// there may be a bug here, can't test on the Mac... idea should be clear
Result.pNameBuffer := Utf8ToAnsi(UTF8Encode(SomeObject.SomeString));
Result.pName := Result.pNameBuffer;
end;
BTW: You wouldn't even have that problem if the record passed to the DLL was a stack variable in the procedure or function that calls the DLL function. In that case the temporary string buffers will only be necessary in the Unicode version if more than one PAnsiChar has to be passed (the conversion calls would otherwise reuse the temporary string). Consider changing the code accordingly.
Edit:
You write in a comment:
This would be best solution if modifying the DLL structures were an option.
Are you sure you can't use this solution? The point is that from the POV of the DLL the structure isn't modified at all. Maybe I didn't make myself clear, but the DLL will not care whether a structure passed to it is exactly what it is declared to be. It will be passed a pointer to the structure, and this pointer needs to point to a block of memory that is at least as large as the structure, and needs to have the same memory layout. However, it can be a block of memory that is larger than the original structure, and contain additional data.
This is actually used in quite a lot of places in the Windows API. Did you ever wonder why there are structures in the Windows API that contain as the first thing an ordinal value giving the size of the structure? It's the key to API evolution while preserving backwards compatibility. Whenever new information is needed for the API function to work it is simply appended to the existing structure, and a new version of the structure is declared. Note that the memory layout of older versions of the structure is preserved. Old clients of the DLL can still call the new function, which will use the size member of the structure to determine which API version is called.
In your case no different versions of the structure exist as far as the DLL is concerned. However, you are free to declare it larger for your application than it really is, provided the memory layout of the real structure is preserved, and additional data is only appended. The only case where this wouldn't work is when the last part of the structure were a record with varying size, kind of like the Windows BITMAP structure - a fixed header and dynamic data. However, your record looks like it has a fixed length.
Wouldn't PChar(AnsiString(SomeObject.SomeString)) work?

Why is Self assignable in Delphi?

This code in a GUI application compiles and runs:
procedure TForm1.Button1Click(Sender: TObject);
begin
Self := TForm1.Create(Owner);
end;
(tested with Delphi 6 and 2009)
why is Self writable and not read-only?
in which situations could this be useful?
Edit:
is this also possible in Delphi Prism? (I think yes it is, see here)
Update:
Delphi applications/libraries which make use of Self assignment:
python4delphi
That's not as bad as it could be. I just tested it in Delphi 2009, and it would seem that, while the Self parameter doesn't use const semantics, which you seem to be implying it should, it also doesn't use var semantics, so you can change it all you want within your method without actually losing the reference the caller holds to your object. That would be a very bad thing.
As for the reason why, one of two answers. Either a simple oversight, or what Marco suggested: to allow you to pass Self to a var parameter.
Maybe to allow passing to const or var parameters?
It could be an artefact, since system doesn't have self anywhere on the left of := sign.
Assigning to Self is so illogical and useless that this 'feature' is probably an oversight. And as with assignable constants, it's not always easy to correct such problems.
The simple advice here is: don't do it.
In reality, "Self" is just a name reference to a place on the stack that store address pointing to object in the heap. Forcing read-only on this variable is possible, apparently the designer decided not to. I believe the decision is arbitrary.
Can't see any case where this is useful, that'd merely change a value in stack. Also, changing this value can be dangerous as there is no guarantee that the behavior of the code that reference instance's member will be consistence across compiler versions.
Updated: In response to PatrickvL comment
The 'variable' "Self" is not on the
stack (to my knowledge, it never is);
Instead it's value is put in a
register (EAX to be exact) just before
a call to any object method is made. –
Nope, Self has actual address on the memory. Try this code to see for yourself.
procedure TForm1.Button1Click(Sender: TObject);
begin
ShowMessage(IntToStr(Integer(#Self)));
end;
procedure TForm1.Button2Click(Sender: TObject);
var
newform: TForm;
p: ^Integer;
begin
Self.Caption := 'TheOriginal';
newform := TForm.Create(nil);
try
newform.Caption := 'TheNewOne';
// The following two lines is, technically, the same as
// Self := newform;
p := Pointer(#Self);
p^ := Integer(newform);
ShowMessage(Self.Caption); // This will show 'TheNewOne' instead of 'TheOriginal'
finally
Self.Free; // Relax, this will free TheNewOne rather than TheOriginal
end;
end;
Sometimes, when you want to optimize a method for as far as you can take it (without resorting to assembly), 'Self' can be (ab)used as a 'free' variable - it could just mean the difference between using stack and using registers.
Sure, the contents of the stack are most probably already present in the CPU cache, so it should be fast to access, but registers are even faster still.
As a sidenote : I'm still missing the days when I was programming on the Amiga's Motorola 68000 and had the luxury of 16 data and 16 address registers.... I can't believe the world chose to go with the limited 4 registers of the 80x86 line of processors!
And as a final note, I choose to use Self sometimes, as the Delphi's optimizer is, well, not optimizing that well, actually. (At least, it pales compared to what trickery one can find in the various LLVM optimizers for example.) IMHO, and YMMV of course.

Resources