How to leak a string in Delphi - delphi

I was talking to a co-worker the other day about how you can leak a string in Delphi if you really mess things up. By default strings are reference counted and automatically allocated, so they typically just work without any thought - no need for manual allocation, size calculations, or memory management.
But I remember reading once that there is a way to leak a string directly (without including it in an object that gets leaked). It seems like it had something to do with passing a string by reference and then accessing it from a larger scope from within the routine it was passed to. Yeah, I know that is vague, which is why I am asking the question here.

I don't know about the issue in your second paragraph, but I was bitten once by leaked strings in a record.
If you call FillChar() on a record that contains strings you overwrite the ref count and the address of the dynamically allocated memory with zeroes. Unless the string is empty this will leak the memory. The way around this is to call Finalize() on the record before clearing the memory it occupies.
Unfortunately calling Finalize() when there are no record members that need finalizing causes a compiler hint. It happened to me that I commented out the Finalize() call to silence the hint, but later when I added a string member to the record I missed uncommenting the call, so a leak was introduced. Luckily I'm generally using the FastMM memory manager in the most verbose and paranoid setting in debug mode, so the leak didn't go unnoticed.
The compiler hint is probably not such a good thing, silently omitting the Finalize() call if it's not needed would be much better IMHO.

No, I don't think such a thing can happen. It's possible for a string variable to obtain a value that you didn't expect, but it won't leak memory. Consider this:
var
Global: string;
procedure One(const Arg: string);
begin
Global := '';
// Oops. This is an invalid reference now. Arg points to
// what Global used to refer to, which isn't there anymore.
writeln(Arg);
end;
procedure Two;
begin
Global := 'foo';
UniqueString(Global);
One(Global);
Assert(Global = 'foo', 'Uh-oh. The argument isn''t really const?');
end;
Here One's argument is declared const, so supposedly, it won't change. But then One circumvents that by changing the actual parameter instead of the formal parameter. Procedure Two "knows" that One's argument is const, so it expects the actual parameter to retain its original value. The assertion fails.
The string hasn't leaked, but this code does demonstrate how you can get a dangling reference for a string. Arg is a local alias of Global. Although we've changed Global, Arg's value remains untouched, and because it was declared const, the string's reference count was not incremented upon entry to the function. Reassigning Global dropped the reference count to zero, and the string was destroyed. Declaring Arg as var would have the same problem; passing it by value would fix this problem. (The call to UniqueString is just to ensure the string is reference-counted. Otherwise, it may be a non-reference-counted string literal.) All compiler-managed types are susceptible to this problem; simple types are immune.
The only way to leak a string is to treat it as something other than a string, or to use non-type-aware memory-management functions. Mghie's answer describes how to treat a string as something other than a string by using FillChar to clobber a string variable. Non-type-aware memory functions include GetMem and FreeMem. For example:
type
PRec = ^TRec;
TRec = record
field: string;
end;
var
Rec: PRec;
begin
GetMem(Rec, SizeOf(Rec^));
// Oops. Rec^ is uninitialized. This assignment isn't safe.
Rec^.field := IntToStr(4);
// Even if the assignment were OK, FreeMem would leak the string.
FreeMem(Rec);
end;
There are two ways to fix it. One is to call Initialize and Finalize:
GetMem(Rec, SizeOf(Rec^));
Initialize(Rec^);
Rec^.field := IntToStr(4);
Finalize(Rec^);
FreeMem(Rec);
The other is to use type-aware functions:
New(Rec);
Rec^.field := IntToStr(4);
Dispose(Rec);

Actually, passing string as CONST or non const are the same in term of reference count in Delphi 2007 and 2009. There was a case that causing access violation when string is passed as CONST. Here is the problem one
type
TFoo = class
S: string;
procedure Foo(const S1: string);
end;
procedure TFoo.Foo(const S1: string);
begin
S:= S1; //access violation
end;
var
F: TFoo;
begin
F:= TFoo.create;
try
F.S := 'S';
F.Foo(F.S);
finally
F.Free;
end;
end.

Another way to leak a string is to declare it as a threadvar variable. See my question for details. And for the solution, see the solution on how to tidy it.

I think this might have been similar to what I was thinking of. It is the reverse of a string leak, a string that gets collected early:
var
p : ^String;
procedure InitString;
var
s, x : String;
begin
s := 'A cool string!';
x := s + '. Append something to make a copy in' +
'memory and generate a new string.';
p := #x;
end;
begin
{ Call a function that will generate a string }
InitString();
{ Write the value of the string (pointed to by p) }
WriteLn(p^); // Runtime error 105!
{ Wait for a key press }
ReadLn;
end.

Related

How to properly free records that contain various types in Delphi at once?

type
TSomeRecord = Record
field1: integer;
field2: string;
field3: boolean;
End;
var
SomeRecord: TSomeRecord;
SomeRecAr: array of TSomeRecord;
This is the most basic example of what I have and since I want to reuse SomeRecord (with certain fields remaining empty, without freeing everything some fields would be carried over when I'm reusing SomeRecord, which is obviously undesired) I am looking for a way to free all of the fields at once. I've started out with string[255] and used ZeroMemory(), which was fine until it started leaking memory, that was because I switched to string. I still lack the knowledge to get why, but it appears to be related to it being dynamic. I am using dynamic arrays as well, so I assume that trying ZeroMemory() on anything dynamic would result in leaks. One day wasted figuring that out. I think I solved this by using Finalize() on SomeRecord or SomeRecAr before ZeroMemory(), but I'm not sure if this is the proper approach or just me being stupid.
So the question is: how to free everything at once? does some single procedure exist at all for this that I'm not aware of?
On a different note, alternatively I would be open to suggestions how to implement these records differently to begin with, so I don't need to make complicated attempts at freeing stuff. I've looked into creating records with New() and then getting rid of it Dispose(), but I have no idea what it means when a variable after a call to Dispose() is undefined, instead of nil. In addition, I don't know what's the difference between a variable of a certain type (SomeRecord: TSomeRecord) versus a variable pointing to a type (SomeRecord: ^TSomeRecord). I'm looking into the above issues at the moment, unless someone can explain it quickly, it might take some time.
Assuming you have a Delphi version that supports implementing methods on a record, you could clear a record like this:
type
TSomeRecord = record
field1: integer;
field2: string;
field3: boolean;
procedure Clear;
end;
procedure TSomeRecord.Clear;
begin
Self := Default(TSomeRecord);
end;
If your compiler doesn't support Default then you can do the same quite simply like this:
procedure TSomeRecord.Clear;
const
Default: TSomeRecord=();
begin
Self := Default;
end;
You might prefer to avoid mutating a value type in a method. In which case create a function that returns an empty record value, and use it with the assignment operator:
type
TSomeRecord = record
// fields go here
class function Empty: TSomeRecord; static;
end;
class function TSomeRecord.Empty: TSomeRecord;
begin
Result := Default(TSomeRecord);
end;
....
Value := TSomeRecord.Empty;
As an aside, I cannot find any documentation reference for Default(TypeIdentifier). Does anyone know where it can be found?
As for the second part of your question, I see no reason not to continue using records, and allocating them using dynamic arrays. Attempting to manage the lifetime yourself is much more error prone.
Don't make thinks overcomplicated!
Assigning a "default" record is just a loss of CPU power and memory.
When a record is declared within a TClass, it is filled with zero, so initialized. When it is allocated on stack, only reference counted variables are initialized: others kind of variable (like integer or double or booleans or enumerations) are in a random state (probably non zero). When it will be allocated on the heap, getmem will not initialize anything, allocmem will fill all content with zero, and new will initialize only reference-counted members (like on the stack initialization): in all cases, you should use either dispose, either finalize+freemem to release a heap-allocated record.
So about your exact question, your own assumption was right: to reset a record content after use, never use "fillchar" (or "zeromemory") without a previous "finalize". Here is the correct and fastest way:
Finalize(aRecord);
FillChar(aRecord,sizeof(aRecord),0);
Once again, it will be faster than assigning a default record. And in all case, if you use Finalize, even multiple times, it won't leak any memory - 100% money back warranty!
Edit: After looking at the code generated by aRecord := default(TRecordType), the code is well optimized: it is in fact a Finalize + bunch of stosd to emulate FillChar. So even if the syntax is a copy / assignement (:=), it is not implemented as a copy / assignment. My mistake here.
But I still do not like the fact that a := has to be used, where Embarcadero should have better used a record method like aRecord.Clear as syntax, just like DelphiWebScript's dynamic arrays. In fact, this := syntax is the same exact used by C#. Sounds like if Embacardero just mimics the C# syntax everywhere, without finding out that this is weird. What is the point if Delphi is just a follower, and not implement thinks "its way"? People will always prefer the original C# to its ancestor (Delphi has the same father).
The most simply solution I think of will be:
const
EmptySomeRecord: TSomeRecord = ();
begin
SomeRecord := EmptySomeRecord;
But to address all the remaining parts of your question, take these definitions:
type
PSomeRecord = ^TSomeRecord;
TSomeRecord = record
Field1: Integer;
Field2: String;
Field3: Boolean;
end;
TSomeRecords = array of TSomeRecord;
PSomeRecordList = ^TSomeRecordList;
TSomeRecordList = array[0..MaxListSize] of TSomeRecord;
const
EmptySomeRecord: TSomeRecord = ();
Count = 10;
var
SomeRecord: TSomeRecord;
SomeRecords: TSomeRecords;
I: Integer;
P: PSomeRecord;
List: PSomeRecordList;
procedure ClearSomeRecord(var ASomeRecord: TSomeRecord);
begin
ASomeRecord.Field1 := 0;
ASomeRecord.Field2 := '';
ASomeRecord.Field3 := False;
end;
function NewSomeRecord: PSomeRecord;
begin
New(Result);
Result^.Field1 := 0;
Result^.Field2 := '';
Result^.Field3 := False;
end;
And then here some multiple examples on how to operate on them:
begin
// Clearing a typed variable (1):
SomeRecord := EmptySomeRecord;
// Clearing a typed variable (2):
ClearSomeRecord(SomeRecord);
// Initializing and clearing a typed array variabele:
SetLength(SomeRecords, Count);
// Creating a pointer variable:
New(P);
// Clearing a pointer variable:
P^.Field1 := 0;
P^.Field2 := '';
P^.Field3 := False;
// Creating and clearing a pointer variable:
P := NewSomeRecord;
// Releasing a pointer variable:
Dispose(P);
// Creating a pointer array variable:
ReallocMem(List, Count * SizeOf(TSomeRecord));
// Clearing a pointer array variable:
for I := 0 to Count - 1 do
begin
Pointer(List^[I].Field2) := nil;
List^[I].Field1 := 0;
List^[I].Field2 := '';
List^[I].Field3 := False;
end;
// Releasing a pointer array variable:
Finalize(List^[0], Count);
Choose and/or combine as you like.
With SomeRecord: TSomeRecord, SomeRecord will be an instance/variable of type TSomeRecord. With SomeRecord: ^TSomeRecord, SomeRecord will be a pointer to a instance or variable of type TSomeRecord. In the last case, SomeRecord will be a typed pointer. If your application transfer a lot of data between routines or interact with external API, typed pointer are recommended.
new() and dispose() are only used with typed pointers. With typed pointers the compiler doesn't have control/knowlegde of the memory your application is using with this kind of vars. It's up to you to free the memory used by typed pointers.
In the other hand, when you use a normal variables, depending on the use and declaration, the compiler will free memory used by them when it considers they are not necessary anymore. For example:
function SomeStaff();
var
NativeVariable: TSomeRecord;
TypedPointer: ^TSomeRecord;
begin
NaviveVariable.Field1 := 'Hello World';
// With typed pointers, we need to manually
// create the variable before we can use it.
new(TypedPointer);
TypedPointer^.Field1 := 'Hello Word';
// Do your stuff here ...
// ... at end, we need to manually "free"
// the typed pointer variable. Field1 within
// TSomerecord is also released
Dispose(TypedPointer);
// You don't need to do the above for NativeVariable
// as the compiler will free it after this function
// ends. This apply also for native arrays of TSomeRecord.
end;
In the above example, the variable NativeVariable is only used within the SomeStaff function, so the compiler automatically free it when the function ends. This appy for almost most native variables, including arrays and records "fields". Objects are treated differently, but that's for another post.

How can I modify and return a variable of type PChar in a function call

I need store a variant value (which always return a string) in a PChar variable now i'm using this code
procedure VariantToPChar(v:variant; p : PChar);
Var
s : String;
begin
s:=v;
GetMem(p,Length(s)*Sizeof(Char));
StrCopy(p, PChar(s));
end;
But i'm wondering if exist a better way
Do you really, really have to create a PChar? As long as possible i would use Strings, and only if an external library (like the Windows API) requires a PChar, i would cast it.
uses
Variants;
var
vText: Variant;
sText: String;
begin
vText := 'Hello world';
// VarToStr() can handle also null values
sText := VarToStr(vText);
// If absolutely necessary, cast it to PChar()
CallToExternalFunction(PChar(sText));
Doing it like this you can avoid problems with memory (de)allocation, null values, and Ansi/Unicode chars. If the external function wants to write into the string, you can use SetLength() before casting. Maybe the article Working with PChar could give you some ideas.
Update: You really shouldn't do this or use this code as you're likely to encourage people to write code that leaks. People will call this and fail to free the memory since they don't know that this function allocates memory.
If you want to store something in a PChar size buffer, and have that value still be associated with p (the pointer p is modified and is different when you return from the procedure), then you need to make the parameter a var (by-reference instead of by-value) parameter like this:
procedure AllocPCharBufFromVariant(v:variant; var p : PChar);
Var
s : String;
begin
try
s:=v;
GetMem(p,(Length(s)+1)*Sizeof(Char)); // fixed to add 1 for the nul
StrCopy(p, PChar(s));
except
on E:EVariantError do
begin
p := nil;
end;
end;
end;
I have also shown above handling EVariantError, which I have chosen to handle by returning nil in the p parameter, but you should think about how you want it to work, and then deal with it somehow.
The above code also leaks memory which is awful, so I renamed it AllocPChar. It seems like your original code has so many problems that I can't recommend a good way to do what looks like a giant pile of bad things and the name you chose is among the most awful choices.
At least the name Alloc gives me a hint so I'm thinking "I better free this when I'm done with it".
I suspect just a
PChar(string(v))
expression will do the trick.
And the memory used to store the converted string content will be available in the scope of this code (i.e. as long as the string(v) will be referenced - so you may want to use an explicit string variable to ensure that your PChar memory is still allocated).

Could omission of "^" when accessing a record pointer's members cause an access violation?

In VirtualTreeview, I am storing my data in the PVirtualNodes. I have experienced several Access Violations (typically with "Read of adress 00000000") in my App, and they mostly (I'd actually dare to say Always) occur when I am doing something with my Node Data.
However, the thing is, I declare my stuff & use it like this:
// DUMMY CODE - Not written or tested in IDE
var
MyNode : PVirtualNode;
MyData : PMyNodeData;
Begin
MyNode := VST.GetFirstSelected;
if Assigned(MyNode) then
Begin
MyData := VST.GetNodeData(MyNode);
if Assigned(MyData) then
Begin
MyData.DummyProperty := 'Test';
End;
End;
End;
As you probably noticed, I do not "dereference" (correct?) my "MyData" by doing MyData^! The reason I don't is that I have been told it was not necessary to add the caret to the pointer name, however I have a feeling that it has something to do with it. If I knew, I wouldn't be posting on here. ;)
So my question is: Is it in the end necessary to add the little ^ to MyData? And is it possible that by not doing that, I may provoke an Access Violation?
When you have a pointer to a record, then you can omit the ^. The following are equivalent:
MyData.DummyProperty
MyData^.DummyProperty
This is also the case for the deprecated Turbo Pascal object. I would expect it to be so for Delphi classes, although I have never tried with them since they are already reference types.
Sadly, this is not the explanation for your AV.
Using ^ to dereference records is optionnal as it is assumed implicitly by the compiler. When not using any hard typecast, any situation that would requires the "^" would not compile. But only 1 level of dereferencing is implicit.
type
TMyRecord = record
MyField : Integer;
end;
PMyRecord = ^TMyRecord;
PPMyRecord = ^PMyRecord;
procedure DoSomething;
var vMyField : PPMyRecord;
begin
vMyField.MyField; <---Won't compile
vMyField^.MyField; <---Will compile
end;
As for your access violation, here's my best guess based on what you wrote... Assuming your exemple is representative (i.e. that is, crash on assigning a string), and assuming PMyNodeData points to a record. I'd guess that PMyNodeData's memory was reserved with "GetMem" instead of "New", making the string field of the record uninitialized.
There is an exception where Data.xx and Data^.xx are not the same: when the field pointed at is of the same pointer type or the generic pointer type:
var
x: PPointer;
y: Pointer;
begin
x := GetPPointer();
y := x;
y := x^;
end;
I consider it best practice to always add the operator ^ when the pointed value is used to avoid ambiguous situations like above.
Given your example: The problem is possibly memory corruption. Did you set NodeDataSize correctly?

How do I stop this Variant memory leak?

I'm using an old script engine that's no longer supported by its creators, and having some trouble with memory leaks. It uses a function written in ASM to call from scripts into Delphi functions, and returns the result as an integer then passes that integer as an untyped parameter to another procedure that translates it into the correct type.
This works fine for most things, but when the return type of the Delphi function was Variant, it leaks memory because the variant is never getting disposed of. Does anyone know how I can take an untyped parameter containing a variant and ensure that it will be disposed of properly? This will probably involve some inline assembly.
procedure ConvertVariant(var input; var output: variant);
begin
output := variant(input);
asm
//what do I put here? Input is still held in EAX at this point.
end;
end;
EDIT: Responding to Rob Kennedy's question in comments:
AnsiString conversion works like this:
procedure VarFromString2(var s : AnsiString; var v : Variant);
begin
v := s;
s := '';
end;
procedure StringToVar(var p; var v : Variant);
begin
asm
call VarFromString2
end;
end;
That works fine and doesn't produce memory leaks. When I try to do the same thing with a variant as the input parameter, and assign the original Null on the second procedure, the memory leaks still happen.
The variants mostly contain strings--the script in question is used to generate XML--and they got there by assigning a Delphi string to a variant in the Delphi function that this script is calling. (Changing the return type of the function wouldn't work in this case.)
Have you tried the same trick as with the string, except that with a Variant, you should put UnAssigned instead of Null to free it, like you did s := ''; for the string.
And by the way, one of the only reasons I can think of that requires to explicitly free the strings, Variants, etc... is when using some ThreadVar.

Initialise string function result?

I've just been debugging a problem with a function that returns a string that has got me worried. I've always assumed that the implicit Result variable for functions that return a string would be empty at the start of the function call, but the following (simplified) code produced an unexpected result:
function TMyObject.GenerateInfo: string;
procedure AppendInfo(const AppendStr: string);
begin
if(Result > '') then
Result := Result + #13;
Result := Result + AppendStr;
end;
begin
if(ACondition) then
AppendInfo('Some Text');
end;
Calling this function multiple times resulted in:
"Some Text"
the first time,
"Some Text"
"Some Text"
the second time,
"Some Text"
"Some Text"
"Some Text"
the third time, etc.
To fix it I had to initialise the Result:
begin
Result := '';
if(ACondition) then
AppendInfo('Some Text');
end;
Is it necessary to initialise a string function result? Why (technically)? Why does the compiler not emit a warning "W1035 Return value of function 'xxx' might be undefined" for string functions? Do I need to go through all my code to make sure a value is set as it is not reliable to expect an empty string from a function if the result is not explicitly set?
I've tested this in a new test application and the result is the same.
procedure TForm1.Button1Click(Sender: TObject);
var
i: integer;
S: string;
begin
for i := 1 to 5 do
S := GenerateInfo;
ShowMessage(S); // 5 lines!
end;
This is not a bug, but "feature":
For a string, dynamic array, method
pointer, or variant result, the
effects are the same as if the
function result were declared as an
additional var parameter following the
declared parameters. In other words,
the caller passes an additional 32-bit
pointer that points to a variable in
which to return the function result.
I.e. your
function TMyObject.GenerateInfo: string;
Is really this:
procedure TMyObject.GenerateInfo(var Result: string);
Note "var" prefix (not "out" as you may expect!).
This is SUCH un-intuitive, so it leads to all kind of problems in the code. Code in question - just one example of results of this feature.
See and vote for this request.
We've run into this before, I think maybe as far back as Delphi 6 or 7. Yes, even though the compiler doesn't bother to give you a warning, you do need to initialize your string Result variables, for precisely the reason you ran into. The string variable is getting initialized -- it doesn't start as a garbage reference -- but it doesn't seem to get reinitialized when you expect it to.
As for why it happens... not sure. It's a bug, so it doesn't necessarily need a reason. We only saw it happen when we called the function repeatedly in a loop; if we called it outside a loop, it worked as expected. It looked like the caller was allocating space for the Result variable (and reusing it when it called the same function repeatedly, thus causing the bug), rather than the function allocating its own string (and allocating a new one on each call).
If you were using short strings, then the caller does allocate the buffer -- that's long-standing behavior for large value types. But that doesn't make sense for AnsiString. Maybe the compiler team just forgot to change the semantics when they first implemented long strings in Delphi 2.
This is not a Bug. By definition no variable inside function is initialized, including Result.
So your Result is undefind on first call, and can hold anything. How it is implemented in compiler is irrelevant, and you can have different results in different compilers.
It seems like your function should be simplified like this:
function TMyObject.GenerateInfo: string;
begin
if(ACondition) then
Result := 'Some Text'
else
Result := '';
end;
You typically don't want to use Result on the right side of an assignment in a function.
Anyway, strictly for illustrative purposes, you could also do this, though not recommended:
procedure TForm1.Button1Click(Sender: TObject);
var
i: integer;
S: string;
begin
for i := 1 to 5 do
begin
S := ''; // Clear before you call
S := GenerateInfo;
end;
ShowMessage(S); // 5 lines!
end;
This looks like a bug in D2007. I just tested it in Delphi 2010 and got the expected behavior. (1 line instead of 5.)
If you think that some automatic management of strings are made to make your life easier, you're only partly right. All such things are also done to make string logic consistent and side-effects free.
In plenty of places there are string passed by reference, passed by value, but all these lines are expecting VALID strings in whose memory-management counter is some valid, not a garbage value. So in order to keep strings valid the only thing for sure is that they should be initialized when they firstly introduced. For example, for any local variable string this is a necessity since this is the place a string is introduced. All other string usage including function(): string (that actually procedure(var Result: string) as Alexander correctly pointed out) just expects valid strings on the stack, not initialized. And validness here comes from the fact that (var Result: string) construction says that "I'm waiting for a valid variable that definetly was introduced before". UPDATE: Because of that the actual contents of Result is unexpected, but due to the same logic, if it's the only call to this function that has a local variable at the left, the emptiness of the string in this case is guaranteed.
Alex's answer is nearly always right and it answers why I was seeing the strange behaviour that I was, but it isn't the whole story.
The following, compiled without optimisation, produces the expected result of sTemp being an empty string. If you swap the function out for the procedure call you get a different result.
There seems to be a different rule for the actual program unit.
Admittedly this is a corner case.
program Project1;
{$APPTYPE CONSOLE}
uses System.SysUtils;
function PointlessFunction: string;
begin
end;
procedure PointlessProcedure(var AString: string);
begin
end;
var
sTemp: string;
begin
sTemp := '1234';
sTemp := PointlessFunction;
//PointlessProcedure(sTemp);
WriteLn('Result:' + sTemp);
ReadLn;
end.

Resources