Initialise string function result? - delphi

I've just been debugging a problem with a function that returns a string that has got me worried. I've always assumed that the implicit Result variable for functions that return a string would be empty at the start of the function call, but the following (simplified) code produced an unexpected result:
function TMyObject.GenerateInfo: string;
procedure AppendInfo(const AppendStr: string);
begin
if(Result > '') then
Result := Result + #13;
Result := Result + AppendStr;
end;
begin
if(ACondition) then
AppendInfo('Some Text');
end;
Calling this function multiple times resulted in:
"Some Text"
the first time,
"Some Text"
"Some Text"
the second time,
"Some Text"
"Some Text"
"Some Text"
the third time, etc.
To fix it I had to initialise the Result:
begin
Result := '';
if(ACondition) then
AppendInfo('Some Text');
end;
Is it necessary to initialise a string function result? Why (technically)? Why does the compiler not emit a warning "W1035 Return value of function 'xxx' might be undefined" for string functions? Do I need to go through all my code to make sure a value is set as it is not reliable to expect an empty string from a function if the result is not explicitly set?
I've tested this in a new test application and the result is the same.
procedure TForm1.Button1Click(Sender: TObject);
var
i: integer;
S: string;
begin
for i := 1 to 5 do
S := GenerateInfo;
ShowMessage(S); // 5 lines!
end;

This is not a bug, but "feature":
For a string, dynamic array, method
pointer, or variant result, the
effects are the same as if the
function result were declared as an
additional var parameter following the
declared parameters. In other words,
the caller passes an additional 32-bit
pointer that points to a variable in
which to return the function result.
I.e. your
function TMyObject.GenerateInfo: string;
Is really this:
procedure TMyObject.GenerateInfo(var Result: string);
Note "var" prefix (not "out" as you may expect!).
This is SUCH un-intuitive, so it leads to all kind of problems in the code. Code in question - just one example of results of this feature.
See and vote for this request.

We've run into this before, I think maybe as far back as Delphi 6 or 7. Yes, even though the compiler doesn't bother to give you a warning, you do need to initialize your string Result variables, for precisely the reason you ran into. The string variable is getting initialized -- it doesn't start as a garbage reference -- but it doesn't seem to get reinitialized when you expect it to.
As for why it happens... not sure. It's a bug, so it doesn't necessarily need a reason. We only saw it happen when we called the function repeatedly in a loop; if we called it outside a loop, it worked as expected. It looked like the caller was allocating space for the Result variable (and reusing it when it called the same function repeatedly, thus causing the bug), rather than the function allocating its own string (and allocating a new one on each call).
If you were using short strings, then the caller does allocate the buffer -- that's long-standing behavior for large value types. But that doesn't make sense for AnsiString. Maybe the compiler team just forgot to change the semantics when they first implemented long strings in Delphi 2.

This is not a Bug. By definition no variable inside function is initialized, including Result.
So your Result is undefind on first call, and can hold anything. How it is implemented in compiler is irrelevant, and you can have different results in different compilers.

It seems like your function should be simplified like this:
function TMyObject.GenerateInfo: string;
begin
if(ACondition) then
Result := 'Some Text'
else
Result := '';
end;
You typically don't want to use Result on the right side of an assignment in a function.
Anyway, strictly for illustrative purposes, you could also do this, though not recommended:
procedure TForm1.Button1Click(Sender: TObject);
var
i: integer;
S: string;
begin
for i := 1 to 5 do
begin
S := ''; // Clear before you call
S := GenerateInfo;
end;
ShowMessage(S); // 5 lines!
end;

This looks like a bug in D2007. I just tested it in Delphi 2010 and got the expected behavior. (1 line instead of 5.)

If you think that some automatic management of strings are made to make your life easier, you're only partly right. All such things are also done to make string logic consistent and side-effects free.
In plenty of places there are string passed by reference, passed by value, but all these lines are expecting VALID strings in whose memory-management counter is some valid, not a garbage value. So in order to keep strings valid the only thing for sure is that they should be initialized when they firstly introduced. For example, for any local variable string this is a necessity since this is the place a string is introduced. All other string usage including function(): string (that actually procedure(var Result: string) as Alexander correctly pointed out) just expects valid strings on the stack, not initialized. And validness here comes from the fact that (var Result: string) construction says that "I'm waiting for a valid variable that definetly was introduced before". UPDATE: Because of that the actual contents of Result is unexpected, but due to the same logic, if it's the only call to this function that has a local variable at the left, the emptiness of the string in this case is guaranteed.

Alex's answer is nearly always right and it answers why I was seeing the strange behaviour that I was, but it isn't the whole story.
The following, compiled without optimisation, produces the expected result of sTemp being an empty string. If you swap the function out for the procedure call you get a different result.
There seems to be a different rule for the actual program unit.
Admittedly this is a corner case.
program Project1;
{$APPTYPE CONSOLE}
uses System.SysUtils;
function PointlessFunction: string;
begin
end;
procedure PointlessProcedure(var AString: string);
begin
end;
var
sTemp: string;
begin
sTemp := '1234';
sTemp := PointlessFunction;
//PointlessProcedure(sTemp);
WriteLn('Result:' + sTemp);
ReadLn;
end.

Related

Why does this one PAnsiChar get chopped when converted to AnsiString?

Please consider the following program:
program SO41175184;
{$APPTYPE CONSOLE}
uses
SysUtils;
function Int9999: PAnsiChar;
begin
Result := PAnsiChar(AnsiString(IntToStr(9999)));
end;
function Int99999: PAnsiChar;
begin
Result := PAnsiChar(AnsiString(IntToStr(99999)));
end;
function Int999999: PAnsiChar;
begin
Result := PAnsiChar(AnsiString(IntToStr(999999)));
end;
function Str9999: PAnsiChar;
begin
Result := PAnsiChar(AnsiString('9999'));
end;
function Str99999: PAnsiChar;
begin
Result := PAnsiChar(AnsiString('99999'));
end;
function Str999999: PAnsiChar;
begin
Result := PAnsiChar(AnsiString('999999'));
end;
begin
WriteLn(Int9999); // '9999'
WriteLn(Int99999); // '99999'
WriteLn(Int999999); // '999999'
WriteLn(string(AnsiString(Str9999))); // '9999'
WriteLn(string(AnsiString(Str99999))); // '99999'
WriteLn(string(AnsiString(Str999999))); // '999999'
WriteLn(string(AnsiString(PAnsiChar(AnsiString(IntToStr(9999)))))); // '9999'
WriteLn(string(AnsiString(PAnsiChar(AnsiString(IntToStr(99999)))))); // '99999'
WriteLn(string(AnsiString(PAnsiChar(AnsiString(IntToStr(999999)))))); // '999999'
WriteLn(string(AnsiString(Int9999))); // '9999'
WriteLn(string(AnsiString(Int99999))); // '9999' <----- ?!
WriteLn(string(AnsiString(Int999999))); // '999999'
ReadLn;
end.
Only in one of these cases does the string lose a single character, in Delphi 2010 and Delphi XE3 both. With FPC the same program works correctly. Switching to PChar also makes the problem disappear.
I suppose it has something to do with memory management, but I don't have enough of a clue where to look to do a meaningful investigation. Could anyone clarify?
Dynamically created strings are reference counted and deallocated when no references remain.
Result := PAnsiChar(AnsiString(IntToStr(99999)));
causes a temporary AnsiString to be created, its address taken via the cast to PAnsiChar, and then the temporary string deallocated†. The resulting pointer points to now-unclaimed memory that may be overwritten for pretty much any reason, including during the allocations of more strings.
Neither Delphi nor FPC clears memory by default during deallocations, so if the memory hasn't been re-used yet, you may get lucky when reading what used to be there. Or, as you saw, you may not.
When returning PAnsiChar like this, you need an agreement between caller and callee on memory management. You need to make sure you do not free the memory early, and you need to make sure your callers know how to free the memory afterwards.
† Remy Lebeau points out that this deallocation happens when the procedure or function returns. If there is another statement after the assignment to Result, the string will still be available. This is normally correct, but there are also cases where it the temporary string gets deallocated before the return, for example when you create temporary strings in a loop. I would not recommend using temporary objects after the statement that creates them concludes, even in cases where it is valid, because it makes it too hard to verify whether the code is correct. For those cases, just use an explicit variable.

Is Result variable defined from first line in a function?

I need a clarification of this case.
According my tests the Result variable is defined to:
Boolean=False, Integer=0, String='', Object=nil etc from the first line.
But I have never seen an official reference for this.
It also make sense as this gives the hint.
[DCC Warning] Unit1.pas(35): H2077 Value assigned to 'TForm1.Test' never used
function TForm1.Test: Boolean;
begin
Result := False;
// Some arbitrary code here
Result := True;
end;
But what happens if I comment out the first line and there is an exception somewhere before last line? Is Result = False ?
If Result is undefined this means that I always have to start every function by defining Result in case of exception later. And this make no sense for me.
As stated by the official Delphi documentation, the result is either:
CPU register(s) (AL / AX / EAX / RAX / EAX:EDX) for ordinal values and elements contained in a register;
FPU register (st(0) / XMM1);
An additional variable passed as a latest parameter.
The general rule is that no result value is defined by default. You'll have to set it. The compiler will warn you about any missing result set.
For a string, dynamic array, method pointer, or variant result, the
effects are the same as if the function result were declared as an
additional var parameter following the declared parameters. In other
words, the caller passes an additional 32-bit pointer that points to a
variable in which to return the function result.
To be accurate, the var parameter is not only for managed types, but only for record or object results, which are allocated on the stack before calling, so are subject to the same behavior.
That is, for instance, if your result is a string, it will passed as an additional var parameter. So it will contain by default the value before the call. It will be '' at first, then if you call the function several times, it will contain the previous value.
function GetString: string;
// is compiled as procedure GetString(var result: string);
begin
if result='' then
result := 'test' else
writeln('result=',result);
end;
function GetRaise: string;
// is compiled as procedure GetRaise(var result: string);
begin
result := 'toto';
raise Exception.Create('Problem');
end;
var s: string;
begin
// here s=''
s := GetString; // called as GetString(s);
// here s='test'
s := GetString; // called as GetString(s);
// will write 'result=test' on the console
try
s := GetRaise; // called as GetRaise(s);
finally
// here s='toto'
end;
end;
So my advices are:
Fix all compiler warning about unset result;
Do not assume that a result string is initialized to '' (it may be at first, but not at 2nd call) - this is passed as a var parameter, not as a out parameter;
Any exception will be processed as usual, that is, the running flow will jump to the next finally or except block - but if you have a result transmitted as a var parameter, and something has been already assigned to result, the value will be set;
It is not because in most cases, an unset result ordinal value (e.g. a boolean) is 0 (because EAX=0 in asm code just before the return), that it will be next time (I've seen random issues on customer side because of such unset result variables: it works most time, then sometimes code fails...);
You can use the exit() syntax to return a value, on newer versions of Delphi.
You state:
If Result is undefined this means that I always have to start every function by defining Result in case of exception later.
You are concerned that the return value of a function is undefined if the function raises an exception. But that should not matter. Consider the following code:
x := fn();
If the body of the function fn raises an exception then, back at the call site, x should not be assigned to. Logically the one-liner above can be thought of as a two-liner:
call fn()
assign return value to x
If an exception is raised in line 1 then line 2 never happens and x should never be assigned to.
So, if an exception is raised before you have assigned to Result then that is simply not a problem because a function's return value should never be used if the function raises an exception.
What you should in fact be concerned about is a related issue. What if you assign to Result and then an exception is raised? Is it possible for the value you assigned to Result to propagate outside of the function? Sadly the answer is yes.
For many result types (e.g. Integer, Boolean etc.) the value you assign to Result does not propagate outside the function if that function raises an exception. So far, so good.
But for some result types (strings, dynamic arrays, interface references, variants etc.) there is an implementation detail that complicates matters. The return value is passed to the function as a var parameter. And it turns out that you can initialise the return value from outside the function. Like this:
s := 'my string';
s := fn();
When the body of fn begins execution, Result has the value 'my string'. It is as if fn is declared like this:
procedure fn(var Result: string);
And this means that you can assign to the Result variable and see modifications at the call site, even if your function subsequently raises an exception. There is no clean way to work around it. The best you can do is to assign to a local variable in the function and only assign Result as the final act of the function.
function fn: string;
var
s: string;
begin
s := ...
... blah blah, maybe raise exception
Result := s;
end;
The lack of a C style return statement is felt strongly here.
It is surprising hard to state accurately which type of result variables will be susceptible to the problem described above. Initially I thought that the problem just affected managed types. But Arnaud states in a comment that records and objects are affected too. Well, that is true if the record or object is stack allocated. If it is a global variable, or heap allocated (e.g. member of a class) then the compiler treats it differently. For heap allocated records, an implicit stack allocated variable is used to return the function result. Only when the function returns is this copied to the heap allocated variable. So the value to which you assign the function result variable at the call site affects the semantics of the function itself!
In my opinion this is all a very clear illustration of why it was a dreadful mistake, in the language design, for function return values to have var semantics as opposed to having out semantics.
No, Result has no (guaranteed) default value. It is undefined unless you give it a value. This is implied by the documentation, which states
If the function exits without assigning a value to Result or the
function name, then the function's return value is undefined.
I just tried
function test: integer;
begin
ShowMessage(IntToStr(result));
end;
and got a message with the text 35531136.

How can I modify and return a variable of type PChar in a function call

I need store a variant value (which always return a string) in a PChar variable now i'm using this code
procedure VariantToPChar(v:variant; p : PChar);
Var
s : String;
begin
s:=v;
GetMem(p,Length(s)*Sizeof(Char));
StrCopy(p, PChar(s));
end;
But i'm wondering if exist a better way
Do you really, really have to create a PChar? As long as possible i would use Strings, and only if an external library (like the Windows API) requires a PChar, i would cast it.
uses
Variants;
var
vText: Variant;
sText: String;
begin
vText := 'Hello world';
// VarToStr() can handle also null values
sText := VarToStr(vText);
// If absolutely necessary, cast it to PChar()
CallToExternalFunction(PChar(sText));
Doing it like this you can avoid problems with memory (de)allocation, null values, and Ansi/Unicode chars. If the external function wants to write into the string, you can use SetLength() before casting. Maybe the article Working with PChar could give you some ideas.
Update: You really shouldn't do this or use this code as you're likely to encourage people to write code that leaks. People will call this and fail to free the memory since they don't know that this function allocates memory.
If you want to store something in a PChar size buffer, and have that value still be associated with p (the pointer p is modified and is different when you return from the procedure), then you need to make the parameter a var (by-reference instead of by-value) parameter like this:
procedure AllocPCharBufFromVariant(v:variant; var p : PChar);
Var
s : String;
begin
try
s:=v;
GetMem(p,(Length(s)+1)*Sizeof(Char)); // fixed to add 1 for the nul
StrCopy(p, PChar(s));
except
on E:EVariantError do
begin
p := nil;
end;
end;
end;
I have also shown above handling EVariantError, which I have chosen to handle by returning nil in the p parameter, but you should think about how you want it to work, and then deal with it somehow.
The above code also leaks memory which is awful, so I renamed it AllocPChar. It seems like your original code has so many problems that I can't recommend a good way to do what looks like a giant pile of bad things and the name you chose is among the most awful choices.
At least the name Alloc gives me a hint so I'm thinking "I better free this when I'm done with it".
I suspect just a
PChar(string(v))
expression will do the trick.
And the memory used to store the converted string content will be available in the scope of this code (i.e. as long as the string(v) will be referenced - so you may want to use an explicit string variable to ensure that your PChar memory is still allocated).

How do I stop this Variant memory leak?

I'm using an old script engine that's no longer supported by its creators, and having some trouble with memory leaks. It uses a function written in ASM to call from scripts into Delphi functions, and returns the result as an integer then passes that integer as an untyped parameter to another procedure that translates it into the correct type.
This works fine for most things, but when the return type of the Delphi function was Variant, it leaks memory because the variant is never getting disposed of. Does anyone know how I can take an untyped parameter containing a variant and ensure that it will be disposed of properly? This will probably involve some inline assembly.
procedure ConvertVariant(var input; var output: variant);
begin
output := variant(input);
asm
//what do I put here? Input is still held in EAX at this point.
end;
end;
EDIT: Responding to Rob Kennedy's question in comments:
AnsiString conversion works like this:
procedure VarFromString2(var s : AnsiString; var v : Variant);
begin
v := s;
s := '';
end;
procedure StringToVar(var p; var v : Variant);
begin
asm
call VarFromString2
end;
end;
That works fine and doesn't produce memory leaks. When I try to do the same thing with a variant as the input parameter, and assign the original Null on the second procedure, the memory leaks still happen.
The variants mostly contain strings--the script in question is used to generate XML--and they got there by assigning a Delphi string to a variant in the Delphi function that this script is calling. (Changing the return type of the function wouldn't work in this case.)
Have you tried the same trick as with the string, except that with a Variant, you should put UnAssigned instead of Null to free it, like you did s := ''; for the string.
And by the way, one of the only reasons I can think of that requires to explicitly free the strings, Variants, etc... is when using some ThreadVar.

How to leak a string in Delphi

I was talking to a co-worker the other day about how you can leak a string in Delphi if you really mess things up. By default strings are reference counted and automatically allocated, so they typically just work without any thought - no need for manual allocation, size calculations, or memory management.
But I remember reading once that there is a way to leak a string directly (without including it in an object that gets leaked). It seems like it had something to do with passing a string by reference and then accessing it from a larger scope from within the routine it was passed to. Yeah, I know that is vague, which is why I am asking the question here.
I don't know about the issue in your second paragraph, but I was bitten once by leaked strings in a record.
If you call FillChar() on a record that contains strings you overwrite the ref count and the address of the dynamically allocated memory with zeroes. Unless the string is empty this will leak the memory. The way around this is to call Finalize() on the record before clearing the memory it occupies.
Unfortunately calling Finalize() when there are no record members that need finalizing causes a compiler hint. It happened to me that I commented out the Finalize() call to silence the hint, but later when I added a string member to the record I missed uncommenting the call, so a leak was introduced. Luckily I'm generally using the FastMM memory manager in the most verbose and paranoid setting in debug mode, so the leak didn't go unnoticed.
The compiler hint is probably not such a good thing, silently omitting the Finalize() call if it's not needed would be much better IMHO.
No, I don't think such a thing can happen. It's possible for a string variable to obtain a value that you didn't expect, but it won't leak memory. Consider this:
var
Global: string;
procedure One(const Arg: string);
begin
Global := '';
// Oops. This is an invalid reference now. Arg points to
// what Global used to refer to, which isn't there anymore.
writeln(Arg);
end;
procedure Two;
begin
Global := 'foo';
UniqueString(Global);
One(Global);
Assert(Global = 'foo', 'Uh-oh. The argument isn''t really const?');
end;
Here One's argument is declared const, so supposedly, it won't change. But then One circumvents that by changing the actual parameter instead of the formal parameter. Procedure Two "knows" that One's argument is const, so it expects the actual parameter to retain its original value. The assertion fails.
The string hasn't leaked, but this code does demonstrate how you can get a dangling reference for a string. Arg is a local alias of Global. Although we've changed Global, Arg's value remains untouched, and because it was declared const, the string's reference count was not incremented upon entry to the function. Reassigning Global dropped the reference count to zero, and the string was destroyed. Declaring Arg as var would have the same problem; passing it by value would fix this problem. (The call to UniqueString is just to ensure the string is reference-counted. Otherwise, it may be a non-reference-counted string literal.) All compiler-managed types are susceptible to this problem; simple types are immune.
The only way to leak a string is to treat it as something other than a string, or to use non-type-aware memory-management functions. Mghie's answer describes how to treat a string as something other than a string by using FillChar to clobber a string variable. Non-type-aware memory functions include GetMem and FreeMem. For example:
type
PRec = ^TRec;
TRec = record
field: string;
end;
var
Rec: PRec;
begin
GetMem(Rec, SizeOf(Rec^));
// Oops. Rec^ is uninitialized. This assignment isn't safe.
Rec^.field := IntToStr(4);
// Even if the assignment were OK, FreeMem would leak the string.
FreeMem(Rec);
end;
There are two ways to fix it. One is to call Initialize and Finalize:
GetMem(Rec, SizeOf(Rec^));
Initialize(Rec^);
Rec^.field := IntToStr(4);
Finalize(Rec^);
FreeMem(Rec);
The other is to use type-aware functions:
New(Rec);
Rec^.field := IntToStr(4);
Dispose(Rec);
Actually, passing string as CONST or non const are the same in term of reference count in Delphi 2007 and 2009. There was a case that causing access violation when string is passed as CONST. Here is the problem one
type
TFoo = class
S: string;
procedure Foo(const S1: string);
end;
procedure TFoo.Foo(const S1: string);
begin
S:= S1; //access violation
end;
var
F: TFoo;
begin
F:= TFoo.create;
try
F.S := 'S';
F.Foo(F.S);
finally
F.Free;
end;
end.
Another way to leak a string is to declare it as a threadvar variable. See my question for details. And for the solution, see the solution on how to tidy it.
I think this might have been similar to what I was thinking of. It is the reverse of a string leak, a string that gets collected early:
var
p : ^String;
procedure InitString;
var
s, x : String;
begin
s := 'A cool string!';
x := s + '. Append something to make a copy in' +
'memory and generate a new string.';
p := #x;
end;
begin
{ Call a function that will generate a string }
InitString();
{ Write the value of the string (pointed to by p) }
WriteLn(p^); // Runtime error 105!
{ Wait for a key press }
ReadLn;
end.

Resources