What is the default value of 'Result' in Delphi? - delphi

Is there any guaranteed default value for the Result variable of a function, like 0, '' or nil? Or should Result always be initialised before use?
I have a function returning a string like this:
function Foo(): String
begin
while {...} do
Result := Result + 'boingbumtschak';
end;
It worked fine, but now I get some strings containing contents from a previous call to the function. When I add a Result := '' at the beginning, it is OK. When should I initialize the Result variable and when don't I have to? (strings, primitives, Class instances (nil))

A function return value of type string is actually treated by the compiler as an implicit var parameter. When the function begins execution, the Result variable contains whatever is in the local variable to which the return value will subsequently be assigned.
Accordingly, you should always initialize function return values. This advice holds not only for strings, but for all data types.
This issue was discussed just yesterday here on Stack Overflow:
Do I need to setLength a dynamic array on initialization?

If the function exits without
assigning a value to Result or the
function name, then the function's
return value is undefined.
see Delphi Reference > Procedures and Functions > Function Declarations

I don't know what it's like in Delphi, but I always initialize variables to a sane value before I perform operations on them (even if that sane value is null, which it might very well be in some situations). Many times it's unneeded, but in those instances, I count on the compiler or JITter to optimize the assignment out if it wants to, rather than relying on potentially undocumented language semantics or compiler implementation details. Maybe it's my background in C, which in and of itself essentially guarantees nothing about initial values, but it feels worthwhile to spend the extra one line of code (at most) in order to get clearer code. By explicitly assigning a value before you start working on the variable, you establish a clear contract with whoever is reading the code; they can trust that their idea of what the starting value of the variable is, actually holds.
As for this particular question, though; isn't Result supposed to be function-local in scope? I would be very surprised if such a variable, even though special, kept values from previous invocations of the function.

Related

Please, explain what does the following line of "with.. do" represent

In the code: pbOut is a TPaintBox.
with pbOut.Canvas, Font do
begin
....
end;
The question is, i want to understand what does the comma in with.. do structure do? Is it the same as writing with pbOut.Canvas.Font do or not?
The comma is simply a separator for a list of things to 'scope to' using with.
As to your question, strictly speaking: it depends...
You might make the assumption that pbOut.Canvas always has a Font property (because you know TCanvas has a Font property). But nothing forces a Canvas variable to be of the TCanvas type, so you actually have to consider 2 possibilities.
In your case the first possibility explained below most likely applies, but it's important to be aware of the other.
As to your specific question:
Is it the same as writing with pbOut.Canvas.Font do or not?
It's almost the same, but not quite. That doesn't expose pbOut.Canvas, which the original code does.
Possibility 1
If Font is a member of pbOut.Canvas, then the statement is equivalent to:
with pbOut.Canvas do
begin
with pbOut.Canvas.Font do
begin
{ See notes underneath }
end;
end;
Here you are able to reference members of either pbOut.Canvas or pbOut.Canvas.Font directly; without fully qualifying the reference.
Any members in both pbOut.Canvas and pbOut.Canvas.Font that have have the same identifier would clash. The compiler will favour the inner with item. This means you would still have to fully qualify the pbOut.Canvas member to access it.
Possibility 2
On the other hand, if Font is not an accessible member of pbOut.Canvas, then the statement is equivalent to:
with pbOut.Canvas do
begin
with Font do
begin
{ See notes underneath }
end;
end;
Similar to the previous construct, you can access members of either pbOut.Canvas or Font directly; without fully qualifying the reference.
I must point out that the with statement is not really a useful construct.
It leads to confusing, difficult to read code because you always have to consider what an unqualified identifier really refers to.
There also a risk that adding/removing/renaming members in an existing object or record type might unexpectedly break existing code. Any code using with on instance of the modified type might unexpectedly scope differently.
The with list form in your question is even worse than single with because of the added trickery covered in my answer.
The with statement saves a small amount of code to more explicitly qualify certain identifiers; i.e. minimal "typing" savings, but at a great cost. So it is advised to not use with.
Look into documentation:
A with statement is a shorthand for referencing the fields of a record or the fields, properties, and methods of an object. The syntax of a with statement is:
with obj do statement
or:
with obj1, ..., objn do statement
where obj is an expression yielding a reference to a record, object instance, class instance, interface or class type (metaclass) instance, and statement is any simple or structured statement. Within the statement, you can refer to fields, properties, and methods of obj using their identifiers alone, that is, without qualifiers.
When multiple objects or records appear after with, the entire statement is treated like a series of nested with statements. Thus:
with obj1, obj2, ..., objn do statement
is equivalent to:
with obj1 do
with obj2 do
...
with objn do
// statement
In this case, each variable reference or method name in statement is interpreted, if possible, as a member of objn; otherwise it is interpreted, if possible, as a member of objn1; and so forth. The same rule applies to interpreting the objs themselves, so that, for instance, if objn is a member of both obj1 and obj2, it is interpreted as obj2.objn.
Since a with statement requires a variable or a field to operate upon, using it with properties can be tricky at times. A with statement expects variables it operates on to be available by reference.
Generally using with is a recipe for disaster, since by the look of it, it is difficult to know if a statement refers to one of the with blocks (and which one) or an outer scope.
To answer the question,
Is it the same as writing with pbOut.Canvas.Font do or not?
If you only want to access the Font properties it is the same, but not if you want to access other members of pbOut.Canvas.
For yours and others sake, do not use with statements at all.

Passing open array into an anonymous function

What is the least wasteful way (i.e. avoiding copying if at all possible) to pass the content of an open string array into an anonymous function and from there into another function that expects an open array?
The problem is that open arrays cannot be captured in anonymous functions in Delphi XE2.
This illustrates the problem:
procedure TMyClass.DoSomething(const aStrings: array of string);
begin
EnumItems(
function (aItem: string) : Boolean
begin
Result := IndexText(aItem, aStrings) >= 0;
end);
end;
The compiler complains: "Cannot capture symbol 'aStrings'".
An obvious solution is to make a copy of aStrings in a dynamic array and capture that. But I don't want to make a copy. (While my specific problem involves a string array and making a copy would only copy the pointers not the string data itself due to reference counting, it would also be instructive to learn how to solve the problem for an arbitrarily large array of a non-reference counted type.)
So I tried capturing instead a PString pointer to the first string in aStrings and an Integer value of the length. But then I couldn't figure out a way to pass these to InsertText.
One other constraint: I want it to be possible to call DoSomething([a, b, c]).
Is there a way to do this without making a copy of the array, and without writing my own version of IndexText, and without being hideously ugly? If so, how?
For the sake of this question I've used IndexText, but it would be instructive to find a solution for a function that could not be trivially rewritten to accept a pointer and length parameter instead of an open array.
An acceptable answer to this question would be: No, you can't do that (at least not without making a copy or rewriting IndexText) though if so I'd also like to know the fundamental reason why not.
If you don't want to copy the array then you should change the signature of DoSomething to take a TArray<string> instead. You of course have to change the caller side if you are passing the elements directly (only since XE7 you can pass dynamic arrays in the same way) - like DoSomething([a, b, c]) i mean.
My advice is not to mess around with some internal pointers and stuff, especially not for an open array.
There's no way to do this without making a copy. Open arrays cannot be captured as you have found, and you cannot get the information into the anonymous method without capture. You must capture, in general, because you need to extend the life of the variable.
So, you cannot do this with an open array and avoid a copy. You could instead:
Switch from open array to a dynamic array, TArray<string>.
Make a copy of the array. You would not be copying the string data, just the array of references to the strings.

When do FORTRAN subprograms save data, and when not?

Have a pretty simple function for taking the name of a month, "Jan", "Feb", etc. and converting to the number of the month:
function month_num(month_str)
character*(*) :: month_str
character*3 :: month_names(12)
integer :: ipos(1),location(12)
data month_names/'Jan','Feb','Mar','Apr','May','Jun','Jul','Aug', &
'Sep','Oct','Nov','Dec'/
where (month_names==month_str) location=1
ipos = maxloc(location)
month_num = ipos(1)
end function
And OK, yes, I know it's dangerous to not define "location" before using it.
Problem: During execution of the function, if input is OK, some value of "location" will be set to 1. And, to my surprise, when the function is called again, that value will still be equal to 1. And this, of course, really messes things up. So I figured I would fix it with a new line
data location/12*0/
And I got the same problem.
Finally, I put in
location = 0
just before the "where" statement, and that fixed everything.
So, I thought FORTRAN subprograms would not save data unless the variables were declared with the "SAVE" attribute. Also, with many compilers, you can invoke some sort of "static" option that will keep everything saved. I did neither of these here, but the "location" array was saved just the same. Can someone enlighten me on the rules of when FORTRAN saves data and when not? Thanks.
The value of a variable local to a procedure is preserved across (ie it is SAVEd) in one of two ways:
The programmer specifies the SAVE attribute when declaring the variable, for example:
REAL, SAVE :: var1
The programmer initialises the variable upon declaration, for example
REAL :: var1 = 3.1415
This second, implicit, behaviour is one features of Fortran which seem designed to catch out the programmer, and not just beginners. Note that the value the variable has upon re-invocation is not, in the 2nd example 3.1415, but whatever value it had when the last invocation exited.
It is common for compilers to behave as if a variable is SAVEd when the programmer has not exercised either of these options, perhaps the memory locations used by one invocation of a procedure are not overwritten before the next invocation. But this behaviour is not to be relied on.
The situation is slightly different for variables declared in modules. Again any variable with the SAVE attribute is saved but any other variable only retains its value while the module is use-associated with a program unit which has started executing but not finished. Again some compilers, and some executions of some programs, may behave as if the value of a module variable is preserved despite the module going out of scope but this is non-standard behaviour and not to be relied on.
This behaviour is scheduled to change in Fortran 2008 when variables defined in modules will acquire the SAVE attribute implicitly.
Personally I like to explicitly SAVE variables even when I am sure that they would get the attribute implicitly, it makes the code just a bit easier to understand next time round.

Are there any optimisations for retrieving of return value in Delphi?

I'm trying to find an elegant way to access the fields of some objects in some other part of my program through the use of a record that stores a byte and accesses fields of another record through the use of functions with the same name as the record's fields.
TAilmentP = Record // actually a number but acts like a pointer
private
Ordinal: Byte;
public
function Name: String; inline;
function Description: String; inline;
class operator Implicit (const Number: Byte): TAilmentP; inline;
End;
TSkill = Class
Name: String;
Power: Word;
Ailment: TAilmentP;
End;
class operator TAilmentP.Implicit (const Number: Byte): TAilmentP;
begin
Result.Ordinal := Number;
ShowMessage (IntToStr (Integer (#Result))); // for release builds
end;
function StrToAilment (const S: String): TAilmentP; // inside same unit
var i: Byte;
begin
for i := 0 to Length (Ailments) - 1 do
if Ailments [i].Name = S then
begin
ShowMessage (IntToStr (Integer (#Result))); // for release builds
Result := i; // uses the Implicit operator
Exit;
end;
raise Exception.Create ('"' + S + '" is not a valid Ailment"');
end;
Now, I was trying to make my life easier by overloading the conversion operator so that when I try to assign a byte to a TAilmentP object, it assigns that to the Ordinal field.
However, as I've checked, it seems that this attempt is actually costly in terms of performance since any call to the implicit "operator" will create a new TAilmentP object for the return value, do its business, and then return the value and make a byte-wise copy back into the object that called it, as the addresses differ.
My code calls this method quite a lot, to be honest, and it seems like this is slower than just assigning my value directly to the Ordinal field of my object.
Is there any way to make my program actually assign the value directly to my field through the use of ANY method/function? Even inlining doesn't seem to work. Is there a way to return a reference to a (record) variable, rather than an object itself?
Finally (and sorry for being off topic a bit), why is operator overloading done through static functions? Wouldn't making them instance methods make it faster since you can access object fields without dereferencing them? This would really come in handy here and other parts of my code.
[EDIT] This is the assembler code for the Implicit operator with all optimizations on and no debugging features (not even "Debug Information" for breakpoints).
add al, [eax] /* function entry */
push ecx
mov [esp], al /* copies Byte parameter to memory */
mov eax, [esp] /* copies stored Byte back to register; function exit */
pop edx
ret
What's even funnier is that the next function has a mov eax, eax instruction at start-up. Now that looks really useful. :P Oh yeah, and my Implicit operator didn't get inlined either.
I'm pretty much convinced [esp] is the Result variable, as it has a different address than what I'm assigning to. With optimizations off, [esp] is replaced with [ebp-$01] (what I assigning to) and [ebp-$02] (the Byte parameter), one more instruction is added to move [ebp-$02] into AL (which then puts it in [ebp-$01]) and the redundant mov instruction is still there with [epb-$02].
Am I doing something wrong, or does Delphi not have return-value optimizations?
Return types — even records — that will fit in a register are returned via a register. It's only larger types that are internally transformed into "out" parameters that get passed to the function by reference.
The size of your record is 1. Making a copy of your record is just as fast as making a copy of an ordinary Byte.
The code you've added for observing the addresses of your Result variables is actually hurting the optimizer. If you don't ask for the address of the variable, then the compiler is not required to allocate any memory for it. The variable could exist only in a register. When you ask for the address, the compiler needs to allocate stack memory so that it has an address to give you.
Get rid of your "release mode" code and instead observe the compiler's work in the CPU window. You should be able to observe how your record exists primarily in registers. The Implicit operator might even compile down to a no-op since the input and output registers should both be EAX.
Whether operators are instance methods or static doesn't make much difference, especially not in terms of performance. Instance methods still receive a reference to the instance they're called on. It's just a matter of whether the reference has a name you choose or whether it's called Self and passed implicitly. Although you wouldn't have to write "Self." in front of your field accesses, the Self variable still needs to get dereferenced just like the parameters of a static operator method.
All I'll say about optimizations in other languages is that you should look up the term named return-value optimization, or its abbreviation NRVO. It's been covered on Stack Overflow before. It has nothing to do with inlining.
Delphi is supposed to optimize return assignment by using pointers. This is also true for C++ and other OOP compiled languages. I stopped writing Pascal before operator overloading was introduced, so my knowledge is a bit dated. What follows is what I would try:
What I'm thinking is this... can you create an object on the heap (use New) and pass a pointer back from your "Implicit" method? This should avoid unnecessary overhead, but will cause you to deal with the return value as a pointer. Overload your methods to deal with pointer types?
I'm not sure if you can do it this with the built-in operator overloading. Like I mentioned, overloading is something I wanted in Pascal for nearly a decade and never got to play with. I think it's worth a shot. You might need to accept that you'll must kill your dreams of elegant type casting.
There are some caveats with inlining. You probably already know that the hint is disabled (by default) for debug builds. You need to be in release mode to profile / benchmark or modify your build settings. If you haven't gone into release mode (or altered build settings) yet, it's likely that your inline hints are being ignored.
Be sure to use const to hint the compiler to optimize further. Even if it doesn't work for your case, it's a great practice to get into. Marking what should not change will prevent all kinds of disasters... and additionally give the compiler the opportunity to aggressively optimize.
Man, I wish I know if Delphi allowed cross-unit inlining by now, but I simply don't. Many C++ compilers only inline within the same source code file, unless you put the code in the header (headers have no correlate in Pascal). It's worth a search or two. Try to make inlined functions / methods local to their callers, if you can. It'll at least help compile time, if not more.
All out of ideas. Hopefully, this meandering helps.
Now that I think about it, maybe it's absolutely necessary to have the return value in a different memory space and copied back into the one being assigned to.
I'm thinking of the cases where the return value may need to be de-allocated, like for example calling a function that accepts a TAilmentP parameter with a Byte value... I don't think you can directly assign to the function's parameters since it hasn't been created yet, and fixing that would break the normal and established way of generating function calls in assembler (i.e.: trying to access a parameter's fields before it's created is a no-no, so you have to create that parameter before that, then assign to it what you have to assign OUTSIDE a constructor and then call the function in assembler).
This is especially true and obvious for the other operators (with which you could evaluate expressions and thus need to create temporary objects), just not so much this one since you'd think it's like the assignment operator in other languages (like in C++, which can be an instance member), but it's actually much more than that - it's a constructor as well.
For example
procedure ShowAilmentName (Ailment: TAilmentP);
begin
ShowMessage (Ailment.Name);
end;
[...]
begin
ShowAilmentName (5);
end.
Yes, the implicit operator can do that too, which is quite cool. :D
In this case, I'm thinking that 5, like any other Byte, would be converted into a TAilmentP (as in creating a new TAilmentP object based on that Byte) given the implicit operator, the object then being copied byte-wise into the Ailment parameter, then the function body is entered, does it's job and on return the temporary TAilmentP object obtained from conversion is destroyed.
This is even more obvious if Ailment would be const, since it would have to be a reference, and constant one too (no modifying after the function was called).
In C++, the assignment operator would have no business with function calls. Instead, one could've used a constructor for TAilmentP which accepts a Byte parameter. The same can be done in Delphi, and I suspect it would take precedence over the implicit operator, however what C++ doesn't support but Delphi does is to have down-conversion to primitive types (Byte, Integer, etc.) since the operators are overloaded using class operators. Thus, a procedure like "procedure ShowAilmentName (Number: Byte);" would never be able to accept a call like "ShowAilmentName (SomeAilment)" in C++, but in Delphi it can.
So, I guess this is a side-effect of the Implicit operator also being like a constructor, and this is necessary since records can not have prototypes (thus you could not convert both one way and the other between two records by just using constructors).
Anyone else think this might be the cause?

Delphi; performance of passing const strings versus passing var strings

Quick one; am I right in thinking that passing a string to a method 'as a CONST' involves more overhead than passing a string as a 'VAR'? The compiler will get Delphi to make a copy of the string and then pass the copy, if the string parameter is declared as a CONST, right?
The reason for the question is a bit tedious; we have a legacy Delphi 5 utility whose days are truly numbered (the replacement is under development). It does a large amount of string processing, frequently passing 1-2Kb strings between various functions and procedures. Throughout the code, the 'correct' observation of using CONST or VAR to pass parameters (depending on the job in hand) has been adhered to. We're just looking for a few 'quick wins' that might shave a few microseconds off the execution time, to tide us over until the new version is ready. We thought of changing the memory manager from the default Delphi 5 one to FastMM, and we also wondered if it was worth altering the way the strings are passed around - because the code is working fine with the strings passed as const, we don't see a problem if we changed those declarations to var - the code within that method isn't going to change the string.
But would it really make any difference in real terms? (The program really just does a large amount of processing on these 1kb+ish strings; several hundred strings a minute at peak times). In the re-write these strings are being held in objects/class variables, so they're not really being copied/passed around in the same way at all, but in the legacy code it's very much 'old school' pascal.
Naturally we'll profile an overall run of the program to see what difference we've made but there's no point in actually trying this if we're categorically wrong about how the string-passing works in the first instance!
No, there shouldn't be any performance difference between using const or var in your case. In both cases a pointer to the string is passed as the parameter. If the parameter is const the compiler simply disallows any modifications to it. Note that this does not preclude modifications to the string if you get tricky:
procedure TForm1.Button1Click(Sender: TObject);
var
s: string;
begin
s := 'foo bar baz';
UniqueString(s);
SetConstCaption(s);
Caption := s;
end;
procedure TForm1.SetConstCaption(const AValue: string);
var
P: PChar;
begin
P := PChar(AValue);
P[3] := '?';
Caption := AValue;
end;
This will actually change the local string variable in the calling method, proof that only a pointer to it is passed.
But definitely use FastMM4, it should have a much bigger performance impact.
const for parameters in Delphi essentially means "I'm not going to mutate this, and I also don't care if this is passed by value or by reference - whichever is most efficient is fine by me". The bolded part is important, because it is actually observable. Consider this code:
type TFoo =
record
x: integer;
//dummy: array[1..10] of integer;
end;
procedure Foo(var x1: TFoo; const x2: TFoo);
begin
WriteLn(x1.x);
WriteLn(x2.x);
Inc(x1.x);
WriteLn;
WriteLn(x1.x);
WriteLn(x2.x);
end;
var
x: TFoo;
begin
Foo(x, x);
ReadLn;
end.
The trick here is that we pass the same variable both as var and as const, so that our function can mutate via one argument, and see if this affects the other. If you try it with code above, you'll see that incrementing x1.x inside Foo doesn't change x2.x, so x2 was passed by value. But try uncommenting the array declaration in TFoo, so that its size becomes larger, and running it again - and you'll see how x2.x now aliases x1.x, so we have pass-by-reference for x2 now!
To sum it up, const is always the most efficient way to pass parameter of any type, but you should not make any assumptions about whether you have a copy of the value that was passed by the caller, or a reference to some (potentially mutated by other code that you may call) location.
This is really a comment, but a long one so bear with me.
About 'so called' string passing by value
Delphi always passes string and ansistring (WideStrings and ShortStrings excluded) by reference, as a pointer.
So strings are never passed by value.
This can be easily tested by passing 100MB strings around.
As long as you don't change them inside the body of the called routine string passing takes O(1) time (and with a small constant at that)
However when passing a string without var or const clause, Delphi does three things.
Increase the reference count of the string.
put an implicit try-finally block around the procedure, so the reference count of the string parameter gets decreased again when the method exits.
When the string gets changed (and only then) Delphi makes a copy of the string, decreases the reference count of the passed string and uses the copy in the rest of the routine.
It fakes a pass by value in doing this.
About passing by reference (pointer)
When the string is passed as a const or var, Delphi also passes a reference (pointer), however:
The reference count of the string does not increase. (tiny, tiny speed increase)
No implicit try/finally is put around the routine, because it is not needed. This is part 1 why const/var string parameters execute faster.
When the string is changed inside the routine, no copy is make the actual string is changed. For const parameters the compiler prohibits string alternations. This is part 2 of why var/const string parameters work faster.
If however you need to create a local var to assign the string to; Delphi copies the string :-) and places an implicit try/finally block eliminating 99%+ of the speed gain of a const string parameter.
Hope this sheds some light on the issue.
Disclaimer: Most of this info comes from here, here and here
The compiler won't make a copy of the string when using const afaik. Using const saves you the overhead of incrementing/decrementing the refcounter for the string that you use.
You will get a bigger performanceboost by upgrading the memorymanager to FastMM, and, because you do a lot with strings, consider using the FastCode library.
Const is already the most efficient way of passing parameters to a function. It avoids creating a copy (default, by value) or even passing a pointer (var, by reference).
It is particularly true for strings and was indeed the way to go when computing power was limited and not to be wasted (hence the "old school" label).
IMO, const should have been the default convention, being up to the programmer to change it when really needed to by value or by var. That would have been more in line with the overall safety of Pascal (as in limiting the opportunity of shooting oneself in the foot).
My 2¢...

Resources