Delphi "for ... to" statement runs from end value to start value - delphi

I'm writing a simple app in Embarcadero Delphi 2010. A simple code with two cycles:
procedure TForm1.Button1Click(Sender: TObject);
var
a:array [0..255] of integer;
i:integer;
k,q:integer;
begin
k:=0;
for I := 0 to 255 do
begin
a[i]:=i;
end;
for I := 0 to 255 do
begin
q:= a[i];
k:=k+q;
end;
Label1.Caption:=inttostr(k);
end;
According to Watch List, in second cycle variable "i" starts from value 256 and going to 0 (256, 255, 254, ..., 0), but array's elements is correct (0, 1, 2, 3, ...). Variable "i" declared only locally, no global variables.
Why does this happens? Is it normal behaviour?

The short answer is because of compiler optimization. The long answer is:
In your Pascal code, the integer I has two (actually three) purposes. First, it is the loops control variable (or loop counter), that is, it controls how many times the loop is run. Secondly, it acts as index to the array a. And in the first loop it also acts as the value assigned to the array elements. When compiled to machine code, these roles are handled by different registers.
If optimization is set in compiler settings, the compiler creates code that decrements the control variable from a start value down towards zero, if it can do so, without changing the end result. This it does, because a comparison against a non-zero value can be avoided, thus being faster.
In following disassembly of the first loop, you can see that the roles of variable I are handled as:
Register eax acts as loop control variable and value to be
assigned to array elements
Register edx is pointer to array element (incremented with 4
(bytes) per turn)
disassembly:
Unit25.pas.34: for I := 0 to 255 do
005DB695 33C0 xor eax,eax // init
005DB697 8D9500FCFFFF lea edx,[ebp-$00000400]
Unit25.pas.36: a[i]:=i;
005DB69D 8902 mov [edx],eax // value assignment
Unit25.pas.37: end;
005DB69F 40 inc eax // prepare for next turn
005DB6A0 83C204 add edx,$04 // same
Unit25.pas.34: for I := 0 to 255 do
005DB6A3 3D00010000 cmp eax,$00000100 // comparison with end of loop
005DB6A8 75F3 jnz $005db69d // if not, run next turn
Since eax has two roles, it must count upward. Note that it requires three commands for each loop to manage the loop counting: inc eax, cmp eax, $00000100 and jnz $005db69d.
In the disassembly of the second loop, the roles of variable I are handled similarily as in the first loop, except I is not assigned to the elements. Therefore the loop control only acts as a loop counter and can be run downward.
Register eax is loop control variable
Register edx is pointer to array element (incremented with 4
(bytes) per turn)
disassembly:
Unit25.pas.39: for I := 0 to 255 do
005DB6AA B800010000 mov eax,$00000100 // init loop counter
005DB6AF 8D9500FCFFFF lea edx,[ebp-$00000400]
Unit25.pas.41: q:= a[i];
005DB6B5 8B0A mov ecx,[edx]
Unit25.pas.42: k:=k+q;
005DB6B7 03D9 add ebx,ecx
Unit25.pas.43: end;
005DB6B9 83C204 add edx,$04 // prepare for next turn
Unit25.pas.39: for I := 0 to 255 do
005DB6BC 48 dec eax // decrement loop counter, includes intrinsic comparison with 0
005DB6BD 75F6 jnz $005db6b5 // jnz = jump if not zero
Note that in this case only two commands are needed to manage loop counting: dec eax and jnz $005db6b5.
In Delphi XE7, in the Watches window, variable I is shown during the first loop as incrementing values but during the second loop as E2171 Variable 'i' inaccessible here due to optimization. In earlier versions I recall it was showing decrementing values which I believe you see.

I have copied your exact code and when I ran it the variable "i" counts normally in both for cycles. Have you ran the second cycle step by step? The "i" is really 256 at the start of the second cycle because of the first one but as soon as the second cycle starts "i" becomes 0 and it counts normally to 255.
I don't see how or why would it count from 256 to 0?
UPDATE:
I haven't even thought of this, but here's your explanation I belive: http://www.delphigroups.info/2/45/418603.html
" It's a compiler optimization - you're not using "I" inside your loop
hence compiler thought of a better way to count . Your loop count
will still be accurate..."

Related

Is the use of "for i := x to y" well defined behaviour, if x > y?

The intuitive answer would be that the loop is never entered. And this seems to be case in all tests I could come up with. I'm still anxious and always test it before entering the loop.
Is this necessary?
No, it is not necessary.
The documentation clearly states :
for counter := initialValue to finalValue do statement
or:
for counter := initialValue downto finalValue do statement
...
If initialValue is equal to finalValue, statement is executed exactly once. If initialValue is greater than finalValue in a for...to statement, or less than finalValue in a for...downto statement, then statement is never executed.
There is no need for anxiety.
If we want to examine further what happens, let's make a few examples. Consider first :
program Project1;
{$APPTYPE CONSOLE}
var
i : integer;
begin
for i := 2 to 1 do WriteLn(i);
end.
This produces a compiler hint:
[dcc32 Hint] Project1.dpr(6): H2135 FOR or WHILE loop executes zero times - deleted
So the compiler will simply throw away a loop with constants that produce no loop iterations. It does this even with optimizations turned off - no code is produced for the loop at all.
Now let's be a bit more clever :
program Project1;
{$APPTYPE CONSOLE}
var
i, j, k : integer;
begin
j := 2;
k := 1;
for i := j to k do WriteLn(i);
end.
This actually compiles the loop. The output is as below:
Project1.dpr.8: for i := j to k do WriteLn(i);
004060E8 A1A4AB4000 mov eax,[$0040aba4] {$0040aba4 -> j = 2}
004060ED 8B15A8AB4000 mov edx,[$0040aba8] {$0040aba8 -> k = 1}
004060F3 2BD0 sub edx,eax {edx = k - j = -1}
004060F5 7C2E jl $00406125 {was k-j < 0? if yes, jmp to end.}
004060F7 42 inc edx {set up loop}
004060F8 8955EC mov [ebp-$14],edx
004060FB A3A0AB4000 mov [$0040aba0],eax
00406100 A118784000 mov eax,[$00407818] {actual looped section}
00406105 8B15A0AB4000 mov edx,[$0040aba0]
0040610B E8E8D6FFFF call #Write0Long
00406110 E8C3D9FFFF call #WriteLn
00406115 E8EECCFFFF call #_IOTest
0040611A FF05A0AB4000 inc dword ptr [$0040aba0] {update loop var}
00406120 FF4DEC dec dword ptr [ebp-$14]
00406123 75DB jnz $00406100 {loop ^ if not complete}
Project1.dpr.9: end.
00406125 E88EE1FFFF call #Halt0
So, the very first thing a loop does is to check whether it needs to execute at all. If the initial is greater than the final (for a for..to loop) then it skips straight past it entirely. It doesn't even waste the cycles to initialize the loop counter.
There are some edge-cases in which you may be surprised to discover that the code does unexpectedly enter the loop. And still other cases where you may be tempted to pre-check whether to call the loop. But before I get into those details, I want to try impress on you the importance of not pre-checking your loop with an if condition.
Every line of code, no matter how easy to understand draws attention. It's more to read and more to confirm is correct. So if it's not important, or if it's technically redundant: it's best left out.
A for loop is conceptually translated as follows:
Initialise loop index to starting value.
If iteration constraint is valid (e.g. Index <= EndValue in case of forward loop):
Perform iteration (code within loop block/statement)
Perform loop control operations (increment loop index)
Repeat 2
Otherwise continue at first instruction after loop.
The way in which Step 2 is checked, makes an extra if condition before the loop completely redundant.
So if you (or another developer) is later maintaining code with a redundant if condition, they're left to wonder:
Is the line correct?
It seems redundant; is there a special condition it's trying to handle?
If it currently serves no purpose, perhaps it was intended to prevent calling the loop on a different condition?
In simple case, redundant lines of code can create some confusion. In more complex cases, they can result in whole new sections of irrelevant code being developed; that tries to cater for irrelevant scenarios implied by legacy redundant code.
Recommendation: Stamp out redundant code as much as possible. Including redundant pre-checks for "should the loop execute at all".
The most important benefit of stamping out redundant code is that: it correctly draws attention to peculiar cases whenever special handling actually is required.
There are 2 potential pitfalls, and the first is the more dangerous one as it deals with implicit type conversion. So it may not always be easy to detect. The following code was tested on rextester using fpc, but I have verified the same issue on Delphi 2007 / 2009 in the past.
//fpc 2.6.2
program UnexpectedForIteration;
{$MODE DELPHI}
{ Ensure range-checking is off. If it's on, a run-time error
prevents potentially bad side-effects of invalid iterations.}
{$R-,H+,W+}
var
IntStart, IntEnd, IntIndex: Integer;
UIntStart, UIntEnd, UIntIndex: Cardinal;
IterCount: Integer;
begin
Writeln('Case 1');
IntStart := High(Integer) - 1;
IntEnd := -IntStart;
UIntStart := Cardinal(IntStart);
UIntEnd := Cardinal(IntEnd);
{This gives a clue why the problem occurs.}
Writeln('From: ', IntStart, ' To: ', IntEnd);
Writeln('From: ', UIntStart, ' To: ', UIntEnd, ' (unsigned)');
Writeln('Loop 1');
IterCount := 0;
for IntIndex := IntStart to IntEnd do Inc(IterCount);
Writeln(IterCount);
Writeln('Loop 2');
IterCount := 0;
{ The loop index variable is a different type to the initial &
final values. So implicit conversion takes place and:
IntEnd **as** unsigned is unexpectedly bigger than IntStart }
for UIntIndex := IntStart to IntEnd do Inc(IterCount);
Writeln(IterCount, ' {Houston we have a problem}');
Writeln();
Writeln('Case 2');
UIntStart := High(Cardinal) - 2;
UIntEnd := 2;
IntStart := Integer(UIntStart);
IntEnd := Integer(UIntEnd);
{This gives a clue why the problem occurs.}
Writeln('From: ', UIntStart, ' To: ', UIntEnd);
Writeln('From: ', IntStart, ' To: ', IntEnd, ' (signed)');
Writeln('Loop 3');
IterCount := 0;
for UIntIndex := UIntStart to UIntEnd do Inc(IterCount);
Writeln(IterCount);
Writeln('Loop 4');
IterCount := 0;
{ The loop index variable is a different type to the initial &
final values. So implicit conversion takes place and:
UIntStart **as** signed is unexpectedly less than UIntEnd }
for IntIndex := UIntStart to UIntEnd do Inc(IterCount);
Writeln(IterCount, ' {Houston we have a problem}');
end.
The output is as follows:
Case 1
From: 2147483646 To: -2147483646
From: 2147483646 To: 2147483650 (unsigned)
Loop 1
0
Loop 2
5 {Houston we have a problem}
Case 2
From: 4294967293 To: 2
From: -3 To: 2 (signed)
Loop 3
0
Loop 4
6 {Houston we have a problem}
In many cases the problem is resolved by ensuring the same types are used for loopIndex, initialValue and finalValue. As this means there won't be an implicit type conversion, and the loop will reliably iterate as the initialValue and finalValue would suggest.
It would be easier if the compiler emits appropriate warnings for implicit type conversion in for loops. Unfortunately fpc didn't; I don't recall whether Delphi 2007/2009 does; and have no idea whether any recent versions do.
However, the preferred approach would be to favour container iteration syntax (pushing responsibility for 'correct' iteration on the enumerators). E.g.: for <element> in <container> do ...;. This should not iterate empty containers provided the enumerator's methods are implemented correctly.
The only time I'd say a pre-check is worth considering is:
when for in is not feasible for some reason
and the loop index needs to be zero-based
and support large unsigned integers (High(Integer) < index < High(Cardinal))
because this leaves no space for a reliable sentinel less than all possible initial values.
Even in this case, consider using an Int64 loop index instead of if (initialValue <= finalValue) then for ....
The second pitfall involves what I would in any case consider to be a design flaw. So the problem can be avoided entirely by rather being aware of this design consideration. It is demonstrated in code that looks as follows:
if Assigned(AnObject) then
for LIndex := 0 to AnObject.Count - 1 do ...;
In this case the if condition may in fact be necessary as a result of dubious design. Certainly, if AnObject hasn't been created, you do not want to access its Count property/method. But the dubious aspect of the design is the fact that you're uncertain whether AnObject exists. Yes, you may have employed a lazy-initialisation pattern. But it doesn't change the fact that in the above code, there's no way to differentiate between: "zero iterations" because AnObject doesn't exist or because AnObject.Count = 0.
I'd like to point out that when code has many redundant if Assigned(AnObject) then (or similar) lines, it leads to one of the problems I described in section 1. Local code caters for 2 possibilities. And by extension, client code also caters for 2 possibilities. And by induction, this problem eventually leaks throughout the code-base.
The solution is to first and foremost limit the cases where existence of AnObject is uncertain.
It's much easier to ensure an empty object with Count = 0 is guaranteed to be created (typically only affecting a small number of places in code).
It's far more work to deal with the ripple effects of a large number of places where the object might not exist yet; yielding 2 possible states and code paths.
If lazy-initialisation is required, try to ensure the code surface where existence is optional is kept as small as possible.

Inline asm (32) emulation of move (copy memory) command

I have two two-dimensional arrays with dynamic sizes (guess that's the proper wording). I copy the content of first one into the other using:
dest:=copy(src,0,4*x*y);
// src,dest:array of array of longint; x,y:longint;
// setlength(both arrays,x,y); //x and y are max 15 bit positive!
It works. However I'm unable to reproduce this in asm. I tried the following variations to no avail... Could someone enlighten me...
MOV ESI,src; MOV EDI,dest; MOV EBX,y; MOV EAX,x; MUL EBX;
PUSH DS; POP ES; MOV ECX,EAX; CLD; REP MOVSD;
Also tried with LEA (didn't expect that to work since it should fetch the pointer address not the array address), no workie, and tried with:
p1:=#src[0,0]; p2:=#dest[0,0]; //being no-type pointers
MOV ESI,p1; MOV EDI,p2... (the same asm)
Hints pls? Btw it's delphi 6. The error is, of course, access violation.
This is really a two-fold three-fold question.
What's the structure of a dynamic array.
Which instructions in asm will copy the array.
I'm throwing random assembler at the CPU, why doesn't it work?
Structure of a dynamic array
See: http://docwiki.embarcadero.com/RADStudio/Seattle/en/Internal_Data_Formats
To quote:
Dynamic Array Types
On the 32-bit platform, a dynamic-array variable occupies 4 bytes of memory (and 8 bytes on 64-bit) that contain a pointer to the dynamically allocated array. When the variable is empty (uninitialized) or holds a zero-length array, the pointer is nil and no dynamic memory is associated with the variable. For a nonempty array, the variable points to a dynamically allocated block of memory that contains the array in addition to a 32-bit (64-bit on Win64) length indicator and a 32-bit reference count. The table below shows the layout of a dynamic-array memory block.
Dynamic array memory layout (32-bit and 64-bit)
Offset 32-bit -8 -4 0
Offset 64-bit -12 -8 0
contents refcount count start of data
So the dynamic array variable is a pointer to the middle of the above structure.
How do I access this in asm
Let's assume the array holds records of type TMyRec
you'll need to run this code for every inner array in the outer array to do the deep copy. I leave this as an exercise for the reader. (you can do the other part in pascal).
type
TDynArr: array of TMyRec;
procedure SlowButBasicMove(const Source: TDynArr; var dest);
asm
//insert register pushes, see below.
mov esi,Source //esi = pointer to source data
mov edi,Dest //edi = pointer to dest
sub esi,8
mov ebx,[esi] //ebx = refcount (just in case)
mov ecx,[esi+4] //ecx = element count
mov edx,SizeOf(TMyRec) //anywhere from 1 to zillions
mul ecx,edx //==ecx=number of bytes in array.
//// now we can start moving
xor ebx,ebx //ebx =0
add eax,8 //eax = #data
#loop:
mov eax,[esi+ebx] //Get data from source
mov [edi+ebx],esi //copy it to dest
add ebx,4 //4 bytes at a time
cmp ebx,ecx //is ebx> number of bytes?
jle loop
//Done copying.
//insert register pops, see below
end;
That's the copy done, however in order for the system not to crash, you need to save and restore the non volatile registers (all but EAX, ECX, EDX), see: http://docwiki.embarcadero.com/RADStudio/Seattle/en/Program_Control
push ebx
push esi
push edi
--- insert code shown above
//restore non-volatile registers
pop edi
pop esi
pop ebx //note the restoring must happen in the reverse order of the push.
See the Jeff Dunteman's book assembly step by step if you're completely new to asm.
You will get access violations if:
you try to read from a wrong address.
you try to write to a wrong adress.
you read past the end of the array.
you write to memory you haven't claimed before using GetMem or whatever means.
if you write past the end of your buffer.
if you do not restore all non-volatile registers
Remember you're directly dealing with the CPU. Delphi will not assist you in any way.
Really fast code will use some form of SSE to move 16bytes per instruction in an unrolled loop, see the above mentioned fastcode for examples of optimized assembler.
Random assembler
In assembler you need to know exactly what you're what to do, how and what the CPU does.
Set a breakpoint and run your code. Press ctrl + alt + C and behold the CPU-debug window.
This will allow you to see the code Delphi generates.
You can single step through the code to see what the CPU does.
see: http://www.plantation-productions.com/Webster/index.html
For more reading.
Dynamic Arrays differ from Static Arrays, especially when it comes to multi-dimensionality.
Refer to this reference for internal formats.
The point is that an Array Of Array Of LongInt of dimensions X and Y (in this order!) is a pointer to an array of X pointers that point to an array of Y LongInts.
Since it seems, from your comments, that you have already allocated the space for all elements in Dest, I assume you want to do a Deep Copy.
Here a sample program, where the assembly as been made as simple as possible for the sake of clarity.
Program Test;
Var Src, Dest: Array Of Array Of LongInt;
X, Y, I, J: Integer;
Begin
X := 4;
Y := 2;
setLength(Src, X, Y);
setLength(Dest, X, Y);
For I := 0 To X-1 Do
For J := 0 To Y-1 Do
Src[I,J] := I*Y + J;
{$ASMMODE intel}
Asm
cld ;Be sure movsd increments the registers
mov esi, DWORD PTR [Src] ;Src pointer
mov edi, DWORD PTR [Dest] ;Dest pointer
mov ecx, DWORD PTR [X] ;Repeat for X times
;The number of elements in Src
#_copy:
push esi ;Save these for later
push edi
push ecx
mov ecx, DWORD PTR [Y] ;Repeat for Y times
;The number of element in a Src[i] array
mov edi, DWORD PTR [edi] ;Get the pointer to the Dest[i] array
mov esi, DWORD PTR [esi] ;Get the pointer to the Src[i] array
rep movsd ;Copy sub array
pop ecx ;Restore
pop edi
pop esi
add esi, 04h ;Go from Src[i] to Src[i+1]
add edi, 04h ;Go from Dest[i] to Dest[i+1]
loop #_copy
End;
For I := 0 To X-1 Do
Begin
WriteLn();
For J := 0 To Y-1 Do
Begin
Write(' ');
Write(Dest[I,J]);
End;
End;
End.
Note 1 This source code is intended to be compile with freepascal.
Donation of Spare Time(TM) for downloading and installing Delphi are welcome!
Note 2 This source code is for illustration purpose only, it is pretty obvious, it has already been stated above, but somehow not everybody got it.
If the OP wanted a fast way to copy the array, they should have stated so.
Note 3 I don't save the clobbered registers, this is bad practice, my bad; I forgot, as there are no subroutines, no optimizations and no reason for the compiler to pass data in the registers between the two fors.
This is left as an exercise to the reader.

Invalid floating point operation calling Trunc()

I'm getting a (repeatable) floating point exception when i try to Trunc() a Real value.
e.g.:
Trunc(1470724508.0318);
In reality the actual code is more complex:
ns: Real;
v: Int64;
ns := ((HighPerformanceTickCount*1.0)/g_HighResolutionTimerFrequency) * 1000000000;
v := Trunc(ns);
But in the end it still boils down to:
Trunc(ARealValue);
Now, i cannot repeat it anywhere else - just at this one spot. Where it fails every time.
It's not voodoo
Fortunately computers are not magic. The Intel CPU performs very specific observable actions. So i should be able to figure out why the floating point operation fails.
Going into the CPU window
v := Trunc(ns)
fld qword ptr [ebp-$10]
This loads the 8-byte floating point value at ebp-$10 into floating point register ST0.
The bytes at memory address [ebp-$10] are:
0018E9D0: 6702098C 41D5EA5E (as DWords)
0018E9D0: 41D5EA5E6702098C (as QWords)
0018E9D0: 1470724508.0318 (as Doubles)
The call succeeds, and the floating point register the contains the appropriate value:
Next is the actual call to the RTL Trunc function:
call #TRUNC
Next is the guts of Delphi RTL's Trunc function:
#TRUNC:
sub esp,$0c
wait
fstcw word ptr [esp] //Store Floating-Point Control Word on the stack
wait
fldcw word ptr [cwChop] //Load Floating-Point Control Word
fistp qword ptr [esp+$04] //Converts value in ST0 to signed integer
//stores the result in the destination operand
//and pops the stack (increments the stack pointer)
wait
fldcw word ptr [esp] //Load Floating-Point Control Word
pop ecx
pop eax
pop edx
ret
Or i suppose i could have just pasted it from the rtl, rather than transcribing it from the CPU window:
const cwChop : Word = $1F32;
procedure _TRUNC;
asm
{ -> FST(0) Extended argument }
{ <- EDX:EAX Result }
SUB ESP,12
FSTCW [ESP] //Store foating-control word in ESP
FWAIT
FLDCW cwChop //Load new control word $1F32
FISTP qword ptr [ESP+4] //Convert ST0 to int, store in ESP+4, and pop the stack
FWAIT
FLDCW [ESP] //restore the FPCW
POP ECX
POP EAX
POP EDX
end;
The exception happens during the actual fistp operation.
fistp qword ptr [esp+$04]
At the moment of this call, the ST0 register will contains the same floating point value:
Note: The careful observer will note the value in the above screenshot doesn't match the first screenshot. That's because i took it on a different run. I'd rather not have to carefully redo all the constants in the question just to make them consistent - but trust me: it's the same when i reach the fistp instruction as it was after the fld instruction.
Leading up to it:
sub esp,$0c: I watch it push the the stack down by 12 bytes
fstcw word ptr [esp]: i watch it push $027F into the the current stack pointer
fldcw word ptr [cwChop]: i watch the floating point control flags change
fistp qword ptr [esp+$04]: and it's about to write the Int64 into the room it made on the stack
and then it crashes.
What can actually be going on here?
It happens with other values as well, it's not like there's something wrong with this particular floating point value. But i even tried to setup the test-case elsewhere.
Knowing that the 8-byte hex value of the float is: $41D5EA5E6702098C, i tried to contrive the setup:
var
ns: Real;
nsOverlay: Int64 absolute ns;
v: Int64;
begin
nsOverlay := $41d62866a2f270dc;
v := Trunc(ns);
end;
Which gives:
nsOverlay := $41d62866a2f270dc;
mov [ebp-$08],$a2f270dc
mov [ebp-$04],$41d62866
v := Trunc(ns)
fld qword ptr [ebp-$08]
call #TRUNC
And at the point of the call to #trunc, the floating point register ST0 contains a value:
But the call does not fail. It only fails, every time in this one section of my code.
What could be possibly happening that is causing the CPU to throw an invalid floating point exception?
What is the value of cwChop before it loads the control word?
The value of cwChop looks to be correct before the load control word, $1F32. But after the load, the actual control word is wrong:
Bonus Chatter
The actual function that is failing is something to convert high-performance tick counts into nanoseconds:
function PerformanceTicksToNs(const HighPerformanceTickCount: Int64): Int64;
//Convert high-performance ticks into nanoseconds
var
ns: Real;
v: Int64;
begin
Result := 0;
if HighPerformanceTickCount = 0 then
Exit;
if g_HighResolutionTimerFrequency = 0 then
Exit;
ns := ((HighPerformanceTickCount*1.0)/g_HighResolutionTimerFrequency) * 1000000000;
v := Trunc(ns);
Result := v;
end;
I created all the intermeidate temporary variables to try to track down where the failure is.
I even tried to use that as a template to try to reproduce it:
var
i1, i2: Int64;
ns: Real;
v: Int64;
vOver: Int64 absolute ns;
begin
i1 := 5060170;
i2 := 3429541;
ns := ((i1*1.0)/i2) * 1000000000;
//vOver := $41d62866a2f270dc;
v := Trunc(ns);
But it works fine. There's something about when it's called during a DUnit unit test.
Floating Point control word flags
Delphi's standard control word: $1332:
$1332 = 0001 00 11 00 110010
0 ;Don't allow invalid numbers
1 ;Allow denormals (very small numbers)
0 ;Don't allow divide by zero
0 ;Don't allow overflow
1 ;Allow underflow
1 ;Allow inexact precision
0 ;reserved exception mask
0 ;reserved
11 ;Precision Control - 11B (Double Extended Precision - 64 bits)
00 ;Rounding control -
0 ;Infinity control - 0 (not used)
The Windows API required value: $027F
$027F = 0000 00 10 01 111111
1 ;Allow invalid numbers
1 ;Allow denormals (very small numbers)
1 ;Allow divide by zero
1 ;Allow overflow
1 ;Allow underflow
1 ;Allow inexact precision
1 ;reserved exception mask
0 ;reserved
10 ;Precision Control - 10B (double precision)
00 ;Rounding control
0 ;Infinity control - 0 (not used)
The crChop control word: $1F32
$1F32 = 0001 11 11 00 110010
0 ;Don't allow invalid numbers
1 ;Allow denormals (very small numbers)
0 ;Don't allow divide by zero
0 ;Don't allow overflow
1 ;Allow underflow
1 ;Allow inexact precision
0 ;reserved exception mask
0 ;unused
11 ;Precision Control - 11B (Double Extended Precision - 64 bits)
11 ;Rounding Control
1 ;Infinity control - 1 (not used)
000 ;unused
The CTRL flags after loading $1F32: $1F72
$1F72 = 0001 11 11 01 110010
0 ;Don't allow invalid numbers
1 ;Allow denormals (very small numbers)
0 ;Don't allow divide by zero
0 ;Don't allow overflow
1 ;Allow underflow
1 ;Allow inexact precision
1 ;reserved exception mask
0 ;unused
11 ;Precision Control - 11B (Double Extended Precision - 64 bits)
11 ;Rounding control
1 ;Infinity control - 1 (not used)
00011 ;unused
All the CPU is doing is turning on a reserved, unused, mask bit.
RaiseLastFloatingPointError()
If you're going to develop programs for Windows, you really need to accept the fact that floating point exceptions should be masked by the CPU, meaning you have to watch for them yourself. Like Win32Check or RaiseLastWin32Error, we'd like a RaiseLastFPError. The best i can come up with is:
procedure RaiseLastFPError();
var
statWord: Word;
const
ERROR_InvalidOperation = $01;
// ERROR_Denormalized = $02;
ERROR_ZeroDivide = $04;
ERROR_Overflow = $08;
// ERROR_Underflow = $10;
// ERROR_InexactResult = $20;
begin
{
Excellent reference of all the floating point instructions.
(Intel's architecture manuals have no organization whatsoever)
http://www.plantation-productions.com/Webster/www.artofasm.com/Linux/HTML/RealArithmetica2.html
Bits 0:5 are exception flags (Mask = $2F)
0: Invalid Operation
1: Denormalized - CPU handles correctly without a problem. Do not throw
2: Zero Divide
3: Overflow
4: Underflow - CPU handles as you'd expect. Do not throw.
5: Precision - Extraordinarily common. CPU does what you'd want. Do not throw
}
asm
fwait //Wait for pending operations
FSTSW statWord //Store floating point flags in AX.
//Waits for pending operations. (Use FNSTSW AX to not wait.)
fclex //clear all exception bits the stack fault bit,
//and the busy flag in the FPU status register
end;
if (statWord and $0D) <> 0 then
begin
//if (statWord and ERROR_InexactResult) <> 0 then raise EInexactResult.Create(SInexactResult)
//else if (statWord and ERROR_Underflow) <> 0 then raise EUnderflow.Create(SUnderflow)}
if (statWord and ERROR_Overflow) <> 0 then raise EOverflow.Create(SOverflow)
else if (statWord and ERROR_ZeroDivide) <> 0 then raise EZeroDivide.Create(SZeroDivide)
//else if (statWord and ERROR_Denormalized) <> 0 then raise EUnderflow.Create(SUnderflow)
else if (statWord and ERROR_InvalidOperation) <> 0 then raise EInvalidOp.Create(SInvalidOp);
end;
end;
A reproducible case!
I found a case, when Delphi's default floating point control word, that was the cause of an invalid floating point exception (although I never saw it before now because it was masked). Now that i'm seeing it, why is it happening! And it's reproducible:
procedure TForm1.Button1Click(Sender: TObject);
var
d: Real;
dover: Int64 absolute d;
begin
d := 1.35715152325557E020;
// dOver := $441d6db44ff62b68; //1.35715152325557E020
d := Round(d); //<--floating point exception
Self.Caption := FloatToStr(d);
end;
You can see that the ST0 register contains a valid floating point value. The floating point control word is $1372. There floating point exception flag are all clear:
And then, as soon as it executes, it's an invalid operation:
IE (Invalid operation) flag is set
ES (Exception) flag is set
I was tempted to ask this as another question, but it would be the exact same question - except this time calling Round().
The problem occurs elsewhere. When your code enters Trunc the control word is set to $027F which is, IIRC, the default Windows control word. This has all exceptions masked. That's a problem because Delphi's RTL expects exceptions to be unmasked.
And look at the FPU window, sure enough there are errors. Both IE and PE flags are set. It's IE that counts. That's means that earlier in the code sequence there was a masked invalid operation.
Then you call Trunc which modifies the control word to unmask the exceptions. Look at your second FPU window screenshot. IE is 1 but IM is 0. So boom, the earlier exception is raised and you are led to think that it was the fault of Trunc. It was not.
You'll need to trace back up the call stack to find out why the control word is not what it ought to be in a Delphi program. It ought to be $1332. Most likely you are calling into some third party library which modifies the control word and does not restore it. You'll have to locate the culprit and take charge whenever any calls to that function return.
Once you get the control word back under control you'll find the real cause of this exception. Clearly there is an illegal FP operation. Once the control word unmasks the exceptions, the error will be raised at the right point.
Note that there's nothing to worry about the discrepancy between $1372 and $1332, or $1F72 and $1F32. That's just an oddity with the CTRL control word that some of the bytes are reserved and ignore you exhortations to clear them.
Your latest update essentially asks a different question. It asks about the exception raised by this code:
procedure foo;
var
d: Real;
i: Int64;
begin
d := 1.35715152325557E020;
i := Round(d);
end;
This code fails because the job of Round() is to round d to the nearest Int64 value. But your value of d is greater than the largest possible value that can be stored in an Int64 and hence the floating point unit traps.

Why Delphi "for" acting like that?

Delphi XE2, simple code:
function FastSwap(Value: uint16): uint16; register; overload;
asm
bswap eax
shr eax, 16
end;
...
type
PPicEleHdr = ^TPicEleHdr;
TPicEleHdr = packed record
zero, size, count: word;
end;
var
count: integer;
buf: TBytes;
begin
...
peh := #buf[offs];
count := integer(FastSwap(peh.count));
for i := 0 to count - 1 do begin
and here at for I see in CPU window
UnitExtract.pas.279: for i := 0 to count - 1 do begin
0051E459 8B45DC mov eax,[ebp-$24]
0051E45C 48 dec eax
0051E45D 85C0 test eax,eax
0051E45F 0F82CD000000 jb $0051e532
0051E465 40 inc eax
0051E466 8945AC mov [ebp-$54],eax
0051E469 C745F400000000 mov [ebp-$0c],$00000000
so when count is 0 nothing works properly, test eax, eax (eax = $FFFFFFFF after dec eax) not affecting Carry flag while jb acting by Carry flag.
Is there something I don't understand about using for?
By a process of reverse engineering, I infer that i is an unsigned 32 bit integer, Cardinal. So the compiler performs the for loop arithmetic in an unsigned context. This means that Count-1 is interpreted as unsigned, and so your loop runs from 0 to high(i).
To flesh this out, this is what happens step by step:
Count is $00000000.
Count-1 is evaluated and has value $FFFFFFFF.
Interpreted as an unsigned integer $FFFFFFFF is 232-1.
Your loop body executes for all values 0 <= i < 232.
The solution is to make your loop variable be a signed integer, for example Integer.
When you switch i to be of type Integer, the following happens:
Count is $00000000.
Count-1 is evaluated and has value $FFFFFFFF.
Interpreted as a signed integer $FFFFFFFF is -1.
The loop body does not execute.
As written, this won't compile, since you don't have a declaration for i.
But my psychic debugging senses say that i is declared somewhere as a cardinal (unsigned integer), and thus when it tries to evaluate 0 - 1, it gets MAXINT instead of -1, because unsigned integers can't represent negative values.
You should never use unsigned integers as either the index variable or the bounding variables of a for loop if there's any chance at all that they can go negative. Otherwise, you get errors like this. In fact, you should probably just not use unsigned integers in general. They're not as useful as they look (if you need a value higher than the maximum signed value for a size, it's likely that you'll end up needing a value higher than twice that at some point, so what you really need is the next larger integer size) and they tend to cause strange bugs like this one.

Is it necessary to assign a default value to a variant returned from a Delphi function?

Gradually I've been using more variants - they can be very useful in certain places for carrying data types that are not known at compile time. One useful value is UnAssigned ('I've not got a value for you'). I think I discovered a long time ago that the function:
function DoSomething : variant;
begin
If SomeBoolean then
Result := 4.5
end;
appeared to be equivalent to:
function DoSomething : variant;
begin
If SomeBoolean then
Result := 4.5
else
Result := Unassigned; // <<<<
end;
I presumed this reasoning that a variant has to be created dynamically and if SomeBoolean was FALSE, the compiler had created it but it was 'Unassigned' (<> nil?). To further encourage this thinking, the compiler reports no warning if you omit assigning Result.
Just now I've spotted nasty bug where my first example (where 'Result' is not explicity defaulted to 'nil') actually returned an 'old' value from somewhere else.
Should I ALWAYS assign Result (as I do when using prefefined types) when returing a variant?
Yes, you always need to initialize the Result of a function, even if it's a managed type (like string and Variant). The compiler does generate some code to initialize the future return value of a Variant function for you (at least the Delphi 2010 compiler I used for testing purposes does) but the compiler doesn't guarantee your Result is initialized; This only makes testing more difficult, because you might run into a case where your Result was initialized, base your decisions on that, only to later discover your code is buggy because under certain circumstances the Result wasn't initialized.
From my investigation, I've noticed:
If your result is assigned to a global variable, your function is called with an initialized hidden temporary variable, creating the illusion that the Result is magically initialized.
If you make two assignments to the same global variable, you'll get two distinct hidden temporary variables, re-enforcing the illusion that Result's are initialized.
If you make two assignments to the same global variable but don't use the global variable between calls, the compiler only uses 1 hidden temporary, braking the previous rule!
If your variable is local to the calling procedure, no intermediary hidden local variable is used at all, so the Result isn't initialized.
Demonstration:
First, here's the proof that a function returning a Variant receives a var Result:
Variant hidden parameter. The following two compile to the exact same assembler, shown below:
procedure RetVarProc(var V:Variant);
begin
V := 1;
end;
function RetVarFunc: Variant;
begin
Result := 1;
end;
// Generated assembler:
push ebx // needs to be saved
mov ebx, eax // EAX contains the address of the return Variant, copies that to EBX
mov eax, ebx // ... not a very smart compiler
mov edx, $00000001
mov cl, $01
call #VarFromInt
pop ebx
ret
Next, it's interesting to see how the call for the two is set up by the complier. Here's what happens for a call to a procedure that has a var X:Variant parameter:
procedure Test;
var X: Variant;
begin
ProcThatTakesOneVarParameter(X);
end;
// compiles to:
lea eax, [ebp - $10]; // EAX gets the address of the local variable X
call ProcThatTakesOneVarParameter
If we make that "X" a global variable, and we call the function returning a Variant, we get this code:
var X: Variant;
procedure Test;
begin
X := FuncReturningVar;
end;
// compiles to:
lea eax, [ebp-$10] // EAX gets the address of a HIDDEN local variable.
call FuncReturningVar // Calls our function with the local variable as parameter
lea edx, [ebp-$10] // EDX gets the address of the same HIDDEN local variable.
mov eax, $00123445 // EAX is loaded with the address of the global variable X
call #VarCopy // This moves the result of FuncReturningVar into the global variable X
If you look at the prologue of this function you'll notice the local variable that's used as a temporary parameter for the call to FuncReturningVar is initialized to ZERO. If the function doesn't contain any Result := statements, X would be "Uninitialized". If we call the function again, a DIFFERENT temporary and hidden variable is used! Here's a bit of sample code to see that:
var X: Variant; // global variable
procedure Test;
begin
X := FuncReturningVar;
WriteLn(X); // Make sure we use "X"
X := FuncReturningVar;
WriteLn(X); // Again, make sure we use "X"
end;
// compiles to:
lea eax, [ebp-$10] // first local temporary
call FuncReturningVar
lea edx, [ebp-$10]
mov eax, $00123456
call #VarCopy
// [call to WriteLn using the actual address of X removed]
lea eax, [ebp-$20] // a DIFFERENT local temporary, again, initialized to Unassigned
call FuncReturningVar
// [ same as before, removed for brevity ]
When looking at that code, you'd think the "Result" of a function returning Variant is allways initialized to Unassigned by the calling party. Not true. If in the previous test we make the "X" variable a LOCAL variable (not global), the compiler no longer uses the two separate local temporary variables. So we've got two separate cases where the compiler generates different code. In other words, don't make any assumptions, always assign Result.
My guess about the different behavior: If the Variant variable can be accessed outside the current scope, as a global variable (or class field for that matter) would, the compiler generates code that uses the thread-safe #VarCopy function. If the variable is local to the function there are no multi-threading issues so the compiler can take the liberty to make direct assignments (no-longer calling #VarCopy).
Should I ALWAYS assign Result (as I do
when using prefefined types) when
returing a variant?
Yes.
Test this:
function DoSomething(SomeBoolean: Boolean) : variant;
begin
if SomeBoolean then
Result := 1
end;
Use the function like this:
var
xx: Variant;
begin
xx := DoSomething(True);
if xx <> Unassigned then
ShowMessage('Assigned');
xx := DoSomething(False);
if xx <> Unassigned then
ShowMessage('Assigned');
end;
xx will still be assigned after the second call to DoSomething.
Change the function to this:
function DoSomething(SomeBoolean: Boolean) : variant;
begin
Result := Unassigned;
if SomeBoolean then
Result := 1
end;
And xx is not assigned after the second call to DoSomething.

Resources