Delphi fastest FileSize for sizes > 10gb - delphi

Wanted to check with you experts if there are any drawbacks in this funtion. Will it work properly on the various Windows OS ? I am using Delphi Seattle (32 and 64 bit exe's). I am using this instead of Findfirst for its speed.
function GetFileDetailsFromAttr(pFileName:WideString):int64;
var
wfad: TWin32FileAttributeData;
wSize:LARGE_INTEGER ;
begin
Result:=0 ;
if not GetFileAttributesEx(pwidechar(pFileName), GetFileExInfoStandard,#wfad) then
exit;
wSize.HighPart:=wfad.nFileSizeHigh ;
wSize.LowPart:=wfad.nFileSizeLow ;
result:=wsize.QuadPart ;
end;
The typical googled samples shown with this command does not work for filesize > 9GB
function GetFileAttributesEx():Int64 using
begin
...
result:=((&wfad.nFileSizeHigh) or (&wfad.nFileSizeLow))

Code with variant record is correct.
But this code
result:=((&wfad.nFileSizeHigh) or (&wfad.nFileSizeLow))
is just wrong, result cannot overcome 32-bit border
Code from link in comment
result := Int64(info.nFileSizeLow) or Int64(info.nFileSizeHigh shl 32);
is wrong because it does not account how compiler works with 32 and 64-bit values. Look at the next example showing how to treat this situation properly (for value d, e):
var
a, b: DWord;
c, d, e: Int64;
wSize:LARGE_INTEGER ;
begin
a := 1;
b := 1;
c := Int64(a) or Int64(b shl 32);
d := Int64(a) or Int64(b) shl 32;
wSize.LowPart := a;
wSize.HighPart := b;
e := wsize.QuadPart;
Caption := Format('$%x $%x $%x', [c, d, e]);
Note that in the expression for c 32-bit value is shifted by 32 bits left and looses set bit, then zero transforms to 64-bit.

Unbound to how you get the filesize: it would even be faster if you'd use a type (manual) that exists for ~25 years already to assign the filesize directly to the function's result instead of using an intermediate variable:
Int64Rec(result).Hi:= wfad.nFileSizeHigh;
Int64Rec(result).Lo:= wfad.nFileSizeLow;
end;
In case this isn't obvious to anyone here's what the compilation looks like:
Above: the intermediate variable w: LARGE_INTEGER first gets assigned the two 32bit parts and then is assigned itself to the function's result. Cost: 10 instructions.
Above: the record Int64Rec is used to cast the function's result and assign both 32bit parts directly, without the need of any other variable. Cost: 6 instructions.
Environment used: Delphi 7.0 (Build 8.1), compiler version 15.0, Win32 executable, code optimization: on.

Related

Different optimizations in Math.Sum in Win32/64

I have the following code
const
NumIterations = 10000000;
var
i, j : Integer;
x : array[1..100] of Double;
Start : Cardinal;
S : Double;
begin
for i := Low(x) to High(x) do x[i] := i;
Start := GetTickCount;
for i := 1 to NumIterations do S := System.Math.Sum(x);
ShowMessage('Math.Sum: ' + IntToStr(GetTickCount - Start));
Start := GetTickCount;
for i := 1 to NumIterations do begin
S := 0;
for j := Low(x) to High(x) do S := S + x[j];
end;
ShowMessage('Simple Sum: ' + IntToStr(GetTickCount - Start));
end;
When compiled for Win32 Math.Sum is considerably faster than the simple loop, as Math.Sum is written in Assembler and uses four-fold loop unrolling.
But when compiled for Win64, Math.Sum is considerably slower than the simple loop, because in 64-bit Math.Sum uses Kahan summation. This is an optimization for accuracy minimizing pile-up of errors during the summation process, but is considerably slower than even the simple loop.
I.e. when compiling for Win32 I get code optimized for speed, when compiling the same code for Win64 I get code optimized for accuracy. This is not exactly what I naively would expect.
Is there any sensible reason for this difference between Win32/64? Double is always 8 byte, so the accuracy should be identical in Win32/64.
Is Math.Sum still implemented identically (Assembler and loop unrolling in Win32, Kahan summation in Win64) in current versions of Delphi? I use Delphi-XE5.
Is Math.Sum still implemented identically (Assembler and loop unrolling in Win32, Kahan summation in Win64) in current versions of Delphi? I use Delphi-XE5.
Yes (Delphi 10.3.2).
Is there any sensible reason for this difference between Win32/64? Double is always 8 byte, so the accuracy should be identical in Win32/64.
32-bit Delphi for Win32 uses the old FPU, while the 64-bit compiler uses SSE instructions. When the 64-bit compiler was introduced in XE2, many of the old assembly routines was not ported to 64-bit. Instead, some routines were ported with similar functionality as other modern compilers.
You can enhance the 64-bit implementation a bit by introducing a Kahan summation function:
program TestKahanSum;
{$APPTYPE CONSOLE}
uses
System.SysUtils,Math,Diagnostics;
function KahanSum(const input : TArray<Double>): Double;
var
sum,c,y,t : Double;
i : Integer;
begin
sum := 0.0;
c := 0.0;
for i := Low(input) to High(input) do begin
y := input[i] - c;
t := sum + y;
c := (t - sum) - y;
sum := t;
end;
Result := sum;
end;
var
dArr : TArray<Double>;
res : Double;
i : Integer;
sw : TStopWatch;
begin
SetLength(dArr,100000000);
for i := 0 to High(dArr) do dArr[i] := Pi;
sw := TStopWatch.StartNew;
res := Math.Sum(dArr);
WriteLn('Math.Sum:',res,' [ms]:',sw.ElapsedMilliseconds);
sw := TStopWatch.StartNew;
res := KahanSum(dArr);
WriteLn('KahanSum:',res,' [ms]:',sw.ElapsedMilliseconds);
sw := TStopWatch.StartNew;
res := 0;
for i := 0 to High(dArr) do res := res + dArr[i];
WriteLn('NaiveSum:',res,' [ms]:',sw.ElapsedMilliseconds);
ReadLn;
end.
64-bit:
Math.Sum: 3.14159265358979E+0008 [ms]:492
KahanSum: 3.14159265358979E+0008 [ms]:359
NaiveSum: 3.14159265624272E+0008 [ms]:246
32-bit:
Math.Sum: 3.14159265358957E+0008 [ms]:67
KahanSum: 3.14159265358979E+0008 [ms]:958
NaiveSum: 3.14159265624272E+0008 [ms]:277
Pi with 15 digits is 3.14159265358979
The 32-bit math assembly routine is accurate to 13 digits in this example, while the 64-bit math routine is accurate to 15 digits.
Conclusion:
The 64 bit implementation is slower (by a factor of two compared to a naive summation), but more accurate than the 32-bit math routine.
Introducing an enhanced Kahan summation routine improves performance by 35%.
Having the very same RTL function not behave the same when switching a compilation target is an awful bug. It should not change the behavior. Even worse, Win64/pascal Sum() over single or double does not behave the same! sum(single) is naive summing, whereas sum(double) uses Kahan... :(
You would better either use plain + operator, or create your own Kahan sum function.
I can confirm that the bug is still there in Delphi 10.3.

Delphi Tokyo 64-bit flushes denormal numbers to zero?

During a short look at the source code of system.math, I discovered that
the 64-bit version Delphi Tokyo 10.2.3 flushes denormal IEEE-Doubles to zero, as can be seen from then following program;
{$apptype console}
uses
system.sysutils, system.math;
var
x: double;
const
twopm1030 : UInt64 = $0000100000000000; {2^(-1030)}
begin
x := PDouble(#twopm1030)^;
writeln(x);
x := ldexp(1,-515);
writeln(x*x);
x := ldexp(1,-1030);
writeln(x);
end.
For 32-bit the output is as expected
8.69169475979376E-0311
8.69169475979376E-0311
8.69169475979376E-0311
but with 64-bit I get
8.69169475979375E-0311
0.00000000000000E+0000
0.00000000000000E+0000
So basically Tokyo can handle denormal numbers in 64-bit mode, the constant is written correctly, but from arithmetic operations or even with ldexp a denormal result is flushed to zero.
Can this observation be confirmed on other systems? If yes, where it is documented? (The only info I could find about zero-flushing is,
that Denormals become zero when stored in a Real48).
Update: I know that for both 32- and 64-bit the single overload is used. For 32-bit the x87 FPU is used and the ASM code is virtually identical for all precisions (single, double, extended). The FPU always returns a 80-bit extended which is stored in a double without premature truncation. The 64-bit code does precision adjustment before storing.
Meanwhile I filed an issue report (https://quality.embarcadero.com/browse/RSP-20925), with the focus on the inconsistent results for 32- or 64-bit.
Update:
There is only a difference in how the compiler treats the overloaded selection.
As #Graymatter found out, the LdExp overload called is the Single type for both the 32-bit and the 64-bit compiler. The only difference is the codebase, where the 32-bit compiler is using asm code, while the 64-bit compiler has a purepascal implementation.
To fix the code to use the correct overload, explicitly define the type for the LdExp() first argument like this it works (64-bit):
program Project116;
{$APPTYPE CONSOLE}
uses
system.sysutils, system.math;
var
x: double;
const
twopm1030 : UInt64 = $0000100000000000; {2^(-1030)}
begin
x := PDouble(#twopm1030)^;
writeln(x);
x := ldexp(Double(1),-515);
writeln(x*x);
x := ldexp(Double(1),-1030);
writeln(x);
ReadLn;
end.
Outputs:
8.69169475979375E-0311
8.69169475979375E-0311
8.69169475979375E-0311
I would say that this behaviour should be reported as a RTL bug, since the overloaded function selected in your case is the Single type. The resulting type is a Double and the compiler should definitely adapt accordingly.
since the 32-bit and the 64-bit compiler should produce the same result.
Note, the Double(1) typecast for floating point types, was introduced in Delphi 10.2 Tokyo. For solutions in prevoius versions, see What is first version of Delphi which allows typecasts like double(10).
The problem here is that Ldexp(single) is returning different results depending on whether the ASM code is being called or whether the pascal code is called. In both cases, the compiler is calling the Single version of the overload because the type isn't specified in the call.
Your pascal code which is executed in the Win64 scenario tries to deal with the exponent less than -126 but the method is still not able to correctly calculate the result because single numbers are limited to an 8 bit exponent. The assembler seems to get around this but I didn't look into it in much detail as to why that's the case.
function Ldexp(const X: Single; const P: Integer): Single;
{ Result := X * (2^P) }
{$IFNDEF X86ASM}
var
T: Single;
I: Integer;
const
MaxExp = 127;
MinExp = -126;
FractionOfOne = $00800000;
begin
T := X;
Result := X;
case T.SpecialType of
fsDenormal,
fsNDenormal,
fsPositive,
fsNegative:
begin
FClearExcept;
I := P;
if I > MaxExp then
begin
T.BuildUp(False, FractionOfOne, MaxExp);
Result := Result * T;
I := I - MaxExp;
if I > MaxExp then I := MaxExp;
end
else if I < MinExp then
begin
T.BuildUp(False, FractionOfOne, MinExp);
Result := Result * T;
I := I - MinExp;
if I < MinExp then I := MinExp;
end;
if I <> 0 then
begin
T.BuildUp(False, FractionOfOne, I);
Result := Result * T;
end;
FCheckExcept;
end;
// fsZero,
// fsNZero,
// fsInf,
// fsNInf,
// fsNaN:
else
;
end;
end;
{$ELSE X86ASM}
{$IF defined(CPUX86) and defined(IOS)} // iOS/Simulator
...
{$ELSE}
asm // StackAlignSafe
PUSH EAX
FILD dword ptr [ESP]
FLD X
FSCALE
POP EAX
FSTP ST(1)
FWAIT
end;
{$ENDIF}
{$ENDIF X86ASM}
As LU RD suggested, you can get around the problem by forcing the methods to call the Double overload. There is a bug but that bug is that the ASM code doesn't match the pascal code in Ldexp(const X: Single; const P: Integer), not that a different overload is being called.

SHR on int64 does not return expected result

I'm porting some C# code to Delphi (XE5). The C# code has code like this:
long t = ...
...
t = (t >> 25) + ...
I translated this to
t: int64;
...
t := (t shr 25) + ...
Now I see that Delphi (sometimes) calculates wrong values for shifting negative t's, e.g.:
-170358640930559629 shr 25
Windows Calculator: -5077083139
C# code: -5077083139
Delphi:
-170358640930559629 shr 25 = 544678730749 (wrong)
For this example, -1*((-t shr 25)+1) gives the correct value in Delphi.
For other negative values of t a simple typecast to integer seems to give the correct result:
integer(t shr 25)
I am at my limit regarding binary operations and representations, so I would appreciate any help with simply getting the same results in Delphi like in C# and Windows calculator.
Based on the article linked in Filipe's answer (which states the reason to be Delphi carrying out a shr as opposed to others doing a sar), here's my take on this:
function CalculatorRsh(Value: Int64; ShiftBits: Integer): Int64;
begin
Result := Value shr ShiftBits;
if (Value and $8000000000000000) > 0 then
Result := Result or ($FFFFFFFFFFFFFFFF shl (64 - ShiftBits));
end;
As you can read here, the way C and Delphi treat Shr is different. Not meaning to point fingers, but C's >> isn't really a shr, it's actually a sar.
Anyways, the only workaround that I've found is doing your math manually. Here's an example:
function SAR(a, b : int64): int64;
begin
result := round(a / (1 shl b));
end;
Hope it helps!

Algol60 passing integer element of array as parameter - error bad type

I have following problem.
When I try to run the code with arun file.obj (I have compiled with algol.exe file)
BEGIN
INTEGER PROCEDURE fun(tab,index,lower,upper);
INTEGER tab,index,lower,upper;
BEGIN
INTEGER t;
text (1, "Start");
t := 0;
FOR index := lower STEP 1 UNTIL upper DO
t := t + tab;
fun := t;
END;
INTEGER ARRAY t[1:10];
INTEGER i,result,lower,upper;
lower := 1;
upper := 10;
FOR i := 1 STEP 1 UNTIL 10 DO
t[i] := i;
i := 1;
result := fun(t[i],i,lower,upper);
END FINISH;
I am still getting error:
ERROR 3
ADD PBASE PROC LOC
07D4 0886 1 13
083A 0842 0 115
The compiler I use is "The Rogalgol Algol60" product of RHA (Minisystems) Ltd.
Error 3 means "3 Procedure called where the actual and the formal parameter types do not match."
But I do not understand why. The reason of error is t[i] (If I change to i - it is ok).
Someone knows what I am doing wrongly?
I compile the code on the dosbox (linux)
Problem is that the index of the integer array that you're passing to your procedure isn't the same as the integer that he's expecting. I can't remember what an integer array is full of, but I guess it isn't integers... Have to admit I never use them, but can't remember why. Possibly because of limitations like this. I stick to Real arrays and EBCDIC ones.
You can almost certainly fix it by defining a new integer, j; inserting "j := t[i];" before your invocation of 'fun'; then invoking 'fun' with 'j' rather than t[i].
BTW you may want to make the array (and the 'for' loop) zero-relative. ALGOL is mostly zero-relative and I think it may save memory if you go with the flow.
Let me know if this helps....

Theres a UIntToStr in delphi to let you display UINT64 values, but where is StrToUInt to allow user to input 64 bit unsigned values?

I want to convert a large 64 bit value from decimal or hex string to 64 bit UINT64 data type. There is a UIntToStr to help converting the UINT64 to string, but no way to convert a 64 bit integer to a unsigned value, as a string. That means integer values greater than 2**63 can not be represented in decimal or hex, using the RTL. This is normally not a big deal, but it can happen that a user needs to input a value, as an unsigned integer, which must be stored into the registry as a 64 bit unsigned integer value.
procedure HandleLargeHexValue;
var
x:UINT64;
begin
x := $FFFFFFFFFFFFFFFE;
try
x := StrToInt('$FFFFFFFFFFFFFFFF'); // range error.
except
x := $FFFFFFFFFFFFFFFD;
end;
Caption := UintToStr(x);
end;
Update Val now works fine in Delphi XE4 and up. In XE3 and below Val('$FFFFFFFFFFFFFFFF') works but not Val('9223372036854775899'). As Roeland points out below in Quality Central 108740: System.Val had problems with big UInt64 values in decimal until Delphi XE4.
UPDATE: In XE4 and later the RTL bug was fixed. This hack is only useful in Delphi XE3 or older
Well, if it ain't there, I guess I could always write it.
(I wrote a pretty good unit test for this too, but its too big to post here)
unit UIntUtils;
{ A missing RTL function written by Warren Postma. }
interface
function TryStrToUINT64(StrValue:String; var uValue:UInt64 ):Boolean;
function StrToUINT64(Value:String):UInt64;
implementation
uses SysUtils,Character;
{$R-}
function TryStrToUINT64(StrValue:String; var uValue:UInt64 ):Boolean;
var
Start,Base,Digit:Integer;
n:Integer;
Nextvalue:UInt64;
begin
result := false;
Base := 10;
Start := 1;
StrValue := Trim(UpperCase(StrValue));
if StrValue='' then
exit;
if StrValue[1]='-' then
exit;
if StrValue[1]='$' then
begin
Base := 16;
Start := 2;
if Length(StrValue)>17 then // $+16 hex digits = max hex length.
exit;
end;
uValue := 0;
for n := Start to Length(StrValue) do
begin
if Character.IsDigit(StrValue[n]) then
Digit := Ord(StrValue[n])-Ord('0')
else if (Base=16) and (StrValue[n] >= 'A') and (StrValue[n] <= 'F') then
Digit := (Ord(StrValue[n])-Ord('A'))+10
else
exit;// invalid digit.
Nextvalue := (uValue*base)+digit;
if (Nextvalue<uValue) then
exit;
uValue := Nextvalue;
end;
result := true; // success.
end;
function StrToUINT64(Value:String):UInt64;
begin
if not TryStrToUINT64(Value,result) then
raise EConvertError.Create('Invalid uint64 value');
end;
end.
I must disagree that Val solves this issue.
Val works only for big UInt64 values when they are written in Hex. When they are written in decimal, the last character is removed from the string and the resulting value is wrong.
See Quality Central 108740: System.Val has problems with big UInt64 values
EDIT: It seems that this issue should be solved in XE4. Can't test this.
With Value a UINT64, the code snippet below gives the expected answer on Delphi 2010 but only if the input values are in hexadecimal
stringValue := '$FFFFFFFFFFFFFFFF';
val( stringValue, value, code );
ShowMessage( UIntToStr( value ));
I'd simply wrap val in a convenience function and you're done.
Now feel free to burn me. Am I missing a digit in my tests? :D

Resources