As the topic indicates above, I'm wondering if there's a good example of a clean and efficient way to handle pointers as passed in function parms when processing the data sequentially. What I have is something like:
function myfunc(inptr: pointer; inptrsize: longint): boolean;
var
inproc: pointer;
i: integer;
begin
inproc := inptr;
for i := 1 to inptrsize do
begin
// do stuff against byte data here.
inc(longint(inproc), 1);
end;
end;
The idea is that instead of finite pieces of data, I want it to be able to process whatever is pushed its way, no matter the size.
Now when it comes to processing the data, I've figured out a couple of ways to do it successfully.
Assign the parm pointers to identical temporary pointers, then use those to access each piece of data, incrementing them to move on. This method is quickest, but not very clean looking with all the pointer increments spread all over the code. (this is what I'm talking about above)
Assign the parm pointers to a pointer representing a big array value and then incremently process that using standard table logic. Much cleaner, but about 500 ms slower than #1.
Is there another way to efficiently handle processing pointers in this way, or is there some method I'm missing that will both be clean and not time inefficient?
Your code here is basically fine. I would always choose to increment a pointer than cast to a fake array.
But you should not cast to an integer. That is semantically wrong and you'll pay the penalty anytime you compile on a platform that has pointer size different from your integer size. Always use a pointer to an element of the right size. In this case a pointer to byte.
function MyFunc(Data: PByte; Length: Integer): Boolean;
var
i: Integer;
begin
for i := 1 to Length do
begin
// do stuff against byte data here.
inc(Data);
end;
end;
Unless the compiler is having a really bad day, you won't find it easy to get better performing code than this. What's more, I think this style is actually rather clear and easy to understand. Most of the clarity gain comes in avoiding the need to cast. Always strive to remove casts from your code.
If you want to allow any pointer type to be passed then you can write it like this:
function MyFunc(P: Pointer; Length: Integer): Boolean;
var
i: Integer;
Data: PByte;
begin
Data := P;
for i := 1 to Length do
begin
// do stuff against byte data here.
inc(Data);
end;
end;
Or if you want to avoid pointers in the interface, then use an untyped const parameter.
function MyFunc(const Buffer; Length: Integer): Boolean;
var
i: Integer;
Data: PByte;
begin
Data := PByte(#Buffer);
for i := 1 to Length do
begin
// do stuff against byte data here.
inc(Data);
end;
end;
Use a var parameter if you need to modify the buffer.
I have a different opinion: For sake of readability I would use an array. Pascal was not designed to be able to access memory directly. Original pascal did not even have pointer arithmetic.
This is how I would use an array:
function MyFunc(P: Pointer; Length: Integer): Boolean;
var
ArrayPtr : PByteArray Absolute P;
I : Integer;
begin
For I := 0 to Length-1 do
// do stuff against ArrayPtr^[I]
end;
But if performance matters, I would write it like this
function MyFunc(P: Pointer; Length: Integer): Boolean;
var
EndOfMemoryBlock: PByte;
begin
EndOfMemoryBlock := PByte(Int_Ptr(Data)+Length);
While P<EndOfMemoryBlock Do begin
// do stuff against byte data here.
inc(P);
end;
end;
Related
Given a buffer and its size in bytes, is there a way to convert this to TBytes without copying it?
Example:
procedure HandleBuffer(_Buffer: PByte; _BufSize: integer);
var
Arr: TBytes;
i: Integer;
begin
// some clever code here to get contents of the buffer into the Array
for i := 0 to Length(Arr)-1 do begin
HandleByte(Arr[i]);
end;
end;
I could of course copy the data:
procedure HandleBuffer(_Buffer: PByte; _BufSize: integer);
var
Arr: TBytes;
i: Integer;
begin
// this works but is very inefficient
SetLength(Arr, _BufSize);
Move(PByte(_Buffer)^, Arr[0], _BufSize);
//
for i := 0 to Length(Arr)-1 do begin
HandleByte(Arr[i]);
end;
end;
But for a large buffer (about a hundred megabytes) this would mean I have double the memory requirement and also spend a lot of time unnecessarily copying data.
I am aware that I could simply use a PByte to process each byte in the buffer, I'm only interested in a solution to use a TBytes instead.
I think it's not possible, but I have been wrong before.
No, this is not possible (without unreasonable hacks).
The problem is that TBytes = TArray<Byte> = array of Byte is a dynamic array and the heap object for a non-empty dynamic array has a header containing the array's reference count and length.
A function that accepts a TBytes parameter, when given a plain pointer to an array of bytes, might (rightfully) attempt to read the (non-existing) header, and then you are in serious trouble.
Also, dynamic arrays are managed types (as indicated by the reference count I mentioned), so you might have problems with that as well.
However, in your particular example code, you don't actually use the dynamic array nature of the data at all, so you can work directly with the buffer:
procedure HandleBuffer(_Buffer: PByte; _BufSize: integer);
var
i: Integer;
begin
for i := 0 to _BufSize - 1 do
HandleByte(_Buffer[i]);
end;
Following on from this question (Dynamic arrays and memory management in Delphi), if I create a dynamic array in Delphi, how do I access the reference count?
SetLength(a1, 100);
a2 := a1;
// The reference count for the array pointed to by both
// a1 and a2 should be 2. How do I retrieve this?
Additionally, if the reference count can be accessed, can it also be modified manually? This latter question is mainly theoretical rather than for use practically (unlike the first question above).
You can see how the reference count is managed by inspecting the code in the System unit. Here are the pertinent parts from the XE3 source:
type
PDynArrayRec = ^TDynArrayRec;
TDynArrayRec = packed record
{$IFDEF CPUX64}
_Padding: LongInt; // Make 16 byte align for payload..
{$ENDIF}
RefCnt: LongInt;
Length: NativeInt;
end;
....
procedure _DynArrayAddRef(P: Pointer);
begin
if P <> nil then
AtomicIncrement(PDynArrayRec(PByte(P) - SizeOf(TDynArrayRec))^.RefCnt);
end;
function _DynArrayRelease(P: Pointer): LongInt;
begin
Result := AtomicDecrement(PDynArrayRec(PByte(P) - SizeOf(TDynArrayRec))^.RefCnt);
end;
A dynamic array variable holds a pointer. If the array is empty, then the pointer is nil. Otherwise the pointer contains the address of the first element of the array. Immediately before the first element of the array is stored the metadata for the array. The TDynArrayRec type describes that metadata.
So, if you wish to read the reference count you can use the exact same technique as does the RTL. For instance:
function DynArrayRefCount(P: Pointer): LongInt;
begin
if P <> nil then
Result := PDynArrayRec(PByte(P) - SizeOf(TDynArrayRec))^.RefCnt
else
Result := 0;
end;
If you want to modify the reference count then you can do so by exposing the functions in System:
procedure DynArrayAddRef(P: Pointer);
asm
JMP System.#DynArrayAddRef
end;
function DynArrayRelease(P: Pointer): LongInt;
asm
JMP System.#DynArrayRelease
end;
Note that the name DynArrayRelease as chosen by the RTL designers is a little mis-leading because it merely reduces the reference count. It does not release memory when the count reaches zero.
I'm not sure why you would want to do this mind you. Bear in mind that once you start modifying the reference count, you have to take full responsibility for getting it right. For instance, this program leaks:
{$APPTYPE CONSOLE}
var
a, b: array of Integer;
type
PDynArrayRec = ^TDynArrayRec;
TDynArrayRec = packed record
{$IFDEF CPUX64}
_Padding: LongInt; // Make 16 byte align for payload..
{$ENDIF}
RefCnt: LongInt;
Length: NativeInt;
end;
function DynArrayRefCount(P: Pointer): LongInt;
begin
if P <> nil then
Result := PDynArrayRec(PByte(P) - SizeOf(TDynArrayRec))^.RefCnt
else
Result := 0;
end;
procedure DynArrayAddRef(P: Pointer);
asm
JMP System.#DynArrayAddRef
end;
function DynArrayRelease(P: Pointer): LongInt;
asm
JMP System.#DynArrayRelease
end;
begin
ReportMemoryLeaksOnShutdown := True;
SetLength(a, 1);
Writeln(DynArrayRefCount(a));
b := a;
Writeln(DynArrayRefCount(a));
DynArrayAddRef(a);
Writeln(DynArrayRefCount(a));
a := nil;
Writeln(DynArrayRefCount(b));
b := nil;
Writeln(DynArrayRefCount(b));
end.
And if you make a call to DynArrayRelease that takes the reference count to zero then you would also need to dispose of the array, for reasons discussed above. I've never encountered a problem that would require manipulation of the reference count, and strongly suggest that you avoid doing so.
One final point. The RTL does not offer this functionality through its public interface. Which means that all of the above is private implementation detail. And so is subject to change in a future release. If you do attempt to read or modify the reference count then you must recognise that doing so relies on such implementation detail.
After some googling, I found an excellent article by Rudy Velthuis. I highly recommend to read it. Quoting dynamic arrays part from http://rvelthuis.de/articles/articles-pointers.html#dynarrays
At the memory location below the address to which the pointer points, there are two more fields, the number of elements allocated, and the reference count.
If, as in the diagram above, N is the address in the dynamic array variable, then the reference count is at address N-8, and the number of allocated elements (the length indicator) at N-4. The first element is at address N.
How to access these:
SetLength(a1, 100);
a2 := a1;
// Reference Count = 2
refCount := PInteger(NativeUInt(#a1[0]) - SizeOf(NativeInt) - SizeOf(Integer))^;
// Array Length = 100
arrLength := PNativeInt(NativeUInt(#a1[0]) - SizeOf(NativeInt))^;
The trick in computing proper offsets is to account for differences between 32bit and 64bit platforms code. Fields size in bytes is as follows:
32bit 64bit
RefCount 4 4
Length 4 8
I'm going maintain and port to Delphi XE2 a bunch of very old Delphi code that is full of VarArrayCreate constructs to fake dynamic arrays having a lower bound that is not zero.
Drawbacks of using Variant types are:
quite a bit slower than native arrays (the code does a lot of complex financial calculations, so speed is important)
not type safe (especially when by accident a wrong var... constant is used, and the Variant system starts to do unwanted conversions or rounding)
Both could become moot if I could use dynamic arrays.
Good thing about variant arrays is that they can have non-zero lower bounds.
What I recollect is that dynamic arrays used to always start at a lower bound of zero.
Is this still true? In other words: Is it possible to have dynamic arrays start at a different bound than zero?
As an illustration a before/after example for a specific case (single dimensional, but the code is full of multi-dimensional arrays, and besides varDouble, the code also uses various other varXXX data types that TVarData allows to use):
function CalculateVector(aSV: TStrings): Variant;
var
I: Integer;
begin
Result := VarArrayCreate([1,aSV.Count-1],varDouble);
for I := 1 to aSV.Count-1 do
Result[I] := CalculateItem(aSV, I);
end;
The CalculateItem function returns Double. Bounds are from 1 to aSV.Count-1.
Current replacement is like this, trading the space zeroth element of Result for improved compile time checking:
type
TVector = array of Double;
function CalculateVector(aSV: TStrings): TVector;
var
I: Integer;
begin
SetLength(Result, aSV.Count); // lower bound is zero, we start at 1 so we ignore the zeroth element
for I := 1 to aSV.Count-1 do
Result[I] := CalculateItem(aSV, I);
end;
Dynamic arrays always have a lower bound of 0. So, low(A) equals 0 for all dynamic arrays. This is even true for empty dynamic arrays, i.e. nil.
From the documentation:
Dynamic arrays are always integer-indexed, always starting from 0.
Having answered your direct question already, I also offer you the beginnings of a generic class that you can use in your porting.
type
TSpecifiedBoundsArray<T> = class
private
FValues: TArray<T>;
FLow: Integer;
function GetHigh: Integer;
procedure SetHigh(Value: Integer);
function GetLength: Integer;
procedure SetLength(Value: Integer);
function GetItem(Index: Integer): T;
procedure SetItem(Index: Integer; const Value: T);
public
property Low: Integer read FLow write FLow;
property High: Integer read GetHigh write SetHigh;
property Length: Integer read GetLength write SetLength;
property Items[Index: Integer]: T read GetItem write SetItem; default;
end;
{ TSpecifiedBoundsArray<T> }
function TSpecifiedBoundsArray<T>.GetHigh: Integer;
begin
Result := FLow+System.High(FValues);
end;
procedure TSpecifiedBoundsArray<T>.SetHigh(Value: Integer);
begin
SetLength(FValues, 1+Value-FLow);
end;
function TSpecifiedBoundsArray<T>.GetLength: Integer;
begin
Result := System.Length(FValues);
end;
procedure TSpecifiedBoundsArray<T>.SetLength(Value: Integer);
begin
System.SetLength(FValues, Value);
end;
function TSpecifiedBoundsArray<T>.GetItem(Index: Integer): T;
begin
Result := FValues[Index-FLow];
end;
function TSpecifiedBoundsArray<T>.SetItem(Index: Integer; const Value: T);
begin
FValues[Index-FLow] := Value;
end;
I think it's pretty obvious how this works. I contemplated using a record but I consider that to be unworkable. That's down to the mix between value type semantics for FLow and reference type semantics for FValues. So, I think a class is best here.
It also behaves rather weirdly when you modify Low.
No doubt you'd want to extend this. You'd add a SetBounds, a copy to, a copy from and so on. But I think you may find it useful. It certainly shows how you can make an object that looks very much like an array with non-zero lower bound.
In Delphi you can speed up your code by passing parameters as const, e.g.
function A(const AStr: string): integer;
//or
function B(AStr: string): integer;
Suppose both functions have the same code inside, the speed difference between them is negligible and I doubt it can even be measured with a cycle-counter like:
function RDTSC: comp;
var
TimeStamp: record case byte of
1: (Whole: comp);
2: (Lo, Hi: Longint);
end;
begin
asm
db $0F; db $31;
mov [TimeStamp.Lo], eax
mov [TimeStamp.Hi], edx
end;
Result := TimeStamp.Whole;
end;
The reason for this is that all the const does in function A is to prevent the reference count of AStr to be incremented.
But the increment only takes one cycle of one core of my multicore CPU, so...
Why should I bother with const?
If there is no other reason for the function to contain an implicit try/finally, and the function itself is not doing much work, the use of const can result in a significant speedup (I once got one function that was using >10% of total runtime in a profiling run down to <2% just by adding a const in the right place).
Also, the reference counting takes much much more than one cycle because it has to be performed with the lock prefix for threadsafety reasons, so we are talking more like 50-100 cycles. More if something in the same cache line has been modified by another core in between.
As for not being able to measure it:
program Project;
{$APPTYPE CONSOLE}
uses
Windows,
SysUtils,
Math;
function GetThreadTime: Int64;
var
CreationTime, ExitTime, KernelTime, UserTime: TFileTime;
begin
GetThreadTimes(GetCurrentThread, CreationTime, ExitTime, KernelTime, UserTime);
Result := PInt64(#UserTime)^;
end;
function ConstLength(const s: string): Integer;
begin
Result := Length(s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Length(s);
end;
var
s : string;
i : Integer;
j : Integer;
ConstTime, NoConstTime: Int64;
begin
try
// make sure we got an heap allocated string;
s := 'abc';
s := s + '123';
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_TIME_CRITICAL);
j := 0;
ConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, ConstLength(s));
ConstTime := GetThreadTime - ConstTime;
j := 0;
NoConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, NoConstLength(s));
NoConstTime := GetThreadTime - NoConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
WriteLn('Const: ', ConstTime);
WriteLn('NoConst: ', NoConstTime);
WriteLn('Const is ', (NoConstTime/ConstTime):2:2, ' times faster.');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
if DebugHook <> 0 then
ReadLn;
end.
Produces this output on my system:
Const: 6084039
NoConst: 36192232
Const is 5.95 times faster.
EDIT: it gets a bit more interesting if we add some thread contention:
program Project;
{$APPTYPE CONSOLE}
uses
Windows,
SysUtils,
Classes,
Math;
function GetThreadTime: Int64;
var
CreationTime, ExitTime, KernelTime, UserTime: TFileTime;
begin
GetThreadTimes(GetCurrentThread, CreationTime, ExitTime, KernelTime, UserTime);
Result := PInt64(#UserTime)^;
end;
function ConstLength(const s: string): Integer;
begin
Result := Length(s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Length(s);
end;
function LockedAdd(var Target: Integer; Value: Integer): Integer; register;
asm
mov ecx, eax
mov eax, edx
lock xadd [ecx], eax
add eax, edx
end;
var
x : Integer;
s : string;
ConstTime, NoConstTime: Integer;
StartEvent: THandle;
ActiveCount: Integer;
begin
try
// make sure we got an heap allocated string;
s := 'abc';
s := s + '123';
ConstTime := 0;
NoConstTime := 0;
StartEvent := CreateEvent(nil, True, False, '');
ActiveCount := 0;
for x := 0 to 2 do
TThread.CreateAnonymousThread(procedure
var
i : Integer;
j : Integer;
ThreadConstTime: Int64;
begin
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_HIGHEST);
InterlockedIncrement(ActiveCount);
WaitForSingleObject(StartEvent, INFINITE);
j := 0;
ThreadConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, ConstLength(s));
ThreadConstTime := GetThreadTime - ThreadConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
LockedAdd(ConstTime, ThreadConstTime);
InterlockedDecrement(ActiveCount);
end).Start;
while ActiveCount < 3 do
Sleep(100);
SetEvent(StartEvent);
while ActiveCount > 0 do
Sleep(100);
WriteLn('Const: ', ConstTime);
ResetEvent(StartEvent);
for x := 0 to 2 do
TThread.CreateAnonymousThread(procedure
var
i : Integer;
j : Integer;
ThreadNoConstTime: Int64;
begin
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_HIGHEST);
InterlockedIncrement(ActiveCount);
WaitForSingleObject(StartEvent, INFINITE);
j := 0;
ThreadNoConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, NoConstLength(s));
ThreadNoConstTime := GetThreadTime - ThreadNoConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
LockedAdd(NoConstTime, ThreadNoConstTime);
InterlockedDecrement(ActiveCount);
end).Start;
while ActiveCount < 3 do
Sleep(100);
SetEvent(StartEvent);
while ActiveCount > 0 do
Sleep(100);
WriteLn('NoConst: ', NoConstTime);
WriteLn('Const is ', (NoConstTime/ConstTime):2:2, ' times faster.');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
if DebugHook <> 0 then
ReadLn;
end.
On a 6 core machine, this gives me:
Const: 19968128
NoConst: 1313528420
Const is 65.78 times faster.
EDIT2: replacing the call to Length with a call to Pos (I picked the worst case, search for something not contained in the string):
function ConstLength(const s: string): Integer;
begin
Result := Pos('x', s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Pos('x', s);
end;
results in:
Const: 51792332
NoConst: 1377644831
Const is 26.60 times faster.
for the threaded case, and:
Const: 15912102
NoConst: 44616286
Const is 2.80 times faster.
for the non-threaded case.
Don't forget that const isn't only there to provide those tiny performance improvements.
Using const explains to anybody reading or maintaining the code that the value shouldn't be updated, and allows the compiler to catch any accidental attempts to do so.
So making your code more readable and maintainable can also make it marginally faster. What good reasons are there for not using const?
Using const prevents an implicit try/finally block which on x86 is rather more expensive than reference counting. That's really a separate issue to the semantic meaning of const. It's a shame that performance and semantics are mixed up in this way.
The type String is a special case, because it is managed by Delphi (copy on demand), and therefore not ideal to answer your question.
If you test your function with other types that are bigger than a pointer, records or arrays for example, you should see a bigger time difference, because with const only a pointer is passed, without const the record would be copied before passing to the function.
Using the keyword const, you can leave the decision of optimization to the compiler.
The documentation says:
Using const allows the compiler to optimize code for structured- and string-type parameters.
So, it is better, thus rational, to use const for string parameters, simply because the manual says so. ;)
Now, this may be well enough an answer for the questioner, but it is even more interesting to look at the general question whether to use const parameters or not.
Again, the documentation says at just one click away from the Delphi Language Guide Index:
Value and constant (const) parameters are passed by value or by reference, depending on the type and size of the parameter:
Note the apparent equality of value and constant parameters in this sentence. This concludes that using const for parameters, being not string- or structured-typed, makes no difference in performance nor code-size. (A short test, derived from Thorsten Engler's test code, indeed shows an average indifference between with and without const for parameters of ordinal and real types.)
So it turns out that whether or not using const only makes a difference to the programmer, not the executable.
As follow-up, and as LukeH already asked: What good reasons are there for not using const?
To follow Delphi's own syntax:
function FindDragTarget(const Pos: TPoint; AllowDisabled: Boolean): TControl;
function UpperCase(const S: string): string;
function UpCase(Ch: Char): Char;
function EncodeDate(Year, Month, Day: Word): TDateTime;
To produce more compact are therefore possibly slightly more readable code. For instance: using constant parameters in property setters really is superfluous, which surprisingly often leads to single line declarations instead of double, if you like to honour a line length limit.
To comfortably provide variables to virtual methods and event handlers. Note that none of the VCL event handler types use const parameters (for other than string- or record-typed members). It is just nice service for the users of your code or your components.
Of course, there also may be fine reasons for using const:
As LukeH already answered, if there is really no need at all to change the value of the parameter.
For (personal) protection, like the documentation says:
Using const also provides a safeguard against unintentionally passing a parameter by reference to another routine.
Partial origin of this answer: http://www.nldelphi.com.
Generally, I would avoid any optimizations (in any language) that don't solve real problems that you can measure. Profile your code, and fix the problems that you can actually see. Optimizing for theoretical issues is just a waste of your time.
If you suspect that something is wrong, and this somehow fixes it/speeds it up, then great, but implementing these kinds of micro optimizations by default are rarely worth the time.
One of the most important fact that people omitted. Interlock ... instruction is very costly in Multicore CPUs of x86 instruction. Read Intel manual. The cost is when refcounter var is taken placed and it is not in cpu cache, ALL other CPUs must be stopped for instruction to carried out.
Cheers
Normally, in Delphi one would declare a function with a variable number of arguments using the 'array of const' method. However, for compatibility with code written in C, there's an much-unknown 'varargs' directive that can be added to a function declaration (I learned this while reading Rudy's excellent 'Pitfalls of convering' document).
As an example, one could have a function in C, declared like this :
void printf(const char *fmt, ...)
In Delphi, this would become :
procedure printf(const fmt: PChar); varargs;
My question is : How can I get to the contents of the stack when implementing a method which is defined with the 'varargs' directive?
I would expect that some tooling for this exists, like Dephi translations of the va_start(), va_arg() and va_end() functions, but I can't find this anywhere.
Please help!
PS: Please don't drift off in discussions about the 'why' or the 'array of const' alternative - I need this to write C-like patches for functions inside Xbox games (see the Delphi Xbox emulator project 'Dxbx' on sourceforge for details).
OK, I see the clarification in your question to mean that you need to implement a C import in Delphi. In that case, you need to implement varargs yourself.
The basic knowledge needed is the C calling convention on the x86: the stack grows downwards, and C pushes arguments from right to left. Thus, a pointer to the last declared argument, after it is incremented by the size of the last declared argument, will point to the tail argument list. From then, it's simply a matter of reading the argument out and incrementing the pointer by an appropriate size to move deeper into the stack. The x86 stack in 32-bit mode is 4-byte aligned generally, and this also means that bytes and words are passed as 32-bit integers.
Anyhow, here's a helper record in a demo program that shows how to read out data. Note that Delphi seems to be passing Extended types in a very odd way; however, you likely won't have to worry about that, as 10-byte floats aren't generally widely used in C, and aren't even implemented in the latest MS C, IIRC.
{$apptype console}
type
TArgPtr = record
private
FArgPtr: PByte;
class function Align(Ptr: Pointer; Align: Integer): Pointer; static;
public
constructor Create(LastArg: Pointer; Size: Integer);
// Read bytes, signed words etc. using Int32
// Make an unsigned version if necessary.
function ReadInt32: Integer;
// Exact floating-point semantics depend on C compiler.
// Delphi compiler passes Extended as 10-byte float; most C
// compilers pass all floating-point values as 8-byte floats.
function ReadDouble: Double;
function ReadExtended: Extended;
function ReadPChar: PChar;
procedure ReadArg(var Arg; Size: Integer);
end;
constructor TArgPtr.Create(LastArg: Pointer; Size: Integer);
begin
FArgPtr := LastArg;
// 32-bit x86 stack is generally 4-byte aligned
FArgPtr := Align(FArgPtr + Size, 4);
end;
class function TArgPtr.Align(Ptr: Pointer; Align: Integer): Pointer;
begin
Integer(Result) := (Integer(Ptr) + Align - 1) and not (Align - 1);
end;
function TArgPtr.ReadInt32: Integer;
begin
ReadArg(Result, SizeOf(Integer));
end;
function TArgPtr.ReadDouble: Double;
begin
ReadArg(Result, SizeOf(Double));
end;
function TArgPtr.ReadExtended: Extended;
begin
ReadArg(Result, SizeOf(Extended));
end;
function TArgPtr.ReadPChar: PChar;
begin
ReadArg(Result, SizeOf(PChar));
end;
procedure TArgPtr.ReadArg(var Arg; Size: Integer);
begin
Move(FArgPtr^, Arg, Size);
FArgPtr := Align(FArgPtr + Size, 4);
end;
procedure Dump(const types: string); cdecl;
var
ap: TArgPtr;
cp: PChar;
begin
cp := PChar(types);
ap := TArgPtr.Create(#types, SizeOf(string));
while True do
begin
case cp^ of
#0:
begin
Writeln;
Exit;
end;
'i': Write(ap.ReadInt32, ' ');
'd': Write(ap.ReadDouble, ' ');
'e': Write(ap.ReadExtended, ' ');
's': Write(ap.ReadPChar, ' ');
else
Writeln('Unknown format');
Exit;
end;
Inc(cp);
end;
end;
type
PDump = procedure(const types: string) cdecl varargs;
var
MyDump: PDump;
function AsDouble(e: Extended): Double;
begin
Result := e;
end;
function AsSingle(e: Extended): Single;
begin
Result := e;
end;
procedure Go;
begin
MyDump := #Dump;
MyDump('iii', 10, 20, 30);
MyDump('sss', 'foo', 'bar', 'baz');
// Looks like Delphi passes Extended in byte-aligned
// stack offset, very strange; thus this doesn't work.
MyDump('e', 2.0);
// These two are more reliable.
MyDump('d', AsDouble(2));
// Singles passed as 8-byte floats.
MyDump('d', AsSingle(2));
end;
begin
Go;
end.
I found this (from a guy we know :))
To write this stuff properly you'll need to use BASM, Delphi's built in
assembler, and code the call sequence in asm. Hopefully you've got a good
idea of what you need to do. Perhaps a post in the .basm group will help if
you get stuck.
Delphi doesn't let you implement a varargs routine. It only works for importing external cdecl functions that use this.
Since varargs is based on the cdecl calling convention, you basically need to reimplement it yourself in Delphi, using assembly and/or various kinds of pointer manipulation.