Address of Delphi label - delphi

I'm working on a simple PIC18 MCPU mnemonic simulation in Delphi pascal. And yes, I intend to use Delphi IDE.
I'm able to simulate any asm instruction, but it stops at labels.
In some cases I need to know the address of Delphi label.
Is there any possibility to cast label in to pointer variable?
As in my example?
procedure addlw(const n:byte); //emulation of mcpu addlw instruction
begin
Carry := (wreg + n) >= 256;
wreg := wreg + n;
Zero := wreg = 0;
inc(CpuCycles);
end;
procedure bnc(p: pointer ); //emulation of mcpu bnc instruction
asm
inc CpuCycles
cmp byte ptr Carry, 0
jnz #exit
pop eax //restore return addres from stack
jmp p
#exit:
end;
//EMULATION OF MCPU ASM CODE
procedure Test;
label
Top;
var
p: pointer;
begin
//
Top:
addlw(5); //emulated mcpu addlw instruction
bnc(Top); //emulated mcpu bnc branch if not carry instruction
//
end;

No, you can't interact with labels that way. Since you're emulating everything else, you may as well emulate assembler labels, too, instead of trying to force Delphi labels to do something they're not designed for.
Suppose you could use code like this instead of the "assembler" code you wrote (without worrying for now exactly how to implement it):
procedure Test;
var
Top: TAsmLabel;
begin
//
DefineLabel(Top);
addlw(5); //emulated mcpu addlw instruction
bnc(Top); //emulated mcpu bnc branch if not carry instruction
//
end;
The syntax looks similar enough, I think. Upon running that code, you'll want Top to refer to the next instruction, which is the one that calls addlw.
Inside the hypothetical function DefineLabel, that address corresponds to the return address, so write DefineLabel to store its return address in the given parameter:
type
TAsmLabel = Pointer;
procedure DefineLabel(out Result: TAsmLabel);
asm
mov ecx, [esp] // copy return address
mov [eax], ecx // store result
end;
Beware that this code corrupts the stack. Your bcn function leaves its return address on the stack, so when the carry flag eventually gets set, you've left a trail of previous return addresses on the stack. If you don't get a stack overflow first, you'll hit strange results when you get to the end of the containing function. It will try to return, but instead of going to the caller, it will find bnc's return address instead, and jump back into the middle of your code. And that's all assuming there aren't any other stack-relative references in the code. If there are, then even calling bnc(Top) might give problems because the relative position of Top will have changed, and you'll end up reading the wrong value off the stack.

Related

Why is there two sequential move to EAX under optimization build?

I looked at the ASM code of a release build with all optimizations turned on, and here is one of the inlined function I came across:
0061F854 mov eax,[$00630bec]
0061F859 mov eax,[$00630e3c]
0061F85E mov edx,$00000001
0061F863 mov eax,[eax+edx*4]
0061F866 cmp byte ptr [eax],$01
0061F869 jnz $0061fa83
The code is pretty easy to understand, it builds an offset (1) into a table, compares the byte value from it to 1 and do a jump if NZ. I know the pointer to my table is stored in $00630e3c, but I have no idea where $00630bec is coming from.
Why is there two move to eax one after the other? Isn't the first one overwritten by the second one? Can this be a cache optimization thing or am I missing something unbelievably obvious/obscure?
The Delphi code for the above ASM is as follow:
if( TGameSignals.IsSet( EmitParticleSignal ) = True ) then [...]
IsSet() is an inlined class function and calls the inlined IsSet() function of TSignalManager:
class function TGameSignals.IsSet(Signal: PBucketSignal): Boolean;
begin
Result := FSignalManagerInstance.IsSet( Signal );
end;
The final IsSet of the signal manager is as such:
function TSignalManagerInstance.IsSet( Signal: PBucketSignal ): Boolean;
begin
Result := Signal.Pending;
end;
My best guess would be that $00630bec is a reference to the class TGameSignals. You can check it by doing
ShowMessage(IntToHex(NativeInt(TGameSignals), 8))
The pre-optimisation code was probably something like this
0061F854 mov eax,[$00630bec] //Move reference to class TGameSignals in EAX
0061F859 mov eax,[eax + $250] //Move Reference to FSignalManagerInstance at offset $250 in class TGameSignals in EAX
the compiler optimised [eax + $250] to [$00630e3c], but didn't realize the previous MOV wasn't required anymore.
I'm not an expert in codegen, so take it with a grain of salt...
On a side note, in delphi, we usually write
if TGameSignals.IsSet( EmitParticleSignal ) then
As it's possible for the following IF to be true
var vBool : Boolean
[...]
vBool := Boolean(10);
if vBool and (vBool <> True) then
Granted, this is not good practice, but no point in comparing to TRUE either.
EDIT: As pointed out by Ped7g, I was wrong. The instruction is
0061F854 mov eax,[$00630bec]
and not
0061F854 mov eax,$00630bec
So what I wrote didn't really make sense...
The first MOV instruction serve to pass the "self" reference for the call to TGameSignals.IsSet. Now, if the function wasn't inline, it would look like this :
mov eax,[$00630bec]
call TGameSignals.IsSet
and then
*TGameSignals.IsSet
mov eax,[$00630e3c]
[...]
The first mov is still pointless, since "Self" isn't used in TGameSignals.IsSet but it is still required to pass "self" to the function. When the routine get inlined, it looks a lot more silly, indeed.
Like mentioned by Arnaud Bouchez, making TGameSignals.IsSet static remove the implicit Self parameter and thus, remove the first MOV operation.

Asm equivalent to a delphi procedure

I have a simple delphi function called SetCompare that compares two singles and if they are not equal then one value is set to the other.
procedure SetCompare( A : single; B : single );
begin
if( A <> B ) then
A := B;
end;
I am trying to convert this into asm as such:
procedure SetCompare( A : Single; B : Single ); register;
begin
asm
mov EAX,A
mov ECX,B
cmp EAX,ECX
jne SetValue
#SetValue:
mov EAX,ECX
end;
end;
Will this work?
Will this work?
No this will not work, because floating point comparison is not the same as binary comparison. For instance 0 and -0 have different bit patterns, but compare as equal. Similarly, NaN compares unequal to all values, including a NaN with the same bit pattern.
The simplest way to work out how to write your code is to get the compiler to compile the Pascal code, and inspect the generated assembly code.
Some asides:
Your function is pointless anyway, because it returns no value and has no side effects.
If performance matters enough to write assembler, then you should write pure assembler functions, rather than inline asm blocks in a Pascal function. Which in any case is not supported by the x64 compiler.
Your arguments are already in registers, so it makes little sense to copy them around to other registers. For x86 code, A arrives in EAX, and B arrives in EDX. Given that EAX already contains A, why would you copy it into EAX? It is already there. And B is already in EDX, why copy it to ECX? For x64 code, the two arguments are passed in floating point registers, and can be compared there directly. As soon as your start writing assembler you need to understand the register use of the calling convention.
Your jne is pointless. If execution does not take the jump, then it moves to the next line of code. Which is where you jumped to.

Why does dcc64 say this value is never used?

The code below is from DDetours.pas. When it compiles for 32 bit, there are no warnings emitted. When it compiles for 64-bit, it emits this warning: (Delphi Berlin Update 2)
[dcc64 Hint] DDetours.pas(1019): H2077 Value assigned to 'Prf' never used
Here is the function in question
function GetPrefixesCount(Prefixes: WORD): Byte;
var
Prf: WORD;
i: Byte;
begin
{ Get prefixes count used by the instruction. }
Result := 0;
if Prefixes = 0 then
Exit;
Prf := 0;
i := 0;
Prefixes := Prefixes and not Prf_VEX;
while Prf < $8000 do
begin
Prf := (1 shl i);
if (Prf and Prefixes = Prf) then
Inc(Result);
Inc(i);
end;
end;
It sure looks to me like the very first time Prf is compared against $8000 that initial value is used.
It's a compiler bug. There are a few of this nature. Quite frustrating. Sometimes the 32 bit compiler will complain in an irrational manner and then when you workaround that the 64 bit compiler in turn complains in an irrational manner about your workaround.
I don't think that Embarcadero habitually compiler with hints and warnings enabled, because their library code is full of hints and warnings.
Anyway in this case the compiler sees the two writes to the variable but for some reason does not recognise the intervening read of the variable.
There's not a whole lot that you can do. You could submit a bug report. I expect that you don't want to change the code because it is third party code. If you don't change it then you'll have to put up with the bogus hint.
Notifying the author of the library might allow them to workaround the issue. Perhaps by suppressing hints for that function.
I suspect the 64bit compiler has a small amount of smarts built-in to allow it to recognize that
the variable is initialized and not touched again until the loop is entered.
the loop condition is comparing the variable to a literal.
the initial value of 0 satisfies the loop condition, thus ensuring the loop will always run at least once at runtime, effectively making it act like a repeat..until loop.
Since the compiler still has to generate code for the initialization, but knows the initial value is not needed at runtime to enter the loop, it can issue a warning that the initial value will be unused.
If you don't initialize the variable, the compiler doesn't know at compile-time whether the loop will be entered or not since the behavior is undefined, so the compiler issues a different warning about the variable being uninitialized.
Viewing the disassembly, you can see that the loop is turned into a repeat..until loop and the Prf := 0; assignment is removed by optimization:
Project87.dpr.17: Result := 0;
00000000004261AA 4833D2 xor rdx,rdx
Project87.dpr.18: if Prefixes = 0 then
00000000004261AD 6685C0 test ax,ax
00000000004261B0 7460 jz GetPrefixesCount + $72
Project87.dpr.22: i := 0;
00000000004261B2 4D33C0 xor r8,r8
Project87.dpr.23: Prefixes := Prefixes and not Prf_VEX;
00000000004261B5 0FB7C0 movzx eax,ax
00000000004261B8 81E02DFBFFFF and eax,$fffffb2d
00000000004261BE 81F8FFFF0000 cmp eax,$0000ffff
00000000004261C4 7605 jbe GetPrefixesCount + $2B
00000000004261C6 E8050FFEFF call #BoundErr
Project87.dpr.26: Prf := (1 shl i);
00000000004261CB 41C7C101000000 mov r9d,$00000001
00000000004261D2 418BC8 mov ecx,r8d
00000000004261D5 41D3E1 shl r9d,r9b
00000000004261D8 4489C9 mov ecx,r9d
00000000004261DB 81F9FFFF0000 cmp ecx,$0000ffff
00000000004261E1 7605 jbe GetPrefixesCount + $48
00000000004261E3 E8E80EFEFF call #BoundErr
Project87.dpr.27: if (Prf and Prefixes = Prf) then
00000000004261E8 448BC9 mov r9d,ecx
00000000004261EB 664423C8 and r9w,ax
00000000004261EF 66443BC9 cmp r9w,cx
00000000004261F3 750A jnz GetPrefixesCount + $5F
Project87.dpr.28: Inc(Result);
00000000004261F5 80C201 add dl,$01
00000000004261F8 7305 jnb GetPrefixesCount + $5F
00000000004261FA E8F10EFEFF call #IntOver
Project87.dpr.29: Inc(i);
00000000004261FF 4180C001 add r8b,$01
0000000000426203 7305 jnb GetPrefixesCount + $6A
0000000000426205 E8E60EFEFF call #IntOver
Project87.dpr.24: while Prf < $8000 do
000000000042620A 6681F90080 cmp cx,$8000
000000000042620F 72BA jb GetPrefixesCount + $2B
well, I guess the answer is that dcc64 is a buggy compiler when it comes to messages. Because if you comment out the offending line, "value never used" becomes "might not have been initialized." Same compiler.
[dcc64 Warning] DDetours.pas(1022): W1036 Variable 'Prf' might not have been initialized

Accessing Delphi Class Fields in 64 bit inline assembler

I am trying to convert the Delphi TBits.GetBit to inline assembler for the 64 bit version. The VCL source looks like this:
function TBits.GetBit(Index: Integer): Boolean;
{$IFNDEF X86ASM}
var
LRelInt: PInteger;
LMask: Integer;
begin
if (Index >= FSize) or (Index < 0) then
Error;
{ Calculate the address of the related integer }
LRelInt := FBits;
Inc(LRelInt, Index div BitsPerInt);
{ Generate the mask }
LMask := (1 shl (Index mod BitsPerInt));
Result := (LRelInt^ and LMask) <> 0;
end;
{$ELSE X86ASM}
asm
CMP Index,[EAX].FSize
JAE TBits.Error
MOV EAX,[EAX].FBits
BT [EAX],Index
SBB EAX,EAX
AND EAX,1
end;
{$ENDIF X86ASM}
I started converting the 32 bit ASM code to 64 bit. After some searching, I found out that I need to change the EAX references to RAX for the 64 bit compiler. I ended up with this for the first line:
CMP Index,[RAX].FSize
This compiles but gives an access violation when it runs. I tried a few combinations (e.g. MOV ECX,[RAX].FSize) and get the same access violation when trying to access [RAX].FSize. When I look at the assembler that is generated by the Delphi compiler, it looks like my [RAX].FSize should be correct.
Unit72.pas.143: MOV ECX,[RAX].FSize
00000000006963C0 8B8868060000 mov ecx,[rax+$00000668]
And the Delphi generated code:
Unit72.pas.131: if (Index >= FSize) or (Index < 0) then
00000000006963CF 488B4550 mov rax,[rbp+$50]
00000000006963D3 8B4D58 mov ecx,[rbp+$58]
00000000006963D6 3B8868060000 cmp ecx,[rax+$00000668]
00000000006963DC 7D06 jnl TForm72.GetBit + $24
00000000006963DE 837D5800 cmp dword ptr [rbp+$58],$00
00000000006963E2 7D09 jnl TForm72.GetBit + $2D
In both cases, the resulting assembler uses [rax+$00000668] for FSize. What is the correct way to access a class field in Delphi 64bit Assembler?
This may sound like a strange thing to optimize but the assembler for the 64bit pascal version doesn't appear to be very efficient. We call this routine a large number of times and it takes anything up to 5 times as long to execute depending on various factors.
The basic problem is that you are using the wrong register. Self is passed as an implicit parameter, before all others. In the x64 calling convention, that means it is passed in RCX and not RAX.
So Self is passed in RCX and Index is passed in RDX. Frankly, I think it's a mistake to use parameter names in inline assembler because they hide the fact that the parameter was passed in a register. If you happen to overwrite either RDX, then that changes the apparent value of Index.
So the if statement might be coded as
CMP EDX,[RCX].FSize
JNL TBits.Error
CMP EDX,0
JL TBits.Error
FWIW, this is a really simple function to implement and I don't believe that you will need to use any stack space. You have enough registers in x64 to be able to do this entirely using volatile registers.

Is it necessary to assign a default value to a variant returned from a Delphi function?

Gradually I've been using more variants - they can be very useful in certain places for carrying data types that are not known at compile time. One useful value is UnAssigned ('I've not got a value for you'). I think I discovered a long time ago that the function:
function DoSomething : variant;
begin
If SomeBoolean then
Result := 4.5
end;
appeared to be equivalent to:
function DoSomething : variant;
begin
If SomeBoolean then
Result := 4.5
else
Result := Unassigned; // <<<<
end;
I presumed this reasoning that a variant has to be created dynamically and if SomeBoolean was FALSE, the compiler had created it but it was 'Unassigned' (<> nil?). To further encourage this thinking, the compiler reports no warning if you omit assigning Result.
Just now I've spotted nasty bug where my first example (where 'Result' is not explicity defaulted to 'nil') actually returned an 'old' value from somewhere else.
Should I ALWAYS assign Result (as I do when using prefefined types) when returing a variant?
Yes, you always need to initialize the Result of a function, even if it's a managed type (like string and Variant). The compiler does generate some code to initialize the future return value of a Variant function for you (at least the Delphi 2010 compiler I used for testing purposes does) but the compiler doesn't guarantee your Result is initialized; This only makes testing more difficult, because you might run into a case where your Result was initialized, base your decisions on that, only to later discover your code is buggy because under certain circumstances the Result wasn't initialized.
From my investigation, I've noticed:
If your result is assigned to a global variable, your function is called with an initialized hidden temporary variable, creating the illusion that the Result is magically initialized.
If you make two assignments to the same global variable, you'll get two distinct hidden temporary variables, re-enforcing the illusion that Result's are initialized.
If you make two assignments to the same global variable but don't use the global variable between calls, the compiler only uses 1 hidden temporary, braking the previous rule!
If your variable is local to the calling procedure, no intermediary hidden local variable is used at all, so the Result isn't initialized.
Demonstration:
First, here's the proof that a function returning a Variant receives a var Result:
Variant hidden parameter. The following two compile to the exact same assembler, shown below:
procedure RetVarProc(var V:Variant);
begin
V := 1;
end;
function RetVarFunc: Variant;
begin
Result := 1;
end;
// Generated assembler:
push ebx // needs to be saved
mov ebx, eax // EAX contains the address of the return Variant, copies that to EBX
mov eax, ebx // ... not a very smart compiler
mov edx, $00000001
mov cl, $01
call #VarFromInt
pop ebx
ret
Next, it's interesting to see how the call for the two is set up by the complier. Here's what happens for a call to a procedure that has a var X:Variant parameter:
procedure Test;
var X: Variant;
begin
ProcThatTakesOneVarParameter(X);
end;
// compiles to:
lea eax, [ebp - $10]; // EAX gets the address of the local variable X
call ProcThatTakesOneVarParameter
If we make that "X" a global variable, and we call the function returning a Variant, we get this code:
var X: Variant;
procedure Test;
begin
X := FuncReturningVar;
end;
// compiles to:
lea eax, [ebp-$10] // EAX gets the address of a HIDDEN local variable.
call FuncReturningVar // Calls our function with the local variable as parameter
lea edx, [ebp-$10] // EDX gets the address of the same HIDDEN local variable.
mov eax, $00123445 // EAX is loaded with the address of the global variable X
call #VarCopy // This moves the result of FuncReturningVar into the global variable X
If you look at the prologue of this function you'll notice the local variable that's used as a temporary parameter for the call to FuncReturningVar is initialized to ZERO. If the function doesn't contain any Result := statements, X would be "Uninitialized". If we call the function again, a DIFFERENT temporary and hidden variable is used! Here's a bit of sample code to see that:
var X: Variant; // global variable
procedure Test;
begin
X := FuncReturningVar;
WriteLn(X); // Make sure we use "X"
X := FuncReturningVar;
WriteLn(X); // Again, make sure we use "X"
end;
// compiles to:
lea eax, [ebp-$10] // first local temporary
call FuncReturningVar
lea edx, [ebp-$10]
mov eax, $00123456
call #VarCopy
// [call to WriteLn using the actual address of X removed]
lea eax, [ebp-$20] // a DIFFERENT local temporary, again, initialized to Unassigned
call FuncReturningVar
// [ same as before, removed for brevity ]
When looking at that code, you'd think the "Result" of a function returning Variant is allways initialized to Unassigned by the calling party. Not true. If in the previous test we make the "X" variable a LOCAL variable (not global), the compiler no longer uses the two separate local temporary variables. So we've got two separate cases where the compiler generates different code. In other words, don't make any assumptions, always assign Result.
My guess about the different behavior: If the Variant variable can be accessed outside the current scope, as a global variable (or class field for that matter) would, the compiler generates code that uses the thread-safe #VarCopy function. If the variable is local to the function there are no multi-threading issues so the compiler can take the liberty to make direct assignments (no-longer calling #VarCopy).
Should I ALWAYS assign Result (as I do
when using prefefined types) when
returing a variant?
Yes.
Test this:
function DoSomething(SomeBoolean: Boolean) : variant;
begin
if SomeBoolean then
Result := 1
end;
Use the function like this:
var
xx: Variant;
begin
xx := DoSomething(True);
if xx <> Unassigned then
ShowMessage('Assigned');
xx := DoSomething(False);
if xx <> Unassigned then
ShowMessage('Assigned');
end;
xx will still be assigned after the second call to DoSomething.
Change the function to this:
function DoSomething(SomeBoolean: Boolean) : variant;
begin
Result := Unassigned;
if SomeBoolean then
Result := 1
end;
And xx is not assigned after the second call to DoSomething.

Resources