I am running into some weird behaviour with Delphi's inline assembly, as demonstrated in this very short and simple program:
program test;
{$APPTYPE CONSOLE}
uses
SysUtils;
type
TAsdf = class
public
int: Integer;
end;
TBlah = class
public
asdf: TAsdf;
constructor Create(a: TAsdf);
procedure Test;
end;
constructor TBlah.Create(a: TAsdf);
begin
asdf := a;
end;
procedure TBlah.Test;
begin
asm
mov eax, [asdf]
end;
end;
var
asdf: TAsdf;
blah: TBlah;
begin
asdf := TAsdf.Create;
blah := TBlah.Create(asdf);
blah.Test;
readln;
end.
It's just for the sake of example (moving [asdf] into eax doesn't do much, but it works for the example). If you look at the assembly for this program, you'll see that
mov eax, [asdf]
has been turned into
mov eax, ds:[4]
(as represented by OllyDbg) which obviously crashes. However, if you do this:
var
temp: TAsdf;
begin
temp := asdf;
asm
int 3;
mov eax, [temp];
end;
It changes to
mov eax, [ebp-4]
which works. Why is this? I'm usually working with C++ and I'm used to using instance vars like that, it may be that I'm using instance variables wrong.
EDIT: Yep, that was it. Changing mov eax, [asdf] to mov eax, [Self.asdf] fixes the problem. Sorry about that.
In the first case, mov eax,[asdf], the assembler will look up asdf and discover it is a field of offset 4 in the instance. Because you used an indirect addressing mode without a base address, it will only encode the offset (it looks like 0 + asdf to the assembler). Had you written it like this: mov eax, [eax].asdf, it would have been encoded as mov eax, [eax+4]. (here eax contains Self as passed in from the caller).
In the second case, the assembler will look up Temp and see that it is a local variable indexed by EBP. Because it knows the base address register to use, it can decide to encode it as [EBP-4].
A method receives the Self pointer in the EAX register. You have to use that value as the base value for accessing the object. So your code would be something like:
mov ebx, TBlah[eax].asdf
See http://www.delphi3000.com/articles/article_3770.asp for an example.
Related
I want to access the local variables of a Delphi procedure from its nested assembly procedure. Although the compiler does allow the references of the local variables, it compiles wrong offsets which only work if the EBP/RBP values are hacked. In the x86 environment I found a fairly elegant hack, but in x64 I couldn't find yet any decent solution.
In the x86 environment the workaround below seems to work fine:
procedure Main;
var ABC: integer;
procedure Sub;
asm
mov ebp, [esp]
mov eax, ABC
end;
...
In the above code, the compiler treats the variable ABC as it would be in the body of Main, so hacking the value of EBP in the fist assembly line solves the problem. However, the same trick won't work in the x64 environment:
procedure Main;
var ABC: int64;
procedure Sub;
asm
mov rbp, [rsp]
mov rax, ABC
end;
...
In the above code, the compiler adds an offset when it references the variable ABC which isn't correct neither with the original (Main) value of the RBP, nor with its new (Sub) value. Moreover, changing the RBP in a 64-bit code isn't recommended, so I found the workaround below:
procedure Main;
var ABC: int64;
procedure Sub;
asm
add rcx, $30
mov rax, [rcx + OFFSET ABC]
end;
...
As the compiler passes the initial value of RBP in RCX, and the reference to the variable ABC can be hacked to be RCX rather than RBP based, the above code does work. However, the correction of $30 depends on the number of variables of Main, so this workaround is kind of a last resort stuff, and I'd very much like to find something more elegant.
Does anyone have a suggestion on how to do this in a more elegant way?
Note that:
Of course: in my real code there are a large number of local variables to be accessed from the ASM code, so solutions like passing the variables as parameters are ruled out.
I'm adding x64 compatibility to x86 code, and there are dozens of codes like this, so I'd need a workaround which transforms that code with small formal changes only (accessing the local variables in a fundamentally different way would become an inexhaustible source of bugs).
UPDATE:
Found a safe but relatively complicated solution: I added a local variable called Sync to find out the offset between the RBP values of Main and Sub, then I do the correction on the RBP:
procedure Main;
var Sync: int64; ABC: int64;
procedure Sub(var SubSync: int64);
asm
push rbp
lea rax, Sync
sub rdx, rax
add rbp, rdx
mov rax, ABC
pop rbp
end;
begin
ABC := 66;
Sub(Sync);
end;
So far nobody came with a solution, so I consider the code below to be the best known solution:
procedure Main;
var Sync: int64; ABC: int64;
procedure Sub(var SubSync: int64);
asm
push rbp
lea rax, Sync
sub rdx, rax
add rbp, rdx
mov rax, ABC
pop rbp
end;
begin
ABC := 66;
Sub(Sync);
end;
BTW, as this very much looks like a Delphi bug, I posted this to the Embarcadero as a bug report.
I have this function (RDRand - written by David Heffernan) that seam to work ok in 32 bit, but failed in 64 bit :
function TryRdRand(out Value: Cardinal): Boolean;
{$IF defined(CPU64BITS)}
asm .noframe
{$else}
asm
{$ifend}
db $0f
db $c7
db $f1
jc #success
xor eax,eax
ret
#success:
mov [eax],ecx
mov eax,1
end;
doc of the function is here: https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide
Especially it's written :
Essentially, developers invoke this instruction with a single operand:
the destination register where the random value will be stored. Note
that this register must be a general purpose register, and the size of
the register (16, 32, or 64 bits) will determine the size of the
random value returned.
After invoking the RDRAND instruction, the caller must examine the
carry flag (CF) to determine whether a random value was available at
the time the RDRAND instruction was executed. As Table 3 shows, a
value of 1 indicates that a random value was available and placed in
the destination register provided in the invocation. A value of 0
indicates that a random value was not available. In current
architectures the destination register will also be zeroed as a side
effect of this condition.
My knowledge of ASM is quite low, what did I miss ?
Also I do not quite understand this instruction :
...
xor eax,eax
ret
...
What it's does exactly ?
If you want a function that performs exactly the same then I think that looks like this:
function TryRdRand(out Value: Cardinal): Boolean;
asm
{$if defined(WIN64)}
.noframe
// rdrand eax
db $0f
db $c7
db $f0
jnc #fail
mov [rcx],eax
{$elseif defined(WIN32)}
// rdrand ecx
db $0f
db $c7
db $f1
jnc #fail
mov [eax],ecx
{$else}
{$Message Fatal 'TryRdRand not implemented for this platform'}
{$endif}
mov eax,1
ret
#fail:
xor eax,eax
end;
The suggestion made by Peter Cordes of implementing a retry loop in the asm looks sensible to me. I will not attempt to implement that here, since I think it is somewhat outside the scope of your question.
Also, Peter points out that in x64 you can read a 64 bit random value with the REX.W=1 prefix. That would look like this:
function TryRdRand(out Value: NativeUInt): Boolean;
asm
{$if defined(WIN64)}
.noframe
// rdrand rax
db $48 // REX.W = 1
db $0f
db $c7
db $f0
jnc #fail
mov [rcx],rax
{$elseif defined(WIN32)}
// rdrand ecx
db $0f
db $c7
db $f1
jnc #fail
mov [eax],ecx
{$else}
{$Message Fatal 'TryRdRand not implemented for this platform'}
{$endif}
mov eax,1
ret
#fail:
xor eax,eax
end;
I looked at the ASM code of a release build with all optimizations turned on, and here is one of the inlined function I came across:
0061F854 mov eax,[$00630bec]
0061F859 mov eax,[$00630e3c]
0061F85E mov edx,$00000001
0061F863 mov eax,[eax+edx*4]
0061F866 cmp byte ptr [eax],$01
0061F869 jnz $0061fa83
The code is pretty easy to understand, it builds an offset (1) into a table, compares the byte value from it to 1 and do a jump if NZ. I know the pointer to my table is stored in $00630e3c, but I have no idea where $00630bec is coming from.
Why is there two move to eax one after the other? Isn't the first one overwritten by the second one? Can this be a cache optimization thing or am I missing something unbelievably obvious/obscure?
The Delphi code for the above ASM is as follow:
if( TGameSignals.IsSet( EmitParticleSignal ) = True ) then [...]
IsSet() is an inlined class function and calls the inlined IsSet() function of TSignalManager:
class function TGameSignals.IsSet(Signal: PBucketSignal): Boolean;
begin
Result := FSignalManagerInstance.IsSet( Signal );
end;
The final IsSet of the signal manager is as such:
function TSignalManagerInstance.IsSet( Signal: PBucketSignal ): Boolean;
begin
Result := Signal.Pending;
end;
My best guess would be that $00630bec is a reference to the class TGameSignals. You can check it by doing
ShowMessage(IntToHex(NativeInt(TGameSignals), 8))
The pre-optimisation code was probably something like this
0061F854 mov eax,[$00630bec] //Move reference to class TGameSignals in EAX
0061F859 mov eax,[eax + $250] //Move Reference to FSignalManagerInstance at offset $250 in class TGameSignals in EAX
the compiler optimised [eax + $250] to [$00630e3c], but didn't realize the previous MOV wasn't required anymore.
I'm not an expert in codegen, so take it with a grain of salt...
On a side note, in delphi, we usually write
if TGameSignals.IsSet( EmitParticleSignal ) then
As it's possible for the following IF to be true
var vBool : Boolean
[...]
vBool := Boolean(10);
if vBool and (vBool <> True) then
Granted, this is not good practice, but no point in comparing to TRUE either.
EDIT: As pointed out by Ped7g, I was wrong. The instruction is
0061F854 mov eax,[$00630bec]
and not
0061F854 mov eax,$00630bec
So what I wrote didn't really make sense...
The first MOV instruction serve to pass the "self" reference for the call to TGameSignals.IsSet. Now, if the function wasn't inline, it would look like this :
mov eax,[$00630bec]
call TGameSignals.IsSet
and then
*TGameSignals.IsSet
mov eax,[$00630e3c]
[...]
The first mov is still pointless, since "Self" isn't used in TGameSignals.IsSet but it is still required to pass "self" to the function. When the routine get inlined, it looks a lot more silly, indeed.
Like mentioned by Arnaud Bouchez, making TGameSignals.IsSet static remove the implicit Self parameter and thus, remove the first MOV operation.
I am trying to learn inline assembly programming in Delphi, and to this end I have found this article highly helpful.
Now I wish to write an assembly function returning a long string, specifically an AnsiString (for simplicity). I have written
function myfunc: AnsiString;
asm
// eax = #result
mov edx, 3
mov ecx, 1252
call System.#LStrSetLength
mov [eax + 0], ord('A')
mov [eax + 1], ord('B')
mov [eax + 2], ord('C')
end;
Explanation:
A function returning a string has an invisible var result: AnsiString (in this case) parameter, so, at the beginning of the function, eax should hold the address of the resulting string. I then set edx and ecx to 3 and 1252, respectively, and then call System._LStrSetLength. In effect, I do
_LStrSetLength(#result, 3, 1252)
where 3 is the new length of the string (in characters = bytes) and 1252 is the standard windows-1252 codepage.
Then, knowing that eax is the address of the first character of the string, I simply set the string to "ABC". But it does not work - it gives me nonsense data or EAccessViolation. What is the problem?
Update
Now we have two seemingly working implementations of myfunc, one employing NewAnsiString and one employing LStrSetLength. I cannot help but wonder if both of them are correct, in the sense that they do not mess upp Delphi's internal handling of strings (reference counting, automatic freeing, etc.).
You have to use some kind of:
function myfunc: AnsiString;
asm
push eax // save #result
call system.#LStrClr
mov eax,3 {Length}
{$ifdef UNICODE}
mov edx,1252 // code page for Delphi 2009/2010
{$endif}
call system.#NewAnsiString
pop edx
mov [edx],eax
mov [eax],$303132
end;
It will return a '210' string...
And it's always a good idea of putting a {$ifdef UNICODE} block to have your code compatible with version of Delphi prior to 2009.
With the excellent aid of A.Bouchez, I managed to correct my own code, employing LStrSetLength:
function myfunc: AnsiString;
asm
push eax
// eax = #result
mov edx, 3
mov ecx, 1252
call System.#LStrSetLength
pop eax
mov ecx, [eax]
mov [ecx], 'A'
mov [ecx] + 1, 'B'
mov [ecx] + 2, 'C'
end;
I am trying to convert the Delphi TBits.GetBit to inline assembler for the 64 bit version. The VCL source looks like this:
function TBits.GetBit(Index: Integer): Boolean;
{$IFNDEF X86ASM}
var
LRelInt: PInteger;
LMask: Integer;
begin
if (Index >= FSize) or (Index < 0) then
Error;
{ Calculate the address of the related integer }
LRelInt := FBits;
Inc(LRelInt, Index div BitsPerInt);
{ Generate the mask }
LMask := (1 shl (Index mod BitsPerInt));
Result := (LRelInt^ and LMask) <> 0;
end;
{$ELSE X86ASM}
asm
CMP Index,[EAX].FSize
JAE TBits.Error
MOV EAX,[EAX].FBits
BT [EAX],Index
SBB EAX,EAX
AND EAX,1
end;
{$ENDIF X86ASM}
I started converting the 32 bit ASM code to 64 bit. After some searching, I found out that I need to change the EAX references to RAX for the 64 bit compiler. I ended up with this for the first line:
CMP Index,[RAX].FSize
This compiles but gives an access violation when it runs. I tried a few combinations (e.g. MOV ECX,[RAX].FSize) and get the same access violation when trying to access [RAX].FSize. When I look at the assembler that is generated by the Delphi compiler, it looks like my [RAX].FSize should be correct.
Unit72.pas.143: MOV ECX,[RAX].FSize
00000000006963C0 8B8868060000 mov ecx,[rax+$00000668]
And the Delphi generated code:
Unit72.pas.131: if (Index >= FSize) or (Index < 0) then
00000000006963CF 488B4550 mov rax,[rbp+$50]
00000000006963D3 8B4D58 mov ecx,[rbp+$58]
00000000006963D6 3B8868060000 cmp ecx,[rax+$00000668]
00000000006963DC 7D06 jnl TForm72.GetBit + $24
00000000006963DE 837D5800 cmp dword ptr [rbp+$58],$00
00000000006963E2 7D09 jnl TForm72.GetBit + $2D
In both cases, the resulting assembler uses [rax+$00000668] for FSize. What is the correct way to access a class field in Delphi 64bit Assembler?
This may sound like a strange thing to optimize but the assembler for the 64bit pascal version doesn't appear to be very efficient. We call this routine a large number of times and it takes anything up to 5 times as long to execute depending on various factors.
The basic problem is that you are using the wrong register. Self is passed as an implicit parameter, before all others. In the x64 calling convention, that means it is passed in RCX and not RAX.
So Self is passed in RCX and Index is passed in RDX. Frankly, I think it's a mistake to use parameter names in inline assembler because they hide the fact that the parameter was passed in a register. If you happen to overwrite either RDX, then that changes the apparent value of Index.
So the if statement might be coded as
CMP EDX,[RCX].FSize
JNL TBits.Error
CMP EDX,0
JL TBits.Error
FWIW, this is a really simple function to implement and I don't believe that you will need to use any stack space. You have enough registers in x64 to be able to do this entirely using volatile registers.