Asm equivalent to a delphi procedure - delphi

I have a simple delphi function called SetCompare that compares two singles and if they are not equal then one value is set to the other.
procedure SetCompare( A : single; B : single );
begin
if( A <> B ) then
A := B;
end;
I am trying to convert this into asm as such:
procedure SetCompare( A : Single; B : Single ); register;
begin
asm
mov EAX,A
mov ECX,B
cmp EAX,ECX
jne SetValue
#SetValue:
mov EAX,ECX
end;
end;
Will this work?

Will this work?
No this will not work, because floating point comparison is not the same as binary comparison. For instance 0 and -0 have different bit patterns, but compare as equal. Similarly, NaN compares unequal to all values, including a NaN with the same bit pattern.
The simplest way to work out how to write your code is to get the compiler to compile the Pascal code, and inspect the generated assembly code.
Some asides:
Your function is pointless anyway, because it returns no value and has no side effects.
If performance matters enough to write assembler, then you should write pure assembler functions, rather than inline asm blocks in a Pascal function. Which in any case is not supported by the x64 compiler.
Your arguments are already in registers, so it makes little sense to copy them around to other registers. For x86 code, A arrives in EAX, and B arrives in EDX. Given that EAX already contains A, why would you copy it into EAX? It is already there. And B is already in EDX, why copy it to ECX? For x64 code, the two arguments are passed in floating point registers, and can be compared there directly. As soon as your start writing assembler you need to understand the register use of the calling convention.
Your jne is pointless. If execution does not take the jump, then it moves to the next line of code. Which is where you jumped to.

Related

Why is there two sequential move to EAX under optimization build?

I looked at the ASM code of a release build with all optimizations turned on, and here is one of the inlined function I came across:
0061F854 mov eax,[$00630bec]
0061F859 mov eax,[$00630e3c]
0061F85E mov edx,$00000001
0061F863 mov eax,[eax+edx*4]
0061F866 cmp byte ptr [eax],$01
0061F869 jnz $0061fa83
The code is pretty easy to understand, it builds an offset (1) into a table, compares the byte value from it to 1 and do a jump if NZ. I know the pointer to my table is stored in $00630e3c, but I have no idea where $00630bec is coming from.
Why is there two move to eax one after the other? Isn't the first one overwritten by the second one? Can this be a cache optimization thing or am I missing something unbelievably obvious/obscure?
The Delphi code for the above ASM is as follow:
if( TGameSignals.IsSet( EmitParticleSignal ) = True ) then [...]
IsSet() is an inlined class function and calls the inlined IsSet() function of TSignalManager:
class function TGameSignals.IsSet(Signal: PBucketSignal): Boolean;
begin
Result := FSignalManagerInstance.IsSet( Signal );
end;
The final IsSet of the signal manager is as such:
function TSignalManagerInstance.IsSet( Signal: PBucketSignal ): Boolean;
begin
Result := Signal.Pending;
end;
My best guess would be that $00630bec is a reference to the class TGameSignals. You can check it by doing
ShowMessage(IntToHex(NativeInt(TGameSignals), 8))
The pre-optimisation code was probably something like this
0061F854 mov eax,[$00630bec] //Move reference to class TGameSignals in EAX
0061F859 mov eax,[eax + $250] //Move Reference to FSignalManagerInstance at offset $250 in class TGameSignals in EAX
the compiler optimised [eax + $250] to [$00630e3c], but didn't realize the previous MOV wasn't required anymore.
I'm not an expert in codegen, so take it with a grain of salt...
On a side note, in delphi, we usually write
if TGameSignals.IsSet( EmitParticleSignal ) then
As it's possible for the following IF to be true
var vBool : Boolean
[...]
vBool := Boolean(10);
if vBool and (vBool <> True) then
Granted, this is not good practice, but no point in comparing to TRUE either.
EDIT: As pointed out by Ped7g, I was wrong. The instruction is
0061F854 mov eax,[$00630bec]
and not
0061F854 mov eax,$00630bec
So what I wrote didn't really make sense...
The first MOV instruction serve to pass the "self" reference for the call to TGameSignals.IsSet. Now, if the function wasn't inline, it would look like this :
mov eax,[$00630bec]
call TGameSignals.IsSet
and then
*TGameSignals.IsSet
mov eax,[$00630e3c]
[...]
The first mov is still pointless, since "Self" isn't used in TGameSignals.IsSet but it is still required to pass "self" to the function. When the routine get inlined, it looks a lot more silly, indeed.
Like mentioned by Arnaud Bouchez, making TGameSignals.IsSet static remove the implicit Self parameter and thus, remove the first MOV operation.

Accessing Delphi Class Fields in 64 bit inline assembler

I am trying to convert the Delphi TBits.GetBit to inline assembler for the 64 bit version. The VCL source looks like this:
function TBits.GetBit(Index: Integer): Boolean;
{$IFNDEF X86ASM}
var
LRelInt: PInteger;
LMask: Integer;
begin
if (Index >= FSize) or (Index < 0) then
Error;
{ Calculate the address of the related integer }
LRelInt := FBits;
Inc(LRelInt, Index div BitsPerInt);
{ Generate the mask }
LMask := (1 shl (Index mod BitsPerInt));
Result := (LRelInt^ and LMask) <> 0;
end;
{$ELSE X86ASM}
asm
CMP Index,[EAX].FSize
JAE TBits.Error
MOV EAX,[EAX].FBits
BT [EAX],Index
SBB EAX,EAX
AND EAX,1
end;
{$ENDIF X86ASM}
I started converting the 32 bit ASM code to 64 bit. After some searching, I found out that I need to change the EAX references to RAX for the 64 bit compiler. I ended up with this for the first line:
CMP Index,[RAX].FSize
This compiles but gives an access violation when it runs. I tried a few combinations (e.g. MOV ECX,[RAX].FSize) and get the same access violation when trying to access [RAX].FSize. When I look at the assembler that is generated by the Delphi compiler, it looks like my [RAX].FSize should be correct.
Unit72.pas.143: MOV ECX,[RAX].FSize
00000000006963C0 8B8868060000 mov ecx,[rax+$00000668]
And the Delphi generated code:
Unit72.pas.131: if (Index >= FSize) or (Index < 0) then
00000000006963CF 488B4550 mov rax,[rbp+$50]
00000000006963D3 8B4D58 mov ecx,[rbp+$58]
00000000006963D6 3B8868060000 cmp ecx,[rax+$00000668]
00000000006963DC 7D06 jnl TForm72.GetBit + $24
00000000006963DE 837D5800 cmp dword ptr [rbp+$58],$00
00000000006963E2 7D09 jnl TForm72.GetBit + $2D
In both cases, the resulting assembler uses [rax+$00000668] for FSize. What is the correct way to access a class field in Delphi 64bit Assembler?
This may sound like a strange thing to optimize but the assembler for the 64bit pascal version doesn't appear to be very efficient. We call this routine a large number of times and it takes anything up to 5 times as long to execute depending on various factors.
The basic problem is that you are using the wrong register. Self is passed as an implicit parameter, before all others. In the x64 calling convention, that means it is passed in RCX and not RAX.
So Self is passed in RCX and Index is passed in RDX. Frankly, I think it's a mistake to use parameter names in inline assembler because they hide the fact that the parameter was passed in a register. If you happen to overwrite either RDX, then that changes the apparent value of Index.
So the if statement might be coded as
CMP EDX,[RCX].FSize
JNL TBits.Error
CMP EDX,0
JL TBits.Error
FWIW, this is a really simple function to implement and I don't believe that you will need to use any stack space. You have enough registers in x64 to be able to do this entirely using volatile registers.

FLD instruction x64 bit

I have a little problem with FLD instruction in x64 bit ...
want to load Double value to the stack pointer FPU in st0 register, but it seem to be impossible.
In Delphi x32, I can use this code :
function DoSomething(X:Double):Double;
asm
FLD X
// Do Something ..
FST Result
end;
Unfortunately, in x64, the same code does not work.
Delphi inherite Microsoft x64 Calling Convention.
So if arguments of function/procedure are float/double, they are passed in XMM0L, XMM1L, XMM2L, and XMM3L registers.
But you can use var before parameter as workaround like:
function DoSomething(var X:Double):Double;
asm
FLD qword ptr [X]
// Do Something ..
FST Result
end;
In x64 mode floating point parameters are passed in xmm-registers. So when Delphi tries to compile FLD X, it becomes FLD xmm0 but there is no such instruction. You first need to move it to memory.
The same goes with the result, it should be passed back in xmm0.
Try this (not tested):
function DoSomething(X:Double):Double;
var
Temp : double;
asm
MOVQ qword ptr Temp,X
FLD Temp
//do something
FST Temp
MOVQ xmm0,qword ptr Temp
end;
You don't need to use legacy x87 stack registers in x86-64 code, because SSE2 is baseline, a required part of the x86-64 ISA. You can and should do your scalar FP math using addsd, mulsd, sqrtsd and so on, on XMM registers. (Or addss for float)
The Windows x64 calling convention passes float/double FP args in XMM0..3, if they're one of the first four args to the function. (i.e. the 3rd total arg goes in xmm2 if it's FP, rather than the 3rd FP arg going in xmm2.) It returns FP values in XMM0.
Only use x87 if you actually need 80-bit precision inside your function. (Instructions like fsin and fyl2x are not fast, and can usually be done just as well by normal math libraries using SSE/SSE2 instructions.
function times2(X:Double):Double;
asm
addsd xmm0, xmm0 // upper 8 bytes of XMM0 are ignored
ret
end
Storing to memory and reloading into an x87 register costs you about 10 cycles of latency for no benefit. SSE/SSE2 scalar instructions are just as fast, or faster, than their x87 equivalents, and easier to program for and optimize because you never need fxch; it's a flat register design instead of stack-based. (https://agner.org/optimize/). Also, you have 15 XMM registers.
Of course, you usually don't need inline asm at all. It could be useful for manually-vectorizing if the compiler doesn't do that for you.

Address of Delphi label

I'm working on a simple PIC18 MCPU mnemonic simulation in Delphi pascal. And yes, I intend to use Delphi IDE.
I'm able to simulate any asm instruction, but it stops at labels.
In some cases I need to know the address of Delphi label.
Is there any possibility to cast label in to pointer variable?
As in my example?
procedure addlw(const n:byte); //emulation of mcpu addlw instruction
begin
Carry := (wreg + n) >= 256;
wreg := wreg + n;
Zero := wreg = 0;
inc(CpuCycles);
end;
procedure bnc(p: pointer ); //emulation of mcpu bnc instruction
asm
inc CpuCycles
cmp byte ptr Carry, 0
jnz #exit
pop eax //restore return addres from stack
jmp p
#exit:
end;
//EMULATION OF MCPU ASM CODE
procedure Test;
label
Top;
var
p: pointer;
begin
//
Top:
addlw(5); //emulated mcpu addlw instruction
bnc(Top); //emulated mcpu bnc branch if not carry instruction
//
end;
No, you can't interact with labels that way. Since you're emulating everything else, you may as well emulate assembler labels, too, instead of trying to force Delphi labels to do something they're not designed for.
Suppose you could use code like this instead of the "assembler" code you wrote (without worrying for now exactly how to implement it):
procedure Test;
var
Top: TAsmLabel;
begin
//
DefineLabel(Top);
addlw(5); //emulated mcpu addlw instruction
bnc(Top); //emulated mcpu bnc branch if not carry instruction
//
end;
The syntax looks similar enough, I think. Upon running that code, you'll want Top to refer to the next instruction, which is the one that calls addlw.
Inside the hypothetical function DefineLabel, that address corresponds to the return address, so write DefineLabel to store its return address in the given parameter:
type
TAsmLabel = Pointer;
procedure DefineLabel(out Result: TAsmLabel);
asm
mov ecx, [esp] // copy return address
mov [eax], ecx // store result
end;
Beware that this code corrupts the stack. Your bcn function leaves its return address on the stack, so when the carry flag eventually gets set, you've left a trail of previous return addresses on the stack. If you don't get a stack overflow first, you'll hit strange results when you get to the end of the containing function. It will try to return, but instead of going to the caller, it will find bnc's return address instead, and jump back into the middle of your code. And that's all assuming there aren't any other stack-relative references in the code. If there are, then even calling bnc(Top) might give problems because the relative position of Top will have changed, and you'll end up reading the wrong value off the stack.

Char and Chr in Delphi

The difference between Chr and Char when used in converting types is that one is a function and the other is cast
So: Char(66) = Chr(66)
I don't think there is any performance difference (at least I've never noticed any, one probably calls the other).... I'm fairly sure someone will correct me on this!
EDIT Thanks to Ulrich for the test proving they are in fact identical.
EDIT 2 Can anyone think of a case where they might not be identical, e.g. you are pushed towards using one over the other due to the context?
Which do you use in your code and why?
I did a small test in D2007:
program CharChr;
{$APPTYPE CONSOLE}
uses
Windows;
function GetSomeByte: Byte;
begin
Result := Random(26) + 65;
end;
procedure DoTests;
var
b: Byte;
c: Char;
begin
b := GetSomeByte;
IsCharAlpha(Chr(b));
b := GetSomeByte;
IsCharAlpha(Char(b));
b := GetSomeByte;
c := Chr(b);
b := GetSomeByte;
c := Char(b);
end;
begin
Randomize;
DoTests;
end.
Both calls produce the same assembly code:
CharChr.dpr.19: IsCharAlpha(Chr(b));
00403AE0 8A45FF mov al,[ebp-$01]
00403AE3 50 push eax
00403AE4 E86FFFFFFF call IsCharAlpha
CharChr.dpr.21: IsCharAlpha(Char(b));
00403AF1 8A45FF mov al,[ebp-$01]
00403AF4 50 push eax
00403AF5 E85EFFFFFF call IsCharAlpha
CharChr.dpr.24: c := Chr(b);
00403B02 8A45FF mov al,[ebp-$01]
00403B05 8845FE mov [ebp-$02],al
CharChr.dpr.26: c := Char(b);
00403B10 8A45FF mov al,[ebp-$01]
00403B13 8845FE mov [ebp-$02],al
Edit: Modified sample to mitigate Nick's concerns.
Edit 2: Nick's wish is my command. ;-)
The help says: Chr returns the character with the ordinal value (ASCII value) of the byte-type expression, X. *
So, how is a character represented in a computer's memory? Guess what, as a byte*. Actually the Chr and Ord functions are only there for Pascal being a strictly typed language prohibiting the use of bytes* where characters are requested. For the computer the resulting char is still represented as byte* - to what shall it convert then? Actually there is no code emitted for this function call, just as there is no code omitted for a type cast. Ergo: no difference.
You may prefer chr just to avoid a type cast.
Note: type casts shall not be confused with explicit type conversions! In Delphi 2010 writing something like Char(a) while a is an AnsiChar, will actually do something.
**For Unicode please replace byte with integer*
Edit:
Just an example to make it clear (assuming non-Unicode):
var
a: Byte;
c: char;
b: Byte;
begin
a := 60;
c := Chr(60);
c := Chr(a);
b := a;
end;
produces similar code
ftest.pas.46: a := 60;
0045836D C645FB3C mov byte ptr [ebp-$05],$3c
ftest.pas.47: c := Chr(60);
00458371 C645FA3C mov byte ptr [ebp-$06],$3c
ftest.pas.48: c := Chr(a);
00458375 8A45FB mov al,[ebp-$05]
00458378 8845FA mov [ebp-$06],al
ftest.pas.49: b := a;
0045837B 8A45FB mov al,[ebp-$05]
0045837E 8845F9 mov [ebp-$07],al
Assigning byte to byte is actually the same as assigning byte to char via CHR().
chr is a function, thus it returns a new value of type char.
char(x) is a cast, that means the actual x object is used but as a different type.
Many system functions, like inc, dec, chr, ord, are inlined.
Both char and chr are fast. Use the one that is most appropriate each time,
and reflects better what you want to do.
Chr is function call, it is a bit (tiny-tiny) more expensive then type cast. But i think Chr is inlined by compiler.
They are identical, but they don't have to be identical. There's no requirement that the internal representation of characters map 1-to-1 with their ordinal values. Nothing says that a Char variable holding the value 'A' must hold the numeric value 65. The requirement is that when you call Ord on that variable, the result must be 65 because that's the code point designated for the letter A in your program's character encoding.
Of course, the easiest implementation of that requirement is for the variable to hold the numeric value 65 as well. Because of this, the function calls and the type-casts are always identical.
If the implementation were different, then when you called Chr(65), the compiler would go look up what character is at code point 65 and use it as the result. When you write Char(65), the compiler wouldn't worry about what character it really represents, as long as the numeric result stored in memory was 65.
Is this splitting hairs? Yes, absolutely, because in all current implementations, they're identical. I liken this to the issue of whether the null pointer is necessarily zero. It's not, but under all implementations, it ends up that way anyway.
chr is typesafe, char isn't: Try to code chr(256) and you'll get a compiler error. Try to code char(256) and you will either get the character with the ordinal value 0 or 1, depending on your computers internal representation of integers.
I'll suffix the above by saying that that applies to pre-unicode Delphi. I don't know if chr and char have been updated to take unicode into account.

Resources