Just hitting various bricks walls with years worth of code updating, but the current one i cant seem to convert is this
Function Pack (Var Source, Dest; Count : Word) : Word; Assembler;
Asm
Push DS
Mov BX, Count { BX = Count }
Mov AX, Word Ptr Dest
Mov DI, AX
Mov AX, Word Ptr Dest+2 `1`
Mov ES, AX { ES:DI - Dest }
Mov AX, Word Ptr Source
Mov SI, AX
Mov AX, Word Ptr Source+2
Mov DS, AX { DS:SI - Source }
Xor DX, DX { Packed size }
CLD
#Cycle:
Or BX, BX
JZ #End { Done }
LODSB
Mov CX, BX
Cmp CX, 100H
JC #1
Mov CX, 0FFH
#1:
Mov AH, CL
Push ES
Push DI { Save ES:DI before scan }
Push SI
Pop DI
Push DS
Pop ES { ES:DI = DS:SI for scan }
RepE ScaSB
Dec DI
Push DI
Pop SI
Push ES
Pop DS { DS:SI = ES:DI for next }
Pop DI
Pop ES { Restore ES:DI after scan }
Sub AH, CL
Mov CL, AH { CX = repeat count }
Cmp AH, 3
JNC #3 { Repeat count >= 3 }
Cmp AL, RP
JNE #2 { Not a RepeatPrefix byte }
STOSW { Save RP, repeat count < 3 }
Sub BX, CX { Actually count in source }
Add DX, 2 { Actually packed size }
Jmp #Cycle
#2:
Sub BX, CX { Actually count in source }
Add DX, CX { Actually packed size }
Rep STOSB { Save bytes }
Jmp #Cycle
#3:
Sub BX, CX { Actually count in source }
Add DX, 3 { Actually packed size }
Mov CL, AL
Mov AL, RP
STOSW { Save RP, repeat count < 3 }
Mov AL, CL
STOSB { Save repeating byte }
Jmp #Cycle
#End:
Pop DS
Mov AX, DX { Return packed size }
End;
Function UnPack (Var Source, Dest; Count : Word) : Word; Assembler;
Asm
Push DS
Mov BX, Count { BX = Count }
Mov AX, Word Ptr Dest
Mov DI, AX
Mov AX, Word Ptr Dest+2
Mov ES, AX { ES:DI - Dest }
Mov AX, Word Ptr Source
Mov SI, AX
Mov AX, Word Ptr Source+2
Mov DS, AX { DS:SI - Source }
Xor DX, DX { Packed size }
Xor AH, AH
CLD
#Cycle:
Or BX, BX
JZ #End { Done }
LODSB
Dec BX
Cmp AL, RP
JE #1
STOSB
Inc DX
Jmp #Cycle
#1:
LODSB
Mov CX, AX
Add DX, CX
Dec BX
Cmp AL, 3
JNC #2
Mov AL, RP
Rep STOSB
Jmp #Cycle
#2:
LODSB
Dec BX
Rep STOSB
Jmp #Cycle
#End:
Pop DS
Mov AX, DX
End;
[DCC Error] Packer.pas(20): E2107 Operand size mismatch " Mov AX, Word Ptr Dest"
[DCC Error] Packer.pas(22): E2105 Inline assembler syntax error" Mov AX, Word Ptr Dest+2 `1`"
[DCC Error] Packer.pas(24): E2107 Operand size mismatch "Mov AX, Word Ptr Source"
[DCC Error] Packer.pas(87): E2107 Operand size mismatch" Mov AX, Word Ptr Dest"
[DCC Error] Packer.pas(91): E2107 Operand size mismatch " Mov AX, Word Ptr Source"
I spent far to many years on d6, what am i doing wrong?
As a general rule, inline assembly code that compiles in Delphi 6 will compile in later versions too. So no porting should be needed. Indeed, when you attempt to compile this code in Delphi 6, it fails with exactly the same errors as you report from D2007.
You also mention in the comments that you have not been compiling the code in Delphi 6, but have rather been using a compiled .dcu file. Which makes more sense, given that the code in the question does not compile in Delphi 6.
The reason the code does not compile in Delphi 6 is that it is not 32 bit code. It seems to me to be 16 bit code.
Reading between the lines I suspect that the code you have dates from long ago, in the 16 bit age. When the code was moved to 32 bit someone ported the code but left you a .dcu file rather than the source.
You are thus in a pickle. Without knowing what's in this .dcu file what are you to do? Are you even sure that the .dcu file does the same as this 16 bit assembly code? Do you have a functional specification for these functions?
In an ideal world you would know what these functions do and would be able to port them to Pascal. Then you would no longer be tied to assembler code.
If you don't know what these functions do you should find out. I would not trust that your 16 bit assembler matches what your .dcu file does. I would disassemble the .dcu file and port that to Pascal.
If even that's too hard, then the expedient approach is to use Delphi 6 to compile the .dcu file into a DLL. Then you can call these functions from your ported D20007 program. This will work but it leaves you no closer to knowing what your code does.
Yes it is 16 bit register.
Just an idea, maybe declare in your variables example AX as word and AH, AL as byte but you will need to rename them.
example...
Function Example (Var Source, Dest; Count : Word) : Word; Assembler;
var
iAX, iBX, iCX, iDX, iBP, iSI, iDI, iSP: Word;
iAH, iAL, iBH, iBL, iCH, iCL, iDH, iDL : Byte
Asm
Push DS
Mov iBX, Count
Mov iAX, Word Ptr Dest
Mov iDI, iAX
Mov iAX, Word Ptr Dest+2
Mov iES, iAX
.......
P.s
You might have to change "Push DS" and "POP DS" to "PUSH ESI" and "POP ESI"
Related
Is there a way to find the implementation of the intrinsic function System.val(..)?
This function works only with unicodeString and I would like to make it working with AnsiString also ... is there any replacement of this function that works with ansiString?
If only decimal digits, and only positive numbers, and no check for overflow (because you know it to never happen), then you can use the following function:
FUNCTION TryAnsiStrToInt(CONST S : AnsiString {EAX} ; OUT Value : Cardinal {EDX} ) : BOOLEAN; ASSEMBLER;
ASM
PUSH ESI
MOV ESI,EAX // ESI = String
XOR ECX,ECX // ECX = Value
CLD
XOR EAX,EAX // Clear upper 24 bits of EAX
#LOOP: LODSB // Load character
OR AL,AL // End-of-String?
JZ #OK
SUB AL,'0' // Convert ASCII to Binary
JB #ERR // Out of range
CMP AL,9
JA #ERR // Out of range
IMUL ECX,10 // ECX:=ECX*10
ADD ECX,EAX // ECX:=ECX+EAX (ie. Value:=Value*10+Digit)
JMP #LOOP // Next character
#ERR: XOR AL,AL // Error: Return FALSE
XOR ECX,ECX // and Value:=0
JMP #OUT
#OK: MOV AL,1 // Success return TRUE
#OUT: MOV [EDX],ECX // and Value
POP ESI
END;
If you have additional requirements (negative numbers, hex numbers, overflow check etc.) you'll need to adapt the function to care for these...
I want to have universal way of detecting specific CPU features. For this task I've created this function which takes EAX leaf number,Register name and bit number and returns true or false. It works fine for MMX/SSEx/AVX (EAX=1) but it does not detect AVX2 (EAX=7).
CPU: i5-4670k
OS: Windows 7
DetectCPUFeature('1','EDX',23) //DETECTS MMX CORRECTLY
DetectCPUFeature('1','EDX',25) //DETECTS SSE CORRECTLY
DetectCPUFeature('1','EDX',26) //DETECTS SSE2 CORRECTLY
DetectCPUFeature('1','ECX',0) //DETECTS SSE3 CORRECTLY
DetectCPUFeature('1','ECX',9) //DETECTS SSSE3 CORRECTLY
DetectCPUFeature('1','ECX',19) //DETECTS SSE4.1 CORRECTLY
DetectCPUFeature('1','ECX',20) //DETECTS SSE4.2 CORRECTLY
DetectCPUFeature('1','ECX',28) //DETECTS AVX CORRECTLY
DetectCPUFeature('7','EBX',5) //DOES NOT DETECT AVX2!
.
function DetectCPUFeature(EAX_Leaf_HEX,Register_Name:string;Bit:byte):boolean;
var _eax,_ebx,_ecx,_edx,EAX_Leaf,_Result: Longword;
x:integer;
Binary_mask:string;
Decimal_mask:int64;
begin
EAX_Leaf:=HexToInt(EAX_Leaf_HEX);
Binary_mask:='1';
for x:=1 to Bit do Binary_mask:=Binary_mask+'0';
Decimal_mask:=BinToInt(Binary_mask);
if AnsiUpperCase(Register_Name)='EDX' then
begin
asm
mov eax,EAX_Leaf // https://en.wikipedia.org/wiki/CPUID
db $0F,$A2 // db $0F,$A2 = CPUID instruction
mov _Result,edx
end;
end;
if AnsiUpperCase(Register_Name)='ECX' then
begin
asm
mov eax,EAX_Leaf
db $0F,$A2
mov _Result,ecx
end;
end;
if AnsiUpperCase(Register_Name)='EBX' then
begin
asm
mov eax,EAX_Leaf
db $0F,$A2
mov _Result,ebx
end;
end;
if (_Result and Decimal_mask) = Decimal_mask then DetectCPUFeature:=true
else DetectCPUFeature:=false;
end;
This sort of code is very dubious, mixing asm with Pascal code. Your code, in the asm blocks modifies registers and fails to restore them. That could easily be conflicting with the compiler's register usage. My strong advice to you is that you should never mix asm and Pascal in this way. Always use pure Pascal or pure asm.
What you need is a function that will perform the CPUID instruction and return you all the registers in a structure. You can then pick out what you want from that using Pascal code.
In addition, as #J... points out, you need to specify the sub-leaf value in the ECX register before invoking the CPUID instruction. That is a requirement for a number of the more recently added CPUID arguments.
This is the function you need:
type
TCPUID = record
EAX: Cardinal;
EBX: Cardinal;
ECX: Cardinal;
EDX: Cardinal;
end;
function GetCPUID(Leaf, Subleaf: Cardinal): TCPUID;
asm
push ebx
push edi
mov edi, ecx
mov ecx, edx
cpuid
mov [edi+$0], eax
mov [edi+$4], ebx
mov [edi+$8], ecx
mov [edi+$c], edx
pop edi
pop ebx
end;
I've written this for 32 bit code, but if you need to support 64 bit code also that support is easy enough to add.
function GetCPUID(Leaf, Subleaf: Integer): TCPUID;
asm
{$IF Defined(CPUX86)}
push ebx
push edi
mov edi, ecx
mov ecx, edx
cpuid
mov [edi+$0], eax
mov [edi+$4], ebx
mov [edi+$8], ecx
mov [edi+$c], edx
pop edi
pop ebx
{$ELSEIF Defined(CPUX64)}
mov r9,rcx
mov ecx,r8d
mov r8,rbx
mov eax,edx
cpuid
mov [r9+$0], eax
mov [r9+$4], ebx
mov [r9+$8], ecx
mov [r9+$c], edx
mov rbx, r8
{$ELSE}
{$Message Fatal 'GetCPUID has not been implemented for this architecture.'}
{$IFEND}
end;
With this at hand you can call CPUID passing any value as input, and retrieve all 4 registers of output, with which you can then do whatever you please.
Your code to create a bitmask is extremely inefficient and very far from idiomatic. Use 1 shl N to create a value with a single bit set, in position N.
Code like this:
if (_Result and Decimal_mask) = Decimal_mask then DetectCPUFeature:=true
else DetectCPUFeature:=false;
is also some way from idiomatic. That would normally be written like this:
DetectCPUFeature := value and mask <> 0;
You might end up with a wrapper function that looks like this:
type
TCPUIDRegister = (regEAX, regEBX, regECX, regEDX);
function GetCPUIDRegister(CPUID: TCPUID; Reg: TCPUIDRegister): Cardinal;
begin
case Reg of
regEAX:
Result := CPUID.EAX;
regEBX:
Result := CPUID.EBX;
regECX:
Result := CPUID.ECX;
regEDX:
Result := CPUID.EDX;
end;
end;
function CPUFeatureEnabled(Leaf, Subleaf: Cardinal; Reg: TCPUIDRegister; Bit: Integer): Boolean;
var
value: Cardinal;
begin
value := GetCPUIDRegister(GetCPUID(Leaf, Subleaf), Reg);
Result := value and (1 shl Bit) <> 0;
end;
While David's answer is excellent, the reason the function fails is that the ECX register is not set to zero (required for fetching extended info in the CPUID call).
See : How to detect New Instruction support in the 4th generation Intel® Core™ processor family
where AVX2 is found by (emphasis mine)
CPUID.(EAX=07H, ECX=0H):EBX.AVX2[bit 5]==1
The following correctly returns the extended information and identifies AVX2 support.
if AnsiUpperCase(Register_Name)='EBX' then
begin
asm
push ecx { push ecx to stack}
mov ecx, 0 { set ecx to zero}
mov eax,EAX_Leaf
db $0F,$A2
mov _Result,ebx
pop ecx { restore ecx}
end;
The other asm functions have the same error as ECX is required to be zero for those calls also.
taken from Synopse Informatique:
type
/// the potential features, retrieved from an Intel CPU
// - see https://en.wikipedia.org/wiki/CPUID#EAX.3D1:_Processor_Info_and_Feature_Bits
TALIntelCpuFeature =
( { in EDX }
cfFPU, cfVME, cfDE, cfPSE, cfTSC, cfMSR, cfPAE, cfMCE,
cfCX8, cfAPIC, cf_d10, cfSEP, cfMTRR, cfPGE, cfMCA, cfCMOV,
cfPAT, cfPSE36, cfPSN, cfCLFSH, cf_d20, cfDS, cfACPI, cfMMX,
cfFXSR, cfSSE, cfSSE2, cfSS, cfHTT, cfTM, cfIA64, cfPBE,
{ in ECX }
cfSSE3, cfCLMUL, cfDS64, cfMON, cfDSCPL, cfVMX, cfSMX, cfEST,
cfTM2, cfSSSE3, cfCID, cfSDBG, cfFMA, cfCX16, cfXTPR, cfPDCM,
cf_c16, cfPCID, cfDCA, cfSSE41, cfSSE42, cfX2A, cfMOVBE, cfPOPCNT,
cfTSC2, cfAESNI, cfXS, cfOSXS, cfAVX, cfF16C, cfRAND, cfHYP,
{ extended features in EBX, ECX }
cfFSGS, cf_b01, cfSGX, cfBMI1, cfHLE, cfAVX2, cf_b06, cfSMEP, cfBMI2,
cfERMS, cfINVPCID, cfRTM, cfPQM, cf_b13, cfMPX, cfPQE, cfAVX512F,
cfAVX512DQ, cfRDSEED, cfADX, cfSMAP, cfAVX512IFMA, cfPCOMMIT,
cfCLFLUSH, cfCLWB, cfIPT, cfAVX512PF, cfAVX512ER, cfAVX512CD,
cfSHA, cfAVX512BW, cfAVX512VL, cfPREFW1, cfAVX512VBMI);
/// all features, as retrieved from an Intel CPU
TALIntelCpuFeatures = set of TALIntelCpuFeature;
var
/// the available CPU features, as recognized at program startup
ALCpuFeatures: TALIntelCpuFeatures;
{**}
type
_TRegisters = record
eax,ebx,ecx,edx: cardinal;
end;
{***************************************************************}
procedure _GetCPUID(Param: Cardinal; var Registers: _TRegisters);
{$IF defined(CPU64BITS)}
asm // ecx=param, rdx=Registers (Linux: edi,rsi)
.NOFRAME
mov eax, ecx
mov r9, rdx
mov r10, rbx // preserve rbx
xor ebx, ebx
xor ecx, ecx
xor edx, edx
cpuid
mov _TRegisters(r9).&eax, eax
mov _TRegisters(r9).&ebx, ebx
mov _TRegisters(r9).&ecx, ecx
mov _TRegisters(r9).&edx, edx
mov rbx, r10
end;
{$else}
asm
push esi
push edi
mov esi, edx
mov edi, eax
pushfd
pop eax
mov edx, eax
xor eax, $200000
push eax
popfd
pushfd
pop eax
xor eax, edx
jz #nocpuid
push ebx
mov eax, edi
xor ecx, ecx
cpuid
mov _TRegisters(esi).&eax, eax
mov _TRegisters(esi).&ebx, ebx
mov _TRegisters(esi).&ecx, ecx
mov _TRegisters(esi).&edx, edx
pop ebx
#nocpuid:
pop edi
pop esi
end;
{$ifend}
{******************************}
procedure _TestIntelCpuFeatures;
var regs: _TRegisters;
begin
regs.edx := 0;
regs.ecx := 0;
_GetCPUID(1,regs);
PIntegerArray(#ALCpuFeatures)^[0] := regs.edx;
PIntegerArray(#ALCpuFeatures)^[1] := regs.ecx;
_GetCPUID(7,regs);
PIntegerArray(#ALCpuFeatures)^[2] := regs.ebx;
PByteArray(#ALCpuFeatures)^[12] := regs.ecx;
end;
initialization
_TestIntelCpuFeatures;
I need to process a file that is coming from the old Mac era (old Motorola CPU). The bytes are big endian so I have a function that swaps and Int64 to Intel little endian. The function is ASM and works on 32 bit CPU but not on 64. For 64 bit I have a different function that is not ASM. I want to combine the functions using IFDEF. Can I do this? Will it be a problem?
interface
function SwapInt64(Value: Int64): Int64; assembler;
implementation
{$IFDEF CPUx86}
function SwapInt64(Value: Int64): Int64; assembler; { Does not work on 64 bit } {
asm
MOV EDX,[DWORD PTR EBP + 12]
MOV EAX,[DWORD PTR EBP + 8]
BSWAP EAX
XCHG EAX,EDX
BSWAP EAX
end;
{$else}
function SwapInt64 (Value: Int64): Int64;
var P: PInteger;
begin
Result: = (Value shl 32) or (Value shr 32);
P: = #Result;
P ^: = (Swap (P ^) shl 16) or (Swap (P ^ shr 16));
Inc (P);
P ^: = (Swap (P ^) shl 16) or (Swap (P ^ shr 16));
end;
{$ENDIF}
I think the compiler will correctly compile/call the appropriate function no matter if one is ASM and the other is Pascal.
What you are proposing is perfectly fine. It is a quite reasonable approach.
If you want a 64 bit swap in asm, for x64, that's quite simple:
function SwapInt64(Value: Int64): Int64;
asm
MOV RAX,RCX
BSWAP RAX
end;
Combine this with the 32 bit version using conditional, as you have done in the question.
function SwapInt64(Value: Int64): Int64;
{$IF Defined(CPUX86)}
asm
MOV EDX,[DWORD PTR EBP + 12]
MOV EAX,[DWORD PTR EBP + 8]
BSWAP EAX
XCHG EAX,EDX
BSWAP EAX
end;
{$ELSEIF Defined(CPUX64)}
asm
MOV RAX,RCX
BSWAP RAX
end;
{$ELSE}
{$Message Fatal 'Unsupported architecture'}
{$ENDIF}
Or include a Pascal implementation in the {$ELSE} block.
The approach of swapping the bytes in a separate routine that cannot be inlined is a bit silly if performance is what you're after.
A better way to a assume you've got a block of data and all dword/qwords in it need to have their endianness changed.
This would look something like this.
For dwords
function SwapDWords(var Data; size: cardinal): boolean;
{ifdef CPUX64}
asm
//Data in RCX, Size in EDX
xor EAX,EAX //failure
test EDX,3
jz #MultipleOf4
#error:
ret
#MultipleOf4
neg EDX //Count up instead of down
jz #done
ADD RCX,RDX
#loop
mov R8d, [RCX+RDX]
bswap R8d
mov [RCX+RDX],R8d
add RDX,4 //add is faster than inc on modern processors
jnz #loop
#done:
inc EAX //success
ret
end;
For qwords
function SwapQWords(var Data; size: cardinal): boolean;
{ifdef CPUX64}
asm
//Data in RCX, Size in EDX
xor EAX,EAX //failure
test EDX,7
jz #MultipleOf8
#error:
ret
#MultipleOf8
neg EDX //Count up instead of down
jz #done
ADD RCX,RDX
#loop
mov R8, [RCX+RDX]
bswap R8
mov [RCX+RDX],R8
add RDX,8 //add is faster than inc on modern processors
jnz #loop
#done:
inc EAX //success
ret
end;
If you're already on 64 bit, then you have SSE2, and can use the 128-bit SSE registers.
Now you can process 4 dwords at a time, effectively unrolling the loop 4 times.
See: http://www.asmcommunity.net/forums/topic/?id=29743
movntpd xmm5,[RCX+RDX] //non-temporal move to avoid polluting the cache
movdqu xmm0, xmm5
movdqu xmm1, xmm5
pxor xmm5, xmm5
punpckhbw xmm0, xmm5 ; interleave '0' with bytes of original
punpcklbw xmm1, xmm5 ; so they become words
pshuflw xmm0, xmm0, 27 ; swap the words by shuffling
pshufhw xmm0, xmm0, 27 ;//27 = B00_01_10_11
pshuflw xmm1, xmm1, 27
pshufhw xmm1, xmm1, 27
packuswb xmm1, xmm0 ; make the words back into bytes.
movntpd [RCX+RDX], xmm1 //non-temporal move to keep the cache clean.
Simply use either LEToN() or BEtoN()
Use the LE variant if the data is little endian (e.g. 32 or 64-bits x86 mac, modern arm), use the BE if the source data (e.g. file from disk) is in big endian format.
Depending on the used architecture a swap or "nothing" will be inlined, usually a fairly optimal one for single conversions. For block oriented solutions see the posted SSE code (or Agner Fog's)
In Delphi how can I detect overflow errors for Int64?
For Integers we could do:
type
MyInt = Integer; //Int64
function TryMaxTimes10(out Res: MyInt): boolean;
var
a, b: MyInt;
begin
{$Q+}
try
a := High(MyInt);
b := 10;
Res := a * b; //REF1
Result := True;
except
Result := False;
end;
{$Q-}
end;
For MyInt = Integer, line REF1 gives an exception and so TryMaxTimes10 returns false.
But if we change MyInt to MyInt = Int64, then REF1 does not give an exception and TryMaxTimes10 returns true!
I understand that the help for {$Q+} does not specifically mention Int64: ... {$Q+} state, certain integer arithmetic operations ... are checked for overflow.
QUESTION: So my question is, how can we detect overflow errors for Int64?
(I'm using Delphi 7. Does the same thing happen in newer versions of Delphi?)
This is a known issue. See http://qc.embarcadero.com/wc/qcmain.aspx?d=10185, and the comments Andy wrote at the bottom.
My suggestion would be to create a function (I did not compile nor test this - just an example):
function Foo(A, B : Int64) : Int64;
var bNeg : boolean;
begin
// Do we expect a negative result?
bNeg := ((a < 0) xor (b < 0));
// Get the real result
Result := a * b;
// If the result is wrong, raise an error
if ((Result < 0) xor bNeg) then begin
// Raise EOverFlow
end;
end;
This bug has been fixed in RAD Studio 10.2 Tokyo.
The issue can be found here (but one have to log in with embarcadero account to see it).
Here is correct version of __llmulo by John O'Harrow (licensed under MPL 1.1) shipped with Delphi versions 10.2 and above:
// Param 1(edx:eax), Param 2([esp+8]:[esp+4])
// Result is stored in edx:eax
// O-flag set on exit => result is invalid
// O-flag clear on exit => result is valid
procedure __llmulo();
asm
test edx, edx {Param1-Hi = 0?}
jne ##Large {No, More than one multiply may be needed}
cmp edx, [esp+8] {Param2-Hi = 0?}
jne ##Large {No, More than one multiply may be needed}
mul dword ptr [esp+4] {Only one multiply needed, Set Result}
and eax, eax {Clear Overflow Flag}
ret 8
##Large:
sub esp, 28 {allocate local storage}
mov [esp], ebx {save used registers}
mov [esp+4], esi
mov [esp+8], edi
mov [esp+12], ebp
mov ebx, [esp+32] {Param2-Lo}
mov ecx, [esp+36] {Param2-Hi}
mov esi, edx
mov edi, ecx
sar esi, 31
sar edi, 31
xor eax, esi
xor edx, esi
sub eax, esi
sbb edx, esi {edx:eax (a1:a0) = abs(Param1)}
xor ebx, edi
xor ecx, edi
sub ebx, edi
sbb ecx, edi {ecx:ebx (b1:b0) = abs(Param2)}
xor esi, edi {Sign Flag, 0 if Params have same sign else -1}
mov [esp+16], eax {a0}
mov [esp+20], edx {a1}
mov [esp+24], ecx {b1}
mul ebx {edx:eax (c1:c0) = a0*b0}
xchg ebx, edx {ebx = c1, edx = b0}
mov edi, eax {abs(Result-Lo) = c0}
xor ecx, ecx {Upper 32 bits of 128 bit result}
xor ebp, ebp {Second 32 bits of 128 bit result}
mov eax, [esp+20] {a1}
mul edx {edx:eax (d1:d0) = a1*b0}
add ebx, eax {c1 + d0}
adc ebp, edx {d1 + carry}
adc ecx, 0 {Possible carry into Upper 32 bits}
mov eax, [esp+16] {a0}
mov edx, [esp+24] {b1}
mul edx {edx:eax (e1:e0) = a0*b1}
add ebx, eax {abs(Result-Hi) = c1 + d0 + e0}
adc ebp, edx {d1 + e1 + carry}
adc ecx, 0 {Possible carry into Upper 32 bits}
mov eax, [esp+20] {a1}
mov edx, [esp+24] {b1}
mul edx {edx:eax (f1:f0) = a1*b1}
add ebp, eax {d1 + e1 + f0 + carry}
adc ecx, edx {f1 + carry}
or ecx, ebp {Overflow if ecx <> 0 or ebp <> 0}
jnz ##Overflow
mov edx, ebx {Set abs(Result-Hi)}
mov eax, edi {Set abs(Result-Lo)}
cmp edx, $80000000
jae ##CheckRange {Possible Overflow if edx>=$80000000}
##SetSign:
xor eax, esi {Correct Sign of Result}
xor edx, esi
sub eax, esi
sbb edx, esi
mov ebx, [esp] {restore used registers}
mov esi, [esp+4]
mov edi, [esp+8]
mov ebp, [esp+12]
add esp, 28 {Clears Overflow flag}
ret 8
##CheckRange:
jne ##Overflow {Overflow if edx>$80000000}
test esi, esi {edx=$80000000, Is Sign Flag=0?}
jnz ##SetSign {No, Result is Ok (-MaxInt64)}
##Overflow:
mov ebx, [esp] {restore used registers}
mov esi, [esp+4]
mov edi, [esp+8]
mov ebp, [esp+12]
add esp, 28
mov ecx, $80000000
dec ecx {Set Overflow Flag}
ret 8
end;
I have the following function that works in Delphi 2006, but under Delphi XE2 it gives either an access violation error or a privileged instruction error when processing RET.
function Q_TrimChar(const S: string; Ch: Char): string;
asm
PUSH ESI
MOV ESI,ECX
TEST EAX,EAX
JE ##qt
MOV ECX,[EAX-4]
TEST ECX,ECX
JE ##qt
PUSH EBX
PUSH EDI
MOV EBX,EAX
MOV EDI,EDX
XOR EDX,EDX
MOV EAX,ESI
CALL System.#LStrFromPCharLen
MOV EDX,EDI
MOV ECX,[EBX-4]
##lp1: CMP DL,BYTE PTR [EBX]
JNE ##ex1
INC EBX
DEC ECX
JNE ##lp1
MOV EDX,[ESI]
JMP ##wq
##ex1: DEC ECX
##lp2: CMP DL,BYTE PTR [EBX+ECX]
JNE ##ex2
DEC ECX
JMP ##lp2
##ex2: MOV EDI,[ESI]
LEA EDX,[EDI+ECX+1]
##lp3: MOV AL,BYTE PTR [EBX+ECX]
MOV BYTE PTR [EDI+ECX],AL
DEC ECX
JNS ##lp3
##wq: MOV EAX,[ESI]
MOV BYTE PTR [EDX],0
SUB EDX,EAX
MOV [EAX-4],EDX
POP EDI
POP EBX
POP ESI
RET
##qt: MOV EAX,ESI
CALL System.#LStrClr
POP ESI
end;
I don't know assembly very well. What is the problem?
I completely agree with David's suggestion to simply code this in Pascal and have upvoted that answer. Unless profiling has indicated that this is a true bottleneck then there's really no need for the ASM. Here are two versions. The first is easier to read but the second is more efficient:
function Q_TrimChar(const S: string; Ch: Char): string;
begin
result := S;
while (result <> '') and (result[1] = Ch) do Delete(Result, 1, 1);
while (result <> '') and (result[Length(Result)] = Ch) do Delete(Result, Length(Result), 1);
end;
function Q_TrimChar(const S: string; Ch: Char): string;
var
First, Last : integer;
begin
First := 1;
Last := Length(S);
while (First < Last) and (S[First] = Ch) do inc(First);
while (Last >= First) and (S[Last] = Ch) do Dec(Last);
Result := copy(S, First, Last-First+1);
end;
Delphi 2006 uses single byte ANSI characters and so string is AnsiString, Char is AnsiChar. On Delphi 2009 and later, two byte Unicode characters are used. This function cannot possibly work on both compilers.
Even the standard hack of using AnsiString and AnsiChar does not work. Most likely the assumptions that this function makes about the RTL implementation are no longer valid in modern Delphi.
I would re-write this function in Pascal and let the compiler do the work. Not only will that be the quickest way to solve your current problem, it will also get you over the hurdle of 64-bit compilation should you ever choose to tackle that.