I have follow the subtraction procedure, but my result subtraction didn't function
payment BYTE ' Please enter the amount you paid : RM ',0
pay DWORD 0
billing BYTE ' Total Bill: RM ',0
bill DWORD 0
total BYTE 'Your return is : RM ',0
final DWORD 0
mov edx,OFFSET
billingcall WriteString
mov eax,billcall WriteDec
call Crlf
mov edx,OFFSET payment
call WriteString
call ReadDec
call Crlf
mov eax,pay
sub eax,bill
mov eax,final
mov edx,OFFSET total
call WriteString
mov eax,final
call WriteDec
call Crlf
call WaitMsg
I would want to process my subtraction
Related
In Delphi, string <> '' seems to generate less code than Length(string) > 0.
Comparing for empty string, defined in TMyClass.UpdateString(const strMyString : String):
MyClassU.pas.31: begin
005CE6A0 55 push ebp
005CE6A1 8BEC mov ebp,esp
005CE6A3 83C4F8 add esp,-$08
005CE6A6 8955F8 mov [ebp-$08],edx
005CE6A9 8945FC mov [ebp-$04],eax
MyClassU.pas.32: if (strMyString <> '') then
005CE6AC 837DF800 cmp dword ptr [ebp-$08],$00
005CE6B0 740E jz $005ce6c0
As I understand it, this is comparing the address of the dynamically allocated string ([ebp-$08]) to zero. Makes sense, since empty strings point to nil.
Comparing for length, defined in TMyClass.UpdateString2(const strMyString : String):
MyClassU.pas.25: begin
005CE664 55 push ebp
005CE665 8BEC mov ebp,esp
005CE667 83C4F4 add esp,-$0c
005CE66A 8955F8 mov [ebp-$08],edx
005CE66D 8945FC mov [ebp-$04],eax
005CE670 8B45F8 mov eax,[ebp-$08]
MyClassU.pas.26: if (Length(strMyString) > 0) then
005CE673 8945F4 mov [ebp-$0c],eax
005CE676 837DF400 cmp dword ptr [ebp-$0c],$00
005CE67A 740B jz $005ce687
005CE67C 8B45F4 mov eax,[ebp-$0c]
005CE67F 83E804 sub eax,$04
005CE682 8B00 mov eax,[eax]
005CE684 8945F4 mov [ebp-$0c],eax
005CE687 837DF400 cmp dword ptr [ebp-$0c],$00
005CE68B 7E0E jle $005ce69b
What? Should't it just be cmp dword ptr [ebp-$04],$00, as the string length is stored at offset -$04 within the string?
My guess is it's because optimizations were off and the compiler did not optimize Lenght (boils down to PInteger(PByte(S) - 4)^), but I don't understand why there are two comparisons. In fact both comparisons are present even with optimizations turned on:
MyClassU.pas.27: if (Length(strMyString) > 0) then
005CE6B1 8BC6 mov eax,esi
005CE6B3 85C0 test eax,eax
005CE6B5 7405 jz $005ce6bc
005CE6B7 83E804 sub eax,$04
005CE6BA 8B00 mov eax,[eax]
005CE6BC 85C0 test eax,eax
005CE6BE 7E0A jle $005ce6ca
vs
MyClassU.pas.33: if (strMyString <> '') then
005CE6D9 85F6 test esi,esi
005CE6DB 740A jz $005ce6e7
The second block of code does more work, and not surprisingly that takes more code.
In the first block of code you simply compare against the empty string. The compiler knows that is equivalent to comparing the pointer against nil and generates that code.
The second block of code first obtains the length of the string. That involves checking whether the pointer is nil. If it is, then the length is zero. Otherwise the length is read from the string meta data record.
The compiler simply does not know that every time the pointer is not nil, the length must be positive and so is not able to optimise.
As for why Length doesn't read from the string record directly, that should be obvious now. An empty string is implemented as the nil pointer and so has no string record. In order to find the length you need to deal with two different cases:
String is empty, length is 0.
String is not empty, length is read from the string record.
I am trying to learn inline assembly programming in Delphi, and to this end I have found this article highly helpful.
Now I wish to write an assembly function returning a long string, specifically an AnsiString (for simplicity). I have written
function myfunc: AnsiString;
asm
// eax = #result
mov edx, 3
mov ecx, 1252
call System.#LStrSetLength
mov [eax + 0], ord('A')
mov [eax + 1], ord('B')
mov [eax + 2], ord('C')
end;
Explanation:
A function returning a string has an invisible var result: AnsiString (in this case) parameter, so, at the beginning of the function, eax should hold the address of the resulting string. I then set edx and ecx to 3 and 1252, respectively, and then call System._LStrSetLength. In effect, I do
_LStrSetLength(#result, 3, 1252)
where 3 is the new length of the string (in characters = bytes) and 1252 is the standard windows-1252 codepage.
Then, knowing that eax is the address of the first character of the string, I simply set the string to "ABC". But it does not work - it gives me nonsense data or EAccessViolation. What is the problem?
Update
Now we have two seemingly working implementations of myfunc, one employing NewAnsiString and one employing LStrSetLength. I cannot help but wonder if both of them are correct, in the sense that they do not mess upp Delphi's internal handling of strings (reference counting, automatic freeing, etc.).
You have to use some kind of:
function myfunc: AnsiString;
asm
push eax // save #result
call system.#LStrClr
mov eax,3 {Length}
{$ifdef UNICODE}
mov edx,1252 // code page for Delphi 2009/2010
{$endif}
call system.#NewAnsiString
pop edx
mov [edx],eax
mov [eax],$303132
end;
It will return a '210' string...
And it's always a good idea of putting a {$ifdef UNICODE} block to have your code compatible with version of Delphi prior to 2009.
With the excellent aid of A.Bouchez, I managed to correct my own code, employing LStrSetLength:
function myfunc: AnsiString;
asm
push eax
// eax = #result
mov edx, 3
mov ecx, 1252
call System.#LStrSetLength
pop eax
mov ecx, [eax]
mov [ecx], 'A'
mov [ecx] + 1, 'B'
mov [ecx] + 2, 'C'
end;
I am trying to see the IR of a very simple loop
for (int i = 0; i < 15; i++){
a[b[i]]++;
}
while compile using -O0 and diving into the .ll file, I can see instructions written step by step in the define i32 #main() function. However, while compiling using -O2 and looking into the .ll file, there is only ret i32 0 in the define i32 #main() function. And some call instruction presented in the .ll file compiled by -O0 are changed to tail call in the .ll file compiled by -O2.
Can anyone give a rather detailed explanation on how llvm does the -O2 compilation? Thanks.
T
We can use the Compiler Explorer at godbolt.org to look at your example. We'll use the following testbench code:
int test() {
int a[15] = {0};
int b[15] = {0};
for (int i = 0; i < 15; i++){
a[b[i]]++;
}
return 0;
}
Godbolt shows the x86 assembly, not the LLVM bytecode, but I've summarized it a bit to show what's going on. Here it is at -O0 -m32:
test():
# set up stack
.LBB0_1:
cmp dword ptr [ebp - 128], 15 # i < 15?
jge .LBB0_4 # no? then jump out of loop
mov eax, dword ptr [ebp - 128] # load i
mov eax, dword ptr [ebp + 4*eax - 124] # load b[i]
mov ecx, dword ptr [ebp + 4*eax - 64] # load a[b[i]]
add ecx, 1 # increment it
mov dword ptr [ebp + 4*eax - 64], ecx # store it back
mov eax, dword ptr [ebp - 128]
add eax, 1 # increment i
mov dword ptr [ebp - 128], eax
jmp .LBB0_1 # repeat
.LBB0_4:
# tear down stack
ret
This looks like we'd expect: the loop is clearly visible and it does all the steps we listed. If we compile at -O1 -m32 -march=i386, we see the loop is still there but it's much simpler:
test(): # #test()
# set up stack
.LBB0_1:
mov ecx, dword ptr [esp + 4*eax] # load b[i]
inc dword ptr [esp + 4*ecx + 60] # increment a[b[i]]
inc eax # increment i
cmp eax, 15 # compare == 15
jne .LBB0_1 # no? then loop
# tear down stack
ret
Clang now uses the inc instruction (useful), noticed it could use the eax register for the loop counter i (neat), and moved the condition check to the bottom of the loop (probably better). We can still recognize our original code, though. Now let's try with -O2 -m32 -march=i386:
test():
xor eax, eax # does nothing
ret
That's it? Yes.
clang has detected that the a array can never be used outside of the function. This means doing the incrementing will never affect any other part of the program - and also that nobody will miss it when it's gone.
Removing the increment leaves a for loop with an empty body and no side effects, which can also be removed. In turn, removing the loop leaves an (for all intents and purposes) empty function.
This empty function is likely what you were seeing in the LLVM bytecode (ret i32 0).
This is not a very scientific description, and the steps clang takes might be different, but I hope the example clears it up a bit. If you want, you can read up on the as-if rule. I also recommend playing around on https://godbolt.org/ for a bit: see what happens when you move a and b outside the function, for example.
I'm somewhat new to assembly language and wanted to understand how it works on an older system. I understand that the large memory model uses far pointers while the small memory model uses near pointers, and that the return address in the large model is 4 bytes instead of two, so the first parameter changes from [bp+4] to [bp+6]. However, in the process of adapting a graphics library from a small to a large model, there are other subtle things that I don't seem to understand. Running this code with a large memory model from C is supposed to clear the screen, but instead it hangs the system (it was assembled with TASM):
; void gr256cls( int color , int page );
COLOR equ [bp+6]
GPAGE equ [bp+8]
.MODEL LARGE,C
.186
public C gr256cls
.code
gr256cls PROC
push bp
mov bp,sp
push di
pushf
jmp skip_1
.386
mov ax,0A800h
mov es,ax
mov ax,0E000h
mov fs,ax
CLD
mov al,es:[bp+6]
mov ah,al
mov bx,ax
shl eax,16
mov ax,bx
cmp word ptr GPAGE,0
je short cls0
cmp word ptr GPAGE,2
je short cls0
jmp short skip_0
cls0:
mov bh,0
mov bl,1
call grph_cls256
skip_0:
cmp word ptr GPAGE,1
je short cls1
cmp word ptr GPAGE,2
je short cls1
jmp short skip_1
cls1:
mov bh,8
mov bl,9
call grph_cls256
skip_1:
.186
popf
pop di
pop bp
ret
.386
grph_cls256:
mov fs:[0004h],bh
mov fs:[0006h],bl
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,14848 ;=8192+6656
mov di,0
rep stosd
;; Freezes here.
ret
gr256cls ENDP
end
It hangs at the ret at the end of grph_256cls. In fact, even if I immediately ret from the beginning of the function it still hangs right after. Is there a comprehensive list of differences when coding assembly in the two modes, so I can more easily understand what's happening?
EDIT: To clarify, this is the original source. This is not generated output; it's intended to be assembled and linked into a library.
I changed grph_256cls to a procedure with PROC FAR and it now works without issue:
grph_cls256 PROC FAR
...
grph_cls256 ENDP
The issue had to do with how C expects functions to be called depending on the memory model. In the large memory model, all function calls are far. I hadn't labeled this assumption on the grph_256cls subroutine when trying to call it, so code that didn't push/pop the right values onto/off the stack was assembled instead.
I'm using the NASM assembler.
The value returned to the eax register is supposed to be a character, when I attempt to print the integer representation its a value that looks like a memory address. I was expecting the decimal representation of the letter. For example, if character 'a' was moved to eax I should see 97 being printed (the decimal representation of 'a'). But this is not the case.
section .data
int_format db "%d", 0
;-----------------------------------------------------------
mov eax, dword[ebx + edx]
push eax
push dword int_format
call _printf ;prints a strange number
add esp, 8
xor eax, eax
mov eax, dword[ebx + edx]
push eax
call _putchar ;prints the correct character!
add esp, 4
So what gives here? ultimately I want to compare the character so it is important that eax gets the correct decimal representation of the character.
mov eax, dword[ebx + edx]
You are loading a dword (32 bits) from the address pointed to ebx+edx. If you want a single character, you need to load a byte. For that, you can use movzx instruction:
movzx eax, byte[ebx + edx]
This will load a single byte to the low byte of eax (i.e. al) and zero out the rest of the register.
Another option would be to mask out the extra bytes after loading the dword, e.g.:
and eax, 0ffh
or
movxz eax, al
As for putchar, it works because it interprets the passed value as char, i.e. it ignores the high three bytes present in the register and considers only the low byte.