I'm trying to reverse engineer some code that was used to read a hex file and a football roster file for xbox 360.
Can you someone who is familiar with Delphi help me understand what the below code is doing exactly? I believe it's grabbing offsets from a hex file and then creating pointers of some sort to pull the first and last name.
I've pasted the entire procedure in here, but I'm primarily focusing on the FirstName and LastName sections.
Thanks in advance for any help!
Javo
procedure TEditPlayerMain._PROC_00697780(Sender : TObject);
begin
(*
00697780 53 push ebx
00697781 8BD8 mov ebx, eax
00697783 33D2 xor edx, edx
* Reference to control TEditPlayerMain.FirstNameTxt : TcxTextEdit
|
00697785 8B83AC030000 mov eax, [ebx+$03AC]
0069778B 8B08 mov ecx, [eax]
* Possible reference to virtual method TcxTextEdit.OFFS_64
|
0069778D FF5164 call dword ptr [ecx+$64]
* Possible String Reference to: 'Multi'
|
00697790 BA40786900 mov edx, $00697840
* Reference to control TEditPlayerMain.FirstNameTxt : TcxTextEdit
|
00697795 8B83AC030000 mov eax, [ebx+$03AC]
* Reference to: Controls.TControl.SetText(TControl;TCaption);
|
0069779B E874CEDDFF call 00474614
006977A0 33D2 xor edx, edx
* Reference to control TEditPlayerMain.LastNameTxt : TcxTextEdit
|
006977A2 8B83A8030000 mov eax, [ebx+$03A8]
006977A8 8B08 mov ecx, [eax]
* Possible reference to virtual method TcxTextEdit.OFFS_64
|
006977AA FF5164 call dword ptr [ecx+$64]
* Possible String Reference to: 'Player'
|
006977AD BA50786900 mov edx, $00697850
* Reference to control TEditPlayerMain.LastNameTxt : TcxTextEdit
|
006977B2 8B83A8030000 mov eax, [ebx+$03A8]
* Reference to: Controls.TControl.SetText(TControl;TCaption);
|
006977B8 E857CEDDFF call 00474614
* Possible String Reference to: 'Multi Player'
|
006977BD BA60786900 mov edx, $00697860
* Reference to control TEditPlayerMain.lblPlayerName : TLabel
|
006977C2 8B8300030000 mov eax, [ebx+$0300]
* Reference to: Controls.TControl.SetText(TControl;TCaption);
|
006977C8 E847CEDDFF call 00474614
006977CD 33D2 xor edx, edx
* Reference to control TEditPlayerMain.cbJersey : TcxComboBox
|
006977CF 8B83A4030000 mov eax, [ebx+$03A4]
006977D5 8B08 mov ecx, [eax]
* Possible reference to virtual method TcxComboBox.OFFS_64
|
006977D7 FF5164 call dword ptr [ecx+$64]
006977DA 33D2 xor edx, edx
* Reference to control TEditPlayerMain.numLabel : TLabel
|
006977DC 8B8388030000 mov eax, [ebx+$0388]
006977E2 8B08 mov ecx, [eax]
* Reference to method TLabel.SetEnabled(Boolean)
|
006977E4 FF5164 call dword ptr [ecx+$64]
006977E7 33D2 xor edx, edx
* Reference to control TEditPlayerMain.lblJersey : TLabel
|
006977E9 8B8328030000 mov eax, [ebx+$0328]
* Reference to: Controls.TControl.SetText(TControl;TCaption);
|
006977EF E820CEDDFF call 00474614
006977F4 33D2 xor edx, edx
* Reference to control TEditPlayerMain.lblPosition : TLabel
|
006977F6 8B8324030000 mov eax, [ebx+$0324]
* Reference to: Controls.TControl.SetText(TControl;TCaption);
|
006977FC E813CEDDFF call 00474614
00697801 33D2 xor edx, edx
* Reference to control TEditPlayerMain.lblWeight : TLabel
|
00697803 8B8350030000 mov eax, [ebx+$0350]
* Reference to: Controls.TControl.SetText(TControl;TCaption);
|
00697809 E806CEDDFF call 00474614
0069780E 33D2 xor edx, edx
* Reference to control TEditPlayerMain.lblHeight : TLabel
|
00697810 8B834C030000 mov eax, [ebx+$034C]
* Reference to: Controls.TControl.SetText(TControl;TCaption);
|
00697816 E8F9CDDDFF call 00474614
0069781B 33D2 xor edx, edx
* Reference to control TEditPlayerMain.tsAttributes : TcxTabSheet
|
0069781D 8B8360030000 mov eax, [ebx+$0360]
* Reference to: ComCtrls.TCustomHeaderControl.SetHotTrack(TCustomHeaderControl;Boolean);
|
00697823 E83C48F0FF call 0059C064
00697828 33D2 xor edx, edx
* Reference to control TEditPlayerMain.tsAbilities : TcxTabSheet
|
0069782A 8B8368030000 mov eax, [ebx+$0368]
* Reference to: ComCtrls.TCustomHeaderControl.SetHotTrack(TCustomHeaderControl;Boolean);
|
00697830 E82F48F0FF call 0059C064
00697835 5B pop ebx
00697836 C3 ret
*)
end;
Disclaimer: I'm not particularly "familiar with Delphi". But I understand quite well what's happening, so do read on.
I believe it's (1) grabbing offsets from a hex file and then (2) creating pointers of some sort to (3) pull the first and last name.
That is entirely, completely, and utterly the wrong way to look at it. None of your 3 assumptions appear to be correct. Sorry.
There is only one occurrence of mov ebx, eax, and it's right at the top. Its purpose is to safeguard the initial value of eax, which is important in Delphi-assembly, as it holds a pointer to an instance of the class that calls this code.
Since eax gets used in other ways as well (typically, it will hold a return value from calling another function), the compiler stores this right away into another register -- here ebx, although the compiler is free to pick any free one, or even store it temporarily on a local stack.
Your class is called TEditPlayerMain, and this decompiler¹ recognizes ebx gets used as a Class in a couple of lines -- the ones with comments "Reference to control TEditPlayerMain.xxx". Those are class member variables that are referenced by name, and stored in the positions pointed to by ebx: [ebx+$034C], for instance, points to TEditPlayerMain.lblHeight.
Another typical use, which does not appear in this fragment, is that the initial value allows access to the Virtual Method Table of the class, the full list of functions that were defined for it. This table is pointed to by the first value, [ebx], and so code to 'call' the second method in the VMT would be something like
mov eax, [ebx]
call dword ptr [eax+4]
You can see variants of this code in your lines
006977A0 33D2 1. xor edx, edx
006977A2 8B83A8030000 2. mov eax, [ebx+$03A8]
006977A8 8B08 3. mov ecx, [eax]
006977AA FF5164 4. call dword ptr [ecx+$64]
Zero out edx -- this is the argument of the following function call.
Load eax with a pointer to a class object inside your class.
Load ecx with the VMT pointer for that object.
Call function #$64 of that object.
Now where do you see references to "first and last names"? Only as class member variables:
Reference to control TEditPlayerMain.FirstNameTxt : TcxTextEdit
and
Reference to control TEditPlayerMain.LastNameTxt : TcxTextEdit
These are controls inside the window or dialog that happen to bear the names 'FirstNameTxt' and 'LastNameTxt'. Both are (apparently) TcxTextEdit fields, and are called first with some function_X(0), then
TControl.SetText(FirstNameTxt, 'Multi')
...
TControl.SetText(LastNameTxt,'Player')
I don't know what this dialog looks like, but I think it's two text edit fields with a header, and the headers are set to 'Multi' and 'Player'. The function_X(0) might very well be to clear the contents of the text field.
So this code does not 'grab offsets from a hex file' (all of that "hex" stuff is actually something defined inside this single class), it does not create pointers (it merely fetches some, to class member objects inside this single class), and it doesn't handle "first and last name", other than there happen to be some member variables named as such.
¹ But it's nicely commented output anyway. This doesn't seem native asm code created by Delphi itself, so which decompiler did you use?
Related
In Delphi, string <> '' seems to generate less code than Length(string) > 0.
Comparing for empty string, defined in TMyClass.UpdateString(const strMyString : String):
MyClassU.pas.31: begin
005CE6A0 55 push ebp
005CE6A1 8BEC mov ebp,esp
005CE6A3 83C4F8 add esp,-$08
005CE6A6 8955F8 mov [ebp-$08],edx
005CE6A9 8945FC mov [ebp-$04],eax
MyClassU.pas.32: if (strMyString <> '') then
005CE6AC 837DF800 cmp dword ptr [ebp-$08],$00
005CE6B0 740E jz $005ce6c0
As I understand it, this is comparing the address of the dynamically allocated string ([ebp-$08]) to zero. Makes sense, since empty strings point to nil.
Comparing for length, defined in TMyClass.UpdateString2(const strMyString : String):
MyClassU.pas.25: begin
005CE664 55 push ebp
005CE665 8BEC mov ebp,esp
005CE667 83C4F4 add esp,-$0c
005CE66A 8955F8 mov [ebp-$08],edx
005CE66D 8945FC mov [ebp-$04],eax
005CE670 8B45F8 mov eax,[ebp-$08]
MyClassU.pas.26: if (Length(strMyString) > 0) then
005CE673 8945F4 mov [ebp-$0c],eax
005CE676 837DF400 cmp dword ptr [ebp-$0c],$00
005CE67A 740B jz $005ce687
005CE67C 8B45F4 mov eax,[ebp-$0c]
005CE67F 83E804 sub eax,$04
005CE682 8B00 mov eax,[eax]
005CE684 8945F4 mov [ebp-$0c],eax
005CE687 837DF400 cmp dword ptr [ebp-$0c],$00
005CE68B 7E0E jle $005ce69b
What? Should't it just be cmp dword ptr [ebp-$04],$00, as the string length is stored at offset -$04 within the string?
My guess is it's because optimizations were off and the compiler did not optimize Lenght (boils down to PInteger(PByte(S) - 4)^), but I don't understand why there are two comparisons. In fact both comparisons are present even with optimizations turned on:
MyClassU.pas.27: if (Length(strMyString) > 0) then
005CE6B1 8BC6 mov eax,esi
005CE6B3 85C0 test eax,eax
005CE6B5 7405 jz $005ce6bc
005CE6B7 83E804 sub eax,$04
005CE6BA 8B00 mov eax,[eax]
005CE6BC 85C0 test eax,eax
005CE6BE 7E0A jle $005ce6ca
vs
MyClassU.pas.33: if (strMyString <> '') then
005CE6D9 85F6 test esi,esi
005CE6DB 740A jz $005ce6e7
The second block of code does more work, and not surprisingly that takes more code.
In the first block of code you simply compare against the empty string. The compiler knows that is equivalent to comparing the pointer against nil and generates that code.
The second block of code first obtains the length of the string. That involves checking whether the pointer is nil. If it is, then the length is zero. Otherwise the length is read from the string meta data record.
The compiler simply does not know that every time the pointer is not nil, the length must be positive and so is not able to optimise.
As for why Length doesn't read from the string record directly, that should be obvious now. An empty string is implemented as the nil pointer and so has no string record. In order to find the length you need to deal with two different cases:
String is empty, length is 0.
String is not empty, length is read from the string record.
I am trying to learn inline assembly programming in Delphi, and to this end I have found this article highly helpful.
Now I wish to write an assembly function returning a long string, specifically an AnsiString (for simplicity). I have written
function myfunc: AnsiString;
asm
// eax = #result
mov edx, 3
mov ecx, 1252
call System.#LStrSetLength
mov [eax + 0], ord('A')
mov [eax + 1], ord('B')
mov [eax + 2], ord('C')
end;
Explanation:
A function returning a string has an invisible var result: AnsiString (in this case) parameter, so, at the beginning of the function, eax should hold the address of the resulting string. I then set edx and ecx to 3 and 1252, respectively, and then call System._LStrSetLength. In effect, I do
_LStrSetLength(#result, 3, 1252)
where 3 is the new length of the string (in characters = bytes) and 1252 is the standard windows-1252 codepage.
Then, knowing that eax is the address of the first character of the string, I simply set the string to "ABC". But it does not work - it gives me nonsense data or EAccessViolation. What is the problem?
Update
Now we have two seemingly working implementations of myfunc, one employing NewAnsiString and one employing LStrSetLength. I cannot help but wonder if both of them are correct, in the sense that they do not mess upp Delphi's internal handling of strings (reference counting, automatic freeing, etc.).
You have to use some kind of:
function myfunc: AnsiString;
asm
push eax // save #result
call system.#LStrClr
mov eax,3 {Length}
{$ifdef UNICODE}
mov edx,1252 // code page for Delphi 2009/2010
{$endif}
call system.#NewAnsiString
pop edx
mov [edx],eax
mov [eax],$303132
end;
It will return a '210' string...
And it's always a good idea of putting a {$ifdef UNICODE} block to have your code compatible with version of Delphi prior to 2009.
With the excellent aid of A.Bouchez, I managed to correct my own code, employing LStrSetLength:
function myfunc: AnsiString;
asm
push eax
// eax = #result
mov edx, 3
mov ecx, 1252
call System.#LStrSetLength
pop eax
mov ecx, [eax]
mov [ecx], 'A'
mov [ecx] + 1, 'B'
mov [ecx] + 2, 'C'
end;
I'm somewhat new to assembly language and wanted to understand how it works on an older system. I understand that the large memory model uses far pointers while the small memory model uses near pointers, and that the return address in the large model is 4 bytes instead of two, so the first parameter changes from [bp+4] to [bp+6]. However, in the process of adapting a graphics library from a small to a large model, there are other subtle things that I don't seem to understand. Running this code with a large memory model from C is supposed to clear the screen, but instead it hangs the system (it was assembled with TASM):
; void gr256cls( int color , int page );
COLOR equ [bp+6]
GPAGE equ [bp+8]
.MODEL LARGE,C
.186
public C gr256cls
.code
gr256cls PROC
push bp
mov bp,sp
push di
pushf
jmp skip_1
.386
mov ax,0A800h
mov es,ax
mov ax,0E000h
mov fs,ax
CLD
mov al,es:[bp+6]
mov ah,al
mov bx,ax
shl eax,16
mov ax,bx
cmp word ptr GPAGE,0
je short cls0
cmp word ptr GPAGE,2
je short cls0
jmp short skip_0
cls0:
mov bh,0
mov bl,1
call grph_cls256
skip_0:
cmp word ptr GPAGE,1
je short cls1
cmp word ptr GPAGE,2
je short cls1
jmp short skip_1
cls1:
mov bh,8
mov bl,9
call grph_cls256
skip_1:
.186
popf
pop di
pop bp
ret
.386
grph_cls256:
mov fs:[0004h],bh
mov fs:[0006h],bl
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,14848 ;=8192+6656
mov di,0
rep stosd
;; Freezes here.
ret
gr256cls ENDP
end
It hangs at the ret at the end of grph_256cls. In fact, even if I immediately ret from the beginning of the function it still hangs right after. Is there a comprehensive list of differences when coding assembly in the two modes, so I can more easily understand what's happening?
EDIT: To clarify, this is the original source. This is not generated output; it's intended to be assembled and linked into a library.
I changed grph_256cls to a procedure with PROC FAR and it now works without issue:
grph_cls256 PROC FAR
...
grph_cls256 ENDP
The issue had to do with how C expects functions to be called depending on the memory model. In the large memory model, all function calls are far. I hadn't labeled this assumption on the grph_256cls subroutine when trying to call it, so code that didn't push/pop the right values onto/off the stack was assembled instead.
According to the "Using Assembler in Delphi", eax will contain Self. However, the content of eax is 0 as shown. I wonder what is wrong ?
procedure TForm1.FormCreate(Sender: TObject);
var
X, Y: Pointer;
begin
asm
mov X, eax
mov Y, edx
end;
ShowMessage(IntToStr(NativeInt(X)) + ' ; ' + IntToStr(NativeInt(Y)));
end;
The code generated when I compile this, under debug settings, is like so:
begin
005A9414 55 push ebp
005A9415 8BEC mov ebp,esp
005A9417 83C4E4 add esp,-$1c
005A941A 33C9 xor ecx,ecx
005A941C 894DEC mov [ebp-$14],ecx
005A941F 894DE8 mov [ebp-$18],ecx
005A9422 894DE4 mov [ebp-$1c],ecx
005A9425 8955F0 mov [ebp-$10],edx
005A9428 8945F4 mov [ebp-$0c],eax
005A942B 33C0 xor eax,eax
005A942D 55 push ebp
005A942E 6890945A00 push $005a9490
005A9433 64FF30 push dword ptr fs:[eax]
005A9436 648920 mov fs:[eax],esp
mov X, eax
005A9439 8945FC mov [ebp-$04],eax
mov Y, edx
005A943C 8955F8 mov [ebp-$08],edx
When the code starts executing, eax is indeed the self pointer. But the compiler has chosen to save it away to ebp-$0c and then zeroise eax. That's really up to the compiler.
The code under release settings is quite similar. The compiler still chooses to zeroise eax. Of course, you cannot rely on the compiler doing that.
begin
005A82A4 55 push ebp
005A82A5 8BEC mov ebp,esp
005A82A7 33C9 xor ecx,ecx
005A82A9 51 push ecx
005A82AA 51 push ecx
005A82AB 51 push ecx
005A82AC 51 push ecx
005A82AD 51 push ecx
005A82AE 33C0 xor eax,eax
005A82B0 55 push ebp
005A82B1 6813835A00 push $005a8313
005A82B6 64FF30 push dword ptr fs:[eax]
005A82B9 648920 mov fs:[eax],esp
mov X, eax
005A82BC 8945FC mov [ebp-$04],eax
mov Y, edx
005A82BF 8955F8 mov [ebp-$08],edx
Remember that parameter passing defines the state of registers and stack when the function starts executing. What happens next, how the function decodes the parameters is down to the compiler. It is under no obligation to leave untouched the registers and stack that were used for parameter passing.
If you inject asm into the middle of a function, you cannot expect the volatile registers like eax to have particular values. They will hold whatever the compiler happened to put in them most recently.
If you want to examine the registers at the very beginning of the execution of the function, you need to use a pure asm function to be sure to avoid having the compiler modify the registers that were used for parameter passing:
var
X, Y: Pointer;
asm
mov X, eax
mov Y, edx
// .... do something with X and Y
end;
The compiler will make its choices very much dependent on the code in the rest of the function. For your code, the complexity of assembling the string to pass to ShowMessage causes quite a large preamble. Consider this code instead:
type
TForm1 = class(TForm)
procedure FormCreate(Sender: TObject);
private
i: Integer;
function Sum(j: Integer): Integer;
end;
....
procedure TForm1.FormCreate(Sender: TObject);
begin
i := 624;
Caption := IntToStr(Sum(42));
end;
function TForm1.Sum(j: Integer): Integer;
var
X: Pointer;
begin
asm
mov X, eax
end;
Result := TForm1(X).i + j;
end;
In this case the code is simple enough for the compiler to leave eax alone. The optimised release build code for Sum is:
begin
005A8298 55 push ebp
005A8299 8BEC mov ebp,esp
005A829B 51 push ecx
mov X, eax
005A829C 8945FC mov [ebp-$04],eax
Result := TForm4(X).i + j;
005A829F 8B45FC mov eax,[ebp-$04]
005A82A2 8B80A0030000 mov eax,[eax+$000003a0]
005A82A8 03C2 add eax,edx
end;
005A82AA 59 pop ecx
005A82AB 5D pop ebp
005A82AC C3 ret
And when you run the code, the form's caption is changed to the expected value.
To be perfectly honest, inline assembly, placed as an asm block inside a Pascal function, is not very useful. The thing about writing assembly is that you need to fully understand the state of the registers and the stack. that is well defined at the beginning and end of a function, defined by the ABI.
But in the middle of a function, that state depends entirely on the decisions made by the compiler. Injecting asm blocks in there requires you to know the decisions the compiler made. It also means that the compiler cannot understand the decisions that you made. This is usually impractical. Indeed for the x64 compiler Embarcadero banned such inline asm blocks. I personally have never used an inline asm block in my code. If ever I write asm I always write pure asm functions.
Just use the Push/Pop to get the pointer of SELF, and then use properties freely, like this:
asm
push Self
pop edx //Now, [edx] is the pointer to Self
mov ecx, [edx].FItems //ecx = FItems
mov eax, [edx].FCount //eax = FCount
dec eax //test zero count!
js #Exit //if count was 0 then exit as -1
#Loop: //and so on...
......
First time I play with ds, si and strings related instructions in assembly. I am trying to read the command line arguments char by char and this is how my code looks like now:
GetCommandLine:
push ebp
mov ebp, esp
push edi
push esi
call GetCommandLineW
mov edi, eax
mov esi, ebp
Parse:
lodsw
cmp ax, 0dh ; until return is found
jne Parse
pop esi
pop edi
pop ebp
ret
So, the GetCommandLineW function returns a correct pointer to the string. The problem is that the Parse section loops forever and I can't see the AX register being loaded with the correct next byte from the string. I think the EDI:ESI is not correctly loaded or something
esi and edi are different pointers. ebp is used for saving the old stack pointer, and for saving/loading local variables. GetCommandLineW will return the pointer in eax, which you should then put into esi. Since you're only using lodsw (and not stos*), you don't need to touch edi.
Why do you think that 0x0d is used in the commandline? A normal C string is returned, so you should look for a 0 byte.