I'm currently working on code for the 8051 processor, and I'm trying to figure out better ways to store data in immediate RAM without using so many bytes of rom space. The data is mostly random but sometimes some data is null.
For example, I have a device connected to my chip that expects 7 bytes of data for processing. Let's say I want to send to that device the following data:
12h 34h 56h 41h 33h 77h 00h
The quick way for me to do this is something like the following:
DATABLOCK equ 30h
MOV DATABLOCK,#12h
MOV DATABLOCK+1,#34h
MOV DATABLOCK+2,#56h
MOV DATABLOCK+3,#41h
MOV DATABLOCK+4,#33h
MOV DATABLOCK+5,#77h
MOV DATABLOCK+6,#00h
I refer to the following website for instructions:
http://www.keil.com/support/man/docs/is51/is51_mov.htm
and based on my code, I needed 21 bytes to store that in rom. One byte for MOV, one byte for destination (DATABLOCK+x) and one byte for the value (xxh). multiply total by 7 = 21.
The problem is, I use this kind of command frequently because its fast but I have limited free space on the chip.
I thought of the following but it doesn't really help. Heck, I think it takes an EXTRA two bytes:
DATABLOCK equ 30h
MOV R0,#DATABLOCK
MOV #R0,#12h
INC R0
MOV #R0,#34h
INC R0
MOV #R0,#56h
INC R0
MOV #R0,#41h
INC R0
MOV #R0,#33h
INC R0
MOV #R0,#77h
INC R0
MOV #R0,#00h
And then I thought of this which can get crazy which I think takes even MORE bytes:
DATABLOCK equ 30h
mov R1,SP
MOV SP,#DATABLOCK-1
MOV A,#12h
PUSH ACC
MOV A,#34h
PUSH ACC
MOV A,#56h
PUSH ACC
MOV A,#41h
PUSH ACC
MOV A,#33h
PUSH ACC
MOV A,#77h
PUSH ACC
MOV A,#00h ;could use CLR A but what if value isn't 0h?
PUSH ACC
MOV SP,R1
Now the problem is I don't have that much ram either so I can't afford to keep the same values in ram for the entire program. If I could, I would implement code like this:
FIND12H equ 70h
FIND34H equ 60h
FIND56H equ 50h
FIND41H equ 55h
FIND33H equ 66h
FIND77H equ 22h
FIND00H equ 2Ah
DATABLOCK equ 30h
mov R1,SP
MOV SP,#DATABLOCK-1
PUSH FIND12H
PUSH FIND34H
PUSH FIND56H
PUSH FIND41H
PUSH FIND33H
PUSH FIND77H
PUSH FIND00H
MOV SP,R1
Now that code there would only cost me maybe 12 bytes. That's a savings of about 9 bytes (like over 1/3), but the problem is I'm using absolute values, not memory locations. Like if the following worked with the 8051, then my question would be answered:
mov A,SP
MOV SP,#DATABLOCK-1
PUSH #12h
PUSH #34h
PUSH #56h
PUSH #41h
PUSH #33h
PUSH #77h
PUSH #0h
MOV SP,A
But for the push command, the parameter cannot be a hard-coded value.
So given what all I have presented, How can I use fewer rom bytes to store data into internal memory?
"For example, I have a device connected to my chip that expects 7 bytes of data for processing" Why you need the data in RAM? You can copy the data right from ROM. For example if you want to send data via UART (not checked if work):
sendData:
MOV DPTR, constData ;pointer to data in ROM
MOV R0, #7 ;number of data to send
sendMore:
MOVC A, #DPTR ;move data to accumulator
MOV SBUF, A ;send data via UART
JNB TI, $ ;wait for TI flag
CLR TI ;clear TI flag
INC DPTR ;increment pointer
DJNZ R0, sendMore ;decrement couter and jump if not zero
constData: DB #12h, #34h, #56h, #41h, #33h, #77h, #00h ;constants stored in ROM
If you really want the data in RAM you can use memory copy function which looks something like this:
ARRAY_START equ 30h
ARRAY_LEN equ 7h
MOV R0, ARRAY_START ;set pointer to start array
MOV DPTR, SrcTable ;pointer to data in ROM
copyMore:
MOVC A, #DPTR ;move data to acumulator
MOV #R0, A ;move data to RAM destination
INC DPTR ;increment source pointer
INC R0 ;increment destination pointer
CJNE R0, ARRAY_START+ARRAY_LEN, copyMore
SrcTable: DB #12h, #34h, #56h, #41h, #33h, #77h, #00h ;constants stored in ROM
Related
I'm somewhat new to assembly language and wanted to understand how it works on an older system. I understand that the large memory model uses far pointers while the small memory model uses near pointers, and that the return address in the large model is 4 bytes instead of two, so the first parameter changes from [bp+4] to [bp+6]. However, in the process of adapting a graphics library from a small to a large model, there are other subtle things that I don't seem to understand. Running this code with a large memory model from C is supposed to clear the screen, but instead it hangs the system (it was assembled with TASM):
; void gr256cls( int color , int page );
COLOR equ [bp+6]
GPAGE equ [bp+8]
.MODEL LARGE,C
.186
public C gr256cls
.code
gr256cls PROC
push bp
mov bp,sp
push di
pushf
jmp skip_1
.386
mov ax,0A800h
mov es,ax
mov ax,0E000h
mov fs,ax
CLD
mov al,es:[bp+6]
mov ah,al
mov bx,ax
shl eax,16
mov ax,bx
cmp word ptr GPAGE,0
je short cls0
cmp word ptr GPAGE,2
je short cls0
jmp short skip_0
cls0:
mov bh,0
mov bl,1
call grph_cls256
skip_0:
cmp word ptr GPAGE,1
je short cls1
cmp word ptr GPAGE,2
je short cls1
jmp short skip_1
cls1:
mov bh,8
mov bl,9
call grph_cls256
skip_1:
.186
popf
pop di
pop bp
ret
.386
grph_cls256:
mov fs:[0004h],bh
mov fs:[0006h],bl
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,16384
mov di,0
rep stosd
add word ptr fs:[0004h],2
add word ptr fs:[0006h],2
mov cx,14848 ;=8192+6656
mov di,0
rep stosd
;; Freezes here.
ret
gr256cls ENDP
end
It hangs at the ret at the end of grph_256cls. In fact, even if I immediately ret from the beginning of the function it still hangs right after. Is there a comprehensive list of differences when coding assembly in the two modes, so I can more easily understand what's happening?
EDIT: To clarify, this is the original source. This is not generated output; it's intended to be assembled and linked into a library.
I changed grph_256cls to a procedure with PROC FAR and it now works without issue:
grph_cls256 PROC FAR
...
grph_cls256 ENDP
The issue had to do with how C expects functions to be called depending on the memory model. In the large memory model, all function calls are far. I hadn't labeled this assumption on the grph_256cls subroutine when trying to call it, so code that didn't push/pop the right values onto/off the stack was assembled instead.
How do you perform operations like change the value at an absolute word address?
Say you have some value at 5DAh and you want to count the number of zeros on that address, or move a value from one absolute address to another. How can one do that?
Short Answer: You Can't
You might have a trick question in front of you (no clue, just my guess).
The physical architecture of the 8086 chip did not have that instruction.
As for your two specific questions...
"...you want to count the number of zeros on that address..."
That's somewhat ambiguous, in fact so vague that I can't understand it.
"...move a value from one absolute address to another..."
Good question. We'll do this in 32 bit, no, 16 and then 32 bit.
16 bit example
Push Si ;source index register
Push Di ;destination index register
Push Ax ;We'll use this for the transfer
Lea Si, Where_The_Number_Is_Now ;You'll define this, somehow
Lea Di, Where_We_Want_It_To_Go ;You'll define this also, same thing
Mov Ax, Ds:[Si] ;The "Ds:" may or may not be needed, be safe
Mov Ds:[Di], Ax ;Probably do need "Ds:" for this instruction
Pop Ax ;Do pay attention to the reverse order
Pop Di ;...of popping the registers in exact
Pop Si ;...opposite of how they were pushed
; And you are done
32 bit example
Push Esi ;source index register
Push Edi ;destination index register
Push Eax ;We'll use this for the transfer
Lea Esi, Where_The_Number_Is_Now ;You'll define this, somehow
Lea Edi, Where_We_Want_It_To_Go ;You'll define this also, same thing
Mov Eax, Ds:[Esi] ;The "Ds:" may or may not be needed, be safe
Mov Ds:[Edi], Eax ;Probably do need "Ds:" for this instruction
Pop Eax ;Do pay attention to the reverse order
Pop Edi ;...of popping the registers in exact
Pop Esi ;...opposite of how they were pushed
; And you are done
To change a value at an absolute word address:
mov byte ptr [5dah], 0
...or...
mov word ptr [5dah], 0
To move a value from one absolute word address to another:
mov al, byte ptr [5dah]
mov byte ptr [1234h], al
...or...
mov ax, word ptr [5dah]
mov word ptr [1234h], ax
As for the other question, the one that asked how to count the number of zeros on that address, you were a little to vague.
First time I play with ds, si and strings related instructions in assembly. I am trying to read the command line arguments char by char and this is how my code looks like now:
GetCommandLine:
push ebp
mov ebp, esp
push edi
push esi
call GetCommandLineW
mov edi, eax
mov esi, ebp
Parse:
lodsw
cmp ax, 0dh ; until return is found
jne Parse
pop esi
pop edi
pop ebp
ret
So, the GetCommandLineW function returns a correct pointer to the string. The problem is that the Parse section loops forever and I can't see the AX register being loaded with the correct next byte from the string. I think the EDI:ESI is not correctly loaded or something
esi and edi are different pointers. ebp is used for saving the old stack pointer, and for saving/loading local variables. GetCommandLineW will return the pointer in eax, which you should then put into esi. Since you're only using lodsw (and not stos*), you don't need to touch edi.
Why do you think that 0x0d is used in the commandline? A normal C string is returned, so you should look for a 0 byte.
I have been trying to create a simple chunk of shell code that allows me to modify a string by doing something simple like changing a letter, then print it out.
_start:
jmp short ender
starter:
xor ecx, ecx ;clear out registers
xor eax, eax
xor ebx, ebx
xor edx, edx
pop esi ;pop address of string into esi register
mov byte [esi+1], 41 ;try and put an ASCII 'A' into the second letter of the string
;at the address ESI+1
mov ecx, esi ;move our string into the ecx register for the write syscall
mov al, 4 ;write syscall number
mov bl, 1 ;write to STDOUT
mov dl, 11 ;string length
int 0x80 ;interrupt call
ender:
call starter
db 'hello world'
It's supposed to print out "hallo world". The problem occurs (segfault) when I try and modify a byte of memory with a mov command like so.
mov byte [esi+1], 41
I ran the program though GDB and the pop esi command works correctly, esi is loaded with the address of the string and everything is valid. I can't understand why I cant modify a byte value at the valid address though. I am testing this "shellcode" by just running the executable generated by NASM and ld, I am not putting it in a C program or anything yet so the bug exists in the assembly.
Extra information
I am using x64 Linux with the following build commands:
nasm -f elf64 shellcode.asm -o shellcode.o
ld -o shellcode shellcode.o
./shellcode
I have pasted the full code here.
If I had to guess, I'd say dbhello world` is being compiled in as code, rather than data, and as such has read and execute permissions, but not write ones. So you're actually falling foul of page protection.
To change this, you need to place the string in a section .data section and use nasm's variable syntax to find it. In order to modify the data as is, you're going to need to make a mprotect call to modify the permissions on your pages, and write to them. Note: these won't persist back to the executable file - mmap()'s MAP_PRIVATE ensures that's the case.
I'm using the NASM assembler.
The value returned to the eax register is supposed to be a character, when I attempt to print the integer representation its a value that looks like a memory address. I was expecting the decimal representation of the letter. For example, if character 'a' was moved to eax I should see 97 being printed (the decimal representation of 'a'). But this is not the case.
section .data
int_format db "%d", 0
;-----------------------------------------------------------
mov eax, dword[ebx + edx]
push eax
push dword int_format
call _printf ;prints a strange number
add esp, 8
xor eax, eax
mov eax, dword[ebx + edx]
push eax
call _putchar ;prints the correct character!
add esp, 4
So what gives here? ultimately I want to compare the character so it is important that eax gets the correct decimal representation of the character.
mov eax, dword[ebx + edx]
You are loading a dword (32 bits) from the address pointed to ebx+edx. If you want a single character, you need to load a byte. For that, you can use movzx instruction:
movzx eax, byte[ebx + edx]
This will load a single byte to the low byte of eax (i.e. al) and zero out the rest of the register.
Another option would be to mask out the extra bytes after loading the dword, e.g.:
and eax, 0ffh
or
movxz eax, al
As for putchar, it works because it interprets the passed value as char, i.e. it ignores the high three bytes present in the register and considers only the low byte.