Reading and parsing the width/height property of a bitmap - parsing

I'm trying to write a bitmap (.bmp) parser/reader by reading raw bytes from the file and simply checking their values, and I've come across something I simply cannot wrap my mind around.
The image I'm trying to read is 512x512 pixels, and when I look at the width property (at 0x12 and 4 bytes onward) it says 00 02 00 00 (when viewed in a hex editor). I assume this is the same as the binary value 00000000 00000010 00000000 00000000. This somehow represents 512, I just cannot figure out the steps to get there.
So what I really need to know is how are integers represented binarily, and how do I parse them correctly? Any help is much appreciated. :)

What you are seeing in your hex editor is actually right. Just remember that bytes are in little endian order, so the value is actually 00 00 02 00 = 0x0200 = 512.

Actually 0x200 in hex equals 512 in decimal. You may have the position of the width/height properties wrong.

Related

Little Endian vs. Big Endian architectures

I've a question it is a kind of disagreement with a university professor, about Endianness, So I did not find any mean to solve this and find the right answer but asking and open a discussion in Stack Overflow community.
Let's say that we have this number (hex)11FF1 defined as an integer, for example in C++ it will be like : int num = 0x11FF1, and I say that the number will be presented in the memory in a little endian machine as :
addr[0] is f1 addr[1] is 1f addr[2] is 01 addr[3] is 00
in binary : 1111 0001 0001 1111 0000 0001 0000 0000
as the compiler considers 0x11ff1 as 0x0001ff1 and considers also 00 as the 1st byte and 01 as the 2nd byte and so on, and for the Big Endian I believe it will look like:
addr[0] is 00 addr[1] is 01 addr[2] is 1f addr[3] is f1
in binary : 0000 0000 0000 0001 0001 1111 1111 0001
but he has another opinion, he says :
Little Endian
Big Endian:
Actually I don't see anything logical in his representation, so I hope the developers Resolve this disagreement, Thanks in advance.
Your hex and binary numbers are correct.
Your (professor's?) French image for little-endian makes no sense at all, none of the 3 representations are consistent with either of the other 2.
73713 is 0x11ff1 in hex, so there aren't any 0xFF bytes (binary 11111111).
In 32-bit little-endian, the bytes are F1 1F 01 00 in order of increasing memory address.
You can get that by taking pairs of hex digits (bytes / octets) from the low end of the full hex value, then fill with zeros once you've consumed the value.
It looks like they maybe padded the wrong side of the hex value with 0s to zero-extend to 32 bits as 0x11ff1000, not 0x00011ff1. Note these are full hex values of the whole number, not an attempt to break it down into separate hex bytes in any order.
But the hex and binary don't match each other; their binary ends with an all-ones byte, so it has FF as the high byte, not the 3rd byte. I didn't check if that matches their hex in PDP (mixed) endian.
They broke up their hex column into 4 byte-sized groups, which would seem to indicate that it's showing bytes in memory order. But that column is the same between their big- and little-endian images, so apparently that's not what they're doing, and they really did just extend it to 32 bits by left shifting (padding with low instead of high zero).
Also, the binary field in the big vs. little endian aren't the reverse of each other. To flip from big to little endian, you reverse the order of the bytes within the integer, keeping each byte value the same. (like x86 bswap). Their 11111111 (FF) byte is 2nd in their big-endian version, but last in little-endian.
TL:DR: unfortunately, nothing about those images makes any sense that I can see.

Hexadecimal Convention in Memory

a super stupid question:
I have an integer in my code, which occupies 4 bytes ( of course ), this information in memory is represented as a pack of four hexadecimal of two digits, for example
int x = 1000
in memory is represented as
e8 03 00 00
where the first hex represents the "lower" byte and the last is the "highest".
How is this representation called? Are there other representations? I just need the name. I'm struggling to find online this information :(
Thanks
The word you are looking for is Endianness.

What's the representation of a set of char in PASCAL?

They ask me to represet a set of char like into "map memory". What chars are in the set? The teacher told us to use ASCII code, into a set of 32 bytes.
A have this example, the set {'A', 'B', 'C'}
(The 7 comes from 0111)
= {00 00 00 00 00 00 00 00 70 00
00 00 00 00 00 00 00 00 00 00
00}
Sets in pascal can be represented in memory with one bit for every element; if the bit is 1, the element is present in the set.
A "set of char" is the set of ascii char, where each element has an ordinal value from 0 to 255 (it should be 127 for ascii, but often this set is extended up to a byte, so there are 256 different characters).
Hence a "set of char" is represented in memory as a block of 32 bytes which contain a total of 256 bits. The character "A" (upper case A) has an ordinal value of 65. The integer division of 65 by 8 (the number of bits a byte can hold) gives 8. So the bit representing "A" in the set resides in the byte number 8. 65 mod 8 gives 1, which is the second bit in that byte.
The byte number 8 will have the second bit ON for the character A (and the third bit for B, and the fourth for C). All the three characters together give the binary representation of 0000.1110 ($0E in hex).
To demonstrate this, I tried the following program with turbo pascal:
var
ms : set of char;
p : array[0..31] of byte absolute ms;
i : integer;
begin
ms := ['A'..'C'];
for i := 0 to 31 do begin
if i mod 8=0 then writeln;
write(i,'=',p[i],' ');
end;
writeln;
end.
The program prints the value of all 32 bytes in the set, thanks to the "absolute" keyword. Other versions of pascal can do it using different methods. Running the program gives this result:
0=0 1=0 2=0 3=0 4=0 5=0 6=0 7=0
8=14 9=0 10=0 11=0 12=0 13=0 14=0 15=0
16=0 17=0 18=0 19=0 20=0 21=0 22=0 23=0
24=0 25=0 26=0 27=0 28=0 29=0 30=0 31=0
where you see that the only byte different than 0 is the byte number 8, and it contains 14 ($0E in hex, 0000.1110). So, your guess (70) is wrong.
That said, I must add that nobody can state this is always true, because a set in pascal is implementation dependent; so your answer could also be right. The representation used by turbo pascal (on dos/windows) is the most logical one, but this does not exclude other possible representations.

Addressing memory data in 32 bit protected mode with nasm

So my book says i can define a table of words like so:
table: dw "13,37,99,99"
and that i can snatch values from the table by incrementing the index into the address of the table like so:
mov ax, [table+2] ; should give me 37
but instead it places 0x2c33 in ax rather than 0x3337
is this because of a difference in system architecture? maybe because the book is for 386 and i'm running 686?
0x2C is a comma , and 0x33 is the character 3, and they appear at positions 2 and 3 in your string, as expected. (I'm a little confused as to what you were expecting, since you first say "should give me 37" and later say "rather than 0x3337".)
You have defined a string constant when I suspect that you didn't mean to. The following:
dw "13,37,99,99"
Will produce the following output:
Offset 00 01 02 03 04 05 06 07 08 09 0A 0B
31 33 2C 33 37 2C 39 39 2C 39 39 00
Why? Because:
31 is the ASCII code for '1'
33 is the ASCII code for '3'
2C is the ASCII code for ','
...
39 is the ASCII code for '9'
NASM also null-terminates your string by putting 0 byte at the end (If you don't want your strings to be null-terminated use single quotes instead, '13,37,99,99')
Take into account that ax holds two bytes and it should be fairly clear why ax contains 0x2C33.
I suspect what you wanted was more along the lines of this (no quotes and we use db to indicate we are declaring byte-sized data instead of dw that declares word-sized data):
db 13,37,99,99
This would still give you 0x6363 (ax holds two bytes / conversion of 99, 99 to hex). Not sure where you got 0x3337 from.
I recommend that you install yourself a hex editor and have an experiment inspecting the output from NASM.

Convert DeDe constant to valid declaration or other interface extraction tool?

I am using DeDe to create an API (Interface) I can compile to. (Strictly legit: while we wait for the vendor to deliver a D2010 version in two months, we can at least get our app compiling...)
We'll stub out all methods.
Dede emits constant declarations like these:
LTIMGLISTCLASS =
00: ÿÿÿÿ....LEADIMGL|FF FF FF FF 0D 00 00 00 4C 45 41 44 49 4D 47 4C|
10: IST32. |49 53 54 33 32 00|;
DS_PREFIX =
0: ÿÿÿÿ....DICM.|FF FF FF FF 04 00 00 00 44 49 43 4D 00|;
How would I convert these into a compilable statement?
In theory, I don't care about the actual values, since I doubt they're use anywhere, but I'd like to get their size correct. Are these integers, LongInts or ???
Any other hints on using DeDe would be welcome.
Those are strings. The first four bytes are the reference count, which for string literals is always -1 ($ffffffff). The next four bytes are the character count. Then comes the characters an a null terminator.
const
LTIMGLISTCLASS = 'LEADIMGLIST32'; // 13 = $0D characters
DS_PREFIX = 'DICM'; // 4 = $04 characters
You don't have to "doubt" whether those constants are used anywhere. You can confirm it empirically. Compile your project without those constants. If it compiles, then they're not used.
If your project doesn't compile, then those constants must be used somewhere in your code. Based on the context, you can provide your own declarations. If the constant is used like a string, then declare a string; if it's used like an integer, then declare an integer.
Another option is to load your project in a version of Delphi that's compatible with the DCUs you have. Use code completion to make the IDE display the constant and its type.

Resources