Little Endian vs. Big Endian architectures - memory

I've a question it is a kind of disagreement with a university professor, about Endianness, So I did not find any mean to solve this and find the right answer but asking and open a discussion in Stack Overflow community.
Let's say that we have this number (hex)11FF1 defined as an integer, for example in C++ it will be like : int num = 0x11FF1, and I say that the number will be presented in the memory in a little endian machine as :
addr[0] is f1 addr[1] is 1f addr[2] is 01 addr[3] is 00
in binary : 1111 0001 0001 1111 0000 0001 0000 0000
as the compiler considers 0x11ff1 as 0x0001ff1 and considers also 00 as the 1st byte and 01 as the 2nd byte and so on, and for the Big Endian I believe it will look like:
addr[0] is 00 addr[1] is 01 addr[2] is 1f addr[3] is f1
in binary : 0000 0000 0000 0001 0001 1111 1111 0001
but he has another opinion, he says :
Little Endian
Big Endian:
Actually I don't see anything logical in his representation, so I hope the developers Resolve this disagreement, Thanks in advance.

Your hex and binary numbers are correct.
Your (professor's?) French image for little-endian makes no sense at all, none of the 3 representations are consistent with either of the other 2.
73713 is 0x11ff1 in hex, so there aren't any 0xFF bytes (binary 11111111).
In 32-bit little-endian, the bytes are F1 1F 01 00 in order of increasing memory address.
You can get that by taking pairs of hex digits (bytes / octets) from the low end of the full hex value, then fill with zeros once you've consumed the value.
It looks like they maybe padded the wrong side of the hex value with 0s to zero-extend to 32 bits as 0x11ff1000, not 0x00011ff1. Note these are full hex values of the whole number, not an attempt to break it down into separate hex bytes in any order.
But the hex and binary don't match each other; their binary ends with an all-ones byte, so it has FF as the high byte, not the 3rd byte. I didn't check if that matches their hex in PDP (mixed) endian.
They broke up their hex column into 4 byte-sized groups, which would seem to indicate that it's showing bytes in memory order. But that column is the same between their big- and little-endian images, so apparently that's not what they're doing, and they really did just extend it to 32 bits by left shifting (padding with low instead of high zero).
Also, the binary field in the big vs. little endian aren't the reverse of each other. To flip from big to little endian, you reverse the order of the bytes within the integer, keeping each byte value the same. (like x86 bswap). Their 11111111 (FF) byte is 2nd in their big-endian version, but last in little-endian.
TL:DR: unfortunately, nothing about those images makes any sense that I can see.

Related

Hexadecimal Convention in Memory

a super stupid question:
I have an integer in my code, which occupies 4 bytes ( of course ), this information in memory is represented as a pack of four hexadecimal of two digits, for example
int x = 1000
in memory is represented as
e8 03 00 00
where the first hex represents the "lower" byte and the last is the "highest".
How is this representation called? Are there other representations? I just need the name. I'm struggling to find online this information :(
Thanks
The word you are looking for is Endianness.

What's the representation of a set of char in PASCAL?

They ask me to represet a set of char like into "map memory". What chars are in the set? The teacher told us to use ASCII code, into a set of 32 bytes.
A have this example, the set {'A', 'B', 'C'}
(The 7 comes from 0111)
= {00 00 00 00 00 00 00 00 70 00
00 00 00 00 00 00 00 00 00 00
00}
Sets in pascal can be represented in memory with one bit for every element; if the bit is 1, the element is present in the set.
A "set of char" is the set of ascii char, where each element has an ordinal value from 0 to 255 (it should be 127 for ascii, but often this set is extended up to a byte, so there are 256 different characters).
Hence a "set of char" is represented in memory as a block of 32 bytes which contain a total of 256 bits. The character "A" (upper case A) has an ordinal value of 65. The integer division of 65 by 8 (the number of bits a byte can hold) gives 8. So the bit representing "A" in the set resides in the byte number 8. 65 mod 8 gives 1, which is the second bit in that byte.
The byte number 8 will have the second bit ON for the character A (and the third bit for B, and the fourth for C). All the three characters together give the binary representation of 0000.1110 ($0E in hex).
To demonstrate this, I tried the following program with turbo pascal:
var
ms : set of char;
p : array[0..31] of byte absolute ms;
i : integer;
begin
ms := ['A'..'C'];
for i := 0 to 31 do begin
if i mod 8=0 then writeln;
write(i,'=',p[i],' ');
end;
writeln;
end.
The program prints the value of all 32 bytes in the set, thanks to the "absolute" keyword. Other versions of pascal can do it using different methods. Running the program gives this result:
0=0 1=0 2=0 3=0 4=0 5=0 6=0 7=0
8=14 9=0 10=0 11=0 12=0 13=0 14=0 15=0
16=0 17=0 18=0 19=0 20=0 21=0 22=0 23=0
24=0 25=0 26=0 27=0 28=0 29=0 30=0 31=0
where you see that the only byte different than 0 is the byte number 8, and it contains 14 ($0E in hex, 0000.1110). So, your guess (70) is wrong.
That said, I must add that nobody can state this is always true, because a set in pascal is implementation dependent; so your answer could also be right. The representation used by turbo pascal (on dos/windows) is the most logical one, but this does not exclude other possible representations.

ELF32 binary, little endian or not?

I know that printf(%08x) shows 4 octets of the stack (410484e4 for instance). Let's say that this value correspond to the begining of a char array (called tab), so what would be the value of tab[0], would it be 08 ('A' converted in ASCII) or e4 (รค) ?
Thank you
p.s: the executable I'm working on is an ELF 32 binary

How does this code that tests for a system's Endianess work?

So I'm trying to find out what endianess a system is using by using code. I looked on the net, and found someone with the same question, and one of the answers on Stack Exchange had the following code:
int num = 1;
if(*(char *)&num == 1)
{
printf("\nLittle-Endian\n");
}
else
{
printf("Big-Endian\n");
}
But the person did not explain why this works, and I could not ask. What the reasoning behind the following code?
(*(char *)&num == 1)
I assume you are using C/C++
&num takes the address in memory of the integer num.
It interprets that address as a pointer to a char by the cast
(char *)
Next, the value of this pointer to a char is considered by the first asterix in *(char *)&num and compared
to 1.
Now int is 4 bytes. It would be 00 00 00 01 on a big endian system and 01 00 00 00 on little endian system. A char is only one byte, so the the value of the cast to char would take the first byte of the memory occupied by num. So on a big endian system this would be **00** 00 00 01 and on the little endian system this would be **01** 00 00 00.
So now you do the comparison using the if statement to found out whether the int, casted to a char, is equivalent to the order of bytes used on a little endian system.
On a X86 32bit system this could compile to the following assembly
mov [esp+20h+var_4], 1 ; Moves the value of 1 to a memory address
lea eax, [esp+20h+var_4] ; Loads that memory address to eax register
mov al, [eax] ; Takes the first byte of the value pointed to by the eax register and move that to register al (al is 1 byte)
cmp al, 1 ; compares that one byte of register al to the value of 1
(*(char *)&num == 1)
Roughly translates as Take the address of variable num. Cast it's contents to a character and compare the value with 1.
If the character at the first address of your integer is 1 then it's the low-order byte, and your numbers are Little-Endian.
Big-Endian numbers will have the high-order byte (0 if the integer value is 1) at the lowest address, so the comparison will fail.
we all know that type "int" occupies 4 bytes while "char" is only 1 byte.
that line just converts the integer to a char.it is equivalent to : char c = (char)"the lowest byte of num".
see the concept of endianess: http://en.wikipedia.org/wiki/Endianness
then if the host machine is big-endian, the value of c will be 0, and 1 otherwise.
an example is :
suppose num is 0x12345678, on big-endian machines, c will be 0x12 whereas on little-endian machines c is 0x78
Here goes. You assign the value 1 to a 4 byte int so
01 00 00 00 on little endian x86
00 00 00 01 on big endian
(*(char *)&num == 1)
&num gives the address of the int but the cast to char* restricts the read (the de-reference) to 1 byte (size of char)
If the first byte is 1 then the least significant bit went first and it's little endian.

Reading and parsing the width/height property of a bitmap

I'm trying to write a bitmap (.bmp) parser/reader by reading raw bytes from the file and simply checking their values, and I've come across something I simply cannot wrap my mind around.
The image I'm trying to read is 512x512 pixels, and when I look at the width property (at 0x12 and 4 bytes onward) it says 00 02 00 00 (when viewed in a hex editor). I assume this is the same as the binary value 00000000 00000010 00000000 00000000. This somehow represents 512, I just cannot figure out the steps to get there.
So what I really need to know is how are integers represented binarily, and how do I parse them correctly? Any help is much appreciated. :)
What you are seeing in your hex editor is actually right. Just remember that bytes are in little endian order, so the value is actually 00 00 02 00 = 0x0200 = 512.
Actually 0x200 in hex equals 512 in decimal. You may have the position of the width/height properties wrong.

Resources