Delphi: how to read X bits from TBytes?

Delphi: how to read X bits from TBytes? - delphi

I've faced with the problem related to reading random number of bits from TBytes. Firstly I followed an advice from this question (the answer about using ubitstream.pas unit): Handling arbitrary bit length data in Delphi?
My example:
binaryfeed (TBytes) = (255, 0, 0, 0, 0, 0, 6, 132, 1, 112, 128, 128, 130, 81);
I read 8 bits and get 255 - ok. Position is 8.
I read another 24 bits and get 0 - ok. Position is 32.
I read another 24 bits and get 393216 instead of 6. Position is 56.
393216 is 0000 0000 0000 0110 0000 0000 0000 0000
I can understand why it happend but I can't figure out how to truncate these extra zero bits. Any ideas?

The issue is endianness.
00000110 00000000 00000000
is 393216. Reverse the three bytes and you have
00000000 00000000 00000110
which is 6.
The code that you are using is little endian, but you are hoping for big endian behaviour. You will need to change your code to account for this mismatch.
From what we have seen in the question though, you are always reading entire bytes and so there is no need for the complexity of the code you are using. You can operate at the byte level. Obviously you will still need to account for endianness correctly but that is not very challenging.

First think of your container. If you read bytes into a 32bit integer, you should use shr/shl to move the bits into place.
its not uncommon to shl left-most bits you dont want out of the 32bit range, then shr the rest X number of positions to get the bytevalues (or word, or whatever size you need).
var
value: integer;
begin
value := (Readint32FromBuffer shl 8) shr 8;
writeln(value.tostring);
end;
In the above code, we dont want the first 8 bits to be included so we shift the whole thing 8 bits to the left, then 8 bits back. This zeroes out the bits when they move "off the edge" of the container.
Use SHL and SHR to move bits/bytes around inside your variable(s). But remember the size of the container. Word (16 bits), Integer (32 bits) etc.
You can also use AND and OR masks to achieve some of the same, although SHL and SHR are typically faster for variables with the same size as a CPU register (32 bit or 64 bit).

Related

Little Endian vs. Big Endian architectures

I've a question it is a kind of disagreement with a university professor, about Endianness, So I did not find any mean to solve this and find the right answer but asking and open a discussion in Stack Overflow community.
Let's say that we have this number (hex)11FF1 defined as an integer, for example in C++ it will be like : int num = 0x11FF1, and I say that the number will be presented in the memory in a little endian machine as :
addr[0] is f1 addr[1] is 1f addr[2] is 01 addr[3] is 00
in binary : 1111 0001 0001 1111 0000 0001 0000 0000
as the compiler considers 0x11ff1 as 0x0001ff1 and considers also 00 as the 1st byte and 01 as the 2nd byte and so on, and for the Big Endian I believe it will look like:
addr[0] is 00 addr[1] is 01 addr[2] is 1f addr[3] is f1
in binary : 0000 0000 0000 0001 0001 1111 1111 0001
but he has another opinion, he says :
Little Endian
Big Endian:
Actually I don't see anything logical in his representation, so I hope the developers Resolve this disagreement, Thanks in advance.

Your hex and binary numbers are correct.
Your (professor's?) French image for little-endian makes no sense at all, none of the 3 representations are consistent with either of the other 2.
73713 is 0x11ff1 in hex, so there aren't any 0xFF bytes (binary 11111111).
In 32-bit little-endian, the bytes are F1 1F 01 00 in order of increasing memory address.
You can get that by taking pairs of hex digits (bytes / octets) from the low end of the full hex value, then fill with zeros once you've consumed the value.
It looks like they maybe padded the wrong side of the hex value with 0s to zero-extend to 32 bits as 0x11ff1000, not 0x00011ff1. Note these are full hex values of the whole number, not an attempt to break it down into separate hex bytes in any order.
But the hex and binary don't match each other; their binary ends with an all-ones byte, so it has FF as the high byte, not the 3rd byte. I didn't check if that matches their hex in PDP (mixed) endian.
They broke up their hex column into 4 byte-sized groups, which would seem to indicate that it's showing bytes in memory order. But that column is the same between their big- and little-endian images, so apparently that's not what they're doing, and they really did just extend it to 32 bits by left shifting (padding with low instead of high zero).
Also, the binary field in the big vs. little endian aren't the reverse of each other. To flip from big to little endian, you reverse the order of the bytes within the integer, keeping each byte value the same. (like x86 bswap). Their 11111111 (FF) byte is 2nd in their big-endian version, but last in little-endian.
TL:DR: unfortunately, nothing about those images makes any sense that I can see.

What's the representation of a set of char in PASCAL?

They ask me to represet a set of char like into "map memory". What chars are in the set? The teacher told us to use ASCII code, into a set of 32 bytes.
A have this example, the set {'A', 'B', 'C'}
(The 7 comes from 0111)
= {00 00 00 00 00 00 00 00 70 00
00 00 00 00 00 00 00 00 00 00
00}

Sets in pascal can be represented in memory with one bit for every element; if the bit is 1, the element is present in the set.
A "set of char" is the set of ascii char, where each element has an ordinal value from 0 to 255 (it should be 127 for ascii, but often this set is extended up to a byte, so there are 256 different characters).
Hence a "set of char" is represented in memory as a block of 32 bytes which contain a total of 256 bits. The character "A" (upper case A) has an ordinal value of 65. The integer division of 65 by 8 (the number of bits a byte can hold) gives 8. So the bit representing "A" in the set resides in the byte number 8. 65 mod 8 gives 1, which is the second bit in that byte.
The byte number 8 will have the second bit ON for the character A (and the third bit for B, and the fourth for C). All the three characters together give the binary representation of 0000.1110 ($0E in hex).
To demonstrate this, I tried the following program with turbo pascal:
var
ms : set of char;
p : array[0..31] of byte absolute ms;
i : integer;
begin
ms := ['A'..'C'];
for i := 0 to 31 do begin
if i mod 8=0 then writeln;
write(i,'=',p[i],' ');
end;
writeln;
end.
The program prints the value of all 32 bytes in the set, thanks to the "absolute" keyword. Other versions of pascal can do it using different methods. Running the program gives this result:
0=0 1=0 2=0 3=0 4=0 5=0 6=0 7=0
8=14 9=0 10=0 11=0 12=0 13=0 14=0 15=0
16=0 17=0 18=0 19=0 20=0 21=0 22=0 23=0
24=0 25=0 26=0 27=0 28=0 29=0 30=0 31=0
where you see that the only byte different than 0 is the byte number 8, and it contains 14 ($0E in hex, 0000.1110). So, your guess (70) is wrong.
That said, I must add that nobody can state this is always true, because a set in pascal is implementation dependent; so your answer could also be right. The representation used by turbo pascal (on dos/windows) is the most logical one, but this does not exclude other possible representations.

Assembly Memory Diagram Verification

Given this data,
.data
Alpha WORD 0022h, 45h
Beta BYTE 56h
Gamma DWORD 4567h
Delta BYTE 23h
Assuming that the data segment begins at 0x00404000, can anyone verify how correct this table is?
Address Variable Data
00404000 Alpha 22
00404001 Alpha + 1 00
00404002 Alpha + 2 45
00404003 Beta 56
00404004 Gamma 67
00404005 Gamma+1 45
00404006 Delta 23

Impossible to answer without knowing the addressing of the processor in question (and how the assembler views the addressing). Nonetheless, you'd need a pretty unusual system for it to be correct.
Alpha is defined has having the type "word". You're showing the first word as allocating two bytes (fairly reasonable), but the second only one byte. This is much less reasonable--a word might be one byte or it might be two, but its size is normally going to at least be consistent.
For the moment, let's assume a word is two bytes, and a dword is four bytes. In that case, I'd expect something more like:
Alpha 22h
alpha+1 00h
alpha+2 45h
Alpha+3 00h
Beta 56h
Gamma 67h
Gamma+1 45h
Gamma+2 00h
Gamma+3 00h
Delta 23h

How come the largest value that can be stored in a bit is 127?

The decimal number 128, is 10000000 in binary. Isn't this 8 bits? How come byte's highest value is 127 then? Thankyou!!

In two's complement representation, you have to allow for negative numbers as well.
Eight bits will give you 256 distinct values, -128 thru 127 inclusive.
00000000 - 01111111 0 to 127
10000000 - 11111111 -128 to -1 (or 128 to 255 for unsigned).
Note that there are other encoding schemes, such as ones' complement or sign/magnitude, which have slightly different properties. Both those have a positive and negative zero so the range is -127..127.

Counting is zero based - it starts at 0.
Hence 0 to 127 is 128 items and the maximum value is 127.
Note that this assumes that you are talking about signed 8 bit bytes/integers.
For unsigned 8 bit bytes/integers the maximum value that can be represented is 255 (0-255 is 256 items).

Binary format of memory address. Computer organization

I'm having a bit of an issue understanding what is going on here, and can't seem to wrap my head about it.
Notes:
Course notes about topic
Example:
Memory location 0x1f6
What is the binary format of this address? 1 1111 0110
What are tag, block index, and block offset? 3, 7, 6
My own work:
Memory location 0x033
What is the binary format of this address? 0 0011 0011
What are tag, block index, and block offset? 0 6, 3
Memory location 0x009
What is the binary format of this address? 0 0000 1001
What are tag, block index, and block offset? 0, 1, 1
Memory location 0x652
What is the binary format of this address? 0110 0101 0010
What are tag, block index, and block offset? 12, 10, 2
These are my attempts, but I have not a clue if I'm doing it right, and I have a feeling that I am not, as least for the last one, which I believe is wrong. Can anyone point me in the right direction?

I ended up figuring it out. The block offset is dependent upon the block size, in this case 16bytes, so it requires 4 binary digits to represent it. Next, the block index is dependent upon the number of blocks, in this case 8 (0-7), which requires 3 binary digits. Finally, the tag is made up of the remaining binary digits after you convert the hex memory location to binary.
Example
Memory location 0x652
What is the binary format of this address? 0110 0101 0010
What is the binary representation of tag, block index, and block offset? 1100 101 0010
What are tag, block index, and block offset? 12, 7, 2

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart