What's the representation of a set of char in PASCAL? - memory

They ask me to represet a set of char like into "map memory". What chars are in the set? The teacher told us to use ASCII code, into a set of 32 bytes.
A have this example, the set {'A', 'B', 'C'}
(The 7 comes from 0111)
= {00 00 00 00 00 00 00 00 70 00
00 00 00 00 00 00 00 00 00 00
00}

Sets in pascal can be represented in memory with one bit for every element; if the bit is 1, the element is present in the set.
A "set of char" is the set of ascii char, where each element has an ordinal value from 0 to 255 (it should be 127 for ascii, but often this set is extended up to a byte, so there are 256 different characters).
Hence a "set of char" is represented in memory as a block of 32 bytes which contain a total of 256 bits. The character "A" (upper case A) has an ordinal value of 65. The integer division of 65 by 8 (the number of bits a byte can hold) gives 8. So the bit representing "A" in the set resides in the byte number 8. 65 mod 8 gives 1, which is the second bit in that byte.
The byte number 8 will have the second bit ON for the character A (and the third bit for B, and the fourth for C). All the three characters together give the binary representation of 0000.1110 ($0E in hex).
To demonstrate this, I tried the following program with turbo pascal:
var
ms : set of char;
p : array[0..31] of byte absolute ms;
i : integer;
begin
ms := ['A'..'C'];
for i := 0 to 31 do begin
if i mod 8=0 then writeln;
write(i,'=',p[i],' ');
end;
writeln;
end.
The program prints the value of all 32 bytes in the set, thanks to the "absolute" keyword. Other versions of pascal can do it using different methods. Running the program gives this result:
0=0 1=0 2=0 3=0 4=0 5=0 6=0 7=0
8=14 9=0 10=0 11=0 12=0 13=0 14=0 15=0
16=0 17=0 18=0 19=0 20=0 21=0 22=0 23=0
24=0 25=0 26=0 27=0 28=0 29=0 30=0 31=0
where you see that the only byte different than 0 is the byte number 8, and it contains 14 ($0E in hex, 0000.1110). So, your guess (70) is wrong.
That said, I must add that nobody can state this is always true, because a set in pascal is implementation dependent; so your answer could also be right. The representation used by turbo pascal (on dos/windows) is the most logical one, but this does not exclude other possible representations.

Related

Indy 10 UdpClient and Open Sound Control

To control a Behringer X32 audio mixer, I have to send an OSC message like /ch/01/mix/fader ,f .3 to move a fader to 30%. The mixer, per the OSC protocol, is expecting the .3 to come in as a 4 character string - in hex it's 3E 99 99 9A. So special characters are involved.
TIdUDPClient is given the characters for 3E 99 99 9A, but it sends out 3E 3F 3F 3F. Likewise .4 wants to be 3E CC CC CD but 3E 3F 3F 3F is sent.
When you get up to .5 and greater, things work again as the characters are below 3F. For example, .6 should be 3F 19 99 9A and goes out as 3F 19 3F 3F.
Evidently the Behringer is only looking at the first two characters there.
I am using Delphi Rio with the Indy 10 version distributed with it. I can create a module in Lazarus with Lnet that works fine. But my main application is in Delphi where I need this ability. As you can see, I've tried several different ways with the same non-working result.
How do I send the proper characters?
procedure TCPForm1.OSCSendMsg;
var
OutValueStr: String;
I: Integer;
J: Tbytes;
B1: TIdbytes;
begin
If Length(CommandStr) > 0 then begin
OscCommandStr := PadStr(CommandStr); //convert CommandStr to OSC string
If TypeStr='' then OscCommandStr := OscCommandStr+','+#0+#0+#0;
If Length(TypeStr) = 1 then begin
If TypeStr='i' then Begin // Parameter is an integer
I := swapendian(IValue); //change to big endian
OscCommandStr := OscCommandStr+','+TypeStr+#0+#0+IntToCharStr(I);
OutValueStr := IntToStr(IValue);
end;
If TypeStr='f' then Begin // Parameter is a float (real)
I := swapendian(PInteger(#FValue)^); //typecast & change to big endian
//I := htonl(PInteger(#FValue)^); //typecast & change to big endian
//J := MakeOSCFloat(FValue);
OscCommandStr := OscCommandStr+','+TypeStr+#0+#0+IntToCharStr(I);
//OscCommandStr := OscCommandStr+','+TypeStr+#0+#0+char(J[0])+char(J[1])+char(J[2])+char(J[3]);
OutValueStr := FloatToStr(FValue);
end;
end;
//IdUDPClient2.Send(OSCCommandStr,IndyTextEncoding_UTF8);
//IdUDPClient2.Send(OSCCommandStr);
B1 := toBytes(OSCCommandStr);
IdUDPClient2.SendBuffer(B1);
if loglevel>0 then logwrite('OSC= '+ hexstr(OSCCommandStr));
Wait(UDPtime);
// if loglevel>0 then logwrite('OSC '+ OSCCommandStr);
end;
end;
function TCPForm1.IntToCharStr(I : Integer) : String;
var
CharStr : String;
MyArray: array [0..3] of Byte;
J: Integer;
begin
For J :=0 to 3 do MyArray[J] := 0;
Move(I, MyArray, 4); //typeset conversion from integer to array of byte
CharStr := '';
For J :=0 to 3 do //convert array of byte to string
CharStr := CharStr+char(MyArray[J]);
IntToCharStr := CharStr;
end;
UPDATE:
The system would not let me add this as an answer, so...
Thank you, Remy. At least as far as the X32 software simulator is concerned, adding the 8 bit text encoding gives the correct response. I'll have to wait until tomorrow to test on the actual mixer in the theater. A byte array might be better if we had control of both ends of the communication. As it is, I can't change the X32 and it wants to get a padded string (in Hex: 2F 63 68 2F 30 31 2F 6D 69 78 2F 66 61 64 65 72 00 00 00 00 2C 66 00 00 3E CC CC CD) for the text string "/ch/01/mix/fader ,f .4". The documentation of messages the X32 responds to is a long table of similar messages with different parameters. e.g. "/ch/01/mix/mute on", "/bus/1/dyn/ratio ,i 2", etc. This is all in accordance with the Open Sound Control protocol.
As always, you are the definitive source of Indy wisdom, so, thank you. I'll edit this note after my results with the actual device.
UPDATE:
Confirmed that the addition of 8 bit text encoding to the Send command works with the X32. Cheers! A couple questions as a result of this:
Is one send construct preferred over the other?
Where should I have read/learned more about these details of Indy?
3F is the ASCII '?' character. You are seeing that character being sent when a Unicode character is encoded to a byte encoding that doesn't support that Unicode character. For example, Indy's default text encoding is US-ASCII unless you specify otherwise (via the GIdDefaultTextEncoding variable in the IdGlobal.pas unit, or via various class properties or method parameters), and US-ASCII does not support Unicode characters > U+007F.
It seems like you are dealing with a binary protocol, not a text protocol, so why are you using strings to create its messages? I would think byte arrays would make more sense.
At the very least, try using Indy's 8-bit text encoding (via the IndyTextEncoding_8Bit() function in the IdGlobal.pas unit) to convert Unicode characters U+0000..U+00FF to bytes 0x00..0xFF without data loss, eg:
B1 := ToBytes(OSCCommandStr, IndyTextEncoding_8Bit); // not ASCII or UTF8!
IdUDPClient2.SendBuffer(B1);
IdUDPClient2.Send(OSCCommandStr, IndyTextEncoding_8Bit); // not ASCII or UTF8!

How does this code that tests for a system's Endianess work?

So I'm trying to find out what endianess a system is using by using code. I looked on the net, and found someone with the same question, and one of the answers on Stack Exchange had the following code:
int num = 1;
if(*(char *)&num == 1)
{
printf("\nLittle-Endian\n");
}
else
{
printf("Big-Endian\n");
}
But the person did not explain why this works, and I could not ask. What the reasoning behind the following code?
(*(char *)&num == 1)
I assume you are using C/C++
&num takes the address in memory of the integer num.
It interprets that address as a pointer to a char by the cast
(char *)
Next, the value of this pointer to a char is considered by the first asterix in *(char *)&num and compared
to 1.
Now int is 4 bytes. It would be 00 00 00 01 on a big endian system and 01 00 00 00 on little endian system. A char is only one byte, so the the value of the cast to char would take the first byte of the memory occupied by num. So on a big endian system this would be **00** 00 00 01 and on the little endian system this would be **01** 00 00 00.
So now you do the comparison using the if statement to found out whether the int, casted to a char, is equivalent to the order of bytes used on a little endian system.
On a X86 32bit system this could compile to the following assembly
mov [esp+20h+var_4], 1 ; Moves the value of 1 to a memory address
lea eax, [esp+20h+var_4] ; Loads that memory address to eax register
mov al, [eax] ; Takes the first byte of the value pointed to by the eax register and move that to register al (al is 1 byte)
cmp al, 1 ; compares that one byte of register al to the value of 1
(*(char *)&num == 1)
Roughly translates as Take the address of variable num. Cast it's contents to a character and compare the value with 1.
If the character at the first address of your integer is 1 then it's the low-order byte, and your numbers are Little-Endian.
Big-Endian numbers will have the high-order byte (0 if the integer value is 1) at the lowest address, so the comparison will fail.
we all know that type "int" occupies 4 bytes while "char" is only 1 byte.
that line just converts the integer to a char.it is equivalent to : char c = (char)"the lowest byte of num".
see the concept of endianess: http://en.wikipedia.org/wiki/Endianness
then if the host machine is big-endian, the value of c will be 0, and 1 otherwise.
an example is :
suppose num is 0x12345678, on big-endian machines, c will be 0x12 whereas on little-endian machines c is 0x78
Here goes. You assign the value 1 to a 4 byte int so
01 00 00 00 on little endian x86
00 00 00 01 on big endian
(*(char *)&num == 1)
&num gives the address of the int but the cast to char* restricts the read (the de-reference) to 1 byte (size of char)
If the first byte is 1 then the least significant bit went first and it's little endian.

Assembly Language: Memory Bytes and Offsets

I am confused as to how memory is stored when declaring variables in assembly language. I have this block of sample code:
val1 db 1,2
val2 dw 1,2
val3 db '12'
From my study guide, it says that the total number of bytes required in memory to store the data declared by these three data definitions is 8 bytes (in decimal). How do I go about calculating this?
It also says that the offset into the data segment of val3 is 6 bytes and the hex byte at offset 5 is 00. I'm lost as to how to calculate these bytes and offsets.
Also, reading val1 into memory will produce 0102 but reading val3 into memory produces 3132. Are apostrophes represented by the 3 or where does it come from? How would val2 be read into memory?
You have two bytes, 0x01 and 0x02. That's two bytes so far.
Then you have two words, 0x0001 and 0x0002. That's another four bytes, making six to date.
The you have two more bytes making up the characters of the string '12', which are 0x31 and 0x32 in ASCII (a). That's another two bytes bringing the grand total to eight.
In little-endian format (which is what you're looking at here based on the memory values your question states), they're stored as:
offset value
------ -----
0 0x01
1 0x02
2 0x01
3 0x00
4 0x02
5 0x00
6 0x31
7 0x32
(a) The character set you're using in this case is the ASCII one (you can follow that link for a table describing all the characters in that set).
The byte values 0x30 thru 0x39 are the digits 0 thru 9, just as the bytes 0x41 thru 0x5A represent the upper-case alpha characters. The pseudo-op:
db '12'
is saying to insert the bytes for the characters '1' and '2'.
Similarly:
db 'Pax is a really cool guy',0
would give you the hex-dump representation:
addr +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F +0123456789ABCDEF
0000 50 61 78 20 69 73 20 61 20 72 65 61 6C 6C 79 20 Pax is a really
0010 63 6F 6F 6C 20 67 75 79 00 cool guy.
val1 is two consecutive bytes, 1 and 2. db means "direct byte". val2 is two consecutive words, i.e. 4 bytes, again 1 and 2. in memory they will be 1, 0, 2, 0, assuming you're on a big endian machine. val3 is a two bytes string. 31 and 32 in are 49 and 50 in hexadecimal notation, they are ASCII codes for the characters "1" and "2".

Addressing memory data in 32 bit protected mode with nasm

So my book says i can define a table of words like so:
table: dw "13,37,99,99"
and that i can snatch values from the table by incrementing the index into the address of the table like so:
mov ax, [table+2] ; should give me 37
but instead it places 0x2c33 in ax rather than 0x3337
is this because of a difference in system architecture? maybe because the book is for 386 and i'm running 686?
0x2C is a comma , and 0x33 is the character 3, and they appear at positions 2 and 3 in your string, as expected. (I'm a little confused as to what you were expecting, since you first say "should give me 37" and later say "rather than 0x3337".)
You have defined a string constant when I suspect that you didn't mean to. The following:
dw "13,37,99,99"
Will produce the following output:
Offset 00 01 02 03 04 05 06 07 08 09 0A 0B
31 33 2C 33 37 2C 39 39 2C 39 39 00
Why? Because:
31 is the ASCII code for '1'
33 is the ASCII code for '3'
2C is the ASCII code for ','
...
39 is the ASCII code for '9'
NASM also null-terminates your string by putting 0 byte at the end (If you don't want your strings to be null-terminated use single quotes instead, '13,37,99,99')
Take into account that ax holds two bytes and it should be fairly clear why ax contains 0x2C33.
I suspect what you wanted was more along the lines of this (no quotes and we use db to indicate we are declaring byte-sized data instead of dw that declares word-sized data):
db 13,37,99,99
This would still give you 0x6363 (ax holds two bytes / conversion of 99, 99 to hex). Not sure where you got 0x3337 from.
I recommend that you install yourself a hex editor and have an experiment inspecting the output from NASM.

Convert DeDe constant to valid declaration or other interface extraction tool?

I am using DeDe to create an API (Interface) I can compile to. (Strictly legit: while we wait for the vendor to deliver a D2010 version in two months, we can at least get our app compiling...)
We'll stub out all methods.
Dede emits constant declarations like these:
LTIMGLISTCLASS =
00: ÿÿÿÿ....LEADIMGL|FF FF FF FF 0D 00 00 00 4C 45 41 44 49 4D 47 4C|
10: IST32. |49 53 54 33 32 00|;
DS_PREFIX =
0: ÿÿÿÿ....DICM.|FF FF FF FF 04 00 00 00 44 49 43 4D 00|;
How would I convert these into a compilable statement?
In theory, I don't care about the actual values, since I doubt they're use anywhere, but I'd like to get their size correct. Are these integers, LongInts or ???
Any other hints on using DeDe would be welcome.
Those are strings. The first four bytes are the reference count, which for string literals is always -1 ($ffffffff). The next four bytes are the character count. Then comes the characters an a null terminator.
const
LTIMGLISTCLASS = 'LEADIMGLIST32'; // 13 = $0D characters
DS_PREFIX = 'DICM'; // 4 = $04 characters
You don't have to "doubt" whether those constants are used anywhere. You can confirm it empirically. Compile your project without those constants. If it compiles, then they're not used.
If your project doesn't compile, then those constants must be used somewhere in your code. Based on the context, you can provide your own declarations. If the constant is used like a string, then declare a string; if it's used like an integer, then declare an integer.
Another option is to load your project in a version of Delphi that's compatible with the DCUs you have. Use code completion to make the IDE display the constant and its type.

Resources