Memory layout of a set - delphi

How is a set organized in memory in Delphi?
What I try to do is casting a simple type to a set type like
var
MyNumber : Word;
ShiftState : TShiftState;
begin
MyNumber:=42;
ShiftState:=TShiftState(MyNumber);
end;
Delphi (2009) won't allow this and I don't understand why. It would make my life a lot easier in cases where I get a number where single bits encode different enum values and I just could cast it like this. Can this be done?
One approach I was going to go for is:
var
ShiftState : TShiftState;
MyNumber : Word absolute ShiftState;
begin
MyNumber:=42;
end;
But before doing this I thought I'd ask for the memory layout. It's more a feeling than knowlege what I am having right now about this.

A Delphi set is a bit field who's bits correspond to the associated values of the elements in your set. For a set of normal enumerated types the bit layout is straight-forward:
bit 0 corresponds to set element with ordinal value 0
bit 1 corresponds to set element with ordinal value 1
and so on.
Things get a bit interesting when you're dealing with a non-contiguous set or with a set that doesn't start at 0. You can do that using Delphi's subrange type (example: set of 3..7) or using enumerated types that specify the actual ordinal value for the elements:
type enum=(seven=7, eight=8, eleven=11);
EnumSet = set of enum;
In such cases Delphi will allocate the minimum required amount of bytes that will include all required bits, but will not "shift" bit values to use less space. In the EnumSet example Delphi will use two bytes:
The first byte will have it's 7th bit associated with seven
The second byte will have bit 0 associated with eight
The second byte will have bit 3 associated with eleven
You can see some tests I did over here: Delphi 2009 - Bug? Adding supposedly invalid values to a set
Tests were done using Delphi 2010, didn't repeat them for Delphi XE.

Unfortunately I just now stumbled upon the following question: Delphi 2009 - Bug? Adding supposedly invalid values to a set
The accepted answer of Cosmin contains a very detailed description of what is going on with sets in Delphi. And why I better not used my approach with absolute. Apparently a set variable can take from 1 to 32 byte of memory, depending on the enum values.

You have to choose a correctly sized ordinal type. For me (D2007) your code compiles with MyNumber: Byte:
procedure Test;
var
MyNumber: Byte;
ShiftState: TShiftState;
begin
MyNumber := 42;
ShiftState := TShiftState(MyNumber);
end;
I have used this technique in some situations and didn't encounter problems.
UPDATE
The TShiftState type has been extended since Delphi 2010 to include two new states, ssTouch and ssPen, based on the corresponding doc page (current doc page). The Delphi 2009 doc still has TShiftState defined as a set of 7 states.
So, your attempt to convert Word to TShiftState would work in Delphi 2010+, but Byte is the right size for Delphi 2009.

I use this:
For <= 8 elements, PByte(#MyNumber)^, for <= 16 elements, PWord(#MyNumber)^, etc.
If the enum takes up more space (through the min enum size compiler option) this will still work.

Related

Delphi 64 bit internal data format / dynamic arrays

I am trying to figure out if there could be situations where using "packed array of [type]" could be advantageous if trying to save memory.
You can read that on Win32 bit, it is not the case:
http://docwiki.embarcadero.com/RADStudio/XE4/en/Internal_Data_Formats#Dynamic_Array_Types
Generally speaking, it would seem to me runtime code for dynamic arrays would be fasterif never using padding between items. But since I believe default alignment in Delphi/64bit is 8byte, it would seem to be there would be a a chance that you some day could not count on dynamic arrays containg e.g. booleans being "packed" by default
First of all I think you need to be careful how you read the documentation. It has not been updated consistently and the fact that some parts appear specific to Win32 can in no way be used to infer that Win64 differs. In fact, for the issues raised in your question, there are no substantive differences between 32 and 64 bit.
For Delphi, using packed on an array does not change the padding between successive elements. The distance between two adjacent elements of an array is equal to SizeOf(T) where T is the element type. This holds true irrespective of the use of the packed modifier. And using packed on an array does not even affect the alignment of the array.
So, to be completely clear and explicit, packed has no impact at all when used on arrays. The compiler treats a packed array in exactly the same way as it treats a non-packed array.
This statement seems to be at odds with the documentation. The documentation states:
Alignment of Structured Types
By default, the values in a structured type are aligned on word- or
double-word boundaries for faster access.
You can, however, specify byte alignment by including the reserved
word packed when you declare a structured type. The packed word
specifies compressed data storage. Here is an example declaration:
type TNumbers = packed array [1..100] of Real;
But consider this program:
{$APPTYPE CONSOLE}
type
TNumbers = packed array [1..100] of Real;
type
TRec = record
a: Byte;
b: TNumbers;
end;
begin
Writeln(Integer(#TRec(nil^).a));
Writeln(Integer(#TRec(nil^).b));
Readln;
end.
The output of which, using Delphi XE2, is:
0
8
So, using packed with arrays, in contradiction of the documentation, does not modify alignment of the type.
Note that the above program, when compiled by Delphi 6, has output
0
1
So it would appear that the compiler has changed, but the documentation has not caught up.
A related question can be found here: Are there any difference between array and packed array in Delphi? But note that the answer of Barry Kelly (an Embarcadero compiler engineer at the time he wrote the answer) does not mention the change in compiler behaviour.
Generally speaking, it would seem to me runtime code would be faster if never using padding between items.
As discussed above, for arrays there never is padding between array elements.
I believe default alignment in Delphi/64bit is 8 byte.
The alignment is determined by the data type and is not related to the target architecture. So a Boolean has alignment of 1. A Word has alignment of 2. An Integer has alignment of 4. A Double has alignment of 8. These alignment values are the same on 32 and 64 bit.
Alignment has an enormous impact on performance. If you care about performance you should, as a general rule, never pack data structures. If you wish to minimise the size of a structure (class or record), put the larger elements first to minimise padding. This could improve performance, but it could also worsen it.
But there is no single hard and fast rule. I can imagine that packing could improve performance if the thing that was packed was hardly ever accessed. And I can, more readily in fact, imagine that changing the order of elements in a structure to group related elements could improve performance.

Displaying the result of mmioRead

After locating the data chunk using mmioDescend, then how i suppose to read and display the sample data into for example into a memo in delphi 7?
I have follow the step like open the file, locating the riff, locating the fmt, locating data chunk.
if (mmioDescend(HMMIO, #ckiData, #ckiRIFF, MMIO_FINDCHUNK) = MMSYSERR_NOERROR) then
SetLength(buf, ckiData.cksize);
mmioRead(HMMIO, PAnsiChar(buf), ckiData.cksize);
I use mmioRead too but i don't know how to display the data.Can anyone help give an example how to use the mmioRead and then display the result?
Well, I'd probably read into a buffer that was declared using a more appropriate type.
For example, suppose your data are 16 bit integers, Smallint in Delphi. Then declare a dynamic array of Smallint.
var
buf: array of Smallint;
Then allocate enough space for the data:
Assert(ckiData.cksize mod SizeOf(buf[0])=0);
SetLength(buf, ckiData.cksize div SizeOf(buf[0]));
And then read the buffer:
mmioRead(HMMIO, PAnsiChar(buf), ckiData.cksize);
Now you can access the elements as Smallint values.
If you have different element types, then you can adjust your array declaration. If you don't know until runtime what the element type is you may be better off with array of Byte and then using pointer arithmetic and casting to access the actual content.
I'd say that the design of the interface to mmioRead is a little weak. The buffer isn't really a string. It's probably best considered as a byte array. But perhaps because C does not have separate byte and character types, the function is declared as taking a pointer to char array. Really the Delphi translation would be better exposing a pointer to byte or even better in my view, a plain untyped Pointer type.
I assumed that you were struggling with interpreting the output of mmioRead since that was the code that you included in the question. But, according to now deleted comments, your question is a GUI question.
You want to add content to a memo. Do it like this:
Memo1.Clear;
for i := low(buf) to high(buf) do
Memo1.Items.Add(IntToStr(buf[i]));
If you want to convert to floating point then, still assuming 16 bit signed data, do this:
Memo1.Clear;
for i := low(buf) to high(buf) do
Memo1.Items.Add(FormatFloat('0.00000', buf[i]/32768.0));//show 5dp

Delphi compile-time integer conversion warnings?

In Delphi XE or 2006, is there any way to detect at compile time that implicit conversions between integer types may lose data? I realize it's possible to detect this with runtime checking. I would want it to flag the following example even if the "big" value were 1. (We're considering changing int to bigint for certain database keys and want to determine the impact on a large legacy codebase.)
program Project1;
{$APPTYPE CONSOLE}
uses
SysUtils;
var
small: Integer;
big: Int64;
begin
big := 3000000000000;
small := big; // Detect me!
Writeln(small);
end.
You won't get any warnings or hints at compile time. The Delphi compiler does not do any program flow analysis which tells it that big contains a too large value when it is assigned to small. It silently truncates the value to make it fit in the smaller type. I tried with Shortint, a signed byte-sized type and even that did not give a warning or hint.
And there is no way to make Delphi warn you. It would be nice. Perhaps you can suggest it in QC (if it wasn't already suggested)?
In Delphi, even in Delphi 10.3 -- no.
But take a look into software calling 'Pascal Analyzer', mfg by www.peganza.com
They have a lot of options and one of them is (taken from software Help):
Source code for test, take a look into line #32:
Analyze result show possible bad assignment in line #32:

Does Delphi 5 have a sorted dictionary container?

I'm new to Delphi 5 and looking for a container (ideally a built-in one) that will do the same job as map does in C++ (i.e. a sorted dictionary). I've done a preliminary Google search but nothing obvious seems to be suggesting itself. Please can anyone point me in the right direction?
Take a look at our TDynArray wrapper.
It's a wrapper around any existing dynamic array (including dynamic array of records), which will add TList-like methods to the dynamic array. With even more features, like search, sorting, hashing, and binary serialization. You can set the array capacity with an external Count variable, so that insertion will be much faster.
type
TGroup: array of integer;
var
Group: TGroup;
GroupA: TDynArray;
i, v: integer;
Test: RawByteString;
begin
GroupA.Init(TypeInfo(TGroup),Group); // associate GroupA with Group
for i := 0 to 1000 do begin
v := i+1000; // need argument passed as a const variable
GroupA.Add(v);
end;
v := 1500;
if GroupA.IndexOf(v)<0 then // search by content
ShowMessage('Error: 1500 not found!');
for i := GroupA.Count-1 downto 0 do
if i and 3=0 then
GroupA.Delete(i); // delete integer at index i
Test := GroupA.SaveTo; // serialization into Test
GroupA.Clear;
GroupA.LoadFrom(Test);
GroupA.Compare := SortDynArrayInteger;
GroupA.Sort; // will sort the dynamic array according to the Compare function
for i := 1 to GroupA.Count-1 do
if Group[i]<Group[i-1] then
ShowMessage('Error: unsorted!');
v := 1500;
if GroupA.Find(v)<0 then // fast binary search
ShowMessage('Error: 1500 not found!');
This example uses an array of integer, but you may use it with an array of records, even containing strings, nested dynamic arrays, or other records.
Works from Delphi 5 up to XE.
I find it even easier to use than generics (e.g. for the auto-serialization feature). ;)
See http://blog.synopse.info/post/2011/03/12/TDynArray-and-Record-compare/load/save-using-fast-RTTI
Well if D5 doesn't contain THashedStringList then maybe this one (from OmniThreadLibrary) does the job:
https://github.com/gabr42/OmniThreadLibrary/blob/master/src/GpStringHash.pas
I cite:
Tested with Delphi 2007. Should work with older versions, too.
Well, your version is definitely older :)
This can be accomplished 'out of the box' in D5 using TList.Sort or TObjectList.Sort - in your case you'd probably want to implement a class along the lines of:
TMyClass
public
Index:integer;
Value:TWhatever;
...
end;
To store in your list, then implement TList.Sort or TObjectList.Sort using your index value for sorting. It will take a bit of work but not terrible. See the D5 help on these classes for implementation details.
Unfortunately there are no generics in D5 so you'll probably have to to a lot of type casting or develop redundant container classes for different types.
This brings back memories. There is another interesting technique that's suitable for ancient versions of Delphi like yours. Read on!
From your description you sound like you want a fairly generic container - ie, one you can use with a variety of types. This cries out for generics (yes, use a new Delphi!) But back in the old days, there used to be a slightly hacky way to implement templates / generics with Delphi pre-2009, using a series of defines and includes. It took a bit of googling, but here's an article on these 'generics' that's much like what I remember. It's from 2001; in those days Delphi 5 was still recent-ish.
The rough idea is this: write a class to do what you want (here, a map from key to value) for a type, and get it working. Then change that file to use a specific name for the type (TMyType, anything really) and strip the file so it's no longer a valid unit, but contains code only. (I think two partial files, actually: one for the interface section, one for implementation.) Include the contents of the file using {$include ...}, so your whole Pascal file is compiled using your definitions and then the contents of the other partial included files which uses those definitions. Neat, hacky, ugly? I don't know, but it works :)
The example article creates a typed object list (ie, a list not of TObject but TMemo, TButton, etc.) You end up with a file that looks like this (copied from the linked article):
unit u_MemoList;
interface
uses
Sysutils,
Classes,
Contnrs,
StdCtrls;
{$define TYPED_OBJECT_LIST_TEMPLATE}
type
_TYPED_OBJECT_LIST_ITEM_ = TMemo;
{$INCLUDE 't_TypedObjectList.tpl'}
type
TMemoList = class(_TYPED_OBJECT_LIST_)
end;
implementation
{$INCLUDE 't_TypedObjectList.tpl'}
end.
You will need to write your own map-like class, although you should be able to base it on this class. I remember that there used to be a set of 'generic' containers floating around the web which may have used this technique, and I would guess map is among them. I'm afraid I don't know where it is or who it was by, and Googling for this kind of thing shows lots of results for modern versions of Delphi. Writing your own might be best.
Edit: I found the same article (same text & content) but better formatted on the Embarcadero Developer Network site.
Good luck!

Delphi Unicode String Type Stored Directly at its Address (or "Unicode ShortString")

I want a string type that is Unicode and that stores the string directly at the adress of the variable, as is the case of the (Ansi-only) ShortString type.
I mean, if I declare a S: ShortString and let S := 'My String', then, at #S, I will find the length of the string (as one byte, so the string cannot contain more than 255 characters) followed by the ANSI-encoded string itself.
What I would like is a Unicode variant of this. That is, I want a string type such that, at #S, I will find a unsigned 32-bit integer (or a single byte would be enough, actually) containing the length of the string in bytes (or in characters, which is half the number of bytes) followed by the Unicode representation of the string. I have tried WideString, UnicodeString, and RawByteString, but they all appear only to store an adress at #S, and the actual string somewhere else (I guess this has do do with reference counting and such). Update: The most important reason for this is probably that it would be very problematic if sizeof(string) were variable.
I suspect that there is no built-in type to use, and that I have to come up with my own way of storing text the way I want (which actually is fun). Am I right?
Update
I will, among other things, need to use these strings in packed records. I also need manually to read/write these strings to files/the heap. I could live with fixed-size strings, such as <= 128 characters, and I could redesign the problem so it will work with null-terminated strings. But PChar will not work, for sizeof(PChar) = 1 - it's merely an address.
The approach I eventually settled for was to use a static array of bytes. I will post my implementation as a solution later today.
You're right. There is no exact analogue to ShortString that holds Unicode characters. There are lots of things that come close, including WideString, UnicodeString, and arrays of WideChar, but if you're not willing to revisit the way you intend to use the data type (make byte-for-byte copies in memory and in files while still being using them in all the contexts a string could be allowed), then none of Delphi's built-in types will work for you.
WideString fails because you insist that the string's length must exist at the address of the string variable, but WideString is a reference type; the only thing at its address is another address. Its length happens to be at the address held by the variable, minus four. That's subject to change, though, because all operations on that type are supposed to go through the API.
UnicodeString fails for that same reason, as well as because it's a reference-counted type; making a byte-for-byte copy of one breaks the reference counting, so you'll get memory leaks, invalid-pointer-operation exceptions, or more subtle heap corruption.
An array of WideChar can be copied without problems, but it doesn't keep track of its effective length, and it also doesn't act like a string very often. You can assign string literals to it and it will act like you called StrLCopy, but you can't assign string variables to it.
You could define a record that has a field for the length and another field for a character array. That would resolve the length issue, but it would still have all the rest of the shortcomings of an undecorated array.
If I were you, I'd simply use a built-in string type. Then I'd write functions to help transfer it between files, blocks of memory, and native variables. It's not that hard; probably much easier than trying to get operator overloading to work just right with a custom record type. Consider how much code you will write to load and store your data versus how much code you're going to write that uses your data structure like an ordinary string. You're going to write the data-persistence code once, but for the rest of the project's lifetime, you're going to be using those strings, and you're going to want them to look and act just like real strings. So use real strings. "Suffer" the inconvenience of manually producing the on-disk format you want, and gain the advantage of being able to use all the existing string library functions.
PChar should work like this, right? AFAIK, it's an array of chars stored right where you put it. Zero terminated, not sure how that works with Unicode Chars.
You actually have this in some way with the new unicode strings.
s as a pointer points to s[1] and the 4 bytes on the left contains the length.
But why not simply use Length(s)?
And for direct reading of the length from memory:
procedure TForm9.Button1Click(Sender: TObject);
var
s: string;
begin
s := 'hlkk ljhk jhto';
{$POINTERMATH ON}
Assert(Length(s) = (PInteger(s)-1)^);
//if you don't want POINTERMATH, replace by PInteger(Cardinal(s)-SizeOf(Integer))^
showmessage(IntToStr(length(s)));
end;
There's no Unicode version of ShortString. If you want to store unicode data inline inside an object instead of as a reference type, you can allocate a buffer:
var
buffer = array[0..255] of WideChar;
This has two disadvantages. 1, the size is fixed, and 2, the compiler doesn't recognize it as a string type.
The main problem here is #1: The fixed size. If you're going to declare an array inside of a larger object or record, the compiler needs to know how large it is in order to calculate the size of the object or record itself. For ShortString this wasn't a big problem, since they could only go up to 256 bytes (1/4 of a K) total, which isn't all that much. But if you want to use long strings that are addressed by a 32-bit integer, that makes the max size 4 GB. You can't put that inside of an object!
This, not the reference counting, is why long strings are implemented as reference types, whose inline size is always a constant sizeof(pointer). Then the compiler can put the string data inside a dynamic array and resize it to fit the current needs.
Why do you need to put something like this into a packed array? If I were to guess, I'd say this probably has something to do with serialization. If so, you're better off using a TStream and a normal Unicode string, and writing an integer (size) to the stream, and then the contents of the string. That turns out to be a lot more flexible than trying to stuff everything into a packed array.
The solution I eventually settled for is this (real-world sample - the string is, of course, the third member called "Ident"):
TASStructMemHeader = packed record
TotalSize: cardinal;
MemType: TASStructMemType;
Ident: packed array[0..63] of WideChar;
DataSize: cardinal;
procedure SetIdent(const AIdent: string);
function ReadIdent: string;
end;
where
function TASStructMemHeader.ReadIdent: string;
begin
result := WideCharLenToString(PWideChar(#(Ident[0])), length(Ident));
end;
procedure TASStructMemHeader.SetIdent(const AIdent: string);
var
i: Integer;
begin
if length(AIdent) > 63 then
raise Exception.Create('Too long structure identifier.');
FillChar(Ident[0], length(Ident) * sizeof(WideChar), 0);
Move(AIdent[1], Ident[0], length(AIdent) * sizeof(WideChar));
end;
But then I realized that the compiler really can interpret array[0..63] of WideChar as a string, so I could simply write
var
MyStr: string;
Ident := 'This is a sample string.';
MyStr := Ident;
Hence, after all, the answer given by Mason Wheeler above is actually the answer.

Resources