TStringList vs. TList<string> - delphi

what is the difference in using a standard
type
sl: TStringList
compared to using a generic TList
type
sl: TList<string>
?
As far as I can see, both behave exactly the same.
Is it just another way of doing the same thing?
Are there situations where one would be better than the other?
Thanks!

TStringList is a descendant of TStrings.
TStringList knows how to sort itself alphabetically.
TStringList has an Objects property.
TStringList doesn't make your code incompatible with all previous versions of Delphi.
TStringList can be used as a published property. (A bug prevents generic classes from being published, for now.)

TStringList has been around a long time in Delphi before generics were around. Therefore, it has built up a handful of useful features that a generic list of strings would not have.
The generics version is just creating a new type that is identical to TList that works on the type of String. (.Add(), .Insert(), .Remove(), .Clear(), etc.)
TStringList has the basic TList type methods and other methods custom to working with strings, such as .SaveToFile() and .LoadFromFile()
If you want backwards compatibility, then TStringList is definitely the way to go.
If you want enhanced functionality for working with a list of Strings, then TStringList is the way to go.
If you have some basic coding fundamentals that you want to work with a list of any type, then perhaps you need to look away from TStringList.

As TStringList is a descendant of TStrings it is compatible with the Lines property of TMemo, Items of TListbox and TComboBox and other VCL components.
So can use
cbList.Items := StringList; // internally calls TStrings.Assign

I'd probably say if you want backwards compatibility use TStringList, and if you want forward compatibility (perhaps the option to change that list of strings to say list of Int64s in the future) then go for TList.

From memory point of view TStringList memory usage is increased with the size of TObject pointer added to each item. TList memory usage is increased with the size of pointer added to each item. If is needed just an array of strings without searching, replacing, sorting or associative operations, a dynamic array (array of string) should be just enough. This lacks of a good memory management of TStringList or TList, but in theory should use less memory.

The TStringlist is one very versatile class of Delphi. I used (and abused ;-) ) its Objects property many times. It's very interesting to quickly translate a delimited string to a control like a TMemo and similar ones (TListBox, TComboBox, just to list a few).
I just don't like much TList, as TStringList satisfied my needs without needing of treating pointers (as Tlist is a list of Pointer values).
EDIT: I confused the TList(list of pointers) with TList (generic list of strings). Sorry for that. My point stands: TStringList is just a lot more than a simple list of strings.

For most purposes that TStringList has been abused in the past, TObjectDictionary is better - it's faster and doesn't need sorting.
If you need a TStrings object (generally for UI stuff, since the VCL doesn't use generics much even for XE5) use TStringList - the required casting from TObject is annoying but not a showstopper.

TStringList has been used for far too long and has many advantages, all mentioned by Rob Kennedy.
The only real disadvantage of using it as a pair of a string and an object is the necessity of casting object to the actual type expected and stored in this list (when reading) and as far as I know Embarcadero did not provide Delphi 2009 and up VCL libraries with generic version of TStringList.
To overcome this limitation I implemented such list for internal use and for almost 3 years it serves it's purpose so I decided to share it today: https://github.com/t00/deltoo#tgenericstringlist
One important note - it changes the default property from Strings to Objects as in most cases when object is stored in a list it is also the mostly accessed property of it.

Related

Delphi object without Create? [duplicate]

In Delphi sane people use a class to define objects.
In Turbo Pascal for Windows we used object and today you can still use object to create an object.
The difference is that a object lives on the stack and a class lives on the heap.
And of course the object is depreciated.
Putting all that aside:
is there a benefit to be had, speed wise by using object instead of class?
I know that object is broken in Delphi 2009, but I've got a special use case1) where speed matters and I'm trying to find if using object will make my thing faster without making it buggy
This code base is in Delphi 7, but I may port it to Delphi 2007, haven't decided yet.
1) Conway's game of life
Long comment
Thanks all for pointing me in the right direction.
Let me explain a bit more. I'm trying to do a faster implementation of hashlife, see also here or here for simple sourcecode
The current record holder is golly, but golly uses a straight translation of Bill Gospher original lisp code (which is brilliant as an algorithm, but not optimized at the micro level at all). Hashlife enables you to calculate a generation in O(log(n)) time.
It does this by using a space/time trade off. And for this reason hashlife needs a lot of memory, gigabytes are not unheard of. In return you can calculate generation 2^128 (340282366920938463463374607431770000000) using generation 2^127 (170141183460469231731687303715880000000) in o(1) time.
Because hashlife needs to compute hashes for all sub-patterns that occur in a larger pattern, allocation of objects needs to be fast.
Here's the solution I've settled upon:
Allocation optimization
I allocate one big block of physical memory (user settable) lets say 512MB. Inside this blob I allocate what I call cheese stacks. This is a normal stack where I push and pop, but a pop can also be from the middle of the stack. If that happens I mark it on the free list (this is a normal stack). When pushing I check the free list first if nothing is free I push as normal. I'll be using records as advised it looks like the solution with the least amount of overhead.
Because of the way hashlife works, very little popping takes place and a lot of pushes. I keep separate stacks for structures of different sizes, making sure to keep memory access aligned on 4/8/16 byte boundaries.
Other optimizations
recursion removal
cache optimization
use of inline
precalculation of hashes (akin to rainbow tables)
detection of pathological cases and use of fall-back algorithm
use of GPU
For using normal OOP programming, you should always use the class kind. You'll have the most powerful object model in Delphi, including interface and generics (in later Delphi versions).
1. Records, pointers and objects
Records can be evil (slow hidden copy if you forgot to declare a parameter as const, record hidden slow cleanup code, a fillchar would make any string in record become a memory leak...), but they are sometimes very convenient to access a binary structure (e.g. some "smallish value"), via a pointer.
A dynamic array of tiny records (e.g. with one integer and one double field) will be much faster than a TList of small classes; with our TDynArray wrapper, you will have high-level access to the records, with serialization, sorting, hashing and such.
If using pointers, you must know what you are doing. It's definitively more preferable to stick with classes, and TPersistent if you want to use the magical "VCL component ownership model".
Inheritance is not allowed for records. You'll need either to use a "variant record" (using the case keyword in its type definition), either use nested records. When using C-like API, you'll sometimes have to use object-oriented structures. Using nested records or variant records is IMHO much less clear than the good old "object" inheritance model.
2. When to use object
But there are some places where objects are a good way of accessing already existing data.
Even the object model is better than the new record model, because it handles simple inheritance.
In a Blog entry last summer, I posted some possibilities to still use objects:
A memory mapped file, which I want to parse very quickly: a pointer to such an object is just great, and you still have methods at hand; I use this for TFileHeader or TFileInfo which map the .zip header, in SynZip.pas;
A Win32 structure, as defined by a API call, in which I put handy methods for easy access to the data (for that you may use record but if there is some object orientation in the struct - which is very common - you'll have to nest records, which is not the very handy);
A temporary structure defined on the stack, just used during a procedure: I use this for TZStream in SynZip.pas, or for our RTTI related classes, which map the Delphi generated RTTI in an Object-Oriented way not as the TypeInfo which is function/procedure oriented. By mapping the RTTI memory content directly, our code is faster than using the new RTTI classes created on the heap. We don't instanciate any memory, which, for an ORM framework like ours, is good for its speed. We need a lot of RTTI info, but we need it quick, we need it directly.
3. How object implementation is broken in modern Delphi
The fact that object is broken in modern Delphi is a shame, IMHO.
Normally, if you define a record on the stack, containing some reference-counted variables (like a string), it will be initialized by some compiler magic code, at the begin level of the method/function:
type TObj = object Int: integer; Str: string; end;
procedure Test;
var O: TObj
begin // here, an _InitializeRecord(#O,TypeInfo(TObj)) call is made
O.Str := 'test';
(...)
end; // here, a _FinalizeRecord(#O,TypeInfo(TObj)) call is made
Those _InitializeRecord and _FinalizeRecord will "prepare" then "release" the O.Str variable.
With Delphi 2010, I found out that sometimes, this _InitializeRecord() was not always made.
If the record has only some no public fields, the hidden calls are sometimes not generated by the compiler.
Just build the source again, and there will be...
The only solution I found out was using the record keyword instead of object.
So here is how the resulting code looks like:
/// used to store and retrieve Words in a sorted array
// - is defined either as an object either as a record, due to a bug
// in Delphi 2010 compiler (at least): this structure is not initialized
// if defined as a record on the stack, but will be as an object
TSortedWordArray = {$ifdef UNICODE}record{$else}object{$endif}
public
Values: TWordDynArray;
Count: integer;
/// add a value into the sorted array
// - return the index of the new inserted value into the Values[] array
// - return -(foundindex+1) if this value is already in the Values[] array
function Add(aValue: Word): PtrInt;
/// return the index if the supplied value in the Values[] array
// - return -1 if not found
function IndexOf(aValue: Word): PtrInt; {$ifdef HASINLINE}inline;{$endif}
end;
The {$ifdef UNICODE}record{$else}object{$endif} is awful... but the code generation error didn't occur since..
The resulting modifications in the source code are not huge, but a bit disappointing. I found out that older version of the IDE (e.g. Delphi 6/7) are not able to parse such declaration, so the class hierarchy will be broken in the editor... :(
Backward compatibility should include regression tests. A lot of Delphi users stay to this product because of the existing code. Breaking features are very problematic for the Delphi future, IMHO: if you have to rewrite a lot of code, which shouldn't you just switch the project to C# or Java?
Object was not the Delphi 1 method of setting up objects; it was the short-lived Turbo Pascal method of setting up objects, which was replaced by the Delphi TObject model in Delphi 1. It was kept around for backwards compatibility, but it should be avoided for a few reasons:
As you noted, it's broken in more recent versions. And AFAIK there are no plans to fix it.
It's a conceptualy wrong object model. The entire point of Object Oriented Programming, the one thing that really distinguishes it from procedural programming, is Liskov substitution (inheritance and polymorphism), and inheritance and value types don't mix.
You lose support for a lot of features that require TObject descendants.
If you really need value types that don't need to be dynamically allocated and initialized, you can use records instead. You can't inherit from them, but you can't do that very well with object either so you're not losing anything here.
As for the rest of the question, there aren't all that many speed benefits. The TObject model is plenty fast, especially if you're using the FastMM memory manager to speed up creation and destruction of objects, and if your objects contain lots of fields they can even be faster than records in a lot of cases, because they're passed by reference and don't have to be copied around for each function call.
When given a choice between "fast and possibly broken" and "fast and correct," always choose the latter.
Old-style objects offer no speed incentive over plain old records, so wherever you might be tempted to use old-style objects, you can use records instead without the risk of having uninitialized compiler-managed types or broken virtual methods. If your version of Delphi doesn't support records with methods, then just use standalone procedures instead.
Way back in older versions of Delphi which did not support records with methods then using object was the way to get your objects allocated on the stack. Very occasionally that would yield worthwhile performance benefits. Nowadays record is better. The only feature missing from record is the ability to inherit from another record.
You give up a lot when you change from class to record so only consider it if the performance benefits are overwhelming.

Delphi non visual TTree implementation

I'm looking for a non visual persistent tree (TStringTree) implementation. If someone known any good implentation of it, please let me know.
Thanks.
You'll find a flexible, non-visual tree structure in the DI Containers library (commercial). However, as others have noted above, it's really quite easy to roll your own, adding only the functionality that you need.
You can do with just two base objects: TNode and a TNodeList (e.g. a TObjectList descendant). At minimum, TNode needs just three members: your string data, a reference to its parent node (nil if the node is root), and a TNodeList, which is a list of its child nodes. What remains is the (somewhat tedious) implementation of the various attendant methods such as Add(), Delete(), IndexOf(), MoveTo(), GetFirstChild(), GetNext() etc. The basic tree should be less than a one-nighter.
What kind of tree? B-tree? Splay tree? Red-black tree? These are all common types of tree algorithms.
You might want to look at Julian Bucknall's book, Tomes of Delphi: Data Structures and Algorithms. It has all sorts of tree implementations with full Delphi source; you could easily adapt any of them to work with strings.
And of course there is still the funky DECAL (previously Rosetta), an attempt at creating a kind of STL using interfaces and variants.
http://sourceforge.net/projects/decal/
More for the people that flexibility over speed though. The base tree structure is Red-Black iirc.
Why not simply use an XML DOM document?
It may be overkill for a truly trivial string-tree, but would not be too burdensome to use for that purpose and has the benefit of being capable of accommodating just about any extension to a string-tree (for storing additional data with each string in the tree - as attributes etc) should the need arise.
In my experience often what starts out as a trivial need can quickly grow beyond the initial or anticipated requirement. :)
If you are concerned about the "overhead" of the COM based XML implementation and the VCL wrapper around it, you might look into TNativeXML
You could just use a tStringList and add objects which are other tStringLists... crude, but it works if your data can be represented as only string data.
Child := tStringlist.create;
ParentList.AddObject('Child',Child);
Of course a better solution would be to create your own objects which contain a tobjectlist of objects.

What datatype/structure to store file list info?

I have an application that searches files on the computer (configurable path, type etc). Currently it adds information to a database as soon as a matching file is found. Rather than that I want to hold the information in memory for further manipulation before inserting to database. The list may contain a lot of items. I consider performance as important factor. I may need iterating thru the items, so a structure that can be coded easily is another key issue. and how can I achieve php style associative arrays for this job?
If you're using Delphi 2009, you can use a TDictionary. It takes two generic parameters. The first should be a string, for the filename, and the second would be whatever data type you're associating with. It also has three built-in enumerators, one for key-value pairs, one for keys only and one for values only, which makes iterating easy.
Another solution would be to use just a standard TStringList.
As long as it's sorted and has some duplicate setting other than dupAccept, you can use indexof or indexofname to find items in the list quickly.
It also has the Objects addition which allows you to store object information attached to the name. Starting with D2009, TStringList has the OwnsObject property which allows you to delegate object cleanup to the TStringList. Prior to D2009 you have to handle that yourself.
Much of this will depend on how you are going to use the list and to what scale. If you are going to use it as a stack, or queue, then a TList would work fine. If your needing to search through the list for a specific item then you will need something that allows faster retrieval. TDictionary (2009) or TStringList (pre 2009) would be the most likely choice.
Dynamic arrays are also a possiblity, but if you use them you will want to minimize the use of SetLength as each time it is called it will re-allocate memory. TList manages this for you, which is why I suggested using a TList. if you KNOW how many you will deal with in advance, then use a dynamic array, and set its length on the onset.
If you have more items than will fit in memory then your choices also change. At that point I would either use a database table, or a tFileStream to store the records to be processed, then seek to the beginning of the table/stream for processing.
Try using the AVL-Tree by http://sourceforge.net/projects/alcinoe/ as your associative Array. It has an iterate-method for fast iteration. You may need to derive from his baseclass and implement your own comparator, but it's easy to use.
Examples are included.

Pointer to generic type

In the process of transforming a given efficient pointer-based hash map implementation into a generic hash map implementation, I stumbled across the following problem:
I have a class representing a hash node (the hash map implementation uses a binary tree)
THashNode <KEY_TYPE, VALUE_TYPE> = class
public
Key : KEY_TYPE;
Value : VALUE_TYPE;
Left : THashNode <KEY_TYPE, VALUE_TYPE>;
Right : THashNode <KEY_TYPE, VALUE_TYPE>;
end;
In addition to that there is a function that should return a pointer to a hash node. I wanted to write
PHashNode = ^THashNode <KEY_TYPE, VALUE_TYPE>
but that doesn't compile (';' expected but '<' found).
How can I have a pointer to a generic type?
And adressed to Barry Kelly: if you read this: yes, this is based on your hash map implementation. You haven't written such a generic version of your implementation yourself, have you? That would save me some time :)
Sorry, Smasher. Pointers to open generic types are not supported because generic pointer types are not supported, although it is possible (compiler bug) to create them in certain circumstances (particularly pointers to nested types inside a generic type); this "feature" can't be removed in an update in case we break someone's code. The limitation on generic pointer types ought to be removed in the future, but I can't make promises when.
If the type in question is the one in JclStrHashMap I wrote (or the ancient HashList unit), well, the easiest way to reproduce it would be to change the node type to be a class and pass around any double-pointers as Pointer with appropriate casting. However, if I were writing that unit again today, I would not implement buckets as binary trees. I got the opportunity to write the dictionary in the Generics.Collections unit, though with all the other Delphi compiler work time was too tight before shipping for solid QA, and generic feature support itself was in flux until fairly late.
I would prefer to implement the hash map buckets as one of double-hashing, per-bucket dynamic arrays or linked lists of cells from a contiguous array, whichever came out best from tests using representative data. The logic is that cache miss cost of following links in tree/list ought to dominate any difference in bucket search between tree and list with a good hash function. The current dictionary is implemented as straight linear probing primarily because it was relatively easy to implement and worked with the available set of primitive generic operations.
That said, the binary tree buckets should have been an effective hedge against poor hash functions; if they were balanced binary trees (=> even more modification cost), they would be O(1) on average and O(log n) worst case performance.
To actually answer your question, you can't make a pointer to a generic type, because "generic types" don't exist. You have to make a pointer to a specific type, with the type parameters filled in.
Unfortunately, the compiler doesn't like finding angle brackets after a ^. But it will accept the following:
TGeneric<T> = record
value: T;
end;
TSpecific = TGeneric<string>;
PGeneric = ^TSpecific;
But "PGeneric = ^TGeneric<string>;" gives a compiler error. Sounds like a glitch to me. I'd report that over at QC if I was you.
Why are you trying to make a pointer to an object, anyway? Delphi objects are a reference type, so they're pointers already. You can just cast your object reference to Pointer and you're good.
If Delphi supported generic pointer types at all, it would have to look like this:
type
PHashNode<K, V> = ^THashNode<K, V>;
That is, mention the generic parameters on the left side where you declare the name of the type, and then use those parameters in constructing the type on the right.
However, Delphi does not support that. See QC 66584.
On the other hand, I'd also question the necessity of having a pointer to a class type at all. Generic or not. they are needed only very rarely.
There's a generic hash map called TDictionary in the Generics.Collections unit. Unfortunately, it's badly broken at the moment, but it's apparently going to be fixed in update #3, which is due out within a matter of days, according to Nick Hodges.

In Delphi 2009 do I need to free variant arrays?

If I have a variant array which holds nothing but simple types, and possible further variant arrays of simple types, do I need to do anything explicit to free memory, or is it all taken care of for me. I've always thought there is nothing to do, but I just had a slight doubt!
Variants are managed types. They're owned by the compiler's reference-counting system and don't need to be freed manually.
If you do something convoluted like typecasting an object to an integer and storing that in the variant, and then making that the only reference to your object, then you'll want to clean that up before the variant goes out of scope, but the variant itself (including variant arrays) is safe.

Resources