understanding TDictionary on IntegerList - delphi

can I create a TDictionary directly on a TList class ? It looks a bit double work if I create my TDictionary Class with key and values are always the same data like BlackList.Add(1, 1);
var
BlackList: TDictionary<Integer, Integer>;
ResultList: TDictionary<Integer, Integer>;
TestListA: TLIst<Integer>;
TestListB: TLIst<Integer>;
i: Integer;
begin
BlackList.Add(1, 1);
BlackList.Add(2, 2);
for i := 0 to TestListA.Count - 1 do
begin
if BlackList.ContainsValue(TestListA[i]) then
begin
// no action ...
end
else
begin
ResultList.Add(i, TestListA[i]);
end;
end;
for i := 0 to TestListB.Count - 1 do
begin
if BlackList.ContainsValue(TestListB[i]) then
begin
// no action ...
end
else
begin
if not(ResultList.ContainsValue(TestListB[i])) then
ResultList.Add(i, TestListB[i]);
end;
end;
end;
The purpose of this algorithm is to compare 2 Integer list's , find all doubles but exclude numbers from a blacklist. The first q is here Find common elements in two Integer List.

The whole purpose, in this particular code, of using TDictionary is to take advantage of O(1) lookup. With TList, which is an array, you do not have that property. Lookup is O(n) in general, O(log n) if sorted. So you cannot extract the O(1) lookup performance from a TList by any means, hence the use of TDictionary.
So it is the performance motivation that is driving the use of TDictionary. As I said in your previous question, the overhead of setting up dictionaries is only worthwhile if the lists are large. You would need to do some benchmarking to quantify what large means in this context.
As for your dictionary, it does not matter what values you use since only the keys are significant. So use zero always. Ideally what you want is a hash set type based on the same algorithm as the dictionary. But I don't believe there is such a thing in the stock RTL. I expect that libraries like Spring4D offer that functionality.

Related

What is the canonical way to write a hasher function for TEqualityComparer.Construct?

Consider the following record:
TMyRecord = record
b: Boolean;
// 3 bytes of padding in here with default record alignment settings
i: Integer;
end;
I wish to implement IEqualityComparer<TMyRecord>. In order to do so I want to call TEqualityComparer<TMyRecord>.Construct. This needs to be supplied with a TEqualityComparison<TMyRecord> which presents no problems to me.
However, Construct also requires a THasher<TMyRecord> and I would like to know the canonical method for implementing that. The function needs to have the following form:
function MyRecordHasher(const Value: TMyRecord): Integer;
begin
Result := ???
end;
I expect that I need to call BobJenkinsHash on both fields of the record value and then combine them some how. Is this the right approach, and how should I combine them?
The reason I don't use TEqualityComparison<TMyRecord>.Default is that it uses CompareMem and so will be incorrect due to the record's padding.
The Effective Java (by Joshua Bloch) section about overriding hashCode could be useful. It shows how the individual parts of the object (or record) can be combined to efficiently construct a hashCode.
A good hash function tends to produce unequal hash codes for unequal
objects. This is exactly what is meant by the third provision of the
hashCode contract. Ideally, a hash function should distribute any
reasonable collection of unequal instances uniformly across all
possible hash values. Achieving this ideal can be extremely difficult.
Luckily it is not too difficult to achieve a fair approximation. Here
is a simple recipe:
Store some constant nonzero value, say 17, in an int variable called result.
For each significant field f in your object (each field taken into account by the equals method, that is), do the following:
a. Compute an int hash code c for the field: ..... details omitted ....
b. Combine the hash code c computed in step a into
result as follows: result = 37*result + c;
Return result.
When you are done writing the hashCode method, ask yourself whether equal instances have equal hash codes. If not, figure out why
and fix the problem.
This can be translated into Delphi code as follows:
{$IFOPT Q+}
{$DEFINE OverflowChecksEnabled}
{$Q-}
{$ENDIF}
function CombinedHash(const Values: array of Integer): Integer;
var
Value: Integer;
begin
Result := 17;
for Value in Values do begin
Result := Result*37 + Value;
end;
end;
{$IFDEF OverflowChecksEnabled}
{$Q+}
{$ENDIF}
This then allows the implementation of MyRecordHasher:
function MyRecordHasher(const Value: TMyRecord): Integer;
begin
Result := CombinedHash([IfThen(Value.b, 0, 1), Value.i]);
end;

RTTI Delphi Create as TValue an n-dimensional matrix

Good day,
TValue is a Delphi-2010 and up RTTI feature.
Following on from my earlier question, I had tried to make recurrent function to return a TValue as a n-dimensional. matrix(2D, 3D, 4D...)
for example, this procedure will show a n-dimensional matrix(it will list all elements from a n-dimensional matrix as TValue variable):
Procedure Show(X:TValue);
var i:integer;
begin
if x.IsArray then
begin
for i:=0 to x.GetArrayLength-1 do
show(x.GetArrayElement(i));
writeln;
end else
write(x.ToString,' ');
end;
I don't understand how to create a function to create from a TValue an n-dimensional matrix. For example i need a Function CreateDynArray(Dimensions:array of integer; Kind:TTypeKind):TValue; and the function will return a TValue which is a dynamic array how contain the dimenssions for example:
Return=CreateDynArray([2,3],tkInteger); will return a TValue as tkDynArray
and if i will show(Return) will list
0 0 0
0 0 0
Is not terminated.
From a TValue i try to create a DynArray with n-dimensions
Procedure CreateArray(var Value:TValue; NewDimmension:integer; NewValue2Kind:TTypeKind; NewValue2:TValue; IsLast:Boolean);
var i:integer;
NewValue:TValue;
len:Longint;
begin
If Value.IsArray then// we have components in this dimension
begin
for i:=0 to Value.GetArrayLength-1 do// list all
begin
NewValue:=Value.GetArrayElement[i];
CreateArray(newValue,NewDimension,NewValue2Kind,NewValue2,IsLast);
Value.SetArrayElement(i,NewValue);
end;
end;
end else
begin
if isLast then
begin
len:=NewDimension;
DynArraySetLength(PPointer(Value.GetRefereneToRawData)^,Value.TypeInfo,1,#len); //set length to NewDimension
for i:=0 to NewDimension-1 do //Fill all with 0
Value.SetArrayElement(i,NewValue2);
end else
begin
len:=NewDimension;
DynArraySetLength(PPointer(Value.GetRefereneToRawData)^,Value.TypeInfo,1,#len);//I will create len TValues
end;
end;
var Index:array of integer;
Value:TValue;
ValuKind:TTypeKind;
......
......
....
Case token of
tokInt:
begin
ValueKind:=tkInteger;
Value:=0;
end;
.....
end;
Index:=GetIndexFromSintacticTree;//for example if i have int[20][30] index=[20,30]
for i:=0 to high(index) do
begin
if i = high(index) then CreateArray(Variable.Value,Index[i],ValueKind,Value,True)
else CreateArray(Variable.Value,Index[i],ValueKind,Value,False)
//Variable.Value is TValue
end;
//first TValue have 1 element, after that it will have 20 elements, and after that will have 20*30 elements
Thank you very much, and have a nice day!
To create a dynamic array dynamically, you need a reference to its type info structure (PTypeInfo) to pass to DynArraySetLength; calling DynArraySetLength and passing a reference to a nil pointer is how you can create a new dynamic array. If the specific shape of dynamic array does not already exist in your Delphi program, there will be no specific PTypeInfo pointer that the compiler will generate for you. In this case, you would have to generate the corresponding PTypeInfo data structure yourself. This is possible, though tedious.
Frankly, I'd recommend you use a different structure than built-in Delphi dynamic arrays to represent arrays in your scripting language-like problem. In the long run it will probably be a lot less work than trying to dynamically generate the low-level RTTI data, which is more likely to change from version to version now that it has a much higher level abstraction in the Rtti unit.

Deep copy of a record with R1:=R2, or Is there good way to implement NxM matrix with record?

I'm implementing a N x M matrix (class) with a record and an internal dynamic array like below.
TMat = record
public
// contents
_Elem: array of array of Double;
//
procedure SetSize(Row, Col: Integer);
procedure Add(const M: TMat);
procedure Subtract(const M: TMat);
function Multiply(const M: TMat): TMat;
//..
class operator Add(A, B: TMat): TMat;
class operator Subtract(A, B: TMat): TMat;
//..
class operator Implicit(A: TMat): TMat; // call assign inside proc.
// <--Self Implicit(which isn't be used in D2007, got compilation error in DelphiXE)
procedure Assign(const M: TMat); // copy _Elem inside proc.
// <-- I don't want to use it explicitly.
end;
I choose a record, because I don't want to Create/Free/Assign to use it.
But with dynamic array, values can't be (deep-)copied with M1 := M2, instead of M1.Assign(M2).
I tried to declare self-Implicit conversion method, but it can't be used for M1:=M2.
(Implicit(const pA: PMat): TMat and M1:=#M2 works, but it's pretty ugly and unreadable..)
Is there any way to hook assignment of record ?
Or is there any suggestion to implement N x M matrix with records ?
Thanks in advance.
Edit:
I implemented like below with Barry's method and confirmed working properly.
type
TDDArray = array of array of Double;
TMat = record
private
procedure CopyElementsIfOthersRefer;
public
_Elem: TDDArray;
_FRefCounter: IInterface;
..
end;
procedure TMat.SetSize(const RowSize, ColSize: Integer);
begin
SetLength(_Elem, RowSize, ColSize);
if not Assigned(_FRefCounter) then
_FRefCounter := TInterfacedObject.Create;
end;
procedure TMat.Assign(const Source: TMat);
var
I: Integer;
SrcElem: TDDArray;
begin
SrcElem := Source._Elem; // Allows self assign
SetLength(Self._Elem, 0, 0);
SetLength(Self._Elem, Length(SrcElem));
for I := 0 to Length(SrcElem) - 1 do
begin
SetLength(Self._Elem[I], Length(SrcElem[I]));
Self._Elem[I] := Copy(SrcElem[I]);
end;
end;
procedure TMat.CopyElementsIfOthersRefer;
begin
if (_FRefCounter as TInterfacedObject).RefCount > 1 then
begin
Self.Assign(Self); // Self Copy
end;
end;
I agree it's not efficient. Just using Assign with pure record is absolutely faster.
But it's pretty handy and more readable.(and interesting. :-)
I think it's useful for light calculation or pre-production prototyping. Isn't it ?
Edit2:
kibab gives function getting reference count of dynamic array itself.
Barry's solution is more independent from internal impl, and perhaps works on upcoming 64bit compilers without any modification, but in this case, I prefer kibab's for it's simplicity & efficiency. Thanks.
TMat = record
private
procedure CopyElementsIfOthersRefer;
public
_Elem: TDDArray;
..
end;
procedure TMat.SetSize(const RowSize, ColSize: Integer);
begin
SetLength(_Elem, RowSize, ColSize);
end;
function GetDynArrayRefCnt(const ADynArray): Longword;
begin
if Pointer(ADynArray) = nil then
Result := 1 {or 0, depending what you need}
else
Result := PLongword(Longword(ADynArray) - 8)^;
end;
procedure TMat.CopyElementsIfOthersRefer;
begin
if GetDynArrayRefCnt(_Elem) > 1 then
Self.Assign(Self);
end;
You can use an interface field reference inside your record to figure out if your array is shared by more than one record: simply check the reference count on the object behind the interface, and you'll know that the data in the arrays is shared. That way, you can lazily copy on modification, but still use data sharing when the matrices aren't being modified.
You can't override record assignment by Implicit or Explicit operators.
The best you can do IMO is not to use direct assignment, using M.Assign method instead:
procedure TMat.Assign(const M: TMat);
begin
// "Copy" currently only copies the first dimension,
// bug report is open - see comment by kibab
// _Elem:= Copy(M._Elem);
..
end;
ex
M1.Assign(M2);
instead of
M1:= M2;
I've just realised a reason why this may not be such a great idea. It's true that the calling code becomes much simpler with operator overloading. But you may have performance problems.
Consider, for example, the simple code A := A+B; and suppose you use the idea in Barry's accepted answer. Using operator overloading this simple operation will result in a new dynamic array being allocated. In reality you would want to perform this operation in place.
Such in-place operations are very common in linear algebra matrix algorithms for the simple reason that you don't want to hit the heap if you can avoid it - it's expensive.
For small value types (e.g. complex numbers, 3x3 matrices etc.) then operator overloading inside records is efficient, but I think, if performance matters, then for large matrices operator overloading is not the best solution.

How can I search faster for name/value pairs in a Delphi TStringList?

I implemented language translation in an application by putting all strings at runtime in a TStringList with:
procedure PopulateStringList;
begin
EnglishStringList.Append('CAN_T_FIND_FILE=It is not possible to find the file');
EnglishStringList.Append('DUMMY=Just a dummy record');
// total of 2000 record appended in the same way
EnglishStringList.Sorted := True; // Updated comment: this is USELESS!
end;
Then I get the translation using:
function GetTranslation(ResStr:String):String;
var
iIndex : Integer;
begin
iIndex := -1;
iIndex := EnglishStringList.IndexOfName(ResStr);
if iIndex >= 0 then
Result := EnglishStringList.ValueFromIndex[iIndex] else
Result := ResStr + ' (Translation N/A)';
end;
Anyway with this approach it takes about 30 microseconds to locate a record, is there a better way to achieve the same result?
UPDATE: For future reference I write here the new implementation that uses TDictionary as suggested (works with Delphi 2009 and newer):
procedure PopulateStringList;
begin
EnglishDictionary := TDictionary<String, String>.Create;
EnglishDictionary.Add('CAN_T_FIND_FILE','It is not possible to find the file');
EnglishDictionary.Add('DUMMY','Just a dummy record');
// total of 2000 record appended in the same way
end;
function GetTranslation(ResStr:String):String;
var
ValueFound: Boolean;
begin
ValueFound:= EnglishDictionary.TryGetValue(ResStr, Result);
if not ValueFound then Result := Result + '(Trans N/A)';
end;
The new GetTranslation function performs 1000 times faster (on my 2000 sample records) then the first version.
THashedStringList should be better, I think.
In Delphi 2009 or later I would use TDictionary< string,string > in Generics.Collections.
Also note that there are free tools such as http://dxgettext.po.dk/ for translating applications.
If THashedStringList works for you, that's great. Its biggest weakness is that every time you change the contents of the list, the Hash table is rebuilt. So it will work for you as long as your list remains small or doesn't change very often.
For more info on this, see: THashedStringList weakness, which gives a few alternatives.
If you have a big list that may be updated, you might want to try GpStringHash by gabr, that doesn't have to recompute the whole table at every change.
I think that you don't use the EnglishStringList(TStringList) correctly. This is a sorted list, you add elements (strings), you sort it, but when you search, you do this by a partial string (only the name, with IndexOfName).
If you use IndexOfName in a sorted list, the TStringList can't use Dicotomic search. It use sequential search.
(this is the implementation of IndexOfName)
for Result := 0 to GetCount - 1 do
begin
S := Get(Result);
P := AnsiPos('=', S);
if (P <> 0) and (CompareStrings(Copy(S, 1, P - 1), Name) = 0) then Exit;
end;
I think that this is the reason of poor performance.
The alternative is use 2 TStringList:
* The first (sorted) only containts the "Name" and a pointer to the second list that contain the value; You can implement this pointer to the second list using the "pointer" of Object property.
* The second (not sorted) list containt the values.
When you search, you do it at first list; In this case you can use the Find method. when you find the name, the pointer (implemented with Object property) give you the position on second list with the value.
In this case, Find method on Sorted List is more efficient that HashList (that must execute a funcion to get the position of a value).
Regards.
Pd:Excuse-me for mistakes with english.
You can also use a CLASS HELPER to re-program the "IndexOfName" function:
TYPE
TStringsHelper = CLASS HELPER FOR TStrings
FUNCTION IndexOfName(CONST Name : STRING) : INTEGER;
END;
FUNCTION TStringsHelper.IndexOfName(CONST Name : STRING) : INTEGER;
VAR
SL : TStringList ABSOLUTE Self;
S,T : STRING;
I : INTEGER;
BEGIN
IF (Self IS TStringList) AND SL.Sorted THEN BEGIN
S:=Name+NameValueSeparator;
IF SL.Find(S,I) THEN
Result:=I
ELSE IF (I<0) OR (I>=Count) THEN
Result:=-1
ELSE BEGIN
T:=SL[I];
IF CompareStrings(COPY(T,1,LENGTH(S)),S)=0 THEN Result:=I ELSE Result:=-1
END;
EXIT
END;
Result:=INHERITED IndexOfName(Name)
END;
(or implement it in a descendant TStrings class if you dislike CLASS HELPERs or don't have them in your Delphi version).
This will use a binary search on a sorted TStringList and a sequential search on other TStrings classes.

Best way to sort an array

Say I have an array of records which I want to sort based on one of the fields in the record. What's the best way to achieve this?
TExample = record
SortOrder : integer;
SomethingElse : string;
end;
var SomeVar : array of TExample;
You can add pointers to the elements of the array to a TList, then call TList.Sort with a comparison function, and finally create a new array and copy the values out of the TList in the desired order.
However, if you're using the next version, D2009, there is a new collections library which can sort arrays. It takes an optional IComparer<TExample> implementation for custom sorting orders. Here it is in action for your specific case:
TArray.Sort<TExample>(SomeVar , TDelegatedComparer<TExample>.Construct(
function(const Left, Right: TExample): Integer
begin
Result := TComparer<Integer>.Default.Compare(Left.SortOrder, Right.SortOrder);
end));
(I know this is a year later, but still useful stuff.)
Skamradt's suggestion to pad integer values assumes you are going to sort using a string compare. This would be slow. Calling format() for each insert, slower still. Instead, you want to do an integer compare.
You start with a record type:
TExample = record
SortOrder : integer;
SomethingElse : string;
end;
You didn't state how the records were stored, or how you wanted to access them once sorted. So let's assume you put them in a Dynamic Array:
var MyDA: Array of TExample;
...
SetLength(MyDA,NewSize); //allocate memory for the dynamic array
for i:=0 to NewSize-1 do begin //fill the array with records
MyDA[i].SortOrder := SomeInteger;
MyDA[i].SomethingElse := SomeString;
end;
Now you want to sort this array by the integer value SortOrder. If what you want out is a TStringList (so you can use the ts.Find method) then you should add each string to the list and add the SortOrder as a pointer. Then sort on the pointer:
var tsExamples: TStringList; //declare it somewhere (global or local)
...
tsExamples := tStringList.create; //allocate it somewhere (and free it later!)
...
tsExamples.Clear; //now let's use it
tsExamples.sorted := False; //don't want to sort after every add
tsExamples.Capacity := High(MyDA)+1; //don't want to increase size with every add
//an empty dynamic array has High() = -1
for i:=0 to High(MyDA) do begin
tsExamples.AddObject(MyDA[i].SomethingElse,TObject(MyDA[i].SortOrder));
end;
Note the trick of casting the Integer SortOrder into a TObject pointer, which is stored in the TStringList.Object property. (This depends upon the fact that Integer and Pointer are the same size.) Somewhere we must define a function to compare the TObject pointers:
function CompareObjects(ts:tStringList; Item1,Item2: integer): Integer;
begin
Result := CompareValue(Integer(ts.Objects[Item1]), Integer(ts.Objects[Item2]))
end;
Now, we can sort the tsList on .Object by calling .CustomSort instead of .Sort (which would sort on the string value.)
tsExamples.CustomSort(#CompareObjects); //Sort the list
The TStringList is now sorted, so you can iterate over it from 0 to .Count-1 and read the strings in sorted order.
But suppose you didn't want a TStringList, just an array in sorted order. Or the records contain more data than just the one string in this example, and your sort order is more complex. You can skip the step of adding every string, and just add the array index as Items in a TList. Do everything above the same way, except use a TList instead of TStringList:
var Mlist: TList; //a list of Pointers
...
for i:=0 to High(MyDA) do
Mlist.add(Pointer(i)); //cast the array index as a Pointer
Mlist.Sort(#CompareRecords); //using the compare function below
function CompareRecords(Item1, Item2: Integer): Integer;
var i,j: integer;
begin
i := integer(item1); //recover the index into MyDA
j := integer(item2); // and use it to access any field
Result := SomeFunctionOf(MyDA[i].SomeField) - SomeFunctionOf(MyDA[j].SomeField);
end;
Now that Mlist is sorted, use it as a lookup table to access the array in sorted order:
for i:=0 to Mlist.Count-1 do begin
Something := MyDA[integer(Mlist[i])].SomeField;
end;
As i iterates over the TList, we get back the array indexes in sorted order. We just need to cast them back to integers, since the TList thinks they're pointers.
I like doing it this way, but you could also put real pointers to array elements in the TList by adding the Address of the array element instead of it's index. Then to use them you would cast them as pointers to TExample records. This is what Barry Kelly and CoolMagic said to do in their answers.
If your need sorted by string then use sorted TStringList and
add record by TString.AddObject(string, Pointer(int_val)).
But If need sort by integer field and string - use TObjectList and after adding all records call TObjectList.Sort with necessary sorted functions as parameter.
This all depends on the number of records you are sorting. If you are only sorting less than a few hundred then the other sort methods work fine, if you are going to be sorting more, then take a good look at the old trusty Turbo Power SysTools project. There is a very good sort algorithm included in the source. One that does a very good job sorting millions of records in a efficient manner.
If you are going to use the tStringList method of sorting a list of records, make sure that your integer is padded to the right before inserting it into the list. You can use the format('%.10d',[rec.sortorder]) to right align to 10 digits for example.
The quicksort algorithm is often used when fast sorting is required. Delphi is (Or was) using it for List.Sort for example.
Delphi List can be used to sort anything, but it is an heavyweight container, which is supposed to look like an array of pointers on structures. It is heavyweight even if we use tricks like Guy Gordon in this thread (Putting index or anything in place of pointers, or putting directly values if they are smaller than 32 bits): we need to construct a list and so on...
Consequently, an alternative to easily and fastly sort an array of struct might be to use qsort C runtime function from msvcrt.dll.
Here is a declaration that might be good (Warning: code portable on windows only).
type TComparatorFunction = function(lpItem1: Pointer; lpItem2: Pointer): Integer; cdecl;
procedure qsort(base: Pointer; num: Cardinal; size: Cardinal; lpComparatorFunction: TComparatorFunction) cdecl; external 'msvcrt.dll';
Full example here.
Notice that directly sorting the array of records can be slow if the records are big. In that case, sorting an array of pointer to the records can be faster (Somehow like List approach).
With an array, I'd use either quicksort or possibly heapsort, and just change the comparison to use TExample.SortOrder, the swap part is still going to just act on the array and swap pointers. If the array is very large then you may want a linked list structure if there's a lot of insertion and deletion.
C based routines, there are several here
http://www.yendor.com/programming/sort/
Another site, but has pascal source
http://www.dcc.uchile.cl/~rbaeza/handbook/sort_a.html
Use one of the sort alorithms propose by Wikipedia. The Swap function should swap array elements using a temporary variable of the same type as the array elements. Use a stable sort if you want entries with the same SortOrder integer value to stay in the order they were in the first place.
TStringList have efficient Sort Method.
If you want Sort use a TStringList object with Sorted property to True.
NOTE: For more speed, add objects in a not Sorted TStringList and at the end change the property to True.
NOTE: For sort by integer Field, convert to String.
NOTE: If there are duplicate values, this method not is Valid.
Regards.
If you have Delphi XE2 or newer, you can try:
var
someVar: array of TExample;
list: TList<TExample>;
sortedVar: array of TExample;
begin
list := TList<TExample>.Create(someVar);
try
list.Sort;
sortedVar := list.ToArray;
finally
list.Free;
end;
end;
I created a very simple example that works correctly if the sort field is a string.
Type
THuman = Class
Public
Name: String;
Age: Byte;
Constructor Create(Name: String; Age: Integer);
End;
Constructor THuman.Create(Name: String; Age: Integer);
Begin
Self.Name:= Name;
Self.Age:= Age;
End;
Procedure Test();
Var
Human: THuman;
Humans: Array Of THuman;
List: TStringList;
Begin
SetLength(Humans, 3);
Humans[0]:= THuman.Create('David', 41);
Humans[1]:= THuman.Create('Brian', 50);
Humans[2]:= THuman.Create('Alex', 20);
List:= TStringList.Create;
List.AddObject(Humans[0].Name, TObject(Humans[0]));
List.AddObject(Humans[1].Name, TObject(Humans[1]));
List.AddObject(Humans[2].Name, TObject(Humans[2]));
List.Sort;
Human:= THuman(List.Objects[0]);
Showmessage('The first person on the list is the human ' + Human.name + '!');
List.Free;
End;

Resources