TStringList enable Binary search without resorting? - delphi

I am building a stringlist from an ADO query, in the query it is much faster to return sorted results and then add them in order. this gives me an already sorted list and then calling either Sort or setting sorted true costs me time as the Quicksort algorithm does not preform well on an already sorted list. Is there some way to set the TStringList to use the Binary search without running the sort?
before you ask I don't have access to the CustomSort attribute.

I am not sure I understand what you are worried about, assuming the desired sort order of the StringList is the same as the ORDER BY of the AdoQuery.
Surely the thing to do is to set Sorted on your StringList to True while it is still empty and then insert the rows from the AdoQuery. That way, when the StringList is about to Add an entry, it will search for it using IndexOf, which will in turn use Find, which does a binary search, to check for duplicates. But using Add in this way does not involve a quicksort because the StringList is already treating itself as sorted.
In view of your comments and your own answer I ran the program below through the Line Timer profiler in NexusDB's Quality Suite. The result is that although there are detectable differences in execution speed using a binary search versus TStringList.IndexOf, they are nothing to do with the use (or not) of TStringList's QuickSort. Further, the difference is explicable by a subtle difference between how the binary search I used and the one in TStringList.Find work - see Update #2 below.
The program generates 200k 100-character strings and then inserts them into a StringList. The StringList is generated in two ways, first with Sorted set to True before any strings are added and then with Sorted set to True only after the strings have been added. StringList.IndexOf and your BinSearch is then used to look up each of the strings which has been added. The results are as follows:
Line Total Time Source
80 procedure Test;
119 0.000549 begin
120 2922.105618 StringList := GetList(True);
121 2877.101652 TestIndexOf;
122 1062.461975 TestBinSearch;
123 29.299069 StringList.Free;
124
125 2970.756283 StringList := GetList(False);
126 2943.510851 TestIndexOf;
127 1044.146265 TestBinSearch;
128 31.440766 StringList.Free;
129 end;
130
131 begin
132 Test;
133 end.
The problem I encountered is that your BinSearch never actually returns 1 and the number of failures is equal to the number of strings searched for. If you can fix this, I'll be happy to re-do the test.
program SortedStringList2;
[...]
const
Rows = 200000;
StrLen = 100;
function ZeroPad(Number : Integer; Len : Integer) : String;
begin
Result := IntToStr(Number);
if Length(Result) < Len then
Result := StringOfChar('0', Len - Length(Result)) + Result;
end;
function GetList(SortWhenEmpty : Boolean) : TStringList;
var
i : Integer;
begin
Result := TStringList.Create;
if SortWhenEmpty then
Result.Sorted := True;
for i := 1 to Rows do
Result.Add(ZeroPad(i, StrLen));
if not SortWhenEmpty then
Result.Sorted := True;
end;
Function BinSearch(slList: TStringList; sToFind : String) : integer;
var
i, j, k : integer;
begin
try
i := slList.Count div 2;
k := i;
if i = 0 then
begin
Result := -1;
// SpendLog('BinSearch List Empty, Exiting...');
exit;
end;
while slList.Strings[i] <> sToFind do
begin
if CompareText(slList.Strings[i], sToFind) < 0 then
begin
j := i;
k := k div 2;
i := i + k;
if j=i then
break;
end else
if CompareText(slList.Strings[i], sToFind) > 0 then
begin
j := i;
k := k div 2;
i := i - k;
if j=i then
break;
end else
break;
end;
if slList.Strings[i] = sToFind then
result := i
else
Result := -1;
except
//SpendLog('<BinSearch> Exception: ' + ExceptionMessage + ' At Line: ' + Analysis.LastSourcePos);
end;
end;
procedure Test;
var
i : Integer;
StringList : TStringList;
procedure TestIndexOf;
var
i : Integer;
Index : Integer;
Failures : Integer;
S : String;
begin
Failures := 0;
for i := 1 to Rows do begin
S := ZeroPad(i, StrLen);
Index := StringList.IndexOf(S);
if Index < 0 then
Inc(Failures);
end;
Assert(Failures = 0);
end;
procedure TestBinSearch;
var
i : Integer;
Index : Integer;
Failures : Integer;
S : String;
begin
Failures := 0;
for i := 1 to Rows do begin
S := ZeroPad(i, StrLen);
Index := BinSearch(StringList, S);
if Index < 0 then
Inc(Failures);
end;
//Assert(Failures = 0);
end;
begin
StringList := GetList(True);
TestIndexOf;
TestBinSearch;
StringList.Free;
StringList := GetList(False);
TestIndexOf;
TestBinSearch;
StringList.Free;
end;
begin
Test;
end.
Update I wrote my own implementation of the search algorithm in the Wikipedia article https://en.wikipedia.org/wiki/Binary_search_algorithm as follows:
function BinSearch(slList: TStringList; sToFind : String) : integer;
var
L, R, m : integer;
begin
L := 0;
R := slList.Count - 1;
if R < L then begin
Result := -1;
exit;
end;
m := (L + R) div 2;
while slList.Strings[m] <> sToFind do begin
m := (L + R) div 2;
if CompareText(slList.Strings[m], sToFind) < 0 then
L := m + 1
else
if CompareText(slList.Strings[m], sToFind) > 0 then
R := m - 1;
if L > R then
break;
end;
if slList.Strings[m] = sToFind then
Result := m
else
Result := -1;
end;
This seems to work correctly, and re-profiling the test app using this gave these results:
Line Total Time Source
113 procedure Test;
153 0.000490 begin
154 3020.588894 StringList := GetList(True);
155 2892.860499 TestIndexOf;
156 1143.722379 TestBinSearch;
157 29.612898 StringList.Free;
158
159 2991.241659 StringList := GetList(False);
160 2934.778847 TestIndexOf;
161 1113.911083 TestBinSearch;
162 30.069241 StringList.Free;
On that showing, a binary search clearly outperforms TStringList.IndexOf and contrary to my expectations it makes no real difference whether TStringList.Sorted is set to True before or after the strings are added.
Update#2 it turns out that the reason BinSearch is faster than TStringList.IndexOf is purely because BinSearch uses CompareText whereas TStringList.IndexOf uses AnsiCompareText (via .Find). If I change BinSearch to use AnsiCompareText, it becomes 1.6 times slower than TStringList.IndexOf!

I was about to suggest using an interposer class to directly change the FSorted field without calling its setter method which as a side effect calls the Sort method. But looking at the implementation of TStringList in Delphi 2007, I found that Find will always do a binary search without checking the Sorted property. This will, of course fail, if the list items aren't sorted, but in your case they are. So, as long as you remember to call Find rather than IndexOf, you don't need to do anything.

in the end I just hacked up a binary search to check the stringlist like an array:
Function BinSearch(slList: TStringList; sToFind : String) : integer;
var
i, j, k : integer;
begin
try
try
i := slList.Count div 2;
k := i;
if i = 0 then
begin
Result := -1;
SpendLog('BinSearch List Empty, Exiting...');
exit;
end;
while slList.Strings[i] <> sToFind do
begin
if CompareText(slList.Strings[i], sToFind) < 0 then
begin
j := i;
k := k div 2;
i := i + k;
if j=i then
break;
end else
if CompareText(slList.Strings[i], sToFind) > 0 then
begin
j := i;
k := k div 2;
i := i - k;
if j=i then
break;
end else
break;
end;
if slList.Strings[i] = sToFind then
result := i
else
Result := -1;
except
SpendLog('<BinSearch> Exception: ' + ExceptionMessage + ' At Line: ' + Analysis.LastSourcePos);
end;
finally
end;
end;
I'll clean this up later if needed.

Related

Delphi How To Convert String To Binary Using Only Pascal [duplicate]

This question already has answers here:
Converting decimal/integer to binary - how and why it works the way it does?
(6 answers)
Closed 4 years ago.
I have done some Example to convert a string to binary but i couldn't find a way to walk on each character in the string and complete the whole calculations process and then step to the next character in the string, Here is my code:
var i,j, rest, results :integer;
restResult : string;
begin
results := 1;
for i := 1 to length(stringValue) do
begin
while (results > 0) do
begin
results := ord(stringValue[i]) div 2;
rest := ord(stringValue[i]) mod 2;
restResult := restResult + inttostr(rest);
end;
end;
// Get The Rests Backwards
for i := length(restResult) downto 1 do
begin
result := result + restResult[i];
end;
The application always get into infinite loop, any suggestions?
Your results := ord(stringValue[i]) div 2; remains the same, because stringValue[i] does not change, so while loop is infinite.
To solve this mistake:
for i := 1 to length(stringValue) do
begin
t := ord(stringValue[i]);
repeat
restResult := restResult + inttostr(t mod 2);
t := t div 2;
until t = 0;
end;
But note that you cannot divide resulting string into pieces for distinct chars, because length of binary representation will vary depending on char itself.
This is example of code with fixed length for representation of char (here AnsiChar):
function AnsiStringToBinaryString(const s: AnsiString): String;
const
SBits: array[0..1] of string = ('0', '1');
var
i, k, t: Integer;
schar: string;
begin
Result := '';
for i := 1 to Length(s) do begin
t := Ord(s[i]);
schar := '';
for k := 1 to 8 * SizeOf(AnsiChar) do begin
schar := SBits[t mod 2] + schar;
t := t div 2
end;
Result := Result + schar;
end;
end;
'#A z': (division bars are mine)
01000000|01000001|00100000|01111010
# A space z

Byte array to Signed integer in Delphi

source array(4 bytes)
[$80,$80,$80,$80] =integer 0
[$80,$80,$80,$81] = 1
[$80,$80,$80,$FF] = 127
[$80,$80,$81,$01] = 128
need to convert this to integer.
below is my code and its working at the moment.
function convert(b: array of Byte): Integer;
var
i, st, p: Integer;
Negative: Boolean;
begin
result := 0;
st := -1;
for i := 0 to High(b) do
begin
if b[i] = $80 then Continue // skip leading 80
else
begin
st := i;
Negative := b[i] < $80;
b[i] := abs(b[i] - $80);
Break;
end;
end;
if st = -1 then exit;
for i := st to High(b) do
begin
p := round(Power(254, High(b) - i));
result := result + b[i] * p;
result := result - (p div 2);
end;
if Negative then result := -1 * result
end;
i'm looking for a better function?
Update:
file link
https://drive.google.com/file/d/0ByBA4QF-YOggZUdzcXpmOS1aam8/view?usp=sharing
in uploaded file ID field offset is from 5 to 9
NEW:
Now i got into new problem which is decoding date field
Date field hex [$80,$8F,$21,$C1] -> possible date 1995-12-15
* in uploaded file date field offset is from 199 to 203
Just an example of some improvements as outlined by David.
The array is passed by reference as a const.
The array is fixed in size.
The use of floating point calculations are converted directly into a constant array.
Const
MaxRange = 3;
Type
TMySpecial = array[0..MaxRange] of Byte;
function Convert(const b: TMySpecial): Integer;
var
i, j: Integer;
Negative: Boolean;
Const
// Pwr[i] = Round(Power(254,MaxRange-i));
Pwr: array[0..MaxRange] of Cardinal = (16387064,64516,254,1);
begin
for i := 0 to MaxRange do begin
if (b[i] <> $80) then begin
Negative := b[i] < $80;
Result := Abs(b[i] - $80)*Pwr[i] - (Pwr[i] shr 1);
for j := i+1 to MaxRange do
Result := Result + b[j]*Pwr[j] - (Pwr[j] shr 1);
if Negative then
Result := -Result;
Exit;
end;
end;
Result := 0;
end;
Note that less code lines is not always a sign of good performance.
Always measure performance before optimizing the code in order to find real bottlenecks.
Often code readability is better than optimizing over the top.
And for future references, please tell us what the algorithm is supposed to do.
Code for testing:
const
X : array[0..3] of TMySpecial =
(($80,$80,$80,$80), // =integer 0
($80,$80,$80,$81), // = 1
($80,$80,$80,$FF), // = 127
($80,$80,$81,$01)); // = 128
var
i,j: Integer;
sw: TStopWatch;
begin
sw := TStopWatch.StartNew;
for i := 1 to 100000000 do
for j := 0 to 3 do
Convert(X[j]);
WriteLn(sw.ElapsedMilliseconds);
ReadLn;
end.

SIGSEV in custom QuickSort implementation

I slept over the answer to question Quicksort drama and wanted to recode it from scratch, implementing your tip with the call-by-reference var. And again: I cannot find any failure I made again. I compare the code to your program one by one and I cannot find the problem. The following code produces an Exception (External:SIGSEV at address 11602) during compilation/run
program quicksort;
var
iArray : array[0..8] of integer;
procedure fillArray(var iArray : array of integer);
begin;
iArray[0] := 3;
iArray[1] := 1;
iArray[2] := 8;
iArray[3] := 4;
iArray[4] := 9;
iArray[5] := 0;
iArray[6] := 8;
iArray[7] := 2;
iArray[8] := 5;
end;
procedure writeArray(iArray : array of integer);
var i:integer;
begin
for i:=low(iArray) to high(iArray) do begin
write(iArray[i]);
end;
writeln('');
end;
procedure quickSort(var iArray : array of integer; links : integer; rechts:integer);
var
l,r,pivot, temp: integer;
begin
if (rechts > links) then begin
l := links;
r := rechts;
pivot := iArray[(rechts+links) div 2];
while (l<r) do begin
while (iArray[l] < pivot) do l:=l+1;
while (iArray[r] > pivot) do r:=r-1;
if (l<=r) then begin
temp := iArray[l];
iArray[l] := iArray[r];
iArray[r] := temp;
end;
end;
if (links < r) then quickSort(iArray, links, r);
if (l < rechts) then quickSort(iArray, l, rechts);
end;
end;
begin
fillArray(iArray);
quickSort(iArray,low(iArray),high(iArray));
writeArray(iArray);
end.
The block of code that swaps, also needs to increment l and decrement r once the swap is complete:
if (l <= r) then
begin
temp := iArray[l];
iArray[l] := iArray[r];
iArray[r] := temp;
inc(l); // <-- this was missing
dec(r); // <-- as was this
end;
The complete program, with some other minor tidy ups:
program quicksort24340509;
var
iArray: array [0 .. 8] of integer;
Procedure fillArray(var iArray: array of integer);
begin;
iArray[0] := 3;
iArray[1] := 1;
iArray[2] := 8;
iArray[3] := 4;
iArray[4] := 9;
iArray[5] := 0;
iArray[6] := 8;
iArray[7] := 2;
iArray[8] := 5;
end;
Procedure writeArray(const iArray: array of integer);
var
i: integer;
begin
for i := low(iArray) to high(iArray) do
begin
write(iArray[i], ' ');
end;
writeln;
end;
Procedure quickSort(var iArray: array of integer; links, rechts: integer);
var
l, r, pivot, temp: integer;
begin
if (rechts > links) then
begin
l := links;
r := rechts;
pivot := iArray[(rechts + links) div 2];
while l < r do
begin
while iArray[l] < pivot do inc(l);
while iArray[r] > pivot do dec(r);
if l <= r then
begin
temp := iArray[l];
iArray[l] := iArray[r];
iArray[r] := temp;
inc(l);
dec(r);
end;
end;
if links < r then
quickSort(iArray, links, r);
if l < rechts then
quickSort(iArray, l, rechts);
end;
end;
begin
fillArray(iArray);
quickSort(iArray, low(iArray), high(iArray));
writeArray(iArray);
readln;
end.
Output
0 1 2 3 4 5 8 8 9
The reason that your version fails, without the missing lines, is that the recursive calls to quickSort operate on the wrong ranges.
For example, Given your input of
3 1 8 4 9 0 8 2 5
the partitioning step pivots on 9 and results in
3 1 8 4 5 0 8 2 9
Now, the recursive step should be to sort all the values to the left of the pivot, and all the values to the right. And we leave the pivot alone because partitioning ensured that it is in its final position.
There are no values to the right of the pivot so we should be making a recursive call for the range 0 to 7. But if you inspect what happens with your code you will find that it does not. Instead it makes a recursive call for the range 0 to 8. That in itself is a little benign, but once the ranges become small, at the stopping condition, it's different. Try asking your program to sort these values:
1 2
The code pivots on 1. At the end of partitioning we have:
links = 0
rechts = 1
l = 0
r = 0
So we recursively call quickSort passing l and rechts as the ranges. But that's exactly the same call as we initially made. And that therefore leads to a stack overflow.
So the point is that we must make sure that when we partition on a pivot, we exclude that pivot from all future recursive calls to quickSort. If we don't do that we don't sub-divide the problem, and the recursion does not terminate.

Loops and increasing letter values for cells in string grid

So this could be hard to explain but i want to do a for ... := 1 to 10 do statement but i want it to be for A to N do. The main purpose of this excersise is to load data into a string grid. So lets have it load the cells 0,1 0,2 0,3 0,4 0,5 0,6 0,7 with the Letter A, B, C, D, E all the way up to 14. If anyone knows how to do this i would be extremely thankful!
Here you got it, but I'm not sure if it's a good way how to learn programming (I mean asking question as requests so that someone else write code for you):
procedure TForm1.Button1Click(Sender: TObject);
var
I: Integer;
begin
StringGrid1.FixedCols := 1;
StringGrid1.ColCount := 15;
for I := 1 to 14 do
StringGrid1.Cells[I, 1] := Chr(Ord('A') + I - 1);
end;
If you want to fill the StringGrid control one row at a time, you can do
procedure TForm1.Button1Click(Sender: TObject);
var
i: Integer;
begin
StringGrid1.FixedCols := 1;
StringGrid1.FixedRows := 1;
for i := 0 to Min(25, (StringGrid1.ColCount-1) * (StringGrid1.RowCount-1)) do
StringGrid1.Cells[i mod (StringGrid1.ColCount - 1) + 1,
i div (StringGrid1.ColCount - 1) + 1] := Chr(Ord('A') + i);
end;
which works no matter how many rows and cols there are.
Want to fuse TLama's answer with that "want to do a for ... := 1 to 10 do statement but i want it to be for A to N do"
Don't know if it will be pun, or enlightening.
var c: char; i: integer;
s: string;
...
i := 0; s:= EmptyStr;
for c := 'A' to 'N' do begin
s := s + c + ',';
Inc(i);
end;
SetLength(s, Length(s) - 1); // we do not need last comma there
StringGrid1.ColCount := i;
StringGrid1.Rows[0].CommaText := s;
Or the same using TStringBuilder - which would be faster than re-arranging Heap on each new string modification.
uses SysUtils;
...
var c: char; i: integer;
s: string;
...
i := 0;
with TStringBuilder.Create do try
for c := 'A' to 'N' do begin
Append(c + ',');
Inc(i);
end;
s := ToString;
finally
Free;
end;
SetLength(s, Length(s) - 1); // we do not need last comma there
StringGrid1.ColCount := i;
StringGrid1.Rows[0].CommaText := s;

Convert Int64 to Base30 and Back

For a registration code I want to convert an Int64 to base30 (30 so that only uppercase characters and excluding 0,O,I,1,etc.) and back.
This is not too difficult using functions like:
const
Base = 30;
Base30CharSet = '23456789ABCDEFGHJKLMNPRSTVWXYZ';
function ConvertIntToBase30(ANumber: Int64): string;
begin
if(ANumber = 0) then
Result := Copy(Base30CharSet, 1, 1)
else begin
Result := '';
while(ANumber <> 0) do begin
Result := Copy(Base30CharSet, (ANumber mod Base)+1, 1) + Result;
ANumber := ANumber div Base;
end;
end;
end;
function ConvertBase30ToInt(ANumber: string): Int64;
var
i: integer;
begin
Result := 0;
for i := 1 to Length(ANumber) do begin
Result := Result + (Pos(ANumber[i], Base30CharSet)-1);
if(i < Length(ANumber)) then
Result := Result * Base;
end;
end;
The snag is that I am interested in the Int64's bits, so I could be dealing with a number like $FFFFFFFFFFFFFFFF = -1.
To work around this I thought I would store and remove the sign (abs()) and include the sign as an extra character appended to the base30 result. The problem the occurs at the lower limit of Int64 as calling abs(-9223372036854775808) results in an overflow.
Does anyone have a solution or better algorithm to solve this problem?
The way to deal with it is having a character to indicate it is a negative number so that you can decode back. For negative number, just flip the bit from 1 to 0 and remove the sign bit before encoding and when decode, do a flip back and add the sign bit. Below is working codes
function InvertIntOff(const ANumberL, ANumberH: Integer): Int64;
asm
XOR EAX,$FFFFFFFF
XOR EDX,$FFFFFFFF
end;
function InvertIntOn(const ANumberL, ANumberH: Integer): Int64;
asm
XOR EAX,$FFFFFFFF
XOR EDX,$FFFFFFFF
OR EDX,$80000000
end;
function ConvertIntToBase(ANumber: Int64): string;
const
CBaseMap: array[0..31] of Char = (
'2','3','4','5','6','7','8','9', //0-7
'A','B','C','D','E','F','G','H', //8-15
'J','K','L','M','N', //16-20
'P','Q','R','S','T','U','V','X','W','Y','Z'); //21-31
var
I: Integer;
begin
SetLength(Result, 15);
I := 0;
if ANumber < 0 then
begin
Inc(I);
Result[I] := '1';
ANumber := InvertIntOff(ANumber and $FFFFFFFF, (ANumber and $FFFFFFFF00000000) shr 32);
end;
while ANumber <> 0 do
begin
Inc(I);
Result[I] := CBaseMap[ANumber and $1F];
ANumber := ANumber shr 5;
end;
SetLength(Result, I);
end;
function ConvertBaseToInt(const ABase: string): Int64;
var
I, Index: Integer;
N: Int64;
begin
Result := 0;
if Length(ABase) > 0 then
begin
if ABase[1] = '1' then
Index := 2
else
Index := 1;
for I := Index to Length(ABase) do
begin
case ABase[I] of
'2'..'9':
N := Ord(ABase[I]) - Ord('2');
'A'..'H':
N := Ord(ABase[I]) - Ord('A') + 8;
'J'..'N':
N := Ord(ABase[I]) - Ord('J') + 16;
'P'..'Z':
N := Ord(ABase[I]) - Ord('P') + 21;
else
raise Exception.Create('error');
end;
if I > Index then
Result := Result or (N shl ((I - Index) * 5))
else
Result := N;
end;
if ABase[1] = '1' then
Result := InvertIntOn(Result and $FFFFFFFF, (Result and $FFFFFFFF00000000) shr 32);
end;
end;
procedure TestBase32;
var
S: string;
begin
S := ConvertIntToBase(-1);
ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? -1');
S := ConvertIntToBase(-31);
ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? -31');
S := ConvertIntToBase(1);
ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? 1');
S := ConvertIntToBase(123456789);
ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? 123456789');
S := ConvertIntToBase(-123456789);
ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? -123456789');
end;
I think you are almost there by considering abs()...
But rather than using abs() why not simply ignore the sign for processing the value of the Int64 itself ? As far as I can tell, you are in fact already doing this so only one minor addition is needed to the encoding routine:
if aNumber < 0 then
// negative
else
// positive;
The only problem then is the LOSS of sign information in the resulting Base30 string. So treat that as a separate problem to be solved using the new information gained from the aNumber < 0 test...
I see you have excluded all chars that could be confused for 0 or 1 but have also excluded 0 and 1 themselves. You could therefore use 0 and 1 to indicate positive or negative (or vice versa).
Depending on the purpose of these routines, the placement of the 0/1 in the result could be entirely arbitrary (if you wished to obfuscate things and make the placement of the 0/1 random rather than a consistent lead/trail character).
When encoding simply drop a sign indicator into the result string at random, and when decoding handle the 0/1 character whenever as the sign marker it is encountered, but skipped for the purposes of decoding the value.
Of course, if obfuscation is not an issue then simply consistently pre or post fix the sign indicator.
You could even simply choose to use '1' to indicate negative and the LACK of a '1' to indicate/assume positive (this would simplify the zero value case a little I think)
The easy answer is to turn range checking off, even just for the method that you're calling abs in.
If you don't care about an extra char or two you could split the int64 into words or dwords and string those together. I would be more tempted to go to base32 and use bit shifts for speed and ease of use. Then your encoding becomes
Base32CharSet[(ANumber shr 5) % 32]
and a similar pos() based approach for the decode.

Resources