How to add up the integer values in a label in delphi - delphi

I am currently having problems with screating a scoreboard in Delphi.
I have a series of forms which are individual questions.
If the questions are answered correctly, then the score is 1. Otherwise the score is -1.
On my scoreboard at the moment, I have 12 labels and 11 of them contain the score for each of the forms.
What I would like to do is add up the numbers in each of the labels and output the final score into the 12th label.
Is there a way of doing this?
Any help will be greatly appreciated.

You should use the UI purely for displaying your values.
For working with your data you should use appropriate data structures: array, lists, etc.
Example using an Array:
var
Scores[0..10]: Integer;
Sum: Integer;
procedure CollectData;
var
i: Integer;
begin
Scores[0] := ...;
//...
Scores[10] := ...;
Sum := 0;
for i := Low(Scores) to High(Scores) do
Sum := Sum + Scores[i];
end;
procedure DisplayData;
begin
Label1.Caption := IntToStr(Scores[0]);
//...
Label11.Caption := IntToStr(Scores[10]);
Label12.Caption := IntToStr(Sum);
end;

Neat solution: Keep scores as integers in Integer fields
Not so neat solution:
SumLabel.Caption := IntToStr( StrToIntDef( Label1.Caption, 0 ) + StrToIntDef( Label2.Caption, 0 ) + ... );

Although I think #DR's answer is spot-on, and #Ritsaert's is helpful, here's another option.
Your label components will have a 'TAG' property - you can use this for your own purposes and in your case I'd just set the TAG property at the same time as you set the Caption.
The advantage behind this is that you can format your caption to contain more than a simple number (if you wish), and also you are just summing up tags (which are already integers and don't need you to do the extra work behind a StrToIntDef call). Really, you're following #DR's point about keeping values out of the GUI (in a sense), you're using a storage field in each label instead.
eg;
when setting a score;-
Label1.Caption:=Format('%d point',[FScore]);
Label1.Tag:=FScore;
and when summing them;-
FSum:=Label1.Tag + Label2.Tag + Label3.Tag (etc)

Related

Find duplicates in a stringlist very fast

what is the fastest way to find duplicates in a Tstringlist. I get the data I need to search for duplicates in a Stringlist. My current idea goes like this :
var TestStringList, DataStringList : TstringList;
for i := 0 to DataStringList.Items-1 do
begin
if TestStringList.Indexof(DataStringList[i])< 0 < 0 then
begin
TestStringList.Add(DataStringList[i])
end
else
begin
memo1.ines.add('duplicate item found');
end;
end;
....
Just for completeness, (and because your code doesn't actually use the duplicate, but just indicates one has been found): Delphi's TStringList has the built-in ability to deal with duplicate entries, in it's Duplicates property. Setting it to dupIgnore will simply discard any duplicates you attempt to add. Note that the destination list has to be sorted, or Duplicates has no effect.
TestStringList.Sorted := True;
TestStringList.Duplicates := dupIgnore;
for i := 0 to DataStringList.Items-1 do
TestStringList.Add(DataStringList[i]);
Memo1.Lines.Add(Format('%d duplicates discarded',
[DataStringList.Count - TestStringList.Count]));
A quick test shows that the entire loop can be removed if you use Sorted and Duplicates:
TestStringList.Sorted := True;
TestStringList.Duplicates := dupIgnore;
TestStringList.AddStrings(DataStringList);
Memo1.Lines.Add(Format('%d duplicates discarded',
[DataStringList.Count - TestStringList.Count]));
See the TStringList.Duplicates documentation for more info.
I think that you are looking for duplicates. If so then you do the following:
Case 1: The string list is ordered
In this scenario, duplicates must appear at adjacent indices. In which case you simply loop from 1 to Count-1 and check whether or not the elements of index i is the same as that at index i-1.
Case 2: The string list is not ordered
In this scenario we need a double for loop. It looks like this:
for i := 0 to List.Count-1 do
for j := i+1 to List.Count-1 do
if List[i]=List[j] then
// duplicate found
There are performance considerations. If the list is ordered the search is O(N). If the list is not ordered the search is O(N2). Clearly the former is preferable. Since a list can be sorted with complexity O(N log N), if performance becomes a factor then it will be advantageous to sort the list before searching for duplicates.
Judging by the use of IndexOf you use an unsorted list. The scaling factor of your algorithm then is n^2. That is slow. You can optimize it as David shown by limiting search area in the internal search and then the average factor would be n^2/2 - but that still scales badly.
Note: scaling factor here makes sense for limited workloads, say dozen or hundreds of strings per list. For larger sets of data asymptotic analysis O(...) measure would suit better. However finding O-measures for QuickSort and for hash-lists is a trivial task.
Option 1: Sort the list. Using quick-sort it would have scaling factor n + n*log(n) or O(n*log(n)) for large loads.
Set Duplicates to accept
Set Sorted to True
Iterate the sorted list and check if the next string exists and is the same
http://docwiki.embarcadero.com/Libraries/XE3/en/System.Classes.TStringList.Duplicates
http://docwiki.embarcadero.com/Libraries/XE3/en/System.Classes.TStringList.Sorted
Option 2: use hashed list helper. In modern Delphi that would be TDictionary<String,Boolean>, in older Delphi there is a class used by TMemIniFile
You iterate your stringlist and then check if the string was already added into the helper collection.
The scaling factor would be a constant for small data chunks and O(1) for large ones - see http://docwiki.embarcadero.com/Libraries/XE2/en/System.Generics.Collections.TDictionary.ContainsKey
If it was not - you add it with "false" value.
If it was - you switch the value to "true"
For older Delphi you can use THashedStringList in a similar pattern (thanks #FreeConsulting)
http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/delphivclwin32/IniFiles_THashedStringList_IndexOf.html
Unfortunately it is unclear what you want to do with the duplicates. Your else clause suggests you just want to know whether there is one (or more) duplicate(s). Although that could be the end goal, I assume you want more.
Extracting duplicates
The previously given answers delete or count the duplicate items. Here an answer for keeping them.
procedure ExtractDuplicates1(List1, List2: TStringList; Dupes: TStrings);
var
Both: TStringList;
I: Integer;
begin
Both := TStringList.Create;
try
Both.Sorted := True;
Both.Duplicates := dupAccept;
Both.AddStrings(List1);
Both.AddStrings(List2);
for I := 0 to Both.Count - 2 do
if (Both[I] = Both[I + 1]) then
if (Dupes.Count = 0) or (Dupes[Dupes.Count - 1] <> Both[I]) then
Dupes.Add(Both[I]);
finally
Both.Free;
end;
end;
Performance
The following alternatives are tried in order to compare performance of the above routine.
procedure ExtractDuplicates2(List1, List2: TStringList; Dupes: TStrings);
var
Both: TStringList;
I: Integer;
begin
Both := TStringList.Create;
try
Both.AddStrings(List1);
Both.AddStrings(List2);
Both.Sort;
for I := 0 to Both.Count - 2 do
if (Both[I] = Both[I + 1]) then
if (Dupes.Count = 0) or (Dupes[Dupes.Count - 1] <> Both[I]) then
Dupes.Add(Both[I]);
finally
Both.Free;
end;
end;
procedure ExtractDuplicates3(List1, List2, Dupes: TStringList);
var
I: Integer;
begin
Dupes.Sorted := True;
Dupes.Duplicates := dupAccept;
Dupes.AddStrings(List1);
Dupes.AddStrings(List2);
for I := Dupes.Count - 1 downto 1 do
if (Dupes[I] <> Dupes[I - 1]) or (I > 1) and (Dupes[I] = Dupes[I - 2]) then
Dupes.Delete(I);
if (Dupes.Count > 1) and (Dupes[0] <> Dupes[1]) then
Dupes.Delete(0);
while (Dupes.Count > 1) and (Dupes[0] = Dupes[1]) do
Dupes.Delete(0);
end;
Although ExtractDuplicates3 marginally performs better, I prefer ExtractDuplicates1 because it reeds better and the TStrings parameter provides more usability. ExtractDuplicates2 performs noticeable worst, which demonstrates that sorting all items afterwards in a single run takes more time then continuously sorting every single item added.
Note
This answer is part of this recent answer for which I was about to ask the same question: "how to keep duplicates?". I didn't, but if anyone knows or finds a better solution, please comment, add or update this answer.
This is an old thread but I thought this solution may be useful.
An option is to pump the values from one stringlist to another one with the setting of TestStringList.Duplicates := dupError; and then trap the exception.
var TestStringList, DataStringList : TstringList;
TestStringList.Sorted := True;
TestStringList.Duplicates := dupError;
for i := 0 to DataStringList.Items-1 do
begin
try
TestStringList.Add(DataStringList[i])
except
on E : EStringListError do begin
memo1.Lines.Add('duplicate item found');
end;
end;
end;
....
Just note that the trapping of the exception also masks the following errors:
There is not enough memory to expand the list, the list tried to grow beyond its maximal capacity, a non-existent element of the list was referenced. (i.e. the list index was out of bounds).
function TestDuplicates(const dataStrList: TStringList): integer;
begin
with TStringlist.create do begin
{Duplicates:= dupIgnore;}
for it:= 0 to DataStrList.count-1 do begin
if IndexOf(DataStrList[it])< 0 then
Add(DataStrList[it])
else
inc(result)
end;
Free;
end;
end;

How do I split alternating lines of a text file into two arrays?

How would I read data from a text file into two arrays? One being string and the other integer?
The text file has a layout like this:
Hello
1
Test
2
Bye
3
Each number corresponds to the text above it. Can anyone perhaps help me? Would greatly appreciate it
var
Items: TStringList;
Strings: array of string;
Integers: array of Integer;
i, Count: Integer;
begin
Items := TStringList.Create;
try
Items.LoadFromFile('c:\YourFileName.txt');
// Production code should check that Items.Count is even at this point.
// Actual arrays here. Set their size once, because we know already.
// growing your arrays inside the iteration will cause many reallocations
// and memory fragmentation.
Count := Items.Count div 2;
SetLength(Strings, Count);
SetLength(Integers, Count);
for i := 0 to Count - 1 do
begin
Strings[i] := Items[i*2];
Integers[i] := StrToInt(Items[i*2+1]);
end;
finally
Items.Free;
end;
end
I would read the file into a string list and then process it item by item. The even ones are put into the list of strings, and the odd ones go into the numbers.
var
file, strings, numbers: TStringList;
...
//create the lists
file.LoadFromFile(filename);
Assert(file.Count mod 2=0);
for i := 0 to file.Count-1 do
if i mod 2=0 then
strings.Add(file[i])
else
numbers.Add(file[i]);
I'd probably use some helper functions called odd and even in my own code.
If you wanted the numbers in a list of integers, rather than a string list, then you would use TList<Integer> and add StrToInt(file[i]) on the odd iterations.
I've used lists rather than dynamic arrays for the ease of writing this code, but GolezTrol shows you how to do it with dynamic arrays if that's what you prefer.
That said, since your state that the number is associated with the string, you may actually be better off with something like this:
type
TNameAndID = record
Name: string;
ID: Integer;
end;
var
List: TList<TNameAndID>;
Item: TNameAndID;
...
List := TList<TNameAndID>.Create;
file.LoadFromFile(filename);
Assert(file.Count mod 2=0);
for i := 0 to file.Count-1 do begin
if i mod 2=0 then begin
Item.Name := file[i];
end else begin
Item.ID := StrToInt(file[i]);
List.Add(Item);
end;
end;
end;
The advantage of this approach is that you now have assurance that the association between name and ID will be maintained. Should you ever wish to sort, insert or remove items then you will find the above structure much more convenient than two parallel arrays.

Delphi label values sorting

Im trying to sort Label's values. I have lots of labels with an integer value. Labels are called like Label1, Label2, [...], which Im accessing through FindComponent. I have no problem in sorting the integer values Ive stored in an array, but the problem is, after sorting, I have no idea which label had what value. My goal is to like, sort those labels by their value, so I'd get like an array with Labels sorted by their value. Im stuck at this point :(
Eg:
Label1.Caption := 10;
Label2.Caption := 4;
Label3.Caption := 7;
for i := 1 to 3
do some_array[i] := StrToInt(TLabel(FindComponent('Label' + IntToStr(i))).Caption);
sortarray(some_array);
Now, I have sorted array, but Im lacking some sort procedure that would also store label number in the corresponding place. Can someone point me out?
Instead of creating an array of integers, create an array of TLabel controls. This one you can sort the same way as the array of integers. Indeed, given a MyLabel: TLabel, you can easily get the associated integer as StrToInt(MyLabel.Caption).
In addition, the FindComponent approach is not very efficient. I'd do
const
ALLOC_BY = 100;
MAGIC_TAG = 871226;
var
i: Integer;
ActualLength: integer;
FLabels: array of TLabel;
begin
SetLength(FLabels, ALLOC_BY);
ActualLength := 0;
for i := 0 to ControlCount - 1 do
if Controls[i] is TLabel then
with TLabel(Controls[i]) do
if Tag = MAGIC_TAG then
begin
if ActualLength = length(FLabels) then
SetLength(FLabels, length(FLabels) + ALLOC_BY);
FLabels[ActualLength] := Controls[i];
inc(ActualLength);
end;
SetLength(FLabels, ActualLength);
SortArray(FLabels) // with respect to the StrToInt(CurLabel.Caption) of each
// CurLabel: TLabel.
Of course, you can skip the chunk allocating if you know the number of labels in advance.
Make sure that each of the labels that are to be included in the array have the Tag set to MAGIC_TAG.
Another option would be to create an array
FLabelDataArray: array of TLabelData;
of
type
TLabelData = record
Control: TLabel;
Value: integer;
end;
where
FLabelDataArray[i].Value := StrToInt(FLabelDataArray[i].Control.Caption);
is computed only once.
A quick-n-dirty solution that also works in old Delphi versions, is to use TStringList, which has a Sort method and an Objects property that allow you to associate one object to each entry in the list.
Note that the list is sorted in lexicographic order, so the integers must be left padded with zeroes when converted to strings.
var
list: TStringList;
i: integer;
lab: TLabel;
begin
Label1.Caption := '10';
Label2.Caption := '4';
Label3.Caption := '7';
list := TStringList.Create;
try
for i := 1 to 3 do begin
lab := TLabel(FindComponent('Label' + IntToStr(i)));
list.AddObject(Format('%10.10d', [StrToInt(lab.Caption)]), lab);
end;
list.Sort;
for i := 0 to list.Count-1 do
Memo1.Lines.Add(list[i] + #9 + TLabel(list.Objects[i]).Name);
finally
list.Free;
end;
end;
The output would be:
0000000004 Label2
0000000007 Label3
0000000010 Label1
Also, if instead of list.Sort you use list.Sorted := true, you get binary search on list as a bonus (using list.IndexOf or list.Find).
Edit: As Rudy says, visual components such as TLabel should only be used for displaying data, and not for storing and manipulating it. It is recommended to use appropiate data structures for this and to separate the logic of the program from its user interface.
As Andreas says, put the labels into a dynamic array, rather than sorting the values. Once you have them in such an array, sort them like this:
uses
Generics.Defaults, Generics.Collections;
procedure SortLabels(var Labels: array of TLabel);
var
Comparison: TComparison<TLabel>;
Comparer: IComparer<TLabel>;
begin
Comparison := function(const Left, Right: TLabel): Integer
begin
Result := StrToInt(Left.Caption)-StrToInt(Right.Caption);
end;
Comparer := TDelegatedComparer<TLabel>.Create(Comparison);
TArray.Sort<TLabel>(Labels, Comparer);
end;
As others have said, I don't think you're taking the right approach for this task. But, based on your question, a simple hack would be to use the tag property on each label to store its caption's value :
Label1.Caption := '10';
Label1.Tag:=10;
Label2.Caption := '4';
Label2.Tag:=4;
Label3.Caption := '7';
Label3.Tag := 7;
Then you can find the appropriate label by matching the 'array of integer' entry with the label's tag property.
Again: As Rudy and others have commented, your approach to this task is far from desirable, and my solution only conforms to your approach. The tag property itself is a hack built into Delphi, an artifact from ancient times, like 'label' and 'goTo' and really should not be used except out of sheer desperation - like when trying to re-work and retro-fit ancient, poorly written code or if you've got something in prod and you must get in a quick fix for an emergency situation.

Enumerate possible set values in Delphi

I have a calculation algorithm in Delphi with a number of different options, and I need to try every combination of options to find an optimal solution.
TMyOption = (option1, option2, option3, option4);
TMyOptions = set of TMyOption;
I wondered about using an Integer loop to enumerate them:
for EnumerationInteger := 0 to 15 do begin
Options := TMyOptions(EnumerationInteger);
end;
This does not compile. What I was wondering was if there was any fairly simple method to convert from Integer to Set (most questions on the Web try to go the other way, from Set to Integer), and if so what is it?
Another possibility is to just use the Integer as a bit-field:
C_Option1 = 1;
C_Option2 = 2;
C_Option3 = 4;
C_Option4 = 8;
and then test membership with a bitwise and:
if (Options and C_Option2) > 0 then begin
...
end;
I've tried this, and it works, but it feels like working with sets would be more natural and use the type system better (even though I'm going outside the said type system to enumerate the sets).
Is there a better/safer way to enumerate all possible set combinations than enumerating the underlying integer representation?
Notes:
I know that the integer values of a set are not guaranteed in theory (though I suspect they are in practice if you don't play with the enumeration numbering).
There could be more than four options (yes, I know that it grows exponentially and if there are too many options the algorithm could take forever).
I know this question is quite old, but this is my preference since it's simple and natural to me :
function NumericToMyOptions(n: integer): TMyOptions;
var
Op: TMyOption;
begin
Result:= [];
for Op:= Low(TMyOption) to High(TMyOption) do
if n and (1 shl ord(Op)) > 0 then Include(Result, Op);
end;
Try
var EnumerationByte: Byte;
...
for EnumerationByte := 0 to 15 do begin
Options := TMyOptions(EnumerationByte);
end;
Your code does not compile because your enumeration (TMyOption) have less than 8 values, and Delphi utilize the minimum possible size (in bytes) for sets. Thus, a byte variable will work for you.
If you have a set with more than 8 but less than 16 possible elements, a Word will work (and not an integer).
For more than 16 but less than 32 a DWord variable and typecast.
For more than 32 possible elements, I think a better approach is to use an array of bytes or something like that.
500 - Internal Server Error's answer is probably the most simple.
Another approach that would less likely to break with changes to the number of options would be to declare an array of boolean, and switch them on/off. This is slower than working with pure integers though. The main advantage, you won't need to change the integer type you use, and you can use it if you have more than 32 options.
procedure DoSomething
var BoolFlags : Array[TOption] of Boolean;
I: TOption;
function GetNextFlagSet(var Bools : Array of Boolean) : Boolean;
var idx, I : Integer;
begin
idx := 0;
while Bools[idx] and (idx <= High(Bools)) do Inc(idx);
Result := idx <= High(Bools);
if Result then
for I := 0 to idx do
Bools[I] := not Bools[I];
end;
begin
for I := Low(BoolFlags) to High(BoolFlags) do BoolFlags[i] := False;
repeat
if BoolFlags[Option1] then
[...]
until not GetNextFlagSet(BoolFlags);
end;
Casting from an Integer to a Set is not possible, but Tondrej once wrote a blog article on SetToString and StringToSet that exposes what you want in the SetOrdValue method:
uses
TypInfo;
procedure SetOrdValue(Info: PTypeInfo; var SetParam; Value: Integer);
begin
case GetTypeData(Info)^.OrdType of
otSByte, otUByte:
Byte(SetParam) := Value;
otSWord, otUWord:
Word(SetParam) := Value;
otSLong, otULong:
Integer(SetParam) := Value;
end;
end;
Your code then would become this:
for EnumerationInteger := 0 to 15 do begin
SetOrdValue(TypeInfo(TMyOptions), Options, EnumerationInteger);
end;
--jeroen
The problem is that you are trying to cast to the set type instead of the enumerated type. You can cast between integer and enumerated because both are ordinal types, but you can't cast to a set because they use bitfiels as you already noted. If you use:
for EnumerationInteger := 0 to 15 do begin
Option := TMyOption(EnumerationInteger);
end;
it would work, although is not what you want.
I had this same problem a few months ago and came to the conclusion that you can't enumerate the contents of a set in Delphi (at least in Delphi 7) because the language doesn't define such operation on a set.
Edit: It seems that you can even in D7, see coments to this answer.

Really fast function to compare the name (full path) of two files

I have to check if I have duplicate paths in a FileListBox (FileListBox has the role of some kind of job list or play list).
Using Delphi's SameText, CompareStr, CompareText, takes 6 seconds. So I came with my own compare function which is (just) a bit faster but not fast enough. Any ideas how to improve it?
function SameFile(CONST Path1, Path2: string): Boolean;
VAR i: Integer;
begin
Result:= Length(Path1)= Length(Path2); { if they have different lenghts then obviously are not the same file }
if Result then
for i:= Length(Path1) downto 1 DO { start from the end because it is more likely to find the difference there }
if Path1[i]<> Path2[i] then
begin
Result:= FALSE;
Break;
end;
end;
I use it like this:
for x:= JList.Count-1 downto 1 DO
begin
sMaster:= JList.Items[x];
for y:= x-1 downto 0 DO
if SameFile(sMaster, JList.Items[y]) then
begin
JList.Items.Delete (x); { REMOVE DUPLICATES }
Break;
end;
end;
Note: The chance of having duplicates is small so Delete is not called often. Also the list cannot be sorted because the items are added by user and sometimes the order may be important.
Update:
The thing is that I lose the asvantage of my code because it is Pascal.
It would be nice if the comparison loop ( Path1[i]<> Path2[i] ) would be optimized to use Borland's ASM code.
Delphi 7, Win XP 32 bit, Tests were done with 577 items in the list. Deleting the items from list IS NOT A PROBLEM because it happens rarely.
CONCLUSION
As Svein Bringsli pointed, my code is slow not because of the comparing algorithm but because of TListBox. The BEST solution was provided by Marcelo Cantos. Thanks a lot Marcelo.
I accepted Svein's answer because it answers directly my question "how to make my comparison function faster" with "there is no point to make it faster".
For the moment I implemented the dirty and quick to implement solution: when I have under 200 files, I use my slow code to check the duplicates. If there are more than 200 files I use dwrbudr's solution (which is damn fast) considering that if the user has so many files, the order is irrelevant anyway (human brain cannot track so many items).
I want to thank you all for ideas and especially Svein for revealing the truth: (Borland's) visual controls are damn slow!
Don't waste time optimising the assembler. You can go from O(n2) to O(n log(n)) — bringing the time down to milliseconds — by sorting the list and then doing a linear scan for duplicates.
While you're at it, forget the SameFile function. The algorithmic improvement will dwarf anything you can achieve there.
Edit: Based on feedback in the comments...
You can perform an order-preserving O(n log(n)) de-duplication as follows:
Sort a copy of the list.
Identify and copy duplicated entries to a third list along with their duplication count minus one.
Walk the original list backwards as per your original version.
In the inner (for y := ...) loop, traverse the duplication list instead. If an outer item matches, delete it, decrement the duplication count, and delete the duplication entry if the count reaches zero.
This is obviously more complicated but it will still be orders of magnitude faster, even if you do horrible dirty things like storing duplication counts as strings, C:\path1\file1=2, and using code like:
y := dupes.IndexOfName(sMaster);
if y <> -1 then
begin
JList.Items.Delete(x);
c := StrToInt(dupes.ValueFromIndex(y));
if c > 1 then
dupes.Values[sMaster] = IntToStr(c - 1);
else
dupes.Delete(y);
end;
Side note: A binary chop would be more efficient than the for y := ... loop, but given that duplicates are rare, the difference ought to be negligible.
Using your code as a starting point, I modified it to take a copy of the list before searching for duplicates. The time went from 5,5 seconds to about 0,5 seconds.
vSL := TStringList.Create;
try
vSL.Assign(jList.Items);
vSL.Sorted := true;
for x:= vSL.Count-1 downto 1 DO
begin
sMaster:= vSL[x];
for y:= x-1 downto 0 DO
if SameFile(sMaster, vSL[y]) then
begin
vSL.Delete (x); { REMOVE DUPLICATES }
jList.Items.Delete (x);
Break;
end;
end;
finally
vSL.Free;
end;
Obviously, this is not a good way to do it, but it demonstrates that TFileListBox is in itself quite slow. I don't believe you can gain much by optimizing your compare-function.
To demonstrate this, I replaced your SameFile function with the following, but kept the rest of your code:
function SameFile(CONST Path1, Path2: string): Boolean;
VAR i: Integer;
begin
Result := false; //Pretty darn fast code!!!
end;
The time went from 5,6 seconds to 5,5 seconds. I don't think there's much more to gain there :-)
Create another sorted list with sortedList.Duplicates := dupIgnore and add your strings to that list, then back.
vSL := TStringList.Create;
try
vSL.Sorted := true;
vSL.Duplicates := dupIgnore;
for x:= 0 to jList.Count - 1 do
vSL.Add(jList[x]);
jList.Clear;
for x:= 0 to vSL.Count - 1 do
jList.Add(vSL[x]);
finally
vSL.Free;
end;
The absolute fastest way, bar none (as alluded to before) is to use a routine that generates a unique 64/128/256 bit hash code for a string (I use the SHA256Managed class in C#). Run down the list of strings, generate the hash code for the strings, check for it in the sorted hash code list, and if found then the string is a duplicate. Otherwise add the hash code to the sorted hash code list.
This will work for strings, file names, images (you can get the unique hash code for an image), etc, and I guarantee that this will be as fast or faster than any other impementation.
PS You can use a string list for the hash codes by representing the hash codes as strings. I've used a hex representation in the past (256 bits -> 64 characters) but in theory you can do it any way you like.
4 seconds for how many calls? Great performance if you call it a billion times...
Anyway, does Length(Path1) get evaluated every time through the loop? If so, store that in an Integer variable prior to looping.
Pointers may yield some speed over the strings.
Try in-lining the function with:
function SameFile(blah blah): Boolean; Inline;
That will save some time, if this is being called thousands of times per second. I would start with that and see if it saves anything.
EDIT: I didn't realize that your list wasn't sorted. Obviously, you should do that first! Then you don't have to compare against every other item in the list - just the prior or next one.
I use a modified Ternary Search Tree (TST) to dedupe lists. You simply load the items into the tree, using the whole string as the key, and on each item you can get back an indication if the key is already there (and delete your visible entry). Then you throw away the tree. Our TST load function can typically load 100000 80-byte items in well under a second. And it could not take any more than this to repaint your list, with proper use of begin- and end-update. The TST is memory-hungry, but not so that you would notice it at all if you only have of the order of 500 items. And much simpler than sorting, comparisons and assembler (if you have a suitable TST implementation, of course).
No need to use a hash table, a single sorted list gives me a result of 10 milliseconds, that's 0.01 seconds, which is about 500 times faster! Here is my test code using a TListBox:
procedure TForm1.Button1Click(Sender: TObject);
var
lIndex1: Integer;
lString: string;
lIndex2: Integer;
lStrings: TStringList;
lCount: Integer;
lItems: TStrings;
begin
ListBox1.Clear;
for lIndex1 := 1 to 577 do begin
lString := '';
for lIndex2 := 1 to 100 do
if (lIndex2 mod 6) = 0 then
lString := lString + Chr(Ord('a') + Random(2))
else
lString := lString + 'a';
ListBox1.Items.Add(lString);
end;
CsiGlobals.AddLogMsg('Start', 'Test', llBrief);
lStrings := TStringList.Create;
try
lStrings.Sorted := True;
lCount := 0;
lItems := ListBox1.Items;
with lItems do begin
BeginUpdate;
try
for lIndex1 := Count - 1 downto 0 do begin
lStrings.Add(Strings[lIndex1]);
if lStrings.Count = lCount then
Delete(lIndex1)
else
Inc(lCount);
end;
finally
EndUpdate;
end;
end;
finally
lStrings.Free;
end;
CsiGlobals.AddLogMsg('Stop', 'Test', llBrief);
end;
I'd also like to point out that your solution would take an extreme amount of time if applied to a huge list (like containing 100,000,000 items or more). Even constructing a hashtable or sorted list would take too much time.
In cases like that you could try another approach : Hash each member, but instead of populating a full-blown hashtable, create a bitset (large enough to contain a close factor to as many slots as there are input items) and just set each bit at the offset indicated by the hashfunction. If the bit was 0, change it to 1. If it was already 1, take note of the offending string-index in a separate list and continue. This results in a list of string-indexes that had a collision in the hash, so you'll have to run it a second time to find the first cause of those collisions. After that, you should sort & de-dupe the string-indexes in this list (as all indexes apart from the first one will be present twice). Once that's done you should sort the list again, but this time sort it on the string-contents in order to easily spot duplicates in a following single scan.
Granted it could be a bit extreme to go this all this length, but at least it's a workable solution for very large volumes! (Oh, and this still won't work if the number of duplicates is very high, when the hash-function has a bad spread or when the number of slots in the 'hashtable' bitset is chosen too small - which would give many collisions which aren't really duplicates.)

Resources