When I need to find the first node in a TTreeView, I call TTreeNodes.GetFirstNode. However, I sometimes need to locate the last node in a tree and there is no corresponding TTreeNodes.GetLastNode function.
I don't want to use Items[Count-1] since that results in the entire tree being walked with Result := Result.GetNext. Naturally this only matters if the tree views have a lot of nodes. I fully appreciate the virtues of virtual container controls but I am not going to switch to Virtual TreeView just yet.
So far I have come up with the following:
function TTreeNodes.GetLastNode: TTreeNode;
var
Node: TTreeNode;
begin
Result := GetFirstNode;
if not Assigned(Result) then begin
exit;
end;
while True do begin
Node := Result.GetNextSibling;
if not Assigned(Node) then begin
Node := Result.GetFirstChild;
if not Assigned(Node) then begin
exit;
end;
end;
Result := Node;
end;
end;
Can anyone:
Find a flaw in my logic?
Suggest improvements?
Edit 1
I'm reluctant to keep my own cache of the nodes. I have been doing just that until recently but have discovered some hard to track very intermittent AVs which I believe must be due to my cache getting out of synch. Clearly one solution would be to get my cache synchronisation code to work correctly but I have an aversion to caches because of the hard to track bugs that arise when you get it wrong.
Although I am not a non-Exit purist, I think that when it is doable without Exit while keeping readability intact, one might prefer that option.
So here is exactly the same code, for I don't think you can get any other way (faster) to the end node, but without Exit and slightly more compact:
function TTreeNodes.GetLastNode: TTreeNode;
var
Node: TTreeNode;
begin
Node := GetFirstNode;
Result := Node;
if Result <> nil then
repeat
Result := Node;
if Node <> nil then
Node := Result.GetNextSibling;
if Node = nil then
Node := Result.GetFirstChild;
until Node = nil;
end;
The method I've used before is the maintain a TList using List.Add on the OnAddition event and List.Remove on the OnDeletion event (OnRemove?). You've access to List.Count-1 (or whatever you need) pretty much instantly then.
Post edit - I have to say that although this worked fine, I then grew up and moved to Virtual Tree View :-)
If I was to implement it, this would probably be my first draft.
function TTreeNodes.GetLastNode: TTreeNode;
var
Node: TTreeNode;
function GetLastSibling(aNode : TTreeNode) : TTreeNode;
begin
if not Assigned(aNode) then
EXIT(nil);
repeat
Result := aNode;
aNode := Result.GetNextSibling;
until not Assigned(aNode) ;
end;
begin
Node := GetFirstNode;
if not Assigned(Node) then begin
exit;
end;
repeat
Result := GetLastSibling(Node);
Node := Result.GetFirstChild;
until not Assigned(Node);
end;
I find this slightly more readable. It might be slightly slower though.
I'm unsure about whether or not this approach would be faster than items[Count-1], in some cases, it could be slower, as the TTreeNodes actually caches the last node accessed through the items property.
Related
I want the selected List Data to populate the tree
Here was my attempt towards it
if ListView1.Items[Count].Selected then
begin
Root := ListView1.Items[Count].Caption;
for Itr := TreeView1.Items.Count-1 downto 0 do Begin
if TreeView1.items[itr].Parent.Text = Root then begin
TreeNode := TreeView1.Items[itr].getFirstChild;
Treeview1.Items.AddChild(Treenode,ListView1.Items[Count].SubItems[0]);
break;
end;
end;
end;
But its creating new node and there is an index error.
Kindly help
You haven't provided any details about what your ListView data looks like, or what the TreeView is supposed to look like in relation to that data. But I suspect your code should probably look more like this instead:
if ListView1.Items[Count].Selected then
begin
Root := ListView1.Items[Count].Caption;
for Itr := TreeView1.Items.Count-1 downto 0 do begin
TreeNode := TreeView1.Items[itr];
if TreeNode.Text = Root then begin
TreeView1.Items.AddChild(TreeNode, ListView1.Items[Count].SubItems[0]);
break;
end;
end;
end;
Note that iterating through a TreeView backwards is VERY inefficient. Trees are not inherently indexable, so every time you access a node by index via the TreeView1.Items[] property, it has to start with the first node in the tree and iterate forwards counting nodes until it reaches the specified index. You are repeating that same forward scan for every node you access while going backwards. That is a lot of wasted overhead.
I want to read the entire table from an MS Access file and I'm trying to do it as fast as possible. When testing a big sample I found that the loop counter increases faster when it's reading the top records comparing to last records of the table. Here's a sample code that demonstrates this:
procedure TForm1.Button1Click(Sender: TObject);
const
MaxRecords = 40000;
Step = 5000;
var
I, J: Integer;
Table: TADOTable;
T: Cardinal;
Ts: TCardinalDynArray;
begin
Table := TADOTable.Create(nil);
Table.ConnectionString :=
'Provider=Microsoft.ACE.OLEDB.12.0;'+
'Data Source=BigMDB.accdb;'+
'Mode=Read|Share Deny Read|Share Deny Write;'+
'Persist Security Info=False';
Table.TableName := 'Table1';
Table.Open;
J := 0;
SetLength(Ts, MaxRecords div Step);
T := GetTickCount;
for I := 1 to MaxRecords do
begin
Table.Next;
if ((I mod Step) = 0) then
begin
T := GetTickCount - T;
Ts[J] := T;
Inc(J);
T := GetTickCount;
end;
end;
Table.Free;
// Chart1.SeriesList[0].Clear;
// for I := 0 to Length(Ts) - 1 do
// begin
// Chart1.SeriesList[0].Add(Ts[I]/1000, Format(
// 'Records: %s %d-%d %s Duration:%f s',
// [#13, I * Step, (I + 1)*Step, #13, Ts[I]/1000]));
// end;
end;
And the result on my PC:
The table has two string fields, one double and one integer. It has no primary key nor index field. Why does it happen and how can I prevent it?
I can reproduce your results using an AdoQuery with an MS Sql Server dataset of similar size to yours.
However, after doing a bit of line-profiling, I think I've found the answer to this, and it's slightly counter-intuitive. I'm sure everyone who does
DB programming in Delphi is used to the idea that looping through a dataset tends to be much quicker if you surround the loop by calls to Disable/EnableControls. But who would bother to do that if there are no db-aware controls attached to the dataset?
Well, it turns out that in your situation, even though there are no DB-aware controls, the speed increases hugely if you use Disable/EnableControls regardless.
The reason is that TCustomADODataSet.InternalGetRecord in AdoDB.Pas contains this:
if ControlsDisabled then
RecordNumber := -2 else
RecordNumber := Recordset.AbsolutePosition;
and according to my line profiler, the while not AdoQuery1.Eof do AdoQuery1.Next loop spends 98.8% of its time executing the assignment
RecordNumber := Recordset.AbsolutePosition;
! The calculation of Recordset.AbsolutePosition is hidden, of course, on the "wrong side" of the Recordset interface, but the fact that the time to call it apparently increases the further you go into the recordset makes it reasonable imo to speculate that it's calculated by counting from the start of the recordset's data.
Of course, ControlsDisabled returns true if DisableControls has been called and not undone by a call to EnableControls. So, retest with the loop surrounded by Disable/EnableControls and hopefully you'll get a similar result to mine. It looks like you were right that the slowdown isn't related to memory allocations.
Using the following code:
procedure TForm1.btnLoopClick(Sender: TObject);
var
I: Integer;
T: Integer;
Step : Integer;
begin
Memo1.Lines.BeginUpdate;
I := 0;
Step := 4000;
if cbDisableControls.Checked then
AdoQuery1.DisableControls;
T := GetTickCount;
{.$define UseRecordSet}
{$ifdef UseRecordSet}
while not AdoQuery1.Recordset.Eof do begin
AdoQuery1.Recordset.MoveNext;
Inc(I);
if I mod Step = 0 then begin
T := GetTickCount - T;
Memo1.Lines.Add(IntToStr(I) + ':' + IntToStr(T));
T := GetTickCount;
end;
end;
{$else}
while not AdoQuery1.Eof do begin
AdoQuery1.Next;
Inc(I);
if I mod Step = 0 then begin
T := GetTickCount - T;
Memo1.Lines.Add(IntToStr(I) + ':' + IntToStr(T));
T := GetTickCount;
end;
end;
{$endif}
if cbDisableControls.Checked then
AdoQuery1.EnableControls;
Memo1.Lines.EndUpdate;
end;
I get the following results (with DisableControls not called except where noted):
Using CursorLocation = clUseClient
AdoQuery.Next AdoQuery.RecordSet AdoQuery.Next
.MoveNext + DisableControls
4000:157 4000:16 4000:15
8000:453 8000:16 8000:15
12000:687 12000:0 12000:32
16000:969 16000:15 16000:31
20000:1250 20000:16 20000:31
24000:1500 24000:0 24000:16
28000:1703 28000:15 28000:31
32000:1891 32000:16 32000:31
36000:2187 36000:16 36000:16
40000:2438 40000:0 40000:15
44000:2703 44000:15 44000:31
48000:3203 48000:16 48000:32
=======================================
Using CursorLocation = clUseServer
AdoQuery.Next AdoQuery.RecordSet AdoQuery.Next
.MoveNext + DisableControls
4000:1031 4000:454 4000:563
8000:1016 8000:468 8000:562
12000:1047 12000:469 12000:500
16000:1234 16000:484 16000:532
20000:1047 20000:454 20000:546
24000:1063 24000:484 24000:547
28000:984 28000:531 28000:563
32000:906 32000:485 32000:500
36000:1016 36000:531 36000:578
40000:1000 40000:547 40000:500
44000:968 44000:406 44000:562
48000:1016 48000:375 48000:547
Calling AdoQuery1.Recordset.MoveNext calls directly into the MDac/ADO layer, of
course, whereas AdoQuery1.Next involves all the overhead of the standard TDataSet
model. As Serge Kraikov said, changing the CursorLocation certainly makes a difference and doesn't exhibit the slowdown we noticed, though obviously it's significantly slower than using clUseClient and calling DisableControls. I suppose it depends on exactly what you're trying to do whether you can take advantage of the extra speed of using clUseClient with RecordSet.MoveNext.
When you open a table, ADO dataset internally creates special data structures to navigate dataset forward/backward - "dataset CURSOR". During navigation, ADO stores the list of already visited records to provide bidirectional navigation.
Seems ADO cursor code uses quadratic-time O(n2) algorithm to store this list.
But there are workaround - use server-side cursor:
Table.CursorLocation := clUseServer;
I tested your code using this fix and get linear fetch time - fetching every next chunk of records takes the same time as previous.
PS Some other data access libraries provides special "unidirectional" datasets - this datasets can traverse only forward and don't even store already traversed records - you get constant memory consumption and linear fetch time.
DAO is native to Access and (IMHO) is typically faster.
Whether or not you switch, use the GetRows method. Both DAO and ADO support it.
There is no looping. You can dump the entire recordset into an array with a couple of lines of code. Air code:
yourrecordset.MoveLast
yourrecordset.MoveFirst
yourarray = yourrecordset.GetRows(yourrecordset.RecordCount)
Ex.:
- link 1
-- link 1.1
--- link 1.1.1(only)
---- link 1.1.1.1 (not expand)
-- link 1.2
--- link 1.2.1 (only)
---- link 1.2.1.1 (not expand)
I can expand only link 1.1, link 1.2... how?
There is no build-in functionality for expanding multiple items or items on a specific level, so there is no other way then traversing through the items. Call the Expand method on all second level items. As well on all first level items, otherwise the second level items will not be shown. Its Recurse parameter should be False in order not to expand a possibly third or deeper level.
There are two ways to traverse through a TreeView's items: by Item index and by Node. Normally, operations on a TreeView's items are preferably done by Node, because the Items property's getter traverses through all Items to find a single Item with a specific index. However, TTreeNodes caches the last retrieved item, thus by incrementing the loop Index with 1, harm will be minimized.
The easy solution then becomes:
procedure ExpandTreeNodes(Nodes: TTreeNodes; Level: Integer);
var
I: Integer;
begin
Nodes.BeginUpdate;
try
for I := 0 to Nodes.Count - 1 do
if Nodes[I].Level < Level then
Nodes[I].Expand(False);
finally
Nodes.EndUpdate;
end;
end;
procedure TForm1.Button1Click(Sender: TObject);
begin
ExpandTreeNodes(TreeView1.Items, 2);
end;
Note that the Items property's getter is still called twice. Despite the caching mechanism, I think this still should be avoided:
procedure ExpandTreeNodes(Nodes: TTreeNodes; Level: Integer);
var
I: Integer;
Node: TTreeNode;
begin
Nodes.BeginUpdate;
try
for I := 0 to Nodes.Count - 1 do
begin
Node := Nodes[I];
if Node.Level < Level then
Node.Expand(False);
end;
finally
Nodes.EndUpdate;
end;
end;
But then you might as well use:
procedure ExpandTreeNodes(Nodes: TTreeNodes; Level: Integer);
var
Node: TTreeNode;
begin
Nodes.BeginUpdate;
try
Node := Nodes.GetFirstNode;
while Node <> nil do
begin
if Node.Level < Level then
Node.Expand(False);
Node := Node.GetNext;
end;
finally
Nodes.EndUpdate;
end;
end;
This still traverses áll Items. Ok, just once. But if you have a really big and/or deep tree, and you want maximum efficiency, and you do not care about readability, or you just want to experiment with a TreeView's Nodes for fun, then use:
procedure ExpandTreeNodes(Nodes: TTreeNodes; Level: Integer);
var
Node: TTreeNode;
Next: TTreeNode;
begin
if Level < 1 then
Exit;
Nodes.BeginUpdate;
try
Node := Nodes.GetFirstNode;
while Node <> nil do
begin
Node.Expand(False);
if (Node.Level < Level - 1) and Node.HasChildren then
Node := Node.GetFirstChild
else
begin
Next := Node.GetNextSibling;
if Next <> nil then
Node := Next
else
if Node.Level > 0 then
Node := Node.Parent.GetNextSibling
else
Node := Node.GetNextSibling;
end;
end;
finally
Nodes.EndUpdate;
end;
end;
I'm working on a TClientDataset that the user can filter at any time based on some criterias. My problem is, we'd like the dataset's cursor to remain positionned "mostly" at the same place after filtering. ("Mostly" in double quote since, of course, it can't stay at the same place if the record is filtered out).
After doing some research, the best I could come up with is the following :
procedure RefreshFilter;
var
I : Integer;
sFilter : string;
vIndexValue: array of TVarRec;
vIndexValueAsVar : Array of Variant;
begin
sFilter := GenerateNewFilterExpression;
if sFilter <> MyDataset.Filter then
begin
if MyDataset.IndexFieldCount > 0 then
begin
SetLength(vIndexValueAsVar, MyDataset.IndexFieldCount);
SetLength(vIndexValue, MyDataset.IndexFieldCount);
for I := 0 to MyDataset.IndexFieldCount - 1 do
begin
vIndexValueAsVar[I] := MyDataset.IndexFields[I].AsVariant;
vIndexValue[I].VType := vtVariant;
vIndexValue[I].VVariant := #vIndexValueAsVar[I];
end;
end;
MyDataset.Filtered := sFilter <> '';
Mydataset.Filter := sFilter;
if MyDataset.IndexFieldCount > 0 then
begin
MyDataset.FindNearest(vIndexValue);
end;
end;
end;
Even though it works pretty well, I find the solution a bit "bulky". I was wondering if there was a some built-in function or a different approach that might be more elegant and less "heavy".
And please, don't mention bookmarks... Bookmarks don't work properly after changing the active filter, and not at all if your record gets filtered out.
I have developed an application that scans basically everywhere for a file or list of files.
When I scan small folders like 10 000 files and sub files there is no problem. But when I scan for instance my entire users folder with more than 100 000 items, it is very heavy on my processor. It takes about 40% of my processor's power.
Is there a way to optimize this code so that it uses less CPU?
procedure GetAllSubFolders(sPath: String);
var
Path: String;
Rec: TSearchRec;
begin
try
Path := IncludeTrailingBackslash(sPath);
if FindFirst(Path + '*.*', faAnyFile, Rec) = 0 then
try
repeat
Application.ProcessMessages;
if (Rec.Name <> '.') and (Rec.Name <> '..') then
begin
if (ExtractFileExt(Path + Rec.Name) <> '') And
(ExtractFileExt(Path + Rec.Name).ToLower <> '.lnk') And
(Directoryexists(Path + Rec.Name + '\') = False) then
begin
if (Pos(Path + Rec.Name, main.Memo1.Lines.Text) = 0) then
begin
main.ListBox1.Items.Add(Path + Rec.Name);
main.Memo1.Lines.Add(Path + Rec.Name)
end;
end;
GetAllSubFolders(Path + Rec.Name);
end;
until FindNext(Rec) <> 0;
finally
FindClose(Rec);
end;
except
on e: Exception do
ShowMessage(e.Message);
end;
end;
My app searches for all the files in a selected folder and sub-folder, zip's them and copies them to another location you specify.
The Application.ProcessMessages command is there to make sure the application doesn't look like it is hanging and the user closes it. Because finding 100 000 files for instance can take an hour or so...
I am concerned about the processor usage, The memory is not really affected.
Note: The memo is to make sure the same files are not selected twice.
I see the following performance problems:
The call to Application.ProcessMessages is somewhat expensive. You are polling for messages rather than using a blocking wait, i.e. GetMessage. As well as the performance issue, the use of Application.ProcessMessages is generally an indication of poor design for various reasons and one should, in general, avoid the need to call it.
A non-virtual list box performs badly with a lot of files.
Using a memo control (a GUI control) to store a list of strings is exceptionally expensive.
Every time you add to the GUI controls they update and refresh which is very expensive.
The evaluation of Memo1.Lines.Text is extraordinarily expensive.
The use of Pos is likewise massively expensive.
The use of DirectoryExists is expensive and spurious. The attributes returned in the search record contain that information.
I would make the following changes:
Move the search code into a thread to avoid the need for ProcessMessages. You'll need to devise some way to transport the information back to the main thread for display in the GUI.
Use a virtual list view to display the files.
Store the list of files that you wish to search for duplicates in a dictionary which gives you O(1) lookup. Take care with case-insensitivity of file names, an issue that you have perhaps neglected so far. This replaces the memo.
Check whether an item is a directory by using Rec.Attr. That is check that Rec.Attr and faDirectory <> 0.
I agree with the answer which says you would do best to do what you're doing in a background thread and I don't want to encourage you to persist in doing it in your main thread.
However, if you go to a command prompt and do this:
dir c:\*.* /s > dump.txt & notepad dump.txt
you may be surprised quite how quickly Notepad pops into view.
So there are few things you could do to speed up your GetAllSubFolders, even if you keep it in your main thread, e.g. to bracket the code by calls to main.Memo1.Lines.BeginUpdate and main.Memo1.Lines.EndUpdate, likewise main.Listbox1.Items.BeginUpdate and EndUpdate. This will stop these controls being updated while it executes (which is actually what your code is spending most of its time doing, that and the "if Pos( ...)" business I've commented on below). And, if you haven't gathered already, Application.ProcessMessages is evil (mostly).
I did some timings on my D: drive, which is a 500Gb SSD with 263562 files in 35949 directories.
The code in your q: 6777 secs
Doing a dir to Notepad as per the above: 15 secs
The code below, in main thread: 9.7 secs
The reason I've included the code below in this answer is that you'll find it much easier to execute in a thread because it gathers the results into a TStringlist, whose contents you can then assign to your memo and listbox once the thread has completed.
A few comments on the code in your q, which I imagine you might have got from somewhere.
It pointlessly recurses even when the current entry in Rec is a plain file. The code below only recurses if the current Rec entry is a directory.
It apparently tries to avoid duplicates by the "if Pos( ...)" business, which shouldn't be necessary (except maybe if there's a symbolic link (e.g created with the MkLink command) somewhere that points elsewhere on the drive) and does it in a highly inefficient manner, i.e. by searching for the filename in the memo contents - those will get longer and longer as it finds more files). In the code below, the stringlist is set up to discard duplicates and has its Sorted property set to True, which makes its checking for duplicates much quicker, becauseit can then do a binary search through its contents rather than a serial one.
It calculates Path + Rec.Name 6 times for each thing it finds, which is avoidably inefficient at r/t and inflates the source code. This is only a minor point, though, compared to the first two.
Code:
function GetAllSubFolders(sPath: String) : TStringList;
procedure GetAllSubFoldersInner(sPath : String);
var
Path,
AFileName,
Ext: String;
Rec: TSearchRec;
Done: Boolean;
begin
Path := IncludeTrailingBackslash(sPath);
if FindFirst(Path + '*.*', faAnyFile, Rec) = 0 then begin
Done := False;
while not Done do begin
if (Rec.Name <> '.') and (Rec.Name <> '..') then begin
AFileName := Path + Rec.Name;
Ext := ExtractFileExt(AFileName).ToLower;
if not ((Rec.Attr and faDirectory) = faDirectory) then begin
Result.Add(AFileName)
end
else begin
GetAllSubFoldersInner(AFileName);
end;
end;
Done := FindNext(Rec) <> 0;
end;
FindClose(Rec);
end;
end;
begin
Result := TStringList.Create;
Result.BeginUpdate;
Result.Sorted := True;
Result.Duplicates := dupIgnore; // don't add duplicate filenames to the list
GetAllSubFoldersInner(sPath);
Result.EndUpdate;
end;
procedure TMain.Button1Click(Sender: TObject);
var
T1,
T2 : Integer;
TL : TStringList;
begin
T1 := GetTickCount;
TL := GetAllSubfolders('D:\');
try
Memo1.Lines.BeginUpdate;
try
Memo1.Lines.Text := TL.Text;
finally
Memo1.Lines.EndUpdate;
end;
T2 := GetTickCount;
Caption := Format('GetAll: %d, Load: %d, Files: %d', [T2 - T1, GetTickCount - T2, TL.Count]);
finally
TL.Free;
end;
end;