VirtualStringTree updating data using cache system

VirtualStringTree updating data using cache system - delphi

Well, I'm using VirtualStringTree to create kind of a process manager...
I run into trouble because of updating the tree with a timer set to 1000ms (cpu usage is too high for my application retrieving a lot of data (filling about 20 columns).
So I wonder how would one build kind of a cache system so I can update the tree only when something changed which I guess seems to be the key decrementing the cpu usage for my application a lot?
Snip:
type
TProcessNodeType = (ntParent, ntDummy);
PProcessData = ^TProcessData;
TProcessData = record
pProcessName : String;
pProcessID,
pPrivMemory,
pWorkingSet,
pPeakWorkingSet,
pVirtualSize,
pPeakVirtualSize,
pPageFileUsage,
pPeakPageFileUsage,
pPageFaults : Cardinal;
pCpuUsageStr: string;
pIOTotal: Cardinal;
...
end;
If my application starts I fill the tree with all running processes.
Remember this is called only once, later when the application runs I got notified of new processes or processes which are terminated via wmi so I dont need to call the following procedure in the timer later to update the tree...
procedure FillTree;
begin
var
NodeData: PProcessData;
Node: PVirtualNode;
ParentNode: PVirtualNode;
ChildNode: PVirtualNode;
Process: TProcessItem;
I : Integer;
begin
ProcessTree.BeginUpdate;
for I := 0 to FRunningProcesses.Count - 1 do
begin
Process := FRunningProcesses[i];
NodeData^.pProcessID := ProcessItem.ProcessID;
NodeData^.pProcessName := ProcessItem.ProcessName;
...
I have a Class which will retrieve all the data I want and store it into the tree like:
var
FRunningProcesses: TProcessRunningProcesses;
So if I want to enumerate all running processes I just give it a call like:
// clears all data inside the class and refills everything with the new data...
FRunningProcesses.UpdateProcesses;
The problem starts here while I enumerate everything and not only data which had changed which is quite cpu intensive:
procedure TMainForm.UpdateTimerTimer(Sender: TObject);
var
NodeData: PProcessData;
Node : PVirtualNode;
Process: TProcessItem;
I: Integer;
begin
for I := 0 to FRunningProcesses.Count - 1 do
begin
Application.ProcessMessages;
Process := FRunningProcesses[I];
// returns PVirtualNode if the node is found inside the tree
Node := FindNodeByPID(Process.ProcessID);
if not(assigned(Node)) then
exit;
NodeData := ProcessVst.GetNodeData(Node);
if not(assigned(NodeData)) then
exit;
// now starting updating the tree
// NodeData^.pWorkingsSet := Process.WorkingsSet;
....
Basically the timer is only needed for cpu usage and all memory informations I can retrieve from a process like:
Priv.Memory
Working Set
Peak Working Set
Virtual Size
PageFile Usage
Peak PageFile Usage
Page Faults
Cpu Usage
Thread Count
Handle Count
GDI Handle Count
User Handle Count
Total Cpu Time
User Cpu Time
Kernel Cpu Time
So I think the above data must be cached and compared somehow if its changed or not just wonder how and what will be most efficient?

You need only update the data in nodes which are currently are visible.
you can use vst.getfirstvisible vst.getnextvisible to iterate thru these nodes.
the second way is also easy.
use objects instead of the record. sample code of object usage
use getters for the different values.
those getters query the processes for the values.
maybe you need here a limit. refresh data only every second.
now you only need to set the vst into an invalidated status every second.
vst.invalidate
this forced the vst to repaint the visible area.
but all this works only if your data is not sorted by any changing values.
if this necessary you need to update all record and this is your bottle neck - i think.
remember COM and WMI are much slower than pure API.
avoid (slow) loops and use a profiler to find the slow parts.

I'd recommend you to have your VT's node data point directly to TProcessItem.
Pro's:
Get rid of FindNodeByPID. Just update all the items from
FRunningProcesses and then call VT.Refresh. When the process is
terminated, delete corresponding item from FRunningProcesses.
Currently you have quite expensive search in FindNodeByPID where
you loop through all VT nodes, retrieve their data and check for
PID.
Get rid of Process := FRunningProcesses[I] where you have
unnecessary data copy of the whole TProcessData record (btw, that
should be done anyway, use pointers instead).
Get rid of the whole // now starting updating the tree block.
In general, by this change you decrease excess entities what is very good for application updating and debugging.
Con's:
You'll have to keep VT and FRunningProcesses in sync. But that's quite trivial.

Related

Are System.Generics.Collections TList<record> subject to memory fragmentation in Delphi?

In Delphi 10.4, I have a record that I use like this in a TList (System.Generics.Collections):
uses
System.Generics.Collections;
type
TSomething = record
Name: String;
Foo: String;
Bar: String;
Group: String;
Flag1: Boolean;
Flag2: Boolean;
Flag3: Boolean;
Flag4: Boolean;
Flag5: Boolean;
end;
PTSomething = ^TSomething;
//Simplified code for readability...
var
Something: TSomething;
MyList := TList<TSomething>;
lRecP: PTSomething;
MyList := TList<TSomething>.Create;
while ACondition do
begin
Something.Name := 'Something';
//Fill the rest of the record
MyList.Add(Something); //Add is done in a while loop which result around 1000 items in MyList
end;
//Later...
for i := 0 to MyList.Count - 1 do
begin
if ACondition then
begin
lRecP := #MyList.List[i];
lRecP.Name := 'Modified'; //Items can be modified but never deleted
end;
end;
//Later...
MyList.Free;
Is my code prone to memory fragmentation? I have about 1000 records in my list that I will iterate through and maybe modify a string off the record once per record.
Would there be a better way to do what I want to do?

Records lie in intrinsic dynamic array of TSomething. This array will be reallocated when you add new records and expanding is required. Memory manager cares about memory allocation, deallocation, it tries to minimize fragmentation. For list size of 1000 fragmentation should be negligible.
Dynamic array capacity changes rarely to avoid expensive operations of reallocation and to diminish fragmentation (more info in SilverWarior
comment)
You records contain strings. Strings are really pointers, and strings bodies are in another place of memory. Again - memory manager cares about string allocation/deallocation, it does this work well (applications with instant creation, treatment and deallocation of millions of strings work 24/7 many years).
So frequent changing of strings does not affect on the list body (intrinsic array) (until you add new records/ delete existing ones), and unlikely can cause memory fragmentation.

This is not an Answer to your Code. I don´t know if Fragmentation happens. In my experience it is dependent on other things happening in parallel or over time. If your Application has Issues like "E Out Of Memory" after running several days, then its time to look at it.
I Would suggest having a look at FastMM. Using its FastMMUsageTracker
Memory Fragmentation Map over FastMM
For me it was a big help. I had Problems in a Service, but i cant remember where i read about Memory Exhaustion. In FastMM or Mad Except? Sorry i can´t remember. It was a Article explaining why Fragmention happens over time.

Efficiently populate combobox in Delphi

Need to add many items (more than 10k) in TComboBox (I know that TComboBox is not supposed to hold many items but its not up to me to change this) without adding duplicates.
So I need to search the full list before adding. I want to avoid TComboBox.items.indexof as I need a binary search but the binary find is not available in TStrings.
So I created a temporary Tstringlist, set sorted to true and used find. But now assigning the temporary Tstringlist back to TComboBox.Items
(myCB.Items.AddStrings(myList))
is really slow as it copies the whole list. Is there any way to move the list instead of copying it? Or any other way to efficient populate my TComboBox?

There is no way to "move" the list into the combo box because the combo box's storage belongs to the internal Windows control implementation. It doesn't know any way to directly consume your Delphi TStringList object. All it offers is a command to add one item to the list, which TComboBox then uses to copy each item from the string list into the system control, one by one. The only way to avoid copying the many thousands of items into the combo box is to avoid the issue entirely, such as by using a different kind of control or by reducing the number of items you need to add.
A list view has a "virtual" mode where you only tell it how many items it should have, and then it calls back to your program when it needs to know details about what's visible on the screen. Items that aren't visible don't occupy any space in the list view's implementation, so you avoid the copying. However, system combo boxes don't have a "virtual" mode. You might be able to find some third-party control that offers that ability.
Reducing the number of items you need to put in the combo box is your next best option, but only you and your colleagues have the domain knowledge necessary to figure out the best way to do that.

As Rudy Velthuis already mentioned in the comments and assuming you are using VCL, the CB_INITSTORAGE message could be an option:
SendMessage(myCB, CB_INITSTORAGE, myList.Count, 20 * myList.Count*sizeof(Integer));
where 20 is your average string length.
Results (on a i5-7200U and 20K items with random length betwen 1 and 50 chars):
without CB_INITSTORAGE: ~ 265ms
with CB_INITSTORAGE: ~215ms
So while you can speed up things a little by preallocating the memory, the bigger issue seems to be the bad user experience. How can a user find the right element in a combobox with such many items?

Notwithstanding that 10k items is crazy to keep in a TComboBox, an efficient strategy here would be to keep a cache in a separate object. For example, declare :
{ use a TDictionary just for storing a hashmap }
FComboStringsDict : TDictionary<string, integer>;
where
procedure TForm1.FormCreate(Sender: TObject);
var
i : integer;
spw : TStopwatch;
begin
FComboStringsDict := TDictionary<string, integer>.Create;
spw := TStopwatch.StartNew;
{ add 10k random items }
for i := 0 to 10000 do begin
AddComboStringIfNotDuplicate(IntToStr(Floor(20000*Random)));
end;
spw.Stop;
ListBox1.Items.Add(IntToStr(spw.ElapsedMilliseconds));
end;
function TForm1.AddComboStringIfNotDuplicate(AEntry: string) : boolean;
begin
result := false;
if not FComboStringsDict.ContainsKey(AEntry) then begin
FComboStringsDict.Add(AEntry, 0);
ComboBox1.Items.Add(AEntry);
result := true;
end;
end;
Adding 10k items initially takes about 0.5s this way.
{ test adding new items }
procedure TForm1.Button1Click(Sender: TObject);
var
spw : TStopwatch;
begin
spw := TStopwatch.StartNew;
if not AddComboString(IntToStr(Floor(20000*Random))) then
ListBox1.Items.Add('Did not add duplicate');
spw.Stop;
ListBox1.Items.Add(IntToStr(spw.ElapsedMilliseconds));
end;
But adding each subsequent item is very fast <1ms. This is a clumsy implementation, but you could easily wrap this behaviour into a custom class. The idea is to keep your data model as separate from the visual component as possible - keep them in sync when adding or removing items but do your heavy searches on the dictionary where the lookup is fast. Removing items would still rely on .IndexOf.

How to auto delete clientdataset records that have not been updated

I have a clientdataset in RAM no database, that maintains a list of active nodes in a network.
Nodes continuously report back confirming they are alive, thus keeping the dataset updated.
The dataset is displayed in a dbgrid.
When a node stops reporting status it is deleted from the database after a few seconds inactivity.
I do this by updating a timeout field when a field is updated.
Every second I iterate through the dataset deleting outdated records.
This works, but the grid sometimes flicker when OnDrawColumnCell refreshes a single line grid to customize the column colors. I call DisableControls/EnableControls, but there seems to be a small delay until OnDrawCell redraws the grid causing the flicker.
If I disable the iteration to delete the outdated records the flicker stops.
Is there a better way to do this?

A way to minimise the flicker in your grid is to use a 'trick' which makes use of a special feature of ClientDataSets, namely that you can copy data between them by assigning their Data properties, as in
cdsDestination.Data := cdsSource.Data;
So what you can do is to have two CDSs, one which you use for display purposes only, and the other which processes your network nodes. This means that changes to the copy CDS are kept to the absolute minimum, and you can do pretty much whatever you like with your source CDS, and take as long as you like about it (as long, of course, as you can get it done before the next destination CDS update). Something like this:
const
NodeCount = 1000;
procedure TForm1.DoDataUpdate;
begin
// do something to CDS1's data here
cdsCopy.Data := CDS1.Data;
end;
procedure TForm1.FormCreate(Sender: TObject);
var
i : Integer;
begin
CDS1.CreateDataSet;
for i := 1 to NodeCount do
CDS1.InsertRecord([i, Now]);
CDS1.First;
DBGrid1.DataSource := DataSource1;
DataSource1.DataSet := cdsCopy;
end;
procedure TForm1.Timer1Timer(Sender: TObject);
begin
DoDataUpdate;
end;

How to parallel code a for-down-to loop in delphi, active over a list, deleting items as you go?

Take for example the following code:
for i := (myStringList.Count - 1) DownTo 0 do begin
dataList := SplitString(myStringList[i], #9);
x := StrToFloat(dataList[0]);
y := StrToFloat(dataList[1]);
z := StrToFloat(dataList[2]);
//Do something with these variables
myOutputRecordArray[i] := {SomeFunctionOf}(x,y,z)
//Free Used List Item
myStringList.Delete(i);
end;
//Free Memory
myStringList.Free;
How would you parallelise this using, for example, the OmniThreadLibrary? Is it possible? Or does it need to be restructured?
I'm calling myStringList.Delete(i); at each iteration as the StringList is large and freeing items after use at each iteration is important to minimise memory usage.

Simple answer: You wouldn't.
More involved answer: The last thing you want to do in a parallelized operation is modify shared state, such as this delete call. Since it's not guaranteed that each individual task will finish "in order"--and in fact it's highly likely that they won't at least once, with that probability approaching 100% very quickly the more tasks you add to the total workload--trying to do something like that is playing with fire.
You can either destroy the items as you go and do it serialized, or do it in parallel, finish faster, and destroy the whole list. But I don't think there's any way to have it both ways.

You can cheat. Setting the string value to an empty string will free most of the memory and will be thread safe. At the end of the processing you can then clear the list.
Parallel.ForEach(0, myStringList.Count - 1).Execute(
procedure (const index: integer)
var
dataList: TStringDynArray;
x, y, z: Single;
begin
dataList := SplitString(myStringList[index], #9);
x := StrToFloat(dataList[0]);
y := StrToFloat(dataList[1]);
z := StrToFloat(dataList[2]);
//Do something with these variables
myOutputRecordArray[index] := {SomeFunctionOf}(x,y,z);
//Free Used List Item
myStringList[index] := '';
end);
myStringList.Clear;
This code is safe because we are never writing to a shared object from multiple threads. You need to make sure that all of the variables you use that would normally be local are declared in the threaded block.

I'm not going to attempt to show how to do what you originally asked because it is a bad idea that will not lead to improved performance. Not even assuming that you deal with the many and various data races in your proposed parallel implementation.
The bottleneck here is the disk I/O. Reading the entire file into memory, and then processing the contents is the design choice that is leading to your memory problems. The correct way to solve this problem is to use a pipeline.
Step 1 of the pipeline takes as input the file on disk. The code here reads chunks of the file and then breaks those chunks into lines. These lines are the output of this step. The entire file is never in memory at one time. You'll have to tune the size of the chunks that you read.
Step 2 takes as input the strings the step 1 produced. Step 2 consumes those strings and produces vectors. Those vectors are added to your vector list.
Step 2 will be faster than step 1 because I/0 is so expensive. Therefore there's nothing to be gained by trying to optimise either of the steps with parallel algorithms. Even on a uniprocessor machine this pipelined implementation could be faster than non-pipelined.

String Sharing/Reference issue with objects in Delphi

My application builds many objects in memory based on filenames (among other string based information). I was hoping to optimise memory usage by storing the path and filename separately, and then sharing the path between objects in the same path. I wasn't trying to look at using a string pool or anything, basically my objects are sorted so if I have 10 objects with the same path I want objects 2-10 to have their path "pointed" at object 1's path (eg object[2].Path=object[1].Path);
I have a problem though, I don't believe that my objects are in fact sharing a reference to the same string after I think I am telling them to (by the object[2].Path=object[1].Path assignment).
When I do an experiment with a string list and set all the values to point to the first value in the list I can see the "memory conservation" in action, but when I use objects I see absolutely no change at all, admittedly I am only using task manager (private working set) to watch for memory use changes.
Here's a contrived example, I hope this makes sense.
I have an object:
TfileObject=class(Tobject)
FpathPart: string;
FfilePart: string;
end;
Now I create 1,000,000 instances of the object, using a new string for each one:
var x: integer;
MyFilePath: string;
fo: TfileObject;
begin
for x := 1 to 1000000 do
begin
// create a new string for every iteration of the loop
MyFilePath:=ExtractFilePath(Application.ExeName);
fo:=TfileObject.Create;
fo.FpathPart:=MyFilePath;
FobjectList.Add(fo);
end;
end;
Run this up and task manager says I am using 68MB of memory or something. (Note that if I allocated MyFilePath outside of the loop then I do save memory because of 1 instance of the string, but this is a contrived example and not actually how it would happen in the app).
Now I want to "optimise" my memory usage by making all objects share the same instance of the path string, since it's the same value:
var x: integer;
begin
for x:=1 to FobjectList.Count-1 do
begin
TfileObject(FobjectList[x]).FpathPart:=TfileObject(FobjectList[0]).FpathPart;
end;
end;
Task Manager shows absouletly no change.
However if I do something similar with a TstringList:
var x: integer;
begin
for x := 1 to 1000000 do
begin
FstringList.Add(ExtractFilePath(Application.ExeName));
end;
end;
Task Manager says 60MB memory use.
Now optimise with:
var x: integer;
begin
for x := 1 to FstringList.Count - 1 do
FstringList[x]:=FstringList[0];
end;
Task Manager shows the drop in memory usage that I would expect, now 10MB.
So I seem to be able to share strings in a string list, but not in objects. I am obviously missing something conceptually, in code or both!
I hope this makes sense, I can really see the ability to conserve memory using this technique as I have a lot of objects all with lots of string information, that data is sorted in many different ways and I would like to be able to iterate over this data once it is loaded into memory and free some of that memory back up again by sharing strings in this way.
Thanks in advance for any assistance you can offer.
PS: I am using Delphi 2007 but I have just tested on Delphi 2010 and the results are the same, except that Delphi 2010 uses twice as much memory due to unicode strings...

When your Delphi program allocates and deallocates memory it does this not by using Windows API functions directly, but it goes through the memory manager. What you are observing here is the fact that the memory manager does not release all allocated memory back to the OS when it's no longer needed in your program. It will keep some or all of it allocated for later, to speed up later memory requests in the application. So if you use the system tools the memory will be listed as allocated by the program, but it is not in active use, it is marked as available internally and is stored in lists of usable memory blocks which the MM will use for any further memory allocations in your program, before it goes to the OS and requests more memory.
If you want to really check how any changes to your programs affect the memory consumption you should not rely on external tools, but should use the diagnostics the memory manager provides. Download the full FastMM4 version and use it in your program by putting it as the first unit in the DPR file. You can get detailed information by using the GetMemoryManagerState() function, which will tell you how much small, medium and large memory blocks are used and how much memory is allocated for each block size. For a quick check however (which will be completely sufficient here) you can simply call the GetMemoryManagerUsageSummary() function. It will tell you the total allocated memory, and if you call it you will see that your reassignment of FPathPart does indeed free several MB of memory.
You will observe different behaviour when a TStringList is used, and all strings are added sequentially. Memory for these strings will be allocated from larger blocks, and those blocks will contain nothing else, so they can be released again when the string list elements are freed. If OTOH you create your objects, then the strings will be allocated alternating with other data elements, so freeing them will create empty memory regions in the larger blocks, but the blocks won't be released as they contain still valid memory for other things. You have basically increased memory fragmentation, which could be a problem in itself.

As noted by another answer, memory that is not being used is not always immediately released to the system by the Delphi Memory Manager.
Your code guarantees a large quantity of such memory by dynamically growing the object list.
A TObjectList (in common with a TList and a TStringList) uses an incremental memory allocator. A new instance of one of these containers starts with memory allocated for 4 items (the Capacity). When the number of items added exceeds the Capacity additional memory is allocated, initially by doubling the capacity and then once a certain number of items has been reached, by increasing the capacity by 25%.
Each time the Count exceeds the Capacity, additional memory is allocated, the current memory copied to the new memory and the previously used memory released (it is this memory which is not immediately returned to the system).
When you know how many items are to be loaded into one of these types of list you can avoid this memory re-allocation behaviour (and achieve a significant performance improvement) by pre-allocating the Capacity of the list accordingly.
You do not necessarily have to set the precise capacity needed - a best guess (that is more likely to be nearer, or higher than, the actual figure required is still going to be better than the initial, default capacity of 4 if the number of items is significantly > 64)

Because task manager does not tell you the whole truth. Compare with this code:
var
x: integer;
MyFilePath: string;
fo: TfileObject;
begin
MyFilePath:=ExtractFilePath(Application.ExeName);
for x := 1 to 1000000 do
begin
fo:=TfileObject.Create;
fo.FpathPart:=MyFilePath;
FobjectList.Add(fo);
end;
end;

To share a reference, strings need to be assigned directly and be of the same type (Obviously, you can't share a reference between UnicodeString and AnsiString).
The best way I can think of to achieve what you want is as follow:
var StrReference : TStringlist; //Sorted
function GetStrReference(const S : string) : string;
var idx : Integer;
begin
if not StrReference.Find(S,idx) then
idx := StrReference.Add(S);
Result := StrReference[idx];
end;
procedure YourProc;
var x: integer;
MyFilePath: string;
fo: TfileObject;
begin
for x := 1 to 1000000 do
begin
// create a new string for every iteration of the loop
MyFilePath := GetStrReference(ExtractFilePath(Application.ExeName));
fo := TfileObject.Create;
fo.FpathPart := MyFilePath;
FobjectList.Add(fo);
end;
end;
To make sure it has worked correctly, you can call the StringRefCount(unit system) function. I don't know in which version of delphi that was introduced, so here's the current implementation.
function StringRefCount(const S: UnicodeString): Longint;
begin
Result := Longint(S);
if Result <> 0 then
Result := PLongint(Result - 8)^;
end;
Let me know if it worked as you wanted.
EDIT: If you are afraid of the stringlist growing too big, you can safely scan it periodically and delete from the list any string with a StringRefCount of 1.
The list could be wiped clean too... But that will make the function reserve a new copy of any new string passed to the function.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart