I would like to process (save) the virtual memory blocks allocated to the current process. Here is the code I am using:
program Project38;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils,
Winapi.Windows;
procedure DoProcess(aStart: Pointer; aSize: Cardinal);
begin
// process it
end;
procedure ProcessVirtualMemory;
var
addr: Pointer;
i: Integer;
p: Pointer;
systemInfo: SYSTEM_INFO;
startAddress, stopAddress: Pointer;
size: size_t;
memInfo: MEMORY_BASIC_INFORMATION;
begin
GetSystemInfo(systemInfo);
startAddress := systemInfo.lpMinimumApplicationAddress;
stopAddress := systemInfo.lpMaximumApplicationAddress;
addr := startAddress;
while NativeUInt(addr) < NativeUInt(stopAddress) do begin
size:= VirtualQuery(Pointer(addr), memInfo, SizeOf(MEMORY_BASIC_INFORMATION));
if (size = SizeOf(MEMORY_BASIC_INFORMATION)) and
(memInfo.State = MEM_COMMIT) and
(memInfo.Type_9 = MEM_PRIVATE) and
(memInfo.RegionSize > 0) and
(memInfo.Protect = PAGE_READWRITE) then
begin
DoProcess(memInfo.BaseAddress, memInfo.RegionSize);
addr := Pointer(NativeUInt(addr) + memInfo.RegionSize);
end
else
addr := Pointer(NativeUInt(addr) + systemInfo.dwPageSize);
end;
end;
begin
ProcessVirtualMemory;
end.
This code is run with a huge application and collecting this information without processing is 10-12 seconds. Is there a faster way of getting the addresses of the virtual memory blocks?
Your program does contain a mistake. In case the memory block does not match your search criteria, you only increment by the page size rather than the region size. Your loop should really look like this:
while NativeUInt(addr) < NativeUInt(stopAddress) do begin
size := VirtualQuery(Pointer(addr), memInfo, SizeOf(MEMORY_BASIC_INFORMATION));
if size = 0 then begin
// handle error
break
end;
if (size = SizeOf(MEMORY_BASIC_INFORMATION)) and
(memInfo.State = MEM_COMMIT) and
(memInfo.Type_9 = MEM_PRIVATE) and
(memInfo.RegionSize > 0) and
(memInfo.Protect = PAGE_READWRITE) then begin
DoProcess(memInfo.BaseAddress, memInfo.RegionSize);
end;
addr := Pointer(NativeUInt(addr) + memInfo.RegionSize);
end;
The problem with your version is that when there are gaps in the virtual address space, your code walks over them one page at a time, rather than skipping the entire region.
Even without that change, I don't believe that enumeration was the bottleneck. I made the following alteration to your original program, to report the time taken:
var
Stopwatch: TStopwatch;
begin
Stopwatch := TStopwatch.StartNew;
ProcessVirtualMemory;
Writeln(Stopwatch.ElapsedMilliseconds);
Readln;
end.
On my machine, for a 32 bit release build, this reported around 500ms.
Then I allocated some memory:
var
i: Integer;
Stopwatch: TStopwatch;
begin
for i := 1 to 100000 do begin
HeapAlloc(GetProcessHeap, 0, Random(100000));
end;
Stopwatch := TStopwatch.StartNew;
ProcessVirtualMemory;
Writeln(Stopwatch.ElapsedMilliseconds);
Readln;
end.
No matter what I did, playing around with the constants, the program still reports around 500ms. If you fix the non-matching increment code as described at the start of this answer then the times come down to around 100ms.
Of course, for a 64 bit process, it's a little different. The defect in your code means that the program effectively gets stuck walking the huge 64 bit address space one page at a time, calling VirtualQuery for each page in the address space. I never even waited for that process to finish.
My conclusion therefore is that the main bottleneck in your program is not the code that your present, the code that finds the virtual memory blocks. That is the code inside DoProcess, the code that you have that we don't. So even when you fix the defect I described at the start, you will still spend significant time in that function. You should expect the virtual memory space enumeration to take in the region of 100ms.
Related
I use Delphi 10.1 Berlin in Windows 10.
I have two records of different sizes. I wrote code to loop through two TList<T> of these records to test elapsed times. Looping through the list of the larger record runs much slower.
Can anyone explain the reason, and provide a solution to make the loop run faster?
type
tTestRecord1 = record
Field1: array[0..4] of Integer;
Field2: array[0..4] of Extended;
Field3: string;
end;
tTestRecord2 = record
Field1: array[0..4999] of Integer;
Field2: array[0..4999] of Extended;
Field3: string;
end;
procedure TForm1.Button1Click(Sender: TObject);
var
_List: TList<tTestRecord1>;
_Record: tTestRecord1;
_Time: TTime;
i: Integer;
begin
_List := TList<tTestRecord1>.Create;
for i := 0 to 4999 do
begin
_List.Add(_Record);
end;
_Time := Time;
for i := 0 to 4999 do
begin
if _List[i].Field3 = 'abcde' then
begin
Break;
end;
end;
Button1.Caption := FormatDateTime('s.zzz', Time - _Time); // 0.000
_List.Free;
end;
procedure TForm1.Button2Click(Sender: TObject);
var
_List: TList<tTestRecord2>;
_Record: tTestRecord2;
_Time: TTime;
i: Integer;
begin
_List := TList<tTestRecord2>.Create;
for i := 0 to 4999 do
begin
_List.Add(_Record);
end;
_Time := Time;
for i := 0 to 4999 do
begin
if _List[i].Field3 = 'abcde' then
begin
Break;
end;
end;
Button2.Caption := FormatDateTime('s.zzz', Time - _Time); // 0.045
_List.Free;
end;
First of all, I want to consider the entire code, even the code that populates the list which I do realise you have not timed. Because the second record is larger in size more memory needs to be copied when you make an assignment of that record type. Further when you read from the list the larger record is less cache friendly than the smaller record which impacts performance. This latter effect is likely less significant than the former.
Related to this is that as you add items the list's internal array of records has to be resized. Sometimes the resizing leads to a reallocation that cannot be performed in-place. When that happens a new block of memory is allocated and the previous content is copied to this new block. That copy is clearly ore expensive for the larger record. You can mitigate this by allocating the array once up front if you know it's length. The list Capacity is the mechanism to use. Of course, not always will you know the length ahead of time.
Your program does very little beyond memory allocation and memory access. Hence the performance of these memory operations dominates.
Now, your timing is only of the code that reads from the lists. So the memory copy performance difference on population is not part of the benchmarking that you performed. Your timing differences are mainly down to excessive memory copy when reading, as I will explain below.
Consider this code:
if _List[i].Field3 = 'abcde' then
Because _List[i] is a record, a value type, the entire record is copied to an implicit hidden local variable. The code is actually equivalent to:
var
tmp: tTestRecord2;
...
tmp := _List[i]; // copy of entire record
if tmp.Field3 = 'abcde' then
There are a few ways to avoid this copy:
Change the underlying type to be a reference type. This changes the memory management requirements. And you may have good reason to want to use a value type.
Use a container class that can return the address of an item rather than a copy of an item.
Switch from TList<T> to dynamic array TArray<T>. That simple change will allow the compiler to access individual fields directly without copying entire records.
Use the TList<T>.List to obtain access to the list object's underlying array holding the data. That would have the same effect as the previous item.
Item 4 above is the simplest change you could make to see a large difference. You would replace
if _List[i].Field3 = 'abcde' then
with
if _List.List[i].Field3 = 'abcde' then
and that should yield a very significant change in performance.
Consider this program:
{$APPTYPE CONSOLE}
uses
System.Diagnostics,
System.Generics.Collections;
type
tTestRecord2 = record
Field1: array[0..4999] of Integer;
Field2: array[0..4999] of Extended;
Field3: string;
end;
procedure Main;
const
N = 100000;
var
i: Integer;
Stopwatch: TStopwatch;
List: TList<tTestRecord2>;
Rec: tTestRecord2;
begin
List := TList<tTestRecord2>.Create;
List.Capacity := N;
for i := 0 to N-1 do
begin
List.Add(Rec);
end;
Stopwatch := TStopwatch.StartNew;
for i := 0 to N-1 do
begin
if List[i].Field3 = 'abcde' then
begin
Break;
end;
end;
Writeln(Stopwatch.ElapsedMilliseconds);
end;
begin
Main;
Readln;
end.
I had to compile it for 64 bit to avoid an out of memory condition. The output on my machine is around 700. Change List[i].Field3 to List.List[i].Field3 and the output is in single figures. The timing is rather crude, but I think this demonstrates the point.
The issue of the large record not being cache friendly remains. That is more complicated to deal with and would require a detailed analysis of how the real world code operated on its data.
As an aside, if you care about performance then you won't use Extended. Because it has size 10, not a power of two, memory access is frequently mis-aligned. Use Double or Real which is an alias to Double.
I'm want to convert a string value to a global memory handle and vice versa, using the following functions I've just written.
But StrToGlobalHandle() causes my testing program hangs. So GlobalHandleToStr() is untest-able yet and I'm also wondering if my code is logical or not.
function StrToGlobalHandle(const aText: string): HGLOBAL;
var
ptr: PChar;
begin
Result := 0;
if aText <> '' then
begin
Result := GlobalAlloc(GMEM_MOVEABLE or GMEM_ZEROINIT, length(aText) + 1);
if Result <> 0 then
begin
ptr := GlobalLock(Result);
if Assigned(ptr) then
begin
StrCopy(ptr, PChar(aText));
GlobalUnlock(Result);
end
end;
end;
end;
function GlobalHandleToStr(const aHandle: HGLOBAL): string;
var
ptrSrc: PChar;
begin
ptrSrc := GlobalLock(aHandle);
if Assigned(ptrSrc) then
begin
SetLength(Result, Length(ptrSrc));
StrCopy(PChar(Result), ptrSrc);
GlobalUnlock(aHandle);
end
end;
Testing code:
procedure TForm3.Button1Click(Sender: TObject);
var
h: HGLOBAL;
s: string;
s2: string;
begin
s := 'this is a test string';
h := StrToGlobalHandle(s);
s2 := GlobalHandleToStr(h);
ShowMessage(s2);
GlobalFree(h);
end;
BTW, I want to use these two functions as helpers to send string values between programs - send a global handle from process A to process B, and process B get the string using GlobalHandleToStr().
BTW 2, I know WM_COPY and other IPC methods, those are not suitable in my case.
The strings in Delphi 2010 are unicode, so you are not allocating the proper buffer size.
replace this line
Result := GlobalAlloc(GMEM_MOVEABLE or GMEM_ZEROINIT, length(aText) + 1);
with this
Result := GlobalAlloc(GMEM_MOVEABLE or GMEM_ZEROINIT, (length(aText) + 1)* SizeOf(Char));
If your program hangs when you call GlobalAlloc, then you probably have heap corruption from earlier in your program. That leads to undefined behavior; the function might detect the problem and return an error, it might crash your program, it might silently corrupt yet more of your memory, it might hang, or it might do any number of other things.
That heap corruption might come from a previous call to StrToGlobalHandle because your StrCopy call writes beyond the end of the allocated memory. You're allocating bytes, but the Length function returns the number of characters in the string. That's only valid when characters are one byte wide, which isn't the case as of Delphi 2009. Multiply by SizeOf(Char) to get a byte count:
Result := GlobalAlloc(GMEM_MOVEABLE or GMEM_ZEROINIT, SizeOf(Char) * (Length(aText) + 1));
You can't send data between programs using GlobalAlloc - it worked only in 16-bit Windows. Use Memory Mapped File instead.
I want to terminate file that user selected from my program. I wrote this sample code:
var
aFile: TFileStream;
Const
FileAddr: String = 'H:\Akon.mp3';
Buf: Byte = 0;
begin
if FileExists(FileAddr) then
begin
// Open given file in file stream & rewrite it
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
aFile.Seek(0, soFromBeginning);
while aFile.Position <> aFile.Size do
aFile.Write(Buf, 1);
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
As you can see, I overwrite given file with 0 (null) value. This code works correctly, but the speed is very low in large files. I would like do this process in multithreaded code, but I tried some test some code and can't do it. For example, I create 4 threads that do this work to speed up this process.
Is there any way to speed up this process?
I don't know if it could help you, but I think you could do better (than multithreading) writing to file a larger buffer.
For example you could initialize a buffer 16k wide and write directly to FileStream; you have only to check the last part of file, for which you write only a part of the full buffer.
Believe me, it will be really faster...
OK, I'll bite:
const
FileAddr: String = 'H:\Akon.mp3';
var
aFile: TFileStream;
Buf: array[0..1023] of Byte;
Remaining, NumBytes: Integer;
begin
if FileExists(FileAddr) then
begin
// Open given file in file stream & rewrite it
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
FillChar(Buf, SizeOf(Buf), 0);
Remaining := aFile.Size;
while Remaining > 0 do begin
NumBytes := SizeOf(Buf);
if NumBytes < Remaining then
NumBytes := Remaining;
aFile.WriteBuffer(Buf, NumBytes);
Dec(Remaining, NumBytes);
end;
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
Multiple threads won't help you here. Your constraint is disk access, primarily because you're writing only 1 byte at a time.
Declare Buf as an array of bytes, and initialise it with FillChar or ZeroMemory. Then change your while loop as follows:
while ((aFile.Position + SizeOf(Buf)) < aFile.Size) do
begin
aFile.Write(Buf, SizeOf(Buf));
end;
if (aFile.Position < aFile.Size) then
begin
aFile.Write(Buf, aFile.Size - aFile.Position);
end;
You should learn from the slowness of the above code that:
Writing one byte at a time is the slowest and worst way to do it, you incur a huge overhead, and reduce your overall performance.
Even writing 1k bytes (1024 bytes) at a time, would be a vast improvement, but writing a larger amount will of course be even faster, until you reach a point of diminishing returns, which I would guess is going to be somewhere between 200k and 500k write buffer size. The only way to find out when it stops mattering for your application is to test, test, and test.
Checking position against size so often is completely superfluous. If you read the size once, and write the correct number of bytes, using a local variable you will save yourself more overhead, and improve performance. ie, Inc(LPosition,BufSize) to increment LPosition:Integer logical variable, by the buffer size amount BufSize.
Not sure if this meets your requirments but it works and it's fast.
var
aFile: TFileStream;
const
FileAddr: String = 'H:\Akon.mp3';
begin
if FileExists(FileAddr) then
begin
aFile:= TFileStream.Create(FileAddr, fmOpenReadWrite);
try
afile.Size := 0;
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
So will something along these lines (declaring b outside the function will improve your performance in the loop, especially when dealing with a large file ). I assume that in the app filename would be a var:
const
b: byte=0;
procedure MyProc;
var
aFile: TFileStream;
Buf: array of byte;
len: integer;
FileAddr: String;
begin
FileAddr := 'C:\testURL.txt';
if FileExists(FileAddr) then
begin
aFile := TFileStream.Create(FileAddr, fmcreate);
try
len := afile.Size;
setlength(buf, len);
fillchar(buf[0], len, b);
aFile.Position := 0;
aFile.Write(buf, len);
finally
aFile.Free;
ShowMessage('Finish');
end;
end;
end;
In Delphi you can speed up your code by passing parameters as const, e.g.
function A(const AStr: string): integer;
//or
function B(AStr: string): integer;
Suppose both functions have the same code inside, the speed difference between them is negligible and I doubt it can even be measured with a cycle-counter like:
function RDTSC: comp;
var
TimeStamp: record case byte of
1: (Whole: comp);
2: (Lo, Hi: Longint);
end;
begin
asm
db $0F; db $31;
mov [TimeStamp.Lo], eax
mov [TimeStamp.Hi], edx
end;
Result := TimeStamp.Whole;
end;
The reason for this is that all the const does in function A is to prevent the reference count of AStr to be incremented.
But the increment only takes one cycle of one core of my multicore CPU, so...
Why should I bother with const?
If there is no other reason for the function to contain an implicit try/finally, and the function itself is not doing much work, the use of const can result in a significant speedup (I once got one function that was using >10% of total runtime in a profiling run down to <2% just by adding a const in the right place).
Also, the reference counting takes much much more than one cycle because it has to be performed with the lock prefix for threadsafety reasons, so we are talking more like 50-100 cycles. More if something in the same cache line has been modified by another core in between.
As for not being able to measure it:
program Project;
{$APPTYPE CONSOLE}
uses
Windows,
SysUtils,
Math;
function GetThreadTime: Int64;
var
CreationTime, ExitTime, KernelTime, UserTime: TFileTime;
begin
GetThreadTimes(GetCurrentThread, CreationTime, ExitTime, KernelTime, UserTime);
Result := PInt64(#UserTime)^;
end;
function ConstLength(const s: string): Integer;
begin
Result := Length(s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Length(s);
end;
var
s : string;
i : Integer;
j : Integer;
ConstTime, NoConstTime: Int64;
begin
try
// make sure we got an heap allocated string;
s := 'abc';
s := s + '123';
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_TIME_CRITICAL);
j := 0;
ConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, ConstLength(s));
ConstTime := GetThreadTime - ConstTime;
j := 0;
NoConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, NoConstLength(s));
NoConstTime := GetThreadTime - NoConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
WriteLn('Const: ', ConstTime);
WriteLn('NoConst: ', NoConstTime);
WriteLn('Const is ', (NoConstTime/ConstTime):2:2, ' times faster.');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
if DebugHook <> 0 then
ReadLn;
end.
Produces this output on my system:
Const: 6084039
NoConst: 36192232
Const is 5.95 times faster.
EDIT: it gets a bit more interesting if we add some thread contention:
program Project;
{$APPTYPE CONSOLE}
uses
Windows,
SysUtils,
Classes,
Math;
function GetThreadTime: Int64;
var
CreationTime, ExitTime, KernelTime, UserTime: TFileTime;
begin
GetThreadTimes(GetCurrentThread, CreationTime, ExitTime, KernelTime, UserTime);
Result := PInt64(#UserTime)^;
end;
function ConstLength(const s: string): Integer;
begin
Result := Length(s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Length(s);
end;
function LockedAdd(var Target: Integer; Value: Integer): Integer; register;
asm
mov ecx, eax
mov eax, edx
lock xadd [ecx], eax
add eax, edx
end;
var
x : Integer;
s : string;
ConstTime, NoConstTime: Integer;
StartEvent: THandle;
ActiveCount: Integer;
begin
try
// make sure we got an heap allocated string;
s := 'abc';
s := s + '123';
ConstTime := 0;
NoConstTime := 0;
StartEvent := CreateEvent(nil, True, False, '');
ActiveCount := 0;
for x := 0 to 2 do
TThread.CreateAnonymousThread(procedure
var
i : Integer;
j : Integer;
ThreadConstTime: Int64;
begin
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_HIGHEST);
InterlockedIncrement(ActiveCount);
WaitForSingleObject(StartEvent, INFINITE);
j := 0;
ThreadConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, ConstLength(s));
ThreadConstTime := GetThreadTime - ThreadConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
LockedAdd(ConstTime, ThreadConstTime);
InterlockedDecrement(ActiveCount);
end).Start;
while ActiveCount < 3 do
Sleep(100);
SetEvent(StartEvent);
while ActiveCount > 0 do
Sleep(100);
WriteLn('Const: ', ConstTime);
ResetEvent(StartEvent);
for x := 0 to 2 do
TThread.CreateAnonymousThread(procedure
var
i : Integer;
j : Integer;
ThreadNoConstTime: Int64;
begin
//make sure we minimize thread context switches during the timing
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_HIGHEST);
InterlockedIncrement(ActiveCount);
WaitForSingleObject(StartEvent, INFINITE);
j := 0;
ThreadNoConstTime := GetThreadTime;
for i := 0 to 100000000 do
Inc(j, NoConstLength(s));
ThreadNoConstTime := GetThreadTime - ThreadNoConstTime;
SetThreadPriority(GetCurrentThread, THREAD_PRIORITY_NORMAL);
LockedAdd(NoConstTime, ThreadNoConstTime);
InterlockedDecrement(ActiveCount);
end).Start;
while ActiveCount < 3 do
Sleep(100);
SetEvent(StartEvent);
while ActiveCount > 0 do
Sleep(100);
WriteLn('NoConst: ', NoConstTime);
WriteLn('Const is ', (NoConstTime/ConstTime):2:2, ' times faster.');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
if DebugHook <> 0 then
ReadLn;
end.
On a 6 core machine, this gives me:
Const: 19968128
NoConst: 1313528420
Const is 65.78 times faster.
EDIT2: replacing the call to Length with a call to Pos (I picked the worst case, search for something not contained in the string):
function ConstLength(const s: string): Integer;
begin
Result := Pos('x', s);
end;
function NoConstLength(s: string): Integer;
begin
Result := Pos('x', s);
end;
results in:
Const: 51792332
NoConst: 1377644831
Const is 26.60 times faster.
for the threaded case, and:
Const: 15912102
NoConst: 44616286
Const is 2.80 times faster.
for the non-threaded case.
Don't forget that const isn't only there to provide those tiny performance improvements.
Using const explains to anybody reading or maintaining the code that the value shouldn't be updated, and allows the compiler to catch any accidental attempts to do so.
So making your code more readable and maintainable can also make it marginally faster. What good reasons are there for not using const?
Using const prevents an implicit try/finally block which on x86 is rather more expensive than reference counting. That's really a separate issue to the semantic meaning of const. It's a shame that performance and semantics are mixed up in this way.
The type String is a special case, because it is managed by Delphi (copy on demand), and therefore not ideal to answer your question.
If you test your function with other types that are bigger than a pointer, records or arrays for example, you should see a bigger time difference, because with const only a pointer is passed, without const the record would be copied before passing to the function.
Using the keyword const, you can leave the decision of optimization to the compiler.
The documentation says:
Using const allows the compiler to optimize code for structured- and string-type parameters.
So, it is better, thus rational, to use const for string parameters, simply because the manual says so. ;)
Now, this may be well enough an answer for the questioner, but it is even more interesting to look at the general question whether to use const parameters or not.
Again, the documentation says at just one click away from the Delphi Language Guide Index:
Value and constant (const) parameters are passed by value or by reference, depending on the type and size of the parameter:
Note the apparent equality of value and constant parameters in this sentence. This concludes that using const for parameters, being not string- or structured-typed, makes no difference in performance nor code-size. (A short test, derived from Thorsten Engler's test code, indeed shows an average indifference between with and without const for parameters of ordinal and real types.)
So it turns out that whether or not using const only makes a difference to the programmer, not the executable.
As follow-up, and as LukeH already asked: What good reasons are there for not using const?
To follow Delphi's own syntax:
function FindDragTarget(const Pos: TPoint; AllowDisabled: Boolean): TControl;
function UpperCase(const S: string): string;
function UpCase(Ch: Char): Char;
function EncodeDate(Year, Month, Day: Word): TDateTime;
To produce more compact are therefore possibly slightly more readable code. For instance: using constant parameters in property setters really is superfluous, which surprisingly often leads to single line declarations instead of double, if you like to honour a line length limit.
To comfortably provide variables to virtual methods and event handlers. Note that none of the VCL event handler types use const parameters (for other than string- or record-typed members). It is just nice service for the users of your code or your components.
Of course, there also may be fine reasons for using const:
As LukeH already answered, if there is really no need at all to change the value of the parameter.
For (personal) protection, like the documentation says:
Using const also provides a safeguard against unintentionally passing a parameter by reference to another routine.
Partial origin of this answer: http://www.nldelphi.com.
Generally, I would avoid any optimizations (in any language) that don't solve real problems that you can measure. Profile your code, and fix the problems that you can actually see. Optimizing for theoretical issues is just a waste of your time.
If you suspect that something is wrong, and this somehow fixes it/speeds it up, then great, but implementing these kinds of micro optimizations by default are rarely worth the time.
One of the most important fact that people omitted. Interlock ... instruction is very costly in Multicore CPUs of x86 instruction. Read Intel manual. The cost is when refcounter var is taken placed and it is not in cpu cache, ALL other CPUs must be stopped for instruction to carried out.
Cheers
Is there a faster way? I basically need to add AA-ZZ to thousands of records at a time.
Just a list of 35 items takes quite a while to complete muchless a list of a thousand.
procedure Tmainform.btnSeederClick(Sender: TObject);
var
ch,ch2:char;
i:integer;
slist1, slist2:TStrings;
begin
slist1:= TStringList.Create;
slist2:= TStringList.Create;
slist1.Text :=queuebox.Items.Text;
for ch := 'a' to 'z' do
begin
for ch2 := 'a' to 'z' do
begin
//
for I := 0 to slist1.Count - 1 do
begin
application.ProcessMessages; // so it doesn't freeze the application in long loops. Not 100% sure where this should be placed, if at all.
sleep(1); //Without this it doesn't process the cancel button.
if cancel then Break;
slist2.Add(slist1.Strings[i]+ch+ch2);
end;
end;
end;
insertsingle(slist2,queuebox);
freeandnil(slist1);
freeandnil(slist2);
end;
Thanks for any help
There are a couple obvious problems with your code.
First off, you're wasting a lot of CPU cycles computing the same values over and over again. The AA..ZZ values aren't going to change, so there's no need to build them over and over. Try something like this: Create a third TStringList. Go through and fill it with all possible AA..ZZ permutations with your double loop. Once that's over with, loop through and merge this list of precomputed strings with the values in slist1. You should see a pretty big boost from that.
(Or, if time is absolutely at a premium, write a minor little program that will compute the permutation list and save it to a textfile, then compile that into your app as a string resource which you can load at runtime.)
Second, and this is probably what's killing you, you shouldn't have the ProcessMessages and the Sleep calls in the innermost loop. Sleep(1); sounds like it means "sleep for 1 milisecond", but Windows doesn't offer that sort of precision. What you end up getting is "sleep for at least 1 milisecond". It releases the CPU until Windows gets back around to it, which is usually somewhere on the order of 16 miliseconds. So you're adding a delay of 16 msec (plus as long as ProcessMessages takes) into a very tight loop that probably takes only a few microseconds to execute the rest of its code.
If you need something like that to keep the UI responsive, it should be in the outermost loop, not an inner one, and you probably don't even need to run it every iteration. Try something like if ch mod 100 = 0 then //sleep and process messages here. Craig's suggestion to move this task to a worker thread would also help, but only if you know enough about threads to get it right. They can be tricky.
You should surround your code with slist2.BeginUpdate() and slist2.EndUpdate(), to stop TStringList from doing extra processing.
From my experience, you would get a very large improvement by using fewer ProcessMessages(); Sleep(1); statements, as suggested in other answers.
Try moving it to just below the first for loop, and see what improvement you get.
An example of how you might use a secundary thread to do the heavy work.
Note that for the 35 items you mention, it is really not worth it to start another thread. For a few thousand items the game changes. Processing 10.000 items takes 10 seconds on my desktop computer.
Some benefits of multithreading:
the main thread stays responsive.
the calculation can be stopped at will.
and offcourse some pitfalls:
care must be taken (in its current implementation) to not mess with the passed stringlists while the seeding is running.
multithreading adds complexity and are source for hard to find bugs.
paste below code in our favorite editor and you should be good to go.
procedure TForm1.btnStartClick(Sender: TObject);
var
I: Integer;
begin
//***** Fill the sourcelist
FSource := TStringList.Create;
FDestination := TStringList.Create;
for I := 0 to 9999 do
FSource.Add(Format('Test%0:d', [I]));
//***** Create and fire Thread
FSeeder := TSeeder.Create(FSource, FDestination);
FSeeder.OnTerminate := DoSeederDone;
FSeeder.Resume;
end;
procedure TForm1.btnStopClick(Sender: TObject);
begin
if Assigned(FSeeder) then
FSeeder.Terminate;
end;
procedure TForm1.DoSeederDone(Sender: TObject);
var
I, step: Integer;
begin
I := 0;
step := 0;
while I < FDestination.Count do
begin
//***** Don't show every item. OutputDebugString is pretty slow.
OutputDebugString(PChar(FDestination[I]));
Inc(step);
Inc(I, step);
end;
FSource.Free;
FDestination.Free;
end;
{ TSeeder }
constructor TSeeder.Create(const source, destination: TStringList);
begin
//***** Create a suspended, automatically freed Thread object.
Assert(Assigned(source));
Assert(Assigned(destination));
Assert(destination.Count = 0);
inherited Create(True);
FreeOnTerminate := True; //***** Triggers the OnTerminate event
FSource := source;
FDestination := destination;
end;
procedure TSeeder.Execute;
var
I, J: Integer;
AString: string;
begin
FDestination.BeginUpdate;
try
FDestination.Capacity := FSource.Count * 26 * 26;
for I := 0 to Pred(FSource.Count) do
begin
AString := FSource[I];
for J := 0 to Pred(26 * 26) do
begin
FDestination.Add(AString + Char(J div 26 + $41) + Char(J mod 26 + $41));
if Terminated then Exit;
end;
end;
finally
FDestination.EndUpdate;
end;
end;
OK. I have tried to optimize your code. For final tests, some test-data is needed.
What I have done (it include most of the ideas from Mason):
comment out the code about "cancel" and "
gave types and variables a more meaningfull name
used the names that Delphi uses ("Application" in stead of "application", etc) to make it readable
moved some logic into "KeepUIGoing"
move the calculation of the suffixes out of the main loop into an initialization loop
made it optionally use a TStringBuilder (which can be way faster than a TStringList, and is available since Delphi 2009)
Below is the modified code, let me know if it works for you.
procedure TForm2.Button1Click(Sender: TObject);
{$define UseStringBuilder}
procedure KeepUIGoing(SourceListIndex: Integer);
begin
if SourceListIndex mod 100 = 0 then
begin
Application.ProcessMessages;
// so it doesn't freeze the application in long loops. Not 100% sure where this should be placed, if at all.
Sleep(1);
end;
end;
const
First = 'a';
Last = 'z';
type
TRange = First .. Last;
TSuffixes = array [TRange, TRange] of string;
var
OuterIndex, InnerIndex: Char;
SourceListIndex: Integer;
SourceList, TargetList: TStrings;
Suffixes: TSuffixes;
NewLine: string;
{$ifdef UseStringBuilder}
TargetStringBuilder: TStringBuilder; // could be way faster than TStringList
{$endif UseStringBuilder}
begin
for OuterIndex := First to Last do
for InnerIndex := First to Last do
Suffixes[OuterIndex, InnerIndex] := OuterIndex + InnerIndex;
SourceList := TStringList.Create;
TargetList := TStringList.Create;
{$ifdef UseStringBuilder}
TargetStringBuilder := TStringBuilder.Create();
{$endif UseStringBuilder}
try
SourceList.Text := queuebox.Items.Text;
for OuterIndex := First to Last do
begin
for InnerIndex := First to Last do
begin
for SourceListIndex := 0 to SourceList.Count - 1 do
begin
KeepUIGoing(SourceListIndex);
// if cancel then
// Break;
NewLine := SourceList.Strings[SourceListIndex] + Suffixes[OuterIndex, InnerIndex];
{$ifdef UseStringBuilder}
TargetStringBuilder.AppendLine(NewLine);
{$else}
TargetList.Add(NewLine);
{$endif UseStringBuilder}
end;
end;
end;
{$ifdef UseStringBuilder}
TargetList.Text := TargetStringBuilder.ToString();
{$endif UseStringBuilder}
// insertsingle(TargetList, queuebox);
finally
{$ifdef UseStringBuilder}
FreeAndNil(TargetStringBuilder);
{$endif UseStringBuilder}
FreeAndNil(SourceList);
FreeAndNil(TargetList);
end;
end;
--jeroen
I would see if you can do it in one loop as per comment. Also try doing it in a thread so you can eliminate the Application.ProcessMessages and Sleep calls without blocking the UI.
I know this doesn't specifically answer your question, but if you are interested in Delphi algorithm's, Julian Bucknall (CTO of DevExpress) wrote the definitive Delphi algorithms book
Tomes of Delphi: Algorithms and Data Structures:
Chapter 1: What is an algorithm?
Chapter 2: Arrays
Chapter 3: Linked Lists, Stacks, and Queues
Chapter 4: Searching
Chapter 5: Sorting
Chapter 6: Randomized Algorithms
Chapter 7: Hashing and Hash Tables
Chapter 8: Binary Trees
Chapter 9: Priority Queues and Heapsort
Chapter 10: State Machines and Regular Expressions
Chapter 11: Data Compression
Chapter 12: Advanced Topics
You can also get his EZDSL (Easy Data Structures Library) for Delphi 2009 and Delphi 6-2007.
try this sample code - hope this will help a little (Delphi 5 Ent./WinXP)
procedure TForm1.Button1Click(Sender: TObject);
var
i,k: Integer;
sourceList, destList: TStringList;
ch1, ch2: char;
begin
destList := TStringList.Create;
sourceList := TStringList.Create;
//some sample data but I guess your list will have 1000+
//entries?
sourceList.Add('Element1');
sourceList.Add('Element2');
sourceList.Add('Element3');
try
i := 0;
while i < (26*26) do
begin
if (i mod 100) = 0 then
Application.ProcessMessages;
ch1 := char(65 + (i div 26));
ch2 := char(65 + (i mod 26));
for k := 0 to sourceList.Count -1 do
destList.Add(Format('%s-%s%s', [sourceList.Strings[k], ch1, ch2]));
Inc(i);
end;
Memo1.Lines.AddStrings(destList);
finally
FreeAndNil(destList);
FreeAndNil(sourceList);
end;
end;
--Reinhard
If you want events to be processed during your loop, such as the Cancel button being clicked, calling Application.ProcessMessages is sufficient. If you call that regularly but not too regularly, e.g. 50 times per second, then your application will remain responsive to the Cancel button without slowing down too much. Application.ProcessMessages returns pretty quickly if there aren't any messages to be processed.
This technique is appropriate for relatively short computations (a few seconds) that you would expect the user to wait on. For long computations a background thread is more appropriate. Then your application can remain fully responsive, particularly if the user has a multi-core CPU.
Calling Sleep in the main thread does not allow your application to process events. It allows other applications to do something. Calling Sleep really puts your application (the calling thread, actually) to sleep for the requested amount of time or the remainder of the thread's time slice, whichever is larger.
Use Delphi backgroundworker Component for this purpose can be better than thread.it is a easy and event based.features of backgroundworker(additional use Thread) :
Use Event based code. no need create class
Add Progress to process
Sample Code:
procedure TForm2.FormCreate(Sender: TObject);
var
I: Integer;
begin
FSource := TStringList.Create;
FDestination := TStringList.Create;
end;
procedure TForm2.Button1Click(Sender: TObject);
var
I: Integer;
begin
try
FSource.BeginUpdate;
FSource.Clear;
for I := 0 to 9999 do
FSource.Add(Format('Test%0:d', [I]));
BackgroundWorker1.Execute;
finally
FSource.EndUpdate;
end;
end;
procedure TForm2.StopButtonClick(Sender: TObject);
begin
BackgroundWorker1.Cancel;
end;
procedure TForm2.FormDestroy(Sender: TObject);
begin
FreeAndNil(FSource);
FreeAndNil(FDestination);
end;
procedure TForm2.BackgroundWorker1Work(Worker: TBackgroundWorker);
var
I, J: Integer;
AString: string;
begin
FDestination.BeginUpdate;
try
FDestination.Capacity := FSource.Count * 26 * 26;
for I := 0 to Pred(FSource.Count) do
begin
AString := FSource[I];
for J := 0 to Pred(26 * 26) do
begin
FDestination.Add(AString + Char(J div 26 + $41) + Char(J mod 26 + $41));
if Worker.CancellationPending then
Exit;
end;
if I mod 10 = 0 then
TThread.Sleep(1);
Worker.ReportProgress((I * 100) div FSource.Count);
end;
Worker.ReportProgress(100); // completed
finally
FDestination.EndUpdate;
end;
end;
procedure TForm2.BackgroundWorker1WorkProgress(Worker: TBackgroundWorker;
PercentDone: Integer);
begin
ProgressBar1.Position := PercentDone;
end;
if you are looking for pure speed just unroll the code into a single loop and write each line as a separate assignment. You could write a program to write the lines for you automatically then copy and past them into your code. This would essentially be about the fastest method possible. Also turn off all updates as mentioned above.
procedure Tmainform.btnSeederClick(Sender: TObject);
var
ch,ch2:char;
i:integer;
slist1, slist2:TStrings;
begin
slist1:= TStringList.Create;
slist2:= TStringList.Create;
slist1.Text :=queuebox.Items.Text;
slist2.BeginUpdate()
for I := 0 to slist1.Count - 1 do
begin
application.ProcessMessages; // so it doesn't freeze the application in long loops. Not 100% sure where this should be placed, if at all.
if cancel then Break;
slist2.Add(slist1.Strings[i]+'AA');
slist2.Add(slist1.Strings[i]+'AB');
slist2.Add(slist1.Strings[i]+'AC');
...
slist2.Add(slist1.Strings[i]+'ZZ');
end;
slist2.EndUpdate()
insertsingle(slist2,queuebox);
freeandnil(slist1);
freeandnil(slist2);
end;