Microsoft already says, in the documentation for GetTickCount, that you could never compare tick counts to check if an interval has passed. e.g.:
Incorrect (pseudo-code):
DWORD endTime = GetTickCount + 10000; //10 s from now
...
if (GetTickCount > endTime)
break;
The above code is bad because it is suceptable to rollover of the tick counter. For example, assume that the clock is near the end of it's range:
endTime = 0xfffffe00 + 10000
= 0x00002510; //9,488 decimal
Then you perform your check:
if (GetTickCount > endTime)
Which is satisfied immediatly, since GetTickCount is larger than endTime:
if (0xfffffe01 > 0x00002510)
The solution
Instead you should always subtract the two time intervals:
DWORD startTime = GetTickCount;
...
if (GetTickCount - startTime) > 10000 //if it's been 10 seconds
break;
Looking at the same math:
if (GetTickCount - startTime) > 10000
if (0xfffffe01 - 0xfffffe00) > 10000
if (1 > 10000)
Which is all well and good in C/C++, where the compiler behaves a certain way.
But what about Delphi?
But when i perform the same math in Delphi, with overflow checking on ({Q+}, {$OVERFLOWCHECKS ON}), the subtraction of the two tick counts generates an EIntOverflow exception when the TickCount rolls over:
if (0x00000100 - 0xffffff00) > 10000
0x00000100 - 0xffffff00 = 0x00000200
What is the intended solution for this problem?
Edit: i've tried to temporarily turn off OVERFLOWCHECKS:
{$OVERFLOWCHECKS OFF}]
delta = GetTickCount - startTime;
{$OVERFLOWCHECKS ON}
But the subtraction still throws an EIntOverflow exception.
Is there a better solution, involving casts and larger intermediate variable types?
Update
Another SO question i asked explained why {$OVERFLOWCHECKS} doesn't work. It apparently only works at the function level, not the line level. So while the following doesn't work:
{$OVERFLOWCHECKS OFF}]
delta = GetTickCount - startTime;
{$OVERFLOWCHECKS ON}
the following does work:
delta := Subtract(GetTickCount, startTime);
{$OVERFLOWCHECKS OFF}]
function Subtract(const B, A: DWORD): DWORD;
begin
Result := (B - A);
end;
{$OVERFLOWCHECKS ON}
How about a simple function like this one?
function GetElapsedTime(LastTick : Cardinal) : Cardinal;
var CurrentTick : Cardinal;
begin
CurrentTick := GetTickCount;
if CurrentTick >= LastTick then
Result := CurrentTick - LastTick
else
Result := (High(Cardinal) - LastTick) + CurrentTick;
end;
So you have
StartTime := GetTickCount
...
if GetElapsedTime(StartTime) > 10000 then
...
It will work as long as the time between StartTime and the current GetTickCount is less than the infamous 49.7 days range of GetTickCount.
I have stopped doing these calculations everywhere after writing a few helper functions that are called instead.
To use the new GetTickCount64() function on Vista and later there is the following new type:
type
TSystemTicks = type int64;
which is used for all such calculations. GetTickCount() is never called directly, the helper function GetSystemTicks() is used instead:
type
TGetTickCount64 = function: int64; stdcall;
var
pGetTickCount64: TGetTickCount64;
procedure LoadGetTickCount64;
var
DllHandle: HMODULE;
begin
DllHandle := LoadLibrary('kernel32.dll');
if DllHandle <> 0 then
pGetTickCount64 := GetProcAddress(DllHandle, 'GetTickCount64');
end;
function GetSystemTicks: TSystemTicks;
begin
if Assigned(pGetTickCount64) then
Result := pGetTickCount64
else
Result := GetTickCount;
end;
// ...
initialization
LoadGetTickCount64;
end.
You could even manually track the wrap-around of the GetTickCount() return value and return a true 64 bit system tick count on earlier systems too, which should work fairly well if you call the GetSystemTicks() function at least every few days. [I seem to remember an implementation of that somewhere, but don't remember where it was. gabr posted a link and the implementation.]
Now it's trivial to implement functions like
function GetTicksRemaining(...): TSystemTicks;
function GetElapsedTicks(...): TSystemTicks;
function IsTimeRunning(...): boolean;
that will hide the details. Calling these functions instead of calculating durations in-place serves also as documentation of the code intent, so less comments are necessary.
Edit:
You write in a comment:
But like you said, the fallback on Windows 2000 and XP to GetTickCount still leaves the original problem.
You can fix this easily. First you don't need to fall back to GetTickCount() - you can use the code gabr provided to calculate a 64 bit tick count on older systems as well. (You can replace timeGetTime() with GetTickCount) if you want.)
But if you don't want to do that you can just as well disable range and overflow checks in the helper functions, or check whether the minuend is smaller than the subtrahend and correct for that by adding $100000000 (2^32) to simulate a 64 bit tick count. Or implement the functions in assembler, in which case the code doesn't have the checks (not that I would advise this, but it's a possibility).
You can also use DSiTimeGetTime64 from the DSiWin32:
threadvar
GLastTimeGetTime: DWORD;
GTimeGetTimeBase: int64;
function DSiTimeGetTime64: int64;
begin
Result := timeGetTime;
if Result < GLastTimeGetTime then
GTimeGetTimeBase := GTimeGetTimeBase + $100000000;
GLastTimeGetTime := Result;
Result := Result + GTimeGetTimeBase;
end; { DSiTimeGetTime64 }
You can use the Int64 datatype to avoid overflow:
var
Start, Delta : Int64;
begin
Start := GetTickCount;
...
Delta := GetTickCount - start;
if (Delta > 10000) then
...
Related
I would like to know how to calculate the time consumed for a function in Delphi.
Then I wanted to show the used time and compare it with another function or component so as to know the faster function.
You can use TStopwatch from the System.Diagnostics unit to measure elapsed time using the system's high-resolution performance counter.
var
Stopwatch: TStopwatch;
Elapsed: TTimeSpan;
....
Stopwatch := TStopwatch.StartNew;
DoSomething;
Elapsed := Stopwatch.Elapsed;
To read a time value in seconds, say, from a time span, do this:
var
Seconds: Double;
....
Seconds := Elapsed.TotalSeconds;
You can use the QueryPerformanceCounter and QueryPerformanceFrequency functions:
var
c1, c2, f: Int64;
begin
QueryPerformanceFrequency(f);
QueryPerformanceCounter(c1);
DoSomething;
QueryPerformanceCounter(c2);
// Now (c2-c1)/f is the duration in secs of DoSomething
For the sake of having more possibilities for tackling the question, you could also use System.Classes.TThread.GetTickCount to get a current time in milliseconds to start your timer before your method, and then again after your method. The difference between these two is obviously the elapsed time in milliseconds, which you could transform into hours, seconds, etc.
Having said that, David Heffernan's proposal with TStopwatch is more elegant (and more precise?).
VAR iFrequency, iTimerStart, iTimerEnd: Int64;
procedure TimerStart;
begin
if NOT QueryPerformanceFrequency(iFrequency)
then MesajWarning('High resolution timer not availalbe!');
WinApi.Windows.QueryPerformanceCounter(iTimerStart);
end;
function TimerElapsed: Double; { In miliseconds }
begin
QueryPerformanceCounter(iTimerEnd);
Result:= 1000 * ((iTimerEnd - iTimerStart) / ifrequency);
end;
function TimerElapsedS: string; { In seconds/miliseconds }
begin
if TimerElapsed < 1000
then Result:= Real2Str(TimerElapsed, 2)+ ' ms'
else Result:= Real2Str(TimerElapsed / 1000, 2)+ ' s';
end;
I am looking for a timer in milliseconds or nanoseconds in Delphi7. I have to check the speeds of three ISAM files with sequential search. The first ind file contains 50 strings like "record_0" to "record_50". The second - "record_0" to "record_500" and the third - "record_0" to "record_5000". I've implemented everything but I don't know how to make the timer. I am comparing a string with the last item in each ISAM file. Here is my code for the first ind file:
procedure TForm1.Button1Click(Sender: TObject);
var i:integer;
var content : String[20];
var indexCounter:integer;
var keyword:string;
begin
//First ISAM file
AssignFile(indF1, 'index1.ind');
ReWrite(indF1);
Reset(indF1);
for i:=0 to 49 do begin
content := 'record_';
content := content + IntToStr(i+1);
index1.index1 := content;
index1.position1 := FileSize(indF1);
Seek(indF1, FileSize(indF1));
write(indF1, index1);
end;
CloseFile(indF1);
Label12.Caption := FileSizeStr('index1.ind');
//Sequential search in first ind file
Reset(indF1);
keyword := 'record_50';
indexCounter := 0;
//start timer
while not Eof(indF1) do begin
Seek(indF1, indexCounter);
Read(indF1, Index1);
if (keyword = Index1.index1) then begin
//stop timer;
//Label20 := milliseconds/nanoseconds;
//return/break while loop (result := -1; exit;) ???
end;
indexCounter := indexCounter + 1;
end;
I need a procedure/function so that when I call it it should start counting in milliseconds or nanoseconds and stop when the string is found (it's the last string in each ind file) and show the elapsed time for traversing through all the file. Also I don't know how to break the while loop. Thanks in advance.
The TStopWatch class described here "delphi-high-performance-timer-tstopwatch" has all functions needed (for Delphi-7).
It's implemented in later Delphi versions (Delphi-2010) as an advanced record in unit diagnostics.
Example:
var
sw : TStopWatch;
elapsedMilliseconds : cardinal;
begin
...
sw := TStopWatch.Create() ;
try
sw.Start;
while not Eof(indF1) do begin
Seek(indF1, indexCounter);
Read(indF1, Index1);
if (keyword = Index1.index1) then begin
sw.Stop;
Label20.Caption := IntToStr(sw.ElapsedMilliseconds);
break; // break while loop
end;
indexCounter := indexCounter + 1;
end;
...
finally
sw.Free;
end;
end;
To break the while loop, just do break; inside your conditional test.
Use QueryPerformanceFrequency and QueryPerformanceCounter. The first function returns a number of units per second, and the second function returns a value.
lFreq: Int64;
InitialF, FinalF: Int64;
if QueryPerformanceFrequency(lFreq) then
// hi-res timer is supported
else
// hi-res timer is not supported
QueryPerformanceCounter(InitialF);
// do something you want to time
QueryPerformanceCounter(FinalF);
// duration of the time of something is FinalF - InitialF in "units"
// divide by lFreq to get the amount of time in seconds,
// this will be an Extended type.
found this simple code:
var
StartTime : Cardinal;
begin
StartTime := GetTickCount;
//code to do
ShowMessage(Format('Elapsed time %d ms', [GetTickCount - StartTime]));
Use JclCounter from Jedi JCL. Or if you don't want to go with Jedi, use Win Api QueryPerformanceCounter.
Can anyone help me how to format an int variable in delphi into a minute:seconds??
sample:
myVar := 19;
my label caption should display 00:19
any idea anyone? thanks
This will avoid any errors for seconds values that overflow into hours.
var
secs: integer;
str: string;
begin
secs := 236;
// SecsPerDay comes from the SysUtils unit.
str := FormatDateTime('nn:ss', secs / SecsPerDay));
// If you need hours, too, just add "hh:" to the formatting string
secs := 32236;
str := FormatDateTime('hh:nn:ss', secs / SecsPerDay));
end;
Assuming the myVar contains number of seconds:
label1.Caption := Format('%.2d:%.2d', [myVar div 60, myVar mod 60]);
You should use FormatDateTime method like this:
procedure TForm1.FormCreate(Sender: TObject);
const MyConst: Integer = 19;
begin
Caption:=FormatDateTime('nn:ss', EncodeTime(0, MyConst div 60, MyConst mod 60, 0));
end;
Expanding onto Brad's answer, I've wrapped this into a function which detects if the time is over an hour, and automatically shows hours if so. Otherwise, if it's less than an hour, it doesn't show the hours. It also has an optional parameter to define whether to show a leading zero on the hours and minutes, depending on your preference (i.e. 03:06:32 vs 3:6:32). This makes it a little more human-readable.
function SecsToTimeStr(const Secs: Integer; const LeadingZero: Boolean = False): String;
begin
if Secs >= SecsPerHour then begin
if LeadingZero then
Result := FormatDateTime('hh:nn:ss', Secs / SecsPerDay)
else
Result := FormatDateTime('h:n:ss', Secs / SecsPerDay)
end else begin
if LeadingZero then
Result := FormatDateTime('nn:ss', Secs / SecsPerDay)
else
Result := FormatDateTime('n:ss', Secs / SecsPerDay)
end;
end;
However, there are many different possible preferences with displaying a time period, which is up to you to decide. I won't cover all those possible ways here.
If you are sure you only want minutes and seconds - a quick solution could be:
Format('%d:%d',[(myVar div 60), (myVar mod 60)]);
Same solution as already proposed ... :-)
Hi I am using QueryperformanceCounter to time a block of code in Delphi. For some reason, the
Millisecond number I got by using QueryPerformanceCounter is quite different from my wall clock time by using a stopwatch. For example The stopwatch give me about 33 seconds, which seems right if not accuracy, but using QueryPerofomanceCounter will give me a number like 500 Milliseconds.
When step though my code, I can see that QueryPerformanceFrequencygives me correct CPU frequency for my CPU, 2.4G for Core2 E6600. So if the tick number is correct, (tick number / Freq) * 1000 should give me correct execution time for the code I am timing, but why not?
I know that for the code I am trying to timing, QeuryPerformanceCounter is probably over-killing as it took seconds rather than MillionSeconds, but I am more interested in understanding the reason for the time difference between wall clock and QueryPerormanceCounter.
My Hardware is E6600 Core2 and OS is Windows 7 X64 if it is relevant.
unit PerformanceTimer;
interface
uses Windows, SysUtils, DateUtils;
type TPerformanceTimer = class
private
fFrequency : TLargeInteger;
fIsRunning: boolean;
fIsHighResolution: boolean;
fStartCount, FstopCount : TLargeInteger;
procedure SetTickStamp(var lInt : TLargeInteger) ;
function GetElapsedTicks: TLargeInteger;
function GetElapsedMiliseconds: TLargeInteger;
public
constructor Create(const startOnCreate : boolean = false) ;
procedure Start;
procedure Stop;
property IsHighResolution : boolean read fIsHighResolution;
property ElapsedTicks : TLargeInteger read GetElapsedTicks;
property ElapsedMiliseconds : TLargeInteger read GetElapsedMiliseconds;
property IsRunning : boolean read fIsRunning;
end;
implementation
constructor TPerformanceTimer.Create(const startOnCreate : boolean = false) ;
begin
inherited Create;
fIsRunning := false;
fIsHighResolution := QueryPerformanceFrequency(fFrequency) ;
if NOT fIsHighResolution then
fFrequency := MSecsPerSec;
if startOnCreate then
Start;
end;
function TPerformanceTimer.GetElapsedTicks: TLargeInteger;
begin
result := fStopCount - fStartCount;
end;
procedure TPerformanceTimer.SetTickStamp(var lInt : TLargeInteger) ;
begin
if fIsHighResolution then
QueryPerformanceCounter(lInt)
else
lInt := MilliSecondOf(Now) ;
end;
function TPerformanceTimer.GetElapsedMiliseconds: TLargeInteger;
begin
result := (MSecsPerSec * (fStopCount - fStartCount)) div fFrequency;
end;
procedure TPerformanceTimer.Start;
begin
SetTickStamp(fStartCount) ;
fIsRunning := true;
end;
procedure TPerformanceTimer.Stop;
begin
SetTickStamp(fStopCount) ;
fIsRunning := false;
end;
end.
This code just works for me, maybe you can try it:
var
ifrequency, icount1, icount2: Int64;
fmsec: Double;
begin
QueryPerformanceFrequency(ifrequency);
QueryPerformanceCounter(icount1);
Sleep(500);
QueryPerformanceCounter(icount2);
fmsec := 1000 * ((icount2 - icount1) / ifrequency);
end;
fmsec is about 499.6 or something like that.
Note: Don't rely on Now or TickCount for small numbers: they have an interval of about 10ms (depending on Windows version)! So duration of "sleep(10)" can give 0ms if you use Now and DateUtils.MillisecondsBetween
Note 2: Don't rely on QueryPerformanceCounter for long durations, because it's time can slowly go away during a day (about 1ms diff per minute)
If your hardware supports dynamic frequency scaling, it implies that QueryPerformanceFrequency cannot return a static value continuously describing a dynamically changing one. Whenever something computationally aggressive starts, the adapting CPU speed will prevent exact measurements.
At least, it was experienced with my notebook - as it changed to the higher clock rate, QueryPerformanceCounter based measurements were messed up.
So, regardless of the higher accuracy offered, I still use GetTickCount most of the time for such purposes (but DateTime based measurements are also OK, as mentioned before, except if time zone switches may occur), with some "warm-up" code piece that starts eating up the CPU power so the CPU speed is at its (constant) maximum as the relevant code piece starts executing.
You should post a code snippet demonstrating the problem...but I would assume an error on your part:
Milliseconds := 1000 * ((StopCount - StartCount) / Frequency);
If you are comparing to a stop watch, you can likely take the easier route and just capture the TDateTime before and after (by using Now()) and then use the DateUtils MilliSecondSpan() method to calculate difference:
var
MyStartDate:TDateTime;
MyStopDate:TDateTime;
MyTiming:Double;
begin
MyStartDate := Now();
DoSomethingYouWantTimed();
MyStopDate := Now();
MyTiming := MilliSecondSpan(MyStopDate, MyStartDate);
DoSomethingWithTiming(MyTiming);
end;
I use an NTP server to sync the PC clock periodically, the PC clock over a large amount of time to adjust the QueryPerformanceCounter "tick" time, and the calibrated QueryPerformanceCounter time for precise time measurements. On a good server where the clock drift is low it means that I have accuracy over time periods to much less than a millisecond and the clock times of all my machines synchronised to within a millsecond or two. Some of the relevant code is attached below:
function NowInternal: TDateTime;
const
// maximum time in seconds between synchronising the high-resolution clock
MAX_SYNC_TIME = 10;
var
lPerformanceCount: Int64;
lResult: TDateTime;
lDateTimeSynchronised: Boolean;
begin
// check that the the high-resolution performance counter frequency has been
// initialised
fDateTimeCritSect.Enter;
try
if (fPerformanceFrequency < 0) and
not QueryPerformanceFrequency(fPerformanceFrequency) then
fPerformanceFrequency := 0;
if fPerformanceFrequency > 0 then begin
// get the return value from the the high-resolution performance counter
if (fWindowsStartTime <> CSI_NULL_DATE_TIME) and
QueryPerformanceCounter(lPerformanceCount) then
lResult := fWindowsStartTime +
lPerformanceCount / fPerformanceFrequency / SecsPerDay
else
lResult := CSI_NULL_DATE_TIME;
if (MilliSecondsBetween(lResult, Now) >= MAX_CLOCK_DIFF) or
(SecondsBetween(Now, fLastSyncTime) >= MAX_SYNC_TIME) then begin
// resynchronise the high-resolution clock due to clock differences or
// at least every 10 seconds
lDateTimeSynchronised := SyncDateTime;
// get the return value from the the high-resolution performance counter
if (fWindowsStartTime <> CSI_NULL_DATE_TIME) and
QueryPerformanceCounter(lPerformanceCount) then
lResult := fWindowsStartTime +
lPerformanceCount / fPerformanceFrequency / SecsPerDay;
end else
lDateTimeSynchronised := False;
if MilliSecondsBetween(lResult, Now) >= (MAX_CLOCK_DIFF * 2) then
// default the return value to the standard low-resolution value if
// anything has gone wrong
Result := Now
else
Result := lResult;
end else begin
lDateTimeSynchronised := False;
// default the return value to the standard low-resolution value because
// we cannot use the high-resolution clock
Result := Now;
end;
finally
fDateTimeCritSect.Leave;
end;
if lDateTimeSynchronised then
CsiGlobals.AddLogMsg('High-resolution clock synchronised', CSI_LC_CLOCK);
end;
function SyncDateTime: Boolean;
var
lPriorityClass: Cardinal;
lThreadPriority: Integer;
lInitTime: TDateTime;
lNextTime: TDateTime;
lPerformanceCount: Int64;
lHighResCurrentTime: TDateTime;
lLowResCurrentTime: TDateTime;
begin
// synchronise the high-resolution date/time structure (boost the thread
// priority as high as possible during synchronisation)
lPriorityClass := CsiGetProcessPriorityClass;
lThreadPriority := CsiGetCurrentThreadPriority;
try
CsiSetProcessPriorityClass(REALTIME_PRIORITY_CLASS);
CsiSetCurrentThreadPriority(THREAD_PRIORITY_TIME_CRITICAL);
// loop until the low-resolution date/time value changes (this will load the
// CPU, but only for a maximum of around 15 milliseconds)
lInitTime := Now;
lNextTime := Now;
while lNextTime = lInitTime do
lNextTime := Now;
// adjust the high-resolution performance counter frequency for clock drift
if (fWindowsStartTime <> CSI_NULL_DATE_TIME) and
QueryPerformanceCounter(lPerformanceCount) then begin
lHighResCurrentTime := fWindowsStartTime +
lPerformanceCount / fPerformanceFrequency /
SecsPerDay;
lLowResCurrentTime := Now;
if MilliSecondsBetween(lHighResCurrentTime, lLowResCurrentTime) <
(MAX_CLOCK_DIFF * 2) then
fPerformanceFrequency := Round((1 +
(lHighResCurrentTime -
lLowResCurrentTime) /
(lLowResCurrentTime - fLastSyncTime)) *
fPerformanceFrequency);
end;
// save the Windows start time by extrapolating the high-resolution
// performance counter value back to zero
if QueryPerformanceCounter(lPerformanceCount) then begin
fWindowsStartTime := lNextTime -
lPerformanceCount / fPerformanceFrequency /
SecsPerDay;
fLastSyncTime := Now;
Result := True;
end else
Result := False;
finally
CsiSetCurrentThreadPriority(lThreadPriority);
CsiSetProcessPriorityClass(lPriorityClass);
end;
end;
Is there any comparable function like Pos that is not case-sensitive in D2010 (unicode)?
I know I can use Pos(AnsiUpperCase(FindString), AnsiUpperCase(SourceString)) but that adds a lot of processing time by converting the strings to uppercase every time the function is called.
For example, on a 1000000 loop, Pos takes 78ms while converting to uppercase takes 764ms.
str1 := 'dfkfkL%&/s"#<.676505';
for i := 0 to 1000000 do
PosEx('#<.', str1, 1); // Takes 78ms
for i := 0 to 1000000 do
PosEx(AnsiUpperCase('#<.'), AnsiUpperCase(str1), 1); // Takes 764ms
I know that to improve the performance of this specific example I can convert the strings to uppercase first before the loop, but the reason why I'm looking to have a Pos-like function that is not case-sensitive is to replace one from FastStrings. All the strings I'll be using Pos for will be different so I will need to convert each and every one to uppercase.
Is there any other function that might be faster than Pos + convert the strings to uppercase?
The built-in Delphi function to do that is in both the AnsiStrings.ContainsText for AnsiStrings and StrUtils.ContainsText for Unicode strings.
In the background however, they use logic very similar to your logic.
No matter in which library, functions like that will always be slow: especially to be as compatible with Unicode as possible, they need to have quite a lot of overhead. And since they are inside the loop, that costs a lot.
The only way to circumvent that overhead, is to do those conversions outside the loop as much as possible.
So: follow your own suggestion, and you have a really good solution.
--jeroen
This version of my previous answer works in both D2007 and D2010.
In Delphi 2007 the CharUpCaseTable is 256 bytes
In Delphi 2010 it is 128 KB (65535*2).
The reason is Char size. In the older version of Delphi my original code only supported the current locale character set at initialization. My InsensPosEx is about 4 times faster than your code. Certainly it is possible to go even faster, but we would lose simplicity.
type
TCharUpCaseTable = array [Char] of Char;
var
CharUpCaseTable: TCharUpCaseTable;
procedure InitCharUpCaseTable(var Table: TCharUpCaseTable);
var
n: cardinal;
begin
for n := 0 to Length(Table) - 1 do
Table[Char(n)] := Char(n);
CharUpperBuff(#Table, Length(Table));
end;
function InsensPosEx(const SubStr, S: string; Offset: Integer = 1): Integer;
var
n: Integer;
SubStrLength: Integer;
SLength: Integer;
label
Fail;
begin
Result := 0;
if S = '' then Exit;
if Offset <= 0 then Exit;
SubStrLength := Length(SubStr);
SLength := Length(s);
if SubStrLength > SLength then Exit;
Result := Offset;
while SubStrLength <= (SLength-Result+1) do
begin
for n := 1 to SubStrLength do
if CharUpCaseTable[SubStr[n]] <> CharUpCaseTable[s[Result+n-1]] then
goto Fail;
Exit;
Fail:
Inc(Result);
end;
Result := 0;
end;
//...
initialization
InitCharUpCaseTable({var}CharUpCaseTable);
I have also faced the problem of converting FastStrings, which used a Boyer-Moore (BM) search to gain some speed, for D2009 and D2010. Since many of my searches are looking for a single character only, and most of these are looking for non-alphabetic characters, my D2010 version of SmartPos has an overload version with a widechar as the first argument, and does a simple loop through the string to find these. I use uppercasing of both arguments to handle the few non-case-sensitive case. For my applications, I believe the speed of this solution is comparable to FastStrings.
For the 'string find' case, my first pass was to use SearchBuf and do the uppercasing and accept the penalty, but I have recently been looking into the possibility of using a Unicode BM implementation. As you may be aware, BM does not scale well or easily to charsets of Unicode proportions, but there is a Unicode BM implementation at Soft Gems. This pre-dates D2009 and D2010, but looks as if it would convert fairly easily. The author, Mike Lischke, solves the uppercasing issue by including a 67kb Unicode uppercasing table, and this may be a step too far for my modest requirements. Since my search strings are usually short (though not as short as your single three-character example) the overhead for Unicode BM may also be a price not worth paying: the BM advantage increases with the length of the string being searched for.
This is definitely a situation where benchmarking with some real-world application-specific examples will be needed before incorporating that Unicode BM into my own applications.
Edit: some basic benchmarking shows that I was right to be wary of the "Unicode Tuned Boyer-Moore" solution. In my environment, UTBM results in bigger code, longer time. I might consider using it if I needed some of the extras this implementation provides (handling surrogates and whole-words only searches).
Here's one that I wrote and have been using for years:
function XPos( const cSubStr, cString :string ) :integer;
var
nLen0, nLen1, nCnt, nCnt2 :integer;
cFirst :Char;
begin
nLen0 := Length(cSubStr);
nLen1 := Length(cString);
if nLen0 > nLen1 then
begin
// the substr is longer than the cString
result := 0;
end
else if nLen0 = 0 then
begin
// null substr not allowed
result := 0;
end
else
begin
// the outer loop finds the first matching character....
cFirst := UpCase( cSubStr[1] );
result := 0;
for nCnt := 1 to nLen1 - nLen0 + 1 do
begin
if UpCase( cString[nCnt] ) = cFirst then
begin
// this might be the start of the substring...at least the first
// character matches....
result := nCnt;
for nCnt2 := 2 to nLen0 do
begin
if UpCase( cString[nCnt + nCnt2 - 1] ) <> UpCase( cSubStr[nCnt2] ) then
begin
// failed
result := 0;
break;
end;
end;
end;
if result > 0 then
break;
end;
end;
end;
Why not just convert the both the substring and the source string to lower or upper case within the regular Pos statement. The result will effectively be case-insensitive because both arguments are all in one case. Simple and lite.
The Jedi Code Library has StrIPos and thousands of other useful functions to complement Delphi's RTL. When I still worked a lot in Delphi, JCL and its visual brother JVCL were among the first things I added to a freshly installed Delphi.
Instead 'AnsiUpperCase' you can use Table it is much faster.
I have reshape my old code. It is very simple and also very fast.
Check it:
type
TAnsiUpCaseTable = array [AnsiChar] of AnsiChar;
var
AnsiTable: TAnsiUpCaseTable;
procedure InitAnsiUpCaseTable(var Table: TAnsiUpCaseTable);
var
n: cardinal;
begin
for n := 0 to SizeOf(TAnsiUpCaseTable) -1 do
begin
AnsiTable[AnsiChar(n)] := AnsiChar(n);
CharUpperBuff(#AnsiTable[AnsiChar(n)], 1);
end;
end;
function UpCasePosEx(const SubStr, S: string; Offset: Integer = 1): Integer;
var
n :integer;
SubStrLength :integer;
SLength :integer;
label
Fail;
begin
SLength := length(s);
if (SLength > 0) and (Offset > 0) then begin
SubStrLength := length(SubStr);
result := Offset;
while SubStrLength <= SLength - result + 1 do begin
for n := 1 to SubStrLength do
if AnsiTable[SubStr[n]] <> AnsiTable[s[result + n -1]] then
goto Fail;
exit;
Fail:
inc(result);
end;
end;
result := 0;
end;
initialization
InitAnsiUpCaseTable(AnsiTable);
end.
I think, converting to upper or lower case before Pos is the best way, but you should try to call AnsiUpperCase/AnsiLowerCase functions as less as possible.
On this occasion I couldn't find any approach that was even as good as, let alone better than Pos() + some form of string normalisation (upper/lowercase conversion).
This is not entirely surprising as when benchmarked the Unicode string handling in Delphi 2009 I found that the Pos() RTL routine has improved significantly since Delphi 7, explained in part by the fact that aspects of the FastCode libraries have been incorporated into the RTL for some time now.
The FastStrings library on the other hand has not - iirc - been significantly updated for a long time now. In tests I found that many FastStrings routines have in fact been overtaken by the equivalent RTL functions (with a couple of exceptions, explained by the unavoidable overhead incurred by the additional complications of Unicode).
The "Char-Wise" processing of the solution presented by Steve is the best so far imho.
Any approach that involves normalising the entire strings (both string and sub-string) risks introducing errors in any character-based position in the results due to the fact that with Unicode strings a case conversion may result in a change in the length of the string (some characters convert to more/fewer characters in a case conversion).
These may be rare cases but Steve's routine avoids them and is only about 10% slower than the already quite fast Pos + Uppercase (your benchmarking results don't tally with mine on that score).
Often the simple solution is the one you'd want to use:
if AnsiPos(AnsiupperCase('needle'), AnsiupperCase('The Needle in the haystack')) <> 0 then
DoSomething;
Reference:
http://www.delphibasics.co.uk/RTL.asp?Name=ansipos
http://www.delphibasics.co.uk/RTL.asp?Name=UpCase
Any program on Windows can call a shell-API function, which keeps your code-size down. As usual, read the program from the bottom up. This has been tested with Ascii-strings only, not wide strings.
program PrgDmoPosIns; {$AppType Console} // demo case-insensitive Pos function for Windows
// Free Pascal 3.2.2 [2022/01/02], Win32 for i386
// FPC.EXE -vq -CoOr -Twin32 -oPrgStrPosDmo.EXE PrgStrPosDmo.LPR
// -vq Verbose: Show message numbers
// -C Code generation:
// o Check overflow of integer operations
// O Check for possible overflow of integer operations - Integer Overflow checking turns on Warning 4048
// r Range checking
// -Twin32 Target 32 bit Windows operating systems
// 29600 bytes code, 1316 bytes data, 35,840 bytes file
function StrStrIA( pszHaystack, pszNeedle : PChar ) : PChar; stdcall; external 'shlwapi.dll'; // dynamic link to Windows API's case-INsensitive search
// https://learn.microsoft.com/en-us/windows/win32/api/shlwapi/nf-shlwapi-strstria
// "FPC\3.2.2\Source\Packages\winunits-base\src\shlwapi.pp" line 557
function StrPos( strNeedle, strHaystk : string ) : SizeInt; // return the position of Needle within Haystack, or zero if not found
var
intRtn : SizeInt; // function result
ptrHayStk , // pointers to
ptrNeedle , // search strings
strMchFnd : PChar ; // pointer to match-found string, or null-pointer/empty-string when not found
bolFnd : boolean; // whether Needle was found within Haystack
intLenHaystk , // length of haystack
intLenMchFnd : SizeInt; // length of needle
begin
strHayStk := strHayStk + #0 ; // strings passed to API must be
strNeedle := strNeedle + #0 ; // null-terminated
ptrHayStk := Addr( strHayStk[ 1 ] ) ; // set pointers to point at first characters of
ptrNeedle := Addr( strNeedle[ 1 ] ) ; // null-terminated strings, so API gets C-style strings
strMchFnd := StrStrIA( ptrHayStk, ptrNeedle ); // call Windows to perform search; match-found-string now points inside the Haystack
bolFnd := ( strMchFnd <> '' ) ; // variable is True when match-found-string is not null/empty
if bolFnd then begin ; // when Needle was yes found in Haystack
intLenMchFnd := Length( strMchFnd ) ; // get length of needle
intLenHaystk := Length( strHayStk ) ; // get length of haystack
intRtn := intLenHaystk - intLenMchFnd; // set function result to the position of needle within haystack, which is the difference in lengths
end else // when Needle was not found in Haystack
intRtn := 0 ; // set function result to tell caller needle does not appear within haystack
StrPos := intRtn ; // pass function result back to caller
end; // StrPos
procedure TstOne( const strNeedle, strHayStk : string ); // run one test with this Needle
var
intPos : SizeInt; // found-match location of Needle within Haystack, or zero if none
begin
write ( 'Searching for : [', strNeedle, ']' ); // bgn output row for this test
intPos := StrPos( strNeedle, strHaystk ); // get Needle position
writeln(' StrPos is ' , intPos ); // end output row for this test
end; // TstOne
procedure TstAll( ); // run all tests with various Needles
const
strHayStk = 'Needle in a Haystack'; // all tests will search in this string
begin
writeln( 'Searching in : [', strHayStk, ']' ); // emit header row
TstOne ( 'Noodle' , strHayStk ); // test not-found
TstOne ( 'Needle' , strHayStk ); // test found at yes-first character
TstOne ( 'Haystack' , strHayStk ); // test found at not-first character
end; // TstAll
begin // ***** MAIN *****
TstAll( ); // run all tests
end.
function TextPos(const ASubText, AText: UnicodeString): Integer;
var
res: Integer;
begin
{
Locates a substring in a given text string without case sensitivity.
Returns the index of the first occurence of ATextin AText,
or zero if the text was not found
}
res := FindNLSString(LOCALE_USER_DEFAULT, FIND_FROMSTART or LINGUISTIC_IGNORECASE, PWideChar(AText), Length(AText), PWideChar(ASubText), Length(ASubText), nil);
Result := (res+1); //convert zero-based to one-based index, and -1 not found to zero.
end;
And in case you don't have the definitions:
function FindNLSString(Locale: LCID; dwFindNLSStringFlags: DWORD; lpStringSource: PWideChar; cchSource: Integer; lpStringValue: PWideChar; cchValue: Integer; cchFound: PInteger): Integer; stdcall; external 'Kernel32.dll';
const
FIND_FROMSTART = $00400000; // look for value in source, starting at the
LINGUISTIC_IGNORECASE = $00000010; // linguistically appropriate 'ignore