`bitpacked` records on the little-endian machine issue - delphi

I'm trying to use FreePascal on little-endian machine to read and interpret data from integrated circuit. The data essentially consists tightly bitpacked (mostly) big-endian integer numbers, some of them (a lot, actually) are not aligned to byte boundary. So, I've tried to employ FPC's bitpacked records for that and found myself in the deep deep trouble.
The first structure I'm trying to read has the following format:
{$BITPACKING ON}
type
THeader = bitpacked record
Magic: Byte; // format id, 8 bits
_Type: $000..$FFF; // type specifier, 12 bits
Version: Word; // data revision, 16 bits
Flags: $0..$F // attributes, 4 bits
end;
And here is a reading code:
procedure TForm1.FormCreate(Sender: TObject);
var
F: File;
Header: THeader;
begin
Writeln(SizeOf(Header), #9, BitSizeOf(Header)); // reports correctly
Writeln('SizeOf(Header._Type) = ', SizeOf(Header._Type)); // correctly reports 2 bytes
Writeln('BitSizeOf(Header._Type) = ', BitSizeOf(Header._Type)); // correctly reports 12 bits
AssignFile(F, 'D:\3fd8.dat');
FileMode := fmOpenRead;
Reset(F, SizeOf(Byte));
BlockRead(F, Header, SizeOf(Header));
{ data is incorrect beyond this point already }
//Header._Type := BEtoN(Header._Type);
Writeln(IntToHex(Header.Magic, SizeOf(Header.Magic) * 2));
Writeln(IntToHex(BEtoN(Header._Type), SizeOf(Header._Type) * 2));
Writeln(BEtoN(Header.Version));
end;
But the code is printing totally wrong data.
Here is the data and the interpretation done manually:
0000000000: F1 55 BE 3F 0A ...
Magic = F1
_Type = 55B
Version = E3F0
Flags = A
But FPC sees the data in severely different and incorrect way. Looks like a nibbles (and bits) belonging to field are not contiguous due to little-endianess of host machine (eg: nibble B normally should belong to _Type field and nibble E - to Version). Here is a Watches window from Lazarus:
Please advice what I should do with such behaviour. Is this non-contiguous bitfield issue a bug of FPC? Any workarounds possible?

The bytes
F1 55 BE 3F 0A
have the following consecutive nibbles (lower nibble before higher nibble):
1 F 5 5 E B F 3 A 0
If you group these into 2, 3, 4 and 1 nibbles respectively, you get:
1 F --> $F1
5 5 E --> $E55 // highest nibble last, so E is highest.
B F 3 A --> $A3FB // same again: A is highest nibble
0 --> $0
which corresponds to the result you see in the Watch window, and not what you decoded manually.
Now, if the data is big-endian, then you'll have to decode manually using shifts and masking:
X.Magic := bytes[0];
X._Type := (bytes[1] shl 4) or (bytes[2] shr 4);
X.Version := ((bytes[2] and $0F) shl 12) or
(bytes[3] shl 4) or
(bytes[4] shr 4);
X.Flags := bytes[4] and $0F;

I use this function to convert from IEEE format to FPC single:
Type MyReal = Array [1..4] of Byte;
function IeeeToSingle (src:MyReal):Single;
var x:MyReal;
s:single absolute x;
man:word;
exp:word;
ca:single;
cb:cardinal absolute ca; // 4 byte unsigned long int
begin
x:=src;
ca:=s;
if cb>0 then begin // not zero
man := cb shr 16;
exp := (man and $ff00) - $0200;
if ((exp and $8000) <> (man and $8000)) then
MsToIeee := -1; // exponent overflow
man := (man and $7f) or ((man shl 8) and $8000);// move sign
man := man or (exp shr 1);
cb := (cb and $ffff) or (Cardinal(man) shl 16);
end;
IeeeToSingle := ca;
end;

Related

How to print code128C?

I am trying to print a code128C (numbers only) but I believe that the way of sending the data is incorrect ... at the time of reading the code the conversion does not result in the data initially informed.
In code128A I submit an ASCCI code, the printer converts to hex and print...the reader convert it back to ASCII.
In code128C if I submit an ASCCI, at the time of reading the reader converts to decimal, which does not result in the initial value.
EX:
128A Input: '1' Printer: 31 Reading: 1
128C Input: '1' Printer: 31 Reading: 49
I imagine that I should submit the input code already in integer .... but as the command is composed of other information I do not know how to send it in integer.
This is the code of code128A:
ComandoAnsiString := tp.cod128A('12'); //Data entry
function TTP650.cod128A(cod: AnsiString): AnsiString;
begin
// Fill out the CODE 128 printing protocol
Result := #29+#107+#73 + chr(length(cod)+2) + #123+#65 + cod;
end;
WritePrinter( HandleImp, PAnsiChar(ComandoAnsiString), Length(ComandoAnsiString),
CaracteresImpressos); //send to printer
This is the code I've been trying with code128C:
ComandoAnsiString := tp.cod128C('12');
function TTP650.cod128C(cod: AnsiString): AnsiString;
begin
Result := #29+#107+#73 + chr(length(cod)+2) + #123+#67 + cod;
end;
WritePrinter( HandleImp, PAnsiChar(ComandoAnsiString), Length(ComandoAnsiString),
CaracteresImpressos);
I'm dealing with a thermal printer and one codebar reader simple, default.
The sending codes(WritePrinter) are from the library WinSpool ... the rest are codes written by me.
Important code information is on pages 47 to 50 of the guide.
Guide
Assuming users will enter the wanted barcodes as a string of digits which may be stored somewhere as string and at the time of printing, passed to the printing function as human readable string.
The printing function will then convert to an array of bytes, packing the digits according to CODE C (each pair of two decimal digits, forming a value 00..99, stored in a byte). Iow, if the entry string of digits is e.g. '123456', then this is represented by three bytes with values 12, 34, 56.
function cod128C(const cod: string): TBytes;
const
GS = 29; // GS - Print bar code
k = 107; // k - -"-
m = 73; // m - CODE128
CS = 123; // { - select code set //}
CC = 67; // C - CODE C
var
i, len, n, x: integer;
s: string;
begin
len := Length(cod);
if len = 0 then exit;
// raise for odd number of digits in cod, ...
// if Odd(len) then
// raise Exception.Create('cod must have even number of digits');
s := cod;
// ... alternatively assume a preceeding zero digit before the first digit
// in cod
if Odd(len) then
begin
s := '0'+s;
inc(len);
end;
len := len div 2; // we pack 2 digits into one byte
SetLength(result, 6 + len);
result[0] := GS;
result[1] := k;
result[2] := m;
result[3] := 2 + len; // length of cod, + 2 for following code set selector
result[4] := CS;
result[5] := CC;
n := length(s);
i := 1; // index to S
x := 6; // index to result
while i < n do
begin
result[x] := StrToInt(MidStr(s, i, 2));
inc(i, 2);
inc(x, 1);
end;
end;
And with a form with a button, edit and memo you can test the function and send it to your printer with the following.
procedure TForm1.Button1Click(Sender: TObject);
var
cmnd: TBytes;
i: integer;
s: string;
begin
cmnd := cod128C(Edit1.Text);
for i := 0 to Length(cmnd)-1 do
s := s+IntToStr(cmnd[i])+', ';
Memo1.Lines.Add(s);
WritePrinter( HandleImp, #cmnd[0], Length(cmnd), CaracteresImpressos);
end;
You may want to add a check for only decimal digits in the input string, but I leave that to you.

Program keeps telling that number I wrote isn't integer

I made a program and it constantly tells me that the number I input isn't an integer.
I'm entering 100010110101 and it pops up with this error:
code:
procedure TForm1.Button1Click(Sender: TObject);
var
m,lo,cshl,cdhl,cjhl,csl,cdl,cjl:integer;
begin
m := StrToInt(Edit1.Text);
cshl := m div 100000000000;
cdhl := m div 10000000000 mod 10;
cjhl := m div 10000000000 mod 100;
csl := m div 1000000000 mod 1000;
cdl := m div 100000000 mod 10000;
cjl := m div 10000000 mod 100000;
lo := cjl + cdl * 10 + csl * 100 + cjhl * 1000 + cdhl * 10000 + cshl *100000;
ShowMessage(IntToStr(lo));
end;
Consider how Delphi (and most languages) handle 32-bit integers: Wikipedia
In this context, Integer is a 32-bit integer, and any value less than -2,147,483,648 or greater than 2,147,483,647 IS NOT a valid 32-bit integer.
The "common sense" would indicate, that integers range from -∞ to +∞, but that is not the case in computer architecture.
Use Int64 if you want to "cover" more values.
In your case, the code should look like this:
var
m,lo,cshl,cdhl,cjhl,csl,cdl,cjl:Int64;
begin
m := StrToInt64(Edit1.Text);
...
end;
Cheers

Lift UInt64 limits with strings in Delphi

I'm reaching my limit with UInt64 and I was wondering if there are functions which do simple operating options such as +/- , etc. with just strings because they can store just as much RAM as you have... (theoretically)
For example I would like to calculate
24758800785707605497982484480 + 363463464326426 and get the result as a string.
I kinda know how to solve this problems with strings using the number system 0123456789 and kinda do digit by digit and overflow the next position - which would cost a lot more power, but I wouldn't mind this issue...
I would like to have this ability to do such calculations until my RAM just blows up (which would be the real limit...)
Are there such functions which already do that?
Arbitrarily large integers are not supported at the language level in Delphi, but a bit of Googling turns up http://www.delphiforfun.org/programs/Library/big_integers.htm, which can support them as alibrary.
On super computers, its called BCD math (Binary Coded Decimals) and each half-byte of RAM represents a decimal digit [0..9] - not an efficient use of RAM, but huge computations take minimal time (i.e. about 3 mSecs to multiply 2 million digit numbers. A BCD Emulator on a fast PC takes 5 or 6 minutes.
I never need to add big numbers, but I do multiply. Actually I call this routine iteratively to compute for example, 1000000 factorial (a 5,565,709 million digit answer. Str6Product refers to how it chops up a pair of string numbers. s1 and s2 have a practical length limit of about 2^31. The function is limited by what a "string can hold". Whatever that limit is, I've never gotten there.
//==============================================================================
function Str6Product(s1: string; s2: string): string; // 6-13 5:15 PM
var
so,snxt6 : string;
z1,z3, i, j, k : Cardinal; // Cardinal is 32-bit unsigned
x1,x3,xm : Cardinal;
countr : Cardinal;
a1, a2, a3 : array of Int64;
inum, icarry : uInt64; // uInt64 is 64-bit signed
begin
s1 := '00000'+s1;
s2 := '00000'+s2;
z1 := length(s1); // set size of Cardinal arrays
z3 := z1 div 6;
x1 := length(s2); // set size of Cardinal arrays
x3 := x1 div 6;
xm := max(x3,z3);
SetLength(a1,xm+1);
SetLength(a2,xm+1);
// try to keep s1 and s2 about the
// same length for best performance
for i := 1 to xm do begin // from rt 2 lft - fill arrays
// with 4-byte integers
if i <= z3 then a1[i] := StrToInt(copy (s1, z1-i*6+1, 6));
if i <= x3 then a2[i] := StrToInt(copy (s2, x1-i*6+1, 6));
if i > z3 then a1[i] := 0;
if i > x3 then a2[i] := 0;
end;
k := max(xm-x3, xm-z3); // k prevents leading zeroes
SetLength(a3,xm+xm+1);
icarry := 0; countr := 0;
icMax := 0; inMax := 0;
for i := 1 to xm do begin // begin 33 lines of "string mult" engine
inum := 0;
for j := 1 to i do
inum := inum + (a1[i-j+1] * a2[j]);
icarry := icarry + inum;
if icMax < icarry then icMax := icarry;
if inMax < inum then inMax := inum;
inum := icarry mod 1000000;
icarry := icarry div 1000000;
countr := countr + 1;
a3[countr] := inum;
end;
if xm > 1 then begin
for i := xm downto k+1 do begin // k or 2
inum := 0;
for j := 2 to i do
inum := inum + (a1[xm+j-i] * a2[xm-j+2]);
icarry := icarry + inum;
if icMax < icarry then icMax := icarry;
if inMax < inum then inMax := inum;
inum := icarry mod 1000000;
icarry := icarry div 1000000;
countr := countr + 1;
a3[countr] := inum;
end;
end;
if icarry >= 1 then begin
countr := countr + 1;
a3[countr] := icarry;
end;
so := IntToStr(a3[countr]);
for i := countr-1 downto 1 do begin
snxt6 := IntToStr(a3[i]+1000000);
so := so+ snxt6[2]+ snxt6[3]+ snxt6[4]+ snxt6[5]+ snxt6[6]+ snxt6[7];
end;
while so[1] = '0' do // leading zeroes may exist
so := copy(so,2,length(so));
result := so;
end;
//==============================================================================
Test call:
StrText := Str6Product ('742136061320987817587158718975871','623450632948509826743508972875');
I should have added that you should be able to add large numbers using the same methodology - From right to left, fragment the strings into 16 byte chunks then convert those chunks to uInt64 variables. Add the least significant digits first and if it produces a 17th byte, carry that over to the 2nd least significant chunk, add those two PLUS any carry over etc. When otherwise done, convert each 16-byte chunk back to string and concatenate accordingly.
The conversions to and from integer to string and vice-versa is a pain, but necessary for big number arithmetic.

String to BCD (embarcadero delphi)

Edit:
I have (test file in ascii) the following record in ascii: "000000000.00"
I need to output it ISO upon parsing it's counter part in BCD (the other test file in bcd/ebcdic). I believe it takes 6 char in BCD and 11 in ascii.
So my need was something that could convert it back and forth.
First I thought of taking each chars, feed it to a convert function and convert it back hence my messed up question.
I hope i'm more clear.
Yain
Dr. Peter Below (of Team B) donated these in the old Borland Delphi newsgroups a few years ago:
// NO NEGATIVE NUMBERS either direction.
// BCD to Integer
function BCDToInteger(Value: Integer): Integer;
begin
Result := (Value and $F);
Result := Result + (((Value shr 4) and $F) * 10);
Result := Result + (((Value shr 8) and $F) * 100);
Result := Result + (((Value shr 16) and $F) * 1000);
end;
// Integer to BCD
function IntegerToBCD(Value: Integer): Integer;
begin
Result := Value div 1000 mod 10;
Result := (Result shl 4) or Value div 100 mod 10;
Result := (Result shl 4) or Value div 10 mod 10;
Result := (Result shl 4) or Value mod 10;
end;
As you may know, the ASCII codes of the numerals 0 through 9 are 48 through 57. Thus, if you convert each character in turn to its ASCII equivalent and subtract 48, you get its numerical value. Then you multiply by ten, and add the next number. In pseudo code (sorry, not a delphi guy):
def bcdToInt( string ):
val = 0
for each ch in string:
val = 10 * val + ascii(ch) - 48;
return val;
If your "string" in fact contains "true BCD values" (that is, numbers from 0 to 9, rather than their ASCII equivalent 48 to 57), then don't subtract the 48 in the above code. Finally, if two BCD values are tucked into a single byte, you would access successive members with a bitwise AND with 0x0F (15). But in that case, Ken White's solution is clearly more helpful. I hope this is enough to get you going.
functions below work for 8 digit hexadecimal and BCD values.
function BCDToInteger(Value: DWORD): Integer;
const Multipliers:array[1..8] of Integer=(1, 10, 100, 1000, 10000, 100000, 1000000, 10000000);
var j:Integer;
begin
Result:=0;
for j:=1 to 8 do //8 digits
Result:=Result+(((Value shr ((j-1)*4)) and $0F) * Multipliers[j]);
end;//BCDToInteger
function IntegerToBCD(Value: DWORD): Integer;
const Dividers:array[1..8] of Integer=(1, 10, 100, 1000, 10000, 100000, 1000000, 10000000);
var j:Integer;
begin
Result:=0;
for j:=8 downto 1 do //8 digits
Result:=(Result shl 4) or ((Value div Dividers[j]) mod 10);
end;//IntegerToBCD

how to improve the code (Delphi) for loading and searching in a dictionary?

I'm a Delphi programmer.
I have made a program who uses dictionaries with words and expressions (loaded in program as "array of string").
It uses a search algorithm based on their "checksum" (I hope this is the correct word).
A string is transformed in integer based on this:
var
FHashSize: Integer; //stores the value of GetHashSize
HashTable, HashTableNoCase: array[Byte] of Longword;
HashTableInit: Boolean = False;
const
AnsiLowCaseLookup: array[AnsiChar] of AnsiChar = (
#$00, #$01, #$02, #$03, #$04, #$05, #$06, #$07,
#$08, #$09, #$0A, #$0B, #$0C, #$0D, #$0E, #$0F,
#$10, #$11, #$12, #$13, #$14, #$15, #$16, #$17,
#$18, #$19, #$1A, #$1B, #$1C, #$1D, #$1E, #$1F,
#$20, #$21, #$22, #$23, #$24, #$25, #$26, #$27,
#$28, #$29, #$2A, #$2B, #$2C, #$2D, #$2E, #$2F,
#$30, #$31, #$32, #$33, #$34, #$35, #$36, #$37,
#$38, #$39, #$3A, #$3B, #$3C, #$3D, #$3E, #$3F,
#$40, #$61, #$62, #$63, #$64, #$65, #$66, #$67,
#$68, #$69, #$6A, #$6B, #$6C, #$6D, #$6E, #$6F,
#$70, #$71, #$72, #$73, #$74, #$75, #$76, #$77,
#$78, #$79, #$7A, #$5B, #$5C, #$5D, #$5E, #$5F,
#$60, #$61, #$62, #$63, #$64, #$65, #$66, #$67,
#$68, #$69, #$6A, #$6B, #$6C, #$6D, #$6E, #$6F,
#$70, #$71, #$72, #$73, #$74, #$75, #$76, #$77,
#$78, #$79, #$7A, #$7B, #$7C, #$7D, #$7E, #$7F,
#$80, #$81, #$82, #$83, #$84, #$85, #$86, #$87,
#$88, #$89, #$8A, #$8B, #$8C, #$8D, #$8E, #$8F,
#$90, #$91, #$92, #$93, #$94, #$95, #$96, #$97,
#$98, #$99, #$9A, #$9B, #$9C, #$9D, #$9E, #$9F,
#$A0, #$A1, #$A2, #$A3, #$A4, #$A5, #$A6, #$A7,
#$A8, #$A9, #$AA, #$AB, #$AC, #$AD, #$AE, #$AF,
#$B0, #$B1, #$B2, #$B3, #$B4, #$B5, #$B6, #$B7,
#$B8, #$B9, #$BA, #$BB, #$BC, #$BD, #$BE, #$BF,
#$C0, #$C1, #$C2, #$C3, #$C4, #$C5, #$C6, #$C7,
#$C8, #$C9, #$CA, #$CB, #$CC, #$CD, #$CE, #$CF,
#$D0, #$D1, #$D2, #$D3, #$D4, #$D5, #$D6, #$D7,
#$D8, #$D9, #$DA, #$DB, #$DC, #$DD, #$DE, #$DF,
#$E0, #$E1, #$E2, #$E3, #$E4, #$E5, #$E6, #$E7,
#$E8, #$E9, #$EA, #$EB, #$EC, #$ED, #$EE, #$EF,
#$F0, #$F1, #$F2, #$F3, #$F4, #$F5, #$F6, #$F7,
#$F8, #$F9, #$FA, #$FB, #$FC, #$FD, #$FE, #$FF);
implementation
function GetHashSize(const Count: Integer): Integer;
begin
if Count < 65 then
Result := 256
else
Result := Round(IntPower(16, Ceil(Log10(Count div 4) / Log10(16))));
end;
function Hash(const Hash: LongWord; const Buf; const BufSize: Integer): LongWord;
var P: PByte;
I: Integer;
begin
P := #Buf;
Result := Hash;
for I := 1 to BufSize do
begin
Result := HashTable[Byte(Result) xor P^] xor (Result shr 8);
Inc(P);
end;
end;
function HashStrBuf(const StrBuf: Pointer; const StrLength: Integer; const Slots: LongWord): LongWord;
var P: PChar;
I, J: Integer;
begin
if not HashTableInit then
InitHashTable;
P := StrBuf;
if StrLength <= 48 then // Hash all characters for short strings
Result := Hash($FFFFFFFF, P^, StrLength)
else
begin
// Hash first 16 bytes
Result := Hash($FFFFFFFF, P^, 16);
// Hash last 16 bytes
Inc(P, StrLength - 16);
Result := Hash(Result, P^, 16);
// Hash 16 bytes sampled from rest of string
I := (StrLength - 48) div 16;
P := StrBuf;
Inc(P, 16);
for J := 1 to 16 do
begin
Result := HashTable[Byte(Result) xor Byte(P^)] xor (Result shr 8);
Inc(P, I + 1);
end;
end;
// Mod into slots
if Slots <> 0 then
Result := Result mod Slots;
end;
procedure InitHashTable;
var I, J: Byte;
R: LongWord;
begin
for I := $00 to $FF do
begin
R := I;
for J := 8 downto 1 do
if R and 1 <> 0 then
R := (R shr 1) xor $EDB88320
else
R := R shr 1;
HashTable[I] := R;
end;
Move(HashTable, HashTableNoCase, Sizeof(HashTable));
for I := Ord('A') to Ord('Z') do
HashTableNoCase[I] := HashTableNoCase[I or 32];
HashTableInit := True;
end;
The result of the HashStrBuf is "and (FHashSize - 1)" and is used as index in an "array of array of Integer" (of FHashSize size) to store the index of the string from that "array of string".
This way, when searches for a string, it's transformed in "checksum" and then the code searches in the "branch" with this index comparing this string with the strings from dictionary who have the same "checksum".
Ideally each string from dictionary should have unique checksum. But in the "real world" about 2/3 share the same "checksum" with other words. Because of that the search is not that fast.
In these dictionaries strings are composed of this characters: ['a'..'z',#224..#246,#248..#254,#154,#156..#159,#179,#186,#191,#190,#185,'0'..'9', '''']
Is there any way to improve the "hashing" so the strings would have more unique "checksums"?
Oh, one way is to increase the size of that "array of array of Integer" (FHashSize) but it cannot be increased too much because it takes a lot of Ram.
Another thing: these dictionaries are stored on HDD only as words/expressions (not the "checksums"). Their "checksum" is generated at program startup. But it takes a lot of seconds to do that...
Is there any way to speed up the startup of the program? Maybe by improving the "hashing" function, maybe by storing the "checksums" on HDD and loading them from there...
Any input would be appreciated...
PS: here is the code to search:
function TDictionary.LocateKey(const Key: AnsiString): Integer;
var i, j, l, H: Integer;
P, Q: PChar;
begin
Result := -1;
l := Length(Key);
H := HashStrBuf(#Key[1], l, 0) and (FHashSize - 1);
P := #Key[1];
for i := 0 to High(FHash[H]) do //FHash is that "array of array of integer"
begin
if l <> FKeys.ItemSize[FHash[H][i]] then //FKeys.ItemSize is an byte array with the lengths of strings from dictionary
Continue;
Q := FKeys.Pointer(FHash[H][i]); //pointer to string in dictionary
for j := 0 to l - 1 do
if (P + j)^ <> (Q + j)^ then
Break;
if j = l then
begin
Result := FHash[H][i];
Exit;
end;
end;
end;
Don't reinvent the wheel!
IMHO your hashing is far from efficient, and your collision algorithm can be improved.
Take a look for instance at the IniFiles unit, and the THashedStringList.
It's a bit old, but a good start for a string list using hashes.
There are a lot of good Delphi implementation of such, like in SuperObject and a lot of other code...
Take a look at our SynBigTable unit, which can handle arrays of data in memory or in file very fast, with full indexed searches. Or our latest TDynArray wrapper around any dynamic array of data, to implement TList-like methods to it, including fast binary search. I'm quite sure it could be faster than your hand-tuned code using hashing, if you use an ordered index then fast binary search.
Post-Scriptum:
About pure hashing speed of a string content, take a look at this function - rename RawByteString into AnsiString, PPtrInt into PPointer, and PtrInt into Integer for Delphi 7:
function Hash32(const Text: RawByteString): cardinal;
function SubHash(P: PCardinalArray): cardinal;
{$ifdef HASINLINE}inline;{$endif}
var s1,s2: cardinal;
i, L: PtrInt;
const Mask: array[0..3] of cardinal = (0,$ff,$ffff,$ffffff);
begin
if P<>nil then begin
L := PPtrInt(PtrInt(P)-4)^; // fast lenght(Text)
s1 := 0;
s2 := 0;
for i := 1 to L shr 4 do begin // 16 bytes (4 DWORD) by loop - aligned read
inc(s1,P^[0]);
inc(s2,s1);
inc(s1,P^[1]);
inc(s2,s1);
inc(s1,P^[2]);
inc(s2,s1);
inc(s1,P^[3]);
inc(s2,s1);
inc(PtrUInt(P),16);
end;
for i := 1 to (L shr 2)and 3 do begin // 4 bytes (DWORD) by loop
inc(s1,P^[0]);
inc(s2,s1);
inc(PtrUInt(P),4);
end;
inc(s1,P^[0] and Mask[L and 3]); // remaining 0..3 bytes
inc(s2,s1);
result := s1 xor (s2 shl 16);
end else
result := 0;
end;
begin // use a sub function for better code generation under Delphi
result := SubHash(pointer(Text));
end;
There is even a pure asm version, even faster, in our SynCommons.pas unit. I don't know any faster hashing function around (it's faster than crc32/adler32/IniFiles.hash...). It's based on adler32, but use DWORD aligned reading and summing for even better speed. This could be improved with SSE asm, of course, but here is a fast pure Delphi hash function.
Then don't forget to use "multiplication"/"binary and operation" for hash resolution, just like in IniFiles. It will reduce the number of iteration to your list of hashs.
But since you didn't provide the search source code, we are not able to know what could be improved here.
If you are using Delphi 7, consider using Julian Bucknall's lovely Delphi data types code, EzDsl (Easy Data Structures Library).
Now you don't have to reinvent the wheel as another wise person has also said.
You can download ezdsl, a version that I have made work with both Delphi 7, and recent unicode delphi versions, here.
In particular the unit name EHash contains a hash table implementation, which has various hashing algorithms plug-inable, or you can write your own plugin function that just does the hashing function of your choice.
As a word to the wise, if you are using a Unicode Delphi version; I would be careful about hashing your unicode strings with a code library like this, without checking how its hashing algorithms perform on your system. The OP here is using Delphi 7, so Unicode is not a factor for the original question.
I think you'll find a database (without checksums) a lot quicker. Maybe try sqlite which will give you a single file database. There are many Delphi Libraries available.

Resources