Most Efficient Unicode Hash Function for Delphi 2009

Most Efficient Unicode Hash Function for Delphi 2009 - delphi

I am in need of the fastest hash function possible in Delphi 2009 that will create hashed values from a Unicode string that will distribute fairly randomly into buckets.
I originally started with Gabr's HashOf function from GpStringHash:
function HashOf(const key: string): cardinal;
asm
xor edx,edx { result := 0 }
and eax,eax { test if 0 }
jz #End { skip if nil }
mov ecx,[eax-4] { ecx := string length }
jecxz #End { skip if length = 0 }
#loop: { repeat }
rol edx,2 { edx := (edx shl 2) or (edx shr 30)... }
xor dl,[eax] { ... xor Ord(key[eax]) }
inc eax { inc(eax) }
loop #loop { until ecx = 0 }
#End:
mov eax,edx { result := eax }
end; { HashOf }
But I found that this did not produce good numbers from Unicode strings. I noted that Gabr's routines have not been updated to Delphi 2009.
Then I discovered HashNameMBCS in SysUtils of Delphi 2009 and translated it to this simple function (where "string" is a Delphi 2009 Unicode string):
function HashOf(const key: string): cardinal;
var
I: integer;
begin
Result := 0;
for I := 1 to length(key) do
begin
Result := (Result shl 5) or (Result shr 27);
Result := Result xor Cardinal(key[I]);
end;
end; { HashOf }
I thought this was pretty good until I looked at the CPU window and saw the assembler code it generated:
Process.pas.1649: Result := 0;
0048DEA8 33DB xor ebx,ebx
Process.pas.1650: for I := 1 to length(key) do begin
0048DEAA 8BC6 mov eax,esi
0048DEAC E89734F7FF call $00401348
0048DEB1 85C0 test eax,eax
0048DEB3 7E1C jle $0048ded1
0048DEB5 BA01000000 mov edx,$00000001
Process.pas.1651: Result := (Result shl 5) or (Result shr 27);
0048DEBA 8BCB mov ecx,ebx
0048DEBC C1E105 shl ecx,$05
0048DEBF C1EB1B shr ebx,$1b
0048DEC2 0BCB or ecx,ebx
0048DEC4 8BD9 mov ebx,ecx
Process.pas.1652: Result := Result xor Cardinal(key[I]);
0048DEC6 0FB74C56FE movzx ecx,[esi+edx*2-$02]
0048DECB 33D9 xor ebx,ecx
Process.pas.1653: end;
0048DECD 42 inc edx
Process.pas.1650: for I := 1 to length(key) do begin
0048DECE 48 dec eax
0048DECF 75E9 jnz $0048deba
Process.pas.1654: end; { HashOf }
0048DED1 8BC3 mov eax,ebx
This seems to contain quite a bit more assembler code than Gabr's code.
Speed is of the essence. Is there anything I can do to improve either the pascal code I wrote or the assembler that my code generated?
Followup.
I finally went with the HashOf function based on SysUtils.HashNameMBCS. It seems to give a good hash distribution for Unicode strings, and appears to be quite fast.
Yes, there is a lot of assembler code generated, but the Delphi code that generates it is so simple and uses only bit-shift operations, so it's hard to believe it wouldn't be fast.

ASM output is not a good indication of algorithm speed. Also, from what I can see, the two pieces of code are doing almost the identical work. The biggest difference seem to be the memory access strategy and the first is using roll-left instead of the equivalent set of instructions (shl | shr -- most higher-level programming languages leave out the "roll" operators). The latter may pipeline better than the former.
ASM optimization is black magic and sometimes more instructions execute faster than fewer.
To be sure, benchmark both and pick the winner. If you like the output of the second but the first is faster, plug the second's values into the first.
rol edx,5 { edx := (edx shl 5) or (edx shr 27)... }
Note that different machines will run the code in different ways, so if speed is REALLY of the essence then benchmark it on the hardware that you plan to run the final application on. I'm willing to bet that over megabytes of data the difference will be a matter of milliseconds -- which is far less than the operating system is taking away from you.
PS. I'm not convinced this algorithm creates even distribution, something you explicitly called out (have you run the histograms?). You may look at porting this hash function to Delphi. It may not be as fast as the above algorithm but it appears to be quite fast and also gives good distribution. Again, we're probably talking on the order of milliseconds of difference over megabytes of data.

We held a nice little contest a while back, improving on a hash called "MurmurHash"; Quoting Wikipedia :
It is noted for being exceptionally
fast, often two to four times faster
than comparable algorithms such as
FNV, Jenkins' lookup3 and Hsieh's
SuperFastHash, with excellent
distribution, avalanche behavior and
overall collision resistance.
You can download the submissions for that contest here.
One thing we learned was, that sometimes optimizations don't improve results on every CPU. My contribution was tweaked to run good on AMD, but performed not-so-good on Intel. The other way around happened too (Intel optimizations running sub-optimal on AMD).
So, as Talljoe said : measure your optimizations, as they might actually be detrimental to your performance!
As a side-note: I don't agree with Lee; Delphi is a nice compiler and all, but sometimes I see it generating code that just isn't optimal (even when compiling with all optimizations turned on). For example, I regularly see it clearing registers that had already been cleared just two or three statements before. Or EAX is put into EBX, only to have it shifted and put back into EAX. That sort of thing. I'm just guessing here, but hand-optimizing that sort of code will surely help in tight spots.
Above all though; First analyze your bottleneck, then see if a better algorithm or datastructure can be used, then try to optimize the pascal code (like: reduce memory-allocations, avoid reference counting, finalization, try/finally, try/except blocks, etc), and then, only as a final resort, optimize the assembly code.

I've written two assembly "optimized" functions in Delphi, or more implemented known fast hash algorithms in both fine-tuned Pascal and Borland Assembler. The first was a implementation of SuperFastHash, and the second was a MurmurHash2 implementation triggered by a request from Tommi Prami on my blog to translate my c# version to a Pascal implementation. This spawned a discussion continued on the Embarcadero Discussion BASM Forums, that in the end resulted in about 20 implementations (check the latest benchmark suite) which ultimately showed that it would be difficult to select the best implementation due to the big differences in cycle times per instruction between Intel and AMD.
So, try one of those, but remember, getting the fastest every time would probably mean changing the algorithm to a simpler one which would hurt your distribution. Fine-tuning an implementation takes lots of time and better create a good validation and benchmarking suite to make check your implementations.

There has been a bit of discussion in the Delphi/BASM forum that may be of interest to you. Have a look at the following:
http://forums.embarcadero.com/thread.jspa?threadID=13902&tstart=0

Related

Using SSE to round in Delphi

I wrote this function to round singles to integers:
function Round(const Val: Single): Integer;
begin
asm
cvtss2si eax,Val
mov Result,eax
end;
end;
It works, but I need to change the rounding mode. Apparently, per this, I need to set the MXCSR register.
How do I do this in Delphi?
The reason I am doing this in the first place is I need "away-from-zero" rounding (like in C#), which is not possible even via SetRoundingMode.

On modern Delphi, to set MXCSR you can call SetMXCSR from the System unit. To read the current value use GetMXCSR.
Do beware that SetMXCSR, just like Set8087CW is not thread-safe. Despite my efforts to persuade Embarcadero to change this, it seems that this particular design flaw will remain with us forever.
On older versions of Delphi you use the LDMXCSR and STMXCSR opcodes. You might write your own versions like this:
function GetMXCSR: LongWord;
asm
PUSH EAX
STMXCSR [ESP].DWord
POP EAX
end;
procedure SetMXCSR(NewMXCSR: LongWord);
//thread-safe version that does not abuse the global variable DefaultMXCSR
var
MXCSR: LongWord;
asm
AND EAX, $FFC0 // Remove flag bits
MOV MXCSR, EAX
LDMXCSR MXCSR
end;
These versions are thread-safe and I hope will compile and work on older Delphi versions.
Do note that using the name Round for your function is likely to cause a lot of confusion. I would advise that you do not do that.
Finally, I checked in the Intel documentation and both of the Intel floating point units (x87, SSE) offer just the rounding modes specified by the IEEE754 standard. They are:
Round to nearest (even)
Round down (toward −∞)
Round up (toward +∞)
Round toward zero (Truncate)
So, your desired rounding mode is not available.

Declaring block level variables for branches in delphi

In Delphi prism we can declare variables that is only needed in special occasions.
eg: In prism
If acondition then
begin
var a :Integer;
end;
a := 3; //this line will produce error. because a will be created only when the condition is true
Here 'a' cannot be assigned with 3 because it is nested inside a branch.
How can we declare a variable which can be used only inside a branch in delphi win32. So i can reduce memory usage as it is only created if a certain condition is true;
If reduced memory usage is not a problem what are the draw backs we have (or we don't have)

The premise of your question is faulty. You're assuming that in languages where block-level variables are allowed, the program allocates and releases memory for those variable when control enters or leaves those variables' scopes. So, for example, you think that when acondition is true, the program adjusts the stack to make room for the a variable as it enters that block. But you're wrong.
Compilers calculate the maximum space required for all declared variables and temporary variables, and then they reserve that much space upon entry to the function. Allocating that space is as simple as adjusting the stack pointer; the time required usually has nothing to do with the amount of space being reserved. The bottom line is that your idea won't actually save any space.
The real advantage to having block-level variables is that their scopes are limited.
If you really need certain variables to be valid in only one branch of code, then factor that branch out to a separate function and put your variables there.

The concept of Local Variable Declaration Statements like in Java is not supported in Delphi, but you could declare a sub-procedure:
procedure foo(const acondition: boolean);
procedure subFoo;
var
a: integer;
begin
a := 3;
end;
begin
If acondition then
begin
subFoo;
end;
end;

There is no way in Delphi to limit scope of an variable to less than entire routine. And in case of a single integer variable it doesn't make sense to worry about it... But in case of large data structure you should allocate it dynamically, not statically, ie instead of
var integers: array[1..10000]of Integer;
use
type TIntArray: array of Integer;
var integers: TIntArray;
If acondition then
begin
SetLength(integers, 10000);
...
end;

Beware that it could only be "syntactic sugar". The compiler may ensure you don't use the variable outside the inner scope, but that doesn't mean it could save memory. The variable may be allocated on the stack in the procedure entry code anyway, regardless if it is actually used or not. AFAIK most ABI initialize the stack on entry and clean it on exit. Manipulating the stack in a much more complex way while the function is executing including taking care of different execution paths may be even less performant - instead of a single instruction to reserve stack space you need several instruction scattered along code, and ensure the stack is restored correctly adding more, epecially stack unwinding due to an exception may become far more complex.
If the aim is to write "better" code because of better scope handling to ensure the wrong variable is not used in the wrong place it could be useful, but if you need it as a way to save memory it could not be the right way.

You can emulate block-level variables with the (dreaded) with statement plus a function returning a record. Here's a bit of sample code, written in the browser:
type TIntegerA = record
A: Integer;
end;
function varAInteger: TIntegerA;
begin
Result.A := 0;
end;
// Code using this pseudo-local-variable
if Condition then
with varAInteger do
begin
A := 7; // Works.
end
else
begin
A := 3; // Error, the compiler doesn't know who A is
end;
Edit to clarify this proposition
Please note this kind of wizardry is no actual replacement for true block-level variables: Even those they're likely allocated on stack, just like most other local variables, the compiler is not geared to treat them as such. It's not going to do the same optimizations: a returned record will always be stored in an actual memory location, while a true local variable might be associated with a CPU register. The compiler will also not let you use such variables for "for" statements, and that's a big problem.

Having commented all that - there is a party trick that Delphi has that has far more uses than a simple local variable and may achieve your aim:
function Something: Integer;
begin
// don't want any too long local variables...
If acondition then
asm
// now I have lots of 'local' variables available in the registers
mov EAX, #AnotherVariable //you can use pascal local variables too!
// do something with the number 3
Add EAX, 3
mov #Result, EAX
jmp #next
#AnotherVariable: dd 10
#next:
end;
end;
end;
:)) bit of a pointless example...

Efficient conversion of an array of singles to an array of doubles in Delphi 2010

I need to implement a wrapper layer between a high level application and a low level sub-system using slightly different typing:
The application produces an array of single vectors:
unit unApplication
type
TVector = record
x, y, z : single;
end;
TvectorArray = array of Tvector;
procedure someFunc(): tvectorArray;
[...]
while the subsystem expects an array of double vectors. I also implemented typecasting from tvector to Tvectord:
unit unSubSystem
type
TVectorD = record
x, y, z : double;
class operator Implicit(value : t3dVector):t3dvectorD;inline;
end;
TvectorDArray = array of TvectorD;
procedure otherFunc(points: tvectorDArray);
implementation
class operator T3dVecTorD.Implicit(value : t3dVector):t3dvectorD;
begin
result.x := value.x;
result.y := value.y;
result.z := value.z;
end;
What I am currently doing is like this:
uses unApplication, unsubsystem,...
procedure ConvertValues
var
singleVecArr : TvectorArray;
doubleveArr : TvectorDArray;
begin
singleVecArr := somefunc;
setlength(doubleVecArray, lenght(singlevecArr));
for i := 0 to length(singlevecArr) -1 do
doubleVecArray[i] := singleVecArr[i];
end;
Is there a more efficient way to perform these kinds of conversion?

First of all I would say that you should not attempt any optimisation without first timing. In this case I don't mean timing alternative algorithms, I mean timing the code in question and assessing what proportion of the total time is spent there.
My instincts tell me that the code you show will run for a tiny proportion of the overall time and so optimising it will yield no discernible benefits. I think if you do anything meaningful with each element of this array then that must be true since the cost of converting from single to double will be small compared to floating point operations.
Finally, if perchance this code is a bottleneck, you should consider not converting it at all. My assumption is that you are using standard Delphi floating point operations which map to the 8087 FPU. All such floating point operations happen inside the 8087 floating point stack. Values are converted on entry to either 64 or more normally 80 bit precision. I don't think it would be any slower to load a single than to load a double – in fact it may even be faster due to memory read performance.

Assuming that the conversion indeed is the bottleneck, then one way of speeding up the conversion may be to use SSE# instead of the FPU, provided the necessary instruction sets can be assumed to be present on the computers on which this code will run.
For instance, the following would convert one single Vector into one double Vector:
procedure SingleToDoubleVector (var S: TVector; var D: TVectorD);
// #S in EAX
// #D in EDX
asm
movups xmm0, [eax] ;// Load S in xmm0
movhlps xmm1, xmm0 ;// Copy High 2 singles of xmm0 into xmm1
cvtps2pd xmm2, xmm0 ;// Convert Low two singles of xmm0 into doubles in xmm2
cvtss2sd xmm3, xmm1 ;// Convert Lowes single in xmm1 into double in xmm1
movupd [edx], xmm2 ;// Move two doubles in xmm2 into D (.X and .Y)
movsd [edx+16],xmm3 ;// Move one double from xmm3 into D.Z
end;
I am not saying that this bit of code is the most efficient way to do it and there are many caveats with using assembly code in general and this code in particular. Note that this code makes assumptions about the alignment of the fields in your records. (It does not make assumptions regarding the alignment of the record as a whole.)
Also, for best results, you would control the alignment of your array/record elements in memory and write the entire conversion loop in assembly, to reduce overheads. Whether this is what you want/can do is another question.

If modifying the source to produce doubles rather than singles is not possible you can try threading out the process. Try dividing the TArray into two or four equal sized chunks (depending on processor count) and have each thread do the conversion. Doing this will realize almost double or quadruple speed.
Also, is the 'length' call calculated each loop? Maybe place that into a variable to avoid the calculation.

How to ensure 16byte code alignment of Delphi routines?

Background:
I have a unit of optimised Delphi/BASM routines, mostly for heavy computations. Some of these routines contain inner loops for which I can achieve a significant speed-up if the loop start is aligned to a DQWORD (16-byte) boundary. I can ensure that the loops in question are aligned as desired IF I know the alignment at the routine entry point.
As far as I can see, the Delphi compiler aligns procedures/functions to DWORD boundaries, and e.g. adding functions to the unit may change the alignment of subsequent ones. However, as long as I pad the end of routines to multiples of 16, I can ensure that subsequent routines are likewise aligned -- or misaligned, depending on the alignment of the first routine. I therefore tried to place the critical routines at the beginning of the unit's implementation section, and put a bit of padding code before them so that the first procedure would be DQWORD aligned.
This looks something like below:
interface
procedure FirstProcInUnit;
implementation
procedure __PadFirstProcTo16;
asm
// variable number of NOP instructions here to get the desired code length
end;
procedure FirstProcInUnit;
asm //should start at DQWORD boundary
//do something
//padding to align the following label to DQWORD boundary
#Some16BAlignedLabel:
//code, looping back to #Some16BAlignedLabel
//do something else
ret #params
//padding to get code length to multiple of 16
end;
initialization
__PadFirstProcTo16; //call this here so that it isn't optimised out
ASSERT ((NativeUInt(Pointer(#FirstProcInUnit)) AND $0F) = 0, 'FirstProcInUnit not DQWORD aligned');
end.
This is a bit of a pain in the neck, but I can get this sort of thing to work when necessary. The problem is that when I use such a unit in different projects, or make some changes to other units in the same project, this may still break the alignment of __PadFirstProcTo16 itself. Likewise, recompiling the same project with different compiler versions (e.g. D2009 vs. D2010) typically also breaks the alignment. So, the only way of doing this sort of thing I found was by hand as the pretty much last thing to be done when all the rest of the project is in its final form.
Question 1:
Is there any other way to achieve the desired effect of ensuring that (at least some specific) routines are DQWORD-aligned?
Question 2:
Which are the exact factors that affect the compiler's alignment of code and (how) could I use such specific knowledge to overcome the problem outlined here?
Assume that for the sake of this question "don't worry about code alignment/the associated presumably small speed benefits" is not a permissible answer.

As of Delphi XE, the problem of code alignment is now easily solved using the $CODEALIGN compiler directive (see this Delphi documentation page):
{$CODEALIGN 16}
procedure MyAlignedProc;
begin
..
end;

One thing that you could do, is to add a 'magic' signature at the end of each routine, after an explicit ret instruction:
asm
...
ret
db <magic signature bytes>
end;
Now you could create an array containing pointers to each routine, scan the routines at run-time once for the magic signature to find the end of each routine and therefore its length. Then, you can copy them to a new block of memory that you allocate with VirtualAlloc using PAGE_EXECUTE_READWRITE, ensuring this time that each routine starts on a 16-byte boundary.

Why should I not use "with" in Delphi?

I've heard many programmers, particularly Delphi programmers scorn the use of 'with'.
I thought it made programs run faster (only one reference to parent object) and that it was easier to read the code if used sensibly (less than a dozen lines of code and no nesting).
Here's an example:
procedure TBitmap32.FillRectS(const ARect: TRect; Value: TColor32);
begin
with ARect do FillRectS(Left, Top, Right, Bottom, Value);
end;
I like using with. What's wrong with me?

One annoyance with using with is that the debugger can't handle it. So it makes debugging more difficult.
A bigger problem is that it is less easy to read the code. Especially if the with statement is a bit longer.
procedure TMyForm.ButtonClick(...)
begin
with OtherForm do begin
Left := 10;
Top := 20;
CallThisFunction;
end;
end;
Which Form's CallThisFunction will be called? Self (TMyForm) or OtherForm? You can't know without checking if OtherForm has a CallThisFunction method.
And the biggest problem is that you can make bugs easy without even knowing it. What if both TMyForm and OtherForm have a CallThisFunction, but it's private. You might expect/want the OtherForm.CallThisFunction to be called, but it really is not. The compiler would have warned you if you didn't use the with, but now it doesn't.
Using multiple objects in the with multiplies the problems. See http://blog.marcocantu.com/blog/with_harmful.html

I prefer the VB syntax in this case because here, you need to prefix the members inside the with block with a . to avoid ambiguities:
With obj
.Left = 10
.Submit()
End With
But really, there's nothing wrong with with in general.

It would be great if the with statement would be extented the following way:
with x := ARect do
begin
x.Left := 0;
x.Rigth := 0;
...
end;
You wouldn't need to declare a variable 'x'. It will be created by the compiler. It's quick to write and no confusion, which function is used.

It is not likely that "with" would make the code run faster, it is more likely that the compiler would compile it to the same executable code.
The main reason people don't like "with" is that it can introduce confusion about namespace scope and precedence.
There are cases when this is a real issue, and cases when this is a non-issue (non-issue cases would be as described in the question as "used sensibly").
Because of the possible confusion, some developers choose to refrain from using "with" completely, even in cases where there may not be such confusion. This may seem dogmatic, however it can be argued that as code changes and grows, the use of "with" may remain even after code has been modified to an extent that would make the "with" confusing, and thus it is best not to introduce its use in the first place.

In fact:
procedure TBitmap32.FillRectS(const ARect: TRect; Value: TColor32);
begin
with ARect do FillRectS(Left, Top, Right, Bottom, Value);
end;
and
procedure TBitmap32.FillRectS(const ARect: TRect; Value: TColor32);
begin
FillRectS(ARect.Left, ARect.Top, ARect.Right, ARect.Bottom, Value);
end;
Will generate exactly the same assembler code.
The performance penalty can exist if the value of the with clause is a function or a method. In this case, if you want to have good maintenance AND good speed, just do what the compiler does behind the scene, i.e. create a temporary variable.
In fact:
with MyRect do
begin
Left := 0;
Right := 0;
end;
is encoded in pseudo-code as such by the compiler:
var aRect: ^TRect;
aRect := #MyRect;
aRect^.Left := 0;
aRect^.Right := 0;
Then aRect can be just a CPU register, but can also be a true temporary variable on stack. Of course, I use pointers here since TRect is a record. It is more direct for objects, since they already are pointers.
Personally, I used with sometimes in my code, but I almost check every time the asm generated to ensure that it does what it should. Not everyone is able or has the time to do it, so IMHO a local variable is a good alternative to with.
I really do not like such code:
for i := 0 to ObjList.Count-1 do
for j := 0 to ObjList[i].NestedList.Count-1 do
begin
ObjList[i].NestedList[j].Member := 'Toto';
ObjList[i].NestedList[j].Count := 10;
end;
It is still pretty readable with with:
for i := 0 to ObjList.Count-1 do
for j := 0 to ObjList[i].NestedList.Count-1 do
with ObjList[i].NestedList[j] do
begin
Member := 'Toto';
Count := 10;
end;
or even
for i := 0 to ObjList.Count-1 do
with ObjList[i] do
for j := 0 to NestedList.Count-1 do
with NestedList[j] do
begin
Member := 'Toto';
Count := 10;
end;
but if the inner loop is huge, a local variable does make sense:
for i := 0 to ObjList.Count-1 do
begin
Obj := ObjList[i];
for j := 0 to Obj.NestedList.Count-1 do
begin
Nested := Obj.NestedList[j];
Nested.Member := 'Toto';
Nested.Count := 10;
end;
end;
This code won't be slower than with: compiler does it in fact behind the scene!
By the way, it will allow easier debugging: you can put a breakpoint, then point your mouse on Obj or Nested directly to get the internal values.

What you save in typing, you lose in readability.
Many debuggers won't have a clue what you're referring to either so debugging is more difficult.
It doesn't make programs run faster.
Consider making the code within your with statement a method of the object that you're refering to.

It's primarily a maintenance issue.
The idea of WITH makes reasonable sense from a language point of view, and the argument that it keeps code, when used sensibly, smaller and clearer has some validity. However the problem is that most commercial code will be maintained by several different people over it's lifetime, and what starts out as a small, easily parsed, construct when written can easily mutate over time into unwieldy large structures where the scope of the WITH is not easily parsed by the maintainer. This naturally tends to produce bugs, and difficult to find ones at that.
For example say we have a small function foo which contains three or four lines of code which have been wrapped inside a WITH block then there is indeed no issue. However a few years later this function may have expanded, under several programmers, into 40 or 50 lines of code still wrapped inside a WITH. This is now brittle, and is ripe for bugs to be introduced, particularly so if the maintainer stars introducing additional embedded WITH blocks.
WITH has no other benefits - code should be parsed exactly the same and run at the same speed (I did some experiments with this in D6 inside tight loops used for 3D rendering and I could find no difference). The inability of the debugger to handle it is also an issue - but one that should have been fixed a while back and would be worth ignoring if there were any benefit. Unfortunately there isn't.

This debate happens in Javascript a lot too.
Basically, that With syntax makes it very hard to tell at a glance which Left/Top/etc property/method you're calling on.You could have a local variable called Left, and a property (it's been a while since I've done delphi, sorry if the name is wrong) called Left, perhaps even a function called Left. Anyone reading the code who isn't super familiar with the ARect structure could be very very lost.

I do not like it because it makes debbuging a hassle. You cannot read the value of a variable or the like by just hovering over it with a mouse.

There's nothing wrong with it as long as you keep it simple and avoid ambiguities.
As far as I'm aware, it doesn't speed anything up though - it's purely syntactic sugar.

At work we give points for removing Withs from an existing Win 32 code base because of the extra effort needed to maintain code that uses them. I have found several bugs in a previous job where a local variable called BusinessComponent was masked by being within a With begin block for an object that a published property BusinessComponent of the same type. The compiler chose to use the published property and the code that meant to use the local variable crashed.
I have seen code like
With a,b,c,d do {except they are much longer names, just shortened here)
begin
i := xyz;
end;
It can be a real pain trying to locate where xyz comes from. If it was c, I'd much sooner write it as
i := c.xyz;
You think it's pretty trivial to understand this but not in a function that was 800 lines long that used a with right at the start!

You can combine with statements, so you end up with
with Object1, Object2, Object3 do
begin
//... Confusing statements here
end
And if you think that the debugger is confused by one with, I don't see how anyone can determine what is going on in the with block

It permits incompetent or evil programmers to write unreadble code. Therefor, only use this feature if you are neither incompetent nor evil.

... run faster ...
Not necessarily - your compiler/interpreter is generally better at optimizing code than you are.
I think it makes me say "yuck!" because it's lazy - when I'm reading code (particularly someone else's) I like to see explicit code. So I'd even write "this.field" instead of "field" in Java.

We've recently banned it in our Delphi coding stnadards.
The pros were frequently outweighing the cons.
That is bugs were being introduced because of its misuse. These didn't justify the savings in time to write or execute the code.
Yes, using with can led to (mildly) faster code execution.
In the following, foo is only evaluated once:
with foo do
begin
bar := 1;
bin := x;
box := 'abc';
end
But, here it is evaluated three times:
foo.bar := 1;
foo.bin := x;
foo.box := 'abc';

For Delphi 2005 is exist hard error in with-do statement - evaluate pointer is lost and repace with pointer up. There have to use a local variable, not object type directly.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart