Inconsistent Results with exAllArithmeticExceptions in Win32 and Win64

Inconsistent Results with exAllArithmeticExceptions in Win32 and Win64 - delphi

A colleague of mine picked up a discrepancy between Win32 and Win64 code compiled by Delphi in how it handles NaN's. Take the following code as an example. When compiled in 32 bit we get no messages but when compiled with 64 bit we get both comparisons returning true.
program TestNaNs;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils,
System.Math;
var
nanDouble: Double;
zereDouble: Double;
nanSingle: Single;
zeroSingle: Single;
begin
SetExceptionMask(exAllArithmeticExceptions);
nanSingle := NaN;
zeroSingle := 0.0;
if nanSingle <> zeroSingle then
WriteLn('nanSingle <> zeroSingle');
nanDouble := NaN;
zereDouble := 0.0;
if nanDouble <> zereDouble then
WriteLn('nanDouble <> zeroDouble');
ReadLn;
end.
My understanding of the IEEE standard is that <> should return true but all other operations should return false. So in this case, it looks like the 64 bit version is correct and the 32 bit version is incorrect. The code generated by both is very different with the 64 bit version generating SSE code.
For 32 bit:
TestNaNs.dpr.21: if nanSingle <> zeroSingle then
0041A552 D905E01E4200 fld dword ptr [$00421ee0]
0041A558 D81DE41E4200 fcomp dword ptr [$00421ee4]
0041A55E 9B wait
0041A55F DFE0 fstsw ax
0041A561 9E sahf
0041A562 7419 jz $0041a57d
and for 64 bit:
TestNaNs.dpr.21: if nanSingle <> zeroSingle then
000000000042764E F3480F5A05C9ED0000 cvtss2sd xmm0,qword ptr [rel $0000edc9]
0000000000427657 F3480F5A0DC4ED0000 cvtss2sd xmm1,qword ptr [rel $0000edc4]
0000000000427660 660F2EC1 ucomisd xmm0,xmm1
0000000000427664 7A02 jp Project63 + $68
0000000000427666 7420 jz Project63 + $88
My question is this. Is this an issue with the Delphi compiler or a caveat with the Intel CPU's?

The IEEE 754 standard defines arithmetic formats, operations, rounding rules, exceptions etc. for floating point computation. The Delphi compiler implements floating point arithmetic on top of the available hardware units. For the 32 bit Windows compiler this is the x87 unit, and for the 64 bit Windows compiler this is the SSE unit. Both of these hardware units conform to the IEEE 754 standard.
The difference that you are observing arises at the language implementation level. Let us look at the two versions in more detail.
32 bit Windows compiler
The comparison statement is compiled to this:
TestNaNs.dpr.19: if nanDouble <> zeroDouble then
0041C4C8 DD05C03E4200 fld qword ptr [$00423ec0]
0041C4CE DC1DC83E4200 fcomp qword ptr [$00423ec8]
0041C4D4 9B wait
0041C4D5 DFE0 fstsw ax
0041C4D7 9E sahf
0041C4D8 7419 jz $0041c4f3
The Intel software developers manual says that an unordered comparison is indicated by the flags C3, C2 and C0 being set to 1. The full table is here:
Condition C3 C2 C0
ST(0) > Source 0 0 0
ST(0) < Source 0 0 1
ST(0) = Source 1 0 0
Unordered 1 1 1
When you inspect the FPU under the debugger, you can see that this us the case.
0041C4D5 DFE0 fstsw ax
0041C4D7 9E sahf
0041C4D8 7419 jz $0041c4f3
This transfers various bits from of the FPU status register into the CPU flags, see the manual for precise details of which flags go where. The the branch is made if ZF is set. The value of ZF comes from the C3 FPU flag, which, reading from the table above, is set for the unordered case.
In fact, the entire branching code can be expressed in pseudo code as:
jump if C3 = 1
So, looking at the table above, it is clear that if one of the operands is a NaN then any floating point equality comparison evaluates as equals.
64 bit Windows compiler
The comparison statement is compiled to this:
TestNaNs.dpr.19: if nanDouble <> zeroDouble then
0000000000428EB8 F20F100548E50000 movsd xmm0,qword ptr [rel $0000e548]
0000000000428EC0 660F2E0548E50000 ucomisd xmm0,qword ptr [rel $0000e548]
0000000000428EC8 7A02 jp TestNaNs + $5C
0000000000428ECA 7420 jz TestNaNs + $7C
The comparison is performed by the ucomisd instruction. The manual gives this psuedo code:
RESULT ← UnorderedCompare(SRC1[63:0] <> SRC2[63:0]) {
(* Set EFLAGS *)
CASE (RESULT) OF
GREATER_THAN: ZF, PF, CF ← 000;
LESS_THAN: ZF, PF, CF ← 001;
EQUAL: ZF, PF, CF ← 100;
UNORDERED: ZF, PF, CF ← 111;
ESAC;
OF, AF, SF ← 0;
Notice that in this instruction, the ZF, PF and CF flags are exactly analagous to the C3, C2 and C0 flags on the x87 unit.
The branching is handled by this code:
0000000000428EC8 7A02 jp TestNaNs + $5C
0000000000428ECA 7420 jz TestNaNs + $7C
Notice that there is first a test of the parity flag PF (the jp instruction), and then the zero flag ZF (the jz instruction). The compiler has therefore emitted code to handle the unordered case (i.e. one of the operands is NaN). This is handled first with the jp. Once that is handled, the compiler then checks the zero flag ZF which (because NaNs have been dealt with) is set if and only if the two operands are equal.
Conclusion
The different behaviour is down to the different compilers taking different choices in how to implement the comparison operators. In both situations the hardware is IEEE 754 compliant, and perfectly capable of comparing NaNs as specified by the standard.
My best guess would be that the decisions for the 32 bit compiler were taken a very long time ago. Some of these decisions are questionable. In my view an equality comparison with a NaN operand should evaluate not equals irrespective of the other operand. The weight of history, felt through a desire to maintain backwards compatibility, means that these questionable decisions have never been addressed.
When the 64 bit compiler was created, more recently, the Embarcadero engineers decided to right some of these mistakes. They presumably felt that the break to a new architecture allowed them the freedom to do so.
In an ideal world, the 32 bit compiler could be configured to behave the same way as the 64 bit compiler, by setting a compiler switch.

Related

Asm equivalent to a delphi procedure

I have a simple delphi function called SetCompare that compares two singles and if they are not equal then one value is set to the other.
procedure SetCompare( A : single; B : single );
begin
if( A <> B ) then
A := B;
end;
I am trying to convert this into asm as such:
procedure SetCompare( A : Single; B : Single ); register;
begin
asm
mov EAX,A
mov ECX,B
cmp EAX,ECX
jne SetValue
#SetValue:
mov EAX,ECX
end;
end;
Will this work?

Will this work?
No this will not work, because floating point comparison is not the same as binary comparison. For instance 0 and -0 have different bit patterns, but compare as equal. Similarly, NaN compares unequal to all values, including a NaN with the same bit pattern.
The simplest way to work out how to write your code is to get the compiler to compile the Pascal code, and inspect the generated assembly code.
Some asides:
Your function is pointless anyway, because it returns no value and has no side effects.
If performance matters enough to write assembler, then you should write pure assembler functions, rather than inline asm blocks in a Pascal function. Which in any case is not supported by the x64 compiler.
Your arguments are already in registers, so it makes little sense to copy them around to other registers. For x86 code, A arrives in EAX, and B arrives in EDX. Given that EAX already contains A, why would you copy it into EAX? It is already there. And B is already in EDX, why copy it to ECX? For x64 code, the two arguments are passed in floating point registers, and can be compared there directly. As soon as your start writing assembler you need to understand the register use of the calling convention.
Your jne is pointless. If execution does not take the jump, then it moves to the next line of code. Which is where you jumped to.

Accessing Delphi Class Fields in 64 bit inline assembler

I am trying to convert the Delphi TBits.GetBit to inline assembler for the 64 bit version. The VCL source looks like this:
function TBits.GetBit(Index: Integer): Boolean;
{$IFNDEF X86ASM}
var
LRelInt: PInteger;
LMask: Integer;
begin
if (Index >= FSize) or (Index < 0) then
Error;
{ Calculate the address of the related integer }
LRelInt := FBits;
Inc(LRelInt, Index div BitsPerInt);
{ Generate the mask }
LMask := (1 shl (Index mod BitsPerInt));
Result := (LRelInt^ and LMask) <> 0;
end;
{$ELSE X86ASM}
asm
CMP Index,[EAX].FSize
JAE TBits.Error
MOV EAX,[EAX].FBits
BT [EAX],Index
SBB EAX,EAX
AND EAX,1
end;
{$ENDIF X86ASM}
I started converting the 32 bit ASM code to 64 bit. After some searching, I found out that I need to change the EAX references to RAX for the 64 bit compiler. I ended up with this for the first line:
CMP Index,[RAX].FSize
This compiles but gives an access violation when it runs. I tried a few combinations (e.g. MOV ECX,[RAX].FSize) and get the same access violation when trying to access [RAX].FSize. When I look at the assembler that is generated by the Delphi compiler, it looks like my [RAX].FSize should be correct.
Unit72.pas.143: MOV ECX,[RAX].FSize
00000000006963C0 8B8868060000 mov ecx,[rax+$00000668]
And the Delphi generated code:
Unit72.pas.131: if (Index >= FSize) or (Index < 0) then
00000000006963CF 488B4550 mov rax,[rbp+$50]
00000000006963D3 8B4D58 mov ecx,[rbp+$58]
00000000006963D6 3B8868060000 cmp ecx,[rax+$00000668]
00000000006963DC 7D06 jnl TForm72.GetBit + $24
00000000006963DE 837D5800 cmp dword ptr [rbp+$58],$00
00000000006963E2 7D09 jnl TForm72.GetBit + $2D
In both cases, the resulting assembler uses [rax+$00000668] for FSize. What is the correct way to access a class field in Delphi 64bit Assembler?
This may sound like a strange thing to optimize but the assembler for the 64bit pascal version doesn't appear to be very efficient. We call this routine a large number of times and it takes anything up to 5 times as long to execute depending on various factors.

The basic problem is that you are using the wrong register. Self is passed as an implicit parameter, before all others. In the x64 calling convention, that means it is passed in RCX and not RAX.
So Self is passed in RCX and Index is passed in RDX. Frankly, I think it's a mistake to use parameter names in inline assembler because they hide the fact that the parameter was passed in a register. If you happen to overwrite either RDX, then that changes the apparent value of Index.
So the if statement might be coded as
CMP EDX,[RCX].FSize
JNL TBits.Error
CMP EDX,0
JL TBits.Error
FWIW, this is a really simple function to implement and I don't believe that you will need to use any stack space. You have enough registers in x64 to be able to do this entirely using volatile registers.

Invalid floating point operation calling Trunc()

I'm getting a (repeatable) floating point exception when i try to Trunc() a Real value.
e.g.:
Trunc(1470724508.0318);
In reality the actual code is more complex:
ns: Real;
v: Int64;
ns := ((HighPerformanceTickCount*1.0)/g_HighResolutionTimerFrequency) * 1000000000;
v := Trunc(ns);
But in the end it still boils down to:
Trunc(ARealValue);
Now, i cannot repeat it anywhere else - just at this one spot. Where it fails every time.
It's not voodoo
Fortunately computers are not magic. The Intel CPU performs very specific observable actions. So i should be able to figure out why the floating point operation fails.
Going into the CPU window
v := Trunc(ns)
fld qword ptr [ebp-$10]
This loads the 8-byte floating point value at ebp-$10 into floating point register ST0.
The bytes at memory address [ebp-$10] are:
0018E9D0: 6702098C 41D5EA5E (as DWords)
0018E9D0: 41D5EA5E6702098C (as QWords)
0018E9D0: 1470724508.0318 (as Doubles)
The call succeeds, and the floating point register the contains the appropriate value:
Next is the actual call to the RTL Trunc function:
call #TRUNC
Next is the guts of Delphi RTL's Trunc function:
#TRUNC:
sub esp,$0c
wait
fstcw word ptr [esp] //Store Floating-Point Control Word on the stack
wait
fldcw word ptr [cwChop] //Load Floating-Point Control Word
fistp qword ptr [esp+$04] //Converts value in ST0 to signed integer
//stores the result in the destination operand
//and pops the stack (increments the stack pointer)
wait
fldcw word ptr [esp] //Load Floating-Point Control Word
pop ecx
pop eax
pop edx
ret
Or i suppose i could have just pasted it from the rtl, rather than transcribing it from the CPU window:
const cwChop : Word = $1F32;
procedure _TRUNC;
asm
{ -> FST(0) Extended argument }
{ <- EDX:EAX Result }
SUB ESP,12
FSTCW [ESP] //Store foating-control word in ESP
FWAIT
FLDCW cwChop //Load new control word $1F32
FISTP qword ptr [ESP+4] //Convert ST0 to int, store in ESP+4, and pop the stack
FWAIT
FLDCW [ESP] //restore the FPCW
POP ECX
POP EAX
POP EDX
end;
The exception happens during the actual fistp operation.
fistp qword ptr [esp+$04]
At the moment of this call, the ST0 register will contains the same floating point value:
Note: The careful observer will note the value in the above screenshot doesn't match the first screenshot. That's because i took it on a different run. I'd rather not have to carefully redo all the constants in the question just to make them consistent - but trust me: it's the same when i reach the fistp instruction as it was after the fld instruction.
Leading up to it:
sub esp,$0c: I watch it push the the stack down by 12 bytes
fstcw word ptr [esp]: i watch it push $027F into the the current stack pointer
fldcw word ptr [cwChop]: i watch the floating point control flags change
fistp qword ptr [esp+$04]: and it's about to write the Int64 into the room it made on the stack
and then it crashes.
What can actually be going on here?
It happens with other values as well, it's not like there's something wrong with this particular floating point value. But i even tried to setup the test-case elsewhere.
Knowing that the 8-byte hex value of the float is: $41D5EA5E6702098C, i tried to contrive the setup:
var
ns: Real;
nsOverlay: Int64 absolute ns;
v: Int64;
begin
nsOverlay := $41d62866a2f270dc;
v := Trunc(ns);
end;
Which gives:
nsOverlay := $41d62866a2f270dc;
mov [ebp-$08],$a2f270dc
mov [ebp-$04],$41d62866
v := Trunc(ns)
fld qword ptr [ebp-$08]
call #TRUNC
And at the point of the call to #trunc, the floating point register ST0 contains a value:
But the call does not fail. It only fails, every time in this one section of my code.
What could be possibly happening that is causing the CPU to throw an invalid floating point exception?
What is the value of cwChop before it loads the control word?
The value of cwChop looks to be correct before the load control word, $1F32. But after the load, the actual control word is wrong:
Bonus Chatter
The actual function that is failing is something to convert high-performance tick counts into nanoseconds:
function PerformanceTicksToNs(const HighPerformanceTickCount: Int64): Int64;
//Convert high-performance ticks into nanoseconds
var
ns: Real;
v: Int64;
begin
Result := 0;
if HighPerformanceTickCount = 0 then
Exit;
if g_HighResolutionTimerFrequency = 0 then
Exit;
ns := ((HighPerformanceTickCount*1.0)/g_HighResolutionTimerFrequency) * 1000000000;
v := Trunc(ns);
Result := v;
end;
I created all the intermeidate temporary variables to try to track down where the failure is.
I even tried to use that as a template to try to reproduce it:
var
i1, i2: Int64;
ns: Real;
v: Int64;
vOver: Int64 absolute ns;
begin
i1 := 5060170;
i2 := 3429541;
ns := ((i1*1.0)/i2) * 1000000000;
//vOver := $41d62866a2f270dc;
v := Trunc(ns);
But it works fine. There's something about when it's called during a DUnit unit test.
Floating Point control word flags
Delphi's standard control word: $1332:
$1332 = 0001 00 11 00 110010
0 ;Don't allow invalid numbers
1 ;Allow denormals (very small numbers)
0 ;Don't allow divide by zero
0 ;Don't allow overflow
1 ;Allow underflow
1 ;Allow inexact precision
0 ;reserved exception mask
0 ;reserved
11 ;Precision Control - 11B (Double Extended Precision - 64 bits)
00 ;Rounding control -
0 ;Infinity control - 0 (not used)
The Windows API required value: $027F
$027F = 0000 00 10 01 111111
1 ;Allow invalid numbers
1 ;Allow denormals (very small numbers)
1 ;Allow divide by zero
1 ;Allow overflow
1 ;Allow underflow
1 ;Allow inexact precision
1 ;reserved exception mask
0 ;reserved
10 ;Precision Control - 10B (double precision)
00 ;Rounding control
0 ;Infinity control - 0 (not used)
The crChop control word: $1F32
$1F32 = 0001 11 11 00 110010
0 ;Don't allow invalid numbers
1 ;Allow denormals (very small numbers)
0 ;Don't allow divide by zero
0 ;Don't allow overflow
1 ;Allow underflow
1 ;Allow inexact precision
0 ;reserved exception mask
0 ;unused
11 ;Precision Control - 11B (Double Extended Precision - 64 bits)
11 ;Rounding Control
1 ;Infinity control - 1 (not used)
000 ;unused
The CTRL flags after loading $1F32: $1F72
$1F72 = 0001 11 11 01 110010
0 ;Don't allow invalid numbers
1 ;Allow denormals (very small numbers)
0 ;Don't allow divide by zero
0 ;Don't allow overflow
1 ;Allow underflow
1 ;Allow inexact precision
1 ;reserved exception mask
0 ;unused
11 ;Precision Control - 11B (Double Extended Precision - 64 bits)
11 ;Rounding control
1 ;Infinity control - 1 (not used)
00011 ;unused
All the CPU is doing is turning on a reserved, unused, mask bit.
RaiseLastFloatingPointError()
If you're going to develop programs for Windows, you really need to accept the fact that floating point exceptions should be masked by the CPU, meaning you have to watch for them yourself. Like Win32Check or RaiseLastWin32Error, we'd like a RaiseLastFPError. The best i can come up with is:
procedure RaiseLastFPError();
var
statWord: Word;
const
ERROR_InvalidOperation = $01;
// ERROR_Denormalized = $02;
ERROR_ZeroDivide = $04;
ERROR_Overflow = $08;
// ERROR_Underflow = $10;
// ERROR_InexactResult = $20;
begin
{
Excellent reference of all the floating point instructions.
(Intel's architecture manuals have no organization whatsoever)
http://www.plantation-productions.com/Webster/www.artofasm.com/Linux/HTML/RealArithmetica2.html
Bits 0:5 are exception flags (Mask = $2F)
0: Invalid Operation
1: Denormalized - CPU handles correctly without a problem. Do not throw
2: Zero Divide
3: Overflow
4: Underflow - CPU handles as you'd expect. Do not throw.
5: Precision - Extraordinarily common. CPU does what you'd want. Do not throw
}
asm
fwait //Wait for pending operations
FSTSW statWord //Store floating point flags in AX.
//Waits for pending operations. (Use FNSTSW AX to not wait.)
fclex //clear all exception bits the stack fault bit,
//and the busy flag in the FPU status register
end;
if (statWord and $0D) <> 0 then
begin
//if (statWord and ERROR_InexactResult) <> 0 then raise EInexactResult.Create(SInexactResult)
//else if (statWord and ERROR_Underflow) <> 0 then raise EUnderflow.Create(SUnderflow)}
if (statWord and ERROR_Overflow) <> 0 then raise EOverflow.Create(SOverflow)
else if (statWord and ERROR_ZeroDivide) <> 0 then raise EZeroDivide.Create(SZeroDivide)
//else if (statWord and ERROR_Denormalized) <> 0 then raise EUnderflow.Create(SUnderflow)
else if (statWord and ERROR_InvalidOperation) <> 0 then raise EInvalidOp.Create(SInvalidOp);
end;
end;
A reproducible case!
I found a case, when Delphi's default floating point control word, that was the cause of an invalid floating point exception (although I never saw it before now because it was masked). Now that i'm seeing it, why is it happening! And it's reproducible:
procedure TForm1.Button1Click(Sender: TObject);
var
d: Real;
dover: Int64 absolute d;
begin
d := 1.35715152325557E020;
// dOver := $441d6db44ff62b68; //1.35715152325557E020
d := Round(d); //<--floating point exception
Self.Caption := FloatToStr(d);
end;
You can see that the ST0 register contains a valid floating point value. The floating point control word is $1372. There floating point exception flag are all clear:
And then, as soon as it executes, it's an invalid operation:
IE (Invalid operation) flag is set
ES (Exception) flag is set
I was tempted to ask this as another question, but it would be the exact same question - except this time calling Round().

The problem occurs elsewhere. When your code enters Trunc the control word is set to $027F which is, IIRC, the default Windows control word. This has all exceptions masked. That's a problem because Delphi's RTL expects exceptions to be unmasked.
And look at the FPU window, sure enough there are errors. Both IE and PE flags are set. It's IE that counts. That's means that earlier in the code sequence there was a masked invalid operation.
Then you call Trunc which modifies the control word to unmask the exceptions. Look at your second FPU window screenshot. IE is 1 but IM is 0. So boom, the earlier exception is raised and you are led to think that it was the fault of Trunc. It was not.
You'll need to trace back up the call stack to find out why the control word is not what it ought to be in a Delphi program. It ought to be $1332. Most likely you are calling into some third party library which modifies the control word and does not restore it. You'll have to locate the culprit and take charge whenever any calls to that function return.
Once you get the control word back under control you'll find the real cause of this exception. Clearly there is an illegal FP operation. Once the control word unmasks the exceptions, the error will be raised at the right point.
Note that there's nothing to worry about the discrepancy between $1372 and $1332, or $1F72 and $1F32. That's just an oddity with the CTRL control word that some of the bytes are reserved and ignore you exhortations to clear them.

Your latest update essentially asks a different question. It asks about the exception raised by this code:
procedure foo;
var
d: Real;
i: Int64;
begin
d := 1.35715152325557E020;
i := Round(d);
end;
This code fails because the job of Round() is to round d to the nearest Int64 value. But your value of d is greater than the largest possible value that can be stored in an Int64 and hence the floating point unit traps.

FLD instruction x64 bit

I have a little problem with FLD instruction in x64 bit ...
want to load Double value to the stack pointer FPU in st0 register, but it seem to be impossible.
In Delphi x32, I can use this code :
function DoSomething(X:Double):Double;
asm
FLD X
// Do Something ..
FST Result
end;
Unfortunately, in x64, the same code does not work.

Delphi inherite Microsoft x64 Calling Convention.
So if arguments of function/procedure are float/double, they are passed in XMM0L, XMM1L, XMM2L, and XMM3L registers.
But you can use var before parameter as workaround like:
function DoSomething(var X:Double):Double;
asm
FLD qword ptr [X]
// Do Something ..
FST Result
end;

In x64 mode floating point parameters are passed in xmm-registers. So when Delphi tries to compile FLD X, it becomes FLD xmm0 but there is no such instruction. You first need to move it to memory.
The same goes with the result, it should be passed back in xmm0.
Try this (not tested):
function DoSomething(X:Double):Double;
var
Temp : double;
asm
MOVQ qword ptr Temp,X
FLD Temp
//do something
FST Temp
MOVQ xmm0,qword ptr Temp
end;

You don't need to use legacy x87 stack registers in x86-64 code, because SSE2 is baseline, a required part of the x86-64 ISA. You can and should do your scalar FP math using addsd, mulsd, sqrtsd and so on, on XMM registers. (Or addss for float)
The Windows x64 calling convention passes float/double FP args in XMM0..3, if they're one of the first four args to the function. (i.e. the 3rd total arg goes in xmm2 if it's FP, rather than the 3rd FP arg going in xmm2.) It returns FP values in XMM0.
Only use x87 if you actually need 80-bit precision inside your function. (Instructions like fsin and fyl2x are not fast, and can usually be done just as well by normal math libraries using SSE/SSE2 instructions.
function times2(X:Double):Double;
asm
addsd xmm0, xmm0 // upper 8 bytes of XMM0 are ignored
ret
end
Storing to memory and reloading into an x87 register costs you about 10 cycles of latency for no benefit. SSE/SSE2 scalar instructions are just as fast, or faster, than their x87 equivalents, and easier to program for and optimize because you never need fxch; it's a flat register design instead of stack-based. (https://agner.org/optimize/). Also, you have 15 XMM registers.
Of course, you usually don't need inline asm at all. It could be useful for manually-vectorizing if the compiler doesn't do that for you.

Char and Chr in Delphi

The difference between Chr and Char when used in converting types is that one is a function and the other is cast
So: Char(66) = Chr(66)
I don't think there is any performance difference (at least I've never noticed any, one probably calls the other).... I'm fairly sure someone will correct me on this!
EDIT Thanks to Ulrich for the test proving they are in fact identical.
EDIT 2 Can anyone think of a case where they might not be identical, e.g. you are pushed towards using one over the other due to the context?
Which do you use in your code and why?

I did a small test in D2007:
program CharChr;
{$APPTYPE CONSOLE}
uses
Windows;
function GetSomeByte: Byte;
begin
Result := Random(26) + 65;
end;
procedure DoTests;
var
b: Byte;
c: Char;
begin
b := GetSomeByte;
IsCharAlpha(Chr(b));
b := GetSomeByte;
IsCharAlpha(Char(b));
b := GetSomeByte;
c := Chr(b);
b := GetSomeByte;
c := Char(b);
end;
begin
Randomize;
DoTests;
end.
Both calls produce the same assembly code:
CharChr.dpr.19: IsCharAlpha(Chr(b));
00403AE0 8A45FF mov al,[ebp-$01]
00403AE3 50 push eax
00403AE4 E86FFFFFFF call IsCharAlpha
CharChr.dpr.21: IsCharAlpha(Char(b));
00403AF1 8A45FF mov al,[ebp-$01]
00403AF4 50 push eax
00403AF5 E85EFFFFFF call IsCharAlpha
CharChr.dpr.24: c := Chr(b);
00403B02 8A45FF mov al,[ebp-$01]
00403B05 8845FE mov [ebp-$02],al
CharChr.dpr.26: c := Char(b);
00403B10 8A45FF mov al,[ebp-$01]
00403B13 8845FE mov [ebp-$02],al
Edit: Modified sample to mitigate Nick's concerns.
Edit 2: Nick's wish is my command. ;-)

The help says: Chr returns the character with the ordinal value (ASCII value) of the byte-type expression, X. *
So, how is a character represented in a computer's memory? Guess what, as a byte*. Actually the Chr and Ord functions are only there for Pascal being a strictly typed language prohibiting the use of bytes* where characters are requested. For the computer the resulting char is still represented as byte* - to what shall it convert then? Actually there is no code emitted for this function call, just as there is no code omitted for a type cast. Ergo: no difference.
You may prefer chr just to avoid a type cast.
Note: type casts shall not be confused with explicit type conversions! In Delphi 2010 writing something like Char(a) while a is an AnsiChar, will actually do something.
**For Unicode please replace byte with integer*
Edit:
Just an example to make it clear (assuming non-Unicode):
var
a: Byte;
c: char;
b: Byte;
begin
a := 60;
c := Chr(60);
c := Chr(a);
b := a;
end;
produces similar code
ftest.pas.46: a := 60;
0045836D C645FB3C mov byte ptr [ebp-$05],$3c
ftest.pas.47: c := Chr(60);
00458371 C645FA3C mov byte ptr [ebp-$06],$3c
ftest.pas.48: c := Chr(a);
00458375 8A45FB mov al,[ebp-$05]
00458378 8845FA mov [ebp-$06],al
ftest.pas.49: b := a;
0045837B 8A45FB mov al,[ebp-$05]
0045837E 8845F9 mov [ebp-$07],al
Assigning byte to byte is actually the same as assigning byte to char via CHR().

chr is a function, thus it returns a new value of type char.
char(x) is a cast, that means the actual x object is used but as a different type.
Many system functions, like inc, dec, chr, ord, are inlined.
Both char and chr are fast. Use the one that is most appropriate each time,
and reflects better what you want to do.

Chr is function call, it is a bit (tiny-tiny) more expensive then type cast. But i think Chr is inlined by compiler.

They are identical, but they don't have to be identical. There's no requirement that the internal representation of characters map 1-to-1 with their ordinal values. Nothing says that a Char variable holding the value 'A' must hold the numeric value 65. The requirement is that when you call Ord on that variable, the result must be 65 because that's the code point designated for the letter A in your program's character encoding.
Of course, the easiest implementation of that requirement is for the variable to hold the numeric value 65 as well. Because of this, the function calls and the type-casts are always identical.
If the implementation were different, then when you called Chr(65), the compiler would go look up what character is at code point 65 and use it as the result. When you write Char(65), the compiler wouldn't worry about what character it really represents, as long as the numeric result stored in memory was 65.
Is this splitting hairs? Yes, absolutely, because in all current implementations, they're identical. I liken this to the issue of whether the null pointer is necessarily zero. It's not, but under all implementations, it ends up that way anyway.

chr is typesafe, char isn't: Try to code chr(256) and you'll get a compiler error. Try to code char(256) and you will either get the character with the ordinal value 0 or 1, depending on your computers internal representation of integers.
I'll suffix the above by saying that that applies to pre-unicode Delphi. I don't know if chr and char have been updated to take unicode into account.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart