I'm working on adapting a large Delphi code base to 64-bits. In many cases there are lines where pointers are casted to/from 32-bit values similar to this:
var
p1,p2 : pointer;
begin
inc(Integer(p1),10);
p2 := Pointer(Integer(p1) + 42);
Where I can find these casts I have replaced them with NativeInt-casts instead to make them correct in 64-bit mode.
However I'm not sure I have found them all. Sometimes the casts are more subtle so just text-searching for the string "integer(" is not sufficient either.
Since the "integer(" casts will fail in 64-bit if the pointer value is above the range of integer type I have an idea: what if I could force the memory manager to allocate memory above 4gb (so the pointer values are using more than 32-bits)? Then I would get runtime errors and can more easily find the casts that are wrong. Is this possible? Or can anyone recommend some other technique?
There's no magic trick to finding these casts beyond the sort of text search that you are using. It would be really nice if the compiler warned of such a cast. I find it very disappointing that it doesn't.
When you do find such a problem, don't change to NativeInt. Change the pointers to be typed pointers, and use pointer arithmetic.
var
p1, p2: PByte;
....
inc(p1, 10);
p2 := p2;
inc(p2, 42);
Then your code will be safe forever.
There are still some situations where you need to cast to integers. For example when passing addresses to SendMessage. But cast these to either WPARAM or LPARAM as appropriate.
Your idea of forcing runtime errors is sound and, thankfully for you, not original! You should use the full version of FastMM and define AlwaysAllocateTopDown. This forces the calls that FastMM makes to VirtualAlloc to pass the MEM_TOP_DOWN flag. This will flush out most of your erroneous casts as runtime pointer truncation errors.
However, that will only force top down allocation for memory allocated by your memory manager. Other modules in your process will use the default policy of bottom up. You can set a machine wide setting to change that default policy. Set HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management\AllocationPreference to REG_DWORD with value 0x100000 and reboot.
Note that this might cause your machine to have stability problems. Many applications cannot cope with this. In particular there are very few anti-virus products that can cope with this setting. MSE is the one that I found works with machine wide top down allocation. What's more the 64 bit debugger does not run under top down allocation! So you have to do this kind of testing without the debugger. My QC report is still open and this problem has not been addressed, even in XE3.
Related
I'm trying to understand using the FPU for 64-bit integer arithmetic. I write this (ATT syntax):
fildq A
fildq B
faddp
fistpq C
The result in C is A + B + 1. If I start with an "finit" instruction, it gives me the correct value A + B. I thought that the unwanted +1 was maybe because it was adding in a carry bit, but using gdb I see no difference at all in the FPU control registers when I use finit from when I don't -- in both cases the control register starts off as 0x27F, the tag register is 0xFFFF (= stack empty), and all the others (including the status register, where all the condition bits are located) are zero.
Using finit seems a bit of a blunt instrument here, and I'm also wondering where the extra +1 is coming from if I don't use it, given that all the FPU registers seem to have the same values in both cases. Can anyone shed any light on this for me?
[…] I see no difference at all in the FPU control registers when I use finit from when I don't -- in both cases the control register starts off as 0x27F […]
Are you sure?
finit is supposed to load 0x37F, one additional bit set in comparison to 0x27F.
The difference is in the precision control field.
The default value uses 80‑bits whilst your observed value is using 64‑bits.
The result in C is A + B + 1. […]
Using finit seems a bit of a blunt instrument here, and I'm also wondering where the extra +1 is coming from if I don't use it, […]
With sufficiently large A and B you’re likely seeing a loss in precision from fadd.
Unmasking the precision exception will confirm this.
I think you were using the inline assembly capabilities of your favorite compiler.
This is certainly convenient if you don’t wanna bother about menial tasks, yet apparently your compiler’s run-time system loads 0x27F at startup for compatibility considerations.
Study its manual (and possibly source code) for details.
I'm using Delphi XE6 to perform a complicated floating point calculation. I realize the limitations of floating point numbers so understand the inaccuracies inherent in FP numbers. However this particular case, I always get 1 of 2 different values at the end of the calculation.
The first value and after a while (I haven't figured out why and when), it flips to the second value, and then I can't get the first value again unless I restart my application. I can't really be more specific as the calculation is very complicated. I could almost understand if the value was somewhat random, but just 2 different states is a little confusing. This only happens in the 32-bit compiler, the 64 bit compiler gives one single answer no matter how many times I try it. This number is different from the 2 from the 32-bit calculation, but I understand why that is happening and I'm fine with it. I need consistency, not total accuracy.
My one suspicion is that perhaps the FPU is being left in a state after some calculation that affects subsequent calculations, hence my question about clearing all registers and FPU stack to level out the playing field. I'd call this CLEARFPU before I start of the calculation.
After some more investigation I realized I was looking in the wrong place. What you see is not what you get with floating point numbers. I was looking at the string representation of the numbers and thinking here are 4 numbers going into a calculation ALL EQUAL and the result is different. Turns out the numbers only seemed to be the same. I started logging the hex equivalent of the numbers, worked my way back and found an external dll used for matrix multiplication the cause of the error. I replaced the matrix multiplication with a routine written in Delphi and all is well.
Floating point calculations are deterministic. The inputs are the input data and the floating point control word. With the same input, the same calculation will yield repeatable output.
If you have unpredictable results, then there will be a reason for it. Either the input data or the floating point control word is varying. You have to diagnose what that reason for that is. Until you understand the problem fully, you should not be looking for a problem. Do not attempt to apply a sticking plaster without understanding the disease.
So the next step is to isolate and reproduce the problem in a simple piece of code. Once you can reproduce the issue you can solve the problem.
Possible explanations include using uninitialized data, or external code modifying the floating point control word. But there could be other reasons.
Uninitialized data is plausible. Perhaps more likely is that some external code is modifying the floating point control word. Instrument your code to log the floating point control word at various stages of execution, to see if it ever changes unexpectedly.
You've probably been bitten by combination of optimization and excess x87 FPU precision resulting in the same bit of floating-point code in your source code being duplicated with different assembly code implementations with different rounding behaviour.
The problem with x87 FPU math
The basic problem is that while x87 FPU the supports 32-bit, 64-bit and 80-bit floating-point value, it only has 80-bit registers and the precision of operations is determined by the state of the bits in the floating point control word, not the instruction used. Changing the rounding bits is expensive, so most compilers don't, and so all floating point operations end being be performed at the same precision regardless of the data types involved.
So if the compiler sets the FPU to use 80-bit rounding and you add three 64-bit floating point variables, the code generated will often add the first two variables keeping the unrounded result in a 80-bit FPU register. It would then add the third 64-bit variable to 80-bit value in the register resulting in another unrounded 80-bit value in a FPU register. This can result in a different value being calculated than if the result was rounded to 64-bit precision after each step.
If that resulting value is then stored in a 64-bit floating-point variable then the compiler might write it to memory, rounding it to 64 bits at this point. But if the value is used in later floating point calculations then the compiler might keep it in a register instead. This means what rounding occurs at this point depends on the optimizations the compiler performs. The more its able to keep values in a 80-bit FPU register for speed, the more the result will differ from what you'd get if all floating point operation were rounded according to the size of actual floating point types used in the code.
Why SSE floating-point math is better
With 64-bit code the x87 FPU isn't normally used, instead equivalent scalar SSE instructions are used. With these instructions the precision of the operation used is determined by the instruction used. So with the adding three numbers example, the compiler would emit instructions that added the numbers using 64-bit precision. It doesn't matter if the result gets stored in memory or stays in register, the value remains the same, so optimization doesn't affect the result.
How optimization can turn deterministic FP code into non-deterministic FP code
So far this would explain why you'd get a different result with 32-bit code and 64-bit code, but it doesn't explain why you can get a different result with the same 32-bit code. The problem here is that optimizations can change the your code in surprising ways. One thing the compiler can do is duplicate code for various reasons, and this can cause the same floating point code being executed in different code paths with different optimizations applied.
Since optimization can affect floating point results this can mean the different code paths can give different results even though there's only one code path in the source code. If the code path chosen at run time is non-deterministic then this can cause non-deterministic results even when the in the source code the result isn't dependent on any non-deterministic factor.
An example
So for example, consider this loop. It performs a long running calculation, so every few seconds it prints a message letting the user know how many iterations have been completed so far. At the end of the loop there's simple summation performed using floating-point arithmetic. While there's non-deterministic factor in the loop, the floating-point operation isn't dependent on it. It's always performed regardless of whether progress updated is printed or not.
while ... do
begin
...
if TimerProgress() then
begin
PrintProgress(count);
count := 0
end
else
count := count + 1;
sum := sum + value
end
As optimization the compiler might move the last summing statement into the end of both blocks of the if statement. This lets both blocks finish by jumping back to the start of the loop, saving a jump instruction. Otherwise one of the blocks has to end with a jump to the summing statement.
This transforms the code into this:
while ... do
begin
...
if TimerProgress() then
begin
PrintProgress(count);
count := 0;
sum := sum + value
end
else
begin
count := count + 1;
sum := sum + value
end
end
This can result in the two summations being optimized differently. It may be in one code path the variable sum can be kept in a register, but in the other path its forced out in to memory. If x87 floating point instructions are used here this can cause sum to be rounded differently depending on a non-deterministic factor: whether or not its time to print the progress update.
Possible solutions
Whatever the source of your problem, clearing the FPU state isn't going to solve it. The fact that the 64-bit version works, provides an possible solution, using SSE math instead x87 math. I don't know if Delphi supports this, but it's common feature of C compilers. It's very hard and expensive to make x87 based floating-point math conforming to the C standard, so many C compilers support using SSE math instead.
Unfortunately, a quick search of the Internet suggests the Delphi compiler doesn't have option for using SSE floating-point math in 32-bit code. In that case your options would be more limited. You can try disabling optimization, that should prevent the compiler from creating differently optimized versions of the same code. You could also try to changing the rounding precision in the x87 floating-point control word. By default it uses 80-bit precision, but all your floating point variables are 64-bit then changing the FPU to use 64-bit precision should significantly reduce the effect optimization has on rounding.
To do the later you can probably use the Set8087CW procedure MBo mentioned, or maybe System.Math.SetPrecisionMode.
How can one switch off range checking for a part of a file. Switching off is easy, but how do I revert to the project setting later on? The pseudo-code below should explain it:
Unit1;
//here's range checking on or off as per the project setting
code here...
{$R-}
//range checking is off here because the code causes range check errors
code here...
//now I want to revert to the project setting. How do I do that?
code here...
end.
See: IFOPT directive.
{$IFOPT R+}
{$DEFINE RANGEON}
{$R-}
{$ELSE}
{$UNDEF RANGEON}
{$ENDIF}
//range checking is off here because the code causes range check errors
//code here...
{$IFDEF RANGEON}
{$R+}
{$UNDEF RANGEON}
{$ENDIF}
Wrap your code in $R directives:
{$R-} // disable range checking
// do non-range-checked operations here
{$R+} // turn range checking back on
Note that the directive applies at the statement level. You cannot wrap just part of an expression with that.
Why would you want to turn it off for release builds? – dan-gph
I have seen too many Delphi programmers writing quite large programs without ever activating Range, Overflow and Assertion checking. Of course, you can do that if you want, but your code will be more buggy.
I hope to convince more programmers to enable these 3 checking right now, to make the program more reliable.
However, note that there is a price to pay for that: your program will be slower. I show below some actual time comparison of code running with and without range checking.
the point is you can turn it off locally where you need to with {$R-}.But you can leave it on globally in the project settings –
dan-gph
Personally, next to Debug and Release, I have a 3rd option called PreRelease. This is actually a Debug version with one single setting different: the "Optimizations" is on. It is fast enough while it still does the checking (range, overflow, assertions, etc).
I release this kind of version to a limited number of customers. If all seems good after one week, I replace it with the true Release version, where the checkings are off.
Overflow checking
This will check certain integer arithmetic operations (+, -, *, Abs, Sqr, Succ, Pred, Inc, and Dec) for overflow. For example, after a + (addition) operation the compiler will insert additional binary code that verifies that the result of the operation is within the supported range.
An "integer overflow" occurs when an operation on an integer variable produces a result that is outside the range of that variable. For example, if an integer variable is declared as a 16-bit signed integer, its value can range from -32768 to 32767. If an operation on this variable produces a result greater than 32767 or less than -32768, an integer overflow has occurred.
When an integer overflow occurs, the result of the operation is undefined and can lead to undefined behavior in the program:
• Wrap-around
The result might result in a wrapped-around value. This means that the number 32768 will be actually stored as 1 since it is 1 unit higher than the highest value we can store (32767).
• Truncation
The result may be truncated or otherwise modified to fit within the range of the integer type. For example, the number 32768 will be actually stored as 32767 since that is the highest value we can store.
Undefined program behavior is one of the worst kind of errors, because it is not an error easy to reproduce. Therefore, it is difficult to track and repair.
There is a small price to pay if you activate this: the speed of the program will decrease slightly.
IO checking
Checks the result of an I/O operation. If an I/O operation fails, an exception is raised. If this switch is off, we must check for I/O errors manually.
There is a minor price to pay if you activate this: the speed of the program will decrease, but insignificantly because the few microseconds introduced by this check is nothing compared with the millisecond-range time required by the I/O operation itself (the hard drives are slow).
Range Checking
The Delphi Geek calls this “The most important Delphi setting” and I totally agree. It checks if all array and string indexing expressions are within the defined bounds. It also checks that all assignments to scalar and subrange variables are within range.
Here is an example of code that would ruin our life if Range Checking would not be available:
Type
Pasword= array [1..10] of byte; // we define an array of 10 elements
…
x:= Pasword[20]; // Range Checking will prevent the program from accessing element 20 (ERangecheckError exception is raised). Security breach avoided. Error log automatically sent to the programmer. Bruce Willis saves everyone.
Enabling Runtime Error Checking
To activate the Runtime Error Checking, go to Project Options and check these three boxes:
Enabling the Runtime Error Checking in ‘Project Options’
Assertions
A good programmer MUST use assertions in its code to increase the quality and stability of the program. Seriously man! You really need to use them.
Assertions are used to check for conditions that should always be true at a certain point in the program, and to raise an exception if the condition is not met. The Assert procedure, which is defined in the SysUtils unit, is typically used to perform assertions.
You can think of assertions as poor man’s unit testing. I strongly advise you to look deeper into assertions. They are very useful and do not require as much work as unit testing.
Typical example:
SysUtils.Assert(Input <> Nil, ‘The input should not be nil!’);
But for the program to check our assertions, we need to active this feature in the Project Settings -> Compiler Options, otherwise they will simply be ignored, like they are not there in our code. Make sure that you understand the implications of what I just said! For example, we screw up badly if we call routines that have side effects in the Assert. In the example below, during Debugging when the assertions are on, the Test() function will be executed and 'This was executed' will appear in the Memo. However, during Release more, that text will not appear in the memo because now Assert is simply ignored. Congratulations we just made the program to behave differently in debug/release mode ☹.
function TMainForm.Test: Boolean;
begin
Result:= FALSE;
mmo.Lines.Add('This was executed');
end;
procedure TMainForm.Start;
VAR x: Integer;
begin
x:= 0;
if x= 0
then Assert(Test(), 'nope');
end;
Here are a few examples of how it can be used:
1 To check if an input parameter is within the 0..100 range:
procedure DoSomething(value: Integer);
begin
Assert((value >= 0) and (value <= 100), 'Value out of range');
…
end;
2 To check if a pointer is not nil before using it:
Var p: Pointer;
Begin
p := GetPointer;
Assert(Assigned(p), 'Pointer is nil');
…
End;
3 To check if a variable has a certain value before proceeding:
var i: Integer;
begin
i := GetValue;
Assert(i = 42, 'Incorrect response to “What is the answer to life”!');
…
end;
Assertions can also be disabled by defining the NDEBUG symbol in the project options or by using the {$D-} compiler directives.
We can also use the Assert as a more elegant way of handling errors and exceptions in some cases, as it can be more readable and it also includes a custom message, that would help the developer understand what went wrong.
Personally, I use it a lot at the top of my routines to check if the parameters are valid.
Activating this feature will (naturally) make your program slower because… well, there is one extra line of code to execute.
Nothing comes for free
Everything nice comes with a price (fortunately a small price in our case): enabling Runtime error checking and Assertions slows down our program and makes it somewhat larger.
Computers today have lots of RAM so the slight increase in size is irrelevant, so, let’s put that aside. But let’s look at the speed, because that is not something we can easily ignore:
Type Disabled Enabled
Range checking 73ms 120ms
Overflow checking 580ms 680ms
I/O checking Not tested Not tested
As we can see the program's speed is strongly impacted by these runtime checking. If we have a program where speed is critical, we better activate “Runtime error checking” during debugging only. What I do, I also leave it active in the first release and wait a few weeks. If no bugs are reported, then I release an update in which “Runtime error checking” is off.
Personally, I leave the “IO checking” always active. The performance hit because of this check is microscopic.
Big warning:
If you have an existing project that was not so nicely written, and you activate any of the Runtime Error checking below, your program may will crash more often than usual.
No, the Runtime Error checking routines did not break your program. It was always broken – you just didn’t know. The Runtime Checking routines are now finding all those places where the code is fishy and shitty and smelly. The sole purpose of Runtime Checking is to find bugs in your program.
Probably a stupid question, but it's an idle curiosity for me.
I've got a bit of Delphi code that looks like this;
const
KeyRepeatBit = 30;
...
// if bit 30 of lParam is set, mark this message as handled
if (Msg.lParam and (1 shl KeyRepeatBit) > 0) then
Handled:=true;
...
(the purpose of the code isn't really important)
Does the compiler see "(1 shl KeyRepeatBit)" as something that can be computed at compile time, and thus it becomes a constant? If not, would there be anything to gain by working it out as a number and replacing the expression with a number?
Yes, the compiler evaluates the expression at compile time and uses the result value as a constant. There's no gain in declaring another constant with the result value yourself.
EDIT: The_Fox is correct. Assignable typed constants (see {$J+} compiler directive) are not treated as constants and the expression is evaluated at runtime in that case.
You can make sure iike this, for readability alone:
const
KeyRepeatBit = 30;
KeyRepeatMask = 1 shl KeyRepeatBit ;
It converts it to a constant at compile time.
However, even if it didn't, this would have no noticeable impact on your application's performance.
You might handle a few thousand messages per second if your app is busy. Your old Pentium I can do gazillions of shifts and ands per second.
Keep your code readable, and profile it to find bottlenecks that you then optimize - usually by looking at the algorithm, and not such a low level as whether you're shifting or not.
I doubt that using a number (would be 1073741824, by the way) here would really improve performance. You seem to be in some Windows message context here and this will possible add more delay than a single and that is lightning fast even if the number is not optimized at compiled time (anyway, I think it is optimized).
The only exception I could imagine would be the case that this particular piece of code is run really often, but as I said I think this gets optimized at compile time and so even in this case it won't make a difference at all.
Maybe it's offtopic to your question but I use a case record for these kind of things, example:
TlParamRecord = record
case Integer of
0: (
RepeatCount: Word;
ScanCode: Byte;
Flags: Set of (lpfExtended, lpfReserved=4, lpfContextCode,
lpfPreviousKeyState, lpfTransitionState);
);
1: (lParam: LPARAM);
end;
see article on my blog for more details
I want to get the size of any "record" type in following function. But seems it doesn't work:
function GetDataSize(P : Pointer) : Integer;
begin
Result := SizeOf(P^); // **How to write the code?**
end;
For example, the size of following record is 8 bytes
SampleRecord = record
Age1 : Integer;
Age2 : Integer;
end;
But GetDataSize(#a) always returns 1 (a is a variable of SampleRecord type of course). What should I do?
I noticed that Delphi has a procedure procedure New(var P: Pointer) which can allocate the memory block corresponds to the size of the type that P points to. How can it gets the size?
The reason New knows how much memory to allocate is that New is compiler magic. It's a language built-in, so when the compiler sees you call it, it rewrites it to something like this:
// New(foo);
foo := System._New(SizeOf(foo^), TypeInfo(TypeOf(foo^)));
TypeOf here is a made-up Delphi function for expository purposes. The compiler knows the declared type of foo because it knows where all your variable declarations are. You can look at the implementation of _New in System.pas. Similar rewriting occurs for Dispose so it knows what kind of finalization to do before freeing the memory.
The ideas of variables and declarations are compile-time concepts. At run time, they cease to exist. At run time, a pointer is just an address. The type of what it points to was determined at compile time. Types are what determine something's size.
If you need to write a function that accepts pointers to multiple things with different sizes, then you'll just have to provide a second parameter that describes what the first one points to.
Check out another question here, "How to know what type is a var." The asker wondered how to determine more information about a variable given only its address.
You cannot find the size of data structure using variable of type Pointer, because compiler cannot, make a guess and check it, since pointer can points to whatever data type you can think of. You can read some information here.
There's no safe way to determine the size of a record that a pointer points to. However, if you allocated the memory that the pointer points to, you can ask the size of that memory block. But then again, since you allocated that block, you should already know the size of that block!
The Delphi memory manager keeps track of every block of memory that gets allocated. With information from the memory manager it is possible to find this information, if your pointer points to the beginning of a memory block. However, if you allocated a large block of memory, loaded some data in it and your pointer points to some data inside this block, this method would be quite unreliable.
Also, if you use referenced types (dynamic arrays, strings, classes, etc.) in your record, the size it returns will still be unusable since you get the size of the reference (4 bytes) instead of the size of the data that is referenced to.
The NEW() command just uses the type information of the datatype that you pass to it to get it's size. To know how it does this exactly, you could just check the Delphi sourcecode. Open \source\Win32\rtl\sys\System.pas and search for "_New". (With the underscore in front of it. Using this sourcecode might help you to understand how Delphi handles memory allocations, although the sourcecode can be really complex.
Delphi has a built-in memory manager. I believe new has access to the heap object and uses HeapSize() (or similar routines) to get the size of a block, for some pointer.