Are floating point operations in Delphi deterministic? - delphi

Are floating point operations in Delphi deterministic?
I.E. will I get the same result from an identical floating point mathematical operation on the same executable compiled with Delphi Win32 compiler as I would with the Win64 compiler, or the OS X compiler, or the iOS compiler, or the Android compiler?
This is a crucial question as I'm implementing multiplayer support in my game engine, and I'm concerned that the predictive results on the client side could very often differ from the definitive (and authoritative) determination of the server.
The consequence of this would be the appearance of "lag" or "jerkiness" on the client side when the authoritative game state data overrules the predictive state on the client side.
Since I don't realistically have the ability to test dozens of different devices on different platforms compiled with the different compilers in anything resembling a "controlled condition", I figure it's best to put this question to the Delphi developers out there to see if anyone has an internal understanding of floating point determinism at a low-level on the compilers.

I think there is no simple answer. Similar task was discussed here.
In general, there are two standards for presentation of floating point numbers:
IEEE 754-1985 and EEE 754-2008.
All modern (and quite old actually) CPUs follow the standards and it guarantees some things:
Binary presentation of same standard floating type will be equal
Result of some operations (not all, only basic operations!) is guaranteed to be equal, but only if compiler will use same type of the command, i am not sure it is true.
But if you use some extended operations, such as square root, result may vary even for different models of desktop CPUs. You can read this article for some details:
http://randomascii.wordpress.com/2013/07/16/floating-point-determinism/
P.S. As tmyklebu mentioned, square root is also defined by IEEE 754, so it is possible to guarantee same result for same input for Add, Subtract, Multiply, Divide and Square root. Few other operations are also defined by IEEE, but for all details it is better to read IEEE.

Putting aside the standards for floating point calculations for a moment, consider that the 32 and 64 bit compilers compile to use the old FPU vs the newer SSE instructions. I would find it difficult to trust that every calculation would always come out exactly the same on different hardware implementations. Better go the safe route and assume that if its within a small delta pf difference, you evaluate as equal.

From experience I can tell that the results are different: 32-bit works with Extended precision by default, while 64-bit works with double precision by default.
Consider the statement
x,y,z: double;
x := y * z;
in Win32 this will execute as "x := double(Extended(y)*Extended(z));
in Win64 this will execute as "x := double(double(y)*double(z));
you put a lot of effort into ensuring that you use the same precision and mode. However whenever you call 3rd party libraries, you need to consider that they may internally change these flags.

Related

Is "Perform all 32 bit float operations as 64 bit float" MORE or LESS compatible with .net?

In Xamarin.iOS project properties, under "iOS Build" there's an option for: "Perform all 32 bit float operations as 64 bit float".
Microsoft seems to say that using 32 bit "affects precision and, possibly, compatibility" in a bad way, so better use 64 bit precision.
But the popup on the text in visual studio (when hovering with the cursor over "Perform all 32 bit float operations as 64 bit float") says "using 64...is slightly incompatible with .net code."
So which one is it?
You have misread the statement in your first point. Microsoft doesn't say that using 32-bit is bad, so you need to use 64-bit. Just the opposite.
Basically, it is always preferable to use 64-bit float operations. They are enabled by default and according to the Floating Point Operations in Xamarin.iOS docs:
While this higher precision is closer to what developers expect from floating point operations in C# on the desktop, on mobile, the performance impact can be significant.
Let's see what the Code Analysis tool is:
Xamarin.iOS analysis is a set of rules that check your project settings to help you determine if better/more optimized settings are available.
So, even though it is preferable to use 64-bit floats, this isn't always the best choice. When you run the Code analysis tool, it will scan your project to see if there is a better suited configuration for your solution (it depends on the project's flow).
Occasionally, the 64-bit floats may do you more harm than gain. In this case, the linter will warn you with XIA0005: Float32Rule, which will suggest that you uncheck the option, like the Microsoft's message says.

Avoiding Denormals in Haxe

I am doing DSP in Haxe. Some of my DSP includes recursive algorithms that may generate denormal (aka subnormal) numbers. Some platforms perform poorly when encountering such numbers, making real-time processing impossible (and even offline processing, in some cases, dramatically more difficult). Obviously, only algorithms that produce very small numbers (eg, via recursive multiplication) are effected, but I am working with these.
One very common procedure for dealing with the problem is simply this:
if r is a denormal
r <- 0
This works fine when denormals are too small to have any effect on the the given algorithm, which is (pretty much) always.
I am looking to build for a number of platforms and would like to avoid these headaches before they happen to the greatest extent possible. So the question is, how do I identify/eliminate denormals in Haxe quickly and efficiently?
This might break down to other questions like: does Haxe have a language-specific method of handling denormals, or is it up to the platform? (I see nothing in the docs -- not even an isDenormal function) If it's up to the platform, is there a flag or something? How do I know which platforms need special handling, and which do not?
Many thanks!
Haxe doesn't support these operations. The problem is that most of the native platforms it addresses do not have any support for that either. I am talking mainly of JavaScript, Flash, PHP and Neko here.
You can certainly build your own library and try to optimize things where possible using inlines.

Good Resources for using Assembly in Delphi?

Question
Are there any resources for learning how to use assembly in Delphi?
Background Information
I've found and read some general assembly and instruction set references (x86, MMX, SSE etc). But I'm finding it difficult to apply that information in Delphi. General things like how to get the value of a class property etc.
I would like to have the option to use assembly when optimising code.
I understand:
It will be difficult to beat the compiler.
High-level optimisation techniques are much more likely to increase performance by several orders of magnitude over low-level assembly optimisations. (Such as choosing different algorthims, caching etc)
Profiling is vital. I'm using Sampling Profiler for real-world performance analysis and cpu cycle counts for low-level details.
I am interested in learning how to use assembly in Delphi because:
It can't hurt to have another tool in the toolbox.
It will help with understanding the compiler generated assembly output.
Understanding what the compiler is doing may help with writing better performing pascal code.
I'm curious.
Here is a resource that could be helpful...
www.guidogybels.eu/docs/Using%20Assembler%20in%20Delphi.pdf
(I wanted to add a comment to #Glenn with this info, but am forced to use the Answer mechanism as I am New to this forum and not enough Reps...)
Most optimization involves creating better algorithms: usually that's where you can get the 'order of magnitude' speed improvements can be obtained.
The x64 assembly world is a big change over the x86 assembly world. Which means that with the introduction of x64 in Delphi in XE2 (very soon now ), you will have to write all your assembly code twice.
Getting yourself a better algorithm in Delphi relieves you of writing that assembly code at all.
The major area where assembly can help (but often smartly crafted Delphi code helps a lot too) is low level bit/byte twiddling, for instance when doing encryption. On the other hand FastMM (the fast memory manager for Delphi) has almost all code written in Delphi.
As Macro already wrote: starting with the disassembled code is often a good start. But assembly optimizations can go very far.
An example you can use as a starting point is for instance the SynCrypto unit which has an option for using either Delphi or assembly code.
The way I read your post, you aren't looking so much for assembler resources as resources to explain how Delphi declarations are structured within memory so you can access them via assembler. This is indeed a difficult thing to find, but not impossible.
Here is a good resource I've found to begin to understand how Delphi structures its declarations. Since assembler only involves itself with discrete data addresses to CPU defined data types, you'll be fine with any Delphi structure as long as you understand it and access it properly.
The only other concern is how to interact with the Delphi procedure and function headers to get the data you want (assuming you want to do your assembler using the Delphi inline facility), but that just involves understanding of the standard function calling conventions. This and this will be useful to that end in understanding those.
Now using actual assembler (linked OBJ files) as opposed to the inline assembler is another topic, which will vary depending on the assembler chosen. You can find information on that as well, but if you have an interest you can always ask that question, too.
HTH.
To use BASM efficiently, you need to have a knowledge both of (1) how Delphi does things at a low level and (2) of assembly. Most of the times, you will not find both of these things described in one place.
However, Dennis Christensen's BASM for beginner and this Delphi3000 article go in that direction. For more specific questions, besides Stackoverflow, also Embarcadero's BASM forum is quite useful.
The simplest solution is always coding it in pascal, and look at the generated assembler.
Speedwise, assembler is usually only at a plus in tight loops, and in general code there is hardly improvement, if any. I've only one piece of assembler in my code, and the benefit comes from recoding a floating point vector operation in fixed point SSE. The saturation provided by SIMD instruction sets is an additional bonus.
Worse even, much ill advised assembler code floating around the web is actually slower than the pascal equivalents on modern processors because the tradeoffs of processors changed over time.
Update:
Then simply load the class property in a local var in the prologue of your procedure before you enter the assembler loop, or move the assembler to a different procedure. Choose your battles.
Studying RTL/VCL source might also yield ideas how to access certain constructs.
Btw, not all low level optimization is done using assembler. On Pascal level with some pointer knowledge a lot can be done too, and stuff like cache optimization can sometimes be done on Pascal level too (see e.g. Cache optimization of rotating bitmaps )

Floating point support in 64-bit compiler

What should we expect from the floating point support in 64-bit Delphi compiler?
Will 64-bit compiler use SSE to
implement floating point arithmetic?
Will 64-bit compiler support the
current 80-bit floating type
(Extended)?
These questions are closely related, so I ask them as a single question.
I made two posts on the subject (here and there), to summarize, yes, the 64bit compiler uses SSE2 (double precision), but it doesn't use SSE (single precision). Everything is converted to double precision floats, and computed using SSE2 (edit: however there is an option to control that)
This means f.i. that if Maths on double precision floats is fast, maths on single precision is slow (lots of redundant conversions between single and double precisions are thrown in), "Extended" is aliased to "Double", and intermediate computations precision is limited to double precision.
Edit: There was an undocumented (at the time) directive that controls SSE code generation, {$EXCESSPRECISION OFF} activates SSE code generation, which brings back performance within expectations.
According to Marco van de Voort in his answer to: How should I prepare my 32-bit Delphi programs for an eventual 64-bit compiler:
x87 FPU is deprecated on x64, and in general SSE2 will be used for florating point. so floating point and its exception handling might work slightly differently, and extended might not be 80-bit (but 64-bit or, less likely 128-bit). This also relates to the usual rounding (copro controlwork) changes when interfacing wiht C code that expects a different fpu word.
PHis commented on that answer with:
I wouldn't say that the x87 FPU is deprecated, but it is certainly the case that Microsoft have decided to do their best to make it that way (and they really don't seem to like 80-bit FP values), although it is clearly technically possible to use the FPU/80-bit floats on Win64.
I just posted an answer to your other question, but I guess it actually should go here:
Obviously, nobody except for Embarcadero can answer this for sure before the product is released.
It is very likely that any decent x64 compiler will use the SSE2 instruction set as a baseline and therefore attempt to do as much floating point computation using SSE features as possible, minimising the use of the x87 FPU. However, it should also be said that there is no technical reason that would prevent the use of the x87 FPU in x64 application code (despite rumours to the contrary which have been around for some time; if you want more info on that point, please have a look at Agner Fog's Calling Convention Manual, specifically chapter 6.1 "Can floating point registers be used in 64-bit Windows?").
Edit 1: Delphi XE2 Win64 indeed does not support 80-bit floating-point calculations out of the box (see e.g. discussuion here (although it allows one to read/write such values). One can bring such capabilities back to Delphi Win64 using a record + class operators, as is done in this TExtendedX87 type (although caveats apply).
For the double=extended bit:
Read ALlen Bauer's Twitter account Kylix_rd:
http://twitter.com/kylix_rd
In hindsight logical, because while SSE2 regs are 128 bit, they are used as two 64-bit doubles.
We won't know for sure how the 64-bit Delphi compiler will implement floating point arithmetic until Embarcadero actually ships it. Anything prior to that is just speculation. But once we know for sure it'll be too late to do anything about it.
Allen Bauer's tweets do seem to indicate that they'll use SSE2 and that the Extended type may be reduced to 64 bits instead of 80 bits. I think that would be a bad idea, for a variety of reasons. I've written up my thoughts in a QualityCentral report Extended should remain an 80-bit type on 64-bit platforms
If you don't want your code to drop from 80-bit precision to 64-bit precision when you move to 64-bit Delphi, click on the QualityCentral link and vote for my report. The more votes, the more likely Embarcadero will listen. If they do use SSE2 for 64-bit floating point, which makes sense, then adding 80-bit floating point using the FPU will be extra work for Embarcadero. I doubt they'll do that work unless a lot of developers ask for it.
If you really need it, then you can use the TExtendedX87 unit by Philipp M. Schlüter (PhiS on SO) as mentioned in this Embarcadero forum thread.
#PhiS: when you update your answer with the info from mine, I'll remove mine.

Large numbers in Pascal (Delphi)

Can I work with large numbers (more than 10^400) with built-in method in Delphi?
Not built-in, but you might want to check out MPArith for arbitrary precision maths.
There is also a Delphi BigInt library on SourceForge . I haven't tried it however, but include for completeness.
You could implement your own large number routines using Delphi's operator overloading.
For example add, subtract, multiply and division.
Intel has also added new instructions for multiply and possibly also for division in their latest chip design to come out in the near future.
One of these instructions is called: mulx
Intel mentions multiple carry streams which would allow multiplication to be accelerated as well.
x86 already had subtract with borrow, and add with carry, so now these new instructions do more or less the same for long multiplication and division and such... there are two methods to do multiplication and by using both apperently this becomes possible.
In the future Delphi will probably support these new instructions as well which could make programming something like this extra interesting.
For now these 4 basic operations might take you somewhere... or perhaps nowhere.
It depends a bit on what you want to do.. what kind of math ? just basic math like add/sub/mul/div
Or more complex math like cosinus, sinus, tan, and all kinds of other math functionality.
As far as I know operator overloading is available for records... I can vaguely remember that it might have been added to classes as well but take a grain of salt with that for now.
Operator overloading used to have a bug when converting between types... but it's been solved in later delphi versions, so it should be good to go.

Resources