single, double and precision - delphi

I know that storing single value (or double) can not be very precise. so storing for example 125.12 can result in 125.1200074788. now in delphi their is some usefull function like samevalue or comparevalue that take an epsilon as param and say that 125.1200074788 or for exemple 125.1200087952 is equal.
but i often see in code stuff like : if aSingleVar = 0 then ... and this in fact as i see always work. why ? why storing for exemple 0 in a single var keep the exact value ?

Only values that are in form m*2^e, where m and e are integers can be stored in a floating point variable (not all of them though, it depends on precision). 0 has this form, and 125.12 does not, as it equals 3128/25, and 1/25 is not an integer power of 2.
Comparing 125.12 to a single (or double) precision variable will most probably return always False, because a literal 125.12 will be treated as an extended precision number, and no single (or double) precision number would have such a value.

Looks like a good use for the BigDecimals unit by Rudy Velthuis. Millions of decimal places of accuracy and precision.

Related

Why multiply two double in dart result in very strange number

Can anyone explain why the result is 252.99999999999997 and not 253? What should be used instead to get 253?
double x = 2.11;
double y = 0.42;
print(((x + y) * 100)); // print 252.99999999999997
I am basically trying to convert a currency value with 2 decimal (ie £2.11) into pence/cent (ie 211p)
Thanks
In short: Because many fractional double values are not precise, and adding imprecise values can give even more imprecise results. That's an inherent property of IEEE-754 floating point numbers, which is what Dart (and most other languages and the CPUs running them) are working with.
Neither of the rational numbers 2.11 and 0.42 are precisely representable as a double value. When you write 2.11 as source code, the meaning of that is the actual double values that is closest to the mathematical number 2.11.
The value of 2.11 is precisely 2.109999999999999875655021241982467472553253173828125.
The value of 0.42 is precisely 0.419999999999999984456877655247808434069156646728515625.
As you can see, both are slightly smaller than the value you intended.
Then you add those two values, which gives the precise double result 2.529999999999999804600747665972448885440826416015625. This loses a few of the last digits of the 0.42 to rounding, and since both were already smaller than 2.11 and 0.42, the result is now even more smaller than 2.53.
Finally you multiply that by 100, which gives the precise result 252.999999999999971578290569595992565155029296875.
This is different from the double value 253.0.
The double.toString method doesn't return a string of the exact value, but it does return different strings for different values, and since the value is different from 253.0, it must return a different string. It then returns a string of the shortest number which is still closer to the result than to the next adjacent double value, and that is the string you see.

how to use AtomiccmpExchange with double?

I have a double value that I need to access to inside a backgroundThread. I would like to use somethink like AtomiccmpExchange but seam to not work with double. is their any other equivalent that I can use with double ? I would like to avoid to use Tmonitor.enter / Tmonitor.exit as I need something the most fast as possible. I m under android/ios so under firemonkey
You could type cast the double values into UInt64 values:
PUInt64(#dOld)^ := AtomicCmpExchange(PUInt64(#d)^,PUInt64(#dNew)^,PUInt64(#dCom‌​p)^);
Note that you need to align the variables properly, according to platforms specifications.
As #David pointed out, comparing doublevalues is not the same as comparing UInt64 values. There are some specific double values that will behave out of the ordinary:
A NaN is normally (as specified in IEEE-754) detected by comparing a value by itself.
IsNaN := d <> d;
footnote: Delphi default exception handler is triggered in the event of comparing a NaN, but other compilers may behave differently. In Delphi there is an IsNaN() function to use instead.
Likewise the value zero could be both positive and negative, for a special meaning. Comparing double 0 with double -0 will return true, but comparing the memory footprint will return false.
Maybe use of System.SyncObjs.TInterlocked class methods will be better?

What is the correct constant to use when comparing with the Minimal Single Number in Delphi?

In a loop like this:
cur := -999999; // represent a minimal possible value hold by a Single type
while ... do
begin
if some_value > cur then
cur := some_value;
end;
There is MaxSingle/NegInfinitydefined in System.Math
MaxSingle = 340282346638528859811704183484516925440.0;
NegInfinity = -1.0 / 0.0;
So should I use -MaxSingle or NegInfinity in this case?
I assume you are trying to find the largest value in a list.
If your values are in an array, just use the library function MaxValue(). (If you look at the implementation of MaxValue, you'll see that it takes the first value in the array as the starting point.)
If you must implement it yourself, use -MaxSingle as the starting value, which is approximately -3.40e38. This is the most negative value that can be represented in a Single.
Special values like Infinity and NaN have special rules in comparisons, so I would avoid these unless you are sure about what those rules are. (See also How do arbitrary floating point values compare to infinity?. In fact, it seems NegInfinity would work OK.)
It might help to understand the range of values that can be represented by a Single. In order, most negative to most positive, they are:
NegInfinity
-MaxSingle .. -MinSingle
0
MinSingle .. MaxSingle
Infinity

Objective C ceil returns wrong value

NSLog(#"CEIL %f",ceil(2/3));
should return 1. However, it shows:
CEIL 0.000000
Why and how to fix that problem? I use ceil([myNSArray count]/3) and it returns 0 when array count is 2.
The same rules as C apply: 2 and 3 are ints, so 2/3 is an integer divide. Integer division truncates so 2/3 produces the integer 0. That integer 0 will then be cast to a double precision float for the call to ceil, but ceil(0) is 0.
Changing the code to:
NSLog(#"CEIL %f",ceil(2.0/3.0));
Will display the result you're expecting. Adding the decimal point causes the constants to be recognised as double precision floating point numbers (and 2.0f is how you'd type a single precision floating point number).
Maudicus' solution works because (float)2/3 casts the integer 2 to a float and C's promotion rules mean that it'll promote the denominator to floating point in order to divide a floating point number by an integer, giving a floating point result.
So, your current statement ceil([myNSArray count]/3) should be changed to either:
([myNSArray count] + 2)/3 // no floating point involved
Or:
ceil((float)[myNSArray count]/3) // arguably more explicit
2/3 evaluates to 0 unless you cast it to a float.
So, you have to be careful with your values being turned to int's before you want.
float decValue = (float) 2/3;
NSLog(#"CEIL %f",ceil(decValue));
==>
CEIL 1.000000
For you array example
float decValue = (float) [myNSArray count]/3;
NSLog(#"CEIL %f",ceil(decValue));
It probably evaluates 2 and 3 as integers (as they are, obviously), evaluates the result (which is 0), and then converts it to float or double (which is also 0.00000). The easiest way to fix it is to type either 2.0f/3, 2/3.0f, or 2.0f/3.0f, (or without "f" if you wish, whatever you like more ;) ).
Hope it helps

Small numbers in Objective C 2.0

I created a calculator class that does basic +,-, %, * and sin, cos, tan, sqrt and other math functions.
I have all the variables of type double, everything is working fine for big numbers, so I can calculate numbers like 1.35E122, but the problem is with extremely small numbers. For example if I do calculation 1/98556321 I get 0 where I would like to get something 1.01464E-8.
Should I rewrite my code so that I only manipulate NSDecimalNumber's and if so, what do I do with sin and cos math functions that accept only double and long double values.
1/98556321
This division gives you 0 because integer division is performed here - the result is an integer part of division. The following line should give you floating point result:
1/(double)98556321
integer/integer is always an integer
So either you convert the upper or the lower number to decimal
(double)1/98556321
or
1/(double)98556321
Which explicitely convert the number to double.
Happy coding....

Resources