Why does calculation using real give different result from one using int?

Why does calculation using real give different result from one using int? - delphi

I have this code for example:
(a) writeln ('real => ', exp(3*Ln(3)):0:0); // return 27
(b) writeln ('int => ', int(exp(3*Ln(3))):0:0); // return 26
Is a bug?
The function calc 3^3 (exponent using ln and exp function), but conversion from real to int fail; in case (a) return 27, in case (b) return (26), when should be 27 both.
As i can solve it?
Thanks very much for help.
Ps: Too assign result to integer variable, using trunc, result not change.

No, it is not a bug. Computers simply don't have infinite precision, so the result is not exactly 27, but perhaps 26.999999999 or something. And so, when you int or trunc it, it ends up as 26. Use Round instead.

The expression you're printing evaluates to something slightly less than 27 due to the usual floating-point errors. The computer cannot exactly represent the natural logarithm of 3, so any further calculations based on it will have errors, too.
In comments, you claim exp(3*ln(3)) = 27.000, but you've shown no programmatic evidence for that assertion. Your code says exp(3*ln(3)) = 27, which is less precise. It prints that because you explicitly told WriteLn to use less precision. The :0:0 part isn't just decoration. It means that you want to print the result with zero decimal places. When you tell WriteLn to do that, it rounds to that many decimal places. In this case, it rounds up. But when you introduce the call to Int, you truncate the almost-27 value to exactly 26, and then WriteLn trivially rounds that to 26 before printing it.
If you tell WriteLn to display more decimal places, you should see different results. Consult the documentation for Write for details on what the numbers after the colons mean.

Working with floating points doesn't always give a 100% exact result. The reason being is that binary floating points variable can't always represent values exactly. The same thing is true about decimal numbers. If you take 1/3, in a 6 digit precision decimal, would be 0.333333. Then if you take 0.333333 * 3 = 0.999999. Int(0.999999) = 0
Here is some litterature about it...
What Every Computer Scientist Should Know About Floating-Point Arithmetic

You should also take a look at Rudy Velthuis' article:
http://rvelthuis.de/articles/articles-floats.html

Not a bug. It is just yet another example of how floating arithmetic works on a computer. Floating point arithmetic is but an approximation of how the real numbers work in mathematics. There is no guarantee, and there can be no such guarantee, that floating point results will be infinitely accurate. In fact, you should expect them to almost always be imprecise to some degree.

Related

Delphi Roundto and FormatFloat Inconsistency

I'm getting a rounding oddity in Delphi 2010, where some numbers are rounding down in roundto, but up in formatfloat.
I'm entirely aware of binary representation of decimal numbers sometimes giving misleading results, but in that case I would expect formatfloat and roundto to give the same result.
I've also seen advice that this is the sort of thing "Currency" should be used for, but as you can see below, Currency and Double give the same results.
program testrounding;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils,Math;
var d:Double;
c:Currency;
begin
d:=534.50;
c:=534.50;
writeln('Format: ' +formatfloat('0',d));
writeln('Roundto: '+formatfloat('0',roundto(d,0)));
writeln('C Format: ' +formatfloat('0',c));
writeln('C Roundto: '+formatfloat('0',roundto(c,0)));
readln;
end.
The results are as follows:
Format: 535
Roundto: 534
C Format: 535
C Roundto: 534
I've looked at Why is the result of RoundTo(87.285, -2) => 87.28 and the suggested remedies do not seem to apply.

First of all, we can remove Currency from the question, because the two functions that you use don't have Currency overloads. The value is converted to an IEEE754 floating point value and then follows the same path as your Double code.
Let's look at RoundTo first of all. It is quick to check, using the debugger, or an additional Writeln that RoundTo(d,0) = 534. Why is that?
Well, the documentation for RoundTo says:
Rounds a floating-point value to a specified digit or power of ten using "Banker's rounding".
Indeed in the implementation of RoundTo we see that the rounding mode is temporarily switched to TRoundingMode.rmNearest before being restored to its original value. The rounding mode only applies when the value is exactly half way between two integers. Which is precisely the case we have here.
So Banker's rounding applies. Which means that when the value is exactly half way between two integers, the rounding algorithm chooses the adjacent even integer.
So it makes sense that RoundTo(534.5,0) = 534, and equally you can check that RoundTo(535.5,0) = 536.
Understanding FormatFloat is quite a different matter. Quite frankly its behaviour is somewhat opaque. It performs an ad hoc rounding in code that differs for different platforms. For instance it is assembler on 32 bit Windows, but Pascal on 64 bit Windows. The overall approach appears to be to take the mantissa of the floating point value, convert it to an integer, convert that to text digits, and then perform the rounding based on those text digits. No respect is paid to the current rounding mode when the rounding is performed, and the algorithm appears to implement the round half away from zero policy. However, even that is not implemented robustly for all possible floating point values. It works correctly for your value, but for values with more digits in the mantissa the algorithm breaks down.
In fact it is fairly well known that the Delphi RTL routines for converting between floating point values and text are fundamentally broken by design. There are no routines in the Delphi RTL that can correctly convert from text to float, or from float to text. In fact, I have recently implemented my own conversion routines, that do this correctly, based on existing open source code used by other language runtimes. One of these days I will get around to publishing this code for use by others.
I'm not sure what your exact needs are, but if you are wishing to exert some control over rounding, then you can do so if you take charge of the rounding. Whilst RoundTo always uses Banker's rounding, you can instead use Round which uses the current rounding mode. This will allow you to perform the round using the rounding algorithm of your choice (by calling SetRoundMode), and then you can convert the rounded value to text. That's the key. Keep the value in an arithmetic type, perform the rounding, and only convert to text at the very last moment, after the correct rounding has been applied.

In this case, the value 534.5 is exactly representable in Double precision.
Looking into source code, reveals that the FormatFloat function rounds upwards if the last pending digit is 5 or more.
RoundTo uses the Banker's rounding, and rounds to nearest even number (534) in this case.

GForth: Convert floating point number to String

A simple question that turned out to be quite complex:
How do I turn a float to a String in GForth? The desired behavior would look something like this:
1.2345e fToString \ takes 1.2345e from the float stack and pushes (addr n) onto the data stack

After a lot of digging, one of my colleagues found it:
f>str-rdp ( rf +nr +nd +np -- c-addr nr )
https://www.complang.tuwien.ac.at/forth/gforth/Docs-html-history/0.6.2/Formatted-numeric-output.html
Convert rf into a string at c-addr nr. The conversion rules and the
meanings of nr +nd np are the same as for f.rdp.
And from f.rdp:
f.rdp ( rf +nr +nd +np – )
https://www.complang.tuwien.ac.at/forth/gforth/Docs-html/Simple-numeric-output.html
Print float rf formatted. The total width of the output is nr. For
fixed-point notation, the number of digits after the decimal point is
+nd and the minimum number of significant digits is np. Set-precision has no effect on f.rdp. Fixed-point notation is used if the number of
siginicant digits would be at least np and if the number of digits
before the decimal point would fit. If fixed-point notation is not
used, exponential notation is used, and if that does not fit,
asterisks are printed. We recommend using nr>=7 to avoid the risk of
numbers not fitting at all. We recommend nr>=np+5 to avoid cases where
f.rdp switches to exponential notation because fixed-point notation
would have too few significant digits, yet exponential notation offers
fewer significant digits. We recommend nr>=nd+2, if you want to have
fixed-point notation for some numbers. We recommend np>nr, if you want
to have exponential notation for all numbers.
In humanly readable terms, these functions require a number on the float-stack and three numbers on the data stack.
The first number-parameter tells it how long the string should be, the second one how many decimals you would like and the third tells it the minimum number of decimals (which roughly translates to precision). A lot of implicit math is performed to determine the final String format that is produced, so some tinkering is almost required to make it behave the way you want.
Testing it out (we don't want to rebuild f., but to produce a format that will be accepted as floating-point number by forth to EVALUATE it again, so the 1.2345E0 notation is on purpose):
PI 18 17 17 f>str-rdp type \ 3.14159265358979E0 ok
PI 18 17 17 f.rdp \ 3.14159265358979E0 ok
PI f. \ 3.14159265358979 ok

I couldn't find the exact word for this, so I looked into Gforth sources.
Apparently, you could go with represent word that prints the most significant numbers into supplied buffer, but that's not exactly the final output. represent returns validity and sign flags, as well as the position of decimal point. That word then is used in all variants of floating point printing words (f., fp. fe.).
Probably the easiest way would be to substitute emit with your word (emit is a deferred word), saving data where you need it, use one of available floating pint printing words, and then restoring emit back to original value.
I'd like to hear the preferred solution too...

swift 2.0 NSDecimalNumber possible discrepency converting to long

I can't make heads or tails of this. I am using NSDecimalNumber to truncate
the fractional portion from a string. This works in most cases, but not apparently in the case of infinite decimals (or just too many). Here is an example:
print(NSDecimalNumber(string: "49.81666666666666666").longLongValue)
print(NSDecimalNumber(string: "49.816666666666666666").longLongValue)
print(NSDecimalNumber(string: "49.8166666666666666666").longLongValue)
The first line prints 49, the second -5, and the last one 0. I know I can use the rounding function to do the same thing, and that is what I will probably use instead, but doesn't this seem odd? I know it isn't just converting the float bit pattern into a long or else the results would be completely different.

Delphi - Comparing float values

I have a function that returns a float value like this:
1.31584870815277
I need a function that returns TRUE comparing the value and the two numbers after the dot.
Example:
if 1.31584870815277 = 1.31 then ShowMessage('same');
Sorry for my english.
Can someone help me? Thanks

Your problem specification is a little vague. For instance, you state that you want to compare the values after the decimal point. In which case that would imply that you wish 1.31 to be considered equal to 2.31.
On top of this, you will need to specify how many decimal places to consider. A number like 1.31 is not representable exactly in binary floating point. Depending on the type you use, the closest representable value could be less than or greater than 1.31.
My guess is that what you wish to do is to use round to nearest, to a specific number of decimal places. You can use the SameValue function from the Math unit for this purpose. In your case you would write:
SameValue(x, y, 0.01)
to test for equality up to a tolerance of 0.01.
This may not be precisely what you are looking for, but then it's clear from your question that you don't yet know exactly what you are looking for. If your needs are specifically related to decimal representation of the values then consider using a decimal type rather than a binary type. In Delphi that would be Currency.

If speed isn't the highest priority, you can use string conversion:
if Copy(1.31584870815277.ToString, 1, 4) = '1.31' then ShowMessage('same');

Erlang floats and trunc

R15b on Windows gives:
>trunc(1.9999999999999999999).
2
For that matter, just typing the float returns:
> 1.9999999999999999999.
2.0
AFAIK, the truncate function should just drop the fractional portion (at least that's what I need, anyway). A floor function might also do the trick AFAIK, but the floor implementations I've seen posted online use... you guessed it... trunc.
I'm not nitpicking this, I actually need this to be correct for a program I'm developing.
Any ideas on this?
Thanks.

Your problem is decimal numbers are represented as IEEE compliant binary representation (32, 64 or 128 bit).
If you really need precision you should use other numerical data structures as Binary Coded Decimal or fixed-point arithmetic.
Hope this helps!

if you want to make a TRUNC to float, maybe this one can help:
select substring (convert(varchar(14), CAST (20160303013458 as varchar(14))) , 1 , 8)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart