I see unlike Double, Int in Swift does not have infinity. Only thing we have is Int.max and Int.min which are actually numbers and (Int.max - 1) is not the same as Int.max. I need to perform operations such as:
//maximumDuration is Integer...width, widthPerSecond, currentWidth are CGFloat, all positive
width = max(CGFloat(maximumDuration) * widthPerSecond, currentWidth)
So if maximumDuration is Int.max, CGFloat(maximumDuration) * widthPerSecond may not be Int.max. Infact, comparisons may not be reliable due to overflow.
What's the way out to have true infinity when using Int datatype? One way would be to use Double instead of Int but that would require so many type casts everywhere else in the code.
All the integer types are simple scalars. All the bits hold value (plus a sign bit for the signed variants.) There are no spare bits for marking things like NAN, (not a number) infinity, normalized/non-normalized.
There is simply no way to represent infinity with binary integer types. This is not unique to Swift. It is true of just about all different languages/platforms.
Floating point types use an IEEE format that reserves some bits for special cases like infinity.
You could create an enum with associated values that had cases for negative and positive infinity, NAN, and the like, but you'd have the same casting/code rework problems that you're trying to avoid with floats.
Edit:
Interestingly, in Binary Coded Decimal (BCD) there are spare bits. I wonder if there is a standard for indicating special values like infinity in BCD?
Related
I wanted to implement a simple parser for double values (just for fun). However, I noticed that when handling the decimal shift, I get rounding errors when multiplying the value with powers of 10.
I'm wondering how double.Parse ensures that the result value is as close to the string value as possible?
Just an example:
When parsing 0.0124 (=124*0.0001), I get 0.012400000000000001. However, double.Parse displays 0.0124 as expected.
The solution seems quite simple:
Just don't multiply by values smaller than 1 (e.g. 0.0001 in the above example) but divide by 1/x (10000 in the above example).
I think the reason is simply that integer values have an exact representation (up to 2^53), so there is no rounding error in the quotient (as there was in the factor < 1).
I'm having a difficult time understanding the size differences of the data types.
I have an attribute called displayOrder with type of Integer 16. I use this attribute to maintain a display order of tableViewCells, added by a user in a tableView. I set the value with plain numbers, "1, 2, 3", and it's working fine.
But there's also a lot of other options like Integer 32, Integer 64, Decimal, Float, and Double. I did my own research and found that a Float can have a decimal point, and a Double is double the size of the Float (Not sure the difference between Decimal and Float).
My question is, if the differences of these are just the size, does that mean I have to worry about displayOrder going up to, for example "1000", and it will exceed the bits of Integer 16 (Does it ever exceed the size?), and therefore I should use Integer 32 instead? And if I set it to Integer 64, and if the displayOrder is just "1", do I have to worry about slow performance?
I've seen the docs NSAttributeType but not sure what the numbers stands for.
Thanks
I think #choppin meant that speed wise it won't make much of a difference. Size wise it very much does, an int16 is half the size of int32, and having a ton of int32s when you only need int16s will have a larger memory footprint. The number here represents the number of bits the variable takes up in memory.
If you will only have a couple then don't worry about it, but if you will have a large data set, then it becomes an issue.
Also, if the number you will store can possibly be a very large number then you need the bigger option, for example an int32 can store 4,294,967,296 or half this if the Int is signed which it is by default. If you go over the maximum size of a signed int then the number wraps around, going negative or to 0 for a signed int.
Since memory is a concern on a mobile device then which option you choose warrants thought, though it warrants less that it did a few years ago.
It shouldn't make a huge deal on performance which one you use, but I would stick with integer 32. That gives you 2 to the power of 32 values (which should be more than enough for a display order)
This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 8 years ago.
I am reading from a txt file and populating a core data entity.
At some point I have read the value form the TXT file and the value is #"0.9".
Now I assign it to a CGFloat.
CGFloat value = (CGFloat)[stringValue floatValue];
debugger shows value as 0.89999997615814208 !!!!!!?????
why? bug? Even if it things [stringValue floatValue] is a double, casting it to CGFloat should not produce that abnormality.
The binary floating point representation used for float can't store the value exactly. So it uses the closest representable value.
It's similar to decimal numbers: It's impossible to represent one third in decimal (instead we use an approximate representation like 0.3333333).
Because to store a float in binary you can only approximate it by summing up fractions like 1/2, 1/4, 1/8 etc. For 0.9 (any many other values) there is no exact representation that can be constructed from summing fractions like this. Whereas if the value was say, 0.25 you could represent that exactly as 1/4.
Floating point imprecision, check out http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
Basically has to do with how floating points work, they don't just store a number, they store a math problem. Broken down by a base number, and a precision. It then must combine the two numbers via math operation to retrieve the actual value, which doesn't always turn out to be exactly what you assigned to it.
Good morning all,
I'm having some issues with floating point math, and have gotten totally lost in ".to_f"'s, "*100"'s and ".0"'s!
I was hoping someone could help me with my specific problem, and also explain exactly why their solution works so that I understand this for next time.
My program needs to do two things:
Sum a list of decimals, determine if they sum to exactly 1.0
Determine a difference between 1.0 and a sum of numbers - set the value of a variable to the exact difference to make the sum equal 1.0.
For example:
[0.28, 0.55, 0.17] -> should sum to 1.0, however I keep getting 1.xxxxxx. I am implementing the sum in the following fashion:
sum = array.inject(0.0){|sum,x| sum+ (x*100)} / 100
The reason I need this functionality is that I'm reading in a set of decimals that come from excel. They are not 100% precise (they are lacking some decimal points) so the sum usually comes out of 0.999999xxxxx or 1.000xxxxx. For example, I will get values like the following:
0.568887955,0.070564759,0.360547286
To fix this, I am ok taking the sum of the first n-1 numbers, and then changing the final number slightly so that all of the numbers together sum to 1.0 (must meet validation using the equation above, or whatever I end up with). I'm currently implementing this as follows:
sum = 0.0
array.each do |item|
sum += item * 100.0
end
array[i] = (100 - sum.round)/100.0
I know I could do this with inject, but was trying to play with it to see what works. I think this is generally working (from inspecting the output), but it doesn't always meet the validation sum above. So if need be I can adjust this one as well. Note that I only need two decimal precision in these numbers - i.e. 0.56 not 0.5623225. I can either round them down at time of presentation, or during this calculation... It doesn't matter to me.
Thank you VERY MUCH for your help!
If accuracy is important to you, you should not be using floating point values, which, by definition, are not accurate. Ruby has some precision data types for doing arithmetic where accuracy is important. They are, off the top of my head, BigDecimal, Rational and Complex, depending on what you actually need to calculate.
It seems that in your case, what you're looking for is BigDecimal, which is basically a number with a fixed number of digits, of which there are a fixed number of digits after the decimal point (in contrast to a floating point, which has an arbitrary number of digits after the decimal point).
When you read from Excel and deliberately cast those strings like "0.9987" to floating points, you're immediately losing the accurate value that is contained in the string.
require "bigdecimal"
BigDecimal("0.9987")
That value is precise. It is 0.9987. Not 0.998732109, or anything close to it, but 0.9987. You may use all the usual arithmetic operations on it. Provided you don't mix floating points into the arithmetic operations, the return values will remain precise.
If your array contains the raw strings you got from Excel (i.e. you haven't #to_f'd them), then this will give you a BigDecimal that is the difference between the sum of them and 1.
1 - array.map{|v| BigDecimal(v)}.reduce(:+)
Either:
continue using floats and round(2) your totals: 12.341.round(2) # => 12.34
use integers (i.e. cents instead of dollars)
use BigDecimal and you won't need to round after summing them, as long as you start with BigDecimal with only two decimals.
I think that algorithms have a great deal more to do with accuracy and precision than a choice of IEEE floating point over another representation.
People used to do some fine calculations while still dealing with accuracy and precision issues. They'd do it by managing the algorithms they'd use and understanding how to represent functions more deeply. I think that you might be making a mistake by throwing aside that better understanding and assuming that another representation is the solution.
For example, no polynomial representation of a function will deal with an asymptote or singularity properly.
Don't discard floating point so quickly. I could be that being smarter about the way you use them will do just fine.
AFAIK, Currency type in Delphi Win32 depends on the processor floating point precision. Because of this I'm having rounding problems when comparing two Currency values, returning different results depending on the machine.
For now I'm using the SameValue function passing a Epsilon parameter = 0.009, because I only need 2 decimal digits precision.
Is there any better way to avoid this problem?
The Currency type in Delphi is a 64-bit integer scaled by 1/10,000; in other words, its smallest increment is equivalent to 0.0001. It is not susceptible to precision issues in the same way that floating point code is.
However, if you are multiplying your Currency numbers by floating-point types, or dividing your Currency values, the rounding does need to be worked out one way or the other. The FPU controls this mechanism (it's called the "control word"). The Math unit contains some procedures which control this mechanism: SetRoundMode in particular. You can see the effects in this program:
{$APPTYPE CONSOLE}
uses Math;
var
x: Currency;
y: Currency;
begin
SetRoundMode(rmTruncate);
x := 1;
x := x / 6;
SetRoundMode(rmNearest);
y := 1;
y := y / 6;
Writeln(x = y); // false
Writeln(x - y); // 0.0001; i.e. 0.1666 vs 0.1667
end.
It is possible that a third-party library you are using is setting the control word to a different value. You may want to set the control word (i.e. rounding mode) explicitly at the starting point of your important calculations.
Also, if your calculations ever transfer into plain floating point and then back into Currency, all bets are off - too hard to audit. Make sure all your calculations are in Currency.
No, Currency is not a floating point type. It is a fixed-precision decimal, implemented with integer storage. It can be compared exactly, and does not have the rounding issues of, say, Double. Therefore, if you are seeing inexact values in your Currency variables, the problem is not the Currency type itself, but what you are putting into it. Most likely, you have a floating-point calculation somewhere else in your code. Since you do not show that code, it's hard to be of more help on this question. But the solution, generally speaking, will be to round your floating point numbers to the correct precision before storing in the Currency variable, rather than doing an inexact comparison on the Currency variables.
Faster and safer way of comparing two currency values is certainly to map the variables to their internal Int64 representation:
function CompCurrency(var A,B: currency): Int64;
var A64: Int64 absolute A;
B64: Int64 absolute B;
begin
result := A64-B64;
end;
This will avoid any rounding error during comparison (working with *10000 integer values), and will be faster than the default FPU-based implementation (especially under 64 bit XE2 compiler).
See this article for additional information.
If your situation is like mine, you might find this approach helpful. I work mostly in payroll. If a business has say 3 departments and wants to charge the cost of an employee evenly among those three departments, there are a lot of times when there will be rounding issues.
What I have been doing is loop through the departments charging each one a third of the total cost and adding the cost charged to a subtotal (currency) variable. But when the loop variable equals the limit, rather than multiplying by the fraction, I subtract the subtotal variable from the total cost and put that in the last department. Since the journal entries that result from this process always have to balance, I believe that it has always worked.
See thread:
D7 / DUnit: all CheckEquals(Currency, Currency) tests suddenly fail ...
https://forums.codegear.com/thread.jspa?threadID=16288
It looks like a change on our development workstations caused Currency comparision to fail. We have not found the root cause, but on two computers running Windows 2000 SP4, and independent of the version of gds32.dll (InterBase 7.5.1 or 2007) and Delphi (7 and 2009), this line
TIBDataBase.Create(nil);
changes the value of to 8087 control word from $1372 to $1272 now.
And all Currency comparisions in unit tests will fail with funny messages like
Expected: <12.34> - Found: <12.34>
The gds32.dll has not been modified, so I guess that there is a dependency in this library to a third party dll which modifies the control word.
To avoid possible issues with currency rounding in Delphi use 4 decimal places.
This will ensure that you never having rounding issues when doing calcualtions with very small amounts.
"Been there. Done That. Written the unit tests."