Project Euler -Prob. #20 (Lua) - lua

http://projecteuler.net/problem=20
I've written code to figure out this problem, however, it seems to be accurate in some cases, and inaccurate in others. When I try solving the problem to 10 (answer is given in question, 27) I get 27, the correct answer. However, when I try solving the question given (100) I get 64, the incorrect answer, as the answer is something else.
Here's my code:
function factorial(num)
if num>=1 then
return num*factorial(num-1)
else
return 1
end
end
function getSumDigits(str)
str=string.format("%18.0f",str):gsub(" ","")
local sum=0
for i=1,#str do
sum=sum+tonumber(str:sub(i,i))
end
return sum
end
print(getSumDigits(tostring(factorial(100))))
64
Since Lua converts large numbers into scientific notation, I had to convert it back to standard notation. I don't think this is a problem, though it might be.
Is there any explanation to this?

Unfortunately, the correct solution is more difficult. The main problem here is that Lua uses 64bit floating point variables, which means this applies.
Long story told short: The number of significant digits in a 64bit float is much too small to store a number like 100!. Lua's floats can store a maximum of 52 mantissa bits, so any number greater than 2^52 will inevitably suffer from rounding errors, which gives you a little over 15 decimal digits. To store 100!, you'll need at least 158 decimal digits.
The number calculated by your factorial() function is reasonably close to the real value of 100! (i.e. the relative error is small), but you need the exact value to get the right solution.
What you need to do is implement your own algorithms for dealing with large numbers. I actually solved that problem in Lua by storing each number as a table, where each entry stores one digit of a decimal number. The complete solution takes a little more than 50 lines of code, so it's not too difficult and a nice exercise.

Related

Neo4j floating point sum different results

I am using neo4j to calculate some statistics on a data set. For that I am often using sum on a floating point value. I am getting different results depending on the circumstances. For example, a query that does this:
...
WITH foo
ORDER BY foo.fooId
RETURN SUM(foo.Weight)
Returns different result than the query that simply does the sum:
...
RETURN SUM(foo.Weight)
The differences are miniscule (293.07724195098984 vs 293.07724195099007). But it is enough to make simple equality checks fail. Another example would be a different instance of the database, loaded with the same data using the same loading process can produce the same issue (the dbs might not be 1:1, the load order of some relations might be different). I took the raw values that neo4j sums (by simply removing the SUM()) and verified that they are the same in all cases (different dbs and ordered/not ordered).
What are my options here? I don't mind losing some precision (I already tried to cut down the precision from 15 to 12 decimal places but that did not seem to work), but I need the results to match up.
Because of rounding errors, floats are not associative. (a+b)+c!=a+(b+c).
The result of every operation is rounded to fit the floats coding constraints and (a+b)+c is implemented as round(round(a+b) +c) while a+(b+c) as round(a+round(b+c)).
As an obvious illustration, consider the operation (2^-100 + 1 -1). If interpreted as a (2^-100 + 1)-1, it will return 0, as 1+2^-100 would require a precision too large for floats or double coding in IEEE754 and can only be coded as 1.0. While (2^-100 +(1-1)) correctly returns 2^-100 that can be coded by either floats or doubles.
This is a trivial example, but these rounding errors may exist after every operation and explain why floating point operations are not associative.
Databases generally do not return data in a garanteed order and depending on the actual order, operations will be done differently and that explains the behaviour that you have.
In general, for this reason, it not a good idea to do equality comparison on floats. Generally, it is advised to replace a==b by abs(a-b) is "sufficiently" small.
"sufficiently" may depend on your algorithm. float are equivalent to ~6-7 decimals and doubles to 15-16 decimals (and I think that it is what is used on your DB). Depending on the number of computations, you may have the last 1--3 decimals affected.
The best is probably to use
abs(a-b)<relative-error*max(abs(a),abs(b))
where relative-error must be adjusted to your problem. Probably something around 10^-13 can be correct, but you must experiment, as rounding errors depends on the number of computations, on the dispersion of the values and on what you may consider as "equal" for you problem.
Look at this site for a discussion on comparison methods. And read What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg that discusses, among others, these problems.

Largest amount of entries in lua table

I am trying to build a Sieve of Eratosthenes in Lua and i tried several things but i see myself confronted with the following problem:
The tables of Lua are to small for this scenario. If I just want to create a table with all numbers (see example below), the table is too "small" even with only 1/8 (...) of the number (the number is pretty big I admit)...
max = 600851475143
numbers = {}
for i=1, max do
table.insert(numbers, i)
end
If I execute this script on my Windows machine there is an error message saying: C:\Program Files (x86)\Lua\5.1\lua.exe: not enough memory. With Lua 5.3 running on my Linux machine I tried that too, error was just killed. So it is pretty obvious that lua can´t handle the amount of entries.
I don´t really know whether it is just impossible to store that amount of entries in a lua table or there is a simple solution for this (tried it by using a long string aswell...)? And what exactly is the largest amount of entries in a Lua table?
Update: And would it be possible to manually allocate somehow more memory for the table?
Update 2 (Solution for second question): The second question is an easy one, I just tested it by running every number until the program breaks: 33.554.432 (2^25) entries fit in one one-dimensional table on my 12 GB RAM system. Why 2^25? Because 64 Bit per number * 2^25 = 2147483648 Bits which are exactly 2 GB. This seems to be the standard memory allocation size for the Lua for Windows 32 Bit compiler.
P.S. You may have noticed that this number is from the Euler Project Problem 3. Yes I am trying to accomplish that. Please don´t give specific hints (..). Thank you :)
The Sieve of Eratosthenes only requires one bit per number, representing whether the number has been marked non-prime or not.
One way to reduce memory usage would be to use bitwise math to represent multiple bits in each table entry. Current Lua implementations have intrinsic support for bitwise-or, -and etc. Depending on the underlying implementation, you should be able to represent 32 or 64 bits (number flags) per table entry.
Another option would be to use one or more very long strings instead of a table. You only need a linear array, which is really what a string is. Just have a long string with "t" or "f", or "0" or "1", at every position.
Caveat: String manipulation in Lua always involves duplication, which rapidly turns into n² or worse complexity in terms of performance. You wouldn't want one continuous string for the whole massive sequence, but you could probably break it up into blocks of a thousand, or of some power of 2. That would reduce your memory usage to 1 byte per number while minimizing the overhead.
Edit: After noticing a point made elsewhere, I realized your maximum number is so large that, even with a bit per number, your memory requirements would optimally be about 73 gigabytes, which is extremely impractical. I would recommend following the advice Piglet gave in their answer, to look at Jon Sorenson's version of the sieve, which works on segments of the space instead of the whole thing.
I'll leave my suggestion, as it still might be useful for Sorenson's sieve, but yeah, you have a bigger problem than you realize.
Lua uses double precision floats to represent numbers. That's 64bits per number.
600851475143 numbers result in almost 4.5 Terabytes of memory.
So it's not Lua's or its tables' fault. The error message even says
not enough memory
You just don't have enough RAM to allocate that much.
If you would have read the linked Wikipedia article carefully you would have found the following section:
As Sorenson notes, the problem with the sieve of Eratosthenes is not
the number of operations it performs but rather its memory
requirements.[8] For large n, the range of primes may not fit in
memory; worse, even for moderate n, its cache use is highly
suboptimal. The algorithm walks through the entire array A, exhibiting
almost no locality of reference.
A solution to these problems is offered by segmented sieves, where
only portions of the range are sieved at a time.[9] These have been
known since the 1970s, and work as follows
...

Ruby Floating Point Math - Issue with Precision in Sum Calc

Good morning all,
I'm having some issues with floating point math, and have gotten totally lost in ".to_f"'s, "*100"'s and ".0"'s!
I was hoping someone could help me with my specific problem, and also explain exactly why their solution works so that I understand this for next time.
My program needs to do two things:
Sum a list of decimals, determine if they sum to exactly 1.0
Determine a difference between 1.0 and a sum of numbers - set the value of a variable to the exact difference to make the sum equal 1.0.
For example:
[0.28, 0.55, 0.17] -> should sum to 1.0, however I keep getting 1.xxxxxx. I am implementing the sum in the following fashion:
sum = array.inject(0.0){|sum,x| sum+ (x*100)} / 100
The reason I need this functionality is that I'm reading in a set of decimals that come from excel. They are not 100% precise (they are lacking some decimal points) so the sum usually comes out of 0.999999xxxxx or 1.000xxxxx. For example, I will get values like the following:
0.568887955,0.070564759,0.360547286
To fix this, I am ok taking the sum of the first n-1 numbers, and then changing the final number slightly so that all of the numbers together sum to 1.0 (must meet validation using the equation above, or whatever I end up with). I'm currently implementing this as follows:
sum = 0.0
array.each do |item|
sum += item * 100.0
end
array[i] = (100 - sum.round)/100.0
I know I could do this with inject, but was trying to play with it to see what works. I think this is generally working (from inspecting the output), but it doesn't always meet the validation sum above. So if need be I can adjust this one as well. Note that I only need two decimal precision in these numbers - i.e. 0.56 not 0.5623225. I can either round them down at time of presentation, or during this calculation... It doesn't matter to me.
Thank you VERY MUCH for your help!
If accuracy is important to you, you should not be using floating point values, which, by definition, are not accurate. Ruby has some precision data types for doing arithmetic where accuracy is important. They are, off the top of my head, BigDecimal, Rational and Complex, depending on what you actually need to calculate.
It seems that in your case, what you're looking for is BigDecimal, which is basically a number with a fixed number of digits, of which there are a fixed number of digits after the decimal point (in contrast to a floating point, which has an arbitrary number of digits after the decimal point).
When you read from Excel and deliberately cast those strings like "0.9987" to floating points, you're immediately losing the accurate value that is contained in the string.
require "bigdecimal"
BigDecimal("0.9987")
That value is precise. It is 0.9987. Not 0.998732109, or anything close to it, but 0.9987. You may use all the usual arithmetic operations on it. Provided you don't mix floating points into the arithmetic operations, the return values will remain precise.
If your array contains the raw strings you got from Excel (i.e. you haven't #to_f'd them), then this will give you a BigDecimal that is the difference between the sum of them and 1.
1 - array.map{|v| BigDecimal(v)}.reduce(:+)
Either:
continue using floats and round(2) your totals: 12.341.round(2) # => 12.34
use integers (i.e. cents instead of dollars)
use BigDecimal and you won't need to round after summing them, as long as you start with BigDecimal with only two decimals.
I think that algorithms have a great deal more to do with accuracy and precision than a choice of IEEE floating point over another representation.
People used to do some fine calculations while still dealing with accuracy and precision issues. They'd do it by managing the algorithms they'd use and understanding how to represent functions more deeply. I think that you might be making a mistake by throwing aside that better understanding and assuming that another representation is the solution.
For example, no polynomial representation of a function will deal with an asymptote or singularity properly.
Don't discard floating point so quickly. I could be that being smarter about the way you use them will do just fine.

Why does this code causes the machine to crash?

I am trying to run this code but it keeps crashing:
log10(x):=log(x)/log(10);
char(x):=floor(log10(x))+1;
mantissa(x):=x/10**char(x);
chop(x,d):=(10**char(x))*(floor(mantissa(x)*(10**d))/(10**d));
rnd(x,d):=chop(x+5*10**(char(x)-d-1),d);
d:5;
a:10;
Ibwd:[[30,rnd(integrate((x**60)/(1+10*x^2),x,0,1),d)]];
for n from 30 thru 1 step -1 do Ibwd:append([[n-1,rnd(1/(2*n-1)-a*last(first(Ibwd)),d)]],Ibwd);
Maxima crashes when it evaluates the last line. Any ideas why it may happen?
Thank you so much.
The problem is that the difference becomes negative and your rounding function dies horribly with a negative argument. To find this out, I changed your loop to:
for n from 30 thru 1 step -1 do
block([],
print (1/(2*n-1)-a*last(first(Ibwd))),
print (a*last(first(Ibwd))),
Ibwd: append([[n-1,rnd(1/(2*n-1)-a*last(first(Ibwd)),d)]],Ibwd),
print (Ibwd));
The last difference printed before everything fails miserably is -316539/6125000. So now try
rnd(-1,3)
and see the same problem. This all stems from the fact that you're taking the log of a negative number, which Maxima interprets as a complex number by analytic continuation. Maxima doesn't evaluate this until it absolutely has to and, somewhere in the evaluation code, something's dying horribly.
I don't know the "fix" for your specific example, since I'm not exactly sure what you're trying to do, but hopefully this gives you enough info to find it yourself.
If you want to deconstruct a floating point number, let's first make sure that it is a bigfloat.
say z: 34.1
You can access the parts of a bigfloat by using lisp, and you can also access the mantissa length in bits by ?fpprec.
Thus ?second(z)*2^(?third(z)-?fpprec) gives you :
4799148352916685/140737488355328
and bfloat(%) gives you :
3.41b1.
If you want the mantissa of z as an integer, look at ?second(z)
Now I am not sure what it is that you are trying to accomplish in base 10, but Maxima
does not do internal arithmetic in base 10.
If you want more bits or fewer, you can set fpprec,
which is linked to ?fpprec. fpprec is the "approximate base 10" precision.
Thus fpprec is initially 16
?fpprec is correspondingly 56.
You can easily change them both, e.g. fpprec:100
corresponds to ?fpprec of 335.
If you are diddling around with float representations, you might benefit from knowing
that you can look at any of the lisp by typing, for example,
?print(z)
which prints the internal form using the Lisp print function.
You can also trace any function, your own or system function, by trace.
For example you could consider doing this:
trace(append,rnd,integrate);
If you want to use machine floats, I suggest you use, for the last line,
for n from 30 thru 1 step -1 do :
Ibwd:append([[n-1,rnd(1/(2.0*n- 1.0)-a*last(first(Ibwd)),d)]],Ibwd);
Note the decimal points. But even that is not quite enough, because integration
inserts exact structures like atan(10). Trying to round these things, or compute log
of them is probably not what you want to do. I suspect that Maxima is unhappy because log is given some messy expression that turns out to be negative, even though it initially thought otherwise. It hands the number to the lisp log program which is perfectly happy to return an appropriate common-lisp complex number object. Unfortunately, most of Maxima was written BEFORE LISP HAD COMPLEX NUMBERS.
Thus the result (log -0.5)= #C(-0.6931472 3.1415927) is entirely unexpected to the rest of Maxima. Maxima has its own form for complex numbers, e.g. 3+4*%i.
In particular, the Maxima display program predates the common lisp complex number format and does not know what to do with it.
The error (stack overflow !!!) is from the display program trying to display a common lisp complex number.
How to fix all this? Well, you could try changing your program so it computes what you really want, in which case it probably won't trigger this error. Maxima's display program should be fixed, too. Also, I suspect there is something unfortunate in simplification of logs of numbers that are negative but not obviously so.
This is probably waaay too much information for the original poster, but maybe the paragraph above will help out and also possibly improve Maxima in one or more places.
It appears that your program triggers an error in Maxima's simplification (algebraic identities) code. We are investigating and I hope we have a bug fix soon.
In the meantime, here is an idea. Looks like the bug is triggered by rnd(x, d) when x < 0. I guess rnd is supposed to round x to d digits. To handle x < 0, try this:
rnd(x, d) := if x < 0 then -rnd1(-x, d) else rnd1(x, d);
rnd1(x, d) := (... put the present definition of rnd here ...);
When I do that, the loop runs to completion and Ibwd is a list of values, but I don't know what values to expect.

(La)TeX Base 10 fixed point arithmetic

I'm trying to implement decimal arithmetic in (La)TeX. I'm trying to use dimens to store the values. I want the arithmetic to be exact to some (fixed) number of decimal places. If I use 1pt as my base unit, then this fails, because \divide rounds down, so 1pt / 10 gives 0.09999pt. If I use something like 1000sp as my base unit, then I get working fixed point arithmetic with 3 decimal places, but I can't figure out an easy way to format the numbers. If I try to convert them to pt, so I can use TeX's display mechanism, I have the same problem with \divide.
How do I fix this problem, or work around it?
The fp package provides fixed point arithmetic for LaTeX. The LaTeX3 Project are currently implementing something similar as part of the expl3 bundle. The code is currently not on CTAN, but can be grabbed from the SVN (or will appear when the next update from the SVN to CTAN takes place).
I would represent all the values as integers and scale them appropriately. For example, when you need three decimal digits, 0.124 would be represented as 124. This is nice because addition and subtraction are trivial. When multiplying two numbers a and b, you would have to divide the result by 1000 to get the proper representation. Dividing works by multiplying the result with 1000.
You still have to get the rounding issues correct, but this isn't very difficult. At least if you don't get near the maximum representable integer (I don't remember if it's 2^31-1 or 2^30-1).
Here is some code:
\def\fixadd#1#2#3{%
#1=#2\relax
\advance #1 by #3\relax
}
\def\fixsub#1#2#3{%
#1=#2\relax
#1=-#1\relax
\advance #1 by #3\relax
#1=-#1\relax
}
\def\fixmul#1#2#3{%
#1=#2\relax
\multiply #1 by #3\relax
\divide #1 by 1000\relax
}
\def\fixdiv#1#2#3{%
#1=#2\relax
\divide #1 by #3\relax
\multiply #1 by 1000\relax
}
\newcount\numa
\newcount\numb
\newcount\numc
\numa=1414
\numb=2828
\fixmul\numc\numa\numb
\the\numc
\bye
The operations are modeled after a three register machine, where the first is the destination and the other two are the operands. The rounding after the multiplication and division, including corner cases for very large or very small numbers are left as an exercise to you.

Resources