Math and physic in programing [duplicate] - delphi

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 6 years ago.
I have one simple question..
I have this code:
program wtf;
var i:integer;
begin
for i:=1 to 20 do
if sqrt(i)*sqrt(i)<>i then writeln(i);
readln
end.
...
it goes through the loop 20 times and for numbers from 1 to 20 and it checks if square root multiplied buy square root of same number is equal to that number.
If we use mathematical rules this program should never have anything on output but ....
I get this :
2
3
5
6
7
8
10
12
13
15
18
19
20
can sombody explain what is going on?

This is because of precision. The square root of a number that is not a perfect square will give an irrational number, which cannot be represented in memory properly using floating-point arithmetic.
Instead, the number gets truncated after some digits (in binary, but the concept is the same as in decimal).
The number is now very close to, but not quite, the square root of the original number. Squaring it will produce a number that is very close to the original, but not quite.
Imagine it like this, the square root of 2 is 1.4142135623........(etc.) but it gets cut off to 1.414213 for memory reasons. 1.414213 * 1.414213 = 1.99999840937 and not 2.
However, the square root of 4 is 2, and this can be stored in memory fully, without being cut off after a few decimal places. When you then do 2 * 2 you do get exactly 4.
Sometimes the number might get cut off, but when squaring it is still close enough to produce the original value. This is why 11 does not show.

sqrt generates floating point numbers. When using floats, on a computer, you cannot compare values and expect absolute equality. You must use a threshold difference comparison. floats are not used to count things, they are used to measure things, (to count things, use integers),. No two measured (even in the real world) values are ever exactly the same. they are only "close enough".
On a computer, it is impossible to represent every possible real number. SO every calculated value gets represented by the "closest" number in the set of possible numbers, (for that data type), that can be represented on the computer. This means that it is very slightly incorrect, and therefore after a few calculations, it will not conform to perfect equality comparisons.

Related

SPSS percentile issue

I am working with SPSS 18.
I am using FREQUENCIES to calculate the 95th percentile of a variable.
FREQUENCIES SdrelPromSldDeu_Acr_5_0
/FORMAT=NOTABLE
/PERCENTILES 1,5,95,99.
The result is given in a table
Statistics
SdrelPromSldDeu_Acr_5_0
N Valid 8881
Missing 0
Percentiles 1 -1,001060644014
5 -1,000541440102
95 6619,140632636228
99 9223372,036854776000
But if I double-click the 9223372,036854776 to copy it, another number appears: 1.0757943411193715E7.
If I use MEANS to get the maximum value, the result is 2.4329524990388575E8, so the number that appears on the double-click seems possible.
I have seen 9223372,03 in other cases as well, as if it were some kind of upper limit SPSS is able to display.
Can anybody tell me if the 9223372,03 represents anything useful? Should I trust the bigger number?
Thanks!
It appears to be a bug in the display of SPSS.
The number you have shown is eerily similar to
9223372036854775807
which is the highest value possible if a variable is declared as a long integer.
see also:
https://en.wikipedia.org/wiki/9223372036854775807
Since your actual number is 11 degrees smaller, it should not reach this limit. Hence the conclusion that it must be a bug in the display software.
Do not trust it.
(the number behind may or may not be right, but the 9223372,03 is surely wrong)

32 bit multiplication on 24 bit ALU

I want to port a 32 by 32 bit unsigned multiplication on a 24-bit dsp (it's a Linear Congruential Generator, so I'm not allowed to truncate, also I don't want to replace yet the current LCG with a 24 bit one). The available data types are 24 and 48 bit ints.
Only the last 32 LSB are needed. Do you know any hacks to implement this in fewer multiplies, masks and shifts than the usual way?
The line looks like this:
//val is an int(32 bit)
val = (1664525 * val) + 1013904223;
An outline would be (in my current compiler style):
static uint48_t val = SEED;
...
val = 0xFFFFFFFFUL & ((1664525UL * val) + 1013904223UL);
and hopefully the compiler will recognise:
it can use a multiply and accumulate command
it only needs a reduced multiply algorithim due to the "high word" of the constant being zero
the AND could be effected by resetting the upper bits or multiplying a constant and restoring
...other stuff depends on your {mystery dsp} target
Note
if you scale up the coefficients by 2^16, you can get truncation for free, but due to lack of info
you will have to explore/decide if it is better overall.
(This is more an elaboration why two multiplications 24×24→n, 31<n are enough for 32×32→min(n, 40).)
The question discloses amazingly little about the capabilities to build a method
32×21→32 in fewer [24×24] multiplies, masks and shifts than the usual way on:
24 and 48 bit ints & DSP (I read high throughput, non-high latency 24×24→48).
As far as there indeed is a 24×24→48 multiply (or even 24×24+56→56 MAC) and one factor is less than 24 bits, the question is pointless, a second multiply being the compelling solution.
The usual composition of a 24<n<48×24<m<48→24<p multiply from 24×24→48 uses three of the latter; a compiler should know as well as a coder that "the fourth multiply" would yield bits with a significance/position exceeding the combined lengths of the lower parts of the factors.
So, is it possible to generate "the long product" using just a second 24×24→48?
Let the (bytes of the) factors be w_xyz and W_XYZ, respectively; the underscores suggesting "the Ws" being the lower significance bits in the higher significance words/ints if interpreted as 24bit ints. The first 24×24→48 gives the sum of
  zX
 yXzY
xXyYzZ
 xYyZ
  xZ, what is needed (fat) is
 wZ +
 zW.
This can be computed using one combined multiplication of
((w<<16)|(z & 0xff)) × ((W<<16)|(Z & 0xff)). (Never mind the 17th bit of wZ+zW "running" into wW.)
(In the first revision of this answer, I foolishly produced wZ and zW separately - their sum is wanted in the end, anyway.)
(Annoyingly, this is about all you can do for 24×24→24 as a base operation too - beyond this "combining multiplication", you need four instead of one.)
Another angle to explore is choosing a different PRNG.
It may have to be >24 bits (tell!).
On a 24 bit machine, XorShift* (or even XorShift+) 48/32 seems worth a look.

Working on 16 bit unsigned integer (uint16_t)

I want to generate a 16 bit unsigned integer (uint16_t) which could represent following:
First 2 digits representing some version like 1, 2, 3 etc.
Next 3 digits representing another number may be 123, 345, 071 etc.
And last 11 digits representing a number T234, T566 etc.
How can we do this using objective C. I would like to parse this data later on to get these components back. Please advise.
I think you are misunderstanding just what uint16_t means. It doesn't mean a 16 digit decimal number (which would be any number between 0 and 9,999,999,999,999,999). It means an unsigned number that can be expressed using 16 bits. The range of such a value is 0 to 65535 in decimal. If you really wanted to store the numbers you are talking about you would need 52 bits. You would also be making things very difficult for yourself, since you wouldn't easily be able to extract the first two decimal digits from that 52 bit sequence; You'd have to treat the number as a decimal value then modulus 100 it, you couldn't just say it's bits 1 to 8.
There is a scheme called Binary Coded Decimal that could help you. You would take a 64 bit value (uint64_t) and you'd say that within this value the bits 1-7 are the version (which could be a value up to 127), bits 8-17 are the second number (which could be a value up to 1023) and bits 18-63 could be your third number (those 46 bits would be able to store a number up to 70,368,744,177,663.
All this is technically possible, but you are really going to be making things hard for yourself. It looks like you are storing a version, minor version and build number and most people do that using strings, not decimals

Project Euler -Prob. #20 (Lua)

http://projecteuler.net/problem=20
I've written code to figure out this problem, however, it seems to be accurate in some cases, and inaccurate in others. When I try solving the problem to 10 (answer is given in question, 27) I get 27, the correct answer. However, when I try solving the question given (100) I get 64, the incorrect answer, as the answer is something else.
Here's my code:
function factorial(num)
if num>=1 then
return num*factorial(num-1)
else
return 1
end
end
function getSumDigits(str)
str=string.format("%18.0f",str):gsub(" ","")
local sum=0
for i=1,#str do
sum=sum+tonumber(str:sub(i,i))
end
return sum
end
print(getSumDigits(tostring(factorial(100))))
64
Since Lua converts large numbers into scientific notation, I had to convert it back to standard notation. I don't think this is a problem, though it might be.
Is there any explanation to this?
Unfortunately, the correct solution is more difficult. The main problem here is that Lua uses 64bit floating point variables, which means this applies.
Long story told short: The number of significant digits in a 64bit float is much too small to store a number like 100!. Lua's floats can store a maximum of 52 mantissa bits, so any number greater than 2^52 will inevitably suffer from rounding errors, which gives you a little over 15 decimal digits. To store 100!, you'll need at least 158 decimal digits.
The number calculated by your factorial() function is reasonably close to the real value of 100! (i.e. the relative error is small), but you need the exact value to get the right solution.
What you need to do is implement your own algorithms for dealing with large numbers. I actually solved that problem in Lua by storing each number as a table, where each entry stores one digit of a decimal number. The complete solution takes a little more than 50 lines of code, so it's not too difficult and a nice exercise.

(La)TeX Base 10 fixed point arithmetic

I'm trying to implement decimal arithmetic in (La)TeX. I'm trying to use dimens to store the values. I want the arithmetic to be exact to some (fixed) number of decimal places. If I use 1pt as my base unit, then this fails, because \divide rounds down, so 1pt / 10 gives 0.09999pt. If I use something like 1000sp as my base unit, then I get working fixed point arithmetic with 3 decimal places, but I can't figure out an easy way to format the numbers. If I try to convert them to pt, so I can use TeX's display mechanism, I have the same problem with \divide.
How do I fix this problem, or work around it?
The fp package provides fixed point arithmetic for LaTeX. The LaTeX3 Project are currently implementing something similar as part of the expl3 bundle. The code is currently not on CTAN, but can be grabbed from the SVN (or will appear when the next update from the SVN to CTAN takes place).
I would represent all the values as integers and scale them appropriately. For example, when you need three decimal digits, 0.124 would be represented as 124. This is nice because addition and subtraction are trivial. When multiplying two numbers a and b, you would have to divide the result by 1000 to get the proper representation. Dividing works by multiplying the result with 1000.
You still have to get the rounding issues correct, but this isn't very difficult. At least if you don't get near the maximum representable integer (I don't remember if it's 2^31-1 or 2^30-1).
Here is some code:
\def\fixadd#1#2#3{%
#1=#2\relax
\advance #1 by #3\relax
}
\def\fixsub#1#2#3{%
#1=#2\relax
#1=-#1\relax
\advance #1 by #3\relax
#1=-#1\relax
}
\def\fixmul#1#2#3{%
#1=#2\relax
\multiply #1 by #3\relax
\divide #1 by 1000\relax
}
\def\fixdiv#1#2#3{%
#1=#2\relax
\divide #1 by #3\relax
\multiply #1 by 1000\relax
}
\newcount\numa
\newcount\numb
\newcount\numc
\numa=1414
\numb=2828
\fixmul\numc\numa\numb
\the\numc
\bye
The operations are modeled after a three register machine, where the first is the destination and the other two are the operands. The rounding after the multiplication and division, including corner cases for very large or very small numbers are left as an exercise to you.

Resources