Logarithm of Gaussian converting from zero - erlang

I have a vector V of 256 position. For each position in the vector I want to calculate the density from a Gaussian distribution:
V[0] -> p(V[0])
V[1] -> p(V[1])
...
V[255] -> p(V[255])
After that I want to multiply all the probabilities together:
p(V[0]) * p(V[1]) * ... * p(V[255])
The problem with this method in my implementation is that each probability is high (around ~400), so I cannot multiply everything together.
A workaround for that would be taking the log of each Gaussian and then adding all together:
V[0] -> log(p(V[0]))
V[1] -> log(p(V[1]))
...
V[255] -> log(p(V[255]))
log(p(V[0])) + log(p(V[1])) + ... + log(p(V[255]))
But when I try to do that I get an error when the Gaussian results in zero.
Having this in mind, is there any workaround the log(0) problem? Would be an accurate representation replace log(0)for 0?
EDIT:
So, for the record, If I try the normal method (multiplying), the error that I get is this one:
iex(6)> Naive.CLI.main(["~/data/usps.csv", "~/indices/17.csv"])
** (ArithmeticError) bad argument in arithmetic expression
:erlang.*(417.62100246853674, 6.504406716503509e307)
From what I understand the numbers are too high for the multiplication take effect.
Here is what I did (and I think should be the right idea):
def gaussian(vector, mean, variance) do
vector
|> Enum.zip(mean)
|> Enum.zip(variance)
|> Enum.map(fn {{e, m}, v} -> {e, m, v} end)
|> Enum.map(fn {e, m, v} -> calculate(e, m, v) end)
|> Enum.map(fn e ->
if e == 0.0 do
0.0
else
:math.log10(e)
end
end)
end
defp calculate(elem, mean, variance) do
(1/:math.sqrt(2*:math.pi*variance)) *
:math.exp(-0.5*(((elem - mean)*(elem - mean)) / variance))
end
Basically if the Gaussian results in zero, I return zero as well.

Related

Concatenate to each timestamp in Keras

I have a keras layer which outputs N timestamps of size M (thus NxM size). I want to append the same vector of size 1xK to all time stamps, so the output should have N timestamps of size M+K. If I use the Concatenate layer like this:
x = Concatenate()[x, v]
It gives an error since the dimensions do not match. And if I use a TimeDistributed wrapper like this:
x = TimeDistributed(Concatenate())[x, v]
It gives an error since vector v does not have time stamps.
Which is the easiest way of doing this?
Thanks!!
First, duplicate your vector N times using RepeatVector:
v = RepeatVector(N)(v) # shape == (N, K)
Then, as their shapes are matching now ((N, M) and (N, K)), you can concatenate them:
x = Concatenate()([x, v]) # shape == (N, M+K)
If N is unknown you can do this manually using the corresponding backend functions in a lambda layer:
from keras import backend as K
def func(xv):
x, v = xv
n = x.shape[1]
v = K.repeat(v, n)
return K.concatenate((x, v))
x = Lambda(lambda xv: func(xv))([x, v])

Taylor series via F#

I'm trying to write Taylor series in F#.
Have a look at my code
let rec iter a b f i =
if a > b then i;
else f a (iter (a+1) b f i)
let sum a b = iter a b (+) 0 // from 0
// e^x = 1 + x + (x^2)/2 + ... (x^n)/n! + ...
let fact n = iter 1 n (*) 1 // factorial
let pow x n = iter 1 n (fun n acc -> acc * x) 1
let exp x =
iter 0 x
(fun n acc ->
acc + (pow x n) / float (fact n)) 0
In the last row I am trying cast int fact n to float, but seems like I'm wrong because this code isn't compileable :(
Am I doing the right algorithm?
Can I call my code functional-first?
The code doesn't compile, because:
You're trying to divide an integer pow x n by a float. Division has to have operands of the same type.
You're specifying the terminal case of the wrong type. Literal 0 is integer. If you want float zero, use 0.0 or abbreviated 0.
Try this:
let exp x =
iter 0 x
(fun n acc ->
acc + float (pow x n) / float (fact n)) 0.
P.S. In the future, please provide the exact error messages and/or unexpected results that you're getting. Simply saying "doesn't work" is not a good description of a problem.

Currying and multiple integrals

I am interested in learning an elegant way to use currying in a functional programming language to numerically evaluate multiple integrals. My language of choice is F#.
If I want to integrate f(x,y,z)=8xyz on the region [0,1]x[0,1]x[0,1] I start by writing down a triple integral of the differential form 8xyz dx dy dz. In some sense, this is a function of three ordered arguments: a (float -> float -> float -> float).
I take the first integral and the problem reduces to the double integral of 4xy dx dy on [0,1]x[0,1]. Conceptually, we have curried the function to become a (float -> float -> float).
After the second integral I am left to take the integral of 2x dx, a (float -> float), on the unit interval.
After three integrals I am left with the result, the number 1.0.
Ignoring optimizations of the numeric integration, how could I succinctly execute this? I would like to write something like:
let diffForm = (fun x y z -> 8 * x * y * z)
let result =
diffForm
|> Integrate 0.0 1.0
|> Integrate 0.0 1.0
|> Integrate 0.0 1.0
Is this doable, if perhaps impractical? I like the idea of how closely this would capture what is going on mathematically.
I like the idea of how closely this would capture what is going on mathematically.
I'm afraid your premise is false: The pipe operator threads a value through a chain of functions and is closely related to function composition. Integrating over an n-dimensional domain however is analogous to n nested loops, i.e. in your case something like
for x in x_grid_nodes do
for y in y_grid_nodes do
for z in z_grid_nodes do
integral <- integral + ... // details depend on integration scheme
You cannot easily map that to a chain of three independet calls to some Integrate function and thus the composition integrate x1 x2 >> integrate y1 y2 >> integrate z1 z2 is actually not what you do when you integrate f. That is why Tomas' solution—if I understood it correctly (and I am not sure about that...)—essentially evaluates your function on an implicitly defined 3D grid and passes that to the integration function. I suspect that is as close as you can get to your original question.
You did not ask for it, but if you do want to evaluate a n-dimensional integral in practice, look into Monte Carlo integration, which avoids another problem commonly known as the "curse of dimensionality", i.e. that fact that the number of required sample points grows exponentially with n with classic integration schemes.
Update
You can implement iterated integration, but not with a single integrate function, because the type of the function to be integrated is different for each step of the integration (i.e. each step turns an n-ary function to an (n - 1)-ary one):
let f = fun x y z -> 8.0 * x * y * z
// numerically integrate f on [x1, x2]
let trapRule f x1 x2 = (x2 - x1) * (f x1 + f x2) / 2.0
// uniform step size for simplicity
let h = 0.1
// integrate an unary function f on a given discrete grid
let integrate grid f =
let mutable integral = 0.0
for x1, x2 in Seq.zip grid (Seq.skip 1 grid) do
integral <- integral + trapRule f x1 x2
integral
// integrate a 3-ary function f with respect to its last argument
let integrate3 lower upper f =
let grid = seq { lower .. h .. upper }
fun x y -> integrate grid (f x y)
// integrate a 2-ary function f with respect to its last argument
let integrate2 lower upper f =
let grid = seq { lower .. h .. upper }
fun x -> integrate grid (f x)
// integrate an unary function f on [lower, upper]
let integrate1 lower upper f =
integrate (seq { lower .. h .. upper }) f
With your example function f
f |> integrate3 0.0 1.0 |> integrate2 0.0 1.0 |> integrate1 0.0 1.0
yields 1.0.
I'm not entirely sure how you would implement this in a normal way, so this might not fully solve the problem, but here are some ideas.
To do the numerical integration, you'll (I think?) need to call the original function diffForm at various points as specified by the Integrate calls in the pipeline - but you actually need to call it at a product of the ranges - so if I wanted to call it only at the borders, I would still need to call it 2x2x2 times to cover all possible combinations (diffForm 0 0 0, diffForm 0 0 1, diffForm 0 1 0 etc.) and then do some calcualtion on the 8 results you get.
The following sample (at least) shows how to write similar code that calls the specified function with all combinations of the argument values that you specify.
The idea is to use continuations which can be called multiple times (and so when we get a function, we can call it repeatedly at multiple different points).
// Our original function
let diffForm x y z = 8.0 * x * y * z
// At the first step, we just pass the function to a continuation 'k' (once)
let diffFormK k = k diffForm
// This function takes a function that returns function via a continuation
// (like diffFormK) and it fixes the first argument of the function
// to 'lo' and 'hi' and calls its own continuation with both options
let range lo hi func k =
// When called for the first time, 'f' will be your 'diffForm'
// and here we call it twice with 'lo' and 'hi' and pass the
// two results (float -> float -> float) to the next in the pipeline
func (fun f -> k (f lo))
func (fun f -> k (f hi))
// At the end, we end up with a function that takes a continuation
// and it calls the continuation with all combinations of results
// (This is where you need to do something tricky to aggregate the results :-))
let integrate result =
result (printfn "%f")
// Now, we pass our function to 'range' for every argument and
// then pass the result to 'integrate' which just prints all results
let result =
diffFormK
|> range 0.0 1.0
|> range 0.0 1.0
|> range 0.0 1.0
|> integrate
This might be pretty confusing (because continuations take a lot of time to get used to), but perhaps you (or someone else here?) can find a way to turn this first attempt into a real numerical integration :-)

Fibonacci Matrix

For calculating a fibonacci sequence in O(logn) we use matrix exponential since the term
fn = fn-1 + fn-2 is linear but what is the matrix required if we want to find nth term of
fn = fn-1 + fn-2 + a0 + a1*n + a2*n^2 + ... an*n^n
which is a dependent on polynomial???
Here a0,a1,... an are constants
Look here for implementation in Erlang which uses formula
. It shows nice linear resulting behavior because in O(M(n) log n) part M(n) is exponential for big numbers. It calculates fib of one million in 2s where result has 208988 digits. The trick is that you can compute exponentiation in O(log n) multiplications using (tail) recursive formula (tail means with O(1) space when used proper compiler or rewrite to cycle):
% compute X^N
power(X, N) when is_integer(N), N >= 0 ->
power(N, X, 1).
power(0, _, Acc) ->
Acc;
power(N, X, Acc) ->
if N rem 2 =:= 1 ->
power(N - 1, X, Acc * X);
true ->
power(N div 2, X * X, Acc)
end.
where X and Acc you substitute with matrices. X will be initiated with and Acc with identity I equals to .

Project Euler Problem 27 in F#

I've been trying to work my way through Problem 27 of Project Euler, but this one seems to be stumping me. Firstly, the code is taking far too long to run (a couple of minutes maybe, on my machine, but more importantly, it's returning the wrong answer though I really can't spot anything wrong with the algorithm after looking through it for a while.
Here is my current code for the solution.
/// Checks number for primality.
let is_prime n =
[|1 .. 2 .. sqrt_int n|] |> Array.for_all (fun x -> n % x <> 0)
/// Memoizes a function.
let memoize f =
let cache = Dictionary<_, _>()
fun x ->
let found, res = cache.TryGetValue(x)
if found then
res
else
let res = f x
cache.[x] <- res
res
/// Problem 27
/// Find a quadratic formula that produces the maximum number of primes for consecutive values of n.
let problem27 n =
let is_prime_mem = memoize is_prime
let range = [|-(n - 1) .. n - 1|]
let natural_nums = Seq.init_infinite (fun i -> i)
range |> Array.map (fun a -> (range |> Array.map (fun b ->
let formula n = n * n + a * n + b
let num_conseq_primes = natural_nums |> Seq.map (fun n -> (n, formula n))
|> Seq.find (fun (n, f) -> not (is_prime_mem f)) |> fst
(a * b, num_conseq_primes)) |> Array.max_by snd)) |> Array.max_by snd |> fst
printn_any (problem27 1000)
Any tips on how to a) get this algorithm actually returning the right answer (I think I'm at least taking a workable approach) and b) improve the performance, as it clearly exceeds the "one minute rule" set out in the Project Euler FAQ. I'm a bit of a newbie to functional programming, so any advice on how I might consider the problem with a more functional solution in mind would also be appreciated.
Two remarks:
You may take advantage of the fact that b must be prime. This follows from the fact that the problem asks for the longest sequence of primes for n = 0, 1, 2, ...
So, formula(0) must be prime to begin with , but formula(0) = b, therefore, b must be prime.
I am not an F# programmer, but it seems to me that the code does not try n= 0 at all. This, of course, does not meet the problem's requirement that n must start from 0, therefore there are neglectable chances a correct answer could be produced.
Right, after a lot of checking that all the helper functions were doing what they should, I've finally reached a working (and reasonably efficient) solution.
Firstly, the is_prime function was completely wrong (thanks to Dimitre Novatchev for making me look at that). I'm not sure quite how I arrived at the function I posted in the original question, but I had assumed it was working since I'd used it in previous problems. (Most likely, I had just tweaked it and broken it since.) Anyway, the working version of this function (which crucially returns false for all integers less than 2) is this:
/// Checks number for primality.
let is_prime n =
if n < 2 then false
else [|2 .. sqrt_int n|] |> Array.for_all (fun x -> n % x <> 0)
The main function was changed to the following:
/// Problem 27
/// Find a quadratic formula that produces the maximum number of primes for consecutive values of n.
let problem27 n =
let is_prime_mem = memoize is_prime
let set_b = primes (int64 (n - 1)) |> List.to_array |> Array.map int
let set_a = [|-(n - 1) .. n - 1|]
let set_n = Seq.init_infinite (fun i -> i)
set_b |> Array.map (fun b -> (set_a |> Array.map (fun a ->
let formula n = n * n + a * n + b
let num_conseq_primes = set_n |> Seq.find (fun n -> not (is_prime_mem (formula n)))
(a * b, num_conseq_primes))
|> Array.max_by snd)) |> Array.max_by snd |> fst
The key here to increase speed was to only generate the set of primes between 1 and 1000 for the values of b (using the primes function, my implementation of the Sieve of Eratosthenes method). I also managed to make this code slightly more concise by eliminating the unnecessary Seq.map.
So, I'm pretty happy with the solution I have now (it takes just under a second), though of course any further suggestions would still be welcome...
You could speed up your "is_prime" function by using a probabilistic algorithm. One of the easiest quick algorithms for this is the Miller-Rabin algorithm.
to get rid of half your computations you could also make the array of possible a´s only contain odd numbers
my superfast python solution :P
flag = [0]*204
primes = []
def ifc(n): return flag[n>>6]&(1<<((n>>1)&31))
def isc(n): flag[n>>6]|=(1<<((n>>1)&31))
def sieve():
for i in xrange(3, 114, 2):
if ifc(i) == 0:
for j in xrange(i*i, 12996, i<<1): isc(j)
def store():
primes.append(2)
for i in xrange(3, 1000, 2):
if ifc(i) == 0: primes.append(i)
def isprime(n):
if n < 2: return 0
if n == 2: return 1
if n & 1 == 0: return 0
if ifc(n) == 0: return 1
return 0
def main():
sieve()
store()
mmax, ret = 0, 0
for b in primes:
for a in xrange(-999, 1000, 2):
n = 1
while isprime(n*n + a*n + b): n += 1
if n > mmax: mmax, ret = n, a * b
print ret
main()

Resources