I tried to get the cube root in F#. But here is my problem.
let x5 = ((float 64) ** (1.0/3.0));;
val x5 : float = 4.0
int x5;; //expected 4
val it : int = 3
The result should be 4, not 3.
What's wrong?
Nothing is wrong, the thing is that the value of your x5 is a bit less, than 4.0.
You may explicitly see how much less using fsi:
let x5 = ((float 64) ** (1.0/3.0))
let err = 4.0 - x5;;
val x5 : float = 4.0
val err : float = 4.440892099e-16
It looks like you may be looking at the wrong variable.
I checked it myself and an example is here:
http://ideone.com/kn9jd
(ideone is a free online compilation/execution service.)
Related
I'm attempting some Newton Raphson updates. Here is a piece of code that compiles and runs (warning: infinite loop).
let thetam = [|beta; sigSq|] |> DenseVector
let mutable gm = grad yt xt betah sigSqh // returns DenseVector
let hm = hess yt xt betah sigSqh // return Matrix<float>
while gm*gm > 0.0001 do
gm <- grad yt xt betah sigSqh
thetam - (hess yt xt betah sigSqh).Inverse() * gm // unassigned compiles
However, as soon as I assign the last value to the mutable variable thetam as follows...
while gm*gm > 0.0001 do
gm <- grad yt xt betah sigSqh
thetam <- thetam - (hess yt xt betah sigSqh).Inverse() * gm // gm here has problems
a squigly red line under gm appears and the compiler complains The type 'Vector<float>' is not compatible with the type 'DenseVector'
However, the function grad is explicitly told to return a DenseVector and ordinarily works as expected.
let grad (yt : Vector<float>) (xt : Vector<float>) (beta : float) (sigSq : float) =
let T = (float yt.Count)
let gradBeta = (yt - beta * xt)*xt / sigSq
let gradSigSq = -0.5*T/sigSq + 0.5/sigSq**2.*(yt - beta * xt)*(yt - beta * xt)
[|gradBeta; gradSigSq|] |> DenseVector
Why is the assignment to thetam causing problems? Is there a magic way to perform updates without mutability?
Here is the complete script:
open System
open System.IO
open System.Windows.Forms
open System.Windows.Forms.DataVisualization
open FSharp.Data
open FSharp.Charting
open FSharp.Core.Operators
open MathNet.Numerics
open MathNet.Numerics.LinearAlgebra
open MathNet.Numerics.LinearAlgebra.Double
open MathNet.Numerics.Random
open MathNet.Numerics.Distributions
open MathNet.Numerics.Statistics
let beta, sigSq = 3., 9.
let xt = DenseVector [|23.; 78.; 43.; 32.; 90.; 66.; 89.; 34.; 72.; 99.|]
let T = xt.Count
let genProc () =
beta * xt + DenseVector [|for i in 1 .. T do yield Normal.Sample(0., Math.Sqrt(sigSq))|]
let llNormal (yt : Vector<float>) (xt : Vector<float>) (beta : float) (sigSq : float) =
let T = (float yt.Count)
let z = (yt - beta * xt) / Math.Sqrt(sigSq)
-0.5 * log (2. * Math.PI) - 0.5 * log (sigSq) - z*z/2./T/sigSq
let grad (yt : Vector<float>) (xt : Vector<float>) (beta : float) (sigSq : float) =
let T = (float yt.Count)
let gradBeta = (yt - beta * xt)*xt / sigSq
let gradSigSq = -0.5*T/sigSq + 0.5/sigSq**2.*(yt - beta * xt)*(yt - beta * xt)
[|gradBeta; gradSigSq|] |> DenseVector
let hess (yt : Vector<float>) (xt : Vector<float>) (beta : float) (sigSq : float) =
let T = (float yt.Count)
let z = yt - beta * xt
let h11 = -xt*xt/sigSq
let h22 = T*0.5/sigSq/sigSq - z*z/sigSq/sigSq/sigSq
let h12 = -1./sigSq**2.*((yt - beta * xt)*xt)
array2D [[h11;h12];[h12;h22]] |> DenseMatrix.ofArray2
let yt = genProc()
// until convergence
let mutable thetam = [|beta; sigSq|] |> DenseVector
let mutable gm = grad yt xt beta sigSq
while gm*gm > 0.0001 do
gm <- grad yt xt beta sigSq
// 'gm' here is complaining upon equation being assigned to thetam
thetam <- thetam - (hess yt xt beta sigSq).Inverse() * gm
You should change at least let mutable thetam = [|beta; sigSq|] |> DenseVector to
let mutable thetam = [|beta; sigSq|] |> DenseVector.ofArray (and possibly other DenseVector references). Mathnet for performance reasons does in-place changes so it might trip you if you use mutable references:
DenseVector(Double[] storage)
Create a new dense vector directly binding to a raw array. The array
is used directly without copying. Very efficient, but changes to the
array and the vector will affect each other.
Versus:
DenseVector OfArray(Double[] array)
Create a new dense vector as a copy of the given array. This new
vector will be independent from the array. A new memory block will be
allocated for storing the vector.
In fact we've seen this behavior in your previous question when Exponential.Samples behaved in a similar fashion.
The API docs (while not super user-friendly) are here.
type Point<'t> =
val X : 't
val Y : 't
new(x : 't,y : 't) = { X = x; Y = y }
let clampedSubtract (p1:Point<_>) (p2:Point<_>) =
Point( max (p2.X - p1.X) 0, max (p2.Y - p1.Y) 0 )
If you look at the code above, you will notice, that the function is not implemented as generic as it should be.
First, using the 0 in the max expressions clamps the type to int. But it should be the type of whatever type Point<'t> has and not Point<int>.
But even more important, this function can only work as expected, if signed types are used for `t.
This raises a few questions of mine:
Is there a way to obtain the neutral element (zero) from a generic (number) type?
How can I express a restriction such as "only signed number"?
Is there a way to extend type constraint system in F#?
Thanks, in advance.
The solution to the first question as already answered is to use an inline function together with GenericZero and that's all.
Regarding the signed restriction, actually there's an easy way to restrict it to signed types. Use somewhere the generic unary negation which is defined only for signed types:
let inline clampedSubtract (p1:Point<_>) (p2:Point<_>) =
let zero = LanguagePrimitives.GenericZero
Point( max (p2.X + -p1.X) zero, max (p2.Y + -p1.Y) zero )
let result1 = clampedSubtract (Point(4 , 5 )) (Point(4 , 5 ))
let result2 = clampedSubtract (Point(4y , 5y )) (Point(4y , 5y ))
let result3 = clampedSubtract (Point(4uy, 5uy)) (Point(4uy, 5uy)) // doesn't compile
In general, if you want to restrict any generic function to signed types you can define this function:
let inline whenSigned x = ignore (-x)
let inline clampedSubtract (p1:Point<_>) (p2:Point<_>) =
whenSigned p1.X
let zero = LanguagePrimitives.GenericZero
Point( max (p2.X - p1.X) zero, max (p2.Y - p1.Y) zero )
Finally regarding your third question it's not very clear to me what do you mean with extending the type system. You can create static constraints by yourself, in that sense the system is already extensible.
I did a project sometime ago to emulate some Haskell types, part of the code of that project is still in a module in FsControl there you can have an idea to what level you can play with those constraints.
This makes it generic:
let inline clampedSubtract (p1:Point<_>) (p2:Point<_>) =
let zero = LanguagePrimitives.GenericZero
Point( max (p2.X - p1.X) zero, max (p2.Y - p1.Y) zero )
But there's no way to constrain it to signed primitive types.
I am very green when it comes to F#, and I have run across a small issue dealing with recursive functions that I was hoping could help me understand.
I have a function that is supposed to spit out the next even number:
let rec nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
// This never returns..
nextEven 3;;
I use the 'rec' keyword so that it will be recursive, although when I use it, it will just run in an endless loop for some reason. If I rewrite the function like this:
let nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
Then everything works fine (no rec keyword). For some reason I though I needed 'rec' since the function is recursive (so why don't I?) and why does the first version of the function run forever ?
EDIT
Turns out this was a total noob mistake. I had created multiple definitions of the function along the way, as is explained in the comments + answers.
I suspect you have multiple definitions of nextEven. That's the only explanation for your second example compiling. Repro:
module A =
let rec nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
open A //the function below will not compile without this
let nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y //calling A.nextEven
Try resetting your FSI session.
Hello everyone
I have converted a project in C# to F# that paints the Mandelbrot set.
Unfortunately does it take around one minute to render a full screen so I am try to find some ways to speed it up.
It is one call that take almost all of the time:
Array.map (fun x -> this.colorArray.[CalcZ x]) xyArray
xyArray (double * double) [] => (array of tuple of double)
colorArray is an array of int32 length = 255
CalcZ is defined as:
let CalcZ (coord:double * double) =
let maxIterations = 255
let rec CalcZHelper (xCoord:double) (yCoord:double) // line break inserted
(x:double) (y:double) iters =
let newx = x * x + xCoord - y * y
let newy = 2.0 * x * y + yCoord
match newx, newy, iters with
| _ when Math.Abs newx > 2.0 -> iters
| _ when Math.Abs newy > 2.0 -> iters
| _ when iters = maxIterations -> iters
| _ -> CalcZHelper xCoord yCoord newx newy (iters + 1)
CalcZHelper (fst coord) (snd coord) (fst coord) (snd coord) 0
As I only use around half of the processor capacity is an idea to use more threads and specifically Array.Parallel.map, translates to system.threading.tasks.parallel
Now my question
A naive solution, would be:
Array.Parallel.map (fun x -> this.colorArray.[CalcZ x]) xyArray
but that took twice the time, how can I rewrite this to take less time, or can I take some other way to utilize the processor better?
Thanks in advance
Gorgen
---edit---
the function that is calling CalcZ looks like this:
let GetMatrix =
let halfX = double bitmap.PixelWidth * scale / 2.0
let halfY = double bitmap.PixelHeight * scale / 2.0
let rect:Mandelbrot.Rectangle =
{xMax = centerX + halfX; xMin = centerX - halfX;
yMax = centerY + halfY; yMin = centerY - halfY;}
let size:Mandelbrot.Size =
{x = bitmap.PixelWidth; y = bitmap.PixelHeight}
let xyList = GenerateXYTuple rect size
let xyArray = Array.ofList xyList
Array.map (fun x -> this.colorArray.[CalcZ x]) xyArray
let region:Int32Rect = new Int32Rect(0,0,bitmap.PixelWidth,bitmap.PixelHeight)
bitmap.WritePixels(region, GetMatrix, bitmap.PixelWidth * 4, region.X, region.Y);
GenerateXYTuple:
let GenerateXYTuple (rect:Rectangle) (pixels:Size) =
let xStep = (rect.xMax - rect.xMin)/double pixels.x
let yStep = (rect.yMax - rect.yMin)/double pixels.y
[for column in 0..pixels.y - 1 do
for row in 0..pixels.x - 1 do
yield (rect.xMin + xStep * double row,
rect.yMax - yStep * double column)]
---edit---
Following a suggestion from kvb (thanks a lot!) in a comment to my question, I built the program in Release mode. Building in the Relase mode generally speeded up things.
Just building in Release took me from 50s to around 30s, moving in all transforms on the array so it all happens in one pass made it around 10 seconds faster. At last using the Array.Parallel.init brought me to just over 11 seconds.
What I learnt from this is.... Use the release mode when timing things and using parallel constructs...
One more time, thanks for the help I have recieved.
--edit--
by using SSE assember from a native dll I have been able to slash the time from around 12 seconds to 1.2 seconds for a full screen of the most computational intensive points. Unfortunately I don't have a graphics processor...
Gorgen
Per the comment on the original post, here is the code I wrote to test the function. The fast version only takes a few seconds on my average workstation. It is fully sequential, and has no parallel code.
It's moderately long, so I posted it on another site: http://pastebin.com/Rjj8EzCA
I'm suspecting that the slowdown you are seeing is in the rendering code.
I don't think that the Array.Parallel.map function (which uses Parallel.For from .NET 4.0 under the cover) should have trouble parallelizing the operation if it runs a simple function ~1 million times. However, I encountered some weird performance behavior in a similar case when F# didn't optimize the call to the lambda function (in some way).
I'd try taking a copy of the Parallel.map function from the F# sources and adding inline. Try adding the following map function to your code and use it instead of the one from F# libraries:
let inline map (f: 'T -> 'U) (array : 'T[]) : 'U[]=
let inputLength = array.Length
let result = Array.zeroCreate inputLength
Parallel.For(0, inputLength, fun i ->
result.[i] <- f array.[i]) |> ignore
result
As an aside, it looks like you're generating an array of coordinates and then mapping it to an array of results. You don't need to create the coordinate array if you use the init function instead of map: Array.Parallel.init 1000 (fun y -> Array.init 1000 (fun x -> this.colorArray.[CalcZ (x, y)]))
EDIT: The following may be inaccurate:
Your problem could be that you call a tiny function a million times, causing the scheduling overhead to overwhelm that actual work you're doing. You should partition the array into much larger chunks so that each individual task takes a millisecond or so. You can use an array of arrays so that you would call Array.Parallel.map on the outer arrays and Array.map on the inner arrays. That way each parallel operation will operate on a whole row of pixels instead of just a single pixel.
I've been struggling with the following code. It's an F# implementation of the Forward-Euler algorithm used for modelling stars moving in a gravitational field.
let force (b1:Body) (b2:Body) =
let r = (b2.Position - b1.Position)
let rm = (float32)r.MagnitudeSquared + softeningLengthSquared
if (b1 = b2) then
VectorFloat.Zero
else
r * (b1.Mass * b2.Mass) / (Math.Sqrt((float)rm) * (float)rm)
member this.Integrate(dT, (bodies:Body[])) =
for i = 0 to bodies.Length - 1 do
for j = (i + 1) to bodies.Length - 1 do
let f = force bodies.[i] bodies.[j]
bodies.[i].Acceleration <- bodies.[i].Acceleration + (f / bodies.[i].Mass)
bodies.[j].Acceleration <- bodies.[j].Acceleration - (f / bodies.[j].Mass)
bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT
bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT
While this works it isn't exactly "functional". It also suffers from horrible performance, it's 2.5 times slower than the equivalent c# code. bodies is an array of structs of type Body.
The thing I'm struggling with is that force() is an expensive function so usually you calculate it once for each pair and rely on the fact that Fij = -Fji. But this really messes up any loop unfolding etc.
Suggestions gratefully received! No this isn't homework...
Thanks,
Ade
UPDATED: To clarify Body and VectorFloat are defined as C# structs. This is because the program interops between F#/C# and C++/CLI. Eventually I'm going to get the code up on BitBucket but it's a work in progress I have some issues to sort out before I can put it up.
[StructLayout(LayoutKind.Sequential)]
public struct Body
{
public VectorFloat Position;
public float Size;
public uint Color;
public VectorFloat Velocity;
public VectorFloat Acceleration;
'''
}
[StructLayout(LayoutKind.Sequential)]
public partial struct VectorFloat
{
public System.Single X { get; set; }
public System.Single Y { get; set; }
public System.Single Z { get; set; }
}
The vector defines the sort of operators you'd expect for a standard Vector class. You could probably use the Vector3D class from the .NET framework for this case (I'm actually investigating cutting over to it).
UPDATE 2: Improved code based on the first two replies below:
for i = 0 to bodies.Length - 1 do
for j = (i + 1) to bodies.Length - 1 do
let r = ( bodies.[j].Position - bodies.[i].Position)
let rm = (float32)r.MagnitudeSquared + softeningLengthSquared
let f = r / (Math.Sqrt((float)rm) * (float)rm)
bodies.[i].Acceleration <- bodies.[i].Acceleration + (f * bodies.[j].Mass)
bodies.[j].Acceleration <- bodies.[j].Acceleration - (f * bodies.[i].Mass)
bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT
bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT
The branch in the force function to cover the b1 == b2 case is the worst offender. You do't need this if softeningLength is always non-zero, even if it's very small (Epsilon). This optimization was in the C# code but not the F# version (doh!).
Math.Pow(x, -1.5) seems to be a lot slower than 1/ (Math.Sqrt(x) * x). Essentially this algorithm is slightly odd in that it's perfromance is dictated by the cost of this one step.
Moving the force calculation inline and getting rid of some divides also gives some improvement, but the performance was really being killed by the branching and is dominated by the cost of Sqrt.
WRT using classes over structs: There are cases (CUDA and native C++ implementations of this code and a DX9 renderer) where I need to get the array of bodies into unmanaged code or onto a GPU. In these scenarios being able to memcpy a contiguous block of memory seems like the way to go. Not something I'd get from an array of class Body.
I'm not sure if it's wise to rewrite this code in a functional style. I've seen some attempts to write pair interaction calculations in a functional manner and each one of them was harder to follow than two nested loops.
Before looking at structs vs. classes (I'm sure someone else has something smart to say about this), maybe you can try optimizing the calculation itself?
You're calculating two acceleration deltas, let's call them dAi and dAj:
dAi = r*m1*m2/(rm*sqrt(rm)) / m1
dAj = r*m1*m2/(rm*sqrt(rm)) / m2
[note: m1 = bodies.[i].mass, m2=bodies.[j].mass]]
The division by mass cancels out like this:
dAi = rm2 / (rmsqrt(rm))
dAj = rm1 / (rmsqrt(rm))
Now you only have to calculate r/(rmsqrt(rm)) for each pair (i,j).
This can be optimized further, because 1/(rmsqrt(rm)) = 1/(rm^1.5) = rm^-1.5, so if you let r' = r * (rm ** -1.5), then Edit: no it can't, that's premature optimization talking right there (see comment). Calculating r' = 1.0 / (r * sqrt r) is fastest.
dAi = m2 * r'
dAj = m1 * r'
Your code would then become something like
member this.Integrate(dT, (bodies:Body[])) =
for i = 0 to bodies.Length - 1 do
for j = (i + 1) to bodies.Length - 1 do
let r = (b2.Position - b1.Position)
let rm = (float32)r.MagnitudeSquared + softeningLengthSquared
let r' = r * (rm ** -1.5)
bodies.[i].Acceleration <- bodies.[i].Acceleration + r' * bodies.[j].Mass
bodies.[j].Acceleration <- bodies.[j].Acceleration - r' * bodies.[i].Mass
bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT
bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT
Look, ma, no more divisions!
Warning: untested code. Try at your own risk.
I'd like to play arround with your code, but it's difficult since the definition of Body and FloatVector is missing and they also seem to be missing from the orginal blog post you point to.
I'd hazard a guess that you could improve your performance and rewrite in a more functional style using F#'s lazy computations:
http://msdn.microsoft.com/en-us/library/dd233247(VS.100).aspx
The idea is fairly simple you wrap any expensive computation that could be repeatedly calculated in a lazy ( ... ) expression then you can force the computation as many times as you like and it will only ever be calculated once.