Why is this F# sequence function not tail recursive? - f#

Disclosure: this came up in FsCheck, an F# random testing framework I maintain. I have a solution, but I do not like it. Moreover, I do not understand the problem - it was merely circumvented.
A fairly standard implementation of (monadic, if we're going to use big words) sequence is:
let sequence l =
let k m m' = gen { let! x = m
let! xs = m'
return (x::xs) }
List.foldBack k l (gen { return [] })
Where gen can be replaced by a computation builder of choice. Unfortunately, that implementation consumes stack space, and so eventually stack overflows if the list is long enough.The question is: why? I know in principle foldBack is not tail recursive, but the clever bunnies of the F# team have circumvented that in the foldBack implementation. Is there a problem in the computation builder implementation?
If I change the implementation to the below, everything is fine:
let sequence l =
let rec go gs acc size r0 =
match gs with
| [] -> List.rev acc
| (Gen g)::gs' ->
let r1,r2 = split r0
let y = g size r1
go gs' (y::acc) size r2
Gen(fun n r -> go l [] n r)
For completeness, the Gen type and computation builder can be found in the FsCheck source

Building on Tomas's answer, let's define two modules:
module Kurt =
type Gen<'a> = Gen of (int -> 'a)
let unit x = Gen (fun _ -> x)
let bind k (Gen m) =
Gen (fun n ->
let (Gen m') = k (m n)
m' n)
type GenBuilder() =
member x.Return(v) = unit v
member x.Bind(v,f) = bind f v
let gen = GenBuilder()
module Tomas =
type Gen<'a> = Gen of (int -> ('a -> unit) -> unit)
let unit x = Gen (fun _ f -> f x)
let bind k (Gen m) =
Gen (fun n f ->
m n (fun r ->
let (Gen m') = k r
m' n f))
type GenBuilder() =
member x.Return v = unit v
member x.Bind(v,f) = bind f v
let gen = GenBuilder()
To simplify things a bit, let's rewrite your original sequence function as
let rec sequence = function
| [] -> gen { return [] }
| m::ms -> gen {
let! x = m
let! xs = sequence ms
return x::xs }
Now, sequence [for i in 1 .. 100000 -> unit i] will run to completion regardless of whether sequence is defined in terms of Kurt.gen or Tomas.gen. The issue is not that sequence causes a stack overflow when using your definitions, it's that the function returned from the call to sequence causes a stack overflow when it is called.
To see why this is so, let's expand the definition of sequence in terms of the underlying monadic operations:
let rec sequence = function
| [] -> unit []
| m::ms ->
bind (fun x -> bind (fun xs -> unit (x::xs)) (sequence ms)) m
Inlining the Kurt.unit and Kurt.bind values and simplifying like crazy, we get
let rec sequence = function
| [] -> Kurt.Gen(fun _ -> [])
| (Kurt.Gen m)::ms ->
Kurt.Gen(fun n ->
let (Kurt.Gen ms') = sequence ms
(m n)::(ms' n))
Now it's hopefully clear why calling let (Kurt.Gen f) = sequence [for i in 1 .. 1000000 -> unit i] in f 0 overflows the stack: f requires a non-tail-recursive call to sequence and evaluation of the resulting function, so there will be one stack frame for each recursive call.
Inlining Tomas.unit and Tomas.bind into the definition of sequence instead, we get the following simplified version:
let rec sequence = function
| [] -> Tomas.Gen (fun _ f -> f [])
| (Tomas.Gen m)::ms ->
Tomas.Gen(fun n f ->
m n (fun r ->
let (Tomas.Gen ms') = sequence ms
ms' n (fun rs -> f (r::rs))))
Reasoning about this variant is tricky. You can empirically verify that it won't blow the stack for some arbitrarily large inputs (as Tomas shows in his answer), and you can step through the evaluation to convince yourself of this fact. However, the stack consumption depends on the Gen instances in the list that's passed in, and it is possible to blow the stack for inputs that aren't themselves tail recursive:
// ok
let (Tomas.Gen f) = sequence [for i in 1 .. 1000000 -> unit i]
f 0 (fun list -> printfn "%i" list.Length)
// not ok...
let (Tomas.Gen f) = sequence [for i in 1 .. 1000000 -> Gen(fun _ f -> f i; printfn "%i" i)]
f 0 (fun list -> printfn "%i" list.Length)

You're correct - the reason why you're getting a stack overflow is that the bind operation of the monad needs to be tail-recursive (because it is used to aggregate values during folding).
The monad used in FsCheck is essentially a state monad (it keeps the current generator and some number). I simplified it a bit and got something like:
type Gen<'a> = Gen of (int -> 'a)
let unit x = Gen (fun n -> x)
let bind k (Gen m) =
Gen (fun n ->
let (Gen m') = k (m n)
m' n)
Here, the bind function is not tail-recursive because it calls k and then does some more work. You can change the monad to be a continuation monad. It is implemented as a function that takes the state and a continuation - a function that is called with the result as an argument. For this monad, you can make bind tail recursive:
type Gen<'a> = Gen of (int -> ('a -> unit) -> unit)
let unit x = Gen (fun n f -> f x)
let bind k (Gen m) =
Gen (fun n f ->
m n (fun r ->
let (Gen m') = k r
m' n f))
The following example will not stack overflow (and it did with the original implementation):
let sequence l =
let k m m' =
m |> bind (fun x ->
m' |> bind (fun xs ->
unit (x::xs)))
List.foldBack k l (unit [])
let (Gen f) = sequence [ for i in 1 .. 100000 -> unit i ]
f 0 (fun list -> printfn "%d" list.Length)

Related

F#, implement fold3, fold4, fold_n

I am interested to implement fold3, fold4 etc., similar to List.fold and List.fold2. e.g.
// TESTCASE
let polynomial (x:double) a b c = a*x + b*x*x + c*x*x*x
let A = [2.0; 3.0; 4.0; 5.0]
let B = [1.5; 1.0; 0.5; 0.2]
let C = [0.8; 0.01; 0.001; 0.0001]
let result = fold3 polynomial 0.7 A B C
// 2.0 * (0.7 ) + 1.5 * (0.7 )^2 + 0.8 * (0.7 )^3 -> 2.4094
// 3.0 * (2.4094) + 1.0 * (2.4094)^2 + 0.01 * (2.4094)^3 -> 13.173
// 4.0 * (13.173) + 0.5 * (13.173)^2 + 0.001 * (13.173)^3 -> 141.75
// 5.0 * (141.75) + 0.2 * (141.75)^2 + 0.0001 * (141.75)^3 -> 5011.964
//
// Output: result = 5011.964
My first method is grouping the 3 lists A, B, C, into a list of tuples, and then apply list.fold
let fold3 f x A B C =
List.map3 (fun a b c -> (a,b,c)) A B C
|> List.fold (fun acc (a,b,c) -> f acc a b c) x
// e.g. creates [(2.0,1.5,0.8); (3.0,1.0,0.01); ......]
My second method is to declare a mutable data, and use List.map3
let mutable result = 0.7
List.map3 (fun a b c ->
result <- polynomial result a b c // Change mutable data
// Output intermediate data
result) A B C
// Output from List.map3: [2.4094; 13.17327905; 141.7467853; 5011.963942]
// result mutable: 5011.963942
I would like to know if there are other ways to solve this problem. Thank you.
For fold3, you could just do zip3 and then fold:
let polynomial (x:double) (a, b, c) = a*x + b*x*x + c*x*x*x
List.zip3 A B C |> List.fold polynomial 0.7
But if you want this for the general case, then you need what we call "applicative functors".
First, imagine you have a list of functions and a list of values. Let's assume for now they're of the same size:
let fs = [ (fun x -> x+1); (fun x -> x+2); (fun x -> x+3) ]
let xs = [3;5;7]
And what you'd like to do (only natural) is to apply each function to each value. This is easily done with List.map2:
let apply fs xs = List.map2 (fun f x -> f x) fs xs
apply fs xs // Result = [4;7;10]
This operation "apply" is why these are called "applicative functors". Not just any ol' functors, but applicative ones. (the reason for why they're "functors" is a tad more complicated)
So far so good. But wait! What if each function in my list of functions returned another function?
let f1s = [ (fun x -> fun y -> x+y); (fun x -> fun y -> x-y); (fun x -> fun y -> x*y) ]
Or, if I remember that fun x -> fun y -> ... can be written in the short form of fun x y -> ...
let f1s = [ (fun x y -> x+y); (fun x y -> x-y); (fun x y -> x*y) ]
What if I apply such list of functions to my values? Well, naturally, I'll get another list of functions:
let f2s = apply f1s xs
// f2s = [ (fun y -> 3+y); (fun y -> 5+y); (fun y -> 7+y) ]
Hey, here's an idea! Since f2s is also a list of functions, can I apply it again? Well of course I can!
let ys = [1;2;3]
apply f2s ys // Result: [4;7;10]
Wait, what? What just happened?
I first applied the first list of functions to xs, and got another list of functions as a result. And then I applied that result to ys, and got a list of numbers.
We could rewrite that without intermediate variable f2s:
let f1s = [ (fun x y -> x+y); (fun x y -> x-y); (fun x y -> x*y) ]
let xs = [3;5;7]
let ys = [1;2;3]
apply (apply f1s xs) ys // Result: [4;7;10]
For extra convenience, this operation apply is usually expressed as an operator:
let (<*>) = apply
f1s <*> xs <*> ys
See what I did there? With this operator, it now looks very similar to just calling the function with two arguments. Neat.
But wait. What about our original task? In the original requirements we don't have a list of functions, we only have one single function.
Well, that can be easily fixed with another operation, let's call it "apply first". This operation will take a single function (not a list) plus a list of values, and apply this function to each value in the list:
let applyFirst f xs = List.map f xs
Oh, wait. That's just map. Silly me :-)
For extra convenience, this operation is usually also given an operator name:
let (<|>) = List.map
And now, I can do things like this:
let f x y = x + y
let xs = [3;5;7]
let ys = [1;2;3]
f <|> xs <*> ys // Result: [4;7;10]
Or this:
let f x y z = (x + y)*z
let xs = [3;5;7]
let ys = [1;2;3]
let zs = [1;-1;100]
f <|> xs <*> ys <*> zs // Result: [4;-7;1000]
Neat! I made it so I can apply arbitrary functions to lists of arguments at once!
Now, finally, you can apply this to your original problem:
let polynomial a b c (x:double) = a*x + b*x*x + c*x*x*x
let A = [2.0; 3.0; 4.0; 5.0]
let B = [1.5; 1.0; 0.5; 0.2]
let C = [0.8; 0.01; 0.001; 0.0001]
let ps = polynomial <|> A <*> B <*> C
let result = ps |> List.fold (fun x f -> f x) 0.7
The list ps consists of polynomial instances that are partially applied to corresponding elements of A, B, and C, and still expecting the final argument x. And on the next line, I simply fold over this list of functions, applying each of them to the result of the previous.
You could check the implementation for ideas:
https://github.com/fsharp/fsharp/blob/master/src/fsharp/FSharp.Core/array.fs
let fold<'T,'State> (f : 'State -> 'T -> 'State) (acc: 'State) (array:'T[]) =
checkNonNull "array" array
let f = OptimizedClosures.FSharpFunc<_,_,_>.Adapt(f)
let mutable state = acc
for i = 0 to array.Length-1 do
state <- f.Invoke(state,array.[i])
state
here's a few implementations for you:
let fold2<'a,'b,'State> (f : 'State -> 'a -> 'b -> 'State) (acc: 'State) (a:'a array) (b:'b array) =
let mutable state = acc
Array.iter2 (fun x y->state<-f state x y) a b
state
let iter3 f (a: 'a[]) (b: 'b[]) (c: 'c[]) =
let f = OptimizedClosures.FSharpFunc<_,_,_,_>.Adapt(f)
if a.Length <> b.Length || a.Length <> c.Length then failwithf "length"
for i = 0 to a.Length-1 do
f.Invoke(a.[i], b.[i], c.[i])
let altIter3 f (a: 'a[]) (b: 'b[]) (c: 'c[]) =
if a.Length <> b.Length || a.Length <> c.Length then failwithf "length"
for i = 0 to a.Length-1 do
f (a.[i]) (b.[i]) (c.[i])
let fold3<'a,'b,'State> (f : 'State -> 'a -> 'b -> 'c -> 'State) (acc: 'State) (a:'a array) (b:'b array) (c:'c array) =
let mutable state = acc
iter3 (fun x y z->state<-f state x y z) a b c
state
NB. we don't have an iter3, so, implement that. OptimizedClosures.FSharpFunc only allow up to 5 (or is it 7?) params. There are a finite number of type slots available. It makes sense. You can go higher than this, of course, without using the OptimizedClosures stuff.
... anyway, generally, you don't want to be iterating too many lists / arrays / sequences at once. So I'd caution against going too high.
... the better way forward in such cases may be to construct a record or tuple from said lists / arrays, first. Then, you can just use map and iter, which are already baked in. This is what zip / zip3 are all about (see: "(array1.[i],array2.[i],array3.[i])")
let zip3 (array1: _[]) (array2: _[]) (array3: _[]) =
checkNonNull "array1" array1
checkNonNull "array2" array2
checkNonNull "array3" array3
let len1 = array1.Length
if len1 <> array2.Length || len1 <> array3.Length then invalidArg3ArraysDifferent "array1" "array2" "array3" len1 array2.Length array3.Length
let res = Microsoft.FSharp.Primitives.Basics.Array.zeroCreateUnchecked len1
for i = 0 to res.Length-1 do
res.[i] <- (array1.[i],array2.[i],array3.[i])
res
I'm working with arrays at the moment, so my solution pertained to those. Sorry about that. Here's a recursive version for lists.
let fold3 f acc a b c =
let mutable state = acc
let rec fold3 f a b c =
match a,b,c with
| [],[],[] -> ()
| [],_,_
| _,[],_
| _,_,[] -> failwith "length"
| ahead::atail, bhead::btail, chead::ctail ->
state <- f state ahead bhead chead
fold3 f atail btail ctail
fold3 f a b c
i.e. we define a recursive function within a function which acts upon/mutates/changes the outer scoped mutable acc variable (a closure in functional speak). Finally, this gets returned.
It's pretty cool how much type information gets inferred about these functions. In the array examples above, mostly I was explicit with 'a 'b 'c. This time, we let type inference kick in. It knows we're dealing with lists from the :: operator. That's kind of neat.
NB. the compiler will probably unwind this tail-recursive approach so that it is just a loop behind-the-scenes. Generally, get a correct answer before optimising. Just mentioning this, though, as food for later thought.
I think the existing answers provide great options if you want to generalize folding, which was your original question. However, if I simply wanted to call the polynomial function on inputs specified in A, B and C, then I would probably do not want to introduce fairly complex constructs like applicative functors with fancy operators to my code base.
The problem becomes a lot easier if you transpose the input data, so that rather than having a list [A; B; C] with lists for individual variables, you have a transposed list with inputs for calculating each polynomial. To do this, we'll need the transpose function:
let rec transpose = function
| (_::_)::_ as M -> List.map List.head M :: transpose (List.map List.tail M)
| _ -> []
Now you can create a list with inputs, transpose it and calculate all polynomials simply using List.map:
transpose [A; B; C]
|> List.map (function
| [a; b; c] -> polynomial 0.7 a b c
| _ -> failwith "wrong number of arguments")
There are many ways to solve this problem. Few are mentioned like first zip3 all three list, then run over it. Using Applicate Functors like Fyodor Soikin describes means you can turn any function with any amount of arguments into a function that expects list instead of single arguments. This is a good general solution that works with any numbers of lists.
While this is a general good idea, i'm sometimes shocked that so few use more low-level tools. In this case it is a good idea to use recursion and learn more about recursion.
Recursion here is the right-tool because we have immutable data-types. But you could consider how you would implement it with mutable lists and looping first, if that helps. The steps would be:
You loop over an index from 0 to the amount of elements in the lists.
You check if every list has an element for the index
If every list has an element then you pass this to your "folder" function
If at least one list don't have an element, then you abort the loop
The recursive version works exactly the same. Only that you don't use an index to access the elements. You would chop of the first element from every list and then recurse on the remaining list.
Otherwise List.isEmpty is the function to check if a List is empty. You can chop off the first element with List.head and you get the remaining list with the first element removed by List.tail. This way you can just write:
let rec fold3 f acc l1 l2 l3 =
let h = List.head
let t = List.tail
let empty = List.isEmpty
if (empty l1) || (empty l2) && (empty l3)
then acc
else fold3 f (f acc (h l1) (h l2) (h l3)) (t l1) (t l2) (t l3)
The if line checks if every list has at least one element. If that is true
it executes: f acc (h l1) (h l2) (h l3). So it executes f and passes it the first element of every list as an argument. The result is the new accumulator of
the next fold3 call.
Now that you worked on the first element of every list, you must chop off the first element of every list, and continue with the remaining lists. You achieve that with List.tail or in the above example (t l1) (t l2) (t l3). Those are the next remaining lists for the next fold3 call.
Creating a fold4, fold5, fold6 and so on isn't really hard, and I think it is self-explanatory. My general advice is to learn a little bit more about recursion and try to write recursive List functions without Pattern Matching. Pattern Matching is not always easier.
Some code examples:
fold3 (fun acc x y z -> x + y + z :: acc) [] [1;2;3] [10;20;30] [100;200;300] // [333;222;111]
fold3 (fun acc x y z -> x :: y :: z :: acc) [] [1;2;3] [10;20;30] [100;200;300] // [3; 30; 300; 2; 20; 200; 1; 10; 100]

Function to get the power sets of a set in F#

I'm trying to write a function in F# to get the powersets of a set. So far I have written :
let rec powerset = function
|[] -> [[]]
| [x] -> [[x]; []]
|x::xs -> [x] :: (List.map (fun n -> [x; n]) xs) # powerset xs;;
but this isn't returning the cases that have 3 or more elements, only the pairs, the single elements, and the empty set.
You are on the right track, here is a working solution:
let rec powerset =
function
| [] -> [[]]
| (x::xs) ->
let xss = powerset xs
List.map (fun xs' -> x::xs') xss # xss
See you only have to use this trick:
for each element x you there half of the elements of the powerset will include x and half will not
so you recursively generate the powerset of the remaining elements xss and concat the two parts (List.map (fun xs' -> x::xs') xss will prepend the x to each of those)
But please note that this is not tail recursive and will blow the stack for bigger lists - you can take this idea and try to implement it with seq or make a tail-recursive version if you like
Using seq
Here is a version that uses seq and the bijection between the binary representation of natural numbers (a subset of those) and the subsets of a set (you map the elements to digits and set 1 if the corresponding element is in the subset and 0 if not):
let powerset (xs : 'a seq) : 'a seq seq =
let digits (n : bigint) : bool seq =
Seq.unfold (fun n ->
if n <= 0I
then None
else Some (n &&& 1I = 1I, n >>> 1))
n
let subsetBy (i : bigint) : 'a seq =
Seq.zip xs (digits i)
|> Seq.choose (fun (x,b) -> if b then Some x else None)
seq { 0I .. 2I**(Seq.length xs)-1I }
|> Seq.map subsetBy
this will work for things like powerset [1..100] but it might take a long time to enumerate them all ;) (but it should not take to much memory...)

More volatile sequence than "classical"

For cartesian production there is a good enough function - sequence which defined like that:
let rec sequence = function
| [] -> Seq.singleton []
| (l::ls) -> seq { for x in l do for xs in sequence ls do yield (x::xs) }
but look at its result:
sequence [[1..2];[1..10000]] |> Seq.skip 1000 ;;
val it : seq = seq [[1; 1001]; [1; 1002]; [1; 1003]; [1; 1004]; ...]
As we can see the first "coordinate" of the product alters very slowly and it will change the value when the second list is ended.
I wrote my own sequence as following (comments below):
/// Sum of all producted indeces = n
let rec hyper'plane'indices indexsum maxlengths =
match maxlengths with
| [x] -> if indexsum < x then [[indexsum]] else []
| (i::is) -> [for x in [0 .. min indexsum (i-1)] do for xs in hyper'plane'indices (indexsum-x) is do yield (x::xs)]
| [] -> [[]]
let finite'sequence = function
| [] -> Seq.singleton []
| ns ->
let ars = [ for n in ns -> Seq.toArray n ]
let length'list = List.map Array.length ars
let nmax = List.max length'list
seq {
for n in [0 .. nmax] do
for ixs in hyper'plane'indices n length'list do
yield (List.map2 (fun (a:'a[]) i -> a.[i]) ars ixs)
}
The key idea is to look at (two) lists as at (two) orthogonal dimensions where every element marked by its index in the list. So we can enumerate all elements by enumerating every element in every section of cartesian product by hyper plane (in 2D case this is a line). In another words imagine excel's sheet where first column contains values from [1;1] to [1;10000] and second - from [2;1] to [2;10000]. And "hyper plane" with number 1 is the line that connects cell A2 and cell B1. For the our example
hyper'plane'indices 0 [2;10000];; val it : int list list = [[0; 0]]
hyper'plane'indices 1 [2;10000];; val it : int list list = [[0; 1]; [1; 0]]
hyper'plane'indices 2 [2;10000];; val it : int list list = [[0; 2]; [1; 1]]
hyper'plane'indices 3 [2;10000];; val it : int list list = [[0; 3]; [1; 2]]
hyper'plane'indices 4 [2;10000];; val it : int list list = [[0; 4]; [1; 3]]
Well if we have indeces and arrays that we are producing from the given lists than we can now define sequence as {all elements in plane 0; than all elements in plane 1 ... and so on } and get more volatile function than original sequence.
But finite'sequence turned out very gluttonous function. And now the question. How I can improve it?
With best wishes, Alexander. (and sorry for poor English)
Can you explain what exactly is the problem - time or space complexity or performance? Do you have a specific benchmark in mind? I am not sure how to improve on the time complexity here, but I edited your code a bit to remove the intermediate lists, which might help a bit with memory allocation behavior.
Do not do this:
for n in [0 .. nmax] do
Do this instead:
for n in 0 .. nmax do
Here is the code:
let rec hyper'plane'indices indexsum maxlengths =
match maxlengths with
| [] -> Seq.singleton []
| [x] -> if indexsum < x then Seq.singleton [indexsum] else Seq.empty
| i :: is ->
seq {
for x in 0 .. min indexsum (i - 1) do
for xs in hyper'plane'indices (indexsum - x) is do
yield x :: xs
}
let finite'sequence xs =
match xs with
| [] -> Seq.singleton []
| ns ->
let ars = [ for n in ns -> Seq.toArray n ]
let length'list = List.map Array.length ars
let nmax = List.max length'list
seq {
for n in 0 .. nmax do
for ixs in hyper'plane'indices n length'list do
yield List.map2 Array.get ars ixs
}
Does this fare any better? Beautiful problem by the way.
UPDATE: Perhaps you are more interested to mix the sequences fairly than in maintaining the exact formula in your algorithm. Here is a Haskell code that mixes a finite number of possibly infinite sequences fairly, where fairness means that for every input element there is a finite prefix of the output sequence that contains it. You mention in the comment that you have a 2D incremental solution that is hard to generalize to N dimensions, and the Haskell code does exactly that:
merge :: [a] -> [a] -> [a]
merge [] y = y
merge x [] = x
merge (x:xs) (y:ys) = x : y : merge xs ys
prod :: (a -> b -> c) -> [a] -> [b] -> [c]
prod _ [] _ = []
prod _ _ [] = []
prod f (x:xs) (y:ys) = f x y : a `merge` b `merge` prod f xs ys where
a = [f x y | x <- xs]
b = [f x y | y <- ys]
prodN :: [[a]] -> [[a]]
prodN [] = [[]]
prodN (x:xs) = prod (:) x (prodN xs)
I have not ported this to F# yet - it requires some thought as sequences do not match to head/tail very well.
UPDATE 2:
A fairly mechanical translation to F# follows.
type Node<'T> =
| Nil
| Cons of 'T * Stream<'T>
and Stream<'T> = Lazy<Node<'T>>
let ( !! ) (x: Lazy<'T>) = x.Value
let ( !^ ) x = Lazy.CreateFromValue(x)
let rec merge (xs: Stream<'T>) (ys: Stream<'T>) : Stream<'T> =
lazy
match !!xs, !!ys with
| Nil, r | r, Nil -> r
| Cons (x, xs), Cons (y, ys) -> Cons (x, !^ (Cons (y, merge xs ys)))
let rec map (f: 'T1 -> 'T2) (xs: Stream<'T1>) : Stream<'T2> =
lazy
match !!xs with
| Nil -> Nil
| Cons (x, xs) -> Cons (f x, map f xs)
let ( ++ ) = merge
let rec prod f xs ys =
lazy
match !!xs, !!ys with
| Nil, _ | _, Nil -> Nil
| Cons (x, xs), Cons (y, ys) ->
let a = map (fun x -> f x y) xs
let b = map (fun y -> f x y) ys
Cons (f x y, a ++ b ++ prod f xs ys)
let ofSeq (s: seq<'T>) =
lazy
let e = s.GetEnumerator()
let rec loop () =
lazy
if e.MoveNext()
then Cons (e.Current, loop ())
else e.Dispose(); Nil
!! (loop ())
let toSeq stream =
stream
|> Seq.unfold (fun stream ->
match !!stream with
| Nil -> None
| Cons (x, xs) -> Some (x, xs))
let empty<'T> : Stream<'T> = !^ Nil
let cons x xs = !^ (Cons (x, xs))
let singleton x = cons x empty
let rec prodN (xs: Stream<Stream<'T>>) : Stream<Stream<'T>> =
match !!xs with
| Nil -> singleton empty
| Cons (x, xs) -> prod cons x (prodN xs)
let test () =
ofSeq [
ofSeq [1; 2; 3]
ofSeq [4; 5; 6]
ofSeq [7; 8; 9]
]
|> prodN
|> toSeq
|> Seq.iter (fun xs ->
toSeq xs
|> Seq.map string
|> String.concat ", "
|> stdout.WriteLine)

Combine memoization and tail-recursion

Is it possible to combine memoization and tail-recursion somehow? I'm learning F# at the moment and understand both concepts but can't seem to combine them.
Suppose I have the following memoize function (from Real-World Functional Programming):
let memoize f = let cache = new Dictionary<_, _>()
(fun x -> match cache.TryGetValue(x) with
| true, y -> y
| _ -> let v = f(x)
cache.Add(x, v)
v)
and the following factorial function:
let rec factorial(x) = if (x = 0) then 1 else x * factorial(x - 1)
Memoizing factorial isn't too difficult and making it tail-recursive isn't either:
let rec memoizedFactorial =
memoize (fun x -> if (x = 0) then 1 else x * memoizedFactorial(x - 1))
let tailRecursiveFactorial(x) =
let rec factorialUtil(x, res) = if (x = 0)
then res
else let newRes = x * res
factorialUtil(x - 1, newRes)
factorialUtil(x, 1)
But can you combine memoization and tail-recursion? I made some attempts but can't seem to get it working. Or is this simply not possible?
As always, continuations yield an elegant tailcall solution:
open System.Collections.Generic
let cache = Dictionary<_,_>() // TODO move inside
let memoizedTRFactorial =
let rec fac n k = // must make tailcalls to k
match cache.TryGetValue(n) with
| true, r -> k r
| _ ->
if n=0 then
k 1
else
fac (n-1) (fun r1 ->
printfn "multiplying by %d" n //***
let r = r1 * n
cache.Add(n,r)
k r)
fun n -> fac n id
printfn "---"
let r = memoizedTRFactorial 4
printfn "%d" r
for KeyValue(k,v) in cache do
printfn "%d: %d" k v
printfn "---"
let r2 = memoizedTRFactorial 5
printfn "%d" r2
printfn "---"
// comment out *** line, then run this
//let r3 = memoizedTRFactorial 100000
//printfn "%d" r3
There are two kinds of tests. First, this demos that calling F(4) caches F(4), F(3), F(2), F(1) as you would like.
Then, comment out the *** printf and uncomment the final test (and compile in Release mode) to show that it does not StackOverflow (it uses tailcalls correctly).
Perhaps I'll generalize out 'memoize' and demonstrate it on 'fib' next...
EDIT
Ok, here's the next step, I think, decoupling memoization from factorial:
open System.Collections.Generic
let cache = Dictionary<_,_>() // TODO move inside
let memoize fGuts n =
let rec newFunc n k = // must make tailcalls to k
match cache.TryGetValue(n) with
| true, r -> k r
| _ ->
fGuts n (fun r ->
cache.Add(n,r)
k r) newFunc
newFunc n id
let TRFactorialGuts n k memoGuts =
if n=0 then
k 1
else
memoGuts (n-1) (fun r1 ->
printfn "multiplying by %d" n //***
let r = r1 * n
k r)
let memoizedTRFactorial = memoize TRFactorialGuts
printfn "---"
let r = memoizedTRFactorial 4
printfn "%d" r
for KeyValue(k,v) in cache do
printfn "%d: %d" k v
printfn "---"
let r2 = memoizedTRFactorial 5
printfn "%d" r2
printfn "---"
// comment out *** line, then run this
//let r3 = memoizedTRFactorial 100000
//printfn "%d" r3
EDIT
Ok, here's a fully generalized version that seems to work.
open System.Collections.Generic
let memoize fGuts =
let cache = Dictionary<_,_>()
let rec newFunc n k = // must make tailcalls to k
match cache.TryGetValue(n) with
| true, r -> k r
| _ ->
fGuts n (fun r ->
cache.Add(n,r)
k r) newFunc
cache, (fun n -> newFunc n id)
let TRFactorialGuts n k memoGuts =
if n=0 then
k 1
else
memoGuts (n-1) (fun r1 ->
printfn "multiplying by %d" n //***
let r = r1 * n
k r)
let facCache,memoizedTRFactorial = memoize TRFactorialGuts
printfn "---"
let r = memoizedTRFactorial 4
printfn "%d" r
for KeyValue(k,v) in facCache do
printfn "%d: %d" k v
printfn "---"
let r2 = memoizedTRFactorial 5
printfn "%d" r2
printfn "---"
// comment out *** line, then run this
//let r3 = memoizedTRFactorial 100000
//printfn "%d" r3
let TRFibGuts n k memoGuts =
if n=0 || n=1 then
k 1
else
memoGuts (n-1) (fun r1 ->
memoGuts (n-2) (fun r2 ->
printfn "adding %d+%d" r1 r2 //%%%
let r = r1+r2
k r))
let fibCache, memoizedTRFib = memoize TRFibGuts
printfn "---"
let r5 = memoizedTRFib 4
printfn "%d" r5
for KeyValue(k,v) in fibCache do
printfn "%d: %d" k v
printfn "---"
let r6 = memoizedTRFib 5
printfn "%d" r6
printfn "---"
// comment out %%% line, then run this
//let r7 = memoizedTRFib 100000
//printfn "%d" r7
The predicament of memoizing tail-recursive functions is, of course, that when tail-recursive function
let f x =
......
f x1
calls itself, it is not allowed to do anything with a result of the recursive call, including putting it into cache. Tricky; so what can we do?
The critical insight here is that since the recursive function is not allowed to do anything with a result of recursive call, the result for all arguments to recursive calls will be the same! Therefore if recursion call trace is this
f x0 -> f x1 -> f x2 -> f x3 -> ... -> f xN -> res
then for all x in x0,x1,...,xN the result of f x will be the same, namely res. So the last invocation of a recursive function, the non-recursive call, knows the results for all the previous values - it is in a position to cache them. The only thing you need to do is to pass a list of visited values to it. Here is what it might look for factorial:
let cache = Dictionary<_,_>()
let rec fact0 l ((n,res) as arg) =
let commitToCache r =
l |> List.iter (fun a -> cache.Add(a,r))
match cache.TryGetValue(arg) with
| true, cachedResult -> commitToCache cachedResult; cachedResult
| false, _ ->
if n = 1 then
commitToCache res
cache.Add(arg, res)
res
else
fact0 (arg::l) (n-1, n*res)
let fact n = fact0 [] (n,1)
But wait! Look - l parameter of fact0 contains all the arguments to recursive calls to fact0 - just like the stack would in a non-tail-recursive version! That is exactly right. Any non-tail recursive algorithm can be converted to a tail-recursive one by moving the "list of stack frames" from stack to heap and converting the "postprocessing" of recursive call result into a walk over that data structure.
Pragmatic note: The factorial example above illustrates a general technique. It is quite useless as is - for factorial function it is quite enough to cache the top-level fact n result, because calculation of fact n for a particular n only hits a unique series of (n,res) pairs of arguments to fact0 - if (n,1) is not cached yet, then none of the pairs fact0 is going to be called on are.
Note that in this example, when we went from non-tail-recursive factorial to a tail-recursive factorial, we exploited the fact that multiplication is associative and commutative - tail-recursive factorial execute a different set of multiplications than a non-tail-recursive one.
In fact, a general technique exists for going from non-tail-recursive to tail-recursive algorithm, which yields an algorithm equivalent to a tee. This technique is called "continuatuion-passing transformation". Going that route, you can take a non-tail-recursive memoizing factorial and get a tail-recursive memoizing factorial by pretty much a mechanical transformation. See Brian's answer for exposition of this method.
I'm not sure if there's a simpler way to do this, but one approach would be to create a memoizing y-combinator:
let memoY f =
let cache = Dictionary<_,_>()
let rec fn x =
match cache.TryGetValue(x) with
| true,y -> y
| _ -> let v = f fn x
cache.Add(x,v)
v
fn
Then, you can use this combinator in lieu of "let rec", with the first argument representing the function to call recursively:
let tailRecFact =
let factHelper fact (x, res) =
printfn "%i,%i" x res
if x = 0 then res
else fact (x-1, x*res)
let memoized = memoY factHelper
fun x -> memoized (x,1)
EDIT
As Mitya pointed out, memoY doesn't preserve the tail recursive properties of the memoee. Here's a revised combinator which uses exceptions and mutable state to memoize any recursive function without overflowing the stack (even if the original function is not itself tail recursive!):
let memoY f =
let cache = Dictionary<_,_>()
fun x ->
let l = ResizeArray([x])
while l.Count <> 0 do
let v = l.[l.Count - 1]
if cache.ContainsKey(v) then l.RemoveAt(l.Count - 1)
else
try
cache.[v] <- f (fun x ->
if cache.ContainsKey(x) then cache.[x]
else
l.Add(x)
failwith "Need to recurse") v
with _ -> ()
cache.[x]
Unfortunately, the machinery which is inserted into each recursive call is somewhat heavy, so performance on un-memoized inputs requiring deep recursion can be a bit slow. However, compared to some other solutions, this has the benefit that it requires fairly minimal changes to the natural expression of recursive functions:
let fib = memoY (fun fib n ->
printfn "%i" n;
if n <= 1 then n
else (fib (n-1)) + (fib (n-2)))
let _ = fib 5000
EDIT
I'll expand a bit on how this compares to other solutions. This technique takes advantage of the fact that exceptions provide a side channel: a function of type 'a -> 'b doesn't actually need to return a value of type 'b, but can instead exit via an exception. We wouldn't need to use exceptions if the return type explicitly contained an additional value indicating failure. Of course, we could use the 'b option as the return type of the function for this purpose. This would lead to the following memoizing combinator:
let memoO f =
let cache = Dictionary<_,_>()
fun x ->
let l = ResizeArray([x])
while l.Count <> 0 do
let v = l.[l.Count - 1]
if cache.ContainsKey v then l.RemoveAt(l.Count - 1)
else
match f(fun x -> if cache.ContainsKey x then Some(cache.[x]) else l.Add(x); None) v with
| Some(r) -> cache.[v] <- r;
| None -> ()
cache.[x]
Previously, our memoization process looked like:
fun fib n ->
printfn "%i" n;
if n <= 1 then n
else (fib (n-1)) + (fib (n-2))
|> memoY
Now, we need to incorporate the fact that fib should return an int option instead of an int. Given a suitable workflow for option types, this could be written as follows:
fun fib n -> option {
printfn "%i" n
if n <= 1 then return n
else
let! x = fib (n-1)
let! y = fib (n-2)
return x + y
} |> memoO
However, if we're willing to change the return type of the first parameter (from int to int option in this case), we may as well go all the way and just use continuations in the return type instead, as in Brian's solution. Here's a variation on his definitions:
let memoC f =
let cache = Dictionary<_,_>()
let rec fn n k =
match cache.TryGetValue(n) with
| true, r -> k r
| _ ->
f fn n (fun r ->
cache.Add(n,r)
k r)
fun n -> fn n id
And again, if we have a suitable computation expression for building CPS functions, we can define our recursive function like this:
fun fib n -> cps {
printfn "%i" n
if n <= 1 then return n
else
let! x = fib (n-1)
let! y = fib (n-2)
return x + y
} |> memoC
This is exactly the same as what Brian has done, but I find the syntax here is easier to follow. To make this work, all we need are the following two definitions:
type CpsBuilder() =
member this.Return x k = k x
member this.Bind(m,f) k = m (fun a -> f a k)
let cps = CpsBuilder()
I wrote a test to visualize the memoization. Each dot is a recursive call.
......720 // factorial 6
......720 // factorial 6
.....120 // factorial 5
......720 // memoizedFactorial 6
720 // memoizedFactorial 6
120 // memoizedFactorial 5
......720 // tailRecFact 6
720 // tailRecFact 6
.....120 // tailRecFact 5
......720 // tailRecursiveMemoizedFactorial 6
720 // tailRecursiveMemoizedFactorial 6
.....120 // tailRecursiveMemoizedFactorial 5
kvb's solution returns the same results are straight memoization like this function.
let tailRecursiveMemoizedFactorial =
memoize
(fun x ->
let rec factorialUtil x res =
if x = 0 then
res
else
printf "."
let newRes = x * res
factorialUtil (x - 1) newRes
factorialUtil x 1
)
Test source code.
open System.Collections.Generic
let memoize f =
let cache = new Dictionary<_, _>()
(fun x ->
match cache.TryGetValue(x) with
| true, y -> y
| _ ->
let v = f(x)
cache.Add(x, v)
v)
let rec factorial(x) =
if (x = 0) then
1
else
printf "."
x * factorial(x - 1)
let rec memoizedFactorial =
memoize (
fun x ->
if (x = 0) then
1
else
printf "."
x * memoizedFactorial(x - 1))
let memoY f =
let cache = Dictionary<_,_>()
let rec fn x =
match cache.TryGetValue(x) with
| true,y -> y
| _ -> let v = f fn x
cache.Add(x,v)
v
fn
let tailRecFact =
let factHelper fact (x, res) =
if x = 0 then
res
else
printf "."
fact (x-1, x*res)
let memoized = memoY factHelper
fun x -> memoized (x,1)
let tailRecursiveMemoizedFactorial =
memoize
(fun x ->
let rec factorialUtil x res =
if x = 0 then
res
else
printf "."
let newRes = x * res
factorialUtil (x - 1) newRes
factorialUtil x 1
)
factorial 6 |> printfn "%A"
factorial 6 |> printfn "%A"
factorial 5 |> printfn "%A\n"
memoizedFactorial 6 |> printfn "%A"
memoizedFactorial 6 |> printfn "%A"
memoizedFactorial 5 |> printfn "%A\n"
tailRecFact 6 |> printfn "%A"
tailRecFact 6 |> printfn "%A"
tailRecFact 5 |> printfn "%A\n"
tailRecursiveMemoizedFactorial 6 |> printfn "%A"
tailRecursiveMemoizedFactorial 6 |> printfn "%A"
tailRecursiveMemoizedFactorial 5 |> printfn "%A\n"
System.Console.ReadLine() |> ignore
That should work if mutual tail recursion through y are not creating stack frames:
let rec y f x = f (y f) x
let memoize (d:System.Collections.Generic.Dictionary<_,_>) f n =
if d.ContainsKey n then d.[n]
else d.Add(n, f n);d.[n]
let rec factorialucps factorial' n cont =
if n = 0I then cont(1I) else factorial' (n-1I) (fun k -> cont (n*k))
let factorialdpcps =
let d = System.Collections.Generic.Dictionary<_, _>()
fun n -> y (factorialucps >> fun f n -> memoize d f n ) n id
factorialdpcps 15I //1307674368000

insertAt in F# simpler and/or better

I would like to start some questions about simplifying different expressions in F#.
Anyone have ideas for better and/or simpler implementation of insertAt (parameters could be reordered, too). Lists or Sequences could be used.
Here is some start implementation:
let insertAt x xs n = Seq.concat [Seq.take n xs; seq [x]; Seq.skip n xs]
The implementation dannyasher posted is a non-tail-recursive one. In order to make the function more efficient, we'll have to introduce an explicit accumulator parameter which makes the function tail-recursive and allows the compiler to optimize the recursion overhead away:
let insertAt =
let rec insertAtRec acc n e list =
match n, list with
| 0, _ -> (List.rev acc) # [e] # list
| _, x::xs -> insertAtRec (x::acc) (n - 1) e xs
| _ -> failwith "Index out of range"
insertAtRec []
Tail-recursive using Seqs:
let rec insertAt = function
| 0, x, xs -> seq { yield x; yield! xs }
| n, x, xs -> seq { yield Seq.hd xs; yield! insertAt (n-1, x, Seq.skip 1 xs) }
Here's an F# implementation of the Haskell list insertion:
let rec insertAt x ys n =
match n, ys with
| 1, _
| _, [] -> x::ys
| _, y::ys -> y::insertAt x ys (n-1)
let a = [1 .. 5]
let b = insertAt 0 a 3
let c = insertAt 0 [] 3
>
val a : int list = [1; 2; 3; 4; 5]
val b : int list = [1; 2; 0; 3; 4; 5]
val c : int list = [0]
My Haskell isn't good enough to know whether the case of passing an empty list is correctly taken care of in the Haskell function. In F# we explicitly take care of the empty list in the second match case.
Danny
For case you really want to work with sequence:
let insertAt x ys n =
let i = ref n
seq {
for y in ys do
decr i
if !i = 0 then yield x
yield y
}
For all other cases dannyasher's answer is definitly nicer and faster.
From the Haskell Wiki - http://www.haskell.org/haskellwiki/99_questions/21_to_28
insertAt :: a -> [a] -> Int -> [a]
insertAt x ys 1 = x:ys
insertAt x (y:ys) n = y:insertAt x ys (n-1)
I'm not an F# programmer so I don't know the equivalent syntax for F# but this is a nice recursive definition for insertAt

Resources