Taking two streams and combining them in OCaml - stream

I want to take two streams of integers in increasing order and combine them into one stream that contains no duplicates and should be in increasing order. I have defined the functionality for streams in the following manner:
type 'a susp = Susp of (unit -> 'a)
let force (Susp f) = f()
type 'a str = {hd : 'a ; tl : ('a str) susp }
let merge s1 s2 = (* must implement *)
The first function suspends computation by wrapping a computation within a function, and the second function evaluates the function and provides me with the result of the computation.
I want to emulate the logic of how you go about combining lists, i.e. match on both lists and check which elements are greater, lesser, or equal and then append (cons) the integers such that the resulting list is sorted.
However, I know I cannot just do this with streams of course as I cannot traverse it like a list, so I think I would need to go integer by integer, compare, and then suspend the computation and keep doing this to build the resulting stream.
I am at a bit of a loss how to implement such logic however, assuming it is how I should be going about this, so if somebody could point me in the right direction that would be great.
Thank you!

If the the input sequences are sorted, there is not much difference between merging lists and sequences. Consider the following merge function on lists:
let rec merge s t =
match s, t with
| x :: s , [] | [], x :: s -> x :: s
| [], [] -> s
| x :: s', y :: t' ->
if x < y then
x :: (merge s' t)
else if x = y then
x :: (merge s' t')
else
y :: (merge s t')
This function is only using two properties of lists:
the ability to split the potential first element from the rest of the list
the ability to add an element to the front of the list
This suggests that we could rewrite this function as a functor over the signature
module type seq = sig
type 'a t
(* if the seq is non-empty we split the seq into head and tail *)
val next: 'a t -> ('a * 'a t) option
(* add back to the front *)
val cons: 'a -> 'a t -> 'a t
end
Then if we replace the pattern matching on the list with a call to next, and the cons operation with a call to cons, the previous function is transformed into:
module Merge(Any_seq: seq ) = struct
open Any_seq
let rec merge s t =
match next s, next t with
| Some(x,s), None | None, Some (x,s) ->
cons x s
| None, None -> s
| Some (x,s'), Some (y,t') ->
if x < y then
cons x (merge s' t)
else if x = y then
cons x (merge s' t')
else
cons y (merge s t')
end
Then, with list, our implementation was:
module List_core = struct
type 'a t = 'a list
let cons = List.cons
let next = function
| [] -> None
| a :: q -> Some(a,q)
end
module List_implem = Merge(List_core)
which can be tested with
let test = List_implem.merge [1;5;6] [2;4;9]
Implementing the same function for your stream type is then just a matter of writing a similar Stream_core module for stream.

Related

How to avoid stack overflow during CPS conversion?

I'm writing a transformation from Scheme subset to CPS language. It is implemented in F#. On big input programs conversion fails by stack overflow.
I'm using some sort of algorithm described in the paper Compiling with Continuations.
I've tried to increase maximum stack size of the working thread up to 50 MB, then it works.
Maybe there some way to modify the algorithm, so that I won't need to tune stack size?
For example, the algorithm transforms
(foo (bar 1) (bar 2))
to
(let ((c1 (cont (r1)
(let ((c2 (cont (r2)
(foo halt r1 r2))))
(bar c2 2)))))
(bar c1 1))
where halt is a final continuation which finishes the program.
Maybe your actual problems has simple solutions to avoid heavy stack consumption, so please don't mind adding details. However, without more knowledge about your particular code, here is a general approach to reduce the stack consumption in a recursive programs, based on trampolines and continuations.
Walker
Here is a typical recursive function that is not trivially tail-recursive, written in Common Lisp because I don't know F#:
(defun walk (form transform join)
(typecase form
(cons (funcall join
(walk (car form) transform join)
(walk (cdr form) transform join)))
(t (funcall transform form))))
The code is however quite simple, hopefully, and walks a tree made of cons cells:
if the form is a cons-cell, recursively walk on the car (resp. cdr) and join the results
Otherwise, apply a transform on the value
For example:
(walk '(a (b c d) 3 2 (a 2 1) 0)
(lambda (u) (and (numberp u) u))
(lambda (a b) (if a (cons a b) (or a b))))
=> (3 2 (2 1) 0)
The code walks the form, and retain only numbers, but preserves (non-empty) nesting.
Calling trace on walk with the above example shows a maximal depth of 8 nested calls.
Continuations and trampoline
Here is an adapted version, called
walk/then, that walks a form as previously, and when a result is
available, calls then on it. Here then is a continuation.
The function also returns a thunk, i.e. a parameterless closure.
What happens is that when we return the closure, the stack is unwound,
and when we apply the thunk it will
start from a fresh stack, but having advanced in the computation
(I usually picture someone walking up an escalator that goes down).
The fact that we return a thunk to reduce the number of stack frames is part of the trampoline.
The then function takes a value, namely
the result that the current walk eventually will return.
The result is thus passed down the stack, and what is
returned at each step is a thunk function.
Nesting continuations allows to capture the complex behaviour of transform/join, by pushing the remaining parts of the computation in nested continuations.
(defun walk/then (form transform join then)
(typecase form
(cons (lambda ()
(walk/then (car form) transform join
(lambda (v)
(walk/then (cdr form) transform join
(lambda (w)
(funcall then (funcall join v w))))))))
(t (funcall then (funcall transform form)))))
For example, (walk/then (car form) transform join (lambda (v) ...)) reads as follows: walk the car of form with
arguments transform and join, and eventually call (lambda (v) ...) on the result; namely, walk down the cdr, and then join both results; eventually, call the input then on the joined result.
What is missing is a way to continually call the returned thunk until exhaustion; here is it
with a loop, but this could easily be a tail-recursive function:
(loop for res =
(walk/then '(a (b c d) 3 2 (a 2 1) 0)
(lambda (u) (and (numberp u) u))
(lambda (a b) (if a (cons a b) (or a b)))
#'identity)
then (typecase res (function (funcall res)) (t res))
while (functionp res)
finally (return res))
The above returns (3 2 (2 1) 0), and the depth of the trace never goes over 2 when tracing walk/then.
See Eli Bendersky's article for another take at this, in Python.
I've converted algorithm to trampoline form. It looks like FSM.
There is a loop, which looks at the current state, makes some manipulations, and goes to another state. Also it uses two stacks for different kind of continuations.
Here is input language (it is a subset of the language I used originally) :
// Input language consists of only variables and function applications
type Expr =
| Var of string
| App of Expr * Expr list
Here is target language:
// CPS form - each function gets a continuation,
// added continuation definitions and continuation applications
type Norm =
| LetCont of name : string * args : string list * body : Norm * inner : Norm
| FuncCall of func : string * cont : string * args : string list
| ContCall of cont : string * args : string list
Here is original algorithm:
// Usual way to make CPS conversion.
let rec transform expr cont =
match expr with
| App(func, args) ->
transformMany (func :: args) (fun vars ->
let func' = List.head vars
let args' = List.tail vars
let c = fresh()
let r = fresh()
LetCont(c, [r], cont r, FuncCall(func', c, args')))
| Var(v) -> cont v
and transformMany exprs cont =
match exprs with
| e :: rest ->
transform e (fun e' ->
transformMany rest (fun rest' ->
cont (e' :: rest')))
| _ -> cont []
let transformTop expr =
transform expr (fun var -> ContCall("halt", [var]))
Here is modified version:
type Action =
| ContinuationVar of Expr * (string -> Action)
| ContinuationExpr of string * (Norm -> Action)
| TransformMany of string list * Expr list * (string list -> Action)
| Result of Norm
| Variable of string
// Make one action at time and return to top loop
let rec transform2 expr =
match expr with
| App(func, args) ->
TransformMany([], func :: args, (fun vars ->
let func' = List.head vars
let args' = List.tail vars
let c = fresh()
let r = fresh()
ContinuationExpr(r, fun expr ->
Result(LetCont(c, [r], expr, FuncCall(func', c, args'))))))
| Var(v) -> Variable(v)
// We have two stacks here:
// contsVar for continuations accepting variables
// contsExpr for continuations accepting expressions
let transformTop2 expr =
let rec loop contsVar contsExpr action =
match action with
| ContinuationVar(expr, cont) ->
loop (cont :: contsVar) contsExpr (transform2 expr)
| ContinuationExpr(var, contExpr) ->
let contVar = List.head contsVar
let contsVar' = List.tail contsVar
loop contsVar' (contExpr :: contsExpr) (contVar var)
| TransformMany(vars, e :: exprs, cont) ->
loop contsVar contsExpr (ContinuationVar(e, fun var ->
TransformMany(var :: vars, exprs, cont)))
| TransformMany(vars, [], cont) ->
loop contsVar contsExpr (cont (List.rev vars))
| Result(r) ->
match contsExpr with
| cont :: rest -> loop contsVar rest (cont r)
| _ -> r
| Variable(v) ->
match contsVar with
| cont :: rest -> loop rest contsExpr (cont v)
| _ -> failwith "must not be empty"
let initial = ContinuationVar(expr, fun var -> Result(ContCall("halt", [var])))
loop [] [] initial

Tail Recursive map f#

I want to write a tail recursive function to multiply all the values in a list by 2 in F#. I know there is a bunch of ways to do this but i want to know if this is even a viable method. This is purely for educational purposes. I realize that there is a built in function to do this for me.
let multiply m =
let rec innerfunct ax = function
| [] -> printfn "%A" m
| (car::cdr) -> (car <- car*2 innerfunct cdr);
innerfunct m;;
let mutable a = 1::3::4::[]
multiply a
I get two errors with this though i doubt they are the only problems.
This value is not mutable on my second matching condition
and
This expression is a function value, i.e. is missing arguments. Its type is 'a list -> unit. for when i call length a.
I am fairly new to F# and realize im probably not calling the function properly but i cant figure out why. This is mostly a learning experience for me so the explanation is more important than just fixing the code. The syntax is clearly off, but can i map *2 to a list just by doing the equivalent of
car = car*2 and then calling the inner function on the cdr of the list.
There are a number of issues that I can't easily explain without showing intermediate code, so I'll try to walk through a commented refactoring:
First, we'll go down the mutable path:
As F# lists are immutable and so are primitive ints, we need a way to mutate that thing inside the list:
let mutable a = [ref 1; ref 3; ref 4]
Getting rid of the superfluous ax and arranging the cases a bit, we can make use of these reference cells:
let multiply m =
let rec innerfunct = function
| [] -> printfn "%A" m
| car :: cdr ->
car := !car*2
innerfunct cdr
innerfunct m
We see, that multiply only calls its inner function, so we end up with the first solution:
let rec multiply m =
match m with
| [] -> printfn "%A" m
| car :: cdr ->
car := !car*2
multiply cdr
This is really only for it's own purpose. If you want mutability, use arrays and traditional for-loops.
Then, we go up the immutable path:
As we learnt in the mutable world, the first error is due to car not being mutable. It is just a primitive int out of an immutable list. Living in an immutable world means we can only create something new out of our input. What we want is to construct a new list, having car*2 as head and then the result of the recursive call to innerfunct. As usual, all branches of a function need to return some thing of the same type:
let multiply m =
let rec innerfunct = function
| [] ->
printfn "%A" m
[]
| car :: cdr ->
car*2 :: innerfunct cdr
innerfunct m
Knowing m is immutable, we can get rid of the printfn. If needed, we can put it outside of the function, anywhere we have access to the list. It will always print the same.
We finish by also making the reference to the list immutable and obtain a second (intermediate) solution:
let multiply m =
let rec innerfunct = function
| [] -> []
| car :: cdr -> car*2 :: innerfunct cdr
innerfunct m
let a = [1; 3; 4]
printfn "%A" a
let multiplied = multiply a
printfn "%A" multiplied
It might be nice to also multiply by different values (the function is called multiply after all and not double). Also, now that innerfunct is so small, we can make the names match the small scope (the smaller the scope, the shorter the names):
let multiply m xs =
let rec inner = function
| [] -> []
| x :: tail -> x*m :: inner tail
inner xs
Note that I put the factor first and the list last. This is similar to other List functions and allows to create pre-customized functions by using partial application:
let double = multiply 2
let doubled = double a
All that's left now is to make multiply tail-recursive:
let multiply m xs =
let rec inner acc = function
| [] -> acc
| x :: tail -> inner (x*m :: acc) tail
inner [] xs |> List.rev
So we end up having (for educational purposes) a hard-coded version of let multiply' m = List.map ((*) m)
F# is a 'single-pass' compiler, so you can expect any compilation error to have a cascading effect beneath the error. When you have a compilation error, focus on that single error. While you may have more errors in your code (you do), it may also be that subsequent errors are only consequences of the first error.
As the compiler says, car isn't mutable, so you can assign a value to it.
In Functional Programming, a map can easily be implemented as a recursive function:
// ('a -> 'b) -> 'a list -> 'b list
let rec map f = function
| [] -> []
| h::t -> f h :: map f t
This version, however, isn't tail-recursive, since it recursively calls map before it cons the head onto the tail.
You can normally refactor to a tail-recursive implementation by introducing an 'inner' implementation function that uses an accumulator for the result. Here's one way to do that:
// ('a -> 'b) -> 'a list -> 'b list
let map' f xs =
let rec mapImp f acc = function
| [] -> acc
| h::t -> mapImp f (acc # [f h]) t
mapImp f [] xs
Here, mapImp is the last operation to be invoked in the h::t case.
This implementation is a bit inefficient because it concatenates two lists (acc # [f h]) in each iteration. Depending on the size of the lists to map, it may be more efficient to cons the accumulator and then do a single reverse at the end:
// ('a -> 'b) -> 'a list -> 'b list
let map'' f xs =
let rec mapImp f acc = function
| [] -> acc
| h::t -> mapImp f (f h :: acc) t
mapImp f [] xs |> List.rev
In any case, however, the only reason to do all of this is for the exercise, because this function is already built-in.
In all cases, you can use map functions to multiply all elements in a list by two:
> let mdouble = List.map ((*) 2);;
val mdouble : (int list -> int list)
> mdouble [1..10];;
val it : int list = [2; 4; 6; 8; 10; 12; 14; 16; 18; 20]
Normally, though, I wouldn't even care to define such function explicitly. Instead, you use it inline:
> List.map ((*) 2) [1..10];;
val it : int list = [2; 4; 6; 8; 10; 12; 14; 16; 18; 20]
You can use all the above map function in the same way.
Symbols that you are creating in a match statement are not mutable, so when you are matching with (car::cdr) you cannot change their values.
Standard functional way would be to produce a new list with the computed values. For that you can write something like this:
let multiplyBy2 = List.map (fun x -> x * 2)
multiplyBy2 [1;2;3;4;5]
This is not tail recursive by itself (but List.map is).
If you really want to change values of the list, you could use an array instead. Then your function will not produce any new objects, just iterate through the array:
let multiplyArrayBy2 arr =
arr
|> Array.iteri (fun index value -> arr.[index] <- value * 2)
let someArray = [| 1; 2; 3; 4; 5 |]
multiplyArrayBy2 someArray

What is wrong with this F# curried function?

I'm writing this curried f# function called inner that should take 2 lists as parameters and multiply both according to position and then return the sum:
let rec inner xs =
let aux ys = function
| ([], ys) -> 0
| (xs, []) -> 0
| (x::xs, y::ys) -> (x*y) + inner xs ys
aux ys;;
inner [1;2;3] [4;5;6];;
In this case the answer is 32 because 1*4 + 2*5 + 3*6 = 32. And it works but there's this message:
error FS0001: Type mismatch. Expecting a
'a list -> 'd
but given a
'b list * 'a list -> int
The type 'a list does not match the type 'b list * 'a list
I honestly don't know what to put next to aux when calling it to make it work.
I'm not sure what exactly the misunderstanding is. I'm noticing three strange points:
let aux ys = function ([], ys) -> declares and immediately re-declares, and thereby shadows, the identifier ys. Note that aux is a function with two curried arguments, of which the second is a 2-tuple. I doubt this is what you intended.
The aux function is indented in a rather unusual way; normally, it should get another four-space indentation. The compiler may not complain about this, and just exit the scope after the pattern match, but it adds to the confusion about what the failing line is supposed to do.
ys is undefined where it is last used. (Could this be related to the confusing indentation?)
Here are two ways to write it:
Not tail recursive
let rec inner xs ys =
match xs, ys with
| x::xt, y::yt -> x * y + inner xt yt
| _ -> 0
In this version, the curried arguments are turned into a tuple and used in a match expression. (Due to the final additions, this may cause a stack overflow for large input lists.)
Tail recursive, with auxiliary function
let rec private aux acc = function
| x::xs, y::ys -> aux (acc + x * y) (xs, ys)
| _ -> acc
let inner xs ys = aux 0 (xs, ys)
Here, the auxiliary function has two curried arguments, of which the first is an accumulator, and the second a tuple holding the two lists. inner becomes a wrapper function that both strips the accumulator – by initializing it with zero – and turns the tupled arguments into curried arguments, as was the requirement. (Since the recursive call's value is the functions return value, tail recursive compilation is supported for this function.)
You need to call your aux function after defining it. Currently your inner function just defines it by doesn't do anything with it.
In this case, I'm not sure you actually need to define aux at all, if you define your inner function to take two parameters:
let rec inner (tuple : int list * int list) =
match tuple with
| ([], ys) -> 0
| (xs, []) -> 0
| (x :: xtail, y :: ytail) -> x * y + inner (xtail, ytail)
inner ([1;2;3], [4;5;6]) // 32
If you want to retain the curried form then the following should work. You need to include xs in the matching, and then just return aux (which will incorporate the first list and expect a second list):
let rec inner xs =
let aux ys =
match xs, ys with
| ([], ys) -> 0
| (xs, []) -> 0
| (x::xs, y::ys) -> (x*y) + inner xs ys
aux
inner [1;2;3] [4;5;6];; // 32

What is wrong with 100000 factorial using ContinuationMonad?

It is powerful technique using recursion because its strong describable feature. Tail recursion provides more powerful computation than normal recursion because it changes recursion into iteration. Continuation-Passing Style (CPS) can change lots of loop codes into tail recursion. Continuation Monad provides recursion syntax but in essence it is tail recursion, which is iteration. It is supposed to reasonable use Continuation Monad for 100000 factorial. Here is the code.
type ContinuationBuilder() =
member b.Bind(x, f) = fun k -> x (fun x -> f x k)
member b.Return x = fun k -> k x
member b.ReturnFrom x = x
(*
type ContinuationBuilder =
class
new : unit -> ContinuationBuilder
member Bind : x:(('d -> 'e) -> 'f) * f:('d -> 'g -> 'e) -> ('g -> 'f)
member Return : x:'b -> (('b -> 'c) -> 'c)
member ReturnFrom : x:'a -> 'a
end
*)
let cont = ContinuationBuilder()
//val cont : ContinuationBuilder
let fac n =
let rec loop n =
cont {
match n with
| n when n = 0I -> return 1I
| _ -> let! x = fun f -> f n
let! y = loop (n - 1I)
return x * y
}
loop n (fun x -> x)
let x2 = fac 100000I
There is wrong message: "Process is terminated due to StackOverflowException."
What is wrong with 100000 factorial using ContinuationMonad?
You need to compile the project in Release mode or check the "Generate tail calls" option in project properties (or use --tailcalls+ if you're running the compiler via command line).
By default, tail call optimization is not enabled in Debug mode. The reason is that, if tail-calls are enabled, you will not see as useful information about stack traces. So, disabling them by default gives you more pleasant debugging experience (even in Debug mode, the compiler optimizes tail-recursive functions that call themselves, which handles most situations).
You probably need to add this memeber to your monad builder:
member this.Delay(mk) = fun c -> mk () c

F# Tail Recursive Function Example

I am new to F# and was reading about tail recursive functions and was hoping someone could give me two different implementations of a function foo - one that is tail recursive and one that isn't so that I can better understand the principle.
Start with a simple task, like mapping items from 'a to 'b in a list. We want to write a function which has the signature
val map: ('a -> 'b) -> 'a list -> 'b list
Where
map (fun x -> x * 2) [1;2;3;4;5] == [2;4;6;8;10]
Start with non-tail recursive version:
let rec map f = function
| [] -> []
| x::xs -> f x::map f xs
This isn't tail recursive because function still has work to do after making the recursive call. :: is syntactic sugar for List.Cons(f x, map f xs).
The function's non-recursive nature might be a little more obvious if I re-wrote the last line as | x::xs -> let temp = map f xs; f x::temp -- obviously its doing work after the recursive call.
Use an accumulator variable to make it tail recursive:
let map f l =
let rec loop acc = function
| [] -> List.rev acc
| x::xs -> loop (f x::acc) xs
loop [] l
Here's we're building up a new list in a variable acc. Since the list gets built up in reverse, we need to reverse the output list before giving it back to the user.
If you're in for a little mind warp, you can use continuation passing to write the code more succinctly:
let map f l =
let rec loop cont = function
| [] -> cont []
| x::xs -> loop ( fun acc -> cont (f x::acc) ) xs
loop id l
Since the call to loop and cont are the last functions called with no additional work, they're tail-recursive.
This works because the continuation cont is captured by a new continuation, which in turn is captured by another, resulting in a sort of tree-like data structure as follows:
(fun acc -> (f 1)::acc)
((fun acc -> (f 2)::acc)
((fun acc -> (f 3)::acc)
((fun acc -> (f 4)::acc)
((fun acc -> (f 5)::acc)
(id [])))))
which builds up a list in-order without requiring you to reverse it.
For what its worth, start writing functions in non-tail recursive way, they're easier to read and work with.
If you have a big list to go through, use an accumulator variable.
If you can't find a way to use an accumulator in a convenient way and you don't have any other options at your disposal, use continuations. I personally consider non-trivial, heavy use of continuations hard to read.
An attempt at a shorter explanation than in the other examples:
let rec foo n =
match n with
| 0 -> 0
| _ -> 2 + foo (n-1)
let rec bar acc n =
match n with
| 0 -> acc
| _ -> bar (acc+2) (n-1)
Here, foo is not tail-recursive, because foo has to call foo recursively in order to evaluate 2+foo(n-1) and return it.
However, bar ís tail-recursive, because bar doesn't have to use the return value of the recursive call in order to return a value. It can just let the recursively called bar return its value immediately (without returning all the way up though the calling stack). The compiler sees this and optimized this by rewriting the recursion into a loop.
Changing the last line in bar into something like | _ -> 2 + (bar (acc+2) (n-1)) would again destroy the function being tail-recursive, since 2 + leads to an action that needs to be done after the recursive call is finished.
Here is a more obvious example, compare it to what you would normally do for a factorial.
let factorial n =
let rec fact n acc =
match n with
| 0 -> acc
| _ -> fact (n-1) (acc*n)
fact n 1
This one is a bit complex, but the idea is that you have an accumulator that keeps a running tally, rather than modifying the return value.
Additionally, this style of wrapping is usually a good idea, that way your caller doesn't need to worry about seeding the accumulator (note that fact is local to the function)
I'm learning F# too.
The following are non-tail recursive and tail recursive function to calculate the fibonacci numbers.
Non-tail recursive version
let rec fib = function
| n when n < 2 -> 1
| n -> fib(n-1) + fib(n-2);;
Tail recursive version
let fib n =
let rec tfib n1 n2 = function
| 0 -> n1
| n -> tfib n2 (n2 + n1) (n - 1)
tfib 0 1 n;;
Note: since the fibanacci number could grow really fast you could replace last line tfib 0 1 n to
tfib 0I 1I n to take advantage of Numerics.BigInteger Structure in F#
Also, when testing, don't forget that indirect tail recursion (tailcall) is turned off by default when compiling in Debug mode. This can cause tailcall recursion to overflow the stack in Debug mode but not in Release mode.

Resources