I am trying to draw lessons on the following behaviour from an example I simplified :
let groupedEnum (input: 'a seq) =
using (input.GetEnumerator()) (fun en ->
Seq.unfold(fun _ ->
if en.MoveNext() then
Some(en.Current, ())
else None) ()
)
//WORKS
let c = groupedEnum ("11111122334569999" |> List.ofSeq ) |> List.ofSeq
//BOOM !! System.NullReferenceException
let c = groupedEnum ("11111122334569999" ) |> List.ofSeq
Is the enumerator "en" disposed of independently of it being captured ? (I guess it is but is there anything to say / materials to read on this behaviour beside this msdn doc on ressources)
Why is it working if the sequence is transformed to a list first ?
Edit : this is just a toy example to illustrate a behaviour, not to be followed.
There are very few good reasons to manipulate enumerators directly.
The using function disposes the enumerator as soon as the lambda function returns. However, the lambda function creates a lazy sequence using Seq.unfold and the lazy sequence accesses the enumerator after the sequence is returned from groupedEnum.
You could either fully evaluate the whole sequence inside using (by adding List.ofSeq there) or you need to call Dispose when the end of the generated sequence is reached:
let groupedEnum (input: 'a seq) =
let en = input.GetEnumerator()
Seq.unfold(fun _ ->
if en.MoveNext() then
Some(en.Current, ())
else
en.Dispose()
None)
Exception handling becomes quite difficult in this case, but I guess that one way to do it would be to wrap the body in try .. with and call Dispose if an exception happens (and then return None).
If you use sequence expressions instead, then the meaning of use changes and it automatically disposes the enumerator after the end of sequence is reached (not when the lazy sequence is returned). So using sequence expressions might be a better choice because the hard work is done for you:
let groupedEnum (input: 'a seq) = seq {
use en = input.GetEnumerator()
let rec loop () = seq {
if en.MoveNext() then
yield en.Current
yield! loop () }
yield! loop () }
EDIT And why does it work in your first example? The enumerator returned by F# list type simply ignores Dispose and continues working, while if you call Dispose on an enumerator returned by a string, the enumerator cannot be used again. (This is arguably a bit odd behaviour of the F# list type.)
Related
Anyone have a decent example, preferably practical/useful, they could post demonstrating the concept?
I came across this term somewhere that I’m unable to find, probably it has to do something with a function returning a function while enclosing on some mutable variable. So there’s no visible mutation.
Probably Haskell community has originated the idea where mutation happens in another area not visible to the scope. I maybe vague here so seeking help to understand more.
It's a good idea to hide mutation, so the consumers of the API won't inadvartently change something unexpectedly. This just means that you have to encapsulate your mutable data/state. This can be done via objects (yes, objects), but what you are referring to in your question can be done with a closure, the canonical example is a counter:
let countUp =
let mutable count = 0
(fun () -> count <- count + 1
count)
countUp() // 1
countUp() // 2
countUp() // 3
You cannot access the mutable count variable directly.
Another example would be using mutable state within a function so that you cannot observe it, and the function is, for all intents and purposes, referentially transparent. Take for example the following function that reverses a string not character-wise, but rather by taking individual text elements (which, depending on language, can be more than one character):
let reverseStringU s =
if Core.string.IsNullOrEmpty s then s else
let rec iter acc (ee : System.Globalization.TextElementEnumerator) =
if not <| ee.MoveNext () then acc else
let e = ee.GetTextElement ()
iter (e :: acc) ee
let inline append x s = (^s : (member Append : ^x -> ^s) (s, x))
let sb = System.Text.StringBuilder s.Length
System.Globalization.StringInfo.GetTextElementEnumerator s
|> iter []
|> List.fold (fun a e -> append e a) sb
|> string
It uses a StringBuilder internally but you cannot observe this externally.
I have two similar functions nested and nestedCurried that traverse a XML tree with XLinq. They don't do anything useful - it's just a "shrinked" excerpt from a bit more complicated code.
What I'd expect from these two functions is to behave in the same fashion, as for me it looks like they are identical, with the only difference being the nestedCurried does not explicitly declare e: XElement argument - it's curied by the use of elements function and function composition >>
Meanwhile, the nestedCurried function throws StackOverflowException when called on any XElement
Evaluated in FSI:
#r "System.Xml.Linq"
open System.Xml.Linq
let inline elements (e: XElement) = e.Elements() |> Seq.toList
let rec nested () e = elements e |> List.collect (nested ())
let rec nestedCurried () = elements >> List.collect (nestedCurried ())
let x = XDocument.Parse """<a></a>"""
let ok : XElement list = nested () (x.Root)
// Stack Overflow below
let boom : XElement list = nestedCurried () (x.Root)
Why does the StackOverflowException occur, what's the technical difference between those two functions, and how can I declare nested function without specifying the XElement argument explicitly?
Look: every time you call nestedCurried, you call nestedCurried again, right away, unconditionally.
To make things a bit clearer, consider that the expression List.collect f is equivalent to let x = f; List.collect x. This means that your definition of nestedCurried is equivalent to this:
let nestedCurried () =
let x = nestedCurried()
elements >> List.collect x
Is it clearer now why this would cause infinite recursion?
Your () parameter is not needed and is confusing things. You have partially applied elements with X.Root so you calling nestedCurried with X.Root over and over again - hence the Stack Overflow. To declare nested without specifying the argument explicitly you can do:
let nested =
let rec inner e = elements e |> List.collect (inner)
inner
If you declared nestedCurried as
let rec nestedCurried = elements >> List.collect (nestedCurried)
You would have got a compiler error that "nestedCurried is evaluated as part of its own definition".
let makeIdGenerator startvalue =
let index : uint64 ref = ref startvalue
fun () ->
let result = !index
index := !index + 1UL
result
What I need is a generator for a function which has type unit -> uint64 as shown above.
The code above works but uses a reference variable to memoize the state of the generator.
Trying to use an infinite sequence as in Seq.initInfinite (fun i -> i) does not work as the sequence inherently uses uint32 for its state.
Does anyone here know a way to do this even without a reference variable? Maybe by means of recursion and yield or so?
Thanks in advance.
The standard functional programming approach to avoiding mutable state in a loop is to pass it in a parameter instead.
If you want an infinite sequence you can use a sequence expression with yield for the "first" result and yield! for the recursive call:
let genUint64() =
let rec genFrom n =
seq {
yield n
yield! genFrom (n+1UL)
}
genFrom 0UL
You can use Seq.unfold:
let makeIdGenerator (startvalue : uint64) =
Seq.unfold (fun i -> Some((i, i+1UL))) startvalue
Here is what I have so far:
type Maybe<'a> = option<'a>
let succeed x = Some(x)
let fail = None
let bind rest p =
match p with
| None -> fail
| Some r -> rest r
let rec whileLoop cond body =
if cond() then
match body() with
| Some() ->
whileLoop cond body
| None ->
fail
else
succeed()
let forLoop (xs : 'T seq) f =
using (xs.GetEnumerator()) (fun it ->
whileLoop
(fun () -> it.MoveNext())
(fun () -> it.Current |> f)
)
whileLoop works fine to support for loops, but I don't see how to get while loops supported. Part of the problem is that the translation of while loops uses delay, which I could not figure out in this case. The obvious implementation below is probably wrong, as it does not delay the computation, but runs it instead!
let delay f = f()
Not having delay also hinders try...with and try...finally.
There are actually two different ways of implementing continuation builders in F#. One is to represent delayed computations using the monadic type (if it supports some way of representing delayed computations, like Async<'T> or the unit -> option<'T> type as shown by kkm.
However, you can also use the flexibility of F# computation expressions and use a different type as a return value of Delay. Then you need to modify the Combine operation accordingly and also implement Run member, but it all works out quite nicely:
type OptionBuilder() =
member x.Bind(v, f) = Option.bind f v
member x.Return(v) = Some v
member x.Zero() = Some ()
member x.Combine(v, f:unit -> _) = Option.bind f v
member x.Delay(f : unit -> 'T) = f
member x.Run(f) = f()
member x.While(cond, f) =
if cond() then x.Bind(f(), fun _ -> x.While(cond, f))
else x.Zero()
let maybe = OptionBuilder()
The trick is that F# compiler uses Delay when you have a computation that needs to be delayed - that is: 1) to wrap the whole computation, 2) when you sequentially compose computations, e.g. using if inside the computation and 3) to delay bodies of while or for.
In the above definition, the Delay member returns unit -> M<'a> instead of M<'a>, but that's perfectly fine because Combine and While take unit -> M<'a> as their second argument. Moreover, by adding Run that evaluates the function, the result of maybe { .. } block (a delayed function) is evaluated, because the whole block is passed to Run:
// As usual, the type of 'res' is 'Option<int>'
let res = maybe {
// The whole body is passed to `Delay` and then to `Run`
let! a = Some 3
let b = ref 0
while !b < 10 do
let! n = Some () // This body will be delayed & passed to While
incr b
if a = 3 then printfn "got 3"
else printfn "got something else"
// Code following `if` is delayed and passed to Combine
return a }
This is a way to define computation builder for non-delayed types that is most likely more efficient than wrapping type inside a function (as in kkm's solution) and it does not require defining a special delayed version of the type.
Note that this problem does not happen in e.g. Haskell, because that is a lazy language, so it does not need to delay computations explicitly. I think that the F# translation is quite elegant as it allows dealing with both types that are delayed (using Delay that returns M<'a>) and types that represent just an immediate result (using Delay that returns a function & Run).
According to monadic identities, your delay should always be equivalent to
let delay f = bind (return ()) f
Since
val bind : M<'T> -> ('T -> M<'R>) -> M<'R>
val return : 'T -> M<'T>
the delay has the signature of
val delay : (unit -> M<'R>) -> M<'R>
'T being type-bound to unit. Note that your bind function has its arguments reversed from the customary order bind p rest. This is technically same but does complicate reading code.
Since you are defining the monadic type as type Maybe<'a> = option<'a>, there is no delaying a computation, as the type does not wrap any computation at all, only a value. So you definition of delay as let delay f = f() is theoretically correct. But it is not adequate for a while loop: the "body" of the loop will be computed before its "test condition," really before the bind is bound. To avoid this, you redefine your monad with an extra layer of delay: instead of wrapping a value, you wrap a computation that takes a unit and computes the value.
type Maybe<'a> = unit -> option<'a>
let return x = fun () -> Some(x)
let fail = fun() -> None
let bind p rest =
match p() with
| None -> fail
| Some r -> rest r
Note that the wrapped computation is not run until inside the bind function, i. e. not run until after the arguments to bind are bound themselves.
With the above expression, delay is correctly simplified to
let delay f = fun () -> f()
I would like to execute a list of functions over a list of corresponding values:
let f1 x = x*2;;
let f2 x = x+70;;
let conslist = [f1;f2];;
let pmap2 list1 list2 =
seq { for i in 0..1 do yield async { return list1.[i] list2.[i] } }
|> Async.Parallel
|> Async.RunSynchronously;;
Result:
seq { for i in 0..1 do yield async { return list1.[i] list2.[i] } }
----------------------------------------------^^^^^^^^^
stdin(213,49): error FS0752: The
operator 'expr.[idx]' has been used an
object of indeterminate type based on
information prior to this program
point. Consider adding further type
constraints
I would like to execute: pmap2 conslist [5;8];; (in parallel)
If you want to use random access then you should use arrays. Random access to elements of list will work, but it is inefficient (it needs to iterate over the list from the start). A version using arrays would look like this:
// Needs to be declared as array
let conslist = [|f1; f2|];;
// Add type annotations to specify that arguments are arrays
let pmap2 (arr1:_[]) (arr2:_[]) =
seq { for i in 0 .. 1 do
yield async { return arr1.[i] arr2.[i] } }
|> Async.Parallel |> Async.RunSynchronously
However, you can also rewrite the example to work with any sequences (including arrays and lists) using the Seq.zip function. I think this solution is more elegant and it doesn't force you to use imperative arrays (and it doesn't have the performance disadvantage):
// Works with any sequence type (array, list, etc.)
let pmap2 functions arguments =
seq { for f, arg in Seq.zip functions arguments do
yield async { return f arg } }
|> Async.Parallel |> Async.RunSynchronously
As the error message suggests, you need to add type annotations to list1 and list2. Once you do that, it works fine (though I would recommend using arrays instead of list since you're random-accessing them).
let pmap2 (list1:_ list) (list2:_ list)