F# XLinq traversal - curried version of function throws StackOverflowException - f#

I have two similar functions nested and nestedCurried that traverse a XML tree with XLinq. They don't do anything useful - it's just a "shrinked" excerpt from a bit more complicated code.
What I'd expect from these two functions is to behave in the same fashion, as for me it looks like they are identical, with the only difference being the nestedCurried does not explicitly declare e: XElement argument - it's curied by the use of elements function and function composition >>
Meanwhile, the nestedCurried function throws StackOverflowException when called on any XElement
Evaluated in FSI:
#r "System.Xml.Linq"
open System.Xml.Linq
let inline elements (e: XElement) = e.Elements() |> Seq.toList
let rec nested () e = elements e |> List.collect (nested ())
let rec nestedCurried () = elements >> List.collect (nestedCurried ())
let x = XDocument.Parse """<a></a>"""
let ok : XElement list = nested () (x.Root)
// Stack Overflow below
let boom : XElement list = nestedCurried () (x.Root)
Why does the StackOverflowException occur, what's the technical difference between those two functions, and how can I declare nested function without specifying the XElement argument explicitly?

Look: every time you call nestedCurried, you call nestedCurried again, right away, unconditionally.
To make things a bit clearer, consider that the expression List.collect f is equivalent to let x = f; List.collect x. This means that your definition of nestedCurried is equivalent to this:
let nestedCurried () =
let x = nestedCurried()
elements >> List.collect x
Is it clearer now why this would cause infinite recursion?

Your () parameter is not needed and is confusing things. You have partially applied elements with X.Root so you calling nestedCurried with X.Root over and over again - hence the Stack Overflow. To declare nested without specifying the argument explicitly you can do:
let nested =
let rec inner e = elements e |> List.collect (inner)
inner
If you declared nestedCurried as
let rec nestedCurried = elements >> List.collect (nestedCurried)
You would have got a compiler error that "nestedCurried is evaluated as part of its own definition".

Related

How to efficiently create a list in reversed order in F#

Is there anyway to contruct a list in reverse order without having to reverse it
Here is an example, I read all lines from stdin
#!/usr/bin/env dotnet fsi
open System
let rec readLines1 () =
let rec helper acc =
match Console.ReadLine() with
| null -> acc
| line ->
helper (line :: acc)
helper [] |> List.rev
readLines1 () |> List.iter (printfn "%s")
Before return from readLines1 I have to List.rev it so that is in right order. Since the result is a slightly linked list it will have to read all trough it and create the reversed version. Is there any way of creating the list in right order?
You can use a sequence instead of accumulating the lines in a list:
open System
let readLines1 () =
let rec helper () =
seq {
match Console.ReadLine() with
| null -> ()
| line ->
yield line
yield! helper ()
}
helper () |> Seq.toList
readLines1 () |> List.iter (printfn "%s")
You cannot create list in reverse order, because that would require mutation. If you read inputs one by one, and want to turn them into a list immediately, the only thing you can do is to create new list, linking to the previous one.
In practice, reversing the list is perfectly fine and that's probably the best way of solving this.
Out of curiosity, you could try defininig a mutable list that has the same structure as immutable F# list:
open System
type MutableList<'T> =
{ mutable List : MutableListBody<'T> }
and MutableListBody<'T> =
| Empty
| Cons of 'T * MutableList<'T>
Now you can implement your function by mutating the list:
let rec readLines () =
let res = { List = Empty }
let rec helper acc =
match Console.ReadLine() with
| null -> res
| line ->
let next = { List = Empty }
acc.List <- Cons(line, next)
helper next
helper res
This may be educational, but it's not very useful and, if you really wanted mutation in F#, you should probably use ResizeArray.
Yet another trick is to work with functions that take the tail of the list:
let rec readLines () =
let rec helper acc =
match Console.ReadLine() with
| null -> acc []
| line -> helper (fun tail -> acc (line :: tail))
helper id
In the line case, this returns a function that takes tail adds line before the tail and then calls whatever function was constructed before to add more things to the front.
This actually creates the list in the right order, but it's probably less efficient than creating a list and reversing it. It may look nice, but you are allocating a new function for each iteration, which is not better than allocating an extra copy of the list. (But it is a nice trick, nevertheless!)
Alternative solution without implementing recursive functions
let lines =
Seq.initInfinite (fun _ -> Console.ReadLine())
|> Seq.takeWhile (not << isNull)
|> Seq.toList

implementing an equivalent of the C#'s null test in F#

I'm using quite a lot this piece of code:
let inline (||>) (a: 'a option) (b: 'a -> unit) = if a.IsSome then b a.Value
so I can do things like
myData ||> DoSomethingWithIt
without having to test if myData is Some or None since there are many functions that don't generally need to test for an option. This avoid to put the test in the function itself.
I would like to extend this to methods of a type where I could do like C#'s:
myData?.DoSomethingWithIt
essentially replacing:
if myData.IsSome then myData.Value.DoSomethingWithIt
with some syntactic sugar.
but I have no idea how I could do the operator so that it allows to get access to the type's method in the expression. Is that possible in F#?
I'm also open to learn about why it could be a bad idea if it is :)
Depending on your return type of DoSomethingWithIt the F# library offers a few standard functions for working with Options that can be turned into operators.
let x = Some 1
let aPrinter a = printfn "%i" a
let add1 a = a + 1
let (|?>) opt f = Option.iter f opt
let (|??>) opt f = Option.map f opt
x |?> aPrinter
let y = x |??> add1
You can also consider redefining your DoSomethingWithIt to work with an option by partial application.
let DoSomethingWithIt' = Option.iter DoSomethingWithIt
let something' = Option.iter (fun (b:B) -> b.DoSomethingWithIt()) //For instance methods
That may end up being a lot of work depending how many functions you are dealing with.
Ultimately you shouldn't try to hide the fact you are working with Options. By making something an Option you are telling the compiler that you aren't sure whether it exists or not. It is trying to help you by forcing you to deal with the None case. If there are lots of cases in your code where you know your Option is Some then there's probably a larger architectural issue in your code and you should try to lift all your Option<'T> to just T prior to doing work with them. e.g.:
let lift xs =
[
for x in xs do
match x with
| Some x -> yield x
| None -> ()
]
Have a look at Option.iter. It has the same signature as your operator.
There is no analogical syntax for such constructions but F# have alternatives.
The easiest way is to use FSharpx.Extras library and FSharpx.Option.maybe computation expression which will allow you to use Option related operations.
open FSharpx.Option
let a = Some 1
let b = maybe {
let! v = a
return v + 3
} // b is (Some 4)
let c : int option = None
let d = maybe {
let! v = c
return v + 3 // this line won't be reached
} // d is None
I believe that the ?. operator in c# is a syntactic sugar that hides the if statement checking for null before invoking a member of the type. Even if you could make it work the way you plan, I feel that it would go against the FP principles and could cause more problems down the line.
The Option module contains probably most of what you need already. The iter function allows to call a function on the value of the Option if that value is present (Some).
If you have situation that your input parametes can be nulls, but not options, you can use the Option.ofObj function that will convert the parameter to an Option with Some if the parameter is not null, else None.
So assuming that your function DoSomethingWithit accepts a string and returns unit:
let DoSomethingWithIt = //(string -> unit)
printf "%s; "
You can use this more verbose syntax to (for example) iterate over nullable values in your list:
let lst = [ "data"; "data 2"; null; "data3" ]
lst
|> List.iter (fun v -> v |> Option.ofObj |> Option.iter DoSomethingWithIt)
Alternatively you can compose the Optioni.ofObj and Option.iter DoSomethingWithIt functions and do something like
let SafeDoSomethingWithIt = //(string -> unit)
Option.ofObj >> Option.iter DoSomethingWithIt
This gives you safe invocation:
let lst2 = [ "data"; "data 2"; null; "data3" ]
lst2
|> List.iter SafeDoSomethingWithIt
You can generalize the combination of the functions returning unit (but not only)
let makeSafe fn =
Option.ofObj >> Option.iter fn
Then you can create a series of safe functions:
let SafeDoSomethingWithIt = makeSafe DoSomethingWithIt
let safePrint = makeSafe (printf "%s; ")
//...etc
Then this still works:
lst2
|> List.iter SafeDoSomethingWithIt
lst2
|> List.iter safePrint
You can also write a wrapper for functions returning values using Option.bind function.
let makeSafeReturn fn = //(string -> string option)
Option.ofObj >> Option.bind fn

Is there already or can I declare a more pipe friendly upcast?

I want to be able to just
let upcast'<'T,'TResult when 'T :> 'TResult> (y:'T) = y |> upcast
However, that then constrains 'T to be 'TResult instead of it being something that can be cast to 'TResult
I know I can
|> fun x -> x :> 'TResult
|> fun x -> upcast x
|> fun x -> x :> _
but then if I'm doing anything else on that line I have to go back and put () around the fun x -> upcast x or it thinks what I'm doing is part of the fun x function.
can I define or does there exist a way to be able to
|> upcast |> doesn't work
|> ( ( :> ) 'TResult) doesn't work and is messy
edit
In response to Thomas Petricek - minimal failing auto-upcast sample:
module Test =
let inline f'<'t>():IReadOnlyCollection<'t> =
List.empty
|> ResizeArray
|> System.Collections.ObjectModel.ReadOnlyCollection
|> fun x -> x :> IReadOnlyCollection<_>
let inline f<'t> () :IReadOnlyCollection<'t> =
List.empty
|> ResizeArray
|> System.Collections.ObjectModel.ReadOnlyCollection
As far as I know, specifying the kind of constraint between 'T and 'TResult is not possible. There is a related question about this with links to more information and a feature request.
That said, I wonder why do you need this? The F# compiler is able to insert upcasts automatically, even when using pipes, so if you want to do this as part of a longer pipe, it should not be needed. Here is a simple illustration:
type Animal = interface end
type Dog = inherit Animal
let makeDog () = { new Dog }
let consumeAnimal (a:Animal) = 0
makeDog () |> consumeAnimal
I guess you might need pipe-able upcast if you wanted to have it at the end of the pipeline, but then I'd just do the upcast on a separate line. Or is your question motivated by some more complicated cases where the implicit upcast does not work?
EDIT 1: Here is a minimal example using ReadOnlyCollection and IReadOnlyList which works:
let foo () : System.Collections.ObjectModel.ReadOnlyCollection<int> = failwith "!"
let bar (x:System.Collections.Generic.IReadOnlyList<int>) = 0
foo() |> bar
EDIT 2: To comment on the update - the problem here is that automatic upcasts are only inserted when passing arguments to functions, but in the second example, the type mismatch is between the result of the pipe and the return type of the function. You can get that to work by adding an identity function of type IReadOnlyCollection<'T> -> IReadOnlyCollection<'T> to the end of the pipe:
let inline f<'t> () :IReadOnlyCollection<'t> =
List.empty
|> ResizeArray
|> System.Collections.ObjectModel.ReadOnlyCollection
|> id<IReadOnlyCollection<_>>
This works, because now the upcast is inserted automatically when passing the argument to the id function - and this then returns a type that matches with the return type of the function.
much simpler and unexpected
let inline f2<'t>() : IReadOnlyCollection<'t> =
List.empty
|> ResizeArray
|> System.Collections.ObjectModel.ReadOnlyCollection
:> _

Trying to understand F# active patterns, why can I do this:

I have a Dictionary over which I initially iterated thusly:
myDictionary |> Seq.iter (fun kvp -> doSomething kvp.Key kvp.Value)
Later, I discovered that I could make use of the KeyValue active pattern, and do this:
myDictionary |> Seq.iter (fun (KeyValue (k, v)) -> doSomething k v)
Knowing that active patterns aren't some form of preprocessor directive, how am I able to substitute the kvp argument in the lambda for a function that decomposes it?
Functions arguments call always be destructured using pattern matching. For instance:
let getSingleton = fun [x] -> x
let getFirst = fun (a,b) -> a
let failIfNotOne = fun 1 -> ()
let failIfNeitherOne = fun (x,1 | 1,x) -> ()
Semantically, fun<pat>-><body> is roughly equivalent to
fun x -> match x with |<pat>-><body>
| _ -> raise MatchFailureException(...)
I think the answer from #kvb covers in enough details why you can use patterns in the arguments of fun. This is not an ad-hoc feature - in F#, you can use patterns anywhere where you can bind a variable. To show some of the examples by #kvb in another contexts:
// When declaring normal functions
let foo [it] = it // Return the value from a singleton list
let fst (a, b) = a // Return first element of a pair
// When assigning value to a pattern using let
let [it] = list
let (a, b) = pair
Similarly, you can use patterns when writing fun. The match construct is a bit more powerful, because you can specify multiple clauses.
Now, active patterns are not really that magical. They are just normal functions with special names. The compiler searches for active patterns in scope when it finds a named pattern. For example, the pattern you're using is just a function:
val (|KeyValue|) : KeyValuePair<'a,'b> -> 'a * 'b
The pattern turns a KevValuePair object into a normal F# tuple that is then matched by a nested pattern (k, v) (which assigns the first element to k and the second to v). The compiler essentially translates your code to:
myDictionary |> Seq.iter (fun _arg0 ->
let _arg1 = (|KeyValue|) _arg0
let (k, v) = _arg1
doSomething k v )

Enumerator and disposing in F#

I am trying to draw lessons on the following behaviour from an example I simplified :
let groupedEnum (input: 'a seq) =
using (input.GetEnumerator()) (fun en ->
Seq.unfold(fun _ ->
if en.MoveNext() then
Some(en.Current, ())
else None) ()
)
//WORKS
let c = groupedEnum ("11111122334569999" |> List.ofSeq ) |> List.ofSeq
//BOOM !! System.NullReferenceException
let c = groupedEnum ("11111122334569999" ) |> List.ofSeq
Is the enumerator "en" disposed of independently of it being captured ? (I guess it is but is there anything to say / materials to read on this behaviour beside this msdn doc on ressources)
Why is it working if the sequence is transformed to a list first ?
Edit : this is just a toy example to illustrate a behaviour, not to be followed.
There are very few good reasons to manipulate enumerators directly.
The using function disposes the enumerator as soon as the lambda function returns. However, the lambda function creates a lazy sequence using Seq.unfold and the lazy sequence accesses the enumerator after the sequence is returned from groupedEnum.
You could either fully evaluate the whole sequence inside using (by adding List.ofSeq there) or you need to call Dispose when the end of the generated sequence is reached:
let groupedEnum (input: 'a seq) =
let en = input.GetEnumerator()
Seq.unfold(fun _ ->
if en.MoveNext() then
Some(en.Current, ())
else
en.Dispose()
None)
Exception handling becomes quite difficult in this case, but I guess that one way to do it would be to wrap the body in try .. with and call Dispose if an exception happens (and then return None).
If you use sequence expressions instead, then the meaning of use changes and it automatically disposes the enumerator after the end of sequence is reached (not when the lazy sequence is returned). So using sequence expressions might be a better choice because the hard work is done for you:
let groupedEnum (input: 'a seq) = seq {
use en = input.GetEnumerator()
let rec loop () = seq {
if en.MoveNext() then
yield en.Current
yield! loop () }
yield! loop () }
EDIT And why does it work in your first example? The enumerator returned by F# list type simply ignores Dispose and continues working, while if you call Dispose on an enumerator returned by a string, the enumerator cannot be used again. (This is arguably a bit odd behaviour of the F# list type.)

Resources