The language suggestion states that the advantages are detailed in the linked paper. I had a quick skim through and I can't see it spelled out explicitly.
Is the advantage just that each statement gets executed in parallel so I might get a speed boost?
Or is there some kind of logic it caters for, that is not convenient using the usual monadic let!?
I understand that, this being applicative, means that it comes with the limitation that I can't use previous expressions to determine the logic of subsequent expressions. So does that mean the trade off is flexibility for efficiency?
The way I understood Don Syme's description, when I read it a while ago, was every step in the let! ... and! ... chain will be executed, unlike when you use let! ... let! .... Say, for instance, that you use the option CE. Then if you write
option {
let! a = parseInt("a")
let! b = parseInt("b")
return a + b
}
only the first let! will be executed, as the CE short-circuits as soon as it meets a None. Writing instead let! a ... and! b = ... will try to parse both strings; though not necessarily in parallel, as I understand it.
I see a huge benefit in this, when parsing data from external sources. Consider for instance running
parse {
let! name = tryParse<Name>("John Doe")
and! email = tryParse<EMail>("johndoe#")
and! age = tryParse<uint>("abc")
return (name, email, age)
}
and getting an Error(["'johndoe#' isn't a valid e-mail"; "'abc' isn't a valid uint"]) in return, instead of only the first of the errors. That's pretty neat to me.
I am excited for this for two reasons. Potentially better performance and that some things that can't fit into the Monad pattern can fit into the Applicative Functor pattern.
In order to support the already existing let! one need to implement Bind and Return, ie the Monad pattern.
let! and and! requires Apply and Pure (not exactly sure what F# the equivalent are named), ie the Applicative Functor pattern.
Applicative Functors are less powerful than Monads but still very useful. For example, in parsers.
As mentioned by Philip Carter, Applicative Functor can be made more efficient than Monads as it's possible to cache them.
The reason is the signature of Bind: M<'T> -> ('T -> M<'U>) -> M<'U>.
One typical way you implement bind is that you "execute" the first argument to produce a value of 'T and pass it the second argument to get the next step M<'U> which you execute as well.
As the second argument is a function that can return any M<'U> it means caching is not possible.
With Apply the signature is: M<'T -> 'U> -> M<'T> -> M<'U>. As neither of the two inputs are functions you can potentially cache them or do precomputation.
This is useful for parsers, for example in FParsec for performance reasons it's not recommended to use Bind.
In addition; a "drawback" with Bind is that if the first input doesn't produce a value 'T there is no way to get the second computation M<'U>. If you are building some kind of validator monad this means that as soon as a valid value can't be produced because the input is invalid then the validation will stop and you won't get a report of the remaining input.
With Applicative Functor you can execute all validators to produce a validation report for the entire document.
So I think it's pretty cool!
Related
Taking the example of writing a parser-unparser pair for a DU.
The unparser function of course uses match expression, where we can count on static exhaustiveness check if new cases are added later.
Now, for the parsing side, the obvious route is (with eg FParsec) use a choice parser with a list of the DU case parsers. This works, but does not have exhaustiveness checking.
My best solution for this, is to create a DU_Case -> parser function that uses match expression, and using runtime reflection get all DU cases, call the function with them, and pass the so generated list to the choice combinator. This works, and does have static exhaustiveness checking, BUT gets pretty clunky fast, especially if the cases have varying fields.
In other words, is there a good pattern other than using reflection for ensuring exhaustiveness when the discriminated union is the output and not the input?
Edit: example on the reflection based solution
By getting clunky, I mean that for every case that has fields, I have to also create the field values dynamically, instead of the simple Array.zeroCreate(0) which is of course always valid. But what in the else branch? There needs to be code to dynamically create the correct number of field values, with their correct (possibly complex) constructors and so on. Not impossible with reflection, but clunky.
FSharpType.GetUnionCases duType
|> Seq.map (fun duCase ->
if duCase.GetFields().Length = 0 then
let case = FSharpValue.MakeUnion(duCase, Array.zeroCreate(0))
else
(*...*)
I could not find an object choice in the standard libraries, that allows me to write
let safeDiv (numer : Choice<Exception, int>) (denom : Choice<Exception, int>) =
choice {
let! n = numer
let! d = denom
return! if d = 0
then Choice1Of2 (new DivideByZeroException())
else Choice2Of2 (n / d)
}
like in Haskell. Did I miss anything, or is there a third-party library for writing this kind of things, or do I have to re-invent this wheel?
There is no built-in computation expression for the Choice<'a,'b> type. In general, F# does not have a built-in computation expression for the commonly used Monads, but it does offer a fairly simple way to create them yourself: Computation Builders. This series is a good tutorial on how to implement them yourself. The F# library does often have a bind function defined that can be used as the basis of a Computation Builder, but it doesn't have one for the Choice type (I suspect because there are many variations of Choice).
Based on the example you provided, I suspect the F# Result<'a, 'error> type would actually be a better fit for your scenario. There's a code-review from a few months ago where a user posted an Either Computation Builder, and the accepted answer has a fairly complete implementation if you'd like to leverage it.
It is worth noting that, unlike in Haskell, using exceptions is a perfectly acceptable way to handle exceptional situations in F#. The language and the runtime both have a first-class support for exceptions and there is nothing wrong about using them.
I understand that your safeDiv function is for illustration, rather than being a real-world problem, so there is no reason for showing how to write that using exceptions.
In more realistic scenarios:
If the exception happens only when something actually goes wrong (network failure, etc.) then I would just let the system throw an exception and handle that using try ... with at the point where you need to restart the work or notify the user.
If the exception represents something expected (e.g. invalid user input) then you'll probably get more readable code if you define a custom data type to represent the wrong states (rather than using Choice<'a, exn> which has no semantic meaning).
It is also worth noting that computation expressions are only useful if you need to mix your special behaviour (exception propagation) with ordinary computation. I think it's often desirable to avoid that as much as possible (because it interleaves effects with pure computations).
For example, if you were doing input validation, you could define something like:
let result = validateAll [ condition1; condition2; condition3 ]
I would prefer that over a computation expression:
let result = validate {
do! condition1
do! condition2
do! condition3 }
That said, if you are absolutely certain that custom computation builder for error propagation is what you need, then Aaron's answer has all the information you need.
I'm trying to work out how to use a computation builder to represent a deferred, nested set of steps.
I've got the following so far:
type Entry =
| Leaf of string * (unit -> unit)
| Node of string * Entry list * (unit -> unit)
type StepBuilder(desc:string) =
member this.Zero() = Leaf(desc,id)
member this.Bind(v:string, f:unit->string) =
Node(f(), [Leaf(v,id)], id)
member this.Bind(v:Entry, f:unit->Entry) =
match f() with
| Node(label,children,a) -> Node(label, v :: children, a)
| Leaf(label,a) -> Node(label, [v], a)
let step desc = StepBuilder(desc)
let a = step "a" {
do! step "b" {
do! step "c" {
do! step "c.1" {
// todo: this still evals as it goes; need to find a way to defer
// the inner contents...
printfn "TEST"
}
}
}
do! step "d" {
printfn "d"
}
}
This produces the desired structure:
A(B(C(c.1)), D)
My issue is that in building the structure up, the printfn calls are made.
Ideally what I want is to be able to retrieve the tree structure, but be able to call some returned function/s that will then execute the inner blocks.
I realise this means that if you have two nested steps with some "normal" code between them, it would need to be able to read the step declarations, and then invoke over it (if that makes sense?).
I know that Delay and Run are things that are used in deferred execution for computation expressions, but I'm not sure if they help me here, as they unfortunately evaluate for everything.
I'm most likely missing something glaringly obvious and very "functional-y" but I just can't seem to make it do what I want.
Clarification
I'm using id for demonstration, they're part of the puzzle, and I imagine how I might surface the "invokable" parts of my expression that I want.
As mentioned in the other answer, free monads provide a useful theoretical framework for thinking about this kind of problems - however, I think you do not necessarily need them to get an answer to the specific question you are asking here.
First, I had to add Return to your computation builder to make your code compile. As you never return anything, I just added an overload taking unit which is equivalent to Zero:
member this.Return( () ) = this.Zero()
Now, to answer your question - I think you need to modify your discriminated union to allow delaying of computations that produce Entry - you do have functions unit -> unit in the domain model, but that's not quite enough to delay a computation that will produce a new entry. So, I think you need to extend the type:
type Entry =
| Leaf of string * (unit -> unit)
| Node of string * Entry list * (unit -> unit)
| Delayed of (unit -> Entry)
When you are evaluating Entry, you will now need to handle the Delayed case - which contains a function that might perform side-effect such as printing "TEST".
Now you can add Delay to your computation builder and also implement the missing case for Delayed in Bind like this:
member this.Delay(f) = Delayed(f)
member this.Bind(v:Entry, f:unit->Entry) = Delayed(fun () ->
let rec loop = function
| Delayed f -> loop (f())
| Node(label,children,a) -> Node(label, v :: children, a)
| Leaf(label,a) -> Node(label, [v], a)
loop (f()) )
Essentially, Bind will create a new delayed computation that, when called, evaluates the entry v until it finds a node or a leaf (collapsing all other delayed nodes) and then does the same thing as what your code did before.
I think this answers your question - but I'd be a bit careful here. I think computation expressions are useful as a syntactic sugar, but they are very harmful if you think about them more than you think about the domain of the problem that you are actually solving - in the question, you did not say much about your actual problem. If you did, the answer might be very different.
You wrote:
Ideally what I want is to be able to retrieve the tree structure, but be able to call some returned function/s that will then execute the inner blocks.
This is an almost perfect description of the "free monad", which is basically the functional-programming equivalent of the OOP "interpreter pattern". The basic idea behind the free monad is that you convert imperative-style code into a two-step process. The first step builds up an AST, and the second step executes the AST. That way you can do things in between step 1 and step 2, like analyze the tree structure without executing the code. Then when you're ready, you can run your "execute" function, which takes the AST as input and actually does the steps it represents.
I'm not experienced enough with free monads to be able to write a complete tutorial on them, nor to directly answer your question with a step-by-step specific free-monad solution. But I can point you to a few resources that may help you understand the concepts behind them. First, the required Scott Wlaschin link:
https://fsharpforfunandprofit.com/posts/13-ways-of-looking-at-a-turtle-2/#way13
This is the last part of his "13 ways of looking at a turtle" series, where he builds a small LOGO-like turtle-graphics app using many different design styles. In #13, he uses the free-monad style, building it from the ground up so you can see the design decisions that go into that style.
Second, a set of links to Mark Seemann's blog. For the past month or two, Mark Seemann has been writing posts about the free-monad style, though I didn't realize that that's what he was writing about until he was several articles in. There's a terminology difference that may confuse you at first: Scott Wlaschin uses the terms "Stop" and "KeepGoing" for the two possible AST cases ("this is the end of the command list" vs. "there are more commands after this one"). But the traditional names for those two free-monad cases are "Pure" and "Free". IMHO, the names "Pure" and "Free" are too abstract, and I like Scott Wlaschin's "Stop" and "KeepGoing" names better. But I mention this so that when you see "Pure" and "Free" in Mark Seemann's posts, you'll know that it's the same concept as Scott Wlaschin's turtle example.
Okay, with that explanation finished, here are the links to Mark Seemann's posts:
http://blog.ploeh.dk/2017/06/27/pure-times/
http://blog.ploeh.dk/2017/06/28/pure-times-in-haskell/
http://blog.ploeh.dk/2017/07/04/pure-times-in-f/
http://blog.ploeh.dk/2017/07/10/pure-interactions/
http://blog.ploeh.dk/2017/07/11/hello-pure-command-line-interaction/
http://blog.ploeh.dk/2017/07/17/a-pure-command-line-wizard/
http://blog.ploeh.dk/2017/07/24/combining-free-monads-in-haskell/
http://blog.ploeh.dk/2017/07/31/combining-free-monads-in-f/
http://blog.ploeh.dk/2017/08/07/f-free-monad-recipe/
Mark intersperses Haskell examples with F# examples, as you can tell from the URLs. If you are completely unfamiliar with Haskell, you can probably skip those posts as they might confuse you more than they help. But if you've got a passing familiarity with Haskell syntax, seeing the same ideas expressed in both Haskell and F# may help you grasp the concepts better, so I've included the Haskell posts as well as the F# posts.
As I said, I'm not quite familiar enough with free monads to be able to give you a specific answer to your question. But hopefully these links will give you some background knowledge that can help you implement what you're looking for.
I was reading this and now wonder: what is the evaluation order in F#?
Obviously ; makes effects happen in a sequential fashion. But what about things like function calls or applications, order of evaluation for operators, and the like.
I've glanced at the F# spec, but there is no mention of that. Thanks for any insight!
I found some emails where we fixed the implementation to have a rigid application order. The code
open System
let f a =
Console.WriteLine "app1";
fun b ->
Console.WriteLine "app2";
()
(Console.WriteLine "f"; f) (Console.WriteLine "arg1") (Console.WriteLine "arg2")
will print "f", "arg1", "arg2", "app1", "app2". However this didn't make it into the spec. I'll file a spec bug.
(Some other portions of the spec are already more explicit, e.g.
6.9.6 Evaluating Method Applications
For elaborated applications of methods, the elaborated form of the expression will be either expr.M(args) or M(args).
The (optional) expr and args are evaluated in left-to-right order and the body of the member is evaluated in an environment with formal parameters that are mapped to corresponding argument values.
If expr evaluates to null then NullReferenceException is raised.
If the method is a virtual dispatch slot (that is, a method that is declared abstract) then the body of the member is chosen according to the dispatch maps of the value of expr.
That said, some experts believe that you will live a longer, happier life if you do not rely on evaluation order. :) )
(Possibly see also
http://blogs.msdn.com/ericlippert/archive/2009/11/19/always-write-a-spec-part-one.aspx
http://blogs.msdn.com/ericlippert/archive/2009/11/23/always-write-a-spec-part-two.aspx
for more on how easy it is to screw things up with evaluation order.)
Why is it that functions in F# and OCaml (and possibly other languages) are not by default recursive?
In other words, why did the language designers decide it was a good idea to explicitly make you type rec in a declaration like:
let rec foo ... = ...
and not give the function recursive capability by default? Why the need for an explicit rec construct?
The French and British descendants of the original ML made different choices and their choices have been inherited through the decades to the modern variants. So this is just legacy but it does affect idioms in these languages.
Functions are not recursive by default in the French CAML family of languages (including OCaml). This choice makes it easy to supercede function (and variable) definitions using let in those languages because you can refer to the previous definition inside the body of a new definition. F# inherited this syntax from OCaml.
For example, superceding the function p when computing the Shannon entropy of a sequence in OCaml:
let shannon fold p =
let p x = p x *. log(p x) /. log 2.0 in
let p t x = t +. p x in
-. fold p 0.0
Note how the argument p to the higher-order shannon function is superceded by another p in the first line of the body and then another p in the second line of the body.
Conversely, the British SML branch of the ML family of languages took the other choice and SML's fun-bound functions are recursive by default. When most function definitions do not need access to previous bindings of their function name, this results in simpler code. However, superceded functions are made to use different names (f1, f2 etc.) which pollutes the scope and makes it possible to accidentally invoke the wrong "version" of a function. And there is now a discrepancy between implicitly-recursive fun-bound functions and non-recursive val-bound functions.
Haskell makes it possible to infer the dependencies between definitions by restricting them to be pure. This makes toy samples look simpler but comes at a grave cost elsewhere.
Note that the answers given by Ganesh and Eddie are red herrings. They explained why groups of functions cannot be placed inside a giant let rec ... and ... because it affects when type variables get generalized. This has nothing to do with rec being default in SML but not OCaml.
One crucial reason for the explicit use of rec is to do with Hindley-Milner type inference, which underlies all staticly typed functional programming languages (albeit changed and extended in various ways).
If you have a definition let f x = x, you'd expect it to have type 'a -> 'a and to be applicable on different 'a types at different points. But equally, if you write let g x = (x + 1) + ..., you'd expect x to be treated as an int in the rest of the body of g.
The way that Hindley-Milner inference deals with this distinction is through an explicit generalisation step. At certain points when processing your program, the type system stops and says "ok, the types of these definitions will be generalised at this point, so that when someone uses them, any free type variables in their type will be freshly instantiated, and thus won't interfere with any other uses of this definition."
It turns out that the sensible place to do this generalisation is after checking a mutually recursive set of functions. Any earlier, and you'll generalise too much, leading to situations where types could actually collide. Any later, and you'll generalise too little, making definitions that can't be used with multiple type instantiations.
So, given that the type checker needs to know about which sets of definitions are mutually recursive, what can it do? One possibility is to simply do a dependency analysis on all the definitions in a scope, and reorder them into the smallest possible groups. Haskell actually does this, but in languages like F# (and OCaml and SML) which have unrestricted side-effects, this is a bad idea because it might reorder the side-effects too. So instead it asks the user to explicitly mark which definitions are mutually recursive, and thus by extension where generalisation should occur.
There are two key reasons this is a good idea:
First, if you enable recursive definitions then you can't refer to a previous binding of a value of the same name. This is often a useful idiom when you are doing something like extending an existing module.
Second, recursive values, and especially sets of mutually recursive values, are much harder to reason about then are definitions that proceed in order, each new definition building on top of what has been already defined. It is nice when reading such code to have the guarantee that, except for definitions explicitly marked as recursive, new definitions can only refer to previous definitions.
Some guesses:
let is not only used to bind functions, but also other regular values. Most forms of values are not allowed to be recursive. Certain forms of recursive values are allowed (e.g. functions, lazy expressions, etc.), so it needs an explicit syntax to indicate this.
It might be easier to optimize non-recursive functions
The closure created when you create a recursive function needs to include an entry that points to the function itself (so the function can recursively call itself), which makes recursive closures more complicated than non-recursive closures. So it might be nice to be able to create simpler non-recursive closures when you don't need recursion
It allows you to define a function in terms of a previously-defined function or value of the same name; although I think this is bad practice
Extra safety? Makes sure that you are doing what you intended. e.g. If you don't intend it to be recursive but you accidentally used a name inside the function with the same name as the function itself, it will most likely complain (unless the name has been defined before)
The let construct is similar to the let construct in Lisp and Scheme; which are non-recursive. There is a separate letrec construct in Scheme for recursive let's
Given this:
let f x = ... and g y = ...;;
Compare:
let f a = f (g a)
With this:
let rec f a = f (g a)
The former redefines f to apply the previously defined f to the result of applying g to a. The latter redefines f to loop forever applying g to a, which is usually not what you want in ML variants.
That said, it's a language designer style thing. Just go with it.
A big part of it is that it gives the programmer more control over the complexity of their local scopes. The spectrum of let, let* and let rec offer an increasing level of both power and cost. let* and let rec are in essence nested versions of the simple let, so using either one is more expensive. This grading allows you to micromanage the optimization of your program as you can choose which level of let you need for the task at hand. If you don't need recursion or the ability to refer to previous bindings, then you can fall back on a simple let to save a bit of performance.
It's similar to the graded equality predicates in Scheme. (i.e. eq?, eqv? and equal?)