Run method in computation expressions - f#

What's the status of the Run() method in a computation method? I've seen it in several examples (here, here, here), and I've seen in it F#'s compiler source, yet it's not in the spec or the MSDN documentation. I filed an issue in MS Connect about this and it was closed as "by design" without further explanations.
So is it deprecated/undocumented/unsupported? Should I avoid it?
UPDATE: MS Connect issue status was promptly changed and the MSDN page updated to include Run()

6.3.10 Computation Expressions
More specifically, computation
expressions are of the form
builder-expr { cexpr } where cexpr is,
syntactically, the grammar of
expressions with the additional
constructs defined in comp-expr.
Computation expressions are used for
sequences and other non-standard
interpretations of the F# expression
syntax. The expression builder-expr {
cexpr } translates to
let b = builder-expr in b.Run (b.Delay(fun () -> {| cexpr |}C))
for a fresh variable b. If no method Run exists on the inferred type of b when this
expression is checked, then that call is omitted. Likewise, if no method Delay exists on > the type of b when this expression is checked, then that call is omitted

I think that the Run method was added quite late in the development process, so that's probably a reason why it is missing in the documentation. As desco explains, the method is used to "run" a computation expression. This means that whenever you write expr { ... }, the translated code will be wrapped in a call to Run.
The method is a bit problematic, because it breaks compositionality. For example, it is sensible to require that for any computation expression, the following two examples represents the same thing:
expr { let! a = foo() expr { let! c = expr {
let! b = bar(a) let! a = foo()
let! c = woo(b) let! b = bar(a)
return! zoo(c) } return! woo(b) }
return! zoo(c) }
However, the Run method will be called only on the overall result in the left example and two times on the right (for the overall computation expression and for the nested one). A usual type signature of the method is M<T> -> T, which means that the right code will not even compile.
For this reason, it is a good idea to avoid it when creating monads (as they are usually defined and used e.g. in Haskell), because the Run method breaks some nice aspects of monads. However, if you know what you are doing, then it can be useful...
For example, in my break code, the computation builder executes its body immediately (in the declaration), so adding Run to unwrap the result doesn't break compositionality - composing means just running another code. However, defining Run for async and other delayed computations is not a good idea at all.

Related

This expression was expected to have type 'unit' but here has type 'string'

I was attempting to convert this to F# but I can't figure out what I'm doing wrong as the error message (in title) is too broad of an error to search for, so I found no resolutions.
Here is the code:
let getIP : string =
let host = Dns.GetHostEntry(Dns.GetHostName())
for ip in host.AddressList do
if ip.AddressFamily = AddressFamily.InterNetwork then
ip.ToString() // big fat red underline here
"?"
A for loop in F# is for running imperative-style code, where the code inside the for loop does not produce a result but instead runs some kind of side-effect. Therefore, the expression block in an F# for loop is expected to produce the type unit, which is what side-effect functions should return. (E.g., printfn "Something" returns the unit type). Also, there's no way to exit a for loop early in F#; this is by design, and is another reason why a for loop isn't the best approach to do what you're trying to do.
What you're trying to do is go through a list one item at a time, find the first item that matches some condition, and return that item (and, if the item is not found, return some default value). F# has a specialized function for that: Seq.find (or List.find if host.AddressList is an F# list, or Array.find if host.AddressList is an array. Those three functions take different input types but all work the same way conceptually, so from now on I'll focus on Seq.find, which takes any IEnumerable as input so is most likely to be what you need here).
If you look at the Seq.find function's type signature in the F# docs, you'll see that it is:
('T -> bool) -> seq<'T> -> 'T
This means that the function takes two parameters, a 'T -> bool and seq<'T> and returns a 'T. The 'T syntax means "this is a generic type called T": in F#, the apostrophe means that what follows is the name of a generic type. The type 'T -> bool means a function that takes a 'T and returns a Boolean; i.e., a predicate that says "Yes, this matches what I'm looking for" or "No, keep looking". The second argument to Seq.find is a seq<'T>; seq is F#'s shorter name for an IEnumerable, so you can read this as IEnumerable<'T>. And the result is an item of type 'T.
Just from that function signature and name alone, you can guess what this does: it goes through the sequence of items and calls the predicate for each one; the first item for which the predicate returns true will be returned as the result of Seq.find.
But wait! What if the item you're looking for isn't in the sequence at all? Then Seq.find will throw an exception, which may not be the behavior you're looking for. Which is why the Seq.tryFind function exists: its function signature looks just like Seq.find, except for the return value: it returns 'T option rather than 'T. That means that you'll either get a result like Some "ip address" or None. In your case, you intend to return "?" if the item isn't found. So you want to convert a value that's either Some "ip address or None to either "ip address" (without the Some) or "?". That is what the defaultArg function is for: it takes a 'T option, and a 'T representing the default value to return if your value is None, and it returns a plain 'T.
So to sum up:
Seq.tryFind takes a predicate function and a sequence, and returns a 'T option. In your case, this will be a string option
defaultArg takes a 'T option and a default value, and returns a normal 'T (in your case, a string).
With these two pieces, plus a predicate function you can write yourself, we can do what you're looking for.
One more note before I show you the code: you wrote let getIP : string = (code). It seems like you intended for getIP to be a function, but you didn't give it any parameters. Writing let something = (code block) will create a value by running the code block immediately (and just once) and then assigning its result to the name something. Whereas writing let something() = (code block) will create a function. It will not run the code block immediately, but it will instead run the code block every time the function is called. So I think you should have written let getIP() : string = (code).
Okay, so having explained all that, let's put this together to give you a getIP function that actually works:
let getIP() = // No need to declare the return type, since F# can infer it
let isInternet ip = // This is the predicate function
// Note that this function isn't accessible outside of getIP()
ip.AddressFamily = AddressFamily.InterNetwork
let host = Dns.GetHostEntry(Dns.GetHostName())
let maybeIP = Seq.tryFind isInternet host.AddressList
defaultArg maybeIP "?"
I hope that's clear enough; if there's anything you don't understand, let me know and I'll try to explain further.
Edit: The above has one possible flaw: the fact that F# may not be able to infer the type of the ip argument in isInternet without an explicit type declaration. It's clear from the code that it needs to be some class with an .AddressFamily property, but the F# compiler can't know (at this point in the code) which class you intend to pass to this predicate function. That's because the F# compiler is a single-pass compiler, that works its way through the code in a top-down, left-to-right order. To be able to infer the type of the ip parameter, you might need to rearrange the code a little, as follows:
let getIP() = // No need to declare the return type, since F# can infer it
let host = Dns.GetHostEntry(Dns.GetHostName())
let maybeIP = host.AddressList |> Seq.tryFind (fun ip -> ip.AddressFamily = AddressFamily.InterNetwork)
defaultArg maybeIP "?"
This is actually more idiomatic F# anyway. When you have a predicate function being passed to Seq.tryFind or other similar functions, the most common style in F# is to declare that predicate as an anonymous function using the fun keyword; this works just like lambdas in C# (in C# that predicate would be ip => ip.AddressFamily == AddressFamily.InterNetwork). And the other thing that's common is to use the |> operator with things like Seq.tryFind and others that take predicates. The |> operator basically* takes the value that's before the |> operator and passes it as the last parameter of the function that's after the operator. So foo |> Seq.tryFind (fun x -> xyz) is just like writing Seq.tryFind (fun x -> xyz) foo, except that foo is the first thing you read in that line. And since foo is the sequence that you're looking in, and fun x -> xyz is how you're looking, that feels more natural: in English, you'd say "Please look in my closet for a green shirt", so the concept "closet" comes up before "green shirt". And in idiomatic F#, you'd write closet |> Seq.find (fun shirt -> shirt.Color = "green"): again, the concept "closet" comes up before "green shirt".
With this version of the function, F# will encounter host.AddressList before it encounters fun ip -> ..., so it will know that the name ip refers to one item in host.AddressList. And since it knows the type of host.AddressList, it will be able to infer the type of ip.
* There's a lot more going on behind the scenes with the |> operator, involving currying and partial application. But at a beginner level, just think of it as "puts a value at the end of a function's parameter list" and you'll have the right idea.
In F# any if/else/then-statement must evaluate to the same type of value for all branches. Since you've omitted the else-branch of the expression, the compiler will infer it to return a value of type unit, effectively turning your if-expression into this:
if ip.AddressFamily = AddressFamily.InterNetwork then
ip.ToString() // value of type string
else
() // value of type unit
Scott Wlaschin explains this better than me on the excellent F# for fun and profit.
This should fix the current error, but still won't compile. You can solve this either by translating the C#-code more directly (using a mutable variable for the localIP value, and doing localIP <- ip.ToString() in your if-clause, or you could look into a more idiomatic approach using something like Seq.tryFind.

Why does f# dot operator have such a low precedence

The precedence of F#'s member selection dot (.) operator as used in
System.Console.WriteLine("test")
has a lower precedence than [space] such that the following
ignore System.Console.WriteLine("test")
must be written explicitly as
ignore (System.Console.WriteLine("test"))
though this would be the intuition from the notion of juxtaposed symbols. Having used CoffeeScript, I can appreciate how intuitive precedence can serve to de-clutter code.
Are there any efforts being made to rationalize this kerfuffle, perhaps something along the lines that incorporated the "lightweight" syntax of the early years?
==============
Upon review, the culprit is not the "." operator but the invocation operator "()", as in "f()". So, given:
type C() = class end
then the following intuitive syntax fails:
printfn "%A" C() <-- syntax error FS0597
and must be written thus (as prescribed by the documentation):
printfn "%A" (C()) <-- OK
It seems intuitive that a string of symbols unbroken by white space should implicitly represents a block. In fact, the utility of juxtaposing is to create such a block.
a b.c is parsed as a (b.c), not (a b).c. So there are no efforts to rationalize this - it simply is not true.
Thanks to all those who responded.
My particular perplexity stemmed from treating () as an invocation operator. As an eager evaluation language, F# does not have or need such a thing. In stead, this is an expression boundary, as in, (expression). In particular, () bounds the nothing expression which is the only value of the type, unit. Consequently, () is the stipulation of a value and not a direction to resolved the associated function (though that is the practical consequence when parameters are provided to functions due to F#'s eager evaluation.)
As a result, the following expression
ignore System.Console.WriteLine("test")
actually surfaces three distinct values,
ignore System.Console.WriteLine ("test")
which are interpreted according to the left-to-right precedence evaluation order or F# (which then permits partial function application and perhaps other things)
( ignore System.Console.WriteLine ) ("test")
...but the result of (ignore expr) will be unit, which does not expect a parameter. Hence, syntax error (strong typing, yea!). So, an expression boundary is required. In particular,
ignore ( System.Console.WriteLine ("test") )
or
ignore (System.Console.WriteLine "test")
or
ignore <| System.Console.WriteLine "test"
or
System.Console.WriteLine "test" |> ignore

F#: generated IL code for seq{} vs other computational workflows

When I compare IL code that F# generates for seq{} expressions vs that for user-defined computational workflows, it's quite obvious that seq{} is implemented very differently: it generates a state machine similar to the once C# uses for its' iterator methods. User-defined workflows, on the other hand, use the corresponding builder object as you'd expect.
So I am wondering - why the difference?
Is this for historical reasons, e.g. "seq was there before workflows"?
Or, is there significant performance to be gained?
Some other reason?
This is an optimization performed by the F# compiler. As far as I know, it has actually been implemented later - F# compiler first had list comprehensions, then a general-purpose version of computation expressions (also used for seq { ... }) but that was less efficient, so the optimization was added in some later version.
The main reason is that this removes many allocations and indirections. Let's say you have something like:
seq { for i in input do
yield i
yield i * 10 }
When using computation expressions, this gets translated to something like:
seq.Delay(fun () -> seq.For(input, fun i ->
seq.Combine(seq.Yield(i), seq.Delay(fun () -> seq.Yield(i * 10)))))
There is a couple of function allocations and the For loop always needs to invoke the lambda function. The optimization turns this into a state machine (similar to the C# state machine), so the MoveNext() operation on the generated enumerator just mutates some state of the class and then returns...
You can easily compare the performance by defining a custom computation builder for sequences:
type MSeqBuilder() =
member x.For(en, f) = Seq.collect f en
member x.Yield(v) = Seq.singleton v
member x.Delay(f) = Seq.delay f
member x.Combine(a, b) = Seq.concat [a; b]
let mseq = MSeqBuilder()
let input = [| 1 .. 100 |]
Now we can test this (using #time in F# interactive):
for i in 0 .. 10000 do
mseq { for x in input do
yield x
yield x * 10 }
|> Seq.length |> ignore
On my computer, this takes 2.644sec when using the custom mseq builder but only 0.065sec when using the built-in optimized seq expression. So the optimization makes sequence expressions significantly more efficient.
Historically, computations expressions ("workflows") were a generalization of sequence expressions: http://blogs.msdn.com/b/dsyme/archive/2007/09/22/some-details-on-f-computation-expressions-aka-monadic-or-workflow-syntax.aspx.
But, the answer is certainly that there is significant performance to be gained. I can't turn up any solid links (though there is a mention of "optimizations related to 'when' filters in sequence expressions" in http://blogs.msdn.com/b/dsyme/archive/2007/11/30/full-release-notes-for-f-1-9-3-7.aspx), but I do recall that this was an optimization that made its way in at some point in time. I'd like to say that the benefit is self-evident: sequence expressions are a "core" language feature and deserving of any optimizations that can be made.
Similarly, you'll see that certain tail-recursive functions will be optimized in to loops, rather than tail calls.

Why was "->" deprecated in F# comprehensions?

Sometimes in books I see this syntax for list and sequence comprehensions in F#:
seq { for i = 0 to System.Int32.MaxValue -> i }
This is from Programming F# by Chris Smith, page 80. In the F# which comes with VS2010, this doesn't compile. I believe -> has been deprecated. (See Alternative List Comprehension Syntax). However, -> can still be used in comprehensions which involve ranges:
seq { for c in 'A' .. 'Z' -> c }
According to Expert F# 2.0, page 58, this is because -> is shorthand for Seq.map over a range.
Why was the first usage of -> above deprecated?
The current use of -> seems inconsistent. Can anyone reconcile this for me?
The -> construct is supported only in the "simple" sequence expression syntax where you're doing a projection (Seq.map) over some data source using the following structure:
seq { for <binding> in <input> -> <projection> }
The first example you mentioned is using for .. to .. which is a different syntactical construct than for .. in, but you can rewrite it using the second one (In fact, I almost always use for .. in when writing sequence expressions):
seq { for i in 0 .. System.Int32.MaxValue -> i }
In all other forms of sequence expressions you'll have to use yield. In earlier versions of F#, the -> syntax was equivalent to yield (and there was also ->> which was equivalent to yield!). So for example, you was able to write:
seq { -> 10 // You need 'yield' here
->> [ 1; 2; 3 ] } // You need 'yield!' here
This syntax looks quite odd, so I think that the main reason for making these two deprecated is to keep the language consistent. The same computation expression syntax is used in sequence expressions (where -> makes some sense), but also for other computation types (and you can define your own), where yield feels more appropriate (and it also corresponds to return in asynchronous workflows or other computation expressions).
The "simple" sequence-specific syntax is still useful, because it saves you some typing (you replace do yield with just ->), but in more complicated cases, you don't save that many characters and I think that the syntax using -> & ->> can look a bit cryptic.

How do I know if a function is tail recursive in F#

I wrote the follwing function:
let str2lst str =
let rec f s acc =
match s with
| "" -> acc
| _ -> f (s.Substring 1) (s.[0]::acc)
f str []
How can I know if the F# compiler turned it into a loop? Is there a way to find out without using Reflector (I have no experience with Reflector and I Don't know C#)?
Edit: Also, is it possible to write a tail recursive function without using an inner function, or is it necessary for the loop to reside in?
Also, Is there a function in F# std lib to run a given function a number of times, each time giving it the last output as input? Lets say I have a string, I want to run a function over the string then run it again over the resultant string and so on...
Unfortunately there is no trivial way.
It is not too hard to read the source code and use the types and determine whether something is a tail call by inspection (is it 'the last thing', and not in a 'try' block), but people second-guess themselves and make mistakes. There's no simple automated way (other than e.g. inspecting the generated code).
Of course, you can just try your function on a large piece of test data and see if it blows up or not.
The F# compiler will generate .tail IL instructions for all tail calls (unless the compiler flags to turn them off is used - used for when you want to keep stack frames for debugging), with the exception that directly tail-recursive functions will be optimized into loops. (EDIT: I think nowadays the F# compiler also fails to emit .tail in cases where it can prove there are no recursive loops through this call site; this is an optimization given that the .tail opcode is a little slower on many platforms.)
'tailcall' is a reserved keyword, with the idea that a future version of F# may allow you to write e.g.
tailcall func args
and then get a warning/error if it's not a tail call.
Only functions that are not naturally tail-recursive (and thus need an extra accumulator parameter) will 'force' you into the 'inner function' idiom.
Here's a code sample of what you asked:
let rec nTimes n f x =
if n = 0 then
x
else
nTimes (n-1) f (f x)
let r = nTimes 3 (fun s -> s ^ " is a rose") "A rose"
printfn "%s" r
I like the rule of thumb Paul Graham formulates in On Lisp: if there is work left to do, e.g. manipulating the recursive call output, then the call is not tail recursive.

Resources