η-expansion in a pure functional language - f#

In OCaml, it is legal to have in .mli:
val f : 'a -> 'a
val g : 'a -> 'a
and .ml:
let f x = x
let g = f
Yet in F#, this is rejected:
eta_expand.ml(2,5): error FS0034: Module 'Eta_expand' contains
val g : ('a -> 'a)
but its signature specifies
val g : 'a -> 'a
The arities in the signature and implementation differ. The signature specifies that 'g' is function definition or lambda expression accepting at least 1 argument(s), but the implementation is a computed function value. To declare that a computed function value is a permitted implementation simply parenthesize its type in the signature, e.g.
val g: int -> (int -> int)
instead of
val g: int -> int -> int.
One workaround is to η-expand the definition of g:
let g x = f x
If my code is purely functional (no exceptions, no side effects, etc.) this should be equivalent (actually, it might be even better with respect to polymorphism, depending on how the language generalizes types : in OCaml partial applications do not produce polymorphic functions, but their η-expansion does).
Is there any drawback to systematic η-expansion?
Two answers elude the question on η-expansion :-) and instead suggest that I add parentheses around my functional type. This is because, apparently, F# distinguishes at the typing level between "true" definition of functions (as λ-expressions and computed definitions, as in partial applications); presumably this is because λ-expressions directly map to CLR functions while computed definitions map to delegate objects. (I'm not sure of this interpretation and would appreciate if somebody very familiar with F# could point to reference documents describing this.)
A solution would be to systematically add parentheses to all the function types in .mli, but I fear this could lead to inefficiencies. Another would be to detect the computed functions and add parenthesize the corresponding types in the .mli. A third solution would be to η-expand the obvious cases, and parenthesize the others.
I'm not familiar enough with F# / CLR internals to measure which ones incur significant performance or interfacing penalties.

In theory, the F# function type 'a -> 'b -> 'c is the same type as 'a -> ('b -> 'c). That is, multiple argument functions are represented using the curried form in F#. You can use one where the other is expected in most cases e.g. when calling a higher-order function.
However, for practical reasons, F# compiler actually distinguishes between the types - the motivation is that they are represented differently in the compiled .NET code. This has impact on performance and also interoperability with C#, so it is useful to make that distinction.
A function Foo : int -> int -> int is going to be compiled as a member int Foo(int, int) - the compiler does not use the curried form by default, because this is more efficient when calling Foo with both arguments (more common case) and it is better for interop. A function Bar : int -> (int -> int) will be compiled as FSharpFunc<int, int> Bar(int) - actually using the curried form (and so it is more efficient to call it with just a single parameter and it will be hard to use from C#).
This is also why F# does not treat the types as equal when it comes to signatures - signature specifies the type, but here it also specifies how is the function going to be compiled. The implementation file has to provide function of the right type, but - in this case - also of the right compiled form.

Interestingly my fsi gives a more helpful error message:
/test.fs(2,5): error FS0034: Module 'Test' contains
val g : ('a -> 'a) but its signature specifies
val g : 'a -> 'a The arities in the signature and implementation differ.
The signature specifies that 'g' is function definition or lambda expression
accepting at least 1 argument(s), but the implementation is a computed
function value. To declare that a computed function value is a permitted
implementation simply parenthesize its type in the signature, e.g.
val g: int -> (int -> int) instead of
val g: int -> int -> int.
If you add brackets to get g :('a -> 'a) all is fine

Related

Understanding F# documentation function signatures

I am currently learning F# with a free online resource. Since i am curious and try to apply the learned stuff in some small excercises, I find myself consulting the MSDN F# documentation quite often.
But the documentation seems really cryptic to me. Take this documentation page for the pown function for example. The usage is pretty straight forward, but i don't understand the functions signature:
// Signature:
pown : ^T -> int -> ^T (requires ^T with static member One and ^T with static member op_Multiply and ^T with static member (/))
Can someone explain to me, what the following things are about?
What does the ^ (Circumflex) before the T do?
What does "T" mean? Is it a generic type?
What does the double -> do?
What does the requires statements do?
I hope this isn't too much to cover in one answer.
This indicates that T is a statically resolved type parameter as opposed to a normal generic type parameter (see also 4 below).
Yes.
-> is the type constructor for functions and is right associative, so this part is equivalent to ^T -> (int -> ^T). In other words, if you pass an argument of type ^T to this function, you'll get back a function from int to ^T. So pown 2 is the function 2x where the power hasn't been passed yet. And pown 2 8 is the same as (pown 2) 8: it's 28.
At the point of invocation, whatever concrete type is substituted for ^T it must be statically known to meet these requirements. So you can call pown 2 8 (because int supports these operations), but not pown "test" 8 (because string doesn't).
There are a few things going on there, so for starters, here's how I would suggest you approach signatures in F#. First of all, ignore the circumflex - mentally substitute a tick there. Then you can ignore the "requires" part - long story short, it's there because of the circumflex.
So after that you have a signature like this:
// Signature:
pown : 'T -> int -> 'T
'T is a generic type - uppercase 'T is a .NET standard, F# usually uses lowercase 'a, 'b etc. What this signature describes is a function that takes a 'T and an int, and returns a 'T. The type after the last -> is the "return type" of the function - at least that's a useful way to think about it at the beginning.
In reality, there's a bit more to that - in F# functions are curried (and partially applicable by default), so what you really have is a function that takes a 'T and returns a function of signature int -> 'T - at which point it's clear why you have double ->.
And the circumflex thing is a statically resolved type - I see #kvb gave more details on that already. It's good to be aware that it exists, but it's something that's rarely used in practice (you'll see it on core numeric functions and operators though).

Why does the argument type of arithmetic operators default to int?

I am new to F#, and I was surprised to find that the type of f x y = x + y is actually int -> int -> int. Appearently, this is due to some performance trade-off.
But why is this actually necessary? Why not just infer the type to be 'a -> 'a -> 'a or something similar? It seems to work for comparison: the type of g x y = x < y is x:'a -> y:'a -> bool when 'a : comparison. Why not for arithmetic operators as well?
Couldn't the compiler statically infer the specific primitive types from the call sites and specialize the generic function from there, falling back to some dynamic dispatch if this fails?
This might very well be obvious, but I could not find any good resources on this. What is the reasoning behind this behavior?
Yes, for those operators int is the default type inferred unless you specify a different one or is inferred by the use. If you want to define them for all types you have to make the function inline:
let inline f x y = x + y
But notice that the signature is:
x: ^a -> y: ^b -> ^c
when ( ^a or ^b) : (static member ( + ) : ^a * ^b -> ^c)
This is because in .NET you can't use member constraints but F# resolves them at compile time. That's why you see those 'hat types' and the constraint that those type should have a static member (+) defined.
Also notice the type variables are not a -> a -> a as you suggest, that's because in the .NET framework not all addition operations respect that signature. Things are different in other environments like Haskell, there the addition is strictly a -> a -> a but in .NET you can add for instance a TimeSpan to a DateTime:
System.DateTime(2000,1,1) + new System.TimeSpan(1, 2, 0, 30, 0)
and the result it's a DateTime, here the signature is: a -> b -> a
Comparison is a different story since that constraint actually exists at .NET level so it can be compiled and encoded in the IL whereas member constraints need to be resolved at compile time, that why the function has to be marked as inline.
I think you misinterpreted the explanation in the linked question: this is not due a performance trade-off, the real reason is a .NET type system limitation. The fact that an inline function executes faster in most cases (since it's inlined by the compiler) is a secondary effect.

Delegate/Func conversion and misleading compiler error message

I thought that conversions between F# functions and System.Func had to be done manually, but there appears to be a case where the compiler (sometimes) does it for you. And when it goes wrong the error message isn't accurate:
module Foo =
let dict = new System.Collections.Generic.Dictionary<string, System.Func<obj,obj>>()
let f (x:obj) = x
do
// Question 1: why does this compile without explicit type conversion?
dict.["foo"] <- fun (x:obj) -> x
// Question 2: given that the above line compiles, why does this fail?
dict.["bar"] <- f
The last line fails to compile, and the error is:
This expression was expected to have type
System.Func<obj,obj>
but here has type
'a -> obj
Clearly the function f doesn't have a signature of 'a > obj. If the F# 3.1 compiler is happy with the first dictionary assignment, then why not the second?
The part of the spec that should explain this is 8.13.7 Type Directed Conversions at Member Invocations. In short, when invoking a member, an automatic conversion from an F# function to a delegate will be applied. Unfortunately, the spec is a bit unclear; from the wording it seems that this conversion might apply to any function expression, but in practice it only appears to apply to anonymous function expressions.
The spec is also a bit out of date; in F# 3.0 type directed conversions also enable a conversion to a System.Linq.Expressions.Expression<SomeDelegateType>.
EDIT
In looking at some past correspondence with the F# team, I think I've tracked down how a conversion could get applied to a non-syntactic function expression. I'll include it here for completeness, but it's a bit of a strange corner case, so for most purposes you should probably consider the rule to be that only syntactic functions will have the type directed conversion applied.
The exception is that overload resolution can result in converting an arbitrary expression of function type; this is partly explained by section 14.4 Method Application Resolution, although it's pretty dense and still not entirely clear. Basically, the argument expressions are only elaborated when there are multiple overloads; when there's just a single candidate method, the argument types are asserted against the unelaborated arguments (note: it's not obvious that this should actually matter in terms of whether the conversion is applicable, but it does matter empirically). Here's an example demonstrating this exception:
type T =
static member M(i:int) = "first overload"
static member M(f:System.Func<int,int>) = "second overload"
let f i = i + 1
T.M f |> printfn "%s"
EDIT: This answer explains only the mysterious promotion to 'a -> obj. #kvb points out that replacing obj with int in OPs example still doesn't work, so that promotion is in itself insufficient explanation for the observed behaviour.
To increase flexibility, the F# type elaborator may under certain conditions promote a named function from f : SomeType -> OtherType to f<'a where 'a :> SomeType> : 'a -> OtherType. This is to reduce the need for upcasts. (See spec. 14.4.2.)
Question 2 first:
dict["bar"] <- f (* Why does this fail? *)
Because f is a "named function", its type is promoted from f : obj -> obj following sec. 14.4.2 to the seemingly less restrictive f<'a where 'a :> obj> : 'a -> obj. But this type is incompatible with System.Func<obj, obj>.
Question 1:
dict["foo"] <- fun (x:obj) -> x (* Why doesn't this, then? *)
This is fine because the anonymous function is not named, and so sec. 14.4.2 does not apply. The type is never promoted from obj -> obj and so fits.
We can observe the interpreter exhibit behaviour following 14.4.2:
> let f = id : obj -> obj
val f : (obj -> obj) (* Ok, f has type obj -> obj *)
> f
val it : ('a -> obj) = <fun:it#135-31> (* f promoted when used. *)
(The interpreter doesn't output constraints to obj.)

Parameterized Discriminated Union in F# [duplicate]

I am trying to write a typed abstract syntax tree datatype that can represent
function application.
So far I have
type Expr<'a> =
| Constant of 'a
| Application of Expr<'b -> 'a> * Expr<'b> // error: The type parameter 'b' is not defined
I don't think there is a way in F# to write something like 'for all b' on that last line - am I approaching this problem wrongly?
In general, the F# type system is not expressive enough to (directly) define a typed abstract syntax tree as the one in your example. This can be done using generalized algebraic data types (GADTs) which are not supported in F# (although they are available in Haskell and OCaml). It would be nice to have this in F#, but I think it makes the language a bit more complex.
Technically speaking, the compiler is complaining because the type variable 'b is not defined. But of course, if you define it, then you get type Expr<'a, 'b> which has a different meaning.
If you wanted to express this in F#, you'd have to use a workaround based on interfaces (an interface can have generic method, which give you a way to express constraint like exists 'b which you need here). This will probably get very ugly very soon, so I do not think it is a good approach, but it would look something like this:
// Represents an application that returns 'a but consists
// of an argument 'b and a function 'b -> 'a
type IApplication<'a> =
abstract Appl<'b> : Expr<'b -> 'a> * Expr<'b> -> unit
and Expr<'a> =
// Constant just stores a value...
| Constant of 'a
// An application is something that we can call with an
// implementation (handler). The function then calls the
// 'Appl' method of the handler we provide. As this method
// is generic, it will be called with an appropriate type
// argument 'b that represents the type of the argument.
| Application of (IApplication<'a> -> unit)
To represent an expression tree of (fun (n:int) -> string n) 42, you could write something like:
let expr =
Application(fun appl ->
appl.Appl(Constant(fun (n:int) -> string n),
Constant(42)))
A function to evaluate the expression can be written like this:
let rec eval<'T> : Expr<'T> -> 'T = function
| Constant(v) -> v // Just return the constant
| Application(f) ->
// We use a bit of dirty mutable state (to keep types simpler for now)
let res = ref None
// Call the function with a 'handler' that evaluates function application
f { new IApplication<'T> with
member x.Appl<'A>(efunc : Expr<'A -> 'T>, earg : Expr<'A>) =
// Here we get function 'efunc' and argument 'earg'
// The type 'A is the type of the argument (which can be
// anything, depending on the created AST)
let f = eval<'A -> 'T> efunc
let a = eval<'A> earg
res := Some <| (f a) }
res.Value.Value
As I said, this is a bit really extreme workaround, so I do not think it is a good idea to actually use it. I suppose the F# way of doing this would be to use untyped Expr type. Can you write a bit more about the overall goal of your project (perhaps there is another good approach)?

Why F# fails to infer types as C# would do

From the book by Tomas Petricek the following code doesn't work as compiler is unable to infer the type of the dt parameter:
> Option.map (fun dt -> dt.Year) (Some(DateTime.Now));;
error FS0072: Lookup on object of indeterminate type.
And if we specify type explicitly everything works fine:
> Option.map (fun (dt:DateTime) -> dt.Year) (Some(DateTime.Now));;
val it : int option = Some(2008)
Or we can use pipelining operator to "help" compiler to infer type:
> Some(DateTime.Now) |> Option.map (fun dt -> dt.Year);;
val it : int option = Some(2008)
The question is why F# compiler can't infer the type of the dt parameter? In this particular case it looks quite easy to infer the dt's type.
The logic behind it can be the following:
the signature of Option.map is map : ('T -> 'U) -> 'T option -> 'U option
the type of the last parameter is DateTime option
so our map usage looks like map : ('T -> 'U) -> 'DateTime option -> 'U option
the compiler then can try to substitute DateTime as 'T to see if it would be correct, so we have (DateTime -> 'U) -> 'DateTime option -> 'U option
then it can infer the 'U type by looking at the body of the lambda-function, so the 'U becomes int
and we finally have (DateTime -> int) -> 'DateTime option -> 'int option
So why F# can't do this inference? Tomas mentions in his book that F# infers types by going from the first to the last argument and that's why the order of arguments matters. And that's why F# can't infer the types in the first example. But why F# can't behave like C#, i.e. try to infer types incrementally starting with what is known?
In most cases F# is much more powerful when speaking about type inference... that't why I'm confused a bit.
So why F# can't do this inference?
F# could do that as OCaml does that. The disadvantage of this kind of more sophisticated inference is the obfuscation of error messages. OCaml taught us that the result generated such incomprehensible errors that, in practice, you always resort to annotating types in order to prevent the compiler from being led down a type inference garden path. Consequently, there was little motivation to implement this in F# because OCaml had already shown that it is not very pragmatic.
For example, if you do that in OCaml but mis-spell the method name then you will get a huge error message at some later point in the code where two inferred class types mismatch and you will have to hunt through it to find the discrepancy and then search back through your code to find the actual location of the error.
IMO, Haskell's type classes also suffer from an equivalent practical problem.
F# can do everything C#'s type inference can do...and much, much more. AFAIK, the extent of C#'s type inference is auto-typing a variable based on the right-hand side of an assignment.
var x = new Dictionary<string, int>();
The equivalent F# would be:
let x = Dictionary()
or
let x = Dictionary<_,_>()
or
let x = Dictionary<string,_>()
or
let x = Dictionary<string,int>()
You can provide as much or as little type information as you want, but you would almost never declare the type of x. So, even in this simple case, F#'s type inference is obviously much more powerful. Hindley-Milner type inference types entire programs, unifying all the expressions involved. As far as I can tell, C# type inference is limited to a single expression, assignment at that.

Resources