I am currently learning F# with a free online resource. Since i am curious and try to apply the learned stuff in some small excercises, I find myself consulting the MSDN F# documentation quite often.
But the documentation seems really cryptic to me. Take this documentation page for the pown function for example. The usage is pretty straight forward, but i don't understand the functions signature:
// Signature:
pown : ^T -> int -> ^T (requires ^T with static member One and ^T with static member op_Multiply and ^T with static member (/))
Can someone explain to me, what the following things are about?
What does the ^ (Circumflex) before the T do?
What does "T" mean? Is it a generic type?
What does the double -> do?
What does the requires statements do?
I hope this isn't too much to cover in one answer.
This indicates that T is a statically resolved type parameter as opposed to a normal generic type parameter (see also 4 below).
Yes.
-> is the type constructor for functions and is right associative, so this part is equivalent to ^T -> (int -> ^T). In other words, if you pass an argument of type ^T to this function, you'll get back a function from int to ^T. So pown 2 is the function 2x where the power hasn't been passed yet. And pown 2 8 is the same as (pown 2) 8: it's 28.
At the point of invocation, whatever concrete type is substituted for ^T it must be statically known to meet these requirements. So you can call pown 2 8 (because int supports these operations), but not pown "test" 8 (because string doesn't).
There are a few things going on there, so for starters, here's how I would suggest you approach signatures in F#. First of all, ignore the circumflex - mentally substitute a tick there. Then you can ignore the "requires" part - long story short, it's there because of the circumflex.
So after that you have a signature like this:
// Signature:
pown : 'T -> int -> 'T
'T is a generic type - uppercase 'T is a .NET standard, F# usually uses lowercase 'a, 'b etc. What this signature describes is a function that takes a 'T and an int, and returns a 'T. The type after the last -> is the "return type" of the function - at least that's a useful way to think about it at the beginning.
In reality, there's a bit more to that - in F# functions are curried (and partially applicable by default), so what you really have is a function that takes a 'T and returns a function of signature int -> 'T - at which point it's clear why you have double ->.
And the circumflex thing is a statically resolved type - I see #kvb gave more details on that already. It's good to be aware that it exists, but it's something that's rarely used in practice (you'll see it on core numeric functions and operators though).
Related
I am new to F#, and I was surprised to find that the type of f x y = x + y is actually int -> int -> int. Appearently, this is due to some performance trade-off.
But why is this actually necessary? Why not just infer the type to be 'a -> 'a -> 'a or something similar? It seems to work for comparison: the type of g x y = x < y is x:'a -> y:'a -> bool when 'a : comparison. Why not for arithmetic operators as well?
Couldn't the compiler statically infer the specific primitive types from the call sites and specialize the generic function from there, falling back to some dynamic dispatch if this fails?
This might very well be obvious, but I could not find any good resources on this. What is the reasoning behind this behavior?
Yes, for those operators int is the default type inferred unless you specify a different one or is inferred by the use. If you want to define them for all types you have to make the function inline:
let inline f x y = x + y
But notice that the signature is:
x: ^a -> y: ^b -> ^c
when ( ^a or ^b) : (static member ( + ) : ^a * ^b -> ^c)
This is because in .NET you can't use member constraints but F# resolves them at compile time. That's why you see those 'hat types' and the constraint that those type should have a static member (+) defined.
Also notice the type variables are not a -> a -> a as you suggest, that's because in the .NET framework not all addition operations respect that signature. Things are different in other environments like Haskell, there the addition is strictly a -> a -> a but in .NET you can add for instance a TimeSpan to a DateTime:
System.DateTime(2000,1,1) + new System.TimeSpan(1, 2, 0, 30, 0)
and the result it's a DateTime, here the signature is: a -> b -> a
Comparison is a different story since that constraint actually exists at .NET level so it can be compiled and encoded in the IL whereas member constraints need to be resolved at compile time, that why the function has to be marked as inline.
I think you misinterpreted the explanation in the linked question: this is not due a performance trade-off, the real reason is a .NET type system limitation. The fact that an inline function executes faster in most cases (since it's inlined by the compiler) is a secondary effect.
I thought that conversions between F# functions and System.Func had to be done manually, but there appears to be a case where the compiler (sometimes) does it for you. And when it goes wrong the error message isn't accurate:
module Foo =
let dict = new System.Collections.Generic.Dictionary<string, System.Func<obj,obj>>()
let f (x:obj) = x
do
// Question 1: why does this compile without explicit type conversion?
dict.["foo"] <- fun (x:obj) -> x
// Question 2: given that the above line compiles, why does this fail?
dict.["bar"] <- f
The last line fails to compile, and the error is:
This expression was expected to have type
System.Func<obj,obj>
but here has type
'a -> obj
Clearly the function f doesn't have a signature of 'a > obj. If the F# 3.1 compiler is happy with the first dictionary assignment, then why not the second?
The part of the spec that should explain this is 8.13.7 Type Directed Conversions at Member Invocations. In short, when invoking a member, an automatic conversion from an F# function to a delegate will be applied. Unfortunately, the spec is a bit unclear; from the wording it seems that this conversion might apply to any function expression, but in practice it only appears to apply to anonymous function expressions.
The spec is also a bit out of date; in F# 3.0 type directed conversions also enable a conversion to a System.Linq.Expressions.Expression<SomeDelegateType>.
EDIT
In looking at some past correspondence with the F# team, I think I've tracked down how a conversion could get applied to a non-syntactic function expression. I'll include it here for completeness, but it's a bit of a strange corner case, so for most purposes you should probably consider the rule to be that only syntactic functions will have the type directed conversion applied.
The exception is that overload resolution can result in converting an arbitrary expression of function type; this is partly explained by section 14.4 Method Application Resolution, although it's pretty dense and still not entirely clear. Basically, the argument expressions are only elaborated when there are multiple overloads; when there's just a single candidate method, the argument types are asserted against the unelaborated arguments (note: it's not obvious that this should actually matter in terms of whether the conversion is applicable, but it does matter empirically). Here's an example demonstrating this exception:
type T =
static member M(i:int) = "first overload"
static member M(f:System.Func<int,int>) = "second overload"
let f i = i + 1
T.M f |> printfn "%s"
EDIT: This answer explains only the mysterious promotion to 'a -> obj. #kvb points out that replacing obj with int in OPs example still doesn't work, so that promotion is in itself insufficient explanation for the observed behaviour.
To increase flexibility, the F# type elaborator may under certain conditions promote a named function from f : SomeType -> OtherType to f<'a where 'a :> SomeType> : 'a -> OtherType. This is to reduce the need for upcasts. (See spec. 14.4.2.)
Question 2 first:
dict["bar"] <- f (* Why does this fail? *)
Because f is a "named function", its type is promoted from f : obj -> obj following sec. 14.4.2 to the seemingly less restrictive f<'a where 'a :> obj> : 'a -> obj. But this type is incompatible with System.Func<obj, obj>.
Question 1:
dict["foo"] <- fun (x:obj) -> x (* Why doesn't this, then? *)
This is fine because the anonymous function is not named, and so sec. 14.4.2 does not apply. The type is never promoted from obj -> obj and so fits.
We can observe the interpreter exhibit behaviour following 14.4.2:
> let f = id : obj -> obj
val f : (obj -> obj) (* Ok, f has type obj -> obj *)
> f
val it : ('a -> obj) = <fun:it#135-31> (* f promoted when used. *)
(The interpreter doesn't output constraints to obj.)
In OCaml, it is legal to have in .mli:
val f : 'a -> 'a
val g : 'a -> 'a
and .ml:
let f x = x
let g = f
Yet in F#, this is rejected:
eta_expand.ml(2,5): error FS0034: Module 'Eta_expand' contains
val g : ('a -> 'a)
but its signature specifies
val g : 'a -> 'a
The arities in the signature and implementation differ. The signature specifies that 'g' is function definition or lambda expression accepting at least 1 argument(s), but the implementation is a computed function value. To declare that a computed function value is a permitted implementation simply parenthesize its type in the signature, e.g.
val g: int -> (int -> int)
instead of
val g: int -> int -> int.
One workaround is to η-expand the definition of g:
let g x = f x
If my code is purely functional (no exceptions, no side effects, etc.) this should be equivalent (actually, it might be even better with respect to polymorphism, depending on how the language generalizes types : in OCaml partial applications do not produce polymorphic functions, but their η-expansion does).
Is there any drawback to systematic η-expansion?
Two answers elude the question on η-expansion :-) and instead suggest that I add parentheses around my functional type. This is because, apparently, F# distinguishes at the typing level between "true" definition of functions (as λ-expressions and computed definitions, as in partial applications); presumably this is because λ-expressions directly map to CLR functions while computed definitions map to delegate objects. (I'm not sure of this interpretation and would appreciate if somebody very familiar with F# could point to reference documents describing this.)
A solution would be to systematically add parentheses to all the function types in .mli, but I fear this could lead to inefficiencies. Another would be to detect the computed functions and add parenthesize the corresponding types in the .mli. A third solution would be to η-expand the obvious cases, and parenthesize the others.
I'm not familiar enough with F# / CLR internals to measure which ones incur significant performance or interfacing penalties.
In theory, the F# function type 'a -> 'b -> 'c is the same type as 'a -> ('b -> 'c). That is, multiple argument functions are represented using the curried form in F#. You can use one where the other is expected in most cases e.g. when calling a higher-order function.
However, for practical reasons, F# compiler actually distinguishes between the types - the motivation is that they are represented differently in the compiled .NET code. This has impact on performance and also interoperability with C#, so it is useful to make that distinction.
A function Foo : int -> int -> int is going to be compiled as a member int Foo(int, int) - the compiler does not use the curried form by default, because this is more efficient when calling Foo with both arguments (more common case) and it is better for interop. A function Bar : int -> (int -> int) will be compiled as FSharpFunc<int, int> Bar(int) - actually using the curried form (and so it is more efficient to call it with just a single parameter and it will be hard to use from C#).
This is also why F# does not treat the types as equal when it comes to signatures - signature specifies the type, but here it also specifies how is the function going to be compiled. The implementation file has to provide function of the right type, but - in this case - also of the right compiled form.
Interestingly my fsi gives a more helpful error message:
/test.fs(2,5): error FS0034: Module 'Test' contains
val g : ('a -> 'a) but its signature specifies
val g : 'a -> 'a The arities in the signature and implementation differ.
The signature specifies that 'g' is function definition or lambda expression
accepting at least 1 argument(s), but the implementation is a computed
function value. To declare that a computed function value is a permitted
implementation simply parenthesize its type in the signature, e.g.
val g: int -> (int -> int) instead of
val g: int -> int -> int.
If you add brackets to get g :('a -> 'a) all is fine
From the book by Tomas Petricek the following code doesn't work as compiler is unable to infer the type of the dt parameter:
> Option.map (fun dt -> dt.Year) (Some(DateTime.Now));;
error FS0072: Lookup on object of indeterminate type.
And if we specify type explicitly everything works fine:
> Option.map (fun (dt:DateTime) -> dt.Year) (Some(DateTime.Now));;
val it : int option = Some(2008)
Or we can use pipelining operator to "help" compiler to infer type:
> Some(DateTime.Now) |> Option.map (fun dt -> dt.Year);;
val it : int option = Some(2008)
The question is why F# compiler can't infer the type of the dt parameter? In this particular case it looks quite easy to infer the dt's type.
The logic behind it can be the following:
the signature of Option.map is map : ('T -> 'U) -> 'T option -> 'U option
the type of the last parameter is DateTime option
so our map usage looks like map : ('T -> 'U) -> 'DateTime option -> 'U option
the compiler then can try to substitute DateTime as 'T to see if it would be correct, so we have (DateTime -> 'U) -> 'DateTime option -> 'U option
then it can infer the 'U type by looking at the body of the lambda-function, so the 'U becomes int
and we finally have (DateTime -> int) -> 'DateTime option -> 'int option
So why F# can't do this inference? Tomas mentions in his book that F# infers types by going from the first to the last argument and that's why the order of arguments matters. And that's why F# can't infer the types in the first example. But why F# can't behave like C#, i.e. try to infer types incrementally starting with what is known?
In most cases F# is much more powerful when speaking about type inference... that't why I'm confused a bit.
So why F# can't do this inference?
F# could do that as OCaml does that. The disadvantage of this kind of more sophisticated inference is the obfuscation of error messages. OCaml taught us that the result generated such incomprehensible errors that, in practice, you always resort to annotating types in order to prevent the compiler from being led down a type inference garden path. Consequently, there was little motivation to implement this in F# because OCaml had already shown that it is not very pragmatic.
For example, if you do that in OCaml but mis-spell the method name then you will get a huge error message at some later point in the code where two inferred class types mismatch and you will have to hunt through it to find the discrepancy and then search back through your code to find the actual location of the error.
IMO, Haskell's type classes also suffer from an equivalent practical problem.
F# can do everything C#'s type inference can do...and much, much more. AFAIK, the extent of C#'s type inference is auto-typing a variable based on the right-hand side of an assignment.
var x = new Dictionary<string, int>();
The equivalent F# would be:
let x = Dictionary()
or
let x = Dictionary<_,_>()
or
let x = Dictionary<string,_>()
or
let x = Dictionary<string,int>()
You can provide as much or as little type information as you want, but you would almost never declare the type of x. So, even in this simple case, F#'s type inference is obviously much more powerful. Hindley-Milner type inference types entire programs, unifying all the expressions involved. As far as I can tell, C# type inference is limited to a single expression, assignment at that.
I have never seen a language have exponent or power operator only taking floating point numbers?
For example:
2 ** 2 throws an error The type 'int' does not support any operators named 'Pow'
Are there valid reasons for this design decision?
(**) and pown are two different things. When you see (**), you can think of the mathematical formula using logarithms. When you see pown, it's just a series of multiplications. I understand it can be surprising/confusing at first, because most other languages don't make such a difference (mainly because integers are often implicitly converted to floating point values). Even in maths, there a small difference: See the Wikipedia entry, the first definition works only for positive integer exponents.
As they are two different (but related) things, they have different signatures. Here is (**):
^a -> ( ^b -> ^a) when ^a : (static member Pow : ^a * ^b -> ^a)
And here is pown:
^a -> (int -> ^a)
when ^a : (static member get_One : -> ^a) and
^a : (static member ( * ) : ^a * ^a -> ^a) and
^a : (static member ( / ) : ^a * ^a -> ^a)
If you create your own type, you only need to have your One, (*), and (/) to get it work with pown. The library will do the loop for you (it's optimized, it's not the naive O(n)).
If you want to use the (**) operator on your type for non-integral values, you'll have to write the full logic (and it's not be the same algorithm as in pown).
I think it was a good design decision to separate the two concepts.
For integral powers, F# provides another operator: pown. Also, as a side note, both (**) and pown are overloaded, so it's perfectly possible to use them with other types which provide appropriate members (a static Pow method in the case of (**); (*) and (/) operators and a static One property in the case of pown).
I can't speak to why the F# team opted not to simulate a Pow member on int, but perhaps they didn't feel it was urgent since the pown operator could be used instead (and since it probably makes more sense to convert to float first in the case of big operands).
The short answer is because it isn't very useful for integer types, even int64s. 2^26 only gives you ~1.84467441E19. So if you had two values X and Y both greater than say 19, then the power operator will result in an overflow.
I agree it is useful for small values, however it isn't generally useful for integer types.
The language that F# is based on is OCaml which does not do operator overloading or automatic data coercion (they prefer explicit coercion).
Thus even adding doubles requires a different operator (+.). I'm not sure if this is where F# gets its strictness on it for sure but I am guessing it is.
In dynamic languages like Python or Scheme you would get automatic data coercion to a bigger data storage if the number was too big. For example you could have integers with integer exponents giving a big-integer for the result.
OCaml and F# have a spirit of extreme type safety.