I am trying to port a small compiler from C# to F# to take advantage of features like pattern matching and discriminated unions. Currently, I am modeling the AST using a pattern based on System.Linq.Expressions: A an abstract base "Expression" class, derived classes for each expression type, and a NodeType enum allowing for switching on expressions without lots of casting. I had hoped to greatly reduce this using an F# discriminated union, but I've run into several seeming limitations:
Forced public default constructor (I'd like to do type-checking and argument validation on expression construction, as System.Linq.Expressions does with it's static factory methods)
Lack of named properties (seems like this is fixed in F# 3.1)
Inability to refer to a case type directly. For example, it seems like I can't declare a function that takes in only one type from the union (e. g. let f (x : TYPE) = x compiles for Expression (the union type) but not for Add or Expression.Add. This seems to sacrifice some type-safety over my C# approach.
Are there good workarounds for these or design patterns which make them less frustrating?
I think, you are stuck a little too much with the idea that a DU is a class hierarchy. It is more helpful to think of it as data, really. As such:
Forced public default constructor (I'd like to do type-checking and argument validation on expression construction, as
System.Linq.Expressions does with it's static factory methods)
A DU is just data, pretty much like say a string or a number, not functionality. Why don't you make a function that returns you an Expression option to express, that your data might be invalid.
Lack of named properties (seems like this is fixed in F# 3.1)
If you feel like you need named properties, you probably have an inappropriate type like say string * string * string * int * float as the data for your Expression. Better make a record instead, something like AddInfo and make your case of the DU use that instead, like say | Add of AddInfo. This way you have properties in pattern matches, intellisense, etc.
Inability to refer to a case type directly. For example, it seems like I can't declare a function that takes in only one type from the
union (e. g. let f (x : TYPE) = x compiles for Expression (the union
type) but not for Add or Expression.Add. This seems to sacrifice some
type-safety over my C# approach.
You cannot request something to be the Add case, but you definitely do can write a function, that takes an AddInfo. Plus you can always do it in a monadic way and have functions that take any Expression and only return an option. In that case, you can pattern match, that your input is of the appropriate type and return None if it is not. At the call site, you then can "use" the value in the good case, using functions like Option.bind.
Basically try not to think of a DU as a set of classes, but really just cases of data. Kind of like an enum.
You can make the implementation private. This allows you the full power of DUs in your implementation but presents a limited view to consumers of your API. See this answer to a related question about records (although it also applies to DUs).
EDIT
I can't find the syntax on MSDN, but here it is:
type T =
private
| A
| B
private here means "private to the module."
Related
What's the advantage of having a type represent a function?
For example, I have observed the following snippet:
type Soldier = Soldier of PieceProperties
type King = King of PieceProperties
type Crown = Soldier -> King
Is it just to support Partial Application when additional args have yet to be satisfied?
As Fyodor Soikin says in the comments
Same reason you give names to everything else - values, functions,
modules, etc.
In other words, think about programming in assembly which typically does not use types, (yes I am aware of typed assembly) and all of the problems that one can have and then how many of those problems are solved or reduced by adding types.
So before you programmed with a language that supported functions but that used static typing, you typed everything. Now that you are using F# which has static typing and functions, just extend what you have been using typing for but now add the ability to type the functions.
To quote Benjamin C. Pierce from "Types and Programming Languages"
A type system is a tractable syntactic method for proving the absence
of certain program behaviors by classifying phrases according to the
kinds of values they compute.
As noted in "Types and Programming Languages" Section 1.2
What Type Systems Are Good For
Detecting Errors
Abstraction
Documentation
Language Safety
Efficiency
TL;DR
One of the places that I find named type function definitions invaluable is when I am building parser combinators. During the construction of the functions I fully type the functions so that I know what the types are as opposed to what type inferencing will infer they are which might be different than what I want. Since the function types typically have several parameters it is easier to just give the function type a name, and then use that name everywhere it is needed. This also saves time because the function definition is consistent and avoid having to debug an improperly declared function definition; yes I have made mistakes by doing each function type by hand and learned my lesson. Once all of the functions work, I then remove the type definitions from the functions, but leave the type definition as comments so that it makes the code easier to understand.
A side benefit of using the named type definitions is that when creating test cases, the typing rules in the named function will ensure that the data used for the test is of the correct type. This also makes understanding the data for the test much easier to understand when you come back to it after many months.
Another advantage is that using function names makes the code easier to understand because when a person new to the code looks at if for the first time they can spot the consistency of the names. Also if the names are meaningful then it makes understanding the code much easier.
You have to remember that functions are also values in F#. And you can do pretty much the same stuff with them as other types. For example you can have a function that returns other functions. Or you can have a list that stores functions. In these cases it will help if you are explicit about the function signature. The function type definition will help you to constrain on the parameters and return types. Also, you might have a complicated type signature, a type definition will make it more readable. This maybe a bit contrived but you can do fun(ky) stuff like this:
type FuncX = int -> int
type FuncZ = float -> float -> float
let addxy (x:int) :FuncX = (+) x
let subxy :FuncX = (-) x
let addz (x:float) :FuncZ =
fun (x:float) -> (fun y -> x + y)
let listofFunc = [addxy 10;addxy 20; subxy 10]
If you check the type of listofFunc you will see it's FuncX list. Also the :FuncX refers to the return type of the function. But we could you use it as an input type as well:
let compFunc (x:FuncX) (z:FuncX) =
[(x 10);(z 10)]
compFunc (addxy 10) (addxy 20)
In F#, is there a special name for a type defined in the following manner:
type Foo = int -> string
I ask because such a type seems to have special significance. It is quite abstract compared to other types and seems to only be usable as a sort of function interface. It is not discussed much in F# literature, but it is used quite a bit in the source of many F# OSS projects.
Is it a special sort of type that has some broader significance in functional programming? Does it have other applications other than functioning as a sort of interface for functions? Why is it something that doesn't really make sense to instantiate, even though you can kind of do that by defining a function with a matching type signature? Or is it really just as simple as saying that it is an alias for a function definition which can then be used as a short form in other type signatures/definitions and that accounts for all it's properties?
The name used in the F# documentation for this kind of definition is type abbreviation and I think many people often refer to it as type alias.
The definition defines an alias or a shorter name for a longer type. It says that whenever you type Foo, the compiler will see it as int -> string.
Type abbreviations do not define a new type, which means that Foo and int -> string are equivalent (a value of one type is also a value of the other type). The important points are:
Type inference will generally infer the original type, so when you write a function that matches the type, compiler will infer it as int -> string unless you give an explicit type annotation
When compiled, the abbreviations are erased (because .NET does not have such concept) and so the compiled code will see int -> string.
Abbreviations are useful just for readability reasons - it lets you use more descriptive name for a type. However, they do not have many other implications.
Let me state that I am very green in F# (but 4 years experience in C#). I wanted to start learning F# and I was following the TryFSharp.org tutorials. I came to the point of the computation expressions but things weren't exactly clear. So I started to google it. I came across another tutorial / article which explained it a lot better in the first example (the logging example). But then I read on and came to the second example; I cannot follow the flow of the code or how it is supposed to work, perhaps because I don't understand the definition of the State type:
type State<'a, 's> = State of ('s -> 'a * 's)
I have worked with a few simple types in F# already, I have seen struct, class, record but I have no clue how to read this type or what it is supposed to do. I also can't figure out what the of keyword is doing in there.
So my question is: what does this type definition do / what does the of keyword in it do?
The code defines a discriminated union type named State whose only constructor is also named State and takes an argument of type 's -> 'a * 's. The of keyword separated the constructor name from its argument type.
So basically it says that a State is a function of type 's -> 'a * 's, but you need to use the State constructor to create a State and thus have to write let myState = State someFunction rather than let myState = someFunction.
As already stated, State is a single case discriminated union type. A two-case union type would look like:
type Multi =
| First of name:string
| Second of number:int
One way to think of this is Multi as a base class and First and Second as subclasses where First requires a string in the constructor and Second requires an int. This is a very powerful construct not available in C#. It is powerful because you can pattern match of values of this type and the compiler will force you to handle every case.
A single case union is helpful as a wrapper of another type. In your example, the State type wraps a function from type 's to a pair (C# tuple) 'a * 's. It turns out that this is a very interesting type because it forms a monad and as a result you get all sorts of functions around it. For example, this gist shows how the State monad can be used to implement functional random value generators.
I'm started to learn F#, and I noticed that one of the major differences in syntax from C# is that type inference is used much more than in C#. This is usually presented as one of the benefits of F#. Why is type inference presented as benefit?
Imagine, you have a class hierarchy and code that uses different classes from it. Strong typing allows you quickly detect which classes are used in any method.
With type inference it will not be so obvious and you have to use hints to understand, which class is used. Are there any techniques that exist to make F# code more readable with type inference?
This question assumes that you are using object-oriented programming (e.g. complex class hierarchies) in F#. While you can certainly do that, using OO concepts is mainly useful for interoperability or for wrapping some F# functionality in a .NET library.
Understanding code. Type inference becomes much more useful when you write code in the functional style. It makes code shorter, but also helps you understand what is going on. For example, if you write map function over list (the Select method in LINQ):
let map f list =
seq { for el in list -> f el }
The type inference tells you that the function type is:
val map : f:('a -> 'b) -> list:seq<'a> -> seq<'b>
This matches our expectations about what we wanted to write - the argument f is a function turning values of type 'a into values of type 'b and the map function takes a list of 'a values and produces a list of 'b values. So you can use the type inference to easily check that your code does what you would expect.
Generalization. Automatic generalization (mentioned in the comments) means that the above code is automatically as reusable as possible. In C#, you might write:
IEnumerable<int> Select(IEnumerable<int> list, Func<int, int> f) {
foreach(int el in list)
yield return f(el);
}
This method is not generic - it is Select that works only on collections of int values. But there is no reason why it should be restricted to int - the same code would work for any types. The type inference mechanism helps you discover such generalizations.
More checking. Finally, thanks to the inference, the F# language can more easily check more things than you could if you had to write all types explicitly. This applies to many aspects of the language, but it is best demonstrated using units of measure:
let l = 1000.0<meter>
let s = 60.0<second>
let speed = l/s
The F# compiler infers that speed has a type float<meter/second> - it understands how units of measure work and infers the type including unit information. This feature is really useful, but it would be hard to use if you had to write all units by hand (because the types get long). In general, you can use more precise types, because you do not have to (always) type them.
I've been watching an interesting video in which type classes in Haskell are used to solve the so-called "expression problem". About 15 minutes in, it shows how type classes can be used to "open up" a datatype based on a discriminated union for extension -- additional discriminators can be added separately without modifying / rebuilding the original definition.
I know type classes aren't available in F#, but is there a way using other language features to achieve this kind of extensibility? If not, how close can we come to solving the expression problem in F#?
Clarification: I'm assuming the problem is defined as described in the previous video
in the series -- extensibility of the datatype and operations on the datatype with the features of code-level modularization and separate compilation (extensions can be deployed as separate modules without needing to modify or recompile the original code) as well as static type safety.
As Jörg pointed out in a comment, it depends on what you mean by solve. If you mean solve including some form of type-checking that the you're not missing an implementation of some function for some case, then F# doesn't give you any elegant way (and I'm not sure if the Haskell solution is elegant). You may be able to encode it using the SML solution mentioned by kvb or maybe using one of the OO based solutions.
In reality, if I was developing a real-world system that needs to solve the problem, I would choose a solution that doesn't give you full checking, but is much easier to use.
A sketch would be to use obj as the representation of a type and use reflection to locate functions that provide implementation for individual cases. I would probably mark all parts using some attribute to make checking easier. A module adding application to an expression might look like this:
[<Extends("Expr")>] // Specifies that this type should be treated as a case of 'Expr'
type App = App of obj * obj
module AppModule =
[<Implements("format")>] // Specifies that this extends function 'format'
let format (App(e1, e2)) =
// We don't make recursive calls directly, but instead use `invoke` function
// and some representation of the function named `formatFunc`. Alternatively
// you could support 'e1?format' using dynamic invoke.
sprintfn "(%s %s)" (invoke formatFunc e1) (invoke formatFunc e2)
This does not give you any type-checking, but it gives you a fairly elegant solution that is easy to use and not that difficult to implement (using reflection). Checking that you're not missing a case is not done at compile-time, but you can easily write unit tests for that.
See Vesa Karvonen's comment here for one SML solution (albeit cumbersome), which can easily be translated to F#.
I know type classes aren't available in F#, but is there a way using other language features to achieve this kind of extensibility?
I do not believe so, no.
If not, how close can we come to solving the expression problem in F#?
The expression problem is about allowing the user to augment your library code with both new functions and new types without having to recompile your library. In F#, union types make it easy to add new functions (but impossible to add new union cases to an existing union type) and class types make it easy to derive new class types (but impossible to add new methods to an existing class hierarchy). These are the two forms of extensibility required in practice. The ability to extend in both directions simultaneously without sacrificing static type safety is just an academic curiosity, IME.
Incidentally, the most elegant way to provide this kind of extensibility that I have seen is to sacrifice type safety and use so-called "rule-based programming". Mathematica does this. For example, a function to compute the symbolic derivative of an expression that is an integer literal, variable or addition may be written in Mathematica like this:
D[_Integer, _] := 0
D[x_Symbol, x_] := 1
D[_Symbol, _] := 0
D[f_ + g_, x_] := D[f, x] + D[g, x]
We can retrofit support for multiplication like this:
D[f_ g_, x_] := f D[g, x] + g D[f, x]
and we can add a new function to evaluate an expression like this:
E[n_Integer] := n
E[f_ + g_] = E[f] + E[g]
To me, this is far more elegant than any of the solutions written in languages like OCaml, Haskell and Scala but, of course, it is not type safe.