I've been watching an interesting video in which type classes in Haskell are used to solve the so-called "expression problem". About 15 minutes in, it shows how type classes can be used to "open up" a datatype based on a discriminated union for extension -- additional discriminators can be added separately without modifying / rebuilding the original definition.
I know type classes aren't available in F#, but is there a way using other language features to achieve this kind of extensibility? If not, how close can we come to solving the expression problem in F#?
Clarification: I'm assuming the problem is defined as described in the previous video
in the series -- extensibility of the datatype and operations on the datatype with the features of code-level modularization and separate compilation (extensions can be deployed as separate modules without needing to modify or recompile the original code) as well as static type safety.
As Jörg pointed out in a comment, it depends on what you mean by solve. If you mean solve including some form of type-checking that the you're not missing an implementation of some function for some case, then F# doesn't give you any elegant way (and I'm not sure if the Haskell solution is elegant). You may be able to encode it using the SML solution mentioned by kvb or maybe using one of the OO based solutions.
In reality, if I was developing a real-world system that needs to solve the problem, I would choose a solution that doesn't give you full checking, but is much easier to use.
A sketch would be to use obj as the representation of a type and use reflection to locate functions that provide implementation for individual cases. I would probably mark all parts using some attribute to make checking easier. A module adding application to an expression might look like this:
[<Extends("Expr")>] // Specifies that this type should be treated as a case of 'Expr'
type App = App of obj * obj
module AppModule =
[<Implements("format")>] // Specifies that this extends function 'format'
let format (App(e1, e2)) =
// We don't make recursive calls directly, but instead use `invoke` function
// and some representation of the function named `formatFunc`. Alternatively
// you could support 'e1?format' using dynamic invoke.
sprintfn "(%s %s)" (invoke formatFunc e1) (invoke formatFunc e2)
This does not give you any type-checking, but it gives you a fairly elegant solution that is easy to use and not that difficult to implement (using reflection). Checking that you're not missing a case is not done at compile-time, but you can easily write unit tests for that.
See Vesa Karvonen's comment here for one SML solution (albeit cumbersome), which can easily be translated to F#.
I know type classes aren't available in F#, but is there a way using other language features to achieve this kind of extensibility?
I do not believe so, no.
If not, how close can we come to solving the expression problem in F#?
The expression problem is about allowing the user to augment your library code with both new functions and new types without having to recompile your library. In F#, union types make it easy to add new functions (but impossible to add new union cases to an existing union type) and class types make it easy to derive new class types (but impossible to add new methods to an existing class hierarchy). These are the two forms of extensibility required in practice. The ability to extend in both directions simultaneously without sacrificing static type safety is just an academic curiosity, IME.
Incidentally, the most elegant way to provide this kind of extensibility that I have seen is to sacrifice type safety and use so-called "rule-based programming". Mathematica does this. For example, a function to compute the symbolic derivative of an expression that is an integer literal, variable or addition may be written in Mathematica like this:
D[_Integer, _] := 0
D[x_Symbol, x_] := 1
D[_Symbol, _] := 0
D[f_ + g_, x_] := D[f, x] + D[g, x]
We can retrofit support for multiplication like this:
D[f_ g_, x_] := f D[g, x] + g D[f, x]
and we can add a new function to evaluate an expression like this:
E[n_Integer] := n
E[f_ + g_] = E[f] + E[g]
To me, this is far more elegant than any of the solutions written in languages like OCaml, Haskell and Scala but, of course, it is not type safe.
Related
What's the advantage of having a type represent a function?
For example, I have observed the following snippet:
type Soldier = Soldier of PieceProperties
type King = King of PieceProperties
type Crown = Soldier -> King
Is it just to support Partial Application when additional args have yet to be satisfied?
As Fyodor Soikin says in the comments
Same reason you give names to everything else - values, functions,
modules, etc.
In other words, think about programming in assembly which typically does not use types, (yes I am aware of typed assembly) and all of the problems that one can have and then how many of those problems are solved or reduced by adding types.
So before you programmed with a language that supported functions but that used static typing, you typed everything. Now that you are using F# which has static typing and functions, just extend what you have been using typing for but now add the ability to type the functions.
To quote Benjamin C. Pierce from "Types and Programming Languages"
A type system is a tractable syntactic method for proving the absence
of certain program behaviors by classifying phrases according to the
kinds of values they compute.
As noted in "Types and Programming Languages" Section 1.2
What Type Systems Are Good For
Detecting Errors
Abstraction
Documentation
Language Safety
Efficiency
TL;DR
One of the places that I find named type function definitions invaluable is when I am building parser combinators. During the construction of the functions I fully type the functions so that I know what the types are as opposed to what type inferencing will infer they are which might be different than what I want. Since the function types typically have several parameters it is easier to just give the function type a name, and then use that name everywhere it is needed. This also saves time because the function definition is consistent and avoid having to debug an improperly declared function definition; yes I have made mistakes by doing each function type by hand and learned my lesson. Once all of the functions work, I then remove the type definitions from the functions, but leave the type definition as comments so that it makes the code easier to understand.
A side benefit of using the named type definitions is that when creating test cases, the typing rules in the named function will ensure that the data used for the test is of the correct type. This also makes understanding the data for the test much easier to understand when you come back to it after many months.
Another advantage is that using function names makes the code easier to understand because when a person new to the code looks at if for the first time they can spot the consistency of the names. Also if the names are meaningful then it makes understanding the code much easier.
You have to remember that functions are also values in F#. And you can do pretty much the same stuff with them as other types. For example you can have a function that returns other functions. Or you can have a list that stores functions. In these cases it will help if you are explicit about the function signature. The function type definition will help you to constrain on the parameters and return types. Also, you might have a complicated type signature, a type definition will make it more readable. This maybe a bit contrived but you can do fun(ky) stuff like this:
type FuncX = int -> int
type FuncZ = float -> float -> float
let addxy (x:int) :FuncX = (+) x
let subxy :FuncX = (-) x
let addz (x:float) :FuncZ =
fun (x:float) -> (fun y -> x + y)
let listofFunc = [addxy 10;addxy 20; subxy 10]
If you check the type of listofFunc you will see it's FuncX list. Also the :FuncX refers to the return type of the function. But we could you use it as an input type as well:
let compFunc (x:FuncX) (z:FuncX) =
[(x 10);(z 10)]
compFunc (addxy 10) (addxy 20)
I am trying to port a small compiler from C# to F# to take advantage of features like pattern matching and discriminated unions. Currently, I am modeling the AST using a pattern based on System.Linq.Expressions: A an abstract base "Expression" class, derived classes for each expression type, and a NodeType enum allowing for switching on expressions without lots of casting. I had hoped to greatly reduce this using an F# discriminated union, but I've run into several seeming limitations:
Forced public default constructor (I'd like to do type-checking and argument validation on expression construction, as System.Linq.Expressions does with it's static factory methods)
Lack of named properties (seems like this is fixed in F# 3.1)
Inability to refer to a case type directly. For example, it seems like I can't declare a function that takes in only one type from the union (e. g. let f (x : TYPE) = x compiles for Expression (the union type) but not for Add or Expression.Add. This seems to sacrifice some type-safety over my C# approach.
Are there good workarounds for these or design patterns which make them less frustrating?
I think, you are stuck a little too much with the idea that a DU is a class hierarchy. It is more helpful to think of it as data, really. As such:
Forced public default constructor (I'd like to do type-checking and argument validation on expression construction, as
System.Linq.Expressions does with it's static factory methods)
A DU is just data, pretty much like say a string or a number, not functionality. Why don't you make a function that returns you an Expression option to express, that your data might be invalid.
Lack of named properties (seems like this is fixed in F# 3.1)
If you feel like you need named properties, you probably have an inappropriate type like say string * string * string * int * float as the data for your Expression. Better make a record instead, something like AddInfo and make your case of the DU use that instead, like say | Add of AddInfo. This way you have properties in pattern matches, intellisense, etc.
Inability to refer to a case type directly. For example, it seems like I can't declare a function that takes in only one type from the
union (e. g. let f (x : TYPE) = x compiles for Expression (the union
type) but not for Add or Expression.Add. This seems to sacrifice some
type-safety over my C# approach.
You cannot request something to be the Add case, but you definitely do can write a function, that takes an AddInfo. Plus you can always do it in a monadic way and have functions that take any Expression and only return an option. In that case, you can pattern match, that your input is of the appropriate type and return None if it is not. At the call site, you then can "use" the value in the good case, using functions like Option.bind.
Basically try not to think of a DU as a set of classes, but really just cases of data. Kind of like an enum.
You can make the implementation private. This allows you the full power of DUs in your implementation but presents a limited view to consumers of your API. See this answer to a related question about records (although it also applies to DUs).
EDIT
I can't find the syntax on MSDN, but here it is:
type T =
private
| A
| B
private here means "private to the module."
In C, I could generate an executable, do an extensive rename only refactor, then compare executables again to confirm that the executable did not change. This was very handy to ensure that the refactor did not break anything.
Has anyone done anything similar with Ruby, particularly a Rails app? Strategies and methods would be appreciated. Ideally, I could run a script that output a single file of some sort that was purely bytecode and was not changed by naming changes. I'm guessing JRuby or Rubinus would be helpful here.
I don't think this strategy will work for Ruby. Unlike C, where the compiler throws away the names, most of the things you name in Ruby carry that name with them. That includes classes, modules, constants, and instance variables.
Automated unit and integration tests are the way to go to support Ruby refactoring.
Interesting question -- I like the definitive "yes" answer you can get from this regression strategy, at least for the specific case of rename refactoring.
I'm not expert enough to tell whether you can compile ruby (or at least a subset, without things like eval) but there seem to be some hints at:
http://www.hokstad.com/the-problem-with-compiling-ruby.html
http://rubini.us/2011/03/17/running-ruby-with-no-ruby/
Supposing that a complete compilation isn't possible, what about an abstract interpretation approach? You could parse the ruby into an AST, emit some kind of C code from the AST, and then compile the C code. The C code would not need to fully capture the behavior of the ruby code. It would only need to be compilable and to be distinct whenever the ruby was distinct. (Actually running it could result in gibberish, or perhaps an immediate memory violation error.)
As a simple example, suppose that ruby supported multiplication and C didn't. Then you could include a static mult function in your C code and translate from:
a = b + c*d
to
a = b + mult(c,d)
and the resulting compiled code would be invariant under name refactoring but would show discrepancies under other sorts of change. The mult function need not actually implement multiplication, you could have one of these instead:
static int mult( int a, int b ) { return a + b; } // pretty close
static int mult( int a, int b ) { return *0; } // not close at all, but still sufficient
and you'd still get the invariance you need as long as the C compiler isn't going to inline the definition. The same sort of translation, from an uncompilable ruby construct to a less functional but distinct C construct, should work for object manipulation and so forth, mapping class operations into C structure references. The key point is just that you want to keep the naming relationships intact while sacrificing actual behavior.
(I wonder whether you could do something with a single C struct that has members (all pointers to the same struct type) named after all the class and property names in the ruby code. Class and object operations would then correspond to nested dereference operations using this single structure. Just a notion.)
Even if you cannot formulate a precise mapping, an imprecise mapping that misses some minor distinctions might still be enough to increase confidence in the original name refactoring.
The quickest way to implement such a scheme might be to map from byte code to C (rather from the ruby AST to C). That would save a lot of parsing, but the mapping would be harder to understand and verify.
I have been trying to explain the difference between switch statements and pattern matching(F#) to a couple of people but I haven't really been able to explain it well..most of the time they just look at me and say "so why don't you just use if..then..else".
How would you explain it to them?
EDIT! Thanks everyone for the great answers, I really wish I could mark multiple right answers.
Having formerly been one of "those people", I don't know that there's a succinct way to sum up why pattern-matching is such tasty goodness. It's experiential.
Back when I had just glanced at pattern-matching and thought it was a glorified switch statement, I think that I didn't have experience programming with algebraic data types (tuples and discriminated unions) and didn't quite see that pattern matching was both a control construct and a binding construct. Now that I've been programming with F#, I finally "get it". Pattern-matching's coolness is due to a confluence of features found in functional programming languages, and so it's non-trivial for the outsider-looking-in to appreciate.
I tried to sum up one aspect of why pattern-matching is useful in the second of a short two-part blog series on language and API design; check out part one and part two.
Patterns give you a small language to describe the structure of the values you want to match. The structure can be arbitrarily deep and you can bind variables to parts of the structured value.
This allows you to write things extremely succinctly. You can illustrate this with a small example, such as a derivative function for a simple type of mathematical expressions:
type expr =
| Int of int
| Var of string
| Add of expr * expr
| Mul of expr * expr;;
let rec d(f, x) =
match f with
| Var y when x=y -> Int 1
| Int _ | Var _ -> Int 0
| Add(f, g) -> Add(d(f, x), d(g, x))
| Mul(f, g) -> Add(Mul(f, d(g, x)), Mul(g, d(f, x)));;
Additionally, because pattern matching is a static construct for static types, the compiler can (i) verify that you covered all cases (ii) detect redundant branches that can never match any value (iii) provide a very efficient implementation (with jumps etc.).
Excerpt from this blog article:
Pattern matching has several advantages over switch statements and method dispatch:
Pattern matches can act upon ints,
floats, strings and other types as
well as objects.
Pattern matches can act upon several
different values simultaneously:
parallel pattern matching. Method
dispatch and switch are limited to a single
value, e.g. "this".
Patterns can be nested, allowing
dispatch over trees of arbitrary
depth. Method dispatch and switch are limited
to the non-nested case.
Or-patterns allow subpatterns to be
shared. Method dispatch only allows
sharing when methods are from
classes that happen to share a base
class. Otherwise you must manually
factor out the commonality into a
separate function (giving it a
name) and then manually insert calls
from all appropriate places to your
unnecessary function.
Pattern matching provides redundancy
checking which catches errors.
Nested and/or parallel pattern
matches are optimized for you by the
F# compiler. The OO equivalent must
be written by hand and constantly
reoptimized by hand during
development, which is prohibitively
tedious and error prone so
production-quality OO code tends to
be extremely slow in comparison.
Active patterns allow you to inject
custom dispatch semantics.
Off the top of my head:
The compiler can tell if you haven't covered all possibilities in your matches
You can use a match as an assignment
If you have a discriminated union, each match can have a different 'type'
Tuples have "," and Variants have Ctor args .. these are constructors, they create things.
Patterns are destructors, they rip them apart.
They're dual concepts.
To put this more forcefully: the notion of a tuple or variant cannot be described merely by its constructor: the destructor is required or the value you made is useless. It is these dual descriptions which define a value.
Generally we think of constructors as data, and destructors as control flow. Variant destructors are alternate branches (one of many), tuple destructors are parallel threads (all of many).
The parallelism is evident in operations like
(f * g) . (h * k) = (f . h * g . k)
if you think of control flowing through a function, tuples provide a way to split up a calculation into parallel threads of control.
Looked at this way, expressions are ways to compose tuples and variants to make complicated data structures (think of an AST).
And pattern matches are ways to compose the destructors (again, think of an AST).
Switch is the two front wheels.
Pattern-matching is the entire car.
Pattern matches in OCaml, in addition to being more expressive as mentioned in several ways that have been described above, also give some very important static guarantees. The compiler will prove for you that the case-analysis embodied by your pattern-match statement is:
exhaustive (no cases are missed)
non-redundant (no cases that can never be hit because they are pre-empted by a previous case)
sound (no patterns that are impossible given the datatype in question)
This is a really big deal. It's helpful when you're writing the program for the first time, and enormously useful when your program is evolving. Used properly, match-statements make it easier to change the types in your code reliably, because the type system points you at the broken match statements, which are a decent indicator of where you have code that needs to be fixed.
If-Else (or switch) statements are about choosing different ways to process a value (input) depending on properties of the value at hand.
Pattern matching is about defining how to process a value given its structure, (also note that single case pattern matches make sense).
Thus pattern matching is more about deconstructing values than making choices, this makes them a very convenient mechanism for defining (recursive) functions on inductive structures (recursive union types), which explains why they are so abundantly used in languages like Ocaml etc.
PS: You might know the pattern-match and If-Else "patterns" from their ad-hoc use in math;
"if x has property A then y else z" (If-Else)
"some term in p1..pn where .... is the prime decomposition of x.." ((single case) pattern match)
Perhaps you could draw an analogy with strings and regular expressions? You describe what you are looking for, and let the compiler figure out how for itself. It makes your code much simpler and clearer.
As an aside: I find that the most useful thing about pattern matching is that it encourages good habits. I deal with the corner cases first, and it's easy to check that I've covered every case.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
This is the unabashed attempt of a similar C# question.
So what are your favorite F# hidden (or not) features?
Most of the features I've used so far aren't exactly hidden but have been quite refreshing. Like how trivial it is to overload operators compared to say C# or VB.NET.
And Async<T> has helped me shave off some real ugly code.
I'm quite new to the language still so it'd be great to learn what other features are being used in the wild.
User defined numeric literals can be defined by providing a module whose name starts with NumericLiteral and which defines certain methods (FromZero, FromOne, etc.).
In particular, you can use this to provide a much more readable syntax for calling LanguagePrimitives.GenericZero and LanguagePrimitives.GenericOne:
module NumericLiteralG = begin
let inline FromZero() = LanguagePrimitives.GenericZero
let inline FromOne() = LanguagePrimitives.GenericOne
end
let inline genericFactorial n =
let rec fact n = if (n = 0G) then 1G else n * (fact (n - 1G))
fact n
let flt = genericFactorial 30.
let bigI = genericFactorial 30I
F# has a little-used feature called "signature files". You can have a big implementation file full of public types/methods/modules/functions, but then you can hide and selectively expose that functionality to the sequel of the program via a signature file. That is, a signature file acts as a kind of screen/filter that enables you to make entities "public to this file" but "private to the rest of the program".
I feel like this is a pretty killer feature on the .Net platform, because the only other/prior tool you have for this kind of encapsulation is assemblies. If you have a small component with a few related types that want to be able to see each other's internal details, but don't want those types to have all those bits public to everyone, what can you do? Well, you can do two things:
You can put that component in a separate assembly, and make the members that those types share be "internal", and make the narrow part you want everyone else to see be "public", or
You just mark the internal stuff "internal" but you leave those types in your gigantic assembly and just hope that all the other code in the assembly chooses not to call those members that were only marked 'internal' because one other type needed to see it.
In my experience, on large software projects, everyone always does #2, because #1 is a non-starter for various reasons (people don't want 50 small assemblies, they want 1 or 2 or 3 large assemblies, for other maybe-good reasons unrelated to the encapsulation point I am raising (aside: everyone mentions ILMerge but no one uses it)).
So you chose option #2. Then a year later, you finally decide to refactor out that component, and you discover that over the past year, 17 other places now call into that 'internal' method that was really only meant for that one other type to call, making it really hard to factor out that bit because now everyone depends on those implementation details. Bummer.
The point is, there is no good way to create a moderate-size intra-assembly encapsulation scope/boundary in .Net. Often times "internal" is too big and "private" is too small.
... until F#. With F# signature files, you can create an encapsulation scope of "this source code file" by marking a bunch of stuff as public within the implementation file, so all the other code in the file can see it and party on it, but then use a signature file to hide all of the details expect the narrow public interface that component exposes to the rest of the world. This is happy. Define three highly related types in one file, let them see each others implementation details, but only expose the truly public stuff to everyone else. Win!
Signature files are perhaps not the ideal feature for intra-assembly encapsulation boundaries, but they are the only such feature I know, and so I cling to them like a life raft in the ocean.
TL;DR
Complexity is the enemy. Encapsulation boundaries are a weapon against this enemy. "private" is a great weapon but sometimes too small to be applicable, and "internal" is often too weak because so much code (entire assembly and all InternalsVisibleTo's) can see internal stuff. F# offers a scope bigger than "private to a type" but smaller than "the whole assembly", and that is very useful.
I wonder what happens if you add
<appSettings>
<add key="fsharp-navigationbar-enabled" value="true" />
</appSettings>
to your devenv.exe.config file? (Use at your own risk.)
Passing --warnon:1182 to the compiler turns on warnings about unused variables; variable names that begin with underscore are immune.
Automatically-generated comparison functions for algebraic data types (based on lexicographical ordering) is a nice feature that is relatively unknown; see
http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!548.entry
for an example.
Yes, F# doesn't have any 'hidden' features, but it sure does have a lot of power packed into the simple language. A less-known feature of the language, is where you can basically enable duck typing despite the fact F# is staticaly typed.
See this question
F# operator "?"
for info on the question-mark operator and how it provides the basic language mechanism to build a feature akin to 'dynamic' in C#.
Not really hidden, but as a non-ML person this escaped me for quite a while:
Pattern matching can decompose arbitrarily deep into data structures.
Here's a [incredibly arbitrary] nested tuple example; this works on lists or unions or any combinations of nested values:
let listEven =
"Manipulating strings can be intriguing using F#".Split ' '
|> List.ofArray
|> List.map (fun x -> (x.Length % 2 = 0, x.Contains "i"), x)
|> List.choose
( function (true, true), s -> Some s
| _, "F#" -> Some "language"
| _ -> None )
Use of F# as a utility scripting language may be under appreciated. F# enthusiasts tend to be quants. Sometimes you want something to back up your MP3s (or dozens of database servers) that's a little more robust than batch. I've been hunting for a modern replacement for jscript / vbscript. Lately, I've used IronPython, but F# may be more complete and the .NET interaction is less cumbersome.
I like curried functions for entertainment value. Show a curried function to a pure procedural / OOP program for at least three WTFs. Starting with this is a bad way to get F# converts, though :)
Inlined operators on generic types can have different generic constraints:
type 'a Wrapper = Wrapper of 'a with
static member inline (+)(Wrapper(a),Wrapper(b)) = Wrapper(a + b)
static member inline Exp(Wrapper(a)) = Wrapper(exp a)
let objWrapper = Wrapper(obj())
let intWrapper = (Wrapper 1) + (Wrapper 2)
let fltWrapper = exp (Wrapper 1.0)
(* won''t compile *)
let _ = exp (Wrapper 1)
There are no hidden features, because F# is in design mode. All what we have is a Technical Preview, which changes every two month.
see http://research.microsoft.com/fsharp/