F# functional way to transform DU based ASTs - f#

In OOP land, take for example Roslyn and it's syntax rewriters, using visitor pattern.
This is very nice, as there is already a base rewriter class, that defines all visit methods with do nothing, and I just have to override the methods that I care about.
What would be a comparable solution with the DU kind of ASTs?
Eg if I would like to write a function that visits every node of an AST parsed with the following snippet (not made by me))
I can write transformer functions like so
// strip all class type modifiers because of reasons
let typeTransformer (input:CSharpType) : CSharpType =
match input with
| Class (access, modifier, name, implements, members) ->
Class (access, None, name, implements, members)
| _ -> input
let rec nameSpaceTransformer typeTransformer (input:NamespaceScope) : NamespaceScope =
match input with
| Namespace (imports, names, nestedNamespaces) ->
Namespace (imports, names, List.map (nameSpaceTransformer typeTransformer) nestedNamespaces)
| Types (names, types) ->
Types (names, List.map typeTransformer types)
This is already pretty cumbersome, but it gets worse and worse, the deeper one gets into the tree.
Does this representation just not lend itself to these kinds of transformations?
Edit: what I am actually looking for, is a way where I can define just the specific transform functions that will then be automatically applied to the correct nodes, while everything else remains unchanged.
Here is my best try so far on a simplified example (Fable REPL)
Note the last 2 lets after the comment, in later usage, one should only need to write those 2 and then call transform replaceAllAsWithBsTransformer someAstRoot with an actual AST instance.
Of course this solution does not work correctly, because it would require recursive records. (eg the transformMiddleNode function should really ask for a the transformer record itself and ask for it's transformLeaf member).
This is the part where I have trouble with, and which I would say is nicely solved by OOP visitor pattern, but I can't figure out how to mirror it successfully here.
Edit 2:
At the end of the day, I went with just implementing an actual visitor class in the form of
type Transformer() =
abstract member TransformLeaf : Leaf -> Leaf
default this.TransformLeaf leaf = id leaf
abstract member TransformMiddleNode : MiddleNode -> MiddleNode
default this.TransformMiddleNode node =
match node with
| MoreNodes nodeList ->
List.map this.TransformMiddleNode nodeList
|> MoreNodes
| Leaf leaf -> this.TransformLeaf leaf |> Leaf
abstract member TransformUpperNode : UpperNode -> UpperNode
default this.TransformUpperNode node =
match node with
| MoreUpperNodes nodeList ->
List.map TransformUpperNode nodeList |> MoreUpperNodes
| MiddleNodes nodeList ->
List.map TransformMiddleNode nodeList |> MiddleNodes
...
and then I can define specific transformations like:
type LeafTransformer()
inherit Transformer()
override this.TransformLeaf leaf = someLeafTransformation leaf
where someLeafTransformation: Leaf -> Leaf
This is not any worse than the OOP solution (is essentially the same, except the "bottom level" visitor interfaces are replaced by pattern matching.

Certainly the code you posted is doing it the "functional way". It's not clear to me exactly how this is "cumbersome" or "gets worse the deeper one gets into the tree". I think the key concept here is just to write your functions as concisely as possible (but not so concise they become unreadable!) and then figuring out the right mix of helper functions and higher level functions that rely on those, plus good comments where needed.
Your first function could just be this:
let transformModifier input =
match input with
| Class (a, modifier, c, d, e) -> (a, None, b, c, d)
| _ -> input
This is less verbose, but still readable. In fact, it's probably more readable as it's obvious now that the only thing this does is change the class modifier.
Perhaps you will want to create other functions that modify classes, and compose these using >>, then call them from a larger function that walks the whole tree.
The ultimate readability of the code is going to be mostly up to you (IMO).
There are good discussions of AST transformations in the books Expert F# and F# Deep Dives.

what I am actually looking for, is a way where I can define just the specific transform functions that will then be automatically applied to the correct nodes, while everything else remains unchanged.
I wrote an AST transformation library called FSharp.Text.Experimental.Transform that does exactly this. Coincidentally I've already written a C# grammar definition so I was able to use to try out your "strip class modifiers" problem.
Your solution, implemented using this library, starts by feeding the C# grammar definition into the GrammarProvider type provider. The type provider will provide methods for parsing the input text, and provides a type for each non-terminal in the grammar.
open FSharp.Text.Experimental.Transform
type CSharp = GrammarProvider<"csharp.grm">
Next, you define your transformation function that just operates on the nodes you care about. Since you care about transforming the modifiers for a class, you'll target the
ClassModifier* part of the ClassDefinition grammar production
// This function replaces any list of class modifiers with the empty list
let stripClassModifiers (_: CSharp.ClassModifier list) = []
Finally, you parse the input text, apply your transformation function, then unparse to a string:
CSharp.ParseFile("path/to/program.cs").ApplyOnePass(stripClassModifiers).ToString()
The ApplyOnePass() method will perform a single pass over the AST, applying your stripClassModifiers transformation wherever it finds a list of class modifiers and leaving all the other nodes untouched.
The library contains more powerful methods for more complex transformations, but I hope the example above suffices to illustrate the idea. See the library documentation for tutorials, examples, API reference, and more details on what it can do.

Related

What is the best way to pass generic function that resolves to multiple types

For the background: It's a variation on functional DI. Following Scott's post I wrote an interpreter. The twist is that my interpreter is generic and parametrized based on what you feed to it.
For testing purposes I'd like to pass another interpreter in, and therein lies the rub - how can I? Here's the simplified outline of the problem:
let y f =
let a = f 1
let b = f 2L
(a,b)
f is my generic interpreter, but here it is obviously constrained by the first use to int -> 'a.
In this simplified scenario I could just pass the interpreter twice, but in my actual implementation the type space is rather large (base type x3 output types).
Is there some F# mechanism that would let me do that, w/o too much overhead?
You can't do this in F# with functions. Functions lose genericity when passed as values.
However, F# does have a mechanism for doing it anyway, albeit a bit awkwardly: interfaces. Interface methods can be generic, so you can use them to wrap your generic functions:
type Wrapper =
abstract member f<'a> : 'a -> 'a
let y (w: Wrapper) =
let a = w.f 1
let b = w.f 2L
(a, b)
let genericFn x = x
// Calling y:
y { new Wrapper with member __.f x = genericFn x }
The downside is, you can't go back to higher-order functions, lest you lose genericity. You have to have interfaces all the way down to the turtles. For example, you can't simplify the instance creation by abstracting it as a function:
let mkWrapper f =
// no can do: `f` will be constrained to a non-generic type at this point
{ new Wrapper with member __.f x = f x }
But you can provide some convenience on the other side. At least get rid of type annotations:
type Wrapper = abstract member f<'a> (x: 'a): 'a
let callF (w: Wrapper) x = w.f x
let y w =
let a = callF w 1
let b = callF w 2L
(a,b)
(NOTE: there may be minor syntactic mistakes in the above code, as I'm writing on my phone)
Not sure if you're still interested, since you already accepted an answer, but as #Fyodorsoikin requested it, here's the 'static' way, it all happens at compile time, so no runtime overhead:
let inline y f =
let a = f $ 1
let b = f $ 2L
(a, b)
type Double = Double with static member inline ($) (Double, x) = x + x
type Triple = Triple with static member inline ($) (Triple, x) = x + x + x
type ToList = ToList with static member ($) (ToList, x) = [x]
let res1 = y Double
let res2 = y Triple
let res3 = y ToList
I use this technique when I need a generic function over arbitrary structures, I use to name the types with a single method 'Invokable'.
UPDATE
To add parameters to the function you add it to the DU, like this:
type Print<'a> = Print of 'a with
static member inline ($) (Print printer, x) = printer (string x)
let stdout (x:string) = System.Console.WriteLine x
let stderr (x:string) = System.Console.Error.WriteLine x
let res4 = y (Print stdout)
let res5 = y (Print stderr)
This is just a quick and simple sample code but this approach can be refined: you can use a method name instead of an operator, you can avoid having to repeat the DU in the declaration, and you can compose Invokables. If you are interested in these enhancements, let me know. I used a refinement of this approach before in production code and never had any issue.
Please look at Crates.
Here is a quick snippet describing the crux of what you want to accomplish. I believe this snippet is valuable in it helps teach us how we can formally reason about using F# and other ML type systems, by using mathematical language. In other words, it not only shows you how, it teaches you the deep principle of why it works.
The issue here is that we have reached a fundamental limitation of what is directly expressible in F#. It follows that the trick to simulating universal quantification is, therefore, to avoid ever passing the function around directly, instead hiding the type parameter away such that it cannot be fixed to one particular value by the caller, but how might one do that?
Recall that F# provides access to the .NET object system. What if we made our own class (in the object-oriented sense) and put a generic method on that? We could create instances of that which we could pass around, and hence carry our function with it (in the form of said method)?
// Encoding the function signature...
// val id<'a> : 'a -> 'a
// ...in terms of an interface with a single generic method
type UniversalId = abstract member Eval<'a> : 'a -> 'a
Now we can create an implementation which we can pass around without the type parameter being fixed:
// Here's the boilerplate I warned you about.
// We're implementing the "UniversalId" interface
// by providing the only reasonable implementation.
// Note how 'a isn't visible in the type of id -
// now it can't be locked down against our will!
let id : UniversalId =
{ new UniversalId with
member __.Eval<'a> (x : 'a) : 'a = x
}
Now we have a way to simulate a universally quantified function. We can pass id around as a value, and at any point we pick a type 'a to pass to it just as we would any value-level argument.
EXISTENTIAL QUANTIFICATION
There exists a type x, such that…
An existential is a value whose type is unknown statically, either because we have intentionally hidden something that was known, or because the type really is chosen at runtime, e.g. due to reflection. At runtime we can, however, inspect the existential to find the type and value inside.
If we don’t know the concrete type inside an existentially quantified type, how can we safely perform operations on it? Well, we can apply any function which itself can handle a value of any type – i.e. we can apply a universally quantified function!
In other words, existentials can be described in terms of the universals which can be used to operate upon them.
This technique is so useful that it is used in datatype generic programming library TypeShape, which allows you to scrap your boilerplate .NET reflection code, as well as MBrace and FsPickler for "packing existential data types". See Erik Tsarpalis' slides on TypeShape for more on "encoding safe existential unpacking in .NET" and encoding rank-2 types in .NET.
A reflection helper library like TypeShape also intuitively should cover most if not all your use cases: dependency injection needs to implement service location under the hood, and so TypeShape can be thought of as the "primitive combinator library" for building dependencies to inject. See the slides starting with Arbitrary Type Shapes: In particular, note the Code Lens data type:
type Lens<'T,'F> =
{
Get : 'T -> 'F
Set : 'T -> 'F -> 'T
}
Finally, for more ideas, you may care to read Don Stewart's PhD dissertation, Dynamic Extension of Typed Functional Languages.
We present a solution to the problem of dynamic extension in statically
typed functional languages with type erasure. The presented solution retains
the benefits of static checking, including type safety, aggressive optimizations, and native code compilation of components, while allowing
extensibility of programs at runtime.
Our approach is based on a framework for dynamic extension in a statically
typed setting, combining dynamic linking, runtime type checking,
first class modules and code hot swapping. We show that this framework
is sufficient to allow a broad class of dynamic extension capabilities in any
statically typed functional language with type erasure semantics.
Uniquely, we employ the full compile-time type system to perform runtime
type checking of dynamic components, and emphasize the use of native
code extension to ensure that the performance benefits of static typing
are retained in a dynamic environment. We also develop the concept of
fully dynamic software architectures, where the static core is minimal and
all code is hot swappable. Benefits of the approach include hot swappable
code and sophisticated application extension via embedded domain specific
languages.
Here are some coarse-grained design patterns Don lays out for future engineers to follow:
Section 3.6.3: Specializing Simulators Approach.
Demonstrates how to apply program specialization techniques to Monte-Carlo simulation of polymer chemistry. This approach demonstrates how you can "inject" specialized code to address so-called "peephole optimizations".
and a general chart to help frame the "tower of extensibility":
You could do that with a full fledged type:
type Function() =
member x.DoF<'a> (v:'a) = v
let y (f: Function) =
let a = f.DoF 1
let b = f.DoF 2L
(a,b)
y (Function())
I don't know a way to make it work with first class functions in F#

How to implement data structure using functional approach? (Linked list, tree etc)

I am new in functional programming, I learn F# and sorry if question is stupid.
I want figure out with syntax and implement some simple data structure, but I don't know how do it.
How should look implementation of linked list?
I tried to create type, put there mutable property and define set of methods to work with the type, but it looks like object oriented linked list...
The basic list type in F# is already somewhat a linked list.
Though you can easily recreate a linked list with a simple union type:
type LinkedList<'t> = Node of 't * LinkedList<'t> | End
A node can have a value and a pointer to the next node or, be the end.
You can simply make a new list by hand:
Node(1, Node(2, Node(3, End))) //LinkedList<int> = Node (1,Node (2,Node (3,End)))
Or make a new linked list by feeding it an F# list:
let rec toLinkedList = function
| [] -> End
| x::xs -> Node (x, (toLinkedList xs))
Walking through it:
let rec walk = function
| End -> printfn "%s" "End"
| Node(value, list) -> printfn "%A" value; walk list
The same concepts would apply for a tree structure as well.
A tree would look something like
type Tree<'leaf,'node> =
| Leaf of 'leaf
| Node of 'node * Tree<'leaf,'node> list
The F# Wikibook has a good article on data structures in F#.

Flattening a tuple in Erlang

I am trying to turn a tuple of the form:
{{A,B,{C,A,{neg,A}}},{A,B,{neg,A}}}
Into
{{A,B,C,A,{neg,A}},{A,B,{neg,A}}
I'm quite new to Erlang so I would appreciate any hints. It makes no difference if the final structure is a list or a tuple, as long as any letter preceded by neg stays as a tuple/list.
A simple solution:
convert({{A,B,{C,D,E}},F}) -> {{A,B,C,D,E},F}.
If why this works is puzzling, consider:
1> YourTuple = {{a, b, {c, a, {neg, a}}}, {a, b, {neg, a}}}.
{{a,b,{c,a,{neg,a}}},{a,b,{neg,a}}}
2> Convert = fun({{A,B,{C,D,E}},F}) -> {{A,B,C,D,E},F} end.
#Fun<erl_eval.6.54118792>
3> Convert(YourTuple).
{{a,b,c,a,{neg,a}},{a,b,{neg,a}}}
The reason this happens is because we are matching over entire values based on the shape of the data. That's the whole point of matching, and also why its super useful in so many cases (and also why we want to use tuples in more specific circumstances in a language with matching VS a language where "everything is an iterable"). We can substitute the details with anything and they will be matched and returned accordingly:
4> MyTuple = {{"foo", bar, {<<"baz">>, balls, {ugh, "HURR!"}}}, {"Fee", "fi", "fo", "fum"}}.
{{"foo",bar,{<<"baz">>,balls,{ugh,"HURR!"}}},
{"Fee","fi","fo","fum"}}
5> Convert(MyTuple).
{{"foo",bar,<<"baz">>,balls,{ugh,"HURR!"}},
{"Fee","fi","fo","fum"}}
Why did this work when the last element of the top-level pair was so different in shape than the first one? Because everything about that second element was bound to the symbol F in the function represented by Convert (note that in the shell I named an anonymous function for convenience, this would be exactly the same as using convert/1 that I wrote at the top of this answer). We don't care what that second element was -- in fact we don't want to have to care about the details of that. The freedom to selectively not care about the shape of a given element of data is one of the key abstractions we use in Erlang.
"But those were just atoms 'a', 'b', 'c' etc. I have different things in there!"
Just to make it look superficially like your example above (and reinforce what I was saying about not caring about exactly what we bound to a given variable):
6> A = 1.
1
7> B = 2.
2
8> C = 3.
3
9> AnotherTuple = {{A, B, {C, A, {neg, A}}}, {A, B, {neg, A}}}.
{{1,2,{3,1,{neg,1}}},{1,2,{neg,1}}}
10> Convert(AnotherTuple).
{{1,2,3,1,{neg,1}},{1,2,{neg,1}}}
Needing to do this is not usually optimal, though. Generally speaking the other parts of the program that are producing that data in the first place should be returning useful data types for you. If not you can certainly hide them behind a conversion function such as the one above (especially when you're dealing with APIs that are out of your control), but generally speaking the need for this is a code smell.
And moving on
The more general case of "needing to flatten a tuple" is a bit different.
Tuples are tuples because each location within it has a meaning. So you don't usually hear of people needing to "flatten a tuple" because that fundamentally changes the meaning of the data you are dealing with. If you have this problem, you should not be using tuples to begin with.
That said, we can convert a tuple to a list, and we can check the shape of a data element. With these two operations in hand we could write a procedure that moves through a tuplish structure, building a list out of whatever it finds inside as it goes. A naive implementation might look like this:
-module(tuplish).
-export([flatten/1]).
-spec flatten(list() | tuple()) -> list().
flatten(Thing) ->
lists:flatten(flatten(Thing, [])).
flatten(Thing, A) when is_tuple(Thing) ->
flatten(tuple_to_list(Thing), A);
flatten([], A) ->
lists:reverse(A);
flatten([H | T], A) when is_tuple(H) ->
flatten(T, [flatten(H) | A]);
flatten([H | T], A) when is_list(H) ->
flatten(T, [flatten(H) | A]);
flatten([H | T], A) ->
flatten(T, [H | A]).
Keep in mind that after several years of writing Erlang code I have never needed to actually do this. Remember: tuples mean something different than lists.
All that said, the problem you are facing is almost certainly handled better by using records.

Accessing specific case from F# DU

Suppose I have the following DU:
type Something =
| A of int
| B of string * int
Now I use it in a function like this:
let UseSomething = function
| A(i) -> DoSomethingWithA i
| B(s, i) -> DoSomethingWithB s i
That works, but I've had to deconstruct the DU in order to pass it to the DoSomethingWith* functions. It feels natural to me to try to define DoSomethingWithA as:
let DoSomethingWithA (a: Something.A) = ....
but the compiler complains that the type A is not defined.
It seems entirely in keeping with the philosophy of F# to want to restrict the argument to being a Something.A, not just any old int, so am I just going about it the wrong way?
The important thing to note is that A and B are constructors of the same type Something. So you will get inexhaustive pattern matching warning if you try to use A and B cases separately.
IMO, deconstructing all cases of DUs is a good idea since it is type-safe and forces you to think of handling those cases even you don't want to. The problem may arise if you have to deconstruct DUs repetitively in the same way. In that case, defining map and fold functions on DUs might be a good idea:
let mapSomething fa fb = function
| A(i) -> fa i
| B(s, i) -> fb s i
Please refer to excellent Catamorphism series by #Brian to learn about fold on DUs.
That also said that your example is fine. What you really process are string and int values after deconstruction.
You can use Active Patterns to consume two cases separately:
let (|ACase|) = function A i -> i | B _ -> failwith "Unexpected pattern B _"
let (|BCase|) = function B(s, i) -> (s, i) | A _ -> failwith "Unexpected pattern A _"
let doSomethingWithA (ACase i) = ....
but inferred type of doSomethingWithA is still the same and you get an exception when passing B _ to the function. So it's a wrong thing to do IMO.
The other answers are accurate: in F# A and B are constructors, not types, and this is the traditional approach taken by strongly typed functional languages like Haskell or the other languages in the ML family. However, there are other approaches - I believe that in Scala, for example, A and B would actually be subclasses of Something, so you could use those more specific types where it makes sense to do so. I'm not completely sure what tradeoffs are involved in the design decision, but generally speaking inheritance makes type inference harder/impossible (and true to the stereotype type inference in Scala is much worse than in Haskell or the ML languages).
A is not a type, it is just a constructor for Something. There's no way you can avoid pattern matching, which is not necessarily a bad thing.
That said, F# does offer a thing called active patterns, for instance
let (|AA|) = function
| A i -> i
| B _ -> invalidArg "B" "B's not allowed!"
which you can then use like this:
let DoSomethingWithA (AA i) = i + 1
But there's no real reason why you would want to do that! You still do the same old pattern matching under the hood, plus you risk the chance of a runtime error.
In any case, your implementation of UseSomething is perfectly natural for F#.

Using a variable in pattern matching in Ocaml or F#

I have a function of the form
'a -> ('a * int) list -> int
let rec getValue identifier bindings =
match bindings with
| (identifier, value)::tail -> value
| (_, _)::tail -> getValue identifier tail
| [] -> -1
I can tell that identifier is not being bound the way I would like it to and is acting as a new variable within the match expression. How to I get identifier to be what is passed into the function?
Ok! I fixed it with a pattern guard, i.e. | (i, value)::tail when i = indentifier -> value
but I find this ugly compared to the way I originally wanted to do it (I'm only using these languages because they are pretty...). Any thoughts?
You can use F# active patterns to create a pattern that will do exactly what you need. F# supports parameterized active patterns that take the value that you're matching, but also take an additional parameter.
Here is a pretty stupid example that fails when the value is zero and otherwise succeeds and returns the addition of the value and the specified parameter:
let (|Test|_|) arg value =
if value = 0 then None else Some(value + arg)
You can specify the parameter in pattern matching like this:
match 1 with
| Test 100 res -> res // 'res' will be 101
Now, we can easily define an active pattern that will compare the matched value with the input argument of the active pattern. The active pattern returns unit option, which means that it doesn't bind any new value (in the example above, it returned some value that we assigned to a symbol res):
let (|Equals|_|) arg x =
if (arg = x) then Some() else None
let foo x y =
match x with
| Equals y -> "equal"
| _ -> "not equal"
You can use this as a nested pattern, so you should be able to rewrite your example using the Equals active pattern.
One of the beauties of functional languages is higher order functions. Using those functions we take the recursion out and just focus on what you really want to do. Which is to get the value of the first tuple that matches your identifier otherwise return -1:
let getValue identifier list =
match List.tryFind (fun (x,y) -> x = identifier) list with
| None -> -1
| Some(x,y) -> y
//val getValue : 'a -> (('a * int) list -> int) when 'a : equality
This paper by Graham Hutton is a great introduction to what you can do with higher order functions.
This is not directly an answer to the question: how to pattern-match the value of a variable. But it's not completely unrelated either.
If you want to see how powerful pattern-matching could be in a ML-like language similar to F# or OCaml, take a look at Moca.
You can also take a look at the code generated by Moca :) (not that there's anything wrong with the compiler doing a lot of things for you in your back. In some cases, it's desirable, even, but many programmers like to feel they know what the operations they are writing will cost).
What you're trying to do is called an equality pattern, and it's not provided by Objective Caml. Objective Caml's patterns are static and purely structural. That is, whether a value matches the pattern depends solely on the value's structure, and in a way that is determined at compile time. For example, (_, _)::tail is a pattern that matches any non-empty list whose head is a pair. (identifier, value)::tail matches exactly the same values; the only difference is that the latter binds two more names identifier and value.
Although some languages have equality patterns, there are non-trivial practical considerations that make them troublesome. Which equality? Physical equality (== in Ocaml), structural equality (= in Ocaml), or some type-dependent custom equality? Furthermore, in Ocaml, there is a clear syntactic indication of which names are binders and which names are reference to previously bound values: any lowercase identifier in a pattern is a binder. These two reasons explain why Ocaml does not have equality patterns baked in. The idiomatic way to express an equality pattern in Ocaml is in a guard. That way, it's immediately clear that the matching is not structural, that identifier is not bound by this pattern matching, and which equality is in use. As for ugly, that's in the eye of the beholder — as a habitual Ocaml programmer, I find equality patterns ugly (for the reasons above).
match bindings with
| (id, value)::tail when id = identifier -> value
| (_, _)::tail -> getValue identifier tail
| [] -> -1
In F#, you have another possibility: active patterns, which let you pre-define guards that concern a single site in a pattern.
This is a common complaint, but I don't think that there's a good workaround in general; a pattern guard is usually the best compromise. In certain specific cases there are alternatives, though, such as marking literals with the [<Literal>] attribute in F# so that they can be matched against.

Resources