In OOP land, take for example Roslyn and it's syntax rewriters, using visitor pattern.
This is very nice, as there is already a base rewriter class, that defines all visit methods with do nothing, and I just have to override the methods that I care about.
What would be a comparable solution with the DU kind of ASTs?
Eg if I would like to write a function that visits every node of an AST parsed with the following snippet (not made by me))
I can write transformer functions like so
// strip all class type modifiers because of reasons
let typeTransformer (input:CSharpType) : CSharpType =
match input with
| Class (access, modifier, name, implements, members) ->
Class (access, None, name, implements, members)
| _ -> input
let rec nameSpaceTransformer typeTransformer (input:NamespaceScope) : NamespaceScope =
match input with
| Namespace (imports, names, nestedNamespaces) ->
Namespace (imports, names, List.map (nameSpaceTransformer typeTransformer) nestedNamespaces)
| Types (names, types) ->
Types (names, List.map typeTransformer types)
This is already pretty cumbersome, but it gets worse and worse, the deeper one gets into the tree.
Does this representation just not lend itself to these kinds of transformations?
Edit: what I am actually looking for, is a way where I can define just the specific transform functions that will then be automatically applied to the correct nodes, while everything else remains unchanged.
Here is my best try so far on a simplified example (Fable REPL)
Note the last 2 lets after the comment, in later usage, one should only need to write those 2 and then call transform replaceAllAsWithBsTransformer someAstRoot with an actual AST instance.
Of course this solution does not work correctly, because it would require recursive records. (eg the transformMiddleNode function should really ask for a the transformer record itself and ask for it's transformLeaf member).
This is the part where I have trouble with, and which I would say is nicely solved by OOP visitor pattern, but I can't figure out how to mirror it successfully here.
Edit 2:
At the end of the day, I went with just implementing an actual visitor class in the form of
type Transformer() =
abstract member TransformLeaf : Leaf -> Leaf
default this.TransformLeaf leaf = id leaf
abstract member TransformMiddleNode : MiddleNode -> MiddleNode
default this.TransformMiddleNode node =
match node with
| MoreNodes nodeList ->
List.map this.TransformMiddleNode nodeList
|> MoreNodes
| Leaf leaf -> this.TransformLeaf leaf |> Leaf
abstract member TransformUpperNode : UpperNode -> UpperNode
default this.TransformUpperNode node =
match node with
| MoreUpperNodes nodeList ->
List.map TransformUpperNode nodeList |> MoreUpperNodes
| MiddleNodes nodeList ->
List.map TransformMiddleNode nodeList |> MiddleNodes
...
and then I can define specific transformations like:
type LeafTransformer()
inherit Transformer()
override this.TransformLeaf leaf = someLeafTransformation leaf
where someLeafTransformation: Leaf -> Leaf
This is not any worse than the OOP solution (is essentially the same, except the "bottom level" visitor interfaces are replaced by pattern matching.
Certainly the code you posted is doing it the "functional way". It's not clear to me exactly how this is "cumbersome" or "gets worse the deeper one gets into the tree". I think the key concept here is just to write your functions as concisely as possible (but not so concise they become unreadable!) and then figuring out the right mix of helper functions and higher level functions that rely on those, plus good comments where needed.
Your first function could just be this:
let transformModifier input =
match input with
| Class (a, modifier, c, d, e) -> (a, None, b, c, d)
| _ -> input
This is less verbose, but still readable. In fact, it's probably more readable as it's obvious now that the only thing this does is change the class modifier.
Perhaps you will want to create other functions that modify classes, and compose these using >>, then call them from a larger function that walks the whole tree.
The ultimate readability of the code is going to be mostly up to you (IMO).
There are good discussions of AST transformations in the books Expert F# and F# Deep Dives.
what I am actually looking for, is a way where I can define just the specific transform functions that will then be automatically applied to the correct nodes, while everything else remains unchanged.
I wrote an AST transformation library called FSharp.Text.Experimental.Transform that does exactly this. Coincidentally I've already written a C# grammar definition so I was able to use to try out your "strip class modifiers" problem.
Your solution, implemented using this library, starts by feeding the C# grammar definition into the GrammarProvider type provider. The type provider will provide methods for parsing the input text, and provides a type for each non-terminal in the grammar.
open FSharp.Text.Experimental.Transform
type CSharp = GrammarProvider<"csharp.grm">
Next, you define your transformation function that just operates on the nodes you care about. Since you care about transforming the modifiers for a class, you'll target the
ClassModifier* part of the ClassDefinition grammar production
// This function replaces any list of class modifiers with the empty list
let stripClassModifiers (_: CSharp.ClassModifier list) = []
Finally, you parse the input text, apply your transformation function, then unparse to a string:
CSharp.ParseFile("path/to/program.cs").ApplyOnePass(stripClassModifiers).ToString()
The ApplyOnePass() method will perform a single pass over the AST, applying your stripClassModifiers transformation wherever it finds a list of class modifiers and leaving all the other nodes untouched.
The library contains more powerful methods for more complex transformations, but I hope the example above suffices to illustrate the idea. See the library documentation for tutorials, examples, API reference, and more details on what it can do.
I have the following purescript code:
class Node a where
parentNode :: forall b. (Node b) => a -> b
but when compiling this I get the following error:
A cycle appears in the definition of type synonym Node
Cycles are disallowed because they can lead to loops in the type checker.
Consider using a 'newtype' instead.
I am trying to write a function parentNode that returns the parent node of a node. The only guarantee for the parent node is that it is also a Node b.
I do not care what the actual type for b is.
I am basically trying to say, parentNode should be a function that returns a value that also implements the Node typeclass. Is something like this possible with type classes or is there some other idiomatic way to do this type of thing?
The type of your parentNode function says that the caller can choose the type b of the parent, but I think this is incorrect. The type should be determined by what's in the DOM, so you need an existential type.
The technical issue here is that type classes cannot currently refer to themselves.
However, in this case, I think there is a simpler solution which doesn't use classes. Why not something like this?
class IsNode a where
toNode :: a -> Node
foreign import data Node :: *
foreign import parentNode :: Node -> Node
Is there a (simple) way, within a parsing expression grammar (PEG), to express an "unordered sequence"? A rule such as
Rule <- A B C
requires A, B and C to match in order. A rule such as
Rule <- (A B C) / (B C A) / (C A B) / (A C B) / (C B A) / (B A C)
allows them to match in any order (which is what we want) but it is cumbersome and inapplicable in practice with more terms in the sequence.
Is the only solution to use a syntactically looser rule such as
Rule <- (A / B / C){3}
and semantically check that each rule matches only once?
The fact that, e.g., Relax NG Compact Syntax has an "unordered list" operator to parse XML make me hint that there is no obvious solution.
Last question: do you think the addition of such an operator would bring ambiguity to PEG?
Grammar rules express precisely the sequence of forms that you want, regardless of parsing engine (e.g., PEG, LALR, LL(k), ...) that you choose.
The only way to express that you want all possible sequences of just of something using BNF rules is the big ugly rule you proposed.
The standard solution is to simply define:
rule <- (A | B | C)*
(or whatever syntax your parser generator accepts for lists) and semantically count that only 3 forms are provided and they are unique.
Often people building parser generators add special "extended BNF" notations to let them describe special circumstances; you gave an example use {3} as special syntax implying that you only wanted "3 of" under the assumption the parser generator accepts this notation and does the appropriate enforcement. One can imagine an extension notation {unique} to let you describe your situation. I've never seen a parser generator that implemented that idea.
Until now, I had a class like this one:
type C<'a when 'a :> A> (...)
But now I created a new type B:
type B (...) =
inherit A()
But I don't want C to support B, and this doesn't compile:
type C<'a when 'a :> A and not 'a :> B> (...)
How can I do that?
You can't and shouldn't. If B is an A, then C should handle it. If it's reasonable for C not to be able to handle B, then B shouldn't derive from A. Otherwise you're effectively breaking Liskov's Substitution Principle (or at least a variant of the same).
When you declare that B inherits from A, you're saying that it can be used as an A. If that's not the case, you shouldn't be using inheritance.
If I have class A which has a dependency on class B, then class B could be passed in to the ctor of class A.
What about if class B has a dependency on class C, does that mean class A should receive all required dependencies upon construction ?
In general terms, Dependency Injection would suggest that your classes should have passed all dependencies in the constructor.
However, for your example, it seems to me that A depends on B and B depends on C. In other words, A only needs to have passed B in the constructor; because B will already be constructed using the C instance. In other words, if we wrote the code without a DI framework:
C c = new C();
B b = new B(c);
A a = new A(b);