I am creating a simple DSL with multiple discriminated unions(DU) . There is building block DU and higher DUs are build on top of lower ones.
Now I want to create UI where user can build a text which matches my DSL. to UI I don't want to express my full grammar, but show only possible actions that can be done. So I need a way to figure out from my hierarchical DU, what are other possible states that user can do.
sample input text (1 + (2 * 3))
type Expression =
| Constant of int
| Add of Expression * Expression
| Mul of expression * Expression
so when user starts, I have to return a list saying only Constant can be used.
when user passes (constant ) as his current state, I have to tell you can add/Mul (which is expression) and so on.
I want to represent a structure which says, current state and possible states to go from in a type safe way. Is there a way to solve this kind of problem in f#
I think you're looking for FSharpType.GetUnionCases. You can use this as follows:
open System
open Microsoft.FSharp.Collections
open Microsoft.FSharp.Reflection
type Expression =
| Constant of int
| Add of Expression * Expression
| Mul of Expression * Expression
typeof<Expression> |> FSharpType.GetUnionCases|> Dump |> ignore
LinqPad for the above here
Related
I have those rules to build a simple calculator :
statement -> assignment | calculation
assignment -> variable '=' sum end
calculation -> sum end
sum -> product (('+' product)|('-' product))*
product -> factor (('*' factor)|('/' factor))*
factor -> term
term -> variable | number
My problem is how to model the rules for postfix and prefix increment and decrement. How can represent it in this grammar above so that, for example, if I have the assignment :
x=1
j=x++ +2
the result will be j=3 and x=2. How do I do post-increment after assignment?
The simplest grammar change would be to add the new operators to term:
term -> variable
| '++' variable | '--' variable
| variable '++' | variable '--'
| number
The new rules could have been added to factor instead, particularly since factor currently has no point at all and could be removed. However, if you ever add more complicated lvalues than a single variable (array subscripts for example) then that will have to be adjusted. Also, adding the operators to factor would make nonsense like ++2 syntactically possible, or (a+b)++ once you implement parentheses. So, although putting them in some non-terminal other than term is more common and probably more appropriate, it's not necessarily the best solution in this particular case.
The questions about the AST and the evaluation of the AST can't be answered without knowing a lot more about how you structure your ASTs. You're free to build ASTs in any way you feel appropriate, but it's probably worth noting that the AST must be able to distinguish between post- and pre-increment. Either you need to use a different operator symbol for the two cases, or you need some hack (such as the C++ hack of adding a fake operand to one of the two cases).
I looked at how to make a tree from a given data with F# and https://citizen428.net/blog/learning-fsharp-binary-search-tree/
Basically what I am attempting to do is to implementing a function for building an extremely simple AST using discriminated unions (DU) to represent the tree.
I want to use tokens/symbols to build the tree. I think these could also be represented by DU. I am struggling to implement the insert function.
Let's just say we use the following to represent the tree. The basic idea is that for addition and subtraction of integers I'll only need binary tree. The Expression could either be an operator or a constant. This might be the wrong way of implementing the tree, but I'm not sure.
type Tree =
| Node of Tree * Expression * Tree
| Empty
and Expression =
| Operator //could be a token or another type
| Constant of int
And let's use the following for representing tokens. There's probably a smarter way of doing this. This is just an example.
type Token =
| Integer
| Add
| Subtract
How should I implement the insert function? I've written the function below and tried different ways of inserting elements.
let rec insert tree element =
match element, tree with
//use Empty to initalize
| x, Empty -> Node(Empty, x, Empty)
| x, Node(Empty,y,Empty) when (*x is something here*) -> Node((*something*))
| _, _ -> failwith "Missing case"
If you got any advice or maybe a link then I would appreciate it.
I think that thinking about the problem in terms of tree insertion is not very helpful, because what you really want to do is to parse a sequence of tokens. So, a plain tree insertion is not very useful. You instead need to construct the tree (expression) in a more specific way.
For example, say I have:
let input = [Integer 1; Add; Integer 2; Subtract; Integer 1;]
Say I want to parse this sequence of tokens to get a representation of 1 + (2 - 1) (which has parentheses in the wrong way, but it makes it easier to explain the idea).
My approach would be to define a recursive Expression type rather than using a general tree:
type Token =
| Integer of int
| Add
| Subtract
type Operator =
| AddOp | SubtractOp
type Expression =
| Binary of Operator * Expression * Expression
| Constant of int
To parse a sequence of tokens, you can write something like:
let rec parse input =
match input with
| Integer i::Add::rest ->
Binary(AddOp, Constant i, parse rest)
| Integer i::Subtract::rest ->
Binary(SubtractOp, Constant i, parse rest)
| Integer i::[] ->
Constant i
| _ -> failwith "Unexpected token"
This looks for lists starting with Integer i; Add; ... or similar with subtract and constructs a tree recursively. Using the above input, you get:
> parse input;;
val it : Expression =
Binary (AddOp, Constant 1,
Binary (SubtractOp, Constant 2, Constant 1))
I'm trying to use Menhir's incremental parsing API and introspection APIs in a generated parser. I want to, say, determine the semantic value associated with a particular LR(1) stack entry; i.e. a token that's been previously consumed by the parser.
Given an abstract parsing checkpoint, encapsulated in Menhir's type 'a env, I can extract a “stack element” from the LR automaton; it looks like this:
type element =
| Element: 'a lr1state * 'a * position * position -> element
The type element describes one entry in the stack of the LR(1) automaton. In a stack element of the form Element (s, v, startp, endp), s is a (non-initial) state and v is a semantic value. The value v is associated with the incoming symbol A of the state s. In other words, the value v was pushed onto the stack just before the state s was entered. Thus, for some type 'a, the state s has type 'a lr1state and the value v has type 'a ...
In order to do anything useful with the value v, one must gain information about the type 'a, by inspection of the state s. So far, the type 'a lr1state is abstract, so there is no way of inspecting s. The inspection API (§9.3) offers further tools for this purpose.
Okay, cool! So I go and dive into the inspection API:
The type 'a terminal is a generalized algebraic data type (GADT). A value of type 'a terminal represents a terminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ terminal =
| T_A : unit terminal
| T_B : int terminal
The type 'a nonterminal is also a GADT. A value of type 'a nonterminal represents a nonterminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ nonterminal =
| N_main : thing nonterminal
Piecing these together, I get something like the following (where "command" is one of my grammar's nonterminals, and thus N_command is a string nonterminal):
let current_command (env : 'a env) =
let rec f i =
match Interpreter.get i env with
| None -> None
| Some Interpreter.Element (lr1state, v, _startp, _endp) ->
match Interpreter.incoming_symbol lr1state with
| Interpreter.N Interpreter.N_command -> Some v
| _ -> f (i + 1)
in
f 0
Unfortunately, this is puking up very confusing type-errors for me:
File "src/incremental.ml", line 110, characters 52-53:
Error: This expression has type string but an expression was expected of type
string
This instance of string is ambiguous:
it would escape the scope of its equation
This is a bit above my level! I'm pretty sure I understand why I can't do what I tried to do above; but I don't understand what my alternatives are. In fact, the Menhir manual specifically mentions this complexity:
This function can be used to gain access to the semantic value v in a stack element Element (s, v, _, _). Indeed, by case analysis on the symbol incoming_symbol s, one gains information about the type 'a, hence one obtains the ability to do something useful with the value v.
Okay, but that's what I thought I did, above: case-analysis by match'ing on incoming_symbol s, pulling out the case where v is of a single, specific type: string.
tl;dr: how do I extract the string payload from this GADT, and do something useful with it?
If your error sounds like
This instance of string is ambiguous:
it would escape the scope of its equation
it means that the type checker is not really sure if outside of the pattern matching branch the type of v should be a string, or another type that is equal to string but only inside the branch. You just need to add a type annotation when leaving the branch to remove this ambiguity:
| Interpreter.(N N_command) -> Some (v:string)
This function:
let convert (v: float<_>) =
match v with
| :? float<m> -> v / 0.1<m>
| :? float<m/s> -> v / 0.2<m/s>
| _ -> failwith "unknown"
produces an error
The type 'float<'u>' does not have any proper subtypes and cannot be used as the source of a type test or runtime coercion.
Is there any way how to pattern match units of measure?
As #kvb explains in detail, the problem is that units of measure are a part of the type. This means that float<m> is different type than float<m/s> (and unfortunately, this information isn't stored as part of the value at runtime).
So, you're actually trying to write a function that would work with two different types of input. The clean functional solution is to declare a discriminated union that can hold values of either the first type or the second type:
type SomeValue =
| M of float<m>
| MPS of float<m/s>
Then you can write the function using ordinary pattern matching:
let convert v =
match v with
| M v -> v / 0.1<m>
| MPS v -> v / 0.2<m/s>
You'll need to explicitly wrap the values into the discriminated union value, but it's probably the only way to do this directly (without making some larger changes in the program structure).
For normal types like int and float, you could also use overloaded members (declared in some F# type), but that doesn't work for units of measure, because the signature will be the same after the F# compiler erases the unit information.
There are two problems with your approach. First of all, when you use an underscore in the definition of your function, that's the same as using a fresh type variable, so your definition is equivalent to the following:
let convert (v: float<'u>) = //'
match v with
| :? float<m> -> v / 0.1<m>
| :? float<m/s> -> v / 0.2<m/s>
| _ -> failwith "unknown"
What the error message is telling you is that the compiler know that v is of type float<'u>, and float<'u> has no proper subtypes, so there's no point in doing a type test to determine if it's a float<m> or any other type.
You might try to get around this by first boxing v into an object and then doing a type test. This would work, for instance, if you had a list<'a> and wanted to see if it were a list<int>, because full type information about generic objects is tracked at runtime including generic type parameters (notably, this is different from how some other runtimes like Java's work). Unfortunately, F# units of measure are erased at runtime, so this won't work here - there is no way for the system to infer the correct measure type given a boxed representation, since at runtime the value is just a plain float - F#'s system for units of measure is actually quite similar in this respect to how Java handles generic types.
As an aside, what you're trying to do seems quite suspect - functions which are generic in the unit of measure shouldn't do different things depending on what the measure type is; they should be properly parametric. What exactly are you trying to achieve? It certainly doesn't look like an operation which corresponds to physical reality, which is the basis for F#'s measure types.
See the Units at Runtime Section at http://msdn.microsoft.com/en-us/library/dd233243.aspx.
I agree with #kvb, I think the best way around this is to pass an object.
What I would like to do, using your code structure:
let convert (v: float<_>) =
match v with
| :? float<m> -> v<m>
| :? float<inches> -> v * 2.54 / 100.0<m>
I have a function of the form
'a -> ('a * int) list -> int
let rec getValue identifier bindings =
match bindings with
| (identifier, value)::tail -> value
| (_, _)::tail -> getValue identifier tail
| [] -> -1
I can tell that identifier is not being bound the way I would like it to and is acting as a new variable within the match expression. How to I get identifier to be what is passed into the function?
Ok! I fixed it with a pattern guard, i.e. | (i, value)::tail when i = indentifier -> value
but I find this ugly compared to the way I originally wanted to do it (I'm only using these languages because they are pretty...). Any thoughts?
You can use F# active patterns to create a pattern that will do exactly what you need. F# supports parameterized active patterns that take the value that you're matching, but also take an additional parameter.
Here is a pretty stupid example that fails when the value is zero and otherwise succeeds and returns the addition of the value and the specified parameter:
let (|Test|_|) arg value =
if value = 0 then None else Some(value + arg)
You can specify the parameter in pattern matching like this:
match 1 with
| Test 100 res -> res // 'res' will be 101
Now, we can easily define an active pattern that will compare the matched value with the input argument of the active pattern. The active pattern returns unit option, which means that it doesn't bind any new value (in the example above, it returned some value that we assigned to a symbol res):
let (|Equals|_|) arg x =
if (arg = x) then Some() else None
let foo x y =
match x with
| Equals y -> "equal"
| _ -> "not equal"
You can use this as a nested pattern, so you should be able to rewrite your example using the Equals active pattern.
One of the beauties of functional languages is higher order functions. Using those functions we take the recursion out and just focus on what you really want to do. Which is to get the value of the first tuple that matches your identifier otherwise return -1:
let getValue identifier list =
match List.tryFind (fun (x,y) -> x = identifier) list with
| None -> -1
| Some(x,y) -> y
//val getValue : 'a -> (('a * int) list -> int) when 'a : equality
This paper by Graham Hutton is a great introduction to what you can do with higher order functions.
This is not directly an answer to the question: how to pattern-match the value of a variable. But it's not completely unrelated either.
If you want to see how powerful pattern-matching could be in a ML-like language similar to F# or OCaml, take a look at Moca.
You can also take a look at the code generated by Moca :) (not that there's anything wrong with the compiler doing a lot of things for you in your back. In some cases, it's desirable, even, but many programmers like to feel they know what the operations they are writing will cost).
What you're trying to do is called an equality pattern, and it's not provided by Objective Caml. Objective Caml's patterns are static and purely structural. That is, whether a value matches the pattern depends solely on the value's structure, and in a way that is determined at compile time. For example, (_, _)::tail is a pattern that matches any non-empty list whose head is a pair. (identifier, value)::tail matches exactly the same values; the only difference is that the latter binds two more names identifier and value.
Although some languages have equality patterns, there are non-trivial practical considerations that make them troublesome. Which equality? Physical equality (== in Ocaml), structural equality (= in Ocaml), or some type-dependent custom equality? Furthermore, in Ocaml, there is a clear syntactic indication of which names are binders and which names are reference to previously bound values: any lowercase identifier in a pattern is a binder. These two reasons explain why Ocaml does not have equality patterns baked in. The idiomatic way to express an equality pattern in Ocaml is in a guard. That way, it's immediately clear that the matching is not structural, that identifier is not bound by this pattern matching, and which equality is in use. As for ugly, that's in the eye of the beholder — as a habitual Ocaml programmer, I find equality patterns ugly (for the reasons above).
match bindings with
| (id, value)::tail when id = identifier -> value
| (_, _)::tail -> getValue identifier tail
| [] -> -1
In F#, you have another possibility: active patterns, which let you pre-define guards that concern a single site in a pattern.
This is a common complaint, but I don't think that there's a good workaround in general; a pattern guard is usually the best compromise. In certain specific cases there are alternatives, though, such as marking literals with the [<Literal>] attribute in F# so that they can be matched against.