Matching a character to a discriminated union - f#

I created a discriminated union which has three possible options:
type tool =
| Hammer
| Screwdriver
| Nail
I would like to match a single character to one tool option. I wrote this function:
let getTool (letter: char) =
match letter with
| H -> Tool.Hammer
| S -> Tool.Screwdriver
| N -> Tool.Nail
Visual Studio Code throws me now the warning that only the first character will be matched and that the other rules never will be.
Can somebody please explain this behaviour and maybe provide an alternative?

That's not how characters are denoted in F#. What you wrote are variable names, not characters.
To denote a character, use single quotes:
let getTool (letter: char) =
match letter with
| 'H' -> Tool.Hammer
| 'S' -> Tool.Screwdriver
| 'N' -> Tool.Nail

Apart from the character syntax (inside single quotes - see Fyodor's response), you should handle the case when the letter is not H, S or N, either using the option type or throwing an exception (less functional but enough for an exercise):
type Tool =
| Hammer
| Screwdriver
| Nail
module Tool =
let ofLetter (letter: char) =
match letter with
| 'H' -> Hammer
| 'S' -> Screwdriver
| 'N' -> Nail
| _ -> invalidArg (nameof letter) $"Unsupported letter '{letter}'"
Usage:
> Tool.ofLetter 'S';;
val it : Tool = Screwdriver
> Tool.ofLetter 'C';;
System.ArgumentException: Unsupported letter 'C' (Parameter 'letter')

Related

What is the main point of Active Patterns?

When we pattern match on a value using Active Patterns, there is a "convert" function implicitly called. So instead of writing:
match value with
| Tag1 -> ...
| Tag2 -> ...
I can explicitly write:
match convert value with
| Tag1 -> ...
| Tag2 -> ...
This way, I don't have to use Active Patterns here. Of course, I have to explicitly call the convert function, and have to explicitly declare a union type. But those are minor things to me.
So what is the main point of Active Patterns?
The primary power of pattern matching is not the funny syntax. The primary power of patterns is that they can be nested.
Take a look at this:
match value with
| Foo (Bar, Baz [First 42; Second "hello!"]) -> "It's a thing"
| Qux [42; 42; 42] -> "Triple fourty-two"
| _ -> "No idea"
Assuming all capitalized words are active patterns, let's try to rewrite the first pattern in terms of calling convert explicitly:
match convertFoo value with
| Foo (x, y) ->
match convertBar x, convertBaz y with
| (Bar, Baz [z1; z2]) ->
match convertFirst z1, convertSecond z2 with
| First 42, Second "hello!" -> "It's a thing"
Too long and convoluted? But wait, we didn't even get to write all the non-matching branches!
match convertFoo value with
| Foo (x, y) ->
match convertBar x, convertBaz y with
| (Bar, Baz [z1; z2]) ->
match convertFirst z1, convertSecond z2 with
| First 42, Second "hello!" -> "It's a thing"
| _ -> "No idea"
| _ -> "No idea"
| Qux [42; 42; 42] -> "Triple fourty-two"
| _ -> "No idea"
See how the "No idea" branch is triplicated? Isn't copy&paste wonderful? :-)
Incidentally, this is why C#'s feeble attempt at what they have the audacity to call "pattern matching" isn't really pattern matching: it can't be nested, and therefore, as you very astutely observe, it is no better than just calling classifier functions.

What is the | symbol for in f #?

I'm pretty new to functional programming and I've started looking at the documentation for match statements and in the example I came across here gitpages and cut and pasted to my question below:
let rec fib n =
match n with
| 0 -> 0
| 1 -> 1
| _ -> fib (n - 1) + fib (n - 2)
I understand that let is for static binding in this case for a recursive function called fib which takes a parameter n. It tries to match n with 3 cases. If it's 0, 1 or anything else.
What I don't understand is what the | symbol is called in this context or why it is used? Anything I search for pertaining to f-sharp pipe takes me to this |> which is the piping character in f sharp.
What is this | used for in this case? Is it required or optional? And when should be and shouldn't I be using |?
The | symbol is used for several things in F#, but in this case, it serves as a separator of cases of the match construct.
The match construct lets you pattern match on some input and handle different values in different ways - in your example, you have one case for 0, one for 1 and one for all other values.
Generally, the syntax of match looks like this:
match <input> with <case_1> | ... | <case_n>
Where each <case> has the following structure:
<case> = <pattern> -> <expression>
Here, the | symbol simply separates multiple cases of the pattern matching expression. Each case then has a pattern and an expression that is evaluated when the input matches the pattern.
To expand on Tomas's excellent answer, here are some more of the various uses of | in F#:
Match expressions
In match expressions, | separates the various patterns, as Tomas has pointed. While you can write the entire match expression on a single line, it's conventional to write each pattern on a separate line, lining up the | characters, so that they form a visual indicator of the scope of the match statement:
match n with
| 0 -> "zero"
| 1 -> "one"
| 2 -> "two"
| 3 -> "three"
| _ -> "something else"
Discriminated Unions
Discriminated Unions (or DUs, since that's a lot shorter to type) are very similar to match expressions in style: defining them means listing the possibilities, and | is used to separate the possibilities. As with match expressions, you can (if you want to) write DUs on a single line:
type Option<'T> = None | Some of 'T
but unless your DU has just two possibilities, it's usually better to write it on multiple lines:
type ContactInfo =
| Email of string
| PhoneNumber of areaCode : string * number : string
| Facebook of string
| Twitter of string
Here, too, the | ends up forming a vertical line that draws the eye to the possibilities of the DU, and makes it very clear where the DU definition ends.
Active patterns
Active patterns also use | to separate the possibilities, but they also are wrapped inside an opening-and-closing pair of | characters:
let (Even|Odd) n = if n % 2 = 0 then Even else Odd // <-- Wrong!
let (|Even|Odd|) n = if n % 2 = 0 then Even else Odd // <-- Right!
Active patterns are usually written in the way I just showed, with the | coming immediately inside the parentheses, which is why some people talk about "banana clips" (because the (| and |) pairs look like bananas if you use your imagination). But in fact, it's not necessary to write the (| and |) characters together: it's perfectly valid to have spaces separating the parentheses from the | characters:
let (|Even|Odd|) n = if n % 2 = 0 then Even else Odd // <-- Right!
let ( |Even|Odd| ) n = if n % 2 = 0 then Even else Odd // <-- ALSO right!
Unrelated things
The pipe operator |> and the Boolean-OR operator || are not at all the same thing as uses of the | operator. F# allows operators to be any combination of symbols, and they can have very different meanings from an operator that looks almost the same. For example, >= is a standard operator that means "greater than". And many F# programs will define a custom operator >>=. But although >>= is not defined in the F# core library, it has a standard meaning, and that standard meaning is NOT "a lot greater than". Rather, >>= is the standard way to write an operator for the bind function. I won't get into what bind does right now, as that's a concept that could take a whole answer all on its own to go through. But if you're curious about how bind works, you can read Scott Wlaschin's series on computation expressions, which explains it all very well.

Confused at transferring an ambiguous grammar to an unambiguous one

An ambiguous grammar is given and I am asked to rewrite the grammar to make it unambiguous. In fact, I don't know why the given grammar is ambiguous, let alone rewriting it to an unambiguous one.
The given grammar is S -> SS | a | b , and I have four choices:
A: S -> Sa | Sb | epsilon
B: S -> SS’
S’-> a | b
C: S -> S | S’
S’-> a | b
D: S -> Sa | Sb.
For each choice, I have already know that D is incorrect because it generates no strings at all,C is incorrect because it only matches the strings 'a' and 'b'.
However, I think the answer is A while the correct answer is B.I think B is wrong because it just generates S over and over again, and B can't deal with empty strings.
Why is the given grammar ambiguous?
Why is A incorrect while B is correct?
The original grammar is ambiguous because multiple right-most (or left-most) derivations are possible for any string of at least three letters. For example:
S -> SS -> SSS -> SSa -> Saa -> aaa
S -> SS -> Sa -> SSa -> Saa -> aaa
The first one corresponds, roughly speaking, to the parse a(aa) while the second to the parse (aa)a.
None of the alternatives is correct. A incorrectly matches ε while B does not match anything (like D). If B were, for example,
S -> SS' | S'
S' -> a | b
it would be correct. (This grammar is left-associative.)

Location in syntax trees

When writing a parser, I want to remember the location of lexemes found, so that I can report useful error messages to the programmer, as in “if-less else on line 23” or ”unexpected character on line 45, character 6” or “variable not defined” or something similar. But once I have built the syntax tree, I will transform it in several ways, optimizing or expanding some kind of macros. The transformations produce or rearrange lexemes which do not have a meaningful location.
Therefore it seems that the type representing the syntax tree should come in two flavor, a flavor with locations decorating lexemes and a flavor without lexemes. Ideally we would like to work with a purely abstract syntax tree, as defined in the OCaml book:
# type unr_op = UMINUS | NOT ;;
# type bin_op = PLUS | MINUS | MULT | DIV | MOD
| EQUAL | LESS | LESSEQ | GREAT | GREATEQ | DIFF
| AND | OR ;;
# type expression =
ExpInt of int
| ExpVar of string
| ExpStr of string
| ExpUnr of unr_op * expression
| ExpBin of expression * bin_op * expression ;;
# type command =
Rem of string
| Goto of int
| Print of expression
| Input of string
| If of expression * int
| Let of string * expression ;;
# type line = { num : int ; cmd : command } ;;
# type program = line list ;;
We should be allowed to totally forget about locations when working on that tree and have special functions to map an expression back to its location (for instance), that we could use in case of emergency.
What is the best way to define such a type in OCaml or to handle lexeme positions?
The best way is to work always with AST nodes fully annotated with the locations. For example:
type expression = {
expr_desc : expr_desc;
expr_loc : Lexing.position * Lexing.position; (* start and end *)
}
and expr_desc =
ExpInt of int
| ExpVar of string
| ExpStr of string
| ExpUnr of unr_op * expression
| ExpBin of expression * bin_op * expression
Your idea, keeping the AST free of locations and writing a function to retrieve the missing locations is not a good idea, I believe. Such a function should require searching by pointer equivalence of AST nodes or something similar, which does not really scale.
I strongly recommend to look though OCaml compiler's parser.mly which is a full scale example of AST with locations.

F# pattern matching: how to match a set of possible types that share the same parameters?

I'm new to F# and not quite familiar with the whole pattern matching idea.
I tried to search for a better solution to my problem but I fear I can't even express the problem properly – I hope the question title is at least somewhat accurate.
What I want to do is extract 2 "parameters" from listMethod.
listMethod is of one of several types that have a string and an Expression "parameter" (I suspect parameter is the wrong term):
let (varDecl, listExpr) =
match listMethod with
| Select (var, expr) -> (var, expr)
| Where (var, expr) -> (var, expr)
| Sum (var, expr) -> (var, expr)
| Concat (var, expr) -> (var, expr)
Then I continue to work with varDecl and at the end have a similar match expression with the actual listMethod code that makes use of several temporary variables I created based on varDecl.
My question now is: How can I make the above code more compact?
I want to match all those types that have 2 parameters (of type string and Expression) without listing them all myself, which is kinda ugly and hard to maintain.
The ListMethod type is declared as follows (the whole thing is a FsLex/FsYacc project):
type ListMethod =
| Select of string * Expr
| Where of string * Expr
| Sum of string * Expr
| Concat of string * Expr
| ...
| somethingElse of Expr
(as of now I only have types of the form string * Expr, but that will change).
I reckon that this is a fairly dumb question for anyone with some experience, but as I've said I'm new to F# and couldn't find a solution myself.
Thanks in advance!
Edit: I'd really like to avoid listing all possible types of listMethod twice. If there's no way I can use wildcards or placeholders in the match expressions, perhaps I can modify the listMethod type to make things cleaner.
One option that comes to mind would be creating only 1 type of listMethod and to create a third parameter for the concrete type (Select, Where, Sum).
Or is there a better approach?
This is probably the standard way:
let (varDecl, listExpr) =
match listMethod with
| Select (var, expr)
| Where (var, expr)
| Sum (var, expr)
| Concat (var, expr) -> (var, expr)
The | sign means or, so if one of these match, the result will be returned. Just make sure that every case has exactly the same names (and types).
As Chuck commented, this is an even better solution:
let (Select (varDecl, expr)
| Where (varDecl, expr)
| Sum (varDecl, expr)
| Concat (varDecl, expr)) = listMethod
I reckon that this is a fairly dumb question for anyone with some experience, but as I've said I'm new to F# and couldn't find a solution myself.
On the contrary, this is a very good question and actually relatively untrodden ground because F# differs from other languages in this regard (e.g. you might solve this problem using polymorphic variants in OCaml).
As Ankur wrote, the best solution is always to change your data structure to make it easier to do what you need to do if that is possible. KVB's solution of using active patterns is not only valuable but also novel because that language feature is uncommon in other languages. Ramon's suggestion to combine your match cases using or-patterns is also good but you don't want to write incomplete pattern matches.
Perhaps the most common example of this problem arising in practice is in operators:
type expr =
| Add of expr * expr
| Sub of expr * expr
| Mul of expr * expr
| Div of expr * expr
| Pow of expr * expr
| ...
where you might restructure your type as follows:
type binOp = Add | Sub | Mul | Div | Pow
type expr =
| BinOp of binOp * expr * expr
| ...
Then tasks like extracting subexpressions:
let subExprs = function
| Add(f, g)
| Sub(f, g)
| Mul(f, g)
| Div(f, g)
| Pow(f, g) -> [f; g]
| ...
can be performed more easily:
let subExprs = function
| BinOp(_, f, g) -> [f; g]
| ...
Finally, don't forget that you can augment F# types (such as union types) with OOP constructs such as implementing shared interfaces. This can also be used to express commonality, e.g. if you have two overlapping requirements on two types then you might make them both implement the same interface in order to expose this commonality.
In case you are ok to do adjustments to your data structure then below is something that will ease out the pattern matching.
type ListOperations =
Select | Where | Sum | Concat
type ListMethod =
| ListOp of ListOperations * string * Expr
| SomethingElse of int
let test t =
match t with
| ListOp (a,b,c) -> (b,c)
| _ -> ....
A data structure should be designed by keeping in mind the operation you want to perform on it.
If there are times when you will want to treat all of your cases the same and other times where you will want to treat them differently based on whether you are processing a Select, Where, Sum, etc., then one solution would be to use an active pattern:
let (|OperatorExpression|_|) = function
| Select(var, expr) -> Some(Select, var, expr)
| Where (var, expr) -> Some(Where, var, expr)
| Sum (var, expr) -> Some(Sum, var, expr)
| Concat (var, expr) -> Some(Concat, var, expr)
| _ -> None
Now you can still match normally if you need to treat the cases individually, but you can also match using the active pattern:
let varDecl, listExp =
match listMethod with
| OperatorExpression(_, v, e) -> v, e
| _ -> // whatever you do for other cases...

Resources