What is the main point of Active Patterns? - f#

When we pattern match on a value using Active Patterns, there is a "convert" function implicitly called. So instead of writing:
match value with
| Tag1 -> ...
| Tag2 -> ...
I can explicitly write:
match convert value with
| Tag1 -> ...
| Tag2 -> ...
This way, I don't have to use Active Patterns here. Of course, I have to explicitly call the convert function, and have to explicitly declare a union type. But those are minor things to me.
So what is the main point of Active Patterns?

The primary power of pattern matching is not the funny syntax. The primary power of patterns is that they can be nested.
Take a look at this:
match value with
| Foo (Bar, Baz [First 42; Second "hello!"]) -> "It's a thing"
| Qux [42; 42; 42] -> "Triple fourty-two"
| _ -> "No idea"
Assuming all capitalized words are active patterns, let's try to rewrite the first pattern in terms of calling convert explicitly:
match convertFoo value with
| Foo (x, y) ->
match convertBar x, convertBaz y with
| (Bar, Baz [z1; z2]) ->
match convertFirst z1, convertSecond z2 with
| First 42, Second "hello!" -> "It's a thing"
Too long and convoluted? But wait, we didn't even get to write all the non-matching branches!
match convertFoo value with
| Foo (x, y) ->
match convertBar x, convertBaz y with
| (Bar, Baz [z1; z2]) ->
match convertFirst z1, convertSecond z2 with
| First 42, Second "hello!" -> "It's a thing"
| _ -> "No idea"
| _ -> "No idea"
| Qux [42; 42; 42] -> "Triple fourty-two"
| _ -> "No idea"
See how the "No idea" branch is triplicated? Isn't copy&paste wonderful? :-)
Incidentally, this is why C#'s feeble attempt at what they have the audacity to call "pattern matching" isn't really pattern matching: it can't be nested, and therefore, as you very astutely observe, it is no better than just calling classifier functions.

Related

Matching a character to a discriminated union

I created a discriminated union which has three possible options:
type tool =
| Hammer
| Screwdriver
| Nail
I would like to match a single character to one tool option. I wrote this function:
let getTool (letter: char) =
match letter with
| H -> Tool.Hammer
| S -> Tool.Screwdriver
| N -> Tool.Nail
Visual Studio Code throws me now the warning that only the first character will be matched and that the other rules never will be.
Can somebody please explain this behaviour and maybe provide an alternative?
That's not how characters are denoted in F#. What you wrote are variable names, not characters.
To denote a character, use single quotes:
let getTool (letter: char) =
match letter with
| 'H' -> Tool.Hammer
| 'S' -> Tool.Screwdriver
| 'N' -> Tool.Nail
Apart from the character syntax (inside single quotes - see Fyodor's response), you should handle the case when the letter is not H, S or N, either using the option type or throwing an exception (less functional but enough for an exercise):
type Tool =
| Hammer
| Screwdriver
| Nail
module Tool =
let ofLetter (letter: char) =
match letter with
| 'H' -> Hammer
| 'S' -> Screwdriver
| 'N' -> Nail
| _ -> invalidArg (nameof letter) $"Unsupported letter '{letter}'"
Usage:
> Tool.ofLetter 'S';;
val it : Tool = Screwdriver
> Tool.ofLetter 'C';;
System.ArgumentException: Unsupported letter 'C' (Parameter 'letter')

How can I simplify a recursive-descent parser?

I have the following simple LL(1) grammar, which describes a language with only three valid sentences: "", "x y" and "z x y":
S -> A x y | ε .
A -> z | ε .
I have constructed the following parsing table, and from it a "naive" recursive-descent parser:
| x | y | z | $
S | S -> A x y | | S -> A x y | S -> ε
A | A -> ε | | A -> z |
func S():
if next() in ['x', 'z']:
A()
expect('x')
expect('y')
expect('$')
elif next() == '$':
pass
else:
error()
func A():
if next() == 'x':
pass
elif next() == 'z':
expect('z')
else:
error()
However, the function A seems to be more complicated than necessary. All of my tests still pass if it's simplified to:
func A():
if next() == 'z':
expect('z')
Is this a valid simplification of A? If so, are there any general rules regarding when it's valid to make simplifications like this one?
That simplification is certainly valid (and quite common).
The main difference is that there is no code associated with the production A→ε. If there are some semantics to implement, you will need to test for the condition. If you only need to ignore the nullable production, you can certainly just return.
Coalescing errors and epsilon productions has one other difference: the error (for example, in the input y) is detected later, after A() returns. Sometimes that makes it harder to produce good error messages (and sometimes it doesn't).

F# matching expression and 'as' pattern

I'm studying some F# and trying to implement a parser combinator like in this tutorial; after some copy and paste of the proposed solution, I tried to customize it by myself.
Surely there's something I missed but the compiler gives me a weird error message.
type T<'a> =
| A of string
| B of 'a
let foo a b =
match a with
| A s as x -> x
| B i ->
match b with
| A s as x -> x
| B j ->
B (i, j)
The code above is a generalization of the issue I found: the error is notified in the last result (the B branch of the innermost matching expression):
error FS0001: Type mismatch. Expecting a
'a
but given a
'a * 'b
The resulting type would be infinite when unifying ''a' and ''a * 'b'
But if I don't use the as pattern:
let foo a b =
match a with
| A s -> A s // it can also be '| A s as x -> A s'
| B i ->
match b with
| A s -> A s
| B j ->
B (i, j)
then the compiler is happy again.
I don't get why I have to recreate the same logical result if it is already there.
In
A s as x -> x
x has type T<'a> while the required return type for foo is T<('a * 'a)>. Even though the A case does not contain any values of 'a it does not have a polymorphic type like forall 'a . T<'a>. In your second example you are creating a new instance of A which can have the required type.

Why are multiple branches of a match expressions evaluated?

I am using a match expression and getting a much different result than I expected. In my case the value of wi.State = "Done" so I expected the wi.Close() call to be executed and that's it. However it runs that and then runs the statements that come after the _ -> catch-all match as well. Clearly I misunderstand how these expressions are supposed so work so I will appreciate any pointers.
workItemCollection.Cast<WorkItem>()
|> Seq.iter(fun wi ->
wi.Open()
match wi.State with
| "Done"
| "Removed" ->
wi.Close()
| _ ->
printfn "Closing %s -> %i" iccmCallNumber wi.Id
wi.State <- "Done"
wi.History <- wi.History + "Closed by Iccm to Tfs integration"
wi.Save())

F#/OCaml: How to avoid duplicate pattern match?

Have a look at this F#/OCaml code:
type AllPossible =
| A of int
| B of int*int
| ...
| Z of ...
let foo x =
....
match x with
| A(value) | B(value,_) -> (* LINE 1 *)
(* do something with the first (or only, in the case of A) value *)
...
(* now do something that is different in the case of B *)
let possibleData =
match x with
| A(a) -> bar1(a)
| B(a,b) -> bar2(a+b)
| _ -> raise Exception (* the problem - read below *)
(* work with possibleData *)
...
| Z -> ...
So what is the problem?
In function foo, we pattern match against a big list of types.
Some of the types share functionality - e.g. they have common
work to do, so we use "|A | B ->" in LINE 1, above.
We read the only integer (in the case of A), or the first integer
(in the case of B) and do something with it.
Next, we want to do something that is completely different, depending
on whether we work on A or B (i.e. call bar1 or bar2).
We now have to pattern match again, and here's the problem: In this
nested pattern match, unless we add a 'catchAll' rule (i.e. '_'),
the compiler complains that we are missing cases - i.e. it doesn't
take into account that only A and B can happen here.
But if we add the catchAll rule, then we have a far worse problem:
if at some point we add more types in the list of LINE1
(i.e. in the line '|A | B ->' ... then the compiler will NOT help
us in the nested match - the '_' will catch them, and a bug will
be detected at RUNTIME. One of the most important powers of
pattern matching - i.e. detecting such errors at compile-time - is lost.
Is there a better way to write this kind of code, without having
to repeat whatever work is shared amongst A and B in two separate
rules for A and B? (or putting the A-and-B common work in a function
solely created for the purpose of "local code sharing" between A and B?)
EDIT: Note that one could argue that the F# compiler's behaviour is buggy in this case -
it should be able to detect that there's no need for matching beyond A and B
in the nested match.
If the datatype is set in stone - I would also prefer local function.
Otherwise, in OCaml you could also enjoy open (aka polymorphic) variants :
type t = [`A | `B | `C]
let f = function
| (`A | `B as x) ->
let s = match x with `A -> "a" | `B -> "b" in
print_endline s
| `C -> print_endline "ugh"
I would just put the common logic in a local function, should be both faster and more readable. Matches nested that way is pretty hard to follow, and putting the common logic in a local function allows you to ditch the extra matching in favour of something that'll get inlined anyway.
Hmm looks like you need to design the data type a bit differently such as:
type AorB =
| A of int
| B of int * int
type AllPossible =
| AB of AorB
| C of int
.... other values
let foo x =
match x with
| AB(v) ->
match v with
| A(i) -> () //Do whatever need for A
| B(i,v) -> () // Do whatever need for B
| _ -> ()
Perhaps the better solution is that rather than
type All =
|A of int
|B of int*int
you have
type All =
|AorB of int * (int Option)
If you bind the data in different ways later on you might be better off using an active pattern rather than a type, but the result would be basically the same
I don't really agree that this should be seen as a bug - although it would definitely be convenient if the case was handled by the compiler.
The C# compiler doesn't complain to the following and you wouldn't expect it to:
var b = true;
if (b)
if (!b)
Console.WriteLine("Can never be reached");

Resources