F#/OCaml: How to avoid duplicate pattern match? - f#

Have a look at this F#/OCaml code:
type AllPossible =
| A of int
| B of int*int
| ...
| Z of ...
let foo x =
....
match x with
| A(value) | B(value,_) -> (* LINE 1 *)
(* do something with the first (or only, in the case of A) value *)
...
(* now do something that is different in the case of B *)
let possibleData =
match x with
| A(a) -> bar1(a)
| B(a,b) -> bar2(a+b)
| _ -> raise Exception (* the problem - read below *)
(* work with possibleData *)
...
| Z -> ...
So what is the problem?
In function foo, we pattern match against a big list of types.
Some of the types share functionality - e.g. they have common
work to do, so we use "|A | B ->" in LINE 1, above.
We read the only integer (in the case of A), or the first integer
(in the case of B) and do something with it.
Next, we want to do something that is completely different, depending
on whether we work on A or B (i.e. call bar1 or bar2).
We now have to pattern match again, and here's the problem: In this
nested pattern match, unless we add a 'catchAll' rule (i.e. '_'),
the compiler complains that we are missing cases - i.e. it doesn't
take into account that only A and B can happen here.
But if we add the catchAll rule, then we have a far worse problem:
if at some point we add more types in the list of LINE1
(i.e. in the line '|A | B ->' ... then the compiler will NOT help
us in the nested match - the '_' will catch them, and a bug will
be detected at RUNTIME. One of the most important powers of
pattern matching - i.e. detecting such errors at compile-time - is lost.
Is there a better way to write this kind of code, without having
to repeat whatever work is shared amongst A and B in two separate
rules for A and B? (or putting the A-and-B common work in a function
solely created for the purpose of "local code sharing" between A and B?)
EDIT: Note that one could argue that the F# compiler's behaviour is buggy in this case -
it should be able to detect that there's no need for matching beyond A and B
in the nested match.

If the datatype is set in stone - I would also prefer local function.
Otherwise, in OCaml you could also enjoy open (aka polymorphic) variants :
type t = [`A | `B | `C]
let f = function
| (`A | `B as x) ->
let s = match x with `A -> "a" | `B -> "b" in
print_endline s
| `C -> print_endline "ugh"

I would just put the common logic in a local function, should be both faster and more readable. Matches nested that way is pretty hard to follow, and putting the common logic in a local function allows you to ditch the extra matching in favour of something that'll get inlined anyway.

Hmm looks like you need to design the data type a bit differently such as:
type AorB =
| A of int
| B of int * int
type AllPossible =
| AB of AorB
| C of int
.... other values
let foo x =
match x with
| AB(v) ->
match v with
| A(i) -> () //Do whatever need for A
| B(i,v) -> () // Do whatever need for B
| _ -> ()

Perhaps the better solution is that rather than
type All =
|A of int
|B of int*int
you have
type All =
|AorB of int * (int Option)
If you bind the data in different ways later on you might be better off using an active pattern rather than a type, but the result would be basically the same

I don't really agree that this should be seen as a bug - although it would definitely be convenient if the case was handled by the compiler.
The C# compiler doesn't complain to the following and you wouldn't expect it to:
var b = true;
if (b)
if (!b)
Console.WriteLine("Can never be reached");

Related

F# pattern matching: how "as" is interpreted when used incorrectly with constructor?

I have a discriminated union with a choice that has another du as its type as follows:
type DunionSubset =
| X
| Y
type Dunion =
| A
| B
| C of DunionSubset
I want to produce a mapping to a list of strings for Dunion type, which naturally extends to C and therefore DunionSubset
When I incorrectly use as to assign an alias to constructor as follows:
let MappingsOfC = function
| X -> ["x"]
| Y -> ["y"]
let StringMappings = function
| A -> ["a";"A"]
| B -> []
| C as c -> (MappingsOfC c)
the compiler gives me:
[FS0019] This constructor is applied to 0 argument(s) but expects 1
How exactly is my incorrect use of as above leading to this compiler error? Interestingly, the location of the compiler error is that of C, not my use of c in MappingsOfC c though Rider ide underlines c and provides a different error.
C as c matches the function argument against the pattern C and also binds it to c. So it's the same as:
let StringMappings c = match c with
| A -> ["a";"A"]
| B -> []
| C -> (MappingsOfC c)
except that this will make c visible in all branches whereas the version with as only makes c visible in the branch that has the as pattern (naturally).
You get an error because C takes an argument, but you're trying to use it without one. If you wrote | A as a ->, | B as b -> and/or | C _ as c, that would work fine.
Actually the left part of -> (before the as and when keywords, if provided) in a matching branch is a pattern. And pattern must be valid - depending on the type of the value provided right after the match keyword.
Here compiler will understand that you are wanting to match a Dunion value because it sees the patterns A and B (which are all valid) first.
Now you can see that C is an invalid pattern for Dunion. The valid pattern here must be: C something
If you don’t care about something (don’t need to use it), still, you must provide a wildcard _ to make the pattern valid: C _
But in your example, I’m pretty sure that you do care, so the code should be like this:
| C c -> MappingsOfC c

F# is it possible to "upcast" discriminated union value to a "superset" union?

Let's say there are two unions where one is a strict subset of another.
type Superset =
| A of int
| B of string
| C of decimal
type Subset =
| A of int
| B of string
Is it possible to automatically upcast a Subset value to Superset value without resorting to explicit pattern matching? Like this:
let x : Subset = A 1
let y : Superset = x // this won't compile :(
Also it's ideal if Subset type was altered so it's no longer a subset then compiler should complain:
type Subset =
| A of int
| B of string
| D of bool // - no longer a subset of Superset!
I believe it's not possible to do but still worth asking (at least to understand why it's impossible)
WHY I NEED IT
I use this style of set/subset typing extensively in my domain to restrict valid parameters in different states of entities / make invalid states non-representable and find the approach very beneficial, the only downside is very tedious upcasting between subsets.
Sorry, no
Sorry, but this is not possible. Take a look at https://fsharpforfunandprofit.com/posts/fsharp-decompiled/#unions — you'll see that F# compiles discriminated unions to .NET classes, each one separate from each other with no common ancestors (apart from Object, of course). The compiler makes no effort to try to identify subsets or supersets between different DUs. If it did work the way you suggested, it would be a breaking change, because the only way to do this would be to make the subset DU a base class, and the superset class its derived class with an extra property. And that would make the following code change behavior:
type PhoneNumber =
| Valid of string
| Invalid
type EmailAddress =
| Valid of string
| ValidButOutdated of string
| Invalid
let identifyContactInfo (info : obj) =
// This came from external code we don't control, but it should be contact info
match (unbox obj) with
| :? PhoneNumber as phone -> // Do something
| :? EmailAddress as email -> // Do something
Yes, this is bad code and should be written differently, but it illustrates the point. Under current compiler behavior, if identifyContactInfo gets passed a EmailAddress object, the :? PhoneNumber test will fail and so it will enter the second branch of the match, and treat that object (correctly) as an email address. If the compiler were to guess supersets/subsets based on DU names as you're suggesting here, then PhoneNumber would be considered a subset of EmailAddress and so would become its base class. And then when this function received an EmailAddress object, the :? PhoneNumber test would succeed (because an instance of a derived class can always be cast to the type of its base class). And then the code would enter the first branch of the match expression, and your code might then try to send a text message to an email address.
But wait...
What you're trying to do might be achievable by pulling out the subsets into their own DU category:
type AorB =
| A of int
| B of string
type ABC =
| AorB of AorB
| C of decimal
type ABD =
| AorB of AorB
| D of bool
Then your match expressions for an ABC might look like:
match foo with
| AorB (A num) -> printfn "%d" num
| AorB (B s) -> printfn "%s" s
| C num -> printfn "%M" num
And if you need to pass data between an ABC and an ABD:
let (bar : ABD option) =
match foo with
| AorB data -> Some (AorB data)
| C _ -> None
That's not a huge savings if your subset has only two common cases. But if your subset is a dozen cases or so, being able to pass those dozen around as a unit makes this design attractive.

Is there a name for this pattern "type 'a foldedSequence = Empty | Value of 'a * (unit->'a foldedSequence )"

I have been working with some f# parsers and some streaming software and I find myself using this pattern more and more. I find it to be a natural alternative to sequences and it has some natural advantages.
here are some example functions using the type.
type foldedSequence<'a> =
| Empty
| Value of ' a * (unit -> 'a foldedSequence)
let rec createFoldedSequence fn state =
match fn state with
| None -> Empty
| Some(value, nextState) ->
Value(value, (fun () -> unfold fn nextState))
let rec filter predicate =
function
| Empty -> Empty
| Value(value, nextValue) ->
let next() = filter predicate(nextValue())
if predicate value then Value(value, next)
else next()
let toSeq<'t> =
Seq.unfold<'t foldedSequence, 't>(function
| Empty -> None
| Value(value, nextValue) -> Some(value, nextValue()))
It has been very helpful I would like to know if it has a name so I can research some tips and tricks for it
To add to the existing answers, I think Haskellers might call a generalised version of this this a list monad transformer. The idea is that your type definition looks almost like ordinary F# list except that there is some additional aspect to it. You can imagine writing this as:
type ListTransformer<'T> =
| Empty
| Value of 'T * M<ListTransformer<'T>>
By supplying specific M, you can define a number of things:
M<'T> = 'T gives you the ordinary F# list type
M<'T> = unit -> 'T gives you your sequence that can be evaluated lazily
M<'T> = Lazy<'T> gives you LazyList (which caches already evaluated elements)
M<'T> = Async<'T> gives you asynchronous sequences
It is also worth noting that in this definition LazyTransformer<'T> is not itself a delayed/lazy/async value. This can cause problems in some cases - e.g. when you need to perform some async operation to decide whether the stream is empty - and so a better definition is:
type ListTransformer<'T> = M<ListTransformerInner<'T>>
and ListTransformerInner<'T> =
| Empty
| Value of 'T * ListTransformer<'T>
This sounds like LazyList which used to be in the "powerpack" and I think now lives here:
http://fsprojects.github.io/FSharpx.Collections/reference/fsharpx-collections-lazylist-1.html
https://github.com/fsprojects/FSharpx.Collections/blob/master/src/FSharpx.Collections/LazyList.fs
Your type is close to how an iteratee would be defined, and since you already mention streaming, this might be the concept you're looking for.
Iteratee IO is an approach to lazy IO outlined by Oleg Kiselyov. Apart from Haskell, implementations exist for major functional languages, including F# (as part of FSharpx.Extras).
This is how FSharpx defines an Iteratee:
type Iteratee<'Chunk,'T> =
| Done of 'T * Stream<'Chunk>
| Error of exn
| Continue of (Stream<'Chunk> -> Iteratee<'Chunk,'T>)
See also this blog post: Iteratee in F# - part 1. Note that there doesn't seem to be a part 2.

Free Monad in F# with generic output type

I am trying to apply the free monad pattern as described in F# for fun and profit to implement data access (for Microsoft Azure Table Storage)
Example
Let's assume we have three database tables and three dao's Foo, Bar, Baz:
Foo Bar Baz
key | col key | col key | col
--------- --------- ---------
foo | 1 bar | 2 |
I want to select Foo with key="foo" and Bar with key="bar" to insert a Baz with key="baz" and col=3
Select<Foo> ("foo", fun foo -> Done foo)
>>= (fun foo -> Select<Bar> ("bar", fun bar -> Done bar)
>>= (fun bar -> Insert<Baz> ((Baz ("baz", foo.col + bar.col), fun () -> Done ()))))
Within the interpreter function
Select results in a function call that takes a key : string and returns an obj
Insert results in a function call that takes an obj and returns unit
Problem
I defined two operations Select and Insert in addition to Done to terminate the computation:
type StoreOp<'T> =
| Select of string * ('T -> StoreOp<'T>)
| Insert of 'T * (unit -> StoreOp<'T>)
| Done of 'T
In order to chain StoreOp's I am trying to implement the correct bind function:
let rec bindOp (f : 'T1 -> StoreOp<'T2>) (op : StoreOp<'T1>) : StoreOp<'T2> =
match op with
| Select (k, next) ->
Select (k, fun v -> bindOp f (next v))
| Insert (v, next) ->
Insert (v, fun () -> bindOp f (next ()))
| Done t ->
f t
let (>>=) = bindOp
However, the f# compiler correctly warns me that:
The type variable 'T1 has been constrained to be type 'T2
For this implementation of bindOp the type is fixed throughout the computation, so instead of:
Foo > Bar > unit
all I can express is:
Foo > Foo > Foo
How should I modify the definition of StoreOp and/or bindOp to work with different types throughout the computation?
As Fyodor mentioned in the comments, the problem is with the type declaration. If you wanted to make it compile at the price of sacrificing type safety, you could use obj in two places - this at least shows where the problem is:
type StoreOp<'T> =
| Select of string * (obj -> StoreOp<'T>)
| Insert of obj * (unit -> StoreOp<'T>)
| Done of 'T
I'm not entirely sure what the two operations are supposed to model - but I guess Select means you are reading something (with string key?) and Insert means that you are storing some value (and then continue with unit). So, here, the data you are storing/reading would be obj.
There are ways of making this type safe, but I think you'd get better answer if you explained what are you trying to achieve by using the monadic structure.
Without knowing more, I think using free monads will only make your code very messy and difficult to understand. F# is a functional-first language, which means that you can write data transformations in a nice functional style using immutable data types and use imperative programming to load your data and store your results. If you are working with table storage, why not just write the normal imperative code to read data from table storage, pass the results to a pure functional transformation and then store the results?

Performing Calculations on F# option types

I'm trying to write some function that handle errors by returning double options instead of doubles. Many of these functions call eachother, and so take double options as inputs to output other double options. The problem is, I can't do with double options what I can do with doubles--something simple like add them using '+'.
For example, a function that divides two doubles, and returns a double option with none for divide by zero error. Then another function calls the first function and adds another double option to it.
Please tell me if there is a way to do this, or if I have completely misunderstood the meaning of F# option types.
This is called lifting - you can write function to lift another function over two options:
let liftOpt f o1 o2 =
match (o1, o2) with
| (Some(v1), Some(v2)) -> Some(f v1 v2)
| _ -> None
then you can supply the function to apply e.g.:
let inline addOpt o1 o2 = liftOpt (+) o1 o2
liftA2 as mentioned above will provide a general way to 'lift' any function that works on the double arguments to a function that can work on the double option arguments.
However, in your case, you may have to write special functions yourself to handle the edge cases you mention
let (<+>) a b =
match (a, b) with
| (Some x, Some y) -> Some (x + y)
| (Some x, None) -> Some (x)
| (None, Some x) -> Some (x)
| (None, None) -> None
Note that liftA2 will not put the cases where you want to add None to Some(x) in automatically.
The liftA2 method for divide also needs some special handling, but its structure is generally what we would write ourselves
let (</>) a b =
match (a, b) with
| (Some x, Some y) when y <> 0.0d -> Some (x/y)
| _ -> None
You can use these functions like
Some(2.0) <+> Some(3.0) // will give Some(5.0)
Some(1.0) </> Some(0.0) // will give None
Also, strictly speaking, lift is defined as a "higher order function" - something that takes a function and returns another function.
So it would look something like this:
let liftOpt2 f =
(function a b ->
match (a, b) with
| (Some (a), Some (b)) -> f a b |> Some
| _ -> None)
In the end, I realized what I was really looking for was the Option.get function, which simply takes a 'a option and returns an 'a. That way, I can pattern match, and return the values I want.
In this case you might want to consider Nullables over Options, for two reasons:
Nullables are value types, while Options are reference types. If you have large collections of these doubles, using Nullables will keep the numbers on the stack instead of putting them on the heap, potentially improving your performance.
Microsoft provides a bunch of built-in Nullable Operators that do let you directly perform math on nullables, exactly as you're trying to do with options.

Resources