Inconsistent behavior in pattern matching - f#

type Bar = A | B
type Foo = C of Bar | D of Bar
let case = Unchecked.defaultof<Foo>;;
match case with
| C A -> ""
| C B -> ""
| _ -> "Matches";;
match case with
| C A -> ""
| D B -> ""
| _ -> "Throws"
Quickly skimming over the F# Language Spec, nothing about null-test (which I can't do anyway) seemed related and both types seems to be reference type (AFAIK).
I would assume the behavior in the first case to be correct one.

I think all bets are off once you use Unchecked.defaultof<_> (thus the "Unchecked" ;-)). Null isn't considered a valid value for type Foo from an F# perspective (although it is from the .NET perspective), so I don't think that the pattern matching semantics are defined.
What is it that you're trying to do?

So reading the spec, the first hint is here
b) Types with null as an abnormal value. These are types that do
not admit the null literal, but do have null as an abnormal value.
Types in this category are:
o All F# list, record, tuple, function, class and interface types.
o All F# union types apart from those with null as a normal value
(as discussed in the next paragraph).
For these types, the use of the null literal is not directly
permitted. However it is, strictly speaking, possible to generate a
null value for these types using certain functions such as
Unchecked.defaultof. For these types, null is considered an
abnormal value. The behavior of operations with respect to null values
is defined in §6.9.
This does seem to suggest that on passing in a null value for your union could be some sort of undefined behaviour as the value is "abnormal", Section 6.9 isn't particularly helpful
Looking at the definition for _, it seems like you are right that this is a bug - it states
7.1.7 Wildcard Patterns
The pattern _ is a wildcard pattern and matches any input. For
example:
let categorize x =
match x with
| 1 -> 0
| 0 -> 1
| _ -> 0
I think the most relevnat hints though are later on where the compiled methods for DU's are listed
8.5.3 Compiled Form of Union Types for Use from Other CLI Languages
A compiled union type U will have:
· One CLI static getter property U.C for each nullary union
case C. This will get a singleton object representing that case.
· One CLI nested type U.C for each non-nullary union case C.
This type will have instance properties Item1, Item2.... for each
field of the union case, or a single instance property Item if there
is only one field. A compiled union type with only one case does not
have a nested type. Instead, the union type itself plays the role of
the case type.
· One CLI static method U.NewC for each non-nullary union case
C. This will construct an object for that case.
· One CLI instance property u.IsC for each case C that returns
true or false for the case.
· One CLI instance property u.Tag for each case C that fetches
or computes an integer tag corresponding to the case.
From this, you can see that all of the methods for checking are instance methods, which would require non-nullness. Sine null is "abnormal", the generated code doesn't bother checking, so it throws.
I think you could argue that this is infact a bug, based on the definition of _. However, fixing it would require inserting null checks before every DU pattern matching check, which would slow the code down significantly, so I doubt whether this will be fixed

A pattern match against a DU with two cases compiles to an if/else with type tests. If you translate your examples, the behavior is apparent.
match case with
| C A -> ""
| C B -> ""
| _ -> "Matches"
translates to
if (case is C)
...
else
"Matches"
and
match case with
| C A -> ""
| D B -> ""
| _ -> "Throws"
translates to
if (case is D)
...
else //must be C
//check tag BANG! NRE
And kvb's example: match case with | C _ -> "C" | D _ -> "D"
if (case is D)
//...
else //must be C
"C"
I suppose you could view this as a reasonable optimization (vs. if (case is D) {...} else if (case is C) {...} else { MatchFailureException }) given that null behavior is undefined.
Add a third case to Foo and the problem goes away.

The description of the method Unchecked.defaultof ends with the sentence "This function is unsafe in the sense that some F# values don't have proper null values", which is exactly the case here.
Try this:
let isNull = (case = null) ;;
let isNull = (case = null) ;;
---------------------^^^^
stdin(3,22): error FS0043: The type 'Foo' does not have 'null' as a proper value
>
DUs are intended to be immutable and don't have proper null values.

I'm not able to confirm this at the moment, but wouldn't the AddNullLiteral attribute to the types allow the match to succeed?

Related

F# is it possible to "upcast" discriminated union value to a "superset" union?

Let's say there are two unions where one is a strict subset of another.
type Superset =
| A of int
| B of string
| C of decimal
type Subset =
| A of int
| B of string
Is it possible to automatically upcast a Subset value to Superset value without resorting to explicit pattern matching? Like this:
let x : Subset = A 1
let y : Superset = x // this won't compile :(
Also it's ideal if Subset type was altered so it's no longer a subset then compiler should complain:
type Subset =
| A of int
| B of string
| D of bool // - no longer a subset of Superset!
I believe it's not possible to do but still worth asking (at least to understand why it's impossible)
WHY I NEED IT
I use this style of set/subset typing extensively in my domain to restrict valid parameters in different states of entities / make invalid states non-representable and find the approach very beneficial, the only downside is very tedious upcasting between subsets.
Sorry, no
Sorry, but this is not possible. Take a look at https://fsharpforfunandprofit.com/posts/fsharp-decompiled/#unions — you'll see that F# compiles discriminated unions to .NET classes, each one separate from each other with no common ancestors (apart from Object, of course). The compiler makes no effort to try to identify subsets or supersets between different DUs. If it did work the way you suggested, it would be a breaking change, because the only way to do this would be to make the subset DU a base class, and the superset class its derived class with an extra property. And that would make the following code change behavior:
type PhoneNumber =
| Valid of string
| Invalid
type EmailAddress =
| Valid of string
| ValidButOutdated of string
| Invalid
let identifyContactInfo (info : obj) =
// This came from external code we don't control, but it should be contact info
match (unbox obj) with
| :? PhoneNumber as phone -> // Do something
| :? EmailAddress as email -> // Do something
Yes, this is bad code and should be written differently, but it illustrates the point. Under current compiler behavior, if identifyContactInfo gets passed a EmailAddress object, the :? PhoneNumber test will fail and so it will enter the second branch of the match, and treat that object (correctly) as an email address. If the compiler were to guess supersets/subsets based on DU names as you're suggesting here, then PhoneNumber would be considered a subset of EmailAddress and so would become its base class. And then when this function received an EmailAddress object, the :? PhoneNumber test would succeed (because an instance of a derived class can always be cast to the type of its base class). And then the code would enter the first branch of the match expression, and your code might then try to send a text message to an email address.
But wait...
What you're trying to do might be achievable by pulling out the subsets into their own DU category:
type AorB =
| A of int
| B of string
type ABC =
| AorB of AorB
| C of decimal
type ABD =
| AorB of AorB
| D of bool
Then your match expressions for an ABC might look like:
match foo with
| AorB (A num) -> printfn "%d" num
| AorB (B s) -> printfn "%s" s
| C num -> printfn "%M" num
And if you need to pass data between an ABC and an ABD:
let (bar : ABD option) =
match foo with
| AorB data -> Some (AorB data)
| C _ -> None
That's not a huge savings if your subset has only two common cases. But if your subset is a dozen cases or so, being able to pass those dozen around as a unit makes this design attractive.

F# how to handle nullable types

I try to do some graphs in F#. As an input I have CSV file that has some values nullable (e.g. nullable int). I try to show chart with following code :
[for row in data.Rows -> row.A.Value, row.B.Value] |> Chart.Point
Where both A and B are nullable integers. I received following error
System.InvalidOperationException: Nullable object must have a value.
How I should handle nullable types. Should I write some Option type to handle it or there is some other good way how to solve it.
If you are using F# 4.0, then there is a built-in function Option.ofNullable. If no, then you can use the implementation in the other answer.
You can also use the same code to define an active pattern:
let (|Present|_|) (n:System.Nullable<_>) =
if n.HasValue then Some(n.Value)
else None
... this can be used inside a match construct and so you can write:
[ for row in data.Rows do
match row.A, row.B wih
| Present a, Present b -> yield a,b
| _ -> () ] |> Chart.Point
Where you are going wrong is: you are calling the Value property on something that might be null.
When you call Value you are effectively saying "It's okay, I have rigorously changed this value and it's definitely not null so it's perfectly safe to treat it as if it were a non-nullable value." Of course, in this case, that condition isn't met, hence the runtime exception.
In F#, you don't want to be working with Nullable<'T> types, you want to be working with Option<'T>, this is much safer and the compiler can check more effectively that you're not making a mistake.
You can convert from Nullable<'T> to Option<'T> for the list using
[for row in data.Rows -> Option.ofNullable (row.A), Option.ofNullable(row.B)]
Of course then you have to decide how you want to handle the None cases but it's much easier to do that once you've made your design explicitly tell you that you've got a value that may or may not be something.
I don't know what behaviour you want but, as an example, perhaps you want to only chart the cases where both values are valid?
You could zip two option values:
module Option =
let zip a b =
match (a,b) with
|Some sa, Some sb -> Some(sa, sb)
|_ -> None
You can then map back to plotable numbers, extracting the None cases using List.choose.
[for row in data.Rows -> Option.ofNullable (row.A), Option.ofNullable (row.B)]
|> List.choose (fun (a,b) -> Option.zip a b)
|> Chart.Point
Map the Nullable type to Option type and filter them out (with .filter or .choose) or transform the None's to a special value for missing values (e.g. 0, -1, NaN) depending on your data to make them working in the charting tool.
module Option =
let fromNullable (n: _ Nullable) =
if n.HasValue
then Some n.Value
else None

Is F# aware of its discriminated unions' compiled forms?

A discriminated union in F# is compiled to an abstract class and its options become nested concrete classes.
type DU = A | B
DU is abstract while DU.A and DU.B are concrete.
With ServiceStack, the serialization of types to JSON strings and back can be customized with functions. With respect to the DU type, here's how I could do it in C#.
using ServiceStack.Text;
JsConfig<DU.A>.SerializeFn = v => "A"; // Func<DU.A, String>
JsConfig<DU.B>.SerializeFn = v => "B"; // Func<DU.B, String>
JsConfig<DU>.DeserializeFn = s =>
if s == "A" then DU.NewA() else DU.NewB(); // Func<String, DU>
Is F# aware of its discriminated unions' compiled forms? How would I get the type of DU.A in F# at compile time?
typeof<DU> // compiles
typeof<DU.A> // error FS0039: The type 'A' is not defined
typeof<A> // error FS0039: The type 'A' is not defined
I can easily enough register a function for deserialization in F#.
open System
open ServiceStack.Text
JsConfig<DU>.RawDeserializeFn <-
Func<_, _>(fun s -> printfn "Hooked"; if s = "A" then A else B)
Is it possible to register serialize functions wholly in F# for the concrete types DU.A and DU.B?
Whilst all the behaviour (the abstract classes etc.) is not just an implemenation detail, it is actually defined by the spec, these things are not accesible from F# - this is a quote from the spec
A compiled union type U has:
· One CLI static getter property U.C for each null union case
C. This property gets a singleton object that represents each such
case.
· One CLI nested type U.C for each non-null union case C. This
type has instance properties Item1, Item2.... for each field of the
union case, or a single instance property Item if there is only one
field. However, a compiled union type that has only one case does not
have a nested type. Instead, the union type itself plays the role of
the case type.
· One CLI static method U.NewC for each non-null union case C.
This method constructs an object for that case.
· One CLI instance property U.IsC for each case C. This
property returns true or false for the case.
· One CLI instance property U.Tag for each case C. This
property fetches or computes an integer tag corresponding to the case.
· If U has more than one case, it has one CLI nested type
U.Tags. The U.Tags typecontains one integer literal for each case, in
increasing order starting from zero.
· A compiled union type has the methods that are required to
implement its auto-generated interfaces, in addition to any
user-defined properties or methods.
These methods and properties may not be used directly from F#.
However, these types have user-facing List.Empty, List.Cons,
Option.None, and Option.Some properties and/or methods.
Importantly, "these methods and properties may not be used from F#".
Daniel is correct, you can do this by registering serialization functions for the base type DU. Here is a fuller example
open System
open ServiceStack.Text
type DU = A | B
let serialize = function
| A -> "A"
| B -> "B"
let deserialize = function
| "A" -> A
| "B" -> B
| _ -> failwith "Can't deserialize"
JsConfig<DU>.SerializeFn <- Func<_,_>(serialize)
JsConfig<DU>.DeSerializeFn <- Func<_,_>(deserialize)
let value = [| A; B |]
let text = JsonSerializer.SerializeToString(value)
let newValue = JsonSerializer.DeserializeFromString<DU[]>(text)
Result:
val value : DU [] = [|A; B|]
val text : string = "["A","B"]"
val newValue : DU [] = [|A; B|]
The fact that a DU in F# is a single type is key to its usefulness. The F# approach would be to use pattern matching:
JsConfig<DU>.SerializeFn <- function
| A -> "A"
| B -> "B"
This should work because the union cases are not only nested types in C#, but subtypes as well. Of course if ServiceStack doesn't consider base type serializers then this won't work.

Using a variable in pattern matching in Ocaml or F#

I have a function of the form
'a -> ('a * int) list -> int
let rec getValue identifier bindings =
match bindings with
| (identifier, value)::tail -> value
| (_, _)::tail -> getValue identifier tail
| [] -> -1
I can tell that identifier is not being bound the way I would like it to and is acting as a new variable within the match expression. How to I get identifier to be what is passed into the function?
Ok! I fixed it with a pattern guard, i.e. | (i, value)::tail when i = indentifier -> value
but I find this ugly compared to the way I originally wanted to do it (I'm only using these languages because they are pretty...). Any thoughts?
You can use F# active patterns to create a pattern that will do exactly what you need. F# supports parameterized active patterns that take the value that you're matching, but also take an additional parameter.
Here is a pretty stupid example that fails when the value is zero and otherwise succeeds and returns the addition of the value and the specified parameter:
let (|Test|_|) arg value =
if value = 0 then None else Some(value + arg)
You can specify the parameter in pattern matching like this:
match 1 with
| Test 100 res -> res // 'res' will be 101
Now, we can easily define an active pattern that will compare the matched value with the input argument of the active pattern. The active pattern returns unit option, which means that it doesn't bind any new value (in the example above, it returned some value that we assigned to a symbol res):
let (|Equals|_|) arg x =
if (arg = x) then Some() else None
let foo x y =
match x with
| Equals y -> "equal"
| _ -> "not equal"
You can use this as a nested pattern, so you should be able to rewrite your example using the Equals active pattern.
One of the beauties of functional languages is higher order functions. Using those functions we take the recursion out and just focus on what you really want to do. Which is to get the value of the first tuple that matches your identifier otherwise return -1:
let getValue identifier list =
match List.tryFind (fun (x,y) -> x = identifier) list with
| None -> -1
| Some(x,y) -> y
//val getValue : 'a -> (('a * int) list -> int) when 'a : equality
This paper by Graham Hutton is a great introduction to what you can do with higher order functions.
This is not directly an answer to the question: how to pattern-match the value of a variable. But it's not completely unrelated either.
If you want to see how powerful pattern-matching could be in a ML-like language similar to F# or OCaml, take a look at Moca.
You can also take a look at the code generated by Moca :) (not that there's anything wrong with the compiler doing a lot of things for you in your back. In some cases, it's desirable, even, but many programmers like to feel they know what the operations they are writing will cost).
What you're trying to do is called an equality pattern, and it's not provided by Objective Caml. Objective Caml's patterns are static and purely structural. That is, whether a value matches the pattern depends solely on the value's structure, and in a way that is determined at compile time. For example, (_, _)::tail is a pattern that matches any non-empty list whose head is a pair. (identifier, value)::tail matches exactly the same values; the only difference is that the latter binds two more names identifier and value.
Although some languages have equality patterns, there are non-trivial practical considerations that make them troublesome. Which equality? Physical equality (== in Ocaml), structural equality (= in Ocaml), or some type-dependent custom equality? Furthermore, in Ocaml, there is a clear syntactic indication of which names are binders and which names are reference to previously bound values: any lowercase identifier in a pattern is a binder. These two reasons explain why Ocaml does not have equality patterns baked in. The idiomatic way to express an equality pattern in Ocaml is in a guard. That way, it's immediately clear that the matching is not structural, that identifier is not bound by this pattern matching, and which equality is in use. As for ugly, that's in the eye of the beholder — as a habitual Ocaml programmer, I find equality patterns ugly (for the reasons above).
match bindings with
| (id, value)::tail when id = identifier -> value
| (_, _)::tail -> getValue identifier tail
| [] -> -1
In F#, you have another possibility: active patterns, which let you pre-define guards that concern a single site in a pattern.
This is a common complaint, but I don't think that there's a good workaround in general; a pattern guard is usually the best compromise. In certain specific cases there are alternatives, though, such as marking literals with the [<Literal>] attribute in F# so that they can be matched against.

F# keyword 'Some'

F# keyword 'Some' - what does it mean?
Some is not a keyword. There is an option type however, which is a discriminated union containing two things:
Some which holds a value of some type.
None which represents lack of value.
It's defined as:
type 'a option =
| None
| Some of 'a
It acts kind of like a nullable type, where you want to have an object which can hold a value of some type or have no value at all.
let stringRepresentationOfSomeObject (x : 'a option) =
match x with
| None -> "NONE!"
| Some(t) -> t.ToString()
Can check out Discriminated Unions in F# for more info on DUs in general and the option type (Some, None) in particular. As a previous answer says, Some is just a union-case of the option<'a> type, which is a particularly common/useful example of an algebraic data type.
Some is used to specify an option type, or in other words, a type that may or may not exist.
F# is different from most languages in that control flow is mostly done through pattern matching as opposed to traditional if/else logic.
In traditional if/else logic, you may see something like this:
if (isNull(x)) {
do ...
} else { //x exists
do ...
}
With pattern matching logic, matching we need a similar way to execute certain code if a value is null, or in F# syntax, None
Thus we would have the same code as
match x with
| None -> do ...
| Some x -> do ...

Resources