Are there use cases for single case variants in Ocaml? - f#

I've been reading F# articles and they use single case variants to create distinct incompatible types. However in Ocaml I can use private module types or abstract types to create distinct types. Is it common in Ocaml to use single case variants like in F# or Haskell?

Another specialized use case fo a single constructor variant is to erase some type information with a GADT (and an existential quantification).
For instance, in
type showable = Show: 'a * ('a -> string) -> showable
let show (Show (x,f)) = f x
let showables = [ Show (0,string_of_int); Show("string", Fun.id) ]
The constructor Show pairs an element of a given type with a printing function, then forget the concrete type of the element. This makes it possible to have a list of showable elements, even if each elements had a different concrete types.

For what it's worth it seems to me this wasn't particularly common in OCaml in the past.
I've been reluctant to do this myself because it has always cost something: the representation of type t = T of int was always bigger than just the representation of an int.
However recently (probably a few years) it's possible to declare types as unboxed, which removes this obstacle:
type [#unboxed] t = T of int
As a result I've personally been using single-constructor types much more frequently recently. There are many advantages. For me the main one is that I can have a distinct type that's independent of whether it's representation happens to be the same as another type.
You can of course use modules to get this effect, as you say. But that is a fairly heavy solution.
(All of this is just my opinion naturally.)

Yet another case for single-constructor types (although it does not quite match your initial question of creating distinct types): fancy records. (By contrast with other answers, this is more a syntactic convenience than a fundamental feature.)
Indeed, using a relatively recent feature (introduced with OCaml 4.03, in 2016) which allows writing constructor arguments with a record syntax (including mutable fields!), you can prefix regular records with a constructor name, Coq-style.
type t = MakeT of {
mutable x : int ;
mutable y : string ;
}
let some_t = MakeT { x = 4 ; y = "tea" }
(* val some_t : t = MakeT {x = 4; y = "tea"} *)
It does not change anything at runtime (just like Constr (a,b) has the same representation as (a,b), provided Constr is the only constructor of its type). The constructor makes the code a bit more explicit to the human eye, and it also provides the type information required to disambiguate field names, thus avoiding the need for type annotations. It is similar in function to the usual module trick, but more systematic.
Patterns work just the same:
let (MakeT { x ; y }) = some_t
(* val x : int = 4 *)
(* val y : string = "tea" *)
You can also access the “contained” record (at no runtime cost), read and modify its fields. This contained record however is not a first-class value: you cannot store it, pass it to a function nor return it.
let (MakeT fields) = some_t in fields.x (* returns 4 *)
let (MakeT fields) = some_t in fields.x <- 42
(* some_t is now MakeT {x = 42; y = "tea"} *)
let (MakeT fields) = some_t in fields
(* ^^^^^^
Error: This form is not allowed as the type of the inlined record could escape. *)

Another use case of single-constructor (polymorphic) variants is documenting something to the caller of a function. For instance, perhaps there's a caveat with the value that your function returns:
val create : unit -> [ `Must_call_close of t ]
Using a variant forces the caller of your function to pattern-match on this variant in their code:
let (`Must_call_close t) = create () in (* ... *)
This makes it more likely that they'll pay attention to the message in the variant, as opposed to documentation in an .mli file that could get missed.
For this use case, polymorphic variants are a bit easier to work with as you don't need to define an intermediate type for the variant.

Related

Is it possible to have an F# operator overload, on an internal type, available to all files in the assembly?

I have some types that enforce numeric value ranges, and I use them in many files in a single project. They look something like this:
[<Struct>]
type NonNegativeMoney =
new(x) =
if x < 0m then invalidArg "x" "Ruh-roh..."
{ Value = x }
val Value : decimal
static member (+) (x: NonNegativeMoney, y: NonNegativeMoney) = NonNegativeMoney(x.Value + y.Value)
I now want to make these types internal to the assembly and leave only my OO type model public. However, when I flip these types to internal, I got the following compiler error:
The member or object constructor 'op_Addition' is not public. Private
members may only be accessed from within the declaring type. Protected
members may only be accessed from an extending type and cannot be
accessed from inner lambda expressions.
The reason for this, has been addressed in the question Why does the F# compiler fail with this infix operator?. The solution proposed in the answer is to use F# signature files to make the types internal. This works for the OP's scenario in that question, where usage of the operator is limited to a type in the same file. However, I can't seem to find a way to make it work so the operator is accessible from all files in my project. If I use a signature file, it works intra-file, but not inter-file.
Is there any way to make this work so the types are internal to the assembly, but visible across the files in my project? I'd like to keep the operators, as I'm using library functions like Seq.sum that require them on the types being summed.
You can define the overload in a module instead of inside the type:
[<Struct>]
type internal NonNegativeMoney =
new(x) =
if x < 0m then invalidArg "x" "Ruh-roh..."
{ Value = x }
val Value : decimal
let internal (+) (x: NonNegativeMoney) (y: NonNegativeMoney) = NonNegativeMoney(x.Value + y.Value)
...but this will override the normal (+) operator, which is probably why you ruled it out. Using a custom operator (like ++) may be a reasonable compromise.
You can make the operator available to the whole project by marking the module [<AutoOpen>].

F# / Simplest way to validate array length at COMPILE time

I have some scientific project. There are vectors / square matrices of various lengths there. Obviously (for example) a vector of length 2 cannot be added to a vector of length 3 (and so on and so forth). There are several NET libraries, which deal with vectors / matrices. All of them either have generic vectors / matrices OR have some very specific vectors / matrices, which do not suite the needs.
Most, if not all, of these libraries can create a vector from a list or array. Unfortunately, If I mistakenly give an input array of the wrong length, then I will get a vector of the wrong length and then everything will blow up at run time!
I wonder if it is possible to check array length at compile time so that to get a compile error if, let’s say, I try to pass a 5-element array to a vector of length 2 “constructor”. After all, printfn does almost that!
F# type providers come to mind, but I am not sure how to apply them here.
Thanks a lot!
Thanks to the OP for an interesting question. My answer frequency has dropped not because of unwillingness to help but rather that there a few questions that tickles my interest.
We don't have dependent types in F# and F# doesn't support generics with numerical type arguments (like C++).
However we could create distinct types for different dimensions like Dim1, Dim2 and so on and provide them as type arguments.
This would allow us to have a type signature for apply that applies a vector a matrix like this:
let apply (m : Matrix<'R, 'C>) (v : Vector<'C>) : Vector<'R> = …
The code won't compile unless the columns of the matrix matches the length of the vector. In addition; the resulting vector has the length that is rows of the columns.
One way to do this is defining an interface IDimension and some concrete implementions representing the different dimensions.
type IDimension =
interface
abstract Size : int
end
type Dim1 () = class interface IDimension with member x.Size = 1 end end
type Dim2 () = class interface IDimension with member x.Size = 2 end end
The vector and the matrix can then be implemented like this
type Vector<'Dim when 'Dim :> IDimension
and 'Dim : (new : unit -> 'Dim)
> () =
class
let dim = new 'Dim()
let vs = Array.zeroCreate<float> dim.Size
member x.Dim = dim
member x.Values = vs
end
type Matrix<'RowDim, 'ColumnDim when 'RowDim :> IDimension
and 'RowDim : (new : unit -> 'RowDim)
and 'ColumnDim :> IDimension
and 'ColumnDim : (new : unit -> 'ColumnDim)
> () =
class
let rowDim = new 'RowDim()
let columnDim = new 'ColumnDim()
let vs = Array.zeroCreate<float> (rowDim.Size*columnDim.Size)
member x.RowDim = rowDim
member x.ColumnDim = columnDim
member x.Values = vs
end
Finally this allows us to write code like this:
let m76 = Matrix<Dim7, Dim6> ()
let v6 = Vector<Dim6> ()
let v7 = apply m76 v6 // Vector<Dim7>
// Doesn't compile because v7 has the wrong dimension
let vv = apply m76 v7
If you need a wide range of dimensions (because you have an algebra increments/decrements the dimensions of vectors/matrices) you could support that using some smart variant of church numerals.
If this is usable or not is entirely up the reader I think.
PS.
Perhaps unit of measures could have been used for this as well if they applied to more types than floats.
The general term for what you're looking for is dependent types, but F# does not support them.
I've seen an experiment in using type providers to mimic one particular flavor of dependent types (constraining the domain of a primitive type), but I wouldn't expect it to be possible to achieve what you want using type providers in their current form. They seem to be too whimsical for that.
Print format strings appear to be doing that (and in fact printers are a "Hello World" application for dependent types), but actually they work because they get special treatment by the compiler, and the mechanism for that is not extensible.
You're doomed to ensure correct lengths at runtime.
My best bet would be to use structs to encode actual vectors and ensure correctness on the API level that way, map them to arrays at the point where you're interacting with those matrix algebra libraries, then map the results back to structs with ample assertions when done.
The comment from #Justanothermetaprogrammer qualifies as an answer. Here is how it works in the real example. The matrix implementation in the example is based on MathNet.Numerics.LinearAlgebra:
open MathNet.Numerics.LinearAlgebra
type RealMatrix2x2 =
| RealMatrix2x2 of Matrix<double>
static member private createInternal (a : #seq<#seq<double>>) =
matrix a |> RealMatrix2x2
static member create
(
(a11, a12),
(a21, a22)
) =
RealMatrix2x2.createInternal
[|
[| a11; a12|]
[| a21; a22|]
|]
let m2 =
(
(1., 2.),
(3., 4.)
)
|> RealMatrix2x2.create
The tuple signatures and "re-mapping" into #seq<#seq<double>> can be easily code-generated using, for example, Excel or any other convenient tool for as many dimensions as necessary. In fact, the whole class along with any other necessary operator overrides (like multiplication of RealMatrix2x2 by RealMatrix2x2, ...) can be code generated for all necessary dimensions.

Understanding F# StringConstant

I am trying to understand the following code, particularly StringConstant:
type StringConstant = StringConstant of string * string
[<EntryPoint>]
let main argv =
let x = StringConstant("little", "shack")
printfn "%A" x
0 // return an integer exit code
(By way of context, StringConstant is used in the FParsec tutorial, but this example does not use FParsec.)
What I would like to know is:
what exactly is the type statement doing?
once I instantiate x, how would I access the individual "parts"
("little" or "house")
As others already noted, technically, StringConstant is a discriminated union with just a single case and you can extract the value using pattern matching.
When talking about domain modelling in F#, I like to use another useful analogy. Often, you can start just by saying that some data type is a tuple:
type Person = string * int
This is really easy way to represent data, but the problem is that when you write "Tomas", 42, the compiler does not know that you mean Person, but instead understands it as string * int tuple. One-case discriminated unions are a really nice way to name your tuple:
type Person = Person of string * int
It might be a bit confusing that this is using the name Person twice - first as a type name and second as a name of the case. This has no special meaning - it simply means that the type will have the same name as the case.
Now you can write Person("Tomas", 42) to create a value and it will have a type Person. You can decompose it using match or let, but you can also easily write functions that take Person. For example, to return name, you can write:
let getName (Person(name, _)) =
name
I think single-case discriminated unions are often used mainly because they are really easy to define and really easy to work with. However, I would not use them in code that is exposed as a public API because they are a bit unusual and may be confusing.
PS: Also note that you need to use parentheses when extracting the values:
// Correct. Defines symbols 'name' and 'age'
let (Person(name, age)) = tomas
// Incorrect! Defines a function `Person` that takes a tuple
// (and hides the `Person` case of the discriminated union)
let Person(name, age) = tomas
StringConstant is a discriminated union type, with just a single case (also named StringConstant). You extract the parts via pattern matching, using match/function or even just let, since there is just a single case:
let (StringConstant(firstPart, secondPart)) = x
type StringConstant = StringConstant of string * string
results in a discriminated union with one type.
type StringConstant = | StringConstant of string * string if you execute it in F# interactive.
You can see the msdn documentation on that here.
You can get the value out like this:
let printValue opt =
match opt with
| StringConstant( x, y) -> printfn "%A%A" x y
The other guys already mentioned how you extract the data from a discriminated union, but to elaborate a little more on Discriminated unions one could say that they are sorta like enums on steroids. They are implemented behind the scenes as a type hierarchy where the type is the base class and the cases are subclases of that baseclass with whatever parameter they might have as readonly public variables.
In Scala a similar data-structure is called case classes which might help you convince yourself of this implementationmethod.
One nice property of discriminated unions are that they are self-referenceable and therefor are perfect for defining recursive structures like a tree. Below is a definition of a Hoffman coding tree in just three lines of code. Doing that in C# would probably take somewhere between 5 and 10 times as many lines of code.
type CodeTree =
| Branch of CodeTree * CodeTree * list<char> * int
| Leaf of char * int
For information about Discriminated Unions see the msdn documentation
For an example of using Discriminated Unions as a tree-structure see this gist which is an implementation of a huffman decoder in roughly 60 lines of F#)

F# Type Annotation For Lists

In F# what is the type annotation for a typed list (e..g list of int)? With a simple function I can do annotations as follows:
let square(x:int) = ...
I've annotated x as an int type. But what if I want to do a type annotation for an int list? For example, let's say I have a max function that expects a list - how would I do a type annotation for it?
let max(numbers:??) = ...
There are two options:
let max (numbers:int list) = ...
let max (numbers:list<int>) = ...
The first version uses syntax that is inherited from OCaml (and is frequently used for primitive F# types such as lists). The second version uses .NET syntax (and is more frequently used for .NET types or when writing object-oriented code in F#). However, both of them mean exactly the same thing.
In any case, the form of type annotation is always (<something> : <type>) where <something> is either a pattern (as in parameter list) or an expression. This means that int list and list<int> are just names of types. F# Interactive prints the type if you enter some value, so you can use this to learn more about how type names are written:
> [1;2;3]
val it : int list = [ 1; 2; 3 ]

How to create a recursive data structure value in (functional) F#?

How can a value of type:
type Tree =
| Node of int * Tree list
have a value that references itself generated in a functional way?
The resulting value should be equal to x in the following Python code, for a suitable definition of Tree:
x = Tree()
x.tlist = [x]
Edit: Obviously more explanation is necessary. I am trying to learn F# and functional programming, so I chose to implement the cover tree which I have programmed before in other languages. The relevant thing here is that the points of each level are a subset of those of the level below. The structure conceptually goes to level -infinity.
In imperative languages a node has a list of children which includes itself. I know that this can be done imperatively in F#. And no, it doesn't create an infinite loop given the cover tree algorithm.
Tomas's answer suggests two possible ways to create recursive data structures in F#. A third possibility is to take advantage of the fact that record fields support direct recursion (when used in the same assembly that the record is defined in). For instance, the following code works without any problem:
type 'a lst = Nil | NonEmpty of 'a nelst
and 'a nelst = { head : 'a; tail : 'a lst }
let rec infList = NonEmpty { head = 1; tail = infList }
Using this list type instead of the built-in one, we can make your code work:
type Tree = Node of int * Tree lst
let rec x = Node(1, NonEmpty { head = x; tail = Nil })
You cannot do this directly if the recursive reference is not delayed (e.g. wrapped in a function or lazy value). I think the motivation is that there is no way to create the value with immediate references "at once", so this would be awkward from the theoretical point of view.
However, F# supports recursive values - you can use those if the recursive reference is delayed (the F# compiler will then generate some code that initializes the data structure and fills in the recursive references). The easiest way is to wrap the refernece inside a lazy value (function would work too):
type Tree =
| Node of int * Lazy<Tree list>
// Note you need 'let rec' here!
let rec t = Node(0, lazy [t; t;])
Another option is to write this using mutation. Then you also need to make your data structure mutable. You can for example store ref<Tree> instead of Tree:
type Tree =
| Node of int * ref<Tree> list
// empty node that is used only for initializataion
let empty = Node(0, [])
// create two references that will be mutated after creation
let a, b = ref empty, ref empty
// create a new node
let t = Node(0, [a; b])
// replace empty node with recursive reference
a := t; b := t
As James mentioned, if you're not allowed to do this, you can have some nice properties such as that any program that walks the data structure will terminate (because the data-structrue is limited and cannot be recursive). So, you'll need to be a bit more careful with recursive values :-)

Resources