Understanding F# StringConstant - f#

I am trying to understand the following code, particularly StringConstant:
type StringConstant = StringConstant of string * string
[<EntryPoint>]
let main argv =
let x = StringConstant("little", "shack")
printfn "%A" x
0 // return an integer exit code
(By way of context, StringConstant is used in the FParsec tutorial, but this example does not use FParsec.)
What I would like to know is:
what exactly is the type statement doing?
once I instantiate x, how would I access the individual "parts"
("little" or "house")

As others already noted, technically, StringConstant is a discriminated union with just a single case and you can extract the value using pattern matching.
When talking about domain modelling in F#, I like to use another useful analogy. Often, you can start just by saying that some data type is a tuple:
type Person = string * int
This is really easy way to represent data, but the problem is that when you write "Tomas", 42, the compiler does not know that you mean Person, but instead understands it as string * int tuple. One-case discriminated unions are a really nice way to name your tuple:
type Person = Person of string * int
It might be a bit confusing that this is using the name Person twice - first as a type name and second as a name of the case. This has no special meaning - it simply means that the type will have the same name as the case.
Now you can write Person("Tomas", 42) to create a value and it will have a type Person. You can decompose it using match or let, but you can also easily write functions that take Person. For example, to return name, you can write:
let getName (Person(name, _)) =
name
I think single-case discriminated unions are often used mainly because they are really easy to define and really easy to work with. However, I would not use them in code that is exposed as a public API because they are a bit unusual and may be confusing.
PS: Also note that you need to use parentheses when extracting the values:
// Correct. Defines symbols 'name' and 'age'
let (Person(name, age)) = tomas
// Incorrect! Defines a function `Person` that takes a tuple
// (and hides the `Person` case of the discriminated union)
let Person(name, age) = tomas

StringConstant is a discriminated union type, with just a single case (also named StringConstant). You extract the parts via pattern matching, using match/function or even just let, since there is just a single case:
let (StringConstant(firstPart, secondPart)) = x

type StringConstant = StringConstant of string * string
results in a discriminated union with one type.
type StringConstant = | StringConstant of string * string if you execute it in F# interactive.
You can see the msdn documentation on that here.
You can get the value out like this:
let printValue opt =
match opt with
| StringConstant( x, y) -> printfn "%A%A" x y

The other guys already mentioned how you extract the data from a discriminated union, but to elaborate a little more on Discriminated unions one could say that they are sorta like enums on steroids. They are implemented behind the scenes as a type hierarchy where the type is the base class and the cases are subclases of that baseclass with whatever parameter they might have as readonly public variables.
In Scala a similar data-structure is called case classes which might help you convince yourself of this implementationmethod.
One nice property of discriminated unions are that they are self-referenceable and therefor are perfect for defining recursive structures like a tree. Below is a definition of a Hoffman coding tree in just three lines of code. Doing that in C# would probably take somewhere between 5 and 10 times as many lines of code.
type CodeTree =
| Branch of CodeTree * CodeTree * list<char> * int
| Leaf of char * int
For information about Discriminated Unions see the msdn documentation
For an example of using Discriminated Unions as a tree-structure see this gist which is an implementation of a huffman decoder in roughly 60 lines of F#)

Related

Are there use cases for single case variants in Ocaml?

I've been reading F# articles and they use single case variants to create distinct incompatible types. However in Ocaml I can use private module types or abstract types to create distinct types. Is it common in Ocaml to use single case variants like in F# or Haskell?
Another specialized use case fo a single constructor variant is to erase some type information with a GADT (and an existential quantification).
For instance, in
type showable = Show: 'a * ('a -> string) -> showable
let show (Show (x,f)) = f x
let showables = [ Show (0,string_of_int); Show("string", Fun.id) ]
The constructor Show pairs an element of a given type with a printing function, then forget the concrete type of the element. This makes it possible to have a list of showable elements, even if each elements had a different concrete types.
For what it's worth it seems to me this wasn't particularly common in OCaml in the past.
I've been reluctant to do this myself because it has always cost something: the representation of type t = T of int was always bigger than just the representation of an int.
However recently (probably a few years) it's possible to declare types as unboxed, which removes this obstacle:
type [#unboxed] t = T of int
As a result I've personally been using single-constructor types much more frequently recently. There are many advantages. For me the main one is that I can have a distinct type that's independent of whether it's representation happens to be the same as another type.
You can of course use modules to get this effect, as you say. But that is a fairly heavy solution.
(All of this is just my opinion naturally.)
Yet another case for single-constructor types (although it does not quite match your initial question of creating distinct types): fancy records. (By contrast with other answers, this is more a syntactic convenience than a fundamental feature.)
Indeed, using a relatively recent feature (introduced with OCaml 4.03, in 2016) which allows writing constructor arguments with a record syntax (including mutable fields!), you can prefix regular records with a constructor name, Coq-style.
type t = MakeT of {
mutable x : int ;
mutable y : string ;
}
let some_t = MakeT { x = 4 ; y = "tea" }
(* val some_t : t = MakeT {x = 4; y = "tea"} *)
It does not change anything at runtime (just like Constr (a,b) has the same representation as (a,b), provided Constr is the only constructor of its type). The constructor makes the code a bit more explicit to the human eye, and it also provides the type information required to disambiguate field names, thus avoiding the need for type annotations. It is similar in function to the usual module trick, but more systematic.
Patterns work just the same:
let (MakeT { x ; y }) = some_t
(* val x : int = 4 *)
(* val y : string = "tea" *)
You can also access the “contained” record (at no runtime cost), read and modify its fields. This contained record however is not a first-class value: you cannot store it, pass it to a function nor return it.
let (MakeT fields) = some_t in fields.x (* returns 4 *)
let (MakeT fields) = some_t in fields.x <- 42
(* some_t is now MakeT {x = 42; y = "tea"} *)
let (MakeT fields) = some_t in fields
(* ^^^^^^
Error: This form is not allowed as the type of the inlined record could escape. *)
Another use case of single-constructor (polymorphic) variants is documenting something to the caller of a function. For instance, perhaps there's a caveat with the value that your function returns:
val create : unit -> [ `Must_call_close of t ]
Using a variant forces the caller of your function to pattern-match on this variant in their code:
let (`Must_call_close t) = create () in (* ... *)
This makes it more likely that they'll pay attention to the message in the variant, as opposed to documentation in an .mli file that could get missed.
For this use case, polymorphic variants are a bit easier to work with as you don't need to define an intermediate type for the variant.

F# optional arguments and overloading alternatives

In my F# application I often need to perform a case-insensitive search of a string within a string, so I created a function with the appropriate comparison:
let indexOf (str:string) (value:string) startIndex =
match str.IndexOf(value, startIndex, StringComparison.OrdinalIgnoreCase) with
| index when index >= 0 -> Some index
| _ -> None
I do not like the fact, that when I want to search from the beginning, I have to pass the redundant 0 as the start index.
I am relatively new to both F# and the functional programming, so I would like to know what is the preferred (cleanest) solution from the functional point of view?
Create two versions:
let indexOfFrom (str:string) (value:string) startIndex = (...)
let indexOf str value = indexOfFrom str value 0
Use Option type:
let foundIndex = indexOf "bar foobar" "bar" (Some 4)
Create a dedicated discriminated union:
type Position =
| Beginning
| StartIndex of index : int
let foundIndex = indexOf "bar foobar" "bar" (Index 4)
Place the 'indexOf' function inside a type and use the 'classic' overloading.
Place the 'indexOf' function inside a type and use F# optional arguments.
If you are defining the functionality as F# functions, then I think that using two separate functions (with reasonably descriptive names) is probably the best option you have. So I'd go with your first option (I definitely prefer this option over defining a discriminated union just for this single purpose):
let indexOfFrom (str:string) (value:string) startIndex = (...)
let indexOf str value = indexOfFrom str value 0
The alternative is to define the functionality as members of a type - then you can use both overloading and F# optional arguments, but you'd have to access them using full name String.IndexOf. You could write something like:
type String =
static member IndexOf(str:string, value:string, startIndex) = (...)
static member IndexOf(str, value) = String.IndexOf(str, value, 0)
Or, using optional parameters:
type String =
static member IndexOf(str:string, value:string, ?startIndex) = (...)
Which of the options is the best one?
If you're designing functional API (e.g. domain-specific language), then your option with two separate functions is probably the best choice.
If you're aiming to design a nice F# API, then I think your option (multiple functions) or optional parameters are quite reasonable. Functions are used quite heavily in Deedle and F# Charting relies on optional arguments.
The benefit of using overloading is that the library will be also nicely usable from C#. So, if you're thinking of calling the library from C#, this is pretty much the only option.
I think option 1 (with curried function) would be simplest. Curried functions are pretty common in functional programming.
In options 2 or 3 you'll still have to pass additional parameter to the function for the search from the beginning
Options 4 or 5 require additional overhead to create type. It's kind of 'overkill' for this simple task

A very basic Type check fails in F# ... why?

I wrote this code
type Test =
| Age of int
| Name of string;;
let x = Age(10);;
if (x.GetType() = typeof<Test>) then printfn "true" else printfn "false";;
The code prints false. But that puzzles me because isn't Age of type Test?
Also, is there a better way to compare types in F# the .GetType() = typeof<> is very long. I tried :? but I think that's for typecasting rather than comparing types.
The plain answer is, do so:
if (x :> obj) :? Test then printfn "true" else printfn "false"
This issue comes because of the implementation of DUs (using internal classes and tags) and the limitation of F#'s type system (which does not acknowledge the implementation).
As you saw, the type of x is FSI_0001+Test+Age, and F# does not recognize that as a sub-type of Test.
Quoting from the spec
A compiled union type U has:
· One CLI static getter property U.C for each null union case C. This
property gets a singleton object that represents each such case.
· One CLI nested type U.C for each non-null union case C. This type
has instance properties Item1, Item2.... for each field of the union
case, or a single instance property Item if there is only one field.
However, a compiled union type that has only one case does not have a
nested type. Instead, the union type itself plays the role of the case
type.
We see that Age is implemented as a nested type of the parent DU. As a result you could use Type.GetNestedTypes to get all the subtypes of the DU and then test each one to see if the type matches.
But that puzzles me because isn't Test of type Color?
There is no type Color and Test is a type itself so it does not have a type. So your question is nonsensical.
Perhaps you meant to ask why the answer is what it is? If so, it is a consequence of the way F# currently chooses to represent such values internally.

F# Method overload resolution not as smart as C#?

Say, I have
member this.Test (x: 'a) = printfn "generic"
1
member this.Test (x: Object) = printfn "non generic"
2
If I call it in C#
var a = test.Test(3); // calls generic version
var b = test.Test((object)3); // calls non generic version
var c = test.Test<object>(3); // calls generic version
However, in F#
let d = test.Test(3); // calls non generic version
let e = test.Test<int>(3); // calls generic version
So I have to add type annotation so as to get the correct overloaded method. Is this true? If so, then why F# doesn't automatically resolve correctly given that the argument type is already inferred? (what is the order of F#'s overload resolution anyway? always favor Object than its inherited classes?)
It is a bit dangerous if a method has both overloads, one of them takes argument as Object type and the other one is generic and both return the same type. (like in this example, or Assert.AreEqual in unit testing), as then it is very much likely we get the wrong overloading without even notice (won't be any compiler error). Wouldn't it be a problem?
Update:
Could someone explain
Why F# resolves Assert.AreEqual(3, 5) as Assert.AreEqual(Object a, Object b) but not Assert.AreEqual<T>(T a, T b)
But F# resolves Array.BinarySearch([|2;3|], 2) as BinarySearch<T>(T[]array, T value) but not BinarySearch(Array array, Object value)
F# Method overload resolution not as smart as C#?
I don't think it's true. Method overloading makes type inference much more difficult. F# has reasonable trade-offs to make method overloading usable and type inference as powerful as it should be.
When you pass a value to a function/method, F# compiler automatically upcasts it to an appropriate type. This is handy in many situations but also confusing sometimes.
In your example, 3 is upcasted to obj type. Both methods are applicable but the simpler (non-generic) method is chosen.
Section 14.4 Method Application Resolution in the spec specifies overloading rules quite clearly:
1) Prefer candidates whose use does not constrain the use of a
user-introduced generic type annotation to be equal to another type.
2) Prefer candidates that do not use ParamArray conversion. If two
candidates both use ParamArray conversion with types pty1 and pty2,
and pty1 feasibly subsumes pty2, prefer the second; that is, use the
candidate that has the more precise type.
3) Prefer candidates that do not have
ImplicitlyReturnedFormalArgs.
4) Prefer candidates that do not have
ImplicitlySuppliedFormalArgs.
5) If two candidates have unnamed actual argument types ty11...ty1n and ty21...ty2n, and each ty1i either
a. feasibly subsumes ty2i, or
b. ty2i is a System.Func type and ty1i is some other delegate
type, then prefer the second candidate. That is, prefer any candidate that has the more specific actual argument types, and
consider any System.Func type to be more specific than any other
delegate type.
6) Prefer candidates that are not extension members over
candidates that are.
7) To choose between two extension members, prefer the one that
results from the most recent use of open.
8) Prefer candidates that are not generic over candidates that are
generic—that is, prefer candidates that have empty ActualArgTypes.
I think it's users' responsibility to create unambiguous overloaded methods. You can always look at inferred types to see whether you're doing them correctly. For example, a modified version of yours without ambiguity:
type T() =
member this.Test (x: 'a) = printfn "generic"; 1
member this.Test (x: System.ValueType) = printfn "non-generic"; 2
let t = T()
let d = t.Test(3) // calls non-generic version
let e = t.Test(test) // call generic version
UPDATE:
It comes down a core concept, covariance. F# doesn't support covariance on arrays, lists, functions, etc. It's generally a good thing to ensure type safety (see this example).
So it's easy to explain why Array.BinarySearch([|2;3|], 2) is resolved to BinarySearch<T>(T[] array, T value). Here is another example on function arguments where
T.Test((fun () -> 2), 2)
is resolved to
T.Test(f: unit -> 'a, v: 'a)
but not to
T.Test(f: unit -> obj, v: obj)

F#: Can't hide a type abbreviation in a signature? Why not?

In F#, I'd like to have what I see as a fairly standard Abstract Datatype:
// in ADT.fsi
module ADT
type my_Type
// in ADT.fs
module ADT
type my_Type = int
In other words, code inside the module knows that my_Type is an int, but code outside does not. However, F# seems to have a restriction where type abbreviations specifically cannot be hidden by a signature. This code gives a compiler error, and the restriction is described here.
If my_Type were instead a discriminated union, then there is no compiler error. My question is, why the restriction? I seem to remember being able to do this in SML and Ocaml, and furthermore, isn't this a pretty standard thing to do when creating an abstract datatype?
Thanks
As Ganesh points out, this is a technical limitation of the F# compiler (and .NET runtime), because the type abbreviation is simply replaced by the actual type during the compilation. As a result, if you write a function:
let foo (a:MyType) : MyType = a + 1
The compiler will compile it as a .NET method with the following signature:
int foo(int a);
If the actual type of the abbreviation was hidden from the users of the library, then they wouldn't be able to recognize that the foo function is actually working with MyType (this information is probably stored in some F#-specific meta-data, but that is not accessible to other .NET languages...).
Perhaps the best workaround for this limiation is to define the type as a single-case discriminated union:
type MyType = MT of int
let foo (MT a) = MT(a + 1)
Working with this kind of type is quite convenient. It adds some overhead (there are new objects created when constructing a value of the type), but that shouldn't be a big issue in most of the situations.
Type abbreviations in F# are compiled away (i.e. the compiled code will use int, not MyType), so you can't make them properly abstract. In theory the compiler could enforce the abstraction within the F# world, but this wouldn't be very helpful as it would still leak in other languages.
Note that you can define a type abbreviation as private within a module:
// File1.fs
module File1
type private MyType = int
let e : MyType = 42
let f (x:MyType) = x+1
// Program.fs
module Program
do printfn "%A" (File1.f File1.e)
I am unclear why you can't hide it with a signature; I logged a bug to consider it.
From what I understand F# does not allow an abbreviation to be hidden by a signature.
I found this link where the blogger commented on this but I am not sure on the specifics of why this is the case.
My assumption is that this is a restraint set to allow more effective interop with other languages on the CLR.

Resources