Can F# units of measure be implemented in OCaml? - f#

F# has a units of measure capability (there's more detail in this research paper).
[<Measure>] type unit-name [ = measure ]
This allows units to be defined such as:
type [<Measure>] USD
type [<Measure>] EUR
And code to be written as:
let dollars = 25.0<USD>
let euros = 25.0<EUR>
// Results in an error as the units differ
if dollars > euros then printfn "Greater!"
It also handles conversions (I'm guessing that means Measure has some functions defined that let Measures be multiplied, divided and exponentiated):
// Mass, grams.
[<Measure>] type g
// Mass, kilograms.
[<Measure>] type kg
let gramsPerKilogram : float<g kg^-1> = 1000.0<g/kg>
let convertGramsToKilograms (x : float<g>) = x / gramsPerKilogram
Could this capability be implemented in OCaml? Someone suggested I look at phantom types but they don't appear to compose in the same way as units.
(Disclosure: I asked this question about Haskell a few months ago, got an interesting discussion but no definitive answer beyond 'probably not').

Quick answer: No, that's beyond the capabilities of current OCaml type inference.
To explain it a bit more: type inference in most functional languages is based on a concept called unification, which is really just a specific way of solving equations. For example, inferring the type of an expression like
let f l i j =
(i, j) = List.nth l (i + j)
involves first creating a set of equations (where the types of l, i and j are 'a, 'b and 'c respectively, and List.nth : 'd list -> int -> 'd, (=) : 'e -> 'e -> bool, (+) : int -> int -> int):
'e ~ 'b * 'c
'a ~ 'd list
'b ~ int
'c ~ int
'd ~ 'e
and then solving these equations, which yields 'a ~ (int * int) list and f : (int * int) list -> int -> int -> bool. As you can see, these equations are not very hard to solve; in fact, the only theory underlying unification is syntactic equality, i.e. if two things are equal if and only if they are written in the same way (with special consideration for unbound variables).
The problem with units of measures is that the equations that are generated cannot be solved in a unique way using syntactic equality; the correct theory to use is the theory of Abelian groups (inverses, identity element, commutative operation). For example, the units of measure m * s * s⁻¹ should be equivalent to m. There is a further complication when it comes to principal types and let-generalization. For example, the following doesn't type-check in F#:
fun x -> let y z = x / z in (y mass, y time)
because y is inferred to have type float<'_a> -> float<'b * '_a⁻¹>, instead of the more general type float<'a> -> float<'b * 'a⁻¹>
Anyhow, for more information, I recommend reading the chapter 3 of the following PhD thesis:
http://adam.gundry.co.uk/pub/thesis/thesis-2013-12-03.pdf

It cannot be directly expressed in the syntax of the type system, but some encoding are possible. One have been proposed for instance in this message on the caml-list https://sympa.inria.fr/sympa/arc/caml-list/2014-06/msg00069.html . Here is the formated content of the answer. By the way, I can't see any reason why this wouldn't be applicable to Haskell.
module Unit : sig
type +'a suc
type (+'a, +'b) quantity
val of_float : float -> ('a, 'a) quantity
val metre : ('a, 'a suc) quantity
val mul : ('a, 'b) quantity -> ('b, 'c) quantity -> ('a, 'c) quantity
val add : ('a, 'b) quantity -> ('a, 'b) quantity -> ('a, 'b) quantity
val neg : ('a, 'b) quantity -> ('a, 'b) quantity
val inv : ('a, 'b) quantity -> ('b, 'a) quantity
end = struct
type 'a suc = unit
type ('a, 'b) quantity = float
let of_float x = x
let metre = 1.
let mul x y = x *. y
let add x y = x +. y
let neg x = 0. -. x
let inv x = 1. /. x
end
This successfully tracks the dimension of quatities:
# open Unit;;
# let m10 = mul (of_float 10.) metre;;
val m10 : ('a, 'a Unit.suc) Unit.quantity = <abstr>
# let sum = add m10 m10;;
val sum : ('a, 'a Unit.suc) Unit.quantity = <abstr>
# let sq = mul m10 m10;;
val sq : ('a, 'a Unit.suc Unit.suc) Unit.quantity = <abstr>
# let cube = mul m10 (mul m10 m10);;
val cube : ('a, 'a Unit.suc Unit.suc Unit.suc) Unit.quantity = <abstr>
# let _ = add (mul sq (inv cube)) (inv m10);;
- : ('a Unit.suc, 'a) Unit.quantity = <abstr>
and it will give errors if they are used incorrectly:
# let _ = add sq cube;;
Characters 15-19:
let _ = add sq cube;;
^^^^
Error: This expression has type
('a, 'a Unit.suc Unit.suc Unit.suc) Unit.quantity
but an expression was expected of type
('a, 'a Unit.suc Unit.suc) Unit.quantity
The type variable 'a occurs inside 'a Unit.suc
# let _ = add m10 (mul m10 m10);;
Characters 16-29:
let _ = add m10 (mul m10 m10);;
^^^^^^^^^^^^^
Error: This expression has type ('a, 'a Unit.suc Unit.suc) Unit.quantity
but an expression was expected of type ('a, 'a Unit.suc)
Unit.quantity
The type variable 'a occurs inside 'a Unit.suc
However, it will infer too restrictive types for some things:
# let sq x = mul x x;;
val sq : ('a, 'a) Unit.quantity -> ('a, 'a) Unit.quantity = <fun>

Related

Is there a way to make this continuation passing with codata example work in F#?

type Interpreter<'a> =
| RegularInterpreter of (int -> 'a)
| StringInterpreter of (string -> 'a)
let add<'a> (x: 'a) (y: 'a) (in_: Interpreter<'a>): 'a =
match in_ with
| RegularInterpreter r ->
x+y |> r
| StringInterpreter r ->
sprintf "(%s + %s)" x y |> r
The error message of it not being able to resolve 'a at compile time is pretty clear to me. I am guessing that the answer to the question of whether it is possible to make the above work is no, short of adding functions directly into the datatype. But then I might as well use an interface, or get rid of generic parameters entirely.
Edit: Mark's reply does in fact do what I asked, but let me extend the question as I did not explain it adequately. What I am trying to do is do with the technique above is imitate what what was done in this post. The motivation for this is to avoid inlined functions as they have poor composability - they can't be passed as lambdas without having their generic arguments specialized.
I was hoping that I might be able to work around it by passing an union type with a generic argument into a closure, but...
type Interpreter<'a> =
| RegularInterpreter of (int -> 'a)
| StringInterpreter of (string -> 'a)
let val_ x in_ =
match in_ with
| RegularInterpreter r -> r x
| StringInterpreter r -> r (string x)
let inline add x y in_ =
match in_ with
| RegularInterpreter r ->
x in_ + y in_ |> r
| StringInterpreter r ->
sprintf "(%A + %A)" (x in_) (y in_) |> r
let inline mult x y in_ =
match in_ with
| RegularInterpreter r ->
x in_ * y in_ |> r
| StringInterpreter r ->
sprintf "(%A * %A)" (x in_) (y in_) |> r
let inline r2 in_ = add (val_ 1) (val_ 3) in_
r2 (RegularInterpreter id)
r2 (StringInterpreter id) // Type error.
This last line gives a type error. Is there a way around this? Though I'd prefer the functions to not be inlined due to the limits they place on composability.
Remove the type annotations:
let inline add x y in_ =
match in_ with
| RegularInterpreter r ->
x + y |> r
| StringInterpreter r ->
sprintf "(%A + %A)" x y |> r
You'll also need to make a few other changes, which I've also incorporated above:
Change the format specifiers used with sprintf to something more generic. When you use %s, you're saying that the argument for that placeholder must be a string, so the compiler would infer x and y to be string values.
Add the inline keyword.
With these changes, the inferred type of add is now:
x: ^a -> y: ^b -> in_:Interpreter<'c> -> 'c
when ( ^a or ^b) : (static member ( + ) : ^a * ^b -> int)
You'll notice that it works for any type where + is defined as turning the input arguments into int. In practice, that's probably going to mean only int itself, unless you define a custom operator.
FSI smoke tests:
> add 3 2 (RegularInterpreter id);;
val it : int = 5
> add 2 3 (StringInterpreter (fun _ -> 42));;
val it : int = 42
The compiler ends up defaulting to int, and the kind of polymorphism you want is difficult to achieve in F#. This article articulates the point.
Perhaps, you could work the dark arts using FSharp.Interop.Dynamic but you lose compile time checking which sort of defeats the point.
I've come to the conclusion that what I am trying to is impossible. I had a hunch that it was already, but the proof is in the following:
let vale (x,_,_) = x
let adde (_,x,_) = x
let multe (_,_,x) = x
let val_ x d =
let f = vale d
f x
let add x y d =
let f = adde d
f (x d) (y d)
let mult x y d =
let f = multe d
f (x d) (y d)
let in_1 =
let val_ (x: int) = x
let add x y = x+y
let mult x y = x*y
val_,add,mult
let in_2 =
let val_ (x: int) = string x
let add x y = sprintf "(%s + %s)" x y
let mult x y = sprintf "(%s * %s)" x y
val_,add,mult
let r2 d = add (val_ 1) (val_ 3) d
//let test x = x in_1, x in_2 // Type error.
let a2 = r2 in_1 // Works
let b2 = r2 in_2 // Works
The reasoning goes that if it cannot be done with plain functions passed as arguments, then it definitely won't be possible with interfaces, records, discriminated unions or any other scheme. The standard functions are more generic than any of the above, and if they cannot do it then this is a fundamental limitation of the language.
It is not the lack of HKTs that make the code ungeneric, but something as simple as this. In fact, going by the Finally Tagless paper linked to in the Reddit post, Haskell has the same problem with needing to duplicate interpreters without the impredicative types extension - though I've looked around and it seem that impredicative types will be removed in the future as the extension is difficult to maintain.
Nevertheless, I do hope this is only a current limitation of F#. If the language was dynamic, the code segment above would in fact run correctly.
Unfortunately, it's not completely clear to me what you're trying to do. However, it seems likely that it's possible by creating an interface with a generic method. For example, here's how you could get the code from your answer to work:
type I = abstract Apply : ((int -> 'a) * ('a -> 'a -> 'a) * ('a -> 'a -> 'a)) -> 'a
//let test x = x in_1, x in_2 // Type error.
let test (i:I) = i.Apply in_1, i.Apply in_2
let r2' = { new I with member __.Apply d = add (val_ 1) (val_ 3) d }
test r2' // no problem
If you want to use a value (e.g. a function input) generically, then in most cases the cleanest way is to create an interface with a generic method whose signature expresses the required polymorphism.

F# Create Factorial function without recursion, library functions or loops

In this video about functional programming at 35:14 Jim Weirich writes a function to compute factorial without using recursion, library functions or loops:
see image of Ruby code here
The code in Ruby
fx = ->(improver) {
improver.(improver)
}.(
->(improver) {
->(n) { n.zero ? 1 : n * improver.(improver).(n-1) }
}
)
I'm trying to express this approach F#
let fx =
(fun improver -> improver(improver))(
fun improver ->
fun n ->
if n = 0 then 1
else n * improver(improver(n - 1)))
I'm currently stuck at
Type mismatch. Expecting a 'a but given a 'a -> 'b
The resulting type would be infinite when unifying ''a' and ''a -> 'b'
I can't seem find the right type annotation or other way of expressing the function
Edit:
*without the rec keyword
Languages with ML-style type inference won't be able to infer a type for the term fun improver -> improver improver; they start by assuming the type 'a -> 'b for a lambda-definition (for some undetermined types 'a and 'b), so as the argument improver has type 'a, but then it's applied to itself to give the result (of type 'b), so improver must simultaneously have type 'a -> 'b. But in the F# type system there's no way to unify these types (and in the simply-typed lambda calculus there's no way to give this term a type at all). My answer to the question that you linked to in your comment covers some workarounds. #desco has given one of those already. Another is:
let fx = (fun (improver:obj->_) -> improver improver)
(fun improver n ->
if n = 0 then 1
else n * (improver :?> _) improver (n-1))
This is cheating, but you can use types
type Self<'T> = delegate of Self<'T> -> 'T
let fx1 = (fun (x: Self<_>) -> x.Invoke(x))(Self(fun x -> fun n -> if n = 0 then 1 else x.Invoke(x)(n - 1) * n))
type Rec<'T> = Rec of (Rec<'T> -> 'T)
let fx2 = (fun (Rec(f ) as r) -> f r)(Rec(fun ((Rec f) as r) -> fun n -> if n = 0 then 1 else f(r)(n - 1) * n))

f# interface with type constraint infinite when unifying

I'm having problems getting an interface with type constraints to work generically.
Here's the type
type LeftistHeap<'a when 'a : comparison> =
...
interface IHeap<LeftistHeap<'a>, 'a> with
...
member this.Insert (x : 'a) = LeftistHeap.insert x this
and the interface
type IHeap<'a when 'a : comparison> =
inherit System.Collections.IEnumerable
inherit System.Collections.Generic.IEnumerable<'a>
...
type IHeap<'c, 'a when 'c :> IHeap<'c, 'a> and 'a : comparison> =
inherit IHeap<'a>
...
abstract member Insert : 'a -> 'c
this code works no problem
let insertThruList l h =
List.fold (fun (h' : LeftistHeap<'a>) x -> h'.Insert x ) h l
but if I try to genralize the code for the interface
let insertThruList l h =
List.fold (fun (h' : IHeap<_,'a>) x -> h'.Insert x ) h l
I get this error at h'.Insert
Type mismatch. Expecting a
'b
but given a
IHeap<'b,'a>
The resulting type would be infinite when unifying ''b' and 'IHeap<'b,'a>'
The compiler's right: you're trying to use a 'c where you need an IHeap<'c,_>. Since 'c :> IHeap<'c,_>, one solution is just to insert an upcast:
let insertThruList l h =
List.fold (fun (h' : IHeap<_,_>) x -> h'.Insert x :> _) h l
Alternatively, you can indicate that you don't want the input to be (exactly) an IHeap<_,_>, but instead some particular subtype:
let insertThruList l h =
List.fold (fun (h' : #IHeap<_,_>) x -> h'.Insert x) h l
This is probably what you really want (the type is more specific). This is equivalent to the more verbose definition:
let insertThruList<'c,'a when 'a : comparison and 'c :> IHeap<'c,'a>> l h =
List.fold (fun (h' : 'c) x -> h'.Insert x) h l
will this work for your case?
let insertThruList l (h : 'T when 'T :> IHeap<'T, 'a> ) =
List.fold (fun (h' : 'T) x -> h'.Insert x ) h l

".NET" and "OCaml" formatting of signatures

F# allows ".NET" and "OCaml" formatting of signatures. This can be confusing when you fall into the habit of using one style, and then find a situation where you cannot properly format the signature you need. Consider this code, which requires a flexible type as the output of the function input to foo:
let foo n (bar: int -> #seq<'a>) =
(fun () -> Vector.ofSeq (bar n))
let foobar n = Array.ofSeq([1..n])
let x = foo 10 foobar
I could not figure out how to express #seq<'a> in OCaml format. Is it possible?
The following compiles just fine:
type A<'a>(x) =
member __.Get : 'a = x
abstract PairWith : 'b -> ('a * 'b * int)
default __.PairWith y = x, y, 1
type B<'a>(x) =
inherit A<'a>(x)
override __.PairWith y = x, y, 2
let pairAB (x : #A<'a>) y =
x, x.PairWith y
type 'a X (x) =
member __.Get : 'a = x
abstract PairWith : 'b -> ('a * 'b * int)
default __.PairWith y = x, y, 1
type 'a Y (x) =
inherit X<'a>(x)
override __.PairWith y = x, y, 2
let pairXY (x : #('a X)) y =
x, x.PairWith y
So you can guess (and then confirm with F# Interactive) that you are looking for #('a seq).
I'm not exactly sure what you mean, but I assume that you want to put the type variable in front of the type name, e.g. 'a #seq.
According to the language specification (§5.1.5) it's not possible since:
A type of the form #type is an anonymous type with a subtype constraint and is equivalent to 'a when 'a :> type, where 'a is a fresh type inference variable.
So you could write your type like: 'a when 'a :> seq<'b>.
EDIT: You could actually use #('a seq), but it looks awkward and I doubt it's what you want.
EDIT2: Didn't see Ramon Snir's answer :).

Packrat parsing (memoization via laziness) in OCaml

I'm implementing a packrat parser in OCaml, as per the Master Thesis by B. Ford. My parser should receive a data structure that represents the grammar of a language and parse given sequences of symbols.
I'm stuck with the memoization part. The original thesis uses Haskell's lazy evaluation to accomplish linear time complexity. I want to do this (memoization via laziness) in OCaml, but don't know how to do it.
So, how do you memoize functions by lazy evaluations in OCaml?
EDIT: I know what lazy evaluation is and how to exploit it in OCaml. The question is how to use it to memoize functions.
EDIT: The data structure I wrote that represents grammars is:
type ('a, 'b, 'c) expr =
| Empty of 'c
| Term of 'a * ('a -> 'c)
| NTerm of 'b
| Juxta of ('a, 'b, 'c) expr * ('a, 'b, 'c) expr * ('c -> 'c -> 'c)
| Alter of ('a, 'b, 'c) expr * ('a, 'b, 'c) expr
| Pred of ('a, 'b, 'c) expr * 'c
| NPred of ('a, 'b, 'c) expr * 'c
type ('a, 'b, 'c) grammar = ('a * ('a, 'b, 'c) expr) list
The (not-memoized) function that parse a list of symbols is:
let rec parse g v xs = parse' g (List.assoc v g) xs
and parse' g e xs =
match e with
| Empty y -> Parsed (y, xs)
| Term (x, f) ->
begin
match xs with
| x' :: xs when x = x' -> Parsed (f x, xs)
| _ -> NoParse
end
| NTerm v' -> parse g v' xs
| Juxta (e1, e2, f) ->
begin
match parse' g e1 xs with
| Parsed (y, xs) ->
begin
match parse' g e2 xs with
| Parsed (y', xs) -> Parsed (f y y', xs)
| p -> p
end
| p -> p
end
( and so on )
where the type of the return value of parse is defined by
type ('a, 'c) result = Parsed of 'c * ('a list) | NoParse
For example, the grammar of basic arithmetic expressions can be specified as g, in:
type nt = Add | Mult | Prim | Dec | Expr
let zero _ = 0
let g =
[(Expr, Juxta (NTerm Add, Term ('$', zero), fun x _ -> x));
(Add, Alter (Juxta (NTerm Mult, Juxta (Term ('+', zero), NTerm Add, fun _ x -> x), (+)), NTerm Mult));
(Mult, Alter (Juxta (NTerm Prim, Juxta (Term ('*', zero), NTerm Mult, fun _ x -> x), ( * )), NTerm Prim));
(Prim, Alter (Juxta (Term ('<', zero), Juxta (NTerm Dec, Term ('>', zero), fun x _ -> x), fun _ x -> x), NTerm Dec));
(Dec, List.fold_left (fun acc d -> Alter (Term (d, (fun c -> int_of_char c - 48)), acc)) (Term ('0', zero)) ['1';'2';'3';])]
The idea of using lazyness for memoization is use not functions, but data structures, for memoization. Lazyness means that when you write let x = foo in some_expr, foo will not be evaluated immediately, but only as far as some_expr needs it, but that different occurences of xin some_expr will share the same trunk: as soon as one of them force computation, the result is available to all of them.
This does not work for functions: if you write let f x = foo in some_expr, and call f several times in some_expr, well, each call will be evaluated independently, there is not a shared thunk to store the results.
So you can get memoization by using a data structure instead of a function. Typically, this is done using an associative data structure: instead of computing a a -> b function, you compute a Table a b, where Table is some map from the arguments to the results. One example is this Haskell presentation of fibonacci:
fib n = fibTable !! n
fibTable = [0,1] ++ map (\n -> fib (n - 1) + fib (n - 2)) [2..]
(You can also write that with tail and zip, but this doesn't make the point clearer.)
See that you do not memoize a function, but a list: it is the list fibTable that does the memoization. You can write this in OCaml as well, for example using the LazyList module of the Batteries library:
open Batteries
module LL = LazyList
let from_2 = LL.seq 2 ((+) 1) (fun _ -> true)
let rec fib n = LL.at fib_table (n - 1) + LL.at fib_table (n - 2)
and fib_table = lazy (LL.Cons (0, LL.cons 1 <| LL.map fib from_2))
However, there is little interest in doing so: as you have seen in the example above, OCaml does not particularly favor call-by-need evaluation -- it's reasonable to use, but not terribly convenient as it was forced to be in Haskell. It is actually equally simple to directly write the cache structure by direct mutation:
open Batteries
let fib =
let fib_table = DynArray.of_list [0; 1] in
let get_fib n = DynArray.get fib_table n in
fun n ->
for i = DynArray.length fib_table to n do
DynArray.add fib_table (get_fib (i - 1) + get_fib (i - 2))
done;
get_fib n
This example may be ill-chosen, because you need a dynamic structure to store the cache. In the packrat parser case, you're tabulating parsing on a known input text, so you can use plain arrays (indexed by the grammar rules): you would have an array of ('a, 'c) result option for each rule, of the size of the input length and initialized to None. Eg. juxta.(n) represents the result of trying the rule Juxta from input position n, or None if this has not yet been tried.
Lazyness is a nice way to present this kind of memoization, but is not always expressive enough: if you need, say, to partially free some part of your result cache to lower memory usage, you will have difficulties if you started from a lazy presentation. See this blog post for a remark on this.
Why do you want to memoize functions? What you want to memoize is, I believe, the parsing result for a given (parsing) expression and a given position in the input stream. You could for instance use Ocaml's Hashtables for that.
The lazy keyword.
Here you can find some great examples.
If it fits your use case, you can also use OCaml streams instead of manually generating thunks.

Resources