I just started exploring the possibilities of data types à la carte in combination with indexed types. My current experiment is a bit too large to include here, but can be found here. My example is mixing together an expression from different ingredients (arithmetic, functions, ...). The goal is to enforce only well-typed expressions. That is why an index is added to the expressions (the Sort type).
I can build expressions like:
-- define expressions over variables and arithmetic (+, *, numeric constants)
type Lia = IFix (VarF :+: ArithmeticF)
-- expression of integer type/sort
t :: Lia IntegralSort
t = var "c" .+. cnst 1
This is all good as long as I construct only fixed (static) expressions.
Is there a way to read an expression from string/other representation (that obviously has to encode the sort) and produce a dynamic value that gets represented by these functors?
For example, I would like to read ((c : Int) + (1 : Int)) and represent it somehow with VarF and ArithmeticF. Here I realize I cannot obtain a value of static type Lia IntegralSort. But suppose I have in addition:
data EqualityF a where
Equals :: forall s. a s -> a s -> EqualityF a BoolSort
I could expect there being a function that can read String into Maybe (IFix (EqualityF :+: VarF :+: ...)). Such a function would attempt to build representations for the LHS and RHS and if the sorts matched it could produce a result of statically known type IFix (EqualityF :+: ...) BoolSort. The problem is that the representation of LHS (and RHS) has no fixed static sort. Is what I am trying to do impossible with this representation I chose?
(.=.) :: EqualityF :<: f => IFix f s -> IFix f s -> IFix f BoolSort
(.=.) a b = inject (Equals a b)
You can use a GADT to hide the sort, allowing you to return values of sorts depending on the input. Pattern matching then allows you to recover the sort.
data Expr (f :: (Sort -> *) -> (Sort -> *)) where
BoolExpr :: IFix f BoolSort -> Expr f
IntExpr :: IFix f IntegralSort -> Expr f
Here is a simplistic parser of postfix expressions involving + and =.
parse :: (EqualityF :<: f, ArithmeticF :<: f) => String -> [Expr f] -> Maybe (Expr f)
parse (c : s) stack | isDigit c =
parse s (IntExpr (cnst (digitToInt c)) : stack)
parse ('+' : s) (IntExpr e1 : IntExpr e2 : stack) =
parse s (IntExpr (e1 .+. e2) : stack)
parse ('=' : s) (IntExpr e1 : IntExpr e2 : stack) =
parse s (BoolExpr (e1 .=. e2) : stack)
parse ('=' : s) (BoolExpr e1 : BoolExpr e2 : stack) =
parse s (BoolExpr (e1 .=. e2) : stack)
parse [] [e] = Just e
parse _ _ = Nothing
You might not like the duplicate cases for =. A more general framework is Typeable, allowing you to just test for the type equalities you need.
data SomeExpr (f :: (Sort -> *) -> Sort -> *) where
SomeExpr :: Typeable s => IFix f s -> SomeExpr f
parseSome :: forall f. (EqualityF :<: f, ArithmeticF :<: f) => String -> [SomeExpr f] -> Maybe (Expr f)
parseSome (c : s) stack | isDigit c =
parseSome s (SomeExpr (cnst (digitToInt c)) : stack)
parseSome ('+' : s) (SomeExpr e1 : SomeExpr e2 : stack) = do
e1 <- gcast e1
e2 <- gcast e2
parseSome s (SomeExpr (e1 .+. e2) : stack)
parseSome ('=' : s) (SomeExpr (e1 :: IFix f s1) : SomeExpr (e2 :: IFix f s2) : stack) = do
Refl <- eqT :: Maybe (s1 :~: s2)
parseSome s (SomeExpr (e1 .=. e2) : stack)
parseSome [] [e] = Just e
parseSome _ _ = Nothing
Edit
To parse sorts, you want to track them at the type level. Again, use an existential type.
data SomeSort where
SomeSort :: Typeable (s :: Sort) => proxy s -> SomeSort
You can construct the sort of arrays this way:
-- \i e -> array i e
arraySort :: SomeSort -> SomeSort -> SomeSort
arraySort (SomeSort (Proxy :: Proxy i)) (SomeSort (Proxy :: Proxy e)) =
SomeSort (Proxy :: Proxy (ArraySort i e))
A potential problem with Typeable here is that it only allows you to test equality of types, when you may want only to check the head constructor: you can't ask "is this type an ArraySort?", but only "is this type equal to ArraySort IntSort BoolSort?" or some other full type.
In that case you need a GADT that reflects the structure of a sort.
-- "Singleton type"
data SSort (s :: Sort) where
SIntSort :: SSort IntSort
SBoolSort :: SSort BoolSort
SArraySort :: SSort i -> SSort e -> SSort (ArraySort i e)
data SomeSort where
SomeSort :: SSort s -> SomeSort
array :: SomeSort -> SomeSort -> SomeSort
array (SomeSort i) (SomeSort e) = SomeSort (SArraySort i e)
The singleton package provides various facilities for defining and working with these singleton types, though it may be overkill for your use case.
let rec merge = function
| ([], ys) -> ys
| (xs, []) -> xs
| (x::xs, y::ys) -> if x < y then x :: merge (xs, y::ys)
else y :: merge (x::xs, ys)
let rec split = function
| [] -> ([], [])
| [a] -> ([a], [])
| a::b::cs -> let (M,N) = split cs
(a::M, b::N)
let rec mergesort = function
| [] -> []
| L -> let (M, N) = split L
merge (mergesort M, mergesort N)
mergesort [5;3;2;1] // Will throw an error.
I took this code from here StackOverflow Question but when I run the mergesort with a list I get an error:
stdin(192,1): error FS0030: Value restriction. The value 'it' has been inferred to have generic type
val it : '_a list when '_a : comparison
How would I fix this problem? What is the problem? The more information, the better (so I can learn :) )
Your mergesort function is missing a case causing the signature to be inferred by the compiler to be 'a list -> 'b list instead of 'a list -> 'a list which it should be. The reason it should be 'a list -> 'a list is that you're not looking to changing the type of the list in mergesort.
Try changing your mergesort function to this, that should fix the problem:
let rec mergesort = function
| [] -> []
| [a] -> [a]
| L -> let (M, N) = split L
merge (mergesort M, mergesort N)
Another problem with your code however is that neither merge nor split is tail recursive and you will therefore get stack overflow exceptions on large lists (try to call the corrected mergesort like this mergesort [for i in 1000000..-1..1 -> i]).
You can make your split and merge functions tail recursive by using the accumulator pattern
let split list =
let rec aux l acc1 acc2 =
match l with
| [] -> (acc1,acc2)
| [x] -> (x::acc1,acc2)
| x::y::tail ->
aux tail (x::acc1) (y::acc2)
aux list [] []
let merge l1 l2 =
let rec aux l1 l2 result =
match l1, l2 with
| [], [] -> result
| [], h :: t | h :: t, [] -> aux [] t (h :: result)
| h1 :: t1, h2 :: t2 ->
if h1 < h2 then aux t1 l2 (h1 :: result)
else aux l1 t2 (h2 :: result)
List.rev (aux l1 l2 [])
You can read more about the accumulator pattern here; the examples are in lisp but it's a general pattern that works in any language that provides tail call optimization.
So I have another "simple" Adga question. I wanted to have a proof that used arbitrary evaluations as premises and results. But I don't think I know the type system well enough to do that.
as a simple example take
f : {S : Set} -> (a : S)
-> ( R : S -> Set)
-> (R a)
f aa rr = rr aa
which has a compilation error
Set !=< rr aa of type Set1
when checking that the expression rr aa has type rr aa
of course
f : {S : Set} -> (a : S)
-> ( R : S -> Set)
-> Set
f aa rr = rr aa
compiles fine
as does
f : {S : Set} -> (a : S)
-> ( R : S -> Set)
-> (R a)
-> (R a)
f _ _ ra = ra
What does (R a) means in context? Can it be constructed? How can it be constructed?
In your first example the expression rr aa has type Set, because it is the result of the application of aa of type S to the function rr of type S -> Set.
The type signature of your function demands a result type of R a though. Given the naming of your parameters the expected result type is rr aa. The type checker now tries to unify the expected type (rr aa) with the expression type (Set) and fails.
In fact a function of the type given above would be inconsistent with the type theory:
no-f : (f : {S : Set} → (a : S) → (R : S → Set) → R a) → ⊥
no-f f = f tt (λ _ → ⊥)
In other words, assuming there was a function of the type above, an element of the empty type (⊥) could be produced. So in general you cannot construct elements of type R a without additional requirements.
Imports used above:
open import Data.Empty
open import Data.Unit
let rec merge = function
| ([], ys) -> ys
| (xs, []) -> xs
| (x::xs, y::ys) -> if x < y then x :: merge (xs, y::ys)
else y :: merge (x::xs, ys)
let rec split = function
| [] -> ([], [])
| [a] -> ([a], [])
| a::b::cs -> let (M,N) = split cs
(a::M, b::N)
let rec mergesort = function
| [] -> []
| L -> let (M, N) = split L
merge (mergesort M, mergesort N)
mergesort [5;3;2;1] // Will throw an error.
I took this code from here StackOverflow Question but when I run the mergesort with a list I get an error:
stdin(192,1): error FS0030: Value restriction. The value 'it' has been inferred to have generic type
val it : '_a list when '_a : comparison
How would I fix this problem? What is the problem? The more information, the better (so I can learn :) )
Your mergesort function is missing a case causing the signature to be inferred by the compiler to be 'a list -> 'b list instead of 'a list -> 'a list which it should be. The reason it should be 'a list -> 'a list is that you're not looking to changing the type of the list in mergesort.
Try changing your mergesort function to this, that should fix the problem:
let rec mergesort = function
| [] -> []
| [a] -> [a]
| L -> let (M, N) = split L
merge (mergesort M, mergesort N)
Another problem with your code however is that neither merge nor split is tail recursive and you will therefore get stack overflow exceptions on large lists (try to call the corrected mergesort like this mergesort [for i in 1000000..-1..1 -> i]).
You can make your split and merge functions tail recursive by using the accumulator pattern
let split list =
let rec aux l acc1 acc2 =
match l with
| [] -> (acc1,acc2)
| [x] -> (x::acc1,acc2)
| x::y::tail ->
aux tail (x::acc1) (y::acc2)
aux list [] []
let merge l1 l2 =
let rec aux l1 l2 result =
match l1, l2 with
| [], [] -> result
| [], h :: t | h :: t, [] -> aux [] t (h :: result)
| h1 :: t1, h2 :: t2 ->
if h1 < h2 then aux t1 l2 (h1 :: result)
else aux l1 t2 (h2 :: result)
List.rev (aux l1 l2 [])
You can read more about the accumulator pattern here; the examples are in lisp but it's a general pattern that works in any language that provides tail call optimization.
I'm implementing a packrat parser in OCaml, as per the Master Thesis by B. Ford. My parser should receive a data structure that represents the grammar of a language and parse given sequences of symbols.
I'm stuck with the memoization part. The original thesis uses Haskell's lazy evaluation to accomplish linear time complexity. I want to do this (memoization via laziness) in OCaml, but don't know how to do it.
So, how do you memoize functions by lazy evaluations in OCaml?
EDIT: I know what lazy evaluation is and how to exploit it in OCaml. The question is how to use it to memoize functions.
EDIT: The data structure I wrote that represents grammars is:
type ('a, 'b, 'c) expr =
| Empty of 'c
| Term of 'a * ('a -> 'c)
| NTerm of 'b
| Juxta of ('a, 'b, 'c) expr * ('a, 'b, 'c) expr * ('c -> 'c -> 'c)
| Alter of ('a, 'b, 'c) expr * ('a, 'b, 'c) expr
| Pred of ('a, 'b, 'c) expr * 'c
| NPred of ('a, 'b, 'c) expr * 'c
type ('a, 'b, 'c) grammar = ('a * ('a, 'b, 'c) expr) list
The (not-memoized) function that parse a list of symbols is:
let rec parse g v xs = parse' g (List.assoc v g) xs
and parse' g e xs =
match e with
| Empty y -> Parsed (y, xs)
| Term (x, f) ->
begin
match xs with
| x' :: xs when x = x' -> Parsed (f x, xs)
| _ -> NoParse
end
| NTerm v' -> parse g v' xs
| Juxta (e1, e2, f) ->
begin
match parse' g e1 xs with
| Parsed (y, xs) ->
begin
match parse' g e2 xs with
| Parsed (y', xs) -> Parsed (f y y', xs)
| p -> p
end
| p -> p
end
( and so on )
where the type of the return value of parse is defined by
type ('a, 'c) result = Parsed of 'c * ('a list) | NoParse
For example, the grammar of basic arithmetic expressions can be specified as g, in:
type nt = Add | Mult | Prim | Dec | Expr
let zero _ = 0
let g =
[(Expr, Juxta (NTerm Add, Term ('$', zero), fun x _ -> x));
(Add, Alter (Juxta (NTerm Mult, Juxta (Term ('+', zero), NTerm Add, fun _ x -> x), (+)), NTerm Mult));
(Mult, Alter (Juxta (NTerm Prim, Juxta (Term ('*', zero), NTerm Mult, fun _ x -> x), ( * )), NTerm Prim));
(Prim, Alter (Juxta (Term ('<', zero), Juxta (NTerm Dec, Term ('>', zero), fun x _ -> x), fun _ x -> x), NTerm Dec));
(Dec, List.fold_left (fun acc d -> Alter (Term (d, (fun c -> int_of_char c - 48)), acc)) (Term ('0', zero)) ['1';'2';'3';])]
The idea of using lazyness for memoization is use not functions, but data structures, for memoization. Lazyness means that when you write let x = foo in some_expr, foo will not be evaluated immediately, but only as far as some_expr needs it, but that different occurences of xin some_expr will share the same trunk: as soon as one of them force computation, the result is available to all of them.
This does not work for functions: if you write let f x = foo in some_expr, and call f several times in some_expr, well, each call will be evaluated independently, there is not a shared thunk to store the results.
So you can get memoization by using a data structure instead of a function. Typically, this is done using an associative data structure: instead of computing a a -> b function, you compute a Table a b, where Table is some map from the arguments to the results. One example is this Haskell presentation of fibonacci:
fib n = fibTable !! n
fibTable = [0,1] ++ map (\n -> fib (n - 1) + fib (n - 2)) [2..]
(You can also write that with tail and zip, but this doesn't make the point clearer.)
See that you do not memoize a function, but a list: it is the list fibTable that does the memoization. You can write this in OCaml as well, for example using the LazyList module of the Batteries library:
open Batteries
module LL = LazyList
let from_2 = LL.seq 2 ((+) 1) (fun _ -> true)
let rec fib n = LL.at fib_table (n - 1) + LL.at fib_table (n - 2)
and fib_table = lazy (LL.Cons (0, LL.cons 1 <| LL.map fib from_2))
However, there is little interest in doing so: as you have seen in the example above, OCaml does not particularly favor call-by-need evaluation -- it's reasonable to use, but not terribly convenient as it was forced to be in Haskell. It is actually equally simple to directly write the cache structure by direct mutation:
open Batteries
let fib =
let fib_table = DynArray.of_list [0; 1] in
let get_fib n = DynArray.get fib_table n in
fun n ->
for i = DynArray.length fib_table to n do
DynArray.add fib_table (get_fib (i - 1) + get_fib (i - 2))
done;
get_fib n
This example may be ill-chosen, because you need a dynamic structure to store the cache. In the packrat parser case, you're tabulating parsing on a known input text, so you can use plain arrays (indexed by the grammar rules): you would have an array of ('a, 'c) result option for each rule, of the size of the input length and initialized to None. Eg. juxta.(n) represents the result of trying the rule Juxta from input position n, or None if this has not yet been tried.
Lazyness is a nice way to present this kind of memoization, but is not always expressive enough: if you need, say, to partially free some part of your result cache to lower memory usage, you will have difficulties if you started from a lazy presentation. See this blog post for a remark on this.
Why do you want to memoize functions? What you want to memoize is, I believe, the parsing result for a given (parsing) expression and a given position in the input stream. You could for instance use Ocaml's Hashtables for that.
The lazy keyword.
Here you can find some great examples.
If it fits your use case, you can also use OCaml streams instead of manually generating thunks.