How to prove a thereom about a type class outside the original type class definition? - typeclass

I was trying alternative ways to write the below proof from this question and Isabelle 2020's Rings.thy. (In particular, I added the note div_mult_mod_eq[of a b] line to test the use of the note command:
lemma mod_div_decomp:
fixes a b
obtains q r where "q = a div b" and "r = a mod b"
and "a = q * b + r"
proof -
from div_mult_mod_eq have "a = a div b * b + a mod b" by simp
note div_mult_mod_eq[of a b]
moreover have "a div b = a div b" ..
moreover have "a mod b = a mod b" ..
note that ultimately show thesis by blast
qed
However, if I write it in a separate .thy file, there is an error about type unification at the note line:
Type unification failed: Variable 'a::{plus,times} not of sort semiring_modulo
Failed to meet type constraint:
Term: a :: 'a
Type: ??'a
The problem goes way if I enclose the whole proof in a pair of type class class begin ... end as follows:
theory "test"
imports Main
HOL.Rings
begin
...
class semiring_modulo = comm_semiring_1_cancel + divide + modulo +
assumes div_mult_mod_eq: "a div b * b + a mod b = a"
begin
(* ... inserted proof here *)
end
...
end
My questions are:
Is this the correct way to prove a theorem about a type class? i.e. to write a separate class definition in a different file?
Is it always necessary to duplicate type class definitions as I did?
If not, what is the proper way to prove a theorem about a type class outside of its original place of definition?

There are two ways to prove things in type classes (basically sort = typeclass for Isabelle/HOL):
Proving in the context of the typeclass
context semiring_modulo
begin
...
end
(slightly less clean) Add the sort constraints to the type:
lemma mod_div_decomp:
fixes a b :: "'a :: {semiring_modulo}"
obtains q r where "q = a div b" and "r = a mod b"
and "a = q * b + r"
semiring_modulo subsumes plus and times, but you can also type {semiring_modulo,plus,times} to really have all of them.
The documentation of classes contains more examples.

The issue you ran into is related to how Isabelle implements polymorphism. Sorts represent a subset of all types, and we characterize them by a set of intersected classes. By attaching a sort to a variable, we restrict the space of terms with which that variable can be instantiated with. One way of looking at this is an assumption that the variable belongs to a certain sort. In your case, type inference (+) (*) div mod apparently gives you {plus,times}, which is insufficient for div_mult_mod_eq. To restrict the variable further you can make an explicit type annotation as Mathias explained.
Note that the simp in the line above should run into the same problem.

Related

Are there use cases for single case variants in Ocaml?

I've been reading F# articles and they use single case variants to create distinct incompatible types. However in Ocaml I can use private module types or abstract types to create distinct types. Is it common in Ocaml to use single case variants like in F# or Haskell?
Another specialized use case fo a single constructor variant is to erase some type information with a GADT (and an existential quantification).
For instance, in
type showable = Show: 'a * ('a -> string) -> showable
let show (Show (x,f)) = f x
let showables = [ Show (0,string_of_int); Show("string", Fun.id) ]
The constructor Show pairs an element of a given type with a printing function, then forget the concrete type of the element. This makes it possible to have a list of showable elements, even if each elements had a different concrete types.
For what it's worth it seems to me this wasn't particularly common in OCaml in the past.
I've been reluctant to do this myself because it has always cost something: the representation of type t = T of int was always bigger than just the representation of an int.
However recently (probably a few years) it's possible to declare types as unboxed, which removes this obstacle:
type [#unboxed] t = T of int
As a result I've personally been using single-constructor types much more frequently recently. There are many advantages. For me the main one is that I can have a distinct type that's independent of whether it's representation happens to be the same as another type.
You can of course use modules to get this effect, as you say. But that is a fairly heavy solution.
(All of this is just my opinion naturally.)
Yet another case for single-constructor types (although it does not quite match your initial question of creating distinct types): fancy records. (By contrast with other answers, this is more a syntactic convenience than a fundamental feature.)
Indeed, using a relatively recent feature (introduced with OCaml 4.03, in 2016) which allows writing constructor arguments with a record syntax (including mutable fields!), you can prefix regular records with a constructor name, Coq-style.
type t = MakeT of {
mutable x : int ;
mutable y : string ;
}
let some_t = MakeT { x = 4 ; y = "tea" }
(* val some_t : t = MakeT {x = 4; y = "tea"} *)
It does not change anything at runtime (just like Constr (a,b) has the same representation as (a,b), provided Constr is the only constructor of its type). The constructor makes the code a bit more explicit to the human eye, and it also provides the type information required to disambiguate field names, thus avoiding the need for type annotations. It is similar in function to the usual module trick, but more systematic.
Patterns work just the same:
let (MakeT { x ; y }) = some_t
(* val x : int = 4 *)
(* val y : string = "tea" *)
You can also access the “contained” record (at no runtime cost), read and modify its fields. This contained record however is not a first-class value: you cannot store it, pass it to a function nor return it.
let (MakeT fields) = some_t in fields.x (* returns 4 *)
let (MakeT fields) = some_t in fields.x <- 42
(* some_t is now MakeT {x = 42; y = "tea"} *)
let (MakeT fields) = some_t in fields
(* ^^^^^^
Error: This form is not allowed as the type of the inlined record could escape. *)
Another use case of single-constructor (polymorphic) variants is documenting something to the caller of a function. For instance, perhaps there's a caveat with the value that your function returns:
val create : unit -> [ `Must_call_close of t ]
Using a variant forces the caller of your function to pattern-match on this variant in their code:
let (`Must_call_close t) = create () in (* ... *)
This makes it more likely that they'll pay attention to the message in the variant, as opposed to documentation in an .mli file that could get missed.
For this use case, polymorphic variants are a bit easier to work with as you don't need to define an intermediate type for the variant.

Parsing procedure calls for a toy language

I have a certain toy language that defines, amongst others, procedures and procedure calls, using EBNF syntax:
program = procedure, {procedure} ;
procedure = "procedure", NAME, bracedblock ;
bracedBlock = "{" , statementlist , "}" ;
statementlist = statement, { statement } ;
statement = define | if | while | call | // others omitted for brevity ;
define = NAME, "=", expression, ";"
if = "if", conditionalblock, "then", bracedBlock, "else", bracedBlock
call = "call" , NAME, ";" ;
// other definitions omitted for brevity
A tokeniser for a program in this language has been implemented, and returns a vector of tokens.
Now, parsing said program without the procedure calls, is fairly straightforward: one can define a recursive descent parser using the above grammar directly, and simply parse through the tokens. Some further notes:
Each procedure may call any other procedure except itself, directly or indirectly (i.e. no recursion), and these need not necessarily be in the order of appearance in the source code (i.e. B may be defined after A, and A may call B, or vice versa).
Procedure names need to be unique, and 'reserved keywords' may be used as variable/procedure names.
Whitespace does not matter, at least amongst tokens of different type: similar to C/C++.
There is no scoping rule: all variables are global.
The concept of a 'line number' is important: each statement has one or more line numbers associated with it: define statements have only 1 line number each, for instance, whereas an if statement, which is itself a parent of two statement lists, has multiple line numbers. For instance:
LN CODE
procedure A {
1. a = 5;
2. b = 7;
3. c = 3;
4. 5. if (b < c) then { call C; } else {
6. call B;
}
procedure B {
7. d = 5;
8. while (d > 2) {
9. d = d + 1; }
}
procedure C {
10. e = 10;
11. f = 8;
12. call B;
}
Line numbers are continuous throughout the program; only procedure definitions and the else keyword aren't assigned line numbers. The line numbers are defined by grammar, rather than their position in source code: for instance, consider 'lines' 4 and 5.
There are some relationships that need to be set in a database given each statement and its line number, variables used, variables set, and child containers. This is a key consideration.
My question is therefore this: how can I parse these function calls, maintain the integrity of the line numbers, and set the relationships?
I have considered the 'OS' way of doing things: upon encounter of a procedure call, look ahead for a procedure that matches said called procedure, parse the callee, and unroll the call stack back to the caller. However, this ruins the line number ordering: if the above program were to be parsed this way, C would have line numbers 6 to 8 inclusive, rather than 10 to 12 inclusive.
Another solution is to parse the entire program once in order, maintain a toposort of procedure calls, and then parse a second time by following said toposort. This is problematic because of implementation details.
Is there a possibly better way to do this?
It's always tempting to try to completely process a program text in a single on-line pass. Unfortunately, it is practically never the simplest solution. Trying to do everything at once in a linear progression results in a kind of spaghetti of intertwined computations, and making it all work almost always involves unnecessary restrictions on the language which will later prove to be unfortunate.
So I'd encourage you to reconsider some of your design decisions. If you use the parser just to build up some kind of structural representation of the program -- whether it's an abstract syntax tree or a vector of three-address code, or some other alternative -- and then do further processing in a series of single-purpose passes over that structural representations, you'll likely find that the code is:
much simpler, because computations don't have to be intermingled;
more general, because each pass can be done in the most convenient order rather than restricting inputs to fit a linear ordering;
more readable and more maintainable.
Persisting data structures over multiple passes might increase storage requirements slightly. But the structures are unlikely to occupy enough storage that this will be noticeable. And it probably will not increase the computation time; indeed, it might even reduce the time because the individual passes are simpler and easier to optimise.

Erlang flatten function time complexity

I need a help with following:
flatten ([]) -> [];
flatten([H|T]) -> H ++ flatten(T).
Input List contain other lists with a different length
For example:
flatten([[1,2,3],[4,7],[9,9,9,9,9,9]]).
What is the time complexity of this function?
And why?
I got it to O(n) where n is a number of elements in the Input list.
For example:
flatten([[1,2,3],[4,7],[9,9,9,9,9,9]]) n=3
flatten([[1,2,3],[4,7],[9,9,9,9,9,9],[3,2,4],[1,4,6]]) n=5
Thanks for help.
First of all I'm not sure your code will work, at least not in the way standard library works. You could compare your function with lists:flatten/1 and maybe improve on your implementation. Try lists such as [a, [b, c]] and [[a], [b, [c]], [d]] as input and verify if you return what you expected.
Regarding complexity it is little tricky due to ++ operator and functional (immutable) nature of the language. All lists in Erlang are linked lists (not arrays like in C++), and you can not just add something to end of one without modifying it; before it was pointing to end of list, now you would like it to link to something else. And again, since it is not mutable language you have to make copy of whole list left of ++ operator, which increases complexity of this operator.
You could say that complexity of A ++ B is length(A), and it makes complexity of your function little bit greater. It would go something like length(FirstElement) + (lenght(FirstElement) + length(SecondElement)) + .... up to (without) last, which after some math magic could be simplified to (n -1) * 1/2 * k * k where n is number of elements, and k is average length of element. Or O(n^3).
If you are new to this it might seem little bit odd, but with some practice you can get hang of it. I would recommend going through few resources:
Good explanation of lists and how they are created
Documentation on list handling with DO and DO NOT parts
Short description of ++ operator myths and best practices
Chapter about recursion and tail-recursion with examples using ++ operator

Function recursion, what happens in the SAS?

I have this scenario: a recursive procedure (or function) is called like
{DoSomething Data C}
and C is the variable that should store the final result, the function prototype is
proc {DoSomething Data N}
%..
%..
{DoSomething Data M}
N = 1 + M
end
and N is the variable that should store the final result too but in the local scope of the procedure.
Now it was told to me that at first, when the procedure is called, the SAS is:
Notice equivalence sets between C and N (both unbound for the moment)
then after all recursions have been completed, the SAS is
notice that both C and N are bound to a value (6)
After exiting the procedure the SAS is left with
because you destroy the N variable. And that's fine.
My question is: what happens during the procedure recursions? Does the C variable link to a partial value structure 1 + M ? And then next time M links to 1 + M2 ?
No, there are no partial structures in Oz, as long as we talk about simple integer arithmetics.
This statement:
N = 1 + M
will block until M is fully determined, i.e. is bound to an integer.
To really understand what is going on, I would have to see the full code.
But I assume that there is a base case that returns a concrete value. Once the base case is reached, the recursion will "bubble up", adding 1 to the result of the inner call.
In other words, the binding of C will only change at the end of the outermost procedure call, where M is 5 and C is therefore bound to 6.

How to create a recursive data structure value in (functional) F#?

How can a value of type:
type Tree =
| Node of int * Tree list
have a value that references itself generated in a functional way?
The resulting value should be equal to x in the following Python code, for a suitable definition of Tree:
x = Tree()
x.tlist = [x]
Edit: Obviously more explanation is necessary. I am trying to learn F# and functional programming, so I chose to implement the cover tree which I have programmed before in other languages. The relevant thing here is that the points of each level are a subset of those of the level below. The structure conceptually goes to level -infinity.
In imperative languages a node has a list of children which includes itself. I know that this can be done imperatively in F#. And no, it doesn't create an infinite loop given the cover tree algorithm.
Tomas's answer suggests two possible ways to create recursive data structures in F#. A third possibility is to take advantage of the fact that record fields support direct recursion (when used in the same assembly that the record is defined in). For instance, the following code works without any problem:
type 'a lst = Nil | NonEmpty of 'a nelst
and 'a nelst = { head : 'a; tail : 'a lst }
let rec infList = NonEmpty { head = 1; tail = infList }
Using this list type instead of the built-in one, we can make your code work:
type Tree = Node of int * Tree lst
let rec x = Node(1, NonEmpty { head = x; tail = Nil })
You cannot do this directly if the recursive reference is not delayed (e.g. wrapped in a function or lazy value). I think the motivation is that there is no way to create the value with immediate references "at once", so this would be awkward from the theoretical point of view.
However, F# supports recursive values - you can use those if the recursive reference is delayed (the F# compiler will then generate some code that initializes the data structure and fills in the recursive references). The easiest way is to wrap the refernece inside a lazy value (function would work too):
type Tree =
| Node of int * Lazy<Tree list>
// Note you need 'let rec' here!
let rec t = Node(0, lazy [t; t;])
Another option is to write this using mutation. Then you also need to make your data structure mutable. You can for example store ref<Tree> instead of Tree:
type Tree =
| Node of int * ref<Tree> list
// empty node that is used only for initializataion
let empty = Node(0, [])
// create two references that will be mutated after creation
let a, b = ref empty, ref empty
// create a new node
let t = Node(0, [a; b])
// replace empty node with recursive reference
a := t; b := t
As James mentioned, if you're not allowed to do this, you can have some nice properties such as that any program that walks the data structure will terminate (because the data-structrue is limited and cannot be recursive). So, you'll need to be a bit more careful with recursive values :-)

Resources