What is positivity checking? [duplicate] - agda

This question already has an answer here:
"Strictly positive" in Agda
(1 answer)
Closed 4 years ago.
Apparently, there is some feature in Agda called positivity checking which can apparently keep the system sound even if type-in-type is enabled.
I am curious to know what this is about, but the Agda Manual fails to answer the question, and only explains how to turn it off.
At a lunch table I overheard that this is about polarity in type theory, but that is about all I know. I am failing to find anything online which explains this concept and why it is useful in maintaining soundness. Any intelligible explanation would be appreciated.

First, I have to clear up a misconception: positivity checking does not guarantee soundness when type-in-type is enabled. Data types must thus satisfy both the positivity check and universe check to preserve soundness.
Now, to explain positivity checking, let's first look at a counterexample when we wouldn't have positivity checking:
-- the empty type
data ⊥ : Set where
-- a non-positive type
data Bad : Set where
bad : (Bad → ⊥) → Bad
Suppose this datatype was allowed, then you could easily prove ⊥:
bad-is-false : Bad → ⊥
bad-is-false (bad f) = f (bad f)
bad-is-true : Bad
bad-is-true = bad bad-is-false
boom : ⊥
boom = bad-is-false bad-is-true
Under the Curry-Howard correspondence, the definition of Bad says: Bad is true if and only if Bad is false. So it is not surprising that it leads to inconsistencies.
Positivity checking rules out datatypes such as Bad. In general, the (strict) positivity criterion says that each constructor c of a datatype D should have a type of the form
c : (x1 : A1)(x2 : A2) ... (xn : An) → D xs
where the type Ai of each argument is either non-recursive (i.e. it doesn't refer to D) or of the form (y1 : B1)(y2 : B2) ... (ym : Bm) → D ys where each Bj doesn't refer to D.
Bad doesn't satisfy this criterion because the argument of the constructor bad has type Bad → ⊥, which is neither of the two allowed forms.
The name 'positivity checking' comes (as many things in type theory do) from category theory, specifically the notion of a positive endofunctor. Each definition of a datatype that satisfies the positivity criterion is such a positive endofunctor on the category of types. This means we can construct the initial algebra of that endofunctor, which can be used to model the datatype when constructing a model of type theory (which is used to prove soundness).

Related

Defining the head of a list with a proof object

I would like to define a head function for lists.
In order to avoid trying to compute the head of an empty list one
can either work with vectors of length greater than one (i.e., Vec (suc n))
or work with Lists, but pass a proof that the list is non-empty to head.
(This is what "Dependent Types at Work" calls internal vs external
programming logic, I believe.)
I am interested in the latter approach.
Note that there is a SO answer which addresses this, but I wanted
a minimal approach. (For example, I would prefer not to use Instance Arguments
unless they are required.)
Below is my attempt, but I don't fully understand what is going on.
For example:
It's not clear why I was able to skip the head [] case. Obviously it's related to the "proof" I pass in but I would have expected I would need some kind of case with () in it.
When I type check (C-c C-l) I seem to get two goals as output.
I would have liked to have seen tmp2 fail to type check.
Any insight would be very welcome.
In particular, what is the "right" way(s) to do what I am trying to do?
data List (A : Set) : Set where
[] : List A
_::_ : A → List A → List A
{-
head1 : {A : Set} → (as : List A) → A
-- As expected, complains about missing case `head1 []`.
head1 (a :: aa) = a
-}
data ⊤ : Set where
tt : ⊤
data ⊥ : Set where
isNonEmpty : {A : Set} → List A → Set
isNonEmpty [] = ⊥
isNonEmpty (_ :: _) = ⊤
head : {A : Set} → (as : List A) → {isNonEmpty as} → A
head (a :: _) = a
-- Define just enough to do some examples
data Nat : Set where
zero : Nat
suc : Nat → Nat
{-# BUILTIN NATURAL Nat #-}
len1 : List Nat
len1 = 17 :: []
tmp : Nat
tmp = head len1
tmp1 : Nat
tmp1 = head len1 { tt }
len0 : List Nat
len0 = []
tmp2 : Nat
tmp2 = head len0
The user manual on coverage checking in Agda explains that in certain situations absurd clauses can be left out completely:
In many common cases, absurd clauses may be omitted as long as the remaining clauses reveal sufficient information to indicate what arguments to case split on. [...] Absurd clauses may be omitted if removing the corresponding internal nodes from the case tree does not result in other internal nodes becoming childless.
Note that you can still write the absurd clause manually if you want and Agda will accept it.
What you are getting are not two unsolved holes but two unsolved metavariables. These metavariables were created by Agda to fill in the implicit argument to head in the definitions of tmp and tmp2 respectively, and Agda's constraint solver wasn't able to solve them. For the metavariable in tmp, this is because you defined ⊤ as a datatype instead of a record type, and Agda only applies eta-equality for record types. For the metavariable in tmp2, the type is ⊥ so there is no hope that Agda would be able to find a solution here.
When using Agda, you should see unsolved metavariables as a specific case of "failing to typecheck". They are not a hard type error because that would prevent you from continuing to use the interactive editing of Agda, and in many cases further filling in holes will actually solve the metavariables. However, they indicate "this program is not finished" just as much as an actual type error would.

Why do we need FOLLOW set in LL(1) grammar parser?

In generated parsing function we use an algorithm which looks on a peek of a tokens list and chooses rule (alternative) based on the current non-terminal FIRST set. If it contains an epsilon (rule is nullable), FOLLOW set is checked as well.
Consider following grammar [not LL(1)]:
B : A term
A : N1 | N2
N1 :
N2 :
During calculation of the FOLLOW set terminal term will be propagated from A to both N1 and N2, so FOLLOW set won't help us decide.
On the other hand, if there is exactly one nullable alternative, we know for sure how to continue execution, even in case current token doesn't match against anything from the FIRST set (by choosing epsilon production).
If above statements are true, FOLLOW set is redundant. Is it needed only for error-handling?
Yes, it is not necessary.
I was asked precisely this question on the colloquium, and my answer that FOLLOW set is used
to check that grammar is LL(1)
to fail immediately when an error occurs, instead of dragging the ill-formatted token to some later production, where generated fail message may be unclear
and for nothing else
was accepted
While you can certainly find grammars for which FOLLOW is unnecessary (i.e., it doesn't play a role in the calculation of the parsing table), in general it is necessary.
For example, consider the grammar
S : A | C
A : B a
B : b | epsilon
C : D c
D : d | epsilon
You need to know that
Follow(B) = {a}
Follow(D) = {c}
to calculate
First(A) = {b, a}
First(C) = {d, c}
in order to make the correct choice at S.

What is `where .force`?

I've been playing around with the idea of writing programs that run on Streams and properties with them, but I feel that I am stuck even with the simplest of things. When I look at the definition of repeat in Codata/Streams in the standard library, I find a construction that I haven't seen anywhere in Agda: λ where .force →.
Here, an excerpt of a Stream defined with this weird feature:
repeat : ∀ {i} → A → Stream A i
repeat a = a ∷ λ where .force → repeat a
Why does where appear in the middle of the lambda function definition?, and what is the purpose of .force if it is never used?
I might be asking something that is in the documentation, but I can't figure out how to search for it.
Also, is there a place where I can find documentation to use "Codata" and proofs with it? Thanks!
Why does where appear in the middle of the lambda function definition?,
Quoting the docs:
Anonymous pattern matching functions can be defined using one of the
two following syntaxes:
\ { p11 .. p1n -> e1 ; … ; pm1 .. pmn -> em }
\ where p11 .. p1n -> e1 … pm1 .. pmn -> em
So λ where is an anonymous pattern matching function. force is the field of Thunk and .force is a copattern in postfix notation (originally I said nonsense here, but thanks to #Cactus it's now fixed, see his answer).
Also, is there a place where I can find documentation to use "Codata" and proofs with it? Thanks!
Check out these papers
Normalization by Evaluation in the Delay Monad
A Case Study for Coinduction via Copatterns and Sized Types
Equational Reasoning about Formal Languages in Coalgebraic Style
Guarded Recursion in Agda via Sized Types
As one can see in the definition of Thunk, force is the field of the Thunk record type:
record Thunk {ℓ} (F : Size → Set ℓ) (i : Size) : Set ℓ where
coinductive
field force : {j : Size< i} → F j
So in the pattern-matching lambda, .force is not a dot pattern (why would it be? there is nothing prescribing the value of the parameter), but instead is simply syntax for the record field selector force. So the above code is equivalent to making a record with a single field called force with the given value, using copatterns:
repeat a = a :: as
where
force as = repeat a
or, which is actually where the .force syntax comes from, using postfix projection syntax:
repeat a = a :: as
where
as .force = repeat a

How do you represent terms of the CoC in Agda?

Representing, for example, the STLC in Agda can be done as:
data Type : Set where
* : Type
_⇒_ : (S T : Type) → Type
data Context : Set where
ε : Context
_,_ : (Γ : Context) (S : Type) → Context
data _∋_ : Context → Type → Set where
here : ∀ {Γ S} → (Γ , S) ∋ S
there : ∀ {Γ S T} (i : Γ ∋ S) → (Γ , T) ∋ S
data Term : Context → Type → Set where
var : ∀ {Γ S} (v : Γ ∋ S) → Term Γ S
lam : ∀ {Γ S T} (t : Term (Γ , S) T) → Term Γ (S ⇒ T)
app : ∀ {Γ S T} (f : Term Γ (S ⇒ T)) (x : Term Γ S) → Term Γ T
(From here.) Trying to adapt this to the Calculus of Constructions, though, is problematic, because Type and Term are a single type. This means not only Context/Term must be mutually recursive, but also that Term must be indexed on itself. Here is an initial attempt:
data Γ : Set
data Term : Γ → Term → Set
data Γ where
ε : Γ
_,_ : (ty : Term) (ctx : Γ) → Γ
infixr 5 _,_
data Term where
-- ...
Agda, though, complains that Term isn't in scope on its initial declaration. Is it possible to represent it that way, or do we really need to have different types for Term and Type? I'd highly like to see a minimal/reference implementation of CoC in Agda.
This is known to be a very hard problem. As far as I'm aware there is no "minimal" way to encode CoC in Agda. You have to either prove a lot of stuff or use shallow encoding or use heavy (but perfectly sensible) techniques like quotient induction or define untyped terms first and then reify them into typed ones. Here is some related literature:
Functional Program Correctness Through Types, Nils Anders Danielsson -- the last chapter of this thesis is a formalization of a dependently typed language. This is a ton-of-lemmas-style formalization and also contains some untyped terms.
Type checking and normalisation, James Chapman -- the fifth chapter of this thesis is a formalization of a dependently typed language. It is also a ton-of-lemmas-style formalization, except many lemmas are just constructors of the corresponding data types. For example, you have explicit substitutions as constructors rather than as computing functions (the previous thesis didn't have those for types, only for terms, while this thesis have explicit substitutions even for types).
Outrageous but Meaningful Coincidences. Dependent type-safe syntax and evaluation, Conor McBride -- this paper presents a deep encoding of a dependent type theory that reifies a shallow encoding of the theory. This means that instead of defining substitution and proving properties about it the author just uses the Agda's evaluation model, but also gives a full syntax for the target language.
Typed Syntactic Meta-programming, Dominique Devriese, Frank Piessens -- untyped terms reified into typed ones. IIRC there were a lot of postulates in the code when I looked into it, as this is a framework for meta-programming rather than a formalization.
Type theory eating itself?, Chuangjie Xu & Martin Escardo -- a single file formalization. As always, several data types defined mutually. Explicit substitutions with explicit transports that "mimic" the behavior of the substitution operations.
EatEval.agda -- we get this by combining the ideas from the previous two formalizations. In this file instead of defining multiple explicit transports we have just a single transport which allows to change the type of a term to a denotationally equal one. I.e. instead of explicitly specifying the behavior of substitution via constructors, we have a single constructor that says "if evaluating two types in Agda gives the same results, then you can convert a term of one type to the another one via a constructor".
Type Theory in Type Theory using Quotient Inductive Type, Thorsten Altenkirch, Ambrus Kaposi -- this is the most promising approach I'd say. It "legalizes" computation at the type level via the quotient types device. But we do not yet have quotient types in Agda, they are essentially postulated in the paper. People work a lot on quotient types (there is an entire thesis: Quotient inductive-inductive definitions -- Dijkstra, Gabe), though, so we'll probably have them at some point.
Decidability of Conversion for Type Theory in Type Theory, Andreas Abel, Joakim Öhman, Andrea Vezzosi -- untyped terms reified as typed ones. Lots of properties. Also has a lot of metatheoretical proofs and a particularly interesting device that allows to prove soundness and completeness using the same logical relation. The formalization is huge and well-commented.
A setoid model of extensional Martin-Löf type theory in Agda (zip file with the development), Erik Palmgren -- abstract:
Abstract. We present details of an Agda formalization of a setoid
model of Martin-Löf type theory with Pi, Sigma, extensional identity
types, natural numbers and an infinite hiearchy of universe à la
Russell. A crucial ingredient is the use of Aczel's type V of
iterative sets as an extensional universe of setoids, which allows for
a well-behaved interpretation of type equality.
Coq in Coq, Bruno Barras and Benjamin Werner -- a formalization of CC in Coq (the code). Untyped terms reified as types ones + lots of lemmas + metatheoretical proofs.
Thanks to András Kovács and James Chapman for suggestions.

agda - type of expression with mutual Lists not found

In Agda 2.5.1.1 on Windows, after the code below is loaded (it corresponds to the tutorial https://github.com/k0001/tut-agda/blob/master/SetsParametric.agda), the C-c C-d type-checking does find the type List₁ _A_2 _B_3 for the [] expression, but no reasonable type for any more structured expression like true ∷ [] , just underscore and number is returned, like _5 . Any ideas what the reason could be, please?
The previous exercises of the tutorial work well.
module Sets.Parametric where
open import Sets.Enumerated using (Bool; true; false; ⊤; tt)
data List₁ (A B : Set) : Set
data List₂ (A B : Set) : Set
data List₁ (A B : Set) where
[] : List₁ A B
_∷_ : A → List₂ A B → List₁ A B
data List₂ (A B : Set) where
_∷_ : B → List₁ A B → List₂ A B
Non-overloaded constructors are inferrable and hence the type of [] is inferred, but overloaded constructors are only checkable, so you can't infer the type true ∷ [] — only check it against List₂ Bool A.
Otherwise type-directed resolution for overloaded constructors would be too complicated. E.g. the type of the second argument of _∷_ could depend on its first argument, then figuring out whether _∷_ belongs to List₁ or List₂ would require solving two possibly non-trivial unification problems (one for List₁ and one for List₂) which likely will be postponed and sit in memory until it's clear which _∷_ the user means. Agda already generates lots of metavariables and I don't see any reason to increase this number and complicate type checking to incorporate this not super useful feature.

Resources