Let G be a grammar such that:
S -> aBa
B -> bB | \epsilon
where \epsilon represents the empty string.
After computing FIRST and FOLLOW, is there a way to tell if G is LL(1) without resorting to the parsing table?
After computing the FIRST and FOLLOW sets for the variables of G, you can compute the length 1 lookahead sets LA(1) for the variables and rules of G. Then G is strong LL(1) iff the following condition holds:
LA(1)(A -> wi) partition LA(1)(A) for each variable A such that A -> wi is a rule.
Alternatively, you can prove that G is strong LL(1) from the definition of a strong LL(k) grammar without computing the FIRST and FOLLOW sets. This is oftentimes easier and less tedious than computing FIRST and FOLLOW for small grammars like G.
I don't have a book handy, so there might be an error in some of these definitions or computations. But this is how I would approach the problem. Computing the FIRST and FOLLOW sets gives:
FIRST(1)(S) = trunc(1)({x : S =>* x AND x IN Σ*})
= trunc(1)({ab^na : n >= 0})
= {a}
FIRST(1)(B) = trunc(1)({x : B =>* x AND x IN Σ*})
= trunc(1)({b^n : n >= 0})
= {ε,b}
FOLLOW(1)(S) = trunc(1)({x : S =>* uSv AND x IN FIRST(1)(v)})
= trunc(1)({x : x IN FIRST(1)(ε)})
= trunc(1)(FIRST(1)(ε))
= {ε}
FOLLOW(1)(B) = trunc(1)({x : S =>* uBv AND x IN FIRST(1)(v)})
= trunc(1)({x : x IN FIRST(1)(a)})
= trunc(1)(FIRST(1)(a))
= {a}
Computing the length 1 lookahead sets for the variables and rules gives:
LA(1)(S) = trunc(1)(FIRST(1)(S)FOLLOW(1)(S))
= trunc(1)({a}{ε})
= trunc(1){a}
= {a}
LA(1)(B) = trunc(1)(FIRST(1)(B)FOLLOW(1)(B))
= trunc(1)({ε,b}{a})
= trunc(1){a,b}
= {a,b}
LA(1)(S -> aBa) = trunc(1)(FIRST(1)(a)FIRST(1)(B)FIRST(1)(a)FOLLOW(1)(S))
= trunc(1){a}
= {a}
LA(1)(B -> bB) = trunc(1)(FIRST(1)(b)FIRST(1)(B))
= trunc(1){b}
= {b}
LA(1)(B -> ε) = trunc(1)(FIRST(1)(ε)FOLLOW(1)(b))
= trunc(1)({ε}{a})
= {a}
Since LA(1)(B -> ε) and LA(1)(B -> bB) partition LA(1)(B) and LA(1)(S -> aBa) trivially partitions LA(1)(S), G is strong LL(1).
Related
I would like to test some definitions in system F using Agda as my typechecker and evaluator.
My first attempt to introduce Church natural numbers was by writing
Num = forall {x} -> (x -> x) -> (x -> x)
Which would be used just like a regular type alias:
zero : Num
zero f x = x
However the definition of Num does not type(kind?)check. What is the most proper way to make it working and be as close as possible to the system F notation?
The following would typecheck
Num : Set₁
Num = forall {x : Set} -> (x -> x) -> (x -> x)
zero : Num
zero f x = x
but as you see Num : Set₁, this might become a problem and you'll need --type-in-type
I know how to define typeclasses, and after reading the definition of RawMonad, RawMonadZero, RawApplicative and more, I implemented some simple type instances:
data Parser (A : Set) : Set where
MkParser : (String → List (A × String)) → Parser A
ParserMonad : RawMonad Parser
ParserMonad = record {
return = λ a → MkParser λ s → < a , s > ∷ []
; _>>=_ = λ p f → MkParser $ concatMap (tuple $ parse ∘ f) ∘ parse p
}
But when I'm trying to use ParserMonad.return in the implementation of ParserApplicative, I failed:
ParserApplicative : RawApplicative Parser
ParserApplicative = record {
pure = ParserMonad.return -- compile error
; _⊛_ = ...
}
My question is: how to use ParserMonad.return to implement ParserApplicative.pure? How can I do that or what doc should I read?
Here you're not using instance arguments, you are using records. Instance arguments are an independent mechanism which, combined with records, can be used to simulate something like type classes.
Coming back to records, to use the field f of a record r of type R, you can do various things:
Use the projection R.f applied to r:
a = R.f r
Name M the module corresponding to the content r as an R and use the definition f in it:
module M = R r
a = M.f
Open that module and use f directly:
open module R r
a = f
Using the first alternative, in your case it'd give us:
ParserApplicative : RawApplicative Parser
ParserApplicative = record {
pure = RawMonad.return ParserMonad
(...)
}
I'm writing an SLR(1) parser from the following grammar:
1) S -> aSb
2) S -> cB
3) B -> cB
4) B -> ε
First of all, I started with finding the associated LR(0) automaton with the augmented grammar adding the S' -> S production and starting to compute the various states. The states that I found are:
I0 = {S'->•S, S->•aSb, S->•cB}
𝛿(I0, S) = I1; 𝛿(I0, a) = I2; 𝛿(I0, c) = I3;
I1 = {S'->S•}
I2 = {S->a•Sb, S->•aSb, S->•cB}
𝛿(I2, S) = I4; 𝛿(I2, a) = I2; 𝛿(I2, c) = I3
I3 = {S->c•B, B->•cB, B->•ε}
𝛿(I3, B) = I5; 𝛿(I3, ε) = I6; 𝛿(I3, c) = I7;
I4 = {S->aS•b}
𝛿(I4, b) = I8;
I5 = {S->cB•}
I6 = {B->ε•}
I7 = {B->c•B, B->•cB, B->•ε}
𝛿(I7, B) = I9; 𝛿(I7, ε) = I6; 𝛿(I7, c) = I7;
I8 = {S->aSb•}
I9 = {B->cB•}
And here there is the LR(0) automaton:
Automaton Picture
After that, I did the parser table (but I don't think it is needed in order to answer my question) so I have a doubt:
is the epsilon transition handled in the right way? I mean, I treated it as a normal character since we have to reduce by the rule number 4 at some point. If I'm wrong, how should I treat that transition? Thanks in advance, hoping this is helpful for other people as well.
no , there is no need to create the State I6
Confusion may have arise in Y->Ɛ . When you place a dot in the augmented productions for example S->A.B it means that A is completed and B is yet to be completed (by completion here means progress in parsing) . Similarly if you write Y->.Ɛ , it means Ɛ is yet to be over, but we also know that Ɛ is null string i.e nothing therefore Y->.Ɛ is interpreted as Y->.
you can use The JFLAP Software and see it's documentation about SLR(1)
NOT REQUIRED...
even i faced the same problem while removing left reccursion for the following grammar
E->E+T|E-T|T
the transformed rule looks like
E->T X
X->+TX|-TX|*e*
that doesnt matter as
x->.e and x->e. have no significance.Since moving the period before and after Epsilon is madman's job
In my current compilers course, I've understood how to find the first and follow sets of a grammar, and so far all of the grammars I have dealt with have contained epsilon. Now I am being asked to find the first and follow sets of a grammar without epsilon, and to determine whether it is LR(0) and SLR. Not having epsilon has thrown me off, so I don't know if I've done it correctly. I would appreciate any comments on whether I am on the right track with the first and follow sets, and how to begin determining if it is LR(0)
Consider the following grammar describing Lisp arithmetic:
S -> E // S is start symbol, E is expression
E -> (FL) // F is math function, L is a list
L -> LI | I // I is an item in a list
I -> n | E // an item is a number n or an expression E
F -> + | - | *
FIRST:
FIRST(S)= FIRST(E) = {(}
FIRST(L)= FIRST(I) = {n,(}
FIRST(F) = {+, -, *}
FOLLOW:
FOLLOW(S) = {$}
FOLLOW(E) = FOLLOW(L) = {), n, $}
FOLLOW(I) = {),$}
FOLLOW(F) = {),$}
The FIRST sets are right, but the FOLLOW sets are incorrect.
The FOLLOW(S) = {$} is right, though technically this is for the augmented grammar S' -> S$ .
E appears on the right side of S -> E and I -> E, both of which mean that the follow of that set is in the follow of E, so: FOLLOW(E) = FOLLOW(S) ∪ FOLLOW(I) .
L appears on the right hand side of L -> LI, which gives FOLLOW(L) ⊇ FIRST(I) , and E -> (FL), which gives FOLLOW(L) ⊇ {)} .
I appears on the right side of L -> LI | I , which gives FOLLOW(I) = FOLLOW(L) .
F appears on the right side in E -> (FL) , which gives FOLLOW(F) = FIRST(L)
Solving for these gives:
FOLLOW(F) = {n, (}
FOLLOW(L) = FIRST(I) ∪ {)} = {n, (, )}
FOLLOW(I) = {n, (, )}
FOLLOW(E) = {$} ∪ {n, (, )} = {n, (, ), $}
I am writing a basic monadic parser in Idris, to get used to the syntax and differences from Haskell. I have the basics of that working just fine, but I am stuck on trying to create VerifiedSemigroup and VerifiedMonoid instances for the parser.
Without further ado, here's the parser type, Semigroup, and Monoid instances, and the start of a VerifiedSemigroup instance.
data ParserM a = Parser (String -> List (a, String))
parse : ParserM a -> String -> List (a, String)
parse (Parser p) = p
instance Semigroup (ParserM a) where
p <+> q = Parser (\s => parse p s ++ parse q s)
instance Monoid (ParserM a) where
neutral = Parser (const [])
instance VerifiedSemigroup (ParserM a) where
semigroupOpIsAssociative (Parser p) (Parser q) (Parser r) = ?whatGoesHere
I'm basically stuck after intros, with the following prover state:
-Parser.whatGoesHere> intros
---------- Other goals: ----------
{hole3},{hole2},{hole1},{hole0}
---------- Assumptions: ----------
a : Type
p : String -> List (a, String)
q : String -> List (a, String)
r : String -> List (a, String)
---------- Goal: ----------
{hole4} : Parser (\s => p s ++ q s ++ r s) =
Parser (\s => (p s ++ q s) ++ r s)
-Parser.whatGoesHere>
It looks like I should be able to use rewrite together with appendAssociative somehow,
but I don't know how to "get inside" the lambda \s.
Anyway, I'm stuck on the theorem-proving part of the exercise - and I can't seem to find much Idris-centric theorem proving documentation. I guess maybe I need to start looking at Agda tutorials (though Idris is the dependently-typed language I'm convinced I want to learn!).
The simple answer is that you can't. Reasoning about functions is fairly awkward in intensional type theories. For example, Martin-Löf's type theory is unable to prove:
S x + y = S (x + y)
0 + y = y
x +′ S y = S (x + y)
x +′ 0 = x
_+_ ≡ _+′_ -- ???
(as far as I know, this is an actual theorem and not just "proof by lack of imagination"; however, I couldn't find the source where I read it). This also means that there is no proof for the more general:
ext : ∀ {A : Set} {B : A → Set}
{f g : (x : A) → B x} →
(∀ x → f x ≡ g x) → f ≡ g
This is called function extensionality: if you can prove that the results are equal for all arguments (that is, the functions are equal extensionally), then the functions are equal as well.
This would work perfectly for the problem you have:
<+>-assoc : {A : Set} (p q r : ParserM A) →
(p <+> q) <+> r ≡ p <+> (q <+> r)
<+>-assoc (Parser p) (Parser q) (Parser r) =
cong Parser (ext λ s → ++-assoc (p s) (q s) (r s))
where ++-assoc is your proof of associative property of _++_. I'm not sure how would it look in tactics, but it's going to be fairly similar: apply congruence for Parser and the goal should be:
(\s => p s ++ q s ++ r s) = (\s => (p s ++ q s) ++ r s)
You can then apply extensionality to get assumption s : String and a goal:
p s ++ q s ++ r s = (p s ++ q s) ++ r s
However, as I said before, we don't have function extensionality (note that this is not true for type theories in general: extensional type theories, homotopy type theory and others are able to prove this statement). The easy option is to assume it as an axiom. As with any other axiom, you risk:
Losing consistency (i.e. being able to prove falsehood; though I think function extensionality is OK)
Breaking reduction (what does a function that does case analysis only for refl do when given this axiom?)
I'm not sure how Idris handles axioms, so I won't go into details. Just beware that axioms can mess up some stuff if you are not careful.
The hard option is to work with setoids. A setoid is basically a type equipped with custom equality. The idea is that instead of having a Monoid (or VerifiedSemigroup in your case) that works on the built-in equality (= in Idris, ≡ in Agda), you have a special monoid (or semigroup) with different underlying equality. This is usually done by packing the monoid (semigroup) operations together with the equality and bunch of proofs, namely (in pseudocode):
= : A → A → Set -- equality
_*_ : A → A → A -- associative binary operation
1 : A -- neutral element
=-refl : x = x
=-trans : x = y → y = z → x = z
=-sym : x = y → y = x
*-cong : x = y → u = v → x * u = y * v -- the operation respects
-- our equality
*-assoc : x * (y * z) = (x * y) * z
1-left : 1 * x = x
1-right : x * 1 = x
The choice of equality for parsers is clear: two parsers are equal if their outputs agree for all possible inputs.
-- Parser equality
_≡p_ : {A : Set} (p q : ParserM A) → Set
Parser p ≡p Parser q = ∀ x → p x ≡ q x
This solution comes with different tradeoffs, namely that the new equality cannot fully substitute the built-in one (this tends to show up when you need to rewrite some terms). But it's great if you just want to show that your code does what it's supposed to do (up to some custom equality).