Removing Ambiguity Caused By Dangling Else For LL(1) Grammars - ambiguous

In the case of the dangling else problem for compiler design, is there a reason to left factor it before removing ambiguity?
We are transforming a CFG into an LL(1) grammar so my professor is asking us to first eliminate recursion, then left factor, then remove ambiguity from our grammar. But, from what I've read, ambiguity is usually eliminated first. I'm not sure how to remove ambiguity after left factoring.
This is how what I got after left factoring it:
S -> i E t S S' | other
S' -> e S | epsilon
However, as I understand it, removing ambiguity requires a rewrite of the grammar so the grammar will always result similar to this right?
S -> U | M
M -> i E t M e M | other
U -> i E t U'
U' -> M e U | S
Or is there another way to do it? As far as I can see, this is the only way to remove ambiguity from the dangling else.

As it turns out, a good way to deal with ambiguity caused by a dangling else in an LL(1) is to handle it in the parser. Rewriting the grammar is also another way to handle it, as is adding 'begin' and 'end' in the grammar like so:
S -> i E t a S z S' | other
S' -> e S | epsilon
Although it might be intuitive for some, for other beginners, this is what the symbols mean:
S: Statement
i: if
E: Expression
t: then
a: begin
z: end
S': Statement'
e: else
other: any other productions
Note: lower case letters represent terminals; Uppercase letters represent variables.
If anything is wrong, please let me know and I'll correct it.

I think this can be a possible answer:
[After left factoring and making it unambiguous]
Let other = a
S -> iEtT | a
T -> S | aeS
I am generating all if's first and associating the else with the recent unassociated if .
If I have to get an else, I should be eliminating the possibility of getting a new if between the current unassociated if and corresponding else.
However I am allowing the possibility of getting an if after generating the corresponding else.
Point out if there are any errors.
Thank you.

Related

Is this grammar LR(2) and how can i determine it?

to determine if my parser is working correctly i need to find a lr(2+) grammar. After a quick research i have found this grammar and i believe that it is lr(2). However, i am not sure how to determine this.
Terminals: b, e, o, r, s
NonTerminals: A, B, E, Q, SL
Start: P
Productions:
P -> A
A -> E B SL E | b e
B -> b | o r
E -> e | Ɛ
SL -> s SL | s
I would be glad, if someone is able to confirm or deny that this grammar is lr(2) and at best give me a brief explanation on how to determine it by myself.
Thank you very much!
I'm pretty sure it's LR(2), but I don't have an LR(2) parser generator handy to test it, which would be the definitive way to do the test. Of course, you could generate the parser tables by hand. It's not that complicated a grammar, so it shouldn't take you too long.
It's certainly not LR(1), as can be seen from the pair of inputs:
b e
b s e
The left-most derivations are:
P->A->b e
P->E B SL E->B SL E->b SL E->b s E->b s e
So at the beginning of the parse, the parser can either shift a b in order to follow the first derivation chain or reduce an empty sequence to E in order to proceed with the second derivation chain. The second token is needed to choose between these two options, hence a lookahead of at least 2 is required.
As a side note, it should be pretty simple to mine StackOverflow for LR(2) grammars; they come up from time to time in questions. Here's a few I found by searching for LALR(2): (I used a Google search with site:stackoverflow.com because SO's own search engine doesn't do well with search patterns which aren't words. Not that Google does it well, but it does do it better.)
Solving bison conflict over 2nd lookahead
Solving small shift reduce conflict
Persistent Shift - Reduce Conflict in Goldparser
How to reduce parser stack or 'unshift' the current token depending on what follows?
I didn't verify the claims in those questions and answers, and there are other questions which didn't seem to have as clear a result.
The most classic LALR(2) grammar is the grammar for Yacc itself, which is pretty ironic. Here's a simplified version:
grammar: %empty | grammar production
production: ID ':' symbols
symbols: %empty | symbols symbol
symbol: ID | QUOTED_LITERAL
That simple grammar leaves out actions and the optional semicolon. But it captures the essence of the LALR(2)-ness of the grammar, which is precisely the result of the semicolon being optional. That's not a complaint; the grammar is unambiguous so the semicolon really is redundant and no-one should be forced to type a redundant token :-)

How to eliminate this Left Recursion for LL Parser

How do you eliminate a left recursion of the following type. I can't seem to be able to apply the general rule on this particular one.
A -> A | a | b
By using the elimination rule you get:
A -> aA' | bA'
A' -> A' | epsilon
Which still has left recursion.
Does this say anything about the grammar being/not being LL(1)?
Thank you.
Notice that the rule
A → A
is, in a sense, entirely useless. It doesn't do anything to a derivation to apply this rule. As a result, we can safely remove it from the grammar without changing what the grammar produces. This leaves
A → a | b
which is LL(1).

Conflict in CLR parsing

This is the grammar:
S' -> S
S-> aBc|bCc|aCd|bBd
B ->e
C ->e
I parsed in CLR then reduce/reduce conflict arose. What to do next? I have attached my solved problem below.
Anybody please tell me what to do next
Err... fix the conflict?
It's very clear even just from the last two productions, when the parser meets either c or d after e:
B -> e . {c, d}
C -> e . {c, d}
single lookahead is not enough to determine whether above condition should reduce to B or C.
Parser generators usually have a solution by taking the one that appears first in the grammar, but this is not always a good case. In above grammar, if this solution is taken, the parser won't be able to parse bec and aed due to e always reduces to B.
I suggest changing the grammar such that no conflict occurs. You know the whole grammar can only produce aec, bec, aed and bed. See what's better in the sequences to be made separate production that will reduce uniquely.

fixing a grammar to LR(0)

Question:
Given the following grammar, fix it to an LR(O) grammar:
S -> S' $
S'-> aS'b | T
T -> cT | c
Thoughts
I've been trying this for quite sometime, using automatic tools for checking my fixed grammars, with no success. Our professor likes asking this kind of questions on test without giving us a methodology for approaching this (except for repeated trying). Is there any method that can be applied to answer these kind of questions? Can anyone show this method can be applied on this example?
I don't know of an automatic procedure, but the basic idea is to defer decisions. That is, if at a particular state in the parse, both shift and reduce actions are possible, find a way to defer the reduction.
In the LR(0) parser, you can make a decision based on the token you just shifted, but not on the token you (might be) about to shift. So you need to move decisions to the end of productions, in a manner of speaking.
For example, your language consists of all sentences { ancmbn$ | n ≥ 0, m > 0}. If we restrict that to n > 0, then an LR(0) grammar can be constructed by deferring the reduction decision to the point following a b:
S -> S' $.
S' -> U | a S' b.
U -> a c T.
T -> b | c T.
That grammar is LR(0). In the original grammar, at the itemset including T -> c . and T -> c . T, both shift and reduce are possible: shift c and reduce before b. By moving the b into the production for T, we defer the decision until after the shift: after shifting b, a reduction is required; after c, the reduction is impossible.
But that forces every sentence to have at least one b. It omits sentences for which n = 0 (that is, the regular language c*$). That subset has an LR(0) grammar:
S -> S' $.
S' -> c | S' c.
We can construct the union of these two languages in a straight-forward manner, renaming one of the S's:
S -> S1' $ | S2' $.
S1' -> U | a S1' b.
U -> a c T.
T -> b | c T.
S2' -> c | S2' c.
This grammar is LR(0), but the form in which the end-of-input sentinel $ has been included seems to be cheating. At least, it violates the rule for augmented grammars, because an augmented grammar's base rule is always S -> S' $ where S' and $ are symbols not used in the original grammar.
It might seem that we could avoid that technicality by right-factoring:
S -> S' $
S' -> S1' | S2'
Unfortunately, while that grammar is still deterministic, and does recognise exactly the original language, it is not LR(0).
(Many thanks to #templatetypedef for checking the original answer, and identifying a flaw, and also to #Dennis, who observed that c* was omitted.)

confusion in finding first and follow in left recursive grammar

Recently I faced the problem for finding first and follow
S->cAd
A->Ab|a
Here I am confused with first of A
which one is correct {a} , {empty,a} as there is left recursion in A's production .
I am confused whether to include empty string in first of A or not
Any help would be appreciated.
-------------edited---------------
what wil be the first and follow of this ,,This is so confusing grammar i have ever seen
S->SA|A
A->a
I need to prove this grammar is not in LL(1) using parsing table but unable to do because i didnot get 2 entry in single cell.
Firstly,you'll need to remove left-recursion leading to
S -> cAd
A -> aA'
A' -> bA' | epsilon
Then, you can calculate
FIRST(A) = a // as a is the only terminal nderived first from A.
EDIT :-
For your second question,
S -> AS'
S' -> AS' | epsilon
A -> a
FIRST(A) = a
FIRST(S) = a
FIRST(S') = {a,epsilon}.
The idea of removing left-recursion before calculating FIRST() and FOLLOW() can be learnt here.

Resources