I am new to the subject of compilation and have just started an exercise for Bottom -up parsing.
I have stuck on the following problem.
build a LR(0) parsing table for following grammar:
1) E –> E + T
2) E –> T
3) T –> (E)
4) T –> id
I0 :
E' –> .E
E –> .E + T
E –> .T
T –> .(E)
T –> .id
on E the next state in the DFA would be :
I1:
E' -> E.
E -> E. + T
from what I have learned so far isn't this a S-R conflict?
because the parser would not know whether to reduce or shift as it
has no look-ahead variable?
so this should not be LR(0) grammar?
but the PDF which I am reading have built the LR(0) table.
So is there a mistake in the PDF or have I gone wrong some where understanding the concept?
You augmented the grammar with E' -> E. Normally, you'd augment with a production like E' -> E $, where $ is a (terminal) symbol that doesn't otherwise occur in the grammar, and denotes end-of-input.
So I1 would actually be
E' -> E. $
E -> E. + T
and there isn't a conflict. (And I believe the grammar is LR(0).)
This is indeed a shift/reduce conflict. This grammar isn't LR(0). You can also see this because it's not prefix-free; the grammar contains multiple strings that are prefixes of one another, so it can't be LR(0).
That said, you can still construct all the LR(0) configurating sets and make the LR(0) automaton. It just won't be deterministic because of the shift/reduce conflict. It's possible, therefore, that you're right and the handout is right.
Hope this helps!
Related
How do you identify whether a grammar is LL(1), LR(0), or SLR(1)?
Can anyone please explain it using this example, or any other example?
X → Yz | a
Y → bZ | ε
Z → ε
To check if a grammar is LL(1), one option is to construct the LL(1) parsing table and check for any conflicts. These conflicts can be
FIRST/FIRST conflicts, where two different productions would have to be predicted for a nonterminal/terminal pair.
FIRST/FOLLOW conflicts, where two different productions are predicted, one representing that some production should be taken and expands out to a nonzero number of symbols, and one representing that a production should be used indicating that some nonterminal should be ultimately expanded out to the empty string.
FOLLOW/FOLLOW conflicts, where two productions indicating that a nonterminal should ultimately be expanded to the empty string conflict with one another.
Let's try this on your grammar by building the FIRST and FOLLOW sets for each of the nonterminals. Here, we get that
FIRST(X) = {a, b, z}
FIRST(Y) = {b, epsilon}
FIRST(Z) = {epsilon}
We also have that the FOLLOW sets are
FOLLOW(X) = {$}
FOLLOW(Y) = {z}
FOLLOW(Z) = {z}
From this, we can build the following LL(1) parsing table:
a b z $
X a Yz Yz
Y bZ eps
Z eps
Since we can build this parsing table with no conflicts, the grammar is LL(1).
To check if a grammar is LR(0) or SLR(1), we begin by building up all of the LR(0) configurating sets for the grammar. In this case, assuming that X is your start symbol, we get the following:
(1)
X' -> .X
X -> .Yz
X -> .a
Y -> .
Y -> .bZ
(2)
X' -> X.
(3)
X -> Y.z
(4)
X -> Yz.
(5)
X -> a.
(6)
Y -> b.Z
Z -> .
(7)
Y -> bZ.
From this, we can see that the grammar is not LR(0) because there is a shift/reduce conflicts in state (1). Specifically, because we have the shift item X → .a and Y → ., we can't tell whether to shift the a or reduce the empty string. More generally, no grammar with ε-productions is LR(0).
However, this grammar might be SLR(1). To see this, we augment each reduction with the lookahead set for the particular nonterminals. This gives back this set of SLR(1) configurating sets:
(1)
X' -> .X
X -> .Yz [$]
X -> .a [$]
Y -> . [z]
Y -> .bZ [z]
(2)
X' -> X.
(3)
X -> Y.z [$]
(4)
X -> Yz. [$]
(5)
X -> a. [$]
(6)
Y -> b.Z [z]
Z -> . [z]
(7)
Y -> bZ. [z]
The shift/reduce conflict in state (1) has been eliminated because we only reduce when the lookahead is z, which doesn't conflict with any of the other items.
If you have no FIRST/FIRST conflicts and no FIRST/FOLLOW conflicts, your grammar is LL(1).
An example of a FIRST/FIRST conflict:
S -> Xb | Yc
X -> a
Y -> a
By seeing only the first input symbol "a", you cannot know whether to apply the production S -> Xb or S -> Yc, because "a" is in the FIRST set of both X and Y.
An example of a FIRST/FOLLOW conflict:
S -> AB
A -> fe | ε
B -> fg
By seeing only the first input symbol "f", you cannot decide whether to apply the production A -> fe or A -> ε, because "f" is in both the FIRST set of A and the FOLLOW set of A (A can be parsed as ε/empty and B as f).
Notice that if you have no epsilon-productions you cannot have a FIRST/FOLLOW conflict.
Simple answer:A grammar is said to be an LL(1),if the associated LL(1) parsing table has atmost one production in each table entry.
Take the simple grammar A -->Aa|b.[A is non-terminal & a,b are terminals]
then find the First and follow sets A.
First{A}={b}.
Follow{A}={$,a}.
Parsing table for Our grammar.Terminals as columns and Nonterminal S as a row element.
a b $
--------------------------------------------
S | A-->a |
| A-->Aa. |
--------------------------------------------
As [S,b] contains two Productions there is a confusion as to which rule to choose.So it is not LL(1).
Some simple checks to see whether a grammar is LL(1) or not.
Check 1: The Grammar should not be left Recursive.
Example: E --> E+T. is not LL(1) because it is Left recursive.
Check 2: The Grammar should be Left Factored.
Left factoring is required when two or more grammar rule choices share a common prefix string.
Example: S-->A+int|A.
Check 3:The Grammar should not be ambiguous.
These are some simple checks.
LL(1) grammar is Context free unambiguous grammar which can be parsed by LL(1) parsers.
In LL(1)
First L stands for scanning input from Left to Right. Second L stands
for Left Most Derivation. 1 stands for using one input symbol at each
step.
For Checking grammar is LL(1) you can draw predictive parsing table. And if you find any multiple entries in table then you can say grammar is not LL(1).
Their is also short cut to check if the grammar is LL(1) or not .
Shortcut Technique
With these two steps we can check if it LL(1) or not.
Both of them have to be satisfied.
1.If we have the production:A->a1|a2|a3|a4|.....|an.
Then,First(a(i)) intersection First(a(j)) must be phi(empty set)[a(i)-a subscript i.]
2.For every non terminal 'A',if First(A) contains epsilon
Then First(A) intersection Follow(A) must be phi(empty set).
Consider the following grammar
S -> aPbSQ | a
Q -> tS | ε
P -> r
While constructing the DFA we can see there shall be a state which contains Items
Q -> .tS
Q -> . (epsilon as a blank string)
since t is in follow(Q) there appears to be a shift - reduce conflict.
Can we conclude the nature of the grammar isn't SLR(1) ?
(Please ignore my incorrect previous answer.)
Yes, the fact that you have a shift/reduce conflict in this configuring set is sufficient to show that this grammar isn't SLR(1).
Given a grammar G defined by
A -> Ca
B -> Cb
C -> e|f
Is this grammar LL(1)?
I realize that we could compress this down into a single line, but that's not the point of this question.
Mainly, can an LL(1) grammar have multiple rules that begin with the same non-terminal?
As a follow up question, how do I construct a parse table for the above grammar?
I've worked out the following:
FIRST(A) = {e,f}
FIRST(B) = {e,f}
FIRST(C) = {a,b}
FOLLOW(A) = {}
FOLLOW(B) = {}
FOLLOW(C) = {a,b}
I looked at this post, but didn't understand how they went from the FIRSTs and FOLLOWs to a table.
The grammar you've given is not LL(1) because there is a FIRST/FIRST conflict between the two productions A → Ca and A → Cb.
In general, grammars with multiple productions for the same nonterminal that begin with the same nonterminal will not be LL(1) because you will get a FIRST/FIRST conflict. There are grammars with this property that are LL(1), though they're essentially degenerate cases. Consider, for example, the following grammar:
A → Ea
A → Eb
E → ε
Here, the nonterminal E only expands out to the empty string ε and therefore, in effect, isn't really there. Therefore, the above grammar is LL(1) because there are no FIRST/FIRST conflicts between the two productions. To see this, here's the parse table:
a b $
A Ea Eb -
E ε ε -
Hope this helps!
I am solve your question in 2 case:
First case if the terminal is {a,b,e,f}
Second case if the terminal is {a,b,f} and e is epsilon
so no multipe entry this is LL(1).
For more information:
look at this explanation and example below:
Best regards
I am badly stuck on a question i am attempting from a sample final exam of compilers. I will really appreciate if someone can help me out with an explanation. Thanks
Consider the grammar G listed below
S = E $
E = E + T | T
T = T * F | F
F = ident | ( E )
Where + * ident ( ) are terminal symbols and $ is end of file.
a) is this grammar LR( 0 )? Justify your answer.
b) is the grammar SLR( 1 ) ? Justify your answer.
c) is this grammar LALR( 1 )? Justify your answer.
If you can show that the grammar is LR(0) then of course it is SLR(1) and LALR(1) because LR(0) is more restrictive.
Unfortunately, the grammar isn't LR(0).
For instance suppose you have just recognized E:
S -> E . $
You must not reduce this E to S if what follows is a + or * symbol, because E can be followed by + or * which continue to build a larger expression:
S -> E . $
E -> E . + T
T -> T . * F
This requires us to look ahead one token to know what to do in that state: to shift (+ or *) or reduce ($).
SLR(1) adds lookahead, and makes use of the follow-set information to make reductions (better than nothing, but the follow-set information globally obtained from the grammar is not context sensitive, like the state-specific lookahead sets in LALR(1)).
Under SLR(1), the above conflict goes away, because the S -> E reduction is considered only when the lookahead symbol is in the follow set of S, and the only thing in the follow set of S is the EOF symbol $. If the input symbol is not $, like +, then the reduction is not considered; a shift takes place which doesn't conflict with the reduction.
So the grammar does not fail to be SLR(1) on account of that conflict. It might, however, have some other conflict. Glancing through it, I can't see one; but to "justify that answer" properly, you have to generate all of the LR(0) state items, and go through the routine of verifying that the SLR(1) constraints are not violated. (You use the simple LR(0) items for SLR(1) because SLR(1) doesn't augment these items in any new way. Remember, it just uses the follow-set information cribbed from the grammar to eliminate conflicts.)
If it is SLR(1) then LALR(1) falls by subset relationship.
Update
The Red Dragon Book (Compilers: Principles, Techniques and Tools, Aho, Sethi, Ullman, 1988) uses exactly the same grammar in a set of examples that show the derivation of the canonical LR(0) item sets and the associated DFA, and some of the steps of filling in the parsing tables. This is in section 4.7, starting with example 4.34.
How do you identify whether a grammar is LL(1), LR(0), or SLR(1)?
Can anyone please explain it using this example, or any other example?
X → Yz | a
Y → bZ | ε
Z → ε
To check if a grammar is LL(1), one option is to construct the LL(1) parsing table and check for any conflicts. These conflicts can be
FIRST/FIRST conflicts, where two different productions would have to be predicted for a nonterminal/terminal pair.
FIRST/FOLLOW conflicts, where two different productions are predicted, one representing that some production should be taken and expands out to a nonzero number of symbols, and one representing that a production should be used indicating that some nonterminal should be ultimately expanded out to the empty string.
FOLLOW/FOLLOW conflicts, where two productions indicating that a nonterminal should ultimately be expanded to the empty string conflict with one another.
Let's try this on your grammar by building the FIRST and FOLLOW sets for each of the nonterminals. Here, we get that
FIRST(X) = {a, b, z}
FIRST(Y) = {b, epsilon}
FIRST(Z) = {epsilon}
We also have that the FOLLOW sets are
FOLLOW(X) = {$}
FOLLOW(Y) = {z}
FOLLOW(Z) = {z}
From this, we can build the following LL(1) parsing table:
a b z $
X a Yz Yz
Y bZ eps
Z eps
Since we can build this parsing table with no conflicts, the grammar is LL(1).
To check if a grammar is LR(0) or SLR(1), we begin by building up all of the LR(0) configurating sets for the grammar. In this case, assuming that X is your start symbol, we get the following:
(1)
X' -> .X
X -> .Yz
X -> .a
Y -> .
Y -> .bZ
(2)
X' -> X.
(3)
X -> Y.z
(4)
X -> Yz.
(5)
X -> a.
(6)
Y -> b.Z
Z -> .
(7)
Y -> bZ.
From this, we can see that the grammar is not LR(0) because there is a shift/reduce conflicts in state (1). Specifically, because we have the shift item X → .a and Y → ., we can't tell whether to shift the a or reduce the empty string. More generally, no grammar with ε-productions is LR(0).
However, this grammar might be SLR(1). To see this, we augment each reduction with the lookahead set for the particular nonterminals. This gives back this set of SLR(1) configurating sets:
(1)
X' -> .X
X -> .Yz [$]
X -> .a [$]
Y -> . [z]
Y -> .bZ [z]
(2)
X' -> X.
(3)
X -> Y.z [$]
(4)
X -> Yz. [$]
(5)
X -> a. [$]
(6)
Y -> b.Z [z]
Z -> . [z]
(7)
Y -> bZ. [z]
The shift/reduce conflict in state (1) has been eliminated because we only reduce when the lookahead is z, which doesn't conflict with any of the other items.
If you have no FIRST/FIRST conflicts and no FIRST/FOLLOW conflicts, your grammar is LL(1).
An example of a FIRST/FIRST conflict:
S -> Xb | Yc
X -> a
Y -> a
By seeing only the first input symbol "a", you cannot know whether to apply the production S -> Xb or S -> Yc, because "a" is in the FIRST set of both X and Y.
An example of a FIRST/FOLLOW conflict:
S -> AB
A -> fe | ε
B -> fg
By seeing only the first input symbol "f", you cannot decide whether to apply the production A -> fe or A -> ε, because "f" is in both the FIRST set of A and the FOLLOW set of A (A can be parsed as ε/empty and B as f).
Notice that if you have no epsilon-productions you cannot have a FIRST/FOLLOW conflict.
Simple answer:A grammar is said to be an LL(1),if the associated LL(1) parsing table has atmost one production in each table entry.
Take the simple grammar A -->Aa|b.[A is non-terminal & a,b are terminals]
then find the First and follow sets A.
First{A}={b}.
Follow{A}={$,a}.
Parsing table for Our grammar.Terminals as columns and Nonterminal S as a row element.
a b $
--------------------------------------------
S | A-->a |
| A-->Aa. |
--------------------------------------------
As [S,b] contains two Productions there is a confusion as to which rule to choose.So it is not LL(1).
Some simple checks to see whether a grammar is LL(1) or not.
Check 1: The Grammar should not be left Recursive.
Example: E --> E+T. is not LL(1) because it is Left recursive.
Check 2: The Grammar should be Left Factored.
Left factoring is required when two or more grammar rule choices share a common prefix string.
Example: S-->A+int|A.
Check 3:The Grammar should not be ambiguous.
These are some simple checks.
LL(1) grammar is Context free unambiguous grammar which can be parsed by LL(1) parsers.
In LL(1)
First L stands for scanning input from Left to Right. Second L stands
for Left Most Derivation. 1 stands for using one input symbol at each
step.
For Checking grammar is LL(1) you can draw predictive parsing table. And if you find any multiple entries in table then you can say grammar is not LL(1).
Their is also short cut to check if the grammar is LL(1) or not .
Shortcut Technique
With these two steps we can check if it LL(1) or not.
Both of them have to be satisfied.
1.If we have the production:A->a1|a2|a3|a4|.....|an.
Then,First(a(i)) intersection First(a(j)) must be phi(empty set)[a(i)-a subscript i.]
2.For every non terminal 'A',if First(A) contains epsilon
Then First(A) intersection Follow(A) must be phi(empty set).