I get a hypothesis from our teacher and he want from us to search and validate it. We have SLR(1) and LALR(1) parser. The hypothesis is:
Suppose we have a language structure called X. If We couldn't provide a LALR(1) grammar for this structure, we couldn't provide a SLR(1) too and maybe a LR(1) grammar could solve problem. but If we could provide a LALR(1) grammar for this structure, we could provide a SLR(1) too.
If you search in internet, you find a lot of sites which say this grammar is not SLR(1) but it is LALR(1):
S -> R
S -> L = R
L -> * R
L -> id
R -> L
("id", "*" and "=" are terminals and others are non-terminals)
If we try to find SLR(1) items, we will see shift/reduce conflict. it is true, but my hypothesis say something else. In our hypothesis, we talk about language described by grammar not grammar itself! We can remove "R" and convert grammar to LL(1) and It is also SLR(1) and LALR(1):
S -> LM
M -> epsilon
M -> = L
L -> * L
L -> id
You can try this grammar and you can see that this grammar describe same language as last grammar and has SLR(1) and LALR(1) grammar!
so my problem is not finding a grammar which is LALR(1) but not SLR(1). There are a lot of them in internet. I want to know is there any language which has LALR(1) grammar but not SLR(1) grammar? and if our hypothesis is true, then there is no need to LALR(1) and SLR(1) could do everything for us, however LALR(1) is easier to use and maybbe in future, a language reject this hypothesis.
I'm sorry for bad English.
Thanks.
Every LR(k) language has an SLR(1) grammar.
There is a proof in this paper from 1976, which provides an algorithm for constructing the SLR(1) grammar, if you have an LR(k) grammar and know the value of k. Unfortunately, there is no algorithm which can tell you definitely whether a CFG is LR(k), much less provide the value of k. (If you somehow know that the grammar is LR(k), you can try successive values of k until you find one which works. But that procedure will never terminate if the grammar is not LR(k).)
The above comes from this reference question on the Computing Science StackExchange site, which is a better place for this kind of question.
LR(1) > LALR(1) > SLR(1)
LR(1) is the most powerful, LALR(1) in less powerful and SLR(1) is least powerful.
This is fact, because of the way the lookahead sets are computed. (1) means lookahead of one token. Here is a grammar that is LR(1), but not LALR(1) and definitely not SLR(1):
G : S... <eof>
;
S : c A1 t ';'
| c A2 n ';'
| r A2 t ';'
| r A1 n ';'
;
A1 : a
;
A2 : a
;
This grammar cannot be made LALR(1) or SLR(1). Or sure you can remove A1 and A2 and
replace them with a, but then you have a different grammar. The problem is that an
action may be attached to the rule A1 : a and a different action my be attached to A2 : a. For example:
A1 : a => X()
;
A2 : a => Y()
;
An SLR(1) parser generator will report conflicts in your grammar that are not real conflicts. I'm talking about the real world using large grammars (e.g. C11.grm).
The SLR(1) lookahead computation is simplistic, getting the lookaheads from the grammar, instead of the LR(0) state machine created by an LALR(1) parser generator.
That is why Frank DeRemer's paper, in 1969, on LALR(1) is so important.
By looking at the grammar, A1 can be followed by either t or n, so this is a conflict
reported by SLR(1), but there is an LR(1) state machine in which there is no conflict on which follows A1.
Related
Anyone would give me an processed example of LR(1) grammar which is not LR(0) Grammar? I was just trying to find out why LR(1) parser is more efficient and powerful, and tried an example of grammar and found it non LR(0) ,there was conflict in parsing table,then tried LR(1) also no use...
A very simple example of grammar ,(augmented)
S->A
A->aBed | aEef
B->m
E->m
Needed details analysis.
Anyone would explain with examples? Getting confused here.
For example:
S -> Aa | Bb
A->c
B->c
In order to decide if a c is an A or a B, you need to know the following symbol.
In real life, you most commonly need LR(1) for epsilon productions:
OPTIONAL_A -> ε | A
MULTI_A -> ε | MULTI_A A
... where ε matches only the empty string. In order to reduce an epsilon production, you always need to see past it.
Given a grammar G defined by
A -> Ca
B -> Cb
C -> e|f
Is this grammar LL(1)?
I realize that we could compress this down into a single line, but that's not the point of this question.
Mainly, can an LL(1) grammar have multiple rules that begin with the same non-terminal?
As a follow up question, how do I construct a parse table for the above grammar?
I've worked out the following:
FIRST(A) = {e,f}
FIRST(B) = {e,f}
FIRST(C) = {a,b}
FOLLOW(A) = {}
FOLLOW(B) = {}
FOLLOW(C) = {a,b}
I looked at this post, but didn't understand how they went from the FIRSTs and FOLLOWs to a table.
The grammar you've given is not LL(1) because there is a FIRST/FIRST conflict between the two productions A → Ca and A → Cb.
In general, grammars with multiple productions for the same nonterminal that begin with the same nonterminal will not be LL(1) because you will get a FIRST/FIRST conflict. There are grammars with this property that are LL(1), though they're essentially degenerate cases. Consider, for example, the following grammar:
A → Ea
A → Eb
E → ε
Here, the nonterminal E only expands out to the empty string ε and therefore, in effect, isn't really there. Therefore, the above grammar is LL(1) because there are no FIRST/FIRST conflicts between the two productions. To see this, here's the parse table:
a b $
A Ea Eb -
E ε ε -
Hope this helps!
I am solve your question in 2 case:
First case if the terminal is {a,b,e,f}
Second case if the terminal is {a,b,f} and e is epsilon
so no multipe entry this is LL(1).
For more information:
look at this explanation and example below:
Best regards
I am badly stuck on a question i am attempting from a sample final exam of compilers. I will really appreciate if someone can help me out with an explanation. Thanks
Consider the grammar G listed below
S = E $
E = E + T | T
T = T * F | F
F = ident | ( E )
Where + * ident ( ) are terminal symbols and $ is end of file.
a) is this grammar LR( 0 )? Justify your answer.
b) is the grammar SLR( 1 ) ? Justify your answer.
c) is this grammar LALR( 1 )? Justify your answer.
If you can show that the grammar is LR(0) then of course it is SLR(1) and LALR(1) because LR(0) is more restrictive.
Unfortunately, the grammar isn't LR(0).
For instance suppose you have just recognized E:
S -> E . $
You must not reduce this E to S if what follows is a + or * symbol, because E can be followed by + or * which continue to build a larger expression:
S -> E . $
E -> E . + T
T -> T . * F
This requires us to look ahead one token to know what to do in that state: to shift (+ or *) or reduce ($).
SLR(1) adds lookahead, and makes use of the follow-set information to make reductions (better than nothing, but the follow-set information globally obtained from the grammar is not context sensitive, like the state-specific lookahead sets in LALR(1)).
Under SLR(1), the above conflict goes away, because the S -> E reduction is considered only when the lookahead symbol is in the follow set of S, and the only thing in the follow set of S is the EOF symbol $. If the input symbol is not $, like +, then the reduction is not considered; a shift takes place which doesn't conflict with the reduction.
So the grammar does not fail to be SLR(1) on account of that conflict. It might, however, have some other conflict. Glancing through it, I can't see one; but to "justify that answer" properly, you have to generate all of the LR(0) state items, and go through the routine of verifying that the SLR(1) constraints are not violated. (You use the simple LR(0) items for SLR(1) because SLR(1) doesn't augment these items in any new way. Remember, it just uses the follow-set information cribbed from the grammar to eliminate conflicts.)
If it is SLR(1) then LALR(1) falls by subset relationship.
Update
The Red Dragon Book (Compilers: Principles, Techniques and Tools, Aho, Sethi, Ullman, 1988) uses exactly the same grammar in a set of examples that show the derivation of the canonical LR(0) item sets and the associated DFA, and some of the steps of filling in the parsing tables. This is in section 4.7, starting with example 4.34.
All LL grammars are LR grammars, but not the other way around, but I still struggle to deal with the distinction. I'm curious about small examples, if any exist, of LR grammars which do not have an equivalent LL representation.
Well, as far as grammars are concerned, its easy -- any simple left-recursive grammar is LR (probably LR(1)) and not LL. So a list grammar like:
list ::= list ',' element | element
is LR(1) (assuming the production for element is) but not LL. Such grammars can be fairly easily converted into LL grammars by left-factoring and such, so this is not too interesting however.
Of more interest is LANGUAGES that are LR but not LL -- that is a language for which there exists an LR(1) grammar but no LL(k) grammar for any k. An example is things that need optional trailing matches. For example, the language of any number of a symbols followed by the same number or fewer b symbols, but not more bs -- { a^i b^j | i >= j }. There's a trivial LR(1) grammar:
S ::= a S | P
P ::= a P b | \epsilon
but no LL(k) grammar. The reason is that an LL grammar needs to decide whether to match an a+b pair or an odd a when looking at an a, while the LR grammar can defer that decision until after it sees the b or the end of the input.
This post on cs.stackechange.com has lots of references about this
I am learning now about parsers on my Theory Of Compilation course.
I need to find an example for grammar which is in LL(1) but not in LALR.
I know it should be exist. please help me think of the most simple example to this problem.
Some googling brings up this example for a non-LALR(1) grammar, which is LL(1):
S ::= '(' X
| E ']'
| F ')'
X ::= E ')'
| F ']'
E ::= A
F ::= A
A ::= ε
The LALR(1) construction fails, because there is a reduce-reduce conflict between E and F. In the set of LR(0) states, there is a state made up of
E ::= A . ;
F ::= A . ;
which is needed for both S and X contexts. The LALR(1) lookahead sets for these items thus mix up tokens originating from the S and X productions. This is different for LR(1), where there are different states for these cases.
With LL(1), decisions are made by looking at FIRST sets of the alternatives, where ')' and ']' always occur in different alternatives.
From the Dragon book (Second Edition, p. 242):
The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive or LL methods. For a grammar to be LR(k), we must be able to recognize the occurrence of the right side of a production in a right-sentential form, with k input symbols of lookahead. This requirement is far less stringent than that for LL(k) grammars where we must be able to recognize the use of a production seeing only the first k symbols of what the right side derives. Thus, it should not be surprising that LR grammars can describe more languages than LL grammars.