I am trying to find how LL(1) parser handle right associative grammar. For example in case of left associative grammar like this E->+TE' first() and follow() works smoothly and parsing table generated easily. But, in case of right-recursive grammar, for example, in case of power like E->T^E/T parsing table isn't generating properly. I am searching for resources but found every example avoiding right associativity like powers.
LL algorithms handle right-recursion with no problem whatsoever. In fact, the transformation you mention turns a left-associative grammar into a right-associative one, and left-associativity needs to restored by transforming the syntax tree in a semantic rule. So if the production is really right-associative, you can use the same grammar without the need for post- processing the tree.
The problem with E -> T ^ E | T is not that it is right recursive. The problem is that the two right-hand sides start with the same non-terminal, making prediction impossible. The solution is left-factoring, which will produce E -> T E' / E' -> ε | ^ T E'.
Related
Is there an easy way to tell whether a simple grammar is suitable for recursive descent? Is eliminating left recursion and left factoring the grammar enough to achieve this ?
Not necessarily.
To build a recursive descent parser (without backtracking), you need to eliminate or resolve all predict conflicts. So one definitive test is to see if the grammar is LL(1); LL(1) grammars have no predict conflicts, by definition. Left-factoring and left-recursion elimination are necessary for this task, but they might not be sufficient, since a predict conflict might be hiding behind two competing non-terminals:
list ::= item list'
list' ::= ε
| ';' item list'
item ::= expr1
| expr2
expr1 ::= ID '+' ID
expr2 ::= ID '(' list ')
The problem with the above (or, at least, one problem) is that when the parser expects an item and sees an ID, it can't know which of expr1 and expr2 to try. (That's a predict conflict: Both non-terminals could be predicted.) In this particular case, it's pretty easy to see how to eliminate that conflict, but it's not really left-factoring since it starts by combining two non-terminals. (And in the full grammar this might be excerpted from, combining the two non-terminals might be much more difficult.)
In the general case, there is no algorithm which can turn an arbitrary grammar into an LL(1) grammar, or even to be able to say whether the language recognised by that grammar has an LL(1) grammar as well. (However, it's easy to tell whether the grammar itself is LL(1).) So there's always going to be some art and/or experimentation involved.
I think it's worth adding that you don't really need to eliminate left-recursion in a practical recursive descent parser, since you can usually turn it into a while-loop instead of recursion. For example, leaving aside the question of the two expr types above, the original grammar in an extended BNF with repetition operators might be something like
list ::= item (';' item)*
Which translates into something like:
def parse_list():
parse_item()
while peek(';'):
match(';')
parse_item()
(Error checking and AST building omitted.)
For example:
R → R bar R|RR|R star|(R)|a|b
construct an equivalent unambiguous grammar:
R → S|RbarS S→T|ST
T → U|Tstar U→a|b|(R)
How about Eliminate left-recursion for R → R bar R|RR|R star|(R)|a|b?
What's the different between Eliminate left-recursion and construct an equivalent unambiguous grammar?
An unambiguous grammar is one where for each string in the language, there is exactly one way to derive it from the grammar. In the context of compiler construction the problem with ambiguous grammar is that it's not obvious from the grammar what the parse tree for a given input string should be. Some tools solve this using their rules for resolving ambiguities while other simply require the grammar to be unambiguous.
A left-recursive grammar is one where the derivation for a given non-terminal can produce that same non-terminal again without first producing a terminal. This leads to infinite loops in recursive-descent-style parsers, but is no problems for shift-reduce parsers.
Note that an unambiguous grammar can still be left-recursive and a grammar without left recursion can still be ambiguous. Also note that depending on your tools, you may need to only remove ambiguity, but not left-recursion, or you may need to remove left-recursion, but not ambiguity (though an unambiguous grammar is generally preferable).
So the difference is that eliminating left recursion and ambiguity solve different problems and are necessary in different situation.
I want to solve this Grammar.
S->SS+
S->SS*
S->a
I want to construct SLR sets of items and parsing table with action and goto.
Can this grammar parse without eliminate left recursion.
Is this Grammar SLR.
No, this grammar is not SLR. It is ambiguous.
Left recursion is not a problem for LR parsers. Left recursion elimination is only necessary for LL parsers.
I am not entirely sure about this, but I think this grammar is actually SLR(1). I constructed by hand the SLR(1) table and I obtained one with no conflicts (having added a 0-transition from S' (new start symbol) -> S).
Can somebody provide a sentence that can be derived in two different ways from this grammar? I was able to get a parser for it in Bison without any warning. Are you sure it is ambiguous?
I wanted to know why top down parsers cannot handle left recursion and we need to eliminate left recursion due to this as mentioned in dragon book..
Think of what it's doing. Suppose we have a left-recursive production rule A -> Aa | b, and right now we try to match that rule. So we're checking whether we can match an A here, but in order to do that, we must first check whether we can match an A here. That sounds impossible, and it mostly is. Using a recursive-descent parser, that obviously represents an infinite recursion.
It is possible using more advanced techniques that are still top-down, for example see [1] or [2].
[1]: Richard A. Frost and Rahmatullah Hafiz. A new top-down parsing algorithm to accommodate ambiguity and left recursion in polynomial time. SIGPLAN Notices, 41(5):46–54, 2006.
[2]: R. Frost, R. Hafiz, and P. Callaghan, Modular and efficient top-down
parsing for ambiguous left-recursive grammars. ACL-IWPT, pp. 109 –
120, 2007.
Top-down parsers cannot handle left recursion
A top-down parser cannot handle left recursive productions. To understand why not, let's take a very simple left-recursive grammar.
S → a
S → S a
There is only one token, a, and only one nonterminal, S. So the parsing table has just one entry. Both productions must go into that one table entry.
The problem is that, on lookahead a, the parser cannot know if another a comes after the lookahead. But the decision of which production to use depends on that information.
I have derived the following grammar:
S -> a | aT
T -> b | bR
R -> cb | cbR
I understand that in order for a grammar to be LL(1) it has to be non-ambiguous and right-recursive. The problem is that I do not fully understand the concept of left-recursive and right-recursive grammars. I do not know whether or not the following grammar is right recursive. I would really appreciate a simple explanation of the concept of left-recursive and right-recursive grammars, and if my grammar is LL(1).
Many thanks.
This grammar is not LL(1). In an LL(1) parser, it should always be possible to determine which production to use next based on the current nonterminal symbol and the next token of the input.
Let's look at this production, for example:
S → a | aT
Now, suppose that I told you that the current nonterminal symbol is S and the next symbol of input was an a. Could you determine which production to use? Unfortunately, without more context, you couldn't do so: perhaps you're suppose to use S → a, and perhaps you're supposed to use S → aT. Using similar reasoning, you can see that all the other productions have similar problems.
This doesn't have anything to do with left or right recursion, but rather the fact that no two productions for the same nonterminal in an LL(1) grammar can have a nonempty common prefix. In fact, a simple heuristic for checking if a grammar is not LL(1) is to see if you can find two production rules like this.
Hope this helps!
The grammar has only a single recursive rule: the last one where R is the symbol on the left, and also appears on the right. It is right-recursive because in the grammar rule, R is the rightmost symbol. The rule refers to R, and that reference is rightmost.
The language is LL(1). How we know this is that we can easily construct a recursive descent parser that uses no backtracking and at most one token of lookahead.
But such a parser would be based on a slightly modified version of the grammar.
For instance the two productions: S -> a and S -> a T could be merged into a single one that can be expressed by the EBNF S -> a [ T ]. (S derives a, followed by optional T). This rule can be handled by a single parsing function for recognizing S.
The function matches a and then looks for the optional T, which would be indicated by the next input symbol being b.
We can write an LL(1) grammar for this, along these lines:
S -> a T_opt
T_opt -> b R_opt
T_opt -> <empty>
... et cetera
The optionality of T is handled explicitly, by making T (which we rename to T_opt) capable of deriving the empty string, and then condensing to a single rule for S, so that we don't have two phrases that both start with a.
So in summary, the language is LL(1), but the given grammar for it isn't. Since the language is LL(1) it is possible to find another grammar which is LL(1), and that grammar is not far off from the given one.