Parsing in compiler design

Parsing in compiler design - parsing

As far as I know, Left recursion is not a problem for LR parser.And I know that an ambiguous grammar can't be parsed by any kind of parser.So if I have an ambiguous grammar as follows,how can I remove ambiguity so that I can check if this grammar is SLR(1) or not?
E->E+E|E-E|(E)|id
And one more question,is left factoring needed for a grammar to check if the grammar is LL(1) or SLR(1)?
Any help will be appreciated.

Any parser generator you are likely to encounter will be able to handle the ambiguities in your grammar simply.
Your grammar produces shift/reduce conflicts. These are not necessarily a problem (as are reduce/reduce conflicts). The default action on a shift/reduce conflict in every parser generator is to shift, which solves your problem. There are usually mechanisms (as in YACC or Bison) to ignore this as a warning.
You can remove the conflicts in your grammar by setting up multiple levels of expressions so that you force the precedence of the operators.

Related

Epsilon(ε) productions and LR(0) grammars and LL(1) grammars

At many places (for example in this answer here), I have seen it is written that an LR(0) grammar cannot contain ε productions.
Also in Wikipedia I have seen statements like: An ε free LL(1) grammar is also SLR(1).
Now the problem which I am facing is that I cannot reason out the logic behind these statements.
Well, I know that LR(0) grammars accept the languages accepted by a DPDA by empty stack, i.e. the language they accept must have prefix property. [This prefix property can, however, be dealt with if we assume end markers and as such given any language the prefix property shall always be satisfied. Many texts like Theory of Computation by Sipser assume this end marker to simply their argument]. That being said, we can say (informally?) that a grammar is LR(0) if there is no state in the canonical collection of LR(0) items that have a shift-reduce conflict or reduce-reduce conflict.
With this background, I tried to consider the following grammar:
S -> Aa
A -> ε
canonical collection of LR(0) items
In the above DFA, I find that there is no state which has a shift-reduce conflict or reduce-reduce conflict.
So this grammar should be LR(0) as per my analysis. But it also has ε production.
Isn't this example contradicting the statement:
"no grammar with ε productions can be LR(0)"
I guess if I know the logic behind the above quoted statement then I can understand the concept better.
Actually my main problem arose with the statement :
An ε free LL(1) grammar is also SLR(1).
When I asked one of my friends, he gave the argument that as the LL(1) grammar is ε free hence it is LR(0) and hence it is SLR(1).
But I could not understand his logic either. When I asked him about reasoning, he started sharing post regarding "grammar with ε productions can never be LR(0)"...
But personally I could not think of any logic as to how "ε free LL(1) grammar is SLR(1)". Is it really related to the above property of "grammar with ε productions cannot be LR(0)"? If so, please do help me out.. If not, then should I consider asking a separate question for the second confusion?
I have got my concepts of compiler design from the dragon book by Ullman only. Also the knowledge of TOC from Ullman and from few other texts like Sipser, Linz.

A notable feature of your grammar is that A could just be eliminated. It serves absolutely no purpose. (By "eliminated", I mean simply removing all references to it; leaving productions otherwise intact.)
It is true that it's existence doesn't preclude the grammar from being LR(0). Similarly, a grammar with an unreachable non-terminal and an ε-production for that non-terminal could also be LR(0).
So it would be more accurate to say that a grammar cannot be LR(0) if it has a productive non-terminal with both an ε-production and some other productive production. But since we usually only consider reduced grammars without pointless non-terminals, I'm not sure that this additional pedantry serves much purpose.
As for your question about ε-free LL(1) grammars, here's a rough outline:
If an ε-free grammar is not LR(0), then there is some state with both a shift and a reduce action. Since the grammar is ε-free, that state was reached by way of a shift or a goto. The previous state must then have had two different productions with the same FIRST set, contradicting the LL(1) condition.

Difference between: 'Eliminate left-recursion' and 'construct an equivalent unambiguous grammar'

For example:
R → R bar R|RR|R star|(R)|a|b
construct an equivalent unambiguous grammar:
R → S|RbarS S→T|ST
T → U|Tstar U→a|b|(R)
How about Eliminate left-recursion for R → R bar R|RR|R star|(R)|a|b?
What's the different between Eliminate left-recursion and construct an equivalent unambiguous grammar?

An unambiguous grammar is one where for each string in the language, there is exactly one way to derive it from the grammar. In the context of compiler construction the problem with ambiguous grammar is that it's not obvious from the grammar what the parse tree for a given input string should be. Some tools solve this using their rules for resolving ambiguities while other simply require the grammar to be unambiguous.
A left-recursive grammar is one where the derivation for a given non-terminal can produce that same non-terminal again without first producing a terminal. This leads to infinite loops in recursive-descent-style parsers, but is no problems for shift-reduce parsers.
Note that an unambiguous grammar can still be left-recursive and a grammar without left recursion can still be ambiguous. Also note that depending on your tools, you may need to only remove ambiguity, but not left-recursion, or you may need to remove left-recursion, but not ambiguity (though an unambiguous grammar is generally preferable).
So the difference is that eliminating left recursion and ambiguity solve different problems and are necessary in different situation.

How to solve this Grammar through SLR?

I want to solve this Grammar.
S->SS+
S->SS*
S->a
I want to construct SLR sets of items and parsing table with action and goto.
Can this grammar parse without eliminate left recursion.
Is this Grammar SLR.

No, this grammar is not SLR. It is ambiguous.
Left recursion is not a problem for LR parsers. Left recursion elimination is only necessary for LL parsers.

I am not entirely sure about this, but I think this grammar is actually SLR(1). I constructed by hand the SLR(1) table and I obtained one with no conflicts (having added a 0-transition from S' (new start symbol) -> S).
Can somebody provide a sentence that can be derived in two different ways from this grammar? I was able to get a parser for it in Bison without any warning. Are you sure it is ambiguous?

Postfix and right-associative operators in LR(0) parsers

Is it possible to construct an LR(0) parser that could parse a language with both prefix and postfix operators? For example, if I had a grammar with the + (addition) and ! (factorial) operators with the usual precedence then 1+3! should be 1 + 3! = 1 + 6 = 7, but surely if the parser were LR(0) then when it had 1+3 on the stack it would reduce rather than shift?
Also, do right associative operators pose a problem? For example, 2^3^4 should be 2^(3^4) but again, when the parser have 2^3 on the stack how would it know to reduce or shift?
If this isn't possible is there still a way to use an LR(0) parser, possibly by altering the grammar to add brackets in the appropriate places?

LR(0) parsers have a weakness in that they can only parse prefix-free languages, languages where no string in the language is a prefix of any other. This generally makes it a bit tricky to parse expressions like these, since something like 5 is a prefix of 5!. This also explains why it's hard to get right-associative operators - given a production like
S → F | F ^ S
the parser will have a shift/reduce conflict after seeing an F because it can't tell whether to extend it or to reduce again. This is related to the prefix-free property mentioned earlier.
This weakness of LR(0) is one of the reasons why people don't use it much in practice. SLR(1) and LALR(1) parsers can usually parse these grammars because they have a token of lookahead that lets them decide whether to shift or reduce. In the above case, the parsers wouldn't encounter shift/reduce conflicts because when deciding whether to reduce an F or shift a ^, they can see to shift the ^ because there's no correct string where a ^ should appear after an S.

How can I tell Bison I also expect reduce-reduce conflicts?

My C#-ish toy grammar now has its first reduce-reduce conflicts! I'm so proud of me.
It seems all right to me, however (I switched off to a GLR parser for the occasion). The problem is, while I know the %expect directive can shut up Bison about shift/reduce conflicts, I can't find the equivalent for reduce/reduce conflicts. So what should I use to make it silent about my 3 shift/reduces and my 2 reduce/reduces?

From the GNU Bison documentation, found here
For normal LALR(1) parsers,
reduce/reduce conflicts are more
serious, and should be eliminated
entirely. Bison will always report
reduce/reduce conflicts for these
parsers. With GLR parsers, however,
both kinds of conflicts are routine;
otherwise, there would be no need to
use GLR parsing. Therefore, it is also
possible to specify an expected number
of reduce/reduce conflicts in GLR
parsers, using the declaration:
%expect-rr n

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Parsing in compiler design - parsing

Related

Epsilon(ε) productions and LR(0) grammars and LL(1) grammars

Difference between: 'Eliminate left-recursion' and 'construct an equivalent unambiguous grammar'

How to solve this Grammar through SLR?

Postfix and right-associative operators in LR(0) parsers

How can I tell Bison I also expect reduce-reduce conflicts?

Categories

Resources