confusion in finding first and follow in left recursive grammar - parsing

Recently I faced the problem for finding first and follow
S->cAd
A->Ab|a
Here I am confused with first of A
which one is correct {a} , {empty,a} as there is left recursion in A's production .
I am confused whether to include empty string in first of A or not
Any help would be appreciated.
-------------edited---------------
what wil be the first and follow of this ,,This is so confusing grammar i have ever seen
S->SA|A
A->a
I need to prove this grammar is not in LL(1) using parsing table but unable to do because i didnot get 2 entry in single cell.

Firstly,you'll need to remove left-recursion leading to
S -> cAd
A -> aA'
A' -> bA' | epsilon
Then, you can calculate
FIRST(A) = a // as a is the only terminal nderived first from A.
EDIT :-
For your second question,
S -> AS'
S' -> AS' | epsilon
A -> a
FIRST(A) = a
FIRST(S) = a
FIRST(S') = {a,epsilon}.
The idea of removing left-recursion before calculating FIRST() and FOLLOW() can be learnt here.

Related

Is it possible that FIRST SET contains same terminal more than one time

I am confused that can FIRST SET contains same terminal twice..
for example I have grammar
E->T+E|T FIRST(E)={a,a}
T->a FIRST(T)={a}
..
Is this correct? or I should write
FIRST(E)={a}
By definition sets can not contain the same element multiple times - this applies to first sets as much as any other set. So {a} is the proper way to write it.
I guess you're trying to compute the First and Follow sets, to construct the final predictive table, but generally, you need to resolve all the conflicts first, which are:
ε-derivation
Direct Left Recursion
Indirect Left Recursion
Ambiguous prefixes
In your example (Or part of it, I guess), you need to factor out ambiguous prefixes, the T.
E -> T E'
E' -> + E | ε
T -> a
Formally, for any non-terminal with derivation rules of the form A → αβ | αγ
1- Remove these 2 derivation rules
2- Create a rule A′ → β | γ
3- Create a rule A → α A′
Check out this Paper about Conflicts, it was very helpful for me, and you might as well check this slide and this, if you have any problem with top-down parsing.

Removing Ambiguity Caused By Dangling Else For LL(1) Grammars

In the case of the dangling else problem for compiler design, is there a reason to left factor it before removing ambiguity?
We are transforming a CFG into an LL(1) grammar so my professor is asking us to first eliminate recursion, then left factor, then remove ambiguity from our grammar. But, from what I've read, ambiguity is usually eliminated first. I'm not sure how to remove ambiguity after left factoring.
This is how what I got after left factoring it:
S -> i E t S S' | other
S' -> e S | epsilon
However, as I understand it, removing ambiguity requires a rewrite of the grammar so the grammar will always result similar to this right?
S -> U | M
M -> i E t M e M | other
U -> i E t U'
U' -> M e U | S
Or is there another way to do it? As far as I can see, this is the only way to remove ambiguity from the dangling else.
As it turns out, a good way to deal with ambiguity caused by a dangling else in an LL(1) is to handle it in the parser. Rewriting the grammar is also another way to handle it, as is adding 'begin' and 'end' in the grammar like so:
S -> i E t a S z S' | other
S' -> e S | epsilon
Although it might be intuitive for some, for other beginners, this is what the symbols mean:
S: Statement
i: if
E: Expression
t: then
a: begin
z: end
S': Statement'
e: else
other: any other productions
Note: lower case letters represent terminals; Uppercase letters represent variables.
If anything is wrong, please let me know and I'll correct it.
I think this can be a possible answer:
[After left factoring and making it unambiguous]
Let other = a
S -> iEtT | a
T -> S | aeS
I am generating all if's first and associating the else with the recent unassociated if .
If I have to get an else, I should be eliminating the possibility of getting a new if between the current unassociated if and corresponding else.
However I am allowing the possibility of getting an if after generating the corresponding else.
Point out if there are any errors.
Thank you.

How to eliminate this Left Recursion for LL Parser

How do you eliminate a left recursion of the following type. I can't seem to be able to apply the general rule on this particular one.
A -> A | a | b
By using the elimination rule you get:
A -> aA' | bA'
A' -> A' | epsilon
Which still has left recursion.
Does this say anything about the grammar being/not being LL(1)?
Thank you.
Notice that the rule
A → A
is, in a sense, entirely useless. It doesn't do anything to a derivation to apply this rule. As a result, we can safely remove it from the grammar without changing what the grammar produces. This leaves
A → a | b
which is LL(1).

Top down parsing - Compute FIRST and FOLLOW

Given the following grammar:
S -> S + S | S S | (S) | S* | a
S -> S S + | S S * | a
For the life of me I can't seem to figure out how to compute the FIRST and FOLLOW for the above grammar. The recursive non-terminal of S confuses me. Does that mean I have to factor out the grammar first before computing the FIRST and FOLLOW?
The general rule for computing FIRST sets in CFGs without ε productions is the following:
Initialize FIRST(A) as follows: for each production A → tω, where t is a terminal, add t to FIRST(A).
Repeatedly apply the following until nothing changes: for each production of the form A → Bω, where B is a nonterminal, set FIRST(A) = FIRST(A) ∪ FIRST(B).
We could follow the above rules as written, but there's something interesting here we can notice. Your grammar only has a single nonterminal, so that second rule - which imports elements into the FIRST set of one nonterminal from FIRST sets from another nonterminal - won't actually do anything. In other words, we can compute the FIRST set just by applying that initial rule. And that's not too bad here - we just look at all the productions that start with a terminal and get FIRST(S) = { a, ( }.

Difference between Left Factoring and Left Recursion

What is the difference between Left Factoring and Left Recursion ? I understand that Left factoring is a predictive top down parsing technique. But I get confused when I hear these two terms.
Left factoring is removing the common left factor that appears in two productions of the same non-terminal. It is done to avoid back-tracing by the parser. Suppose the parser has a look-ahead, consider this example:
A -> qB | qC
where A, B and C are non-terminals and q is a sentence.
In this case, the parser will be confused as to which of the two productions to choose and it might have to back-trace. After left factoring, the grammar is converted to:
A -> qD
D -> B | C
In this case, a parser with a look-ahead will always choose the right production.
Left recursion is a case when the left-most non-terminal in a production of a non-terminal is the non-terminal itself (direct left recursion) or through some other non-terminal definitions, rewrites to the non-terminal again (indirect left recursion).
Consider these examples:
(1) A -> Aq (direct)
(2) A -> Bq
B -> Ar (indirect)
Left recursion has to be removed if the parser performs top-down parsing.
Left Factoring is a grammar transformation technique. It consists in "factoring out" prefixes which are common to two or more productions.
For example, going from:
A → α β | α γ
to:
A → α A'
A' → β | γ
Left Recursion is a property a grammar has whenever you can derive from a given variable (non terminal) a rhs that begins with the same variable, in one or more steps.
For example:
A → A α
or
A → B α
B → A γ
There is a grammar transformation technique called Elimination of left recursion, which provides a method to generate, given a left recursive grammar, another grammar that is equivalent and is not left recursive.
The relationship/confusion between both terms probably derives from the fact that both transformation techniques may need to be applied to a grammar before being able to derive a predictive top down parser for it.
This is the way I've seen the two terms used:
Left recursion: when one or more productions can be reached from themselves with no tokens consumed in-between.
Left factoring: a process of transformation, turning the grammar from a left-recursive form to an equivalent non-left-recursive form.
left factor :
Let the given grammar :
A-->ab1 | ab2 | ab3
1) we can see that, for every production, there is a common prefix & if we choose any production here, it is not confirmed that we will not need to backtrack.
2) it is non deterministic, because we cannot choice any production and be assured that we will reach at our desired string by making the correct parse tree.
but if we rewrite the grammar in a way that is deterministic and also leaves us flexible enough to convert it into any string that is possible without backtracking, it will be:
A --> aA',
A' --> b1 | b2| b3
now if we are asked to make the parse tree for string ab2 and now we don't need back tracking. Because we can always choose the correct production when we get A' thus we will generate the correct parse tree.
Left recursion :
A --> Aa | b
here it is clear that the left child of A will always be A if we choose the first production,this is left recursion .because , A is calling itself over and over again .
the generated string from this grammar is :
ba*
since this cannot be in a grammar ... we eliminate the left recursion by writing :
A --> bA'
A' --> E | aA'
now we will not have left recursion and also we can generate ba* .
Left Recursion:
A grammar is left recursive if it has a nonterminal A such that there is a derivation A -> Aα | β where α and β are sequences of terminals and nonterminals that do not start with A.
While designing a top down-parser, if the left recursion exist in the grammar then the parser falls in an infinite loop, here because A is trying to match A itself, which is not possible.
We can eliminate the above left recursion by rewriting the offending production. As-
A -> βA'
A' -> αA' | epsilon
Left Factoring: Left factoring is required to eliminate non-determinism of a grammar. Suppose a grammar, S -> abS | aSb
Here, S is deriving the same terminal a in the production rule(two alternative choices for S), which follows non-determinism. We can rewrite the production to defer the decision of S as-
S -> aS'
S' -> bS | Sb
Thus, S' can be replaced for bS or Sb
Here is a simple way to differentiate between both terms:
Left Recursion:
When leftmost Element of a production is the Producing element itself (Non Terminal Element).
e.g. A -> Aα / Aβ
Left Factoring:
When leftmost Element of a production (Terminal element) is repeated in the same production.
e.g. A -> αB / αC
Furthermore,
If a Grammar is Left Recursive, it might result into infinite loop hence we need to Eliminate Left Recursion.
If a Grammar is Left Factoring, it confuses the parser hence we need to Remove Left Factoring as well.
left recursion:= when left hand non terminal is same as right hand non terminal.
Example:
A->A&|B where & is alpha.
We can remove left ricursion using rewrite this production as like.
A->BA'
A'->&A'|€
Left factor mean productn should not non deterministic. .
Example:
A->&A|&B|&C

Resources