First & Follow set for Arithmetic Expressions - parsing

I want to know if my FIRST and FOLLOW set I made for this grammar is correct or not
S -> TS'
S' -> +TS' | -TS' | epsilon
T -> UT'
T' -> *UT' | /UT' | epsilon
U -> VX
X -> ^U | epsilon
V -> (W) | -W | W | epsilon
W -> S | number
FIRST(S) = FIRST(T) = FIRST(U) = FIRST(V) = FIRST(W) = { ( , - , + , number , epsilon }
FIRST(T') = { *, / , epsilon}
FIRST(S') = { + , - , epsilon}
FIRST(X) = { ^ , epsilon}
FOLLOW(S) = FOLLOW(S') = FOLLOW(V) = {$}
FOLLOW(T) = {+ , - , $ }
FOLLOW(T')= {+, - , $ }
FOLLOW(U) = FOLLOW(X) = { * , / , + , - ,$ }
FOLLOW(W) = { ) , $ }

Just a remark:
You said:
FIRST(U) = FIRST(V)
Which is correct, but, V can be epsilon which means FIRST(U) = FIRST(V) + FIRST(X)
And X can be epsilon to.
Those epsilons can be quite frustrating sometimes.
There is a little more to say.
Just a few rules:
- Capitals are nonterminal
- lowercase are terminals
- epsilon is used for an empty rule
- $ is used to note the end of the input.
First(a) = {a}
First(A,B) = First(A), if epsilon is not in First(A)
First(A,B) = First(A) + First(B), if epsilon in First(A)
First(A|B) = First(A) + First(B)
Follow(T) includes $ if T is the start symbol
Follow(T) includes First(A) if there is a rule with ..TA..
Follow(T) includes Follow(A) if there is a rule A -> ..T
Follow(T) includes Follow(A) if there is a rule A -> ..TB and B can be epsilon
Follow(T) never includes epsilon
Example:
E = TE'
E' = +TE'|epsilon
T = FT'
T' = *FT' | epsilon
F = (E) | id
First(E) = First(T) = First(F) = {(, id}
First(E') = {+, epsilon}
First(T) = First(F) = {(, id}
First(T') = {*, epsilon}
First(F) = {(, id}
Follow(E) = {$, )}
Follow(E') = Follow(E) = {$, )}
Follow(T) = First(E') + Follow(E') = {$, ), +}
Follow(T') = Follow(T) = {$, ), +}
Follow(F) = First(T') + Follow(T') + Follow(T) = {*, $, ), +}
Your grammar is much more complex and a bit weird (are you sure there are no mistakes in the grammar?) but you can follow the rules.

Related

Calculating first and follow set of grammar

below is the grammar that i am using for a calculator language and my attempt at finding the follow set and the first set of the grammar.
I would love help in figuring out what i am doing wrong when trying to figure out these sets because I feel like i am not doing them correctly at all (at least for the follow sets)
Grammar
program → stmt_list $$$
stmt_list → stmt stmt_list | ε
stmt → id = expr | input id | print expr
expr → term term_tail
term_tail → add op term term_tail | ε
term → factor fact_tail
fact_tail → mult_op fact fact_tail | ε
factor → ( expr ) | number | id
add_op → + | -
mult_op → * | / | // | %
First set
first(p) = {id, input, print}
first(stmt_list) = {id, input, print, e}
first(s) = {id, input, print}
first(expr) = {(, id, number}
first(term_tail) = {+, -, e}
first(term) = {(, id, number}
first(fact_tail) = {, /, //, %, e}
first(factor) = {(, id, number}
first(add_op) = {+, -}
first(mult_op) = {, /, //, %}
Follow Set
follow(p) = {$}
follow(stmt_list) = {$}
follow(stmt) = {id, input, print}
follow(expr) = {(, id, number, ), input, print, , /, //, %}
follow(term_tail) = {), (, id, number, print, input}
follow(term) = {+, -}
follow(factor) = {, /, //, %}
follow(add_op) = {}
follow(mult_op) = {}
follow(fact_tail) = {*, /, //, %, +, -}
You have certain mistakes in First as well
first(p) = {id, input, print,e}
it will include epsilon
* is missing in the next two -
first(fact_tail) = { *,/, //, %, e} first(mult_op) = {*, /, //, %}
fact_tail → mult_op fact fact_tail | ε
Iam assuming here you actually mean
fact_tail → mult_op factor fact_tail | ε
Follow
follow(stmt) = {id, input, print,$}
if you refer to
stmt_list → stmt stmt_list | ε
then stmt is followed by first of stmt_list which includes e so string generated will end, hence stmt is followed by $
follow(expr) = {(, id, number, ), input, print, , /, //, %}
I don't know how you got this, follow of expr is equal to follow of stmt and )
follow(expr) = {id, ), input, print,$}
follow(term_tail) is equal to follow(expr)
follow(term) = {+,-,),id,input,print,$}
follow(fact_tail) is equal to follow(term)
follow(factor) = first(fact_tail)
follow(add_op) = first(term)
follow(mult_op) = first(factor)

Follow Set of grammar

I am working on trying to compute the FOLLOW set of the following grammar:
E -> TX
T -> int Y | ( E )
X -> + E | ε
Y -> * T | ε
I have calculated the following FOLLOW set so far:
follow (E) = {$} U {)}
follow (Y) = follow (T)
follow (T) = follow (Y)
follow (X) = follow (E) = {$, )}
follow (E) = first ()) = {)}
I know that the follow (T) / follow (Y) contains {+,$,)} but I am struggling to get to that point.
Any assistance in explaining the method here would be greatly helpful.
Note: I have followed these rules
1) If A is the start symbol put $ in Follow (A)
2) If there is a production B -> αAb, then Follow (A) = First (b)
3) If there is a production B -> aA or B -> αAb where First (b) is ε, add Follow (A) = Follow (B)
I have figured it out (and spent the better part of the afternoon)!
So the rules I'm following for anyone who finds this are:
follow(E) = follow(T)
follow(E) = first ())
follow(X) = follow(E)
follow(Y) = follow(T)
**follow(T) = first(X)** //the important one!
Following these rules you can build the sets:
follow(E) = {$, )}
follow(T) = {$, ), +}
follow(X) = {$, )}
follow(Y) = {$, ), +}
Which concludes the follow sets for the grammar!

How Follow function works? (compiler)

Grammar:
E -> TE’
E’ -> +TE’ | ε
T -> FΤ’
Τ’ -> *FΤ’ | ε
F -> (E)| id
Functions:
1. FIRST(F) = FIRST(T) = FIRST(E) = {(, id}
2. FIRST(E’) = {+, ε}
3. FIRST(T’) = {*, ε}
4. FOLLOW(E) = FOLLOW(E’) = {), $}
5. FOLLOW(T) = FOLLOW(T’) = {+, ), $}
6. FOLLOW(F) = {*, +, ), $}
Here is the grammar and the functions from my lectures...Can someone explain me how FOLLOW works??? I understood how FIRST work but FOLLOW is very difficult to understand...
Have a look at Wikipedia's FIRST_and_FOLLOW_sets
.
FOLLOW(E):
You look for any references of E.
Here (E) and union all following terminals and the FIRST-set of the following nonterminals.
Here only the following terminal ).
FOLLOW(F):
F is referenced by FT, *FT'. So FOLLOW(F) is the union of FIRST(T) = {(, id}* and FIRST(T') = {*, ε}.
Finally, FOLLOW(F) = {(, id, *, ε}.
here FOLLOW(F) is find by this way:
T-->FT' means FOLLOW(T) IS subset of FOLLOW(F)
T'-->*FT' means FIRST(T') contain epsilon then except epsilon and add other values to set.

Top Down Parsing - First and Follow

I have the following grammar on which I am trying to learn how to do first and follow. I think I have the FIRST correct. However, the FOLLOW is confusing due to the nonterminal C.
Here is the grammar:
S --> ABC
A --> a | Cb |ε
B --> C | dA | ε
C --> e | f
For the FIRST:
First(S) = First(A)-{ε} + First(C) = { a,f, e, ε}
First(B) = First(C) = {d,e,f,ε}
For the FOLLOW:
Follow(S) = {ε}
Follow(A) = First(B)-{ε} + First(C) = {a,e,f}
Follow(B) = Follow(C) = Follow(S) = { $}
Follow(C) = Follow(B) = Follow(S) = {b, $}
I’m having issues since there are two C one in production A and B?
Am I close to having this?
I think the First already is wrong.
As A and B are optional, and C has a non-empty first:
First(S) = First(A) + First(B) + First(C) - {ε}
First(A) = {a} + First(C) + {ε}
First(B) = First(C) + {d, ε}
First(C) = {e, f}
=>
First(A) = {a, e, f, ε}
First(B) = {d, e, f, ε}
=>
First(S) = {a, d, e, f}
First when occurrence is followed by nt,
Follow of rule when at end.
Follow(S) = {$}
Follow(A) = First(B) - {ε} + Follow(B) = {d, e, f}
Follow(B) = First(C) = {e, f}
Follow(C) = Follow(S) + Follow(B) + {b} = {b, e, f, $}
(I hope I got this right.)
In detail:
Follow(S) = {$} [Start rule]
Follow(A) = First(B) - {ε} [S --> A.BC]
+ First(C) [S --> AB.C as B or First(B) contains ε]
+ Follow(B) [B. --> C | dA. | ε]
Follow(B) = First(C) [S --> AB.C as C does not contain ε stops]
Follow(C) = Follow(S) [S. --> ABC.]
+ {b} [A --> a | C.b |ε]
+ Follow(B) [B. --> C. | dA | ε]
The dot on the LHS like X. --> ... results in Follow(S);
The dot on the RHS like ... --> ... .X ... results in First(X) - {ε}
While there is an ε-production for X, continue: ... --> ... X. ...
Mind, that these are "my" rules, your book might use a slightly different algebra.
I'm answering the following for the grammar. As the firsts are already answered correctly.
follow(S) = {$};
follow(A) = {$,d,e,f};
follow(B) = {$,e,f};
follow(C) = {$,b,e,f};
Reason for including '$' in follow of A, B and C is. If there is a production A ---> ABC in the grammar, where C is nullable then everything in Follow(A) will be Follow of(B).

Hindley Milner Type Inference in F#

Can somebody explain step by step type inference in following F# program:
let rec sumList lst =
match lst with
| [] -> 0
| hd :: tl -> hd + sumList tl
I specifically want to see step by step how process of unification in Hindley Milner works.
Fun stuff!
First we invent a generic type for sumList:
x -> y
And get the simple equations:
t(lst) = x;
t(match ...) = y
Now you add the equation:
t(lst) = [a] because of (match lst with [] ...)
Then the equation:
b = t(0) = Int; y = b
Since 0 is a possible result of the match:
c = t(match lst with ...) = b
From the second pattern:
t(lst) = [d];
t(hd) = e;
t(tl) = f;
f = [e];
t(lst) = t(tl);
t(lst) = [t(hd)]
Guess a type (a generic type) for hd:
g = t(hd); e = g
Then we need a type for sumList, so we'll just get a meaningless function type for now:
h -> i = t(sumList)
So now we know:
h = f;
t(sumList tl) = i
Then from the addition we get:
Addable g;
Addable i;
g = i;
t(hd + sumList tl) = g
Now we can start unification:
t(lst) = t(tl) => [a] = f = [e] => a = e
t(lst) = x = [a] = f = [e]; h = t(tl) = x
t(hd) = g = i /\ i = y => y = t(hd)
x = t(lst) = [t(hd)] /\ t(hd) = y => x = [y]
y = b = Int /\ x = [y] => x = [Int] => t(sumList) = [Int] -> Int
I skipped some trivial steps, but I think you can get how it works.

Resources