I'm writing an SLR(1) parser from the following grammar:
1) S -> aSb
2) S -> cB
3) B -> cB
4) B -> ε
First of all, I started with finding the associated LR(0) automaton with the augmented grammar adding the S' -> S production and starting to compute the various states. The states that I found are:
I0 = {S'->•S, S->•aSb, S->•cB}
𝛿(I0, S) = I1; 𝛿(I0, a) = I2; 𝛿(I0, c) = I3;
I1 = {S'->S•}
I2 = {S->a•Sb, S->•aSb, S->•cB}
𝛿(I2, S) = I4; 𝛿(I2, a) = I2; 𝛿(I2, c) = I3
I3 = {S->c•B, B->•cB, B->•ε}
𝛿(I3, B) = I5; 𝛿(I3, ε) = I6; 𝛿(I3, c) = I7;
I4 = {S->aS•b}
𝛿(I4, b) = I8;
I5 = {S->cB•}
I6 = {B->ε•}
I7 = {B->c•B, B->•cB, B->•ε}
𝛿(I7, B) = I9; 𝛿(I7, ε) = I6; 𝛿(I7, c) = I7;
I8 = {S->aSb•}
I9 = {B->cB•}
And here there is the LR(0) automaton:
Automaton Picture
After that, I did the parser table (but I don't think it is needed in order to answer my question) so I have a doubt:
is the epsilon transition handled in the right way? I mean, I treated it as a normal character since we have to reduce by the rule number 4 at some point. If I'm wrong, how should I treat that transition? Thanks in advance, hoping this is helpful for other people as well.
no , there is no need to create the State I6
Confusion may have arise in Y->Ɛ . When you place a dot in the augmented productions for example S->A.B it means that A is completed and B is yet to be completed (by completion here means progress in parsing) . Similarly if you write Y->.Ɛ , it means Ɛ is yet to be over, but we also know that Ɛ is null string i.e nothing therefore Y->.Ɛ is interpreted as Y->.
you can use The JFLAP Software and see it's documentation about SLR(1)
NOT REQUIRED...
even i faced the same problem while removing left reccursion for the following grammar
E->E+T|E-T|T
the transformed rule looks like
E->T X
X->+TX|-TX|*e*
that doesnt matter as
x->.e and x->e. have no significance.Since moving the period before and after Epsilon is madman's job
Related
Im having a bit of difficulty with the following question:
Given a DFA A = (E = {a,b,c} , Q ,q0 ,F , l) (where l is the transition function), build a new DFA B such that L(B) = L(A) - {a}.
Now I understand that Eb = Ea, but how can I define B transition function, or accepting states, without knowing the accepting states of A?
Thank you.
First, let us construct a DFA called C which accepts w. This DFA has |w| + 2 states: one initial state, one dead state, one accepting state.
Second, let us construct a DFA called D which accepts everything except w. Simply change all non-accepting states to accepting, and vice versa.
Third, let us construct B using the Cartesian Product Machine construction on input DFAs A and D with intersection as the operator. This DFA will hav |Q| x (|w| + 2) states in total.
The language of B is everything accepted by A and D simultaneously. D accepts anything that isn't w; so, B accepts anything in L(A) that isn't w, as required.
EDIT: Some more detail about what B ends up looking like.
Let the states of A be QA and the states of D be QD. Let the accepting states of A be FA and the accepting states of D be FD. Let dA be the transition function for A and dD be the transition function for D. From our construction above, we have:
Q = {(x, y) for x in QA y in QD}
E = the same alphabet as A and D, assumed to be the same. If w contains only symbols in A's alphabet then just use A's alphabet. If w contains symbols not in A's alphabet then w is not in L(A) and we can just let B = A.
q0 = (q0, q0')
F = {(x, y) in Q | x in FA and y in FD}
d((x, y), s) = (dA(x, s), dD(y, s))
As for what D looks like:
QD = {q0', q1', …, q(|w|+1)'}
E = same as A's
q0' is the initial state
FD = QD \ {q(|w|)}
d(qi, s) = q(i+1) if s is the (i+1)th symbol of w, or q(|w|+1) otherwise
I have a language L1 = {w in {0,1}*| w contains the same number of 1's and 0's} and i have a TM M that decides L1.
I want to prove that L2 = {w in {0,1}*| w contains more 1's than 0's} is Turing-decidable.
I have used the "closed under complement" approach and proven that M' decides the complement of L1 (~L1).
My question is, can I assume that the ~L1 = (L2 or ~L2) and conclude that since M' decides ~L1 that L2 and ~L2 are both decidable languages?
Thank you for any advice
(Sorry, haven't figured out how to use LaTex here yet...)
I just want to flesh out Wellbog's answer. Here is L1 (Read n1(w) as "number of 1's in w"):
L1 = {w∈
{0,1}*: n1(w) = n0(w)}
And here is L2:
L2 = {w∈
{0,1}*: n1(w) > n0(w)}
From the other side, L1-bar is:
L1-bar = {w∈
{0,1}*: n1(w) > n0(w) OR n1(w) < n0(w)}
And clearly, L1-bar and L2 are different.
I am trying to figure out some details involving parsing expression grammars, and am stuck on the following question:
For the given grammar:
a = b Z
b = Z Z | Z
(where lower-case letters indicate productions, and uppercase letters indicate terminals).
Is the production "a" supposed to match against the string "Z Z"?
Here is the pseudo-code that I've seen the above grammar get translated to, where each production is mapped to a function that outputs two values. The first indicates whether the parse succeeded. And the second indicates the resulting position in the stream after the parse.
defn parse-a (i:Int) -> [True|False, Int] :
val [r1, i1] = parse-b(i)
if r1 : eat("Z", i1)
else : [false, i]
defn parse-b1 (i:Int) -> [True|False, Int] :
val [r1, i1] = eat("Z", i)
if r1 : eat("Z", i1)
else : [false, i]
defn parse-b2 (i:Int) -> [True|False, Int] :
eat("Z", i)
defn parse-b (i:Int) -> [True|False, Int] :
val [r1, i1] = parse-b1(i)
if r1 : [r1, i1]
else : parse-b2(i)
The above code will fail when trying to parse the production "a" on the input "Z Z". This is because the parsing function for "b" is incorrect. It will greedily consume both Z's in the input and succeed, and then leave nothing left for a to parse. Is this what a parsing expression grammar is supposed to do? The pseudocode in Ford's thesis seems to indicate this.
Thanks very much.
-Patrick
In PEGs, disjunctions (alternatives) are indeed ordered. In Ford's thesis, the operator is written / and called "ordered choice", which distinguishes it from the | disjunction operator.
That makes PEGs fundamentally different from CFGs. In particular, given PEG rules a -> b Z and b -> Z Z / Z, a will not match Z Z.
Thanks for your reply Rici.
I re-read Ford's thesis much more closely, and it reaffirms what you said. PEGs / operator are both ordered and greedy. So the rule presented above is supposed to fail.
-Patrick
I've been trying to get my head round various bits of F# (I'm coming from more of a C# background), and parsers interest me, so I jumped at this blog post about F# parser combinators:
http://santialbo.com/blog/2013/03/24/introduction-to-parser-combinators
One of the samples here was this:
/// If the stream starts with c, returns Success, otherwise returns Failure
let CharParser (c: char) : Parser<char> =
let p stream =
match stream with
| x::xs when x = c -> Success(x, xs)
| _ -> Failure
in p //what does this mean?
However, one of the things that confused me about this code was the in p statement. I looked up the in keyword in the MSDN docs:
http://msdn.microsoft.com/en-us/library/dd233249.aspx
I also spotted this earlier question:
Meaning of keyword "in" in F#
Neither of those seemed to be the same usage. The only thing that seems to fit is that this is a pipelining construct.
The let x = ... in expr allows you to declare a binding for some variable x which can then be used in expr.
In this case p is a function which takes an argument stream and then returns either Success or Failure depending on the result of the match, and this function is returned by the CharParser function.
The F# light syntax automatically nests let .. in bindings, so for example
let x = 1
let y = x + 2
y * z
is the same as
let x = 1 in
let y = x + 2 in
y * z
Therefore, the in is not needed here and the function could have been written simply as
let CharParser (c: char) : Parser<char> =
let p stream =
match stream with
| x::xs when x = c -> Success(x, xs)
| _ -> Failure
p
The answer from Lee explains the problem. In F#, the in keyword is heritage from earlier functional languages that inspired F# and required it - namely from ML and OCaml.
It might be worth adding that there is just one situation in F# where you still need in - that is, when you want to write let followed by an expression on a single line. For example:
let a = 10
if (let x = a * a in x = 100) then printfn "Ok"
This is a bit funky coding style and I would not normally use it, but you do need in if you want to write it like this. You can always split that to multiple lines though:
let a = 10
if ( let x = a * a
x = 100 ) then printfn "Ok"
What is the most elegant way to implement dynamic programming algorithms that solve problems with overlapping subproblems? In imperative programming one would usually create an array indexed (at least in one dimension) by the size of the problem, and then the algorithm would start from the simplest problems and work towards more complicated once, using the results already computed.
The simplest example I can think of is computing the Nth Fibonacci number:
int Fibonacci(int N)
{
var F = new int[N+1];
F[0]=1;
F[1]=1;
for(int i=2; i<=N; i++)
{
F[i]=F[i-1]+F[i-2];
}
return F[N];
}
I know you can implement the same thing in F#, but I am looking for a nice functional solution (which is O(N) as well obviously).
One technique that is quite useful for dynamic programming is called memoization. For more details, see for example blog post by Don Syme or introduction by Matthew Podwysocki.
The idea is that you write (a naive) recursive function and then add cache that stores previous results. This lets you write the function in a usual functional style, but get the performance of algorithm implemented using dynamic programming.
For example, a naive (inefficient) function for calculating Fibonacci number looks like this:
let rec fibs n =
if n < 1 then 1 else
(fibs (n - 1)) + (fibs (n - 2))
This is inefficient, because when you call fibs 3, it will call fibs 1 three times (and many more times if you call, for example, fibs 6). The idea behind memoization is that we write a cache that stores the result of fib 1 and fib 2, and so on, so repeated calls will just pick the pre-calculated value from the cache.
A generic function that does the memoization can be written like this:
open System.Collections.Generic
let memoize(f) =
// Create (mutable) cache that is used for storing results of
// for function arguments that were already calculated.
let cache = new Dictionary<_, _>()
(fun x ->
// The returned function first performs a cache lookup
let succ, v = cache.TryGetValue(x)
if succ then v else
// If value was not found, calculate & cache it
let v = f(x)
cache.Add(x, v)
v)
To write more efficient Fibonacci function, we can now call memoize and give it the function that performs the calculation as an argument:
let rec fibs = memoize (fun n ->
if n < 1 then 1 else
(fibs (n - 1)) + (fibs (n - 2)))
Note that this is a recursive value - the body of the function calls the memoized fibs function.
Tomas's answer is a good general approach. In more specific circumstances, there may be other techniques that work well - for example, in your Fibonacci case you really only need a finite amount of state (the previous 2 numbers), not all of the previously calculated values. Therefore you can do something like this:
let fibs = Seq.unfold (fun (i,j) -> Some(i,(j,i+j))) (1,1)
let fib n = Seq.nth n fibs
You could also do this more directly (without using Seq.unfold):
let fib =
let rec loop i j = function
| 0 -> i
| n -> loop j (i+j) (n-1)
loop 1 1
let fibs =
(1I,1I)
|> Seq.unfold (fun (n0, n1) -> Some (n0 , (n1, n0 + n1)))
|> Seq.cache
Taking inspiration from Tomas' answer here, and in an attempt to resolve the warning in my comment on said answer, I propose the following updated solution.
open System.Collections.Generic
let fib n =
let cache = new Dictionary<_, _>()
let memoize f c =
let succ, v = cache.TryGetValue c
if succ then v else
let v = f c
cache.Add(c, v)
v
let rec inner n =
match n with
| 1
| 2 -> bigint n
| n ->
memoize inner (n - 1) + memoize inner (n - 2)
inner n
This solution internalizes the memoization, and while doing so, allows the definitions of fib and inner to be functions, instead of fib being a recursive object, which allows the compiler to (I think) properly reason about the viability of the function calls.
I also return a bigint instead of an int, as int quickly overflows with even a small of n.
Edit: I should mention, however, that this solution still runs into stack overflow exceptions with sufficiently large values of n.