I have to convert a given PDA into Turing machine form, and am struggling to find any resources on how to do this. If someone could explain or point me to some resource that could explain how to do that it would be much appreciated.
As Welbog suggests in comments, just use the tape like a stack:
when the PDA would have pushed onto the stack:
leave the current tape symbol alone
move the tape head to the right
write the symbol to the tape and stay in the same position
when the PDA would have popped from the stack:
replace the current symbol with blank
move the tape head to the left
Notice that this construction means that every "push" transition in your PDA will need an extra intermediate state in the TM to handle the two-step move-then-write operation. There are different ways you could structure this so possibly you could have the extra step on the pop operation instead.
Related
Left recursion will make the parser go into an infinite loop. So why does the same not happen with right recursion?
In a recursive descent parser a grammar rule like A -> B C | D is implemented by trying to parse B at the current position and then, if that succeeds, trying to parse C at the position where B ended. If either fail, we try to parse D at the current position¹.
If C is equal to A (right recursion) that's okay. That simply means that if B succeeds, we try to parse A at the position after B, which means that we try to first parse B there and then either try A again at a new position or try D. This will continue until finally B fails and we try D.
If B is equal to A (left recursion), however, that is very much a problem. Because now to parse A, we first try to parse A at the current position, which tries to parse A at the current position ... ad infinitum. We never advance our position and never try anything except A (which just keeps trying itself), so we never get to a point where we might terminate.
¹ Assuming full back tracking. Otherwise A might fail without trying D if B and/or C consumed any tokens (or more tokens than we've got lookahead), but none of this matters for this discussion.
If you're puzzled by the lack of symmetry, another way of looking at this is that left recursion causes problems for recursive descent parsers is because we typically parse languages from left to right. That means that if a parser is left-recursive, then the recursive symbol is the first one that's tried, and it's tried in the same state as the parent rule, guaranteeing that the recursion will continue infinitely.
If you really think about it, there's no fundamental reason why we parse languages from left-to-right; it's just a convention! (There are reasons, such as that it's faster to read files from disk that way; but those are consequences of the convention.) If you wrote a right-to-left recursive descent parser, which started at the end of the file and consumed characters from the end first, working backwards to the beginning of the file, then right recursion is what would cause problems, and you would need to rewrite your right-recursive grammars to be left-recursive before you can parse them. That's because if you're handling the right symbol first, then it's the one that's parsed with the same state as the parent.
So there you are; symmetry is preserved. Just as left-to-right recursive descent parsers struggle with left recursion, similarly right-to-left recursive descent parsers struggle with right recursion.
I was going through the text Compilers Principles, Techniques and Tools by Ullman et. al where I came across the excerpt where the authors try to justify why stack is the best data structure of shift reduce parsing. They said that it is so because of the fact that
"The handle will always eventually appear on top of the stack, never inside."
The Excerpt
This fact becomes obvious when we consider the possible forms of two successive steps in any rightmost derivation. These two steps can be of the form
In case (1), A is replaced by , and then the rightmost nonterminal B in that right side is replaced by . In case (2), A is again replaced first, but this time the right side is a string y of terminals only. The next rightmost nonterminal B will be somewhere to the left of y.
Let us consider case (1) in reverse, where a shift-reduce parser has just reached the configuration
The parser now reduces the handle to B to reach the configuration
in which is the handle, and it gets reduced to A
In case (2), in configuration
the handle is on top of the stack. After reducing the handle to B, the parser can shift the string xy to get the next handle y on top of the stack,
Now the parser reduces y to A.
In both cases, after making a reduction the parser had to shift zero or more symbols to get the next handle onto the stack. It never had to go into the stack to find the handle. It is this aspect of handle pruning that makes a stack a particularly convenient data structure for implementing a shift-reduce parser.
My reasoning and doubts
Intuitively this is how I feel that the statement in can be justified
If there is an handle on the top of the stack, then the algorithm, will first reduce it before pushing the next input symbol on top of the stack. Since before the push any possible handle is reduced, so there is no chance of an handle being on the top of the stack and then pushing a new input symbol thereby causing the handle to go inside the stack.
Moreover I could not understand the logic the authors have given in highlighted portion of the excerpt justifying that the handle cannot occur inside the stack, based on what they say about B and other facts related to it.
Please can anyone help me understand the concept.
The key to the logic expressed by the authors is in the statement at the beginning (emphasis added):
This fact becomes obvious when we consider the possible forms of two successive steps in any rightmost derivation.
It's also important to remember that a bottom-up parser traces out a right-most derivation backwards. Each reduction performed by the parser is a step in the derivation; since the derivation is rightmost the non-terminal being replaced in the derivation step must be the last non-terminal in the sentential form. So if we write down the sequence of reduction actions used by the parser and then read the list backwards, we get the derivation. Alternatively, if we write down the list of productions used in the rightmost derivation and then read it backwards, we get the sequence of parser reductions.
Either way, the point is to prove that the successive handles in the derivation steps correspond to monotonically non-retreating prefixes in the original input. The authors' proof takes two derivation steps (any two derivation steps) and shows that the end of the handle of the second derivation step is not before the end of the handle of the first step (although the ends of the two handles may be at the same point in the input).
I'm having trouble understanding the difference between npda vs dpda
i think it goes like this
NPDA- from a state multiple choices can be taken to get to next state
DPDA- from a state, only 1 path can be taken to the next state
..but there are 2 rules concerning DPDA that i can't get a black and white understanding of
..per wikipedia
for the first rule:
q is a state, a is an alphabet symbol, x is a stack symbol
what does "has at most one element" mean
I have no idea what the second rule means.
Could someone translate this into plain english please. I'd be grateful.
For the first rule, "has at most one element" means there is only one delta transition for a particular input and stack symbol (delta transitions are treated formally as a set for PDAs). In other words, if there is something on top of the stack and input coming in, there are never multiple states to go to.
As you stated, "DPDA- from a state, only 1 path can be taken to the next state." This rule is the formal way of denoting that only one path may be taken.
A violation of rule 1 might specify two delta transitions at the same state with the same input symbol and stack symbol. For example, there could be two transitions from a state q that each require a b on the stack and an a as input, but go to different states. This would not be a DPDA.
The second rule states that if there are delta transitions for the empty string under a stack symbol, then there are no delta transitions for any letter of the alphabet under that stack symbol.
This means that if you allow a particular stack symbol to be popped at a state without any input, you cannot allow it to also popped at that same state with input.
A violation of rule 2 might allow popping as from the stack without any input between 2 states but also allow popping a's from the stack with a b as input.
I use Bison output file to analyze the state (machine) transformation of parser, I find when parser deduce a rule, it goes back to a previous state, but sometimes it goes one state back, sometimes it goes two or three states back. Can anyone tell me what is the rule that determine to which state the state machine will go back, after finished a deduction?
Thanks in advance.
When an LR(k) machine performs a reduction, it pops the right-hand side of the production off the parser stack, revealing the state in which parsing of the production started. It then looks up the reduced non-terminal in the GOTO table for that state.
So the number of entries popped off the parser stack will be the number of symbols on the right-hand side of the reduced production. (In theory, an LR parser could optimize by not pushing all symbols onto the stack, which would allow it to pop fewer symbols off the stack. But as far as I know, bison doesn't do this particular optimization, because it would dramatically complicated the interface.)
as far as I understand the FOLLOW-Set is there to tell me at the first possible moment if there is an error in the input stream. Is that right?
Because otherwise I'm wondering what you actually need it for. Consider you're parser has a non-terminal on top of the stack (in our class we used a stack as abstraction for LL-Parsers)
i.e.
[TOP] X...[BOTTOM]
The X - let it be a non-terminal - is to be replaced in the next step since it is at the top of the stack. Therefore the parser asks the parsing table what derivation to use for X. Consider the input is
+ b
Where + and b are both terminals.
Suppose X has "" i.e. empty string in its FIRST set. And it has NO + in his FIRST set.
As far as I see it in this situation, the parser could simply check that there is no + in the FIRST set of X and then use the derivation which lets X dissolve into an "" i.e. empty string since it is the only way how the parser possibly can continue parsing the input without throwing an error. If the input stream is invalid the parser will recognize it anyway at some moment later in time. I understand that the FOLLOW set can help here to right away identify whether parsing can continue without an error or not.
My question is - is that really the only role that the FOLLOW set plays?
I hope my question belongs here - I'm sorry if not. Also feel free to request clarification in case something is unclear.
Thank you in advance
You are right about that. The parser could eventually just continue parsing and would eventually find a conflict in another way.
Besides that, the FOLLOW set can be very convenient in reasoning about a grammar. Not by the parser, but by the people that constructs the grammar. For instance, if you discover, that any FIRST/FIRST or FIRST/FOLLOW conflict exists, you have made an ambiguous grammar, and may want to revise it.