So this is the DFA in the question needs to be minimzed
The answer to this question is this and as you can see the DFA is minimized now.
My question is : as you can see the minimized DFA has a state q7 which is unreachable from the start or initial state. So why they are showing state q7 in the final answer, shouldn't the unreachable state be removed to make this dfa even more minimized.
If you look carefully none of the states q4,q5,q6,q7 are reachable from the initial state q0, not just q7, so all these 4 states should be removed. my solution for this would start from q0,q1,q2,q3, and then follow the procedure of reduction.
This is what I think the answer should be:
Let's be practical for a moment. Definitions and constructions aside, a minimal DFA corresponding to the given DFA should be a DFA which accepts the same language and has as few states as possible. Any other definition of DFA minimization is not as useful as this one. Given this, the answer to your questions is unambiguously that q7 MUST NOT be in the minimized DFA, since a DFA without q7 accepts the same language and has fewer states. We can argue about whether a particular minimization procedure would remove it or whatever ad infinitum, but really that state must go. Another reason it must go is that the Myhill-Nerode theorem tells us a minimal DFA for this language must have the same number of states in a minimal DFA for this language as we do equivalence classes over the indistinguishability relation. Because no string leads to q7, there is no equivalence class for it at all, and there certainly can't be a new one it adds.
TL;DR - q7 is not a state in a minimal DFA corresponding to the given DFA. Make of that what you will.
Related
It is not possible to have multiple start state in DFA but how can I achieve Compliment operation on DFA having multiple Final states?
I try Complimenting the language of given DFA but how multiple final states can be converted to multiple starting states
As pointed out in the comments, finding a DFA for compliment of the language of another DFA is quite straightforward: you simply make all accepting states non-accepting, and vice-versa. The resulting DFA may not be minimal for the language it accepts, but it is still a DFA.
You might be thinking of how to find a DFA for the reversal of a language of another DFA. This is a little more involved and encounters the issue you suggest: by reversing the direction of all transitions, making the initial state accepting and making accepting states initial, you get an automaton that works, but it has multiple initial states. This is not allowed for DFAs. Happily, we can make an NFA-lambda easily by adding a new state q0' and adding empty/lambda/epsilon transitions from it to the otherwise initial states. Now we have an NFA-lambda for the reversal of the language and we can obtain a DFA by determinizing this using a standard algorithm.
I am stuck to a problem from the famous dragon Book of Compiler Design.How to find all the viable prefixes of the following grammar:
S -> 0S1 | 01
The grammar is actually the language of the regex 0n1n.
I presume the set of all viable prefixes might come as a regex too.I came up with the following solution
0+
0+S
0+S1
0+1
S
(By plus , I meant no of zeroes is 1..inf)
after reducing string 000111 with the following steps:
stack input
000111
0 00111
00 0111
000 111
0001 11
00S 11
00S1 1
0S 1
0S1 $
S $
Is my solution correct or I am missing something?
0n1n is not a regular language; regexen don't have variables like n and they cannot enforce an equal number of repetitions of two distinct subsequences. Nonetheless, for any context-free grammar, the set of viable prefixes is a regular language. (A proof of this fact, in some form, appears at the beginning of Part II of Donald Knuth's seminal 1965 paper, On the Translation of Languages from Left to Right, which demonstrated both a test for the LR(k) property and an algorithm for parsing LR(k) grammars in linear time.)
OK, to the actual question. A viable prefix for a grammar is (by definition) the prefix of a sentential form which can appear on the stack during a parse using that grammar. It's called "viable" (which means "still alive" or "could continue growing") precisely because it must be the prefix of some right sentential form whose suffix contains no non-terminal symbol. In other words, there exists a sequence of terminals which can be appended to the viable prefix in order to produce a right-sentential form; the viable prefix can grow.
Knuth shows how to create a DFA which produces all viable prefixes, but it's easier to see this DFA if we already have the LR(k) parser produced by an LR(k) algorithm. That parser is a finite-state machine whose alphabet is the set of terminal and non-terminal symbols of a grammar. To get the viable-prefix grammar, we use exactly the same state machine, but we remove the stack (so that it becomes just a state machine) and the reduce actions, leaving only the shift and goto actions as transitions. All states in the viable-prefix machine are accepting states, since any prefix of a viable prefix is itself a viable prefix.
A key feature of this new automaton is that it cannot extend a prefix with a reduce action (since we removed all the reduce actions). A prefix with a reduce action is a prefix which ends in a handle -- recall that a handle is the right-hand side of some production -- so another definition of a viable prefix is that it is a right-sentential form (that is, a possible step in a derivation) which does not extend beyond the right-most handle.
The grammar you are working with has only two productions, so there are only two handles, 01 and 0S1. Note that 10 and 1S cannot be subsequences of any right-sentential form, nor can a right-sentential form contain more than one S. Any right-sentential form must either be a sentence 0n1n or a sentential form 0nS1n where n>0. But every handle ends at the first 1 of a sentential form, and so a viable prefix must end at or before the first 1. This produces precisely the four possibilities you list, which we can condense to the regular expression 0*0(S1?)?.
Chopping off the suffix removed the second n from the formula, so there is no longer a requirement of concordance and the language is regular.
Note:
Questions like this and their answers are begging to be rendered using MathJax. StackOverflow, unfortunately, does not provide this extension, which is apparently considered unnecessary for programming. However, there is a site in the StackExchange constellation dedicated to computing science questions, http://cs.stackexchange.com, and another one dedicated to mathematical questions, http://math.stackexchange.com. Formal language theory is part of both computing science and mathematics. Both of those sites permit MathJax, and questions on those sites will not be closed because they are not programming questions. I suggest you take this information into account for questions like this one.
Can we merge or combine all of the final states of a DFA which has more then one final states? Will it produce another equivalent DFA?
Till now, i just figure out that in some cases, merging all the final states of a DFA can produce another NFA which maybe equivalent with the foregoing DFA.
THANK YOU
You may only merge/combine states which are equivalent. States are equivalent, if the language recognized by them is identical. The recognized language is the set of strings which leads from the given state to a stop state.
Consider the regular expression a+|b. After the sequence aa the DFA is in a stop state, call it s1. It must have an outgoing transition on a to another stop state, which may be itself.
On the other hand, on the input b, the DFA is also in a stop state, call it s2. This can not have any outgoing transition which will ever end up in a stop state, because otherwise some string starting with b would be recognized, which is not permitted by a+|b.
Consequently s1 and s2 are not equivalent and can not be merged.
You noticed correctly, however, that we can always add an epsilon transition from all stop states to a unique, new stop state. But the result is an NFA, not a DFA anymore.
I am confused about the implementation of a language by an automaton. Does the automaton go directly to the next state if there is a ɛ-transition? Suppose I have an automaton consisting of three states a, b, and c (where a is initial state and c the accepting state) with alphabet {0,1}. How does the following work?
a----ɛ--->(b----0---->a)
(b----1---->c)
Is the string "1" accepted? What if we had
a---1--->b----ɛ--->c
? Would the string "1" be accepted?
Does the automaton go directly to the next state if there is an ɛ-transition?
Roughly speaking, yes. An ɛ-transition (in a non-deterministic finite automaton, or NFA, for short) is a transition that is not associated with the consumption of any symbol (0 or 1, in this case). Once you understand that, it's easy (in this case) to derive deterministic finite automata (or DFA, for short) that are equivalent to your NFAs and identify the languages that the latter describe.
Suppose I have an automaton [...] Is the string "1" accepted?
Yes. Here is a nicer diagram (curtesy of LaTeX and tikz) of your first NFA:
An equivalent DFA would be:
Once you have that, it's easy to see that the language accepted by that NFA is the set of strings that
start with zero or more 0's,
end with exactly one 1.
The string "1", because it starts with zero 0 and ends with one 1, is indeed accepted.
What if we had [...]? Would the string "1" be accepted?
Yes. Here is a nicer diagram of your second NFA:
An equivalent DFA would be:
In fact, it's easy to see that "1" is the only accepted string, here.
Is a theoretical LR parser with infinite lookahead capable of parsing (unambiguous) languages which can be desribed by a context-free grammar?
Normally LR(k) parsers are restricted to deterministic context-free languages. I think this means that there always has to be exactly one grammar rule that can be applied currently. Meaning within the current lookahead context not more than one possible way of parsing is allowed to occur. The book "Language Implementation Patterns" states that a "... parser is nondeterministic - it cannot determine which alternative to choose." if the lookahead sets overlap. In contrast a non-deterministic parser just chooses one way if there are multiple alternatives and then goes back to the decision point and chooses the next alternative if it is impossible at a certain point to continue with the decision previously made.
Wherever I read definitions of LR(k) parsers (like on Wikipedia or in the Dragon Book) I always read something like: "k is the number of lookahead tokens" or cases when "k > 1" but never if k can be infinite. Wouldn't an infinite lookahead be the same as trying all alternatives until one succeeds?
Could it be that k is assumed to be finite in order to (implicitly) distinguish LR(k) parsers from non-deterministic parsers?
You are raising several issues here that are difficult to answer in a short form. Nevertheless I will try.
First of all, what is "infinite lookahead"? There is no book that describes such parser. If you have clear idea of what is this, you need to describe it first. Only after that we can discuss this topic. For now parsing theory discusses only LR(k) grammars, where the k is finite.
Normally LR(k) parsers are restricted to deterministic context-free
languages. I think this means that there always has to be exactly one
grammar rule that can be applied currently.
This is wrong. LR(k) grammars may have "grammar conflicts". Dragon book briefly mentions them without going into any details. "Grammars may have conflicts" means that some grammars do not have conflicts, while all other grammars have them. When grammar does not have conflicts, then there is ALWAYS only one rule and the situation is relatively simple.
When grammar has conflicts, this means that in certain cases more than one rule is applicable. Classic parsers cannot help here. What makes matters worse is that some input statement may have a set of correct parsings, not just one. From the grammar theory stand point all these parsings have the same value and importance.
The book "Language Implementation Patterns" states that a "... parser
is nondeterministic - it cannot determine which alternative to
choose."
I have impression that there is no definitive agreement on what "nondeterministic parser" means. I would tend to say that nondeterministic parser just picks up one of the alternatives randomly and goes on.
Practically only 2 strategies of resolving conflicts are used. The first one is conflict resolution in the callback handler. Callback handler is a regular code. Programmer, who writes it, checks whatever he wants in any way he wants. This code only gives back the result - what action to take. For the parser on top this callback handler is a black box. There is no theory here.
Second approach is called "backtracking". The idea behind is very simple. We do not know where to go. Ok, let's try all possible alternatives. In this case all variants are tried. There is nothing non deterministic here. There are several different flavors of backtracking.
If this is not enough I can write a little bit more.
nondeterminism means that in order to produce the correct result(s!), a finite state machine reads a token and then has N>1 next states. You can recognize a nondeterministic FSM if a node has more than one outgoing edge with the same label. Note that not every branch has to be valid, but the FSM can't pick just one. In practice you could fork here, resulting in N state machines or you could try a branch completely and then come back and try the next one until every outgoing statetransfer was tested.