I'm trying to build Deterministic finite Automata this formal language
L = {w|w=Σ*0100} ⋂ {w=!Σ*11Σ*}
Any help building the automata
Here your language accepts all the strings which end in 0100 but do not contain 11. So, following automata satisfies your langauge.
Explanation:
state e is the null state. If automata encounter two 1 consecutively, it goes to null state and then no matter what comes, it is stuck in non-terminating state.
It search for 0100 and if it encounters it, it goes to terminating state d.
Related
So this is the DFA in the question needs to be minimzed
The answer to this question is this and as you can see the DFA is minimized now.
My question is : as you can see the minimized DFA has a state q7 which is unreachable from the start or initial state. So why they are showing state q7 in the final answer, shouldn't the unreachable state be removed to make this dfa even more minimized.
If you look carefully none of the states q4,q5,q6,q7 are reachable from the initial state q0, not just q7, so all these 4 states should be removed. my solution for this would start from q0,q1,q2,q3, and then follow the procedure of reduction.
This is what I think the answer should be:
Let's be practical for a moment. Definitions and constructions aside, a minimal DFA corresponding to the given DFA should be a DFA which accepts the same language and has as few states as possible. Any other definition of DFA minimization is not as useful as this one. Given this, the answer to your questions is unambiguously that q7 MUST NOT be in the minimized DFA, since a DFA without q7 accepts the same language and has fewer states. We can argue about whether a particular minimization procedure would remove it or whatever ad infinitum, but really that state must go. Another reason it must go is that the Myhill-Nerode theorem tells us a minimal DFA for this language must have the same number of states in a minimal DFA for this language as we do equivalence classes over the indistinguishability relation. Because no string leads to q7, there is no equivalence class for it at all, and there certainly can't be a new one it adds.
TL;DR - q7 is not a state in a minimal DFA corresponding to the given DFA. Make of that what you will.
Can we merge or combine all of the final states of a DFA which has more then one final states? Will it produce another equivalent DFA?
Till now, i just figure out that in some cases, merging all the final states of a DFA can produce another NFA which maybe equivalent with the foregoing DFA.
THANK YOU
You may only merge/combine states which are equivalent. States are equivalent, if the language recognized by them is identical. The recognized language is the set of strings which leads from the given state to a stop state.
Consider the regular expression a+|b. After the sequence aa the DFA is in a stop state, call it s1. It must have an outgoing transition on a to another stop state, which may be itself.
On the other hand, on the input b, the DFA is also in a stop state, call it s2. This can not have any outgoing transition which will ever end up in a stop state, because otherwise some string starting with b would be recognized, which is not permitted by a+|b.
Consequently s1 and s2 are not equivalent and can not be merged.
You noticed correctly, however, that we can always add an epsilon transition from all stop states to a unique, new stop state. But the result is an NFA, not a DFA anymore.
I am confused about the implementation of a language by an automaton. Does the automaton go directly to the next state if there is a ɛ-transition? Suppose I have an automaton consisting of three states a, b, and c (where a is initial state and c the accepting state) with alphabet {0,1}. How does the following work?
a----ɛ--->(b----0---->a)
(b----1---->c)
Is the string "1" accepted? What if we had
a---1--->b----ɛ--->c
? Would the string "1" be accepted?
Does the automaton go directly to the next state if there is an ɛ-transition?
Roughly speaking, yes. An ɛ-transition (in a non-deterministic finite automaton, or NFA, for short) is a transition that is not associated with the consumption of any symbol (0 or 1, in this case). Once you understand that, it's easy (in this case) to derive deterministic finite automata (or DFA, for short) that are equivalent to your NFAs and identify the languages that the latter describe.
Suppose I have an automaton [...] Is the string "1" accepted?
Yes. Here is a nicer diagram (curtesy of LaTeX and tikz) of your first NFA:
An equivalent DFA would be:
Once you have that, it's easy to see that the language accepted by that NFA is the set of strings that
start with zero or more 0's,
end with exactly one 1.
The string "1", because it starts with zero 0 and ends with one 1, is indeed accepted.
What if we had [...]? Would the string "1" be accepted?
Yes. Here is a nicer diagram of your second NFA:
An equivalent DFA would be:
In fact, it's easy to see that "1" is the only accepted string, here.
I'm implementing the automatic construction of an LALR parse table for no reason at all. There are two flavors of this parser, LALR(0) and LALR(1), where the number signifies the amount of look-ahead.
I have gotten myself confused on what look-ahead means.
If my input stream is 'abc' and I have the following production, would I need 0 look-ahead, or 1?
P :== a E
Same question, but I can't choose the correct P production in advance by only looking at the 'a' in the input.
P :== a b E
| a b F
I have additional confusion in that I don't think the latter P-productions really happen in when building a LALR parser generator. The reason is that the grammar is effectively left-factored automatically as we compute the closures.
I was working through this page and was ok until I got to the first/follow section. My issue here is that I don't know why we are calculating these things, so I am having trouble abstracting this in my head.
I almost get the idea that the look-ahead is not related to shifting input, but instead in deciding when to reduce.
I've been reading the Dragon book, but it is about as linear as a Tarantino script. It seems like a great reference for people who already know how to do this.
The first thing you need to do when learning about bottom-up parsing (such as LALR) is to remember that it is completely different from top-down parsing. Top-down parsing starts with a nonterminal, the left-hand-side (LHS) of a production, and guesses which right-hand-side (RHS) to use. Bottom-up parsing, on the other hand, starts by identifying the RHS and then figures out which LHS to select.
To be more specific, a bottom-up parser accumulates incoming tokens into a queue until a right-hand side is at the right-hand end of the queue. Then it reduces that RHS by replacing it with the corresponding LHS, and checks to see whether an appropriate RHS is at the right-hand edge of the modified accumulated input. It keeps on doing that until it decides that no more reductions will take place at that point in the input, and then reads a new token (or, in other words, takes the next input token and shifts it onto the end of the queue.)
This continues until the last token is read and all possible reductions are performed, at which point if what remains is the single non-terminal which is the "start symbol", it accepts the parse.
It is not obligatory for the parser to reduce a RHS just because it appears at the end of the current queue, but it cannot reduce a RHS which is not at the end of the queue. That means that it has to decide whether to reduce or not before it shifts any other token. Since the decision is not always obvious, it may examine one or more tokens which it has not yet read ("lookahead tokens", because it is looking ahead into the input) in order to decide. But it can only look at the next k tokens for some value of k, typically 1.
Here's a very simple example; a comma separated list:
1. Start -> List
2. List -> ELEMENT
3. List -> List ',' ELEMENT
Let's suppose the input is:
ELEMENT , ELEMENT , ELEMENT
At the beginning, the input queue is empty, and since no RHS is empty the only alternative is to shift:
queue remaining input action
---------------------- --------------------------- -----
ELEMENT , ELEMENT , ELEMENT SHIFT
At the next step, the parser decides to reduce using production 2:
ELEMENT , ELEMENT , ELEMENT REDUCE 2
Now there is a List at the end of the queue, so the parser could reduce using production 1, but it decides not to based on the fact that it sees a , in the incoming input. This goes on for a while:
List , ELEMENT , ELEMENT SHIFT
List , ELEMENT , ELEMENT SHIFT
List , ELEMENT , ELEMENT REDUCE 3
List , ELEMENT SHIFT
List , ELEMENT SHIFT
List , ELEMENT -- REDUCE 3
Now the lookahead token is the "end of input" pseudo-token. This time, it does decide to reduce:
List -- REDUCE 1
Start -- ACCEPT
and the parse is successful.
That still leaves a few questions. To start with, how do we use the FIRST and FOLLOW sets?
As a simple answer, the FOLLOW set of a non-terminal cannot be computed without knowing the FIRST sets for the non-terminals which might follow that non-terminal. And one way we can decide whether or not a reduction should be performed is to see whether the lookahead is in the FOLLOW set for the target non-terminal of the reduction; if not, the reduction certainly should not be performed. That algorithm is sufficient for the simple grammar above, for example: the reduction of Start -> List is not possible with a lookahead of ,, because , is not in FOLLOW(Start). Grammars whose only conflicts can be resolved in this way are SLR grammars (where S stands for "Simple", which it certainly is).
For most grammars, that is not sufficient, and more analysis has to be performed. It is possible that a symbol might be in the FOLLOW set of a non-terminal, but not in the context which lead to the current stack configuration. In order to determine that, we need to know more about how we got to the current configuration; the various possible analyses lead to LALR, IELR and canonical LR parsing, amongst other possibilities.
Can you check on this: https://dl.dropbox.com/u/25439537/finite%20automata.png
This is a checked homework, so don't worry. I just want to clarify whether my answer is correct or not, because it is marked by my teacher as incorrect.
My answer is ((a+b)(a+b))*a
The first (a+b) signifies the upper arrows. The second (a+b) signifies the lower arrows. The last 'a' tells us that it should always end in 'a'.
I just want to record evidences from a lot of experts so that I can give it to my teacher.
I believe your answer is correct.
Let's consider the whole process as two parts: (1) start with start, and go back to start; and (2) go from start to end and accept. Obviously, the (1) part is a loop.
For (1), starting with start, either accept b or a. For b, it's b(a+b) to go back. For a, it's a(a+b) to go back. So (1) is b(a+b) + a(a+b) which is (a+b)(a+b).
For (2), it's a'.
So, the final result is (loop in (1))* (2) i.e. ( (a+b)(a+b) )* a.
Follow the description above, you can also come up with a proof of the equivalence between the two. Proof part (a) every sequence accepted by the automata is in the set ((a+b)(a+b))*a; part (b) every sequence in the set ((a+b)(a+b))*a is accepted by the automata.
Your answer is wrong, because it doesn't provide for strings starting with b.
The path (start) -> b -> a+b -> a -> (end) is accepted by your finite automaton, but not by your regex. The simplest counterexample to your answer being correct is the regex's rejection of the string "baba".
By the way, if the teacher gave you that regex without the "end" state having two concentric circles (to indicate being an accept state) it was probably a trick question. Having no accept state means your automaton rejects everything. The best way to describe that would be to just write down {} (the empty set).