I'm currently studying compiler's and am on the topic of "Chomsky Hierarchy and the 4 languages." But it beats me as to what the practical purpose of all this is?
It'd be great if I could see real-life examples of the 4 grammars: Unrestricted, CSG, CFG, Regular Grammer come to play.
I found online that Chomsky hierarchy along with the 4 grammars is used to evaluate proposals within cognitive science but this goes way over my head. It'd be great if someone could break it down for me, thanks a lot!
There is no practical value. That's the whole point.
Let me try to break that down a bit. It's useful to remember that Chomsky is a linguist --someone who studies human languages-- and that he was writing in the late 1950s when computational theory was not as well-developed as it is today. (To put it mildly.) His goal was to find a mathematical model which could provide some insights into the mechanisms by which human beings generate and understand sentences, and he took as his starting point a particular simple model of sentence generation.
In this model, a grammar is a function F, which transforms elements from an arbitrary sequence of symbols from some alphabet onto another sequence of symbols from the same alphabet. F is defined by a finite set of pairs (called productions) α → β. We then say that F(ω) = F(ζ) if the definition of F contains some pair α → β such that α is a substring of ω and ζ is the result of substituting a single instance of α in ω with β.
That's not very interesting in and of itself; we make that into a full language by starting with some designated starting sequence, normally represented as the single symbol S, and repeatedly apply F as many times as is necessary. (In all interesting grammars, the set so constructed is infinite, so it cannot actually be constructed. But we can imagine proceeding from the starting point until we find the sentence we wanted to generate.)
The problem with this model is that it can be used to describe an arbitrary Turing Machine. Or, if you like, an arbitrary computer program, although the equivalence is easier to see with a Turing Machine. In other words, it is at least theoretically possible to construct a finite grammar which will recognise strings consisting of the description of a Turing Machine (i.e., a program written in some programming language) followed by an input and an output only if the Turing Machine applied to the input would produce the output. In other words, there exists (in the mathematical sense) a grammar of this form which is computationally equivalent to a general purpose computer.
Unfortunately, that's not actually very useful if our goal is to understand sentences, because there is actually no algorithm for running computers backwards. The best we can do as a general solution is to enumerate all possible inputs and run the program on each of them until we find the output we hoped for. And that doesn't actually work because there is no limit to the amount of time the program might take to produce an output and no way to even know if the program will eventually come to an end. (This is called the "halting problem".) So we might get stuck on some possible input, and we'll never know if some other input might have produced the desired output.
As a result, we cannot tell whether the provided input was "grammatical", that is, whether it conformed to the grammar provided. And that's not just the case with the particular grammar we built to emulate Turing Machines. It means that we have no confidence that we can recognise sentences from any arbitrary grammar, and even if we stumble upon an answer, we have no way to limit how much time it might take to get there.
Clearly, this is not how human beings understand each other. So if it is to serve a practical purpose, we must restrict the possible grammars in some way to make them computationally feasible.
On the other end of the spectrum, a lot was known about finite-state machines. A finite-state machine is a Turing Machine without a tape; that is to say, it is simply a finite collection of states. In each state, the machine reads a single input symbol and uses it to decide what the next state will be. It turns out that finite-state machines can be modelled using a grammar (as above) restricted to very simple productions, each of which is either of the form A → a B or A → a, where a is a symbol from the "terminal alphabet" (that is, a word) and A and B are single grammatical symbols. These grammars are called "regular grammars" and they are computationally equivalent to what mathematicians call "regular expressions" (which are a small subset of what is recognised by "regex" libraries, but that's a whole other discussion).
Regular grammars are easy to parse. All that is needed to is trace through the state machine, so it can be done without backtracking in time proportional to the length of the input. But regular grammars are far too weak to be able to represent human language, or even most computer languages. As a simple example, algebraic expressions with parentheses cannot be recognised with a regular grammar (or with a finite-state machine) because there is no way to count the parenthesis depth; the finite-state machine has no memory at all (other than knowing which state it is in, and there are only a finite number of states).
So unrestricted grammars are too powerful to parse and regular grammars are too weak to be useful. (Useful for complex parsing problems, that is. There are certain applications for regular expressions, but parsing complete computer programs is not one of them.)
The next step, then, was to try to find a restriction on grammars which was still powerful enough to represent human language without being so powerful that parsing became impossible.
That, finally, is the origin of Chomsky's hierarchy. Between the two extremes described above (type 0 and type 3 grammars), Chomsky proposed two possible intermediate restrictions -- type 1 and type 2 grammars -- and proved a number of important properties about each of them.
While this work turned out to be fundamental in the development of formal language theory, it cannot really be said to have answered the question Chomsky started with. Type 2 grammars -- context-free grammars -- are indeed computationally tractable; they can be parsed with simple algorithms in polynomial time, and can represent a large number of useful languages. But they are still too weak to represent human language. In particular, context-free grammars cannot represent a language as simple as "all strings which contain two instances of the same substring". Type 1 grammars -- context-sensitive grammars -- can probably represent any useful language, and are not quite as unruly as unrestricted grammars, but they are still too powerful to parse. (Since the derivation steps in a context-sensitive grammar never get shorter, it is possible to enumerate all possible derivations from a starting point in order by length, which means that you can decide whether a sentence is generated by the grammar without running into the halting problem. But that's as good as it gets; that procedure takes exponential time and is not remotely feasible for non-trivial inputs.)
In the six decades since Chomsky published his seminal papers, a lot of work has been done to try to find useful intermediate restrictions between type 1 and type 2 grammars. And there has been a lot of useful study into algorithms for parsing context-free languages, which is of enormous utility in building compilers. All of this builds on the crucial work done by Chomsky and the other computational theorists whose work he built on -- Markov, Turing, Church and Kleene, just to name a few worthy of study. But Chomsky's original project remains unsolved.
So if your goal is to build a simple parser for a programming language, the Chomsky hierarchy is probably just an interesting footnote. But if you are interested in the academic study of formal language theory, there are still lots of interesting unsolved problems to work on.
how can you show a) Deterministic pushdown automata (DPDA) are more powerful than finite automata and less powerful than a non-determinstic pushdown automata?
(1) First, show that any language that can be accepted by a finite automaton can also be accepted by a deterministic pushdown automaton. Recall that any language accepted by a finite automaton is accepted by a deterministic finite automaton, and a deterministic pushdown automaton can simulate a deterministic finite automaton simply by doing nothing interesting with its stack. Next, show there is a non-regular language accepted by a DPDA. 0^n 1^n is a good candidate. Prove this language is non-regular using the pumping lemma or Myhill-Nerode theorem, thenshow the DPDA that pushes on 0s and then switches to popping on 1s works.
(2) First, note that NPDAs can accept any language accepted by DPDAs since all DPDAs are also NPDAs that happen not to make use of nondeterminism. Next, find a language that has an NPDA but no DPDA. Even-length palindromes over the alphabet {0, 1} might work. There is a simple NPDA for this that nondeterministically guesses when the first half of the input has been read and switches from pushing to popping. To show there is no DPDA is more challenging. Perhaps you could argue as follows: suppose there were a DPDA. Then, in any configuration of the DPDA, only one transition would be possible. If string w leads to an accepting state in the DPDA and empties the stack, x00 may lead either to an accepting or non-accepting state (since x00 either may or may not be an even-length palindrome). This is a contradiction, though, so our DPDA does not exist. The same argument fails for NPDAs, by the way, because there can be multiple paths through, so one failed choice means nothing.
NDPA is more powerful than DPDA because we can add more transitions to it. It is possible for every language to add a transition. For some languages, we can construct DPDA there exist an NPDA but there are some languages that are accepted by NPDA but are not by DPDA. This is said to be powerful when it accepts more sets of languages than other automata.
In fact, it is more powerful than DFA(Deterministic finite automata) and NFA(Non-deterministic finite automata) also because, In the case of DFA and NFA, they are equivalent in power. So for every language accepted by DFA there exist an NFA and Vice-Versa. There is not any language for which we construct NFA but not DFA. Hence, we can’t convert NPDA to DPDA always and we can convert NFA to equivalent dfa.
What are the major issues essential to consider when constructing a finite state machine representing a given language? I know that finite state machines take strings as inputs, and that as each element of the string is read, the machine state changes until the EOF is reached. If once the string has been read completely the machine is in one of the final states the string is accepted. What I don't understand is what considerations need to be made when constructing the FSA (other than the string it should accept, and the definition of each transition function.)
One thing you'll want to consider is the number of states. There are many equivalent ways to define the machine but the fewer number of states would typically be preferred because the same result is achieved with less complexity and space.
The representation of a computational automaton requires space proportional to the number of states so optimizing the spacial complexity through state reduction is desirable or necessary.
I was knowing that Finite Automata can have only one input states but several output states. Recently, I've come across an example of removing ^-moves that has many input states.
Please help!
Deterministic finite automata are equivalent to nondeterministic finite automata with epsilon/lambda transitions. Having multiple input/start/initial states is equivalent to having a single input/start/initial state with epsilon/lambda transitions to the desired input/start/initial states. It may not be "standard" or "usual" to speak of finite automata with multiple input/start/initial states but it adds no expressive or computational power.
What kind of Finite Automata supports set of conditions for transaction from state A to state B? For example, if (a && b && c) then A -> B.
To make it even harder, let's say that we receive one condition at a time.
of course it could be done with DFA, but that will lead to state explosion
It doesn't matter whether you use NFA or DFA as you can subsequently apply the so-called powerset construction in order to minimize it. This yields the smallest possible automaton for your defined language.