how to parse Context-sensitive grammar? - parsing

CSG is similar to CFG but the reduce symbol is multiple.
So, can I just use CFG parser to parse CSG with reducing production to multiple terminals or non-terminals?
Like
1. S → a bc
2. S → a S B c
3. c B → W B
4. W B → W X
5. W X → B X
6. B X → B c
7. b B → b b
When we meet W X, can we just reduce W X to W B?
When we meet W B, can we just reduce W B to c B?
So if CSG parser is based on CFG parser, it's not hard to write, is it true?
But when I checked wiki, it said to parse CSG, we should use linear bounded automaton.
What is linear bounded automaton?

Context sensitive grammars are non-deterministic. So you can not assume that a reduction will take place, just because the RHS happens to be visible at some point in a derivation.
LBAs (linear-bounded automata) are also non-deterministic, so they are not really a practical algorithm. (You can simulate one with backtracking, but there is no convenient bound on the amount of time it might take to perform a parse.) The fact that they are acceptors for CSGs is interesting for parsing theory but not really for parsing practice.
Just as with CFGs, there are different classes of CSGs. Some restricted subclasses of CSGs are easier to parse (CFGs are one subclass, for example), but I don't believe there has been much investigation into practical uses; in practice, CSGs are hard to write, and there is no obvious analog of a parse tree which can be constructed from a derivation.
For more reading, you could start with the wikipedia entry on LBAs and continue by following its references. Good luck.

Related

Method to calculate predict set of a grammar production in recursive descent parser

I understand first and follow but I am totally lost on the predict sets. can someone explain to me how to go about finding a predict set of a production in a grammar using the first and follow sets? I have not provided a grammar because this is for a homework assignment and I want to know how to do it not how to do it for this specific grammar.
Intuitively, the predict set for a production A → α [Note 1] is the set of terminal symbols which might be the next symbol to be read if that production is to be predicted. (That implies that the production's non-terminal (A) has already been predicted, and the parser must now decide which of the non-terminal's productions to predict.)
Obviously, that includes all the terminal symbols which might be the first symbol of the right-hand side. But what if the right-hand side might derive ε, the empty string? In that case, the next symbol in the input will be the first symbol which comes after the predicted non-terminal, A; in other words, it will be a member of FOLLOW(A). So the predict set contains the terminals which might start the right-hand side α, plus all the symbols in FOLLOW(A) if α could derive the empty string. [Note 2]
More formally, PREDICT(A → α) is:
FIRST(α) if ε ∉ FIRST(α)
(FIRST(α) ∪ FOLLOW(A)) - {ε} if ε ∈ FIRST(α)
Remember that we compute FIRST on a sentential form by "looking through" epsilons:
FIRST(aβ) is
FIRST(a) if ε ∉ FIRST(a)
(FIRST(a) - {ε}) ∪ FIRST(β) if ε ∈ FIRST(a)
Consequently, FIRST of a right hand side only include ε if every symbol in the right-hand side is nullable.
Notes:
I use the common convention that capital letters (A...) refer to non-terminals, lower-case letters (a...) refer to grammar symbols (terminals or non-terminals) and Greek letters (α...) refer to possibly empty sequences of grammar symbols.
Aside from the first step when the start symbol is predicted, the current prediction always contains more than one symbol. So if A is the next non-terminal to expand and we see that it is nullable (i.e., it could derive nothing), we don't really need to lookup FOLLOW(A) because we could just look at the predict stack and see what we've predicted will follow A. In some cases, this might allow us to avoid a conflict with one of the other alternatives for A.
However, it is normal to use FOLLOW(A), regardless. Always using FOLLOW(A) is usually referred to as the "Strong LL" (SLL) algorithm. Although it seems like computing the FIRST set of the known prediction stack is more powerful than using a precomputed FOLLOW set, it does not actually improve the power of LL parsing at all; every non-LL grammar can be converted to an SLL grammar.

Finding an equivalent LR grammar for the same number of "a" and "b" grammar?

I can't seem to find an equivalent LR grammar for:
S → aSbS | bSaS | ε
which I think recognize strings with the same number of 'a' than 'b'.
What would be a workaround for this? Is it possible to find and LR grammar for this?
Thanks in advance!
EDIT:
I have found what I think is an equivalent grammar but I haven't been able to prove it.
I think I need to prove that the original grammar generates the language above, and then prove that language is generated for the following equivalent grammar. But I am not sure how to do it. How should I do it?
S → aBS | bAS | ε
B → b | aBB
A → a | bAA
Thanks in advance...
PS: I have already proven that this new grammar is LL(1), SLR(1), LR(1) and LALR(1).
Unless a grammar is directly related to another grammar -- for example through standard transformations such as normalization, null-production eliminate, and so on -- proving that two grammars derivee the same language is very difficult without knowing what the language is. It is usually easier to prove (independently) that each grammar derives the language.
The first grammar you provide:
S → aSbS | bSaS | ε
does in fact derive the language of all strings over the alphabet {a, b}* where the number of as is the same as the number of bs. We can prove that in two parts: first, that every sentence recognized by the grammar has that property, and second that every sentence which has that property can be derived by that grammar. Both proofs proceed by induction.
For the forward proof, we proceed by induction on the number of derivations. Suppose we have some derivation S → α → β → … → ω where all the greek letters represent sequences of non-terminals and terminals.
If the length of the derivation is exactly zero, so that it starts and ends with S, then there are no terminals in any derived sentence so its clear that every derived sentence has the same number of as and bs. (Base step)
Now for the induction step. Suppose that every derivation of length i is known to end with a derived sentence which has the same number of as and bs. We want to prove from that premise that every derivation of length i+1 ends with a sentence which has the same number of as and bs. But that is also clear: each of the three possible production steps preserves parity.
Now, let's look at the opposite direction: every sentence with the same number of as and bs can be derived from that grammar. We'll do this by induction on the length of the string. Our induction premise will be that if it is the case that for every j ≤ i, every sentence with exactly j as and j bs has a derivation from S, then every sentence with exactly i+1 as and i+1 bs. (Here we are only considering sentences consisting only of terminals.)
Consider such a ssentence. It either starts with an a or a b. Suppose that it starts with an a: then there is at least one b in the sentence such that the prefix ending with that b has the same number of each terminal. (Think of the string as a walk along a square grid: every a moves diagonally up and right one unit, and every b moves diagonally down and right. Since the endpoint is at exactly the same height as the beginning point and there are no wormholes in the graph, once we ascend we must sooner or later descend back to the starting height, which is a prefix ending b.) So the interior of that prefix (everything except the a at the beginning and the b at the end) is balanced, as is the remainder of the string. Both of those are shorter, so by the induction hypothesis they can be derived from S. Making those substitutions, we get aSbS, which can be derived from S. An identical argument applies to strings starting with b. Again, the base step is trivial.
So that's basically the proof procedure you'll need to adapt for your grammar.
Good luck.
By the way, this sort of question can also be posed on cs.stackexchange.com or math.stackexchange.com, where the MathJax is available. MathJax makes writing out mathematical proofs much less tedious, so you may well find that you'll get more readable answers there.

How to determine the k value of an LL(k) grammar

Suppose I'm given the grammar
Z-> X
X-> Y
-> b Y a
Y-> c
-> c a
The grammar is LL(K) What is the K value?
All I know is its not LL(1) since there is a predict set conflict on Y and LL(1) grammar predict set must be disjoint.
Ok, so luckily this question was not on my exam.
As I mentioned, the predict set conflict means its not LL(1) next you just have to observe the minimum number of look ahead need to determine a production value.
In this case two.

Non binary decision tree to binary decision tree (Machine learning)

This is homework question, so I just need help may be yes/No and few comment will be appreciated!
Prove: Arbitrary tree (NON binary tree) can be converted to equivalent binary decision tree.
My answer:
Every decision can be generated just using binary decisions. Hence that decision tree too.
I don't know formal proof. Its like I can argue with Entropy(Gain actually) for that node will be E(S) - E(L) - E(R). And before that may be it is E(S) - E(Y|X=t1) - E(Y|X=t2) - and so on.
But don't know how to say?!
You can give a constructive proof of something like this, demonstrating how to convert an arbitrary decision tree into a binary decision tree.
Imagine that you are sitting at node A, and you have a choice of traversing to B, C, and D based on whether or not your example satisfies requirements B, C or D. If this is a proper decision tree, B, C and D are mutually exclusive and cover all cases.
A -> B
-> C
-> D
Since they're mutually exclusive, you could imagine splitting your tree into a binary decision: B or not B; on the not B branch, we know that either C or D has to be true, since B, C, and D were mutually exclusive and cover all cases. In other words:
A -> B
-> ~B
---> C
---> D
Then you can copy whatever was going to go after B on to the branch that follows B, performing the same simplification. Same for C and D.

Differences between Agda and Idris

I'm starting to dive into dependently-typed programming and have found that the Agda and Idris languages are the closest to Haskell, so I started there.
My question is: which are the main differences between them? Are the type systems equally expresive in both of them? It would be great to have a comprehensive comparative and a discussion about benefits.
I've been able to spot some:
Idris has type classes à la Haskell, whereas Agda goes with instance arguments
Idris includes monadic and applicative notation
Both of them seem to have some sort of rebindable syntax, although not really sure if they are the same.
Edit: there are some more answers in the Reddit page of this question: http://www.reddit.com/r/dependent_types/comments/q8n2q/agda_vs_idris/
I may not be the best person to answer this, as having implemented Idris I'm probably a bit biased! The FAQ - http://docs.idris-lang.org/en/latest/faq/faq.html - has something to say on it, but to expand on that a bit:
Idris has been designed from the ground up to support general purpose programming ahead of theorem proving, and as such has high level features such as type classes, do notation, idiom brackets, list comprehensions, overloading and so on. Idris puts high level programming ahead of interactive proof, although because Idris is built on a tactic-based elaborator, there is an interface to a tactic based interactive theorem prover (a bit like Coq, but not as advanced, at least not yet).
Another thing Idris aims to support well is Embedded DSL implementation. With Haskell you can get a long way with do notation, and you can with Idris too, but you can also rebind other constructs such as application and variable binding if you need to. You can find more details on this in the tutorial, or full details in this paper: http://eb.host.cs.st-andrews.ac.uk/drafts/dsl-idris.pdf
Another difference is in compilation. Agda goes primarily via Haskell, Idris via C. There is an experimental back end for Agda which uses the same back end as Idris, via C. I don't know how well maintained it is. A primary goal of Idris will always be to generate efficient code - we can do a lot better than we currently do, but we're working on it.
The type systems in Agda and Idris are pretty similar in many important respects. I think the main difference is in the handling of universes. Agda has universe polymorphism, Idris has cumulativity (and you can have Set : Set in both if you find this too restrictive and don't mind that your proofs might be unsound).
One other difference between Idris and Agda is that Idris's propositional equality is heterogeneous, while Agda's is homogeneous.
In other words, the putative definition of equality in Idris would be:
data (=) : {a, b : Type} -> a -> b -> Type where
refl : x = x
while in Agda, it is
data _≡_ {l} {A : Set l} (x : A) : A → Set a where
refl : x ≡ x
The l in the Agda defintion can be ignored, as it has to do with the universe polymorphism that Edwin mentions in his answer.
The important difference is that the equality type in Agda takes two elements of A as arguments, while in Idris it can take two values with potentially different types.
In other words, in Idris one can claim that two things with different types are equal (even if it ends up being an unprovable claim), while in Agda, the very statement is nonsense.
This has important and wide-reaching consequences for the type theory, especially regarding the feasibility of working with homotopy type theory. For this, heterogeneous equality just won't work because it requires an axiom that is inconsistent with HoTT. On the other hand, it is possible to state useful theorems with heterogeneous equality that can't be straightforwardly stated with homogeneous equality.
Perhaps the easiest example is associativity of vector concatenation. Given length-indexed lists called vectors defined thusly:
data Vect : Nat -> Type -> Type where
Nil : Vect 0 a
(::) : a -> Vect n a -> Vect (S n) a
and concatenation with the following type:
(++) : Vect n a -> Vect m a -> Vect (n + m) a
we might want to prove that:
concatAssoc : (xs : Vect n a) -> (ys : Vect m a) -> (zs : Vect o a) ->
xs ++ (ys ++ zs) = (xs ++ ys) ++ zs
This statement is nonsense under homogeneous equality, because the left side of the equality has type Vect (n + (m + o)) a and the right side has type Vect ((n + m) + o) a. It's a perfectly sensible statement with heterogeneous equality.

Resources