I'm trying to solve DFA - automata

I have to do L1 U L2 and intersection L1 n L2

You can run through the formal Cartesian product machine construction to algorithmically derive automata for the intersection and union of L1 and L2. However, since these languages are so easy, it might be simpler to give the languages and just write down a DFA for each one.
L1 is the language of all strings of as and bs with at least one a. L2 is the language of all strings of as and bs with at least two bs.
To accept the intersection of L1 and L2, we need to see at least one as and two bs. Below, we have six states:
q0, the initial state, where we need one a and two bs
q1, where we still need two bs
q2, where we still need one b
q3, where we need no more (accepting state)
q4, where we still need one a and one b
q5, where we still need one a
--->q0-a->q1-b->q2-b->q3
-b->q4-a->q2 q3
-b->q5-a->q3
(where transitions are missing, they are self loops)
Note that there are six states: this is the same as if we had done the Cartesian product machine construction on the original DFAs of two and three states, respectively.
For union, we can use the exact same DFA and change the set of accepting states to q1, q3, q5. This captures the fact that we now accept when either condition is true (and states q1 and q5 are where one, but not both (as in q3) conditions become satisfied).

Related

complexity between FCBF & Greedy Forward Selection

I am reading about Feature Selection methods and comparing FCBF method vs Greedy forward selection method
the complexity of FCBF is O(M N log N) where M is the number of dataset instances and N is the number of dataset features as per "Understanding and Using Rough Set Based Feature Selection Concepts, Techniques and Applications" book Page 45
and the complexity of Greedy forward selection method is O(m^2 M N log N)
where M is the total number of features, m is the subset feature numbers, and N is the number of datapoints as per "OMEGA: ON-LINE MEMORY-BASED GENERAL PURPOSE SYSTEM CLASSIFIER" book page 122
although both complexities look the same
yet M & N has different meaning in each one
N is the number of features in the first and M is the number of the feature in the 2nd.
my question is :
if we going to unify symbols will the 2nd complexity be like O(k^2 K M log M)
or they are actually the same variables, i mean N in the 1st is the N in the 2nd and M in the 1st is M in the 2nd and the problem is that I did not understand the equations right?

Could you explain this question? i am new to ML, and i faced this problem, but its solution is not clear to me

The problem is in the picture
Question's image:
Question 2
Many substances that can burn (such as gasoline and alcohol) have a chemical structure based on carbon atoms; for this reason they are called hydrocarbons. A chemist wants to understand how the number of carbon atoms in a molecule affects how much energy is released when that molecule combusts (meaning that it is burned). The chemists obtains the dataset below. In the column on the right, kj/mole is the unit measuring the amount of energy released. examples.
You would like to use linear regression (h a(x)=a0+a1 x) to estimate the amount of energy released (y) as a function of the number of carbon atoms (x). Which of the following do you think will be the values you obtain for a0 and a1? You should be able to select the right answer without actually implementing linear regression.
A) a0=−1780.0, a1=−530.9 B) a0=−569.6, a1=−530.9
C) a0=−1780.0, a1=530.9 D) a0=−569.6, a1=530.9
Since all a0s are negative but two a1s are positive lets figure out the latter first.
As you can see by increasing the number of carbon atoms the energy is become more and more negative, so the relation cannot be positively correlated which rules out options c and d.
Then for the intercept the value that produces the least error is the correct one. For the 1 and 10 (easier to calculate) the outputs are about -2300 and -7000 for a, -1100 and -5900 for b, so one would prefer b over a.
PS: You might be thinking there should be obvious values for a0 and a1 from the data, it's not. The intention of the question is to give you a general understanding of the best fit. Also this way of solving is kinda machine learning as well

Method to calculate predict set of a grammar production in recursive descent parser

I understand first and follow but I am totally lost on the predict sets. can someone explain to me how to go about finding a predict set of a production in a grammar using the first and follow sets? I have not provided a grammar because this is for a homework assignment and I want to know how to do it not how to do it for this specific grammar.
Intuitively, the predict set for a production A → α [Note 1] is the set of terminal symbols which might be the next symbol to be read if that production is to be predicted. (That implies that the production's non-terminal (A) has already been predicted, and the parser must now decide which of the non-terminal's productions to predict.)
Obviously, that includes all the terminal symbols which might be the first symbol of the right-hand side. But what if the right-hand side might derive ε, the empty string? In that case, the next symbol in the input will be the first symbol which comes after the predicted non-terminal, A; in other words, it will be a member of FOLLOW(A). So the predict set contains the terminals which might start the right-hand side α, plus all the symbols in FOLLOW(A) if α could derive the empty string. [Note 2]
More formally, PREDICT(A → α) is:
FIRST(α) if ε ∉ FIRST(α)
(FIRST(α) ∪ FOLLOW(A)) - {ε} if ε ∈ FIRST(α)
Remember that we compute FIRST on a sentential form by "looking through" epsilons:
FIRST(aβ) is
FIRST(a) if ε ∉ FIRST(a)
(FIRST(a) - {ε}) ∪ FIRST(β) if ε ∈ FIRST(a)
Consequently, FIRST of a right hand side only include ε if every symbol in the right-hand side is nullable.
Notes:
I use the common convention that capital letters (A...) refer to non-terminals, lower-case letters (a...) refer to grammar symbols (terminals or non-terminals) and Greek letters (α...) refer to possibly empty sequences of grammar symbols.
Aside from the first step when the start symbol is predicted, the current prediction always contains more than one symbol. So if A is the next non-terminal to expand and we see that it is nullable (i.e., it could derive nothing), we don't really need to lookup FOLLOW(A) because we could just look at the predict stack and see what we've predicted will follow A. In some cases, this might allow us to avoid a conflict with one of the other alternatives for A.
However, it is normal to use FOLLOW(A), regardless. Always using FOLLOW(A) is usually referred to as the "Strong LL" (SLL) algorithm. Although it seems like computing the FIRST set of the known prediction stack is more powerful than using a precomputed FOLLOW set, it does not actually improve the power of LL parsing at all; every non-LL grammar can be converted to an SLL grammar.

Finding an equivalent LR grammar for the same number of "a" and "b" grammar?

I can't seem to find an equivalent LR grammar for:
S → aSbS | bSaS | ε
which I think recognize strings with the same number of 'a' than 'b'.
What would be a workaround for this? Is it possible to find and LR grammar for this?
Thanks in advance!
EDIT:
I have found what I think is an equivalent grammar but I haven't been able to prove it.
I think I need to prove that the original grammar generates the language above, and then prove that language is generated for the following equivalent grammar. But I am not sure how to do it. How should I do it?
S → aBS | bAS | ε
B → b | aBB
A → a | bAA
Thanks in advance...
PS: I have already proven that this new grammar is LL(1), SLR(1), LR(1) and LALR(1).
Unless a grammar is directly related to another grammar -- for example through standard transformations such as normalization, null-production eliminate, and so on -- proving that two grammars derivee the same language is very difficult without knowing what the language is. It is usually easier to prove (independently) that each grammar derives the language.
The first grammar you provide:
S → aSbS | bSaS | ε
does in fact derive the language of all strings over the alphabet {a, b}* where the number of as is the same as the number of bs. We can prove that in two parts: first, that every sentence recognized by the grammar has that property, and second that every sentence which has that property can be derived by that grammar. Both proofs proceed by induction.
For the forward proof, we proceed by induction on the number of derivations. Suppose we have some derivation S → α → β → … → ω where all the greek letters represent sequences of non-terminals and terminals.
If the length of the derivation is exactly zero, so that it starts and ends with S, then there are no terminals in any derived sentence so its clear that every derived sentence has the same number of as and bs. (Base step)
Now for the induction step. Suppose that every derivation of length i is known to end with a derived sentence which has the same number of as and bs. We want to prove from that premise that every derivation of length i+1 ends with a sentence which has the same number of as and bs. But that is also clear: each of the three possible production steps preserves parity.
Now, let's look at the opposite direction: every sentence with the same number of as and bs can be derived from that grammar. We'll do this by induction on the length of the string. Our induction premise will be that if it is the case that for every j ≤ i, every sentence with exactly j as and j bs has a derivation from S, then every sentence with exactly i+1 as and i+1 bs. (Here we are only considering sentences consisting only of terminals.)
Consider such a ssentence. It either starts with an a or a b. Suppose that it starts with an a: then there is at least one b in the sentence such that the prefix ending with that b has the same number of each terminal. (Think of the string as a walk along a square grid: every a moves diagonally up and right one unit, and every b moves diagonally down and right. Since the endpoint is at exactly the same height as the beginning point and there are no wormholes in the graph, once we ascend we must sooner or later descend back to the starting height, which is a prefix ending b.) So the interior of that prefix (everything except the a at the beginning and the b at the end) is balanced, as is the remainder of the string. Both of those are shorter, so by the induction hypothesis they can be derived from S. Making those substitutions, we get aSbS, which can be derived from S. An identical argument applies to strings starting with b. Again, the base step is trivial.
So that's basically the proof procedure you'll need to adapt for your grammar.
Good luck.
By the way, this sort of question can also be posed on cs.stackexchange.com or math.stackexchange.com, where the MathJax is available. MathJax makes writing out mathematical proofs much less tedious, so you may well find that you'll get more readable answers there.

how to parse Context-sensitive grammar?

CSG is similar to CFG but the reduce symbol is multiple.
So, can I just use CFG parser to parse CSG with reducing production to multiple terminals or non-terminals?
Like
1. S → a bc
2. S → a S B c
3. c B → W B
4. W B → W X
5. W X → B X
6. B X → B c
7. b B → b b
When we meet W X, can we just reduce W X to W B?
When we meet W B, can we just reduce W B to c B?
So if CSG parser is based on CFG parser, it's not hard to write, is it true?
But when I checked wiki, it said to parse CSG, we should use linear bounded automaton.
What is linear bounded automaton?
Context sensitive grammars are non-deterministic. So you can not assume that a reduction will take place, just because the RHS happens to be visible at some point in a derivation.
LBAs (linear-bounded automata) are also non-deterministic, so they are not really a practical algorithm. (You can simulate one with backtracking, but there is no convenient bound on the amount of time it might take to perform a parse.) The fact that they are acceptors for CSGs is interesting for parsing theory but not really for parsing practice.
Just as with CFGs, there are different classes of CSGs. Some restricted subclasses of CSGs are easier to parse (CFGs are one subclass, for example), but I don't believe there has been much investigation into practical uses; in practice, CSGs are hard to write, and there is no obvious analog of a parse tree which can be constructed from a derivation.
For more reading, you could start with the wikipedia entry on LBAs and continue by following its references. Good luck.

Resources