Show that L and Images cannot both be finite - automata

im trying to complete these exercises for my automata theory class. The book i have explains this stuff really badly. Im kinda lost on how to start this as im not sure what I should be looking at.
Let L be any language on a non-empty alphabet. Show that L and The Complement of L cannot both be finite.
i know the complement of L ( ill use L# for the compliment of L) L#= E^*-L but i dont know were to go from their.

Let a be a letter of your alphabet. Assume for sake of contradiction both L and its complement L# are finite. Then, their union, L+L#, is finite. But L+L# contains all words a^n for natural n, i.e. infinitely many, a contradiction.
This is as much about infinite sets as it is about automata and languages: you cannot split an infinite set into a finite number of finite sets.

Related

LCS (Longest Common Subsequence) - get best K solutions

The LCS problem gets two strings and returns their longest common subsequence.
For example:
LCS on the strings: elephant and eat is 3, as the whole string eat is a subsequence in elephant - indices 0,6,7 or 2,6,7
Another example:
LCS on the strings: elephant and olives is 2, as their longest common subsequence is le
The question is, whether there is an algorithm that does not only returns the most optimal solution, but that can return the K best solutions?
There is an algorithm to return all the optimal solutions (I think this is what you asked).
As in Wikipedia:
Using the dynamic programming algorithm for two strings, the table is constructed, then backtracked from the end to the beginning recursively, with the added computation that if either of (i, j-1) or (i-1, j) could be the point preceding the current one, then both paths are explored. This leads to exponential computation in the worst case.
There can be an exponential number of these optimal sequences in the worst case!

Finding an equivalent LR grammar for the same number of "a" and "b" grammar?

I can't seem to find an equivalent LR grammar for:
S → aSbS | bSaS | ε
which I think recognize strings with the same number of 'a' than 'b'.
What would be a workaround for this? Is it possible to find and LR grammar for this?
Thanks in advance!
EDIT:
I have found what I think is an equivalent grammar but I haven't been able to prove it.
I think I need to prove that the original grammar generates the language above, and then prove that language is generated for the following equivalent grammar. But I am not sure how to do it. How should I do it?
S → aBS | bAS | ε
B → b | aBB
A → a | bAA
Thanks in advance...
PS: I have already proven that this new grammar is LL(1), SLR(1), LR(1) and LALR(1).
Unless a grammar is directly related to another grammar -- for example through standard transformations such as normalization, null-production eliminate, and so on -- proving that two grammars derivee the same language is very difficult without knowing what the language is. It is usually easier to prove (independently) that each grammar derives the language.
The first grammar you provide:
S → aSbS | bSaS | ε
does in fact derive the language of all strings over the alphabet {a, b}* where the number of as is the same as the number of bs. We can prove that in two parts: first, that every sentence recognized by the grammar has that property, and second that every sentence which has that property can be derived by that grammar. Both proofs proceed by induction.
For the forward proof, we proceed by induction on the number of derivations. Suppose we have some derivation S → α → β → … → ω where all the greek letters represent sequences of non-terminals and terminals.
If the length of the derivation is exactly zero, so that it starts and ends with S, then there are no terminals in any derived sentence so its clear that every derived sentence has the same number of as and bs. (Base step)
Now for the induction step. Suppose that every derivation of length i is known to end with a derived sentence which has the same number of as and bs. We want to prove from that premise that every derivation of length i+1 ends with a sentence which has the same number of as and bs. But that is also clear: each of the three possible production steps preserves parity.
Now, let's look at the opposite direction: every sentence with the same number of as and bs can be derived from that grammar. We'll do this by induction on the length of the string. Our induction premise will be that if it is the case that for every j ≤ i, every sentence with exactly j as and j bs has a derivation from S, then every sentence with exactly i+1 as and i+1 bs. (Here we are only considering sentences consisting only of terminals.)
Consider such a ssentence. It either starts with an a or a b. Suppose that it starts with an a: then there is at least one b in the sentence such that the prefix ending with that b has the same number of each terminal. (Think of the string as a walk along a square grid: every a moves diagonally up and right one unit, and every b moves diagonally down and right. Since the endpoint is at exactly the same height as the beginning point and there are no wormholes in the graph, once we ascend we must sooner or later descend back to the starting height, which is a prefix ending b.) So the interior of that prefix (everything except the a at the beginning and the b at the end) is balanced, as is the remainder of the string. Both of those are shorter, so by the induction hypothesis they can be derived from S. Making those substitutions, we get aSbS, which can be derived from S. An identical argument applies to strings starting with b. Again, the base step is trivial.
So that's basically the proof procedure you'll need to adapt for your grammar.
Good luck.
By the way, this sort of question can also be posed on cs.stackexchange.com or math.stackexchange.com, where the MathJax is available. MathJax makes writing out mathematical proofs much less tedious, so you may well find that you'll get more readable answers there.

When does the hypothesis space contain the target concept

What does it mean when it is written that-
Hypothesis space contains the target concept?
If possible with an example.
TLDR: It means you can learn with zero error.
Here is an example what it means: Suppose a concept: f(a,b,c,d) = a & b & (!c | !d) (input are in boolean domain).
This concept is in a ML task usualy represented by the data, so you are given a dataset:
a | b | c | d = f
--+---+---+---+---
T T T T = F
T T T F = T
T T F T = T
... etc ...
And your hypothesis space is decision trees. In this case your hypothesis space contains target concept, as you can do (for example, there are more possibilities):
It can be proven, that any binary formula (concept) can be learned as
a decision tree. Thus General binary formulas are subset of decision trees. That means, when you know the concept is a binary formula (that you even may not know), you will be able to learn it with a decision tree (given enough examples) with zero error.
On the other hand if you want to learn the example concept by
monotone conjunctions, you can't do it, because binary formulas are not subset of monotone conjunctions.
(By subsets, I mean in terms of possible concepts. And from the subset relation, you can make statements about containing target concept in hypothesis space.)
Monotone conjunction is a set of conjunctions in which the variables are not negated. And you have more of those, when any of the conjunctions is true, the output is also true. Is is a subset of DNF where you cannot use negations.
Some concepts can be learned by montone conjunctions, but you cannot learn general binary formula concept by it. That means, you will not be able to learn with zero error, general binary formulas are not subset of monotone conjunctions.
Here is a nice PDF from Princeton on basics of ML: http://www.cs.princeton.edu/courses/archive/spr06/cos511/scribe_notes/0209.pdf

Sum of all the bits in a Bit Vector of Z3

Given a bit vector in Z3, I am wondering how can I sum up each individual bit of this vector?
E.g.,
a = BitVecVal(3, 2)
sum_all_bit(a) = 2
Is there any pre-implemented APIs/functions that support this? Thank you!
It isn't part of the bit-vector operations.
You can create an expression as follows:
def sub(b):
n = b.size()
bits = [ Extract(i, i, b) for i in range(n) ]
bvs = [ Concat(BitVecVal(0, n - 1), b) for b in bits ]
nb = reduce(lambda a, b: a + b, bvs)
return nb
print sub(BitVecVal(4,7))
Of course, log(n) bits for the result will suffice if you prefer.
The page:
https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetNaive
has various algorithms for counting the bits; which can be translated to Z3/Python with relative ease, I suppose.
My favorite is: https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan
which has the nice property that it loops as many times as there are set bits in the input. (But you shouldn't extrapolate from that to any meaningful complexity metric, as you do arithmetic in each loop, which might be costly. The same is true for all these algorithms.)
Having said that, if your input is fully symbolic, you can't really beat the simple iterative algorithm, as you can't short-cut the iteration count. Above methods might work faster if the input has concrete bits.
So you're computing the Hamming Weight of a bit vector. Based on a previous question I had, one of the developers had this answer. Based on that original answer, this is how I do it today:
def HW(bvec):
return Sum([ ZeroExt(int(ceil(log2(bvec.size()))), Extract(i,i,bvec)) for i in range(bvec.size())])

why a good choice of mod is "a prime not too close to an exact of 2"

To generate a hash function, Map a key k into one of m slots by taking the remainder of k divided by m. That is, the hash function is
h(k) = k mod m.
I have read at several places that a good choice of m will be
A prime - I understand that we want to remove common factors, hence a prime number is chosen
not too close to an exact power of 2 - why is that?
From Introduction to algorithms :
When using the division method we avoid certain values of m. For
example m should not be power of 2. Since if m=2^p then h(k) is p
lowest-order bits of k. Unless it is known that all low-order p-bit
patterns are equally likely,
it is better to make a hash function
depend on all bits of the key.
As you se from the below image if i chose 2^3 which mean p=3 and m=8. The hashed keys are only dependent to lowest 3(p) bits which is bad because when you hash you want to include as much data as possible for a good distribution.

Resources