Can Countable string is Countable always? - automata

Why some sets are countable and some are not countable? Say regular set are countable but how (0+1)* will be countable? It is an infinite string, then how it could be a countable set?
How the set of all non-decreasing functions from N to N are countable?
How the set of all finite partitions of N are uncountable?

First let's clean up a common misconception: (0+1)* is not an infinite string. It's a language of an infinite number of finite strings. The distinction is important: any given string from the language is finite, but there are an infinite number of them.
L=(0+1)* means L={'', '0', '1', '00', '01', '10', '11', etc}. This also shows how a language can be considered countable, if you list all of the words of length 0, then 1, then 2, etc. Every word in the language has a position in this set and can be mapped to the natural numbers.
All regular languages are countable sets of words. Finite languages are trivially countable. Infinite languages are countable because the corresponding DFA can be walked over, enumerating the entire language in an ordered manner, allowing all strings to be mapped to the natural numbers.
The other two questions are more generally mathematical rather than computer science. This should help: https://math.stackexchange.com/questions/1396896/number-of-non-decreasing-functions. For the last question, this might help: https://en.wikipedia.org/wiki/Cantor%27s_theorem

Related

why is tables in lua designed in this way?

I am confused about below grammer in language lua:
w = {x=0, y=0}
w[1] = "another"
in my opinion, the first sentence descirbe the w as a dict-like strcture and next a array, are the w of the first and of the second same? if so, why? why can two different things can be stored in a w?
I'm a rookie in lua and also in english, pardon.
I want to know some thoughts of designation of lua and the explanation of my qustion.
An array is conceptually just a series of key/value pairs. It's just that the "keys" are all integers and are a sequence of integers starting from (in Lua's case) 1.
Lua recognizes that a "dictionary" and "array" are really the same thing. It bundles these two concepts together into a single type: Lua's "table".
In a Lua table, keys can be (almost) anything. Including integers. Including integers starting from 1 and increasing. As such, a Lua table is said to have an "array portion", which are all of the integer keys from the integer 1 to the highest integer whose value is not nil. This is what it means to take the "length" of a table.

What does the compare function in F# do

I was reading through the F# documentation and came across the compare function. The examples in the docs do not really make it clear what the function does. I also tried it with a few inputs, but couldn't find a clear pattern.
When comparing lists the values are either -1, 0 or 1.
> compare [1;2;4] [8;1;4;9]
-1
> compare [1;2;4] [1;2;3]
1
> compare [1;2;4] [1;2;4]
0
But when comparing strings the numbers can get larger than 1.
> compare "abf" "abc"
3
What does compare compare?
The F# Language specification provides a formal description of language elements. For the compare function see p. 173 "8.15.6 Behavior of Hash, =, and Compare", where the behavior is described in pseudocode to achieve the following objectives:
Ordinal comparison for strings
Structural comparison for arrays
Natural ordering for native integers (which do not support System.IComparable)
Structural comparison, an important concept in functional programming, does apply to tuples, lists, options, arrays, and user-defined record, union, and struct types whose constituent field types permit structural equality, hashing, and comparison.
For strings, the comparison relies on System.String.CompareOrdinal, whose return values are described under the System.String.Compare method:
Less than zero: strA precedes strB in the sort order.
Zero: strA occurs in the same position as strB in the sort order.
Greater than zero: strA follows strB in the sort order.

Swift 3: most performant way to check many strings with many regular expressions

I do have a list of several hundred strings and an array of 10k regular expressions.
I now have to iterate over all strings and check which of the 10k regular expressions match. What's the most performant way to do this?
Currently I'm doing this:
myRegularExpression.firstMatch(in: myString, options: myMatchingOption, range: NSMakeRange(0, myString.characters.count)) == nil
where myRegularExpression is an NSRegularExpression stored for reuse and myMatchingOption is NSRegularExpression.MatchingOptions(rawValue: 0)
Is there a faster, more performant way to check if a string matches one of those 10k regular expressions?
EDIT:
I need to know not only IF one of my 10k regular expressions fit but also which one. So currently I do have a for loop inside a for-loop: the outer one iterates over my several hundred strings and for each of these strings I iterate over my 10k rules and see if one rule fits (of course if one fits I can stop for that string, so roughly:
for string in stringsToCheck {
for rule in myRules {
if string.matches(rule) {
// continue with next string of stringsToCheck
}
}
}
Depending on what platform you're running this on, separating the work in using multiple threads may provide some response time improvements but I believe that really dramatic optimization for this would require some insight on the nature of the regular expressions.
For example, if the expressions don't have a specific precedence order, you could rearrange (reorder) them to make the most likely "matches" come first in the list. This could be evaluated pre-emptively either by the supplier of the expressions or using some function to estimate their complexity (e.g. length of the expression, presence of optional or combinatory symbols).
Or it could be evaluated statistically by collecting (and persisting) hit/miss counts for each expression. But of course such an optimization assumes that every string will match at least one expression and that the 80/20 rule applies (i.e 20% of the expressions match 80% of the strings).
If the expressions are very simple and only make use of letter patterns, then you would get better performance with more "manual" implementations of the matching functions (instead of regex). In the best case scenario, simple letter patterns can be converted into a character tree and yield orders of magnitude in performance improvements.
Note that these solutions are not mutually exclusive. For example, if a large proportion of the expressions are simple patterns and only a few have complex patterns then you don't have to throw away the baby with the bath water: you can apply the simple pattern optimization to a subset of the rules and use the "brute force" nested loop to the remaining complex ones.
I had a similar problem in the past where thousands of rules would need to be applied on hundreds of thousands of records for processing insurance claims. The traditional "expert system" approach was to create a list of rules and run every record through it. Obviously that would take a ridiculous amount of time (like 2 months execution time to process one month of claims). Looking at it with a less than "purist" mindset, I was able to convince my customer that rules should be defined hierarchically. So we split them into a set of eligibility rules and a set of decision rules. Then we further refined the structure by creating eligibility groups and decision groups. What we ended up with was a coarse tree structure where rules would allow the system to narrow down the number of rules that should be applied to a given record. With this, the 6 week processing time for 250,000 records was cut down to 7 hours (this was in 1988 mind you).
All this to say that taking a step back into the nature of the problem to solve may provide some optimization opportunities that are not visible when looking merely at the mechanics of one process option.

How do ɛ-transitions work in nondeterministic finite automata?

I am confused about the implementation of a language by an automaton. Does the automaton go directly to the next state if there is a ɛ-transition? Suppose I have an automaton consisting of three states a, b, and c (where a is initial state and c the accepting state) with alphabet {0,1}. How does the following work?
a----ɛ--->(b----0---->a)
(b----1---->c)
Is the string "1" accepted? What if we had
a---1--->b----ɛ--->c
? Would the string "1" be accepted?
Does the automaton go directly to the next state if there is an ɛ-transition?
Roughly speaking, yes. An ɛ-transition (in a non-deterministic finite automaton, or NFA, for short) is a transition that is not associated with the consumption of any symbol (0 or 1, in this case). Once you understand that, it's easy (in this case) to derive deterministic finite automata (or DFA, for short) that are equivalent to your NFAs and identify the languages that the latter describe.
Suppose I have an automaton [...] Is the string "1" accepted?
Yes. Here is a nicer diagram (curtesy of LaTeX and tikz) of your first NFA:
An equivalent DFA would be:
Once you have that, it's easy to see that the language accepted by that NFA is the set of strings that
start with zero or more 0's,
end with exactly one 1.
The string "1", because it starts with zero 0 and ends with one 1, is indeed accepted.
What if we had [...]? Would the string "1" be accepted?
Yes. Here is a nicer diagram of your second NFA:
An equivalent DFA would be:
In fact, it's easy to see that "1" is the only accepted string, here.

Given a language a^n b^m such that n and m have some relation between them implies that the given language can not be regular.Am i correct? [duplicate]

I know anbn for n > 0 is not regular by the pumping lemma but I would imagine a*b* to be regular since both a,b don't have to be the same length. Is there a proof for it being regular or not?
Answer to your question:
imagine a*b* to be regular, Is there a proof for it being regular or not?
No need to imagine, expression a*b* is called regular expression (re), and regular expressions are possible only for regular languages. If a language is not regular then regular expression is also not possible for that and if a language is regular language then we can always represent it by some regular expression.
Yes, a*b* represents a regular language.
Language description: Any number of a followed by any numbers of b (by any number I mean zero (including null ^) or more times). Some example strings are:
{^, a, b, aab, abbb, aabbb, ...}
DFA for RE a*b* will be as follows:
a- b-
|| ||
▼| ▼|
---►((Q0))---b---►((Q1))
In figure: `(())` means final state, so both `{Q0, Q1}` are final states.
You need to understand following basic concept:
What is basically a regular language? And why an infinite language `a*b*` is regular whereas languages like `{ anbn | n > 0 }` are not regular!!
A language(a set) is called regular language, if it requires only bounded(finite) amount of information to keep store at any instance of time while processing strings of the language.
So, what is 'bounded' information?
For example: Consider a fan 'on'/'off' switch. By viewing fan switch we can say whether the fan is in the on or off state (this is bounded or limited information). But we cannot tell 'how many times' a fan has been switched to on or off in the past! (to memorize this, we require a mechanism to store an 'unbounded' amount of information to count — 'how many times' e.g. the meter used in our cars/bikes).
The language { anbn | n > 0 } is not a regular language because here n is unbounded(it can be infinitely large). To verify strings in the language anbn, we need to memorize how many a symbols there are and it requires an infinite memory storage to count because the number of a symbols in the string can be infinitely large!
That means an automata is only capable of processing strings of the language anbn if it has infinite memory e.g PDA.
Whereas, a*b* is of course regular by its nature, because there is the bounded restriction ‐ that b may come after some a ( and a can't came after b). And that is why every string of this language can be easily processed (or recognized) by an automata in which we have finite memory - and finite automata is a class of automata where memory is finite. Yes, in finite automata, we have finite amount of memory in the term of states.
(Memory in finite automata is present in the form of states Q and according to automata principal: any automata can have only finite states. hence finite automata have finite memory, this is the reason the class of automata for regular languages is called finite automata. You can think of a finite automata like a CPU without memory, that has finite register to remember its internal states)
Finite State ⇒ Finite Memory ⇒ Only language can be processed for which finite memory needs to store at any instance of time while processing the string ⇒ that language is called Regular Language
Absent of external memory is limitation of finite automate ⇒ or we can say limitation of finite automata defined class of language called Regular Language.
You should read other answer "finiteness of regular language" to learn scope of regular language.
side note::
language { anbn | n > 0 } is subset of a*b*
Also a language { anbn | 10>100 n > 0 } is regular, a large set but regular because n is bounded, hence finite automata and regular expression is possible for this language.
You should also read: How to prove a language is regular?
The proof is: ((a*)(b*)) is a well-formed regular expression, hence matching a regular language. a*b* is a syntactic shortenning of the same expression.
Another proof: Regular languages are closed to concatenation. a* is a regular language. b* is a regular language, therefore their concatenation, a*b*, is also a regular expression.
You can build an automat for it:
0 ->(a) 1
0 ->(b) 2
1 ->(a) 1
1 ->(b) 2
2 ->(b) 2
2 ->(a) 3
3 ->(a,b) 3
where only 3 is not an accepting state, and prove that the language is a*b*.
To prove that a language is regular, it is sufficient to show either:
1) There exists some DFA that recognizes it. In this case, the DFA is trivial.
2) The language can be expressed as a regular expression, as mentioned in another answer. a*b* is a regular expression to recognize this language.
A regular language is a language that can be expressed with a regular expression or a deterministic or non-deterministic finite automata or state machine.
A language is a set of strings which are made up of characters from a specified alphabet, or set of symbols. Regular languages are a subset of the set of all strings.
a closure property is a statement that a certain operation on languages, when applied to languages in a class (e.g., the regular languages), produces a result that is also in that class.
this RE shows..the type of language that accepts multiple of (a) if any but before (b)
means language without containing any substring (ba)
Regular languages are not subset of context free languages. For example, ab is regular, comprising all the strings made of substring of a's followed by substring of b's. This is not subset of a^nb^n, but superset.

Resources