What exactly is the 'pumping length' in the Pumping lemma? - automata

I'm trying to understand what is this 'magical' number 'n' that is used in every application of the Pumping lemma. After hours of research on the subject, I came to the following website: http://elvis.rowan.edu/~nlt/TheoryNotes/PumpingLemma.pdf
It states
n is
the longest a string can be without having a loop. The biggest n can
be is s, though it might be smaller for some particular language.
From what I understand if there is a Language L then the pumping length of L is the amount of states in the Finite State Automata that recognizes L. Is this true?
If it is then what exactly does the last line from above say "though it might be smaller for some particular language"? Complete mess in my head. Could somebody clear it up, please?

A state doesn't recognise a language. A DFA recognises a language by accepting exactly the set of words in the languages and no others. A DFA has many states.
If there is a regular language L, which can be modelled by the Pumping Lemma, it will have a property n.
For a DFA with s states, in order for it to accept L, s must be >= n.
The last line merely states that there are some languages in which s is greater than n, rather than equal.
This is probably more suited for https://cstheory.stackexchange.com/ or https://cs.stackexchange.com/ (not quite sure of the value of both myself).
Note: Trivially, not all DFA's with sufficient states will accept the language. Also, the fact that a language passes the pumping lemma doesn't mean it's regular (but failing it means definitely isn't).
Note also, the language changes between FA and DFA - this is a bit lax, but because NDFAs have the same power as DFAs and DFAs are easier to write and understand, DFAs are used for the proof.
Edit I'll give an example of a regular language, so you can see an idea of u,v,w,z and n. Then we'll discuss s.
L = xy^nz, n > 2 (i.e. xyyz, xyyyz, xyyyyz)
u = xy
v = y
w = z
n = 4
Hence:
|z| = 3: xyz (i = 0) Not in L, but |z| < n
|z| = 4: xyyz (i = 1)
|z| = 5: xyyyz (i = 2)
etc
Hence, it's modelled by the Pumping Lemma. A DFA would need at least 4 states. So let's think of one.
-> State 1: x
State 1:
-> State 2: y
State 2:
-> State 3: y
State 3:
-> State 3: y
-> State 4: z
State 4:
Accepting state
Terminating state
4 states, as expected. Not possible to do it in less, as expected by n = 4. In this case, the example is quite simple so I don't think you can build one with more than 4 states (but that would be okay if it were needed).

I think a possibility is when you have a FA with an unreachable state. The FA has s states, but 1 is unreachable, so all strings recognizing L will be comprised of (s-1)=n states, so n<s.

Related

Are those languages regular or not

Hi I would need help from you, I got the following languages and need to determine if are regular or not.
Now I think that Y is not regular and I applied the Pumping Lemma to determine that.
For X I am not sure if is a regular language or not, I was thinking that X is the set of strings with an odd number of a's that can be easily represented with an NFA.
Can anyone help me with that ?
The first one (X) is regular, because you can construct a finite automaton for it:
(start) --- a --> (final) -- a --> (state)
^ |
\------ a -------/
The second one (Y) is not regular, because you cannot construct a finite automaton for it. It would require memory to store the number of a to be able to later find one more of b. That language is context-free, with a grammar:
S = T b
T = a T b
T = ε

How many equivalence classes in the RL relation for {w in {a, b}* | (#a(w) mod m) = ((#b(w)+1) mod m)}

How many equivalence classes in the RL relation for
{w in {a, b}* | (#a(w) mod m) = ((#b(w)+1) mod m)}
I am looking at a past test question which gives me the options
m(m+1)
2m
m^2
m^2+1
infinite
However, i claim that its m, and I came up with an automaton that I believe accepts this language which contains 3 states (for m=3).
Am I right?
Actually you're right. To see this, observe that the difference of #a(w) and #b(w), #a(w) - #b(w) modulo m, is all that matters; and there are only m possible values of this difference modulo m. So, m states are always sufficient to accept a language of this form: simply make the state corresponding to the appropriate difference the accepting state.
In your DFA, a2 corresponds to a difference of zero, a1 to a difference of one and a3 to a difference of two.

Pumping Lemma Assistance

I recently had an assignment where I was asked to use pumping lemma to show that a language was not regular, and unfortunately got the wrong answer.
The language to prove is non-regular is as follows:
L = {ai bj ck: i = j or j = k}
The definition of a pumping lemma that I was given is as follows:
opponent picks m
I want to pick w to contradict the pumping lemma. Use m to pick a string w ∈ L where |w| ≥ m
opponent picks a decomposition of w subject to constraints.
I try to pick an i so that the pumped string wi ∉ L. If found, L is not regular
This subject has proven to be very difficult for me to understand and I feel like a complete airhead because of it, so a detailed explanation as to how I would properly apply a pumping lemma would be appreciated.
Intuitively, the pumping lemma says that long enough words (the length depends only on the language L) in a regular language L must contain a section (of length > 0) which can be repeated as often as desired. Repeating that section ('pumping' the original word) any number of time results in some longer words which are also in the language L.
The minimal length for the word is the m in the first step of the above definition; the number of times the section is repeated is the i in the 4th step of the above definition.
The pumping lemma is usually used to show that a language L is not regular. It is a proof by contradiction: assume that L is regular and thus the pumping lemma for regular languages is true for L. Then pick a word w which is in L of sufficient length* and show that regardless of how it is decomposed* some pumped word is not in the language. This contradicts the pumping lemma - which we know to be true. Thus our assumption that the language is regular was wrong and thus the language is not regular. The parts marked with an * cannot be chosen to make things easy - that's why in steps 1 and 3 the 'opponent' selects them.
The word w is rewritten as w = x y z, where | y | > 0 and |x y| <= m. Both x and z may be of length 0.
The usual approach is to pick xy to be a string consisting of the same letter such that having more of that same letter (after the pumping) leads to a word not in L.
No restrictions are specified for the i or the k in the language L in the post, so assuming that i = 0 is allowed, a suitable word might be b^m c^m (that is m bs followed by m cs). Now whatever decomposition the opponent might select, the y will always consist of some bs. Repeating those bs leads to a word with more bs than cs and no as, and thus i != j and j!= k and the word is not in the language.

How a^n b^n where n>=1 is not regular?

This is the simple finite automata I tried, what am I doing it wrong?
This matches , not . Aka ababab, not aaabbb.
The b transition leads to a final state which halts the machine. Your machine will only halt if given a sequence of 'ab' of length 1 or more.
The language a^n b^n where n>=1 is not regular, and it can be proved using the pumping lemma. Assume there is a finite state automaton that can accept the language. This finite automaton has a finite number of states k, and there is string x in the language such that n > k. According to the pumping lemma, x can be decomposed such that x=uvw, and any finite automaton that accepts x must also accept uv*w. v is non-empty and can be made to consist of only a's or only b's since n > k. Let v consist of only a's. If the finite automaton accepts x = uvw, it must also accept x = uvvw, which has more a's than b's and is not of the form a^n b^n. This is a contradiction, so a^n b^n cannot be a regular language.
This matches (ab)^n, not a^nb^n.
What you're looking for is Pumping lemma for regular languages.
Examples:
Let L = {a^mb^m | m ≥ 1}.
Then L is not regular.
Proof: Let n be as in Pumping Lemma.
Let w = a^nb^n.
Let w = xyz be as in Pumping Lemma.
Thus, xy^2z ∈ L, however, xy^2z contains more a’s than b’s.

Testing intersection of two regular languages

I want to test whether two languages have a string in common. Both of these languages are from a subset of regular languages described below and I only need to know whether there exists a string in both languages, not produce an example string.
The language is specified by a glob-like string like
/foo/**/bar/*.baz
where ** matches 0 or more characters, and * matches zero or more characters that are not /, and all other characters are literal.
Any ideas?
thanks,
mike
EDIT:
I implemented something that seems to perform well, but have yet to try a correctness proof. You can see the source and unit tests
Build FAs A and B for both languages, and construct the "intersection FA" AnB. If AnB has at least one accepting state accessible from the start state, then there is a word that is in both languages.
Constructing AnB could be tricky, but I'm sure there are FA textbooks that cover it. The approach I would take is:
The states of AnB is the cartesian product of the states of A and B respectively. A state in AnB is written (a, b) where a is a state in A and b is a state in B.
A transition (a, b) ->r (c, d) (meaning, there is a transition from (a, b) to (c, d) on symbol r) exists iff a ->r c is a transition in A, and b ->r d is a transition in B.
(a, b) is a start state in AnB iff a and b are start states in A and B respectively.
(a, b) is an accepting state in AnB iff each is an accepting state in its respective FA.
This is all off the top of my head, and hence completely unproven!
I just did a quick search and this problem is decidable (aka can be done), but I don't know of any good algorithms to do it. One is solution is:
Convert both regular expressions to NFAs A and B
Create a NFA, C, that represents the intersection of A and B.
Now try every string from 0 to the number of states in C and see if C accepts it (since if the string is longer it must repeat states at one point).
I know this might be a little hard to follow but this is only way I know how.

Resources