F sharp KMP Algorithm is stuck in the first while loop if i use a pattern with the same characters at the first two indces - f#

I am playing around with the KMP algorithm in f sharp. While it works for patterns like "ATAT" (result will be [|0; 0; 1; 2;|]) , the first while loop enters a deadlock when the first 2 characters of a string are the same and the 3rd is another, for example "AAT".
I understand why: first, i gets incremented to 1. now the first condition for the while loop is true, while the second is also true, because "A" <> "T". Now it sets i to prefixtable.[!i], which is 1 again, and here we go.
Can you guys give me a hint on how to solve this?
let kMPrefix (pattern : string) =
let (m : int) = pattern.Length - 1
let prefixTable = Array.create pattern.Length 0
// i : longest proper prefix that is also a suffix
let i = ref 0
// j: the index of the pattern for which the prefix value will be calculated
// starts with 1 because the first prefix value is always 0
for j in 1 .. m do
while !i > 0 && pattern.[!i] <> pattern.[j] do
i := prefixTable.[!i]
if pattern.[!i] = pattern.[j] then
i := !i+1
Array.set prefixTable j !i
prefixTable

I'm not sure how to repair the code with a small modification, since it doesn't match the KMP algorithm's lookup table contents (at least the ones I've found on Wikipedia), which are:
-1 for index 0
Otherwise, the count of consecutive elements before the current position that match the beginning (excluding the beginning itself)
Therefore, I'd expect output for "ATAT" to be [|-1; 0; 0; 1|], not [|0; 0; 1; 2;|].
This type of problem might be better to reason about in functional style. To create the KMP table, you could use a recursive function that fills the table one by one, keeping track of how many recent characters match the beginning, and start running it at the second character's index.
A possible implementation:
let buildKmpPrefixTable (pattern : string) =
let prefixTable = Array.zeroCreate pattern.Length
let rec run startIndex matchCount =
let writeIndex = startIndex + matchCount
if writeIndex < pattern.Length then
if pattern.[writeIndex] = pattern.[matchCount] then
prefixTable.[writeIndex] <- matchCount
run startIndex (matchCount + 1)
else
prefixTable.[writeIndex] <- matchCount
run (writeIndex + 1) 0
run 1 0
if pattern.Length > 0 then prefixTable.[0] <- -1
prefixTable
This approach isn't in danger of any endless loops/recursion, because all code paths of run either increase writeIndex in the next iteration or finish iterating.
Note on terminology: the error you are describing in the question is an endless loop or, more generally, non-terminating iteration. Deadlock refers specifically to a situation in which a thread waits for a lock that will never be released because the thread holding it is itself waiting for a lock that will never be released for the same reason.

Related

Splitting a String into two variables? LUA

So in a LUA driver I am writing I am constantly receiving RS232 strings eg;
ZAA1, ZO64, D1 etc. etc. I am after a solution for finding where the string ends, and the Int starts and putting it into two different variables?
I currently am using a while loop with a string.match method inside. is there a better way? Current Shortened code below;
s = "ZO29"
j = 1
while j <= 64 do
if (s == string.format("ZO%d", j)) then
print("Within ZO message")
inputBuffer = ""
sendACK()
break
elseif (s == string.format("ZC%d", j)) then
inputBuffer = ""
sendACK()
break
end
j = j + 1
end
Try this:
a,b=s:match("(.-)(%d+)$")
This captures the digits at the end of the string into b and the preceding text into a.

Express Running time in Big Theta Notation ?

For this pseudocode, how would I express the running time in the Θ notation in terms of n?
s = 0
for i = 0 to n:
for j = 0 to i:
s = (s + i)*j
print s
The assignment s = (s+i)*j has constant time-complexity Θ(1). For each i the inner loop gets executed exactly i times, whereas i is iterated from 0 to n. So the body of your loop (eg. the assignment) is executed
1+2+3+...+(n+1) = (n+1)(n+2)/2 = Θ(n^2).
As the body of the loop is Θ(1) you get Θ(n^2) for the whole program noting that the first and last lines are just Θ(1) so you can ignore them.

Bound checks for array and string slices

I can't seem to get my head around the rules that govern these two cases:
1. The end index may be one less than the start index, producing an empty array/string.
2. It's apparently legal to position the start index just behind the last element, if the end index is one less, as before.
[|0..2|].[3..2];; // [||]
"bar".[3..2];; // ""
A naive implementation of bound checks with consideration of case 1 wouldn't allow case 2:
let trySlice i j (a : string) =
let lastIdx = a.Length - 1
if i < 0 || i > lastIdx || j < i - 1 || j > lastIdx then None
else Some a.[i..j]
trySlice 3 2 "bar" // None
What's the rationale behind this behavior? How to proceed?
Edit
This is what I have now thanks to Tarmil's input
let trySlice i j (a : string) =
if i >= 0 && j >= i - 1 && j < a.Length then Some a.[i..j]
else None
which should be equivalent to
let trySlice' i j (s : string) =
try s.Substring(i, j - i + 1) |> Some
with _ -> None
I suppose the rationale is that a.[i..j] is a sequence of length (j - i + 1), so in a way it makes sense to allow i = j + 1 as a way to extract an empty sequence.
As for "how to proceed", if you want your trySlice to accept all cases that the built-in slicing accepts, then just remove the i > lastIdx clause. When i = lastIdx + 1, the only way for the other conditions to pass is if j = lastIdx, and when i > lastIdx + 1, there is no way for j to pass both its constraints.
As a side-note, the way you write:
if (failCondition || failCondition || ...) then None else Some x
feels counter-intuitive to me for some reason, I would have written it as:
if (successCondition && successCondition && ...) then Some x else None

F# recursive function in strange endless loop

I am very green when it comes to F#, and I have run across a small issue dealing with recursive functions that I was hoping could help me understand.
I have a function that is supposed to spit out the next even number:
let rec nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
// This never returns..
nextEven 3;;
I use the 'rec' keyword so that it will be recursive, although when I use it, it will just run in an endless loop for some reason. If I rewrite the function like this:
let nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
Then everything works fine (no rec keyword). For some reason I though I needed 'rec' since the function is recursive (so why don't I?) and why does the first version of the function run forever ?
EDIT
Turns out this was a total noob mistake. I had created multiple definitions of the function along the way, as is explained in the comments + answers.
I suspect you have multiple definitions of nextEven. That's the only explanation for your second example compiling. Repro:
module A =
let rec nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y
open A //the function below will not compile without this
let nextEven(x) =
let y = x + 1
if y % 2 = 0 then y
else nextEven y //calling A.nextEven
Try resetting your FSI session.

Adding Overloaded Constructor That Requires Initialization Code to Implicit F# Type

I currently have the following code:
type Matrix(sourceMatrix:double[,]) =
let rows = sourceMatrix.GetUpperBound(0) + 1
let cols = sourceMatrix.GetUpperBound(1) + 1
let matrix = Array2D.zeroCreate<double> rows cols
do
for i in 0 .. rows - 1 do
for j in 0 .. cols - 1 do
matrix.[i,j] <- sourceMatrix.[i,j]
new (rows, cols) = Matrix( Array2D.zeroCreate<double> rows cols)
new (boolSourceMatrix:bool[,]) = Matrix(Array2D.zeroCreate<double> rows cols)
for i in 0 .. rows - 1 do
for j in 0 .. cols - 1 do
if(boolSourceMatrix.[i,j]) then matrix.[i,j] <- 1.0
else matrix.[i,j] <- -1.0
My problem lies in the last constructor that takes a bool[,] parameter. The compiler isn't letting me get away with the two for loops I'm trying to use for initialization in this constructor. How can I make this work?
The easiest solution would be to just do this instead:
new (boolSourceMatrix) = Matrix(Array2D.map (fun b -> if b then 1.0 else -1.0) boolSourceMatrix)
The specific issue that you were running into is that the let-bound fields from the primary constructor aren't available in alternate constructors. To work around this, you could use an explicitly defined field, if you wanted. However, in this case it's better to take advantage of the additional functionality in the Array2D module.

Resources