How smart is pattern match? - f#

My program spends most of time on array pattern match, I am wondering if I should rewrite the function and discard the auto pattern matching.
E.g. a very simple case
let categorize array =
match array with
| [|(1|2);(1|2);(1|2)|] -> 3
| [|(1|2);(1|2);_|] -> 2
| [|(1|2);_;_|] -> 1
| _ -> 0
categorize [|2;1;3|]
Would the compiler apply the least amount of comparisons in this case, by recognizing that e.g. the first case is the same as the second case except for the third element.
Actually the patterns are more complicated, the pre optimized pattern matching could cost way more time than fully optimized pattern matching.

Straight from Reflector:
public static int categorize(int[] array)
{
if ((array > null) && (array.Length == 3))
{
switch (array[0])
{
case 1:
switch (array[1])
{
case 1:
switch (array[2])
{
case 1:
case 2:
goto Label_005C;
}
goto Label_005A;
case 2:
switch (array[2])
{
case 1:
case 2:
goto Label_005C;
}
goto Label_005A;
}
goto Label_0042;
case 2:
switch (array[1])
{
case 1:
switch (array[2])
{
case 1:
case 2:
goto Label_005C;
}
goto Label_005A;
case 2:
switch (array[2])
{
case 1:
case 2:
goto Label_005C;
}
goto Label_005A;
}
goto Label_0042;
}
}
return 0;
Label_0042:
return 1;
Label_005A:
return 2;
Label_005C:
return 3;
}
I don't see anything inefficient.

What is really missing in your question is the actual subject area. In other words, your question is quite generic (which is, generally, good for SO), while coding against on your actual problem may solve the entire issue in an elegant manner.
If I extrapolate your question as it currently stands, you just need the index of the first element which is neither 1 nor 2, and the implementation is trivial:
let categorize arr =
try
Array.findIndex (fun x -> not(x = 1 || x = 2)) arr
with
| :? System.Collections.Generic.KeyNotFoundException -> Array.length arr
// Usage
let x1 = categorize [|2;1;3|] // returns 2
let x2 = categorize [|4;2;1;3|] // returns 0
let x3 = categorize [|1;2;1|] // returns 3
As several free benefits, you get the code that is array length-agnostic and absolutely readable.
Is this what you need?

You could write:
let f (xs: _ []) =
if xs.Length=3 then
let p n = n=1 || n=2
if p xs.[0] then
if p xs.[1] then
if p xs.[2] then 3
else 2
else 1
else 0

Test 1
F#
let test1 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; _ |] -> A
| [| 1; _; _ |] -> A
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
return Program.MyType.A;
}
break;
default:
return Program.MyType.A;
}
break;
}
}
throw new MatchFailureException(...);
Decompiled IL
Code size 107
Conclusion
Pattern Match doesn't optimize based on the values after ->.
Pattern Match is able to find the optimized approach for array decomposition under conclusion 1.
Incomplete pattern matches always throw exceptions, so there is no harm to add a wildcard to catch the missing patterns and throw exceptions explicitly.
Test 2
F#
let test2 x =
match x with
| [| 1; 2; 3 |] -> A
| [| _; 2; 3 |] -> B
| [| _; _; 3 |] -> C
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
goto IL_49;
}
break;
default:
switch (x[2])
{
case 3:
break;
default:
goto IL_49;
}
break;
}
break;
default:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.B;
default:
goto IL_49;
}
break;
default:
switch (x[2])
{
case 3:
goto IL_58;
}
goto IL_49;
}
break;
}
IL_58:
return Program.MyType.C;
}
IL_49:
throw new MatchFailureException(...);
Decompiled IL
Code size 185
Conclusion
Pattern Match checks values from the beginning of an array to end. So it fails to find the optimized approach.
Code size is 2x as much as an optimal one.
Test 3
F#
let test3 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; a |] when a <> 3 -> B
| [| 1; 2; _ |] -> C
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
if (x[2] != 3)
{
int a = x[2];
return Program.MyType.B;
}
break;
}
break;
}
break;
}
}
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
return Program.MyType.C;
}
break;
}
}
throw new MatchFailureException(...);
Conclusion
The compiler isn't smart enough to see through Guard to check completeness/duplicity.
Guard makes Pattern Match produce weird unoptimized code.
Test 4
F#
let (| Is3 | IsNot3 |) x =
if x = 3 then Is3 else IsNot3
let test4 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; Is3 |] -> B
| [| 1; 2; IsNot3 |] -> C
| [| 1; 2; _ |] -> D // This rule will never be matched.
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
{
FSharpChoice<Unit, Unit> fSharpChoice = Program.|Is3|IsNot3|(x[2]);
if (fSharpChoice is FSharpChoice<Unit, Unit>.Choice2Of2)
{
return Program.MyType.C;
}
return Program.MyType.B;
}
}
break;
}
break;
}
}
throw new MatchFailureException(...);
Conclusion
Multiple cases Active Patterns compile to FSharpChoice.
The compiler is able to check completeness/duplicity of active patterns, however it cannot compare them with normal patterns.
Unreached patterns are not compiled.
Test 5
F#
let (| Equal3 |) x =
if x = 3 then Equal3 1 else Equal3 0 // Equivalent to "then 1 else 0"
let test5 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; Equal3 0 |] -> B
| [| 1; 2; Equal3 1 |] -> C
| [| 1; 2; _ |] -> D
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
{
int num = x[2];
switch ((num != 3) ? 0 : 1)
{
case 0:
return Program.MyType.B;
case 1:
return Program.MyType.C;
default:
return Program.MyType.D;
}
break;
}
}
break;
}
break;
}
}
throw new MatchFailureException(...);
Conclusion
Single case Active Patterns compile to the return type.
The compiler sometimes auto inline the function.
Test 6
F#
let (| Partial3 | _ |) x =
if x = 3 then Some (Partial3 true) else None // Equivalent to "then Some true"
let test6 x =
match x with
| [| 1; 2; 3 |] -> A
| [| 1; 2; Partial3 true |] -> B
| [| 1; 2; Partial3 true |] -> C
Decompiled C#
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
switch (x[2])
{
case 3:
return Program.MyType.A;
default:
{
FSharpOption<bool> fSharpOption = Program.|Partial3|_|(x[2]);
if (fSharpOption != null && fSharpOption.Value)
{
return Program.MyType.B;
}
break;
}
}
break;
}
break;
}
}
if (x != null && x.Length == 3)
{
switch (x[0])
{
case 1:
switch (x[1])
{
case 2:
{
FSharpOption<bool> fSharpOption = Program.|Partial3|_|(x[2]);
if (fSharpOption != null && fSharpOption.Value)
{
return Program.MyType.C;
}
break;
}
}
break;
}
}
throw new MatchFailureException(...);
Conclusion
Partial Active Patterns compile to FSharpOption.
The compiler is unable to check completeness/duplicity of partial active patterns.
Test 7
F#
type MyOne =
| AA
| BB of int
| CC
type MyAnother =
| AAA
| BBB of int
| CCC
| DDD
let test7a x =
match x with
| AA -> 2
let test7b x =
match x with
| AAA -> 2
Decompiled C#
public static int test7a(Program.MyOne x)
{
if (x is Program.MyOne._AA)
{
return 2;
}
throw new MatchFailureException(...);
}
public static int test7b(Program.MyAnother x)
{
if (x.Tag == 0)
{
return 2;
}
throw new MatchFailureException(...);
}
Conclusion
If there are more than 3 cases in the union, Pattern Match would use Tag property instead of is. (It also applies to Multiple cases Active Patterns.)
Often a Pattern Match would result in multiple is which degenerate performance greatly.

Related

F# "This value is not a function and cannot be applied" when trying to add (+)

The m.Count + m.StepSize in the F# code, 4th line from the bottom, returns the error This value is not a function and cannot be applied.
I can't see why + isn't being interpreted as an infix function instead of m.Count.
Why is this line a problem?
type Model =
{ Count: int
StepSize: int }
type Msg =
| Increment
| Decrement
| SetStepSize of int
| Reset
let init =
{ Count = 0
StepSize = 1 }
let canReset = (<>) init
let update msg m =
match msg with
| Increment -> { m with Count = m.Count + m.StepSize }
| Decrement -> { m with Count = m.Count - m.StepSize }
| SetStepSize x -> { m with StepSize = x }
| Reset -> init

How do I reduce code duplication with nested 'if' statements?

let's consider this code:
let getBuildDate (assembly: Assembly) : DateTime option =
let buildVersionMetadataPrefix = "+build"
let attribute = assembly.GetCustomAttribute<AssemblyInformationalVersionAttribute>()
if attribute <> null && attribute.InformationalVersion <> null then
let value = attribute.InformationalVersion
let index = value.IndexOf(buildVersionMetadataPrefix)
if index > 0 then
let value = value.Substring(index + buildVersionMetadataPrefix.Length)
let success, timestamp = DateTime.TryParseExact(value, "yyyyMMddHHmmss", CultureInfo.InvariantCulture, DateTimeStyles.None)
if success then
Some timestamp
else
None
else
None
else
None
Is there a way to get rid of all the 'else None' statements to have only one?
On one side, I can imagine that for some people the code is more clear with all the None statements spelled out, but on the other side, coming from the C world, I see it as clutter that reduces readability.
There are many cases where you need a series of conditions to be met and all the failed cases go to one place.
If I have a list of conditions that depend on each others' success, how can I make a concise short exit without duplication.
Another approach might be to use the Option functions - each of these steps will effectively short circuit if the input from the previous step is None.
let getBuildDate (assembly: Assembly) : DateTime option =
let tryDate value =
match DateTime.TryParseExact(value, "yyyyMMddHHmmss", CultureInfo.InvariantCulture, DateTimeStyles.None) with
| true, date -> Some date
| false, _ -> None
let buildVersionMetadataPrefix = "+build"
let attribute = assembly.GetCustomAttribute<AssemblyInformationalVersionAttribute>()
Option.ofObj attribute
|> Option.bind (fun attr -> Option.ofObj attr.InformationalVersion)
|> Option.map (fun infVer -> infVer, infVer.IndexOf buildVersionMetadataPrefix)
|> Option.filter (fun (_, index) -> index > 0)
|> Option.map (fun (infVer, index) -> infVer.Substring(index + buildVersionMetadataPrefix.Length))
|> Option.bind tryDate
Whether this is 'better' is arguable - and definitely a matter of opinion!
The other answers show how to do this using more sophisticated functional programming methods, like using computation expressions or option values. Those are definitely useful and make sense if this is something that you are doing in many places throughout your system.
However, if you just want a simple way to change the code so that the control flow is more clear (without making it more clever), I would negate the conditions. Previously, you had:
if something then
moreStuff()
Some result
else
None
You can rewrite this by returning None if not something. I think the F# coding convention in this case also allows you to remove the indentation, so it looks more like imperative early return:
if not something then None else
moreStuff()
Some result
With this, you can write your original function as follows - without any extra clever tricks:
let getBuildDate (assembly: Assembly) : DateTime option =
let buildVersionMetadataPrefix = "+build"
let attribute = assembly.GetCustomAttribute<AssemblyInformationalVersionAttribute>()
if attribute = null || attribute.InformationalVersion = null then None else
let value = attribute.InformationalVersion
let index = value.IndexOf(buildVersionMetadataPrefix)
if index <= 0 then None else
let value = value.Substring(index + buildVersionMetadataPrefix.Length)
let success, timestamp = DateTime.TryParseExact(value, "yyyyMMddHHmmss", CultureInfo.InvariantCulture, DateTimeStyles.None)
if not success then None else
Some timestamp
A readable approach might be use a computation expression builder for Option.
type OptionBuilder() =
member _.Return v = Some v
member _.Zero () = None
member _.Bind(v, f) = Option.bind f v
member _.ReturnFrom o = o
let opt = OptionBuilder()
You can simulate an imperative style of if-then-return.
let condition num = num % 2 = 0
let result = opt {
if condition 2 then
if condition 4 then
if condition 6 then
return 10
}
Rewriting your example:
let getBuildDate (assembly: Assembly) : DateTime option = opt {
let buildVersionMetadataPrefix = "+build"
let attribute = assembly.GetCustomAttribute<AssemblyInformationalVersionAttribute>()
if attribute <> null && attribute.InformationalVersion <> null then
let value = attribute.InformationalVersion
let index = value.IndexOf(buildVersionMetadataPrefix)
if index > 0 then
let value = value.Substring(index + buildVersionMetadataPrefix.Length)
let success, timestamp = DateTime.TryParseExact(value, "yyyyMMddHHmmss", CultureInfo.InvariantCulture, DateTimeStyles.None)
if success then
return timestamp
}
No more None.
open System
open System.Reflection
open System.Globalization
let inline guard cond next = if cond then next () else None
let getBuildDate (assembly: Assembly) : DateTime option =
let buildVersionMetadataPrefix = "+build"
let attribute = assembly.GetCustomAttribute<AssemblyInformationalVersionAttribute>()
guard (attribute <> null && attribute.InformationalVersion <> null) <| fun _ ->
let value = attribute.InformationalVersion
let index = value.IndexOf(buildVersionMetadataPrefix)
guard (index > 0) <| fun _ ->
let value = value.Substring(index + buildVersionMetadataPrefix.Length)
let success, timestamp = DateTime.TryParseExact(value, "yyyyMMddHHmmss", CultureInfo.InvariantCulture, DateTimeStyles.None)
guard success <| fun _ ->
Some timestamp
If you can stomach the inelegance of having to write <| fun _ -> on every guard, this is an option worth considering.
Have you considered using Result<TSuccess, TError>. It is very structuring - making the code rigid and flat - and makes it possible to provide detailed error information for the step that possible fails. It's a little more code, but IMO more readable and maintainable:
let getBuildDate (assembly: Assembly) : Result<DateTime, string> =
let buildVersionMetadataPrefix = "+build"
let extractAttribute (assem: Assembly) =
match assem.GetCustomAttribute<AssemblyInformationalVersionAttribute>() with
| attrib when attrib <> null -> Ok attrib
| _ -> Error "No attribute found"
let extractDateString (attrib: AssemblyInformationalVersionAttribute) =
match attrib.InformationalVersion.IndexOf (buildVersionMetadataPrefix) with
| x when x > 0 -> Ok (attrib.InformationalVersion.Substring (x + buildVersionMetadataPrefix.Length))
| _ -> Error "Metadata prefix not found"
let toDateTime dateString =
match DateTime.TryParseExact(dateString, "yyyyMMddHHmmss", CultureInfo.InvariantCulture, DateTimeStyles.None) with
| true, timeStamp -> Ok timeStamp
| false, _ -> Error "Invalid date time format"
extractAttribute assembly
|> Result.bind extractDateString
|> Result.bind toDateTime
Usage
let optBuildDate = getBuildDate (Assembly.GetExecutingAssembly())
match optBuildDate with
| Ok date -> printfn "%A" date
| Error msg -> printfn "ERROR: %s" msg
There is an approach that I really love which is the use of an array in certain scenarios.
Example:
Instead of using something like:
if (grade >= 90) {
scale = "A";
} else if (grade >= 80) {
scale = "B";
} else if (grade >= 70) {
scale = "C";
} else if (grade >= 60) {
scale = "D";
} else {
scale = "F";
}
Use an array like:
function calculate(scores) {
var grade, scale;
let sum = 0;
for (let i = 0; i < scores.length; i++) {
sum += scores[i];
}
grade = sum / scores.length;
scale = {
[90 <= grade && grade <= 100]: "O",
[80 <= grade && grade < 90]: "E",
[70 <= grade && grade < 80]: "A",
[55 <= grade && grade < 70]: "P",
[40 <= grade && grade < 55]: "D",
[grade < 40]: "T"
};
console.log(scale.true);
}
In python could be like:
def calculate(scores: list) -> str:
grade = sum(scores) / len(scores)
print(grade)
scale = {90 <= grade <= 100: "O", 80 <=
grade < 90: "E", 70 <= grade < 80: "A",
55 <= grade < 70: "P", 40 <= grade < 55: "D",
grade < 40: "T"}
return scale.get(True)

Writing AST matcher to find all case statements having no break statement

I want to find all the case statement having no break statement. I using clang-query to build my matcher. My matcher is failing in some of the test cases.
I wrote simple matcher as
match caseStmt(unless(has(breakStmt())))
it works with follwing test case
#include<stdlib.h>
int main(){
int x;
switch(x){
case 1:
break;
case 2:
default:
x++;
}
return 0;
}
and
int main()
{
int x = 1, y = 2;
// Outer Switch
switch (x) {
// If x == 1
case 1:
// Nested Switch
switch (y) {
// If y == 2
case 2:
//break;
// If y == 3
case 3:
break;
}
break;
// If x == 4
case 4:
break;
// If x == 5
case 5:
break;
default:
break;
}
return 0;
}
does not work well with following
#include <iostream>
using namespace std;
int main()
{
int x = 1, y = 2;
// Outer Switch
switch (x) {
// If x == 1
case 1:
// Nested Switch
switch (y) {
// If y == 2
case 2:
cout << "Choice is 2";
//break;
// If y == 3
case 3:
cout << "Choice is 3";
break;
}
//break;
// If x == 4
case 4:
cout << "Choice is 4";
break;
// If x == 5
case 5:
cout << "Choice is 5";
break;
default:
cout << "Choice is other than 1, 2 3, 4, or 5";
break;
}
return 0;
}
In above case it shows case statement that are having break statement along with case statement with no break statement.
what wrong i am doing ? please help :) I am following this
http://releases.llvm.org/8.0.0/tools/clang/docs/LibASTMatchersTutorial.html
Unfortunately this is not going to work :-(
case is technically a label, and label has only one statement as its child. If you print out AST you'll see that case and break statements will be at the same level:
| |-CaseStmt 0x5618732e1e30 <line:29:3, line:30:9>
| | |-IntegerLiteral 0x5618732e1e10 <line:29:8> 'int' 4
| | |-<<<NULL>>>
| | `-CallExpr 0x5618732e1f00 <line:30:5, col:9> 'void'
| | `-ImplicitCastExpr 0x5618732e1ee8 <col:5> 'void (*)()' <FunctionToPointerDecay>
| | `-DeclRefExpr 0x5618732e1ec0 <col:5> 'void ()' lvalue Function 0x5618732e16d0 'foo' 'void ()'
| |-BreakStmt 0x5618732e1f28 <line:31:5>
| |-CaseStmt 0x5618732e1f50 <line:34:3, line:35:9>
| | |-IntegerLiteral 0x5618732e1f30 <line:34:8> 'int' 5
| | |-<<<NULL>>>
| | `-CallExpr 0x5618732e2020 <line:35:5, col:9> 'void'
| | `-ImplicitCastExpr 0x5618732e2008 <col:5> 'void (*)()' <FunctionToPointerDecay>
| | `-DeclRefExpr 0x5618732e1fe0 <col:5> 'void ()' lvalue Function 0x5618732e16d0 'foo' 'void ()'
| |-BreakStmt 0x5618732e2048 <line:36:5>
Here you can see that CallExpr is a child of CaseStmt while BreakStmt is not.
NOTE: to make example a bit easier I replaced std::cout << "..." with foo().
You'll have to write a much more complex matcher that fetches for cases that don't have break statements between them and the following cases.
I hope this is still helpful.

F# - Expected to be option but has a type

I was trying to build a binary tree in F# but when I tried to test my code, I met the problem above.
Here is my code:
type TreeNode<'a> = { Key: int; Val: 'a }
type Tree<'a> = { LT: Tree<'a> option; TreeNode: TreeNode<'a>; RT: Tree<'a> option; }
//insert a node according to Binary Tree operation
let rec insert (node: TreeNode<'a>) (tree: Tree<'a> option) =
match tree with
| None -> {LT = None; RT = None; TreeNode = node }
| Some t when node.Key < t.TreeNode.Key -> insert node t.LT
| Some t when node.Key > t.TreeNode.Key -> insert node t.RT
let t = seq { for i in 1 .. 10 -> { Key = i; Val = i } }|> Seq.fold (fun a i -> insert i a) None
Your insert function takes option<Tree<'T>> but returns Tree<'T>. When performing the fold, you need to keep state of the same type - so if you want to use None to represent empty tree, the state needs to be optional type.
The way to fix this is to wrap the result of insert in Some:
let tree =
seq { for i in 1 .. 10 -> { Key = i; Val = i } }
|> Seq.fold (fun a i -> Some(insert i a)) None
I worked it out now... It should be like below:
type TreeNode<'a> = { Key: int; Val: 'a }
type Tree<'a> = { TreeNode: TreeNode<'a>; RT: Tree<'a> option; LT: Tree<'a> option; }
//insert a node according to Binary Tree operation
let rec insert (node: TreeNode<'a>) (tree: Tree<'a> option) =
match tree with
| None -> {LT = None; RT = None; TreeNode = node }
| Some t when node.Key < t.TreeNode.Key -> {TreeNode = t.TreeNode; LT = Some(insert node t.LT); RT = t.RT}
| Some t when node.Key > t.TreeNode.Key -> {TreeNode = t.TreeNode; RT = Some(insert node t.RT); LT = t.LT}
let t = seq { for i in 1 .. 10-> { Key = i; Val = i } }|> Seq.fold (fun a i -> Some(insert i a)) None

iterative binary search implementation in f#

I am trying to write a binary search in f#, but stumbled at a problem:
let find(words:string[]) (value:string) =
let mutable mid = 0
let mutable fpos = 0
let mutable lpos = words.Length - 1
while fpos < lpos do
mid <- (fpos + lpos) / 2
if value < words.[mid] then
lpos <- mid
else if value > words.[mid] then
fpos <- mid
else if value = words.[mid] then
true
false
It is giving error at the line which says true saying it expected an expression of type unit() instead got bool. What is the correct way to write this function?
Edit:
Temporarily I took to writing as follows:
let find(words:string[]) (value:string) =
let mutable mid = 0
let mutable fpos = 0
let mutable lpos = words.Length - 1
let ret = false
while fpos < lpos && ret = false do
mid <- (fpos + lpos) / 2
if value < words.[mid] then
lpos <- mid
else if value > words.[mid] then
fpos <- mid
else if value = words.[mid] then
ret <- true
ret
But execution wise I think I am doing a lot of operations here than intended...
Use a recursive function:
let find(words:string[]) (value:string) =
let rec findRec fpos lpos =
if fpos > lpos then
false
else
let mid = (fpos + lpos) / 2
if value < words.[mid] then
findRec fpos (mid-1)
else if value > words.[mid] then
findRec (mid+1) lpos
else
true
findRec 0 (words.Length-1)
Non-recursive version (adapted from Gene's answer):
let find (words: string[]) (value:string) =
let mutable mid = 0
let mutable fpos = 0
let mutable lpos = words.Length - 1
let mutable cont = true
while fpos <= lpos && cont do
mid <- (fpos + lpos) / 2
match sign(value.CompareTo(words.[mid])) with
| -1 -> lpos <- mid-1
| 1 -> fpos <- mid+1
| _ -> cont <- false
not cont
But I think that the recursive version is preferable: more idiomatic, as efficient as the iterative one because it uses tail calls.
To begin with, your algo would not terminate for value greater, than the rightmost words element (easy test case is find [|"a";"b";"c";"d"|] "e").
This matter being corrected and throwing in few minor optimizations, the final interactive implementation is not likely can be shorter, than below
let find (words: string[]) (value:string) =
let mutable lpos = words.Length - 1
if value.CompareTo(words.[lpos]) > 0 then
false
else
let mutable mid = 0
let mutable fpos = 0
let mutable cont = true
while fpos < lpos && cont do
mid <- (fpos + lpos) / 2
match sign(value.CompareTo(words.[mid])) with
| -1 -> lpos <- mid
| 1 -> fpos <- mid
| _ -> cont <- false
not cont
UPDATE: That's what happens when putting answer in a rush and without a computer around :(. The content striked-through above is not something to be proud of. As MiMo has already took care of all problems in the snippet above I'll try something different to vindicate myself, namely, try demonstrating how MiMo's recursive implementation after tail-call recursion elimination turns almost literally into his non-recursive one.
We'll do this in two steps: first use a pseudo-code with labels and gotos to illustrate what compiler does for eliminating this form of tail recursion, and then convert pseudo-code back into F# for getting an imperative version.
// Step 1 - pseudo-code with tail recursion substituted by goto
let find(words:string[]) (value:string) =
let mutable fpos = 0
let mutable lpos = words.Length - 1
findRec:
match fpos - lpos > 0 with
| true -> return false
| _ -> let mid = (fpos + lpos) / 2
match sign(value.CompareTo(words.[mid])) with
| -1 -> lpos <- mid - 1
goto findRec
| 1 -> fpos <- mid + 1
goto findRec
| _ -> return true
Now, in absence of goto we should come up with an equivalent construction while staying within legit set of F# constructions. The easiest approach would be using while...do construction in concert with a mutable state variable capable simultaneously of signaling while when to stop and carrying return value. A tuple of two Booleans would be sufficient for this purpose:
// Step 2 - conversion of pseudo-code back to F#
let find(words:string[]) (value:string) =
let mutable fpos = 0
let mutable lpos = words.Length - 1
let mutable state = (true,false)
while (fst state) do
match fpos - lpos > 0 with
| true -> state <- (false,false)
| _ -> let mid = (fpos + lpos) / 2
match sign(value.CompareTo(words.[mid])) with
| -1 -> lpos <- mid - 1
| 1 -> fpos <- mid + 1
| _ -> state <- (false,true)
snd state
Summing up, the difference between "a-la compiler optimized" recursive version and hand-picked imperative one is insignificant, indeed, which should, in my opinion, make evident that correctly arranged recursive version performance-wise is equivalent to imperative version, but, given conversion performed by compiler, leaves no space for blunders of stateful coding.
I would suggest a recursive solution like this:
let find (xs: _ []) x =
let rec loop i0 i2 =
match i2-i0 with
| 0 -> false
| 1 -> xs.[i0]=x
| di ->
let i1 = i0 + di/2
let c = compare x xs.[i1]
if c<0 then loop i0 i1
else c=0 || loop i1 i2
loop 0 xs.Length
F# converts the tail calls into gotos, of course:
internal static bool loop#4<a>(a[] xs, a x, int i0, int i2)
{
a a;
while (true)
{
int num = i2 - i0;
switch (num)
{
case 0:
return false;
case 1:
goto IL_50;
default:
{
int i3 = i0 + num / 2;
a = xs[i3];
int c = LanguagePrimitives.HashCompare.GenericComparisonIntrinsic<a>(x, a);
if (c < 0)
{
a[] arg_37_0 = xs;
a arg_35_0 = x;
int arg_33_0 = i0;
i2 = i3;
i0 = arg_33_0;
x = arg_35_0;
xs = arg_37_0;
}
else
{
if (c == 0)
{
return true;
}
a[] arg_4A_0 = xs;
a arg_48_0 = x;
int arg_46_0 = i3;
i2 = i2;
i0 = arg_46_0;
x = arg_48_0;
xs = arg_4A_0;
}
break;
}
}
}
return true;
IL_50:
a = xs[i0];
return LanguagePrimitives.HashCompare.GenericEqualityIntrinsic<a>(a, x);
}
public static bool find<a>(a[] xs, a x)
{
return File1.loop#4<a>(xs, x, 0, xs.Length);
}

Resources