F# how to search and count occurences up untill a point? - f#

I made a data structure to represent a house maze. The way it works is that a path can either lead to the a dead end where there is either a cake, ice cream or oven, or it can lead to an intersection that is either Left, Right, Double (left and right paths) or Triple (Left, Middle and Right paths).
Here are the types and the layout of the house
type path =
|End of string
|Double of path * path
|Triple of path * path * path
|Left of path
|Right of path
let GingerbreadHouse = Triple(
Double( Double(End("*"), End("X")), Left(Right(End("X"))) ),
Left( Double(End("*") , Left( Double(End("X") , Right(End("O")))))) ,
Left( Triple(Right(End("X")) , Double( Double(End("*") , End("X")) , End("X")) , Double(End("X") , Right(End("*"))) ))
)
Now what I am trying to do is count the number of cakes reached before reaching the oven while follwoing the right most path first.
I first tried doing it with a simple helper function that keeps track of the count and when it reaches the oven it will just retrun the count. However I hit a wall as I cant exactly return the dead ends in this way.
let YummyKids house =
let rec helper house count =
match t with
|Left(p) -> helper p count
|Right(p) -> helper p
|Double(lp,rp) -> helper count + helper count
|Triple(lp, fp, rp) -> helper rp count + helper mp count + helper lp count
|End(treat) when treat = "*" -> helper ??? count
|End(treat) when treat = "X" -> helper ??? (count+1)
|End(treat) when treat = "O" -> count
helper house 0
So my second attempt I thought I would stick with the normal backwards recursion method however I hit another wall as I have no clue on how to actually end the whole thing when it reaches oven.
let YummyKids house =
match t with
|Left(p) -> YummyKids p
|Right(p) -> YummyKids p
|Double(lp,rp) -> YummyKids rp + YummyKids lp
|Triple(lp, fp, rp) -> YummyKids rp + YummyKids mp + YummyKids lp
|End(treat) when treat = "*" -> 0
|End(treat) when treat = "X" -> 1
|End(treat) when treat = "O" -> //???
This would work if were looking to count the number of cakes in the entire maze but I only want to count up until a certain point such that when it reaches the oven. How can I do this?

The result of your recursive function needs to indicate how many cakes you found so far, but also whether the process has already reached an oven and should therefore terminate.
Then you can implement the branching so that it continues to the other branches if an oven has not been found yet (adding the number of cakes), but returns immediately when an oven is found in one branch - before looking into the other branches.
In the following, the return type is int * bool where the int represents a number of cakes and bool is true when we hit an oven. The interesting case is the handling of Double:
let rec YummyKids path =
match path with
| Left(p) -> YummyKids p
| Right(p) -> YummyKids p
| Double(lp,rp) ->
let cakes, finished = YummyKids rp
if finished then cakes, finished else
let moreCakes, finished = YummyKids lp
cakes + moreCakes, finished
| End(treat) when treat = "*" -> 0, false
| End(treat) when treat = "X" -> 1, false
| End(treat) when treat = "O" -> 0, true
In Double, we first look into the right branch and if we found an oven, we return the number of cakes so far. If finished = false, we look into the right branch and add the cakes.
I left the Triple case unimplemented, but you should be able to complete that fairly easily following the same pattern as in the Double case.

Tomas has answered your question. I'd just like to point out a couple of possible optimizations:
A single argument function expression immediately followed by matches on the argument let func arg = match arg with... can be replaced by a pattern matching function, let func = function...
You can match against literals, e.g. "X", "O", true, so that the guarded pattern matching rule with the keyword when may be avoided
The Triple case can be constructed from two Double cases
Multiple cases binding to the same pattern can be combined into an or pattern that specifies the right hand expression only once
To make the pattern matches complete and avoid the warning, specify
a wildcard pattern _
A single argument case can be given without parentheses
Thus we may also arrive at this:
let rec YummyKids = function
| Double(lp,rp) ->
match YummyKids rp with
| _, true as result -> result
| cakes, _ ->
let moreCakes, finished = YummyKids lp
cakes + moreCakes, finished
| Triple(lp,mp,rp) -> YummyKids (Double(Double(rp,mp),lp))
| Left p | Right p -> YummyKids p
| End "X" -> 1, false
| End "O" -> 0, true
| End _ -> 0, false

Related

Non decreasing list of lists

I've been trying to implement a function that takes a list of integers and then return a list of lists of integers which are non-decreasing.
i.e
let ls = [ 1;2;3;5;6;3;2;5;6;2]
I should get [[1;2;3;5;6];[3];[2;5;6];[2]]
How should i approach this ? i'm a total noob at functional programming.
I can think of the steps needed:
1. Start a new sublist, compare each element with the one next to it. if it is greater then add to list. if not, start a new list and so on.
From what I've learned so far from the book Functional Programming with f# ( which i just started a few days ago), I could possibly use pattern matching and a recursive function maybe to go through the list comparing two elements
something like this :
let rec nonDecreasing list =
match list with
| (x,y) :: xs when x <= y ->
how would I go about to create the sublists using pattern matching ?
or have i approached the question wrongly?
Since there's already a solution using fold, here's another answer using foldBack, so you don't have to reverse it. Now you can backout a pure recursive solution.
let splitByInc x lls = // x is an item from the list, lls is a list of lists
match lls with
| y::xs -> // split the list of lists into head and tail
match y with
| h::_ when x <= h -> (x::y)::xs // take the head, and compare it with x, then cons it together with the rest
| _ -> [x]::lls // in the other case cons the single item with the rest of the list of lists
| _ -> [[x]] // nothing else to do, return the whole thing
let ls = [ 1;2;3;5;6;3;2;5;6;3]
List.foldBack splitByInc ls [] //foldBack needs a folder function, a list and a starting state
Edit:
Here's a really simplified example, you could write a recursive sum and compare it with the fold version:
let sumList x y =
x + y
List.foldBack sumList ls 0 //36
To better understand what splitByInc does, try it out with these examples:
splitByInc 4 [[5;6;7]] // matches (x::y)::xs
splitByInc 4 [] // matches [[x]]
splitByInc 4 [[1;2;3]] // matches [x]::lls
That's basically the same answer as the one given by #s952163 but maybe more readable by removing the nested match and also more general by adding a comparison function to do the "packing".
let packWhile predicate list =
let folder item = function
| [] -> [[ item ]]
| (subHead :: _ as subList) :: accTail
when predicate item subHead -> (item :: subList) :: accTail
| accList -> [ item ] :: accList
List.foldBack folder list []
// usage (you can replace (<=) by (fun x y -> x <= y) if it's clearer for you)
packWhile (<=) [1;2;3;5;6;3;2;5;6;3]
// you can also define a function to bake-in the comparison
let packIncreasing list = packWhile (<=) list
packIncreasing [1;2;3;5;6;3;2;5;6;3]
I'd use a fold, where your 'State is a tuple containing the previous value, the list of lists, and the current non-decreasing list you're working on.
let ls = [ 1;2;3;5;6;3;2;5;6;3]
let _, listOfLists, currList =
((Int32.MinValue, [], []), ls) ||>
List.fold(fun (prev, listOfLists, currList) t ->
if t < prev then //decreasing, so store your currList and start a new one
t, currList::listOfLists, [t]
else //just add t to your currList
t, listOfLists, t::currList)
let listOfLists = currList::listOfLists //cleanup: append final sublist
let final = List.rev(List.map List.rev listOfLists) //cleanup: reverse everything
printfn "%A" final
Note you'll have to clean up, adding the final list to the list-of-lists, and then reversing the full list-of-lists and each sublist once you've done the fold.

How to "compress" similar branches in F# pattern matching

I have the following piece of code in hand:
match intersection with
| None ->
printfn "Please provide an empty intersection for ring placement"
gameState
| Some x ->
match x.Status with
| Empty ->
let piece = { Color = gameState.Active.Color; Type = Ring }
putPieceOnIntersection gameState.Board pos piece
printfn "%s ring placed at %A" (colorStr gameState.Active.Color) pos
// Decide if we ended this phase
let updatedPhase = if ringsPlaced = 10 then Main else Start(ringsPlaced + 1)
let newActivePlayer = gameState.Players |> Array.find (fun p -> p.Color = invertColor gameState.Active.Color)
let updatedGameState = { gameState with Active = newActivePlayer; CurrentPhase = updatedPhase }
updatedGameState
| _ ->
printfn "Please provide an empty intersection for ring placement"
gameState
As you may see, if the variable intersection is either None or its Status is different than empty, I should do exactly the same branch of printing some text and return. However I don't know how to do that kind of condition expression in F# so that I can share the same branch. In imperative programming I would do this easily, but in F# how can I do it?
Thank you
If Status is a record field then you can do:
match intersection with
| Some { Status = Empty } ->
// Code for empty...
| _ ->
printfn "Please provide an empty intersection for ring placement"
gameState
Otherwise, you can use a guard:
match intersection with
| Some x when x.Status = Empty ->
// Code for empty...
| _ ->
printfn "Please provide an empty intersection for ring placement"
gameState

For loop in list

Poeple often use
for i in [0 .. 10] do something
but afaik that creates a list which is then iterated through, it appears to me it would make more sense to use
for i = 0 to 10 do something
without creating that unnecessary list but having the same behaviour.
Am I missing something? (I guess that's the case)
You are correct, writing for i in [0 .. 10] do something generates a list and it does have a significant overhead. Though you can also omit the square brackets, in which case it just builds a lazy sequence (and, it turns out that the compiler even optimizes that case). I generally prefer writing in 0 .. 100 do because it looks the same as code that iterates over a sequence.
Using the #time feature of F# interactive to do a simple analysis:
for i in [ 0 .. 10000000 ] do // 3194ms (yikes!)
last <- i
for i in 0 .. 10000000 do // 3ms
last <- i
for i = 0 to 10000000 do // 3ms
last <- i
for i in seq { 0 .. 10000000 } do // 709ms (smaller yikes!)
last <- i
So, it turns out that the compiler actually optimizes the in 0 .. 10000000 do into the same thing as the 0 to 10000000 do loop. You can force it to create the lazy sequence explicitly (last case) which is faster than a list, but still very slow.
Giving a somewhat different kind of answer but hopefully interesting to some
You are correct in that the F# compiler fails to apply the fast-for-loop optimization in this case. Good news, the F# compiler is open source and it's possible for us to improve upon it's behavior.
So here's a freebie from me:
fast-for-loop optimization happens in tastops.fs. It's rather primitive at the moment, great opportunity for us to improve upon.
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <finishExpr> do <bodyExpr>' expression over integers
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <step> .. <finishExpr> do <bodyExpr>' expression over integers when step is positive
let (|CompiledInt32ForEachExprWithKnownStep|_|) g expr =
match expr with
| Let (_enumerableVar, RangeInt32Step g (startExpr, step, finishExpr), _,
Let (_enumeratorVar, _getEnumExpr, spBind,
TryFinally (WhileLoopForCompiledForEachExpr (_guardExpr, Let (elemVar,_currentExpr,_,bodyExpr), m), _cleanupExpr))) ->
let spForLoop = match spBind with SequencePointAtBinding(spStart) -> SequencePointAtForLoop(spStart) | _ -> NoSequencePointAtForLoop
Some(spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m)
| _ ->
None
let DetectFastIntegerForLoops g expr =
match expr with
| CompiledInt32ForEachExprWithKnownStep g (spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m)
// fast for loops only allow steps 1 and -1 steps at the moment
when step = 1 || step = -1 ->
mkFastForLoop g (spForLoop,m,elemVar,startExpr,(step = 1),finishExpr,bodyExpr)
| _ -> expr
The problem here is that RangeInt32Step only detects patterns like 0..10 and 0..1..10. It misses for instance [0..10]
Let's introduce another active pattern SeqRangeInt32Step that matches these kind of expressions:
let (|SeqRangeInt32Step|_|) g expr =
match expr with
// detect '[n .. m]'
| Expr.App(Expr.Val(toList,_,_),_,[TType_var _],
[Expr.App(Expr.Val(seq,_,_),_,[TType_var _],
[Expr.Op(TOp.Coerce, [TType_app (seqT, [TType_var _]); TType_var _],
[RangeInt32Step g (startExpr, step, finishExpr)], _)],_)],_)
when
valRefEq g toList (ValRefForIntrinsic g.seq_to_list_info) &&
valRefEq g seq g.seq_vref &&
tyconRefEq g seqT g.seq_tcr ->
Some(startExpr, step, finishExpr)
| _ -> None
How do you figure out that this is what you need to pattern match for? The approach I often take is that I do a simple F# program with the right properties and put a breakpoint during compilation to inspect the expression. From that I create the pattern to match for:
Let's put the two patterns together:
let (|ExtractInt32Range|_|) g expr =
match expr with
| RangeInt32Step g range -> Some range
| SeqRangeInt32Step g range -> Some range
| _ -> None
CompiledInt32ForEachExprWithKnownStep is updated to use ExtractInt32Range over RangeInt32Step
The complete solution would be something like this:
let (|SeqRangeInt32Step|_|) g expr =
match expr with
// detect '[n .. m]'
| Expr.App(Expr.Val(toList,_,_),_,[TType_var _],
[Expr.App(Expr.Val(seq,_,_),_,[TType_var _],
[Expr.Op(TOp.Coerce, [TType_app (seqT, [TType_var _]); TType_var _],
[RangeInt32Step g (startExpr, step, finishExpr)], _)],_)],_)
when
valRefEq g toList (ValRefForIntrinsic g.seq_to_list_info) &&
valRefEq g seq g.seq_vref &&
tyconRefEq g seqT g.seq_tcr ->
Some(startExpr, step, finishExpr)
| _ -> None
let (|ExtractInt32Range|_|) g expr =
match expr with
| RangeInt32Step g range -> Some range
| SeqRangeInt32Step g range -> Some range
| _ -> None
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <finishExpr> do <bodyExpr>' expression over integers
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <step> .. <finishExpr> do <bodyExpr>' expression over integers when step is positive
let (|CompiledInt32ForEachExprWithKnownStep|_|) g expr =
match expr with
| Let (_enumerableVar, ExtractInt32Range g (startExpr, step, finishExpr), _,
Let (_enumeratorVar, _getEnumExpr, spBind,
TryFinally (WhileLoopForCompiledForEachExpr (_guardExpr, Let (elemVar,_currentExpr,_,bodyExpr), m), _cleanupExpr))) ->
let spForLoop = match spBind with SequencePointAtBinding(spStart) -> SequencePointAtForLoop(spStart) | _ -> NoSequencePointAtForLoop
Some(spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m)
| _ ->
None
Using a simple test program
let print v =
printfn "%A" v
[<EntryPoint>]
let main argv =
for x in [0..10] do
print x
0
Before the optimization the corresponding C# code would look something like this (IL code is better to inspect but can be a bit hard to understand if one is unused to it):
// Test
[EntryPoint]
public static int main(string[] argv)
{
FSharpList<int> fSharpList = SeqModule.ToList<int>(Operators.CreateSequence<int>(Operators.OperatorIntrinsics.RangeInt32(0, 1, 10)));
IEnumerator<int> enumerator = ((IEnumerable<int>)fSharpList).GetEnumerator();
try
{
while (enumerator.MoveNext())
{
Test.print<int>(enumerator.Current);
}
}
finally
{
IDisposable disposable = enumerator as IDisposable;
if (disposable != null)
{
disposable.Dispose();
}
}
return 0;
}
F# creates a list and then uses the enumerator to iterate over it. No wonder it's rather slow compared to a classic for-loop.
After the optimization is applied we get this code:
// Test
[EntryPoint]
public static int main(string[] argv)
{
for (int i = 0; i < 11; i++)
{
Test.print<int>(i);
}
return 0;
}
A significant improvement.
So steal this code, post a PR to https://github.com/Microsoft/visualfsharp/ and bask in glory. Of course you need to add unit tests and emitted IL code tests which can be somewhat tricky to find the right level for, check this commit for inspiration
PS. Probably should support [|0..10|] as well seq {0..10} as well
PS. In addition for v in 0L..10L do print v as well as for v in 0..2..10 do print v is also inefficiently implemented in F#.
The former form requires a special construct in the language (for var from ... to ... by), it is the way followed by ancient programming languages :
'do' loop in Fortran
for var:= expr to expr in Pascal
etc.
The latter form (for var in something) is more général. It works on plain lists, but also with generators (like in python) etc. A construction of the full list may not be needed before running the list. This allows to write loops on potentially infinite lists.
Anyway, a decent compiler/interpreter should recognize the rather frequent special case [expr1..expr2] and avoid the computation and storage of the intermediate list.

Pattern matching with guards vs if/else construct in F#

In ML-family languages, people tend to prefer pattern matching to if/else construct. In F#, using guards within pattern matching could easily replace if/else in many cases.
For example, a simple delete1 function could be rewritten without using if/else (see delete2):
let rec delete1 (a, xs) =
match xs with
| [] -> []
| x::xs' -> if x = a then xs' else x::delete1(a, xs')
let rec delete2 (a, xs) =
match xs with
| [] -> []
| x::xs' when x = a -> xs'
| x::xs' -> x::delete2(a, xs')
Another example is solving quadratic functions:
type Solution =
| NoRoot
| OneRoot of float
| TwoRoots of float * float
let solve1 (a,b,c) =
let delta = b*b-4.0*a*c
if delta < 0.0 || a = 0.0 then NoRoot
elif delta = 0.0 then OneRoot (-b/(2.0*a))
else
TwoRoots ((-b + sqrt(delta))/(2.0*a), (-b - sqrt(delta))/(2.0*a))
let solve2 (a,b,c) =
match a, b*b-4.0*a*c with
| 0.0, _ -> NoRoot
| _, delta when delta < 0.0 -> NoRoot
| _, 0.0 -> OneRoot (-b/(2.0*a))
| _, delta -> TwoRoots((-b + sqrt(delta))/(2.0*a),(-b - sqrt(delta))/(2.0*a))
Should we use pattern matching with guards to ignore ugly if/else construct?
Is there any performance implication against using pattern matching with guards? My impression is that it seems to be slow because pattern matching has be checked at runtime.
The right answer is probably it depends, but I surmise, in most cases, the compiled representation is the same. As an example
let f b =
match b with
| true -> 1
| false -> 0
and
let f b =
if b then 1
else 0
both translate to
public static int f(bool b)
{
if (!b)
{
return 0;
}
return 1;
}
Given that, it's mostly a matter of style. Personally I prefer pattern matching because the cases are always aligned, making it more readable. Also, they're (arguably) easier to expand later to handle more cases. I consider pattern matching an evolution of if/then/else.
There is also no additional run-time cost for pattern matching, with or without guards.
Both have their own place. People are more used to If/else construct for checking a value where as pattern matching is like a If/else on steroids. Pattern matching allows you to sort of compare against the decomposed structure of the data along with using gaurds for specifying some additional condition on the parts of the decomposed data or some other value (specially in case of recursive data structures or so called discriminated unions in F#).
I personally prefer to use if/else for simple values comparisons (true/false, ints etc), but in case you have a recursive data structure or something which you need to compare against its decomposed value than there is nothing better than pattern matching.
First make it work and make it elegant and simple and then if you seem some performance problem then check for performance issues (which mostly will be due to some other logic and not due to pattern matching)
Agree with #Daniel that pattern matching is usually more flexible.
Check this implementation:
type Solution = | Identity | Roots of float list
let quadraticEquation x =
let rec removeZeros list =
match list with
| 0.0::rest -> removeZeros rest
| _ -> list
let x = removeZeros x
match x with
| [] -> Identity // zero constant
| [_] -> Roots [] // non-zero constant
| [a;b] -> Roots [ -b/a ] // linear equation
| [a;b;c] ->
let delta = b*b - 4.0*a*c
match delta with
| delta when delta < 0.0 ->
Roots [] // no real roots
| _ ->
let d = sqrt delta
let x1 = (-b-d) / (2.0*a)
let x2 = (-b+d) / (2.0*a)
Roots [x1; x2]
| _ -> failwithf "equation is bigger than quadratic: %A" x
Also notice in https://fsharpforfunandprofit.com/learning-fsharp/ that it is discouraged to use if-else. It is considered a bid less functional.
I did some testing on a self writen prime number generator, and as far as i can say there is "if then else" is significantly slower than pattern matching, can't explain why though, but I as far as I have tested the imperativ part of F# have a slower run time than recursive functional style when it come to optimal algorithms.

Is this correct use of pattern matching and active patterns?

I'm new to F# and functional and am working on some HTML parsing code. I want to remove from a HTML document elements that match some criteria. Here I have a sequence of objects (HtmlNodes) and want to remove them from the document.
Is this idiomatic way of using pattern matching? Also as HtmlNode.Remove() has a side-effect on the original HtmlDocument object, is there any particular way of structuring the code to make the side-effect obvious or how should this be handled. You can be as pedantic as you like with the code.
open HtmlAgilityPack
let removeNodes (node : HtmlNode) =
let (|HiddenNodeCount|) (n : HtmlNode) =
match n.SelectNodes("*[#style[contains(.,'visibility:hidden')]]") with
| null -> 0
| _ as x -> Seq.length x
match node with
| x when x.Name.ToLower() = "script" -> node.Remove()
| x when x.NodeType = HtmlNodeType.Comment -> node.Remove()
| HiddenNodeCount x when x > 0 -> node.Remove()
| _ -> ()
let html = "some long messy html code would be here"
let dom = new HtmlDocument(OptionAutoCloseOnEnd=true)
dom.LoadHtml(html)
let nodes = dom.DocumentNode.DescendantNodes()
nodes |> Seq.toArray |> Array.iter removeNodes
Personally, I prefer if elif else over pattern matching when you don't have a data structure to decompose (it's just less typing, and may also serve to differentiate between when a structure is being decomposed versus simpler case testing).
There are some odd things in your code. The Active Pattern isn't very helpful here for two reasons: first, its scope is limited to removeNodes so it is only used once. I'll address the second issues later, but first I will show how I would write this by eliminating the Active Pattern and, for me at least, making the side-effects more obvious (by separating the code which tests whether a node should be removed from the code that does the removing):
let shouldRemoveNode (node : HtmlNode) =
if node.Name.ToLower() = "script" then true
elif node.NodeType = HtmlNodeType.Comment then true
else match node.SelectNodes("*[#style[contains(.,'visibility:hidden')]]") with
| null -> false
| x -> Seq.length x > 0
let removeNode (node: HtmlNode) =
if shouldRemoveNode(node) then node.Remove() else ()
Notice I do use a pattern match in the visibility hidden query since I do get to match against null and bind to x otherwise (rather than binding to x, and then testing x with if else).
The second odd thing with your Active Pattern is that you are using it for converting a node to an int, but the length you obtain isn't immediately useful (you still need to perform a test against it). Whereas the more powerful use of an Active Pattern here would be to carve up nodes into different kinds (assuming this isn't ad-hoc, which was may first point). So you could have:
//expand to encompass several other kinds of nodes
let (|Script|Comment|Hidden|Other|) (node : HtmlNode) =
if node.Name.ToLower() = "script" then Script
elif node.NodeType = HtmlNodeType.Comment then Comment
else match node.SelectNodes("*[#style[contains(.,'visibility:hidden')]]") with
| null -> Other
| x -> if Seq.length x > 0 then Hidden
else Other
let removeNode (node: HtmlNode) =
match node with
| Script | Comment | Hidden -> node.Remove()
| Other -> ()
Edit:
#Pascal made the observation in the comments that shouldRemoveNode can be further condensed into one big boolean expression:
let shouldRemoveNode (node : HtmlNode) =
node.Name.ToLower() = "script" ||
node.NodeType = HtmlNodeType.Comment ||
match node.SelectNodes("*[#style[contains(.,'visibility:hidden')]]") with
| null -> false
| x -> Seq.length x > 0
It isn't clear to me that this is any better than using functions and if-then-else, e.g.
let HiddenNodeCount (n : HtmlNode) =
match n.SelectNodes("*[#style[contains(.,'visibility:hidden')]]") with
| null -> 0
| x -> Seq.length x
if node.Name.ToLower() = "script" then
node.Remove()
elif node.NodeType = HtmlNodeType.Comment then
node.Remove()
elif HiddenNodeCount node > 0 then
node.Remove()

Resources