How does recursion work for active patterns in F#? - parsing

I have the following function for parsing an integer:
let rec digits = function
| head::tail when System.Char.IsDigit(head) ->
let result = digits tail
(head::(fst result), snd result)
| rest -> ([], rest)
If I change this function to be an active recognizer, it no longer compiles.
let rec (|Digits|) = function
| head::tail when System.Char.IsDigit(head) ->
let result = Digits tail
(head::(fst result), snd result)
// ^^^^^^ ^^^^^^ see error*
| rest -> ([], rest)
*error FS0001: This expression was expected to have type char list * 'a but here has type
char list

let rec (|Digits|) = function
| head::(Digits (a, b)) when System.Char.IsDigit(head) -> (head::a, b)
| rest -> ([], rest)
NOTE:
if you want to use active pattern as a function you still can do it:
let rec (|Digits|) = function
| head::tail when System.Char.IsDigit(head) ->
let a, b = (|Digits|) tail
(head::a, b)
| rest -> ([], rest)

Related

F#: pattern matching on base types

I have a series of validation functions I want to put into an array to execute:
type result = {D: int; E: int; F: int; G: int}
type InvalidReason =
| AAA
| BBB
| CCC
| DDD
| EEE
type Validation =
| Valid
| Invalid of InvalidReason
let validators = [|AAA; BBB; CCC; DDD; EEE|]
let validateStuff result =
validators
|> Array.map(fun v -> v result)
|> Array.contains(Validation.Invalid _)
The problem is that last line of code. I am getting an "Unexpected value _ in the expression." The following does work
|> Array.contains(Validation.Valid)
|> Array.contains(Validation.Invalid InvalidReason.AAA)
But I don't want to spell out each of the sub types for InvalidReasons. Is there some syntax I am overlooking?
The function Array.contains takes a value and checks if that value is in the array. What you're trying to do is to give it a whole bunch of values to check. Well, this won't work: the function only takes one. And it doesn't help that there is no syntax like that in F# :-)
You might use another function that takes multiple values, but a better way to accomplish what you want is to use a function that takes a predicate - Array.exists. Make yourself a predicate to check if a value is "invalid":
let isInvalid x = match x with
| Valid -> false
| Invalid _ -> true
And pass it to Array.exists:
let validateStuff result =
validators
|> Array.map(fun v -> v result)
|> Array.exists isInvalid
Or you could even put that function inline:
let validateStuff result =
validators
|> Array.map(fun v -> v result)
|> Array.exists ( fun x -> match x with
| Valid -> false
| Invalid _ -> true )
Or even shorter, using the function keyword:
let validateStuff result =
validators
|> Array.map(fun v -> v result)
|> Array.exists ( function | Valid -> false | Invalid _ -> true )
Or even shorter, getting rid of as much noise as possible:
let validateStuff result =
validators
|> Array.map(fun v -> v result)
|> Array.exists ( function Invalid _ -> true | _ -> false )

using a sequence on partial application

I have a sequence of value that I would like to apply to a function partially :
let f a b c d e= a+b+c+d+e
let items = [1,2,3,4,5]
let result = applyPartially f items
Assert.Equal(15, result)
I am looking after the applyPartially function. I have tried writing recursive functions like this :
let rec applyPartially f items =
| [] -> f
| [x] -> f x
| head :: tail -> applyPartially (f head) tail
The problem I have encountered is that the f type is at the beginning of my iteration 'a->'b->'c->'d->'e, and for every loop it should consume an order.
'a->'b->'c->'d->'e
'b->'c->'d->'e
'c->'d->'e
'd->'e
That means that the lower interface I can think of would be 'd->'e. How could I hide the complexity of my function so that only 'd->'e is shown in the recursive function?
The F# type system does not have a nice way of working with ordinary functions in a way you are suggesting - to do this, you'd need to make sure that the length of the list matches the number of arguments of the function, which is not possible with ordinary lists and functions.
However, you can model this nicely using a discriminated union. You can define a partial function, which has either completed, or needs one more input:
type PartialFunction<'T, 'R> =
| Completed of 'R
| NeedsMore of ('T -> PartialFunction<'T, 'R>)
Your function f can now be written (with a slightly ugly syntax) as a PartialFunction<int, int> that keeps taking 5 inputs and then returns the result:
let f =
NeedsMore(fun a -> NeedsMore(fun b ->
NeedsMore(fun c -> NeedsMore(fun d ->
NeedsMore(fun e -> Completed(a+b+c+d+e))))))
Now you can implement applyPartially by deconstructing the list of arguments and applying them one by one to the partial function until you get the result:
let rec applyPartially f items =
match f, items with
| Completed r, _ -> r
| NeedsMore f, head::tail -> applyPartially (f head) tail
| NeedsMore _, _ -> failwith "Insufficient number of arguments"
The following now returns 15 as expected:
applyPartially f [1;2;3;4;5]
Disclaimer: Please don't use this. This is just plain evil.
let apply f v =
let args = v |> Seq.toArray
f.GetType().GetMethods()
|> Array.tryFind (fun m -> m.Name = "Invoke" && Array.length (m.GetParameters()) = Array.length args)
|> function None -> failwith "Not enough args" | Some(m) -> m.Invoke(f, args)
Just like you would expect:
let f a b c d e= a+b+c+d+e
apply f [1; 2; 3; 4; 5] //15

F# merge sort error value restriction [duplicate]

let rec merge = function
| ([], ys) -> ys
| (xs, []) -> xs
| (x::xs, y::ys) -> if x < y then x :: merge (xs, y::ys)
else y :: merge (x::xs, ys)
let rec split = function
| [] -> ([], [])
| [a] -> ([a], [])
| a::b::cs -> let (M,N) = split cs
(a::M, b::N)
let rec mergesort = function
| [] -> []
| L -> let (M, N) = split L
merge (mergesort M, mergesort N)
mergesort [5;3;2;1] // Will throw an error.
I took this code from here StackOverflow Question but when I run the mergesort with a list I get an error:
stdin(192,1): error FS0030: Value restriction. The value 'it' has been inferred to have generic type
val it : '_a list when '_a : comparison
How would I fix this problem? What is the problem? The more information, the better (so I can learn :) )
Your mergesort function is missing a case causing the signature to be inferred by the compiler to be 'a list -> 'b list instead of 'a list -> 'a list which it should be. The reason it should be 'a list -> 'a list is that you're not looking to changing the type of the list in mergesort.
Try changing your mergesort function to this, that should fix the problem:
let rec mergesort = function
| [] -> []
| [a] -> [a]
| L -> let (M, N) = split L
merge (mergesort M, mergesort N)
Another problem with your code however is that neither merge nor split is tail recursive and you will therefore get stack overflow exceptions on large lists (try to call the corrected mergesort like this mergesort [for i in 1000000..-1..1 -> i]).
You can make your split and merge functions tail recursive by using the accumulator pattern
let split list =
let rec aux l acc1 acc2 =
match l with
| [] -> (acc1,acc2)
| [x] -> (x::acc1,acc2)
| x::y::tail ->
aux tail (x::acc1) (y::acc2)
aux list [] []
let merge l1 l2 =
let rec aux l1 l2 result =
match l1, l2 with
| [], [] -> result
| [], h :: t | h :: t, [] -> aux [] t (h :: result)
| h1 :: t1, h2 :: t2 ->
if h1 < h2 then aux t1 l2 (h1 :: result)
else aux l1 t2 (h2 :: result)
List.rev (aux l1 l2 [])
You can read more about the accumulator pattern here; the examples are in lisp but it's a general pattern that works in any language that provides tail call optimization.

Mergesort Getting an Error in F#

let rec merge = function
| ([], ys) -> ys
| (xs, []) -> xs
| (x::xs, y::ys) -> if x < y then x :: merge (xs, y::ys)
else y :: merge (x::xs, ys)
let rec split = function
| [] -> ([], [])
| [a] -> ([a], [])
| a::b::cs -> let (M,N) = split cs
(a::M, b::N)
let rec mergesort = function
| [] -> []
| L -> let (M, N) = split L
merge (mergesort M, mergesort N)
mergesort [5;3;2;1] // Will throw an error.
I took this code from here StackOverflow Question but when I run the mergesort with a list I get an error:
stdin(192,1): error FS0030: Value restriction. The value 'it' has been inferred to have generic type
val it : '_a list when '_a : comparison
How would I fix this problem? What is the problem? The more information, the better (so I can learn :) )
Your mergesort function is missing a case causing the signature to be inferred by the compiler to be 'a list -> 'b list instead of 'a list -> 'a list which it should be. The reason it should be 'a list -> 'a list is that you're not looking to changing the type of the list in mergesort.
Try changing your mergesort function to this, that should fix the problem:
let rec mergesort = function
| [] -> []
| [a] -> [a]
| L -> let (M, N) = split L
merge (mergesort M, mergesort N)
Another problem with your code however is that neither merge nor split is tail recursive and you will therefore get stack overflow exceptions on large lists (try to call the corrected mergesort like this mergesort [for i in 1000000..-1..1 -> i]).
You can make your split and merge functions tail recursive by using the accumulator pattern
let split list =
let rec aux l acc1 acc2 =
match l with
| [] -> (acc1,acc2)
| [x] -> (x::acc1,acc2)
| x::y::tail ->
aux tail (x::acc1) (y::acc2)
aux list [] []
let merge l1 l2 =
let rec aux l1 l2 result =
match l1, l2 with
| [], [] -> result
| [], h :: t | h :: t, [] -> aux [] t (h :: result)
| h1 :: t1, h2 :: t2 ->
if h1 < h2 then aux t1 l2 (h1 :: result)
else aux l1 t2 (h2 :: result)
List.rev (aux l1 l2 [])
You can read more about the accumulator pattern here; the examples are in lisp but it's a general pattern that works in any language that provides tail call optimization.

Is this a reasonable foundation for a parser combinator library?

I've been working with FParsec lately and I found that the lack of generic parsers is a major stopping point for me. My goal for this little library is simplicity as well as support for generic input. Can you think of any additions that would improve this or is anything particularly bad?
open LazyList
type State<'a, 'b> (input:LazyList<'a>, data:'b) =
member this.Input = input
member this.Data = data
type Result<'a, 'b, 'c> =
| Success of 'c * State<'a, 'b>
| Failure of string * State<'a, 'b>
type Parser<'a,'b, 'c> = State<'a, 'b> -> Result<'a, 'b, 'c>
let (>>=) left right state =
match left state with
| Success (result, state) -> (right result) state
| Failure (message, _) -> Result<'a, 'b, 'd>.Failure (message, state)
let (<|>) left right state =
match left state with
| Success (_, _) as result -> result
| Failure (_, _) -> right state
let (|>>) parser transform state =
match parser state with
| Success (result, state) -> Success (transform result, state)
| Failure (message, _) -> Failure (message, state)
let (<?>) parser errorMessage state =
match parser state with
| Success (_, _) as result -> result
| Failure (_, _) -> Failure (errorMessage, state)
type ParseMonad() =
member this.Bind (f, g) = f >>= g
member this.Return x s = Success(x, s)
member this.Zero () s = Failure("", s)
member this.Delay (f:unit -> Parser<_,_,_>) = f()
let parse = ParseMonad()
Backtracking
Surprisingly it didn't take too much code to implement what you describe. It is a bit sloppy but seems to work quite well.
let (>>=) left right state =
seq {
for res in left state do
match res with
| Success(v, s) ->
let v =
right v s
|> List.tryFind (
fun res ->
match res with
| Success (_, _) -> true
| _ -> false
)
match v with
| Some v -> yield v
| None -> ()
} |> Seq.toList
let (<|>) left right state =
left state # right state
Backtracking Part 2
Switched around the code to use lazy lists and tail-call optimized recursion.
let (>>=) left right state =
let rec readRight lst =
match lst with
| Cons (x, xs) ->
match x with
| Success (r, s) as q -> LazyList.ofList [q]
| Failure (m, s) -> readRight xs
| Nil -> LazyList.empty<Result<'a, 'b, 'd>>
let rec readLeft lst =
match lst with
| Cons (x, xs) ->
match x with
| Success (r, s) ->
match readRight (right r s) with
| Cons (x, xs) ->
match x with
| Success (r, s) as q -> LazyList.ofList [q]
| Failure (m, s) -> readRight xs
| Nil -> readLeft xs
| Failure (m, s) -> readLeft xs
| Nil -> LazyList.empty<Result<'a, 'b, 'd>>
readLeft (left state)
let (<|>) (left:Parser<'a, 'b, 'c>) (right:Parser<'a, 'b, 'c>) state =
LazyList.delayed (fun () -> left state)
|> LazyList.append
<| LazyList.delayed (fun () -> right state)
I think that one important design decision that you'll need to make is whether you want to support backtracking in your parsers or not (I don't remember much about parsing theory, but this probably specifies the types of languages that your parser can handle).
Backtracking. In your implementation, a parser can either fail (the Failure case) or produce exactly one result (the Success case). An alternative option is to generate zero or more results (for example, represent results as seq<'c>). Sorry if this is something you already considered :-), but anyway...
The difference is that your parser always follows the first possible option. For example, if you write something like the following:
let! s1 = (str "ab" <|> str "a")
let! s2 = str "bcd"
Using your implementation, this will fail for input "abcd". It will choose the first branch of the <|> operator, which will then process first two characters and the next parser in the sequence will fail. An implementation based on sequences would be able to backtrack and follow the second path in <|> and parse the input.
Combine. Another idea that occurs to me is that you could also add Combine member to your parser computation builder. This is a bit subtle (because you need to understand how computation expressions are translated), but it can be sometimes useful. If you add:
member x.Combine(a, b) = a <|> b
member x.ReturnFrom(p) = p
You can then write recursive parsers nicely:
let rec many p acc =
parser { let! r = p // Parse 'p' at least once
return! many p (r::acc) // Try parsing 'p' multiple times
return r::acc |> List.rev } // If fails, return the result

Resources