F# Develop a function counting the number of branches containing only "leaves" - f#

I have expression - (cos(9**5)-cos(8*5))*(sin(3+1)**exp(6*6)).
I present this expression in type -
type common =
Exp of common*common
| Sin of common
| Cos of common
| Bin of common*string*common
| Digit of float
| Exponent of common
let expr = Bin(Bin(Cos(Exp(Digit(9.0),Digit(5.0))),"-",Cos(Bin(Digit(8.0),"*",Digit(5.0)))),"*",Exp(Sin(Bin(Digit(3.0),"+",Digit(1.0))),Exponent(Bin(Digit(6.0),"*",Digit(6.0)))));
I have function that calculate expression -
let rec evalf com =
match com with
Digit(x) -> x
|Exp(d1,d2) -> let dig1 = evalf(d1)
let dig2 = evalf(d2)
System.Math.Pow(dig1,dig2)
|Sin(d) -> let dig = evalf(d)
System.Math.Sin(dig)
|Cos(d) -> let dig = evalf(d)
System.Math.Cos(dig)
|Exponent(d) -> let dig = evalf(d)
System.Math.Exp(dig)
|Bin(d1,op,d2) -> let dig1 = evalf(d1)
let dig2 = evalf(d2)
match op with
| "*" -> dig1*dig2
| "+" -> dig1+dig2
| "-" -> dig1-dig2
I need develop a function counting the number of branches containing only "leaves". Please help.

If you defines "leaves" as digits, then to count the number of branches containing only "leaves" you would need to count the number of expressions that only reference digits.
This can be achieved with a recursive function similar to evalf, that returns 1 for branches with only "leaves"/digits and recurses for the non-digit cases e.g.
let rec count expr =
match expr with
| Expr(Digit(_),Digit(_) -> 1
| Expr(d1,d2) -> count d1 + count d2
| Sin(Digit(_)) -> 1
| Sin(d) -> count d
// ... for all cases
A similar technique can be used to simplify an expression tree, for example a binary operation (Bin) on 2 numbers could be matched and simplified to a single number. This might be used for example as a compiler optimization step.

Related

Continuos active pattern match with list for F#

I have this pattern for a compiler, and in the expr parser I want to do this
let (|InfixOperator|_|) tokens = ...
let (|UnaryValue|_|) tokens = ...
let infixExpr tokens =
match tokens with
| Value v::Operator op::tokens -> ...
| Value v::tokens -> ...
The problem is that either the tokens for the match expr have a signature of Token list list or the pattern Value (needs to receive a list since it parses unary operations) does not give back the remaining tokens, the same happens for the Operator pattern
The ugly way around it would be something like this
let (|InfixOperator|_|) tokens = ...
let (|UnaryValue|_|) tokens = ...
let infixExpr tokens =
match tokens with
| Value (v, Operator (op, tokens)) -> ...
| Value (v, tokens) -> ...
Does anyone know any cleaner way of doing this with pattern matching?
I wrote a parser for a subset of BASIC that uses pattern matching (source code is available on GitHub). This is not heavily using active patterns, but it works in two steps - first it tokenizes the input (turning list<char> into list<Token>) and then it parses the list of tokens (turning list<Token> into list<Statement>).
This way, you mostly avoid the issue with nesting, because most interesting things become just a single token. For example, for operators, the tokenization looks like:
let rec tokenize toks = function
(* First character is number, collect all remaining number characters *)
| c::cs when isNumber c -> number toks [c] cs
(* First character is operator, collect all remanining operator characters *)
| c::cs when isOp c -> operator toks [c] cs
(* more cases omitted *)
| [] -> List.rev toks
and number toks acc = function
| c::cs when isNumber c -> number toks (c::acc) cs
| input -> tokenize (Number(float (str acc))::toks) input
and operator toks acc = function
| c::cs when isOp c -> operator toks (c::acc) cs
| input -> tokenize (Operator(List.rev acc)::toks) input
If you now have a list of tokens, then parsing a binary expression becomes:
let rec parseBinary left = function
| (Operator o)::toks ->
let right, toks = parseExpr toks
Binary(o, left, right), toks
| toks -> left, toks
and parseExpr = function
| (Number n)::toks -> parseBinary (Const(NumberValue n)) toks
(* more cases omitted *)
There are some places where I used active patterns, for example to parse a BASIC range expression which can be N1-N2 or -N1 or N1-
let (|Range|_|) = function
| (Number lo)::(Operator ['-'])::(Number hi)::[] ->
Some(Some (int lo), Some (int hi))
| (Operator ['-'])::(Number hi)::[] -> Some(None, Some(int hi))
| (Number lo)::(Operator ['-'])::[] -> Some(Some(int lo), None)
| [] -> Some(None, None)
| _ -> None
So, I guess the answer is that if you want to write a simple parser using pattern matching (without getting into more sophisticated and more powerful monadic parser libraries), then it is quite doable, but it is good idea to separate tokenization from parsing.

Can one set default values for Discriminated Union types?

I implemented a Discriminated Union type that would be used to select a function:
type BooleanCombinator =
| All
| Some
| None
| AtLeast of int
| MoreThan of int
| NotMoreThan of int
| LessThan of int
| ExactlyOne
| ExactlyTwo
| AllButOne
| AllButTwo
let boolToInt (b: bool) : int = if b then 1 else 0
let combineBooleans (combinator : BooleanCombinator)
(bools : bool list)
: bool =
let n = List.sumBy boolToInt bools
match combinator with
| BooleanCombinator.All -> List.forall id bools
| BooleanCombinator.Some -> bools |> List.exists id
| BooleanCombinator.None -> bools |> List.exists id |> not
| BooleanCombinator.AtLeast i -> n >= i
| BooleanCombinator.MoreThan i -> n > i
| BooleanCombinator.NotMoreThan i -> n <= i
| BooleanCombinator.LessThan i -> n < i
| BooleanCombinator.ExactlyOne -> n = 1
| BooleanCombinator.ExactlyTwo -> n = 2
| BooleanCombinator.AllButOne -> n = bools.Length - 1
| BooleanCombinator.AllButTwo -> n = bools.Length - 2
This looked Ok to me but the compiler started to look at all instances of Some and None as belonging to this DU, instead of the Option DU.
I do not want to go through all of my code replacing Some with Option.Some and None with Option.None.
Is there a way to tell the compiler that unqualified Some and None are actually Option.Some and Option.None?
Or should I just give different names to these DU cases, like AtLeastOne and ExactlyZero
The general rule for resolving name collisions in F# is "last declaration wins". Because your custom DU is declared after Option, its constructors Some and None win over those of Option.
But this rule offers a way to fix the problem: you just need to "reassert" the declarations after your custom DU:
type Bogus = Some of int | None
let g = function Some _ -> 42 | None -> 5
let x = Some 42
let inline Some a = Option.Some a
let inline None<'a> = Option.None : 'a option
let (|Some|None|) = function | Option.Some a -> Some a | Option.None -> None
let f = function Some _ -> 42 | None -> 5
let y = Some 42
If you inspect the types of g, x, f, and y in the above code:
> g
g : Bogus -> int
> f
f : 'a option -> int
> x
Bogus
> y
int option
The function g and value x were inferred to have type Bogus -> int and Bogus respectively, because Some and None in their bodies refer to Bogus.Some and Bogus.None.
The function f and value y were inferred to have Option-related types, because Some and None in their bodies refer to the Some function and the (|Some|None|) active pattern that I defined just above.
Of course, this is a rather hacky way to restore status quo. This will convince the compiler, but humans will still have a hard time reading your code. I suggest you rename the cases of your DU instead.
You can mark your DU with [<RequireQualifiedAccess>] attribute.
This means that you will be required to qualify the case name with the type whenever you use it in the code - which is something you do now anyway in your match expression.
That way an unqualified Some would still be resolved to mean Option.Some, despite the fact that you reuse the name.
It's a useful technique to know when you want to use a snappy name for a DU case - like None, Yes, Failure etc. - that by itself would be ambiguous or confusing to the reader (or the compiler, for that matter).

How to construct a match expression

I am allowing a command-line parameter like this --10GB, where -- and GB are constant, but a number like 1, 10, or 100 could be substituted in between the constant values, like --5GB.
I could easily parse the start and end of the string with substr or written a command line parser, but wanted to use match instead. I am just not sure how to structure the match expression.
let GB1 = cvt_bytes_to_gb(int64(DiskFreeLevels.GB1))
let arg = argv.[0]
let match_head = "--"
let match_tail = "GB"
let parse_min_gb_arg arg =
match arg with
| match_head & match_tail -> cvt_gb_arg_to_int arg
| _ -> volLib.GB1
I get a warning saying _ This rule will never be matched. How should the what is an AND expression be constructed?
You can't match on strings, except matching on the whole value, e.g. match s with | "1" -> 1 | "2" -> 2 ...
Parsing beginning and end would be the most efficient way to do this, there is no need to get clever (this, by the way, is a universally true statement).
But if you really want to use pattern matching, it is definitely possible to do, but you'll have to make yourself some custom matchers (also known as "active patterns").
First, make a custom matcher that would parse out the "middle" part of the string surrounded by prefix and suffix:
let (|StrBetween|_|) starts ends (str: string) =
if str.StartsWith starts && str.EndsWith ends then
Some (str.Substring(starts.Length, str.Length - ends.Length - starts.Length))
else
None
Usage:
let x = match "abcd" with
| StrBetween "a" "d" s -> s
| _ -> "nope"
// x = "bc"
Then make a custom matcher that would parse out an integer:
let (|Int|_|) (s: string) =
match System.Int32.TryParse s with
| true, i -> Some i
| _ -> None
Usage:
let x = match "15" with
| Int i -> i
| _ -> 0
// x = 15
Now, combine the two:
let x = match "--10GB" with
| StrBetween "--" "GB" (Int i) -> i
| _ -> volLib.GB1
// x = 10
This ability of patterns to combine and nest is their primary power: you get to build a complicated pattern out of small, easily understandable pieces, and have the compiler match it to the input. That's basically why it's called "pattern matching". :-)
The best I can come up with is using a partial active pattern:
let (|GbFormat|_|) (x:string) =
let prefix = "--"
let suffix = "GB"
if x.StartsWith(prefix) && x.EndsWith(suffix) then
let len = x.Length - prefix.Length - suffix.Length
Some(x.Substring(prefix.Length, len))
else
None
let parse_min_gb_arg arg =
match arg with
| GbFormat gb -> gb
| _ -> volLib.GB1
parse_min_gb_arg "--42GB"

Pattern matching with guards vs if/else construct in F#

In ML-family languages, people tend to prefer pattern matching to if/else construct. In F#, using guards within pattern matching could easily replace if/else in many cases.
For example, a simple delete1 function could be rewritten without using if/else (see delete2):
let rec delete1 (a, xs) =
match xs with
| [] -> []
| x::xs' -> if x = a then xs' else x::delete1(a, xs')
let rec delete2 (a, xs) =
match xs with
| [] -> []
| x::xs' when x = a -> xs'
| x::xs' -> x::delete2(a, xs')
Another example is solving quadratic functions:
type Solution =
| NoRoot
| OneRoot of float
| TwoRoots of float * float
let solve1 (a,b,c) =
let delta = b*b-4.0*a*c
if delta < 0.0 || a = 0.0 then NoRoot
elif delta = 0.0 then OneRoot (-b/(2.0*a))
else
TwoRoots ((-b + sqrt(delta))/(2.0*a), (-b - sqrt(delta))/(2.0*a))
let solve2 (a,b,c) =
match a, b*b-4.0*a*c with
| 0.0, _ -> NoRoot
| _, delta when delta < 0.0 -> NoRoot
| _, 0.0 -> OneRoot (-b/(2.0*a))
| _, delta -> TwoRoots((-b + sqrt(delta))/(2.0*a),(-b - sqrt(delta))/(2.0*a))
Should we use pattern matching with guards to ignore ugly if/else construct?
Is there any performance implication against using pattern matching with guards? My impression is that it seems to be slow because pattern matching has be checked at runtime.
The right answer is probably it depends, but I surmise, in most cases, the compiled representation is the same. As an example
let f b =
match b with
| true -> 1
| false -> 0
and
let f b =
if b then 1
else 0
both translate to
public static int f(bool b)
{
if (!b)
{
return 0;
}
return 1;
}
Given that, it's mostly a matter of style. Personally I prefer pattern matching because the cases are always aligned, making it more readable. Also, they're (arguably) easier to expand later to handle more cases. I consider pattern matching an evolution of if/then/else.
There is also no additional run-time cost for pattern matching, with or without guards.
Both have their own place. People are more used to If/else construct for checking a value where as pattern matching is like a If/else on steroids. Pattern matching allows you to sort of compare against the decomposed structure of the data along with using gaurds for specifying some additional condition on the parts of the decomposed data or some other value (specially in case of recursive data structures or so called discriminated unions in F#).
I personally prefer to use if/else for simple values comparisons (true/false, ints etc), but in case you have a recursive data structure or something which you need to compare against its decomposed value than there is nothing better than pattern matching.
First make it work and make it elegant and simple and then if you seem some performance problem then check for performance issues (which mostly will be due to some other logic and not due to pattern matching)
Agree with #Daniel that pattern matching is usually more flexible.
Check this implementation:
type Solution = | Identity | Roots of float list
let quadraticEquation x =
let rec removeZeros list =
match list with
| 0.0::rest -> removeZeros rest
| _ -> list
let x = removeZeros x
match x with
| [] -> Identity // zero constant
| [_] -> Roots [] // non-zero constant
| [a;b] -> Roots [ -b/a ] // linear equation
| [a;b;c] ->
let delta = b*b - 4.0*a*c
match delta with
| delta when delta < 0.0 ->
Roots [] // no real roots
| _ ->
let d = sqrt delta
let x1 = (-b-d) / (2.0*a)
let x2 = (-b+d) / (2.0*a)
Roots [x1; x2]
| _ -> failwithf "equation is bigger than quadratic: %A" x
Also notice in https://fsharpforfunandprofit.com/learning-fsharp/ that it is discouraged to use if-else. It is considered a bid less functional.
I did some testing on a self writen prime number generator, and as far as i can say there is "if then else" is significantly slower than pattern matching, can't explain why though, but I as far as I have tested the imperativ part of F# have a slower run time than recursive functional style when it come to optimal algorithms.

Parsing grammars using OCaml

I have a task to write a (toy) parser for a (toy) grammar using OCaml and not sure how to start (and proceed with) this problem.
Here's a sample Awk grammar:
type ('nonterm, 'term) symbol = N of 'nonterm | T of 'term;;
type awksub_nonterminals = Expr | Term | Lvalue | Incrop | Binop | Num;;
let awksub_grammar =
(Expr,
function
| Expr ->
[[N Term; N Binop; N Expr];
[N Term]]
| Term ->
[[N Num];
[N Lvalue];
[N Incrop; N Lvalue];
[N Lvalue; N Incrop];
[T"("; N Expr; T")"]]
| Lvalue ->
[[T"$"; N Expr]]
| Incrop ->
[[T"++"];
[T"--"]]
| Binop ->
[[T"+"];
[T"-"]]
| Num ->
[[T"0"]; [T"1"]; [T"2"]; [T"3"]; [T"4"];
[T"5"]; [T"6"]; [T"7"]; [T"8"]; [T"9"]]);;
And here's some fragments to parse:
let frag1 = ["4"; "+"; "3"];;
let frag2 = ["9"; "+"; "$"; "1"; "+"];;
What I'm looking for is a rulelist that is the result of the parsing a fragment, such as this one for frag1 ["4"; "+"; "3"]:
[(Expr, [N Term; N Binop; N Expr]);
(Term, [N Num]);
(Num, [T "3"]);
(Binop, [T "+"]);
(Expr, [N Term]);
(Term, [N Num]);
(Num, [T "4"])]
The restriction is to not use any OCaml libraries other than List... :/
Here is a rough sketch - straightforwardly descend into the grammar and try each branch in order. Possible optimization : tail recursion for single non-terminal in a branch.
exception Backtrack
let parse l =
let rules = snd awksub_grammar in
let rec descend gram l =
let rec loop = function
| [] -> raise Backtrack
| x::xs -> try attempt x l with Backtrack -> loop xs
in
loop (rules gram)
and attempt branch (path,tokens) =
match branch, tokens with
| T x :: branch' , h::tokens' when h = x ->
attempt branch' ((T x :: path),tokens')
| N n :: branch' , _ ->
let (path',tokens) = descend n ((N n :: path),tokens) in
attempt branch' (path', tokens)
| [], _ -> path,tokens
| _, _ -> raise Backtrack
in
let (path,tail) = descend (fst awksub_grammar) ([],l) in
tail, List.rev path
Ok, so the first think you should do is write a lexical analyser. That's the
function that takes the ‘raw’ input, like ["3"; "-"; "("; "4"; "+"; "2"; ")"],
and splits it into a list of tokens (that is, representations of terminal symbols).
You can define a token to be
type token =
| TokInt of int (* an integer *)
| TokBinOp of binop (* a binary operator *)
| TokOParen (* an opening parenthesis *)
| TokCParen (* a closing parenthesis *)
and binop = Plus | Minus
The type of the lexer function would be string list -> token list and the ouput of
lexer ["3"; "-"; "("; "4"; "+"; "2"; ")"]
would be something like
[ TokInt 3; TokBinOp Minus; TokOParen; TokInt 4;
TBinOp Plus; TokInt 2; TokCParen ]
This will make the job of writing the parser easier, because you won't have to
worry about recognising what is a integer, what is an operator, etc.
This is a first, not too difficult step because the tokens are already separated.
All the lexer has to do is identify them.
When this is done, you can write a more realistic lexical analyser, of type string -> token list, that takes a actual raw input, such as "3-(4+2)" and turns it into a token list.
I'm not sure if you specifically require the derivation tree, or if this is a just a first step in parsing. I'm assuming the latter.
You could start by defining the structure of the resulting abstract syntax tree by defining types. It could be something like this:
type expr =
| Operation of term * binop * term
| Term of term
and term =
| Num of num
| Lvalue of expr
| Incrop of incrop * expression
and incrop = Incr | Decr
and binop = Plus | Minus
and num = int
Then I'd implement a recursive descent parser. Of course it would be much nicer if you could use streams combined with the preprocessor camlp4of...
By the way, there's a small example about arithmetic expressions in the OCaml documentation here.

Resources