Splitting out blocks of code in F# pattern matching for readability - f#

// Standard pattern matching.
let Foo x =
match x with
| 1 ->
// ... lots of code, only evaluated if x == 1
| 2 ->
// ... lots of code, only evaluated if x == 2
// Standard pattern matching separated out, causing exception.
let Bar x =
let valueOne = //... lots of code, evaluated always. Exception if value <> 1.
let valueTwo = //... lots of code, evaluated always. Exception if value <> 2.
match x with
| 1 -> valueOne
| 2 -> valueTwo
In a pattern matching using "match", the code for each pattern can be large, see Foo above, making me want to split out the blocks as separate calls for improved readability.
The issue with this can be that the call will be evaluated even though the pattern is not matched, as in Bar above.
Option 1: lazy eval.
Option 2: forward the argument/parameters.
Option 3: forward the argument/parameters and use Active pattern.
What is the preferred method for improving the readability where code under each pattern can be large. Or are there any other obvious solution to the problem?
// ===== OPTION 1 =====
// Pattern matching separated out, lazy eval.
let Foo x =
let valueOne = lazy //... lots of code, evaluated on Force().
let valueTwo = lazy //... lots of code, evaluated on Force().
match x with
| 1 -> valueOne.Force()
| 2 -> valueTwo.Force()
// ===== OPTION 2 =====
// Pattern matching separated out, with arguments.
let Foo x =
let valueOne a = //... lots of code.
let valueTwo a = //... lots of code.
match x with
| 1 -> valueOne x
| 2 -> valueTwo x
// ===== OPTION 3 =====
// Active Pattern matching separated out, with arguments.
let Foo x =
let (|ValueOne|_|) inp =
if inp = 1 then Some(...) else None
let (|ValueTwo|_|) inp =
if inp = 2 then Some(...) else None
match x with
| ValueOne a -> a
| ValueTwo b -> b

I would probably just extract the two bodies of the pattern matching into functions that take unit:
let caseOne () =
// Lots of code when x=1
let caseTwo () =
// Lots of code when x=2
let Foo x =
match x with
| 1 -> caseOne()
| 2 -> caseTwo()
This is similar to your solution using lazy, but as we are never re-using the result of the lazy value, there is really no reason for using lazy values - a function is simpler and it also delays the evaluation of the body.
If you then find that there is some commonality between caseOne and caseTwo, you can again extract this into another function that both of them can call.

Generally speaking, I try to have the semantics of my code match the logic of what I'm trying to accomplish. In your case, I would describe your problem as:
There are two different pieces of simple data I may receive. Based on which
one I receive, run a specific piece of code.
This maps exactly to option 2. I may or may not nest the functions like you have depending on the context. The other two options create a mismatch between your goal and your code.
Option 1:
I would describe the logic of this one as:
I have information (or context) now from which I can build two different calculations, one of which may need to be run later. Construct both now, and then evaluate the one which is needed later.
This really doesn't match the logic of what you want since there is no change in context or available data. You're still in the same context either way, so the additional complexity of making it lazy simply obfuscates the logic of the function.
Option 3:
I would describe the logic of this one as:
I have some data which may fit into one of two cases. Determining which case holds requires some complex logic, and that logic may also have some overlap with the logic needed to determine the needed return value of the overall function. Move the logic to determine the case into a separate function (an active pattern in this case), and then use the return value of the active pattern to determine the resulting value of the function.
This one conflates the control flow (determining what code should I execute next) with the logic executed in each case. Its much more clear to separate control flow from the following logic if there is no overlap between the two.
So match the structure of your code to the structure of your problem, or you'll end up with people wondering what the additional code complexity accomplishes.

Related

FsCheck: Override generator for a type, but only in the context of a single parent generator

I seem to often run into cases where I want to generate some complex structure, but a special variation with a member type generated differently.
For example, consider this tree
type Tree<'LeafData,'INodeData> =
| LeafNode of 'LeafData
| InternalNode of 'INodeData * Tree<'LeafData,'INodeData> list
I want to generate cases like
No internal node is childless
There are no leaf-type nodes
Only a limited subset of leaf types are used
These are simple to do if I override all generation of a corresponding child type.
The problem is that it seems register is inherently a thread-level action, and there is no gen-local alternative.
For example, what I want could look like
let limitedLeafs =
gen {
let leafGen = Arb.generate<LeafType> |> Gen.filter isAllowedLeaf
do! registerContextualArb (leafGen |> Arb.fromGen)
return! Arb.generate<Tree<NodeType, LeafType>>
}
This Tree example specifically can work around with some creative type shuffling, but that's not always possible.
It's also possible to use some sort of recursive map that enforces assumptions, but that seems relatively complex if the above is possible. I might be misunderstanding the nature of FsCheck generators though.
Does anyone know how to accomplish this kind of gen-local override?
There's a few options here - I'm assuming you're on FsCheck 2.x but keep scrolling for an option in FsCheck 3.
The first is the most natural one but is more work, which is to break down the generator explicitly to the level you need, and then put humpty dumpty together again. I.e don't rely on the type-based generator derivation so much - if I understand your example correctly that would mean implementing a recursive generator - relying on Arb.generate<LeafType> for the generic types.
Second option - Config has an Arbitrary field which you can use to override Arbitrary instances. These overrides will take effect even if the overridden types are part of the automatically generated ones. So as a sketch you could try:
Check.One ({Config.Quick with Arbitrary = [| typeof<MyLeafArbitrary>) |]) (fun safeTree -> ...)
More extensive example which uses FsCheck.Xunit's PropertyAttribute but the principle is the same, set on the Config instead.
Final option! :) In FsCheck 3 (prerelease) you can configure this via a new (as of yet undocumented) concept ArbMap which makes the map from type to Arbitrary instance explicit, instead of this static global nonsense in 2.x (my bad of course. seemed like a good idea at the time.) The implementation is here which may not tell you all that much - the idea is that you put an ArbMap instance together which contains your "safe" generators for the subparts, then you ArbMap.mergeWith that safe map with ArbMap.defaults (thus overriding the default generators with your safe ones, in the resulting ArbMap) and then you use ArbMap.arbitrary or ArbMap.generate with the resulting map.
Sorry for the long winded explanation - but all in all that should give you the best of both worlds - you can reuse the generic union type generator in FsCheck, while surgically overriding certain types in that context.
FsCheck guidance on this is:
To define a generator that generates a subset of the normal range of values for an existing type, say all the even ints, it makes properties more readable if you define a single-case union case, and register a generator for the new type:
As an example, they suggest you could define arbitrary even integers like this:
type EvenInt = EvenInt of int with
static member op_Explicit(EvenInt i) = i
type ArbitraryModifiers =
static member EvenInt() =
Arb.from<int>
|> Arb.filter (fun i -> i % 2 = 0)
|> Arb.convert EvenInt int
Arb.register<ArbitraryModifiers>() |> ignore
You could then generate and test trees whose leaves are even integers like this:
let ``leaves are even`` (tree : Tree<EvenInt, string>) =
let rec leaves = function
| LeafNode leaf -> [leaf]
| InternalNode (_, children) ->
children |> List.collect leaves
leaves tree
|> Seq.forall (fun (EvenInt leaf) ->
leaf % 2 = 0)
Check.Quick ``leaves are even`` // output: Ok, passed 100 tests.
To be honest, I like your idea of a "gen-local override" better, but I don't think FsCheck supports it.

Creating an 'add' computation expression

I'd like the example computation expression and values below to return 6. For some the numbers aren't yielding like I'd expect. What's the step I'm missing to get my result? Thanks!
type AddBuilder() =
let mutable x = 0
member _.Yield i = x <- x + i
member _.Zero() = 0
member _.Return() = x
let add = AddBuilder()
(* Compiler tells me that each of the numbers in add don't do anything
and suggests putting '|> ignore' in front of each *)
let result = add { 1; 2; 3 }
(* Currently the result is 0 *)
printfn "%i should be 6" result
Note: This is just for creating my own computation expression to expand my learning. Seq.sum would be a better approach. I'm open to the idea that this example completely misses the value of computation expressions and is no good for learning.
There is a lot wrong here.
First, let's start with mere mechanics.
In order for the Yield method to be called, the code inside the curly braces must use the yield keyword:
let result = add { yield 1; yield 2; yield 3 }
But now the compiler will complain that you also need a Combine method. See, the semantics of yield is that each of them produces a finished computation, a resulting value. And therefore, if you want to have more than one, you need some way to "glue" them together. This is what the Combine method does.
Since your computation builder doesn't actually produce any results, but instead mutates its internal variable, the ultimate result of the computation should be the value of that internal variable. So that's what Combine needs to return:
member _.Combine(a, b) = x
But now the compiler complains again: you need a Delay method. Delay is not strictly necessary, but it's required in order to mitigate performance pitfalls. When the computation consists of many "parts" (like in the case of multiple yields), it's often the case that some of them should be discarded. In these situation, it would be inefficient to evaluate all of them and then discard some. So the compiler inserts a call to Delay: it receives a function, which, when called, would evaluate a "part" of the computation, and Delay has the opportunity to put this function in some sort of deferred container, so that later Combine can decide which of those containers to discard and which to evaluate.
In your case, however, since the result of the computation doesn't matter (remember: you're not returning any results, you're just mutating the internal variable), Delay can just execute the function it receives to have it produce the side effects (which are - mutating the variable):
member _.Delay(f) = f ()
And now the computation finally compiles, and behold: its result is 6. This result comes from whatever Combine is returning. Try modifying it like this:
member _.Combine(a, b) = "foo"
Now suddenly the result of your computation becomes "foo".
And now, let's move on to semantics.
The above modifications will let your program compile and even produce expected result. However, I think you misunderstood the whole idea of the computation expressions in the first place.
The builder isn't supposed to have any internal state. Instead, its methods are supposed to manipulate complex values of some sort, some methods creating new values, some modifying existing ones. For example, the seq builder1 manipulates sequences. That's the type of values it handles. Different methods create new sequences (Yield) or transform them in some way (e.g. Combine), and the ultimate result is also a sequence.
In your case, it looks like the values that your builder needs to manipulate are numbers. And the ultimate result would also be a number.
So let's look at the methods' semantics.
The Yield method is supposed to create one of those values that you're manipulating. Since your values are numbers, that's what Yield should return:
member _.Yield x = x
The Combine method, as explained above, is supposed to combine two of such values that got created by different parts of the expression. In your case, since you want the ultimate result to be a sum, that's what Combine should do:
member _.Combine(a, b) = a + b
Finally, the Delay method should just execute the provided function. In your case, since your values are numbers, it doesn't make sense to discard any of them:
member _.Delay(f) = f()
And that's it! With these three methods, you can add numbers:
type AddBuilder() =
member _.Yield x = x
member _.Combine(a, b) = a + b
member _.Delay(f) = f ()
let add = AddBuilder()
let result = add { yield 1; yield 2; yield 3 }
I think numbers are not a very good example for learning about computation expressions, because numbers lack the inner structure that computation expressions are supposed to handle. Try instead creating a maybe builder to manipulate Option<'a> values.
Added bonus - there are already implementations you can find online and use for reference.
1 seq is not actually a computation expression. It predates computation expressions and is treated in a special way by the compiler. But good enough for examples and comparisons.

Design alternatives to extending object with interface

While working through Expert F# again, I decided to implement the application for manipulating algebraic expressions. This went well and now I've decided as a next exercise to expand on that by building a more advanced application.
My first idea was to have a setup that allows for a more extendible way of creating functions without having to recompile. To that end I have something like:
type IFunction =
member x.Name : string with get
/// additional members omitted
type Expr =
| Num of decimal
| Var of string
///... omitting some types here that don't matter
| FunctionApplication of IFunction * Expr list
So that say a Sin(x) could be represented a:
let sin = { new IFunction() with member x.Name = "SIN" }
let sinExpr = FunctionApplication(sin,Var("x"))
So far all good, but the next idea that I would like to implement is having additional interfaces to represent function of properties. E.g.
type IDifferentiable =
member Derivative : int -> IFunction // Get the derivative w.r.t a variable index
One of the ideas the things I'm trying to achieve here is that I implement some functions and all the logic for them and then move on to the next part of the logic I would like to implement. However, as it currently stands, that means that with every interface I add, I have to revisit all the IFunctions that I've implemented. Instead, I'd rather have a function:
let makeDifferentiable (f : IFunction) (deriv : int -> IFunction) =
{ f with
interface IDifferentiable with
member x.Derivative = deriv }
but as discussed in this question, that is not possible. The alternative that is possible, doesn't meet my extensibility requirement. My question is what alternatives would work well?
[EDIT] I was asked to expand on the "doesn't meet my extenibility requirement" comment. The way this function would work is by doing something like:
let makeDifferentiable (deriv : int -> IFunction) (f : IFunction)=
{ new IFunction with
member x.Name = f.Name
interface IDifferentiable with
member x.Derivative = deriv }
However, ideally I would keep on adding additional interfaces to an object as I add them. So if I now wanted to add an interface that tell whether on function is even:
type IsEven =
abstract member IsEven : bool with get
then I would like to be able to (but not obliged, as in, if I don't make this change everything should still compile) to change my definition of a sine from
let sin = { new IFunction with ... } >> (makeDifferentiable ...)
to
let sin = { new IFunction with ... } >> (makeDifferentiable ...) >> (makeEven false)
The result of which would be that I could create an object that implements the IFunction interface as well as potentially, but not necessarily a lot of different other interfaces as well; the operations I'd then define on them, would potentially be able to optimize what they are doing based on whether or not a certain function implements an interface. This will also allow me to add additional features/interfaces/operations first without having to change the functions I've defined (though they wouldn't take advantage of the additional features, things wouldn't be broken either.[/EDIT]
The only thing I can think of right now is to create a dictionary for each feature that I'd like to implement, with function names as keys and the details to build an interface on the fly, e.g. along the lines:
let derivative (f : IFunction) =
match derivativeDictionary.TryGetValue(f.Name) with
| false, _ -> None
| true, d -> d.Derivative
This would require me to create one such function per feature that I add in addition to one dictionary per feature. Especially if implemented asynchronously with agents, this might be not that slow, but it still feels a little clunky.
I think the problem that you're trying to solve here is what is called The Expression Problem. You're essentially trying to write code that would be extensible in two directions. Discriminated unions and object-oriented model give you one or the other:
Discriminated union makes it easy to add new operations (just write a function with pattern matching), but it is hard to add a new kind of expression (you have to extend the DU and modify all code
that uses it).
Interfaces make it easy to add new kinds of expressions (just implement the interface), but it is hard to add new operations (you have to modify the interface and change all code that creates it.
In general, I don't think it is all that useful to try to come up with solutions that let you do both (they end up being terribly complicated), so my advice is to pick the one that you'll need more often.
Going back to your problem, I'd probably represent the function just as a function name together with the parameters:
type Expr =
| Num of decimal
| Var of string
| Application of string * Expr list
Really - an expression is just this. The fact that you can take derivatives is another part of the problem you're solving. Now, to make the derivative extensible, you can just keep a dictionary of the derivatives:
let derrivatives =
dict [ "sin", (fun [arg] -> Application("cos", [arg]))
... ]
This way, you have an Expr type that really models just what an expression is and you can write differentiation function that will look for the derivatives in the dictionary.

Function composition - when?

Given the following two approaches, what would the cons and pros of both, when it comes to function composition?
Approach 1
let isNameTaken source name =
source |> Query.Exists(fun z -> z.Name = name)
let usage : Customer = isNameTaken source "Test"
Approach 2
let isNameTaken f name =
f(fun z -> z.Name = name)
let usage : Customer = isNameTaken (source |> Query.Exists) "Test"
Is it just silly to pass (source |> Query.Exists) in Approach 2 - is it too extreme?
It depends on the wider context. I would generally prefer the first approach, unless you have some really good reason for using the second style (e.g. there is a number of functions similar to Query.Exists that you need to apply in a similar style).
Aside - I think your second example has a couple of issues (e.g. the piping in source |> Query.Exists would have to be replaced with (fun pred -> source |> Query.Exists pred) which makes it uglier.
Even then, the second approach does not really give you much benefit - your isNameTaken is simply a function that tests whether a customer name equals a given name and then it passes that as an argument to some f - you could just define a function that tests name equality and write something like this:
let nameEquals name (customer:Customer) =
customer.Name = name
let usage = source |> Query.Exists (nameEquals "Test")
More generally, I think it is always preferable to write code so that the caller can compose the pieces that are available to them (like Query.Exists, nameEquals etc.) rather than In a way that requires the caller to fill some holes of a particular required shape (e.g. implement a function with specified signature).
I think the answer to your question has to do with two main criteria. Which is more important, the readability of the code or the decoupling of the query from isNameTaken. In this particular case, I'm not sure that you get much at all from decoupling the query and it also seems like your decoupling is partial.
The thing I don't like about this is that in both cases, you've got z.Name tightly coupled into isNameTaken, which means that isNameTaken needs to know about the type of z. If that's OK with you then fine.

Writing F# code to parse "2 + 2" into code

Extremely just-started-yesterday new to F#.
What I want: To write code that parses the string "2 + 2" into (using as an example code from the tutorial project) Expr.Add(Expr.Num 2, Expr.Num 2) for evaluation. Some help to at least point me in the right direction or tell me it's too complex for my first F# project. (This is how I learn things: By bashing my head against stuff that's hard)
What I have: My best guess at code to extract the numbers. Probably horribly off base. Also, a lack of clue.
let script = "2 + 2";
let rec scriptParse xs =
match xs with
| [] -> (double)0
| y::ys -> (double)y
let split = (script.Split([|' '|]))
let f x = (split[x]) // "This code is not a function and cannot be applied."
let list = [ for x in 0..script.Length -> f x ]
let result = scriptParse
Thanks.
The immediate issue that you're running into is that split is an array of strings. To access an element of this array, the syntax is split.[x], not split[x] (which would apply split to the singleton list [x], assuming it were a function).
Here are a few other issues:
Your definition of list is probably wrong: x ranges up to the length of script, not the length of the array split. If you want to convert an array or other sequence to a list you can just use List.ofSeq or Seq.toList instead of an explicit list comprehension [...].
Your "casts" to double are a bit odd - that's not the right syntax for performing conversions in F#, although it will work in this case. double is a function, so the parentheses are unnecessary and what you are doing is really calling double 0 and double y. You should just use 0.0 for the first case, and in the second case, it's unclear what you are converting from.
In general, it would probably be better to do a bit more design up front to decide what your overall strategy will be, since it's not clear to me that you'll be able to piece together a working parser based on your current approach. There are several well known techniques for writing a parser - are you trying to use a particular approach?

Resources