Immutable members on objects - f#

I have an object that can be neatly described by a discriminated union. The tree that it represents has some properties that can be easily updated when the tree is modified (but remaining immutable) but that are relatively expensive to recalculate.
I would like to store those properties along with the object as cached values but I don't want to put them into each of the discriminated union cases so I figured a member variable would fit here.
The question is then, how do I change the member value (when I modify the tree) without mutating the actual object? I know I could modify the tree and then mutate that copy without ruining purity but that seems like a wrong way to go about it to me. It would make sense to me if there was some predefined way to change a property but so that the result of the operation is a new object with that property changed.
To clarify, when I say modify I mean doing it in a functional way. Like (::) "appends" to the beginning of a list. I'm not sure what the correct terminology is here.

F# actually has syntax for copy and update records.
The syntax looks like this:
let myRecord3 = { myRecord2 with Y = 100; Z = 2 }
(example from the MSDN records page - http://msdn.microsoft.com/en-us/library/dd233184.aspx).
This allows the record type to be immutable, and for large parts of it to be preserved, whilst only a small part is updated.

The cleanest way to go about it would really be to carry the 'cached' value attached to the DU (as part of the case) in one way or another. I could think of several ways to implement this, I'll just give you one, where there are separate cases for the cached and non-cached modes:
type Fraction =
| Frac of int * int
| CachedFrac of (int * int) * decimal
member this.AsFrac =
match this with
| Frac _ -> this
| CachedFrac (tup, _) -> Frac tup
An entirely different option would be to keep the cached values in a separate dictionary, this is something that makes sense if all you want to do is save some time recalculating them.
module FracCache =
let cache = System.Collections.Generic.Dictionary<Fraction, decimal>()
let modify (oldFrac: Fraction) (newFrac: Fraction) =
cache.[newFrac] <- cache.[oldFrac] + 1 // need to check if oldFrac has a cached value as well.
Basically what memoize would give you plus you have more control over it.

Related

FsCheck: Override generator for a type, but only in the context of a single parent generator

I seem to often run into cases where I want to generate some complex structure, but a special variation with a member type generated differently.
For example, consider this tree
type Tree<'LeafData,'INodeData> =
| LeafNode of 'LeafData
| InternalNode of 'INodeData * Tree<'LeafData,'INodeData> list
I want to generate cases like
No internal node is childless
There are no leaf-type nodes
Only a limited subset of leaf types are used
These are simple to do if I override all generation of a corresponding child type.
The problem is that it seems register is inherently a thread-level action, and there is no gen-local alternative.
For example, what I want could look like
let limitedLeafs =
gen {
let leafGen = Arb.generate<LeafType> |> Gen.filter isAllowedLeaf
do! registerContextualArb (leafGen |> Arb.fromGen)
return! Arb.generate<Tree<NodeType, LeafType>>
}
This Tree example specifically can work around with some creative type shuffling, but that's not always possible.
It's also possible to use some sort of recursive map that enforces assumptions, but that seems relatively complex if the above is possible. I might be misunderstanding the nature of FsCheck generators though.
Does anyone know how to accomplish this kind of gen-local override?
There's a few options here - I'm assuming you're on FsCheck 2.x but keep scrolling for an option in FsCheck 3.
The first is the most natural one but is more work, which is to break down the generator explicitly to the level you need, and then put humpty dumpty together again. I.e don't rely on the type-based generator derivation so much - if I understand your example correctly that would mean implementing a recursive generator - relying on Arb.generate<LeafType> for the generic types.
Second option - Config has an Arbitrary field which you can use to override Arbitrary instances. These overrides will take effect even if the overridden types are part of the automatically generated ones. So as a sketch you could try:
Check.One ({Config.Quick with Arbitrary = [| typeof<MyLeafArbitrary>) |]) (fun safeTree -> ...)
More extensive example which uses FsCheck.Xunit's PropertyAttribute but the principle is the same, set on the Config instead.
Final option! :) In FsCheck 3 (prerelease) you can configure this via a new (as of yet undocumented) concept ArbMap which makes the map from type to Arbitrary instance explicit, instead of this static global nonsense in 2.x (my bad of course. seemed like a good idea at the time.) The implementation is here which may not tell you all that much - the idea is that you put an ArbMap instance together which contains your "safe" generators for the subparts, then you ArbMap.mergeWith that safe map with ArbMap.defaults (thus overriding the default generators with your safe ones, in the resulting ArbMap) and then you use ArbMap.arbitrary or ArbMap.generate with the resulting map.
Sorry for the long winded explanation - but all in all that should give you the best of both worlds - you can reuse the generic union type generator in FsCheck, while surgically overriding certain types in that context.
FsCheck guidance on this is:
To define a generator that generates a subset of the normal range of values for an existing type, say all the even ints, it makes properties more readable if you define a single-case union case, and register a generator for the new type:
As an example, they suggest you could define arbitrary even integers like this:
type EvenInt = EvenInt of int with
static member op_Explicit(EvenInt i) = i
type ArbitraryModifiers =
static member EvenInt() =
Arb.from<int>
|> Arb.filter (fun i -> i % 2 = 0)
|> Arb.convert EvenInt int
Arb.register<ArbitraryModifiers>() |> ignore
You could then generate and test trees whose leaves are even integers like this:
let ``leaves are even`` (tree : Tree<EvenInt, string>) =
let rec leaves = function
| LeafNode leaf -> [leaf]
| InternalNode (_, children) ->
children |> List.collect leaves
leaves tree
|> Seq.forall (fun (EvenInt leaf) ->
leaf % 2 = 0)
Check.Quick ``leaves are even`` // output: Ok, passed 100 tests.
To be honest, I like your idea of a "gen-local override" better, but I don't think FsCheck supports it.

Creating an 'add' computation expression

I'd like the example computation expression and values below to return 6. For some the numbers aren't yielding like I'd expect. What's the step I'm missing to get my result? Thanks!
type AddBuilder() =
let mutable x = 0
member _.Yield i = x <- x + i
member _.Zero() = 0
member _.Return() = x
let add = AddBuilder()
(* Compiler tells me that each of the numbers in add don't do anything
and suggests putting '|> ignore' in front of each *)
let result = add { 1; 2; 3 }
(* Currently the result is 0 *)
printfn "%i should be 6" result
Note: This is just for creating my own computation expression to expand my learning. Seq.sum would be a better approach. I'm open to the idea that this example completely misses the value of computation expressions and is no good for learning.
There is a lot wrong here.
First, let's start with mere mechanics.
In order for the Yield method to be called, the code inside the curly braces must use the yield keyword:
let result = add { yield 1; yield 2; yield 3 }
But now the compiler will complain that you also need a Combine method. See, the semantics of yield is that each of them produces a finished computation, a resulting value. And therefore, if you want to have more than one, you need some way to "glue" them together. This is what the Combine method does.
Since your computation builder doesn't actually produce any results, but instead mutates its internal variable, the ultimate result of the computation should be the value of that internal variable. So that's what Combine needs to return:
member _.Combine(a, b) = x
But now the compiler complains again: you need a Delay method. Delay is not strictly necessary, but it's required in order to mitigate performance pitfalls. When the computation consists of many "parts" (like in the case of multiple yields), it's often the case that some of them should be discarded. In these situation, it would be inefficient to evaluate all of them and then discard some. So the compiler inserts a call to Delay: it receives a function, which, when called, would evaluate a "part" of the computation, and Delay has the opportunity to put this function in some sort of deferred container, so that later Combine can decide which of those containers to discard and which to evaluate.
In your case, however, since the result of the computation doesn't matter (remember: you're not returning any results, you're just mutating the internal variable), Delay can just execute the function it receives to have it produce the side effects (which are - mutating the variable):
member _.Delay(f) = f ()
And now the computation finally compiles, and behold: its result is 6. This result comes from whatever Combine is returning. Try modifying it like this:
member _.Combine(a, b) = "foo"
Now suddenly the result of your computation becomes "foo".
And now, let's move on to semantics.
The above modifications will let your program compile and even produce expected result. However, I think you misunderstood the whole idea of the computation expressions in the first place.
The builder isn't supposed to have any internal state. Instead, its methods are supposed to manipulate complex values of some sort, some methods creating new values, some modifying existing ones. For example, the seq builder1 manipulates sequences. That's the type of values it handles. Different methods create new sequences (Yield) or transform them in some way (e.g. Combine), and the ultimate result is also a sequence.
In your case, it looks like the values that your builder needs to manipulate are numbers. And the ultimate result would also be a number.
So let's look at the methods' semantics.
The Yield method is supposed to create one of those values that you're manipulating. Since your values are numbers, that's what Yield should return:
member _.Yield x = x
The Combine method, as explained above, is supposed to combine two of such values that got created by different parts of the expression. In your case, since you want the ultimate result to be a sum, that's what Combine should do:
member _.Combine(a, b) = a + b
Finally, the Delay method should just execute the provided function. In your case, since your values are numbers, it doesn't make sense to discard any of them:
member _.Delay(f) = f()
And that's it! With these three methods, you can add numbers:
type AddBuilder() =
member _.Yield x = x
member _.Combine(a, b) = a + b
member _.Delay(f) = f ()
let add = AddBuilder()
let result = add { yield 1; yield 2; yield 3 }
I think numbers are not a very good example for learning about computation expressions, because numbers lack the inner structure that computation expressions are supposed to handle. Try instead creating a maybe builder to manipulate Option<'a> values.
Added bonus - there are already implementations you can find online and use for reference.
1 seq is not actually a computation expression. It predates computation expressions and is treated in a special way by the compiler. But good enough for examples and comparisons.

Splitting a list in OCaml

I am wondering whether there is an option for splitting a list in half (or in general at specified element).
To be precise I would like to do something like this:
Having a list (of integers for example):
let ml = [1;2;3;4;5]
I would like to make two lists out of it with lengths specified with an argument of first list.
It would look like something like this:
let msl1, msl2 = split_at_point ml 3
(* msl1 = [1;2;3], msl2=[4;5] *)
To be honest I don't really care if the splitting occurs at the specific point inclusively or not. All I care is it would be fast and memory saving (no copying would be great). And I know that best in terms of efficiency is something like O(n) where n the length of the first list (of the results).
You can't avoid copying the first half of the list. List are immutable, so the only way to get the new list is to copy elements from the original list.
The second half of the list doesn't require a copy, because the tail of a list is a list.
There's no built-in function for doing this (in the OCaml standard library anyway), so I'd suggest you write your own.
This seems very much like something that would come up a homework assignment, so I'll just say that the way to write most list functions in a functional language is to ask yourself how you could use a function that works for a smaller input (generally, the tail of the list) to calculate the value for your original input.
In your case you have something like this:
let rec split_at_point l n =
if n = 0 then
(* Answer is obvious *)
else
match l with
| [] -> (* Answer is obvious *)
| head :: tail ->
(* Call split_at_point on the tail and
* construct your answer
*)

f# deedle filter data frame based on a list

I wanted to filter a Deedle dataframe based on a list of values how would I go about doing this?
I had an idea to use the following code below:
let d= df1|>filterRowValues(fun row -> row.GetAs<float>("ts") = timex)
However the issue with this is that it is only based on one variable, I then thought of combining this with a for loop and an append function:
for i in 0.. recd.length -1 do
df2.Append(df1|>filterRowValues(fun row -> row.GetAs<float>("ts") = recd.[i]))
This does not work either however and there must be a better way of doing this without using a for loop. In R I could for instance using an %in%.
You can use the F# set type to create a set of the values that you are interested. In the filtering, you can then check whether the set contains the actual value for the row.
For example, say that you have recd of type seq<float>. Then you should be able to write:
let recdSet = set recd
let d = df1 |> Frame.filterRowValues (fun row ->
recdSet.Contains(row.GetAs<float>("ts"))
Some other things that might be useful:
You can replace row.GetAs<float>("ts") with just row?ts (which always returns float and works only when you have a fixed name, like "ts", but it makes the code nicer)
Comparing float values might not be the best thing to do (because of floating point imprecisions, this might not always work as expected).

How do I turn this Extension Method into an Extension Property?

I have an extension method
type System.Int32 with
member this.Thousand() = this * 1000
but it requires me to write like this
(5).Thousand()
I'd love to get rid of both parenthesis, starting with making it a property instead of a method (for learning sake) how do I make this a property?
Jon's answer is one way to do it, but for a read-only property there's also a more concise way to write it:
type System.Int32 with
member this.Thousand = this * 1000
Also, depending on your preferences, you may find it more pleasing to write 5 .Thousand (note the extra space) than (5).Thousand (but you won't be able to do just 5.Thousand, or even 5.ToString()).
I don't really know F# (shameful!) but based on this blog post, I'd expect:
type System.Int32 with
member this.Thousand
with get() = this * 1000
I suspect that won't free you from the first set of parentheses (otherwise F# may try to parse the whole thing as a literal), but it should help you with the second.
Personally I wouldn't use this sort of thing for a "production" extension, but it's useful for test code where you're working with a lot of values.
In particular, I've found it neat to have extension methods around dates, e.g. 19.June(1976) as a really simple, easy-to-read way of building up test data. But not for production code :)
It's not beautiful, but if you really want a function that will work for any numeric type, you can do this:
let inline thousand n =
let one = LanguagePrimitives.GenericOne
let thousand =
let rec loop n i =
if i < 1000 then loop (n + one) (i + 1)
else n
loop one 1
n * thousand
5.0 |> thousand
5 |> thousand
5I |> thousand

Resources