Working with missing values in Deedle Time Series in F# (1) - f#

here is a small example where I want to deal with missing values on custom functions on series.
suppose that i have obtained a series
series4;;
 val it : Series<int,int opt> =
 1 -> 1
 2 -> 2
 3 -> 3
 4 -> <missing>
for example, this way:
let series1 = Series.ofObservations [(1,1);(2,2);(3,3)]
let series2 = Series.ofObservations [(1,2);(2,2);(3,1);(4,4)]
let series3 = series1.Zip(series2,JoinKind.Outer);;
let series4 = series3 |> Series.mapValues fst
Then if i do,
Series.mapAll (fun v -> match v with
| Some a -> (a>1)
| _-> false) series4
that fails with
 System.Exception: Operation could not be completed due to earlier
error  The type 'int option' does not match the type 'int opt'. See
also input.fsx(4,42)-(4,49). at 4,42
while i would like the result to be
val it : Series<int,bool opt> =
 1 -> false
 2 -> true
 3 -> true
 4 -> false
even better would be to able to get a result like
val it : Series<int,int opt> =
 1 -> false
 2 -> true
 3 -> true
 4 -> <missing>
what would be the right syntax there ? ideally if there is a <missing> value, i would like a <missing> value for the same key in the new series
basically i need to do pattern matching on int opt type
Bonus question: is there are vectorized operator in Deedle for some usual operators such like ">" ?
(series1 > series2) when both series have the same key types would return a new series of boolean (option ?)type
thanks

You can do it this way:
let series5 =
series4
|> Series.mapValues(OptionalValue.map(fun x -> x > 1))
You can read about module OptionalValue in the documentation

Related

F#, Deedle and OptionalValue: Object must implement IConvertible error

I'm facing trouble when I try to create missing values in a Frame and later perform operations with them. Here is a "working" sample:
open Deedle
open System.Text.RegularExpressions
do fsi.AddPrinter(fun (printer:Deedle.Internal.IFsiFormattable) -> "\n" + (printer.Format()))
module Frame = let mapAddCol col f frame = frame |> Frame.addCol col (Frame.mapRowValues f frame)
[ {|Desc = "A - 1.50ml"; ``Price ($)`` = 23.|}
{|Desc = "B - 2ml"; ``Price ($)`` = 18.5|}
{|Desc = "C"; ``Price ($)`` = 25.|} ]
|> Frame.ofRecords
(*
Desc Price ($)
0 -> A - 1.50ml 23
1 -> B - 2ml 18.5
2 -> C 25
*)
|> Frame.mapAddCol "Volume (ml)" (fun row ->
match Regex.Match(row.GetAs<string>("Desc"),"[\d\.]+").Value with
| "" -> OptionalValue.Missing
| n -> n |> float |> OptionalValue)
(*
Desc Price ($) Volume (ml)
0 -> A - 1.50ml 23 1.5
1 -> B - 2ml 18.5 2
2 -> C 25 <missing>
*)
|> fun df -> df?``Price ($/ml)`` <- df?``Price ($)`` / df?``Volume (ml)``
//error message: System.InvalidCastException: Object must implement IConvertible.
What is wrong with this approach?
Deedle internally stores a flag whether a value is present or missing. This is typically exposed via the OptionalValue type, but the internal representation is not actually using this type.
When you use a function such as mapRowValues to generate new data, Deedle needs to recognize which data is missing. This happens in only somewhat limited cases only. When you return OptionalValue<float>, Deedle actually produces a series where the type of values is OptionalValue<float> rather than float (the type system does not let it do anything else).
For float values, the solution is just to return nan as your missing value:
|> Frame.mapAddCol "Volume (ml)" (fun row ->
match Regex.Match(row.GetAs<string>("Desc"),"[\d\.]+").Value with
| "" -> nan
| n -> n |> float )
This will create a new series of float values, which you can then access using the ? operator.

F# : How to test the equality of sequence/list elements?

I would like to test whether all of elements in a list/sequence equals something
For example,a sequence of integers.
I would like to test if ALL element of the sequence equals to the same number.
My solution so far looks like imperative programming solution.
let test seq =
if Seq.forall(fun num -> num =1) then 1
elif Seq.forall(fun num-> num = 2) then 2
else None
Your solution is fine! Checking that all elements of a sequence have some value is not something you can nicely express using pattern matching - you have to use when clause, but that's doing exactly the same thing as your code (but with longer syntax). In cases like this, there is absolutely nothing wrong with using if.
You can extend pattern matching by definining custom active patterns, which gives you a nice option here. This is fairly advanced F#, but you can define a custom pattern ForAll n that succeeds when the input is a sequence containing just n values:
let (|ForAll|_|) n seq =
if Seq.forall (fun num -> num = n) seq then Some() else None
Note that success is represented as Some and failure as None. Now, you can solve your problem very nicely using pattern matching:
let test = function
| ForAll 1 -> Some 1
| ForAll 2 -> Some 2
| _ -> None
This looks quite nice, but it's relying on more advanced features - I would do this if this is something that you need in more than one place. If I needed this just in one place, I'd go with ordinary if.
You can rewrite it using pattern matching with a guard clause:
let testList = [2;2;2]
let isOne x = x = 1
let isTwo x = x = 2
let forAll = function
| list when list |> List.forall isOne -> Some 1
| list when list |> List.forall isTwo -> Some 2
| _ -> None
let res = forAll testList //Some 2
Instead of the function you could use partial application on the equals operator.
> let yes = [1;1;1];;
val yes : int list = [1; 1; 1]
> let no = [1;2;3];;
val no : int list = [1; 2; 3]
> yes |> List.forall ((=) 1);;
val it : bool = true
> no |> List.forall ((=) 1);;
val it : bool = false
Maybe this looks more functional? And I think you should return Some 1 in your code, otherwise you'd get type errors since Option and int are not the same type...
If you want to check if all elements are equal (not just if they equal some constant), you could do this:
> [1;2] |> List.pairwise |> List.forall (fun (a,b) -> a = b)
;;
val it : bool = false
> [1;1;1] |> List.pairwise |> List.forall (fun (a,b) -> a = b)
;;
val it : bool = true
There you split your list into tuples and checks if the tuples are equal. This means transitively that all elements are equal.

Match on discriminated union case not contents

Is it possible in F# to match a Discriminated Union based on its case rather than by case contents? For example, if I wanted to filter a list by elements that are of the case Flag, is it possible to filter as such? Currently, I am forced to have three separate functions to filter the way I desire. This is the approach I have so far:
type Option =
{Id : string
Arg : string}
type Argument =
| Flag of string
| Option of Option
| Unannotated of string
//This is what I'm going for, but it does not work as the "other" match case will never be matched
let LocateByCase (case:Argument) (args : Argument List) =
args
|> List.filter (fun x -> match x with
| case -> true
| _ -> false)
let LocateAllFlags args =
args
|> List.filter (fun x -> match x with
| Flag y -> true
| _ -> false)
let LocateAllOptions args =
args
|> List.filter (fun x -> match x with
| Option y -> true
| _ -> false)
let LocateAllUnannotated args =
args
|> List.filter (fun x -> match x with
| Unannotated y -> true
| _ -> false)
Am I missing some facet of the F# language that would make this much easier to deal with?
There is no built-in way to find out the case of a DU value. The usual approach, when faced with such requirement, is to provide appropriate functions for each case:
type Argument =
| Flag of string
| Option of Option
| Unannotated of string
with
static member isFlag = function Flag _ -> true | _ -> false
static member isOption = function Option _ -> true | _ -> false
static member isUnannotated = function Unannotated _ -> true | _ -> false
let LocateByCase case args = List.filter case args
let LocateAllFlags args = LocateByCase Argument.isFlag args
(needless to say, the LocateByCase function is actually redundant, but I decided to keep it in to make the answer clearer)
WARNING: DIRTY HACK BELOW
Alternatively, you could provide the case as a quotation, and make yourself a function that will analyze that quotation, fish the case name out of it, and compare it to the given value:
open FSharp.Quotations
let isCase (case: Expr<'t -> Argument>) (value: Argument) =
match case with
| Patterns.Lambda (_, Patterns.NewUnionCase(case, _)) -> case.Name = value.GetType().Name
| _ -> false
// Usage:
isCase <# Flag #> (Unannotated "") // returns false
isCase <# Flag #> (Flag "") // returns true
Then use this function to filter:
let LocateByCase case args = List.filter (isCase case) args
let LocateAllFlags args = LocateByCase <# Flag #> args
HOWEVER, this is essentially a dirty hack. Its dirtiness and hackiness comes from the fact that, because you can't require a certain quotation shape at compile time, it will allow nonsensical programs. For example:
isCase <# fun() -> Flag "abc" #> (Flag "xyz") // Returns true!
isCase <# fun() -> let x = "abc" in Flag x #> (Flag "xyz") // Returns false. WTF?
// And so on...
Another gotcha may happen if a future version of the compiler decides to generate quotations slightly differently, and your code won't recognize them and report false negatives all the time.
I would recommend avoiding messing with quotations if at all possible. It may look easy on the surface, but it's really a case of easy over simple.

F# pipeline placeholder?

I have googlet a bit, and I haven't found what I was looking for. As expected. My question is, is it possible to define a F# pipeline placeholder? What I want is something like _ in the following:
let func a b c = 2*a + 3*b + c
2 |> func 5 _ 6
Which would evaluate to 22 (2*5 + 3*2 + 6).
For comparison, check out the magrittr R package: https://github.com/smbache/magrittr
This is (unfortunately!) not supported in the F# language - while you can come up with various fancy functions and operators to emulate the behavior, I think it is usually just easier to refactor your code so that the call is outside of the pipeline. Then you can write:
let input = 2
let result = func 5 input 6
The strength of a pipeline is when you have one "main" data structure that is processed through a sequence of steps (like list processed through a sequence of List.xyz functions). In that case, pipeline makes the code nicer and readable.
However, if you have function that takes multiple inputs and no "main" input (last argument that would work with pipelines), then it is actually more readable to use a temporary variable and ordinary function calls.
I don't think that's possible, but you could simply use a lambda expression, like
2 |> (fun b -> func 5 b 6)
Here's a point-free approach:
let func a b c = 2*a + 3*b + c
let func' = func 5 >> (|>) 6
let result = 2 |> func'
// result = 22
I have explained it in details here.
Be aware, however, that someone who would work with your code will not quickly grasp your intent. You may use it for purposes of learning the deeper aspects of the language, but in real-world projects you will probably find a straightforward approach suitable better:
let func' b = func 5 b 6
You could use a new function like that:
let func a b c = 2*a + 3*b + c
let func2 b = func 5 b 6
2 |> func2
#Dominic Kexel's right on the money. If the object isn't really the placement of a placeholder in the chain of arguments, which could have been achieved by a lambda function, but changing their order, then it's more a case of flip than pipe.
From the simple two-argument case
let flip f b a = f a b
// val flip : f:('a -> 'b -> 'c) -> b:'b -> a:'a -> 'c
we need to derive a function
let flip23of3 f a c b = f a b c
// val flip23of3 : f:('a -> 'b -> 'c -> 'd) -> a:'a -> c:'c -> b:'b -> 'd
in order to flip the second and third argument. This could have also been written
let flip23of3' f = f >> flip
let func a b c = 2*a + 3*b + c
2 |> flip23of3 func 5 6
// val it : int = 22
I have given it a try myself. The result is not perfect, but it is as close as I have gotten:
let (|.|) (x: 'a -> 'b -> 'c) (y: 'b) = fun (a: 'a) -> x a y
let func (a:string) b (c:int) = 2.*(float a) + b + 5.*(float c)
let foo = func "4" 9. 5
printfn "First: %f" foo
let bar =
"4"
|> ((func |.| 9.) |.| 5)
printfn "Second: %f" bar
let baz =
9.
|> (func "4" |.| 5)
printfn "Third: %f" baz
The output is, as expected
First: 42.000000
Second: 42.000000
Third: 42.000000

Built in equality of lists using a custom comparison function?

Is there a built-in function which does the following?
let rec listsEqual xl yl f =
match xl, yl with
| [], [] -> true
| [], _ | _, [] -> false
| xh::xt, yh::yt -> if f xh yh then listsEqual xt yt f else false
Updated, further elaboration: and in general is there any way to tap in to structural comparison but using a custom comparison function?
List.forall2 : (('a -> 'b -> bool) -> 'a list -> 'b list -> bool)
But it takes f before the lists. You can create your function like this:
let listsEqual x y f =
if List.length x = List.length y then
List.forall2 f x y
else
false
Remember that List.forall2 assumes the lengths are the same.
Concerning Seq.compareWith, you wrote:
not quite, two problems 1) expects the
two sequences be of the same type, 2)
doesn't short circuit
2) is wrong, the function really does a court-circuit.
1) is true. Take Seq.compareWith from F# library, modify (or remove) the type annotation and it will work for sequences of different types.
[<CompiledName("CompareWith")>]
let compareWith (f:'T1 -> 'T2 -> int) (source1 : seq<'T1>) (source2: seq<'T2>) =
//checkNonNull "source1" source1
//checkNonNull "source2" source2
use e1 = source1.GetEnumerator()
use e2 = source2.GetEnumerator()
let rec go () =
let e1ok = e1.MoveNext()
let e2ok = e2.MoveNext()
let c = (if e1ok = e2ok then 0 else if e1ok then 1 else -1)
if c <> 0 then c else
if not e1ok || not e2ok then 0
else
let c = f e1.Current e2.Current
if c <> 0 then c else
go ()
go()
Now, you can send an email to fsbugs (# microsoft.com) and ask them to remove the type constraint in the next F# release.

Resources