Broadcasting operations in MathNet - f#

Suppose I have the following data:
var1,var2,var3
0.942856823,0.568425866,0.325885379
1.227681099,1.335672206,0.925331054
1.952671045,1.829479996,1.512280854
2.45428731,1.990174152,1.534456808
2.987783477,2.78975186,1.725095748
3.651682331,2.966399127,1.972274564
3.768010479,3.211381506,1.993080807
4.509429614,3.642983433,2.541071547
4.81498729,3.888415006,3.218031802
Here is the code:
open System.IO
open MathNet.Numerics.LinearAlgebra
let rows = [|for line in File.ReadAllLines("Z:\\mypath.csv")
|> Seq.skip 1 do yield line.Split(',') |> Array.map float|]
let data = DenseMatrix.ofRowArrays rows
let data_logdiff =
DenseMatrix.init (data.RowCount-1) (data.ColumnCount)
(fun j i -> if j = 0 then 0. else data.At(j, i) / data.At(j-1, i) |> log)
let alpha = vector [for i in data_logdiff.EnumerateColumns() -> i |> Statistics.Mean]
let sigsq (values:Vector<float>) (avg: float) =
let sqr x = x * x
let result = values |> (fun i -> sqr (i - avg))
result
sigsq (data_logdiff.Column(i), alpha.[0]) |> printfn "%A"
Error: The type ''a * 'b' is not compatible with the type 'Vector<float>'
This is all for a broadcast operation between a matrix and a vector. All these acrobatics to do a simple mean((y-alpha).^2) in MATLAB.

You have a mistake in your code, and the F# compiler complains about it, albeit in a somewhat obscure way. You define your function:
let sigsq (values:Vector<float>) (avg: float) =
This is a function that takes two arguments. (Actually it's a function taking one argument, returning another function taking one argument.) But you call it like this:
sigsq (data_logdiff.Column(i), alpha.[0]) |> printfn "%A"
You tuple the arguments, and for F# functions (a,b) is one argument, which is a tuple. You should call your function like this:
sigsq (data_logdiff.Column(0)) (alpha.[0])
or
sigsq <| data_logdiff.Column(0) <| alpha.[0]
and my favorite one:
data_logdiff.Column(0) |> sigsq <| alpha.[0]
I replaced the (i) with 0 in your code. You can map through the columns if you want to loop:
data_logdiff.EnumerateColumnsIndexed() |> Seq.map (fun (i,col) -> sigsq col alpha.[i])

Related

Trying to create a function, and then filtering a sequence by "not that function" in F#

My data is a SEQUENCE of:
[(40,"TX");(48,"MO");(15,"TX");(78,"TN");(41,"VT")]
My code is as follows:
type Csvfile = CsvProvider<somefile>
let data = Csvfile.GetSample().Rows
let nullid row =
row.Id = 15
let otherid row =
row.Id= 40
let iddata =
data
|> Seq.filter (not nullid)
|> Seq.filter (not otherid)
I create the functions.
Then I want to call the "not" of those functions to filter them out of a sequence.
But the issue is that I am getting errors for "row.Id" in the first two functions, because you can only do that with a type.
How do I solve this problem so I can accomplish this successfully.
My result should be a SEQUENCE of:
[(48,"MO);(78,"TN");(41,"VT")]
You can use >> operator to compose the two functions:
let iddata =
data
|> Seq.filter (nullid >> not)
|> Seq.filter (othered >> not)
See Function Composition and Pipelining.
Or you can make it more explicit:
let iddata =
data
|> Seq.filter (fun x -> not (nullid x))
|> Seq.filter (fun x -> not (othered x))
You can see that in action:
let input = [|1;2;3;4;5;6;7;8;9;10|];;
let is3 value =
value = 3;;
input |> Seq.filter (fun x -> not (is3 x));;
input |> Seq.filter (not >> is3);;
They both print val it : seq<int> = seq [1; 2; 4; 5; ...]
Please see below what an MCVE might look in your case, for an fsx file you can reference the Fsharp.Data dll with #r, for a compiled project just reference the dll an open it.
#if INTERACTIVE
#r #"..\..\SO2018\packages\FSharp.Data\lib\net45\FSharp.Data.dll"
#endif
open FSharp.Data
[<Literal>]
let datafile = #"C:\tmp\data.csv"
type CsvFile = CsvProvider<datafile>
let data = CsvFile.GetSample().Rows
In the end this is what you want to achieve:
data
|> Seq.filter (fun x -> x.Id <> 15)
|> Seq.filter (fun x -> x.Id <> 40)
//val it : seq<CsvProvider<...>.Row> = seq [(48, "MO"); (78, "TN"); (41, "VT")]
One way to do this is with SRTP, as they allow a way to do structural typing, where the type depends on its shape, for example in this case having the Id property. If you want you can define helper function for the two numbers 15 and 40, and use that in your filter, just like in the second example. However SRTP syntax is a bit strange, and it's designed for a use case where you need to apply a function to different types that have some similarity (basically like interfaces).
let inline getId row =
(^T : (member Id : int) row)
data
|> Seq.filter (fun x -> (getId x <> 15 ))
|> Seq.filter (fun x -> (getId x <> 40))
//val it : seq<CsvProvider<...>.Row> = seq [(48, "MO"); (78, "TN"); (41, "VT")]
Now back to your original post, as you correctly point out your function will show an error, as you define it to be generic, but it needs to operate on a specific Csv row type (that has the Id property). This is very easy to fix, just add a type annotation to the row parameter. In this case your type is CsvFile.Row, and since CsvFile.Row has the Id property we can access that in the function. Now this function returns a Boolean. You could make it return the actual row as well.
let nullid (row: CsvFile.Row) =
row.Id = 15
let otherid (row: CsvFile.Row) =
row.Id = 40
Then what is left is applying this inside a Seq.filter and negating it:
let iddata =
data
|> Seq.filter (not << nullid)
|> Seq.filter (not << otherid)
|> Seq.toList
//val iddata : CsvProvider<...>.Row list = [(48, "MO"); (78, "TN"); (41, "VT")]

How to create a dependency between observables?

I want a tool for testing Rx components that would work like this:
Given an order of the events specified as a 'v seq and a key selector function (keySelector :: 'v -> 'k) I want to create a Map<'k, IObservable<'k>> where the guarantee is that the groupped observables yield the values in the global order defined by the above enumerable.
For example:
makeObservables isEven [1;2;3;4;5;6]
...should produce
{ true : -2-4-6|,
false: 1-3-5| }
This is my attempt looks like this:
open System
open System.Reactive.Linq
open FSharp.Control.Reactive
let subscribeAfter (o1: IObservable<'a>) (o2 : IObservable<'b>) : IObservable<'b> =
fun (observer : IObserver<'b>) ->
let tempObserver = { new IObserver<'a> with
member this.OnNext x = ()
member this.OnError e = observer.OnError e
member this.OnCompleted () = o2 |> Observable.subscribeObserver observer |> ignore
}
o1.Subscribe tempObserver
|> Observable.Create
let makeObservables (keySelector : 'a -> 'k) (xs : 'a seq) : Map<'k, IObservable<'a>> =
let makeDependencies : ('k * IObservable<'a>) seq -> ('k * IObservable<'a>) seq =
let makeDep ((_, o1), (k2, o2)) = (k2, subscribeAfter o1 o2)
Seq.pairwise
>> Seq.map makeDep
let makeObservable x = (keySelector x, Observable.single x)
let firstItem =
Seq.head xs
|> makeObservable
|> Seq.singleton
let dependentObservables =
xs
|> Seq.map makeObservable
|> makeDependencies
dependentObservables
|> Seq.append firstItem
|> Seq.groupBy fst
|> Seq.map (fun (k, obs) -> (k, obs |> Seq.map snd |> Observable.concatSeq))
|> Map.ofSeq
[<EntryPoint>]
let main argv =
let isEven x = (x % 2 = 0)
let splits : Map<bool, IObservable<int>> =
[1;2;3;4;5]
|> makeObservables isEven
use subscription =
splits
|> Map.toSeq
|> Seq.map snd
|> Observable.mergeSeq
|> Observable.subscribe (printfn "%A")
Console.ReadKey() |> ignore
0 // return an integer exit code
...but the results are not as expected and the observed values are not in the global order.
Apparently the items in each group are yield correctly but when the groups are merged its more like a concat then a merge
The expected output is: 1 2 3 4 5
...but the actual output is 1 3 5 2 4
What am I doing wrong?
Thanks!
You describe wanting this:
{ true : -2-4-6|,
false: 1-3-5| }
But you're really creating this:
{ true : 246|,
false: 135| }
Since there's no time gaps between the items in the observables, the merge basically has a constant race condition. Rx guarantees that element 1 of a given sequence will fire before element 2, but Merge offers no guarantees around cases like this.
You need to introduce time gaps into your observables if you want Merge to be able to re-sequence in the original order.

Pass function as a parameter and overload it

I want to time my functions, some of them use up to three parameters. Right now I'm using the same code below with some variations for the three.
let GetTime f (args : string) =
let sw = Stopwatch.StartNew()
f (args)
printfn "%s : %A" sw.Elapsed
I want to replace the three functions with this one.
let GetTime f ( args : 'T[]) =
let sW = Stopwatch.StartNew()
match args.Length with
| 1 -> f args.[0]
| 2 -> f (args.[0] args.[1])
printfn "%A" sW.Elapsed
()
But I'm getting an error of type mismatch, if I use the three functions it works. Is it possible to send the function as a parameter and use it like this?
Why not just do something like this?
let getTime f =
let sw = Stopwatch.StartNew()
let result = f ()
printfn "%A" sw.Elapsed
result
Assuming that f1, f2, and f3 are three functions that take respectively 1, 2, and 3 arguments, you can use the getTime function like this:
getTime (fun () -> f1 "foo")
getTime (fun () -> f2 "foo" "bar")
getTime (fun () -> f3 "foo" "bar" "baz")
However, if you just need to time some functions in FSI, this feature is already built-in: just type
> #time;;
and timing will be turned on.
It isn't possible for the compiler to know how many arguments will be passed at runtime, so the function f must satisfy both 'T -> unit and 'T -> 'T -> unit. This form also requires all arguments to be of the same type.
The following approach delays the function execution and may be suitable for your needs.
let printTime f =
let sw = Stopwatch.StartNew()
f() |> ignore
printfn "%A" sw.Elapsed
let f1 s = String.length s
let f2 s c = String.concat c s
printTime (fun () -> f1 "Test")
printTime (fun () -> f2 [| "Test1"; "Test2" |] ",")
You're probably thinking of passing a method group as an argument to GetTime, and then having the compiler decide which overload of the method group to call. That's not possible with any .NET compiler. Method groups are used for code analysis by compilers and tools such as ReSharper, but they are not something that actually exists at runtime.
If your functions take their arguments in tupled form, like these:
let f1 (s: string, b: bool) =
System.Threading.Thread.Sleep 1000
s
let f2 (n: int, s:string, dt: System.DateTime) =
System.Threading.Thread.Sleep 1000
n+1
then the implementation becomes trivial:
let Timed f args =
let sw = System.Diagnostics.Stopwatch.StartNew()
let ret = f args
printfn "Called with arguments %A, elapsed %A" args sw.Elapsed
ret
Usage:
f1
|> Timed // note, at this time we haven't yet applied any arguments
<| ("foo", true)
|> printfn "f1 done, returned %A"
f2
|> Timed
<| (42, "bar", DateTime.Now)
|> printfn "f2 done, returned %A"
However, if the functions take their arguments in curried form, like this:
let f1Curried (s: string) (b: bool) =
System.Threading.Thread.Sleep 1000
s
let f2Curried (n: int) (s:string) (dt: System.DateTime) =
System.Threading.Thread.Sleep 1000
n+1
it becomes a bit tricky. The idea is using standard operators (<|), (<||), and (<|||) that are intended to uncurry the arguments.
let Timed2 op f args =
let sw = System.Diagnostics.Stopwatch.StartNew()
let ret = op f args
printfn "Called with arguments %A, elapsed %A" args sw.Elapsed
ret
f1Curried
|> Timed2 (<||) // again, no arguments are passed yet
<| ("foo", true)
|> printfn "f1Curried done, returned %A"
f2Curried
|> Timed2 (<|||)
<| (42, "bar", DateTime.Now)
|> printfn "f2Curried done, returned %A"

Combining functions with the same, but partially unknown signature

Suppose we have a number of filter functions that accept the same parameters and return a boolean result.
let filter1 _ _ = true
let filter2 _ _ = false
These can be combined into a single filter.
let combine2 f1 f2 = fun a b -> f1 a b && f2 a b
combine2 filter1 filter2
Our implementation requires some knowledge of the parameters of f1 and f2. More generally, we may find functions combine1 ... combineN useful, where N is the number of parameters to the filter functions. Can a generic combine function be written that is independent of N?
I am interested in the capabilities of F# and being able to apply this concept in other situations.
Update: My understanding of the problem is that functions succeed in ignoring any remaining parameters when they don't care whether the result is a simple type or a partially applied function. In the example above, we only reach a boolean type after applying all parameters, so they need to be specified.
use high order function, passing the function as argument
let combineN invoke filters = filters |> List.map invoke |> List.reduce (&&)
and use it like this
[filter1; filter2] |> combineN (fun f -> f 1 2) |> printfn "%b"
demo: https://dotnetfiddle.net/EHC5di
you can also pass List.reduce parameter as argument, like combineN (&&) (fun f -> f 1 2)
but usually is easier to write List.map |> List.reduce
you can also use it with more arguments
let filter3 _ _ _ = true
let filter4 _ _ _ = true
[filter3; filter4] |> List.map (fun f -> f 1 2 3) |> List.reduce (&&) |> printfn "%b"
[filter3; filter4] |> combineN (fun f -> f 1 2 3) |> printfn "%b"
compiler will check types (number arguments)
//call list of function with 2 argument, with more arguments doesnt compile
[filter1; filter2] |> combineN (fun f -> f 1 2 3) |> printfn "%b"
//mix functions with different arguments, doesnt compile either
[filter1; filter3] |> combineN (fun f -> f 1 2 3) |> printfn "%b"
see demo

How do I do in F# what would be called compression in APL?

In APL one can use a bit vector to select out elements of another vector; this is called compression. For example 1 0 1/3 5 7 would yield 3 7.
Is there a accepted term for this in functional programming in general and F# in particular?
Here is my F# program:
let list1 = [|"Bob"; "Mary"; "Sue"|]
let list2 = [|1; 0; 1|]
[<EntryPoint>]
let main argv =
0 // return an integer exit code
What I would like to do is compute a new string[] which would be [|"Bob"; Sue"|]
How would one do this in F#?
Array.zip list1 list2 // [|("Bob",1); ("Mary",0); ("Sue",1)|]
|> Array.filter (fun (_,x) -> x = 1) // [|("Bob", 1); ("Sue", 1)|]
|> Array.map fst // [|"Bob"; "Sue"|]
The pipe operator |> does function application syntactically reversed, i.e., x |> f is equivalent to f x. As mentioned in another answer, replace Array with Seq to avoid the construction of intermediate arrays.
I expect you'll find many APL primitives missing from F#. For lists and sequences, many can be constructed by stringing together primitives from the Seq, Array, or List modules, like the above. For reference, here is an overview of the Seq module.
I think the easiest is to use an array sequence expression, something like this:
let compress bits values =
[|
for i = 0 to bits.Length - 1 do
if bits.[i] = 1 then
yield values.[i]
|]
If you only want to use combinators, this is what I would do:
Seq.zip bits values
|> Seq.choose (fun (bit, value) ->
if bit = 1 then Some value else None)
|> Array.ofSeq
I use Seq functions instead of Array in order to avoid building intermediary arrays, but it would be correct too.
One might say this is more idiomatic:
Seq.map2 (fun l1 l2 -> if l2 = 1 then Some(l1) else None) list1 list2
|> Seq.choose id
|> Seq.toArray
EDIT (for the pipe lovers)
(list1, list2)
||> Seq.map2 (fun l1 l2 -> if l2 = 1 then Some(l1) else None)
|> Seq.choose id
|> Seq.toArray
Søren Debois' solution is good but, as he pointed out, but we can do better. Let's define a function, based on Søren's code:
let compressArray vals idx =
Array.zip vals idx
|> Array.filter (fun (_, x) -> x = 1)
|> Array.map fst
compressArray ends up creating a new array in each of the 3 lines. This can take some time, if the input arrays are long (1.4 seconds for 10M values in my quick test).
We can save some time by working on sequences and creating an array at the end only:
let compressSeq vals idx =
Seq.zip vals idx
|> Seq.filter (fun (_, x) -> x = 1)
|> Seq.map fst
This function is generic and will work on arrays, lists, etc. To generate an array as output:
compressSeq sq idx |> Seq.toArray
The latter saves about 40% of computation time (0.8s in my test).
As ildjarn commented, the function argument to filter can be rewritten to snd >> (=) 1, although that causes a slight performance drop (< 10%), probably because of the extra function call that is generated.

Resources