how to skip graph chunks with xplot/plotly, in F#? - f#

When I do a graph with xplot, I provide lists of values like this:
Scatter(x = [1;2;3;4], y = [10;20;30;40])
but what if I want to skip some parts of the graph?
how can I do an equivalent of:
Scatter(x = [1;2;3;4], y = [10;20;nan;40])
where the line would have a gap where the nan is?

I think it works exactly the same way in XPlot. Try this:
Scatter(x = [1.0;2.0;3.0;4.0], y = [10.0;20.0;nan;40.0])
|> Chart.Plot
|> Chart.Show
We can also use Martin's idea to make this work for any numeric data type:
let toFloat = function
| Some value -> float value
| None -> nan
let scatter x y =
let toFloats = Seq.map toFloat
Scatter(x = toFloats x, y = toFloats y)
scatter
[Some 1m; Some 2m; Some 3m; Some 4m]
[Some 10m; Some 20m; None; Some 40m]
|> Chart.Plot
|> Chart.Show

Related

Skip null values in seq<Nullable<int>> in FSharp.Charting

I have seq<Nullable<int>> and need to create nice plot, but without null values.
Here is my code:
open System
#r """..\packages\FSharp.Charting.0.90.14\lib\net40\FSharp.Charting.dll"""
#load """..\packages\FSharp.Charting.0.90.14\FSharp.Charting.fsx"""
open FSharp.Charting
//in a real world replaced by .csv with empty values
let seqWithNullInt = seq[Nullable 10 ; Nullable 20 ; Nullable (); Nullable 40; Nullable 50]
//let seqWithNullInt = seq[ 10 ; 20 ; 30; 40; 50] //works fine
let bothSeq = seqWithNullInt |> Seq.zip {1..5}
Chart.Line bothSeq // Error because of nullable int
And here is my vision:
How to skip null values? I don't want to replace them by nearest or something, I need to skip them from chart.. Is there any solution?
Something like this might work (note that I've used Option values instead of nullables, since that's more idiomatic in F#):
let neitherPairHasNoneInValue (pair1, pair2) =
pair1 |> snd |> Option.isSome && pair2 |> snd |> Option.isSome
let seqWithNone = Seq.ofList [Some 10; Some 20; None; Some 40; Some 50]
let pairsWithoutNone = seqWithNone
|> Seq.zip {1..5}
|> Seq.pairwise
|> Seq.filter neitherPairHasNoneInValue
printfn "%A" pairsWithoutNone
This will output [(1,10),(2,20) ; (4,40),(5,50)]. I don't know the FSharp.Charting API offhand so I can't tell you which function will take that list of X,Y pairs and draw the graph you want, but it should be relatively straightforward to get from there to your graph.

MathNumerics.LinearAlgebra Matrix.mapRows dimensionality issues

So I have verified that the starting version of what I'm trying to do works, but for some reason when putting it into the Matrix.map high order function it breaks down.
Here is the failing function:
let SumSquares (theta:Vector<float>) (y:Vector<float>) (trainingData:Matrix<float>) =
let m = trainingData.RowCount
let theta' = theta.ToRowMatrix()
trainingData
|> Matrix.mapRows(fun a r -> (theta' * r) - y.[a] )
Here are some sample tests
Set up:
let tData = matrix [[1.0; 2.0]
[1.0; 3.0]
[1.0; 3.0]
[1.0; 4.0]]
let yVals = vector [5.0; 6.0; 7.0; 11.0]
let theta = vector [1.0; 0.2]
Test raw functionality of basic operation (theta transpose * vector - actual)
let theta' = theta.ToRowMatrix()
(theta.ToRowMatrix() * tData.[0, 0 .. 1]) - yVals.[0]
Testing in actual function:
tData |> SumSquares theta yVals
Here is a copy/paste of actual error. It reads as though its having issues of me mapping a larger vector to a smaller vector.
Parameter name: target
at MathNet.Numerics.LinearAlgebra.Storage.VectorStorage1.CopyToRow(MatrixStorage1 target, Int32 rowIndex, ExistingData existingData)
at FSI_0061.SumSquares(Vector1 theta, Vector1 y, Matrix`1 trainingData) in C:\projects\deleteme\ASPNet5Test\ConsoleApplication1\ConsoleApplication1\MachineLearning.fsx:line 23
at .$FSI_0084.main#() in C:\projects\deleteme\ASPNet5Test\ConsoleApplication1\ConsoleApplication1\MachineLearning.fsx:line 39
Stopped due to error
I found an even better easier way to do this. I have to credit s952163 for starting me down a good path, but this approach is even more optimized:
let square (x:Vector<float>) = x * x
let subtract (x:Vector<float>) (y:Vector<float>) = y - x
let divideBy (x:float) (y:float) = y / x
let SumSquares (theta:Vector<float>) (y:Vector<float>) (trainingData:Matrix<float>) =
let m = trainingData.RowCount |> float
(trainingData * theta)
|> subtract y
|> square
|> divideBy m
Since you know the number of rows you can just map to that. Arguably this is not pretty:
let SumSquares (theta:Vector<float>) (y:Vector<float>) (trainingData:Matrix<float>) =
let m = trainingData.RowCount
let theta' = theta.ToRowMatrix()
[0..m-1] |> List.map (fun i -> (((theta' * trainingData.[i,0..1]) |> Seq.exactlyOne) - yVals.[i] ))
Edit:
My guess is that mapRows wants everything to be in the same shape, and your output vector is different. So if you want to stick to the Vector type, this will just enumerate the indexed rows:
tData.EnumerateRowsIndexed() |> Seq.map (fun (i,r) -> (theta' * r) - yVals.[i])
and you can also use Matrix.toRowSeqi if you prefer to pipe it through, and get back a Matrix:
tData
|> Matrix.toRowSeqi
|> Seq.map (fun (i,x) -> (theta' * x) - yVals.[i])
|> DenseMatrix.ofRowSeq

How to cumulate (scan) Deedle data frame values

I'm loading a sequence of records into a deedle data frame (from a database table). Is it possible to accumulate (for example sum cumulatively) the values, and get back a data frame? For example there is Series.scanValues but there is no Frame.scanValues. There is Frame.map, but it didn't do what I expected, it left all values as they were.
#if INTERACTIVE
#r #"Fsharp.Charting"
#load #"..\..\Deedle.fsx"
#endif
open FSharp.Charting
open FSharp.Charting.ChartTypes
open Deedle
type SeriesX = {
DataDate:DateTime
Series1:float
Series2:float
Series3:float
}
let rnd = new System.Random()
rnd.NextDouble() - 0.5
let data =
[for i in [100..-1..1] ->
{SeriesX.DataDate = DateTime.Now.AddDays(float -i)
SeriesX.Series1 = rnd.NextDouble() - 0.5
SeriesX.Series2 = rnd.NextDouble() - 0.5
SeriesX.Series3 = rnd.NextDouble() - 0.5
}
]
# now comes the deedle frame:
let df = data |> Frame.ofRecords
let df = df.IndexRows<DateTime>("DataDate")
df.["Series1"] |> Chart.Line
df.["Series1"].ScanValues((fun acc x -> acc + x),0.0) |> Chart.Line
let df' = df |> Frame.mapValues (Seq.scan (fun acc x -> acc + x) 0.0)
df'.["Series1"] |> Chart.Line
The last two lines just give me back the original values while I would like to have the accumulated values like in df.["Series1"].Scanvalues for Series1, Series2, and Series3.
For filtering and projection, series provides Where and Select methods
and corresponding Series.map and Series.filter functions (there is
also Series.mapValues and Series.mapKeys if you only want to transform
one aspect).
So you just apply your function to each Series:
let allSum =
df.Columns
|> Series.mapValues(Series.scanValues(fun acc v -> acc + (v :?> float)) 0.0)
|> Frame.ofColumns
and use Frame.ofColumns that to convert the result to the Frame.
Edit:
If you need to select only numerics columns, you can use the Frame.getNumericCols:
let allSum =
df
|> Frame.getNumericCols
|> Series.mapValues(Series.scanValues (+) 0.0)
|> Frame.ofColumns
without an explicit type cast code has become more beautiful :)
There is a Series.scanValues function. You can obtain a series from every column in your data frame like this: frame$column, which gets you a Series.
If you need all the columns at once to do the scan, you could first map each row into a single value (a tuple, for example) and the apply the Series.scanValues to that new column.

Sampling in F# : is Set adequate?

I have an array of items, from which I'd like to sample.
I was under the impression that a Set would the a good structure to sample from, in a fold where I'd give back the original or a modified set with the retrieved element missing depending if I want replacement of not.
However, there seems to no method to retrieve an element directly from a Set.
Is there something I am missing ? or should I use Set of indices, along with a surrogate function that starts at some random position < Set.count and goes up until it finds a member ?
That is, something along this line
module Seq =
let modulo (n:int) start =
let rec next i = seq { yield (i + 1)%n ; yield! next (i+1)}
next start
module Array =
let Sample (withReplacement:bool) seed (entries:'T array) =
let prng, indexes = new Random(seed), Set(Seq.init (entries |> Array.length) id)
Seq.unfold (fun set -> let N = set |> Set.count
let next = Seq.modulo N (prng.Next(N)) |> Seq.truncate N |> Seq.tryFind(fun i -> set |> Set.exists ((=) i))
if next.IsSome then
Some(entries.[next.Value], if withReplacement then set else Set.remove next.Value set)
else
None)
Edit : Tracking positively what I gave, instead of tracking what I still can give would make it simpler and more efficient.
For sampling without replacement, you could just permute the source seq and take however many elements you want to sample
let sampleWithoutReplacement n s =
let a = Array.ofSeq s
seq { for i = a.Length downto 1 do
let j = rnd.Next i
yield a.[j]
a.[j] <- a.[i - 1] }
|> Seq.take n
To sample with replacement, just pick a random element n times from the source seq
let sampleWithReplacement n s =
let a = Array.ofSeq s
Seq.init n (fun _ -> a.[rnd.Next(a.Length)])
These may not be the most efficient methods with huge data sets however
Continuing our comments...if you want to randomly sample a sequence without slurping the entire thing into memory you could generate a set of random indices the size of your desired sample (not too different from what you already have):
let rand count max =
System.Random()
|> Seq.unfold (fun r -> Some(r.Next(max), r))
|> Seq.distinct
|> Seq.take count
|> set
let takeSample sampleSize inputSize input =
let indices = rand sampleSize inputSize
input
|> Seq.mapi (fun idx x ->
if Set.contains idx indices then Some x else None)
|> Seq.choose id
let inputSize = 100000
let input = Seq.init inputSize id
let sample = takeSample 50 inputSize input
printfn "%A" (Seq.toList sample)

Lexicographic sorting in F#

I am playing with a toy problem (Convex hull identification) and needed lexicographic sorting twice already. One of the cases was given a list of type Point = { X: float; Y: float }, I would like to sort by X coordinate, and in case of equality, by Y coordinate.
I ended up writing the following:
let rec lexiCompare comparers a b =
match comparers with
[ ] -> 0
| head :: tail ->
if not (head a b = 0) then head a b else
lexiCompare tail a b
let xComparer p1 p2 =
if p1.X > p2.X then 1 else
if p1.X < p2.X then -1 else
0
let yComparer p1 p2 =
if p1.Y > p2.Y then 1 else
if p1.Y < p2.Y then -1 else
0
let coordCompare =
lexiCompare [ yComparer; xComparer ]
Which allows me to do
let lowest (points: Point list) =
List.sortWith coordCompare points
|> List.head
So far, so good. However, this feels a bit heavy-handed. I have to create specific comparers returning -1, 0 or 1, and so far I can't see a straightforward way to use this in cases like List.minBy. Ideally, I would like to do something along the lines of providing a list of functions that can be compared (like [(fun p -> p.X); (fun p -> p.Y)]) and do something like lexicographic min of a list of items supporting that list of functions.
Is there a way to achieve this in F#? Or am I thinking about this incorrectly?
Is there a way to achieve this in F#? Or am I thinking about this incorrectly?
F# does this for you automatically when you define a record type like yours:
> type Point = { X: float; Y: float };;
type Point =
{X: float;
Y: float;}
You can immediately start comparing values. For example, defining a 3-element list of points and sorting it into lexicographic order using the built-in List.sort:
> [ { X = 2.0; Y = 3.0 }
{ X = 2.0; Y = 2.0 }
{ X = 1.0; Y = 3.0 } ]
|> List.sort;;
val it : Point list = [{X = 1.0;
Y = 3.0;}; {X = 2.0;
Y = 2.0;}; {X = 2.0;
Y = 3.0;}]
Note that the results were sorted first by X and then by Y.
You can compare two values of any comparable type using the built-in compare function.
If you want to use a custom ordering then you have two options. If you want to do all of your operations using your custom total order then it belongs in the type definition as an implementation of IComparable and friends. If you want to use a custom ordering for a few operations then you can use higher-order functions like List.sortBy and List.sortWith. For example, List.sortBy (fun p -> p.Y, p.X) will sort by Y and then X because F# generates the lexicographic comparison over 2-tuples for you (!).
This is one of the big advantages of F#.
Well, to start with, you can rely on F#'s built-in compare function:
let xComparer p1 p2 = compare p1.X p2.X
let yComparer p1 p2 = compare p1.Y p2.Y
Alternatively, you can clearly abstract this a bit if desired:
let compareWith f a b = compare (f a) (f b)
let xComparer = compareWith (fun p -> p.X)
let yComparer = compareWith (fun p -> p.Y)
Or, as you note, you could build this approach directly into the list handling function:
let rec lexiCompareWith l a b =
match l with
| [] -> 0
| f::fs ->
match compare (f a) (f b) with
| 0 -> lexiCompareWith fs a b
| n -> n
One important limitation here is that since you're putting them into a list, the functions must all have identical return types. This isn't a problem in your Point example (since both functions have type Point -> float), but it would prevent you from sorting two Person objects by name and then age (since the first projection would have type Person -> string but the second would have type Person -> int).
I don't think I understand your question correctly, but doesn't the following code work fine?
let lowest (points : Point list) = List.sort points |> List.head
It seems that F# performs implicit comparison on record data types. And my little experiment indicates that the comparison happens to be lexicographic. But I could not find any evidence to support that result.
So I'm not yet sure F# compares records lexicographically. I can still write in the following manner using tuple instead:
let lowest (points : Point list) =
let tuple = List.map (fun pt -> (pt.X, pt.Y)) points |> List.sort |> List.head
{ X = fst tuple; Y = snd tuple }
I hope this post could help.

Resources