Deedle IndexRows type annotations - f#

I was trying to implement a Deedle solution for the little challenge from #migueldeicaza to achieve in F# what was done in http://t.co/4YFXk8PQaU with python and R. The csv source data is available from the link.
The start is simple but now, while trying to order based upon a column series of float values I'm struggling to understand the syntax for the IndexRows type annotation.
#I "../packages/FSharp.Charting.0.90.5"
#I "../packages/Deedle.0.9.12"
#load "FSharp.Charting.fsx"
#load "Deedle.fsx"
open System
open Deedle
open FSharp.Charting
let bodyCountData = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/film_death_counts.csv")
bodyCountData?DeathsPerMinute <- bodyCountData?Body_Count / bodyCountData?Length_Minutes
// select top 3 rows based upon default ordinal indexer
bodyCountData.Rows.[0..3]
// create a new frame indexed and ordered by descending number of screen deaths per minute
let bodyCountDataOrdered =
bodyCountData
|> Frame.indexRows <float>"DeathsPerMinute" // uh oh error here - I'm confused
And because I can't figure that syntax out... various messages like:
Error 1 The type '('a -> Frame<'c,Frame<int,string>>)' does not support the 'comparison' constraint. For example, it does not support the 'System.IComparable' interface. See also c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx(18,4)-(19,22). c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx 19 8 RPythonFSharpDFChallenge
Error 2 Type mismatch. Expecting a
'a -> Frame<'c,Frame<int,string>>
but given a
'a -> float
The type 'Frame<'a,Frame<int,string>>' does not match the type 'float' c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx 19 25 RPythonFSharpDFChallenge
Error 3 This expression was expected to have type
bool
but here has type
string c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx 19 31 RPythonFSharpDFChallenge
Edit: Just thinking about this... indexing on a measured float is a silly thing to do anyway - duplicates and missing values in real world data. So, I wonder what a more sensible approach to this would be. I still need to find the 25 max values... Maybe I can work this out for myself...

With Deedle 1.0, you can sort on an arbitrary column.
See: http://bluemountaincapital.github.io/Deedle/reference/deedle-framemodule.html#section7

Related

Excel DNA UDF obtain unprocessed values as inputs

I have written several helper functions in F# that enable me to deal with the dynamic nature of Excel over the COM/PIA interface. However when I go to use these functions in an Excel-DNA UDF they do not work as expected as Excel-DNA is pre-processing the values in the array from excel.
e.g. null is turned into ExcelDna.Integration.ExcelEmpty
This interferes with my own validation code that was anticipating a null. I am able to work around this by adding an additional case to my pattern matching:
let (|XlEmpty|_|) (x: obj) =
match x with
| null -> Some XlEmpty
| :? ExcelDna.Integration.ExcelEmpty -> Some XlEmpty
| _ -> None
However it feels like a waste to convert and then convert again. Is there a way to tell Excel-DNA not to do additional processing of the range values in a UDF and supply them equivalent to the COM/PIA interface? i.e. Range.Value XlRangeValueDataType.xlRangeValueDefault
EDIT:
I declare my arguments as obj like this:
[<ExcelFunction(Description = "Validates a Test Table Row")>]
let isTestRow (headings: obj) (row: obj) =
let validator = TestTable.validator
let headingsList = TestTable.testHeadings
validateRow validator headingsList headings row
I have done some more digging and #Jim Foye's suggested question also confirms this. For UDF's, Excel-DNA works over the C API rather than COM and therefore has to do its own marshaling. The possible values are shown in this file:
https://github.com/Excel-DNA/ExcelDna/blob/2aa1bd9afaf76084c1d59e2330584edddb888eb1/Distribution/Reference.txt
The reason to use ExcelEmpty (the user supplied an empty cell) is that for a UDF, the argument can also be ExcelMissing (the user supplied no argument) which might both be reasonably null and there is a need to disambiguate.
I will adjust my pattern matching to be compatible with both the COM marshaling and the ExcelDNA marshaling.

Other ways to call/eval dynamic strings in Lua?

I am working with a third party device which has some implementation of Lua, and communicates in BACnet. The documentation is pretty janky, not providing any sort of help for any more advanced programming ideas. It's simply, "This is how you set variables...". So, I am trying to just figure it out, and hoping you all can help.
I need to set a long list of variables to certain values. I have a userdata 'ME', with a bunch of variables named MVXX (e.g. - MV21, MV98, MV56, etc).
(This is all kind of background for BACnet.) Variables in BACnet all have 17 'priorities', i.e., every BACnet variable is actually a sort of list of 17 values, with priority 16 being the default. So, typically, if I were to say ME.MV12 = 23, that would set MV12's priority-16 to the desired value of 23.
However, I need to set priority 17. I can do this in the provided Lua implementation, by saying ME.MV12_PV[17] = 23. I can set any of the priorities I want by indexing that PV. (Corollaries - what is PV? What is the underscore? How do I get to these objects? Or are they just interpreted from Lua to some function in C on the backend?)
All this being said, I need to make that variable name dynamic, so that i can set whichever value I need to set, based on some other code. I have made several attempts.
This tells me the object(MV12_PV[17]) does not exist:
x = 12
ME["MV" .. x .. "_PV[17]"] = 23
But this works fine, setting priority 16 to 23:
x = 12
ME["MV" .. x] = 23
I was trying to attempt some sort of what I think is called an evaluation, or eval. But, this just prints out function followed by some random 8 digit number:
x = 12
test = assert(loadstring("MV" .. x .. "_PV[17] = 23"))
print(test)
Any help? Apologies if I am unclear - tbh, I am so far behind the 8-ball I am pretty much grabbing at straws.
Underscores can be part of Lua identifiers (variable and function names). They are just part of the variable name (like letters are) and aren't a special Lua operator like [ and ] are.
In the expression ME.MV12_PV[17] we have ME being an object with a bunch of fields, ME.MV12_PV being an array stored in the "MV12_PV" field of that object and ME.MV12_PV[17] is the 17th slot in that array.
If you want to access fields dynamically, the thing to know is that accessing a field with dot notation in Lua is equivalent to using bracket notation and passing in the field name as a string:
-- The following are all equivalent:
x.foo
x["foo"]
local fieldname = "foo"
x[fieldname]
So in your case you might want to try doing something like this:
local n = 12
ME["MV"..n.."_PV"][17] = 23
BACnet "Commmandable" Objects (e.g. Binary Output, Analog Output, and o[tionally Binary Value, Analog Value and a handful of others) actually have 16 priorities (1-16). The "17th" you are referring to may be the "Relinquish Default", a value that is used if all 16 priorities are set to NULL or "Relinquished".
Perhaps your system will allow you to write to a BACnet Property called "Relinquish Default".

Format built-in types for pretty printing in Deedle

I understand that in order to pretty print things like discriminated unions in Deedle, you have to override ToString(). But what about built in types, like float?
Specifically, I want floats in one column to be displayed as percentages, or at the very least, to not have a million digits past the decimal.
Is there a way to do this?
There is no built-in support for doing this - it sounds like a useful addition, so if you want to contribute this to Deedle, please open an issue to discuss this! We'd be happy to accepta pull request that adds this feature.
As a workaround, I think your best chance is to transform the data in the frame before printing. Something like this should do the trick:
let df = frame [ "A" => series [ 1 => 0.001 ] ]
df |> Frame.map (fun r c (v:float) ->
if c = "A" then box (sprintf "%f%%" (v*100.0)) else box v)
This creates a new frame where all float values of a column named A are transformed using the formatting function sprintf "%f%%" (v*100.0) and the rest is left unchanged.

f# deedle filter data frame based on a list

I wanted to filter a Deedle dataframe based on a list of values how would I go about doing this?
I had an idea to use the following code below:
let d= df1|>filterRowValues(fun row -> row.GetAs<float>("ts") = timex)
However the issue with this is that it is only based on one variable, I then thought of combining this with a for loop and an append function:
for i in 0.. recd.length -1 do
df2.Append(df1|>filterRowValues(fun row -> row.GetAs<float>("ts") = recd.[i]))
This does not work either however and there must be a better way of doing this without using a for loop. In R I could for instance using an %in%.
You can use the F# set type to create a set of the values that you are interested. In the filtering, you can then check whether the set contains the actual value for the row.
For example, say that you have recd of type seq<float>. Then you should be able to write:
let recdSet = set recd
let d = df1 |> Frame.filterRowValues (fun row ->
recdSet.Contains(row.GetAs<float>("ts"))
Some other things that might be useful:
You can replace row.GetAs<float>("ts") with just row?ts (which always returns float and works only when you have a fixed name, like "ts", but it makes the code nicer)
Comparing float values might not be the best thing to do (because of floating point imprecisions, this might not always work as expected).

How do I turn this Extension Method into an Extension Property?

I have an extension method
type System.Int32 with
member this.Thousand() = this * 1000
but it requires me to write like this
(5).Thousand()
I'd love to get rid of both parenthesis, starting with making it a property instead of a method (for learning sake) how do I make this a property?
Jon's answer is one way to do it, but for a read-only property there's also a more concise way to write it:
type System.Int32 with
member this.Thousand = this * 1000
Also, depending on your preferences, you may find it more pleasing to write 5 .Thousand (note the extra space) than (5).Thousand (but you won't be able to do just 5.Thousand, or even 5.ToString()).
I don't really know F# (shameful!) but based on this blog post, I'd expect:
type System.Int32 with
member this.Thousand
with get() = this * 1000
I suspect that won't free you from the first set of parentheses (otherwise F# may try to parse the whole thing as a literal), but it should help you with the second.
Personally I wouldn't use this sort of thing for a "production" extension, but it's useful for test code where you're working with a lot of values.
In particular, I've found it neat to have extension methods around dates, e.g. 19.June(1976) as a really simple, easy-to-read way of building up test data. But not for production code :)
It's not beautiful, but if you really want a function that will work for any numeric type, you can do this:
let inline thousand n =
let one = LanguagePrimitives.GenericOne
let thousand =
let rec loop n i =
if i < 1000 then loop (n + one) (i + 1)
else n
loop one 1
n * thousand
5.0 |> thousand
5 |> thousand
5I |> thousand

Resources