Deedle: apply a function to Frame values [duplicate] - f#

I'm following a tutorial:
http://dkowalski.com/blog/archive/2014/01/11/f-deedle-and-computational-investing.aspx and when I try to apply normalization to all the columns of "stocks" Frame, using Frame.mapColValues, I obtain the following error message:
System.InvalidOperationException: OptionalValue.Value: Value is not available
in Deedle.OptionalValue``1.get_Value() in c:\Tomas\Public\Deedle\src\Deedle\Common\Common.fs:riga 35
in FSI_0046.normalized#52-8.Invoke(ObjectSeries``1 os) in C:\Users\Bruno-Astarita\Documents\Visual Studio 2013\Projects\TestDeedle02\TestDeedle02\TestTutorial01.fsx:riga 52
in <StartupCode$Deedle>.$Series.f#257-13[K,V,R](Func``3 f, Int32 i, K key, V v) in c:\Tomas\Public\Deedle\src\Deedle\Series.fs:riga 358
in <StartupCode$Deedle>.$Series.newVector#354-1.Invoke(Int32 i, Tuple``2 tupledArg) in c:\Tomas\Public\Deedle\src\Deedle\Series.fs:riga 355
in Microsoft.FSharp.Collections.IEnumerator.mapi#129.DoMoveNext(b& )
in Microsoft.FSharp.Collections.IEnumerator.MapEnumerator``1.System-Collections-IEnumerator-MoveNext()
in Microsoft.FSharp.Collections.SeqModule.ToArray[T](IEnumerable``1 source)
in Microsoft.FSharp.Collections.ArrayModule.OfSeq[T](IEnumerable``1 source)
in Deedle.Series``2.Select[R](Func``3 f) in c:\Tomas\Public\Deedle\src\Deedle\Series.fs:riga 352
in Deedle.Series``2.Select[R](Func``2 f) in c:\Tomas\Public\Deedle\src\Deedle\Series.fs:riga 365
in Deedle.SeriesModule.MapValues[T,R,K](FSharpFunc``2 f, Series``2 series) in c:\Tomas\Public\Deedle\src\Deedle\SeriesModule.fs:riga 451
in <StartupCode$FSI_0046>.$FSI_0046.main#() in C:\Users\Bruno-Astarita\Documents\Visual Studio 2013\Projects\TestDeedle02\TestDeedle02\TestTutorial01.fsx:riga 49
Stopped due to error
I tried to find the problem making a step-by-step procedure based on source code in FrameModule.fs, and the error raises at the instruction Series.mapValues f.
Where I'm wrong?
Many thanks.
Update
After many tests, the problem seems be arised by the row let firstItem = osAsFloat.GetAt(0) from the tutorial. If I try to substitute it with let firstItem = 2.0 everything is ok. Still is not clear for me this behaviour.

Finally I found the reason of problem. In some Series in the Frame, there are missing values in correspondence to the first element. If I do not use those Series, everything is ok.

Related

Creating a simple Rcpp package with dependency with other Rcpp package

I am trying to improve my loop computation speed by using foreach, but there is a simple Rcpp function I defined inside of this loop. I saved the Rcpp function as mproduct.cpp, and I call out the function simply using
sourceCpp("mproduct.cpp")
and the Rcpp function is a simple one, which is to perform matrix product in C++:
// [[Rcpp::depends(RcppArmadillo, RcppEigen)]]
#include <RcppArmadillo.h>
#include <RcppEigen.h>
// [[Rcpp::export]]
SEXP MP(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
So, the function in the Rcpp file is MP, referring to matrix product. I need to perform the following foreach loop (I have simplified the code for illustration):
foreach(j=1:n, .package='Rcpp',.noexport= c("mproduct.cpp"),.combine=rbind)%dopar%{
n=1000000
A<-matrix(rnorm(n,1000,1000))
B<-matrix(rnorm(n,1000,1000))
S<-MP(A,B)
return(S)
}
Since the size of matrix A and B are large, it is why I want to use foreach to alleviate the computational cost.
However, the above code does not work, since it provides me error message:
task 1 failed - "NULL value passed as symbol address"
The reason I added .noexport= c("mproduct.cpp") is to follow some suggestions from people who solved similar issues (Can't run Rcpp function in foreach - "NULL value passed as symbol address"). But somehow this does not solve my issue.
So I tried to install my Rcpp function as a library. I used the following code:
Rcpp.package.skeleton('mp',cpp_files = "<my working directory>")
but it returns me a warning message:
The following packages are referenced using Rcpp::depends attributes however are not listed in the Depends, Imports or LinkingTo fields of the package DESCRIPTION file: RcppArmadillo, RcppEigen
so when I tried to install my package using
install.packages("<my working directory>",repos = NULL,type='source')
I got the warning message:
Error in untar2(tarfile, files, list, exdir, restore_times) :
incomplete block on file
In R CMD INSTALL
Warning in install.packages :
installation of package ‘C:/Users/Lenovo/Documents/mproduct.cpp’ had non-zero exit status
So can someone help me out how to solve 1) using foreach with Rcpp function MP, or 2) install the Rcpp file as a package?
Thank you all very much.
The first step would be making sure that you are optimizing the right thing. For me, this would not be the case as this simple benchmark shows:
set.seed(42)
n <- 1000
A<-matrix(rnorm(n*n), n, n)
B<-matrix(rnorm(n*n), n, n)
MP <- Rcpp::cppFunction("SEXP MP(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}", depends = "RcppEigen")
bench::mark(MP(A, B), A %*% B)[1:5]
#> # A tibble: 2 x 5
#> expression min median `itr/sec` mem_alloc
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt>
#> 1 MP(A, B) 277.8ms 278ms 3.60 7.63MB
#> 2 A %*% B 37.4ms 39ms 22.8 7.63MB
So for me the matrix product via %*% is several times faster than the one via RcppEigen. However, I am using Linux with OpenBLAS for matrix operations while you are on Windows, which often means reference BLAS for matrix operations. It might be that RcppEigen is faster on your system. I am not sure how difficult it is for Windows user to get a faster BLAS implementation (https://csgillespie.github.io/efficientR/set-up.html#blas-and-alternative-r-interpreters might contain some pointers), but I would suggest spending some time on investigating this.
Now if you come to the conclusion that you do need RcppEigen or RcppArmadillo in your code and want to put that code into a package, you can do the following. Instead of Rcpp::Rcpp.package.skeleton() use RcppEigen::RcppEigen.package.skeleton() or RcppArmadillo::RcppArmadillo.package.skeleton() to create a starting point for a package based on RcppEigen or RcppArmadillo, respectively.

Does GNU FORTH have an editor?

Chapter 3 of Starting FORTH says,
Now that you've made a block "current", you can list it by simply typing the word L. Unlike LIST, L does not want to be proceeded by a block number; instead it lists the current block.
When I run 180 LIST, I get
Screen 180 not modified
0
...
15
ok
But when I run L, I get an error
:30: Undefined word
>>>L<<<
Backtrace:
$7F0876E99A68 throw
$7F0876EAFDE0 no.extensions
$7F0876E99D28 interpreter-notfound1
What am I doing wrong?
Yes, gForth supports an internal (BLOCK) editor. Start gforth
type: use blocked.fb (a demo page)
type: 1 load
type editor
words will show the editor words,
s b n bx nx qx dl il f y r d i t 'par 'line 'rest c a m ok
type 0 l to list screen 0 which describes the editor,
Screen 0 not modified
0 \\ some comments on this simple editor 29aug95py
1 m marks current position a goes to marked position
2 c moves cursor by n chars t goes to line n and inserts
3 i inserts d deletes marked area
4 r replaces marked area f search and mark
5 il insert a line dl delete a line
6 qx gives a quick index nx gives next index
7 bx gives previous index
8 n goes to next screen b goes to previous screen
9 l goes to screen n v goes to current screen
10 s searches until screen n y yank deleted string
11
12 Syntax and implementation style a la PolyFORTH
13 If you don't like it, write a block editor mode for Emacs!
14
15
ok
Creating your own block file
To create your own new block file myblocks.fb
type: use blocked.fb
type: 1 load
type editor
Then
type use myblocks.fb
1 load will show BLOCK #1 (lines 0 till 15. 16 Lines of 64 characters each)
1 t will highlight line 1
Type i this is text to [i]nsert into line 1
After the current BLOCK is edited type flush in order to write BLOCK #1 to the file myblocks.fb
For more information see, gForth Blocks
It turns out these are "Editor Commands" the book says,
For Those Whose EDITOR Doesn't Follow These Rules
The FORTH-79 Standard does not specify editor commands. Your system may use a different editor; if so, check your systems documentation
I don't believe gforth supports an internal editor at all. So L, T, I, P, F, E, D, R are all presumably unsupported.
gforth is well integrated with emacs. In my xemacs here, by default any file called *.fs is considered FORTH source. "C-h m", as usual, gives the available commands.
No, GNU Forth doesn't have an internal editor; I use Vim :)

lua static analysis: detecting uninitialized table field

I'm using luacheck (within the Atom editor), but open to other static analysis tools.
Is there a way to check that I'm using an uninitialized table field? I read the docs (http://luacheck.readthedocs.io/en/stable/index.html) but maybe I missed how to do this?
In all three cases in the code below I'm trying to detect that I'm (erroneously) using field 'y1'. None of them do. (At run-time it is detected, but I'm trying to catch it before run-time).
local a = {}
a.x = 10
a.y = 20
print(a.x + a.y1) -- no warning about uninitialized field y1 !?
-- luacheck: globals b
b = {}
b.x = 10
b.y = 20
print(b.x + b.y1) -- no warning about uninitialized field y1 !?
-- No inline option for luacheck re: 'c', so plenty of complaints
-- about "non-standard global variable 'c'."
c = {} -- warning about setting
c.x = 10 -- warning about mutating
c.y = 20 -- " " "
print(c.x + c.y1) -- more warnings (but NOT about field y1)
The point is this: as projects grow (files grow, and the number & size of modules grow), it would be nice to prevent simple errors like this from creeping in.
Thanks.
lua-inspect should be able to detect and report these instances. I have it integrated into ZeroBrane Studio IDE and when running with the deep analysis it reports the following on this fragment:
unknown-field.lua:4: first use of unknown field 'y1' in 'a'
unknown-field.lua:7: first assignment to global variable 'b'
unknown-field.lua:10: first use of unknown field 'y1' in 'b'
unknown-field.lua:14: first assignment to global variable 'c'
unknown-field.lua:17: first use of unknown field 'y1' in 'c'
(Note that the integration code only reports first instances of these errors to minimize the number of instances reported; I also fixed an issue that only reported first unknown instance of a field, so you may want to use the latest code from the repository.)
People who look into questions related to "Lua static analysis" may also be interested in the various dialects of typed Lua, for example:
Typed Lua
Titan
Pallene
Ravi
But you may not have heard of "Teal". (early in its life it was called "tl"); .
I'm taking the liberty to answer my original question using Teal, since I find it intriguing.
-- 'record' (like a 'struct')
local Point = record
x : number
y : number
end
local a : Point = {}
a.x = 10
a.y = 20
print(a.x + a.y1) -- will trigger an error
-- (in VS Code using teal extension & at command line)
From command line:
> tl check myfile.tl
========================================
1 error:
myfile.tl:44:13: invalid key 'y1' in record 'a'
By the way...
> tl gen myfile.tl'
creates a pure Lua file: 'myfile.lua' that has no type information in it. Note: running this Lua file will trigger the 'nil' error... lua: myfile.lua:42: attempt to index a nil value (local 'a').
So, Teal gives you a chance to catch 'type' errors, but it doesn't require you to fix them before generating Lua files.

F# Compiler Services incorrectly parses program

UPDATE:
I now realize that the question was stupid, I should have just filed the issue. In hindsight, I don't see why I even asked this question.
The issue is here: https://github.com/fsharp/FSharp.Compiler.Service/issues/544
Original question:
I'm using FSharp Compiler Services for parsing some F# code.
The particular piece of code that I'm facing right now is this:
let f x y = x+y
let g = f 1
let h = (g 2) + 3
This program yields a TAST without the (+) call on the last line. That is, the compiler service returns TAST as if the last line was just let h = g 2.
The question is: is this is a legitimate bug that I ought to report or am I missing something?
Some notes
Here is a repo containing minimal repro (I didn't want to include it in this question, because Compiler Services require quite a bit of dancing around).
Adding more statements after the let h line does not change the outcome.
When compiled to IL (as opposed to parsed with Compiler Services), it seems to work as expected (e.g. see fiddle)
If I make g a value, the program parses correctly.
If I make g a normal function (rather than partially applied one), the program parses correctly.
I have no priori experience with FSharp.Compiler.Services but nevertheless I did a small investigation using Visual Studio's debugger. I analyzed abstract syntax tree of following string:
"""
module X
let f x y = x+y
let g = f 1
let h = (g 2) + 3
"""
I've found out that there's following object inside it:
App (Val (op_Addition,NormalValUse,D:\file.fs (6,32--6,33) IsSynthetic=false),TType_forall ([T1; T2; T3],TType_fun (TType_var T1,TType_fun (...,...))),...,...,...)
As you can see, there's an addition in 6th line between characters 32 and 33.
The most likely explanation why F# Interactive doesn't display it properly is a bug in a library (maybe AST is in an inconsistent state or pretty-printing is broken). I think that you should file a bug in project's issue tracker.
UPDATE:
Aforementioned object can be obtained in a debbuger in a following way:
error.[0]
(option of Microsoft.FSharp.Compiler.SourceCodeServices.FSharpImplementationFileDeclaration.Entity)
.Item2
.[2]
(option of Microsoft.FSharp.Compiler.SourceCodeServices.FSharpImplementationFileDeclaration.MemberOrFunctionOrValue)
.Item3
.f (private member)
.Value
(option of Microsoft.FSharp.Compiler.SourceCodeServices.FSharpExprConvert.ConvExprOnDemand#903)
.expr

Deedle IndexRows type annotations

I was trying to implement a Deedle solution for the little challenge from #migueldeicaza to achieve in F# what was done in http://t.co/4YFXk8PQaU with python and R. The csv source data is available from the link.
The start is simple but now, while trying to order based upon a column series of float values I'm struggling to understand the syntax for the IndexRows type annotation.
#I "../packages/FSharp.Charting.0.90.5"
#I "../packages/Deedle.0.9.12"
#load "FSharp.Charting.fsx"
#load "Deedle.fsx"
open System
open Deedle
open FSharp.Charting
let bodyCountData = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/film_death_counts.csv")
bodyCountData?DeathsPerMinute <- bodyCountData?Body_Count / bodyCountData?Length_Minutes
// select top 3 rows based upon default ordinal indexer
bodyCountData.Rows.[0..3]
// create a new frame indexed and ordered by descending number of screen deaths per minute
let bodyCountDataOrdered =
bodyCountData
|> Frame.indexRows <float>"DeathsPerMinute" // uh oh error here - I'm confused
And because I can't figure that syntax out... various messages like:
Error 1 The type '('a -> Frame<'c,Frame<int,string>>)' does not support the 'comparison' constraint. For example, it does not support the 'System.IComparable' interface. See also c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx(18,4)-(19,22). c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx 19 8 RPythonFSharpDFChallenge
Error 2 Type mismatch. Expecting a
'a -> Frame<'c,Frame<int,string>>
but given a
'a -> float
The type 'Frame<'a,Frame<int,string>>' does not match the type 'float' c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx 19 25 RPythonFSharpDFChallenge
Error 3 This expression was expected to have type
bool
but here has type
string c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx 19 31 RPythonFSharpDFChallenge
Edit: Just thinking about this... indexing on a measured float is a silly thing to do anyway - duplicates and missing values in real world data. So, I wonder what a more sensible approach to this would be. I still need to find the 25 max values... Maybe I can work this out for myself...
With Deedle 1.0, you can sort on an arbitrary column.
See: http://bluemountaincapital.github.io/Deedle/reference/deedle-framemodule.html#section7

Resources