Reference another F# Jupyter Notebook - f#

Is there a way to reference functions from another F# Jupyter Notebook?
I see number of answers for doing the same in Python Notebooks, but can't seem to find a way to do the same in F# Notebooks.

You can do a hack. You put the code you want to reference in a .fsx file inside a module:
//myScript.fsx
module Script1
let myFunction x y = x + 2 * y
Then you can simply reference that a .fsx file in your current jupyter notebook:
#load "myScript.fsx"
open Script1
printfn "%i" (myFunction 8 9)

Related

What does local _, x = ... do in lua?

I'm looking at some lua code on github which consists of many folders and files. Next to external libraries, each file starts with:
local _,x = ...
Now my question is, what is the purpose of this, namely the 3 dots? is it a way to 'import' the global values of x? In what way is it best used?
... is the variable arguments to the current function.
E.g.:
function test(x, y, ...)
print("x is",x)
print("y is",y)
print("... is", ...)
local a,b,c,d = ...
print("b is",b)
print("c is",c)
end
test(1,2,"oat","meal")
prints:
x is 1
y is 2
... is oat meal
b is meal
c is nil
Files are also treated as functions. In Lua, when you, or someone else, loads a file (with load or loadfile or whatever), it returns a function, and then to run the code, you call the function. And when you call the function, you can pass arguments. And none of the arguments have names, but the file can read them with ...
They are arguments from the command line.
Read lua's reference manual, in the chapter Lua Standalone, it says:
...If there is a script, the script is called with arguments arg[1], ···, arg[#arg]. Like all chunks in Lua, the script is compiled as a vararg function.
For example if your lua script is run with the command line:
lua my_script.lua 10 20
In my_script.lua you have:
local _, x = ...
Then _ = "10" and x = "20"
Update when a library script is required by another script, the meaning of the 3 dots changes, they are arguments passed from the require function to the searcher:
Once a loader is found, require calls the loader with two arguments: modname and an extra value, a loader data, also returned by the searcher.
And under package.searchers:
All searchers except the first one (preload) return as the extra value the file name where the module was found
For example if you have a lua file that requires my_script.lua.
require('my_script')
At this time _ = "my_script" and x = "/full/path/to/my_script.lua"
Note that in lua 5.1, require passes only 1 argument to the loader, so x is nil.

Creating a simple Rcpp package with dependency with other Rcpp package

I am trying to improve my loop computation speed by using foreach, but there is a simple Rcpp function I defined inside of this loop. I saved the Rcpp function as mproduct.cpp, and I call out the function simply using
sourceCpp("mproduct.cpp")
and the Rcpp function is a simple one, which is to perform matrix product in C++:
// [[Rcpp::depends(RcppArmadillo, RcppEigen)]]
#include <RcppArmadillo.h>
#include <RcppEigen.h>
// [[Rcpp::export]]
SEXP MP(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
So, the function in the Rcpp file is MP, referring to matrix product. I need to perform the following foreach loop (I have simplified the code for illustration):
foreach(j=1:n, .package='Rcpp',.noexport= c("mproduct.cpp"),.combine=rbind)%dopar%{
n=1000000
A<-matrix(rnorm(n,1000,1000))
B<-matrix(rnorm(n,1000,1000))
S<-MP(A,B)
return(S)
}
Since the size of matrix A and B are large, it is why I want to use foreach to alleviate the computational cost.
However, the above code does not work, since it provides me error message:
task 1 failed - "NULL value passed as symbol address"
The reason I added .noexport= c("mproduct.cpp") is to follow some suggestions from people who solved similar issues (Can't run Rcpp function in foreach - "NULL value passed as symbol address"). But somehow this does not solve my issue.
So I tried to install my Rcpp function as a library. I used the following code:
Rcpp.package.skeleton('mp',cpp_files = "<my working directory>")
but it returns me a warning message:
The following packages are referenced using Rcpp::depends attributes however are not listed in the Depends, Imports or LinkingTo fields of the package DESCRIPTION file: RcppArmadillo, RcppEigen
so when I tried to install my package using
install.packages("<my working directory>",repos = NULL,type='source')
I got the warning message:
Error in untar2(tarfile, files, list, exdir, restore_times) :
incomplete block on file
In R CMD INSTALL
Warning in install.packages :
installation of package ‘C:/Users/Lenovo/Documents/mproduct.cpp’ had non-zero exit status
So can someone help me out how to solve 1) using foreach with Rcpp function MP, or 2) install the Rcpp file as a package?
Thank you all very much.
The first step would be making sure that you are optimizing the right thing. For me, this would not be the case as this simple benchmark shows:
set.seed(42)
n <- 1000
A<-matrix(rnorm(n*n), n, n)
B<-matrix(rnorm(n*n), n, n)
MP <- Rcpp::cppFunction("SEXP MP(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}", depends = "RcppEigen")
bench::mark(MP(A, B), A %*% B)[1:5]
#> # A tibble: 2 x 5
#> expression min median `itr/sec` mem_alloc
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt>
#> 1 MP(A, B) 277.8ms 278ms 3.60 7.63MB
#> 2 A %*% B 37.4ms 39ms 22.8 7.63MB
So for me the matrix product via %*% is several times faster than the one via RcppEigen. However, I am using Linux with OpenBLAS for matrix operations while you are on Windows, which often means reference BLAS for matrix operations. It might be that RcppEigen is faster on your system. I am not sure how difficult it is for Windows user to get a faster BLAS implementation (https://csgillespie.github.io/efficientR/set-up.html#blas-and-alternative-r-interpreters might contain some pointers), but I would suggest spending some time on investigating this.
Now if you come to the conclusion that you do need RcppEigen or RcppArmadillo in your code and want to put that code into a package, you can do the following. Instead of Rcpp::Rcpp.package.skeleton() use RcppEigen::RcppEigen.package.skeleton() or RcppArmadillo::RcppArmadillo.package.skeleton() to create a starting point for a package based on RcppEigen or RcppArmadillo, respectively.

F#: Path error when trying to run simple F# program on VS Code

I just installed Visual Studio Code on a different laptop and then proceeded to get the Ionide-fsharp extension. I then tried to run this simple piece of code:
open System
let rec factorial n =
if n = 0
then 1
else n * factorial (n - 1)
factorial 5
But then I get this error and I can't seem to fix it:
open: the term 'open' is not recognized as the name of a cmdlet,
function, script file, or operable program. Check the spelling on the name,
of if a path was included, verify that the path is correct and try again.
It then proceeds to throw a bunch more nonsense at me and repeats the error above for every single line of code I have written afterwards. Please help.

F# Compiler Services incorrectly parses program

UPDATE:
I now realize that the question was stupid, I should have just filed the issue. In hindsight, I don't see why I even asked this question.
The issue is here: https://github.com/fsharp/FSharp.Compiler.Service/issues/544
Original question:
I'm using FSharp Compiler Services for parsing some F# code.
The particular piece of code that I'm facing right now is this:
let f x y = x+y
let g = f 1
let h = (g 2) + 3
This program yields a TAST without the (+) call on the last line. That is, the compiler service returns TAST as if the last line was just let h = g 2.
The question is: is this is a legitimate bug that I ought to report or am I missing something?
Some notes
Here is a repo containing minimal repro (I didn't want to include it in this question, because Compiler Services require quite a bit of dancing around).
Adding more statements after the let h line does not change the outcome.
When compiled to IL (as opposed to parsed with Compiler Services), it seems to work as expected (e.g. see fiddle)
If I make g a value, the program parses correctly.
If I make g a normal function (rather than partially applied one), the program parses correctly.
I have no priori experience with FSharp.Compiler.Services but nevertheless I did a small investigation using Visual Studio's debugger. I analyzed abstract syntax tree of following string:
"""
module X
let f x y = x+y
let g = f 1
let h = (g 2) + 3
"""
I've found out that there's following object inside it:
App (Val (op_Addition,NormalValUse,D:\file.fs (6,32--6,33) IsSynthetic=false),TType_forall ([T1; T2; T3],TType_fun (TType_var T1,TType_fun (...,...))),...,...,...)
As you can see, there's an addition in 6th line between characters 32 and 33.
The most likely explanation why F# Interactive doesn't display it properly is a bug in a library (maybe AST is in an inconsistent state or pretty-printing is broken). I think that you should file a bug in project's issue tracker.
UPDATE:
Aforementioned object can be obtained in a debbuger in a following way:
error.[0]
(option of Microsoft.FSharp.Compiler.SourceCodeServices.FSharpImplementationFileDeclaration.Entity)
.Item2
.[2]
(option of Microsoft.FSharp.Compiler.SourceCodeServices.FSharpImplementationFileDeclaration.MemberOrFunctionOrValue)
.Item3
.f (private member)
.Value
(option of Microsoft.FSharp.Compiler.SourceCodeServices.FSharpExprConvert.ConvExprOnDemand#903)
.expr

local variable cannot be seen in a closure across the file?

Suppose I have the following two Lua files:
In a.lua:
local x = 5
f = dofile'b.lua'
f()
In b.lua:
local fun = function()
print(x)
end
return fun
Then if I run luajit a.lua in shell it prints nil since x cannot be seen in the function defined in b.lua. The expected printing should be 5. However if I put everything in a single file then it's exactly what I want:
In aa.lua:
local x = 5
local f = function()
print(x)
end
f()
Run luajit aa.lua it prints 5.
So why x cannot be seen in the first case?
As their name suggests, local variables are local to the chunk.
dofile() loads the chunk from another file. Since it's another chunk, it makes sense that the local variable x in the first chunk isn't seen by it.
I agree that it is somewhat unintuitive that this doesn't work.
You'd like to say, at any point in the code there is a clear set of variables that are 'visible' -- some may be local, some may be global, but there is some map that the interpreter can use to resolve names of either kind.
When you load a chunk using dofile, then it can see whatever global variables currently exist, but apparently it can't see any local variables. We know that 'dofile' is not like C/C++ inclusion macros, which would give exactly the behavior you describe for local variables, but still you might reasonably expect that this part of it would work the same.
Ultimately there's no answer but "that's just not how they specified the language". The only satisfying answer is probably along the lines 'because otherwise it would cause non-obvious problem X' or 'because then use-case Y would go slower'.
I think the best answer is that, if all names were dynamically rebound according to the scope in which they are loaded when you use loadfile / dofile, that would inhibit a lot of optimization and such when compiling chunks into bytecode. In the lua system, name resolution works like 'either it is local in this scope, and then it binds to that (known) object, or, it is a lookup in the (unique) global table.' This system is pretty simple, there are only a few options and not a lot of room for complexity.
I don't think that running byte code even keeps track of the names of local variables, it discards them after the chunk is compiled. They would have to undo that optimization if they wanted to allow dynamic name resolution at chunk loading time like you suggest.
If your question is not really why but how can I make it work anyways, then one way you can do it is, in the host script, put any local variables that you want to be visible in the environment of the script that is called. When you do this you need to split dofile into a few calls. It's slightly different in lua 5.1 vs lua 5.2.
In lua 5.1:
In a.lua:
local shared = { x = 5 }
temp = loadfile('b.lua')
setfenv(temp, shared)
f = temp()
f()
In lua 5.2:
In a.lua:
local shared = { x = 5 }
temp = loadfile('b.lua', 't', shared)
f = temp()
f()
The x variable defined in module a.lua cannot be seen from b.lua because it was declared as local. The scope of a local variable is its own module.
If you want x to be visible from b.lua, just need to declare it global. A variable is either local or global. To declare a variable as global, just simply do not declare it as local.
a.lua
x = 5
f = dofile'b.lua'
f()
b.lua
local fun = function()
print(x)
end
return fun
This will work.
Global variables live within the global namespace, which can be accessed at any given time via the _G table. When Lua cannot solve a variable, because it's not defined inside the module where is being used, Lua searches that variable in the global namespace. In conclusion, it's also possible to write b.lua as:
local fun = function()
print(_G["x"])
end
return fun

Resources