Possible to create Graal native function callable from C without isolate? - graalvm

I'd like to create a library, written in Java, callable from C, with simple method signatures:
int addThree(int in) {
return in + 3;
}
I know it's possible to do this with GraalVM if you do a little dance and create an Isolate in your C program and pass it in as the first parameter in every function call. There is good sample code here.
The problem is that the system I'm writing for, Postgres, can load C libraries and call functions in them, but I would have to create a wrapper function in C that would wrap every function I wanted to expose. This really limits the value of being able to slap something together in Java and use it in Postgres directly. I'd have to do something like this:
int myPublicAddThreeFunction(int in) {
graal_isolatethread_t *thread = NULL;
if (graal_create_isolate(NULL, NULL, &thread) != 0) {
fprintf(stderr, "error on isolate creation or attach\n");
return 1;
}
return SomeClassName_addThree_big_random_string_here(thread, in);
}
Is there a way, in Java alone, to expose a simple C function? I'm thinking I could create the isolate in a static method that gets loaded once on startup, somehow set it as the current isolate, and have the Java method just use it. Haven't been able to figure it out, though.
Also, it would be real nice not to have to append a big random string to every function name.

Related

How can I get a custom python type and avoid importing a python module every time a C function is called

I am writing some functions for a C extension module for python and need to import a module I wrote directly in python for access to a custom python type. I use PyImport_ImportModule() in the body of my C function, then PyObject_GetAttrString() on the module to get the custom python type. This executes every time the C function is called and seems like it's not very efficient and may not be best practice. I'm looking for a way to have access to the python custom type as a PyObject* or PyTypeObject* in my source code for efficiency and I may need the type in more than one C function also.
Right now the function looks something like
static PyObject* foo(PyObject* self, PyObject* args)
{
PyObject* myPythonModule = PyImport_ImportModule("my.python.module");
if (!myPythonModule)
return NULL;
PyObject* myPythonType = PyObject_GetAttrString(myPythonModule, "MyPythonType");
if (!myPythonType) {
Py_DECREF(myPythonModule);
return NULL;
}
/* more code to create and return a MyPythonType instance */
}
To avoid retrieving myPythonType every function call I tried adding a global variable to hold the object at the top of my C file
static PyObject* myPythonType;
and initialized it in the module init function similar to the old function body
PyMODINIT_FUNC
PyInit_mymodule(void)
{
/* more initializing here */
PyObject* myPythonModule = PyImport_ImportModule("my.python.module");
if (!myPythonModule) {
/* clean-up code here */
return NULL;
}
// set the static global variable here
myPythonType = PyObject_GetAttrString(myPythonModule, "MyPythonType");
Py_DECREF(myPythonModule);
if (!myPythonType) {
/* clean-up code here */
return NULL;
/* finish initializing module */
}
which worked, however I am unsure how to Py_DECREF the global variable whenever the module is finished being used. Is there a way to do that or even a better way to solve this whole problem I am overlooking?
First, just calling import each time probably isn't as bad as you think - Python does internally keep a list of imported modules, so the second time you call it on the same module the cost is much lower. So this might be an acceptable solution.
Second, the global variable approach should work, but you're right that it doesn't get cleaned up. This is rarely a problem because modules are rarely unloaded (and most extension modules don't really support it), but it isn't great. It also won't work with isolated sub-interpreters (which isn't much of a concern now, but may become more more popular in future).
The most robust way to do it needs multi-phase initialization of your module. To quickly summarise what you should do:
You should define a module state struct containing this type of information,
Your module spec should contain the size of the module state struct,
You need to initialize this struct within the Py_mod_exec slot.
You need to create an m_free function (and ideally the other GC functions) to correctly decref your state during de-initialization.
Within a global module function, self will be your module object, and so you can get the state with PyModule_GetState(self)

LLVM, Get first usage of a global variable

I'm new to LLVM and I'm stuck on something that might seem basic.
I'm writing a LLVM pass to apply some transformations to global variables before they are use.
I would like to detect somehow when is the first usage of a global variable to only apply the transformation there, and not in all places where the global variable is used. But it must be the first time it is used otherwise the program crashes.
I have been reading about the AnalysisManager, and I would say that I want something similar to DominatorTree which is used for basic blocks in a function.
So the idea is to get the DominatorTree of a GlobalVariable to get the first time it is used in the code and apply there my transformation.
Given the following example
int MyGlobal = 30;
void foo()
{
printf("%s\n", MyGlobal);
}
int main()
{
printf("%s\n", MyGlobal);
foo();
}
In the example above, I only want to apply the transformation just before the first printf in the main function
Given the following example
int MyGlobal = 30;
void foo()
{
printf("%s\n", MyGlobal);
}
int main()
{
foo();
printf("%s\n", MyGlobal);
}
For the example above I would like to apply the transformation inside the foo function.
I want to avoid to create a stub function at the beginning of the program to process all globals before start running (This is what actually Im doing)
Does LLVM provide something that can help me doing this? or what should be the best approach to implement it?

Caching streams in Functional Reactive Programming

I have an application which is written entirely using the FRP paradigm and I think I am having performance issues due to the way that I am creating the streams. It is written in Haxe but the problem is not language specific.
For example, I have this function which returns a stream that resolves every time a config file is updated for that specific section like the following:
function getConfigSection(section:String) : Stream<Map<String, String>> {
return configFileUpdated()
.then(filterForSectionChanged(section))
.then(readFile)
.then(parseYaml);
}
In the reactive programming library I am using called promhx each step of the chain should remember its last resolved value but I think every time I call this function I am recreating the stream and reprocessing each step. This is a problem with the way I am using it rather than the library.
Since this function is called everywhere parsing the YAML every time it is needed is killing the performance and is taking up over 50% of the CPU time according to profiling.
As a fix I have done something like the following using a Map stored as an instance variable that caches the streams:
function getConfigSection(section:String) : Stream<Map<String, String>> {
var cachedStream = this._streamCache.get(section);
if (cachedStream != null) {
return cachedStream;
}
var stream = configFileUpdated()
.filter(sectionFilter(section))
.then(readFile)
.then(parseYaml);
this._streamCache.set(section, stream);
return stream;
}
This might be a good solution to the problem but it doesn't feel right to me. I am wondering if anyone can think of a cleaner solution that maybe uses a more functional approach (closures etc.) or even an extension I can add to the stream like a cache function.
Another way I could do it is to create the streams before hand and store them in fields that can be accessed by consumers. I don't like this approach because I don't want to make a field for every config section, I like being able to call a function with a specific section and get a stream back.
I'd love any ideas that could give me a fresh perspective!
Well, I think one answer is to just abstract away the caching like so:
class Test {
static function main() {
var sideeffects = 0;
var cached = memoize(function (x) return x + sideeffects++);
cached(1);
trace(sideeffects);//1
cached(1);
trace(sideeffects);//1
cached(3);
trace(sideeffects);//2
cached(3);
trace(sideeffects);//2
}
#:generic static function memoize<In, Out>(f:In->Out):In->Out {
var m = new Map<In, Out>();
return
function (input:In)
return switch m[input] {
case null: m[input] = f(input);
case output: output;
}
}
}
You may be able to find a more "functional" implementation for memoize down the road. But the important thing is that it is a separate thing now and you can use it at will.
You may choose to memoize(parseYaml) so that toggling two states in the file actually becomes very cheap after both have been parsed once. You can also tweak memoize to manage the cache size according to whatever strategy proves the most valuable.

Calling Lua from C

I'm trying to call a user-defined Lua function from C. I've seen some discussion on this, and the solution seems clear. I need to grab the index of the function with luaL_ref(), and save the returned index for use later.
In my case, I've saved the value with luaL_ref, and I'm at a point where my C code needs to invoke the Lua function saved with luaL_ref. For that, I'm using lua_rawgeti as follows:
lua_rawgeti(l, LUA_REGISTRYINDEX, fIndex);
This causes a crash in lua_rawgeti.
The fIndex I'm using is the value I received from luaL_ref, so I'm not sure what's going on here.
EDIT:
I'm running a Lua script as follows:
function errorFunc()
print("Error")
end
function savedFunc()
print("Saved")
end
mylib.save(savedFunc, errorFunc)
I've defined my own Lua library 'mylib', with a C function:
static int save(lua_State *L)
{
int cIdx = myCIndex = luaL_ref(L, LUA_REGISTRYINDEX);
int eIdx = luaL_ref(L, LUA_REGISTRYINDEX);
I save cIdx and eIdx away until a later point in time when I receive some external event at which point I would like to invoke one of the functions set as parameters in my Lua script. Here, (on the same thread, using the same lua_State*), I call:
lua_rawgeti(L, LUA_REGISTRYINDEX, myCIndex);
Which is causing the crash.
My first suggestion is to get it working without storing the function in C at all. Just assign your function to a global in Lua, then in C use the Lua state (L) to get the global, push the args, call the function, and use the results. Once that's working, you've got the basics and know your function is working, you can change the way you get at the function to use the registry. Good luck!
As #Schollii mentioned, I was making this call after doing a lua_close(L).

How can I load an unnamed function in Lua?

I want users of my C++ application to be able to provide anonymous functions to perform small chunks of work.
Small fragments like this would be ideal.
function(arg) return arg*5 end
Now I'd like to be able to write something as simple as this for my C code,
// Push the function onto the lua stack
lua_xxx(L, "function(arg) return arg*5 end" )
// Store it away for later
int reg_index = luaL_ref(L, LUA_REGISTRY_INDEX);
However I dont think lua_loadstring will do "the right thing".
Am I left with what feels to me like a horrible hack?
void push_lua_function_from_string( lua_State * L, std::string code )
{
// Wrap our string so that we can get something useful for luaL_loadstring
std::string wrapped_code = "return "+code;
luaL_loadstring(L, wrapped_code.c_str());
lua_pcall( L, 0, 1, 0 );
}
push_lua_function_from_string(L, "function(arg) return arg*5 end" );
int reg_index = luaL_ref(L, LUA_REGISTRY_INDEX);
Is there a better solution?
If you need access to parameters, the way you have written is correct. lua_loadstring returns a function that represents the chunk/code you are compiling. If you want to actually get a function back from the code, you have to return it. I also do this (in Lua) for little "expression evaluators", and I don't consider it a "horrible hack" :)
If you only need some callbacks, without any parameters, you can directly write the code and use the function returned by lua_tostring. You can even pass parameters to this chunk, it will be accessible as the ... expression. Then you can get the parameters as:
local arg1, arg2 = ...
-- rest of code
You decide what is better for you - "ugly code" inside your library codebase, or "ugly code" in your Lua functions.
Have a look at my ae. It caches functions from expressions so you can simply say ae_eval("a*x^2+b*x+c") and it'll only compile it once.

Resources