How do you make a language binding? - binding

Although I do more or less understand what a language binding is, I am struggling to understand how they work.
Could anyone explain how do you make a Java binding for WinAPI, for example?

You'll find much better results if you search for Foreign Function Interface or FFI. The FFI is what allows you to call functions that were written in a different language, i.e., foreign ones. Different languages and runtimes have vastly different FFIs and you'll have to learn each one individually. Learning an FFI also forces you to know a little more about the internals of your language and its runtime than you are ordinarily used to. Some FFIs make you write code in the target language, like Haskell (where FFI code must be written in Haskell), and others make you write code in the source language, like Python (where FFI code must be written in C).
Certain languages don't use the term FFI (though it would be nice if they did). For Java, it's called Java Native Interface, or JNI.

Languages (usually) have defined syntax for calling "native" code. So if you have library that exports method foo(), making a biding would mean that you will create, in you example, Java class with method foo(). That way, you can call MyBinding.foo() from the rest of a code, it will make no difference whether it was pure Java method or compiled C code.
Again for Java, you probably want to look at JNI documentation. Other languages have similar mechanisms. There are tools like SIP that will take bunch of C(++) header files, and produce Python bindings for it. I guess other languages could have similar tools as well.

Related

Can you explicitly ask the Lua compiler to inline? What about the LuaJIT?

Is there a keyword or some other functionality in the standard Lua compiler that allows you to explicitly inline? What about the LuaJIT?
There is no function inlining in vanilla Lua interpreter. Some tools exist to inline the code on a source level, but that's not what you're asking.
LuaJIT does some inlining while generating the native code, but that can't be controlled from outside, there's no explicit 'inline' keyword. And there's limits on what could be inlined. I.e. the call to native code using FFI library will be inlined, but the calls to functions registered via classic Lua/C interface can't be.
No.
It could only apply to functions that aren't closures, though. (It doesn't seem worth it to have some other way to implement closures; because where would the time savings be?)
In some cases, a tail-call would give more of an advantage, particularly if your concern is stack space.

Parser for OCaml

Can anyone recommend me an open-source full OCaml parser?
Essentially, I would like to implement my own type-checker for OCaml. Ideally, the parser is written in OCaml. I would just use it to get the AST of the input program. (it is probably too much to ask for the initial typing environment pre-filled with standard library function signatures)
Use compiler-lib that is distributed with OCaml under QPL license. It has everything needed to create your own compiler (and even has some documentation). compiler-lib is essentially a compiler shipped as library.
Otherwise, you can use camlp4 to get the parsetree, but then you need to reimplement everything else from scratch. But at this case you're not restricted with QPL.
it is probably too much to ask for the initial typing environment pre-filled with standard library function signatures
It is not! See the files typing/predef.ml(i).
As for the stdlib, just use the same one as the compiler, except for pervasive which uses values from Predef, the rest is normal OCaml code without any special cases (except bootstraping, obviously).

Why do we need an embeddable programming language like Lua?

What are the typical use cases of using an embeddable programming language? Do I understand it correctly that such language should be embedded into some program environment and should be able to be executed from there?
Since you tagged the question as "Lua", I'll give you an answer in the context of this language.
Introduction
Lua is written in C (almost completely compatible with C89 standard; the incompatible features can be easily disabled, if needed, using compile-time switches) and has been designed to be easily integrated with C code. In the the context of Lua, "integrated" means two different, but related, things:
You can easily write C code that can be used as a library by Lua code. The integration is achieved either by static or dynamic linking your C code to Lua engine's code. The linked library can then be referred to in your Lua code using the Lua require function.
Lua engine can be easily embedded in a C application, i.e. linked (again either statically or dynamically) to the C application code. Then the C application can interact with the Lua code using Lua's C application programming interface (Lua C-API).
Note: this can be done, with a little more effort, also with a C++ application.
Advantages of embedding a Lua engine
If your C application embeds Lua many, if not most, operations can be delegated to the Lua engine, i.e. either to code written using the C-API functions or, better yet, Lua code. Lua code could be embedded as C strings inside your C code or be stored as external Lua scripts.
Having part of your code logic implemented using Lua code has several advantages:
Lua is simpler (less tricky) to learn and use than C, and it is much more high-level. It supports powerful abstractions, such as function closures and object orientation (in a peculiar way, using Lua tables and metamethods).
Lua is a dynamic language: it requires no "off-line" compilation. You can modify the text of your Lua script and that's all you need to modify your application behavior (no additional compilation+linking steps needed). This simplifies application development and debugging.
Lua is a safer language than C: it is really difficult to write Lua code that exhibits undefined behavior, as intended in the context of C/C++. If a Lua script fails, it fails "loudly". Moreover Lua supports an exception mechanism (although with a different syntax than C++) which can be employed to implement error management in a much easier way compared to C.
Lua, as most dynamic languages, is garbage collected. This means that the programmer is spared the pain of manually managing dynamic memory, which is a major cause of bugs, leaks, instability and security loopholes in languages that lack garbage collection.
Lua can "eat its own dog food", i.e. you can build a string at runtime (even in Lua itself) and if it is valid Lua code, your program can execute it on the fly. This is something not frequently seen even in other dynamic languages (still it is not LISP, but it gets closer, and with much more readable syntax). This enables Lua scripts to:
employ powerful text-based metaprogramming techniques, where Lua code can generate other Lua code and execute it on the fly;
implement domain specific languages (DSLs) in an easy way; Lua code can load at runtime other Lua code that is crafted so as to reflect the specific problem domain in which it is used (Lua syntax is simple, yet flexible enough to allow such things);
be used as a configuration language with ease: your application (written in a mix of C and Lua) can use some lua files as configuration files without the need to craft an ad-hoc parser for a specific configuration file format. Therefore you don't need to parse *.properties, *.csv, *.ini, or whichever other format you would choose if you hadn't the option of using Lua files for that purpose.
Lua engine has a very small memory footprint (some hundreds kBs), packing powerful capabilities. With very few C code lines and a bunch of Lua files you could create a complete application that would require thousands of C code lines otherwise. The standard Lua standalone interpreter can be seen as just an example of embedding Lua in a C application!
Lua has a very liberal open-source license, which enables its use even in commercial applications without much hassle. This also allows the modification of its source code to adapt it to special needs.
Small memory footprint and easily tweakable C sources make Lua a perfect candidate for porting it on embedded systems or small microcomputer systems (microcontrollers, etc.). Many parts of the standard Lua distributions can be stripped off, reducing the core Lua engine in the ~100kB range. As an example, take the eLua project, a modified distribution of Lua designed for embedded devices.
Lua, and other scripting languages, provide various benefits that are dependant on your needs.
Provide rapid iteration of development.
Allow run-time code changes, such as reloading your UI in World of Warcraft which re-loads all scripts without stopping the game engine itself or logging you out.
Provide a distinct API for your application for users to extend, without exposing critical parts of your system to the public. Such as text editors providing a macro language to allow you to integrate custom behaviour without giving you unfettered access to the internals of the editor itself.
The uses are really quite extensive and depends on the developer.

Bindings and introspection for OCaml library

I want to write an OCaml library which will be used by other programing languages like C or even python.
I not sure it's even feasible, and i guess i need to drop some type safety and add runtime checks to the interface for dynamically typed language.
Is it doable ? Is there tools to achieve this goal to auto-generate bindings ? I think stuffs like Corba do not fit well with ocaml ABI, but I may be wrong.
EDIT : by dropping the runtime requirement and using only languages having a llvm frontend, I could use llvm as a common ABI I guess, but it seems tricky.
OCaml has a FFI to interact with C code. The code for the binding has to be written in C, not in OCaml (which has no direct representation of C values, while C has representations of OCaml values). My advice would be:
On the C side, decide what would be the best interface to export that C programmers would like (or Python programmers writing Python bindings starting from your C interface)
Define a "low-level layer" on the OCaml side that gets your OCaml value as close as possible from the C representation
Write some C wrappers to convert from this low-level OCaml representation to your optimal C representation
The reason for step (2) is to have the step (3) as small as possible. Manipulating OCaml values from the C side is a bit painful, in particular you risk getting the interaction with the Garbage Collector wrong, which means segfaults -- plus you don't get any type safety. So the less work you have to do on the C side, the better.
There are some projects to do some of the wrapping work for you. CamlIDL for example, and I think Swig has some support for OCaml. I have never used those, though, so I can't comment.
If you know to which high-level language you wish to convert your interface to, there may be specialized bridge that don't need a C step. For example there are libraries to interact directly with Python representations (search for Pycaml, not sure how battle-tested their are) or with the Java runtime (the OCamlJava project). A C interface is still a safe bet that will allow other people to create bridges to their own languages.
It is feasible, but you need to understand involved topics, like how the GC works.
Have a look at this: http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual033.html#toc148
You need to be careful about types in the stub code, but otherwise you can keep type safety.

Binding generator (like SWIG) that handles C-style callbacks?

I recently wrote a binding for a C library using SWIG. While a good deal of it was straight forward and used only basic SWIG functionality, I ran into trouble when I needed to support one function which took a C callback as an argument, which is not supported for SWIG. I solved this by writing Python-specific code to provide a custom callback in which I called the Python 'eval' function to evaluate a supplied Callable.
While this worked nicely, it was unfortunate for me.. I had been hoping to use SWIG to take advantage of its support for tens of languages, but now I'm stuck having to figure out callbacks in every single language I wish to support. This makes my binding work magnitudes less useful, as I now have to solve the same problem many times, manually--the opposite of the point of using SWIG.
Are there any tool like SWIG that also handles C callbacks?
It's a bit rounadabout but if you recompile the C project in C++ or create a C++ extension, then you can take advantage of virtual function overloading.
Most SWIG language module have support for directors which allow a class in the target language to derive from a class in the C++ library. This way any overridden virtual function act as a callback.

Resources