Concise description of the Lua vm? - lua

I've skimmed Programming in Lua, I've looked at the Lua Reference.
However, they both tells me this function does this, but not how.
When reading SICP, I got this feeling of: "ah, here's the computational model underlying scheme"; I'm trying to get the same sense concerning Lua -- i.e. a concise description of it's vm, a "how" rather than a "what".
Does anyone know of a good document (besides the C source) describing this?

You might want to read the No-Frills Intro to Lua 5(.1) VM Instructions (pick a link, click on the Docs tab, choose English -> Go).
I don't remember exactly where I've seen it, but I remember reading that Lua's authors specifically discourage end-users from getting into too much detail on the VM; I think they want it to be as much of an implementation detail as possible.

Besides already mentioned A No-Frills Introduction to Lua 5.1 VM Instructions, you may be interested in this excellent post by Mike Pall on how to read Lua source.
Also see related Lua-Users Wiki page.

See http://www.lua.org/source/5.1/lopcodes.h.html . The list starts at OP_MOVE.

The computational model underlying Lua is pretty much the same as the computational model underlying Scheme, except that the central data structure is not the cons cell; it's the mutable hash table. (At least until you get into metaprogramming with metatables.) Otherwise all the familiar stuff is there: nested first-class functions with mutable local variables (let-bound variables in Scheme), and so on.
It's not clear to me that you'd get much from a study of the VM. I did some hacking on the VM a while back and it's a lot like any other register-oriented VM, although maybe a bit cleaner. Only a handful of instructions are Lua-specific.
If you're curious about the metatables, the semantics is described clearly, if somewhat verbosely, in Section 2.8 of the reference manual for Lua 5.1. If you look at the VM code in src/lvm.c you'll see almost exactly that logic implemented in C (e.g., the internal Arith function). The VM instructions are specialized for the common cases, but it's all terribly straightforward; nothing clever is involved.
For years I've been wanting a more formal specification of Lua's computational model, but my tastes run more toward formal semantics...

I've found The Implementation of Lua 5.1 very useful for understanding what Lua is actually doing.
It explains the hashing techniques, garbage collection and some other bits and pieces.

Another great paper is The Implmentation of Lua 5.0, which describes design and motivations of various key systems in the VM. I found that reading it was a great way to parse and understand what I was seeing in the C code.

I am surprised you refer to the C source for the VM as this is protected by lua.org and the tecgraf/puc rio in Brazil specially as the language is used for real business and commercial applications in a number of countries. The paper about The Implementation of lua contains details about the VM in the most detail it is permitted to include but the structure of the VM is proprietary. It is worth noting that versions 5.0 and 5' were commissioned by IBM in Europe for use on customer mainframes and their register-based version have a VM which accepts the IBM defined format of intermediate instructions.

Related

What makes libadalang special?

I have been reading about libadalang 1 2 and I am very impressed by it. However, I was wondering if this technique has already been used and another language supports a library for syntactically and semantically analyzing its code. Is this a unique approach?
C and C++: libclang "The C Interface to Clang provides a relatively small API that exposes facilities for parsing source code into an abstract syntax tree (AST), loading already-parsed ASTs, traversing the AST, associating physical source locations with elements within the AST, and other facilities that support Clang-based development tools." (See libtooling for a C++ API)
Python: See the ast module in the Python Language Services section of the Python Library manual. (The other modules can be useful, as well.)
Javascript: The ongoing ESTree effort is attempting to standardize parsing services over different Javascript engines.
C# and Visual Basic: See the .NET Compiler Platform ("Roslyn").
I'm sure there are lots more; those ones just came off the top of my head.
For a practical and theoretical grounding, you should definitely (re)visit the classical textbook Structure and Interpretation of Computer Programs by Abelson & Sussman (1st edition 1985, 2nd edition 1996), which helped popularise the idea of Metacircular Interpretation -- that is, interpreting a computer program as a formal datastructure which can be interpreted (or otherwise analysed) programmatically.
You can see "libadalang" as ASIS Mark II. AdaCore seems to be attempting to rethink ASIS in a way that will support both what ASIS already can do, and more lightweight operations, where you don't require the source to compile, to provide an analysis of it.
Hopefully the final API will be nicer than that of ASIS.
So no, it is not a unique approach. It has already been done for Ada. (But I'm not aware of similar libraries for other languages.)

What front-end can I use with RPython to implement a language?

I've looked high and low for examples of implementing a language using the RPython toolchain, but the only one I've been able to find so far is this one in which the author writes a simple BF interpreter. Because the grammar is so simple, he doesn't need to use a parser/lexer generator. Is there a front-end out there that supports developing a language in RPython?
Thanks!
I'm not aware of any general lexer or parser generator targeting RPython specifically. Some with Python output may work, but I wouldn't bet on it. However, there's a set of parsing tools in rlib.parsing. It seems quite usable. OTOH, there's a warning in the documentation: It's reportedly still in development, experimental, and only used for the Prolog interpreter so far.
Alternatively, you can write the frontend by hand. Lexers can be annoying and unnatural, granted (you may be able to rip out the utility modules for DFAs used by the Python implementation). But parsers are a piece of cake if you know the right algorithms. I'm a huge fan of "Top Down Operator Precedence parsers" a.k.a. "Pratt parsers", which are reasonably simple (recursive descent) but make all expression parsing issues (nesting, precedence, associativity, etc.) a breeze. There's depressingly little information on them, but the few blog posts were sufficient for me:
One by Crockford (wouldn't recommend it though, it throws a whole lot of unrelated stuff into the parser and thus obscures it),
another one at effbot.org (uses Python),
and a third by a sadly even-less-famous guy who's developing a language himself, Robert Nystrom.
Alex Gaynor has ported David Beazley's excellent PLY to RPython. Its documentation is quite good, and he even gave a talk about using it to implement an interpreter at PyCon US 2013.

Custom programming language: how?

Hopefully this question won't be too convoluted or vague. I know what I want in my head, so fingers crossed I can get this across in text.
I'm looking for a language with a syntax of my own specification, so I assume I will need to create one myself. I've spent the last few days reading about compilers, lexers, parsers, assembly language, virtual machines, etc, and I'm struggling to sort everything out in terms of what I need to accomplish my goals (file attached at the bottom with some specifications). Essentially, I'm deathly confused as to what tools specifically I will need to use to go forward.
A little background: the language made would hopefully be used to implement a multiplayer, text-based MUD server. Therefore, it needs easy inbuilt functionality for creating/maintaining client TCP/IP connections, non-blocking IO, database access via SQL or similar. I'm also interested in security insofar as I don't want code that is written for this language to be able to be stolen and used by the general public without specialist software. This probably means that it should compile to object code
So, what are my best options to create a language that fits these specifications
My conclusions are below. This is just my best educated guess, so please contest me if you think I'm heading in the wrong direction. I'm mostly only including this to see how very confused I am when the experts come to make comments.
For code security, I should want a language that compiles and is run in a virtual machine. If I do this, I'll have a hell of a lot of work to do, won't I? Write a virtual machine, assembler language on the lower-level, and then on the higher-level, code libraries to deal with IO, sockets, etc myself, rather than using existing modules?
I'm just plain confused.
I'm not sure if I'm making sense.
If anyone could settle my brain even a little bit, I'd sincerely appreciate it! Alternatively, if I'm way off course and there's a much easier way to do this, please let me know!
Designing a custom domain-specific programming language is the right approach to a problem. Actually, almost all the problems are better approached with DSLs. Terms you'd probably like to google are: domain specific languages and language-oriented programming.
Some would say that designing and implementing a compiler is a complicated task. It is not true at all. Implementing compilers is a trivial thing. There are hordes of high-quality compilers available, and all you need to do is to define a simple transform from your very own language into another, or into a combination of the other languages. You'd need a parser - it is not a big deal nowdays, with Antlr and tons of homebrew PEG-based parser generators around. You'd need something to define semantics of your language - modern functional programming langauges shines in this area, all you need is something with a support for ADTs and pattern matching. You'd need a target platform. There is a lot of possibilities: JVM and .NET, C, C++, LLVM, Common Lisp, Scheme, Python, and whatever else is made of text strings.
There are ready to use frameworks for building your own languages. Literally, any Common Lisp or Scheme implementation can be used as such a framework. LLVM has all the stuff you'd need too. .NET toolbox is ok - there is a lot of code generation options available. There are specialised frameworks like this one for building languages with complex semantics.
Choose any way you like. It is easy. Much easier than you can imagine.
Writing your own language and tool chain to solve what seems to be a standard problem sounds like the wrong way to go. You'll end up developing yet another language, not writing your MUD.
Many game developers take an approach of using scripting languages to describe their own game world, for example see: http://www.gamasutra.com/view/feature/1570/reflections_on_building_three_.php
Also see: https://stackoverflow.com/questions/356160/which-game-scripting-language-is-better-to-use-lua-or-python for using existing languages (Pythong and LUA) in this case for in-game scripting.
Since you don't know a lot about compilers and creating computer languages: Don't. There are about five people in the world who are good at it.
If you still want to try: Creating a good general purpose language takes at least 3 years. Full time. It's a huge undertaking.
So instead, you should try one of the existing languages which solves almost all of your problems already except maybe the "custom" part. But maybe the language does things better than you ever imagined and you don't need the "custom" part at all.
Here are two options:
Python, a beautiful scripting language. The VM will compile the language into byte code for you, no need to waste time with a compiler. The syntax is very flexible but since there is a good reason for everything in Python, it's not too flexible.
Java. With the new Xtext framework, you can create your own languages in a couple of minutes. That doesn't mean you can create a good language in a few minutes. Just a language.
Python comes with a lot of libraries but if you need anything else, the air gets thin, quickly. On a positive side, you can write a lot of good and solid code in a short time. One line of python is usually equal to 10 lines of Java.
Java doesn't come with a lot of frills but there a literally millions of frameworks out there which do everything you can image ... and a lot of things you can't.
That said: Why limit yourself to one language? With Jython, you can run Python source in the Java VM. So you can write the core (web server, SQL, etc) in Java and the flexible UI parts, the adventures and stuff, in Python.
If you really want to create your own little language, a simpler and often quicker solution is to look at tools like lex and yacc and similar systems (ANTLR is a popular alternative), and then you can generate code either to an existing virtual machine or make a simple one yourself.
Making it all yourself is a great learning-experience, and will help you understand what goes on behind the scenes in other virtual machines.
An excellent source for understanding programming language design and implementation concepts is Structure and Interpretation of Computer Programs from MIT Press. It's a great read for anyone wanting to design and implement a language, or anyone looking to generally become a better programmer.
From what I can understand from this, you want to know how to develop your own programming language.
If so, you can accomplish this by different methods. I just finished up my own a few minutes ago and I used HTML and Javascript (And DOM) to develop my very own. I used a lot of x.split and x.indexOf("code here")!=-1 to do so... I don't have much time to give an example, but if you use W3schools and search "indexOf" and "split" I am sure that you will find what you might need.
I would really like to show you what I did and past the code below, but I can't due to possible theft and claim of my work.
I am pretty much just here to say that you can make your own programming language using HTML and Javascript, so that you and other might not get their hopes too low.
I hope this helps with most things....

Lua certified for use on an airframe or road vehicle?

Does anyone know if Lua has been certified to run on an airframe or road vehicle? Certification processes such as DO178B (RTCA) or standardization such as ISO 26262 (Road vehicles).
Certification is like case law and I would feel more confident evaluating the language knowing that another company has successfully made it through a process.
I'm betting no because of GC and dynamic features, but I thought I'd throw the question to the crowd anyway. Cheers.
DO178 Level D would be doubtful and higher would be impossible. The Lua VM uses lots of dynamic memory allocation. For Level A you need to show source to object code tracability. I don't see you doing that in Lua.
Also there is no ready made tools for everything you need. Doing everything yourself is not really an option once you realise all the work required on level C or higher. Using recognized tools with ready certification packs makes it a lot easier. Is there any statement and branch coverage tools for Lua? Is this tool qualified?
As you said certification is like case law and authorities know C and is not going to question anything if you use C. As soon as you use anything else you are opening yourself up for all kinds of questions about interpretation and implementation.
I would love to use Ruby on a aircraft but I know it is not going to happen.
Not exactly what you asked for, but this can give you an idea of what to expect:
Esterel Technologies justified the use of OCaml for the latest version of Scade, which is a code generator used in certified environments.
Note that it was not about having a language with dynamic allocation run inside the vehicle! OCaml had to be qualified as the code generator for the code generator!
If I had to summarize the article in one sentence, it would be "it was a lot of work".

Game Engine Scripting Languages

I am trying to build out a useful 3d game engine out of the Ogre3d rendering engine for mocking up some of the ideas i have come up with and have come to a bit of a crossroads. There are a number of scripting languages that are available and i was wondering if there were one or two that were vetted and had a proper following.
LUA and Squirrel seem to be the more vetted, but im open to any and all.
Optimally it would be best if there were a compiled form for the language for distribution and ease of loading.
One interesting option is stackless-python. This was used in the Eve-Online game.
The syntax is a matter of taste, Lua is like Javascript but with curly braces replaced with Pascal-like keywords. It has the nice syntactic feature that semicolons are never required but whitespace is still not significant, so you can even remove all line breaks and have it still work. As someone who started with C I'd say Python is the one with esoteric syntax compared to all the other languages.
LuaJIT is also around 10 times as fast as Python and the Lua interpreter is much much smaller (150kb or around 15k lines of C which you can actually read through and understand). You can let the user script your game without having to embed a massive language. On the other hand if you rip the parser part out of Lua it becomes even smaller.
The Python/C API manual is longer than the whole Lua manual (including the Lua/C API).
Another reason for Lua is the built-in support for coroutines (co-operative multitasking within the one OS thread). It allows one to have like 1000's of seemingly individual scripts running very fast alongside each other. Like one script per monster/weapon or so.
( Why do people write Lua in upper case so much on SO? It's "Lua" (see here). )
One more vote for Lua. Small, fast, easy to integrate, what's important for modern consoles - you can easily control its memory operations.
I'd go with Lua since writing bindings is extremely easy, the license is very friendly (MIT) and existing libraries also tend to be under said license. Scheme is also nice and easy to bind which is why it was chosen for the Gimp image editor for example. But Lua is simply great. World of Warcraft uses it, as a very high profile example. LuaJIT gives you native-compiled performance. It's less than an order of magnitude from pure C.
I wouldn't recommend LUA, it has a peculiar syntax so takes some time to get used to. Depending on who will be doing the scripting, this may not be a problem, but I would try to use something fairly accessible.
I would probably choose python. It normally compiles to bytecode, so you would need to embed the interpreter. However, if you must you can use PyPy to for example translate the code to C, and then compile it.
Embedding the interpreter is no issue. I am more interested in features and performance at this point in time. LUA and Squirrel are both interpreted, which is nice because one of the games i am building out is to include modifiable code, which has an editor in game.
I would love to hear about python, as i have seen its use within the battlefield series i believe.
python is also nice because it has actual OGRE bindings, just in case you need to modify something lower-level on the fly. I don't know of any equivalent bindings for Lua.
Since it's a C++ library, I would suggest either JavaScript or Squirrel, the latter being my personal favorite of the two for being even closer to C++, in particular to how it handles tables/structs and classes. It would be the easiest to get used to for a C++ coder because of all the similarities.
However, if you go with JavaScript and find an HTML5 version of Ogre3D, you should be able to port your game code directly into the web version with minimal (if any) changes necessary.
Both of these are a good choice, and both have their pros and cons, but both would definitely be the easiest to learn since you're likely already working in C++. If you're working with Java, the same may hold true, and if it's Game Maker, you wouldn't need either one unless you're trying to make an executable that people wouldn't need Game Maker itself to run, in which case, good luck finding an extension to run either of these.

Resources