How to decode obfuscated lua [duplicate] - lua

I have some Lua code that I suspect is obfuscated. How do I go about de-obfuscating it?
I believe the code is obfuscated because it looks very different from normal Lua code, but I know it is valid Lua code because the Lua interpreter will still compile and run the code.
I have a legitimate interest in de-obfuscating the code and do not intend to distribute it against the authors will or modify it to circumvent any DRM-mechanism.

There are generally two ways to obfuscate Lua source code:
Obfuscate the code directly, mostly by renaming variables, introducing istraction and restructuring code to be harder to follow
Encode the source code and embed it as a string in a Lua file that only decodes, loads and runs the encoded real program.
In reality, a combination of both is often used: Programs are obfuscated, then encoded and wrapped in a string. Finally, the code that loads and runs the string is often obfuscated again.
Typical mechanisms used for making Lua code harder to follow include:
Renaming standard functions such as string.gsub, table.concat, etc.
Renaming variables to nonsense
Replacing dot- and colon-notation for table-indices with bracket-notation
Using hexadecimal notation for literal strings (often in combination with 3.)
Generally speaking, the steps to de-obfuscate such code by hand are often very similar: reformatting the code to make is easier to follow the control-flow, then figuring out what each variable represents and renaming them. For this it is often necessary to have a good understanding of the Language, as one needs to be aware of all the rules that the obfuscation takes advantage of to make the code harder to understand. A few such rules to be aware of:
Local variable shadowing: two different variables can have the same name in different scopes (or even in the same scope).
Syntactic sugar such as dot- and colon-notation
Function environments and getfenv and setfenv
Metatables and that all Strings share one metatable with __index set to string
Whitespace is often insignificant in Lua and only necessary to separate statements in some cases, which can also be done with ;.
For more in-detail help with de-obfuscating a specific snippet of Lua code, you could ask in the following other online communities:
The Lua subreddit
The Lua Scripters Discord Server
The Lua Forum
But remember: Don't ask to ask, just ask
Note that these are not official communities. For more options, see the Community page on the official Lua website.

Related

How do I deobfuscate LUA code? are there any specific steps? [duplicate]

I have some Lua code that I suspect is obfuscated. How do I go about de-obfuscating it?
I believe the code is obfuscated because it looks very different from normal Lua code, but I know it is valid Lua code because the Lua interpreter will still compile and run the code.
I have a legitimate interest in de-obfuscating the code and do not intend to distribute it against the authors will or modify it to circumvent any DRM-mechanism.
There are generally two ways to obfuscate Lua source code:
Obfuscate the code directly, mostly by renaming variables, introducing istraction and restructuring code to be harder to follow
Encode the source code and embed it as a string in a Lua file that only decodes, loads and runs the encoded real program.
In reality, a combination of both is often used: Programs are obfuscated, then encoded and wrapped in a string. Finally, the code that loads and runs the string is often obfuscated again.
Typical mechanisms used for making Lua code harder to follow include:
Renaming standard functions such as string.gsub, table.concat, etc.
Renaming variables to nonsense
Replacing dot- and colon-notation for table-indices with bracket-notation
Using hexadecimal notation for literal strings (often in combination with 3.)
Generally speaking, the steps to de-obfuscate such code by hand are often very similar: reformatting the code to make is easier to follow the control-flow, then figuring out what each variable represents and renaming them. For this it is often necessary to have a good understanding of the Language, as one needs to be aware of all the rules that the obfuscation takes advantage of to make the code harder to understand. A few such rules to be aware of:
Local variable shadowing: two different variables can have the same name in different scopes (or even in the same scope).
Syntactic sugar such as dot- and colon-notation
Function environments and getfenv and setfenv
Metatables and that all Strings share one metatable with __index set to string
Whitespace is often insignificant in Lua and only necessary to separate statements in some cases, which can also be done with ;.
For more in-detail help with de-obfuscating a specific snippet of Lua code, you could ask in the following other online communities:
The Lua subreddit
The Lua Scripters Discord Server
The Lua Forum
But remember: Don't ask to ask, just ask
Note that these are not official communities. For more options, see the Community page on the official Lua website.

How do I de-obfuscate a Lua script?

I have some Lua code that I suspect is obfuscated. How do I go about de-obfuscating it?
I believe the code is obfuscated because it looks very different from normal Lua code, but I know it is valid Lua code because the Lua interpreter will still compile and run the code.
I have a legitimate interest in de-obfuscating the code and do not intend to distribute it against the authors will or modify it to circumvent any DRM-mechanism.
There are generally two ways to obfuscate Lua source code:
Obfuscate the code directly, mostly by renaming variables, introducing istraction and restructuring code to be harder to follow
Encode the source code and embed it as a string in a Lua file that only decodes, loads and runs the encoded real program.
In reality, a combination of both is often used: Programs are obfuscated, then encoded and wrapped in a string. Finally, the code that loads and runs the string is often obfuscated again.
Typical mechanisms used for making Lua code harder to follow include:
Renaming standard functions such as string.gsub, table.concat, etc.
Renaming variables to nonsense
Replacing dot- and colon-notation for table-indices with bracket-notation
Using hexadecimal notation for literal strings (often in combination with 3.)
Generally speaking, the steps to de-obfuscate such code by hand are often very similar: reformatting the code to make is easier to follow the control-flow, then figuring out what each variable represents and renaming them. For this it is often necessary to have a good understanding of the Language, as one needs to be aware of all the rules that the obfuscation takes advantage of to make the code harder to understand. A few such rules to be aware of:
Local variable shadowing: two different variables can have the same name in different scopes (or even in the same scope).
Syntactic sugar such as dot- and colon-notation
Function environments and getfenv and setfenv
Metatables and that all Strings share one metatable with __index set to string
Whitespace is often insignificant in Lua and only necessary to separate statements in some cases, which can also be done with ;.
For more in-detail help with de-obfuscating a specific snippet of Lua code, you could ask in the following other online communities:
The Lua subreddit
The Lua Scripters Discord Server
The Lua Forum
But remember: Don't ask to ask, just ask
Note that these are not official communities. For more options, see the Community page on the official Lua website.

Parser for OCaml

Can anyone recommend me an open-source full OCaml parser?
Essentially, I would like to implement my own type-checker for OCaml. Ideally, the parser is written in OCaml. I would just use it to get the AST of the input program. (it is probably too much to ask for the initial typing environment pre-filled with standard library function signatures)
Use compiler-lib that is distributed with OCaml under QPL license. It has everything needed to create your own compiler (and even has some documentation). compiler-lib is essentially a compiler shipped as library.
Otherwise, you can use camlp4 to get the parsetree, but then you need to reimplement everything else from scratch. But at this case you're not restricted with QPL.
it is probably too much to ask for the initial typing environment pre-filled with standard library function signatures
It is not! See the files typing/predef.ml(i).
As for the stdlib, just use the same one as the compiler, except for pervasive which uses values from Predef, the rest is normal OCaml code without any special cases (except bootstraping, obviously).

Why do we need an embeddable programming language like Lua?

What are the typical use cases of using an embeddable programming language? Do I understand it correctly that such language should be embedded into some program environment and should be able to be executed from there?
Since you tagged the question as "Lua", I'll give you an answer in the context of this language.
Introduction
Lua is written in C (almost completely compatible with C89 standard; the incompatible features can be easily disabled, if needed, using compile-time switches) and has been designed to be easily integrated with C code. In the the context of Lua, "integrated" means two different, but related, things:
You can easily write C code that can be used as a library by Lua code. The integration is achieved either by static or dynamic linking your C code to Lua engine's code. The linked library can then be referred to in your Lua code using the Lua require function.
Lua engine can be easily embedded in a C application, i.e. linked (again either statically or dynamically) to the C application code. Then the C application can interact with the Lua code using Lua's C application programming interface (Lua C-API).
Note: this can be done, with a little more effort, also with a C++ application.
Advantages of embedding a Lua engine
If your C application embeds Lua many, if not most, operations can be delegated to the Lua engine, i.e. either to code written using the C-API functions or, better yet, Lua code. Lua code could be embedded as C strings inside your C code or be stored as external Lua scripts.
Having part of your code logic implemented using Lua code has several advantages:
Lua is simpler (less tricky) to learn and use than C, and it is much more high-level. It supports powerful abstractions, such as function closures and object orientation (in a peculiar way, using Lua tables and metamethods).
Lua is a dynamic language: it requires no "off-line" compilation. You can modify the text of your Lua script and that's all you need to modify your application behavior (no additional compilation+linking steps needed). This simplifies application development and debugging.
Lua is a safer language than C: it is really difficult to write Lua code that exhibits undefined behavior, as intended in the context of C/C++. If a Lua script fails, it fails "loudly". Moreover Lua supports an exception mechanism (although with a different syntax than C++) which can be employed to implement error management in a much easier way compared to C.
Lua, as most dynamic languages, is garbage collected. This means that the programmer is spared the pain of manually managing dynamic memory, which is a major cause of bugs, leaks, instability and security loopholes in languages that lack garbage collection.
Lua can "eat its own dog food", i.e. you can build a string at runtime (even in Lua itself) and if it is valid Lua code, your program can execute it on the fly. This is something not frequently seen even in other dynamic languages (still it is not LISP, but it gets closer, and with much more readable syntax). This enables Lua scripts to:
employ powerful text-based metaprogramming techniques, where Lua code can generate other Lua code and execute it on the fly;
implement domain specific languages (DSLs) in an easy way; Lua code can load at runtime other Lua code that is crafted so as to reflect the specific problem domain in which it is used (Lua syntax is simple, yet flexible enough to allow such things);
be used as a configuration language with ease: your application (written in a mix of C and Lua) can use some lua files as configuration files without the need to craft an ad-hoc parser for a specific configuration file format. Therefore you don't need to parse *.properties, *.csv, *.ini, or whichever other format you would choose if you hadn't the option of using Lua files for that purpose.
Lua engine has a very small memory footprint (some hundreds kBs), packing powerful capabilities. With very few C code lines and a bunch of Lua files you could create a complete application that would require thousands of C code lines otherwise. The standard Lua standalone interpreter can be seen as just an example of embedding Lua in a C application!
Lua has a very liberal open-source license, which enables its use even in commercial applications without much hassle. This also allows the modification of its source code to adapt it to special needs.
Small memory footprint and easily tweakable C sources make Lua a perfect candidate for porting it on embedded systems or small microcomputer systems (microcontrollers, etc.). Many parts of the standard Lua distributions can be stripped off, reducing the core Lua engine in the ~100kB range. As an example, take the eLua project, a modified distribution of Lua designed for embedded devices.
Lua, and other scripting languages, provide various benefits that are dependant on your needs.
Provide rapid iteration of development.
Allow run-time code changes, such as reloading your UI in World of Warcraft which re-loads all scripts without stopping the game engine itself or logging you out.
Provide a distinct API for your application for users to extend, without exposing critical parts of your system to the public. Such as text editors providing a macro language to allow you to integrate custom behaviour without giving you unfettered access to the internals of the editor itself.
The uses are really quite extensive and depends on the developer.

Game Engine Scripting Languages

I am trying to build out a useful 3d game engine out of the Ogre3d rendering engine for mocking up some of the ideas i have come up with and have come to a bit of a crossroads. There are a number of scripting languages that are available and i was wondering if there were one or two that were vetted and had a proper following.
LUA and Squirrel seem to be the more vetted, but im open to any and all.
Optimally it would be best if there were a compiled form for the language for distribution and ease of loading.
One interesting option is stackless-python. This was used in the Eve-Online game.
The syntax is a matter of taste, Lua is like Javascript but with curly braces replaced with Pascal-like keywords. It has the nice syntactic feature that semicolons are never required but whitespace is still not significant, so you can even remove all line breaks and have it still work. As someone who started with C I'd say Python is the one with esoteric syntax compared to all the other languages.
LuaJIT is also around 10 times as fast as Python and the Lua interpreter is much much smaller (150kb or around 15k lines of C which you can actually read through and understand). You can let the user script your game without having to embed a massive language. On the other hand if you rip the parser part out of Lua it becomes even smaller.
The Python/C API manual is longer than the whole Lua manual (including the Lua/C API).
Another reason for Lua is the built-in support for coroutines (co-operative multitasking within the one OS thread). It allows one to have like 1000's of seemingly individual scripts running very fast alongside each other. Like one script per monster/weapon or so.
( Why do people write Lua in upper case so much on SO? It's "Lua" (see here). )
One more vote for Lua. Small, fast, easy to integrate, what's important for modern consoles - you can easily control its memory operations.
I'd go with Lua since writing bindings is extremely easy, the license is very friendly (MIT) and existing libraries also tend to be under said license. Scheme is also nice and easy to bind which is why it was chosen for the Gimp image editor for example. But Lua is simply great. World of Warcraft uses it, as a very high profile example. LuaJIT gives you native-compiled performance. It's less than an order of magnitude from pure C.
I wouldn't recommend LUA, it has a peculiar syntax so takes some time to get used to. Depending on who will be doing the scripting, this may not be a problem, but I would try to use something fairly accessible.
I would probably choose python. It normally compiles to bytecode, so you would need to embed the interpreter. However, if you must you can use PyPy to for example translate the code to C, and then compile it.
Embedding the interpreter is no issue. I am more interested in features and performance at this point in time. LUA and Squirrel are both interpreted, which is nice because one of the games i am building out is to include modifiable code, which has an editor in game.
I would love to hear about python, as i have seen its use within the battlefield series i believe.
python is also nice because it has actual OGRE bindings, just in case you need to modify something lower-level on the fly. I don't know of any equivalent bindings for Lua.
Since it's a C++ library, I would suggest either JavaScript or Squirrel, the latter being my personal favorite of the two for being even closer to C++, in particular to how it handles tables/structs and classes. It would be the easiest to get used to for a C++ coder because of all the similarities.
However, if you go with JavaScript and find an HTML5 version of Ogre3D, you should be able to port your game code directly into the web version with minimal (if any) changes necessary.
Both of these are a good choice, and both have their pros and cons, but both would definitely be the easiest to learn since you're likely already working in C++. If you're working with Java, the same may hold true, and if it's Game Maker, you wouldn't need either one unless you're trying to make an executable that people wouldn't need Game Maker itself to run, in which case, good luck finding an extension to run either of these.

Resources