Lua/LuaJIT decompilation challenge

Lua/LuaJIT decompilation challenge - lua

I have stumbled upon a Lua script which I am trying to decompile. So far I have tried all different versions of standard Lua decompilers such as unluac and luadec. I either get a "not a precompiled chunk" or "bad header in precompiled chunk" errors.
I have also tried different Lua versions and 32-bit and 64-bit architectures for the decompilers.
I have looked at the header and it reads something like this - 1b 4c 4a 01 02 52 20 20 in hex. Looks almost correct but it seems to me like the signature starts with one extra byte and the 6th and 7th byte are wrong. Also, there is 52 in there which I assume is the Lua's version.
As the signature is one byte too long, the normal decompilers don't work. I have a suspicion that this might me a LuaJIT bytecode as if you convert that string to ANSI you will notice ESC, L, J sequence which is a function header for LuaJIT.
In case this is a LuaJIT binary, where do I start about decompiling it? I have tried some decompilers but they all seem to be extremely out of date or fail at some point within the execution (I am talking about LJD and its derivatives).
Any suggestions on how to analyze, view opcodes or decompile it would be greatly appreciated.
Here is the file I am talking about (apologies, too long to post here in case someone wants to have a go):
https://pastebin.com/eeLHsiXk
https://filebin.net/r0hszoeh8zscp8dh

Related

x86 32bit Assembly Parser | logical problem

I'm currently working on an Obfuscator for assembled x86 assembly (working with the raw bytes).
To do that I first need to build a simple parser, to "understand" the bytes.
I'm using a database that I create for myself mostly with the website: https://defuse.ca/online-x86-assembler.htm
Now my question:
Some bytes can be interpreted in two ways, for example (intel syntax):
1. f3 00 00 repz add BYTE PTR [eax],al
2. f3 repz
My idea way to loop through the bytes and work with every instruction as single,
but when I reach byte '0xf3' I have 2 ways of interpreting it.
I know there are working x86 disassemblers out there, how do I know what case this is?

Prefixes, including repz prefix, are not meaningful without subsequent instruction. The subsequent instruction may incorporate the prefix (repz nop is pause), change its meaning (repz is xrelease if used before some interlocked instruction), or the prefix may be just invalid.
The decoding is always unambiguous, otherwise the CPU could not execute instructions. It may be ambiguous only if you don't know exact byte offset where to begin decoding (as x86 uses variable instruction length).

Add to zero...What is it for?

Why such code is used in some applications instead of a MOVE?
add 16 to ZERO giving SOME-RESULT
I spotted this in professionally written code at several spots.
Sorce is on this page

Why such code is used in some applications instead of a MOVE?
add 16 to ZERO giving SOME-RESULT
Without seeing more of the code, it appears that it could be a translation of IBM Assembler to COBOL. In particular, the ZAP (Zero and Add Packed) instruction may be literally translated to the above instruction, particularly if SOME-RESULT is COMP-3. Thus, someone checking the translation could see that the ZAP instruction was faithfully translated.
Or, it could be an assembler programmer's idea of a joke.
Having seen the code, I also note the use of
subtract some-data-item from some-data-item
which is used instead of
move zero to some-data-item
This is consistent with operations used with packed decimal fields in IBM Assembly, where there are no other instructions to accomplish "flexible" moves. By flexible, I mean that the packed decimal instructions contain a length field so that specific size MVC instructions need not be used.
This particular style, being unusual, may be related to catching copyright violations.

From my experience, I'm pretty sure I know the reason why the programmer would have done this. It has something to do with the binary representation of the number.
I bet SOME-RESULT is a packed-decimal (or COMP-3) format number. Let's assume the field is defined like this
05 SOME-RESULT PIC S9(5) COMP-3.
This results in a 3-byte field with a hex representation like this
x'00016C'
The decimal number is encoded as a binary encoded decimal (BCD, one decimal digit per half-byte), and the last half-byte holds the sign.
Let's take a look at how the sign is defined:
if it is one of x'C', x'A', x'F', x'E' (café), then the number is positive
if it is one of x'B', x'D', then the number is negative
any of x'0'..'x'9' are not valid signs, so we can distinguish signed packed-decimals from unsigned.
However, a zoned number (PIC 9(5) DISPLAY) - as in the source code - looks like this:
x'F0F0F0F1F6'
As you can see, each decimal digit is an EBCDIC character with the 'zone' part (the first half-byte) always being x'F'.
Now we get closer to your question!
What happens when we use
MOVE 16 TO SOME-RESULT
If you just MOVE a number to such a field, this results in being compiled into a PACK instruction on the machine code level.
PACK SOME-RESULT,=C'16'
A pack instruction takes a zoned number and packs it by picking only the second half-byte of each byte and storing it in the half-bytes of the packed number - with one exception! When it comes to the last byte, it simply flips the two half-bytes and stores them in the last half-byte of the decimal.
This means that the zone of the last byte of the zoned decimal becomes the sign in the packed decimal:
x'00016F'
So now we have an x'F' as the sign – which is a valid positive sign.
However, what happens if use this Cobol instruction instead
ADD 16 TO ZERO GIVING SOME-RESULT
This compiles into multiple machine level instructions
PACK SOME_RESULT,=C'0'
PACK TEMP,=C'16'
AP SOME_RESULT,TEMP
(or similar - the key point is that is needs an AP somewhere)
This makes a slight difference in the result, because the AP (add packed) instruction always sets the resulting sign to either x'C' for a positive or x'D' for a negative result.
So the difference lies in the sign
x'00016C'
Finally, the question is why would one make this difference? After all, both x'F' and x'C' are valid positive signs. So why care?
There is one situation when this slight difference can cause big problems: When the packed decimal is part of an index key, then we would not get a match, even though the numbers are semantically identical!
Because this situation occurred quite often in older databases like VSAM and DL/I (later: IMS/DB), it became good practice to "normalize" packed decimals if they were part of an index key.
However, some programmers adopted the practice without knowing why, so you may come across code that uses this "normalization" even though the data are not used for index keys.
You might also wonder why a compiler does not optimize out the ADD 16 TO ZERO. I'm pretty sure it once did, but that broke a lot of applications, so this specific optimization was removed again or at least made a non-default option with warnings.
Additional useful info
Note that at least the Enterprise Cobol for z/OS compiler allows you to see exactly the machine code that is produced from your source code if use the LIST compile option (see this example output). I recommend to always compile with options LIST, MAP, OFFSET, XREF because these options enable you find the exact problem in your Cobol source even when you only have a program dump from an abend.
Anyway, good programming practice is not to care about the compiler or the machine code, but about the other programmers who will have to maintain, and thus read and understand the code. Good practice would be to always prefer simple and readable instructions, and to document the reasons (right in the code) when deviating from this rule.

Some programmers like to do things "just because they can". I have a feeling that is what you are seeing here. It makes about as much sense as doing
a := 0 + b
would in go.

Random Letters When Decrypted

Today I decrypted some Lua 5.3 Bytecode, but I'm wondering why there are random letters coming up.
AH���U��H�D#�#�A�H��E#F������#�\��H�A ����U�G�E# F����I���E# F���I���E# F#���I����*p#�� 640#0stringformat%X�o#��#0#��# �####�#8�#��#��#��#��#<�#��#��##�#p�#n�#p�#��#n�#��#>�#��#`�##�#getRainbow0x00HacksHacksBackgroundColor CEPanel2Hacks 84itle��A�A�#A��disableMenuHacksdragNow���#ƀ��� ���PictureHacksHacksClose_On��E#F��F� #��PictureHacksHacksClose_Off���#ƀ��� ���Pictu 114eHacks
HacksLock_On��E#F��F� #��PictureHacksHacksLock_Off���#ƀ��� ���PictureHacksMenu_On��E#F��F� #��PictureHacks Menu_Off��EF#�F��#��#� #A 130��EF#���##��#��AI���Hacks CEPanel1Heighti#Enabled$#��EF#�F������EF�I�#�EF#�F������EF#�I�#��HacksPage2CheckedPage3Page1��EF#�F������EF�I�#�EF#�F������EF#�I 192#��HacksPage2CheckedPage1Page3��EF#�I�#�EF�I#A�EF��I#A�EF��I#B�EF��IÅEF#�I�Å�HacksHacksPage1visibleHacksPage2HacksPage3
HacksPageCaptionPage 1 of 3HacksPageDownColor
I got the bytecode from my friend. I decrypted it fast but I'd like to know if there's any way to fix those random letters or decrypt them. Any ideas? Thanks.

The � symbol is used when Unicode runs into "an unknown, unrecognizable or unrepresentable character." So either your decryption isn't working properly and is getting some corrupted characters, your friend's encryption isn't working properly and when he encrypted it, the data was corrupted, or somewhere in-between the bytecode was corrupted or altered in some way that is affecting the decryption.
I haven't done much messing around in Lua with bytecode, but I would suggest having a look at the Lua Unicode library page, or look into this module which provides support for UTF-8 for Lua and LuaJIT. Finally, this Stack Overflow question has a good explanation of how Lua's support for Unicode works.
Go back to your friend and double-triple check that the bytecode you decrypted is exactly identical to the bytecode after he encrypted it, and make sure that both your method of decryption and his method of encryption are working correctly.

Lua String char encoding

I cant see what encoding Lua uses for its strings.
Im using
string.byte (s [, i [, j]])
which has the doc
Returns the internal numerical codes of the characters s[i], s[i+1],
···, s[j]. The default value for i is 1; the default value for j is i.
Note that numerical codes are not necessarily portable across
platforms.
Reading around people suggest it uses ASCII - which is fine for me - but I dont get the changing across platforms - I thought the very nature of using a single encoding (like ASCII) is that this wouldnt happen - or is it just saying this as ASCII does not define for over 126 (or 127) and therefore different countries / OEMS / OSs etc may be using custom ASCII extensions from decades ago for the upper range?
Its important for me to know that [a-zA-Z] will have the same char values on all platforms im running on.
The Lua doc could be a bit more specific here!
Any light anyone can shed on this would be great thx

I'm fairly sure you can safely assume an ASCII-derived encoding. So the minuscule set of characters you're interested in stays the same.
The note about the code changing between platforms likely means that Lua doesn't know anything about the character encoding at all and thus just uses whatever bytes the OS hands out. On Linux this is likely UTF-8, which means you'd have to deal with individual code units when stepping outside ASCII. On Windows I could imagine it being the system's legacy codepage, which means sort-of Latin 1 (CP 1252) in much of the Western world.

How to decrypt Lua bytecode?

Good morning, I'm trying to decipher a code of Moon bytecode, but i can not in any way, does anyone could help me with this?
I have this, example:
code = '\27\76\117\97\81\0\1\4\4\4\8\0\'
How I decrypt this bytecode to text?
I already search here: http://www.asciitable.com/
But find result, because some of it does not exist in the table
Please help me with this...
I'm trying to several days and nothing

This seems to be bytecode for Lua 5.1. It is not encrypted cryptographically and be easily read with luac -l -p (not in source form but in VM instructions, which are probably enough to reconstruct the source). If you want to reconstruct the source, try LuaDec for Lua 5.1.

have this, example:
code = '\27\76\117\97\81\0\1\4\4\4\8\0\'
How I decrypt this bytecode to text?
The sequence above is what Lua bytecode looks like, if the first character '\27' tells lua that the file is bytecode or text. The sequence is \27 followed by Lua '\76\117\97' followed by \81 which tells that this is a Lua 5.1 bytecode, etc. for details have a look at this link http://howto.oz-apps.com/2012/04/delve-deeper-into-lua-and-compilation.html
A very good resource can be found at http://chunkspy.luaforge.net/ and a wonderful detailed PDF from Kein Hong Man

Like others have said, you can use LuaDec for Lua 5.1
but what I would do, for bytecodes like this, is use this number extractor, getting all the numbers only, then using this numbers to utf8 converter
and paste them there. It may look like random words at first, but you need to align them in 1 single line, but make a space every time you go back a line, like this.
27 76 117 97 81 0 1 4 4 4 8 0
and NOT like this.
27
76
117
97
81
etc, and you should get the answer, but this might not work on different bytecodes, but this is pretty much just the manual way, It's better to use LuaDec for Lua 5.1, however if you still wanna try this, you will not get the exact source string, and if you do then good job, but that would be very bad encryption.

You can't turn bytecode into text. It's not text but instructions to the Lua interpreter.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart