Decompressing LZW in Lua [duplicate] - lua

Here is the Pseudocode for Lempel-Ziv-Welch Compression.
pattern = get input character
while ( not end-of-file ) {
K = get input character
if ( <<pattern, K>> is NOT in
the string table ){
output the code for pattern
add <<pattern, K>> to the string table
pattern = K
}
else { pattern = <<pattern, K>> }
}
output the code for pattern
output EOF_CODE
I am trying to code this in Lua, but it is not really working. Here is the code I modeled after an LZW function in Python, but I am getting an "attempt to call a string value" error on line 8.
function compress(uncompressed)
local dict_size = 256
local dictionary = {}
w = ""
result = {}
for c in uncompressed do
-- while c is in the function compress
local wc = w + c
if dictionary[wc] == true then
w = wc
else
dictionary[w] = ""
-- Add wc to the dictionary.
dictionary[wc] = dict_size
dict_size = dict_size + 1
w = c
end
-- Output the code for w.
if w then
dictionary[w] = ""
end
end
return dictionary
end
compressed = compress('TOBEORNOTTOBEORTOBEORNOT')
print (compressed)
I would really like some help either getting my code to run, or helping me code the LZW compression in Lua. Thank you so much!

Assuming uncompressed is a string, you'll need to use something like this to iterate over it:
for i = 1, #uncompressed do
local c = string.sub(uncompressed, i, i)
-- etc
end
There's another issue on line 10; .. is used for string concatenation in Lua, so this line should be local wc = w .. c.
You may also want to read this with regard to the performance of string concatenation. Long story short, it's often more efficient to keep each element in a table and return it with table.concat().

You should also take a look here to download the source for a high-performance LZW compression algorithm in Lua...

Related

How do I use more then one pattern for gmatch

Hello I am trying to get some data from a text file and put it into a table.
Im not sure how to add more then one pattern while also doing what I want, I know this pattern by its self %a+ finds letters and %b{} finds brackets, but I am not sure how to combine them together so that I find the letters as a key and the brackets as a value and have them be put into a table that I could use.
text file :
left = {{0,63},{16,63},{32,63},{48,63}}
right = {{0,21},{16,21},{32,21},{48,21}}
up = {{0,42},{16,42},{32,42},{48,42}}
down = {{0,0},{16,0},{32,0},{48,0}}
code:
local function get_animations(file_path)
local animation_table = {}
local file = io.open(file_path,"r")
local contents = file:read("*a")
for k, v in string.gmatch(contents, ("(%a+)=(%b{})")) do -- A gets words and %b{} finds brackets
animation_table[k] = v
print("key : " .. k.. " Value : ".. v)
end
file:close()
end
get_animations("Sprites/Player/MainPlayer.txt")
This is valid Lua code, why not simply execute it?
left = {{0,63},{16,63},{32,63},{48,63}}
right = {{0,21},{16,21},{32,21},{48,21}}
up = {{0,42},{16,42},{32,42},{48,42}}
down = {{0,0},{16,0},{32,0},{48,0}}
If you don't want the data in globals, use the string library to turn it into
return {
left = {{0,63},{16,63},{32,63},{48,63}},
right = {{0,21},{16,21},{32,21},{48,21}},
up = {{0,42},{16,42},{32,42},{48,42}},
down = {{0,0},{16,0},{32,0},{48,0}},
}
befor you execute it.
If you insist on parsing that file you can use a something like this for each line:
local line = "left = {{0,63},{16,63},{32,63},{48,63}}"
print(line:match("^%w+"))
for num1, num2 in a:gmatch("(%d+),(%d+)") do
print(num1, num2)
end
This should be enough to get you started. Of course you wouldn't print those values but put them into a table.

Lua - too many captures. How fix it?

Have problems with this one. If try convert cirilic words or wors have to many symbols and have error
function to_string(t)
local o = {};
for _, v in ipairs(t) do
local b = v < 0 and (0xff + v + 1) or v;
table.insert(o, string.char(b));
end
return table.concat(o);
end
function to_bytes(s)
local c = { s:match( (s:gsub(".", "(.)")) ) };
local o = {};
for _, v in pairs(c) do
table.insert(o, v:byte());
end
return o;
end
local t = to_bytes("If this have to many words или русские");
local out = "\\"
local chars = #t;
for i = 1, chars do
out = out..tostring(t[i]);
if i < chars then
out = out.."\\"
end
end
out = out..""
I think the error is self-explanatory: you have too many captures in your pattern (those groups that are wrapped into parentheses). The default value is 32. You have a couple of options: (1) recompile your Lua version to use a large number (you'll have to modify LUA_MAXCAPTURES value), but keep in mind that this limit is there for a reason and (2) change your pattern to avoid this many captures (possibly splitting into smaller fragments/patterns). You may also consider using more powerful parsers, like LPEG.
You don't need regex to convert string to array of bytes
function to_bytes(s)
return {s:byte(1, -1)}
end

How to make encode and decode functions as I want?

--encode
function strToBytes(str)
local bytes = { str:byte(1, -1)
for i = 1, #bytes do
bytes[i] = bytes[i] + 100
end
return table.concat(bytes, ',')
end
--decode
function bytesToStr(str)
local function gsub(c)return string.char(c - 100) end
return str:gsub('(%d+),?', gsub) end
implemented :
str = "hello world"
strbyte = strToBytes(str)
bytestr = bytesToStr(strbyte)
print(strbyte)
Output :
204,201,208,208,211,132,219,211,214,208,200
print(bytestr)
Output :
"Hello world"
Hi, I need improving my code above. Actually encode and decode functions is work fine, but I need a little bit change.
I want to make encode functions similar like code above, but, the results is table like below :
{204,201,208,208,211,132,219,211,214,208,200}
Then, same as like my first decode functions, all bytes inside the table should be back to "hello world".
I hope my purpose and explanation above is easy to understand. Thanks in advance for any help and suggestions.
Update explanation :
It is a little bit complicated to explain what is my purposes. But I will try to explain as good as I can.
I am trying to make scripts encoder. Encode functions is in encoder scripts side, and decode function is in encoded scripts side. So I must write concatenate decode function before encoded string.
To clearly my explanation, encoder scripts will load undecode source code.
file = io.open(path, "r")
local data = file:read("*l")
The problem is, table cant concatenate with string.
local data = encode(str)--the result is byte array
local data = "decode("..data..")"
file:write(data)
file:close()
local data = string.dump(load(data),true,true)
My first purpose is to hide some important string, because string.dump result is not hide all string.
My second purpose is, to make an obsfucated code using byteArray.
Any solution or suggestion?
SOLVED
function strToBytes(str)
local byteArray= { str:byte(1, -1) }
for i = 1, #byteArray do
byteArray[i] = byteArray[i] + 100
encoded = '{' ..table.concat(byteArray, ',') .. '}'
end
return "load(string.dump(load(bytesToStr("..encoded.."))))()\n"
end
Thank you so much... 👍
Your code was very close to what you were looking for.
--encode
function strToBytes(str)
local byteArray= { str:byte(1, -1) }
for i = 1, #byteArray do
byteArray[i] = byteArray[i] + 100
end
return '{' .. table.concat(byteArray, ',') .. '}'
end
For the encode I removed the table.concat and now just return the byteArray
--decode
function bytesToStr(byteArray)
local output = "" --initialize output variable
for _,b in ipairs(byteArray) do --use ipairs to preserve order
output = output .. string.char(b - 100) --convert each byte to a char and add to output
end
return output
end
For the decode I use a for loop with ipairs to iterate over each byte and concatenate the values into an output variable.
-- test
str = "hello world!"
strbyte = strToBytes(str)
bytestr = 'return bytesToStr(' .. strbyte .. ')'
strBack = string.dump(load(bytestr),true,true)
print(strbyte)
print(bytestr)
print(load(strBack)())
Test output:
{204,201,208,208,211,132,219,211,214,208,200,133}
return bytesToStr({204,201,208,208,211,132,219,211,214,208,200,133})
hello world!

Iterate Chinese string in Lua / Torch

I have a lua string in Chinese, such as
str = '这是一个中文字符串' -- in English: 'this is a Chinese string'
Now I would like to iterate the string above, to get the following result:
str[1] = '这'
str[2] = '是'
str[3] = '一'
str[4] = '个'
str[5] = '中'
str[6] = '文'
str[7] = '字'
str[8] = '符'
str[9] = '串'
and also output 9 for the length of the string.
Any ideas?
Something like this should work if you are using utf8 module from Lua 5.3 or luautf8, which works with LuaJIT:
local str = '这是一个中文字符串'
local tbl = {}
for p, c in utf8.codes(str) do
table.insert(tbl, utf8.char(c))
end
print(#tbl) -- prints 9
I haven't used non-english characters in lua before and my emulator just puts them in as '?' but something along the lines of this might work:
convert = function ( str )
local temp = {}
for c in str:gmatch('.') do
table.insert(temp, c)
end
return temp
end
This is a simple function that utilizes string.gmatch() to separate the string into individual characters and save them into a table. It would be used like this:
t = convert('abcd')
Which would make 't' a table containing a, b, c and d.
t[1] = a
t[2] = b
...
I am not sure if this will work for the Chinese characters but it is worth a shot.

Simple LZW Compression doesnt work

I wrote simple class to compress data. Here it is:
LZWCompressor = {}
function LZWCompressor.new()
local self = {}
self.mDictionary = {}
self.mDictionaryLen = 0
-- ...
self.Encode = function(sInput)
self:InitDictionary(true)
local s = ""
local ch = ""
local len = string.len(sInput)
local result = {}
local dic = self.mDictionary
local temp = 0
for i = 1, len do
ch = string.sub(sInput, i, i)
temp = s..ch
if dic[temp] then
s = temp
else
result[#result + 1] = dic[s]
self.mDictionaryLen = self.mDictionaryLen + 1
dic[temp] = self.mDictionaryLen
s = ch
end
end
result[#result + 1] = dic[s]
return result
end
-- ...
return self
end
And i run it by:
local compressor = LZWCompression.new()
local encodedData = compressor:Encode("I like LZW, but it doesnt want to compress this text.")
print("Input length:",string.len(originalString))
print("Output length:",#encodedData)
local decodedString = compressor:Decode(encodedData)
print(decodedString)
print(originalString == decodedString)
But when i finally run it by lua, it shows that interpreter expected string, not Table. That was strange thing, because I pass argument of type string. To test Lua's logs, i wrote at beggining of function:
print(typeof(sInput))
I got output "Table" and lua's error. So how to fix it? Why lua displays that string (That i have passed) is a table? I use Lua 5.3.
Issue is in definition of method Encode(), and most likely Decode() has same problem.
You create Encode() method using dot syntax: self.Encode = function(sInput),
but then you're calling it with colon syntax: compressor:Encode(data)
When you call Encode() with colon syntax, its first implicit argument will be compressor itself (table from your error), not the data.
To fix it, declare Encode() method with colon syntax: function self:Encode(sInput), or add 'self' as first argument explicitly self.Encode = function(self, sInput)
The code you provided should not run at all.
You define function LZWCompressor.new() but call CLZWCompression.new()
Inside Encode you call self:InitDictionary(true) which has not been defined.
Maybe you did not paste all relevant code here.
The reason for the error you get though is that you call compressor:Encode(sInput) which is equivalent to compressor.Encode(self, sInput). (syntactic sugar) As function parameters are not passed by name but by their position sInput inside Encode is now compressor, not your string.
Your first argument (which happens to be self, a table) is then passed to string.len which expects a string.
So you acutally call string.len(compressor) which of course results in an error.
Please make sure you know how to call and define functions and how to use self properly!

Resources