Weird matching behaviour and argument interpretation in Lua - lua

Somehow there seems to be a difference, when passing a string in via a variable vs. passing a string via an expression as an argument.
I am so confused right now, about how Lua evaluates expressions.
In short: I am trying to detect a word, case insensitive and I am reformatting the pattern, so it is not case sensitive. If I pass the argument directly to <string>:match (sidenote: issue persists with directly calling string.match), it doesn't give the expected behaviour, while it does, when passing it via a local variable.
I have destilled the code into a reproducable script (Windows: Lua 5.4.3 and Lua JIT 2.1.0-beta3, WSL: Lua 5.3.3, Linux: Lua 5.1):
-- Replaces each char with a charset pattern in uppercase and lowercase
local function makeCaseInsensitive(name)
return name:gsub("%a", function (c)
return string.format("[%s%s]", c:lower(), c:upper())
end)
end
local suite = "Retained widgets"
local pattern = "retained"
if suite:match(makeCaseInsensitive(pattern)) then
print("In expression ok")
else
print("In expression not ok")
end
local insensitive = makeCaseInsensitive(pattern)
if suite:match(insensitive) then
print("In variable ok")
else
print("In variable not ok")
end
The expected output would be:
In expression ok
In variable ok
instead:
In expression not ok
In variable ok
WTF is going on?
Could someone please explain to me, what is going on?
Any feedback is appreciated

As #MikeV. pointed out in the comments: makeCaseInsensitive(pattern) returns two arguments. This is due to string.gsub returning the replacement and the replaced character count: 8.
The solution is to discard the rest from gsub, either explicitly:
-- Replaces each char with a charset pattern in uppercase and lowercase
local function makeCaseInsensitive(name)
local caseInsensitivePattern, count = name:gsub("%a", function (c)
return string.format("[%s%s]", c:lower(), c:upper())
end)
return caseInsensitivePattern
end
or implicitly by adding extra parenthesis:
-- Replaces each char with a charset pattern in uppercase and lowercase
local function makeCaseInsensitive(name)
return (name:gsub("%a", function (c)
return string.format("[%s%s]", c:lower(), c:upper())
end))
end

Related

What's the , Lua equivalent of pythons endswith()?

I want to convert this python code to lua .
for i in range(1000,9999):
if str(i).endswith('9'):
print(i)
I've come this far ,,
for var=1000,9000 then
if tostring(var).endswith('9') then
print (var)
end
end
but I don't know what's the lua equivalent of endswith() is ,,, im writing an nmap script,,
working 1st time with lua so pls let me know if there are any errors ,, on my current code .
The python code is not great, you can get the last digit by using modulo %
# python code using modulo
for i in range(1000,9999):
if i % 10 == 9:
print(i)
This also works in Lua. However Lua includes the last number in the loop, unlike python.
-- lua code to do this
for i=1000, 9998 do
if i % 10 == 9 then
print(i)
end
end
However in both languages you could iterate by 10 each time
for i in range(1009, 9999, 10):
print(i)
for i=9, 9998, 10 do
print(i)
for var = 1000, 9000 do
if string.sub(var, -1) == "9" then
-- do your stuff
end
end
XY-Problem
The X problem of how to best port your code to Lua has been answered by quantumpro already, who optimized it & cleaned it up.
I'll focus on your Y problem:
What's the Lua equivalent of Python endswith?
Calling string functions, OOP-style
In Lua, strings have a metatable that indexes the global string library table. String functions are called using str:func(...) in Lua rather than str.func(...) to pass the string str as first "self" argument (see "Difference between . and : in Lua").
Furthermore, if the argument to the call is a single string, you can omit the parentheses, turning str:func("...") into str:func"...".
Constant suffix: Pattern Matching
Lua provides a more powerful pattern matching function that can be used to check whether a string ends with a suffix: string.match. str.endswith("9") in Python is equivalent to str:match"9$" in Lua: $ anchors the pattern at the end of the string and 9 matches the literal character 9.
Be careful though: This approach doesn't work with arbitrary, possibly variable suffices since certain characters - such as $ - are magic characters in Lua patterns and thus have a special meaning. Consider str.endswith("."); this is not equivalent to string:match".$" in Lua, since . matches any character.
I'd say that this is the lua-esque way of checking whether a string ends with a constant suffix. Note that it does not return a boolean, but rather a match (the suffix, a truthy value) if successful or nil (a falsey value) if unsuccessful; it can thus safely be used in ifs. To convert the result into a boolean, you could use not not string:match"9$".
Variable suffix: Rolling your own
Lua's standard library is very minimalistic; as such, you often need to roll your own functions even for basic things. There are two possible implementations for endswith, one using pattern matching and another one using substrings; the latter approach is preferable because it's shorter, possibly faster (Lua uses a naive pattern matching engine) and doesn't have to take care of pattern escaping:
function string:endswith(suffix)
return self:sub(-#suffix) == suffix
end
Explanation: self:sub(-#suffix) returns the last suffix length characters of self, the first argument. This is compared against the suffix.
You can then call this function using the colon (:) syntax:
str = "prefixsuffix"
assert(str:endswith"suffix")
assert(not str:endswith"prefix")

How would I convert a string into a table?

I have been trying to convert a string into a table for example:
local stringtable = "{{"user123","Banned for cheating"},{"user124","Banned for making alt accounts"}}"
Code:
local table = "{{"user123","Banned for cheating"},{"user124","Banned for making alt accounts"}}"
print(table[1])
Output result:
Line 3: nil
Is there any sort of method of converting a string into a table? If so, let me know.
First of all, your Lua code will not work. You cannot have unescaped double quotes in a string delimited by double quotes. Use single quotes(') within a "-string, " within '...' or use heredoc syntax to be able to use both types of quotes, as shall I in the example below.
Secondly, your task cannot be solved with a regular expression, unless your table structure is very rigid; and even then Lua patterns will not be enough: you will need to use Perl-compatible regular expressions from Lua lrexlib library.
Thirdly, fortunately, Lua has a Lua interpreter available at runtime: the function loadstring. It returns a function that executes Lua code in its argument string. You just need to prepend return to your table code and call the returned function.
The code:
local stringtable = [===[
{{"user123","Banned for cheating"},{"user124","Banned for making alt accounts"}}
]===]
local tbl_func = loadstring ('return ' .. stringtable)
-- If stringtable is not valid Lua code, tbl_func will be nil:
local tbl = tbl_func and tbl_func() or nil
-- Test:
if tbl then
for _, user in ipairs (tbl) do
print (user[1] .. ': ' .. user[2])
end
else
print 'Could not compile stringtable'
end

Lua - why is string after function call allowed?

I'm trying to implement a simple C++ function, which checks a syntax of Lua script. For that I'm using Lua's compiler function luaL_loadbufferx() and checking its return value afterwards.
Recently, I have ran into a problem, because the code, that I thought should be marked invalid, was not detected and instead the script failed later at a runtime (eg. in lua_pcall()).
Example Lua code (can be tested on official Lua demo):
function myfunc()
return "everyone"
end
-- Examples of unexpected behaviour:
-- The following lines pass the compile time check without errors.
print("Hello " .. myfunc() "!") -- Runtime error: attempt to call a string value
print("Hello " .. myfunc() {1,2,3}) -- Runtime error: attempt to call a string value
-- Other examples:
-- The following lines contain examples of invalid syntax, which IS detected by compiler.
print("Hello " myfunc() .. "!") -- Compile error: ')' expected near 'myfunc'
print("Hello " .. myfunc() 5) -- Compile error: ')' expected near '5'
print("Hello " .. myfunc() .. ) -- Compile error: unexpected symbol near ')'
The goal is obviously to catch all syntax errors at compile time. So my questions are:
What exactly is meant by calling a string value?
Why is this syntax allowed in the first place? Is it some Lua feature I'm unaware of, or the luaL_loadbufferx() is faulty in this particular example?
Is it possible to detect such errors by any other method without running it? Unfortunately, my function doesn't have access to global variables at compile time, so I can't just just run the code directly via lua_pcall().
Note: I'm using Lua version 5.3.4 (manual here).
Thank you very much for your help.
Both myfunc() "!" and myfunc(){1,2,3} are valid Lua expressions.
Lua allows calls of the form exp string. See functioncall and prefixexp in the Syntax of Lua.
So myfunc() "!" is a valid function call that calls whatever myfunc returns and call it with the string "!".
The same thing happens for a call of the form exp table-literal.
Another approach is to change string's metatable making a call to a string valid.
local mt = getmetatable ""
mt.__call = function (self, args) return self .. args end
print(("x") "y") -- outputs `xy`
Now those valid syntax calls to a string will result in string concatenation instead of runtime errors.
I'm writing answer to my own question just in case anyone else stumbles upon the similar problem in the future and also looks for solution.
Manual
Lua manual (in its section 3.4.10 – Function Calls) basically states, that there are three different ways of providing arguments to Lua function.
Arguments have the following syntax: args ::= ‘(’ [explist] ‘)’
args ::= tableconstructor
args ::= LiteralString
All argument expressions are evaluated before the call. A call of the form f{fields} is syntactic sugar for f({fields}); that is, the argument list is a single new table. A call of the form f'string' (or f"string" or f[[string]]) is syntactic sugar for f('string'); that is, the argument list is a single literal string.
Explanation
As lhf pointed out in his answer, both myfunc()"!" and myfunc(){1,2,3} are valid Lua expressions. It means the Lua compiler is doing nothing wrong, considering it doesn't know the function return value at a compile time.
The original example code given in the question:
print("Hello " .. myfunc() "!")
Could be then rewritten as:
print("Hello " .. (myfunc()) ("!"))
Which (when executed) translates to:
print("Hello " .. ("everyone") ("!"))
And thus resulting in the runtime error message attempt to call a string value (which could be rewritten as: the string everyone is not a function, so you can't call it).
Solution
As far as I understand, these two alternative ways of supplying arguments have no real benefit over the standard func(arg) syntax. That's why I ended up modyfing the Lua parser files. The disadventage of keeping this alternative syntax was too big. Here is what I've done (relevant for v5.3.4):
In file lparser.c i searched for function:
static void suffixedexp (LexState *ls, expdesc *v)
Inside this function i changed the case statement:
case '(': case TK_STRING: case '{':to case '(':
Warning! By doing this I have modified the Lua language, so as lhf stated in his comment, it can no longer be called pure Lua. If you are unsure whether it is exactly what you want, I can't recommend this approach.
With this slight modification compiler detects the two above mentioned alternative syntaxes as errors. Of course, I can no longer use them inside Lua scripts, but for my specific application it's just fine.
All I need to do is to note this change somewhere to find it in case of upgrading Lua to higher version.

Lua string.find() error

So I'm writing a Lua script and I tested it but I got an error that I don't know how to fix:
.\search.lua:10: malformed pattern (missing ']')
Below is my code. If you know what I did wrong, it would be very helpful if you could tell me.
weird = "--[[".."\n"
function readAll(file)
local c = io.open(file, "rb")
local j = c:read("*all")
c:close()
return(j)
end
function blockActive()
local fc = readAll("functions.lua")
if string.find(fc,weird) ~= nil then
require("blockDeactivated")
return("false")
else
return("true")
end
end
print(blockActive())
Edit: first comment had the answer. I changed
weird = "--[[".."\n" to weird = "%-%-%[%[".."\r" The \n to \r change was because it was actually supposed to be that way in the first place.
This errors because string.find uses Lua Patterns.
Most non-alpha-numeric characters, such as "[", ".", "-" etc. convey special meaning.
string.find(fc,weird), or better, fc:find(weird) is trying to parse these special characters, and erroring.
You can use these patterns to cancel out your other patterns, however.
weird = ("--[["):gsub("%W","%%%0") .. "\r?\n"
This is a little daunting, but it will hopefully make sense.
the ("--[[") is the orignal first part of your weird string, working as expected.
:gsub() is a function that replaces a pattern with another one. Once again, see Patterns.
"%W" is a pattern that matches every string that isn't a letter, a number, or an underscore.
%%%0 replaces everything that matches with itself (%0 is a string that represents everything in this match), following a %, which is escaped.
So this means that [[ will be turned into %[%[, which is how find, and similar patterns 'escape' special characters.
The reason \n is now \r?\n refers back to these patterns. This matches it if it ends with a \n, like it did before. However, if this is running on windows, a newline might look like \r\n. (You can read up on this HERE). A ? following a character, \r in this case, means it can optionally match it. So this matches both --[[\n and --[[\r\n, supporting both windows and linux.
Now, when you run your fc:find(weird), it's running fc:find("%-%-%[%[\r?\n"), which should be exactly what you want.
Hope this has helped!
Finished code if you're a bit lazy
weird = ("--[["):gsub("%W","%%%0") .. "\r?\n" // Escape "--[[", add a newline. Used in our find.
// readAll(file)
// Takes a string as input representing a filename, returns the entire contents as a string.
function readAll(file)
local c = io.open(file, "rb") // Open the file specified by the argument. Read-only, binary (Doesn't autoformat things like \r\n)
local j = c:read("*all") // Dump the contents of the file into a string.
c:close() // Close the file, free up memory.
return j // Return the contents of the string.
end
// blockActive()
// returns whether or not the weird string was matched in 'functions.lua', executes 'blockDeactivated.lua' if it wasn't.
function blockActive()
local fc = readAll("functions.lua") // Dump the contents of 'functions.lua' into a string.
if fc:find(weird) then // If it functions.lua has the block-er.
require("blockDeactivated") // Require (Thus, execute, consider loadfile instead) 'blockDeactived.lua'
return false // Return false.
else
return true // Return true.
end
end
print(blockActive()) // Test? the blockActve code.

Escaping strings for gsub

I read a file:
local logfile = io.open("log.txt", "r")
data = logfile:read("*a")
print(data)
output:
...
"(\.)\n(\w)", r"\1 \2"
"\n[^\t]", "", x, re.S
...
Yes, logfile looks awful as it's full of various commands
How can I call gsub and remove i.e. "(\.)\n(\w)", r"\1 \2" line from data variable?
Below snippet, does not work:
s='"(\.)\n(\w)", r"\1 \2"'
data=data:gsub(s, '')
I guess some escaping needs to be done. Any easy solution?
Update:
local data = [["(\.)\n(\w)", r"\1 \2"
"\n[^\t]", "", x, re.S]]
local s = [["(\.)\n(\w)", r"\1 \2"]]
local function esc(x)
return (x:gsub('%%', '%%%%')
:gsub('^%^', '%%^')
:gsub('%$$', '%%$')
:gsub('%(', '%%(')
:gsub('%)', '%%)')
:gsub('%.', '%%.')
:gsub('%[', '%%[')
:gsub('%]', '%%]')
:gsub('%*', '%%*')
:gsub('%+', '%%+')
:gsub('%-', '%%-')
:gsub('%?', '%%?'))
end
print(data:gsub(esc(s), ''))
This seems to works fine, only that I need to escape, escape character %, as it wont work if % is in matched string. I tried :gsub('%%', '%%%%') or :gsub('\%', '\%\%') but it doesn't work.
Update 2:
OK, % can be escaped this way if set first in above "table" which I just corrected
:terrible experience:
Update 3:
Escaping of ^ and $
As stated in Lua manual (5.1, 5.2, 5.3)
A caret ^ at the beginning of a pattern anchors the match at the beginning of the subject string. A $ at the end of a pattern anchors the match at the end of the subject string. At other positions, ^ and $ have no special meaning and represent themselves.
So a better idea would be to escape ^ and $ only when they are found (respectively) and the beginning or the end of the string.
Lua 5.1 - 5.2+ incompatibilities
string.gsub now raises an error if the replacement string contains a % followed by a character other than the permitted % or digit.
There is no need to double every % in the replacement string. See lua-users.
According to Programming in Lua:
The character `%´ works as an escape for those magic characters. So, '%.' matches a dot; '%%' matches the character `%´ itself. You can use the escape `%´ not only for the magic characters, but also for all other non-alphanumeric characters. When in doubt, play safe and put an escape.
Doesn't this mean that you can simply put % in front of every non alphanumeric character and be fine. This would also be future proof (in the case that new special characters are introduced). Like this:
function escape_pattern(text)
return text:gsub("([^%w])", "%%%1")
end
It worked for me on Lua 5.3.2 (only rudimentary testing was performed). Not sure if it will work with older versions.
Why not:
local quotepattern = '(['..("%^$().[]*+-?"):gsub("(.)", "%%%1")..'])'
string.quote = function(str)
return str:gsub(quotepattern, "%%%1")
end
to escape and then gsub it away?
try
line = '"(\.)\n(\w)", r"\1 \2"'
rx = '\"%(%\.%)%\n%(%\w%)\", r\"%\1 %\2\"'
print(string.gsub(line, rx, ""))
escape special characters with %, and quotes with \
Try s=[["(\.)\n(\w)", r"\1 \2"]].
Use stringx.replace() from Penlight Lua Libraries instead.
Reference: https://stevedonovan.github.io/Penlight/api/libraries/pl.stringx.html#replace
Implementation (v1.12.0): https://github.com/lunarmodules/Penlight/blob/1.12.0/lua/pl/stringx.lua#L288
Based on their implementation:
function escape(s)
return (s:gsub('[%-%.%+%[%]%(%)%$%^%%%?%*]','%%%1'))
end
function replace(s,old,new,n)
return (gsub(s,escape(old),new:gsub('%%','%%%%'),n))
end

Resources