Case-insensitive Lua pattern-matching - lua

I'm writing a grep utility in Lua for our mobile devices running Windows CE 6/7, but I've run into some issues implementing case-insensitive match patterns. The obvious solution of converting everything to uppercase (or lower) does not work so simply due to the character classes.
The only other thing I can think of is converting the literals in the pattern itself to uppercase.
Here's what I have so far:
function toUpperPattern(instr)
-- Check first character
if string.find(instr, "^%l") then
instr = string.upper(string.sub(instr, 1, 1)) .. string.sub(instr, 2)
end
-- Check the rest of the pattern
while 1 do
local a, b, str = string.find(instr, "[^%%](%l+)")
if not a then break end
if str then
instr = string.sub(instr, 1, a) .. string.upper(string.sub(instr, a+1, b)) .. string.sub(instr, b + 1)
end
end
return instr
end
I hate to admit how long it took to get even that far, and I can still see right away there are going to be problems with things like escaped percent signs '%%'
I figured this must be a fairly common issue, but I can't seem to find much on the topic.
Are there any easier (or at least complete) ways to do this? I'm starting to go crazy here...
Hoping you Lua gurus out there can enlighten me!

Try something like this:
function case_insensitive_pattern(pattern)
-- find an optional '%' (group 1) followed by any character (group 2)
local p = pattern:gsub("(%%?)(.)", function(percent, letter)
if percent ~= "" or not letter:match("%a") then
-- if the '%' matched, or `letter` is not a letter, return "as is"
return percent .. letter
else
-- else, return a case-insensitive character class of the matched letter
return string.format("[%s%s]", letter:lower(), letter:upper())
end
end)
return p
end
print(case_insensitive_pattern("xyz = %d+ or %% end"))
which prints:
[xX][yY][zZ] = %d+ [oO][rR] %% [eE][nN][dD]

Lua 5.1, LPeg v0.12
do
local p = re.compile([[
pattern <- ( {b} / {escaped} / brackets / other)+
b <- "%b" . .
escaped <- "%" .
brackets <- { "[" ([^]%]+ / escaped)* "]" }
other <- [^[%]+ -> cases
]], {
cases = function(str) return (str:gsub('%a',function(a) return '['..a:lower()..a:upper()..']' end)) end
})
local pb = re.compile([[
pattern <- ( {b} / {escaped} / brackets / other)+
b <- "%b" . .
escaped <- "%" .
brackets <- {: {"["} ({escaped} / bcases)* {"]"} :}
bcases <- [^]%]+ -> bcases
other <- [^[%]+ -> cases
]], {
cases = function(str) return (str:gsub('%a',function(a) return '['..a:lower()..a:upper()..']' end)) end
, bcases = function(str) return (str:gsub('%a',function(a) return a:lower()..a:upper() end)) end
})
function iPattern(pattern,brackets)
('sanity check'):find(pattern)
return table.concat({re.match(pattern, brackets and pb or p)})
end
end
local test = '[ab%c%]d%%]+ o%%r %bnm'
print(iPattern(test)) -- [ab%c%]d%%]+ [oO]%%[rR] %bnm
print(iPattern(test,true)) -- [aAbB%c%]dD%%]+ [oO]%%[rR] %bnm
print(('qwe [%D]% O%r n---m asd'):match(iPattern(test, true))) -- %D]% O%r n---m
Pure Lua version:
It is necessary to analyze all the characters in the string to convert it into a correct pattern because Lua patterns do not have alternations like in regexps (abc|something).
function iPattern(pattern, brackets)
('sanity check'):find(pattern)
local tmp = {}
local i=1
while i <= #pattern do -- 'for' don't let change counter
local char = pattern:sub(i,i) -- current char
if char == '%' then
tmp[#tmp+1] = char -- add to tmp table
i=i+1 -- next char position
char = pattern:sub(i,i)
tmp[#tmp+1] = char
if char == 'b' then -- '%bxy' - add next 2 chars
tmp[#tmp+1] = pattern:sub(i+1,i+2)
i=i+2
end
elseif char=='[' then -- brackets
tmp[#tmp+1] = char
i = i+1
while i <= #pattern do
char = pattern:sub(i,i)
if char == '%' then -- no '%bxy' inside brackets
tmp[#tmp+1] = char
tmp[#tmp+1] = pattern:sub(i+1,i+1)
i = i+1
elseif char:match("%a") then -- letter
tmp[#tmp+1] = not brackets and char or char:lower()..char:upper()
else -- something else
tmp[#tmp+1] = char
end
if char==']' then break end -- close bracket
i = i+1
end
elseif char:match("%a") then -- letter
tmp[#tmp+1] = '['..char:lower()..char:upper()..']'
else
tmp[#tmp+1] = char -- something else
end
i=i+1
end
return table.concat(tmp)
end
local test = '[ab%c%]d%%]+ o%%r %bnm'
print(iPattern(test)) -- [ab%c%]d%%]+ [oO]%%[rR] %bnm
print(iPattern(test,true)) -- [aAbB%c%]dD%%]+ [oO]%%[rR] %bnm
print(('qwe [%D]% O%r n---m asd'):match(iPattern(test, true))) -- %D]% O%r n---m

Related

Generating star pattern in LUA

I am new to programming in LUA. And I am not able to solve this question below.
Given a number N, generate a star pattern such that on the first line there are N stars and on the subsequent lines the number of stars decreases by 1.
The pattern generated should have N rows. In every row, every fifth star (*) is replaced with a hash (#). Every row should have the required number of stars (*) and hash (#) symbols.
Sample input and output, where the first line is the number of test cases
This is what I tried.. And I am not able to move further
function generatePattern()
n = tonumber(io.read())
i = n
while(i >= 1)
do
j = 1
while(j<=i)
do
if(j<=i)
then
if(j%5 == 0)
then
print("#");
else
print("*");
end
print(" ");
end
j = j+1;
end
print("\n");
i = i-1;
end
end
tc = tonumber(io.read())
for i=1,tc
do
generatePattern()
end
First, just the stars without hashes. This part is easy:
local function pattern(n)
for i=n,1,-1 do
print(string.rep("*", i))
end
end
To replace each 5th asterisk with a hash, you can extend the expression with the following substitution:
local function pattern(n)
for i=n,1,-1 do
print((string.rep("*", i):gsub("(%*%*%*%*)%*", "%1#")))
end
end
The asterisks in the pattern need to be escaped with a %, since * holds special meaning within Lua patterns.
Note that string.gsub returns 2 values, but they can be truncated to one value by adding an extra set of parentheses, leading to the somewhat awkward-looking form print((..)).
Depending on Lua version the metamethod __index holding rep for repeats...
--- Lua 5.3
n=10
asterisk='*'
print(asterisk:rep(n))
-- puts out: **********
#! /usr/bin/env lua
for n = arg[1], 1, -1 do
local char = ''
while #char < n do
if #char %5 == 4 then char = char ..'#'
else char = char ..'*'
end -- mod 5
end -- #char
print( char )
end -- arg[1]
chmod +x asterisk.lua
./asterisk.lua 15
Please do not follow this answer since it is bad coding style! I would delete it but SO won't let me. See comment and other answers for better solutions.
My Lua print adds newlines to each printout, therefore I concatenate each character in a string and print the concatenated string out afterwards.
function generatePattern()
n = tonumber(io.read())
i = n
while(i >= 1)
do
ouput = ""
j = 1
while(j<=i)
do
if(j%5 == 0)
then
ouput=ouput .. "#";
else
ouput=ouput .. "*";
end
j = j+1;
end
print(ouput);
i = i-1;
end
end
Also this code is just yours minimal transformed to give the correct output. There are plenty of different ways to solve the task, some are faster or more intuitive than others.

"translating" One character to another in lua

I want to make a lua script that takes the input of a table, then outputs the strings in that table in their full width counterparts, eg
input = {"Hello", " ", "World"}
print(full(table.concat(input)))
and it will print "Hello World"
I tried it using this:
local encoding = [[ 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!゛#$%&()*+、ー。/:;〈=〉?@[\\]^_‘{|}~]]
function char(i)
return encoding:sub(i:len(),i:len())
end
function decode(t)
for i=1,#t do t[i]=char(t[i]) end
return table.concat(t)
end
function returns(word, word_eol)
print(char(word_eol[2]))
end
but that did not work
note: it is a plugin for hexchat that's why I have it as print(char(word_eol[2])))
Because when you hook a command in hexchat it spits out a table that is the command name, then what was entered after
If (string) = [[ 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!゛#$%&()*+、ー。/:;〈=〉?@[\]^_‘{|}~]], you're finding the n th character of (string), with n being the length of the character, which will always be one. If I understand correctly, this will do the job, by having a separate alphabet and matching the characters.
local encoding = [[ 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!゛#$%&()*+、ー。/:;〈=〉?@[]^_‘{|}~]]
local decoding = [[ 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&()*+,-./:;{=}?#[]^_'{|}~]]
function char(i)
local l = decoding:find(i,1,true)
return encoding:sub(l,l)
end
function decode(t)
for i=1,#t do t[i]=char(t[i]) end
return table.concat(t)
end
function returns(word, word_eol)
print(char(word_eol[2]))
end
function full(s)
return (s:gsub('.', function(c)
c = c:byte()
if c == 0x20 then
return string.char(0xE3, 0x80, 0x80)
elseif c >= 0x21 and c <= 0x5F then
return string.char(0xEF, 0xBC, c+0x60)
elseif c >= 0x60 and c <= 0x7E then
return string.char(0xEF, 0xBD, c+0x20)
end
end))
end

How to make LPeg.match return nil

I'm currently getting familiar with the LPeg parser module. For this I want to match a version string (e.g. 11.4) against a list.
Such a list is a string with a tight syntax that can also contain ranges. Here is an EBNF-like, but in any case quite simple grammar (I write it down because LPeg code below can be a bit difficult to read):
S = R, { ',', R }
R = N, [ '-', N ]
N = digit+, [ '.', digit+ ]
An example string would be 1-9,10.1-11,12. Here is my enormous code:
local L = require "lpeg"
local LV, LP, LC, LR, floor = L.V, L.P, L.C, L.R, math.floor
local version = "7.25"
local function check(a, op, b)
if op and a+0 <= version and version <= b+0 then
return a..op..b -- range
elseif not op and floor(version) == floor(a+0) then
return a -- single item
end
end
local grammar = LP({ "S",
S = LV"R" * (LP"," * LV"R")^0,
R = LV"V" * (LC(LP"-") * LV"V")^-1 / check,
V = LC(LV"D" * (LP"." * LV"D")^-1),
D = (LR("09")^1),
})
function checkversion(str)
return grammar:match(str)
end
So you would call it like checkversion("1-7,8.1,8.3,9") and if the current version is not matched by the list you should get nil.
Now, the trouble is, if all calls to check return nothing (meaning, if the versions do not match), grammar:match(...) will actually have no captures and so return the current position of the string. But this is exactly what I do not want, I want checkversion to return nil or false if there is no match and something that evaluates to true otherwise, actually just like string:match would do.
If I on the other hand return false or nil from check in case of a non-match, I end up with return values from match like nil, "1", nil, nil which is basically impossible to handle.
Any ideas?
I think you can or + it with a constant capture of nil:
grammar = grammar + lpeg.Cc(nil)
This is the pattern I eventually used:
nil_capturing_pattern * lpeg.Cc(nil)
I incorporated it into the grammar in the S rule (Note that this also includes changed grammar to "correctly" determine version order, since in version numbering "4.7" < "4.11" is true, but not in calculus)
local Minor_mag = log10(Minor);
local function check(a, am, op, b, bm)
if op then
local mag = floor(max(log10(am), log10(bm), Minor_mag, 1))+1;
local a, b, v = a*10^mag+am, b*10^mag+bm, Major*10^mag+Minor;
if a <= v and v <= b then
return a..op..b;
end
elseif a == Major and (am == "0" or am == Minor) then
return a.."."..am;
end
end
local R, V, C, Cc = lpeg.R, lpeg.V, lpeg.C, lpeg.Cc
local g = lpeg.P({ "S",
S = V("R") * ("," * V("R"))^0 * Cc(nil),
R = (V("Vm") + V("VM")) * (C("-") * (V("Vm") + V("VM")))^-1 / check,
VM = V("D") * Cc("0"),
Vm = V("D") * "." * V("D"),
D = C(R("09")^1),
});
Multiple returns from match are not impossible to handle, if you catch them in a way that makes handling them easier. I added a function matched that does that, and added the fallback return of false to your check.
do
local L = require "lpeg"
local LV, LP, LC, LR, floor = L.V, L.P, L.C, L.R, math.floor
local version = 6.25
local function check(a, op, b)
if op and a+0 <= version and version <= b+0 then
return a..op..b -- range
elseif not op and floor(version) == floor(a+0) then
return a -- single item
end
return false
end
local grammar = LP({ "S",
S = LV"R" * (LP"," * LV"R")^0,
R = LV"V" * (LC(LP"-") * LV"V")^-1 / check,
V = LC(LV"D" * (LP"." * LV"D")^-1),
D = (LR("09")^1),
})
local function matched(...)
local n = select('#',...)
if n == 0 then return false end
for i=1,n do
if select(i,...) then return true end
end
return false
end
function checkversion(ver,str)
version = ver
return matched(grammar:match(str))
end
end
I enclosed the whole thing in do ... end so that the local version which is used here as an upvalue to check would have constrained scope, and added a parameter to checversion() to make it clearer to run through few test cases. For example:
cases = { 1, 6.25, 7.25, 8, 8.5, 10 }
for _,v in ipairs(cases) do
print(v, checkversion(v, "1-7,8.1,8.3,9"))
end
When run, I get:
C:\Users\Ross\Documents\tmp\SOQuestions>q18793493.lua
1 true
6.25 true
7.25 false
8 true
8.5 true
10 false
C:\Users\Ross\Documents\tmp\SOQuestions>
Note that either nil or false would work equally well in this case. It just feels saner to have collected a list that can be handled as a normal Lua array-like table without concern for the holes.

Lua table.concat

Is there a way to use the arg 2 value of table.concat to represent the current table index?
eg:
t = {}
t[1] = "a"
t[2] = "b"
t[3] = "c"
X = table.concat(t,"\n")
desired output of table concat (X):
"1 a\n2 b\n3 c\n"
Simple answer : no.
table.concat is something really basic, and really fast.
So you should do it in a loop anyhow.
If you want to avoid excessive string concatenation you can do:
function concatIndexed(tab,template)
template = template or '%d %s\n'
local tt = {}
for k,v in ipairs(tab) do
tt[#tt+1]=template:format(k,v)
end
return table.concat(tt)
end
X = concatIndexed(t) -- and optionally specify a certain per item format
Y = concatIndexed(t,'custom format %3d %s\n')
I don't think so: how would you tell it that the separator between keys and values is supposed to be a space, for example?
You can write a general mapping function to do what you'd like:
function map2(t, func)
local out = {}
for k, v in pairs(t) do
out[k] = func(k, v)
end
return out
end
function joinbyspace(k, v)
return k .. ' ' .. v
end
X = table.concat(map2(t, joinbyspace), "\n")
No. But there is a work around:
local n = 0
local function next_line_no()
n = n + 1
return n..' '
end
X = table.concat(t,'\0'):gsub('%f[%Z]',next_line_no):gsub('%z','\n')
function Util_Concat(tab, seperator)
if seperator == nil then return table.concat(tab) end
local buffer = {}
for i, v in ipairs(tab) do
buffer[#buffer + 1] = v
if i < #tab then
buffer[#buffer + 1] = seperator
end
end
return table.concat(buffer)
end
usage tab is where the table input is and seperator be both nil or string (if it nil it act like ordinary table.concat)
print(Util_Concat({"Hello", "World"}, "_"))
--Prints
--Hello_world

Lua Operator Overloading

I've found some places on the web saying that operators in Lua are overloadable but I can't seem to find any example.
Can someone provide an example of, say, overloading the + operator to work like the .. operator works for string concatenation?
EDIT 1: to Alexander Gladysh and RBerteig:
If operator overloading only works when both operands are the same type and changing this behavior wouldn't be easy, then how come the following code works? (I don't mean any offense, I just started learning this language):
printf = function(fmt, ...)
io.write(string.format(fmt, ...))
end
Set = {}
Set.mt = {} -- metatable for sets
function Set.new (t)
local set = {}
setmetatable(set, Set.mt)
for _, l in ipairs(t) do set[l] = true end
return set
end
function Set.union (a,b)
-- THIS IS THE PART THAT MANAGES OPERATOR OVERLOADING WITH OPERANDS OF DIFFERENT TYPES
-- if user built new set using: new_set = some_set + some_number
if type(a) == "table" and type(b) == "number" then
print("building set...")
local mixedset = Set.new{}
for k in pairs(a) do mixedset[k] = true end
mixedset[b] = true
return mixedset
-- elseif user built new set using: new_set = some_number + some_set
elseif type(b) == "table" and type(a) == "number" then
print("building set...")
local mixedset = Set.new{}
for k in pairs(b) do mixedset[k] = true end
mixedset[a] = true
return mixedset
end
if getmetatable(a) ~= Set.mt or
getmetatable(b) ~= Set.mt then
error("attempt to 'add' a set with a non-set value that is also not a number", 2)
end
local res = Set.new{}
for k in pairs(a) do res[k] = true end
for k in pairs(b) do res[k] = true end
return res
end
function Set.tostring (set)
local s = "{"
local sep = ""
for e in pairs(set) do
s = s .. sep .. e
sep = ", "
end
return s .. "}"
end
function Set.print (s)
print(Set.tostring(s))
end
s1 = Set.new{10, 20, 30, 50}
s2 = Set.new{30, 1}
Set.mt.__add = Set.union
-- now try to make a new set by unioning a set plus a number:
s3 = s1 + 8
Set.print(s3) --> {1, 10, 20, 30, 50}
The metatable function only works on tables, but you can use debug.metatable to set the strings metatable...
> mt = {}
> debug.setmetatable("",mt)
> mt.__add = function (op1, op2) return op1 .. op2 end
> ="foo"+"bar"
foobar
>
Another approach is to use debug.getmetatable to augment the built-in string metatable (answering the question in the comment below):
~ e$ lua
Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
> debug.getmetatable("").__add = function (op1, op2) return op1 .. op2 end
> ="foo"+"bar"
foobar
>
See the Metatables section of Lua Programming Manual and Metatables and Metamethods chapter of the Programming in Lua 2nd edition.
Note that for comparison operators operator overloading works only when both operand types are the same.

Resources