Escaping strings for gsub - lua

I read a file:
local logfile = io.open("log.txt", "r")
data = logfile:read("*a")
print(data)
output:
...
"(\.)\n(\w)", r"\1 \2"
"\n[^\t]", "", x, re.S
...
Yes, logfile looks awful as it's full of various commands
How can I call gsub and remove i.e. "(\.)\n(\w)", r"\1 \2" line from data variable?
Below snippet, does not work:
s='"(\.)\n(\w)", r"\1 \2"'
data=data:gsub(s, '')
I guess some escaping needs to be done. Any easy solution?
Update:
local data = [["(\.)\n(\w)", r"\1 \2"
"\n[^\t]", "", x, re.S]]
local s = [["(\.)\n(\w)", r"\1 \2"]]
local function esc(x)
return (x:gsub('%%', '%%%%')
:gsub('^%^', '%%^')
:gsub('%$$', '%%$')
:gsub('%(', '%%(')
:gsub('%)', '%%)')
:gsub('%.', '%%.')
:gsub('%[', '%%[')
:gsub('%]', '%%]')
:gsub('%*', '%%*')
:gsub('%+', '%%+')
:gsub('%-', '%%-')
:gsub('%?', '%%?'))
end
print(data:gsub(esc(s), ''))
This seems to works fine, only that I need to escape, escape character %, as it wont work if % is in matched string. I tried :gsub('%%', '%%%%') or :gsub('\%', '\%\%') but it doesn't work.
Update 2:
OK, % can be escaped this way if set first in above "table" which I just corrected
:terrible experience:
Update 3:
Escaping of ^ and $
As stated in Lua manual (5.1, 5.2, 5.3)
A caret ^ at the beginning of a pattern anchors the match at the beginning of the subject string. A $ at the end of a pattern anchors the match at the end of the subject string. At other positions, ^ and $ have no special meaning and represent themselves.
So a better idea would be to escape ^ and $ only when they are found (respectively) and the beginning or the end of the string.
Lua 5.1 - 5.2+ incompatibilities
string.gsub now raises an error if the replacement string contains a % followed by a character other than the permitted % or digit.
There is no need to double every % in the replacement string. See lua-users.

According to Programming in Lua:
The character `%´ works as an escape for those magic characters. So, '%.' matches a dot; '%%' matches the character `%´ itself. You can use the escape `%´ not only for the magic characters, but also for all other non-alphanumeric characters. When in doubt, play safe and put an escape.
Doesn't this mean that you can simply put % in front of every non alphanumeric character and be fine. This would also be future proof (in the case that new special characters are introduced). Like this:
function escape_pattern(text)
return text:gsub("([^%w])", "%%%1")
end
It worked for me on Lua 5.3.2 (only rudimentary testing was performed). Not sure if it will work with older versions.

Why not:
local quotepattern = '(['..("%^$().[]*+-?"):gsub("(.)", "%%%1")..'])'
string.quote = function(str)
return str:gsub(quotepattern, "%%%1")
end
to escape and then gsub it away?

try
line = '"(\.)\n(\w)", r"\1 \2"'
rx = '\"%(%\.%)%\n%(%\w%)\", r\"%\1 %\2\"'
print(string.gsub(line, rx, ""))
escape special characters with %, and quotes with \

Try s=[["(\.)\n(\w)", r"\1 \2"]].

Use stringx.replace() from Penlight Lua Libraries instead.
Reference: https://stevedonovan.github.io/Penlight/api/libraries/pl.stringx.html#replace
Implementation (v1.12.0): https://github.com/lunarmodules/Penlight/blob/1.12.0/lua/pl/stringx.lua#L288
Based on their implementation:
function escape(s)
return (s:gsub('[%-%.%+%[%]%(%)%$%^%%%?%*]','%%%1'))
end
function replace(s,old,new,n)
return (gsub(s,escape(old),new:gsub('%%','%%%%'),n))
end

Related

What's the , Lua equivalent of pythons endswith()?

I want to convert this python code to lua .
for i in range(1000,9999):
if str(i).endswith('9'):
print(i)
I've come this far ,,
for var=1000,9000 then
if tostring(var).endswith('9') then
print (var)
end
end
but I don't know what's the lua equivalent of endswith() is ,,, im writing an nmap script,,
working 1st time with lua so pls let me know if there are any errors ,, on my current code .
The python code is not great, you can get the last digit by using modulo %
# python code using modulo
for i in range(1000,9999):
if i % 10 == 9:
print(i)
This also works in Lua. However Lua includes the last number in the loop, unlike python.
-- lua code to do this
for i=1000, 9998 do
if i % 10 == 9 then
print(i)
end
end
However in both languages you could iterate by 10 each time
for i in range(1009, 9999, 10):
print(i)
for i=9, 9998, 10 do
print(i)
for var = 1000, 9000 do
if string.sub(var, -1) == "9" then
-- do your stuff
end
end
XY-Problem
The X problem of how to best port your code to Lua has been answered by quantumpro already, who optimized it & cleaned it up.
I'll focus on your Y problem:
What's the Lua equivalent of Python endswith?
Calling string functions, OOP-style
In Lua, strings have a metatable that indexes the global string library table. String functions are called using str:func(...) in Lua rather than str.func(...) to pass the string str as first "self" argument (see "Difference between . and : in Lua").
Furthermore, if the argument to the call is a single string, you can omit the parentheses, turning str:func("...") into str:func"...".
Constant suffix: Pattern Matching
Lua provides a more powerful pattern matching function that can be used to check whether a string ends with a suffix: string.match. str.endswith("9") in Python is equivalent to str:match"9$" in Lua: $ anchors the pattern at the end of the string and 9 matches the literal character 9.
Be careful though: This approach doesn't work with arbitrary, possibly variable suffices since certain characters - such as $ - are magic characters in Lua patterns and thus have a special meaning. Consider str.endswith("."); this is not equivalent to string:match".$" in Lua, since . matches any character.
I'd say that this is the lua-esque way of checking whether a string ends with a constant suffix. Note that it does not return a boolean, but rather a match (the suffix, a truthy value) if successful or nil (a falsey value) if unsuccessful; it can thus safely be used in ifs. To convert the result into a boolean, you could use not not string:match"9$".
Variable suffix: Rolling your own
Lua's standard library is very minimalistic; as such, you often need to roll your own functions even for basic things. There are two possible implementations for endswith, one using pattern matching and another one using substrings; the latter approach is preferable because it's shorter, possibly faster (Lua uses a naive pattern matching engine) and doesn't have to take care of pattern escaping:
function string:endswith(suffix)
return self:sub(-#suffix) == suffix
end
Explanation: self:sub(-#suffix) returns the last suffix length characters of self, the first argument. This is compared against the suffix.
You can then call this function using the colon (:) syntax:
str = "prefixsuffix"
assert(str:endswith"suffix")
assert(not str:endswith"prefix")

Rails strip all except numbers commas and decimal points

Hi I've been struggling with this for the last hour and am no closer. How exactly do I strip everything except numbers, commas and decimal points from a rails string? The closest I have so far is:-
rate = rate.gsub!(/[^0-9]/i, '')
This strips everything but the numbers. When I try add commas to the expression, everything is getting stripped. I got the aboves from somewhere else and as far as I can gather:
^ = not
Everything to the left of the comma gets replaced by what's in the '' on the right
No idea what the /i does
I'm very new to gsub. Does anyone know of a good tutorial on building expressions?
Thanks
Try:
rate = rate.gsub(/[^0-9,\.]/, '')
Basically, you know the ^ means not when inside the character class brackets [] which you are using, and then you can just add the comma to the list. The decimal needs to be escaped with a backslash because in regular expressions they are a special character that means "match anything".
Also, be aware of whether you are using gsub or gsub!
gsub! has the bang, so it edits the instance of the string you're passing in, rather than returning another one.
So if using gsub! it would be:
rate.gsub!(/[^0-9,\.]/, '')
And rate would be altered.
If you do not want to alter the original variable, then you can use the version without the bang (and assign it to a different var):
cleaned_rate = rate.gsub!(/[^0-9,\.]/, '')
I'd just google for tutorials. I haven't used one. Regexes are a LOT of time and trial and error (and table-flipping).
This is a cool tool to use with a mini cheat-sheet on it for ruby that allows you to quickly edit and test your expression:
http://rubular.com/
You can just add the comma and period in the square-bracketed expression:
rate.gsub(/[^0-9,.]/, '')
You don't need the i for case-insensitivity for numbers and symbols.
There's lots of info on regular expressions, regex, etc. Maybe search for those instead of gsub.
You can use this:
rate = rate.gsub!(/[^0-9\.\,]/g,'')
Also check this out to learn more about regular expressions:
http://www.regexr.com/

How to replace brackets in Lua using string.gsub?

I have a function which is used to replace some words with a few characters or numbers. I am using string.gsub() function in this way:
string.gsub(line, "[0-9%a%s/,-]+", "\t")
This works very good with strings with numbers, letters, spaces, ,, and /. I also would like to replace brackets like ( and ). But simply inserting () to my pattern doesn't work. I have also tried with %( and %) but it wasn't successful. How can I replace brackets in Lua using pattern in string.gsub() method?
The only characters that need to be escaped inside [] are []%-, all of which are escaped with %. As such, escaping - as follows works:
string.gsub(line, "[0-9%a%s/,%-()]+", "\t")
It's also probably worth mentioning that [0-9%a] is equivalent to [%d%a], which is equivalent to %w.

Removing lines that begin with > in a rails string

I'm trying to remove any lines that begin with the character '>' in a long string (i.e. replies to an email).
In PHP I'd iterate over each line with an if statement, in linux I'd try and use sed or awk.
What's the most elegant rails approach?
You can try this:
your_string.gsub(/^\>.+\n/,'')
Your question is implying that the input is one string, containing multiple lines.
Do you want the output to be just one string with multiple lines as well? I'm assuming yes.
either using String and Array operations:
str.lines.reject{|x| x =~ /^>/}.join # this will return a new string, without those ">" lines
or using Regular Expressions:
str.gsub(/^>.+\n*/. '')
Better Solution:
You will need to use non-greedy multi-line matching mode for your Regular Expression:
str.gsub(/^>.*?$\n*/m, '') # by using gsub!() you can modify the string in place
^> matches your ">" character at the start of a line
.*?$ matches any characters after the start character until the end of the line (non-greedy)
\n* matches the newline character itself if any (you want to remove that as well)
the "m" at the end of the regular expressions indicates multi-line matching , which will apply the RegExp for each line in the string.
It should work as you expect:
your_string.lines.to_a.reject{|line| line[0] == '>'}.join

Error in string.gsub with backslash

local a = "te\st"
local b = string.gsub(a,'\','\\\\')
assert(false,b)
What am I doing wrong?
When I do assert, I want that to the screen the string te\st will be printed... but it's not working
I have a JSON file, that I want to decode it into Lua table. I don't need to print out nothing, I did the assert just to test a local problem.
So what I need is to keep all data in the JSON file that has '\'.
Use [[]] instead of "" or '' if you don't want backslash to have special meaning.
Read about literal strings in the manual.
Have you tried escaping it with the % character instead of \
I don't know if this will help, but I was having a HELL of a time making Lua's gsub match my string with special characters in it that I wanted treated literally... it turned out that instead of using \ as an escape character, or doubling the character, that I needed to prefix the special character with % to make it be treated literally.
Your question wasn't too clear so I'm not 100% sure what you mean. Do you mean that you want the assert to fire when b is equal to the string "te\st"? If so you can do a simple:
assert(b ~= "te\st")
Or I suppse...
assert(b ~= a)
You don't need the gsub. But here it is anyways.
local a = "te\\st"
local b = string.gsub(a,'\\','\\')
assert(false,b)

Resources