Lua String Manipulation (Find Words Before & After) - lua

I'm fairly new to this forum. I am having trouble with manipulating the correct string to achieve this.
Basically, what I'm trying to do is receive an input string like this example:
str = "Say hello to=Stack overflow, Say goodbye to=other resources"
for question, answer in pairs(string.gmatch(s, "(%w+)=(%w+)"))
print(question, answer)
end
I want it to return: question = "Say hello to" and answer = "Stack overflow, question = "Say goodbye to" and so on and so forth. but instead, it picks up the word just before the equal sign and the word just after. I've even tried the * quantifier, and it does the same exact thing.
I've also tried this pattern
[%w%s]*=[%w%s]
I just want to be able to sort this string into a key-value table where the key is all words before each = and the value is all words after that equal but before the comma.
Does anyone have a suggestion?

You can use something like this:
local str = "Say hello to=Stack overflow, Say goodbye to=other resources"
for question, answer in string.gmatch(str..",", "([^=]+)=([^,]+),%s*") do
print(question, answer)
end
"([^=]+)=([^,]+),%s*" means the following: anything except = ([^=]) repeated 1 or more times (+) followed by = and then anything except ',', followed by comma and optional whitespaces (to avoid including them in the next question). I also added comma to the string, so it parses the last pair as well.
To elaborate a bit further per request in the comments: in the expression [^=]+, [=] designates a set with one allowed character (=) and [^=] negates that, so it's a set with any character allowed except = and + allows the set to be repeated 1 or more times.
As #lhf suggested you can use a simpler expression: (.-)=(.-),%s*, which means: take all characters until the first = (- makes matching non-greedy) and then take all characters until the first ,.

Related

What Lua pattern behaves like a regex negative lookahead?

my problem is I need to write a Lua code to interpret a text file and match lines with a pattern like
if line_str:match(myPattern) then do myAction(arg) end
Let's say I want a pattern to match lines containing "hello" in any context except one containing "hello world". I found that in regex, what I want is called negative lookahead, and you would write it like
.*hello (?!world).*
but I'm struggling to find the Lua version of this.
Let's say I want a pattern to match lines containing "hello" in any context except one containing "hello world".
As Wiktor has correctly pointed out, the simplest way to write this would be line:find"hello" and not line:find"hello world" (you can use both find and match here, but find is probably more performant; you can also turn off pattern matching for find).
I found that in regex, what I want is called negative lookahead, and
you would write it like .*hello (?!world).*
That's incorrect. If you checked against the existence of such a match, all it would tell you would be that there exists a "hello" which is not followed by a "world". The string hello hello world would match this, despite containing "hello world".
Negative lookahead is a questionable feature anyways as it isn't trivially provided by actually regular expressions and thus may not be implemented in linear time.
If you really need it, look into LPeg; negative lookahead is implemented as pattern1 - pattern2 there.
Finally, the RegEx may be translated to "just Lua" simply by searching for (1) the pattern without the negative part (2) the pattern with the negative part and checking whether there is a match in (1) that is not in (2) simply by counting:
local hello_count = 0; for _ in line:gmatch"hello" do hello_count = hello_count + 1 end
local helloworld_count = 0; for _ in line:gmatch"helloworld" do helloworld_count = helloworld_count + 1 end
if hello_count > helloworld_count then
-- there is a "hello" not followed by a "world"
end

How to specify a range in Ruby

I've been looking for a good way to see if a string of items are all numbers, and thought there might be a way of specifying a range from 0 to 9 and seeing if they're included in the string, but all that I've looked up online has really confused me.
def validate_pin(pin)
(pin.length == 4 || pin.length == 6) && pin.count("0-9") == pin.length
end
The code above is someone else's work and I've been trying to identify how it works. It's a pin checker - takes in a set of characters and ensures the string is either 4 or 6 digits and all numbers - but how does the range work?
When I did this problem I tried to use to_a? Integer and a bunch of other things including ranges such as (0..9) and ("0..9) and ("0".."9") to validate a character is an integer. When I saw ("0-9) it confused the heck out of me, and half an hour of googling and youtube has only left me with regex tutorials (which I'm interested in, but currently just trying to get the basics down)
So to sum this up, my goal is to understand a more semantic/concise way to identify if a character is an integer. Whatever is the simplest way. All and any feedback is welcome. I am a new rubyist and trying to get down my fundamentals. Thank You.
Regex really is the right way to do this. It's specifically for testing patterns in strings. This is how you'd test "do all characters in this string fall in the range of characters 0-9?":
pin.match(/\A[0-9]+\z/)
This regex says "Does this string start and end with at least one of the characters 0-9, with nothing else in between?" - the \A and \z are start-of-string and end-of-string matchers, and the [0-9]+ matches any one or more of any character in that range.
You could even do your entire check in one line of regex:
pin.match(/\A([0-9]{4}|[0-9]{6})\z/)
Which says "Does this string consist of the characters 0-9 repeated exactly 4 times, or the characters 0-9, repeated exactly 6 times?"
Ruby's String#count method does something similar to this, though it just counts the number of occurrences of the characters passed, and it uses something similar to regex ranges to allow you to specify character ranges.
The sequence c1-c2 means all characters between c1 and c2.
Thus, it expands the parameter "0-9" into the list of characters "0123456789", and then it tests how many of the characters in the string match that list of characters.
This will work to verify that a certain number of numbers exist in the string, and the length checks let you implicitly test that no other characters exist in the string. However, regexes let you assert that directly, by ensuring that the whole string matches a given pattern, including length constraints.
Count everything non-digit in pin and check if this count is zero:
pin.count("^0-9").zero?
Since you seem to be looking for answers outside regex and since Chris already spelled out how the count method was being implemented in the example above, I'll try to add one more idea for testing whether a string is an Integer or not:
pin.to_i.to_s == pin
What we're doing is converting the string to an integer, converting that result back to a string, and then testing to see if anything changed during the process. If the result is =>true, then you know nothing changed during the conversion to an integer and therefore the string is only an Integer.
EDIT:
The example above only works if the entire string is an Integer and won’t properly deal with leading zeros. If you want to check to make sure each and every character is an Integer then do something like this instead:
pin.prepend(“1”).to_i.to_s(1..-1) == pin
Part of the question seems to be exactly HOW the following portion of code is doing its job:
pin.count("0-9")
This piece of the code is simply returning a count of how many instances of the numbers 0 through 9 exist in the string. That's only one piece of the relevant section of code though. You need to look at the rest of the line to make sense of it:
pin.count("0-9") == pin.length
The first part counts how many instances then the second part compares that to the length of the string. If they are equal (==) then that means every character in the string is an Integer.
Sometimes negation can be used to advantage:
!pin.match?(/\D/) && [4,6].include?(pin.length)
pin.match?(/\D/) returns true if the string contains a character other than a digit (matching /\D/), in which case it it would be negated to false.
One advantage of using negation here is that if the string contains a character other than a digit pin.match?(/\D/) would return true as soon as a non-digit is found, as opposed to methods that examine all the characters in the string.

Lua Pattern Matching, get character before match

Currently I have code that looks like this:
somestring = "param=valueZ&456"
local stringToPrint = (somestring):gsub("(param=)[^&]+", "%1hello", 1)
StringToPrint will look like this:
param=hello&456
I have replaced all of the characters before the & with the string "hello". This is where my question becomes a little strange and specific.
I want my string to appear as: param=helloZ&456. In other words, I want to preserve the character right before the & when replacing the string valueZ with hello to make it helloZ instead. How can this be done?
I suggest:
somestring:gsub("param=[^&]*([^&])", "param=hello%1", 1)
See the Lua demo
Here, the pattern matches:
param= - literal substring param=
[^&]* - 0 or more chars other than & as many as possible
([^&]) - Group 1 capturing a symbol other than & (here, backtracking will occur, as the previous pattern grabs all such chars other than & and then the engine will take a step back and place the last char from that chunk into Group 1).
There are probably other ways to do this, but here is one:
somestring = "param=valueZ&456"
local stringToPrint = (somestring):gsub("(param=).-([^&]&)", "%1hello%2", 1)
print(stringToPrint)
The thing here is that I match the shortest string that ends with a character that is not & and a character that is &. Then I add the two ending characters to the replaced part.

Lua pattern help (Double parentheses)

I have been coding a program in Lua that automatically formats IRC logs from a roleplay. In the roleplay logs there is a specific guideline for "Out of character" conversation, which we use double parentheses for. For example: ((<Things unrelated to roleplay go here>)). I have been trying to have my program remove text between double brackets (and including both brackets). The code is:
ofile = io.open("Output.txt", "w")
rfile = io.open("Input.txt", "r")
p = rfile:read("*all")
w = string.gsub(p, "%(%(.*?%)%)", "")
ofile:write(w)
The pattern here is > "%(%(.*?%)%)" I've tried multiple variations of the pattern. All resulted in fruitless results:
1. %(%(.*?%)%) --Wouldn't do anything.
2. %(%(.*%)%) --Would remove *everything* after the first OOC message.
Then, my friend told me that prepending the brackets with percentages wouldn't work, and that I had to use backslashes to 'escape' the parentheses.
3. \(\(.*\)\) --resulted in the output file being completely empty.
4. (\(\(.*\)\)) --Same result as above.
5. (\(\(.*?\)\) --would for some reason, remove large parts of the text for no apparent reason.
6. \(\(.*?\)\) --would just remove all the text except for the last line.
The short, absolute question:
What pattern would I need to use to remove all text between double parentheses, and remove the double parentheses themselves too?
You're friend is thinking of regular expressions. Lua patterns are similar, but different. % is the correct escape character.
Your pattern should be %(%(.-%)%). The - is similar to * in that it matches any number of the preceding sequence, but while * tries to match as many characters as it can (it's greedy), - matches the least amount of characters possible (it's non-greedy). It won't go overboard and match extra double-close-parenthesis.

Split lua string into characters

I only found this related to what I am looking for: Split string by count of characters but it is not useful for what I mean.
I have a string variable, which is an ammount of 3 numbers (can be from 000 to 999). I need to separate each of the numbers (characters) and get them into a table.
I am programming for a game mod which uses lua, and it has some extra functions. If you could help me to make it using: http://wiki.multitheftauto.com/wiki/Split would be amazing, but any other way is ok too.
Thanks in advance
Corrected to what the OP wanted to ask:
To just split a 3-digit number in 3 numbers, that's even easier:
s='429'
c1,c2,c3=s:match('(%d)(%d)(%d)')
t={tonumber(c1),tonumber(c2),tonumber(c3)}
The answer to "How do I split a long string composed of 3 digit numbers":
This is trivial. You might take a look at the gmatch function in the reference manual:
s="123456789"
res={}
for num in s:gmatch('%d%d%d') do
res[#res+1]=tonumber(num)
end
or if you don't like looping:
res={}
s:gsub('%d%d%d',function(n)res[#res+1]=tonumber(n)end)
I was looking for something like this, but avoiding looping - and hopefully having it as one-liner. Eventually, I found this example from lua-users wiki: Split Join:
fields = {str:match((str:gsub("[^"..sep.."]*"..sep, "([^"..sep.."]*)"..sep)))}
... which is exactly the kind of syntax I'd like - one liner, returns a table - except, I don't really understand what is going on :/ Still, after some poking about, I managed to find the right syntax to split into characters with this idiom, which apparently is:
fields = { str:match( (str:gsub(".", "(.)")) ) }
I guess, what happens is that gsub basically puts parenthesis '(.)' around each character '.' - so that match would consider those as a separate match unit, and "extract" them as separate units as well... But I still don't get why is there extra pair of parenthesis around the str:gsub(".", "(.)") piece.
I tested this with Lua5.1:
str = "a - b - c"
fields = { str:match( (str:gsub(".", "(.)")) ) }
print(table_print(fields))
... where table_print is from lua-users wiki: Table Serialization; and this code prints:
"a"
" "
"-"
" "
"b"
" "
"-"
" "
"c"

Resources