extract data from string in lua - SubStrings and Numbers - lua

I'm trying to phrase a string for a hobby project and I'm self taught from code snips from this site and having a hard time working out this problem. I hope you guys can help.
I have a large string, containing many lines, and each line has a certain format.
I can get each line in the string using this code...
for line in string.gmatch(deckData,'[^\r\n]+') do
print(line) end
Each line looks something like this...
3x Rivendell Minstrel (The Hunt for Gollum)
What I am trying to do is make a table that looks something like this for the above line.
table = {}
table['The Hunt for Gollum'].card = 'Rivendell Minstrel'
table['The Hunt for Gollum'].count = 3
So my thinking was to extract everything inside the parentheses, then extract the numeric vale. Then delete the first 4 chars in the line, as it will always be '1x ', '2x ' or '3x '
I have tried a bunch of things.. like this...
word=str:match("%((%a+)%)")
but it errors if there are spaces...
my test code looks like this at the moment...
line = '3x Rivendell Minstrel (The Hunt for Gollum)'
num = line:gsub('%D+', '')
print(num) -- Prints "3"
card2Fetch = string.sub(line, 5)
print(card2Fetch) -- Prints "Rivendell Minstrel (The Hunt for Gollum)"
key = string.gsub(card2Fetch, "%s+", "") -- Remove all Spaces
key=key:match("%((%a+)%)") -- Fetch between ()s
print(key) -- Prints "TheHuntforGollum"
Any ideas how to get the "The Hunt for Gollum" text out of there including the spaces?

Try a single pattern capturing all fields:
x,y,z=line:match("(%d+)x%s+(.-)%s+%((.*)%)")
t = {}
t[z] = {}
t[z].card = y
t[z].count = x
The pattern reads: capture a run of digits before x, skip whitespace, capture everything until whitespace followed by open parenthesis, and finally capture everything until a close parenthesis.

Related

Trying to get some kind of key:value data from a string in Lua

I'm (again) stuck because patterns... so let's see if with a little of help... The case is I have e. g. a string returned by a function that contains the following:
📄 My Script
ScriptID:RL_SimpleTest
Version:0.0.1
ScriptType:MenuScript
AnotherKey:AnotherValue
And, maybe, some more text...
And I'd want to parse it line by line and should the line contains a ":" get the left side content of the line in a variable (k) and the right content in another one (v), so e. g. I'd have k containing "ScriptID" and v containing "RL_SimpleTest" for the second line (the first one should be just ignored) and so on...
Well, I've started with something like this:
function RL_Test:StringToKeyValue(str, sep1, sep2)
sep1 = sep1 or "\n"
sep2 = sep2 or ":"
local t = {}
for line in string.gmatch(str, "([^" .. sep1 .. "]+)") do
print(line)
for k in string.gmatch(line, "([^" .. sep2 .. "]+)") do --Here is where I'm lost trying to get the key/value pair separately and at the same time...
--t[k] = v
print(k)
end
end
return t
end
With the hope once I got isolated the line containing the data in the key:value form that I want to extract, I'd be able to do some kind of for k, v in string.gmatch(line, "([^" .. sep2 .. "]+)") or something so and that way get the two pieces of data, but of course it doesn't work and even though I have a feeling it's a triviality I don't know even where to start, always for the lack of patterns understanding...
Well, I hope at least I exposed it right... Thanks in advance for any help.
local t = {}
for line in (s..'\n'):gmatch("(.-)\r?\n") do
for a, b in line:gmatch("([^:]+):([^:\n\r]+)") do
t[a] = b
end
end
The pattern is quite simple. Match anything that is not a colon that is followed by a colon that is followed by anything that is not a colon or a line break. Put what you want in captures and you're done.
I assume every line is of the format k:v, containing exactly one colon, or containing no colon (no k/v pair).
Then you can simply first match nonempty lines using [^\n]+ (assuming UNIX LF line endings), then match each line using ^([^:]+):([^:]+)$. Breakdown of the second pattern:
^ and $ are anchors. They force the pattern to match the entire line.
([^:]+) matches & captures one or more non-semicolon characters.
This leaves you with:
function RL_Test:StringToKeyValue(str)
local t = {}
for line in str:gmatch"[^\n]+" do
local k, v = line:match"^([^:]+):([^:]+)$"
if k then -- line is k:v pair?
t[k] = v
end
end
return t
end
If you want to support Windows CRLF line endings, use for line in (s..'\n'):gmatch'(.-)\r?\n' do as in Piglet's answer for matching the lines instead.
This answer differs from Piglet's answer in that it uses match instead of gmatch for matching the k/v pairs, allowing exactly one k/v pair with exactly one colon per line, whereas Piglet's code may extract multiple k/v pairs per line.

Split a string on new lines, but include empty lines

Let's say I have a string with the contents
local my_str = [[
line1
line2
line4
]]
I'd like to get the following table:
{"line1","line2","","line4"}
In other words, I'd like the blank line 3 to be included in my result. I've tried the following:
local result = {};
for line in string.gmatch(my_str, "[^\n]+") do
table.insert(result, line);
end
However, this produces a result which will not include the blank line 3.
How can I make sure the blank line is included? Am I just using the wrong regex?
Try this instead:
local result = {};
for line in string.gmatch(my_str .. "\n", "(.-)\n") do
table.insert(result, line);
end
If you don't want the empty fifth element that gives you, then get rid of the blank line at the end of my_str, like this:
local my_str = [[
line1
line2
line4]]
(Note that a newline at the beginning of a long literal is ignored, but a newline at the end is not.)
You can replace the + with *, but that won't work in all Lua versions; LuaJIT will add random empty strings to your result (which isn't even technically wrong).
If your string always includes a newline character at the end of the last line like in your example, you can just do something like "([^\n]*)\n" to prevent random empty strings and the last empty string.
In Lua 5.2+ you can also just use a frontier pattern to check for either a newline or the end of the string: [^\n]*%f[\n\0], but that won't work in LuaJIT either.
If you need to support LuaJIT and don't have the trailing newline in your actual string, then you could just add it manually:
string.gmatch(my_str .. "\n", "([^\n]*)\n")

Lua string.find() error

So I'm writing a Lua script and I tested it but I got an error that I don't know how to fix:
.\search.lua:10: malformed pattern (missing ']')
Below is my code. If you know what I did wrong, it would be very helpful if you could tell me.
weird = "--[[".."\n"
function readAll(file)
local c = io.open(file, "rb")
local j = c:read("*all")
c:close()
return(j)
end
function blockActive()
local fc = readAll("functions.lua")
if string.find(fc,weird) ~= nil then
require("blockDeactivated")
return("false")
else
return("true")
end
end
print(blockActive())
Edit: first comment had the answer. I changed
weird = "--[[".."\n" to weird = "%-%-%[%[".."\r" The \n to \r change was because it was actually supposed to be that way in the first place.
This errors because string.find uses Lua Patterns.
Most non-alpha-numeric characters, such as "[", ".", "-" etc. convey special meaning.
string.find(fc,weird), or better, fc:find(weird) is trying to parse these special characters, and erroring.
You can use these patterns to cancel out your other patterns, however.
weird = ("--[["):gsub("%W","%%%0") .. "\r?\n"
This is a little daunting, but it will hopefully make sense.
the ("--[[") is the orignal first part of your weird string, working as expected.
:gsub() is a function that replaces a pattern with another one. Once again, see Patterns.
"%W" is a pattern that matches every string that isn't a letter, a number, or an underscore.
%%%0 replaces everything that matches with itself (%0 is a string that represents everything in this match), following a %, which is escaped.
So this means that [[ will be turned into %[%[, which is how find, and similar patterns 'escape' special characters.
The reason \n is now \r?\n refers back to these patterns. This matches it if it ends with a \n, like it did before. However, if this is running on windows, a newline might look like \r\n. (You can read up on this HERE). A ? following a character, \r in this case, means it can optionally match it. So this matches both --[[\n and --[[\r\n, supporting both windows and linux.
Now, when you run your fc:find(weird), it's running fc:find("%-%-%[%[\r?\n"), which should be exactly what you want.
Hope this has helped!
Finished code if you're a bit lazy
weird = ("--[["):gsub("%W","%%%0") .. "\r?\n" // Escape "--[[", add a newline. Used in our find.
// readAll(file)
// Takes a string as input representing a filename, returns the entire contents as a string.
function readAll(file)
local c = io.open(file, "rb") // Open the file specified by the argument. Read-only, binary (Doesn't autoformat things like \r\n)
local j = c:read("*all") // Dump the contents of the file into a string.
c:close() // Close the file, free up memory.
return j // Return the contents of the string.
end
// blockActive()
// returns whether or not the weird string was matched in 'functions.lua', executes 'blockDeactivated.lua' if it wasn't.
function blockActive()
local fc = readAll("functions.lua") // Dump the contents of 'functions.lua' into a string.
if fc:find(weird) then // If it functions.lua has the block-er.
require("blockDeactivated") // Require (Thus, execute, consider loadfile instead) 'blockDeactived.lua'
return false // Return false.
else
return true // Return true.
end
end
print(blockActive()) // Test? the blockActve code.

Lua Pattern Matching, get character before match

Currently I have code that looks like this:
somestring = "param=valueZ&456"
local stringToPrint = (somestring):gsub("(param=)[^&]+", "%1hello", 1)
StringToPrint will look like this:
param=hello&456
I have replaced all of the characters before the & with the string "hello". This is where my question becomes a little strange and specific.
I want my string to appear as: param=helloZ&456. In other words, I want to preserve the character right before the & when replacing the string valueZ with hello to make it helloZ instead. How can this be done?
I suggest:
somestring:gsub("param=[^&]*([^&])", "param=hello%1", 1)
See the Lua demo
Here, the pattern matches:
param= - literal substring param=
[^&]* - 0 or more chars other than & as many as possible
([^&]) - Group 1 capturing a symbol other than & (here, backtracking will occur, as the previous pattern grabs all such chars other than & and then the engine will take a step back and place the last char from that chunk into Group 1).
There are probably other ways to do this, but here is one:
somestring = "param=valueZ&456"
local stringToPrint = (somestring):gsub("(param=).-([^&]&)", "%1hello%2", 1)
print(stringToPrint)
The thing here is that I match the shortest string that ends with a character that is not & and a character that is &. Then I add the two ending characters to the replaced part.

Split lua string into characters

I only found this related to what I am looking for: Split string by count of characters but it is not useful for what I mean.
I have a string variable, which is an ammount of 3 numbers (can be from 000 to 999). I need to separate each of the numbers (characters) and get them into a table.
I am programming for a game mod which uses lua, and it has some extra functions. If you could help me to make it using: http://wiki.multitheftauto.com/wiki/Split would be amazing, but any other way is ok too.
Thanks in advance
Corrected to what the OP wanted to ask:
To just split a 3-digit number in 3 numbers, that's even easier:
s='429'
c1,c2,c3=s:match('(%d)(%d)(%d)')
t={tonumber(c1),tonumber(c2),tonumber(c3)}
The answer to "How do I split a long string composed of 3 digit numbers":
This is trivial. You might take a look at the gmatch function in the reference manual:
s="123456789"
res={}
for num in s:gmatch('%d%d%d') do
res[#res+1]=tonumber(num)
end
or if you don't like looping:
res={}
s:gsub('%d%d%d',function(n)res[#res+1]=tonumber(n)end)
I was looking for something like this, but avoiding looping - and hopefully having it as one-liner. Eventually, I found this example from lua-users wiki: Split Join:
fields = {str:match((str:gsub("[^"..sep.."]*"..sep, "([^"..sep.."]*)"..sep)))}
... which is exactly the kind of syntax I'd like - one liner, returns a table - except, I don't really understand what is going on :/ Still, after some poking about, I managed to find the right syntax to split into characters with this idiom, which apparently is:
fields = { str:match( (str:gsub(".", "(.)")) ) }
I guess, what happens is that gsub basically puts parenthesis '(.)' around each character '.' - so that match would consider those as a separate match unit, and "extract" them as separate units as well... But I still don't get why is there extra pair of parenthesis around the str:gsub(".", "(.)") piece.
I tested this with Lua5.1:
str = "a - b - c"
fields = { str:match( (str:gsub(".", "(.)")) ) }
print(table_print(fields))
... where table_print is from lua-users wiki: Table Serialization; and this code prints:
"a"
" "
"-"
" "
"b"
" "
"-"
" "
"c"

Resources