How to read a specific part of a string? - lua

Essentially, what I need is to read a certain part of a string.
Example:
I have a string that contains "12 31".
However, I need to put these numbers into separate variables. Just sorts 12 into lets say variable A, and 31 in variable B.
How should I go about this?

You can use Lua Patterns:
> ExampleString = "12 31"
> ExampleString:match("(%d+)%s+(%d+)")
12 31
> SubString1, SubString2= ExampleString:match("(%d+)%s+(%d+)")
> Number1 = tonumber(SubString1)
> Number2 = tonumber(SubString2)
The Pattern expression seems complex but is actually quite simple. The things between ( and ) are named captures and will be returned if they are found. Here, we want 2 results so we have 2 couples ( and ). %d+ means that we want to find a string which contains at least 1 digit (+).
The 2 numbers are separated by some spaces %s+, at least 1 (+).
In summary, we want to extract (Number1)space(Number2)
The function string.match is used to match against the given pattern and returns the found strings. The last step is to use the function tonumber to convert the found sub-strings into Lua numbers.

Related

how to find the index of a repeated character in lua string

suppose you have a path like this
/home/user/dev/project
I want to get the index of any / I want
like if I want the one before dev or the one before user
I don't get lua string patterns if there is a good documentation for it please link it
There are several ways to do this. Perhaps the simplest is using the () pattern element which yields a match position combined with string.gmatch:
for index in ("/home/user/dev/project"):gmatch"()/" do
print(index)
end
which prints
1
6
11
15
as expected. Another way to go (which requires some more code) would be repeatedly invoking string.find, always passing a start index.
Assuming that you probably want to split a string by slashes, that's about as simple using string.gmatch:
for substr in ("/home/user/dev/project"):gmatch"[^/]+" do
print(substr)
end
(the pattern finds all substrings of nonzero, maximal length that don't contain a slash)
Documentation for patterns is here. You might want to have a look at the subsection "Captures".
There are many ways to do so.
Also its good to know that Lua has attached all string functions on datatype string as methods.
Thats what #LMD demonstrates with the : directly on a string.
My favorite place for experimenting with such complicated/difficult things like pattern and their captures is the Lua Standalone Console maked with: make linux-readline
So lets play with the pattern '[%/\\][%u%l%s]+'
> _VERSION
Lua 5.4
> -- Lets set up a path
> path='/home/dev/project/folder with spaces mixed with one OR MORE Capitals in should not be ignored'
> -- I am curious /home exists so trying to have a look into
> os.execute('/bin/ls -Ah ' .. ('"%s"'):format(path:match('[%/\\][%u%l%s]+')));
knoppix koyaanisqatsi
> -- OK now lets see if i can capture the last folder with the $
> io.stdout:write(('"%s"\n'):format(path:match('[%/\\][%u%l%s]+$'))):flush();
"/folder with spaces mixed with one OR MORE Capitals in should not be ignored"
> -- Works too so now i want to know whats the depth is
> do local str, count = path:gsub('[%/\\][%u%l%s%_%-]+','"%1"\n') print(str) return count end
"/home"
"/dev"
"/project"
"/folder with spaces mixed with one OR MORE Capitals in should not be ignored"
4
> -- OK seems usefull lets check a windows path with it
> path='C:\\tmp\\Some Folder'
> do local str, count = path:gsub('[%/\\][%u%l%s]+','<%1>') print(str) return count end
C:<\tmp><\Some Folder>
2
> -- And that is what i mean with "many"
> -- But aware that only lower upper and space chars are handled
> -- So _ - and other chars has to be included by the pattern
> -- Like: '[%/\\][%u%l%s%_%-]+'
> path='C:\\tmp\\Some_Folder'
> do local str, count = path:gsub('[%/\\][%u%l%s%_%-]+','<%1>') print(str) return count end
C:<\tmp><\Some_Folder>
2
> path='C:\\tmp\\Some-Folder'
> do local str, count = path:gsub('[%/\\][%u%l%s%_%-]+','<%1>') print(str) return count end
C:<\tmp><\Some-Folder>
2

Finding strings between two strings in lua

I have been trying to find all possible strings in between 2 strings
This is my input: "print/// to be able to put any amount of strings here endprint///"
The goal is to print every string in between print/// and endprint///
You can use Lua's string patterns to achieve that.
local text = "print/// to be able to put any amount of strings here endprint///"
print(text:match("print///(.*)endprint///"))
The pattern "print///(.*)endprint///" captures any character that is between "print///" and "endprint///"
Lua string patterns here
In this kind of problem, you don't use the greedy quantifiers * or +, instead, you use the lazy quantifier -. This is because * matches until the last occurrence of the sub-pattern after it, while - matches until the first occurence of the sub-pattern after it. So, you should use this pattern:
print///(.-)endprint///
And to match it in Lua, you do this:
local text = "print/// to be able to put any amount of strings here endprint///"
local match = text:match("print///(.-)endprint///")
-- `match` should now be the text in-between.
print(match) -- "to be able to put any amount of strings here "

Capture group in Lua pattern matches literal digit character instead of capture group

I want to extract the VALUE of lines containing key="VALUE", and I am trying to use a simple Lua pattern to solve this.
It works for lines except for those which contains a literal 1 in the VALUE. It seems the pattern parser is confusing my capture group for an escape sequence.
> return string.find('... key = "PHONE2" ...', 'key%s*=%s*(["\'])([^%1]-)%1')
5 18 " PHONE2
> return string.find('... key = "PHONE1" ...', 'key%s*=%s*(["\'])([^%1]-)%1')
nil
>
You do not need to use the [^%1] at all. Just use .- as it, by definition, matches the smallest possible string.
Also, you can use multiline string syntax, to not have to escape the quotes in your pattern:
> s=[[... key = "PHONE1" ...]]
> return s:find [[key%s*=%s*(["'])(.-)%1]]
5 18 " PHONE1
The pattern [^%1] actually means, do not search for characters % and 1 individually.

Lua Pattern Matching, get character before match

Currently I have code that looks like this:
somestring = "param=valueZ&456"
local stringToPrint = (somestring):gsub("(param=)[^&]+", "%1hello", 1)
StringToPrint will look like this:
param=hello&456
I have replaced all of the characters before the & with the string "hello". This is where my question becomes a little strange and specific.
I want my string to appear as: param=helloZ&456. In other words, I want to preserve the character right before the & when replacing the string valueZ with hello to make it helloZ instead. How can this be done?
I suggest:
somestring:gsub("param=[^&]*([^&])", "param=hello%1", 1)
See the Lua demo
Here, the pattern matches:
param= - literal substring param=
[^&]* - 0 or more chars other than & as many as possible
([^&]) - Group 1 capturing a symbol other than & (here, backtracking will occur, as the previous pattern grabs all such chars other than & and then the engine will take a step back and place the last char from that chunk into Group 1).
There are probably other ways to do this, but here is one:
somestring = "param=valueZ&456"
local stringToPrint = (somestring):gsub("(param=).-([^&]&)", "%1hello%2", 1)
print(stringToPrint)
The thing here is that I match the shortest string that ends with a character that is not & and a character that is &. Then I add the two ending characters to the replaced part.

String and CharStream<'a> in FParsec

I would like to parse a big sentence, which can contain names in fsharp.
I posit that names is in the form first name + last name.
In the absence of a first name list (can't find, will do later), I say that a first name is a string of length 4 or more, same for the last name.
When I try my very smart parser
let firstorlastname x = (parray 4 letter) x
firstorlastname "JEAN"
firstorlastname "CHRISTOPHE"
So, it works for both, but the problem is that it consumes only 4 characters, which is not the desired behaviour for Christophe. I would like the whole word to be consumed.
How can I instruct FParsec to consume the entire word, but fail if the word is less than 4 characters ?
Haven't tested it, but I think this should do it:
let firstOrLastName = manyMinMaxSatisfy 4 Int32.MaxValue isLetter

Resources