Parsing key value pair with Lua - parsing

I am trying to parse key=value pairs with Lua. An example string looks like:
str="a=b b=c name=george jetson name2=paul davidson company=radioshack"
for name, value in string.gfind(str, "([^&=]+)=([^&=]+)") do
print(name)
print(value)
end
result:
a
b b
c name
george jetson name2
paul davidson company
radioshack
Unfortunately its grabbing the next key and adding it to the value which I don't want. What am I missing?

You need to treat spaces in values and spaces before keys differently.
The code below is one way of doing it.
str="a=b b=c name=george jetson name2=paul davidson company=radioshack"
str=" "..str.."\n"
str=str:gsub("%s(%S-)=","\n%1=")
for name, value in string.gmatch(str, "(%S-)=(.-)\n") do
print(name,"'"..value.."'")
end

Related

Trying to get some kind of key:value data from a string in Lua

I'm (again) stuck because patterns... so let's see if with a little of help... The case is I have e. g. a string returned by a function that contains the following:
📄 My Script
ScriptID:RL_SimpleTest
Version:0.0.1
ScriptType:MenuScript
AnotherKey:AnotherValue
And, maybe, some more text...
And I'd want to parse it line by line and should the line contains a ":" get the left side content of the line in a variable (k) and the right content in another one (v), so e. g. I'd have k containing "ScriptID" and v containing "RL_SimpleTest" for the second line (the first one should be just ignored) and so on...
Well, I've started with something like this:
function RL_Test:StringToKeyValue(str, sep1, sep2)
sep1 = sep1 or "\n"
sep2 = sep2 or ":"
local t = {}
for line in string.gmatch(str, "([^" .. sep1 .. "]+)") do
print(line)
for k in string.gmatch(line, "([^" .. sep2 .. "]+)") do --Here is where I'm lost trying to get the key/value pair separately and at the same time...
--t[k] = v
print(k)
end
end
return t
end
With the hope once I got isolated the line containing the data in the key:value form that I want to extract, I'd be able to do some kind of for k, v in string.gmatch(line, "([^" .. sep2 .. "]+)") or something so and that way get the two pieces of data, but of course it doesn't work and even though I have a feeling it's a triviality I don't know even where to start, always for the lack of patterns understanding...
Well, I hope at least I exposed it right... Thanks in advance for any help.
local t = {}
for line in (s..'\n'):gmatch("(.-)\r?\n") do
for a, b in line:gmatch("([^:]+):([^:\n\r]+)") do
t[a] = b
end
end
The pattern is quite simple. Match anything that is not a colon that is followed by a colon that is followed by anything that is not a colon or a line break. Put what you want in captures and you're done.
I assume every line is of the format k:v, containing exactly one colon, or containing no colon (no k/v pair).
Then you can simply first match nonempty lines using [^\n]+ (assuming UNIX LF line endings), then match each line using ^([^:]+):([^:]+)$. Breakdown of the second pattern:
^ and $ are anchors. They force the pattern to match the entire line.
([^:]+) matches & captures one or more non-semicolon characters.
This leaves you with:
function RL_Test:StringToKeyValue(str)
local t = {}
for line in str:gmatch"[^\n]+" do
local k, v = line:match"^([^:]+):([^:]+)$"
if k then -- line is k:v pair?
t[k] = v
end
end
return t
end
If you want to support Windows CRLF line endings, use for line in (s..'\n'):gmatch'(.-)\r?\n' do as in Piglet's answer for matching the lines instead.
This answer differs from Piglet's answer in that it uses match instead of gmatch for matching the k/v pairs, allowing exactly one k/v pair with exactly one colon per line, whereas Piglet's code may extract multiple k/v pairs per line.

lua match everything after a tag in a string

The string is like this:
TEMPLATES="!$TEMPLATE templatename manufacturer model mode\n$TEMPLATE MacQuantum Wash Basic\n$$MANUFACTURER Martin\n$$MODELNAME Mac Quantum Wash\n$$MODENAME Basic\n"
My way to get strings without tags is:
local sentence=""
for word in string.gmatch(line,"%S+") do
if word ~= tag then
sentence=sentence .. word.." "
end
end
table.insert(tagValues, sentence)
E(tag .." --> "..sentence)
And I get output:
$$MANUFACTURER --> Martin
$$MODELNAME --> Mac Quantum Wash
...
...
But this is not the way I like.
I would like to find first the block starting with $TEMPLATE tag to check if this is the right block. There is many such blocks in a file I read line by line. Then I have to get all tags marked with double $: $$MODELNAME etc.
I have tried it on many ways, but none satisfied me. Perhaps someone has an idea how to solve it?
We are going to use Lua patterns (like regex, but different) inside a function string.gmatch, which creates a loop.
Explanation:
for match in string.gmatch(string, pattern) do print(match) end is an iterative function that will iterate over every instance of pattern in string. The pattern I will use is %$+%w+%s[^\n]+
%$+ - At least 1 literal $ ($ is a special character so it needs the % to escape), + means 1 or more. You could match for just one ("%$") if you only need the data of the tag but we want information on how many $ there are so we'll leave that in.
%w+ - match any alphanumeric character, as many as appear in a row.
%s - match a single space character
[^\n]+ - match anything that isn't '\n' (^ means invert), as many as appear in a row.
Once the function hits a \n, it executes the loop on the match and repeats the process.
That leaves us with strings like "$TEMPLATE templatename manufacturer"
We want to extract the $TEMPLATE to its own variable to verify it, so we use string.match(string, pattern) to just return the value found by the pattern in string.
OK: EDIT: Here's a comprehensive example that should provide everything you're looking for.
templates = "!$TEMPLATE templatename manufacturer model mode\n$TEMPLATE MacQuantum Wash Basic\n$$MANUFACTURER Martin\n$$MODELNAME Mac Quantum Wash\n$$MODENAME Basic\n"
local data = {}
for match in string.gmatch(templates, "%$+%w+%s[^\n]+") do --finds the pattern given in the variable 'templates'
--this function assigns certain data to tags inside table t, which goes inside data.
local t = {}
t.tag = string.match(match, '%w+') --the tag (stuff that comes between a $ and a space)
t.info = string.gsub(match, '%$+%w+%s', "") --value of the tag (stuff that comes after the `$TEMPLATE `. Explanation: %$+ one or more dollar signs $w+ one or more alphanumeric characters $s a space. Replace with "" (erase it)
_, t.ds = string.gsub(match, '%$', "") --This function emits two values, the first one is garbage and we don't need (hence a blank variable, _). The second is the number of $s in the string).
table.insert(data, t)
end
for _,tag in pairs(data) do --iterate over every table of data in data.
for key, value in pairs(tag) do
print("Key:", key, "Value:", value) --this will show you data examples (see output)
end
print("-------------")
end
print('--just print the stuff with two dollar signs')
for key, data in pairs(data) do
if data.ds == 2 then --'data' becomes a subtable in table 'data', we evaluate how many dollar signs it recognized.
print(data.tag)
end
end
print("--just print the MODELNAME tag's value")
for key, data in pairs(data) do
if data.tag == "MODELNAME" then --evaluate the tag name.
print(data.info)
end
end
Output:
Key: info Value: templatename manufacturer model mode
Key: ds Value: 1
Key: tag Value: TEMPLATE
-------------
Key: info Value: MacQuantum Wash Basic
Key: ds Value: 1
Key: tag Value: TEMPLATE
-------------
Key: info Value: Martin
Key: ds Value: 2
Key: tag Value: MANUFACTURER
-------------
Key: info Value: Mac Quantum Wash
Key: ds Value: 2
Key: tag Value: MODELNAME
-------------
Key: info Value: Basic
Key: ds Value: 2
Key: tag Value: MODENAME
-------------
--just print the stuff with two dollar signs
MANUFACTURER
MODELNAME
MODENAME
--just print the MODELNAME tag's value:
Mac Quantum Wash

navigate table without using pairs in Lua

Hello im a newbie in Lua i just want to know if there is a way to get key and value of table not using pairs,ipairs,next or other iterators? thanks in advance.!
I don't believe this is possible, as you've phrased your question in such a way that implies that the key is unknown. The only way to check for a certain value and its corresponding key would be to iterate through the whole table.
However, maybe I misunderstood and you want to get a certain value from a key without iterating through the whole table.
Say you have a table named morse as follows:
morse = { a = ".-"; b = "-..."; } -- And so on
If you wanted to convert a single character to morse you could do as follows:
morse["a"] --Which will return the string ".-"
You can do the opposite, and define a table with all the morse values and their corresponding letters like below. Note the use of square brackets to 'escape' the characters.
morse = { [".-"] = "a"; ["-..."] = "b" }
morse[".-"] -- This will return "a"
Based on your comment, I think you are looking for a string substitution using a mapping table. I think you can use string.gsub here (if your teacher still insists that .gsub is an iterator; you can ask them politely that you are unaware of the method they claim and would be delighted to actually learn about the same):
local str = "sos sos sos"
local morse = {s = "...", o = "---"}
print( str:gsub("%a", morse) )

How can i parse the standard input with the erlang api?

I'm developing a game in Erlang, and now i need to read the standard input. I tried the following calls:
io:fread()
io:read()
The problem is that i can't read a whole string, when it contains white spaces. So i have the following questions:
How can i read the string typed from the user when he press the enter key? (remember that the string contains white spaces)
How can i convert a string like "56" in the number 56?
Read line
You can use io:get_line/1 to get string terminated by line feed from console.
3> io:get_line("Prompt> ").
Prompt> hello world how are you?
"hello world how are you?\n"
io:read will get you erlang term, so you can't read a string, unless you want to make your users wrap string in quotes.
Patterns in io:fread does not seem to let you read arbitrary length string containing spaces.
Parse integer
You can convert "56" to 56 using erlang:list_to_integer/1.
5> erlang:list_to_integer("56").
56
or using string:to_integer/1 which will also return you the rest of a string
10> string:to_integer("56hello").
{56,"hello"}
11> string:to_integer("56").
{56,[]}
The erlang documentation about io:fread/2 should help you out.
You can use field lengths in order to read an arbitrary length of characters (including whitespace):
io:fread("Prompt> ","~20c").
Prompt> This is a sentence!!
{ok,["This is a sentence!!"]}
As for converting a string (a list of characters) to an integer, erlang:list_to_integer/1 does the job:
7> erlang:list_to_integer("645").
645
Edit: try experimenting with io:fread/2, the format sequence can ease the parsing of data by applying some form of pattern matching:
9> io:fread("Prompt> ","~s ~s").
Prompt> John Doe
{ok,["John","Doe"]}
The console is not really a good place to do your stuff, because you need to know in advance the format of the answer. Considering that you allow spaces, you need to know how many words will be entered before getting the answer. Knowing that, you can use a string as entry, and then parse it later:
1> io:read("Enter a text > ").
Enter a text > "hello guy, this is my answer :o)".
{ok,"hello guy, this is my answer :o)"}
2>
The bad news is that the user must enter the quotes and a final dot, not user friendly...

Break strings into substrings based on delimiters, with empty substrings

I am using LUA to create a table within a table, and am running into an issue. I need to also populate the NIL values that appear, but can not seem to get it right.
String being manipulated:
PatID = '07-26-27~L73F11341687Per^^^SCI^SP~N7N558300000Acc^'
for word in PatID:gmatch("[^\~w]+") do table.insert(PatIDTable,word) end
local _, PatIDCount = string.gsub(PatID,"~","")
PatIDTableB = {}
for i=1, PatIDCount+1 do
PatIDTableB[i] = {}
end
for j=1, #PatIDTable do
for word in PatIDTable[j]:gmatch("[^\^]+") do
table.insert(PatIDTableB[j], word)
end
end
This currently produces this output:
table
[1]=table
[1]='07-26-27'
[2]=table
[1]='L73F11341687Per'
[2]='SCI'
[3]='SP'
[3]=table
[1]='N7N558300000Acc'
But I need it to produce:
table
[1]=table
[1]='07-26-27'
[2]=table
[1]='L73F11341687Per'
[2]=''
[3]=''
[4]='SCI'
[5]='SP'
[3]=table
[1]='N7N558300000Acc'
[2]=''
EDIT:
I think I may have done a bad job explaining what it is I am looking for. It is not necessarily that I want the karats to be considered "NIL" or "empty", but rather, that they signify that a new string is to be started.
They are, I guess for lack of a better explanation, position identifiers.
So, for example:
L73F11341687Per^^^SCI^SP
actually translates to:
1. L73F11341687Per
2.
3.
4. SCI
5. SP
If I were to have
L73F11341687Per^12ABC^^SCI^SP
Then the positions are:
1. L73F11341687Per
2. 12ABC
3.
4. SCI
5. SP
And in turn, the table would be:
table
[1]=table
[1]='07-26-27'
[2]=table
[1]='L73F11341687Per'
[2]='12ABC'
[3]=''
[4]='SCI'
[5]='SP'
[3]=table
[1]='N7N558300000Acc'
[2]=''
Hopefully this sheds a little more light on what I'm trying to do.
Now that we've cleared up what the question is about, here's the issue.
Your gmatch pattern will return all of the matching substrings in the given string. However, your gmatch pattern uses "+". That means "one or more", which therefore cannot match an empty string. If it encounters a ^ character, it just skips it.
But, if you just tried :gmatch("[^\^]*"), which allows empty matches, the problem is that it would effectively turn every ^ character into an empty match. Which is not what you want.
What you want is to eat the ^ at the end of a substring. But, if you try :gmatch("([^\^])\^"), you'll find that it won't return the last string. That's because the last string doesn't end with ^, so it isn't a valid match.
The closest you can get with gmatch is this pattern: "([^\^]*)\^?". This has the downside of putting an empty string at the end. However, you can just remove that easily enough, since one will always be placed there.
local s0 = '07-26-27~L73F11341687Per^^^SCI^SP~N7N558300000Acc^'
local tt = {}
for s1 in (s0..'~'):gmatch'(.-)~' do
local t = {}
for s2 in (s1..'^'):gmatch'(.-)^' do
table.insert(t, s2)
end
table.insert(tt, t)
end

Resources