LPeg Increment for Each Match - lua

I'm making a serialization library for Lua, and I'm using LPeg to parse the string. I've got K/V pairs working (with the key explicitly named), but now I'm going to add auto-indexing.
It'll work like so:
#"value"
#"value2"
Will evaluate to
{
[1] = "value"
[2] = "value2"
}
I've already got the value matching working (strings, tables, numbers, and Booleans all work perfectly), so I don't need help with that; what I'm looking for is the indexing. For each match of #[value pattern], it should capture the number of #[value pattern]'s found - in other words, I can match a sequence of values ("#"value1" #"value2") but I don't know how to assign them indexes according to the number of matches. If that's not clear enough, just comment and I'll attempt to explain it better.
Here's something of what my current pattern looks like (using compressed notation):
local process = {} -- Process a captured value
process.number = tonumber
process.string = function(s) return s:sub(2, -2) end -- Strip of opening and closing tags
process.boolean = function(s) if s == "true" then return true else return false end
number = [decimal number, scientific notation] / process.number
string = [double or single quoted string, supports escaped quotation characters] / process.string
boolean = P("true") + "false" / process.boolean
table = [balanced brackets] / [parse the table]
type = number + string + boolean + table
at_notation = (P("#") * whitespace * type) / [creates a table that includes the key and value]
As you can see in the last line of code, I've got a function that does this:
k,v matched in the pattern
-- turns into --
{k, v}
-- which is then added into an "entry table" (I loop through it and add it into the return table)

Based on what you've described so far, you should be able to accomplish this using a simple capture and table capture.
Here's a simplified example I knocked up to illustrate:
lpeg = require 'lpeg'
l = lpeg.locale(lpeg)
whitesp = l.space ^ 0
bool_val = (l.P "true" + "false") / function (s) return s == "true" end
num_val = l.digit ^ 1 / tonumber
string_val = '"' * l.C(l.alnum ^ 1) * '"'
val = bool_val + num_val + string_val
at_notation = l.Ct( (l.P "#" * whitesp * val * whitesp) ^ 0 )
local testdata = [[
#"value1"
#42
# "value2"
#true
]]
local res = l.match(at_notation, testdata)
The match returns a table containing the contents:
{
[1] = "value1",
[2] = 42,
[3] = "value2",
[4] = true
}

Related

Trying to make function which takes string as input and returns no. of words in whole string

**It takes Input as a string such as this - 'Nice one'
And Output gives - 4,3 (which is no. Of words in sentence or string)
**
function countx(str)
local count = {}
for i = 1, string.len(str) do
s = ''
while (i<=string.len(str) and string.sub(str, i, i) ~= ' ' ) do
s = s .. string.sub(str, i, i)
i = i+1
end
if (string.len(s)>0) then
table.insert(count,string.len(s))
end
end
return table.concat(count, ',')
end
You can find a simple alternative with your new requirements:
function CountWordLength (String)
local Results = { }
local Continue = true
local Position = 1
local SpacePosition
while Continue do
SpacePosition = string.find(String, " ", Position)
if SpacePosition then
Results[#Results + 1] = SpacePosition - Position
Position = SpacePosition + 1
-- if needed to print the string
-- local SubString = String:sub(Position, SpacePosition)
-- print(SubString)
else
Continue = false
end
end
Results[#Results + 1] = #String - Position + 1
return Results
end
Results = CountWordLength('I am a boy')
for Index, Value in ipairs(Results) do
print(Value)
end
Which gives the following results:
1
2
1
3
def countLenWords(s):
s=s.split(" ")
s=map(len,s)
s=map(str,s)
s=list(s)
return s
The above functions returns a list containing number of characters in each word
s=s.split(" ") splits string with delimiter " " (space)
s=map(len,s) maps the words into length of the words in int
s=map(str,s) maps the values into string
s=list(s) converts map object to list
Short version of above function (all in one line)
def countLenWords(s):
return list(map(str,map(len,s.split(" "))))
-- Localise for performance.
local insert = table.insert
local text = 'I am a poor boy straight. I do not need sympathy'
local function word_lengths (text)
local lengths = {}
for word in text:gmatch '[%l%u]+' do
insert (lengths, word:len())
end
return lengths
end
print ('{' .. table.concat (word_lengths (text), ', ') .. '}')
gmatch returns an iterator over matches of a pattern in a string.
[%l%u]+ is a Lua regular expression (see http://lua-users.org/wiki/PatternsTutorial) matching at least one lowercase or uppercase letter:
[] is a character class: a set of characters. It matches anything inside brackets, e.g. [ab] will match both a and b,
%l is any lowercase Latin letter,
%u is any uppercase Latin letter,
+ means one or more repeats.
Therefore, text:gmatch '[%l%u]+' will return an iterator that will produce words, consisting of Latin letters, one by one, until text is over. This iterator is used in generic for (see https://www.lua.org/pil/4.3.5.html); and on any iteration word will contain a full match of the regular expression.

How can I loop through nested tables in Lua when the nested tables are mixed in with other data types?

I'm trying loop though a very large table in Lua that consists of mixed data types many nested tables. I want to print the entire data table to the console, but I'm having trouble with nested loops. When I do a nested loop to print the next level deep Key Value pairs I get this error bad argument #1 to 'pairs' (table expected, got number) because not all values are tables.
I've tried adding a if type(value) == table then before the nested loop but it never triggers, because type(value) returns userdata whether they are ints, strings or tables.
EDIT: I was wrong, only tables are returned as type userdata
My table looks something like this but hundreds of pairs and can be several nested tables. I have a great built in method printall() with the tool I'm using for this but it only works on the first nested table. I don't have any control over what this table looks like, I'm just playing with a game's data, any help is appreciated.
myTable = {
key1 = { value1 = "string" },
key2 = int,
key3 = { -- printall() will print all these two as key value pairs
subKey1 = int,
subKey2 = int
},
key4 = {
innerKey1 = { -- printall() returns something like : innerKey1 = <int32_t[]: 0x13e9dcb98>
nestedValue1 = "string",
nestedValue2 = "string"
},
innerKey2 = { -- printall() returns something like : innerKey2 = <vector<int32_t>[41]: 0x13e9dcbc8>
nestedValue3 = int,
nestedValue4 = int
}
},
keyN = "string"
}
My loop
for key, value in pairs(myTable) do
print(key)
printall(value)
for k,v in pairs(value) do
print(k)
printall(v)
end
end
print("====")
end
ANSWER : Here is my final version of the function that fixed this, it's slightly modified from the answer Nifim gave to catch edge cases that were breaking it.
function printFullObjectTree(t, tabs)
local nesting = ""
for i = 0, tabs, 1 do
nesting = nesting .. "\t"
end
for k, v in pairs(t) do
if type(v) == "userdata" then -- all tables in this object are the type `userdata`
print(nesting .. k .. " = {")
printFullObjectTree(v, tabs + 1)
print(nesting .. "}")
elseif v == nil then
print(nesting .. k .. " = nil")
elseif type(v) == "boolean" then
print(nesting .. k .. " = " .. string.format("%s", v))
else
print(nesting .. k .. " = " .. v)
end
end
end
type(value) returns a string representing the type of value
More information on that Here:
lua-users.org/wiki/TypeIntrospection
Additionally your example table has int as some of the values for some keys, as this would be nil those keys are essentially not part of the table for my below example i will change each instance of int to a number value.
It would also make sense to recurse if you hit a table rather than making a unknown number of nested loops.
here is an example of working printAll
myTable = {
key1 = { value1 = "string" },
key2 = 2,
key3 = { -- printall() will print all these two as key value pairs
subKey1 = 1,
subKey2 = 2
},
key4 = {
innerKey1 = { -- printall() returns something like : innerKey1 = <int32_t[]: 0x13e9dcb98>
nestedValue1 = "string",
nestedValue2 = "string"
},
innerKey2 = { -- printall() returns something like : innerKey2 = <vector<int32_t>[41]: 0x13e9dcbc8>
nestedValue3 = 3,
nestedValue4 = 4
}
},
keyN = "string"
}
function printAll(t, tabs)
local nesting = ""
for i = 0, tabs, 1 do
nesting = nesting .. "\t"
end
for k, v in pairs(t) do
if type(v) == "table" then
print(nesting .. k .. " = {")
printAll(v, tabs + 1)
print(nesting .. "}")
else
print(nesting .. k .. " = " .. v)
end
end
end
print("myTable = {")
printAll(myTable, 0)
print("}")

Use variables inside a string in lua

In javascript we can do the following:
var someString = `some ${myVar} string`
I have the following lua code, myVar is a number that needs to be in the square brackets:
splash:evaljs('document.querySelectorAll("a[title*=further]")[myVar]')
Function that fits you description is string.format:
splash:evaljs(string.format('document.querySelectorAll("a[title*=further]")[%s]', myVar))
It is not as verbose as ${}. It is more of a good old (and hated) sprintf.
I have written an emulation of Python's f'' strings, which is a set of functions you can hide inside a require file. So, if you like Python's f'' strings, this may be what you're looking for.
(If anyone finds errors, please notify.)
It's quite big compared to the other solution, but if you hide the bulk in a library, then its use is more compact and readable, IMO.
With this library you can do the following, for example:
require 'f_strings'
a = 12345
print(f'Number: {a}, formatted with two decimals: {a::%.2f}')
-- Number: 12345, formatted with two decimals: 12345.00
Note the use of Lua string.format formatting codes, and the use of double colon (instead of Python's single colon) for format specifiers because of Lua's use of colon for methods.
I have extracted only the relevant functions from a larger library. Although some optimizations may be possible for this specific use case, I leave them unchanged as they are general purpose and may also be useful for other purposes.
And here's the required library (placed somewhere in your Lua libraries folder):
-- f_strings.lua ---
unpack = table.unpack or unpack
--------------------------------------------------------------------------------
-- Escape special pattern characters in string to be treated as simple characters
--------------------------------------------------------------------------------
local
function escape_magic(s)
local MAGIC_CHARS_SET = '[()%%.[^$%]*+%-?]'
if s == nil then return end
return (s:gsub(MAGIC_CHARS_SET,'%%%1'))
end
--------------------------------------------------------------------------------
-- Returns iterator to split string on given delimiter (multi-space by default)
--------------------------------------------------------------------------------
function string:gsplit(delimiter)
if delimiter == nil then return self:gmatch '%S+' end --default delimiter is any number of spaces
if delimiter == '' then return self:gmatch '.' end
if type(delimiter) == 'number' then --break string in equal-size chunks
local index = 1
local ans
return function()
ans = self:sub(index,index+delimiter-1)
if ans ~= '' then
index = index + delimiter
return ans
end
end
end
if self:sub(-#delimiter) ~= delimiter then self = self .. delimiter end
return self:gmatch('(.-)'..escape_magic(delimiter))
end
--------------------------------------------------------------------------------
-- Split a string on the given delimiter (comma by default)
--------------------------------------------------------------------------------
function string:split(delimiter,tabled)
tabled = tabled or false --default is unpacked
local ans = {}
for item in self:gsplit(delimiter) do
ans[#ans+1] = item
end
if tabled then return ans end
return unpack(ans)
end
--------------------------------------------------------------------------------
function copy(t) --returns a simple (shallow) copy of the table
if type(t) == 'table' then
local ans = {}
for k,v in next,t do ans[ k ] = v end
return ans
end
return t
end
--------------------------------------------------------------------------------
function eval(expr,vars)
--evaluate a string expression with optional variables
if expr == nil then return end
vars = vars or {}
assert(type(expr) == 'string','String expected as 1st arg')
assert(type(vars) == 'table','Variable table expected as 2nd arg')
local env = {abs=math.abs,acos=math.acos,asin=math.asin,atan=math.atan,
atan2=math.atan2,ceil=math.ceil,cos=math.cos,cosh=math.cosh,
deg=math.deg,exp=math.exp,floor=math.floor,fmod=math.fmod,
frexp=math.frexp,huge=math.huge,ldexp=math.ldexp,log=math.log,
max=math.max,min=math.min,modf=math.modf,pi=math.pi,pow=math.pow,
rad=math.rad,random=math.random,randomseed=math.randomseed,
sin=math.sin,sinh=math.sinh,sqrt=math.sqrt,tan=math.tan,
tanh=math.tanh}
for name,value in pairs(vars) do env[name] = value end
local a,b = pcall(load('return '..expr,nil,'t',env))
if a == false then return nil,b else return b end
end
--------------------------------------------------------------------------------
-- f'' formatted strings like those introduced in Python v3.6
-- However, you must use Lua style format modifiers as with string.format()
--------------------------------------------------------------------------------
function f(s)
local env = copy(_ENV) --start with all globals
local i,k,v,fmt = 0
repeat
i = i + 1
k,v = debug.getlocal(2,i) --two levels up (1 level is this repeat block)
if k ~= nil then env[k] = v end
until k == nil
local
function go(s)
local fmt
s,fmt = s:sub(2,-2):split('::')
if s:match '%b{}' then s = (s:gsub('%b{}',go)) end
s = eval(s,env)
if fmt ~= nil then
if fmt:match '%b{}' then fmt = eval(fmt:sub(2,-2),env) end
s = fmt:format(s)
end
return s
end
return (s:gsub('%b{}',go))
end

How to use patterns to shorten only matching strings

I have a program that grabs a list of peripheral types, matches them to see if they're a valid type, and then executes type-specific code if they are valid.
However, some of the types can share parts of their name, with the only difference being their tier, I want to match these to the base type listed in a table of valid peripherals, but I can't figure out how to use a pattern to match them without the pattern returning nil for everything that doesn't match.
Here is the code to demonstrate my problem:
connectedPeripherals = {
[1] = "tile_thermalexpansion_cell_basic_name",
[2] = "modem",
[3] = "BigReactors-Turbine",
[4] = "tile_thermalexpansion_cell_resonant_name",
[5] = "monitor",
[6] = "tile_thermalexpansion_cell_hardened_name",
[7] = "tile_thermalexpansion_cell_reinforced_name",
[8] = "tile_blockcapacitorbank_name"
}
validPeripherals = {
["tile_thermalexpansion_cell"]=true,
["tile_blockcapacitorbank_name"]=true,
["monitor"]=true,
["BigReactors-Turbine"]=true,
["BigReactors-Reactor"]=true
}
for i = 1, #connectedPeripherals do
local periFunctions = {
["tile_thermalexpansion_cell"] = function()
--content
end,
["tile_blockcapacitorbank_name"] = function()
--content
end,
["monitor"] = function()
--content
end,
["BigReactors-Turbine"] = function()
--content
end,
["BigReactors-Reactor"] = function()
--content
end
}
if validPeripherals[connectedPeripherals[i]] then periFunctions[connectedPeripherals[i]]() end
end
If I try to run it like that, all of the thermalexpansioncells aren't recognized as valid peripherals, and if I add a pattern matching statement, it works for the thermalexpansioncells, but returns nil for everything else and causes an exception.
How do I do a match statement that only returns a shortened string for things that match and returns the original string for things that don't?
Is this possible?
If the short version doesn't contain any of the special characters from Lua patterns you can use the following:
long = "tile_thermalexpansion_cell_basic_name"
result = long:match("tile_thermalexpansion_cell") or long
print(result) -- prints the shorter version
result = long:match("foo") or long
print(result) -- prints the long version
Based on the comment, you can also use string.find to see the types match your peripheral names:
for i,v in ipairs(connectedPeripherals) do
local Valid = CheckValidity(v)
if Valid then Valid() end
end
where, CheckValidity will return the key from validPeripherals:
function CheckValidity( name )
for n, b in pairs(validPeripherals) do
if name:find( n ) then return n end
end
return false
end

Using LPEG (Lua Parser Expression Grammars) like boost::spirit

So I am playing with lpeg to replace a boost spirit grammar, I must say boost::spirit is far more elegant and natural than lpeg. However it is a bitch to work with due to the constraints of current C++ compiler technology and the issues of TMP in C++. The type mechanism is in this case your enemy rather than your friend. Lpeg on the other hand while ugly and basic results in more productivity.
Anyway, I am digressing, part of my lpeg grammar looks like as follows:
function get_namespace_parser()
local P, R, S, C, V =
lpeg.P, lpeg.R, lpeg.S, lpeg.C, lpeg.V
namespace_parser =
lpeg.P{
"NAMESPACE";
NAMESPACE = V("WS") * P("namespace") * V("SPACE_WS") * V("NAMESPACE_IDENTIFIER")
* V("WS") * V("NAMESPACE_BODY") * V("WS"),
NAMESPACE_IDENTIFIER = V("IDENTIFIER") / print_string ,
NAMESPACE_BODY = "{" * V("WS") *
V("ENTRIES")^0 * V("WS") * "}",
WS = S(" \t\n")^0,
SPACE_WS = P(" ") * V("WS")
}
return namespace_parser
end
This grammar (although incomplete) matches the following namespace foo {}. I'd like to achieve the following semantics (which are common use-cases when using boost spirit).
Create a local variable for the namespace rule.
Add a namespace data structure to this local variable when namespace IDENTIFIER { has been matched.
Pass the newly created namespace data structure to the NAMESPACE_BODY for further construction of the AST... so on and so forth.
I am sure this use-case is achievable. No examples show it. I don't know the language or the library enough to figure out how to do it. Can someone show the syntax for it.
edit : After a few days of trying to dance with lpeg, and getting my feet troden on, I have decided to go back to spirit :D it is clear that lpeg is meant to be weaved with lua functions and that such weaving is very free-form (whereas spirit has clear very well documented semantics). I simply do not have the right mental model of lua yet.
Though "Create a local variable for the namespace rule" sounds disturbingly like "context-sensitive grammar", which is not really for LPEG, I will assume that you want to build an abstract syntax tree.
In Lua, an AST can be represented as a nested table (with named and indexed fields) or a closure, doing whatever task that tree is meant to do.
Both can be produced by a combination of nested LPEG captures.
I will limit this answer to AST as a Lua table.
Most useful, in this case, LPEG captures will be:
lpeg.C( pattern ) -- simple capture,
lpeg.Ct( pattern ) -- table capture,
lpeg.Cg( pattern, name ) -- named group capture.
The following example based on your code will produce a simple syntax tree as a Lua table:
local lpeg = require'lpeg'
local P, V = lpeg.P, lpeg.V
local C, Ct, Cg = lpeg.C, lpeg.Ct, lpeg.Cg
local locale = lpeg.locale()
local blank = locale.space ^ 0
local space = P' ' * blank
local id = P'_' ^ 0 * locale.alpha * (locale.alnum + '_') ^ 0
local NS = P{ 'ns',
-- The upper level table with two fields: 'id' and 'entries':
ns = Ct( blank * 'namespace' * space * Cg( V'ns_id', 'id' )
* blank * Cg( V'ns_body', 'entries' ) * blank ),
ns_id = id,
ns_body = P'{' * blank
-- The field 'entries' is, in turn, an indexed table:
* Ct( (C( V'ns_entry' )
* (blank * P',' * blank * C( V'ns_entry') ) ^ 0) ^ -1 )
* blank * P'}',
ns_entry = id
}
lpeg.match( NS, 'namespace foo {}' ) will give:
table#1 {
["entries"] = table#2 {
},
["id"] = "foo",
}
lpeg.match( NS, 'namespace foo {AA}' ) will give:
table#1 {
["entries"] = table#2 {
"AA"
},
["id"] = "foo",
}
lpeg.match( NS, 'namespace foo {AA, _BB}' ) will give:
table#1 {
["entries"] = table#2 {
"AA",
"_BB"
},
["id"] = "foo",
}
lpeg.match( NS, 'namespace foo {AA, _BB, CC1}' ) will give:
table#1 {
["entries"] = table#2 {
"AA",
"_BB",
"CC1"
},
["id"] = "foo",
}

Resources