Parsing a TeX-like language with lpeg - lua

I am struggling to get my head around LPEG. I have managed to produce one grammar which does what I want, but I have been beating my head against this one and not getting far. The idea is to parse a document which is a simplified form of TeX. I want to split a document into:
Environments, which are \begin{cmd} and \end{cmd} pairs.
Commands which can either take an argument like so: \foo{bar} or can be bare: \foo.
Both environments and commands can have parameters like so: \command[color=green,background=blue]{content}.
Other stuff.
I also would like to keep track of line number information for error handling purposes. Here's what I have so far:
lpeg = require("lpeg")
lpeg.locale(lpeg)
-- Assume a lot of "X = lpeg.X" here.
-- Line number handling from http://lua-users.org/lists/lua-l/2011-05/msg00607.html
-- with additional print statements to check they are working.
local newline = P"\r"^-1 * "\n" / function (a) print("New"); end
local incrementline = Cg( Cb"linenum" )/ function ( a ) print("NL"); return a + 1 end , "linenum"
local setup = Cg ( Cc ( 1) , "linenum" )
nl = newline * incrementline
space = nl + lpeg.space
-- Taken from "Name-value lists" in http://www.inf.puc-rio.br/~roberto/lpeg/
local identifier = (R("AZ") + R("az") + P("_") + R("09"))^1
local sep = lpeg.S(",;") * space^0
local value = (1-lpeg.S(",;]"))^1
local pair = lpeg.Cg(C(identifier) * space ^0 * "=" * space ^0 * C(value)) * sep^-1
local list = lpeg.Cf(lpeg.Ct("") * pair^0, rawset)
local parameters = (P("[") * list * P("]")) ^-1
-- And the rest is mine
anything = C( (space^1 + (1-lpeg.S("\\{}")) )^1) * Cb("linenum") / function (a,b) return { text = a, line = b } end
begin_environment = P("\\begin") * Ct(parameters) * P("{") * Cg(identifier, "environment") * Cb("environment") * P("}") / function (a,b) return { params = a[1], environment = b } end
end_environment = P("\\end{") * Cg(identifier) * P("}")
texlike = lpeg.P{
"document";
document = setup * V("stuff") * -1,
stuff = Cg(V"environment" + anything + V"bracketed_stuff" + V"command_with" + V"command_without")^0,
bracketed_stuff = P"{" * V"stuff" * P"}" / function (a) return a end,
command_with =((P("\\") * Cg(identifier) * Ct(parameters) * Ct(V"bracketed_stuff"))-P("\\end{")) / function (i,p,n) return { command = i, parameters = p, nodes = n } end,
command_without = (( P("\\") * Cg(identifier) * Ct(parameters) )-P("\\end{")) / function (i,p) return { command = i, parameters = p } end,
environment = Cg(begin_environment * Ct(V("stuff")) * end_environment) / function (b,stuff, e) return { b = b, stuff = stuff, e = e} end
}
It almost works!
> texlike:match("\\foo[one=two]thing\\bar")
{
command = "foo",
parameters = {
{
one = "two",
},
},
}
{
line = 1,
text = "thing",
}
{
command = "bar",
parameters = {
},
}
But! First, I can't get the line number handling part to work at all. The function within incrementline is never fired.
I also can't quite work out how nested capture information is passed to handling functions (which is why I have scattered Cg, C and Ct semirandomly over the grammar). This means that only one item is returned from within a command_with:
> texlike:match("\\foo{text \\command moretext}")
{
command = "foo",
nodes = {
{
line = 1,
text = "text ",
},
},
parameters = {
},
}
I would also love to be able to check that the environment start and ends match up but when I tried to do so, my back references from "begin" were not in scope by the time I got to "end". I don't know where to go from here.

Late answer but hopefully it'll offer some insight if you're still looking for a solution or wondering what the problem was.
There are a couple of issues with your grammar, some of which can be tricky to spot.
Your line increment here looks incorrect:
local incrementline = Cg( Cb"linenum" ) /
function ( a ) print("NL"); return a + 1 end,
"linenum"
It looks like you meant to create a named capture group and not an anonymous group. The backcapture linenum is essentially being used like a variable. The problem is because this is inside an anonymous capture, linenum will not update properly -- function(a) will always receive 1 when called. You need to move the closing ) to the end so "linenum" is included:
local incrementline = Cg( Cb"linenum" /
function ( a ) print("NL"); return a + 1 end,
"linenum")
Relevant LPeg documentation for Cg capture.
The second problem is with your anything non-terminal rule:
anything = C( (space^1 + (1-lpeg.S("\\{}")) )^1) * Cb("linenum") ...
There are several things to be careful here. First, a named Cg capture (from incrementline rule once it's fixed) doesn't produce anything unless it's in a table or you backref it. The second major thing is that it has an adhoc scope like a variable. More precisely, its scope ends once you close it in an outer capture -- like what you're doing here:
C( (space^1 + (...) )^1)
Which means by the time you reference its backcapture with * Cb("linenum"), that's already too late -- the linenum you really want already closed its scope.
I always found LPeg's re syntax a bit easier to grok so I've rewritten the grammar with that instead:
local grammar_cb =
{
fold = pairfold,
resetlinenum = resetlinenum,
incrementlinenum = incrementlinenum, getlinenum = getlinenum,
error = error
}
local texlike_grammar = re.compile(
[[
document <- '' -> resetlinenum {| docpiece* |} !.
docpiece <- {| envcmd |} / {| cmd |} / multiline
beginslash <- cmdslash 'begin'
endslash <- cmdslash 'end'
envcmd <- beginslash paramblock? {:beginenv: envblock :} (!endslash docpiece)*
endslash openbrace {:endenv: =beginenv :} closebrace / &beginslash {} -> error .
envblock <- openbrace key closebrace
cmd <- cmdslash {:command: identifier :} (paramblock? cmdblock)?
cmdblock <- openbrace {:nodes: {| docpiece* |} :} closebrace
paramblock <- opensq ( {:parameters: {| parampairs |} -> fold :} / whitesp) closesq
parampairs <- parampair (sep parampair)*
parampair <- key assign value
key <- whitesp { identifier }
value <- whitesp { [^],;%s]+ }
multiline <- (nl? text)+
text <- {| {:text: (!cmd !closebrace !%nl [_%w%p%s])+ :} {:line: '' -> getlinenum :} |}
identifier <- [_%w]+
cmdslash <- whitesp '\'
assign <- whitesp '='
sep <- whitesp ','
openbrace <- whitesp '{'
closebrace <- whitesp '}'
opensq <- whitesp '['
closesq <- whitesp ']'
nl <- {%nl+} -> incrementlinenum
whitesp <- (nl / %s)*
]], grammar_cb)
The callback functions are straight-forwardly defined as:
local function pairfold(...)
local t, kv = {}, ...
if #kv % 2 == 1 then return ... end
for i = #kv, 2, -2 do
t[ kv[i - 1] ] = kv[i]
end
return t
end
local incrementlinenum, getlinenum, resetlinenum do
local line = 1
function incrementlinenum(nl)
assert(not nl:match "%S")
line = line + #nl
end
function getlinenum() return line end
function resetlinenum() line = 1 end
end
Testing the grammar with a non-trivial tex-like str with multiple lines:
local test1 = [[\foo{text \bar[color = red, background = black]{
moretext \baz{
even
more text} }
this time skipping multiple
lines even, such wow!}]]
Produces the follow AST in lua-table format:
{
command = "foo",
nodes = {
{
text = "text",
line = 1
},
{
parameters = {
color = "red",
background = "black"
},
command = "bar",
nodes = {
{
text = " moretext",
line = 2
},
{
command = "baz",
nodes = {
{
text = "even ",
line = 3
},
{
text = "more text",
line = 4
}
}
}
}
},
{
text = "this time skipping multiple",
line = 7
},
{
text = "lines even, such wow!",
line = 9
}
}
}
And a second test for begin/end environments:
local test2 = [[\begin[p1
=apple,
p2=blue]{scope} scope foobar
\end{scope} global foobar]]
Which seems to give approximately what you're looking for:
{
{
{
text = " scope foobar",
line = 3
},
parameters = {
p1 = "apple",
p2 = "blue"
},
beginenv = "scope",
endenv = "scope"
},
{
text = " global foobar",
line = 4
}
}

Related

problems with Lua match to find a pattern

I'm struggling with this problem:
Given 2 strings:
s1 = '/foo/:bar/oof/:rab'
s2 = '/foo/lua/oof/rocks'
I would like to produce the following information:
If they match (these two above should match, s2 follows a pattern described in s1).
A table holding the values of s2 in with the corresponding name in s1. In this case we would have: { bar = "lua", rab = "rocks" }
I think this algorithm solves it, but I can't figure how to implement it (probably with gmatch):
store the placeholders : indexes as KEYS of a table, and the respective VALUES being the name of these placeholders.
Example with s1:
local aux1 = { "6" = "bar", "15" = "rab" }
With the keys of aux1 fetched as indexes, extract the values of s2
into another table:
local aux2 = {"6" = "lua", "15" = "rocks"}
Finally merge them two into one table (this one is easy :P)
{ bar = "lua", rab = "rocks" }
Something like this maybe:
function comp(a,b)
local t = {}
local i, len_a = 0
for w in (a..'/'):gmatch('(.-)/') do
i = i + 1
if w:sub(1,1) == ':' then
t[ -i ] = w:sub(2)
else
t[ i ] = w
end
end
len_a = i
i = 0
local ans = {}
for w in (b..'/'):gmatch('(.-)/') do
i = i + 1
if t[ i ] and t[ i ] ~= w then
return {}
elseif t[ -i ] then
ans[ t[ -i ] ] = w
end
end
if len_a ~= i then return {} end
return ans
end
s1 = '/foo/:bar/oof/:rab'
s2 = '/foo/lua/oof/rocks'
for k,v in pairs(comp(s1,s2)) do print(k,v) end
Another solution could be:
s1 = '/foo/:bar/oof/:rab'
s2 = '/foo/lua/oof/rocks'
pattern = "/([^/]+)"
function getStrngTable(_strng,_pattern)
local t = {}
for val in string.gmatch(_strng,_pattern) do
table.insert(t,val)
end
return t
end
local r = {}
t1 = getStrngTable(s1,pattern)
t2 = getStrngTable(s2,pattern)
for k = 1,#t1 do
if (t1[k] == t2[k]) then
r[t1[k + 1]:match(":(.+)")] = t2[k + 1]
end
end
The Table r will have the required result
The solution below, which is some what cleaner, will also give the same result:
s1 = '/foo/:bar/oof/:rab'
s2 = '/foo/lua/oof/rocks'
pattern = "/:?([^/]+)"
function getStrng(_strng,_pattern)
local t = {}
for val in string.gmatch(_strng,_pattern) do
table.insert(t,val)
end
return t
end
local r = {}
t1 = getStrng(s1,pattern)
t2 = getStrng(s2,pattern)
for k = 1,#t1 do
if (t1[k] == t2[k]) then
r[t1[k + 1]] = t2[k + 1]
end
end

Get a certain value from a concatenated table

Trying to allow a concatenated table to be referenced as such:
local group = table.concat(arguments, ",", 1)
where arguments = {"1,1,1"}
Currently, doing group[2] gives me the comma. How do I avoid that while still allowing for two-digit numbers?
(snippet of what I'm trying to use it for)
for i = 1, #group do
target:SetGroup(i, tonumber(group[i]))
end
Maybe you want something like
local i = 1
for v in string.gmatch(s, "(%w+),*") do
group[i] = v
i = i + 1
end
Revised version in response to comment, avoiding the table altogether:
local i = 1
for v in string.gmatch(s, "(%w+),*") do
target:SetGroup(i, tonumber(v))
i = i + 1
end
split function (you have to add it to code)
split = function(str, delim)
if not delim then
delim = " "
end
-- Eliminate bad cases...
if string.find(str, delim) == nil then
return { str }
end
local result = {}
local pat = "(.-)" .. delim .. "()"
local nb = 0
local lastPos
for part, pos in string.gfind(str, pat) do
nb = nb + 1
result[nb] = part
lastPos = pos
end
-- Handle the last field
result[nb + 1] = string.sub(str, lastPos)
return result
end
so
local arguments = {"1,1,1"};
local group = split(arguments[1], ",");
for i = 1, #group do
target:SetGroup(i, tonumber(group[i]))
end
also note that
local arguments = {"1,1,1"};
local group = split(arguments[1], ",");
local group_count = #group;
for i = 1, group_count do
target:SetGroup(i, tonumber(group[i]))
end
is faster code ;)

How to make LPeg.match return nil

I'm currently getting familiar with the LPeg parser module. For this I want to match a version string (e.g. 11.4) against a list.
Such a list is a string with a tight syntax that can also contain ranges. Here is an EBNF-like, but in any case quite simple grammar (I write it down because LPeg code below can be a bit difficult to read):
S = R, { ',', R }
R = N, [ '-', N ]
N = digit+, [ '.', digit+ ]
An example string would be 1-9,10.1-11,12. Here is my enormous code:
local L = require "lpeg"
local LV, LP, LC, LR, floor = L.V, L.P, L.C, L.R, math.floor
local version = "7.25"
local function check(a, op, b)
if op and a+0 <= version and version <= b+0 then
return a..op..b -- range
elseif not op and floor(version) == floor(a+0) then
return a -- single item
end
end
local grammar = LP({ "S",
S = LV"R" * (LP"," * LV"R")^0,
R = LV"V" * (LC(LP"-") * LV"V")^-1 / check,
V = LC(LV"D" * (LP"." * LV"D")^-1),
D = (LR("09")^1),
})
function checkversion(str)
return grammar:match(str)
end
So you would call it like checkversion("1-7,8.1,8.3,9") and if the current version is not matched by the list you should get nil.
Now, the trouble is, if all calls to check return nothing (meaning, if the versions do not match), grammar:match(...) will actually have no captures and so return the current position of the string. But this is exactly what I do not want, I want checkversion to return nil or false if there is no match and something that evaluates to true otherwise, actually just like string:match would do.
If I on the other hand return false or nil from check in case of a non-match, I end up with return values from match like nil, "1", nil, nil which is basically impossible to handle.
Any ideas?
I think you can or + it with a constant capture of nil:
grammar = grammar + lpeg.Cc(nil)
This is the pattern I eventually used:
nil_capturing_pattern * lpeg.Cc(nil)
I incorporated it into the grammar in the S rule (Note that this also includes changed grammar to "correctly" determine version order, since in version numbering "4.7" < "4.11" is true, but not in calculus)
local Minor_mag = log10(Minor);
local function check(a, am, op, b, bm)
if op then
local mag = floor(max(log10(am), log10(bm), Minor_mag, 1))+1;
local a, b, v = a*10^mag+am, b*10^mag+bm, Major*10^mag+Minor;
if a <= v and v <= b then
return a..op..b;
end
elseif a == Major and (am == "0" or am == Minor) then
return a.."."..am;
end
end
local R, V, C, Cc = lpeg.R, lpeg.V, lpeg.C, lpeg.Cc
local g = lpeg.P({ "S",
S = V("R") * ("," * V("R"))^0 * Cc(nil),
R = (V("Vm") + V("VM")) * (C("-") * (V("Vm") + V("VM")))^-1 / check,
VM = V("D") * Cc("0"),
Vm = V("D") * "." * V("D"),
D = C(R("09")^1),
});
Multiple returns from match are not impossible to handle, if you catch them in a way that makes handling them easier. I added a function matched that does that, and added the fallback return of false to your check.
do
local L = require "lpeg"
local LV, LP, LC, LR, floor = L.V, L.P, L.C, L.R, math.floor
local version = 6.25
local function check(a, op, b)
if op and a+0 <= version and version <= b+0 then
return a..op..b -- range
elseif not op and floor(version) == floor(a+0) then
return a -- single item
end
return false
end
local grammar = LP({ "S",
S = LV"R" * (LP"," * LV"R")^0,
R = LV"V" * (LC(LP"-") * LV"V")^-1 / check,
V = LC(LV"D" * (LP"." * LV"D")^-1),
D = (LR("09")^1),
})
local function matched(...)
local n = select('#',...)
if n == 0 then return false end
for i=1,n do
if select(i,...) then return true end
end
return false
end
function checkversion(ver,str)
version = ver
return matched(grammar:match(str))
end
end
I enclosed the whole thing in do ... end so that the local version which is used here as an upvalue to check would have constrained scope, and added a parameter to checversion() to make it clearer to run through few test cases. For example:
cases = { 1, 6.25, 7.25, 8, 8.5, 10 }
for _,v in ipairs(cases) do
print(v, checkversion(v, "1-7,8.1,8.3,9"))
end
When run, I get:
C:\Users\Ross\Documents\tmp\SOQuestions>q18793493.lua
1 true
6.25 true
7.25 false
8 true
8.5 true
10 false
C:\Users\Ross\Documents\tmp\SOQuestions>
Note that either nil or false would work equally well in this case. It just feels saner to have collected a list that can be handled as a normal Lua array-like table without concern for the holes.

Case-insensitive Lua pattern-matching

I'm writing a grep utility in Lua for our mobile devices running Windows CE 6/7, but I've run into some issues implementing case-insensitive match patterns. The obvious solution of converting everything to uppercase (or lower) does not work so simply due to the character classes.
The only other thing I can think of is converting the literals in the pattern itself to uppercase.
Here's what I have so far:
function toUpperPattern(instr)
-- Check first character
if string.find(instr, "^%l") then
instr = string.upper(string.sub(instr, 1, 1)) .. string.sub(instr, 2)
end
-- Check the rest of the pattern
while 1 do
local a, b, str = string.find(instr, "[^%%](%l+)")
if not a then break end
if str then
instr = string.sub(instr, 1, a) .. string.upper(string.sub(instr, a+1, b)) .. string.sub(instr, b + 1)
end
end
return instr
end
I hate to admit how long it took to get even that far, and I can still see right away there are going to be problems with things like escaped percent signs '%%'
I figured this must be a fairly common issue, but I can't seem to find much on the topic.
Are there any easier (or at least complete) ways to do this? I'm starting to go crazy here...
Hoping you Lua gurus out there can enlighten me!
Try something like this:
function case_insensitive_pattern(pattern)
-- find an optional '%' (group 1) followed by any character (group 2)
local p = pattern:gsub("(%%?)(.)", function(percent, letter)
if percent ~= "" or not letter:match("%a") then
-- if the '%' matched, or `letter` is not a letter, return "as is"
return percent .. letter
else
-- else, return a case-insensitive character class of the matched letter
return string.format("[%s%s]", letter:lower(), letter:upper())
end
end)
return p
end
print(case_insensitive_pattern("xyz = %d+ or %% end"))
which prints:
[xX][yY][zZ] = %d+ [oO][rR] %% [eE][nN][dD]
Lua 5.1, LPeg v0.12
do
local p = re.compile([[
pattern <- ( {b} / {escaped} / brackets / other)+
b <- "%b" . .
escaped <- "%" .
brackets <- { "[" ([^]%]+ / escaped)* "]" }
other <- [^[%]+ -> cases
]], {
cases = function(str) return (str:gsub('%a',function(a) return '['..a:lower()..a:upper()..']' end)) end
})
local pb = re.compile([[
pattern <- ( {b} / {escaped} / brackets / other)+
b <- "%b" . .
escaped <- "%" .
brackets <- {: {"["} ({escaped} / bcases)* {"]"} :}
bcases <- [^]%]+ -> bcases
other <- [^[%]+ -> cases
]], {
cases = function(str) return (str:gsub('%a',function(a) return '['..a:lower()..a:upper()..']' end)) end
, bcases = function(str) return (str:gsub('%a',function(a) return a:lower()..a:upper() end)) end
})
function iPattern(pattern,brackets)
('sanity check'):find(pattern)
return table.concat({re.match(pattern, brackets and pb or p)})
end
end
local test = '[ab%c%]d%%]+ o%%r %bnm'
print(iPattern(test)) -- [ab%c%]d%%]+ [oO]%%[rR] %bnm
print(iPattern(test,true)) -- [aAbB%c%]dD%%]+ [oO]%%[rR] %bnm
print(('qwe [%D]% O%r n---m asd'):match(iPattern(test, true))) -- %D]% O%r n---m
Pure Lua version:
It is necessary to analyze all the characters in the string to convert it into a correct pattern because Lua patterns do not have alternations like in regexps (abc|something).
function iPattern(pattern, brackets)
('sanity check'):find(pattern)
local tmp = {}
local i=1
while i <= #pattern do -- 'for' don't let change counter
local char = pattern:sub(i,i) -- current char
if char == '%' then
tmp[#tmp+1] = char -- add to tmp table
i=i+1 -- next char position
char = pattern:sub(i,i)
tmp[#tmp+1] = char
if char == 'b' then -- '%bxy' - add next 2 chars
tmp[#tmp+1] = pattern:sub(i+1,i+2)
i=i+2
end
elseif char=='[' then -- brackets
tmp[#tmp+1] = char
i = i+1
while i <= #pattern do
char = pattern:sub(i,i)
if char == '%' then -- no '%bxy' inside brackets
tmp[#tmp+1] = char
tmp[#tmp+1] = pattern:sub(i+1,i+1)
i = i+1
elseif char:match("%a") then -- letter
tmp[#tmp+1] = not brackets and char or char:lower()..char:upper()
else -- something else
tmp[#tmp+1] = char
end
if char==']' then break end -- close bracket
i = i+1
end
elseif char:match("%a") then -- letter
tmp[#tmp+1] = '['..char:lower()..char:upper()..']'
else
tmp[#tmp+1] = char -- something else
end
i=i+1
end
return table.concat(tmp)
end
local test = '[ab%c%]d%%]+ o%%r %bnm'
print(iPattern(test)) -- [ab%c%]d%%]+ [oO]%%[rR] %bnm
print(iPattern(test,true)) -- [aAbB%c%]dD%%]+ [oO]%%[rR] %bnm
print(('qwe [%D]% O%r n---m asd'):match(iPattern(test, true))) -- %D]% O%r n---m

Very simple RogueLike in F#, making it more "functional"

I have some existing C# code for a very, very simple RogueLike engine. It is deliberately naive in that I was trying to do the minimum amount as simply as possible. All it does is move an # symbol around a hardcoded map using the arrow keys and System.Console:
//define the map
var map = new List<string>{
" ",
" ",
" ",
" ",
" ############################### ",
" # # ",
" # ###### # ",
" # # # # ",
" #### #### # # # ",
" # # # # # # ",
" # # # # # # ",
" #### #### ###### # ",
" # = # ",
" # = # ",
" ############################### ",
" ",
" ",
" ",
" ",
" "
};
//set initial player position on the map
var playerX = 8;
var playerY = 6;
//clear the console
Console.Clear();
//send each row of the map to the Console
map.ForEach( Console.WriteLine );
//create an empty ConsoleKeyInfo for storing the last key pressed
var keyInfo = new ConsoleKeyInfo( );
//keep processing key presses until the player wants to quit
while ( keyInfo.Key != ConsoleKey.Q ) {
//store the player's current location
var oldX = playerX;
var oldY = playerY;
//change the player's location if they pressed an arrow key
switch ( keyInfo.Key ) {
case ConsoleKey.UpArrow:
playerY--;
break;
case ConsoleKey.DownArrow:
playerY++;
break;
case ConsoleKey.LeftArrow:
playerX--;
break;
case ConsoleKey.RightArrow:
playerX++;
break;
}
//check if the square that the player is trying to move to is empty
if( map[ playerY ][ playerX ] == ' ' ) {
//ok it was empty, clear the square they were standing on before
Console.SetCursorPosition( oldX, oldY );
Console.Write( ' ' );
//now draw them at the new square
Console.SetCursorPosition( playerX, playerY );
Console.Write( '#' );
} else {
//they can't move there, change their location back to the old location
playerX = oldX;
playerY = oldY;
}
//wait for them to press a key and store it in keyInfo
keyInfo = Console.ReadKey( true );
}
I was playing around with doing it in F#, initially I was trying to write it using functional concepts, but turned out I was a bit over my head, so I did pretty much a straight port - it's not really an F# program (though it compiles and runs) it's a procedural program written in F# syntax:
open System
//define the map
let map = [ " ";
" ";
" ";
" ";
" ############################### ";
" # # ";
" # ###### # ";
" # # # # ";
" #### #### # # # ";
" # # # # # # ";
" # # # # # # ";
" #### #### ###### # ";
" # = # ";
" # = # ";
" ############################### ";
" ";
" ";
" ";
" ";
" " ]
//set initial player position on the map
let mutable playerX = 8
let mutable playerY = 6
//clear the console
Console.Clear()
//send each row of the map to the Console
map |> Seq.iter (printfn "%s")
//create an empty ConsoleKeyInfo for storing the last key pressed
let mutable keyInfo = ConsoleKeyInfo()
//keep processing key presses until the player wants to quit
while not ( keyInfo.Key = ConsoleKey.Q ) do
//store the player's current location
let mutable oldX = playerX
let mutable oldY = playerY
//change the player's location if they pressed an arrow key
if keyInfo.Key = ConsoleKey.UpArrow then
playerY <- playerY - 1
else if keyInfo.Key = ConsoleKey.DownArrow then
playerY <- playerY + 1
else if keyInfo.Key = ConsoleKey.LeftArrow then
playerX <- playerX - 1
else if keyInfo.Key = ConsoleKey.RightArrow then
playerX <- playerX + 1
//check if the square that the player is trying to move to is empty
if map.Item( playerY ).Chars( playerX ) = ' ' then
//ok it was empty, clear the square they were standing on
Console.SetCursorPosition( oldX, oldY )
Console.Write( ' ' )
//now draw them at the new square
Console.SetCursorPosition( playerX, playerY )
Console.Write( '#' )
else
//they can't move there, change their location back to the old location
playerX <- oldX
playerY <- oldY
//wait for them to press a key and store it in keyInfo
keyInfo <- Console.ReadKey( true )
So my question is, what do I need to learn in order to rewrite this more functionally, can you give me some hints, a vague overview, that kind of thing.
I'd prefer a shove in the right direction rather than just seeing some code, but if that's the easiest way for you to explain it to me then fine, but in that case can you please also explain the "why" rather the "how" of it?
Game programming in general will test your ability to manage complexity. I find that functional programming encourages you to break problems your solving into smaller pieces.
The first thing you want to do is turn your script into a bunch of functions by separating all the different concerns. I know it sounds silly but the very act of doing this will make the code more functional (pun intended.) Your main concern is going to be state management. I used a record to manage the position state and a tuple to manage the running state. As your code gets more advanced you will need objects to manage state cleanly.
Try adding more to this game and keep breaking the functions apart as they grow. Eventually you will need objects to manage all the functions.
On a game programming note don't change state to something else and then change it back if it fails some test. You want minimal state change. So for instance below I calculate the newPosition and then only change the playerPosition if this future position passes.
open System
// use a third party vector class for 2D and 3D positions
// or write your own for pratice
type Pos = {x: int; y: int}
with
static member (+) (a, b) =
{x = a.x + b.x; y = a.y + b.y}
let drawBoard map =
//clear the console
Console.Clear()
//send each row of the map to the Console
map |> List.iter (printfn "%s")
let movePlayer (keyInfo : ConsoleKeyInfo) =
match keyInfo.Key with
| ConsoleKey.UpArrow -> {x = 0; y = -1}
| ConsoleKey.DownArrow -> {x = 0; y = 1}
| ConsoleKey.LeftArrow -> {x = -1; y = 0}
| ConsoleKey.RightArrow -> {x = 1; y = 0}
| _ -> {x = 0; y = 0}
let validPosition (map:string list) position =
map.Item(position.y).Chars(position.x) = ' '
//clear the square player was standing on
let clearPlayer position =
Console.SetCursorPosition(position.x, position.y)
Console.Write( ' ' )
//draw the square player is standing on
let drawPlayer position =
Console.SetCursorPosition(position.x, position.y)
Console.Write( '#' )
let takeTurn map playerPosition =
let keyInfo = Console.ReadKey true
// check to see if player wants to keep playing
let keepPlaying = keyInfo.Key <> ConsoleKey.Q
// get player movement from user input
let movement = movePlayer keyInfo
// calculate the players new position
let newPosition = playerPosition + movement
// check for valid move
let validMove = newPosition |> validPosition map
// update drawing if move was valid
if validMove then
clearPlayer playerPosition
drawPlayer newPosition
// return state
if validMove then
keepPlaying, newPosition
else
keepPlaying, playerPosition
// main game loop
let rec gameRun map playerPosition =
let keepPlaying, newPosition = playerPosition |> takeTurn map
if keepPlaying then
gameRun map newPosition
// setup game
let startGame map playerPosition =
drawBoard map
drawPlayer playerPosition
gameRun map playerPosition
//define the map
let map = [ " ";
" ";
" ";
" ";
" ############################### ";
" # # ";
" # ###### # ";
" # # # # ";
" #### #### # # # ";
" # # # # # # ";
" # # # # # # ";
" #### #### ###### # ";
" # = # ";
" # = # ";
" ############################### ";
" ";
" ";
" ";
" ";
" " ]
//initial player position on the map
let playerPosition = {x = 8; y = 6}
startGame map playerPosition
That's a nice little game :-). In functional programming, you'd want to avoid using mutable state (as others pointed out) and you'd also want to write the core of your game as a function that doesn't have any side-effects (e.g. reading from console and writing).
The key part of the game is the function that controls the position. You could refactor your code to have a function with the type signature:
val getNextPosition : (int * int) -> ConsoleKey -> option<int * int>
The function returns None if the game should quit. Otherwise it returns Some(posX, posY) where posX and posY are your new locations for the # symbol. By doing the change, you get a nice functional core and the function getNextPosition is also easy to test (because it always returns the same result for the same inputs).
To use the function, the best option is to write the looping using recursion. The structure of the main function would look like this:
let rec playing pos =
match getNextPosition pos (Console.ReadKey()) with
| None -> () // Quit the game
| Some(newPos) ->
// This function redraws the screen (this is a side-effect,
// but it is localized to a single function)
redrawScreen pos newPos
playing newPos
Being a game, and using the Console, there is state and side-effects here which are inherent. But the key thing you'll want to do is eliminate those mutables. Using a recursive loop instead of a while loop will help you do that since then you can pass your state as arguments to each recursive call. Other than that, the main thing I can see to take advantage of F# features here is using pattern matching instead of if/then statements and switches, though that would be a mainly aesthetic improvement.
I'll try and avoid being overly specific - if I end up going too far in the other direction and this is too vague, let me know and I'll try improve it a little.
When making a functional program that has some sort of state, the basic mechanism you want to implement is something like:
(currentState, input) => newState
Then you can write a small wrapper around that to handle fetching input and drawing output.

Resources