Using makeExprParser with ambiguity - parsing

I'm currently encountering a problem while translating a parser from a CFG-based tool (antlr) to Megaparsec.
The grammar contains lists of expressions (handled with makeExprParser) that are enclosed in brackets (<, >) and separated by ,.
Stuff like <>, <23>, <23,87> etc.
The problem now is that the expressions may themselves contain the > operator (meaning "greater than"), which causes my parser to fail.
<1223>234> should, for example, be parsed into [BinaryExpression ">" (IntExpr 1223) (IntExpr 234)].
I presume that I have to strategically place try somewhere, but the places I tried (to the first argument of sepBy and the first argument of makeExprParser) did unfortunately not work.
Can I use makeExprParser in such a situation or do I have to manually write the expression parser?:
This is the relevant part of my parser:
-- uses megaparsec, text, and parser-combinators
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Control.Monad.Combinators.Expr
import Data.Text
import Data.Void
import System.Environment
import Text.Megaparsec
import Text.Megaparsec.Char
import qualified Text.Megaparsec.Char.Lexer as L
type BinaryOperator = Text
type Name = Text
data Expr
= IntExpr Integer
| BinaryExpression BinaryOperator Expr Expr
deriving (Eq, Show)
type Parser = Parsec Void Text
lexeme :: Parser a -> Parser a
lexeme = L.lexeme sc
symbol :: Text -> Parser Text
symbol = L.symbol sc
sc :: Parser ()
sc = L.space space1 (L.skipLineComment "//") (L.skipBlockCommentNested "/*" "*/")
parseInteger :: Parser Expr
parseInteger = do
number <- some digitChar
_ <- sc
return $ IntExpr $ read number
parseExpr :: Parser Expr
parseExpr = makeExprParser parseInteger [[InfixL (BinaryExpression ">" <$ symbol ">")]]
parseBracketList :: Parser [Expr]
parseBracketList = do
_ <- symbol "<"
exprs <- sepBy parseExpr (symbol ",")
_ <- symbol ">"
return exprs
main :: IO ()
main = do
text : _ <- getArgs
let res = runParser parseBracketList "stdin" (pack text)
case res of
(Right suc) -> do
print suc
(Left err) ->
putStrLn $ errorBundlePretty err

You've (probably) misdiagnosed the problem. Your parser fails on <1233>234> because it's trying to parse > as a left associative operator, like +. In other words, the same way:
1+2+
would fail, because the second + has no right-hand operand, your parser is failing because:
1233>234>
has no digit following the second >. Assuming you don't want your > operator to chain (i.e., 1>2>3 is not a valid Expr), you should first replace InfixL with InfixN (non-associative) in your makeExprParser table. Then, it will parse this example fine.
Unfortunately, with or without this change your parser will still fail on the simpler test case:
<1233>
because the > is interpreted as an operator within a continuing expression.
In other words, the problem isn't that your parser can't handle expressions with > characters, it's that it's overly aggressive in treating > characters as part of an expression, preventing them from being recognized as the closing angle bracket.
To fix this, you need to figure out exactly what you're parsing. Specifically, you need to resolve the ambiguity in your parser by precisely characterizing the situations where > can be part of a continuing expression and where it can't.
One rule that will probably work is to only consider a > as an operator if it is followed by a valid "term" (i.e., a parseInteger). You can do this with lookAhead. The parser:
symbol ">" <* lookAhead term
will parse a > operator only if it is followed by a valid term. If it fails to find a term, it will consume some input (at least the > symbol itself), so you must surround it with a try:
try (symbol ">" <* lookAhead term)
With the above two fixes applied to parseExpr:
parseExpr :: Parser Expr
parseExpr = makeExprParser term
[[InfixN (BinaryExpression ">" <$ try (symbol ">" <* lookAhead term))]]
where term = parseInteger
you'll get the following parses:
λ> parseTest parseBracketList "<23>"
[IntExpr 23]
λ> parseTest parseBracketList "<23,87>"
[IntExpr 23,IntExpr 87]
λ> parseTest parseBracketList "<23,87>18>"
[IntExpr 23,BinaryExpression ">" (IntExpr 87) (IntExpr 18)]
However, the following will fail:
λ> parseTest parseBracketList "<23,87>18"
1:10:
|
1 | <23,87>18
| ^
unexpected end of input
expecting ',', '>', or digit
λ>
because the fact that the > is followed by 18 means that it is a valid operator, and it is parse failure that the valid expression 87>18 is followed by neither a comma nor a closing > angle bracket.
If you need to parse something like <23,87>18, you have bigger problems. Consider the following two test cases:
<1,2>3,4,5,6,7,...,100000000000,100000000001>
<1,2>3,4,5,6,7,...,100000000000,100000000001
It's a challenge to write an efficient parser that will parse the first one as a list of 10000000000 expressions but the second one as a list of two expression:
[IntExpr 1, IntExpr 2]
followed by some "extra" text. Hopefully, the underlying "language" you're trying to parse isn't so hopelessly broken that this will be an issue.

Related

Correctly parsing nested data using megaparsec

I am trying to get more familiar with megaparsec, and I am running into some issues with presedences. By 'nested data' in the title I refer to the fact that I am trying to parse types, which in turn could contain other types. If someone could explain why this does not behave as I would expect, please don't hesitate to tell me.
I am trying to parse types similar to those found in Haskell. Types are either base types Int, Bool, Float or type variables a (any lowercase word).
We can also construct algebraic data types from type constructors (Uppercase words) such as Maybe and type parameters (any other type). Examples are Maybe a and Either (Maybe Int) Bool. Functions associate to the right and are constructed with ->, such as Maybe a -> Either Int (b -> c). N-ary tuples are a sequence of types separated by , and enclosed in ( and ), such as (Int, Bool, a). A type can be wrapped in parenthesis to raise its precedence level (Maybe a). A unit type () is also defined.
I am using this ADT to describe this.
newtype Ident = Ident String
newtype UIdent = UIdent String
data Type a
= TLam a (Type a) (Type a)
| TVar a Ident
| TNil a
| TAdt a UIdent [Type a]
| TTup a [Type a]
| TBool a
| TInt a
| TFloat a
I have tried to write a megaparsec parser to parse such types, but I get unexpected results. I attach the relevant code below after which I will try to describe what I experience.
{-# LANGUAGE OverloadedStrings #-}
module Parser where
import AbsTinyCamiot
import Text.Megaparsec
import Text.Megaparsec.Char
import qualified Text.Megaparsec.Char.Lexer as Lexer
import Text.Megaparsec.Debug
import Control.Applicative hiding (many, some, Const)
import Control.Monad.Combinators.Expr
import Control.Monad.Identity
import Data.Void
import Data.Text (Text, unpack)
type Parser a = ParsecT Void Text Identity a
-- parse types
pBaseType :: Parser (Type ())
pBaseType = choice [
TInt () <$ label "parse int" (pSymbol "Int"),
TBool () <$ label "parse bool" (pSymbol "Bool"),
TFloat () <$ label "parse float" (pSymbol "Float"),
TNil () <$ label "parse void" (pSymbol "()"),
TVar () <$> label "parse type variable" pIdent]
pAdt :: Parser (Type ())
pAdt = label "parse ADT" $ do
con <- pUIdent
variables <- many $ try $ many spaceChar >> pType
return $ TAdt () con variables
pType :: Parser (Type ())
pType = label "parse a type" $
makeExprParser
(choice [ try pFunctionType
, try $ parens pType
, try pTupleType
, try pBaseType
, try pAdt
])
[]--[[InfixR (TLam () <$ pSymbol "->")]]
pTupleType :: Parser (Type ())
pTupleType = label "parse a tuple type" $ do
pSymbol "("
fst <- pType
rest <- some (pSymbol "," >> pType)
pSymbol ")"
return $ TTup () (fst : rest)
pFunctionType :: Parser (Type ())
pFunctionType = label "parse a function type" $ do
domain <- pType
some spaceChar
pSymbol "->"
some spaceChar
codomain <- pType
return $ TLam () domain codomain
parens :: Parser a -> Parser a
parens p = label "parse a type wrapped in parentheses" $ do
pSymbol "("
a <- p
pSymbol ")"
return a
pUIdent :: Parser UIdent
pUIdent = label "parse a UIdent" $ do
a <- upperChar
rest <- many $ choice [letterChar, digitChar, char '_']
return $ UIdent (a:rest)
pIdent :: Parser Ident
pIdent = label "parse an Ident" $ do
a <- lowerChar
rest <- many $ choice [letterChar, digitChar, char '_']
return $ Ident (a:rest)
pSymbol :: Text -> Parser Text
pSymbol = Lexer.symbol pSpace
pSpace :: Parser ()
pSpace = Lexer.space
(void spaceChar)
(Lexer.skipLineComment "--")
(Lexer.skipBlockComment "{-" "-}")
This might be overwhelming, so let me explain some key points. I understand that I have a lot of different constructions that could match on an opening parenthesis, so I've wrapped those parsers in try, such that if they fail I can try the next parser that might consume an opening parenthesis. Perhaps I am using try too much? Does it affect performance to potentially backtrack so much?
I've also tried to make an expression parser by defining some terms and an operator table. You can see now that I've commented out the operator (function arrow), however. As the code looks right now I loop infinitely when I try to parse a function type. I think this might be due to the fact that when I try to parse a function type (invoked from pType) I immediately try to parse a type representing the domain of the function, which again call pType. How would I do this correctly?
If I decide to use the operator table instead, and not use my custom parser for function types, I parse things using wrong precedences. E.g Maybe a -> b gets parsed as Maybe (a -> b), while I would want it to be parsed as (Maybe a) -> b. Is there a way where I could use the operator table and still have type constructors bind more tightly than the function arrow?
Lastly, as I am learning megaparsec as I go, if anyone sees any misunderstandings or things that are wierd/unexpected, please tell me. I've read most of this tutorial to get my this far.
Please let me know of any edits I can make to increase the quality of my question!
Your code does not handle precedences at all, and also as a result of this it uses looping left-recursion.
To give an example of left-recursion in your code, pFunctionType calls pType as the first action, which calls pFunctionType as the first action. This is clearly a loop.
For precedences, I recommend to look at tutorials on "recursive descent operator parsing", a quick Google search reveals that there are several of them. Nevertheless I can summarize here the key points. I write some code.
{-# language OverloadedStrings #-}
import Control.Monad.Identity
import Data.Text (Text)
import Data.Void
import Text.Megaparsec
import Text.Megaparsec.Char
import qualified Text.Megaparsec.Char.Lexer as Lexer
type Parser a = ParsecT Void Text Identity a
newtype Ident = Ident String deriving Show
newtype UIdent = UIdent String deriving Show
data Type
= TVar Ident
| TFun Type Type -- instead of "TLam"
| TAdt UIdent [Type]
| TTup [Type]
| TUnit -- instead of "TNil"
| TBool
| TInt
| TFloat
deriving Show
pSymbol :: Text -> Parser Text
pSymbol = Lexer.symbol pSpace
pChar :: Char -> Parser ()
pChar c = void (char c <* pSpace)
pSpace :: Parser ()
pSpace = Lexer.space
(void spaceChar)
(Lexer.skipLineComment "--")
(Lexer.skipBlockComment "{-" "-}")
keywords :: [String]
keywords = ["Bool", "Int", "Float"]
pUIdent :: Parser UIdent
pUIdent = try $ do
a <- upperChar
rest <- many $ choice [letterChar, digitChar, char '_']
pSpace
let x = a:rest
if elem x keywords
then fail "expected an ADT name"
else pure $ UIdent x
pIdent :: Parser Ident
pIdent = try $ do
a <- lowerChar
rest <- many $ choice [letterChar, digitChar, char '_']
pSpace
return $ Ident (a:rest)
Let's stop here.
I changed the names of constructors in Type to conform to how they are called in Haskell. I also removed the parameter on Type, to have less noise in my example, but you can add it back of course.
Note the changed pUIdent and the addition of keywords. In general, if you want to parse identifiers, you have to disambiguate them from keywords. In this case, Int could parse both as Int and as an upper case identifier, so we have to specify that Int is not an identifier.
Continuing:
pClosed :: Parser Type
pClosed =
(TInt <$ pSymbol "Int")
<|> (TBool <$ pSymbol "Bool")
<|> (TFloat <$ pSymbol "Float")
<|> (TVar <$> pIdent)
<|> (do pChar '('
ts <- sepBy1 pFun (pChar ',') <* pChar ')'
case ts of
[] -> pure TUnit
[t] -> pure t
_ -> pure (TTup ts))
pApp :: Parser Type
pApp = (TAdt <$> pUIdent <*> many pClosed)
<|> pClosed
pFun :: Parser Type
pFun = foldr1 TFun <$> sepBy1 pApp (pSymbol "->")
pExpr :: Parser Type
pExpr = pSpace *> pFun <* eof
We have to group operators according to binding strength. For each strength, we need to have a separate parsing function which parses all operators of that strength. In this case we have pFun, pApp and pClosed in increasing order of binding strength. pExpr is just a wrapper which handles top-level expressions, and takes care of leading whitespace and matches the end of the input.
When writing an operator parser, the first thing we should pin down is the group of closed expressions. Closed expressions are delimited by a keyword or symbol both on the left and the right. This is conceptually "infinite" binding strength, since text before and after such expressions don't change their parsing at all.
Keywords and variables are clearly closed, since they consist of a single token. We also have three more closed cases: the unit type, tuples and parenthesized expressions. Since all of these start with a (, I factor this out. After that, we have one or more types separated by , and we have to branch on the number of parsed types.
The rule in precedence parsing is that when parsing an operator expression of given strength, we always call the next stronger expression parser when reading the expressions between operator symbols.
, is the weakest operator, so we call the function for the second weakest operator, pFun.
pFun in turn calls pApp, which reads ADT applications, or falls back to pClosed. In pFun you can also see the handling of right associativity, as we use foldr1 TFun to combine expressions. In a left-associative infix operator, we would instead use foldl1.
Note that parser functions always parse all stronger expressions as well. So pFun falls back on pApp when there is no -> (because sepBy1 accepts the case with no separators), and pApp falls back on pClosed when there's no ADT application.

Problem while writing a small parser in Haskell using Parsec

I am trying to write a parser for a small language with the following piece of code
import Text.ParserCombinators.Parsec
import Text.Parsec.Token
data Exp = Atom String | Op String Exp
instance Show Exp where
show (Atom x) = x
show (Op f x) = f ++ "(" ++ (show x) ++ ")"
parse_exp :: Parser Exp
parse_exp = (try parse_atom) <|> parse_op
parse_atom :: Parser Exp
parse_atom = do
x <- many1 letter
return (Atom x)
parse_op :: Parser Exp
parse_op = do
x <- many1 letter
char '('
y <- parse_exp
char ')'
return (Op x y)
But when I type in ghci
>>> parse (parse_exp <* eof) "<error>" "s(t)"
I get the output
Left "<error>" (line 1, column 2):
unexpected '('
expecting letter or end of input
If I redefine parse_exp as
parse_exp = (try parse_op) <|> parse_atom
then with I get correct result
>>> parse (parse_exp <* eof) "<error>" "s(t)"
Right s(t)
But I am confused why the first one does not work. Is there a general fix to these kinds of problems in parsing?
When a Parsec parser, like parse_atom, is run on a particular string, there are four possible results:
It succeeds, consuming some input.
It fails, consuming some input.
It succeeds, consuming no input.
It fails, consuming no input.
In the Parsec source code, these are referred to as "consumed ok", "consumed err", "empty ok" and "empty err" (sometimes abbreviated cok, cerr, eok, eerr).
When two Parsec parsers are used in an alternative, like p <|> q, here's how it's parsed. First, Parsec tries to parse with p. Then:
If this results in "consumed ok" or "empty ok", the parse succeeds and this becomes the result of the entire parser p <|> q.
If this results in "empty err", Parsec tries the alternative q, and this becomes the result of the entire p <|> q parser.
If this results in "consumed err", the entire parser p <|> q fails with "consumed err" (cerr).
Note the critical difference between p returning cerr (which causes the whole parser to fail) versus returning eerr (which causes the alternative parser q to be tried).
The try function changes the behavior of a parser by converting a "cerr" result to an "eerr" result.
This means that if you are trying to parse the text "s(t)" with different parsers:
with the parser parse_atom <|> parse_op, the parser parse_atom returns "cok" consuming "s" and leaving unparseable text "(t)" which causes an error
with the parser try parse_atom <|> parse_op, the parser parse_atom still returns "cok" consuming "s", so the try (which only changes cerr to eerr) has no effect, and the unparseable text "(t)" causes the same error
with the parser parse_op <|> parse_atom, the parser parse_op successfully parses the string (actually, it doesn't because the recursive call to parse_exp can't parse "t", but let's ignore that); however, if the same parser was used on the text "s", then parse_op would consume the "s" before failing (i.e., cerr), causing the entire parse to fail instead of trying the alternative parse_atom
with the parser try parse_op <|> parse_atom, this would parse "s(t)", exactly as the previous example, and the try would have no effect; however, it would also work on the text "s", because parse_op would consume the "s" before failing with cerr, then try would "rescue" the parse by turning the cerr into an eerr, and the alternative parse_atom would be checked, successfully parsing (cok) the atom "s".
That's why the "correct" parser for your problem is try parse_op <|> parse_atom.
Be warned that this behavior isn't a fundamental aspect of monadic parsers. It's a design choice made by Parsec (and compatible parsers like Megaparsec). Other monadic parsers can have different rules for how alternatives with <|> work.
The "general fix" for these kind of Parsec parsing problems is to be aware of the facts that in the expression p <|> q:
p is tried first, and if it succeeds, q will be ignored, even if q would provide a "longer" or "better" or "more sensible" parse or avoid additional parsing errors further down the road. In parse_atom <|> parse_op, because parse_atom can succeed on strings meant for parse_op, this order won't work correctly.
q is only tried if p fails without consuming input. You must arrange for p to not consume anything on failure, possibly by using try, if you expect the alternative q to be checked. So, parse_op <|> parse_atom isn't going to work if parse_op starts to consume something (like an identifier) before realizing it can't continue and returning cerr.
As an alternative to using try, you can also think more carefully about the structure of your parser. An alternative way of writing parse_exp, for example, would be:
parse_exp :: Parser Exp
parse_exp = do
-- there's always an identifier
x <- many1 letter
-- there *might* be an expression in parentheses
y <- optionMaybe (parens parse_exp)
case y of
Nothing -> return (Atom x)
Just y' -> return (Op x y')
where parens = between (char '(') (char ')')
This can be written a little more concisely, but even then it's not as "elegant" as something like try parse_op <|> parse_atom. (It performs better, though, so that might be a consideration in some applications.)
The problem is that the string "s" counts as an atom according to your definitions. Try this:
parse parse_atom "" "s(t)"
> Atom "s"
So your parser parse_exp actually succeeds, returning Atom "s", but then you also expect an EOF right after it, and that's where it fails, encountering an open paren instead of an EOF (just like the error message says!)
When you swap the alternative around, it would first attempt parse_op, which would succeed, returning Op "s" "t", and then encounter EOF, just as expected.

Parse EBNF with Megaparsec nested sepBy

As an exercise I try to parse a EBNF/ABNF grammar with Megaparsec. I got trivial stuff like terminals and optionals working, but I'm struggling with alternatives. With this grammar:
S ::= 'hello' ['world'] IDENTIFIER LITERAL | 'test';
And this code:
production :: Parser Production
production = sepBy1 alternativeTerm (char '|') >>= return . Production
alternativeTerm :: Parser AlternativeTerm
alternativeTerm = sepBy1 term space >>= return . AlternativeTerm
term :: Parser Term
term = terminal
<|> optional
<|> identifier
<|> literal
I get this error:
unexpected '|'
expecting "IDENTIFIER", "LITERAL", ''', '[', or white space
I guess the alternativeTerm parser is not returning to the production parser when it encounters a sequence that it cannot parse and throws an error instead.
What can I do about this? Change my ADT of an EBNF or should I somehow flatten the parsing. But then again, how can I do so?
It's probably best to expand my previous comment into a full answer.
Your grammar is basically a list of list of terms seperated (and ended) by whitespace, which in turn is seperated by |. Your solution with sepBy1 does not work because there is a trailing whitespace after LITERAL - sepBy1 assumes there is another term following that whitespace and tries to apply term to the |, which fails.
If your alternativeTerm is guaranteed to end with a whitespace character (or multiple), rewrite your alternativeTerm as follows:
alternativeTerm = (term `sepEndBy1` space) >>= return . AlternativeTerm

How to combine two parsers in Haskell ReadP anonymously?

The Problem
I want to chain two ReadP parsers anonymously.
Example
Input characters are in {'a', 'o', '+', ' '} where ' ' is a space.
I want to parse this input according to the following rules:
the first o or a is not preceeded by a space/plus
every other o is preceeded by a space
every other a is preceeded by a plus
The BNF
In case this is not clear, I came up with the following BNF:
{-
<Input> :: = <O> <Expr> | <A> <Expr>
<Expr> ::= <Space> <O> <Expr> | <Plus> <A> <Expr> | <EOL>
<O> ::= "o"
<A> ::= "a"
<Space> ::= " "
<Plus> ::= "+"
-}
Specific questions
The idea is, that the rules "o preceeded by space" and "a preceeded by plus"
(if not at the beginning) should not be part of the o or a parser, nor do
I want to create an explicit/named parser, because that is a part of the
definition of an expression (Expr).
How do I chain the oParser with the spaceParser?
Composition (does not work): spaceParser . oParser
(+++) is the symmetric choice: oParser (+++) spaceParser. And related to that: Is (+++) equivalent to <|> from Control.Applicative?
ReadP.choice [oParser, spaceParser] creates a new parser, which obviously does not work for a string, but I did that previously and the behaviour of choice surprised me. How does choice work?
The Code
--------------------------------------------------------------------------------
module Parser (main) where
--------------------------------------------------------------------------------
-- modules
import Text.ParserCombinators.ReadP as ReadP
import Data.Char as D
import Control.Applicative
--------------------------------------------------------------------------------
-- this parses one o
oParser :: ReadP Char
oParser = satisfy (== 'o')
-- this parses one a
aParser :: ReadP Char
aParser = satisfy (== 'a') --(\v -> v == 'a')
-- this parses one plus
pParser :: ReadP Char
pParser = satisfy (== '+')
spaceParser :: ReadP Char
spaceParser = satisfy D.isSpace
parseExpr :: ReadP String
parseExpr = -- ReadP.choice [oParser, spaceParser]
main :: IO ()
main = print [ x | (x, "") <- ReadP.readP_to_S parseExpr "o +a o+a+a o o"]
Thank you a lot for reading all of that :) (And: Which haskell parsing libraries would you recommend?)

Make a parser ignore all redundant whitespace

Say I have a Parser p in Parsec and I want to specify that I want to ignore all superfluous/redundant white space in p. Let's for example say that I define a list as starting with "[", end with "]", and in the list are integers separated by white space. But I don't want any errors if there are white space in front of the "[", after the "]", in between the "[" and the first integer, and so on.
In my case, I want this to work for my parser for a toy programming language.
I will update with code if that is requested/necessary.
Just surround everything with space:
parseIntList :: Parsec String u [Int]
parseIntList = do
spaces
char '['
spaces
first <- many1 digit
rest <- many $ try $ do
spaces
char ','
spaces
many1 digit
spaces
char ']'
return $ map read $ first : rest
This is a very basic one, there are cases where it'll fail (such as an empty list) but it's a good start towards getting something to work.
#Joehillen's suggestion will also work, but it requires some more type magic to use the token features of Parsec. The definition of spaces matches 0 or more characters that satisfies Data.Char.isSpace, which is all the standard ASCII space characters.
Use combinators to say what you mean:
import Control.Applicative
import Text.Parsec
import Text.Parsec.String
program :: Parser [[Int]]
program = spaces *> many1 term <* eof
term :: Parser [Int]
term = list
list :: Parser [Int]
list = between listBegin listEnd (number `sepBy` listSeparator)
listBegin, listEnd, listSeparator :: Parser Char
listBegin = lexeme (char '[')
listEnd = lexeme (char ']')
listSeparator = lexeme (char ',')
lexeme :: Parser a -> Parser a
lexeme parser = parser <* spaces
number :: Parser Int
number = lexeme $ do
digits <- many1 digit
return (read digits :: Int)
Try it out:
λ :l Parse.hs
Ok, modules loaded: Main.
λ parseTest program " [1, 2, 3] [4, 5, 6] "
[[1,2,3],[4,5,6]]
This lexeme combinator takes a parser and allows arbitrary whitespace after it. Then you only need to use lexeme around the primitive tokens in your language such as listSeparator and number.
Alternatively, you can parse the stream of characters into a stream of tokens, then parse the stream of tokens into a parse tree. That way, both the lexer and parser can be greatly simplified. It’s only worth doing for larger grammars, though, where maintainability is a concern; and you have to use some of the lower-level Parsec API such as tokenPrim.

Resources