Parsec custom while loop parser only parses one statement in loop body - parsing

I'm trying to write a parser to parse a loop in the following form:
(:= x 0)
Do ((< x 10))->
(:= x (+ x 1))
print(x)
Od
What's occurring however is that my parser can only work for a loop whose body contains only one statement. To parse more than one statement, the body above would have to be written in the following way:
(:= x (+ x 1))(:= x 20)
I have tried to use delimiters such as semi-colons to try and force the parser to allow for loop body parsing to be taken line by line the above behaviour persists such that it would have to be written like: (:= x (+ x 1));(:= x 20) instead of on separate lines.
Please find my parsers below:
parsersHStatement :: Parser HStatement
parsersHStatement = try (parsePrint) <|> try (parseDo) <|> try (parseEval)
parseLoopBody :: Parser [HStatement]
parseLoopBody = many1 $ parsersHStatement
parseDo :: Parser HStatement
parseDo = do
spaces
_ <- string "Do"
spaces
_ <- string "("
p <- try (parseExpr) <|> try (parseBool)
_ <- string ")->"
spaces
q <- parseLoopBody <* spaces
spaces
_ <- string "Od"
return $ Do p q
parseEval :: Parser HStatement
parseEval = liftM Eval $ parsersHVal
parsersHVal :: Parser HVal
parsersHVal = try (parseAssign) <|> try (parsePrimitiveValue) <|> try (parseExpr)
parsePrint :: Parser HStatement
parsePrint = string "print(" *> parsersHVal <* string ")" >>= (return . Print)
parseExpr :: Parser HVal
parseExpr = do
char '('
spaces
op <- try (parseOperation)
spaces
x <- try (sepBy (parseExpr <|> parseVarOrInt) spaces)
spaces
char ')'
return $ Expr op x
parseBool :: Parser HVal
parseBool = classifyBool <$> ( (string "True") <|> (string "False") )
where
classifyBool "True" = Bool True
classifyBool "False" = Bool False
Within parseLoopBody, I tried 'feeding' spaces (many1 $ spaces *> ...) but nothing would parse then.
The following is the ADT:
data HVal
= Integer Integer
| Var String
| Bool Bool
| List [HVal]
| Expr Operation [HVal]
| Assign HVal HVal
deriving (Show, Eq, Read)
data HStatement
= Eval HVal -- Bridge between HVal and HStatement
| Print HVal
| Do HVal [HStatement]
deriving (Show, Eq, Read)
parseDo was altered to the following :
parseDo :: Parser HStatement
parseDo = do
string "Do"
spaces
string "("
p <- try (parseExpr) <|> try (parseBool)
string ")->"
spaces
q <- many1 $ parsersHStatement
spaces
string "Od"
return $ Do p q
This allows for the parsing of two statements but the second statement breaks the loop.

After a lot of fiddling around, it seemed that the error lay responsible on the function parseEval. This was changed to:
parseEval :: Parser HStatement
parseEval = do
x <- try (parseAssign) <|> try (parseExpr)
spaces
return $ Eval x
Furthermore my parseDo function was changed to:
parseDo :: Parser HStatement
parseDo = do
string "Do"
spaces
string "("
p <- try (parseExpr) <|> try (parseBool)
string ")->"
spaces
q <- many1 $ parsersHStatement
spaces
string "Od"
return $ Do p q

Related

How to parse a bool expression in Haskell

I am trying to parse a bool expression in Haskell. This line is giving me an error: BoolExpr <$> parseBoolOp <*> (n : ns). This is the error:
• Couldn't match type ‘[]’ with ‘Parser’
Expected type: Parser [Expr]
Actual type: [Expr]
-- define the expression types
data Expr
= BoolExpr BoolOp [Expr]
deriving (Show, Eq)
-- define the type for bool value
data Value
= BoolVal Bool
deriving (Show, Eq)
-- many x = Parser.some x <|> pure []
-- some x = (:) <$> x <*> Parser.many x
kstar :: Alternative f => f a -> f [a]
kstar x = kplus x <|> pure []
kplus :: Alternative f => f a -> f [a]
kplus x = (:) <$> x <*> kstar x
symbol :: String -> Parser String
symbol xs = token (string xs)
-- a bool expression is the operator followed by one or more expressions that we have to parse
-- TODO: add bool expressions
boolExpr :: Parser Expr
boolExpr = do
n <- parseExpr
ns <- kstar (symbol "," >> parseExpr)
BoolExpr <$> parseBoolOp <*> (n : ns)
-- an atom is a literalExpr, which can be an actual literal or some other things
parseAtom :: Parser Expr
parseAtom =
do
literalExpr
-- the main parsing function which alternates between all the options you have
parseExpr :: Parser Expr
parseExpr =
do
parseAtom
<|> parseParens boolExpr
<|> parseParens parseExpr
-- implement parsing bool operations, these are 'and' and 'or'
parseBoolOp :: Parser BoolOp
parseBoolOp =
do symbol "and" >> return And
<|> do symbol "or" >> return Or
The boolExpr is expecting a Parser [Expr] but I am returning only an [Expr]. Is there a way to fix this or do it in another way? When I try pure (n:ns), evalStr "(and true (and false true) true)" returns Left (ParseError "'a' didn't match expected character") instead of Right (BoolVal False)
The expression (n : ns) is a list. Therefore the compiler thinks that the applicative operators <*> and <$> should be used in the context [], while you want Parser instead.
I would guess you need pure (n : ns) instead.

Haskell : Operator Parser keeps going to undefined rather than inputs

I'm practicing writing parsers. I'm using Tsodings JSON Parser video as reference. I'm trying to add to it by being able to parse arithmetic of arbitrary length and I have come up with the following AST.
data HVal
= HInteger Integer -- No Support For Floats
| HBool Bool
| HNull
| HString String
| HChar Char
| HList [HVal]
| HObj [(String, HVal)]
deriving (Show, Eq, Read)
data Op -- There's only one operator for the sake of brevity at the moment.
= Add
deriving (Show, Read)
newtype Parser a = Parser {
runParser :: String -> Maybe (String, a)
}
The following functions is my attempt of implementing the operator parser.
ops :: [Char]
ops = ['+']
isOp :: Char -> Bool
isOp c = elem c ops
spanP :: (Char -> Bool) -> Parser String
spanP f = Parser $ \input -> let (token, rest) = span f input
in Just (rest, token)
opLiteral :: Parser String
opLiteral = spanP isOp
sOp :: String -> Op
sOp "+" = Add
sOp _ = undefined
parseOp :: Parser Op
parseOp = sOp <$> (charP '"' *> opLiteral <* charP '"')
The logic above is similar to how strings are parsed therefore my assumption was that the only difference was looking specifically for an operator rather than anything that's not a number between quotation marks. It does seemingly begin to parse correctly but it then gives me the following error:
λ > runParser parseOp "\"+\""
Just ("+\"",*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /DIRECTORY/parser.hs:110:11 in main:Main
I'm confused as to where the error is occurring. I'm assuming it's to do with sOp mainly due to how the other functions work as intended as the rest of parseOp being a translation of the parseString function:
stringLiteral :: Parser String
stringLiteral = spanP (/= '"')
parseString :: Parser HVal
parseString = HString <$> (charP '"' *> stringLiteral <* charP '"')
The only reason why I have sOp however is that if it was replaced with say Op, I would get the error that the following doesn't exist Op :: String -> Op. When I say this my inclination was that the string coming from the parsed expression would be passed into this function wherein I could return the appropriate operator. This however is incorrect and I'm not sure how to proceed.
charP and Applicative Instance
charP :: Char -> Parser Char
charP x = Parser $ f
where f (y:ys)
| y == x = Just (ys, x)
| otherwise = Nothing
f [] = Nothing
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input', a) <- q input
Just (input', f a)
The implementation of (<*>) is the culprit. You did not use input' in the next call to q, but used input instead. As a result you pass the string to the next parser without "eating" characters. You can fix this with:
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input'', a) <- q input'
Just (input'', f a)
With the updated instance for Applicative, we get:
*Main> runParser parseOp "\"+\""
Just ("",Add)

Haskell : Non-Exhaustive Pattern In Function Prevents Another Function From Executing Even Though Its Not Used

I'm trying to implement car, cdr, and cons functionality into a toy language I'm writing however when I try to execute my car function through main, I get the following error:
./parser "car [1 2 3]"
parser: parser.hs:(48,27)-(55,45): Non-exhaustive patterns in case
The function on lines 48-55 is the following:
parseOp :: Parser HVal
parseOp = (many1 letter <|> string "+" <|> string "-" <|> string "*" <|> string "/" <|> string "%" <|> string "&&" <|> string "||") >>=
(\x -> return $ case x of
"&&" -> Op And
"||" -> Op Or
"+" -> Op Add
"-" -> Op Sub
"*" -> Op Mult
"/" -> Op Div
"%" -> Op Mod)
I'm really unsure why the error message points to this function because it has nothing to do with the list functionality. The car function is working however because I was able to successfully execute it through GHCI. I know my problem is due to parsing but I don't see where it is. The following are the functions that relate to lists. I can't see from them how they are influenced by parseOp.
data HVal = Number Integer
| String String
| Boolean Bool
| List [HVal]
| Op Op
| Expr Op HVal HVal
| Car [HVal]
deriving (Read)
car :: [HVal] -> HVal
car xs = head xs
parseListFunctions :: Parser HVal
parseListFunctions = do
_ <- string "car "
_ <- char '['
x <- parseList
_ <- char ']'
return $ Car [x]
parseExpr :: Parser HVal
parseExpr = parseNumber
<|> parseOp
<|> parseBool
<|> parseListFunctions
<|> do
_ <- char '['
x <- parseList
_ <- char ']'
return x
<|> do
_ <- char '('
x <- parseExpression
_ <- char ')'
return x
eval :: HVal -> HVal
eval val#(Number _) = val
eval val#(String _) = val
eval val#(Boolean _) = val
eval val#(List _) = val -- Look at list eval NOT WORKING
eval val#(Op _) = val
eval (Expr op x y) = eval $ evalExpr (eval x) op (eval y)
eval (Car xs) = eval $ car xs
The removal of many1 letter in parseOp transfers the same error to the following function parseBool:
parseBool :: Parser HVal
parseBool = many1 letter >>= (\x -> return $ case x of
"True" -> Boolean True
"False" -> Boolean False)
You write
parseExpr = ... <|> parseOp <|> ... <|> parseListFunctions <|> ...
and so
car ...
is passed to parseOp first, then parseListFunctions. The parseOp parser succeeds in the
many1 letter
branch, and so in the \x -> return $ case x of ..., x is bound to "car". Because parseOp succeeds (and returns an error value with an embedded, not-yet-evaluated inexhaustive case error!), parseListFunctions is never tried.
You will need to modify your grammar to reduce the ambiguity in it, so that these conflicts where multiple branches may match do not arise.

Is it possible to force backtrack all options?

I need to parse this syntax for function declaration
foo x = 1
Func "foo" (Ident "x") = 1
foo (x = 1) = 1
Func "foo" (Label "x" 1) = 1
foo x = y = 1
Func "foo" (Ident "x") = (Label "y" 1)
I have written this parser
module SimpleParser where
import Text.Parsec.String (Parser)
import Text.Parsec.Language (emptyDef)
import Text.Parsec
import qualified Text.Parsec.Token as Tok
import Text.Parsec.Char
import Prelude
lexer :: Tok.TokenParser ()
lexer = Tok.makeTokenParser style
where
style = emptyDef {
Tok.identLetter = alphaNum
}
parens :: Parser a -> Parser a
parens = Tok.parens lexer
commaSep :: Parser a -> Parser [a]
commaSep = Tok.commaSep1 lexer
commaSep1 :: Parser a -> Parser [a]
commaSep1 = Tok.commaSep1 lexer
identifier :: Parser String
identifier = Tok.identifier lexer
reservedOp :: String -> Parser ()
reservedOp = Tok.reservedOp lexer
data Expr = IntLit Int | Ident String | Label String Expr | Func String Expr Expr | ExprList [Expr] deriving (Eq, Ord, Show)
integer :: Parser Integer
integer = Tok.integer lexer
litInt :: Parser Expr
litInt = do
n <- integer
return $ IntLit (fromInteger n)
ident :: Parser Expr
ident = Ident <$> identifier
paramLabelItem = litInt <|> paramLabel
paramLabel :: Parser Expr
paramLabel = do
lbl <- try (identifier <* reservedOp "=")
body <- paramLabelItem
return $ Label lbl body
paramItem :: Parser Expr
paramItem = parens paramRecord <|> litInt <|> try paramLabel <|> ident
paramRecord :: Parser Expr
paramRecord = ExprList <$> commaSep1 paramItem
func :: Parser Expr
func = do
name <- identifier
params <- paramRecord
reservedOp "="
body <- paramRecord
return $ (Func name params body)
parseExpr :: String -> Either ParseError Expr
parseExpr s = parse func "" s
I can parse foo (x) = 1 but I cannot parse foo x = 1
parseExpr "foo x = 1"
Left (line 1, column 10):
unexpected end of input
expecting digit, "," or "="
I understand it tries to parse this code like Func "foo" (Label "x" 1) and fails. But after fail why it cannot try to parse it like Func "foo" (Ident "x") = 1
Is there any way to do it?
Also I have tried to swap ident and paramLabel
paramItem = parens paramRecord <|> litInt <|> try paramLabel <|> ident
paramItem = parens paramRecord <|> litInt <|> try ident <|> paramLabel
In this case I can parse foo x = 1 but I cannot parse foo (x = 1) = 2
parseExpr "foo (x = 1) = 2"
Left (line 1, column 8):
unexpected "="
expecting "," or ")"
Here is how I understand how Parsec backtracking works (and doesn't work):
In:
(try a <|> try b <|> c)
if a fails, b will be tried, and if b subsequently fails c will be tried.
However, in:
(try a <|> try b <|> c) >> d
if a succeeds but d fails, Parsec does not go back and try b. Once a succeeded Parsec considers the whole choice as parsed and it moves on to d. It will never go back to try b or c.
This doesn't work either for the same reason:
(try (try a <|> try b <|> c)) >> d
Once a or b succeeds, the whole choice succeeds, and therefor the outer try succeeds. Parsing then moves on to d.
A solution is to distribute d to within the choice:
try (a >> d) <|> try (b >> d) <|> (c >> d)
Now if a succeeds but d fails, b >> d will be tried.

Writing Parser for S Expressions

I'm trying to write a Parser for S Expressions from Prof. Yorgey's 2013 homework.
newtype Parser a = Parser { runParser :: String -> Maybe (a, String) }
Given the following definitions, presented in the homework:
type Ident = String
-- An "atom" is either an integer value or an identifier.
data Atom = N Integer | I Ident
deriving Show
-- An S-expression is either an atom, or a list of S-expressions.
data SExpr = A Atom
| Comb [SExpr]
deriving Show
I wrote a parser for Parser Atom and Parser SExpr for A Atom.
parseAtom :: Parser Atom
parseAtom = alt n i
where n = (\_ z -> N z) <$> spaces <*> posInt
i = (\ _ z -> I z) <$> spaces <*> ident
parseAAtom :: Parser SExpr
parseAAtom = fmap (\x -> A x) parseAtom
Then, I attempted to write a parser to handle a Parser SExpr for the Comb ... case:
parseComb :: Parser SExpr
parseComb = (\_ _ x _ _ _ -> x) <$> (zeroOrMore spaces) <*> (char '(') <*>
(alt parseAAtom parseComb) <*> (zeroOrMore spaces)
<*> (char ')') <*> (zeroOrMore spaces)
Assuming that parseComb was right, I could simply make usage of oneOrMore for Parser [SExpr].
parseCombElements :: Parser [SExpr]
parseCombElements = oneOrMore parseComb
So, my two last functions compile, but running runParser parseComb "( foo )" never terminates.
What's wrong with my parseComb definition? Please don't give me the whole answer, but rather a hint - for my own learning.
I am very suspicious of zeroOrMore spaces, because spaces is usually a parser which itself parses zero or more spaces. Which means that it can parse the empty string if there aren't any spaces at that point. In particular, the spaces parser always succeeds.
But when you apply zeroOrMore to a parser that always succeeds, the combined parser will never stop - because zeroOrMore only stops trying again once its parser argument fails.
As an aside, Applicative expressions like (\_ _ x _ _ _ -> x) <$> ... <*> ... <*> ...... which only use a single of the subparsers can usually be written more succinctly with the *> and <* combinators:
... *> ... *> x_parser_here <* ... <* ...

Resources