I have a file called Parser.hs and have defined methods for evaluating a boolean expression. In that file, I have the following:
-- implementing parsing bool operations, these are 'and' and 'or'
parseBoolOp :: Parser BoolOp
parseBoolOp =
do symbol "and" >> return And
<|> do symbol "or" >> return Or
-- a bool expression is the operator followed by one or more expressions that we have to parse
-- TODO: add bool expressions
boolExpr :: Parser Expr
boolExpr = BoolExpr <$> parseBoolOp <*> (pure <$> parseExpr)
-- the main parsing function which alternates between all the options you have
parseExpr :: Parser Expr
parseExpr =
do
parseAtom
<|> parseParens notExpr
<|> parseParens boolExpr
<|> parseParens parseExpr
In another file called Eval.hs, I have defined the following methods. I am trying to evaluate a bool expr using calcBoolList and evalListOfExprs:
-- call them whenever I want to generate an error
evalError :: ErrorT -> Evaluator a
evalError err = E (\_ -> Left err)
-- this evaluates a list of expressions and returns a list of values
-- by mapping an evaluator function (using <$>) over the list of expressions
evalListOfExprs :: ValueEnv -> [Expr] -> [Either ErrorT Value]
evalListOfExprs env exprs =
( \expr ->
case eval evalExpr (env, expr) of
Right (res, _) -> Right res
Left msg -> Left msg
)
<$> exprs
-- evaluates a bool expression, this first evaluates all the
-- arguments to the bool expression and then uses calcBoolList
-- to calculate the boolean operation over the arguments. Note that
-- first, I use evalListOfExprs to evaluate the arguments. Then
-- I used calcBoolList with the right op on it
evalBoolExpr :: Evaluator Value
evalBoolExpr = do
(env, BoolExpr op exprs) <- next
-- TODO: implement the rest!
case calcBoolList op (evalListOfExprs env exprs) of
Right v1 -> return v1
Left err -> evalError err
-- determine which bool operation to use to fold with by the kind of BoolOp passed in
calcBoolList :: BoolOp -> [Either ErrorT Value] -> Either ErrorT Value
calcBoolList op lst = case op of
And -> boolOpFold (&&) lst
Or -> boolOpFold (||) lst
-- parses the string then evaluates it
parseAndEval :: String -> Either ErrorT (Value, (ValueEnv, Expr))
parseAndEval str = do
(ast, _) <- parse parseExpr str
-- here, [] represents the empty environment
eval evalExpr ([], ast)
-- parseAndEvalEnv = parseAndEvalEnv [] -- could replace the code above with this <-- because we defined parseAndEvalEnv below
-- parseAndEvalEnv :: ValueEnv -> String -> Either ErrorT (Value, (ValueEnv, Expr))
-- parseAndEvalEnv env str = do
-- (ast, _) <- parse parseExpr str
-- -- here, [] represents the empty environment
-- eval evalExpr ([], ast)
-- extract the value from the result, which contains extra stuff we don't need to see
getValue :: Either ErrorT (Value, (ValueEnv, Expr)) -> Either ErrorT Value
getValue (Right (val, _)) = Right val
getValue (Left err) = Left err
-- takes a string and parses it, then it tries to evaluate it
evalStr :: String -> Either ErrorT Value
evalStr = getValue . parseAndEval
The expressions types and value types are contained in this file called Expr.hs:
-- define the operator types
data BoolOp = And | Or deriving (Show, Eq)
-- define the expression types
data Expr
= BoolExpr BoolOp [Expr]
deriving (Show, Eq)
-- define the type for values, which in our mini language
-- can be integers, bools, pairs, or closures
data Value
= BoolVal Bool
deriving (Show, Eq)
When I run evalStr "(and true (and false true) true)", which should return Right (BoolVal False), it instead returns Left (ParseError "'a' didn't match expected character"). Is my evaluator wrong or is it the boolExpr in my Parser file?
Your boolExpr doesn't address and x y. You write
boolExpr = BoolExpr <$> parseBoolOp <*> (pure <$> parseExpr)
which means to parse an operator, and then parse exactly one expression, and wrap that expression in a singleton list (here pure :: a -> [a]). You probably want something like
boolExpr = BoolExpr <$> parseBoolOp <*> parseExpr `sepBy1` spaces
to allow multiple subexpressions.
Related
I'm studying functional programming using Haskell language. And as an exercise I need to implement a function parsing a primitive arithmetic expression from String. The function must be able to handle double literals, operations +, -, *, / with the usual precedence and parentheses.
parseExpr :: String -> Except ParseError Expr
with next defined data types:
data ParseError = ErrorAtPos Natural
deriving Show
newtype Parser a = P (ExceptState ParseError (Natural, String) a)
deriving newtype (Functor, Applicative, Monad)
data Prim a
= Add a a
| Sub a a
| Mul a a
| Div a a
| Abs a
| Sgn a
deriving Show
data Expr
= Val Double
| Op (Prim Expr)
deriving Show
Where ExceptState is a modified State monad, allowing to throw exception pointing at the error position.
data Annotated e a = a :# e
deriving Show
infix 0 :#
data Except e a = Error e | Success a
deriving Show
data ExceptState e s a = ES { runES :: s -> Except e (Annotated s a) }
Also ExceptState has defined Functor, Applicative and Monad instances, which were thoroughly tested earlier, so I am positive in their correctness.
instance Functor (ExceptState e s) where
fmap func ES{runES = runner} = ES{runES = \s ->
case (runner s) of
Error err -> Error err
Success ans -> Success (mapAnnotated func $ ans) }
instance Applicative (ExceptState e s) where
pure arg = ES{runES = \s -> Success (arg :# s)}
p <*> q = Control.Monad.ap p q
instance Monad (ExceptState e s) where
m >>= f = joinExceptState (fmap f m)
where
joinExceptState :: ExceptState e s (ExceptState e s a) -> ExceptState e s a
joinExceptState ES{runES = runner} = ES{runES = \s ->
case (runner s) of
Error err -> Error err
Success (ES{runES = runner2} :# s2) ->
case (runner2 s2) of
Error err -> Error err
Success (res :# s3) -> Success (res :# s3) }
To implement the function parseExpr I used basic parser combinators:
pChar :: Parser Char
pChar = P $ ES $ \(pos, s) ->
case s of
[] -> Error (ErrorAtPos pos)
(c:cs) -> Success (c :# (pos + 1, cs))
parseError :: Parser a
parseError = P $ ES $ \(pos, _) -> Error (ErrorAtPos pos)
instance Alternative Parser where
empty = parseError
(<|>) (P(ES{runES = runnerP})) (P(ES{runES = runnerQ})) =
P $ ES $ \(pos, s) ->
case runnerP (pos, s) of
Error _ -> runnerQ (pos, s)
Success res -> Success res
instance MonadPlus Parser
which were used to construct more complex ones:
-- | elementary parser not consuming a character, failing if input doesn't
-- reach its end
pEof :: Parser ()
pEof = P $ ES $ \(pos, s) ->
case s of
[] -> Success (() :# (pos, []))
_ -> Error $ ErrorAtPos pos
-- | parses a single digit value
parseVal :: Parser Expr
parseVal = Val <$> (fromIntegral . digitToInt) <$> mfilter isDigit pChar
-- | parses an expression inside parenthises
pParenth :: Parser Expr
pParenth = do
void $ mfilter (== '(') pChar
expr <- parseAddSub
(void $ mfilter (== ')') pChar) <|> parseError
return expr
-- | parses the most prioritised operations
parseTerm :: Parser Expr
parseTerm = pParenth <|> parseVal
parseAddSub :: Parser Expr
parseAddSub = do
x <- parseTerm
ys <- many parseSecond
return $ foldl (\acc (sgn, y) -> Op $
(if sgn == '+' then Add else Sub) acc y) x ys
where
parseSecond :: Parser (Char, Expr)
parseSecond = do
sgn <- mfilter ((flip elem) "+-") pChar
y <- parseTerm <|> parseError
return (sgn, y)
-- | Parses the whole expression. Begins from parsing on +, - level and
-- successfully consuming the whole string.
pExpr :: Parser Expr
pExpr = do
expr <- parseAddSub
pEof
return expr
-- | More convinient way to run 'pExpr' parser
parseExpr :: String -> Except ParseError Expr
parseExpr = runP pExpr
As a result, at this point function works as intended if given String expression is valid:
ghci> parseExpr "(2+3)-1"
Success (Op (Sub (Op (Add (Val 2.0) (Val 3.0))) (Val 1.0)))
ghci> parseExpr "(2+3-1)-1"
Success (Op (Sub (Op (Sub (Op (Add (Val 2.0) (Val 3.0))) (Val 1.0))) (Val 1.0)))
Otherwise ErrorAtPos does not point at the necessary position:
ghci> parseExpr "(2+)-1"
Error (ErrorAtPos 1)
ghci> parseExpr "(2+3-)-1"
Error (ErrorAtPos 1)
What am I doing wrong here? Thank you in advance.
My main assumption was that something wrong was with function (<|>) of Alternative Parser and it incorrectly changed pos variable.
(<|>) (P(ES{runES = runnerP})) (P(ES{runES = runnerQ})) =
P $ ES $ \(pos, s) ->
case runnerP (pos, s) of
-- Error _ -> runnerQ (pos, s)
Error (ErrorAtPos pos') -> runnerQ (pos' + pos, s)
Success res -> Success res
But it led to more strange results:
ghci> parseExpr "(5+)-3"
Error (ErrorAtPos 84)
ghci> parseExpr "(5+2-)-3"
Error (ErrorAtPos 372)
Then more doubts were aimed at joinExceptState function of instance Monad (ExceptState e s) in spite of everything I've run it through, doubts that it wasn't working on s of (Natural, String) type as I indented in this case. But then I can't really change it for this concrete type only.
Excellent question, although it would have been even better if it really included all your code. I filled in the missing pieces:
mapAnnotated :: (a -> b) -> Annotated s a -> Annotated s b
mapAnnotated f (a :# e) = (f a) :# e
runP :: Parser a -> String -> Except ParseError a
runP (P (ES {runES = p})) s = case p (0, s) of
Error e -> Error e
Success (a :# e) -> Success a
Why is parseExpr "(5+)-3" equal to Error (ErrorAtPos 1)? Here's what happens: we call parseExpr which (ultimately) calls parseTerm which is just pParenth <|> parseVal. pParenth fails, of course, so we look at the definition of <|> to work out what to do. That definition says: if the thing on the left fails, try the thing on the right. So we try the thing on the right (i. e. parseVal), which also fails, and we report the second error, which is in fact at position 1.
To see this more clearly, you can just replace pParenth <|> parseVal with parseVal <|> pParenth and observe that you get ErrorAtPos 2 instead.
This is almost certainly not the behaviour you want. The documentation of Megaparsec's p <|> q, here, says:
If [parser] p fails without consuming any input, parser q is tried.
(emphasis in original, meaning: parser q is not tried in other cases). This is a more useful thing to do. If you got reasonably far trying to parse a parenthesised expression and then got an error, probably you want to report that error rather than complaining that '(' isn't a digit.
Since you say this is an exercise, I'm not going to tell you how to fix the problem. I'll tell you some other stuff, though.
First, this is not your only issue with error reporting. Above we see that parseVal "(1" reports an error at position 1 (after the problematic character, which is at position 0) whereas pParenth "(5+)-3" reports an error at position 2 (before the problematic character, which is at position 3). Ideally, both should give the position of the problematic character itself. (Of course, it'd be even better if the parser stated what character it expected, but that's more difficult to do.)
Second, the way I found the problem was to import Debug.Trace, replace your definition of pChar with
pChar :: Parser Char
pChar = P $ ES $ \(pos, s) -> traceShow (pos, s) $
case s of
[] -> Error (ErrorAtPos pos)
(c:cs) -> Success (c :# (pos + 1, cs))
and mull over the output for a bit. Debug.Trace is sometimes less useful than one hopes, because of lazy evaluation, but for a program like this it can help a lot.
Third, if you modify your definition of <|> to match Megaparsec's does, you might need Megaparsec's try combinator. (Not for the grammar you're trying to parse now, but maybe later.) try solves the issue that
(singleChar 'p' *> singleChar 'q') <|> (singleChar 'p' *> singleChar 'r')
fails on the string "pr" with Megaparsec's <|>.
Fourth, you sometimes write someParser <|> parseError, which I think is equivalent to someParser for both your definition of <|> and Megaparsec's.
Fifth, you don't need void; just ignore the result, it's the same thing.
Sixth, your Except seems to just be Either.
I am trying to parse a bool expression in Haskell. This line is giving me an error: BoolExpr <$> parseBoolOp <*> (n : ns). This is the error:
• Couldn't match type ‘[]’ with ‘Parser’
Expected type: Parser [Expr]
Actual type: [Expr]
-- define the expression types
data Expr
= BoolExpr BoolOp [Expr]
deriving (Show, Eq)
-- define the type for bool value
data Value
= BoolVal Bool
deriving (Show, Eq)
-- many x = Parser.some x <|> pure []
-- some x = (:) <$> x <*> Parser.many x
kstar :: Alternative f => f a -> f [a]
kstar x = kplus x <|> pure []
kplus :: Alternative f => f a -> f [a]
kplus x = (:) <$> x <*> kstar x
symbol :: String -> Parser String
symbol xs = token (string xs)
-- a bool expression is the operator followed by one or more expressions that we have to parse
-- TODO: add bool expressions
boolExpr :: Parser Expr
boolExpr = do
n <- parseExpr
ns <- kstar (symbol "," >> parseExpr)
BoolExpr <$> parseBoolOp <*> (n : ns)
-- an atom is a literalExpr, which can be an actual literal or some other things
parseAtom :: Parser Expr
parseAtom =
do
literalExpr
-- the main parsing function which alternates between all the options you have
parseExpr :: Parser Expr
parseExpr =
do
parseAtom
<|> parseParens boolExpr
<|> parseParens parseExpr
-- implement parsing bool operations, these are 'and' and 'or'
parseBoolOp :: Parser BoolOp
parseBoolOp =
do symbol "and" >> return And
<|> do symbol "or" >> return Or
The boolExpr is expecting a Parser [Expr] but I am returning only an [Expr]. Is there a way to fix this or do it in another way? When I try pure (n:ns), evalStr "(and true (and false true) true)" returns Left (ParseError "'a' didn't match expected character") instead of Right (BoolVal False)
The expression (n : ns) is a list. Therefore the compiler thinks that the applicative operators <*> and <$> should be used in the context [], while you want Parser instead.
I would guess you need pure (n : ns) instead.
I'm practicing writing parsers. I'm using Tsodings JSON Parser video as reference. I'm trying to add to it by being able to parse arithmetic of arbitrary length and I have come up with the following AST.
data HVal
= HInteger Integer -- No Support For Floats
| HBool Bool
| HNull
| HString String
| HChar Char
| HList [HVal]
| HObj [(String, HVal)]
deriving (Show, Eq, Read)
data Op -- There's only one operator for the sake of brevity at the moment.
= Add
deriving (Show, Read)
newtype Parser a = Parser {
runParser :: String -> Maybe (String, a)
}
The following functions is my attempt of implementing the operator parser.
ops :: [Char]
ops = ['+']
isOp :: Char -> Bool
isOp c = elem c ops
spanP :: (Char -> Bool) -> Parser String
spanP f = Parser $ \input -> let (token, rest) = span f input
in Just (rest, token)
opLiteral :: Parser String
opLiteral = spanP isOp
sOp :: String -> Op
sOp "+" = Add
sOp _ = undefined
parseOp :: Parser Op
parseOp = sOp <$> (charP '"' *> opLiteral <* charP '"')
The logic above is similar to how strings are parsed therefore my assumption was that the only difference was looking specifically for an operator rather than anything that's not a number between quotation marks. It does seemingly begin to parse correctly but it then gives me the following error:
λ > runParser parseOp "\"+\""
Just ("+\"",*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /DIRECTORY/parser.hs:110:11 in main:Main
I'm confused as to where the error is occurring. I'm assuming it's to do with sOp mainly due to how the other functions work as intended as the rest of parseOp being a translation of the parseString function:
stringLiteral :: Parser String
stringLiteral = spanP (/= '"')
parseString :: Parser HVal
parseString = HString <$> (charP '"' *> stringLiteral <* charP '"')
The only reason why I have sOp however is that if it was replaced with say Op, I would get the error that the following doesn't exist Op :: String -> Op. When I say this my inclination was that the string coming from the parsed expression would be passed into this function wherein I could return the appropriate operator. This however is incorrect and I'm not sure how to proceed.
charP and Applicative Instance
charP :: Char -> Parser Char
charP x = Parser $ f
where f (y:ys)
| y == x = Just (ys, x)
| otherwise = Nothing
f [] = Nothing
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input', a) <- q input
Just (input', f a)
The implementation of (<*>) is the culprit. You did not use input' in the next call to q, but used input instead. As a result you pass the string to the next parser without "eating" characters. You can fix this with:
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input'', a) <- q input'
Just (input'', f a)
With the updated instance for Applicative, we get:
*Main> runParser parseOp "\"+\""
Just ("",Add)
I'm trying to implement car, cdr, and cons functionality into a toy language I'm writing however when I try to execute my car function through main, I get the following error:
./parser "car [1 2 3]"
parser: parser.hs:(48,27)-(55,45): Non-exhaustive patterns in case
The function on lines 48-55 is the following:
parseOp :: Parser HVal
parseOp = (many1 letter <|> string "+" <|> string "-" <|> string "*" <|> string "/" <|> string "%" <|> string "&&" <|> string "||") >>=
(\x -> return $ case x of
"&&" -> Op And
"||" -> Op Or
"+" -> Op Add
"-" -> Op Sub
"*" -> Op Mult
"/" -> Op Div
"%" -> Op Mod)
I'm really unsure why the error message points to this function because it has nothing to do with the list functionality. The car function is working however because I was able to successfully execute it through GHCI. I know my problem is due to parsing but I don't see where it is. The following are the functions that relate to lists. I can't see from them how they are influenced by parseOp.
data HVal = Number Integer
| String String
| Boolean Bool
| List [HVal]
| Op Op
| Expr Op HVal HVal
| Car [HVal]
deriving (Read)
car :: [HVal] -> HVal
car xs = head xs
parseListFunctions :: Parser HVal
parseListFunctions = do
_ <- string "car "
_ <- char '['
x <- parseList
_ <- char ']'
return $ Car [x]
parseExpr :: Parser HVal
parseExpr = parseNumber
<|> parseOp
<|> parseBool
<|> parseListFunctions
<|> do
_ <- char '['
x <- parseList
_ <- char ']'
return x
<|> do
_ <- char '('
x <- parseExpression
_ <- char ')'
return x
eval :: HVal -> HVal
eval val#(Number _) = val
eval val#(String _) = val
eval val#(Boolean _) = val
eval val#(List _) = val -- Look at list eval NOT WORKING
eval val#(Op _) = val
eval (Expr op x y) = eval $ evalExpr (eval x) op (eval y)
eval (Car xs) = eval $ car xs
The removal of many1 letter in parseOp transfers the same error to the following function parseBool:
parseBool :: Parser HVal
parseBool = many1 letter >>= (\x -> return $ case x of
"True" -> Boolean True
"False" -> Boolean False)
You write
parseExpr = ... <|> parseOp <|> ... <|> parseListFunctions <|> ...
and so
car ...
is passed to parseOp first, then parseListFunctions. The parseOp parser succeeds in the
many1 letter
branch, and so in the \x -> return $ case x of ..., x is bound to "car". Because parseOp succeeds (and returns an error value with an embedded, not-yet-evaluated inexhaustive case error!), parseListFunctions is never tried.
You will need to modify your grammar to reduce the ambiguity in it, so that these conflicts where multiple branches may match do not arise.
I'm writing a simple language but have encountered a problem in that arithmetic expressions don't expect their operators despite the fact they're defined in the language. I have since gone back to express my program more simply. At the moment the only complicated expression that I have is arithmetic. I can parse integers, strings, booleans, and, lists. It can recognise expressions such as "5 Add 5" but only in the form of a list, "[5 Add 5]".
I suspect it doesn't parse when it's not in a list due to how atoms are parsed. An atom parser parses booleans and then an arbitrary atom. If the arithmetic expression is placed into a string, it will also be parsed properly.
I'm not sure why it's not evaluating correctly though. I have an evaluation function that deals with arithmetic expressions but it does not work when I try to give it arguments through the command line. When I load it into GHCI, I can use it perfectly so I know the arithmetic evaluation function works, but I'm failing to see why it's not evaluating when I give it command line arguments. Any arithmetic expression just results in the expression itself, not the value that should be returned.
The only reason I can think of is that when the expression is parsed, it isn't parsed as an arithmetic expression but instead it's parsed as a list or a string.
data HenryVal = Integer Integer
| Bool Bool
| Atom String
| String String
| List [HenryVal]
| ABinOp ABinOp
| Ad HenryVal HenryVal
| ABinary ABinOp HenryVal HenryVal
data ABinOp = Add
| Subtract
| Multiply
| Divide
deriving (Show)
evalABinOp :: HenryVal -> ABinOp -> HenryVal -> HenryVal
evalABinOp (Integer a) Add (Integer b) = Integer (a +b)
evalABinOp (Integer a) Multiply (Integer b) = Integer (a * b)
evalABinOp (Integer a) Divide (Integer b) = Integer (a `div` b)
evalABinOp (Integer a) Subtract (Integer b) = Integer (a - b)
eval :: HenryVal -> HenryVal
eval val#(Atom _) = val
eval val#(String _) = val
eval val#(Integer _) = val
eval val#(Bool _) = val
eval (List [Atom "quote", val]) = val
eval val#(List _) = val
eval (ABinary op x y) = evalABinOp (eval x) op (eval y)
spaces :: Parser ()
spaces = skipMany1 space
parseString :: Parser HenryVal
parseString = do
char '"'
x <- many (noneOf "\"")
char '"'
return $ String x
parseList :: Parser HenryVal
parseList = liftM List $ sepBy parseExpr spaces
parseNumber :: Parser HenryVal
parseNumber = liftM (Integer . read) $ many1 digit
parseAtom :: Parser HenryVal
parseAtom = do
first <- letter
rest <- many (letter <|> digit)
let atom = first:rest
return $ case atom of
"True" -> Bool True
"False" -> Bool False
_ -> Atom atom
parseExpr :: Parser HenryVal
parseExpr = parseNumber <|>
parseAtom <|>
parseString <|>
do
_ <- char '['
x <- try parseList
_ <- char ']'
return x
readExpr :: String -> HenryVal
readExpr input = case parse parseExpr "Henry" input of
Left error -> String $ "No Match: " ++ show error
Right val -> val
main :: IO ()
main = getArgs >>= print . eval . readExpr . head