Invalid exception messages from parser combinators in Haskell - parsing

I'm studying functional programming using Haskell language. And as an exercise I need to implement a function parsing a primitive arithmetic expression from String. The function must be able to handle double literals, operations +, -, *, / with the usual precedence and parentheses.
parseExpr :: String -> Except ParseError Expr
with next defined data types:
data ParseError = ErrorAtPos Natural
deriving Show
newtype Parser a = P (ExceptState ParseError (Natural, String) a)
deriving newtype (Functor, Applicative, Monad)
data Prim a
= Add a a
| Sub a a
| Mul a a
| Div a a
| Abs a
| Sgn a
deriving Show
data Expr
= Val Double
| Op (Prim Expr)
deriving Show
Where ExceptState is a modified State monad, allowing to throw exception pointing at the error position.
data Annotated e a = a :# e
deriving Show
infix 0 :#
data Except e a = Error e | Success a
deriving Show
data ExceptState e s a = ES { runES :: s -> Except e (Annotated s a) }
Also ExceptState has defined Functor, Applicative and Monad instances, which were thoroughly tested earlier, so I am positive in their correctness.
instance Functor (ExceptState e s) where
fmap func ES{runES = runner} = ES{runES = \s ->
case (runner s) of
Error err -> Error err
Success ans -> Success (mapAnnotated func $ ans) }
instance Applicative (ExceptState e s) where
pure arg = ES{runES = \s -> Success (arg :# s)}
p <*> q = Control.Monad.ap p q
instance Monad (ExceptState e s) where
m >>= f = joinExceptState (fmap f m)
where
joinExceptState :: ExceptState e s (ExceptState e s a) -> ExceptState e s a
joinExceptState ES{runES = runner} = ES{runES = \s ->
case (runner s) of
Error err -> Error err
Success (ES{runES = runner2} :# s2) ->
case (runner2 s2) of
Error err -> Error err
Success (res :# s3) -> Success (res :# s3) }
To implement the function parseExpr I used basic parser combinators:
pChar :: Parser Char
pChar = P $ ES $ \(pos, s) ->
case s of
[] -> Error (ErrorAtPos pos)
(c:cs) -> Success (c :# (pos + 1, cs))
parseError :: Parser a
parseError = P $ ES $ \(pos, _) -> Error (ErrorAtPos pos)
instance Alternative Parser where
empty = parseError
(<|>) (P(ES{runES = runnerP})) (P(ES{runES = runnerQ})) =
P $ ES $ \(pos, s) ->
case runnerP (pos, s) of
Error _ -> runnerQ (pos, s)
Success res -> Success res
instance MonadPlus Parser
which were used to construct more complex ones:
-- | elementary parser not consuming a character, failing if input doesn't
-- reach its end
pEof :: Parser ()
pEof = P $ ES $ \(pos, s) ->
case s of
[] -> Success (() :# (pos, []))
_ -> Error $ ErrorAtPos pos
-- | parses a single digit value
parseVal :: Parser Expr
parseVal = Val <$> (fromIntegral . digitToInt) <$> mfilter isDigit pChar
-- | parses an expression inside parenthises
pParenth :: Parser Expr
pParenth = do
void $ mfilter (== '(') pChar
expr <- parseAddSub
(void $ mfilter (== ')') pChar) <|> parseError
return expr
-- | parses the most prioritised operations
parseTerm :: Parser Expr
parseTerm = pParenth <|> parseVal
parseAddSub :: Parser Expr
parseAddSub = do
x <- parseTerm
ys <- many parseSecond
return $ foldl (\acc (sgn, y) -> Op $
(if sgn == '+' then Add else Sub) acc y) x ys
where
parseSecond :: Parser (Char, Expr)
parseSecond = do
sgn <- mfilter ((flip elem) "+-") pChar
y <- parseTerm <|> parseError
return (sgn, y)
-- | Parses the whole expression. Begins from parsing on +, - level and
-- successfully consuming the whole string.
pExpr :: Parser Expr
pExpr = do
expr <- parseAddSub
pEof
return expr
-- | More convinient way to run 'pExpr' parser
parseExpr :: String -> Except ParseError Expr
parseExpr = runP pExpr
As a result, at this point function works as intended if given String expression is valid:
ghci> parseExpr "(2+3)-1"
Success (Op (Sub (Op (Add (Val 2.0) (Val 3.0))) (Val 1.0)))
ghci> parseExpr "(2+3-1)-1"
Success (Op (Sub (Op (Sub (Op (Add (Val 2.0) (Val 3.0))) (Val 1.0))) (Val 1.0)))
Otherwise ErrorAtPos does not point at the necessary position:
ghci> parseExpr "(2+)-1"
Error (ErrorAtPos 1)
ghci> parseExpr "(2+3-)-1"
Error (ErrorAtPos 1)
What am I doing wrong here? Thank you in advance.
My main assumption was that something wrong was with function (<|>) of Alternative Parser and it incorrectly changed pos variable.
(<|>) (P(ES{runES = runnerP})) (P(ES{runES = runnerQ})) =
P $ ES $ \(pos, s) ->
case runnerP (pos, s) of
-- Error _ -> runnerQ (pos, s)
Error (ErrorAtPos pos') -> runnerQ (pos' + pos, s)
Success res -> Success res
But it led to more strange results:
ghci> parseExpr "(5+)-3"
Error (ErrorAtPos 84)
ghci> parseExpr "(5+2-)-3"
Error (ErrorAtPos 372)
Then more doubts were aimed at joinExceptState function of instance Monad (ExceptState e s) in spite of everything I've run it through, doubts that it wasn't working on s of (Natural, String) type as I indented in this case. But then I can't really change it for this concrete type only.

Excellent question, although it would have been even better if it really included all your code. I filled in the missing pieces:
mapAnnotated :: (a -> b) -> Annotated s a -> Annotated s b
mapAnnotated f (a :# e) = (f a) :# e
runP :: Parser a -> String -> Except ParseError a
runP (P (ES {runES = p})) s = case p (0, s) of
Error e -> Error e
Success (a :# e) -> Success a
Why is parseExpr "(5+)-3" equal to Error (ErrorAtPos 1)? Here's what happens: we call parseExpr which (ultimately) calls parseTerm which is just pParenth <|> parseVal. pParenth fails, of course, so we look at the definition of <|> to work out what to do. That definition says: if the thing on the left fails, try the thing on the right. So we try the thing on the right (i. e. parseVal), which also fails, and we report the second error, which is in fact at position 1.
To see this more clearly, you can just replace pParenth <|> parseVal with parseVal <|> pParenth and observe that you get ErrorAtPos 2 instead.
This is almost certainly not the behaviour you want. The documentation of Megaparsec's p <|> q, here, says:
If [parser] p fails without consuming any input, parser q is tried.
(emphasis in original, meaning: parser q is not tried in other cases). This is a more useful thing to do. If you got reasonably far trying to parse a parenthesised expression and then got an error, probably you want to report that error rather than complaining that '(' isn't a digit.
Since you say this is an exercise, I'm not going to tell you how to fix the problem. I'll tell you some other stuff, though.
First, this is not your only issue with error reporting. Above we see that parseVal "(1" reports an error at position 1 (after the problematic character, which is at position 0) whereas pParenth "(5+)-3" reports an error at position 2 (before the problematic character, which is at position 3). Ideally, both should give the position of the problematic character itself. (Of course, it'd be even better if the parser stated what character it expected, but that's more difficult to do.)
Second, the way I found the problem was to import Debug.Trace, replace your definition of pChar with
pChar :: Parser Char
pChar = P $ ES $ \(pos, s) -> traceShow (pos, s) $
case s of
[] -> Error (ErrorAtPos pos)
(c:cs) -> Success (c :# (pos + 1, cs))
and mull over the output for a bit. Debug.Trace is sometimes less useful than one hopes, because of lazy evaluation, but for a program like this it can help a lot.
Third, if you modify your definition of <|> to match Megaparsec's does, you might need Megaparsec's try combinator. (Not for the grammar you're trying to parse now, but maybe later.) try solves the issue that
(singleChar 'p' *> singleChar 'q') <|> (singleChar 'p' *> singleChar 'r')
fails on the string "pr" with Megaparsec's <|>.
Fourth, you sometimes write someParser <|> parseError, which I think is equivalent to someParser for both your definition of <|> and Megaparsec's.
Fifth, you don't need void; just ignore the result, it's the same thing.
Sixth, your Except seems to just be Either.

Related

Adding error handling to function - Haskell

I am completely new to haskell and seen examples online of how to add error handling but I'm not sure how to incorporate it in my context. Below is an example of the code which works before trying to handle errors.
expr'::Parser Double
expr' = term' `chainl1'` addop
term'::Parser Double
term' = factor' `chainl1` mulop
chainl :: Parser a -> Parser (a -> a -> a) -> a -> Parser a
chainl p op a = (p `chainl1` op) <|> pure a
chainl1 ::Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p op = p >>= rest
where
rest a = (do
f <- op
b <- p
rest (f a b)) <|> pure a
addop, mulop :: Parser (Double -> Double -> Double)
I've since expanded this to let addop and mulop return error messages if something irregular is found. This causes the function definition to change to:
addop, mulop :: Parser (Either String (Double -> Double -> Double))
In other programming languages I would check if f <- op is a String and return the string. However I'm not sure how to go about this in Haskell. The idea is that this error message returns all the way back to term'. Hence its function definition also needs to change eventually. This is all in the attempt to build a Monadic Parser.
If you're using parsec then you can make your code more general to work with the ParsecT monad transformer:
import Text.Parsec hiding (chainl1)
import Control.Monad.Trans.Class (lift)
expr' :: ParsecT String () (Either String) Double
expr' = term' `chainl1` addop
term' :: ParsecT String () (Either String) Double
term' = factor' `chainl1` mulop
factor' :: ParsecT String () (Either String) Double
factor' = read <$> many1 digit
chainl1 :: Monad m => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> ParsecT s u m a
chainl1 p op = p >>= rest
where
rest a = (do
f <- op
b <- p
rest (f a b))
<|> pure a
addop, mulop :: ParsecT String () (Either String) (Double -> Double -> Double)
addop = (+) <$ char '+' <|> (-) <$ char '-'
mulop = ((*) <$ char '*' <* lift (Left "error")) <|> (/) <$ char '/' <|> (**) <$ char '^'
I don't know what kind of errors you would want to return, so I've just made an error if an '*' is encountered in the input.
You can run the parser like this:
ghci> runParserT (expr' <* eof) () "buffer" "1+2+3"
Right (Right 6.0)
ghci> runParserT (expr' <* eof) () "buffer" "1+2*3"
Left "error"
The answer based on parsec implementation.
Actually the operator <|> is what you need. It handles any parsing errors. In expression a <|> b if the parser a fails then the parser b will be run (expect if the parser a consume some input before fails; for handle this case you can use combinator try like this: try a <|> b).
But if you want to handle error depending to the kind of error then you should do like #Noughtmare answered. But then I recomend you to do that:
Define your type for errors. It will be bugless to handle errors.
data MyError
= ME_DivByZero
| ...
You can simplify type signature if you define type alias for your parser.
type MyParser = ParsecT String () (Either MyError)
Then signatires will look like this:
expr' :: MyParser Double
addop, mulop :: MyParser (Double -> Double -> Double)
Use throwError to throw your errors and catchError to handle your errors, that will be more idiomatic. So it's look like this:
f <- catchError op $ \case
ME_DivByZero -> ...
ME_... -> ...
err -> throwError err -- rethrow error

Haskell : Operator Parser keeps going to undefined rather than inputs

I'm practicing writing parsers. I'm using Tsodings JSON Parser video as reference. I'm trying to add to it by being able to parse arithmetic of arbitrary length and I have come up with the following AST.
data HVal
= HInteger Integer -- No Support For Floats
| HBool Bool
| HNull
| HString String
| HChar Char
| HList [HVal]
| HObj [(String, HVal)]
deriving (Show, Eq, Read)
data Op -- There's only one operator for the sake of brevity at the moment.
= Add
deriving (Show, Read)
newtype Parser a = Parser {
runParser :: String -> Maybe (String, a)
}
The following functions is my attempt of implementing the operator parser.
ops :: [Char]
ops = ['+']
isOp :: Char -> Bool
isOp c = elem c ops
spanP :: (Char -> Bool) -> Parser String
spanP f = Parser $ \input -> let (token, rest) = span f input
in Just (rest, token)
opLiteral :: Parser String
opLiteral = spanP isOp
sOp :: String -> Op
sOp "+" = Add
sOp _ = undefined
parseOp :: Parser Op
parseOp = sOp <$> (charP '"' *> opLiteral <* charP '"')
The logic above is similar to how strings are parsed therefore my assumption was that the only difference was looking specifically for an operator rather than anything that's not a number between quotation marks. It does seemingly begin to parse correctly but it then gives me the following error:
λ > runParser parseOp "\"+\""
Just ("+\"",*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /DIRECTORY/parser.hs:110:11 in main:Main
I'm confused as to where the error is occurring. I'm assuming it's to do with sOp mainly due to how the other functions work as intended as the rest of parseOp being a translation of the parseString function:
stringLiteral :: Parser String
stringLiteral = spanP (/= '"')
parseString :: Parser HVal
parseString = HString <$> (charP '"' *> stringLiteral <* charP '"')
The only reason why I have sOp however is that if it was replaced with say Op, I would get the error that the following doesn't exist Op :: String -> Op. When I say this my inclination was that the string coming from the parsed expression would be passed into this function wherein I could return the appropriate operator. This however is incorrect and I'm not sure how to proceed.
charP and Applicative Instance
charP :: Char -> Parser Char
charP x = Parser $ f
where f (y:ys)
| y == x = Just (ys, x)
| otherwise = Nothing
f [] = Nothing
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input', a) <- q input
Just (input', f a)
The implementation of (<*>) is the culprit. You did not use input' in the next call to q, but used input instead. As a result you pass the string to the next parser without "eating" characters. You can fix this with:
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input'', a) <- q input'
Just (input'', f a)
With the updated instance for Applicative, we get:
*Main> runParser parseOp "\"+\""
Just ("",Add)

Combining parsers in Haskell

I'm given the following parsers
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
instance Functor Parser where
fmap f p = Parser $ \s -> (\(a,c) -> (f a, c)) <$> parse p s
instance Applicative Parser where
pure a = Parser $ \s -> Just (a,s)
f <*> a = Parser $ \s ->
case parse f s of
Just (g,s') -> parse (fmap g a) s'
Nothing -> Nothing
instance Alternative Parser where
empty = Parser $ \s -> Nothing
l <|> r = Parser $ \s -> parse l s <|> parse r s
ensure :: (a -> Bool) -> Parser a -> Parser a
ensure p parser = Parser $ \s ->
case parse parser s of
Nothing -> Nothing
Just (a,s') -> if p a then Just (a,s') else Nothing
lookahead :: Parser (Maybe Char)
lookahead = Parser f
where f [] = Just (Nothing,[])
f (c:s) = Just (Just c,c:s)
satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser f
where f [] = Nothing
f (x:xs) = if p x then Just (x,xs) else Nothing
eof :: Parser ()
eof = Parser $ \s -> if null s then Just ((),[]) else Nothing
eof' :: Parser ()
eof' = ???
I need to write a new parser eof' that does exactly what eof does but is built only using the given parsers and the
Functor/Applicative/Alternative instances above. I'm stuck on this as I don't have experience in combining parsers. Can anyone help me out ?
To understand it easier, we can write it in an equational pseudocode, while we substitute and simplify the definitions, using Monad Comprehensions for clarity and succinctness.
Monad Comprehensions are just like List Comprehensions, only working for any MonadPlus type, not just []; while corresponding closely to do notation, e.g. [ (f a, s') | (a, s') <- parse p s ] === do { (a, s') <- parse p s ; return (f a, s') }.
This gets us:
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
instance Functor Parser where
parse (fmap f p) s = [ (f a, s') | (a, s') <- parse p s ]
instance Applicative Parser where
parse (pure a) s = pure (a, s)
parse (pf <*> pa) s = [ (g a, s'') | (g, s') <- parse pf s
, (a, s'') <- parse pa s' ]
instance Alternative Parser where
parse empty s = empty
parse (l <|> r) s = parse l s <|> parse r s
ensure :: (a -> Bool) -> Parser a -> Parser a
parse (ensure pred p) s = [ (a, s') | (a, s') <- parse p s, pred a ]
lookahead :: Parser (Maybe Char)
parse lookahead [] = pure (Nothing, [])
parse lookahead s#(c:_) = pure (Just c, s )
satisfy :: (Char -> Bool) -> Parser Char
parse (satisfy p) [] = mzero
parse (satisfy p) (x:xs) = [ (x, xs) | p x ]
eof :: Parser ()
parse eof s = [ ((), []) | null s ]
eof' :: Parser ()
eof' = ???
By the way thanks to the use of Monad Comprehensions and the more abstract pure, empty and mzero instead of their concrete representations in terms of the Maybe type, this same (pseudo-)code will work with a different type, like [] in place of Maybe, viz. newtype Parser a = Parser { parse :: String -> [(a,String)] }.
So we have
ensure :: (a -> Bool) -> Parser a -> Parser a
lookahead :: Parser (Maybe Char)
(satisfy is no good for us here .... why?)
Using that, we can have
ensure ....... ...... :: Parser (Maybe Char)
(... what does ensure id (pure False) do? ...)
but we'll have a useless Nothing result in case the input string was in fact empty, whereas the eof parser given to use produces the () as its result in such case (and otherwise it produces nothing).
No fear, we also have
fmap :: ( a -> b ) -> Parser a -> Parser b
which can transform the Nothing into () for us. We'll need a function that will always do this for us,
alwaysUnit nothing = ()
which we can use now to arrive at the solution:
eof' = fmap ..... (..... ..... ......)

Cascading Parsers in Haskell

My parser type is
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
I have two parsers :
1) a = (satisfy isAlpha) that knows how to match the first alpha numeric character in a string.
Running parse a "k345" gives Just ('k',"345")
2) b = many (satisfy isDigit) that knows how to match any number of digits. Running parse b "1234 abc" gives Just ("1234"," abc")
Now I want to combine those two parsers and match a singe alphanumeric character followed by any number of digits.
I tried:
parse (a *> b) "k1234 7" and got Just ("1234"," 7 "). Looks like the 'k' matched by the first parser a is gone from the output. How do I fix this problem ?
Thanks!
For a toy parser, look the following code:
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE UndecidableInstances #-}
module Parse where
import Data.Char
import Data.List
newtype Parser a = Parser
{ parse :: String -> Maybe (a, String) }
satisfy :: (Char -> Bool) -> Parser Char
satisfy cond = Parser $ \s ->
case s of
"" -> Nothing
(c:cs) -> if cond c then Just (c, cs) else Nothing
many :: Parser a -> Parser [a]
many p = Parser $ \s ->
case parse p s of
Nothing -> Just ([], s)
Just (c, cs) -> let Just (cc, cs') = parse (many p) cs
in Just (c:cc, cs')
string :: String -> Parser String
string str = Parser $ \s -> if isPrefixOf str s
then Just (str, drop (length str) s)
else Nothing
instance Functor Parser where
fmap f (Parser g) = Parser $ \s ->
case g s of
Nothing -> Nothing
Just (r, remain) -> Just (f r, remain)
instance Applicative Parser where
pure a = Parser $ \s -> Just (a, s)
-- (<*>) :: f (a -> b) -> f a -> f b
(Parser f) <*> (Parser g) = Parser $ \s ->
case f s of
Nothing -> Nothing
Just (ab, remain) -> case g remain of
Nothing -> Nothing
Just (r, remain1) -> Just (ab r, remain1)
instance Semigroup a => Semigroup (Parser a) where
(Parser p1) <> (Parser p2) = Parser $ \s ->
case p1 s of
Nothing -> Nothing
Just (r1, s1) -> case p2 s1 of
Nothing -> Nothing
Just (r2, s2) -> Just (r1 <> r2, s2)
instance (Monoid a, Semigroup (Parser a))=> Monoid (Parser a) where
mempty = Parser $ \s -> Just (mempty, s)
mappend = (<>)
a = satisfy isAlpha
b = many (satisfy isDigit)
λ> parse a "k345"
Just ('k',"345")
λ> parse b "12345 abc"
Just ("12345"," abc")
λ> parse (a *> b) "k1234 7"
Just ("1234"," 7")
λ> parse (string "k" <> b) "k1234 7"
Just ("k1234"," 7")
So maybe you should find some tutorials and try to be familiar with Functor, Applicative, and Monad. See, you can implement the instance of a Monoid for your Parser type, and then you can use (<>) to combine your parsed results together.
It looks like this is working fine :
parse (fmap (:) (satisfy isAlpha) <*> many (satisfy isDigit)) "k1234 7"
And gives back what I wanted
Just ("k1234"," 7")

Using monad for simple Haskell parser

TL;DR
I'm trying to understand how this:
satisfy :: (Char -> Bool) -> Parser Char
satisfy pred = PsrOf p
where
p (c:cs) | pred c = Just (cs, c)
p _ = Nothing
Is equivalent to this:
satisfy :: (Char -> Bool) -> Parser Char
satisfy pred = do
c <- anyChar
if pred c then return c else empty
Context
This is a snippet from some lecture notes on Haskell parsing, which I'm trying to understand:
import Control.Applicative
import Data.Char
import Data.Functor
import Data.List
newtype Parser a = PsrOf (String -> Maybe (String, a))
-- Function from input string to:
--
-- * Nothing, if failure (syntax error);
-- * Just (unconsumed input, answer), if success.
dePsr :: Parser a -> String -> Maybe (String, a)
dePsr (PsrOf p) = p
-- Monadic Parsing in Haskell uses [] instead of Maybe to support ambiguous
-- grammars and multiple answers.
-- | Use a parser on an input string.
runParser :: Parser a -> String -> Maybe a
runParser (PsrOf p) inp = case p inp of
Nothing -> Nothing
Just (_, a) -> Just a
-- OR: fmap (\(_,a) -> a) (p inp)
-- | Read a character and return. Failure if input is empty.
anyChar :: Parser Char
anyChar = PsrOf p
where
p "" = Nothing
p (c:cs) = Just (cs, c)
-- | Read a character and check against the given character.
char :: Char -> Parser Char
-- char wanted = PsrOf p
-- where
-- p (c:cs) | c == wanted = Just (cs, c)
-- p _ = Nothing
char wanted = satisfy (\c -> c == wanted) -- (== wanted)
-- | Read a character and check against the given predicate.
satisfy :: (Char -> Bool) -> Parser Char
satisfy pred = PsrOf p
where
p (c:cs) | pred c = Just (cs, c)
p _ = Nothing
-- Could also be:
-- satisfy pred = do
-- c <- anyChar
-- if pred c then return c else empty
instance Monad Parser where
-- return :: a -> Parser a
return = pure
-- (>>=) :: Parser a -> (a -> Parser b) -> Parser b
PsrOf p1 >>= k = PsrOf q
where
q inp = case p1 inp of
Nothing -> Nothing
Just (rest, a) -> dePsr (k a) rest
I understand everything up until the last bit of the Monad definition, specifically I don't understand how the following line returns something of type Parser b as is required by the (>>=) definition:
Just (rest, a) -> dePsr (k a) rest
It's difficult for me grasp what the Monad definition means without an example. Thankfully, we have one in the alternate version of the satisfy function, which uses do-notation (which of course means the Monad is being called). I really don't understand do-notation yet, so here's the desugared version of satisfy:
satisfy pred = do
anyChar >>= (c ->
if pred c then return c else empty)
So based on the first line of our (>>=)definition, which is
PsrOf p1 >>= k = PsrOf q
We have anyChar as our PsrOf p1 and (c -> if pred c then return c else empty) as our k. What I don't get is how in dePsr (k a) rest that (k a) returns a Parser (at least it shold, otherwise calling dePsr on it wouldn't make sense). This is made more confusing by the presence of rest. Even if (k a) returned a Parser, calling dePsr would extract the underlying function from the returned Parser and pass rest to it as an input. This is definitely doesn't return something of type Parser b as required by the definition of (>>=). Clearly I'm misunderstanding something somewhere.
Ok, Maybe this will help. Let's start by puting some points back into dePsr.
dePsr :: Parser a -> String -> Maybe (String, a)
dePsr (PsrOf p) rest = p rest
And let's also write out return: (NB I'm putting in all the points for clarity)
return :: a -> Parser a
return a = PsrOf (\rest -> Just (rest, a))
And now from the Just branch of the (>>=) definition
Just (rest, a) -> dePsr (k a) rest
Let's make sure we agree on what every thing is:
rest the string remaining unparsed after p1 is applied
a the result of applying p1
k :: a -> Parser b takes the result of the previous parser and makes a new parser
dePsr unwraps a Parser a back into a function `String -> Maybe (String, a)
Remember we will wrap this back into a parser again at the top of the function: PsrOf q
So in English bind (>>=) take a parser in a and a function from a to a parser in b and returns a parser in b. The resulting parser is made by wrapping q :: String -> Maybe (String, b) in the Parser constructor PsrOf. Then q, the combined parser, take a String called inp and applies the function p1 :: String -> Maybe (String,a) that we got from pattern matching against the first parser, and pattern matches on the result. For an error we propagate Nothing (easy). If the first parser had a result we have tow pieces of information, the still unparsed string called rest and the result a. We give a to k, the second parser combinator, and get a Parser b which we need to unwrap with dePsr to get a function (String -> Maybe (String,b) back. That function can be applied to rest for the final result of the combined parsers.
I think the hardest part about reading this is that sometimes we curry the parser function which obscures what is actually happening.
Ok for the satisfy example
satisfy pred
= anyChar >>= (c -> if pred c then return c else empty)
empty comes from the alternative instance and is PsrOf (const Nothing) so a parser that always fails.
Lets look at only the successful branches. By substitution of only the successful part:
PsrOf (\(c:cs) ->Just (cs, c)) >>= (\c -> PsrOf (\rest -> Just (rest, c)))
So in the bind (>>=) definition
p1 = \(c:cs -> Just (cs, c))
k = (\c -> PsrOf (\rest -> Just (rest, c)))
q inp = let Just (rest,a) = p1 inp in dePsr (k a) rest again only successful branch
Then q becomes
q inp =
let Just (rest, a) = (\(c:cs) -> Just (cs, c)) inp
in dePsr (\c -> PsrOf (\rest -> Just (rest, c))) a rest
Doing a little β-reduction
q inp =
let (c:cs) = inp
rest = cs
a = c
in dePsr (PsdOf (\rest -> Just (rest, a))) rest -- dePsr . PsrOf = id
Finally cleaning up some more
q (c:cs) = Just (cs, c)
So if pred is successful we reduce satisfy back to exactly anyChar which is exactly what we expect, and exactly what we find in the first example of the question. I will leave it as and exersize to the reader (read: I'm lazy) to prove that if either inp = "" or pred c = False that the outcome is Nothing as in the first satisfy example.
NOTE: If you are doing anything other than a class assignment, you will save yourself hours of pain and frustration by starting with error handling from the beginning make your parser String -> Either String (String,a) it is easy to make the error type more general later, but a PITA to change everything from Maybe to Either.
Question: "[C]ould you explain how you arrived at return a = PsrOf (\rest -> Just (rest, a)) from return = pure after you put "points" back into return?
Answer: First off, it is pretty unfortunate to give the Monad instance definition without the Functor and Applicative definitions. The pure and return functions must be identical (It is part of the Monad Laws), and they would be called the same thing except Monad far predates Applicative in Haskell history. In point of fact, I don't "know" what pure looks like, but I know what it has to be because it is the only possible definition. (If you want to understand the the proof of that statement ask, I have read the papers, and I know the results, but I'm not into typed lambda calculus quite enough to be confident in reproducing the results.)
return must wrap a value in the context without altering the context.
return :: Monad m => a -> m a
return :: a -> Parser a -- for our Monad
return :: a -> PsrOf(\str -> Maybe (rest, value)) -- substituting the constructor (PSUDO CODE)
A Parser is a function that takes a string to be parsed and returns Just the value along with any unparsed portion of the original string or Nothing on failure, all wrapped in the constructorPsrOf. The context is the string to be parsed, so we cannot change that. The value is of course what was passed toreturn`. The parser always succeeds so we must return Just a value.
return a = PsrOf (\rest -> Just (rest, a))
rest is the context and it is passed through unaltered.
a is the value we put into the Monad context.
For completeness here is also the only reasonable definition of fmap from Functor.
fmap :: Functor f => (a->b) -> f a -> f b
fmap :: (a -> b) -> Parser a -> Parser b -- for Parser Monad
fmap f (PsrOf p) = PsrOf q
where q inp = case p inp of
Nothing -> Nothing
Just (rest, a) -> Just (rest, f a)
-- better but less instructive definition of q
-- q = fmap (\(rest,a) -> (rest, f a)) . p

Resources