mutual reference in function definition - parsing

some & many defined here have internal expressions which refer to each other. I'm finding this difficult to understand.
some :: f a -> f [a]
some v = some_v
where
many_v = some_v <|> pure []
some_v = (:) <$> v <*> many_v
-- | Zero or more.
many :: f a -> f [a]
many v = many_v
where
many_v = some_v <|> pure []
some_v = (:) <$> v <*> many_v
What would be the type signatures of many_v and some_v?
How does the following get evaluated (using parsec)?
Prelude Text.Parsec> parse (many (oneOf "abc")) mempty "abc"
Right "abc"

both of the functions have signatures like
many_v :: (Alternative f) => f [a]
some_v :: (Alternative f) => f [a]
Replacing bindings with their definitions you can simplify
some v = some_v
where
some_v = (:) <$> v <*> (some_v <|> pure [])
many v = many_v
where
many_v = ( (:) <$> v <*> many_v) <|> pure []
some v = (:) <$> v <*> (some v <|> pure [])
many v = ((:) <$> v <*> many v) <|> pure []
Normally you define these functions simply as
many v = some v <|> pure []
some v = (:) <$> v <*> many v
In most projects recursive functions are defined worker wrapper functions in order to enable compiler do more optimizations

Let's look at some_v = (:) <$> v <*> many_v. We know
v :: f a
(:) :: a -> [a] -> [a]
(<$>) :: Functor f => (a -> b) -> f a -> f b
Then:
(:) <$> v :: f ([a] -> [a])
Let's now look at <*> :: Applicative f => f (x -> y) -> f x -> f y. First arg is (:) <$> v :: f ([a] -> [a]), so x ~ [a] and y ~ [a]. And that means, that many_v :: f [a] and some_v :: f [a].
Also let's check many_v = some_v <|> pure [] definition. We have:
pure :: Applicative f => a -> f a
(<|>) :: Alternative f => f a -> f a -> f a
So:
pure [] :: f [a]
some_v :: f [a]
-- so we have:
some_v <|> pure [] :: f [a]
-- and by definition
many_v :: f [a]

Related

How to parse a bool expression in Haskell

I am trying to parse a bool expression in Haskell. This line is giving me an error: BoolExpr <$> parseBoolOp <*> (n : ns). This is the error:
• Couldn't match type ‘[]’ with ‘Parser’
Expected type: Parser [Expr]
Actual type: [Expr]
-- define the expression types
data Expr
= BoolExpr BoolOp [Expr]
deriving (Show, Eq)
-- define the type for bool value
data Value
= BoolVal Bool
deriving (Show, Eq)
-- many x = Parser.some x <|> pure []
-- some x = (:) <$> x <*> Parser.many x
kstar :: Alternative f => f a -> f [a]
kstar x = kplus x <|> pure []
kplus :: Alternative f => f a -> f [a]
kplus x = (:) <$> x <*> kstar x
symbol :: String -> Parser String
symbol xs = token (string xs)
-- a bool expression is the operator followed by one or more expressions that we have to parse
-- TODO: add bool expressions
boolExpr :: Parser Expr
boolExpr = do
n <- parseExpr
ns <- kstar (symbol "," >> parseExpr)
BoolExpr <$> parseBoolOp <*> (n : ns)
-- an atom is a literalExpr, which can be an actual literal or some other things
parseAtom :: Parser Expr
parseAtom =
do
literalExpr
-- the main parsing function which alternates between all the options you have
parseExpr :: Parser Expr
parseExpr =
do
parseAtom
<|> parseParens boolExpr
<|> parseParens parseExpr
-- implement parsing bool operations, these are 'and' and 'or'
parseBoolOp :: Parser BoolOp
parseBoolOp =
do symbol "and" >> return And
<|> do symbol "or" >> return Or
The boolExpr is expecting a Parser [Expr] but I am returning only an [Expr]. Is there a way to fix this or do it in another way? When I try pure (n:ns), evalStr "(and true (and false true) true)" returns Left (ParseError "'a' didn't match expected character") instead of Right (BoolVal False)
The expression (n : ns) is a list. Therefore the compiler thinks that the applicative operators <*> and <$> should be used in the context [], while you want Parser instead.
I would guess you need pure (n : ns) instead.

Combining parsers in Haskell

I'm given the following parsers
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
instance Functor Parser where
fmap f p = Parser $ \s -> (\(a,c) -> (f a, c)) <$> parse p s
instance Applicative Parser where
pure a = Parser $ \s -> Just (a,s)
f <*> a = Parser $ \s ->
case parse f s of
Just (g,s') -> parse (fmap g a) s'
Nothing -> Nothing
instance Alternative Parser where
empty = Parser $ \s -> Nothing
l <|> r = Parser $ \s -> parse l s <|> parse r s
ensure :: (a -> Bool) -> Parser a -> Parser a
ensure p parser = Parser $ \s ->
case parse parser s of
Nothing -> Nothing
Just (a,s') -> if p a then Just (a,s') else Nothing
lookahead :: Parser (Maybe Char)
lookahead = Parser f
where f [] = Just (Nothing,[])
f (c:s) = Just (Just c,c:s)
satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser f
where f [] = Nothing
f (x:xs) = if p x then Just (x,xs) else Nothing
eof :: Parser ()
eof = Parser $ \s -> if null s then Just ((),[]) else Nothing
eof' :: Parser ()
eof' = ???
I need to write a new parser eof' that does exactly what eof does but is built only using the given parsers and the
Functor/Applicative/Alternative instances above. I'm stuck on this as I don't have experience in combining parsers. Can anyone help me out ?
To understand it easier, we can write it in an equational pseudocode, while we substitute and simplify the definitions, using Monad Comprehensions for clarity and succinctness.
Monad Comprehensions are just like List Comprehensions, only working for any MonadPlus type, not just []; while corresponding closely to do notation, e.g. [ (f a, s') | (a, s') <- parse p s ] === do { (a, s') <- parse p s ; return (f a, s') }.
This gets us:
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
instance Functor Parser where
parse (fmap f p) s = [ (f a, s') | (a, s') <- parse p s ]
instance Applicative Parser where
parse (pure a) s = pure (a, s)
parse (pf <*> pa) s = [ (g a, s'') | (g, s') <- parse pf s
, (a, s'') <- parse pa s' ]
instance Alternative Parser where
parse empty s = empty
parse (l <|> r) s = parse l s <|> parse r s
ensure :: (a -> Bool) -> Parser a -> Parser a
parse (ensure pred p) s = [ (a, s') | (a, s') <- parse p s, pred a ]
lookahead :: Parser (Maybe Char)
parse lookahead [] = pure (Nothing, [])
parse lookahead s#(c:_) = pure (Just c, s )
satisfy :: (Char -> Bool) -> Parser Char
parse (satisfy p) [] = mzero
parse (satisfy p) (x:xs) = [ (x, xs) | p x ]
eof :: Parser ()
parse eof s = [ ((), []) | null s ]
eof' :: Parser ()
eof' = ???
By the way thanks to the use of Monad Comprehensions and the more abstract pure, empty and mzero instead of their concrete representations in terms of the Maybe type, this same (pseudo-)code will work with a different type, like [] in place of Maybe, viz. newtype Parser a = Parser { parse :: String -> [(a,String)] }.
So we have
ensure :: (a -> Bool) -> Parser a -> Parser a
lookahead :: Parser (Maybe Char)
(satisfy is no good for us here .... why?)
Using that, we can have
ensure ....... ...... :: Parser (Maybe Char)
(... what does ensure id (pure False) do? ...)
but we'll have a useless Nothing result in case the input string was in fact empty, whereas the eof parser given to use produces the () as its result in such case (and otherwise it produces nothing).
No fear, we also have
fmap :: ( a -> b ) -> Parser a -> Parser b
which can transform the Nothing into () for us. We'll need a function that will always do this for us,
alwaysUnit nothing = ()
which we can use now to arrive at the solution:
eof' = fmap ..... (..... ..... ......)

Removing Left Recursion in a Basic Expression Parser

As an exercise, I'm implementing a parser for an exceedingly simple language defined in Haskell using the following GADT (the real grammar for my project involves many more expressions, but this extract is sufficient for the question):
data Expr a where
I :: Int -> Expr Int
Add :: [Expr Int] -> Expr Int
The parsing functions are as follows:
expr :: Parser (Expr Int)
expr = foldl1 mplus
[ lit
, add
]
lit :: Parser (Expr Int)
lit = I . read <$> some digit
add :: Parser (Expr Int)
add = do
i0 <- expr
is (== '+')
i1 <- expr
is <- many (is (== '+') *> expr)
pure (Add (i0:i1:is))
Due to the left-recursive nature of the expression grammar, when I attempt to parse something as simple as 1+1 using the expr parser, the parser get stuck in an infinite loop.
I've seen examples of how to factor out left recursion across the web using a transformation from something like:
S -> S a | b
Into something like:
S -> b T
T -> a T
But I'm struggling with how to apply this to my parser.
For completeness, here is the code that actually implements the parser:
newtype Parser a = Parser
{ runParser :: String -> [(a, String)]
}
instance Functor Parser where
fmap f (Parser p) = Parser $ \s ->
fmap (\(a, r) -> (f a, r)) (p s)
instance Applicative Parser where
pure a = Parser $ \s -> [(a, s)]
(<*>) (Parser f) (Parser p) = Parser $ \s ->
concat $ fmap (\(f', r) -> fmap (\(a, r') -> (f' a, r')) (p r)) (f >
instance Alternative Parser where
empty = Parser $ \s -> []
(<|>) (Parser a) (Parser b) = Parser $ \s ->
case a s of
(r:rs) -> (r:rs)
[] -> case b s of
(r:rs) -> (r:rs)
[] -> []
instance Monad Parser where
return = pure
(>>=) (Parser a) f = Parser $ \s ->
concat $ fmap (\(r, rs) -> runParser (f r) rs) (a s)
instance MonadPlus Parser where
mzero = empty
mplus (Parser a) (Parser b) = Parser $ \s -> a s ++ b s
char = Parser $ \case (c:cs) -> [(c, cs)]; [] -> []
is p = char >>= \c -> if p c then pure c else empty
digit = is isDigit
Suppose you want to parse non-parenthesized expressions involving literals, addition, and multiplication. You can do this by cutting down the list by precedence. Here's one way to do it in attoparsec, which should be pretty similar to what you'd do with your parser. I'm no parsing expert, so there might be some errors or infelicities.
import Data.Attoparsec.ByteString.Char8
import Control.Applicative
expr :: Parser (Expr Int)
expr = choice [add, mul, lit] <* skipSpace
-- choice is in Data.Attoparsec.Combinators, but is
-- actually a general Alternative operator.
add :: Parser (Expr Int)
add = Add <$> addList
addList :: Parser [Expr Int]
addList = (:) <$> addend <* skipSpace <* char '+' <*> (addList <|> ((:[]) <$> addend))
addend :: Parser (Expr Int)
addend = mul <|> multiplicand
mul :: Parser (Expr Int)
mul = Mul <$> mulList
mulList :: Parser [Expr Int]
mulList = (:) <$> multiplicand <* skipSpace <* char '*' <*> (mulList <|> ((:[]) <$> multiplicand))
multiplicand :: Parser (Expr Int)
multiplicand = lit
lit :: Parser (Expr Int)
lit = I <$> (skipSpace *> decimal)

Writing Parser for Positive JSON Number w/ Decimal

Given the following definitions from Prof. Yorgey's UPenn class:
newtype Parser a = Parser { runParser :: String -> Maybe (a, String) }
satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser f
where
f [] = Nothing -- fail on the empty input
f (x:xs) -- check if x satisfies the predicate
-- if so, return x along with the remainder
-- of the input (that is, xs)
| p x = Just (x, xs)
| otherwise = Nothing -- otherwise, fail
And the following algebraic data types:
type Key = String
data Json = JObj Key JValue
| Arr [JValue]
deriving Show
data JValue = N Double
| S String
| B Bool
| J Json
deriving Show
I wrote the following function to parse a position JSON number with a decimal point:
parseDecimalPoint :: Parser Char
parseDecimalPoint = satisfy (== '.')
type Whole = Integer
type Decimal = Integer
readWholeAndDecimal :: Whole -> Decimal -> Double
readWholeAndDecimal w d = read $ (show w) ++ "." ++ (show d)
parsePositiveDecimal:: Parser JValue
parsePositiveDecimal = (\x _ y -> f x y) <$> (
(oneOrMore (satisfy isNumber)) <*> parseDecimalPoint <*>
(zeroOrMore (satisfy isNumber)) )
where
f x [] = N (read x)
f x y = N (-(readWholeAndDecimal (read x) (read y)))
However I'm getting the following compile-time error:
JsonParser.hs:30:25:
Couldn't match expected type ‘t0 -> [Char] -> JValue’
with actual type ‘JValue’
The lambda expression ‘\ x _ y -> f x y’ has three arguments,
but its type ‘String -> JValue’ has only one
In the first argument of ‘(<$>)’, namely ‘(\ x _ y -> f x y)’
In the expression:
(\ x _ y -> f x y)
<$>
((oneOrMore (satisfy isNumber)) <*> parseDecimalPoint
<*> (zeroOrMore (satisfy isNumber)))
JsonParser.hs:30:49:
Couldn't match type ‘[Char]’ with ‘Char -> [Char] -> String’
Expected type: Parser (Char -> [Char] -> String)
Actual type: Parser [Char]
In the first argument of ‘(<*>)’, namely
‘(oneOrMore (satisfy isNumber))’
In the first argument of ‘(<*>)’, namely
‘(oneOrMore (satisfy isNumber)) <*> parseDecimalPoint’
In my parsePositiveDecimal function, my understanding of the types are:
(String -> Char -> String -> JValue) <$> (Parser String <*> Parser Char <*> Parser String)
I've worked through a few examples making parsers with <$> and <*>. But I'm not entirely grokking the types.
Any help on understanding them too would be greatly appreciated.
Cactus is correct. I'll expand a bit on the types.
<$> :: Functor f => (a -> b) -> f a -> f b
Our f here is Parser, and the first argument to <$> has type String -> Char -> String -> JValue. Remember that this can be understood as a function which takes a String and returns a function Char -> String -> JValue So the a type variable is filled in with String.
From that, we can see that the second argument to <$> needs to be of type Parser String. oneOrMore (satisfy isNumber) has that type.
Taken together, we now have:
(\x _ y -> f x y) <$> (oneOrMore (satisfy isNumber)) :: Parser (Char -> String -> JValue)
We've gone from a function of 3 arguments which didn't involve Parser at all, to a function of 2 arguments wrapped in Parser. To apply this function to it's next argument, Char, we need:
(<*>) :: Applicative f => f (a -> b) -> f a -> f b
f is Parser again, and a here is Char. parseDecimalPoint :: Parser Char has the required type for the right-hand side of <*>.
(\x _ y -> f x y) <$> (oneOrMore (satisfy isNumber)) <*> parseDecimalPoint :: Parser (String -> JValue)
We do this one more time, to get:
(\x _ y -> f x y) <$> oneOrMore (satisfy isNumber) <*> parseDecimalPoint <*> zeroOrMore (satisfy isNumber) :: Parser JValue
I've taken advantage of knowing the precedence and associativity of the operators to remove some parentheses. This is how I see most such code written, but perhaps Cactus's version is more clear. Or even the fully parenthesized version, emphasizing the associativity:
( ((\x _ y -> f x y) <$>
(oneOrMore (satisfy isNumber)))
<*> parseDecimalPoint)
<*> (zeroOrMore (satisfy isNumber)) :: Parser JValue

Is it possible to express chainl1 using applicative?

Is it possible to express the chainl1 combinator from Parsec not using the Monad instance defined by parsec?
chainl1 p op =
do x <- p
rest x
where
rest x = do f <- op
y <- p
rest (f x y)
<|> return x
Yes, it is:
chainl1 p op = foldl (flip ($)) <$> p <*> many (flip <$> op <*> p)
The idea is that you have to parse p (op p)* and evaluate it as (...(((p) op p) op p)...).
It might help to expand the definition a bit:
chainl1 p op = foldl (\x f -> f x) <$> p <*> many ((\f y -> flip f y) <$> op <*> p)
As the pairs of op and p are parsed, the results are applied immediately, but because p is the right operand of op, it needs a flip.
So, the result type of many (flip <$> op <*> p) is f [a -> a]. This list of functions is then applied from left to right on an initial value of p by foldl.
Ugly but equivalent Applicative definition:
chainl1 p op =
p <**>
rest
where
rest = flip <$> op <*>
p <**>
pure (.) <*> rest
<|> pure id
Instead of passing of left-side argument x explicitly to the right-hand side op, this Applicative form 'chains' op's partially applied to their right-side argument (hence flip <$> op <*> p) via lifted combinator (.) and then applies the leftmost p via (<**>) to the resulting rest :: Alternative f => f (a -> a).

Resources