precedence climbing in haskell: parsec mutual recursion error - parsing

I'm programming the precedence climbing algorithm in Haskell, but for a reason unknown to me, does not work. I think that Parsec state info is lost at some point, but I don't even know that is the source of the error:
module PrecedenceClimbing where
import Text.Parsec
import Text.Parsec.Char
{-
Algorithm
compute_expr(min_prec):
result = compute_atom()
while cur token is a binary operator with precedence >= min_prec:
prec, assoc = precedence and associativity of current token
if assoc is left:
next_min_prec = prec + 1
else:
next_min_prec = prec
rhs = compute_expr(next_min_prec)
result = compute operator(result, rhs)
return result
-}
type Precedence = Int
data Associativity = LeftAssoc
| RightAssoc
deriving (Eq, Show)
data OperatorInfo = OPInfo Precedence Associativity (Int -> Int -> Int)
mkOperator :: Char -> OperatorInfo
mkOperator = \c -> case c of
'+' -> OPInfo 1 LeftAssoc (+)
'-' -> OPInfo 1 LeftAssoc (-)
'*' -> OPInfo 2 LeftAssoc (*)
'/' -> OPInfo 2 LeftAssoc div
'^' -> OPInfo 3 RightAssoc (^)
getPrecedence :: OperatorInfo -> Precedence
getPrecedence (OPInfo prec _ _) = prec
getAssoc :: OperatorInfo -> Associativity
getAssoc (OPInfo _ assoc _) = assoc
getFun :: OperatorInfo -> (Int -> Int -> Int)
getFun (OPInfo _ _ fun) = fun
number :: Parsec String () Int
number = do
spaces
fmap read $ many1 digit
operator :: Parsec String () OperatorInfo
operator = do
spaces
fmap mkOperator $ oneOf "+-*/^"
computeAtom = do
spaces
number
loop minPrec res = (do
oper <- operator
let prec = getPrecedence oper
if prec >= minPrec
then do
let assoc = getAssoc oper
next_min_prec = if assoc == LeftAssoc
then prec + 1
else prec
rhs <- computeExpr(next_min_prec)
loop minPrec $ getFun oper res rhs
else return res) <|> (return res)
computeExpr :: Int -> Parsec String () Int
computeExpr minPrec = (do
result <- computeAtom
loop minPrec result) <|> (computeAtom)
getResult minPrec = parse (computeExpr minPrec) ""
My program for some reason is only processing the first operation or the first operand depending on the case, but does not go any further
GHCi session:
*PrecedenceClimbing> getResult 1 "46+10"
Right 56
*PrecedenceClimbing> getResult 1 "46+10+1"
Right 56

I'm not sure exactly what's wrong with your code but I'll offer these comments:
(1) These statements are not equivalent:
Generic Imperative: rhs = compute_expr(next_min_prec)
Haskell: rhs <- computeExpr(next_min_prec)
The imperative call to compute_expr will always return. The Haskell call may fail in which case the stuff following the call never happens.
(2) You are really working against Parsec's strengths by trying to parse tokens one at a time in sequence. To see the "Parsec way" of generically parsing expressions with operators of various precedences and associativities, have a look at:
buildExpression
Parsec and Expression Printing
Update
I've posted a solution to http://lpaste.net/165651

Related

Adding error handling to function - Haskell

I am completely new to haskell and seen examples online of how to add error handling but I'm not sure how to incorporate it in my context. Below is an example of the code which works before trying to handle errors.
expr'::Parser Double
expr' = term' `chainl1'` addop
term'::Parser Double
term' = factor' `chainl1` mulop
chainl :: Parser a -> Parser (a -> a -> a) -> a -> Parser a
chainl p op a = (p `chainl1` op) <|> pure a
chainl1 ::Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p op = p >>= rest
where
rest a = (do
f <- op
b <- p
rest (f a b)) <|> pure a
addop, mulop :: Parser (Double -> Double -> Double)
I've since expanded this to let addop and mulop return error messages if something irregular is found. This causes the function definition to change to:
addop, mulop :: Parser (Either String (Double -> Double -> Double))
In other programming languages I would check if f <- op is a String and return the string. However I'm not sure how to go about this in Haskell. The idea is that this error message returns all the way back to term'. Hence its function definition also needs to change eventually. This is all in the attempt to build a Monadic Parser.
If you're using parsec then you can make your code more general to work with the ParsecT monad transformer:
import Text.Parsec hiding (chainl1)
import Control.Monad.Trans.Class (lift)
expr' :: ParsecT String () (Either String) Double
expr' = term' `chainl1` addop
term' :: ParsecT String () (Either String) Double
term' = factor' `chainl1` mulop
factor' :: ParsecT String () (Either String) Double
factor' = read <$> many1 digit
chainl1 :: Monad m => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> ParsecT s u m a
chainl1 p op = p >>= rest
where
rest a = (do
f <- op
b <- p
rest (f a b))
<|> pure a
addop, mulop :: ParsecT String () (Either String) (Double -> Double -> Double)
addop = (+) <$ char '+' <|> (-) <$ char '-'
mulop = ((*) <$ char '*' <* lift (Left "error")) <|> (/) <$ char '/' <|> (**) <$ char '^'
I don't know what kind of errors you would want to return, so I've just made an error if an '*' is encountered in the input.
You can run the parser like this:
ghci> runParserT (expr' <* eof) () "buffer" "1+2+3"
Right (Right 6.0)
ghci> runParserT (expr' <* eof) () "buffer" "1+2*3"
Left "error"
The answer based on parsec implementation.
Actually the operator <|> is what you need. It handles any parsing errors. In expression a <|> b if the parser a fails then the parser b will be run (expect if the parser a consume some input before fails; for handle this case you can use combinator try like this: try a <|> b).
But if you want to handle error depending to the kind of error then you should do like #Noughtmare answered. But then I recomend you to do that:
Define your type for errors. It will be bugless to handle errors.
data MyError
= ME_DivByZero
| ...
You can simplify type signature if you define type alias for your parser.
type MyParser = ParsecT String () (Either MyError)
Then signatires will look like this:
expr' :: MyParser Double
addop, mulop :: MyParser (Double -> Double -> Double)
Use throwError to throw your errors and catchError to handle your errors, that will be more idiomatic. So it's look like this:
f <- catchError op $ \case
ME_DivByZero -> ...
ME_... -> ...
err -> throwError err -- rethrow error

Haskell : Operator Parser keeps going to undefined rather than inputs

I'm practicing writing parsers. I'm using Tsodings JSON Parser video as reference. I'm trying to add to it by being able to parse arithmetic of arbitrary length and I have come up with the following AST.
data HVal
= HInteger Integer -- No Support For Floats
| HBool Bool
| HNull
| HString String
| HChar Char
| HList [HVal]
| HObj [(String, HVal)]
deriving (Show, Eq, Read)
data Op -- There's only one operator for the sake of brevity at the moment.
= Add
deriving (Show, Read)
newtype Parser a = Parser {
runParser :: String -> Maybe (String, a)
}
The following functions is my attempt of implementing the operator parser.
ops :: [Char]
ops = ['+']
isOp :: Char -> Bool
isOp c = elem c ops
spanP :: (Char -> Bool) -> Parser String
spanP f = Parser $ \input -> let (token, rest) = span f input
in Just (rest, token)
opLiteral :: Parser String
opLiteral = spanP isOp
sOp :: String -> Op
sOp "+" = Add
sOp _ = undefined
parseOp :: Parser Op
parseOp = sOp <$> (charP '"' *> opLiteral <* charP '"')
The logic above is similar to how strings are parsed therefore my assumption was that the only difference was looking specifically for an operator rather than anything that's not a number between quotation marks. It does seemingly begin to parse correctly but it then gives me the following error:
λ > runParser parseOp "\"+\""
Just ("+\"",*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /DIRECTORY/parser.hs:110:11 in main:Main
I'm confused as to where the error is occurring. I'm assuming it's to do with sOp mainly due to how the other functions work as intended as the rest of parseOp being a translation of the parseString function:
stringLiteral :: Parser String
stringLiteral = spanP (/= '"')
parseString :: Parser HVal
parseString = HString <$> (charP '"' *> stringLiteral <* charP '"')
The only reason why I have sOp however is that if it was replaced with say Op, I would get the error that the following doesn't exist Op :: String -> Op. When I say this my inclination was that the string coming from the parsed expression would be passed into this function wherein I could return the appropriate operator. This however is incorrect and I'm not sure how to proceed.
charP and Applicative Instance
charP :: Char -> Parser Char
charP x = Parser $ f
where f (y:ys)
| y == x = Just (ys, x)
| otherwise = Nothing
f [] = Nothing
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input', a) <- q input
Just (input', f a)
The implementation of (<*>) is the culprit. You did not use input' in the next call to q, but used input instead. As a result you pass the string to the next parser without "eating" characters. You can fix this with:
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input'', a) <- q input'
Just (input'', f a)
With the updated instance for Applicative, we get:
*Main> runParser parseOp "\"+\""
Just ("",Add)

How to insert an indentation check into the operator precedence parser?

I am working on the parsing stage for the language I am making and am having difficulty with the following.
let test2 = // I'd like this to be an error.
"""
2
+ 2
"""
let result = run (spaces >>. expr) test2
val result : ParserResult<CudaExpr,unit> =
Success: Add (LitInt32 2,LitInt32 2)
I already managed to make the following example when the terms are indented incorrectly
2 +
2
give me an error, but not when the operator is on the wrong indentation level. I need something like a before-parse check.
let operators expr i =
let f expr (s: CharStream<_>) = if i <= s.Column then expr s else pzero s
opp.TermParser <- f expr
f opp.ExpressionParser
The above function is how the operators phase is structured and as you can see, the term parsers get wrapped in a function that does the indentation check, but the last line is faulty.
Here is a simplified example of the full parser.
#r "../../packages/FParsec.1.0.2/lib/net40-client/FParsecCS.dll"
#r "../../packages/FParsec.1.0.2/lib/net40-client/FParsec.dll"
open FParsec
type Expr =
| V of string
| Add of Expr * Expr
let identifier = many1Satisfy2L isAsciiLetter (fun x -> isAsciiLetter x || isDigit x || x = ''') "identifier" .>> spaces |>> V
let indentations expressions (s: CharStream<_>) =
let i = s.Column
let expr_indent expr (s: CharStream<_>) =
let expr (s: CharStream<_>) = if i <= s.Column then expr s else pzero s
many1 expr s
expr_indent (expressions i) s
let expr =
let opp = new OperatorPrecedenceParser<_,_,_>()
opp.AddOperator(InfixOperator("+", spaces, 6, Associativity.Left, fun x y -> Add(x,y)))
let operators expr i =
let f (s: CharStream<_>) = if i <= s.Column then expr s else pzero s
opp.TermParser <- f
f opp.ExpressionParser
let rec expr s = indentations (operators identifier) s
expr
let test2 = // I'd like this to be an error.
"""
a
+
b
"""
let result = run (spaces >>. expr) test2
The full parser so far can be found here.
let operators expr i =
let f (s: CharStream<_>) = if i <= s.Column then expr s else pzero s
opp.TermParser <- f
f opp.ExpressionParser
I did not realize it 2.5 weeks ago, but what happens when a new block gets opened and expr s gets called is that the term parser gets overwritten with the new indentation and there is no way to back it up and restore it on exit. I did a bit of looking around and managed to adapt the Pratt top down parsing method for my purposes.
Here is a talk by Douglas Crockford on the method.
let poperator: Parser<_,_> =
let f c = (isAsciiIdContinue c || isAnyOf [|' ';'\t';'\n';'\"';'(';')';'{';'}';'[';']'|] c) = false
(many1Satisfy f .>> spaces)
>>= fun token ->
match dict_operator.TryGetValue token with
| true, x -> preturn x
| false, _ -> fail "unknown operator"
let rec led poperator term left (prec,asoc,m) =
match asoc with
| Associativity.Left | Associativity.None -> tdop poperator term prec |>> m left
| Associativity.Right -> tdop poperator term (prec-1) |>> m left
| _ -> failwith "impossible"
and tdop poperator term rbp =
let rec f left =
poperator >>= fun (prec,asoc,m as v) ->
if rbp < prec then led poperator term left v >>= loop
else pzero
and loop left = attempt (f left) <|>% left
term >>= loop
let operators expr i (s: CharStream<_>) =
let expr_indent expr (s: CharStream<_>) = expr_indent i (<=) expr s
let op s = expr_indent poperator s
let term s = expr_indent expr s
tdop op term 0 s
The led and tdop functions which do the actual precedence parsing are 10 lines long. The above is just a snippet of the full parser for the language I am making - in terms of syntax it is similar to F# and is indentation sensitive. Here is a more straightforward F# translation of Douglas Crockford's Javascript example.

Haskell: if-then-else parse error

I'm working on an instance of Read ComplexInt.
Here's what was given:
data ComplexInt = ComplexInt Int Int
deriving (Show)
and
module Parser (Parser,parser,runParser,satisfy,char,string,many,many1,(+++)) where
import Data.Char
import Control.Monad
import Control.Monad.State
type Parser = StateT String []
runParser :: Parser a -> String -> [(a,String)]
runParser = runStateT
parser :: (String -> [(a,String)]) -> Parser a
parser = StateT
satisfy :: (Char -> Bool) -> Parser Char
satisfy f = parser $ \s -> case s of
[] -> []
a:as -> [(a,as) | f a]
char :: Char -> Parser Char
char = satisfy . (==)
alpha,digit :: Parser Char
alpha = satisfy isAlpha
digit = satisfy isDigit
string :: String -> Parser String
string = mapM char
infixr 5 +++
(+++) :: Parser a -> Parser a -> Parser a
(+++) = mplus
many, many1 :: Parser a -> Parser [a]
many p = return [] +++ many1 p
many1 p = liftM2 (:) p (many p)
Here's the given exercise:
"Use Parser to implement Read ComplexInt, where you can accept either the simple integer
syntax "12" for ComplexInt 12 0 or "(1,2)" for ComplexInt 1 2, and illustrate that read
works as expected (when its return type is specialized appropriately) on these examples.
Don't worry (yet) about the possibility of minus signs in the specification of natural
numbers."
Here's my attempt:
data ComplexInt = ComplexInt Int Int
deriving (Show)
instance Read ComplexInt where
readsPrec _ = runParser parseComplexInt
parseComplexInt :: Parser ComplexInt
parseComplexInt = do
statestring <- getContents
case statestring of
if '(' `elem` statestring
then do process1 statestring
else do process2 statestring
where
process1 ststr = do
number <- read(dropWhile (not(isDigit)) ststr) :: Int
return ComplexInt number 0
process2 ststr = do
numbers <- dropWhile (not(isDigit)) ststr
number1 <- read(takeWhile (not(isSpace)) numbers) :: Int
number2 <- read(dropWhile (not(isSpace)) numbers) :: Int
return ComplexInt number1 number2
Here's my error (my current error, as I'm sure there will be more once I sort this one out, but I'll take this one step at time):
Parse error in pattern: if ')' `elem` statestring then
do { process1 statestring }
else
do { process2 statestring }
I based my structure of the if-then-else statement on the structure used in this question: "parse error on input" in Haskell if-then-else conditional
I would appreciate any help with the if-then-else block as well as with the code in general, if you see any obvious errors.
Let's look at the code around the parse error.
case statestring of
if '(' `elem` statestring
then do process1 statestring
else do process2 statestring
That's not how case works. It's supposed to be used like so:
case statestring of
"foo" -> -- code for when statestring == "foo"
'b':xs -> -- code for when statestring begins with 'b'
_ -> -- code for none of the above
Since you're not making any sort of actual use of the case, just get rid of the case line entirely.
(Also, since they're only followed by a single statement each, the dos after then and else are superfluous.)
You stated you were given some functions to work with, but then didn't use them! Perhaps I misunderstood. Your code seems jumbled and doesn't seem to achieve what you would like it to. You have a call to getContents, which has type IO String but that function is supposed to be in the parser monad, not the io monad.
If you actually would like to use them, here is how:
readAsTuple :: Parser ComplexInt
readAsTuple = do
_ <- char '('
x <- many digit
_ <- char ','
y <- many digit
_ <- char ')'
return $ ComplexInt (read x) (read y)
readAsNum :: Parser ComplexInt
readAsNum = do
x <- many digit
return $ ComplexInt (read x) 0
instance Read ComplexInt where
readsPrec _ = runParser (readAsTuple +++ readAsNum)
This is fairly basic, as strings like " 42" (ones with spaces) will fail.
Usage:
> read "12" :: ComplexInt
ComplexInt 12 0
> read "(12,1)" :: ComplexInt
ComplexInt 12 1
The Read type-class has a method called readsPrec; defining this method is sufficient to fully define the read instance for the type, and gives you the function read automatically.
What is readsPrec?
readsPrec :: Int -> String -> [(a, String)].
The first parameter is the precedence context; you can think of this as the precedence of the last thing that was parsed. This can range from 0 to 11. The default is 0. For simple parses like this you don't even use it. For more complex (ie recursive) datatypes, changing the precedence context may change the parse.
The second parameter is the input string.
The output type is the possible parses and string remaining a parse terminates. For example:
>runStateT (char 'h') "hello world"
[('h',"ello world")]
Note that parsing is not-deterministic; every matching parse is returned.
>runStateT (many1 (char 'a')) "aa"
[("a","a"),("aa","")]
A parse is considered successful if the return list is a singleton list whose second value is the empty string; namely: [(x, "")] for some x. Empty lists, or lists where any of the remaining strings are not the empty string, give the error no parse and lists with more than one value give the error ambiguous parse.

FParsec combinator to turn Parser<char,_> until Parser<string,_>?

I'm certain there's a really simple answer to this, but I've been staring at this all day and I can't figure it out.
As per the tutorial, I'm implementing a JSON parser. To challenge myself, I'm implementing the number parser myself.
This is what I got so far:
let jnumber =
let neg = stringReturn "-" -1 <|> preturn 1
let digit = satisfy (isDigit)
let digit19 = satisfy (fun c -> isDigit c && c <> '0')
let digits = many1 digit
let ``int`` =
digit
<|> (many1Satisfy2 (fun c -> isDigit c && c <> '0') isDigit)
The trouble is that digit is a Parser<char,_>, whereas the second option for int is a Parser<string,_>. Would I normally just use a combinator to turn digit into a Parser<char,_>, or is there something else I should do?
The |>> operator is what you're looking for. I quote the FParsec reference:
val (|>>): Parser<'a,'u> -> ('a -> 'b) -> Parser<'b,'u>
The parser p
|>> f applies the parser p and returns the result of the function
application f x, where x is the result returned by  p.
p |>> f is an
optimized implementation of p >>= fun x -> preturn (f x).
For example:
let jnumber =
let neg = stringReturn "-" -1 <|> preturn 1
let digit = satisfy (isDigit)
let digit19 = satisfy (fun c -> isDigit c && c <> '0')
let digits = many1 digit
(digit |>> string) (* The operator is used here *)
<|> (many1Satisfy2 (fun c -> isDigit c && c <> '0') isDigit)
You may want to read FParsec tutorial on parsing JSON which uses this operator quite intensively.

Resources