Haskell: if-then-else parse error - parsing

I'm working on an instance of Read ComplexInt.
Here's what was given:
data ComplexInt = ComplexInt Int Int
deriving (Show)
and
module Parser (Parser,parser,runParser,satisfy,char,string,many,many1,(+++)) where
import Data.Char
import Control.Monad
import Control.Monad.State
type Parser = StateT String []
runParser :: Parser a -> String -> [(a,String)]
runParser = runStateT
parser :: (String -> [(a,String)]) -> Parser a
parser = StateT
satisfy :: (Char -> Bool) -> Parser Char
satisfy f = parser $ \s -> case s of
[] -> []
a:as -> [(a,as) | f a]
char :: Char -> Parser Char
char = satisfy . (==)
alpha,digit :: Parser Char
alpha = satisfy isAlpha
digit = satisfy isDigit
string :: String -> Parser String
string = mapM char
infixr 5 +++
(+++) :: Parser a -> Parser a -> Parser a
(+++) = mplus
many, many1 :: Parser a -> Parser [a]
many p = return [] +++ many1 p
many1 p = liftM2 (:) p (many p)
Here's the given exercise:
"Use Parser to implement Read ComplexInt, where you can accept either the simple integer
syntax "12" for ComplexInt 12 0 or "(1,2)" for ComplexInt 1 2, and illustrate that read
works as expected (when its return type is specialized appropriately) on these examples.
Don't worry (yet) about the possibility of minus signs in the specification of natural
numbers."
Here's my attempt:
data ComplexInt = ComplexInt Int Int
deriving (Show)
instance Read ComplexInt where
readsPrec _ = runParser parseComplexInt
parseComplexInt :: Parser ComplexInt
parseComplexInt = do
statestring <- getContents
case statestring of
if '(' `elem` statestring
then do process1 statestring
else do process2 statestring
where
process1 ststr = do
number <- read(dropWhile (not(isDigit)) ststr) :: Int
return ComplexInt number 0
process2 ststr = do
numbers <- dropWhile (not(isDigit)) ststr
number1 <- read(takeWhile (not(isSpace)) numbers) :: Int
number2 <- read(dropWhile (not(isSpace)) numbers) :: Int
return ComplexInt number1 number2
Here's my error (my current error, as I'm sure there will be more once I sort this one out, but I'll take this one step at time):
Parse error in pattern: if ')' `elem` statestring then
do { process1 statestring }
else
do { process2 statestring }
I based my structure of the if-then-else statement on the structure used in this question: "parse error on input" in Haskell if-then-else conditional
I would appreciate any help with the if-then-else block as well as with the code in general, if you see any obvious errors.

Let's look at the code around the parse error.
case statestring of
if '(' `elem` statestring
then do process1 statestring
else do process2 statestring
That's not how case works. It's supposed to be used like so:
case statestring of
"foo" -> -- code for when statestring == "foo"
'b':xs -> -- code for when statestring begins with 'b'
_ -> -- code for none of the above
Since you're not making any sort of actual use of the case, just get rid of the case line entirely.
(Also, since they're only followed by a single statement each, the dos after then and else are superfluous.)

You stated you were given some functions to work with, but then didn't use them! Perhaps I misunderstood. Your code seems jumbled and doesn't seem to achieve what you would like it to. You have a call to getContents, which has type IO String but that function is supposed to be in the parser monad, not the io monad.
If you actually would like to use them, here is how:
readAsTuple :: Parser ComplexInt
readAsTuple = do
_ <- char '('
x <- many digit
_ <- char ','
y <- many digit
_ <- char ')'
return $ ComplexInt (read x) (read y)
readAsNum :: Parser ComplexInt
readAsNum = do
x <- many digit
return $ ComplexInt (read x) 0
instance Read ComplexInt where
readsPrec _ = runParser (readAsTuple +++ readAsNum)
This is fairly basic, as strings like " 42" (ones with spaces) will fail.
Usage:
> read "12" :: ComplexInt
ComplexInt 12 0
> read "(12,1)" :: ComplexInt
ComplexInt 12 1
The Read type-class has a method called readsPrec; defining this method is sufficient to fully define the read instance for the type, and gives you the function read automatically.
What is readsPrec?
readsPrec :: Int -> String -> [(a, String)].
The first parameter is the precedence context; you can think of this as the precedence of the last thing that was parsed. This can range from 0 to 11. The default is 0. For simple parses like this you don't even use it. For more complex (ie recursive) datatypes, changing the precedence context may change the parse.
The second parameter is the input string.
The output type is the possible parses and string remaining a parse terminates. For example:
>runStateT (char 'h') "hello world"
[('h',"ello world")]
Note that parsing is not-deterministic; every matching parse is returned.
>runStateT (many1 (char 'a')) "aa"
[("a","a"),("aa","")]
A parse is considered successful if the return list is a singleton list whose second value is the empty string; namely: [(x, "")] for some x. Empty lists, or lists where any of the remaining strings are not the empty string, give the error no parse and lists with more than one value give the error ambiguous parse.

Related

Haskell : Operator Parser keeps going to undefined rather than inputs

I'm practicing writing parsers. I'm using Tsodings JSON Parser video as reference. I'm trying to add to it by being able to parse arithmetic of arbitrary length and I have come up with the following AST.
data HVal
= HInteger Integer -- No Support For Floats
| HBool Bool
| HNull
| HString String
| HChar Char
| HList [HVal]
| HObj [(String, HVal)]
deriving (Show, Eq, Read)
data Op -- There's only one operator for the sake of brevity at the moment.
= Add
deriving (Show, Read)
newtype Parser a = Parser {
runParser :: String -> Maybe (String, a)
}
The following functions is my attempt of implementing the operator parser.
ops :: [Char]
ops = ['+']
isOp :: Char -> Bool
isOp c = elem c ops
spanP :: (Char -> Bool) -> Parser String
spanP f = Parser $ \input -> let (token, rest) = span f input
in Just (rest, token)
opLiteral :: Parser String
opLiteral = spanP isOp
sOp :: String -> Op
sOp "+" = Add
sOp _ = undefined
parseOp :: Parser Op
parseOp = sOp <$> (charP '"' *> opLiteral <* charP '"')
The logic above is similar to how strings are parsed therefore my assumption was that the only difference was looking specifically for an operator rather than anything that's not a number between quotation marks. It does seemingly begin to parse correctly but it then gives me the following error:
λ > runParser parseOp "\"+\""
Just ("+\"",*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /DIRECTORY/parser.hs:110:11 in main:Main
I'm confused as to where the error is occurring. I'm assuming it's to do with sOp mainly due to how the other functions work as intended as the rest of parseOp being a translation of the parseString function:
stringLiteral :: Parser String
stringLiteral = spanP (/= '"')
parseString :: Parser HVal
parseString = HString <$> (charP '"' *> stringLiteral <* charP '"')
The only reason why I have sOp however is that if it was replaced with say Op, I would get the error that the following doesn't exist Op :: String -> Op. When I say this my inclination was that the string coming from the parsed expression would be passed into this function wherein I could return the appropriate operator. This however is incorrect and I'm not sure how to proceed.
charP and Applicative Instance
charP :: Char -> Parser Char
charP x = Parser $ f
where f (y:ys)
| y == x = Just (ys, x)
| otherwise = Nothing
f [] = Nothing
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input', a) <- q input
Just (input', f a)
The implementation of (<*>) is the culprit. You did not use input' in the next call to q, but used input instead. As a result you pass the string to the next parser without "eating" characters. You can fix this with:
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input'', a) <- q input'
Just (input'', f a)
With the updated instance for Applicative, we get:
*Main> runParser parseOp "\"+\""
Just ("",Add)

What's a difference between unary and binary operator parsing in Haskell?

I'm learning some techniques to make a very simple Haskell parser that serves to calculation consistence (addition, subtraction and other trivial operations). Library I use is Parsec. Although I've got some comprehension on binary calculation, it seems to be tough to me if I try to make a unary operator function, for example that of negation (~). There is a code snippet I use to implement parsing for multiplication:
import Text.Parsec hiding(digit)
import Data.Functor
type Parser a = Parsec String () a
digit :: Parser Char
digit = oneOf ['0'..'9']
number :: Parser Integer
number = read <$> many1 digit
applyMany :: a -> [a -> a] -> a
applyMany x [] = x
applyMany x (h:t) = applyMany (h x) t
multiplication :: Parser Integer
multiplication = do
lhv <- number
spaces
char '*'
spaces
rhv <- number
return $ lhv * rhv
Switching to an unary operation, my code for factorial as follows:
fact :: Parser Integer
fact = do
spaces
char '!'
rhv <- number
spaces
return $ factorial rhv
factorial :: Parser Integer -> Parser Integer
factorial n
| n == 0 || n == 1 = 1
| otherwise = n * factorial (n-1)
And once module is getting loaded, an error message appears just like that:
Couldn't match type `Integer'
with `ParsecT String () Data.Functor.Identity.Identity Integer'
Expected type: Parser Integer
Actual type: Integer
Confusingly, it's a hard case for me to realize what's wrong with my comprehension about unary ops comparing them to binary ones. Hoping any help to fix that.
factorial doesn't define a parser; it computes a factorial. The type should just be Integer -> Integer, not Parser Integer -> Parser Integer.

Haskell Parser Combinators - String function

I have been reading a tutorial about parser combinators and I came across a function which I would like a bit of help in trying to understand.
satisfy :: (Char -> Bool) -> Parser Char
satisfy p = item `bind` \c ->
if p c
then unit c
else (Parser (\cs -> []))
char :: Char -> Parser Char
char c = satisfy (c ==)
natural :: Parser Integer
natural = read <$> some (satisfy isDigit)
string :: String -> Parser String
string [] = return []
string (c:cs) = do { char c; string cs; return (c:cs)}
My question is how does the string function work or rather how does it terminate, say i did something like:
let while_parser = string "while"
and then i used it to parse a string say for example
parse while_parser "while if" , it will correctly parse me the "while".
however if i try something like parse while_parser "test it will return [].
My question is how does it fail? what happens when char c returns an empty list?
Let's say your Parser is defined like this:
newtype Parser a = Parser { runParser :: String -> [(a,String)] }
Then your Monad instance would be defined something like this:
instance Monad Parser where
return x = Parser $ \input -> [(x, input)]
p >>= f = Parser $ \input -> concatMap (\(x,s) -> runParser (f x) s) (runParser p input)
You're wondering what happens when char c fails in this line of code:
string (c:cs) = do { char c; string cs; return (c:cs) }
First, let's desugar it:
string (c:cs) = char c >>= \_ -> string cs >>= \_ -> return (c:cs)
Now the part of interest is char c >>= \_ -> string cs. From the definition of char and subsequently the definition of satisfy we see that ultimately runParser (char c) input will evaluate to [] when char c fails. Look at the definition of >>= when p is char c. concatMap won't have any work to do because the list will be empty! Thus any calls to >>= from then on will just encounter an empty list and pass it along.
One of the wonderful things about referential transparency is that you can write down your expression and evaluate it by substituting definitions and doing the function applications by hand.

precedence climbing in haskell: parsec mutual recursion error

I'm programming the precedence climbing algorithm in Haskell, but for a reason unknown to me, does not work. I think that Parsec state info is lost at some point, but I don't even know that is the source of the error:
module PrecedenceClimbing where
import Text.Parsec
import Text.Parsec.Char
{-
Algorithm
compute_expr(min_prec):
result = compute_atom()
while cur token is a binary operator with precedence >= min_prec:
prec, assoc = precedence and associativity of current token
if assoc is left:
next_min_prec = prec + 1
else:
next_min_prec = prec
rhs = compute_expr(next_min_prec)
result = compute operator(result, rhs)
return result
-}
type Precedence = Int
data Associativity = LeftAssoc
| RightAssoc
deriving (Eq, Show)
data OperatorInfo = OPInfo Precedence Associativity (Int -> Int -> Int)
mkOperator :: Char -> OperatorInfo
mkOperator = \c -> case c of
'+' -> OPInfo 1 LeftAssoc (+)
'-' -> OPInfo 1 LeftAssoc (-)
'*' -> OPInfo 2 LeftAssoc (*)
'/' -> OPInfo 2 LeftAssoc div
'^' -> OPInfo 3 RightAssoc (^)
getPrecedence :: OperatorInfo -> Precedence
getPrecedence (OPInfo prec _ _) = prec
getAssoc :: OperatorInfo -> Associativity
getAssoc (OPInfo _ assoc _) = assoc
getFun :: OperatorInfo -> (Int -> Int -> Int)
getFun (OPInfo _ _ fun) = fun
number :: Parsec String () Int
number = do
spaces
fmap read $ many1 digit
operator :: Parsec String () OperatorInfo
operator = do
spaces
fmap mkOperator $ oneOf "+-*/^"
computeAtom = do
spaces
number
loop minPrec res = (do
oper <- operator
let prec = getPrecedence oper
if prec >= minPrec
then do
let assoc = getAssoc oper
next_min_prec = if assoc == LeftAssoc
then prec + 1
else prec
rhs <- computeExpr(next_min_prec)
loop minPrec $ getFun oper res rhs
else return res) <|> (return res)
computeExpr :: Int -> Parsec String () Int
computeExpr minPrec = (do
result <- computeAtom
loop minPrec result) <|> (computeAtom)
getResult minPrec = parse (computeExpr minPrec) ""
My program for some reason is only processing the first operation or the first operand depending on the case, but does not go any further
GHCi session:
*PrecedenceClimbing> getResult 1 "46+10"
Right 56
*PrecedenceClimbing> getResult 1 "46+10+1"
Right 56
I'm not sure exactly what's wrong with your code but I'll offer these comments:
(1) These statements are not equivalent:
Generic Imperative: rhs = compute_expr(next_min_prec)
Haskell: rhs <- computeExpr(next_min_prec)
The imperative call to compute_expr will always return. The Haskell call may fail in which case the stuff following the call never happens.
(2) You are really working against Parsec's strengths by trying to parse tokens one at a time in sequence. To see the "Parsec way" of generically parsing expressions with operators of various precedences and associativities, have a look at:
buildExpression
Parsec and Expression Printing
Update
I've posted a solution to http://lpaste.net/165651

Haskell read variable name

I need to write a code that parses some language. I got stuck on parsing variable name - it can be anything that is at least 1 char long, starts with lowercase letter and can contain underscore '_' character. I think I made a good start with following code:
identToken :: Parser String
identToken = do
c <- letter
cs <- letdigs
return (c:cs)
where letter = satisfy isLetter
letdigs = munch isLetter +++ munch isDigit +++ munch underscore
num = satisfy isDigit
underscore = \x -> x == '_'
lowerCase = \x -> x `elem` ['a'..'z'] -- how to add this function to current code?
ident :: Parser Ident
ident = do
_ <- skipSpaces
s <- identToken
skipSpaces; return $ s
idents :: Parser Command
idents = do
skipSpaces; ids <- many1 ident
...
This function however gives me a weird results. If I call my test function
test_parseIdents :: String -> Either Error [Ident]
test_parseIdents p =
case readP_to_S prog p of
[(j, "")] -> Right j
[] -> Left InvalidParse
multipleRes -> Left (AmbiguousIdents multipleRes)
where
prog :: Parser [Ident]
prog = do
result <- many ident
eof
return result
like this:
test_parseIdents "test"
I get this:
Left (AmbiguousIdents [(["test"],""),(["t","est"],""),(["t","e","st"],""),
(["t","e","st"],""),(["t","est"],""),(["t","e","st"],""),(["t","e","st"],""),
(["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
(["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
(["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
(["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
(["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
(["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
(["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
(["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],"")])
Note that Parser is just synonym for ReadP a.
I also want to encode in the parser that variable names should start with a lowercase character.
Thank you for your help.
Part of the problem is with your use of the +++ operator. The following code works for me:
import Data.Char
import Text.ParserCombinators.ReadP
type Parser a = ReadP a
type Ident = String
identToken :: Parser String
identToken = do c <- satisfy lowerCase
cs <- letdigs
return (c:cs)
where lowerCase = \x -> x `elem` ['a'..'z']
underscore = \x -> x == '_'
letdigs = munch (\c -> isLetter c || isDigit c || underscore c)
ident :: Parser Ident
ident = do _ <- skipSpaces
s <- identToken
skipSpaces
return s
test_parseIdents :: String -> Either String [Ident]
test_parseIdents p = case readP_to_S prog p of
[(j, "")] -> Right j
[] -> Left "Invalid parse"
multipleRes -> Left ("Ambiguous idents: " ++ show multipleRes)
where prog :: Parser [Ident]
prog = do result <- many ident
eof
return result
main = print $ test_parseIdents "test_1349_zefz"
So what went wrong:
+++ imposes an order on its arguments, and allows for multiple alternatives to succeed (symmetric choice). <++ is left-biased so only the left-most option succeeds -> this would remove the ambiguity in the parse, but still leaves the next problem.
Your parser was looking for letters first, then digits, and finally underscores. Digits after underscores failed, for example. The parser had to be modified to munch characters that were either letters, digits or underscores.
I also removed some functions that were unused and made an educated guess for the definition of your datatypes.

Resources