Parse error on input `_' - parsing

Line 7, at the _. I've no idea what might be the problem. Any tips?
term :: Parser Expr
term s1 = case factor s1 of
Just (a, s2) -> case s2 of
'*':s3 -> case term s3 of
Just (b, s4) -> Just (Mul a b, s4)
Nothing -> Just (a, s2)
_ -> Just (a, s2)
Nothing -> Nothing
I'm trying to parse a string into an Expr (selfmade datatype). I think this is how we're supposed to do it but i can't test it since i can't compile it properly. GHCI and Ghc wall gives me the same error. Parse error at that specific point.
My code is larger than this but this is the rellevant piece of code.
edit: Code posted here, sorry.

It is a syntax problem. Haskell uses two-dimensional syntax, thus each part of the case statement should have same indentation.
So, to fix the error move line 7 two characters left
term :: Parser Expr
term s1 = case factor s1 of
Just (a, s2) -> case s2 of
'*':s3 -> case term s3 of
Just (b, s4) -> Just (Mul a b, s4)
Nothing -> Just (a, s2)
_ -> Just (a, s2)
Nothing -> Nothing

Related

Calculating First and Follow of a grammar

I'm trying to calculate First and Follow of the following grammar:
S -> A B C D E
A -> a
A -> EPSILON
B -> b
B -> EPSILON
C -> c
D -> d
D -> EPSILON
E -> e
E -> EPSILON
I calculated them and got First(S)={a,b,c}. But using this tools, says: First(S)= {a, ε, c, b}. Why epsilon is part of First(S)? As I understand it should not be there. Is it my mistake or a bug? In case it's a bug. Are there other tools I can use to verify my results? In case it's my mistake, it would be helpful to understand why. Printscreen:
Also I got Follow(C)={d,e,$} but their result is Follow(C)={c, d, $}. Why?

Haskell Type error in Double recursion function

I'm trying to define a greedy function
greedy :: ReadP a -> ReadP [a]
that parses a sequence of values, returning only the "maximal" sequences that cannot be extended any further. For example,
> readP_to_S (greedy (string "a" +++ string "ab")) "abaac"
[(["a"],"baac"),(["ab","a","a"],"c")]
I'm using a very simple and probably clumsy way. Just parse the values and see if they can be parsed any further; if so, then reapply the function again to get all the possible values and concat that with the previous ones, or else just return the value itself. However, there seems to be some type problems, below is my code.
import Text.ParserCombinators.ReadP
addpair :: a -> [([a],String)] -> [([a],String)]
addpair a [] = []
addpair a (c:cs) = (a : (fst c), snd c ) : (addpair a cs)
greedy :: ReadP a -> ReadP [a]
greedy ap = readS_to_P (\s ->
let list = readP_to_S ap s in
f list )
where
f :: [(a,String)] -> [([a],String)]
f ((value, str2):cs) =
case readP_to_S ap str2 of
[] -> ([value], str2) : (f cs)
_ -> (addpair value (readP_to_S (greedy ap) str2)) ++ (f cs)
The GHC processes the code and says that function "f" has type [(a1,String)] -> [([a1],String)] but greedy is ReadP a -> ReadP [a]. I wonder why it is so because I think their type should agree. It also really helps if anyone can come up with some clever and more elegant approach to define the function greedy(my approach is definitely way too redundant)
To fix the compilation error, you need to add the language extension
{-# LANGUAGE ScopedTypeVariables #-}
to your source file, or pass the corresponding flag into the compiler. You also need to change the type signature of greedy to
greedy :: forall a. ReadP a -> ReadP [a]
This is because your two a type variables are not actually the same; they're in different scopes. With the extension and the forall, they are treated as being the same variable, and your types unify properly. Even then, the code errors, because you don't have an exhaustive pattern match in your definition of f. If you add
f [] = []
then the code seems to work as intended.
In order to simplify your code, I took a look at the provided function munch, which is defined as:
munch :: (Char -> Bool) -> ReadP String
-- ^ Parses the first zero or more characters satisfying the predicate.
-- Always succeeds, exactly once having consumed all the characters
-- Hence NOT the same as (many (satisfy p))
munch p =
do s <- look
scan s
where
scan (c:cs) | p c = do _ <- get; s <- scan cs; return (c:s)
scan _ = do return ""
In that spirit, your code can be rewritten as:
greedy2 :: forall a. ReadP a -> ReadP [a]
greedy2 ap = do
-- look at the string
s <- look
-- try parsing it, but without do notation
case readP_to_S ap s of
-- if we failed, then return nothing
[] -> return []
-- if we parsed something, ignore it
(_:_) -> do
-- parse it again, but this time inside of the monad
x <- ap
-- recurse, greedily parsing again
xs <- greedy2 ap
-- and return the concatenated values
return (x:xs)
This does have the speed disadvantage of executing ap twice as often as needed; this may be too slow for your use case. I'm sure my code could be further rewritten to avoid that, but I'm not a ReadP expert.

Parse string with lex in Haskell

I'm following Gentle introduction to Haskell tutorial and the code presented there seems to be broken. I need to understand whether it is so, or my seeing of the concept is wrong.
I am implementing parser for custom type:
data Tree a = Leaf a | Branch (Tree a) (Tree a)
printing function for convenience
showsTree :: Show a => Tree a -> String -> String
showsTree (Leaf x) = shows x
showsTree (Branch l r) = ('<':) . showsTree l . ('|':) . showsTree r . ('>':)
instance Show a => Show (Tree a) where
showsPrec _ x = showsTree x
this parser is fine but breaks when there are spaces
readsTree :: (Read a) => String -> [(Tree a, String)]
readsTree ('<':s) = [(Branch l r, u) | (l, '|':t) <- readsTree s,
(r, '>':u) <- readsTree t ]
readsTree s = [(Leaf x, t) | (x,t) <- reads s]
this one is said to be a better solution, but it does not work without spaces
readsTree_lex :: (Read a) => String -> [(Tree a, String)]
readsTree_lex s = [(Branch l r, x) | ("<", t) <- lex s,
(l, u) <- readsTree_lex t,
("|", v) <- lex u,
(r, w) <- readsTree_lex v,
(">", x) <- lex w ]
++
[(Leaf x, t) | (x, t) <- reads s ]
next I pick one of parsers to use with read
instance Read a => Read (Tree a) where
readsPrec _ s = readsTree s
then I load it in ghci using Leksah debug mode (this is unrelevant, I guess), and try to parse two strings:
read "<1|<2|3>>" :: Tree Int -- succeeds with readsTree
read "<1| <2|3> >" :: Tree Int -- succeeds with readsTree_lex
when lex encounters |<2... part of the former string, it splits onto ("|<", _). That does not match ("|", v) <- lex u part of parser and fails to complete parsing.
There are two questions arising:
how do I define parser that really ignores spaces, not requires them?
how can I define rules for splitting encountered literals with lex
speaking of second question -- it is asked more of curiousity as defining my own lexer seems to be more correct than defining rules of existing one.
lex splits into Haskell lexemes, skipping whitespace.
This means that since Haskell permits |< as a lexeme, lex will not split it into two lexemes, since that's not how it parses in Haskell.
You can only use lex in your parser if you're using the same (or similar) syntactic rules to Haskell.
If you want to ignore all whitespace (as opposed to making any whitespace equivalent to one space), it's much simpler and more efficient to first run filter (not.isSpace).
The answer to this seems to be a small gap between text of Gentle introduction to Haskell and its code samples, plus an error in sample code.
there should also be one more lexer, but there is no working example (satisfying my need) in codebase, so I written one. Please point out any flaw in it:
lexAll :: ReadS String
lexAll s = case lex s of
[("",_)] -> [] -- nothing to parse.
[(c, r)] -> if length c == 1 then [(c, r)] -- we will try to match
else [(c, r), ([head s], tail s)]-- not only as it was
any_else -> any_else -- parsed but also splitted
author sais:
Finally, the complete reader. This is not sensitive to white space as
were the previous versions. When you derive the Show class for a data
type the reader generated automatically is similar to this in style.
but lexAll should be used instead of lex (which seems to be said error):
readsTree' :: (Read a) => ReadS (Tree a)
readsTree' s = [(Branch l r, x) | ("<", t) <- lexAll s,
(l, u) <- readsTree' t,
("|", v) <- lexAll u,
(r, w) <- readsTree' v,
(">", x) <- lexAll w ]
++
[(Leaf x, t) | (x, t) <- reads s]

Creating a parser combinator of type Parser a -> Parser b -> Parser (Either a b)

I want to parse some text in which certain fields have structure most of the time but occasionally (due to special casing, typos etc) this structure is missing.
E.g. Regular case is Cost: 5, but occasionally it will read Cost: 5m or Cost: 3 + 1 per ally, or some other random stuff.
In the case of the normal parser (p) not working, I'd like to fallback to a parser which just takes the whole line as a string.
To this end, I'd like to create a combinator of type Parser a -> Parser b -> Either a b. However, I cannot work out how to inspect the results of attempting to see if the first parser succeeds or not, without doing something like case parse p "" txt of ....
I can't see a build in combinator, but I'm sure there's some easy way to solve this that I'm missing
I think you want something like this
eitherParse :: Parser a -> Parser b -> Parser (Either a b)
eitherParse a b = fmap Left (try a) <|> fmap Right b
The try is just to ensure that if a consumes some input and then fails, you'll backtrack properly. Then you can just use the normal methods for running a parser to yield Either ParseError (Either a b)
Which is quite easy to transform into your Either a b
case parse p "" str of
Right (Left a) -> useA a
Right (Right b) -> useB b
Left err -> handleParserError err
Try this: (<|>) :: ParsecT s u m a -> ParsecT s u m a -> ParsecT s u m a
As a rule you could use it this way:
try p <|> q

Haskell: Lifting a reads function to a parsec parser

As part of the 4th exercise here
I would like to use a reads type function such as readHex with a parsec Parser.
To do this I have written a function:
liftReadsToParse :: Parser String -> (String -> [(a, String)]) -> Parser a
liftReadsToParse p f = p >>= \s -> if null (f s) then fail "No parse" else (return . fst . head ) (f s)
Which can be used, for example in GHCI, like this:
*Main Numeric> parse (liftReadsToParse (many1 hexDigit) readHex) "" "a1"
Right 161
Can anyone suggest any improvement to this approach with regard to:
Will the term (f s) be memoised, or evaluated twice in the case of a null (f s) returning False?
Handling multiple successful parses, i.e. when length (f s) is greater than one, I do not know how parsec deals with this.
Handling the remainder of the parse, i.e. (snd . head) (f s).
This is a nice idea. A more natural approach that would make
your ReadS parser fit in better with Parsec would be to
leave off the Parser String at the beginning of the type:
liftReadS :: ReadS a -> String -> Parser a
liftReadS reader = maybe (unexpected "no parse") (return . fst) .
listToMaybe . filter (null . snd) . reader
This "combinator" style is very idiomatic Haskell - once you
get used to it, it makes function definitions much easier
to read and understand.
You would then use liftReadS like this in the simple case:
> parse (many1 hexDigit >>= liftReadS readHex) "" "a1"
(Note that listToMaybe is in the Data.Maybe module.)
In more complex cases, liftReadS is easy to use inside any
Parsec do block.
Regarding some of your other questions:
The function reader is applied only once now, so there is nothing to "memoize".
It is common and accepted practice to ignore all except the first parse in a ReadS parser in most cases, so you're fine.
To answer the first part of your question, no (f s) will not be memoised, you would have to do that manually:
liftReadsToParse p f = p >>= \s -> let fs = f s in if null fs then fail "No parse"
else (return . fst . head ) fs
But I'd use pattern matching instead:
liftReadsToParse p f = p >>= \s -> case f s of
[] -> fail "No parse"
(answer, _) : _ -> return answer

Resources