I'm writing a monadic parser using Alex and Happy in Haskell.
My error function is defined like this:
parseError :: Token -> Alex a
parseError _ = alexError "error occurred"
How can I send custom errors (like incorrect type while trying to add a string to a number) during parsing?
UPDATE
The parser doesn't need to do the type checking, I'm doing it inside the production since I keep track of the operands type.
As said in a comment, I cannot use the parseError, so is there a way to print an error and stop the parser?
I've solved it by implementing this function:
fatalError :: (Show a1, Show a) => [Char] -> a -> a1 -> t
fatalError s l c = error ("Error at line " ++ (show l) ++ " column " ++ (show c) ++ ": " ++ s)
and I call it from the production when an error is detected
Related
I am currently writing a basic parser. A parser for type a takes a string in argument and returns either nothing, or an object of type a and the rest of the string.
Here is a simple type satisfying all these features:
type Parser a = String -> Maybe (a, String)
For example, I wrote a function that takes a Char as argument and returns a Parser Char :
parseChar :: Char -> Parser Char
parseChar _ [] = Nothing
parseChar c (x:xs)
| c == x = Just (x, xs)
| otherwise = Nothing
I would like to write a function which takes a parser in argument and tries to apply it zero or more times, returning a list of the parsed elements :
parse :: Parser a -> Parser [a]
Usage example:
> parse (parseChar ' ') " foobar"
Just (" ", "foobar")
I tried to write a recursive function but I can't save the parsed elements in a list.
How can I apply the parsing several times and save the result in a list ?
I tried to write a recursive function but I can't save the parsed elements in a list.
You don't need to "save" anything. You can use pattern matching. Here's a hint. Try to reason about what should happen in each case below. The middle case is a bit subtle, don't worry if you get that wrong at first. Note how s and s' are used below.
parse :: Parser a -> Parser [a]
parse p s = case p s of
Nothing -> ... -- first p failed
Just (x,s') -> case parse p s' of
Nothing -> ... -- subtle case, might not be relevant after all
Just (xs,s'') -> ... -- merge the results
Another hint: note that according to your description parse p should never fail, since it can always return the empty list.
I have a monadic parser that I'm implementing as an exercise. Its signature looks like this:
type Parser err src target = ExceptT err (State [src]) target
I've already implemented many basic helpers, but I've come across a use case where a negative lookahead is necessary. In particular, I think I'd like to make something of this signature:
notFollowedBy :: e -> Parser e s t -> Parser e s t' -> Parser e s t
notFollowedBy followedByError parser shouldFail = -- ...
My thought is that it can be used in context like this:
foo = letter `notFollowedBy'` digit
where notFollowedBy' = notFollowedBy FollwedByDigitError
I'm stumbling to implement notFollowedBy, though for a variety of reasons:
I need a way to run shouldFail such that I can invert its ExceptT result (ie. if it throws I want to catch it and do nothing, but if it doesn't throw I need to throw notFollowedByError)
catchError won't do here (I don't think) and I couldn't figure out a way to use runExceptT to get the Either e t' from shouldFail
Before I run shouldFail I need to save the state from StateT because after running shouldFail I need to restore the state (as if this parser wasn't run). But I'm using the Lazy StateT, so it's unclear if I need to switch everything to the strict one just to allow for this case
My best stab doesn't even compile, but it looks like this:
notFollowedBy :: (t' -> e) -> Parser e s t -> Parser e s t' -> Parser e s t
notFollowedBy onUnexpected parser shouldFail = do
parsed <- parser
state <- get -- This isn't strict
result <- runExceptT shouldFail -- This doesn't typecheck
case result of
Left err -> put state >> return parsed
Right t -> put state >> throwError $ onUnexpected t
(As an implementation note, the first parameter is actually (t' -> e), because I want to allow customizing the error thrown based on information returned by the second parser. But I don't think that matters for my question.)
The typecheck failure is due to it expecting ExceptT (ExceptT e (State [s])) t but getting Parser e s t (which is ExceptT e (State [s]) t).
After pouring over the docs and reading some of the source for ExceptT and StateT, my best guess is that I need to emulate catchError (which matches on an Either) and then use liftCatch. This is my sloppy stab at that (which also doesn't compile):
notFollowedBy :: (t' -> e) -> Parser e s t -> Parser e s t' -> Parser e s t
notFollowedBy onUnexpected parser shouldFail = do
state <- get
result <- parser
catchSuccess' result shouldFail (throwError . onUnexpected)
put state
return result
where catchSuccess' result = liftCatch (catchSuccess result)
catchSuccess r (Left l) _ = Right r
catchSuccess _ (Right r) h = Left (h r)
The typechecker seems to be unhappy about a lot of things the second time around. In particular, it seems like liftCatch (from State.Lazy) is not what we want (because it expects catchSuccess' to return an ExceptT).
At this point I'm just flailing around making random permutations trying to appease the compiler. Can anyone offer any suggestions about how I could implement notFollowedBy?
edit: After consulting how I implemented optional (below), it seems like the state ordering is not an issue (although how that is is a mystery to me). So my primary issue then is creating the reverse of catchError (catchSuccessAndSuppressError).
option :: Parser e s t -> Parser e s t -> Parser e s t
option parserA parserB = do
state <- get
parserA `catchError` \_ -> put state >> parserB
tl;dr
I'm trying to write a function with this signature that throws followedByError when shouldFail does not throw an exception (when run after running parser). The state upon return should be the same as the state after parser is run.
type Parser err src target = ExceptT err (State [src]) target
notFollowedBy :: e -> Parser e s t -> Parser e s t' -> Parser e s t
notFollowedBy followedByError parser shouldFail = -- ...
I'm currently writing my simple programming language parser in Haskell with megaparsec library.
I found this megaparsec tutorial, and I wrote following parser code:
import Data.Void
import Text.Megaparsec
import Text.Megaparsec.Char
import qualified Text.Megaparsec.Char.Lexer as L
type Parser = Parsec Void String
lexeme :: Parser a -> Parser a
lexeme = L.lexeme space
rws :: [String] -- list of reserved words
rws = ["if", "then"]
identifier :: Parser String
identifier = (lexeme . try) (p >>= check)
where
p = (:) <$> letterChar <*> many alphaNumChar
check x =
if x `elem` rws
then fail $ "keyword " ++ show x ++ " cannot be an identifier"
else return x
A simple identifier parser with reserved name error handling. It successfully parses valid identifier such as foo, bar123.
But when an invalid input(a.k.a. reserved name) goes in to the parser, it outputs error:
>> parseTest identifier "if"
1:3:
keyword "if" cannot be an identifier
which, error message is alright, but error location(1:3:) is a bit different from what I expected. I expected error location to be 1:1:.
In the following part of definition of identifier,
identifier = (lexeme . try) (p >>= check)
I expected try would behave like there was no input consumed if (p >>= check) fails and go back to source location 1:1:.
Is my expectation wrong? How can I get this code work as I intended?
As part of the 4th exercise here
I would like to use a reads type function such as readHex with a parsec Parser.
To do this I have written a function:
liftReadsToParse :: Parser String -> (String -> [(a, String)]) -> Parser a
liftReadsToParse p f = p >>= \s -> if null (f s) then fail "No parse" else (return . fst . head ) (f s)
Which can be used, for example in GHCI, like this:
*Main Numeric> parse (liftReadsToParse (many1 hexDigit) readHex) "" "a1"
Right 161
Can anyone suggest any improvement to this approach with regard to:
Will the term (f s) be memoised, or evaluated twice in the case of a null (f s) returning False?
Handling multiple successful parses, i.e. when length (f s) is greater than one, I do not know how parsec deals with this.
Handling the remainder of the parse, i.e. (snd . head) (f s).
This is a nice idea. A more natural approach that would make
your ReadS parser fit in better with Parsec would be to
leave off the Parser String at the beginning of the type:
liftReadS :: ReadS a -> String -> Parser a
liftReadS reader = maybe (unexpected "no parse") (return . fst) .
listToMaybe . filter (null . snd) . reader
This "combinator" style is very idiomatic Haskell - once you
get used to it, it makes function definitions much easier
to read and understand.
You would then use liftReadS like this in the simple case:
> parse (many1 hexDigit >>= liftReadS readHex) "" "a1"
(Note that listToMaybe is in the Data.Maybe module.)
In more complex cases, liftReadS is easy to use inside any
Parsec do block.
Regarding some of your other questions:
The function reader is applied only once now, so there is nothing to "memoize".
It is common and accepted practice to ignore all except the first parse in a ReadS parser in most cases, so you're fine.
To answer the first part of your question, no (f s) will not be memoised, you would have to do that manually:
liftReadsToParse p f = p >>= \s -> let fs = f s in if null fs then fail "No parse"
else (return . fst . head ) fs
But I'd use pattern matching instead:
liftReadsToParse p f = p >>= \s -> case f s of
[] -> fail "No parse"
(answer, _) : _ -> return answer
This is not working...
I get error FS0001: The type 'string' is not compatible with the type 'seq'
for the last line. Why?
let rec Parse (charlist) =
match charlist with
| head :: tail -> printf "%s " head
Parse tail
| [] -> None
Parse (Seq.toList "this is a sentence.") |> ignore
The problem is that printf "%s " head means that head must be a string, but you actually want it to be a char, so you'll see that Parse has inferred type string list -> 'a option. Therefore, F# expects Seq.toList to be applied to a string seq, not a string.
The simple fix is to change the line doing the printing to printf "%c " head.