Haskell Parser not working - parsing

I am trying to understand Parsers. Therefore I have created my own parser. Unfortunately it does not work. Why?
type Parser a = String -> [(a, String)]
preturn :: a -> Parser a
preturn t = \inp -> [(t,inp)]
pfailure :: Parser a
pfailure = \inp -> []
pitem :: Parser Char
pitem = \inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)]
parse :: Parser a -> Parser a
--parse :: Parser a -> String -> [(a,String)]
parse p inp = p inp
{-
combine :: Parser a -> Parser b -> Parser (a,b)
combine p1 p2 = \inp -> p2 t output
where
p1 inp = ([
-}
-- firstlast :: Parser (Char,Char)
firstlast = do
x <- pitem
z <- pitem
y <- pitem
preturn (x,y)
another = do
x <- pitem
y <- pitem
Firstlast is supposed to take a string and return the first and third character. Unfortunately, it returns odd values, and it does not accept its type (Parser (Char,Char))
For example,
*Main> firstlast "abc"
[(([('a',"bc")],[('a',"bc")]),"abc")]
What should happen is:
*Main> firstlast "abc"
[("ac",[])]

Please use code that compiles. Your another function does not.
What's the problem?
Your code for firstlast and another makes use of do-notation. And the way you're using pitem here, it looks as if you're expecting Parser to be a monad. But it isn't, at least not in the way you expect it to be.
There is a monad instance pre-defined which make GHC think that Parser is a monad, namely
instance Monad ((->) r) where
return = const
f >>= k = \ r -> k (f r) r
What this instance says is that, for any type r the function type r -> ... can be considered a monad, namely by distributing the parameter everywhere. So returning something in this monad amounts to producing a value ignoring the parameter of type r, and binding a value means that you take r and pass it on both to the left and right computation.
This is not what you want for a parser. The input string will be distributed to all computations. So each pitem will operate on the original input string. Furthermore, as
pitem :: String -> [(Char, String)]
the result of your monadic computation will be of type [(Char, String)], so x and y are both of this type. That's why you get the result
[(([('a',"bc")],[('a',"bc")]),"abc")]
You're calling pitem three times on the same input string. You're putting two results in a pair, and you're preturn-ing the whole thing.
How to fix it?
You need to define your own monad instance for the Parser type. You cannot do that directly, because Parser is a type synonym, and type synonyms cannot be partially applied,
so you cannot write
instance Monad Parser where
...
Instead, you have to wrap Parser in a new datatype or newtype:
newtype Parser a = Parser { parse :: String -> [(a, String)] }
This gives you a constructor Parser and a function parse to convert between the unwrapped and wrapped parser types:
Parser :: String -> [(a, String)] -> Parser a
parse :: Parser a -> String -> [(a, String)]
This implies you'll have to adapt your other functions. For example, preturn becomes
preturn :: a -> Parser a
preturn t = Parser (\inp -> [(t,inp)])
Change pfailure and pitem similarly. Then, you have to define the Monad instance:
instance Monad Parser where
return = preturn
(>>=) = ... -- to be completed by you
The function (>>=) is not contained in your code above. You'll want to implement the behaviour that the input is passed to the first parser, and for every result of that, the result and the remaining input are passed to the second argument of (>>=). Once this is done, a call to parse firstlast "abc" will have the following result:
[(('a','c'),"")]
which isn't quite what you want in your question, but I believe it's what you're actually after.

Related

Haskell: How to implement >>= for a parser

I have the following Parser
newtype Parser a = P (String -> [(a,String)])
and I need to implement the bind on its implementation as a Monad. I have that the return is defined as
instance Monad Parser where
return v = P (\inp -> [(v,inp)])
To implement p >>= f I know this much: p is a Parser object and f has type declaration
f :: a -> Parser b
So I'm thinking the value of p >>= f needs to be a Parser object which wraps a function. That function's argument is a String. So I'm guessing the function should "open up p", get its function, apply that to the input string, get an object of type [(a, String)], then ... I guess maybe apply f to every first coordinate in each tuple, then use the resulting Parser's function and apply it to the second coordinate ... and make a list of all of those tuples?
At this point I get pretty foggy on whether I got this right and if so, how to do it. Maybe I should write a helper function with type
trans :: [(a,String)] -> (a -> Parser b) -> [(b,String)]
But before getting into that, I wanted to check if my confused description of what I should be doing rings true.
instance Monad Parser where
return v = P (\inp -> [(v,inp)])
P p >>= f = P (\inp -> do
(x,u) <- p inp
let P q = f x
q u
)

Simple Parse in Haskell with do construct

I'm trying to write a simple Parser, so all the declarations are listed in the image below, but when I try to compile this module it fails.
I'm following the tutorial provided by this source ->
Haskell lessons suggested by official site and specifically this video by Dr. Erik Meijer (Lesson on Parser with "do" construct).
The problem is that I thought that the "do" construct was able to "concatenate" outputs from a previous function in a descending way, but the way that this p function should work seems to be magic to me. What's the right implementation?
-- Generic Parser.
type Parser a = String -> [(a, String)]
-- A simple Parser that captures the first char of the string, puts it in
-- the first position of the couple and then puts the rest of the string into
-- the second place of the couple inside the singleton list.
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x, xs)]
-- Simple parser that always fails.
failure :: Parser a
failure = \inp -> []
-- Returns the type of the parser without operating on the input.
return1 :: a -> Parser a
return1 v = \inp -> [(v, inp)]
-- Explicit call to parse.
parse :: Parser a -> String -> [(a, String)]
parse p inp = p inp
-- Some kind of "or" operator that if the first parser fails (returning an empty list) it
-- parses the second parser.
(+++) :: Parser a -> Parser a -> Parser a
p +++ q = \inp -> case p inp of
[] -> parse q inp
[(v, out)] -> [(v, out)]
-- The function within I'm having troubles.
p :: Parser (Char,Char)
p = do
x <- item
item
y <- item
return1 (x, y)
This is how it's explained by Dr. Meijer:
And this is how it should work:
Your Parser is just a type synonym for a function. The friendly parsers you've seen in use are all proper types of their own, with Functor and Applicative instances, along with (in most cases) Alternative, Monad, and MonadPlus instances. You probably want something that looks like the following (untested, never compiled) version.
import Control.Monad (ap, liftM)
import Control.Applicative (Alternative (..))
newtype Parser a = Parser
{ runParser :: String -> [(a, String)] }
instance Functor Parser where
fmap = liftM
instance Applicative Parser where
pure v = Parser $ \inp -> [(v, inp)]
(<*>) = ap
instance Monad Parser where
-- The next line isn't required for
-- recent GHC versions
-- return = pure
Parser m >>= f = Parser $ \s ->
[(r, s'') | (x, s') <- m s
, (r, s'') <- runParser (f r) s']
(+++) :: Parser a -> Parser a -> Parser a
p +++ q = Parser $ \inp -> case runParser p inp of
[] -> runParser q inp
[(v, out)] -> [(v, out)]
failure :: Parser a
failure = Parser $ \inp -> []
instance Alternative Parser where
(<|>) = (+++)
empty = failure
instance MonadPlus Parser

Implementing (<++) within a Haskell Parser

The goal of this is to implement <++ in Haskell using StateT to act as a parser
import Control.Monad.State
type Parser = StateT String []
runParser :: Parser a -> String -> [(a,String)]
runParser = runStateT
parser :: (String -> [(a,String)]) -> Parser a
parser = StateT
(<++) :: Parser a -> Parser a -> Parser a
From this base, how could I implement <++ so that it mirrors the definition in Text.ParserCombinators.ReadP
Local, exclusive, left-biased choice: If left parser locally produces any result at all, then right parser is not used.
Your parser are functions String -> [(a,String)] where the argument is the initial state. So lets construct such a function that tries the first parser, and if it fails tries the second parser:
(<++) :: Parser a -> Parser a -> Parser a
p1 <++ p2 =
parser $ \s0 -> -- 1. captures initial state
case runParser p1 s0 of -- 2. run first parser
(x:xs) -> x : xs -- 3. success
[] -> runParser p2 s0 -- 4. otherwise run second parser
Some parsers to try this out
-- | Always fails
failP :: Parser a
failP = parser (const [])
-- | Parses an Int
intP :: Parser Int
intP = parser reads
Output as expected? (Perhaps there are more enlightening test cases...)
λ> runParser (failP <++ intP) "2014"
[(2014,"")]
λ> runParser (intP <++ failP) "2014"
[(2014,"")]
λ> runParser (intP <++ intP) "2014"
[(2014,"")]

Minimal Purely Applicative Parser

I'm trying to figure out how to build a "purely applicative parser" based on a simple parser implementation. The parser would not use monads in its implementation. I asked this question previously but mis-framed it so I'm trying again.
Here is the basic type and its Functor, Applicative and Alternative implementations:
newtype Parser a = Parser { parse :: String -> [(a,String)] }
instance Functor Parser where
fmap f (Parser cs) = Parser (\s -> [(f a, b) | (a, b) <- cs s])
instance Applicative Parser where
pure = Parser (\s -> [(a,s)])
(Parser cs1) <*> (Parser cs2) = Parser (\s -> [(f a, s2) | (f, s1) <- cs1 s, (a, s2) <- cs2 s1])
instance Alternative Parser where
empty = Parser $ \s -> []
p <|> q = Parser $ \s ->
case parse p s of
[] -> parse q s
r -> r
The item function takes a character off the stream:
item :: Parser Char
item = Parser $ \s ->
case s of
[] -> []
(c:cs) -> [(c,cs)]
At this point, I want to implement digit. I can of course do this:
digit = Parser $ \s ->
case s of
[] -> []
(c:cs) -> if isDigit c then [(c, cs)] else []
but I'm replicating the code of item. I'd like to implement digit based on item.
How do I go about implementing digit, using item to take a character off the stream and then checking to see if the character is a digit without bringing monadic concepts into the implementation?
First, let us write down all the tools we currently have at hand:
-- Data constructor
Parser :: (String -> [(a, String)]) -> Parser a
-- field accessor
parse :: Parser a -> String -> [(a, String)]
-- instances, replace 'f' by 'Parser'
fmap :: Functor f => (a -> b) -> f a -> f b
(<*>) :: Applicative f => f (a -> b) -> f a -> f b
pure :: Applicative f => a -> f a
-- the parser at hand
item :: Parser Char
-- the parser we want to write with item
digit :: Parser Char
digit = magic item
-- ?
magic :: Parser Char -> Parser Char
The real question at hand is "what is magic"? There are only so many things we can use. Its type indicates fmap, but we can rule that out. All we can provide is some function a -> b, but there is no f :: Char -> Char that makes fmap f indicate a failure.
What about (<*>), can this help? Well, again, the answer is no. The only thing we can do here is to take the (a -> b) out of the context and apply it; whatever that means in the context of the given Applicative. We can rule pure out.
The problem is that we need to check the Char that item might parse and change the context. We need something like Char -> Parser Char
But we didn't rule Parser or parse out!
magic p = Parser $ \s ->
case parse p s of -- < item will be used here
[(c, cs)] -> if isDigit c then [(c, cs)] else []
_ -> []
Yes, I know, it's duplicate code, but now it's using item. It's using item before inspecting the character. That's the only way we can use item here. And now, there is some kind of sequence implied: item has to succeed before digit can do it's work.
Alternatively, we could have tried this way:
digit' c :: Char -> Parser Char
digit' c = if isDigit c then pure c else empty
But then fmap digit' item would have the type Parser (Parser Char), which can only get collapsed with a join-like function. That's why monads are more powerful than applicative.
That being said, you can get around all of the monad requirements if you use a more general function first:
satisfy :: (Char -> Bool) -> Parser Char
satisfy = Parser $ \s ->
case s of
(c:cs) | p c -> [(c, cs)]
_ -> []
You can then define both item and digit in terms of satisfy:
item = satisfy (const True)
digit = satisfy isDigit
That way digit does not have to inspect the result of a previous parser.
Functors allow you to act on somethings values. For example, if you have a list [1,2,3], you can change the contents. Note that Functors do not allow changing structure. map can not change the length of a list.
Applicatives allow you to combine structure, and the content is mushed together somehow. But the values can not change influence the structure.
Namely, given an item, we can change its structure, and we can change its content, but the content can not change the structure. We can't choose to fail on some content and not other.
If anyone knows how to state this more formally and provably, I'm all ears (it probably has to do with free theorems).

Problems with Parsers

i hope somebody can help me to understand the following code
type Parser a = String -> [(a,String)]
item :: Parser Char
item = \ s -> case s of
[] -> []
(x:xs) -> [(x,xs)]
returnP :: Parser a
returnP a = \s -> [(a,s)]
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p>>=f = \s -> case p s of
[(x,xs)]-> f x xs
_ -> []
twochars :: Parser (Char,Char)
twochars= item >>= \a -> item >>= \b -> returnP (a,b)
Everything seems to be clear but i dont understand the lampda function in the last line in the twochars-function. It would be nice if somebody can give me a explanation about that.
Rewriting the twochars function for clarity and it is basically:
twochars =
item >>= \a -> -- parse a character and call it `a`
item >>= \b -> -- parse another character and call it `b`
returnP (a,b) -- return the tuple of `a` and `b`
The lambdas here just introduce names for the parsed characters, and let them be passed along to a later part of the computation.
They correspond to the second argument in the bind you have defined:
(>>=) :: Parser a -- your item
-> (a -> Parser b) -- your lambda returning another parse result
-> Parser b

Resources