Parse error when reading - parsing

GHC can't derive Read or Show for complicated GADTs, so I attempted to define custom instances below that satisfy read . show == id. I've simplified the example as much as possible (while still triggering the runtime error like my real code). I decided to let GHC do the heavy lifting of writing Read and Show instances by making newtype wrappers for each GADT constructor (more accurately: for each type output by the GADT). In the Read/Show instances, I simply read/show the newtype wrapper and convert where necessary. I assumed this was foolproof: I'm letting GHC define all of the non-trivial instances. However, I seem to have done something wrong.
In my real code, Foo below is a complicated GADT that GHC won't derive Show or Read for. Since Foo is a wrapper (in part) around a newtype, I use the derived Show/Read instances for that.
{-# LANGUAGE FlexibleContexts, FlexibleInstances, GADTs, ScopedTypeVariables #-}
import Text.Read (Read(readPrec))
newtype Bar r = Bar r deriving (Show, Read)
newtype Foo r = Foo (Bar r)
-- use the GHC-derived Show/Read for Bar
instance (Show r) => Show (Foo r) where
show (Foo x) = show x
instance (Read r) => Read (Foo r) where
readPrec = Foo <$> readPrec
This instance seems to work: I can call read . show and I get back the input.
Now I have a wrapper around Foo:
data U t rep r where
U1 :: t r -> U t Int r
U2 :: t r -> U t Char r
-- use the Read/Show instances for U1Wrap and U2Wrap
newtype U1Wrap t r = U1Wrap {unU1Wrap :: t r} deriving (Show, Read)
newtype U2Wrap t r = U2Wrap (t r) deriving (Show, Read)
instance (Read (t r)) => Read (U t Int r) where
readPrec = (U1 . unU1Wrap) <$> readPrec
instance (Read (U2Wrap t r)) => Read (U t Char r) where
readPrec = do
x <- readPrec
return $ case x of
(U2Wrap y) -> U2 y
instance (Show (t r)) => Show (U t Int r) where
show (U1 x) = show $ U1Wrap x
instance (Show (t r)) => Show (U t Char r) where
show (U2 x) = show (U2Wrap x :: U2Wrap t r)
Like Foo, U is a complicated GADT, so I define custom newtype wrappers for each output type of the GADT. Unfortunately, my Show/Read instances don't work:
main :: IO ()
main = do
let x = U1 $ Foo $ Bar 3
y = U2 $ Foo $ Bar 3
print $ show (read (show x) `asTypeOf` x)
print $ show (read (show y) `asTypeOf` y)
main prints the first line, but I get Exception: Prelude.read: no parse on the second line.
This is my first time using Read, so I suspect I'm doing something incorrectly, though I don't see what that is.
My questions are:
Why am I getting this error, and logically how can I fix it? (There are several ways to poke the minimal example above to make the error go away, but I can't do those things in my real code.)
Is there a different/better high-level approach to Reading GADTs?

Your custom Show instance for Foo doesn't parenthesize correctly.
> U2 $ Foo $ Bar 3
U2Wrap Bar 3
When writing custom Show instances, you should define showsPrec instead. That's because show just gives back a string independently of the context, while showsPrec parenthesizes based on precendence. See Text.Show for further documentation.
instance (Show r) => Show (Foo r) where
showsPrec n (Foo x) = showsPrec n x
Which works here.
I don't know of an elegant approach that would automacially get us a Read instance for this GADT. The deriving mechanism can't seem to figure out that only a single constructor has to be considered for each rep.
At least Show can be derived even here. I also include here a manual Read instance which I hope conforms to Show. I tried to mimic the definitions in Text.Read, which you could also do in other cases. Alternatively, one could use the -ddump-deriv GHC argument to look at derived Read instances, and try to copy them in custom code.
{-# LANGUAGE GADTs, StandaloneDeriving, FlexibleInstances, FlexibleContexts #-}
import Text.Read
import Data.Proxy
data U t rep r where
U1 :: t r -> U t Int r
U2 :: t r -> U t Char r
deriving instance Show (t r) => Show (U t rep r)
instance Read (t r) => Read (U t Int r) where
readPrec = parens $ do
prec 10 $ do
Ident "U1" <- lexP
U1 <$> readPrec
instance Read (t r) => Read (U t Char r) where
readPrec = parens $ do
prec 10 $ do
Ident "U2" <- lexP
U2 <$> readPrec

Related

loading elements from file into a tree in haskell

I am trying to make a tree from the info in a text document. For example in example.txt we have aritmetchic expression (3 + x) * (5 - 2). I want to make a tree which seems like this:
Node * (Node + (Leaf 3) (Leaf x)) (Node - (Leaf 5) (Leaf 2)
So far after a lot of unsuccessful attempts I have done this:
data Tree a = Empty
| Leaf a
| Node a (Tree a) (Tree a)
deriving (Show)
this is the tree I use and :
take name = do
elements <- readFile name
return elements
So how can I put the elements in the tree?
You'll need to make a data type to put in the tree that can store both operations and values. One way to do this would be to create an ADT representing everything you want to store in the tree:
data Eval a
= Val a
| Var Char
| Op (a -> a -> a)
type EvalTree a = Tree (Eval a)
But this isn't really ideal because someone could have Leaf (Op (+)), which doesn't make much sense here. Rather, I would suggest structuring it as
data Eval a
= Val a
| Var Char
| Op (a -> a -> a) (Eval a) (Eval a)
Which is essentially the tree structure you have, just restricted to be syntactically correct. Then you can write a simple evaluator as
eval :: Eval a -> Data.Map.Map Char a -> Maybe a
eval vars (Val a) = Just a
eval vars (Var x) = Data.Map.lookup x vars
eval vars (Op op l r) = do
left <- eval l
right <- eval r
return $ left `op` right
This will just walk down both branches, evaluating as it goes, then finally returning the computed value. You just have to supply it with a map of variables to values to use
So for example, (3 + x) * (5 - 2) would be represented as Op (*) (Op (+) (Val 3) (Var 'x')) (Op (-) (Val 5) (Val 2)). All that's left is to parse the file, which is another problem entirely.

Parse string with lex in Haskell

I'm following Gentle introduction to Haskell tutorial and the code presented there seems to be broken. I need to understand whether it is so, or my seeing of the concept is wrong.
I am implementing parser for custom type:
data Tree a = Leaf a | Branch (Tree a) (Tree a)
printing function for convenience
showsTree :: Show a => Tree a -> String -> String
showsTree (Leaf x) = shows x
showsTree (Branch l r) = ('<':) . showsTree l . ('|':) . showsTree r . ('>':)
instance Show a => Show (Tree a) where
showsPrec _ x = showsTree x
this parser is fine but breaks when there are spaces
readsTree :: (Read a) => String -> [(Tree a, String)]
readsTree ('<':s) = [(Branch l r, u) | (l, '|':t) <- readsTree s,
(r, '>':u) <- readsTree t ]
readsTree s = [(Leaf x, t) | (x,t) <- reads s]
this one is said to be a better solution, but it does not work without spaces
readsTree_lex :: (Read a) => String -> [(Tree a, String)]
readsTree_lex s = [(Branch l r, x) | ("<", t) <- lex s,
(l, u) <- readsTree_lex t,
("|", v) <- lex u,
(r, w) <- readsTree_lex v,
(">", x) <- lex w ]
++
[(Leaf x, t) | (x, t) <- reads s ]
next I pick one of parsers to use with read
instance Read a => Read (Tree a) where
readsPrec _ s = readsTree s
then I load it in ghci using Leksah debug mode (this is unrelevant, I guess), and try to parse two strings:
read "<1|<2|3>>" :: Tree Int -- succeeds with readsTree
read "<1| <2|3> >" :: Tree Int -- succeeds with readsTree_lex
when lex encounters |<2... part of the former string, it splits onto ("|<", _). That does not match ("|", v) <- lex u part of parser and fails to complete parsing.
There are two questions arising:
how do I define parser that really ignores spaces, not requires them?
how can I define rules for splitting encountered literals with lex
speaking of second question -- it is asked more of curiousity as defining my own lexer seems to be more correct than defining rules of existing one.
lex splits into Haskell lexemes, skipping whitespace.
This means that since Haskell permits |< as a lexeme, lex will not split it into two lexemes, since that's not how it parses in Haskell.
You can only use lex in your parser if you're using the same (or similar) syntactic rules to Haskell.
If you want to ignore all whitespace (as opposed to making any whitespace equivalent to one space), it's much simpler and more efficient to first run filter (not.isSpace).
The answer to this seems to be a small gap between text of Gentle introduction to Haskell and its code samples, plus an error in sample code.
there should also be one more lexer, but there is no working example (satisfying my need) in codebase, so I written one. Please point out any flaw in it:
lexAll :: ReadS String
lexAll s = case lex s of
[("",_)] -> [] -- nothing to parse.
[(c, r)] -> if length c == 1 then [(c, r)] -- we will try to match
else [(c, r), ([head s], tail s)]-- not only as it was
any_else -> any_else -- parsed but also splitted
author sais:
Finally, the complete reader. This is not sensitive to white space as
were the previous versions. When you derive the Show class for a data
type the reader generated automatically is similar to this in style.
but lexAll should be used instead of lex (which seems to be said error):
readsTree' :: (Read a) => ReadS (Tree a)
readsTree' s = [(Branch l r, x) | ("<", t) <- lexAll s,
(l, u) <- readsTree' t,
("|", v) <- lexAll u,
(r, w) <- readsTree' v,
(">", x) <- lexAll w ]
++
[(Leaf x, t) | (x, t) <- reads s]

Invertible State monad (and parsers)

Good day, ladies and gentlemen!
I'm constantly writing parsers and codecs. Implementing both parsers and printers seems to be massive code duplication. I wonder whether it is possible to invert a stateful computation, given it is isomorphic by nature.
It is possible to invert pure function composition (Control.Lens.Iso did that by defining a composition operator over isomorphisms). As it can be observed,
Iso bc cb . Iso ab ba = Iso (bc . ab) (ba . cb) -- from Lenses talk
invert (f . g) = (invert g) . (invert f) -- pseudo-code
In other words, to invert a function composition one should compose inverted functions in the opposite order. So, given all primitive isomorphic pairs are defined, one can compose them to get more complicated pairs with no code duplication. Here is an example of pure bidirectional computation (Control.Lens is used, the explanatory video can help you to get the general idea of Lenses, Folds and Traversals):
import Control.Lens
tick :: Num a => Iso' a a
tick = iso (+1) (subtract 1) -- define an isomorphic pair
double :: Num a => Iso' a a
double = iso (+2) (subtract 2) -- and another one
threeTick :: Num a => Iso' a a
-- These are composed via simple function composition!
threeTick = double . tick
main :: IO ()
main = do
print $ (4 :: Int)^.tick -- => 5
print $ (4 :: Int)^.from tick -- => 3
print $ (4 :: Int)^.threeTick -- => 7, Composable
print $ (4 :: Int)^.from threeTick -- => 1, YEAH
As you can see, I didn't need to supply the inverted version of threeTick; it is obtained by backward composition automatically!
Now, let's consider a simple parser.
data FOO = FOO Int Int deriving Show
parseFoo :: Parser FOO
parseFoo = FOO <$> decimal <* char ' '
<*> decimal
parseFoo' :: Parser FOO
parseFoo' = do
first <- decimal
void $ char ' '
second <- decimal
return $ FOO first second
printFoo :: FOO -> BS.ByteString
printFoo (FOO a b) = BS.pack(show a) <>
BS.pack(" ") <>
BS.pack(show b)
main :: IO ()
main = do
print $ parseOnly parseFoo "10 11" -- => Right (FOO 10 11)
print $ parseOnly parseFoo' "10 11" -- => Right (FOO 10 11)
print . printFoo $ FOO 10 11 -- => "10 11"
print . parseOnly parseFoo . printFoo $ FOO 10 11 -- id
You can see that both versions of parseFoo are fairly declarative (thanks to parser combinators). Note the similarity between parseFoo and printFoo. Can I define isomorphisms over primitive parsers (decimal and char) and then just derive the printer (printFoo :: FOO -> String) automatically? Ideally, parser combinators will work as well.
I tried to redefine a monadic >>= operator in order to provide inverted semantics, but I've failed to do so. I feel that one could define inverted Kleisli composition operator (monadic function composition) similarly to composition inversion, but can one use it with ordinary monads?
f :: a -> m b, inverse f :: b -> m a
g :: b -> m c, inverse g :: c -> m b
inverse (f >=> g) = (inverse f) <=< (inverse g)
Why inverse f is of type b -> m a and not m b -> a? The answer is: monadic side effect is an attribute of an arrow, not that of a data type b. The state monad dualization is further discussed in the great Expert to Expert video.
If the solution does exist, could you please supply a working example of printFoo derivation? By the way, here is an interesting paper that could help us find the solution.
You may be interested in digging in further into the lens package for the concept of a Prism.
A Prism can be used as a 'smart constructor' to build something (e.g. a pretty printed string) unconditionally, and match on it (e.g. parse).
You'd have to ignore the laws or treat the laws as holding only up to a quotient though, as the string you get out of pretty printing is very likely not exactly the string you parsed.

How do I implement an Applicative instance for a parser without assuming Monad?

I can't figure out how to implement an Applicative instance for this parser:
newtype Parser m s a = Parser { getParser :: [s] -> m ([s], a) }
without assuming Monad m. I expected to only have to assume Applicative m, since the Functor instance only has to assume Functor m. I finally ended up with:
instance Functor m => Functor (Parser m s) where
fmap f (Parser g) = Parser (fmap (fmap f) . g)
instance Monad m => Applicative (Parser m s) where
pure a = Parser (\xs -> pure (xs, a))
Parser f <*> Parser x = Parser h
where
h xs = f xs >>= \(ys, f') ->
x ys >>= \(zs, x') ->
pure (zs, f' x')
How do I do this? I tried substituting in for >>= by hand, but always wound up getting stuck trying to reduce a join -- which would also require Monad.
I also consulted Parsec, but even that wasn't much help:
instance Applicative.Applicative (ParsecT s u m) where
pure = return
(<*>) = ap
My reasons for asking this question are purely self-educational.
It's not possible. Look at the inside of your newtype:
getParser :: [s] -> m ([s], a)
Presumably, you want to pass [s] to the input of y in x <*> y. This is exactly the difference between Monad m and Applicative m:
In Monad you can use the output of one computation as the input to another.
In Applicative, you cannot.
It's possible if you do a funny trick:
Parser x <*> Parser y = Parser $
\s -> (\(_, xv) (s', yv) -> (s', xv yv)) <$> x s <*> y s
However, this is almost certainly not the definition that you want, since it parses x and y in parallel.
Fixes
Your ParserT can be Applicative quite easily:
newtype ParserT m s a = ParserT { runParser :: [s] -> m ([s], a) }
-- or, equvalently
newtype ParserT m s a = ParserT (StateT [s] m a)
instance Monad m => Applicative (ParserT m s) where
...
Note that ParserT m s is not an instance of Monad as long as you don't define the Monad instance.
You can move the leftover characters outside the parser:
newtype ParserT m s a = ParserT { runParser :: [s] -> ([s], m a) }
instance Applicative m => Applicative (ParserT m s) where
ParserT x <*> ParserT y = ParserT $ \s ->
let (s', x') = x s
(s'', y') = y s'
in x' <*> y'
...
Full marks for aiming to use Applicative as much as possible - it's much cleaner.
Headline: Your parser can stay Applicative, but your collection of possible parses need to be stored in a Monad. Internal structure: uses a monad. External structure: is applicative.
You're using m ([s],a) to represent a bunch of possible parses. When you parse the next input, you want it to depend on what's already been parsed, but you're using m because there's potentially less than or more than one possible parse; you want to do \([s],a) -> ... and work with that to make a new m ([s],a). That process is called binding and uses >>= or equivalent, so your container is definitely a Monad, no escape.
It's not all that bad using a monad for your container - it's just a container you're keeping some stuff in after all. There's a difference between using a monad internally and being a monad. Your parsers can be applicative whilst using a monad inside.
See What are the benefits of applicative parsing over monadic parsing?.
If your parsers are applicative, they're simpler, so in theory you can do some optimisation when you combine them, by keeping static information about what they do instead of keeping their implementation. For example,
string "Hello World!" <|> string "Hello Mum!"
== (++) <$> string "Hello " <*> (string "World" <|> string "Mum!")
The second version is better than the first because it does no backtracking.
If you do a lot of this, it's like when a regular expression is compiled before it's run, creating a graph (finite state automaton) and simplifying it as much as possible and eliminating a whole load of inefficient backtracking.

Haskell: Lifting a reads function to a parsec parser

As part of the 4th exercise here
I would like to use a reads type function such as readHex with a parsec Parser.
To do this I have written a function:
liftReadsToParse :: Parser String -> (String -> [(a, String)]) -> Parser a
liftReadsToParse p f = p >>= \s -> if null (f s) then fail "No parse" else (return . fst . head ) (f s)
Which can be used, for example in GHCI, like this:
*Main Numeric> parse (liftReadsToParse (many1 hexDigit) readHex) "" "a1"
Right 161
Can anyone suggest any improvement to this approach with regard to:
Will the term (f s) be memoised, or evaluated twice in the case of a null (f s) returning False?
Handling multiple successful parses, i.e. when length (f s) is greater than one, I do not know how parsec deals with this.
Handling the remainder of the parse, i.e. (snd . head) (f s).
This is a nice idea. A more natural approach that would make
your ReadS parser fit in better with Parsec would be to
leave off the Parser String at the beginning of the type:
liftReadS :: ReadS a -> String -> Parser a
liftReadS reader = maybe (unexpected "no parse") (return . fst) .
listToMaybe . filter (null . snd) . reader
This "combinator" style is very idiomatic Haskell - once you
get used to it, it makes function definitions much easier
to read and understand.
You would then use liftReadS like this in the simple case:
> parse (many1 hexDigit >>= liftReadS readHex) "" "a1"
(Note that listToMaybe is in the Data.Maybe module.)
In more complex cases, liftReadS is easy to use inside any
Parsec do block.
Regarding some of your other questions:
The function reader is applied only once now, so there is nothing to "memoize".
It is common and accepted practice to ignore all except the first parse in a ReadS parser in most cases, so you're fine.
To answer the first part of your question, no (f s) will not be memoised, you would have to do that manually:
liftReadsToParse p f = p >>= \s -> let fs = f s in if null fs then fail "No parse"
else (return . fst . head ) fs
But I'd use pattern matching instead:
liftReadsToParse p f = p >>= \s -> case f s of
[] -> fail "No parse"
(answer, _) : _ -> return answer

Resources