Converting [IO String] to IO String in evaluation of parsed expression - parsing

In the language I'm writing at the moment, I'm trying to implement a function which evaluates the entire program based on what I have written already as I can only execute one statement at a time. The function allows me to parse and evaluate files from a file.
The function evalString is the problem. The function executes perfectly if it's last line were runIOThrows $ liftM show $ evalStatement env (x!!0) for example. I felt like the natural step to take was to use map but that just gives me [IO String] rather than IO String.
If I make the return of the function [IO String] however there exists an error with the readStatement function and evalAndPrint function:
----- readStatement -----
Couldn't match type ‘IO’ with ‘[]’
Expected type: [[HStatement]]
Actual type: IO [HStatement]
----- evalAndPrint -----
Couldn't match type ‘[]’ with ‘IO’
Expected type: IO ()
Actual type: [()]
Couldn't match type ‘IO’ with ‘[]’
Expected type: IO String -> [()]
Actual type: String -> IO ()
I get the impression that there's a much easier way to achieve the desired effect in using map. If I executed each statement sequentially then everything works perfectly so perhaps I could use map to evaluate n-1 statements then execute the nth one manually?
parseProgram :: Parser [HStatement]
parseProgram = spaces *> many (parseEvalHVal <* spaces)
readStatement :: String -> IO [HStatement]
readStatement input = do
program <- readFile input
case parse parseProgram "fyp" program of
Left err -> fail $ show err
Right parsed -> return $ parsed
evalAndPrint :: Env -> String -> IO ()
evalAndPrint env expr = evalString env expr >>= putStrLn
evalString :: Env -> String -> IO String
evalString env expr = do
x <- readStatement expr
putStrLn $ show x
map (\exprs -> runIOThrows $ liftM show $ evalStatement env exprs) x
run :: String -> IO ()
run expr = nullEnv >>= flip evalAndPrint expr
main :: IO ()
main = do
args <- getArgs
run $ args !! 0
runIOThrows :: IOThrowsError String -> IO String
runIOThrows action = runExceptT (trapError action) >>= return . extractValue

You can use mapM to perform the steps of an IO and then retrieve the list of strings:
evalString :: Env -> String -> IO [String]
evalString env expr = do
x <- readStatement expr
putStrLn (show x)
mapM (runIOThrows . liftM show . evalStatement env) x
This of course gives us a list of strings. If you want to post-process that list, for example concatenating the strings, you can fmap it:
evalString :: Env -> String -> IO String
evalString env expr = do
x <- readStatement expr
putStrLn (show x)
concat <$> mapM (runIOThrows . liftM show . evalStatement env) x

Related

Making Let binding parsing in Haskell

I want to create parser-like behaviour in Haskell such that I can assign an expression to a variable based on a string. I have difficulties doing so.
If I have types with the following definitions:
data Expr =
Numb Int
| Add Expr Expr
| Let {var :: PVariable, definition, body :: Expr}
type PVariable = String
And want to create a function 'eval' that would be able to handle different operations such as Add, Subtract, Multiply etc... but also the Let binding, sucht that 'eval' would be subject to the following definition:
eval :: Exp -> Integer
eval (Number expr) = expr
eval (Add expr1 expr2) = eval(expr1) + eval(expr2)
...
eval (Let v expr1 body) = ...
How could I then create eval such that it would assign an expr1 to the string v, that would then be expressed in the body, such that the parser-like behaviour could accomplish for instance something similar to the conversion from:
Let {var = "Var1", definition = expr1, body = (Add (Var "Var1") (Var "Var1"))}
where expr1 would be a chosen expression such that the above could be expressed as
let Var1 = expr1 in expr1+expr1
That could then have different Expr assigned to expr1 such as (Numb 2), so that we would get something similar to the following in Haskell:
let Var1 = 2 in Var1 + Var1
So far I have tried to deal with isolating fields of the record 'Let' so that I can evaluate each of these considering that I want to stay with the function type declarations. But I don't think that this is the easiest way, and it would probably require that I create a whole function to extract these, as far as I can see from : How to generically extract field names and values in Haskell records
Is there a smarter way to go about it?
You'll need the function eval to have extra argument that would contain the variable bindings and pass it to subexpressions recursively. You also need a special case to evaluate Var-expressions:
module Main where
import qualified Data.Map as M
data Expr =
Numb Int
| Add Expr Expr
| Let {var :: PVariable, definition, body :: Expr}
| Var PVariable
type PVariable = String
type Env = M.Map PVariable Int
eval :: Env -> Expr -> Int
eval _ (Numb a) = a
eval env (Add e1 e2) = (eval env e1) + (eval env e2)
eval env (Var v) = M.findWithDefault (error $ "undefined variable: " ++ v) v env
eval env (Let v expr body) = let
val = eval env expr
env' = M.insert v val env
in eval env' body
main = print $ eval M.empty $ Let "a" (Numb 1) (Add (Var "a") (Numb 2))

Non-exhaustive pattern in function error despite the data handling being handled elsewhere

I'm trying to write a simple language and at the moment I'm trying to implement a loop but every time I run the program, I get an error that there's a non-exhaustive pattern in the evalStatement_ function.
----- Main -----
readStatement :: String -> IO [HStatement]
readStatement input = do
program <- readFile input
case parse parseProgram "Olivia" program of
Left err -> fail $ show err
Right parsed -> return $ parsed
evalString :: Env -> String -> IO String
evalString env expr = do
x <- readStatement expr
concat <$> mapM (runIOThrows . liftM show . evalStatement_ env) x
--mapM (runIOThrows . liftM show . evalStatement env) x
-- evalStatement env x
--map (\exprs -> runIOThrows $ liftM show $ evalStatement env exprs) x
--map (runIOThrows $ liftM show $ evalStatement env) x
--runIOThrows $ liftM show $ (evalStatement env x) -- >>= runIOThrows $ liftM show $ evalStatement env
evalAndPrint :: Env -> String -> IO ()
evalAndPrint env expr = do
evalString env expr
return ()
run :: String -> IO ()
run expr = nullEnv >>= flip evalAndPrint expr
main :: IO ()
main = do
args <- getArgs
run $ args !! 0
----- Error -----
Main: Expr.hs:(82,1)-(85,34): Non-exhaustive patterns in function evalStatement_
-----------------
evalStatement_ :: Env -> HStatement -> IOThrowsError ()
evalStatement_ env (Do cond expr) = evalDo env (Do cond expr)
evalStatement_ env (Print val) = do
x <- evalVal env val
liftIO $ putStrLn $ show x
evalDo :: Env -> HStatement -> IOThrowsError ()
evalDo env (Do cond expr) = evalVal env cond >>= \x -> case x of
HBool False -> return ()
HBool True -> do
traverse_ (evalVal env) expr
evalStatement_ env $ Do cond expr
evalVal :: Env -> HVal -> IOThrowsError HVal
evalVal env val #(HInteger _) = return $ val
evalVal env val #(HBool _) = return $ val
evalVal env val #(HString _) = return $ val
evalVal env val #(HList _) = return $ val
evalVal env (Arith x op y) = evalArithmetic env x op y
evalVal env (Assign var val) = evalVal env val >>= defineVar env var
I isolated the error down to these functions. I use evalStatement_ to evaluate the Do and Print functions. I have tested print and it does work but I don't understand why evalDo doesn't. evalVal works as intended so I'm at a loss as to where the non-exhaustive problem resides. I ran the compile command with -Wall -Wextra and it gave me the following based on the Statement data type.
Expr.hs:82:1: warning: [-Wincomplete-patterns]
Pattern match(es) are non-exhaustive
In an equation for ‘evalStatement_’:
Patterns not matched:
_ (Eval _)
_ (Program _)
|
82 | evalStatement_ env (Do cond expr) = evalDo env (Do cond expr)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^...
data HStatement
= Eval HVal
| Print HVal
| Do HVal [HVal]
| Program [HVal]
deriving (Eq, Read)
I'm not sure why Eval and Program would have bearing on something which doesn't actually used them or am I missing something completely?
Your main program appears to run evalStatement_ sequentially on the full list of statements parsed from your source file:
evalString :: Env -> String -> IO String
evalString env expr = do
x <- readStatement expr
concat <$> mapM (runIOThrows . liftM show . evalStatement_ env) x
^^^^ runs on every statement in the list `x`
If the statement list x :: [HStatement] contains any Program or Eval statements, then this will cause a run-time error, since evalStatement_ only handles the Do and Print constructors.
So, as #jpmarinier comments, if you expect to call evalStatement_ on any valid statement, then you need to handle all cases that the compiler warns you about when you turn on -Wall.
The following definitions for the missing cases might work for you to get things running:
evalStatement_ env (Program pgm) = mapM_ (evalStatement_ env) pgm
evalStatement_ env (Eval val) = do
result <- evalVal env val
return ()
Note that the case for Eval is pretty useless. Since evalStatement_ can't return anything other than (), we end up calculating a result and then throwing it away. It's going to be almost the same as just doing:
evalStatement_ _ (Eval _) = return () -- do nothing

Haskell : Operator Parser keeps going to undefined rather than inputs

I'm practicing writing parsers. I'm using Tsodings JSON Parser video as reference. I'm trying to add to it by being able to parse arithmetic of arbitrary length and I have come up with the following AST.
data HVal
= HInteger Integer -- No Support For Floats
| HBool Bool
| HNull
| HString String
| HChar Char
| HList [HVal]
| HObj [(String, HVal)]
deriving (Show, Eq, Read)
data Op -- There's only one operator for the sake of brevity at the moment.
= Add
deriving (Show, Read)
newtype Parser a = Parser {
runParser :: String -> Maybe (String, a)
}
The following functions is my attempt of implementing the operator parser.
ops :: [Char]
ops = ['+']
isOp :: Char -> Bool
isOp c = elem c ops
spanP :: (Char -> Bool) -> Parser String
spanP f = Parser $ \input -> let (token, rest) = span f input
in Just (rest, token)
opLiteral :: Parser String
opLiteral = spanP isOp
sOp :: String -> Op
sOp "+" = Add
sOp _ = undefined
parseOp :: Parser Op
parseOp = sOp <$> (charP '"' *> opLiteral <* charP '"')
The logic above is similar to how strings are parsed therefore my assumption was that the only difference was looking specifically for an operator rather than anything that's not a number between quotation marks. It does seemingly begin to parse correctly but it then gives me the following error:
λ > runParser parseOp "\"+\""
Just ("+\"",*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /DIRECTORY/parser.hs:110:11 in main:Main
I'm confused as to where the error is occurring. I'm assuming it's to do with sOp mainly due to how the other functions work as intended as the rest of parseOp being a translation of the parseString function:
stringLiteral :: Parser String
stringLiteral = spanP (/= '"')
parseString :: Parser HVal
parseString = HString <$> (charP '"' *> stringLiteral <* charP '"')
The only reason why I have sOp however is that if it was replaced with say Op, I would get the error that the following doesn't exist Op :: String -> Op. When I say this my inclination was that the string coming from the parsed expression would be passed into this function wherein I could return the appropriate operator. This however is incorrect and I'm not sure how to proceed.
charP and Applicative Instance
charP :: Char -> Parser Char
charP x = Parser $ f
where f (y:ys)
| y == x = Just (ys, x)
| otherwise = Nothing
f [] = Nothing
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input', a) <- q input
Just (input', f a)
The implementation of (<*>) is the culprit. You did not use input' in the next call to q, but used input instead. As a result you pass the string to the next parser without "eating" characters. You can fix this with:
instance Applicative Parser where
pure x = Parser $ \input -> Just (input, x)
(Parser p) <*> (Parser q) = Parser $ \input -> do
(input', f) <- p input
(input'', a) <- q input'
Just (input'', f a)
With the updated instance for Applicative, we get:
*Main> runParser parseOp "\"+\""
Just ("",Add)

Operating on parsed data with attoparsec

Background
I've written a logfile parser using attoparsec. All my smaller parsers succeed, as does the composed final parser. I've confirmed this with tests. But I'm stumbling over performing operations with the parsed stream.
What I've tried
I started by trying to pass the successfully parsed input to a function. But all the seems to get is Done (), which I'm presuming means the logfile has been consumed by this point.
prepareStats :: Result Log -> IO ()
prepareStats r =
case r of
Fail _ _ _ -> putStrLn $ "Parsing failed"
Done _ parsedLog -> putStrLn "Success" -- This now has a [LogEntry] array. Do something with it.
main :: IO ()
main = do
[f] <- getArgs
logFile <- B.readFile (f :: FilePath)
let results = parseOnly parseLog logFile
putStrLn "TBC"
What I'm trying to do
I want to accumulate some stats from the logfile as I consume the input. For example, I'm parsing response codes and I'd like to count how many 2** responses there were and how many 4/5** ones. I'm parsing the number of bytes each response returned as Ints, and I'd like to efficiently sum these (sounds like a foldl'?). I've defined a data type like this:
data Stats = Stats {
successfulRequestsPerMinute :: Int
, failingRequestsPerMinute :: Int
, meanResponseTime :: Int
, megabytesPerMinute :: Int
} deriving Show
And I'd like to constantly update that as I parse the input. But the part of performing operations as I consume is where I got stuck. So far print is the only function I've successfully passed output to and it showed the parsing is succeeding by returning Done before printing the output.
My main parser(s) look like this:
parseLogEntry :: Parser LogEntry
parseLogEntry = do
ip <- logItem
_ <- char ' '
logName <- logItem
_ <- char ' '
user <- logItem
_ <- char ' '
time <- datetimeLogItem
_ <- char ' '
firstLogLine <- quotedLogItem
_ <- char ' '
finalRequestStatus <- intLogItem
_ <- char ' '
responseSizeB <- intLogItem
_ <- char ' '
timeToResponse <- intLogItem
return $ LogEntry ip logName user time firstLogLine finalRequestStatus responseSizeB timeToResponse
type Log = [LogEntry]
parseLog :: Parser Log
parseLog = many $ parseLogEntry <* endOfLine
Desired outcome
I want to pass each parsed line to a function that will update the above data type. Ideally I want this to be very memory efficient because it'll be operating on large files.
You have to make your unit of parsing a single log entry rather than a list of log entries.
It's not pretty, but here is an example of how to interleave parsing and processing:
(Depends on bytestring, attoparsec and mtl)
{-# LANGUAGE NoMonomorphismRestriction, FlexibleContexts #-}
import qualified Data.ByteString.Char8 as BS
import qualified Data.Attoparsec.ByteString.Char8 as A
import Data.Attoparsec.ByteString.Char8 hiding (takeWhile)
import Data.Char
import Control.Monad.State.Strict
aWord :: Parser BS.ByteString
aWord = skipSpace >> A.takeWhile isAlphaNum
getNext :: MonadState [a] m => m (Maybe a)
getNext = do
xs <- get
case xs of
[] -> return Nothing
(y:ys) -> put ys >> return (Just y)
loop iresult =
case iresult of
Fail _ _ msg -> error $ "parse failed: " ++ msg
Done x' aword -> do lift $ process aword; loop (parse aWord x')
Partial _ -> do
mx <- getNext
case mx of
Just y -> loop (feed iresult y)
Nothing -> case feed iresult BS.empty of
Fail _ _ msg -> error $ "parse failed: " ++ msg
Done x' aword -> do lift $ process aword; return ()
Partial _ -> error $ "partial returned" -- probably can't happen
process :: Show a => a -> IO ()
process w = putStrLn $ "got a word: " ++ show w
theWords = map BS.pack [ "this is a te", "st of the emergency ", "broadcasting sys", "tem"]
main = runStateT (loop (Partial (parse aWord))) theWords
Notes:
We parse a aWord at a time and call process after each word is recognized.
Use feed to feed the parser more input when it returns a Partial.
Feed the parser an empty string when there is no more input left.
When Done is return, process the recognized word and continue with parse aWord.
getNext is just an example of a monadic function which gets the next unit of input. Replace it with your own version - i.e. something that reads the next line from a file.
Update
Here is a solution using parseWith as #dfeuer suggested:
noMoreInput = fmap null get
loop2 x = do
iresult <- parseWith (fmap (fromMaybe BS.empty) getNext) aWord x
case iresult of
Fail _ _ msg -> error $ "parse failed: " ++ msg
Done x' aword -> do lift $ process aword;
if BS.null x'
then do b <- noMoreInput
if b then return ()
else loop2 x'
else loop2 x'
Partial _ -> error $ "huh???" -- this really can't happen
main2 = runStateT (loop2 BS.empty) theWords
If each log entry is exactly one line, here's a simpler solution:
do loglines <- fmap BS.lines $ BS.readfile "input-file.log"
foldl' go initialStats loglines
where
go stats logline =
case parseOnly yourParser logline of
Left e -> error $ "oops: " ++ e
Right r -> let stats' = ... combine r with stats ...
in stats'
Basically you are just reading the file line-by-line and calling parseOnly on each line and accumulating the results.
This is properly done with a streaming library
main = do
f:_ <- getArgs
withFile f ReadMode $ \h -> do
result <- foldStream $ streamProcess $ streamHandle h
print result
where
streamHandle = undefined
streamProcess = undefined
foldStream = undefined
where the blanks can be filled by any streaming library, e.g.
import qualified Pipes.Prelude as P
import Pipes
import qualified Pipes.ByteString as PB
import Pipes.Group (folds)
import qualified Control.Foldl as L
import Control.Lens (view) -- or import Lens.Simple (view), or whatever
streamHandle = Pipes.ByteStream.fromHandle :: Handle -> Producer ByteString IO ()
in that case we might then divide the labor further thus:
streamProcess :: Producer ByteString m r -> Producer LogEntry m r
streamProcess p = streamLines p >-> lineParser
streamLines :: Producer ByteString m r -> Producer ByteString m r
streamLines p = L.purely fold L.list (view (Pipes.ByteString.lines p)) >-> P.map B.toStrict
lineParser :: Pipe ByteString LogEntry m r
lineParser = P.map (parseOnly line_parser) >-> P.concat -- concat removes lefts
(This is slightly laborious because pipes is sensible persnickety about accumulating lines, and memory generally: we are just trying to get a producer of individual strict bytestring lines, and then to convert that into a producer of parsed lines, and then to throw out bad parses, if there are any. With io-streams or conduit, things will be basically the same, and that particular step will be easier.)
We are now in a position to fold over our Producer LogEntry IO (). This can be done explicitly using Pipes.Prelude.fold, which makes a strict left fold. Here we will just cop the structure from user5402
foldStream str = P.fold go initial_stats id
where
go stats_till_now new_entry = undefined
If you get used to the use of the foldl library and the application of a fold to a Producer with L.purely fold some_fold, then you can build Control.Foldl.Folds for your LogEntries out of components and slot in different requests as you please.
If you use pipes-attoparsec and include the newline bit in your parser, then you can just write
handleToLogEntries :: Handle -> Producer LogEntry IO ()
handleToLogEntries h = void $ parsed my_line_parser (fromHandle h) >-> P.concat
and get the Producer LogEntry IO () more directly. (This ultra-simple way of writing it will, however, stop at a bad parse; dividing on lines first will be faster than using attoparsec to recognize newlines.) This is very simple with io-streams too, you would write something like
import qualified System.IO.Streams as Streams
io :: Handle -> IO ()
io h = do
bytes <- Streams.handleToInputStream h
log_entries <- Streams.parserToInputStream my_line_parser bytes
fold_result <- Stream.fold go initial_stats log_entries
print fold_result
or to keep with the structure above:
where
streamHandle = Streams.handleToInputStream
streamProcess io_bytes =
io_bytes >>= Streams.parserToInputStream my_line_parser
foldStream io_logentries =
log_entries >>= Stream.fold go initial_stats
Either way, my_line_parser should return a Maybe LogEntry and should recognize the newline.

How to Get Parsec to Parse Multiple Expressions

I'm following this scheme interpreter tutorial: http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/
but can't seem to figure out how to setup the REPL or Parsec so I can have the functionality to interpret a whole source file. What I'd like to do is to be able to enter something like this from the REPL:
:l ~/myscheme.scm
And the file would be interpreted. Right now, all it does is parse one expression and it ignores the rest. I can see why this is so -- readExpr reads only 1 expression.
Parser excerpt, whole code can be found here: http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing
parseExpr :: Parser LispVal
parseExpr = parseAtom
<|> parseString
<|> parseNumber
<|> parseQuoted
<|> do char '('
x <- try parseList <|> parseDottedList
char ')'
return x
readExpr :: String -> String
readExpr input = case parse parseExpr "lisp" input of
Left err -> "No match: " ++ show err
Right _ -> "Found value"
REPL:
import System.IO
flushStr :: String -> IO ()
flushStr str = putStr str >> hFlush stdout
readPrompt :: String -> IO String
readPrompt prompt = flushStr prompt >> getLine
evalString :: String -> IO String
evalString expr = return $ extractValue $ trapError (liftM show $ readExpr expr >>= eval)
evalAndPrint :: String -> IO ()
evalAndPrint expr = evalString expr >>= putStrLn
until_ :: Monad m => (a -> Bool) -> m a -> (a -> m ()) -> m ()
until_ pred prompt action = do
result <- prompt
if pred result
then return ()
else action result >> until_ pred prompt action
runRepl :: IO ()
runRepl = until_ (== "quit") (readPrompt "Lisp>>> ") evalAndPrint
main :: IO ()
main = do args <- getArgs
case length args of
0 -> runRepl
1 -> evalAndPrint $ args !! 0
otherwise -> putStrLn "Program takes only 0 or 1 argument"
Would appreciate any help!
How about parse (many parseExpr) instead of parse parseExpr?
You then will have to amend the interpreter so that it can interpret a list of expressions.

Resources