Check every character of a string with guards [Haskell] - parsing

i have a problem when i'm trying to use Haskell. I want to read a string of number and print various messages when i see characters.
import System.Environment
import System.Exit
import Data.List
import Control.Monad
test_parse:: [Char] -> IO ()
test_parse [] = putStrLn "\n"
test_parse (a:b:c:xs)
| a == '1' && b == '2' && c == '3' = putStrLn ("True")
| a == '2' && b == '3' && c == '4' = putStrLn ("False")
| a == '4' && b == '5' && c == '6' = putStrLn ("maybe")
| otherwise = test_parse (b:c:xs)
main = do
let numbers = "123456"
let loop = do
goGlenn <- getLine
test_parse numbers
putStrLn goGlenn
when (goGlenn /= "start") loop
loop
putStrLn "ok"
The problem is this. I would like to print "True\nFalse\nMaybe\n" But I print just "True\n". I know my problem is that when an action is made by the guards, it leaves the function. But I don’t see how to check the entire string without leaving 'test_parse.
If anyone have a idea, thanks.

You want to check every suffix, regardless of the result on the prefix. One example:
-- output a string based on the first 3 characters of the input
classify :: String -> IO ()
classify xs = case take 3 xs of
"123" -> putStrLn "True"
"234" -> putStrLn "False"
"456" -> putStrLn "maybe"
otherwise -> return ()
-- call classify repeatedly on different suffixes of the input
test_parse :: String -> IO ()
test_parse [] = return ()
test_parse all#(_:xs) = do
classify all
test_parse xs

To illustrate chepner's point about making classify return a list of Strings rather than doing I/O:
import Data.List (tails)
classify :: String -> [String]
classify s = case take 3 s of
"123" -> return "True"
"234" -> return "False"
"456" -> return "maybe"
otherwise -> mempty
classifyAll :: String -> [String]
classifyAll s = tails s >>= classify
main :: IO ()
main = interact (unlines . classifyAll)
Running this,
$ stack ghc classify.hs
$ echo 123456 | ./classify
True
False
maybe
I can't choose if i want to put a \n after the print or not.
Basically, it's working, but can we delete \n if we want?
If you want to separate by something other than linebreak,
import Data.List (intercalate, tails)
...
main :: IO ()
main = interact ((++ "\n") . intercalate ", " . classifyAll)
Running this,
$ echo 123456 | ./classify
True, False, maybe
Splitting the code that parses and prints also makes your code more easily testable. E.g.
spec_classify :: Spec
spec_classify =
describe "classify" $
it "classifies 123, 234 and 456" $
classifyAll "123456" `shouldBe` ["True", "False", "maybe"]
Testing this,
$ stack ghci classify.hs
> hspec spec_classify
classify
classifies 123, 234 and 456
Finished in 0.0005 seconds
1 example, 0 failures

Related

Parser written in Haskell not working as intended

I was playing around with Haskell's parsec library. I was trying to parse a hexadecimal string of the form "#x[0-9A-Fa-f]*" into an integer. This the code I thought would work:
module Main where
import Control.Monad
import Numeric
import System.Environment
import Text.ParserCombinators.Parsec hiding (spaces)
parseHex :: Parser Integer
parseHex = do
string "#x"
x <- many1 hexDigit
return (fst (head (readHex x)))
testHex :: String -> String
testHex input = case parse parseHex "lisp" input of
Left err -> "Does not match " ++ show err
Right val -> "Matched" ++ show val
main :: IO ()
main = do
args <- getArgs
putStrLn (testHex (head args))
And then I tried testing the testHex function in Haskell's repl:
GHCi, version 8.6.5: http://www.haskell.org/ghc/ :? for help
[1 of 1] Compiling Main ( src/Main.hs, interpreted )
Ok, one module loaded.
*Main> testHex "#xcafebeef"
"Matched3405692655"
*Main> testHex "#xnothx"
"Does not match \"lisp\" (line 1, column 3):\nunexpected \"n\"\nexpecting hexadecimal digit"
*Main> testHex "#xcafexbeef"
"Matched51966"
The first and second try work as intended. But in the third one, the string is matching upto the invalid character. I do not want the parser to do this, but rather not match if any digit in the string is not a valid string. Why is this happening, and how do if fix this?
Thank you!
You need to place eof at the end.
parseHex :: Parser Integer
parseHex = do
string "#x"
x <- many1 hexDigit
eof
return (fst (head (readHex x)))
Alternatively, you can compose it with eof where you use it if you want to reuse parseHex in other places.
testHex :: String -> String
testHex input = case parse (parseHex <* eof) "lisp" input of
Left err -> "Does not match " ++ show err
Right val -> "Matched" ++ show val

Converting [IO String] to IO String in evaluation of parsed expression

In the language I'm writing at the moment, I'm trying to implement a function which evaluates the entire program based on what I have written already as I can only execute one statement at a time. The function allows me to parse and evaluate files from a file.
The function evalString is the problem. The function executes perfectly if it's last line were runIOThrows $ liftM show $ evalStatement env (x!!0) for example. I felt like the natural step to take was to use map but that just gives me [IO String] rather than IO String.
If I make the return of the function [IO String] however there exists an error with the readStatement function and evalAndPrint function:
----- readStatement -----
Couldn't match type ‘IO’ with ‘[]’
Expected type: [[HStatement]]
Actual type: IO [HStatement]
----- evalAndPrint -----
Couldn't match type ‘[]’ with ‘IO’
Expected type: IO ()
Actual type: [()]
Couldn't match type ‘IO’ with ‘[]’
Expected type: IO String -> [()]
Actual type: String -> IO ()
I get the impression that there's a much easier way to achieve the desired effect in using map. If I executed each statement sequentially then everything works perfectly so perhaps I could use map to evaluate n-1 statements then execute the nth one manually?
parseProgram :: Parser [HStatement]
parseProgram = spaces *> many (parseEvalHVal <* spaces)
readStatement :: String -> IO [HStatement]
readStatement input = do
program <- readFile input
case parse parseProgram "fyp" program of
Left err -> fail $ show err
Right parsed -> return $ parsed
evalAndPrint :: Env -> String -> IO ()
evalAndPrint env expr = evalString env expr >>= putStrLn
evalString :: Env -> String -> IO String
evalString env expr = do
x <- readStatement expr
putStrLn $ show x
map (\exprs -> runIOThrows $ liftM show $ evalStatement env exprs) x
run :: String -> IO ()
run expr = nullEnv >>= flip evalAndPrint expr
main :: IO ()
main = do
args <- getArgs
run $ args !! 0
runIOThrows :: IOThrowsError String -> IO String
runIOThrows action = runExceptT (trapError action) >>= return . extractValue
You can use mapM to perform the steps of an IO and then retrieve the list of strings:
evalString :: Env -> String -> IO [String]
evalString env expr = do
x <- readStatement expr
putStrLn (show x)
mapM (runIOThrows . liftM show . evalStatement env) x
This of course gives us a list of strings. If you want to post-process that list, for example concatenating the strings, you can fmap it:
evalString :: Env -> String -> IO String
evalString env expr = do
x <- readStatement expr
putStrLn (show x)
concat <$> mapM (runIOThrows . liftM show . evalStatement env) x

Operating on parsed data with attoparsec

Background
I've written a logfile parser using attoparsec. All my smaller parsers succeed, as does the composed final parser. I've confirmed this with tests. But I'm stumbling over performing operations with the parsed stream.
What I've tried
I started by trying to pass the successfully parsed input to a function. But all the seems to get is Done (), which I'm presuming means the logfile has been consumed by this point.
prepareStats :: Result Log -> IO ()
prepareStats r =
case r of
Fail _ _ _ -> putStrLn $ "Parsing failed"
Done _ parsedLog -> putStrLn "Success" -- This now has a [LogEntry] array. Do something with it.
main :: IO ()
main = do
[f] <- getArgs
logFile <- B.readFile (f :: FilePath)
let results = parseOnly parseLog logFile
putStrLn "TBC"
What I'm trying to do
I want to accumulate some stats from the logfile as I consume the input. For example, I'm parsing response codes and I'd like to count how many 2** responses there were and how many 4/5** ones. I'm parsing the number of bytes each response returned as Ints, and I'd like to efficiently sum these (sounds like a foldl'?). I've defined a data type like this:
data Stats = Stats {
successfulRequestsPerMinute :: Int
, failingRequestsPerMinute :: Int
, meanResponseTime :: Int
, megabytesPerMinute :: Int
} deriving Show
And I'd like to constantly update that as I parse the input. But the part of performing operations as I consume is where I got stuck. So far print is the only function I've successfully passed output to and it showed the parsing is succeeding by returning Done before printing the output.
My main parser(s) look like this:
parseLogEntry :: Parser LogEntry
parseLogEntry = do
ip <- logItem
_ <- char ' '
logName <- logItem
_ <- char ' '
user <- logItem
_ <- char ' '
time <- datetimeLogItem
_ <- char ' '
firstLogLine <- quotedLogItem
_ <- char ' '
finalRequestStatus <- intLogItem
_ <- char ' '
responseSizeB <- intLogItem
_ <- char ' '
timeToResponse <- intLogItem
return $ LogEntry ip logName user time firstLogLine finalRequestStatus responseSizeB timeToResponse
type Log = [LogEntry]
parseLog :: Parser Log
parseLog = many $ parseLogEntry <* endOfLine
Desired outcome
I want to pass each parsed line to a function that will update the above data type. Ideally I want this to be very memory efficient because it'll be operating on large files.
You have to make your unit of parsing a single log entry rather than a list of log entries.
It's not pretty, but here is an example of how to interleave parsing and processing:
(Depends on bytestring, attoparsec and mtl)
{-# LANGUAGE NoMonomorphismRestriction, FlexibleContexts #-}
import qualified Data.ByteString.Char8 as BS
import qualified Data.Attoparsec.ByteString.Char8 as A
import Data.Attoparsec.ByteString.Char8 hiding (takeWhile)
import Data.Char
import Control.Monad.State.Strict
aWord :: Parser BS.ByteString
aWord = skipSpace >> A.takeWhile isAlphaNum
getNext :: MonadState [a] m => m (Maybe a)
getNext = do
xs <- get
case xs of
[] -> return Nothing
(y:ys) -> put ys >> return (Just y)
loop iresult =
case iresult of
Fail _ _ msg -> error $ "parse failed: " ++ msg
Done x' aword -> do lift $ process aword; loop (parse aWord x')
Partial _ -> do
mx <- getNext
case mx of
Just y -> loop (feed iresult y)
Nothing -> case feed iresult BS.empty of
Fail _ _ msg -> error $ "parse failed: " ++ msg
Done x' aword -> do lift $ process aword; return ()
Partial _ -> error $ "partial returned" -- probably can't happen
process :: Show a => a -> IO ()
process w = putStrLn $ "got a word: " ++ show w
theWords = map BS.pack [ "this is a te", "st of the emergency ", "broadcasting sys", "tem"]
main = runStateT (loop (Partial (parse aWord))) theWords
Notes:
We parse a aWord at a time and call process after each word is recognized.
Use feed to feed the parser more input when it returns a Partial.
Feed the parser an empty string when there is no more input left.
When Done is return, process the recognized word and continue with parse aWord.
getNext is just an example of a monadic function which gets the next unit of input. Replace it with your own version - i.e. something that reads the next line from a file.
Update
Here is a solution using parseWith as #dfeuer suggested:
noMoreInput = fmap null get
loop2 x = do
iresult <- parseWith (fmap (fromMaybe BS.empty) getNext) aWord x
case iresult of
Fail _ _ msg -> error $ "parse failed: " ++ msg
Done x' aword -> do lift $ process aword;
if BS.null x'
then do b <- noMoreInput
if b then return ()
else loop2 x'
else loop2 x'
Partial _ -> error $ "huh???" -- this really can't happen
main2 = runStateT (loop2 BS.empty) theWords
If each log entry is exactly one line, here's a simpler solution:
do loglines <- fmap BS.lines $ BS.readfile "input-file.log"
foldl' go initialStats loglines
where
go stats logline =
case parseOnly yourParser logline of
Left e -> error $ "oops: " ++ e
Right r -> let stats' = ... combine r with stats ...
in stats'
Basically you are just reading the file line-by-line and calling parseOnly on each line and accumulating the results.
This is properly done with a streaming library
main = do
f:_ <- getArgs
withFile f ReadMode $ \h -> do
result <- foldStream $ streamProcess $ streamHandle h
print result
where
streamHandle = undefined
streamProcess = undefined
foldStream = undefined
where the blanks can be filled by any streaming library, e.g.
import qualified Pipes.Prelude as P
import Pipes
import qualified Pipes.ByteString as PB
import Pipes.Group (folds)
import qualified Control.Foldl as L
import Control.Lens (view) -- or import Lens.Simple (view), or whatever
streamHandle = Pipes.ByteStream.fromHandle :: Handle -> Producer ByteString IO ()
in that case we might then divide the labor further thus:
streamProcess :: Producer ByteString m r -> Producer LogEntry m r
streamProcess p = streamLines p >-> lineParser
streamLines :: Producer ByteString m r -> Producer ByteString m r
streamLines p = L.purely fold L.list (view (Pipes.ByteString.lines p)) >-> P.map B.toStrict
lineParser :: Pipe ByteString LogEntry m r
lineParser = P.map (parseOnly line_parser) >-> P.concat -- concat removes lefts
(This is slightly laborious because pipes is sensible persnickety about accumulating lines, and memory generally: we are just trying to get a producer of individual strict bytestring lines, and then to convert that into a producer of parsed lines, and then to throw out bad parses, if there are any. With io-streams or conduit, things will be basically the same, and that particular step will be easier.)
We are now in a position to fold over our Producer LogEntry IO (). This can be done explicitly using Pipes.Prelude.fold, which makes a strict left fold. Here we will just cop the structure from user5402
foldStream str = P.fold go initial_stats id
where
go stats_till_now new_entry = undefined
If you get used to the use of the foldl library and the application of a fold to a Producer with L.purely fold some_fold, then you can build Control.Foldl.Folds for your LogEntries out of components and slot in different requests as you please.
If you use pipes-attoparsec and include the newline bit in your parser, then you can just write
handleToLogEntries :: Handle -> Producer LogEntry IO ()
handleToLogEntries h = void $ parsed my_line_parser (fromHandle h) >-> P.concat
and get the Producer LogEntry IO () more directly. (This ultra-simple way of writing it will, however, stop at a bad parse; dividing on lines first will be faster than using attoparsec to recognize newlines.) This is very simple with io-streams too, you would write something like
import qualified System.IO.Streams as Streams
io :: Handle -> IO ()
io h = do
bytes <- Streams.handleToInputStream h
log_entries <- Streams.parserToInputStream my_line_parser bytes
fold_result <- Stream.fold go initial_stats log_entries
print fold_result
or to keep with the structure above:
where
streamHandle = Streams.handleToInputStream
streamProcess io_bytes =
io_bytes >>= Streams.parserToInputStream my_line_parser
foldStream io_logentries =
log_entries >>= Stream.fold go initial_stats
Either way, my_line_parser should return a Maybe LogEntry and should recognize the newline.

Checking that a price value in a string is in the correct format

I use n <- getLine to get from user price. How can I check is value correct ? (Price can have '.' and digits and must be greater than 0) ?
It doesn't work:
isFloat = do
n <- getLine
let val = case reads n of
((v,_):_) -> True
_ -> False
If The Input Is Always Valid Or Exceptions Are OK
If you have users entering decimal numbers in the form of "123.456" then this can simply be converted to a Float or Double using read:
n <- getLine
let val = read n
Or in one line (having imported Control.Monad):
n <- liftM read getLine
To Catch Erroneous Input
The above code fails with an exception if the users enter invalid entries. If that's a problem then use reads and listToMaybe (from Data.Maybe):
n <- liftM (fmap fst . listToMaybe . reads) getLine
If that code looks complex then don't sweat it - the below is the same operation but doing all the work with explicit case statements:
n <- getLine
let val = case reads n of
((v,_):_) -> Just v
_ -> Nothing
Notice we pattern match to get the first element of the tuple in the head of the list, The head of the list being (v,_) and the first element is v. The underscore (_) just means "ignore the value in this spot".
If Floating Point Isn't Acceptable
Floating values are well known to be approximate, and not suitable for real world financial computations (but perhaps homework, depending on your professor). In this case you'd want to read the values into a Rational (from Data.Ratio).
n <- liftM maybeRational getLine
...
where
maybeRational :: String -> Maybe Rational
maybeRational str =
let (a,b) = break (=='.') str
in liftM2 (%) (readMaybe a) (readMaybe $ drop 1 b)
readMaybe = fmap fst . listToMaybe . reads
In addition to the parsing advice provided by TomMD, consider using the appropriate monad for error reporting. It allows you to conveniently chain computations which can fail, avoiding explicit error checking on every step.
{-# LANGUAGE FlexibleContexts #-}
import Control.Monad.Error
parsePrice :: MonadError String m => String -> m Double
parsePrice s = do
x <- case reads s of
[(x, "")] -> return x
_ -> throwError "Not a valid real number."
when (x <= 0) $ throwError "Price must be positive."
return x
main = do
n <- getLine
case parsePrice n of
Left err -> putStrLn err
Right x -> putStrLn $ "Price is " ++ show x

Checking if a string holds an Integer

Why this code doesn't work:
import IO
import Char
isInteger "" = False
isInteger (a:b) =
if length b == 0 && isDigit(a) == True then True
else if isDigit(a) == True then isInteger(b)
else False
main = do
q <- getLine
let x = read q
if isInteger x == False then putStrLn "not integer"
else putStrLn "integer"
This will work:
main = do
q <- getLine -- q is already String - we don't need to parse it
if isInteger q == False then putStrLn "not integer"
else putStrLn "integer"
The reason for your code results in runtime error "Prelude.read: no parse" is that since getLine :: IO String and isInteger :: String -> Bool, the expression let x = read x will try to parse String into String. Try it yourself:
Prelude> read "42" :: String
"*** Exception: Prelude.read: no parse
PS It's not that you can't parse String (although it's still doesn't really make sense to do that), you can, but the input should be different: String is just a list of Char and even though Show threats [Char] as a special case Read doesn't, so in order to read String just pass it as a list:
Prelude> read "['4','2']" :: String
"42"
It helps us if you give us the error message:
/home/dave/tmp/so.hs:14:4:
parse error (possibly incorrect indentation)
Failed, modules loaded: none.
Line 14 is else putStrLn "integer"
The hint that this is to do with indentation is correct. When you use if-then-else with do-notation, you need to ensure that multiline expressions --- and if-then-else is a single expression --- have extra indentation after the first line.
(You do not use do-notation in your isInteger function, which is why the same indentation of if-then-else does not cause problems there.)
So this has no compile errors:
main = do
q <- getLine
let x = read q
if isInteger x == False then putStrLn "not integer"
else putStrLn "integer"
Neither does this:
main = do
q <- getLine
let x = read q
if isInteger x == False
then putStrLn "not integer"
else putStrLn "integer"
You then still have the issue Ed'ka points out. But at least it compiles.

Resources