Parsing Input Using Expression Tree - parsing

So I am trying to take a user input and then parse it into an expression that I have set up an expression tree for. The expression tree that I have is:
data Expression = State [Float] Float
| Const Float
| Plus Expression Expression
| Times Expression Expression
| Minus Expression Expression
| Divide Expression Expression
| Uminus Expression
deriving(Show, Read)
And then I am trying to take the input and then parse it using the following two lines.
expression <- getLine
read expression
Is there something that I am missing regarding either getting the input or in my use of read? The error that it is giving me is:
Main.hs:94:3: error:
• No instance for (Read (IO a0)) arising from a use of ‘read’
• In a stmt of a 'do' block: read expression
In the expression:
do putStrLn "Input an expression to evaluate"
expression <- getLine
read expression
print ("Hello World")
....
In an equation for ‘main’:
main
= do putStrLn "Input an expression to evaluate"
expression <- getLine
read expression
....

If you use something inside a do-notation, it is in the Monad (or Applicative, if you turn on some language pragmas).
So, your code
expression <- getLine
read expression
is written inside a monad. Then read expression is expected to have a type m a for some monad m and some type a. Compiler can infer m from the type of getLine :: IO String, so m is IO.
But read has type read :: Read b => String -> b. Therefore, b should be IO a. But there is no instance of Read for IO a, you can't convert a string into an IO action returning something.
The problem is that you want to perform pure computation. To do so you have to use let:
expression <- getLine
let parsed = read expression :: Expression
print parsed
Here I use let to bind a name parsed to the result of read expression. Then you have to do something with a value parsed, for example, you can print it.
I have to specify the type because print can work with anything that has Show instance, and the compiler can't select the specific type of parsed. If I write something more specific, like performOperation :: Expression -> IO (), then the type declaration can be omitted.

Related

Taking a parser in argument and try to apply it zero or more times

I am currently writing a basic parser. A parser for type a takes a string in argument and returns either nothing, or an object of type a and the rest of the string.
Here is a simple type satisfying all these features:
type Parser a = String -> Maybe (a, String)
For example, I wrote a function that takes a Char as argument and returns a Parser Char :
parseChar :: Char -> Parser Char
parseChar _ [] = Nothing
parseChar c (x:xs)
| c == x = Just (x, xs)
| otherwise = Nothing
I would like to write a function which takes a parser in argument and tries to apply it zero or more times, returning a list of the parsed elements :
parse :: Parser a -> Parser [a]
Usage example:
> parse (parseChar ' ') " foobar"
Just (" ", "foobar")
I tried to write a recursive function but I can't save the parsed elements in a list.
How can I apply the parsing several times and save the result in a list ?
I tried to write a recursive function but I can't save the parsed elements in a list.
You don't need to "save" anything. You can use pattern matching. Here's a hint. Try to reason about what should happen in each case below. The middle case is a bit subtle, don't worry if you get that wrong at first. Note how s and s' are used below.
parse :: Parser a -> Parser [a]
parse p s = case p s of
Nothing -> ... -- first p failed
Just (x,s') -> case parse p s' of
Nothing -> ... -- subtle case, might not be relevant after all
Just (xs,s'') -> ... -- merge the results
Another hint: note that according to your description parse p should never fail, since it can always return the empty list.

Parse String to Datatype in Haskell

I'm taking a Haskell course at school, and I have to define a Logical Proposition datatype in Haskell. Everything so far Works fine (definition and functions), and i've declared it as an instance of Ord, Eq and show. The problem comes when I'm required to define a program which interacts with the user: I have to parse the input from the user into my datatype:
type Var = String
data FProp = V Var
| No FProp
| Y FProp FProp
| O FProp FProp
| Si FProp FProp
| Sii FProp FProp
where the formula: ¬q ^ p would be: (Y (No (V "q")) (V "p"))
I've been researching, and found that I can declare my datatype as an instance of Read.
Is this advisable? If it is, can I get some help in order to define the parsing method?
Not a complete answer, since this is a homework problem, but here are some hints.
The other answer suggested getLine followed by splitting at words. It sounds like you instead want something more like a conventional tokenizer, which would let you write things like:
(Y
(No (V q))
(V p))
Here’s one implementation that turns a string into tokens that are either a string of alphanumeric characters or a single, non-alphanumeric printable character. You would need to extend it to support quoted strings:
import Data.Char
type Token = String
tokenize :: String -> [Token]
{- Here, a token is either a string of alphanumeric characters, or else one
- non-spacing printable character, such as "(" or ")".
-}
tokenize [] = []
tokenize (x:xs) | isSpace x = tokenize xs
| not (isPrint x) = error $
"Invalid character " ++ show x ++ " in input."
| not (isAlphaNum x) = [x]:(tokenize xs)
| otherwise = let (token, rest) = span isAlphaNum (x:xs)
in token:(tokenize rest)
It turns the example into ["(","Y","(","No","(","V","q",")",")","(","V","p",")",")"]. Note that you have access to the entire repertoire of Unicode.
The main function that evaluates this interactively might look like:
main = interact ( unlines . map show . map evaluate . parse . tokenize )
Where parse turns a list of tokens into a list of ASTs and evaluate turns an AST into a printable expression.
As for implementing the parser, your language appears to have similar syntax to LISP, which is one of the simplest languages to parse; you don’t even need precedence rules. A recursive-descent parser could do it, and is probably the easiest to implement by hand. You can pattern-match on parse ("(":xs) =, but pattern-matching syntax can also implement lookahead very easily, for example parse ("(":x1:xs) = to look ahead one token.
If you’re calling the parser recursively, you would define a helper function that consumes only a single expression, and that has a type signature like :: [Token] -> (AST, [Token]). This lets you parse the inner expression, check that the next token is ")", and proceed with the parse. However, externally, you’ll want to consume all the tokens and return an AST or a list of them.
The stylish way to write a parser is with monadic parser combinators. (And maybe someone will post an example of one.) The industrial-strength solution would be a library like Parsec, but that’s probably overkill here. Still, parsing is (mostly!) a solved problem, and if you just want to get the assignment done on time, using a library off the shelf is a good idea.
the read part of a REPL interpreter typically looks like this
repl :: ForthState -> IO () -- parser definition
repl state
= do putStr "> " -- puts a > character to indicate it's waiting for input
input <- getLine -- this is what you're looking for, to read a line.
if input == "quit" -- allows user to quit the interpreter
then do putStrLn "Bye!"
return ()
else let (is, cs, d, output) = eval (words input) state -- your grammar definition is somewhere down the chain when eval is called on input
in do mapM_ putStrLn output
repl (is, cs, d, [])
main = do putStrLn "Welcome to your very own interpreter!"
repl initialForthState -- runs the parser, starting with read
your eval method will have various loops, stack manipulations, conditionals, etc to actually figure out what the user inputted. hope this helps you with at least the reading input part.

try function in parsing lambda expressions

I'm totally new to Haskell and trying to implement a "Lambda calculus" parser, that will be used to read the input to a lambda reducer .. It's required to parse bindings first "identifier = expression;" from a text file, then at the end there's an expression alone ..
till now it can parse bindings only, and displays errors when encountering an expression alone .. when I try to use the try or option functions, it gives a type mismatch error:
Couldn't match type `[Expr]'
with `Text.Parsec.Prim.ParsecT s0 u0 m0 [[Expr]]'
Expected type: Text.Parsec.Prim.ParsecT
s0 u0 m0 (Text.Parsec.Prim.ParsecT s0 u0 m0 [[Expr]])
Actual type: Text.Parsec.Prim.ParsecT s0 u0 m0 [Expr]
In the second argument of `option', namely `bindings'
bindings weren't supposed to return anything, but I tried to add a return statement and it also returned a type mismatch error:
Couldn't match type `[Expr]' with `Expr'
Expected type: Text.Parsec.Prim.ParsecT
[Char] u0 Data.Functor.Identity.Identity [Expr]
Actual type: Text.Parsec.Prim.ParsecT
[Char] u0 Data.Functor.Identity.Identity [[Expr]]
In the second argument of `(<|>)', namely `expressions'
Don't use <|> if you want to allow both
Your program parser does its main work with
program = do
spaces
try bindings <|> expressions
spaces >> eof
This <|> is choice - it does bindings if it can, and if that fails, expressions, which isn't what you want. You want zero or more bindings, followed by expressions, so let's make it do that.
Sadly, even when this works, the last line of your parser is eof and
First, let's allow zero bindings, since they're optional, then let's get both the bindings and the expressions:
bindings = many binding
program = do
spaces
bs <- bindings
es <- expressions
spaces >> eof
return (bs,es)
This error would be easier to find with plenty more <?> "binding" type hints so you can see more clearly what was expected.
endBy doesn't need many
The error message you have stems from the line
expressions = many (endBy expression eol)
which should be
expressions :: Parser [Expr]
expressions = endBy expression eol
endBy works like sepBy - you don't need to use many on it because it already parses many.
This error would have been easier to find with a stronger data type tree, so:
Use try to deal with common prefixes
One of the hard-to-debug problems you've had is when you get the error expecting space or "=" whilst parsing an expression. If we think about that, the only place we expect = is in a binding, so it must be part way through parsing a binding when we've given it an expression. This only happens if our expression starts with an identifier, just like a binding does.
binding sees the first identifier and says "It's OK guys, I've got this" but then finds no = and gives you an error, where we wanted it to backtrack and let expression have a go. The key point is we've already used the identifier input, and we want to unuse it. try is right for that.
Encase your binding parser with try so if it fails, we'll go back to the start of the line and hand over to expression.
binding = try (do
(Var id) <- identifier
_ <- char '='
spaces
exp <- expression
spaces
eol <?> "end of line"
return $ Eq id exp
<?> "binding")
It's important that as far as possible each parser starts with matching something unique to avoid this problem. (try is backtracking, hence inefficient, so should be avoided if possible.)
In particular, avoid starting parsers with spaces, but instead make sure you finish them all with spaces. Your main program can start with spaces if you like, since it's the only alternative.
Use types for most productions - better structure & readability
My first piece of general advice is that you could do with a more fine-grained data type, and should annotate your parsers with their type. At the moment, everything's wrapped up in Expr, which means you can only get error messages about whether you have an Expr or a [Expr]. The fact that you had to add Eq to Expr is a sign you're pushing the type too far.
Usually it's worth making a data type for quite a lot of the productions, and if you import Control.Applicative hiding ((<|>),(<$>),many) Control.Applicative you can use <$> and <*> so that the production, the datatype and the parser are all the same structure:
--<program> ::= <spaces> [<bindings>] <expressions>
data Program = Prog [Binding] [Expr]
program = spaces >> Prog <$> bindings <*> expressions
-- <expression> ::= <abstraction> | factors
data Expression = Ab Abstraction | Fa [Factor]
expression = Ab <$> abstraction <|> Fa <$> factors <?> "expression"
Don't do this with letters for example, but for important things. What counts as important things is a matter of judgement, but I'd start with Identifiers. (You can use <* or *> to not include syntax like = in the results.)
Amended code:
Before refactoring types and using Applicative here
And afterwards here

What is an appropriate data structure or algorithm for producing an immutable concrete syntax tree in a functionally pure manner?

Given a LL(1) grammar what is an appropriate data structure or algorithm for producing an immutable concrete syntax tree in a functionally pure manner? Please feel free to write example code in whatever language you prefer.
My Idea
symbol : either a token or a node
result : success or failure
token : a lexical token from source text
value -> string : the value of the token
type -> integer : the named type code of the token
next -> token : reads the next token and keeps position of the previous token
back -> token : moves back to the previous position and re-reads the token
node : a node in the syntax tree
type -> integer : the named type code of the node
symbols -> linkedList : the symbols found at this node
append -> symbol -> node : appends the new symbol to a new copy of the node
Here is an idea I have been thinking about. The main issue here is handling syntax errors.
I mean I could stop at the first error but that doesn't seem right.
let program token =
sourceElements (node nodeType.program) token
let sourceElements node token =
let (n, r) = sourceElement (node.append (node nodeType.sourceElements)) token
match r with
| success -> (n, r)
| failure -> // ???
let sourceElement node token =
match token.value with
| "function" ->
functionDeclaration (node.append (node nodeType.sourceElement)) token
| _ ->
statement (node.append (node nodeType.sourceElement)) token
Please Note
I will be offering up a nice bounty to the best answer so don't feel rushed. Answers that simply post a link will have less weight over answers that show code or contain detailed explanations.
Final Note
I am really new to this kind of stuff so don't be afraid to call me a dimwit.
You want to parse something into an abstract syntax tree.
In the purely functional programming language Haskell, you can use parser combinators to express your grammar. Here an example that parses a tiny expression language:
EDIT Use monadic style to match Graham Hutton's book
-- import a library of *parser combinators*
import Parsimony
import Parsimony.Char
import Parsimony.Error
(+++) = (<|>)
-- abstract syntax tree
data Expr = I Int
| Add Expr Expr
| Mul Expr Expr
deriving (Eq,Show)
-- parse an expression
parseExpr :: String -> Either ParseError Expr
parseExpr = Parsimony.parse pExpr
where
-- grammar
pExpr :: Parser String Expr
pExpr = do
a <- pNumber +++ parentheses pExpr -- first argument
do
f <- pOp -- operation symbol
b <- pExpr -- second argument
return (f a b)
+++ return a -- or just the first argument
parentheses parser = do -- parse inside parentheses
string "("
x <- parser
string ")"
return x
pNumber = do -- parse an integer
digits <- many1 digit
return . I . read $ digits
pOp = -- parse an operation symbol
do string "+"
return Add
+++
do string "*"
return Mul
Here an example run:
*Main> parseExpr "(5*3)+1"
Right (Add (Mul (I 5) (I 3)) (I 1))
To learn more about parser combinators, see for example chapter 8 of Graham Hutton's book "Programming in Haskell" or chapter 16 of "Real World Haskell".
Many parser combinator library can be used with different token types, as you intend to do. Token streams are usually represented as lists of tokens [Token].
Definitely check out the monadic parser combinator approach; I've blogged about it in C# and in F#.
Eric Lippert's blog series on immutable binary trees may be helpful. Obviously, you need a tree which is not binary, but it will give you the general idea.

F# how to write an empty statement

How can I write a no-op statement in F#?
Specifically, how can I improve the second clause of the following match statement:
match list with
| [] -> printfn "Empty!"
| _ -> ignore 0
Use unit for empty side effect:
match list with
| [] -> printfn "Empty!"
| _ -> ()
The answer from Stringer is, of course, correct. I thought it may be useful to clarify how this works, because "()" insn't really an empty statement or empty side effect...
In F#, every valid piece of code is an expression. Constructs like let and match consist of some keywords, patterns and several sub-expressions. The F# grammar for let and match looks like this:
<expr> ::= let <pattern> = <expr>
<expr>
::= match <expr> with
| <pat> -> <expr>
This means that the body of let or the body of clause of match must be some expression. It can be some function call such as ignore 0 or it can be some value - in your case it must be some expression of type unit, because printfn ".." is also of type unit.
The unit type is a type that has only one value, which is written as () (and it also means empty tuple with no elements). This is, indeed, somewhat similar to void in C# with the exception that void doesn't have any values.
BTW: The following code may look like a sequence of statements, but it is also an expression:
printf "Hello "
printf "world"
The F# compiler implicitly adds ; between the two lines and ; is a sequencing operator, which has the following structure: <expr>; <expr>. It requires that the first expression returns unit and returns the result of the second expression.
This is a bit surprising when you're coming from C# background, but it makes the langauge surprisingly elegant and consise. It doesn't limit you in any way - you can for example write:
if (a < 10 && (printfn "demo"; true)) then // ...
(This example isn't really useful - just a demonstration of the flexibility)

Resources