Haskell - Happy, math expressions and variables parser - parsing

I'm working on a Happy math expressions and variables parser. The problem is that I don't know how to save the value for a variable and use it later. Any ideas?
This is how I recognize expressions and variables assignment:
genExp : exp { $1 }
| variable '=' exp { //here I want to save the value of the variable; something like this: insert variables $1 $3, where 'variables' is a Data.Map }
A expression can contain a variable. For example:
a = 2 + 1
a + 2 (now the parser must print 5)
I need to save the value of the variable 'a' when the parser is parsing the line 'a = 2 + 1' and to get the value of the variable 'a' when the parser is parsing the line 'a + 2'

What you want to do is to keep track of the value of variables during evaluation of expressions, not during parsing. Let's assume you parse your expressions into the following types:
data Expr = Literal Int | Variable Var | Assign Var Expr | Add Expr Expr | ...
newtype Var = Var String deriving (Ord, Eq, Show)
Then you could simply pass a Map around your evaluation function with the current value of all variables:
import qualified Data.Map as M
import Control.Monad.State
data Expr = Literal Int | Variable Var | Assign Var Expr | Add Expr Expr
newtype Var = Var String deriving (Ord, Eq, Show)
-- Each Expr corresponds to a single line in your language, so
-- a = 2+1
-- a + 2
-- corresponds to
-- [Assign (Var "a") (Add (Literal 2) (Literal 1)),
-- Add (Variable (Var "a")) (Literal 2)]
eval :: [Expr] -> Int
eval es = last $ evalState (mapM eval' es) M.empty -- M.empty :: M.Map Var Int
where
eval' (Literal n) = return n
eval' (Variable v) = do
vs <- get
case M.lookup v vs of
Just x -> return x
_ -> error $ "variable " ++ show v ++ " is undefined!"
eval' (Assign v ex) = do
x <- eval' ex
modify (M.insert v x)
return x
eval' (Add a b) = do
x <- eval' a
y <- eval' b
return (x+y)
Of course, there's nothing to prevent you from evaluating expressions as you parse them, eliminating the need for an abstract syntax tree such as this. The general idea is the same there; you'll need to keep some state with you during the entire parsing, that keeps track of the current value of all your variables.

Related

Making Let binding parsing in Haskell

I want to create parser-like behaviour in Haskell such that I can assign an expression to a variable based on a string. I have difficulties doing so.
If I have types with the following definitions:
data Expr =
Numb Int
| Add Expr Expr
| Let {var :: PVariable, definition, body :: Expr}
type PVariable = String
And want to create a function 'eval' that would be able to handle different operations such as Add, Subtract, Multiply etc... but also the Let binding, sucht that 'eval' would be subject to the following definition:
eval :: Exp -> Integer
eval (Number expr) = expr
eval (Add expr1 expr2) = eval(expr1) + eval(expr2)
...
eval (Let v expr1 body) = ...
How could I then create eval such that it would assign an expr1 to the string v, that would then be expressed in the body, such that the parser-like behaviour could accomplish for instance something similar to the conversion from:
Let {var = "Var1", definition = expr1, body = (Add (Var "Var1") (Var "Var1"))}
where expr1 would be a chosen expression such that the above could be expressed as
let Var1 = expr1 in expr1+expr1
That could then have different Expr assigned to expr1 such as (Numb 2), so that we would get something similar to the following in Haskell:
let Var1 = 2 in Var1 + Var1
So far I have tried to deal with isolating fields of the record 'Let' so that I can evaluate each of these considering that I want to stay with the function type declarations. But I don't think that this is the easiest way, and it would probably require that I create a whole function to extract these, as far as I can see from : How to generically extract field names and values in Haskell records
Is there a smarter way to go about it?
You'll need the function eval to have extra argument that would contain the variable bindings and pass it to subexpressions recursively. You also need a special case to evaluate Var-expressions:
module Main where
import qualified Data.Map as M
data Expr =
Numb Int
| Add Expr Expr
| Let {var :: PVariable, definition, body :: Expr}
| Var PVariable
type PVariable = String
type Env = M.Map PVariable Int
eval :: Env -> Expr -> Int
eval _ (Numb a) = a
eval env (Add e1 e2) = (eval env e1) + (eval env e2)
eval env (Var v) = M.findWithDefault (error $ "undefined variable: " ++ v) v env
eval env (Let v expr body) = let
val = eval env expr
env' = M.insert v val env
in eval env' body
main = print $ eval M.empty $ Let "a" (Numb 1) (Add (Var "a") (Numb 2))

Haskell: Parsing a file finishes after first expression despite more input in file

The following is an example program of a language in which I'm writing a parser.
n := 1
Do (1)-> -- The 1 in brackets is a placeholder for a Boolean or relational expression.
n := 1 + 1
Od
When the program looks like this, the parseFile functions ends after the assignment on the first line however when the assignment is removed, it parses as expected. Below is how it's called in GHCI, first with the first line present then removed:
λ > parseFile "example.hnry"
Assign "n" (HInteger 1)
λ > parseFile "example.hnry"
Do (HInteger 1) (Assign "n" (AExpr (HInteger 1) Add (HInteger 1)))
The expected output would look similar to this:
λ > parseFile "example.hnry"
Assign "n" (HInteger 1) Do (HInteger 1) (Assign "n" (AExpr (HInteger 1) Add (HInteger 1)))
I first assumed it was something to do with the the assignment parser but in the body of the loop, there exists an assignment which parses as expected so I was able to rule that out. I believe that the issue is within the parseFile function itself. The following is the parseFile function and the other functions that make up the parseExpression function that I'm using to parse a program.
I think that the error is within parseFile because it parses an expression only once and doesn't "loop" for the want of a better word to itself to check if there's more input left the parse. I think that's the error but I'm not quite sure.
parseFile :: String -> IO HVal
parseFile file =
do program <- readFile file
case parse parseExpression "" program of
Left err -> fail "Parse Error"
Right parsed -> return $ parsed
parseExpression :: Parser HVal
parseExpression = parseAExpr <|> parseDo <|> parseAssign
parseDo :: Parser HVal
parseDo = do
_ <- string "Do "
_ <- char '('
x <- parseHVal -- Will be changed to a Boolean expression
_ <- string ")->"
spaces
y <- parseExpression
spaces
_ <- string "Od"
return $ Do x y
parseAExpr :: Parser HVal
parseAExpr = do
x <- parseInteger
spaces
op <- parseOp
spaces
y <- parseInteger <|> do
_ <- char '('
z <- parseAExpr
_ <- char ')'
return $ z
return $ AExpr x op y
parseAssign :: Parser HVal
parseAssign = do
var <- oneOf ['a'..'z'] <|> oneOf ['A'..'Z']
spaces
_ <- string ":="
spaces
val <- parseHVal <|> do
_ <- char '('
z <- parseAExpr
_ <- char ')'
return $ z
return $ Assign [var] val
As you note, your parseFile function parses a single expression (though maybe "statement" would be a better name) using the parseExpression parser. You probably want to introduce a new parser for a "program" or sequence of expressions/statements:
parseProgram :: Parser [HVal]
parseProgram = spaces *> many (parseExpression <* spaces)
and then in parseFile, replace parseExpression with parseProgram:
parseFile :: String -> IO [HVal]
parseFile file =
do program <- readFile file
case parse parseProgram "" program of
Left err -> fail "Parse Error"
Right parsed -> return $ parsed
Note that I've had to change the type here from HVal to [HVal] to reflect the fact that a program, being a sequence of expressions each of type HVal, needs to be represented as some sort of data type capable of combining multiple HVals together, and a list [HVal] is one way of doing so.
If you want a program to be an HVal instead of an [HVal], then you need to introduce a new constructor in your HVal type that's capable of representing programs. One method is to use a constructor to directly represent a block of statements:
data HVal = ... | Block [HVal]
Another is to add a constructor represent a sequence of two statements:
data HVal = ... | Seq HVal HVal
Both methods are used in real parsers. (Note that you'd normally pick one; you wouldn't use both.) To represent a sequence of three assignment statements, for example, the block method would do it directly as a list:
Block [Assign "a" (HInteger 1), Assign "b" (HInteger 2), Assign "c" (HInteger 3)]
while the two-statement sequence method would build a sort of nested tree:
Seq (Assign "a" (HInteger 1)) (Seq (Assign "b" (HInteger 2)
(Assign "c" (HInteger 3))
The appropriate parsers for these two alternatives, both of which return a plain HVal, might be:
-- use blocks
parseProgram1 :: Parser HVal
parseProgram1 = do
spaces
xs <- many (parseExpression <* spaces)
return $ Block xs
parseProgram2 :: Parser HVal
parseProgram2 = do
spaces
x <- parseExpression
spaces
(do xs <- parseProgram2
return $ Seq x xs)
<|> return x

Stack overflow with two functions calling each other in Applicative parser

I'm doing data61's course: https://github.com/data61/fp-course. In the parser one, the following implementation will cause parse (list1 (character *> valueParser 'v')) "abc" stack overflow.
Existing code:
data List t =
Nil
| t :. List t
deriving (Eq, Ord)
-- Right-associative
infixr 5 :.
type Input = Chars
data ParseResult a =
UnexpectedEof
| ExpectedEof Input
| UnexpectedChar Char
| UnexpectedString Chars
| Result Input a
deriving Eq
instance Show a => Show (ParseResult a) where
show UnexpectedEof =
"Unexpected end of stream"
show (ExpectedEof i) =
stringconcat ["Expected end of stream, but got >", show i, "<"]
show (UnexpectedChar c) =
stringconcat ["Unexpected character: ", show [c]]
show (UnexpectedString s) =
stringconcat ["Unexpected string: ", show s]
show (Result i a) =
stringconcat ["Result >", hlist i, "< ", show a]
instance Functor ParseResult where
_ <$> UnexpectedEof =
UnexpectedEof
_ <$> ExpectedEof i =
ExpectedEof i
_ <$> UnexpectedChar c =
UnexpectedChar c
_ <$> UnexpectedString s =
UnexpectedString s
f <$> Result i a =
Result i (f a)
-- Function to determine is a parse result is an error.
isErrorResult ::
ParseResult a
-> Bool
isErrorResult (Result _ _) =
False
isErrorResult UnexpectedEof =
True
isErrorResult (ExpectedEof _) =
True
isErrorResult (UnexpectedChar _) =
True
isErrorResult (UnexpectedString _) =
True
-- | Runs the given function on a successful parse result. Otherwise return the same failing parse result.
onResult ::
ParseResult a
-> (Input -> a -> ParseResult b)
-> ParseResult b
onResult UnexpectedEof _ =
UnexpectedEof
onResult (ExpectedEof i) _ =
ExpectedEof i
onResult (UnexpectedChar c) _ =
UnexpectedChar c
onResult (UnexpectedString s) _ =
UnexpectedString s
onResult (Result i a) k =
k i a
data Parser a = P (Input -> ParseResult a)
parse ::
Parser a
-> Input
-> ParseResult a
parse (P p) =
p
-- | Produces a parser that always fails with #UnexpectedChar# using the given character.
unexpectedCharParser ::
Char
-> Parser a
unexpectedCharParser c =
P (\_ -> UnexpectedChar c)
--- | Return a parser that always returns the given parse result.
---
--- >>> isErrorResult (parse (constantParser UnexpectedEof) "abc")
--- True
constantParser ::
ParseResult a
-> Parser a
constantParser =
P . const
-- | Return a parser that succeeds with a character off the input or fails with an error if the input is empty.
--
-- >>> parse character "abc"
-- Result >bc< 'a'
--
-- >>> isErrorResult (parse character "")
-- True
character ::
Parser Char
character = P p
where p Nil = UnexpectedString Nil
p (a :. as) = Result as a
-- | Parsers can map.
-- Write a Functor instance for a #Parser#.
--
-- >>> parse (toUpper <$> character) "amz"
-- Result >mz< 'A'
instance Functor Parser where
(<$>) ::
(a -> b)
-> Parser a
-> Parser b
f <$> P p = P p'
where p' input = f <$> p input
-- | Return a parser that always succeeds with the given value and consumes no input.
--
-- >>> parse (valueParser 3) "abc"
-- Result >abc< 3
valueParser ::
a
-> Parser a
valueParser a = P p
where p input = Result input a
-- | Return a parser that tries the first parser for a successful value.
--
-- * If the first parser succeeds then use this parser.
--
-- * If the first parser fails, try the second parser.
--
-- >>> parse (character ||| valueParser 'v') ""
-- Result >< 'v'
--
-- >>> parse (constantParser UnexpectedEof ||| valueParser 'v') ""
-- Result >< 'v'
--
-- >>> parse (character ||| valueParser 'v') "abc"
-- Result >bc< 'a'
--
-- >>> parse (constantParser UnexpectedEof ||| valueParser 'v') "abc"
-- Result >abc< 'v'
(|||) ::
Parser a
-> Parser a
-> Parser a
P a ||| P b = P c
where c input
| isErrorResult resultA = b input
| otherwise = resultA
where resultA = a input
infixl 3 |||
My code:
instance Monad Parser where
(=<<) ::
(a -> Parser b)
-> Parser a
-> Parser b
f =<< P a = P p
where p input = onResult (a input) (\i r -> parse (f r) i)
instance Applicative Parser where
(<*>) ::
Parser (a -> b)
-> Parser a
-> Parser b
P f <*> P a = P b
where b input = onResult (f input) (\i f' -> f' <$> a i)
list ::
Parser a
-> Parser (List a)
list p = list1 p ||| pure Nil
list1 ::
Parser a
-> Parser (List a)
list1 p = (:.) <$> p <*> list p
However, if I change list to not use list1, or use =<< in list1, it works fine. It also works if <*> uses =<<. I feel like it might be an issue with tail recursion.
UPDATE:
If I use lazy pattern matching here
P f <*> ~(P a) = P b
where b input = onResult (f input) (\i f' -> f' <$> a i)
It works fine. Pattern matching here is the problem. I don't understand this... Please help!
If I use lazy pattern matching P f <*> ~(P a) = ... then it works fine. Why?
This very issue was discussed recently. You could also fix it by using newtype instead of data: newtype Parser a = P (Input -> ParseResult a).(*)
The definition of list1 wants to know both parser arguments to <*>, but actually when the first will fail (when input is exhausted) we don't need to know the second! But since we force it, it will force its second argument, and that one will force its second parser, ad infinitum.(**) That is, p will fail when input is exhausted, but we have list1 p = (:.) <$> p <*> list p which forces list p even though it won't run when the preceding p fails. That's the reason for the infinite looping, and why your fix with the lazy pattern works.
What is the difference between data and newtype in terms of laziness?
(*)newtype'd type always has only one data constructor, and pattern matching on it does not actually force the value, so it is implicitly like a lazy pattern. Try newtype P = P Int, let foo (P i) = 42 in foo undefined and see that it works.
(**) This happens when the parser is still prepared, composed; before the combined, composed parser even gets to run on the actual input. This means there's yet another, third way to fix the problem: define
list1 p = (:.) <$> p <*> P (\s -> parse (list p) s)
This should work regardless of the laziness of <*> and whether data or newtype was used.
Intriguingly, the above definition means that the parser will be actually created during run time, depending on the input, which is the defining characteristic of Monad, not Applicative which is supposed to be known statically, in advance. But the difference here is that the Applicative depends on the hidden state of input, and not on the "returned" value.

Parsing function in haskell

I'm new to Haskell and I am trying to parse expressions. I found out about Parsec and I also found some articles but I don't seem to understand what I have to do. My problem is that I want to give an expression like "x^2+2*x+3" and the result to be a function that takes an argument x and returns a value. I am very sorry if this is an easy question but I really need some help. Thanks! The code I inserted is from the article that you can find on this link.
import Control.Monad(liftM)
import Text.ParserCombinators.Parsec
import Text.ParserCombinators.Parsec.Expr
import Text.ParserCombinators.Parsec.Token
import Text.ParserCombinators.Parsec.Language
data Expr = Num Int | Var String | Add Expr Expr
| Sub Expr Expr | Mul Expr Expr | Div Expr Expr
| Pow Expr Expr
deriving Show
expr :: Parser Expr
expr = buildExpressionParser table factor
<?> "expression"
table = [[op "^" Pow AssocRight],
[op "*" Mul AssocLeft, op "/" Div AssocLeft],
[op "+" Add AssocLeft, op "-" Sub AssocLeft]]
where
op s f assoc
= Infix (do{ string s; return f}) assoc
factor = do{ char '('
; x <- expr
; char ')'
; return x}
<|> number
<|> variable
<?> "simple expression"
number :: Parser Expr
number = do{ ds<- many1 digit
; return (Num (read ds))}
<?> "number"
variable :: Parser Expr
variable = do{ ds<- many1 letter
; return (Var ds)}
<?> "variable"
This is just a parser for expressions with variables. Actually interpreting the expression is an entirely separate matter.
You should create a function that takes an already parsed expression and values for variables, and returns the result of evaluating the expression. Pseudocode:
evaluate :: Expr -> Map String Int -> Int
evaluate (Num n) _ = n
evaluate (Var x) vars = {- Look up the value of x in vars -}
evaluate (Plus e f) vars = {- Evaluate e and f, and return their sum -}
...
I've deliberately omitted some details; hopefully by exploring the missing parts, you learn more about Haskell.
As a next step, you should probably look at the Reader monad for a convenient way to pass the variable map vars around, and using Maybe or Error to signal errors, e.g. referencing a variable that is not bound in vars, or division by zero.

Problems with Haskell's Parsec <|> operator

I am new to both Haskell and Parsec. In an effort to learn more about the language and that library in particular I am trying to create a parser that can parse Lua saved variable files. In these files variables can take the following forms:
varname = value
varname = {value, value,...}
varname = {{value, value},{value,value,...}}
I've created parsers for each of these types but when I string them together with the choice <|> operator I get a type error.
Couldn't match expected type `[Char]' against inferred type `Char'
Expected type: GenParser Char st [[[Char]]]
Inferred type: GenParser Char st [[Char]]
In the first argument of `try', namely `lList'
In the first argument of `(<|>)', namely `try lList'
My assumption is (although I can't find it in the documentation) that each parser passed to the choice operator must return the same type.
Here's the code in question:
data Variable = LuaString ([Char], [Char])
| LuaList ([Char], [[Char]])
| NestedLuaList ([Char], [[[Char]]])
deriving (Show)
main:: IO()
main = do
case (parse varName "" "variable = {{1234,\"Josh\"},{123,222}}") of
Left err -> print err
Right xs -> print xs
varName :: GenParser Char st Variable
varName = do{
vName <- (many letter);
eq <- string " = ";
vCon <- try nestList
<|> try lList
<|> varContent;
return (vName, vCon)}
varContent :: GenParser Char st [Char]
varContent = quotedString
<|> many1 letter
<|> many1 digit
quotedString :: GenParser Char st [Char]
quotedString = do{
s1 <- string "\"";
s2 <- varContent;
s3 <- string "\"";
return (s1++s2++s3)}
lList :: GenParser Char st [[Char]]
lList = between (string "{") (string "}") (sepBy varContent (string ","))
nestList :: GenParser Char st [[[Char]]]
nestList = between (string "{") (string "}") (sepBy lList (string ","))
That's correct.
(<|>) :: (Alternative f) => f a -> f a -> f a
Notice how both arguments are exactly the same type.
I don't exactly understand your Variable data type. This is the way I would do it:
data LuaValue = LuaString String | LuaList [LuaValue]
data Binding = Binding String LuaValue
This allows values to be arbitrarily nested, not just nested two levels deep like yours has. Then write:
luaValue :: GenParser Char st LuaValue
luaValue = (LuaString <$> identifier)
<|> (LuaList <$> between (string "{") (string "}") (sepBy (string ",") luaValue))
This is the parser for luaValue. Then you just need to write:
binding :: GenParser Char st Binding
content :: GenParser Char st [Binding]
And you'll have it. Using a data type that accurately represents what is possible is important.
Indeed, parsers passed to the choice operator must have equal types. You can tell by the type of the choice operator:
(<|>) :: GenParser tok st a -> GenParser tok st a -> GenParser tok st a
This says that it will happily combine two parsers as long as their token types, state types and result types are the same.
So how do we make sure those parsers you're trying to combine have the same result type? Well, you already have a datatype Variable that captures the different forms of variables that can appear in Lua, so what we need to do is not return String, [String] or [[String]] but just Variables.
But when we try that we run into a problem. We can't let nestList etc. return Variables yet because the constructors of Variable require variable names and we don't know those yet at that point. There are workarounds for this (such as return a function String -> Variable that still expects that variable name) but there is a better solution: separate the variable name from the different kinds of values that a variable can have.
data Variable = Variable String Value
deriving Show
data Value = LuaString String
| LuaList [Value]
deriving (Show)
Note that I've removed the NestedLuaList constructor. I've changed LuaList to accept a list of Values rather than Strings, so a nested list can now be expressed as a LuaList of LuaLists. This allows lists to be nested arbitrarily deep rather than just two levels as in your example. I don't know if this is allowed in Lua but it made writing the parsers easier. :-)
Now we can let lList and nestList return Values:
lList :: GenParser Char st Value
lList = do
ss <- between (string "{") (string "}") (sepBy varContent (string ","))
return (LuaList (map LuaString ss))
nestList :: GenParser Char st Value
nestList = do
vs <- between (string "{") (string "}") (sepBy lList (string ","))
return (LuaList vs)
And varName, which I've renamed variable here, now returns a Variable:
variable :: GenParser Char st Variable
variable = do
vName <- (many letter)
eq <- string " = "
vCon <- try nestList
<|> try lList
<|> (do v <- varContent; return (LuaString v))
return (Variable vName vCon)
I think you'll find that when you run your parser on some input there are still some problems, but you're already a lot closer to the solution now than before.
I hope this helps!

Resources