Stuck trying to parse a string into an Abstract Syntax Tree
This is the grammar that im using, in BNF
<Block> ::= <Expr>;
<Expr> ::= <Number> | <App>
<App> ::= (<Expr>, <Expr>) <Func>
<Func> ::= + | - | * | \
<Number> ::= <Digit> | <Digit><Number>
<Digit> ::= 0 | 1 | .... | 9
I have to use this:
data Ast = Number Integer | Func String | App Ast [Ast] | Block [Ast]
I wrote a tokenize method (that perhaps is not completely finished, but gives the right idea)
tokenize :: String -> [String]
tokenize [] = []
tokenize xs # (x : xs')
| x `elem` t = [x] : tokenize xs'
| isDigit x = [y | y <- takeWhile isDigit xs] : (tokenize (dropWhile isDigit xs))
| otherwise = tokenize xs'
where t = ['+', '-', '*', '/', '(', ')', ';']
I'm stuck on the parseApp function. It's not obvious to me how I could use pattern matching, and I don't see any other way to do it (with my very limited haskell experience)
parse :: String -> Ast
parse xs = parseBlock (tokenize xs)
parseBlock :: [String] -> (Ast, [String])
parseBlock xss = let (a, b) = parseExpr (init xss) in (Block [a], b)
parseExpr :: [String] -> (Ast, [String])
parseExpr xss # (xs : xss')
| all isDigit xs = (Number (read xs :: Integer), xss')
| otherwise = parseApp xss
parseApp :: [String] -> (Ast, [String])
parseApp xss = -- ...
If the input string is
"((10, 2)- , (0, 2)-)+;"
it should tokenize to
["(", "(", "10", ",", "2", ")", "-", ",", "(", "0", ",", "2", ")", "-", ")", "+", ";"]
and that would become
(Block [ App (Name "+") [App (Name "-") [Number 10, Number 2], App (Name "-") [Number 0, Number 2]]], [])
It's an assignment so I can't change the types and so on. I'm supposed to assume that the given input has correct syntax. I have read similiar threads here, but the answers seem to assume a higher level of understanding
Related
Currently, I am working on a problem of parsing and showing expressions in Haskell.
type Name = String
data Expr = Val Integer
| Var Name
| Expr :+: Expr
| Expr :-: Expr
| Expr :*: Expr
| Expr :/: Expr
| Expr :%: Expr
This is the code of my data type Expr and this is how i define show function:
instance Show Expr where
show (Val x) = show x
show (Var y) = y
show (p :+: q) = par (show p ++ "+" ++ show q)
show (p :-: q) = par (show p ++ "-" ++ show q)
show (p :/: q) = par (show p ++ "/" ++ show q)
show (p :*: q) = par (show p ++ "*" ++ show q)
show (p :%: q) = par (show p ++ "%" ++ show q)
par :: String -> String
par s = "(" ++ s ++ ")"
Later i tried to transform string input into the expression but i encounter the following problem: I don't understand how parentheses in the second case are implemented in Haskell.
*Main> Val 2 :*:Val 2 :+: Val 3
((2*2)+3)
*Main> Val 2 :*:(Val 2 :+: Val 3)
(2*(2+3))
Because of that, i am a bit confused regarding how should i transform parentheses from my string into the expression. Currently i am using the following function for parsing, but for now, it just ignores parentheses which is not intended behavior:
toExpr :: String -> Expr
toExpr str = f (lexer str) (Val 0)
where
f [] expr = expr
f (c:cs) expr
|isAlpha (head c) = f cs (Var c)
|isDigit (head c) = f cs (Val (read c))
|c == "+" = (expr :+: f cs (Val 0))
|c == "-" = (expr :-: f cs (Val 0))
|c == "/" = (expr :/: f cs (Val 0))
|c == "*" = (expr :*: f cs (Val 0))
|c == "%" = (expr :%: f cs (Val 0))
|otherwise = f cs expr
Edit: few grammar mistakes
I don't understand how parentheses in the second case are implemented in Haskell.
The brackets just give precedence to a certain part of the expression to parse. The problem is not with the parenthesis you render. I think the problem is that you did not assign precedence to your operators. This thus means that, unless you specify brackets, Haskell will consider all operators to have the same precedence, and parse these left-to-right. This thus means that x ⊕ y ⊗ z is parsed as (x ⊕ y) ⊗ z.
You can define the precedence of your :+:, :*, etc. operators with infixl:
infixl 7 :*:, :/:, :%:
infixl 5 :+:, :-:
type Name = String
data Expr = Val Integer
| Var Name
| Expr :+: Expr
| Expr :-: Expr
| Expr :*: Expr
| Expr :/: Expr
| Expr :%: Expr
As for your parser (the toExpr), you will need a parsing mechanism like a LALR parser [wiki] that stores results on a stack, and thus makes proper operations.
This was my final parser which gave me the result I needed. To get the result i wanted proper grammar was added and i wrote a parses according to he grammar.
Thanks, everyone for the help.
{-
parser for the following grammar:
E -> T E'
E' -> + T E' | - T E' | <empty string>
T -> F T'
T' -> * F T' | / F T' | % F T' | <empty string>
F -> (E) | <integer> | <identifier>
-}
parseExpr :: String -> (Expr,[String])
parseExpr tokens = parseE (lexer tokens)
parseE :: [String] -> (Expr,[String])
parseE tokens = parseE' acc rest where (acc,rest) = parseT tokens
parseE' :: Expr -> [String] -> (Expr,[String])
parseE' accepted ("+":tokens) = let (acc,rest) = parseT tokens in parseE' (accepted :+: acc) rest
parseE' accepted ("-":tokens) = let (acc,rest) = parseT tokens in parseE' (accepted :-: acc) rest
parseE' accepted tokens = (accepted,tokens)
parseT :: [String] -> (Expr,[String])
parseT tokens = let (acc,rest) = parseF tokens in parseT' acc rest
parseT' :: Expr -> [String] -> (Expr,[String])
parseT' accepted ("*":tokens) = let (acc,rest) = parseF tokens in parseT' (accepted :*: acc) rest
parseT' accepted ("/":tokens) = let (acc,rest) = parseF tokens in parseT' (accepted :/: acc) rest
parseT' accepted ("%":tokens) = let (acc,rest) = parseF tokens in parseT' (accepted :%: acc) rest
parseT' accepted tokens = (accepted,tokens)
parseF :: [String] -> (Expr,[String])
parseF ("(":tokens) = (e, tail rest) where (e,rest) = parseE tokens
parseF (t:tokens)
| isAlpha (head t) = (Var t,tokens)
| isDigit (head t) = (Val (read t),tokens)
| otherwise = error ""
parseF [] = error ""
lexer :: String -> [String]
lexer [] = []
lexer (c:cs)
| elem c " \t\n" = lexer cs
| elem c "=+-*/%()" = [c]:(lexer cs)
| isAlpha c = (c:takeWhile isAlpha cs):lexer(dropWhile isAlpha cs)
| isDigit c = (c:takeWhile isDigit cs):lexer(dropWhile isDigit cs)
| otherwise = error ""
I test indentation with FParsec, according to this implementation, but when I make it a little more complex by adding expressions (literals, lists, tuples and arithmetic operations), allowing expressions to top-level, and adding a variable creation statement; I first get a StackOverflowException error . In my opinion, this is because the expression parser is solicited in such a way as to make an infinite loop in the program. I see no other reason, however, I don't know how to fix this problem.
If I remove the attempt pexpression from my parser data statement, there is no more StackOverflowException, nevertheless the module IndentationParserWithoutBacktracking (therefore managing indentation) tells me that the code to be parsed is missing a "newline":
Failure: Error in Ln: 2 Col: 1
loop i 0 10
^
Expecting: let or print
The parser backtracked after:
Error in Ln: 3 Col: 5
let myVar = 2 + 1
^
Expecting: loop or print
The parser backtracked after:
Error in Ln: 3 Col: 17
let myVar = 2 + 1
^
Expecting: newline
All this according to the following text to be parsed:
loop i 0 10
let myVar = 2 + 1
print myVar
Here is my code:
open FParsec
// module IndentationParserWithoutBacktracking // see the link
// Utils
open IndentationParserWithoutBacktracking
let isBlank = fun c -> c = ' ' || c = '\t'
let ws = spaces
let ws1 = skipMany1SatisfyL isBlank "whitespace"
let str s = pstring s .>> ws
let keyword str = pstring str >>? nextCharSatisfiesNot (fun c -> isLetter c || isDigit c) <?> str
// AST
type Identifier = Identifier of string
type InfixOp =
| Sum | Sub | Mul | Div | Pow | Mod
| And | Or | Equal | NotEqual | Greater | Smaller | GreaterEqual | SmallerEqual
type Value =
| Int of int
| Float of float
| Bool of bool
| String of string
| Char of char
| Variable of Identifier
type Expr =
| Literal of Value
| Infix of Expr * InfixOp * Expr
| List of Expr list
| Tuple of Expr list
type Statement =
| Expression of Expr
| Let of Identifier * Statement list
| Loop of Identifier * Expr * Expr * Statement list
| Print of Identifier
// Literals
let numberFormat = NumberLiteralOptions.AllowMinusSign ||| NumberLiteralOptions.AllowFraction |||
NumberLiteralOptions.AllowHexadecimal ||| NumberLiteralOptions.AllowOctal |||
NumberLiteralOptions.AllowBinary ||| NumberLiteralOptions.AllowPlusSign
let literal_numeric =
numberLiteral numberFormat "number" |>> fun nl ->
if nl.IsInteger then Literal (Int(int nl.String))
else Literal (Float(float nl.String))
let literal_bool =
(choice [
(stringReturn "true" (Literal (Bool true)))
(stringReturn "false" (Literal (Bool false)))
]
.>> ws) <?> "boolean"
let literal_string =
(between (pstring "\"") (pstring "\"") (manyChars (satisfy (fun c -> c <> '"')))
|>> fun s -> Literal (String s)) <?> "string"
let literal_char =
(between (pstring "'") (pstring "'") (satisfy (fun c -> c <> '''))
|>> fun c -> Literal (Char c)) <?> "character"
let identifier =
(many1Satisfy2L isLetter (fun c -> isLetter c || isDigit c) "identifier"
|>> fun i -> Identifier i) <?> "valid identifier"
let betweenParentheses p =
(between (str "(") (str ")") p)
let variable = identifier |>> fun id -> Literal (Variable id)
let literal = (attempt literal_numeric <|>
attempt literal_bool <|>
attempt literal_char <|>
attempt literal_string <|>
attempt variable) <?> "literal"
// Expressions
let pexpr, pexprimpl = createParserForwardedToRef()
let term =
(ws >>. literal .>> ws) <|>
(betweenParentheses (ws >>. pexpr)) <|>
(ws >>. pexpr .>> ws)
let infixOperator (p: OperatorPrecedenceParser<_, _, _>) op prec map =
p.AddOperator(InfixOperator(op, ws, prec, Associativity.Left, map))
let ops =
// Arithmetic
[ "+"; "-"; "*"; "/"; "%" ] #
// Logical
[ "&&"; "||"; "=="; "!="; ">"; "<"; ">="; "<=" ]
let opCorrespondance op =
match op with
// Arithmetic operators
| "+" -> Sum
| "-" -> Sub
| "*" -> Mul
| "/" -> Div
| "%" -> Mod
// Logical operators
| "&&" -> And
| "||" -> Or
| "==" -> Equal
| "!=" -> NotEqual
| ">" -> Greater
| "<" -> Smaller
| ">=" -> GreaterEqual
| "<=" -> SmallerEqual
let opParser = new OperatorPrecedenceParser<_, _, _>()
for op in ops do
infixOperator opParser op 1 (fun x y -> Infix(x, opCorrespondance op, y))
opParser.TermParser <- term
let list = between (str "[") (str "]") (sepBy pexpr (str ",")) |>> List
let tuple = between (str "(") (str ")") (sepBy pexpr (str ",")) |>> Tuple
let expression =
opParser.ExpressionParser <|> // I removed this line to don't have the mistake again.
list <|>
tuple <|>
literal
pexprimpl := attempt expression
// Statements
let statements, statementsRef = createParserForwardedToRef()
let pexpression = expression |>> Expression
let plet =
pipe2
(keyword "let" >>. ws1 >>. identifier)
(ws >>. str "=" >>. ws >>. statements)
(fun id gtt exp -> Let(id, gtt, exp))
// From the link, but "revisited"
let ploop =
pipe4
(keyword "loop" >>. ws1 >>. identifier)
(ws1 >>. literal) // If I put 'pexpr', it doesn't work too...
(ws1 >>. literal)
(statements)
(fun id min max stmts -> Loop(id, min, max, stmts))
let print = keyword "print" >>. (ws1 >>. identifier |>> Print)
let statement =
attempt plet <|>
attempt print <|>
attempt ploop <|>
attempt pexpression
statementsRef := indentedMany1 statement "statement"
let document = statements .>> spaces .>> eof
let test str =
match runParserOnString document (UserState.Create()) "" str with
| Success(result, _, _) -> printfn "Success: %A" result
| Failure(errorMsg, _, _) -> printfn "Failure: %s" errorMsg
System.Console.Clear()
test #"
loop i 0 10
let myVar = 2 + 1
print myVar
"
I know I ask several questions at the same time, and the site doesn't really allow it, but they're all a little linked together, so I might as well kill two birds with one stone...
I would really like to understand my mistakes, in order to design a parser for a very small ML-like language.
Thank you.
Edit
Here is my current code, which has been modified to respond to the first problems encountered with indentation:
open IndentationParserWithoutBacktracking // So from the link
let isBlank = fun c -> c = ' ' || c = '\t'
let ws = spaces
let ws1 = skipMany1SatisfyL isBlank "whitespace"
let str s = pstring s .>> ws
let keyword str = pstring str >>? nextCharSatisfiesNot (fun c -> isLetter c || isDigit c) <?> str
// AST
type Identifier = Identifier of string
type Value =
| Int of int
| Float of float
| Bool of bool
| String of string
| Char of char
| Variable of Identifier
// In FP, "all" is an expression, so:
type Expr =
// Arithmetic + lists and tuple
| Literal of Value
| Infix of Expr * InfixOp * Expr
| List of Expr list
| Tuple of Expr list
// Statements
| Return of Expr
| Loop of Identifier * Expr * Expr * Expr list
| Print of Identifier
and InfixOp =
| Sum | Sub | Mul | Div | Pow | Mod
| And | Or | Equal | NotEqual | Greater | Smaller | GreaterEqual | SmallerEqual
// Literals
let numberFormat = NumberLiteralOptions.AllowMinusSign ||| NumberLiteralOptions.AllowFraction |||
NumberLiteralOptions.AllowHexadecimal ||| NumberLiteralOptions.AllowOctal |||
NumberLiteralOptions.AllowBinary
let literal_numeric =
numberLiteral numberFormat "number" |>> fun nl ->
if nl.IsInteger then Literal (Int(int nl.String))
else Literal (Float(float nl.String))
let literal_bool =
(choice [
(stringReturn "true" (Literal (Bool true)))
(stringReturn "false" (Literal (Bool false)))
]
.>> ws) <?> "boolean"
let literal_string =
(between (pstring "\"") (pstring "\"") (manyChars (satisfy (fun c -> c <> '"')))
|>> fun s -> Literal (String s)) <?> "string"
let literal_char =
(between (pstring "'") (pstring "'") (satisfy (fun c -> c <> '''))
|>> fun c -> Literal (Char c)) <?> "character"
let identifier =
(many1Satisfy2L isLetter (fun c -> isLetter c || isDigit c) "identifier"
|>> fun i -> Identifier i) <?> "identifier"
let betweenParentheses p =
(between (str "(") (str ")") p) <?> ""
let variable = identifier |>> fun id -> Literal (Variable id)
let literal = (attempt literal_numeric <|>
attempt literal_bool <|>
attempt literal_char <|>
attempt literal_string <|>
attempt variable) <?> "literal"
// Expressions and statements
let pexprs, pexprimpl = createParserForwardedToRef()
// `ploop` is located here to force `pexprs` to be of the type `Expr list`, `ploop` requesting an expression list.
let ploop =
pipe4
(keyword "loop" >>. ws1 >>. identifier)
(ws1 >>. literal)
(ws1 >>. literal)
(pexprs)
(fun id min max stmts -> Loop(id, min, max, stmts))
// `singlepexpr` allows to use only one expression.
let singlepexpr =
pexprs |>> fun ex -> ex.Head
let term =
(ws >>. singlepexpr .>> ws) <|>
(betweenParentheses (ws >>. singlepexpr)) <|>
(ws >>. literal .>> ws) <|>
(betweenParentheses (ws >>. literal))
let infixOperator (p: OperatorPrecedenceParser<_, _, _>) op prec map =
p.AddOperator(InfixOperator(op, ws, prec, Associativity.Left, map))
let ops =
// Arithmetic
[ "+"; "-"; "*"; "/"; "%" ] #
// Logical
[ "&&"; "||"; "=="; "!="; ">"; "<"; ">="; "<=" ]
let opCorrespondance op =
match op with
// Arithmetic operators
| "+" -> Sum
| "-" -> Sub
| "*" -> Mul
| "/" -> Div
| "%" -> Mod
// Logical operators
| "&&" -> And
| "||" -> Or
| "==" -> Equal
| "!=" -> NotEqual
| ">" -> Greater
| "<" -> Smaller
| ">=" -> GreaterEqual
| "<=" -> SmallerEqual
let opParser = new OperatorPrecedenceParser<Expr, unit, UserState>()
for op in ops do
infixOperator opParser op 1 (fun x y -> Infix(x, opCorrespondance op, y))
opParser.TermParser <- term
let list = (between (str "[") (str "]") (sepBy singlepexpr (str ",")) |>> List) <?> "list"
let tuple = (between (str "(") (str ")") (sepBy singlepexpr (str ",")) |>> Tuple) <?> "tuple"
// Statements
// A commented `let` expression, commented for tests with the `return` instruction.
//let plet =
// pipe3
// (keyword "let" >>. ws1 >>. identifier)
// (ws >>. gtt ":")
// (ws >>. str "=" >>. ws >>. pexprs)
// (fun id gtt exp -> Let(id, gtt, exp))
let preturn =
keyword "return" >>. ws >>. singlepexpr
|>> fun ex -> Return ex
let print = keyword "print" >>. (ws1 >>. identifier |>> Print)
let instruction =
print <|>
ploop <|>
preturn <|>
opParser.ExpressionParser <|> // So we add the arithmetic, like x + y or 21 * 32 - 12 for example
list <|>
tuple <|>
literal
pexprimpl := indentedMany1 instruction "instruction"
let document = pexprs .>> spaces .>> eof
let test str =
match runParserOnString document (UserState.Create()) "" str with
| Success(result, _, _) -> printfn "%A" result
| Failure(errorMsg, _, _) -> printfn "%s" errorMsg
System.Console.Clear()
// The test code that give an error of "newline" expecting
let code = test #"
return 2 + 1
"
And here some screenshots about error:
The reason why indentedMany1 tells you it's expecting a newline in your example code is because that's what it's looking for: an indented block. Not an expression on one line. So your let myVar = 2 + 1 line is confusing it. If you wrote it as:
let myVar =
2 + 1
then I bet it would work.
What you need, I believe, is to change your let parser to allow one of two things: either an expression on a single line, or a block of statements (your statements parser). I.e., something like:
let pLetValue = expression <|> statements
let plet =
pipe2
(keyword "let" >>. ws1 >>. identifier)
(ws >>. str "=" >>. ws >>. pLetValue)
(fun id gtt exp -> Let(id, gtt, exp))
Note that I haven't tested this, as I don't have much time today. It's possible that instead of expression above, you'd want attempt expression (or pexpr, which is the same thing). Experiment a little and see what happens; and if you're completely lost as you try to figure out how FParsec is handling a given expression, remember the advice given in http://www.quanttec.com/fparsec/users-guide/debugging-a-parser.html.
I have a simple language with following grammar
Expr -> Var | Int | Expr Op Expr
Op -> + | - | * | / | % | == | != | < | > | <= | >= | && | ||
Stmt -> Skip | Var := Expr | Stmt ; Stmt | write Expr | read Expr | while Expr do Stmt | if Expr then Stmt else Stmt
I am writing simple parser for this language using Haskell's Parsec library and i am stuck with some things
When i try to parse statement skip ; skip i get only first Skip, however i want go get something like Colon Skip Skip
Also when i try to parse the assignment, i get an infinite recursion. For example, when i try to parse x := 1 my computer hangs up for long time.
Here is full source code of my parser. Thanks for any help!
module Parser where
import Control.Monad
import Text.Parsec.Language
import Text.ParserCombinators.Parsec
import Text.ParserCombinators.Parsec.Expr
import Text.ParserCombinators.Parsec.Language
import qualified Text.ParserCombinators.Parsec.Token as Token
type Id = String
data Op = Add
| Sub
| Mul
| Div
| Mod
| Eq
| Neq
| Gt
| Geq
| Lt
| Leq
| And
| Or deriving (Eq, Show)
data Expr = Var Id
| Num Integer
| BinOp Op Expr Expr deriving (Eq, Show)
data Stmt = Skip
| Assign Expr Expr
| Colon Stmt Stmt
| Write Expr
| Read Expr
| WhileLoop Expr Stmt
| IfCond Expr Stmt Stmt deriving (Eq, Show)
languageDef =
emptyDef { Token.commentStart = ""
, Token.commentEnd = ""
, Token.commentLine = ""
, Token.nestedComments = False
, Token.caseSensitive = True
, Token.identStart = letter
, Token.identLetter = alphaNum
, Token.reservedNames = [ "skip"
, ";"
, "write"
, "read"
, "while"
, "do"
, "if"
, "then"
, "else"
]
, Token.reservedOpNames = [ "+"
, "-"
, "*"
, "/"
, ":="
, "%"
, "=="
, "!="
, ">"
, ">="
, "<"
, "<="
, "&&"
, "||"
]
}
lexer = Token.makeTokenParser languageDef
identifier = Token.identifier lexer
reserved = Token.reserved lexer
reservedOp = Token.reservedOp lexer
semi = Token.semi lexer
parens = Token.parens lexer
integer = Token.integer lexer
whiteSpace = Token.whiteSpace lexer
ifStmt :: Parser Stmt
ifStmt = do
reserved "if"
cond <- expression
reserved "then"
action1 <- statement
reserved "else"
action2 <- statement
return $ IfCond cond action1 action2
whileStmt :: Parser Stmt
whileStmt = do
reserved "while"
cond <- expression
reserved "do"
action <- statement
return $ WhileLoop cond action
assignStmt :: Parser Stmt
assignStmt = do
var <- expression
reservedOp ":="
expr <- expression
return $ Assign var expr
skipStmt :: Parser Stmt
skipStmt = do
reserved "skip"
return Skip
colonStmt :: Parser Stmt
colonStmt = do
s1 <- statement
reserved ";"
s2 <- statement
return $ Colon s1 s2
readStmt :: Parser Stmt
readStmt = do
reserved "read"
e <- expression
return $ Read e
writeStmt :: Parser Stmt
writeStmt = do
reserved "write"
e <- expression
return $ Write e
statement :: Parser Stmt
statement = colonStmt
<|> assignStmt
<|> writeStmt
<|> readStmt
<|> whileStmt
<|> ifStmt
<|> skipStmt
expression :: Parser Expr
expression = buildExpressionParser operators term
term = fmap Var identifier
<|> fmap Num integer
<|> parens expression
operators = [ [Infix (reservedOp "==" >> return (BinOp Eq)) AssocNone,
Infix (reservedOp "!=" >> return (BinOp Neq)) AssocNone,
Infix (reservedOp ">" >> return (BinOp Gt)) AssocNone,
Infix (reservedOp ">=" >> return (BinOp Geq)) AssocNone,
Infix (reservedOp "<" >> return (BinOp Lt)) AssocNone,
Infix (reservedOp "<=" >> return (BinOp Leq)) AssocNone,
Infix (reservedOp "&&" >> return (BinOp And)) AssocNone,
Infix (reservedOp "||" >> return (BinOp Or)) AssocNone]
, [Infix (reservedOp "*" >> return (BinOp Mul)) AssocLeft,
Infix (reservedOp "/" >> return (BinOp Div)) AssocLeft,
Infix (reservedOp "%" >> return (BinOp Mod)) AssocLeft]
, [Infix (reservedOp "+" >> return (BinOp Add)) AssocLeft,
Infix (reservedOp "-" >> return (BinOp Sub)) AssocLeft]
]
parser :: Parser Stmt
parser = whiteSpace >> statement
parseString :: String -> Stmt
parseString str =
case parse parser "" str of
Left e -> error $ show e
Right r -> r`
It's a common problem of parsers based on parser combinator: statement is left-recursive as its first pattern is colonStmt, and the first thing colonStmt will do is try parsing a statement again. Parser combinators are well-known won't terminate in this case.
Removed the colonStmt pattern from statement parser and the other parts worked appropriately:
> parseString "if (1 == 1) then skip else skip"
< IfCond (BinOp Eq (Num 1) (Num 1)) Skip Skip
> parseString "x := 1"
< Assign (Var "x") (Num 1)
The solution is fully described in this repo, there's no license file so I don't really know if it's safe to refer to the code, the general idea is to add another layer of parser when parsing any statement:
statement :: Parser Stmt
statement = do
ss <- sepBy1 statement' (reserved ";")
if length ss == 1
then return $ head ss
else return $ foldr1 Colon ss
statement' :: Parser Stmt
statement' = assignStmt
<|> writeStmt
<|> readStmt
<|> whileStmt
<|> ifStmt
<|> skipStmt
I have a discriminated union for expressions like this one (EQ =; GT >; etc)
(AND (OR (EQ X 0)
(GT X 10))
(OR (EQ Y 0)
(GT Y 10)))
I want to create instances of DU from such expressions saved in file/database.
How do i do it? If it is not feasible, what is the best way to approach it in F#?
Daniel: these expressions are saved in prefix format (as above) as text and will be parsed in F#. Thanks.
If you just want to know how to model these expressions using DUs, here's one way:
type BinaryOp =
| EQ
| GT
type Expr =
| And of Expr * Expr
| Or of Expr * Expr
| Binary of BinaryOp * Expr * Expr
| Var of string
| Value of obj
let expr =
And(
Or(
Binary(EQ, Var("X"), Value(0)),
Binary(GT, Var("X"), Value(10))),
Or(
Binary(EQ, Var("Y"), Value(0)),
Binary(GT, Var("Y"), Value(10))))
Now, this may be too "loose," i.e., it permits expressions like And(Value(1), Value(2)), which may not be valid according to your grammar. But this should give you an idea of how to approach it.
There are also some good examples in the F# Programming wikibook.
If you need to parse these expressions, I highly recommend FParsec.
Daniel's answer is good. Here's a similar approach, along with a simple top-down parser built with active patterns:
type BinOp = | And | Or
type Comparison = | Gt | Eq
type Expr =
| BinOp of BinOp * Expr * Expr
| Comp of Comparison * string * int
module private Parsing =
// recognize and strip a leading literal
let (|Lit|_|) lit (s:string) =
if s.StartsWith(lit) then Some(s.Substring lit.Length)
else None
// strip leading whitespace
let (|NoWs|) (s:string) =
s.TrimStart(' ', '\t', '\r', '\n')
// parse a binary operator
let (|BinOp|_|) = function
| Lit "AND" r -> Some(And, r)
| Lit "OR" r -> Some(Or, r)
| _ -> None
// parse a comparison operator
let (|Comparison|_|) = function
| Lit "GT" r -> Some(Gt, r)
| Lit "EQ" r -> Some(Eq, r)
| _ -> None
// parse a variable (alphabetical characters only)
let (|Var|_|) s =
let m = System.Text.RegularExpressions.Regex.Match(s, "^[a-zA-Z]+")
if m.Success then
Some(m.Value, s.Substring m.Value.Length)
else
None
// parse an integer
let (|Int|_|) s =
let m = System.Text.RegularExpressions.Regex.Match(s, #"^-?\d+")
if m.Success then
Some(int m.Value, s.Substring m.Value.Length)
else
None
// parse an expression
let rec (|Expr|_|) = function
| NoWs (Lit "(" (BinOp (b, Expr(e1, Expr(e2, Lit ")" rest))))) ->
Some(BinOp(b, e1, e2), rest)
| NoWs (Lit "(" (Comparison (c, NoWs (Var (v, NoWs (Int (i, Lit ")" rest))))))) ->
Some(Comp(c, v, i), rest)
| _ -> None
let parse = function
| Parsing.Expr(e, "") -> e
| s -> failwith (sprintf "Not a valid expression: %s" s)
let e = parse #"
(AND (OR (EQ X 0)
(GT X 10))
(OR (EQ Y 0)
(GT Y 10)))"
The question is similar to this one, but I want to parse an expression with function application using the OperatorPrecedenceParser in FParsec.
Here is my AST:
type Expression =
| Float of float
| Variable of VarIdentifier
| BinaryOperation of Operator * Expression * Expression
| FunctionCall of VarIdentifier (*fun name*) * Expression list (*arguments*)
I have the following input:
board→create_obstacle(4, 4, 450, 0, fric)
And here is the parser code:
let expr = (number |>> Float) <|> (ident |>> Variable)
let parenexpr = between (str_ws "(") (str_ws ")") expr
let opp = new OperatorPrecedenceParser<_,_,_>()
opp.TermParser <- expr <|> parenexpr
opp.AddOperator(InfixOperator("→", ws,
10, Associativity.Right,
fun left right -> BinaryOperation(Arrow, left, right)))
My problem here is that the function arguments are expressions as well (they can include operators, variables etc) and I don't know how to extend my expr parser to parse the argument list as a list of expression. I built a parser here, but I don't know how to combine it with my existing parser:
let primitive = expr <|> parenexpr
let argList = sepBy primitive (str_ws ",")
let fcall = tuple2 ident (between (str_ws "(") (str_ws ")") argList)
I currently have the following output from my parser:
Success: Expression (BinaryOperation
(Arrow,Variable "board",Variable "create_obstacle"))
What I want is to get the following:
Success: Expression
(BinaryOperation
(Arrow,
Variable "board",
Function (VarIdentifier "create_obstacle",
[Float 4, Float 4, Float 450, Float 0, Variable "fric"]))
You could parse the argument list as an optional postfix expression of an identifier
let argListInParens = between (str_ws "(") (str_ws ")") argList
let identWithOptArgs =
pipe2 ident (opt argListInParens)
(fun id optArgs -> match optArgs with
| Some args -> FunctionCall(id, args)
| None -> Variable(id))
and then define expr like
let expr = (number |>> Float) <|> identWithOptArgs