How can I interpret this as ENBF grammer?
<assign>--> <id> = <expr>
<id>--> A | B | C
<expr> --> <expr> * <expr>
<expr> --> <expr> + <expr>
| <id> + <expr>
|( <expr> )
| <id>
I can make parse tree and derivation of any statement using this grammer, but am having trouble with EBNF.
<assign>--> <id> = <expr>
An assign is the sequence: id equals-sign expr.
<id>--> A | B | C
An id is one of A, B or C
<expr> --> <expr> * <expr>
<expr> --> <expr> + <expr>
| <id> + <expr>
|( <expr> )
| <id>
An expression can be:
The product of two expressions (infix notation)
The addition of two expression (infix notation)
The addition of an identifier and an expression (which is a particular case of addition of two expressions, where the first expresion is just <id>)
A parenthesized expression.
An identifier.
Related
I'm trying to implement a context-free grammar for the language of logical operators with parentheses including operator precedence.
For example, the follows:
{1} or {2}
{1} and {2} or {3}
({1} or {2}) and {3}
not {1} and ({2} or {3})
...
I've started from the following grammar:
expr := control | expr and expr | expr or expr | not expr | (expr)
control := {d+}
To implement operator precedence and eliminate left recursion, I changed it in the following way:
S ::= expr
expr ::= control or expr1 | expr1
expr1 ::= control and expr2 | expr2
expr2 ::= not expr3 | expr3
expr3 ::= (expr) | expr | control
control := {d+}
But such grammar doesn't support examples like: ({1} or {2}) and {3} that contain 'and' / 'or' after parentheses.
For now, I have the following grammar:
S ::= expr
expr ::= control or expr1 | expr1
expr1 ::= control and expr2 | expr2
expr2 ::= not expr3 | expr3
expr3 ::= (expr) | (expr) expr4 | expr | control
expr4 :: = and expr | or expr
control := {d+}
Is this grammar correct?
Can it be simplified in some way?
Thanks!
To implement your operator precedence, you want just:
S ::= expr
expr ::= expr or expr1 | expr1
expr1 ::= expr1 and expr2 | expr2
expr2 ::= not expr2 | expr3
expr3 ::= (expr) | control
control := {d+}
This is left-recursive in order to be left-associative, as that's generally what you want (both for correctness and most parser-generators), but if you need to avoid left recursion for some reason, you can use a right-recursive, right associative grammar:
S ::= expr
expr ::= expr1 or expr | expr1
expr1 ::= expr2 and expr1 | expr2
expr2 ::= not expr2 | expr3
expr3 ::= (expr) | control
control := {d+}
as both and and or are associative operators, so left-vs-right doesn't matter.
In both cases, you can "simplify" it by folding expr3 and control into expr2:
expr2 ::= not expr2 | ( expr ) | {d+}
I implemented a type-checker and reducer of calculus of constructions in Haskell with a simple monadic parser using Megaparsec. Now I want to improve it so it can recognize this syntactic shortcut:
∀(x:A)->B (with x not free in B) = A -> B
The grammar for this syntax is as follows:
<expr>
= "(" <expr> ")"
| <expr> <expr>
| "λ" "(" <name> ":" <expr> ")" "→" <expr>
| "∀" "(" <name> ":" <expr> ")" "→" <expr>
| <expr> "→" <expr>
| <name>
| "*"
<name> = [_A-Za-z][_0-9A-Za-z]*
My current parser uses this variation with left recursion eliminated (without the shortcut):
<expr>
= "(" <appl> ")"
| "λ" "(" <name> ":" <appl> ")" "→" <appl>
| "∀" "(" <name> ":" <appl> ")" "→" <appl>
| <name>
| "*"
<appl> = <expr>+
<name> = [_A-Za-z][_0-9A-Za-z]*
The previously mentioned shortcut is left-recursive. I have no idea how to convert it to a right-recursive grammar so it can be handled by a conventional recursive descent parser.
I know there exist more powerful parsing techniques that can handle left-recursive grammars, but I want to keep it right-recursive to left open the possibility of implementing a parser by hand in the near future.
The answer has been evident after a short break. Use exactly the same trick that we did on <appl> and extend it as follows:
<expr>
= "(" <appl> ")"
| "λ" "(" <name> ":" <appl> ")" "→" <appl>
| "∀" "(" <name> ":" <appl> ")" "→" <appl>
| <name>
| "*"
<appl> = <expr>+ ("→" <appl>)?
<name> = [_A-Za-z][_0-9A-Za-z]*
I will leave the question open in case it helps somebody.
This is Right Recursion Grammar:
<assign> -> <id> = <exp>
<id> -> A | B | C
<exp> -> <term> + <exp> | <temp>
<term> -> <factor> * <term> | <factor>
<factor> -> ( <exp> ) | <id>
This is Left Recursion Grammar:
<assign> -> <id> = <exp>
<id> -> A | B | C
<exp> -> <exp> + <term> | <term>
<term> -> <term> * <factor> | <factor>
<factor> -> ( <exp> ) | <id>
Will those grammar produce the same parse tree for String B + C + A?
The following picture is for Left Recursion.
However, i draw the parse tree for the Right Recursion, it is a bit different between position of nodes. I dont know if what i am doing is correct. So I wonder that Left Recursion and Right Recursion produce two different parse tree or should be same parse tree. Please help to clarify this problem. Thanks.
Left and right recursion will not produce identical trees.
You can see easily from the grammars that A+B+C will at the "top-level' have <term> <op> <exp> or <exp> <op> <term> ("exp" being "B+C" in one case and "A+B" in the other.
The trees will only be identical in trivial cases where all productions yield a direct match.
(E.g. A (skipping assign) would be <exp> --> <term> --> <factor> --> <id>
Since both grammars are different I would expect different parse trees. The right recursion would swap the addition and the single expansion on the node at the top that is labeled <expr> = <expr> + <term>. The right recursion grammar would expand to <expr> = <term> + <expr> so both children would be swapped.
If you are trying to write a compiler for mathematical expressions a thing that matters more is operator precedence, but not sure what you are up to.
Considering the following grammar for propositional logic:
<A> ::= <B> <-> <A> | <B>
<B> ::= <C> -> <B> | <C>
<C> ::= <D> \/ <C> | <D>
<D> ::= <E> /\ <D> | <E>
<E> ::= <F> | -<F>
<F> ::= <G> | <H>
<G> ::= (<A>)
<H> ::= p | q | r | ... | z
Precedence for conectives is: -, /\, /, ->, <->.
Associativity is also considered, for example p\/q\/r should be the same as p\/(q\/r). The same for the other conectives.
I pretend to make a predictive top-down parser in java. I dont see here ambiguity or direct left recursion, but not sure if thats all i need to consider this a LL(1) grammar. Maybe undirect left recursion?
If this is not a LL(1) grammar, what would be the steps required to transform it for my intentions?
It's not LL(1). Here's why:
The first rule of an LL(1) grammar is:
A grammar G is LL(1) if and only if whenever A --> C | D are two distinct productions of G, the following conditions hold:
For no terminal a , do both C and D derive strings beginning with a.
This rule is, so that there are no conflicts while parsing this code. When the parser encounters a (, it won't know which production to use.
Your grammar violates this first rule. All your non-terminals on the right hand of the same production , that is, all your Cs and Ds, eventually reduce to G and H, so all of them derive at least one string beginning with (.
We have little snippets of vb6 code (the only use a subset of features) that gets wirtten by non-programmers. These are called rules. For the people writing these they are hard to debug so somebody wrote a kind of add hoc parser to be able to evaluete the subexpressions and thereby show better where the problem is.
This addhoc parser is very bad and does not really work woll. So Im trying to write a real parser (because im writting it by hand (no parser generator I could understand with vb6 backends) I want to go with recursive decent parser). I had to reverse-engineer the grammer because I could find anything. (Eventully I found something http://www.notebar.com/GoldParserEngine.html but its LALR and its way bigger then i need)
Here is the grammer for the subset of VB.
<Rule> ::= expr rule | e
<Expr> ::= ( expr )
| Not_List CompareExpr <and_or> expr
| Not_List CompareExpr
<and_or> ::= Or | And
<Not_List> ::= Not Not_List | e
<CompareExpr> ::= ConcatExpr comp CompareExpr
|ConcatExpr
<ConcatExpr> ::= term term_tail & ConcatExpr
|term term_tail
<term> ::= factor factor_tail
<term_tail> ::= add_op term term_tail | e
<factor> ::= add_op Value | Value
<factor_tail> ::= multi_op factor factor_tail | e
<Value> ::= ConstExpr | function | expr
<ConstExpr> ::= <bool> | number | string | Nothing
<bool> ::= True | False
<Nothing> ::= Nothing | Null | Empty
<function> ::= id | id ( ) | id ( arg_list )
<arg_list> ::= expr , arg_list | expr
<add_op> ::= + | -
<multi_op> ::= * | /
<comp> ::= > | < | <= | => | =< | >= | = | <>
All in all it works pretty good here are some simple examples:
my_function(1, 2 , 3)
looks like
(Programm
(rule
(expr
(Not_List)
(CompareExpr
(ConcatExpr
(term
(factor
(value
(function
my_function
(arg_list
(expr
(Not_List)
(CompareExpr
(ConcatExpr (term (factor (value 1))) (term_tail))))
(arg_list
(expr
(Not_List)
(CompareExpr
(ConcatExpr (term (factor (value 2))) (term_tail))))
(arg_list
(expr
(Not_List)
(CompareExpr
(ConcatExpr (term (factor (value 3))) (term_tail))))
(arg_list))))))))
(term_tail))))
(rule)))
Now whats my problem?
if you have code that looks like this (( true OR false ) AND true) I have a infinit recursion but the real problem is that in the (true OR false) AND true (after the first ( expr ) ) is understood as only (true or false).
Here is the Parstree:
So how to solve this. Should I change the grammer somehow or use some implmentation hack?
Something hard exmplale in case you need it.
(( f1 OR f1 ) AND (( f3="ALL" OR f4="test" OR f5="ALL" OR f6="make" OR f9(1, 2) ) AND ( f7>1 OR f8>1 )) OR f8 <> "")
You have several issues that I see.
You are treating OR and AND as equal precedence operators. You should have separate rules for OR, and for AND. Otherwise you will the wrong precedence (therefore evaluation) for the expression A OR B AND C.
So as a first step, I'd revise your rules as follows:
<Expr> ::= ( expr )
| Not_List AndExpr Or Expr
| Not_List AndExpr
<AndExpr> ::=
| CompareExpr And AndExpr
| Not_List CompareExpr
Next problem is that you have ( expr ) at the top level of your list. What if I write:
A AND (B OR C)
To fix this, change these two rules:
<Expr> ::= Not_List AndExpr Or Expr
| Not_List AndExpr
<Value> ::= ConstExpr | function | ( expr )
I think your implementation of Not is not appropriate. Not is an operator,
just with one operand, so its "tree" should have a Not node and a child which
is the expression be Notted. What you have a list of Nots with no operands.
Try this instead:
<Expr> ::= AndExpr Or Expr
| AndExpr
<Value> ::= ConstExpr | function | ( expr ) | Not Value
I haven't looked, but I think VB6 expressions have other messy things in them.
If you notice, the style of Expr and AndExpr I have written use right recursion to avoid left recursion. You should change your Concat, Sum, and Factor rules to follow a similar style; what you have is pretty complicated and hard to follow.
If they are just creating snippets then perhaps VB5 is "good enough" for creating them. And if VB5 is good enough, the free VB5 Control Creation Edition might be worth tracking down for them to use:
http://www.thevbzone.com/vbcce.htm
You could have them start from a "test harness" project they add snippets to, and they can even test them out.
With a little orientation this will probably prove much more practical than hand crafting a syntax analyzer, and a lot more useful since they can test for more than correct syntax.
Where VB5 is lacking you might include a static module in the "test harness" that provides a rough and ready equivalent of Split(), Replace(), etc:
http://support.microsoft.com/kb/188007