Parse lists which can contain parentheses with ANTLR4 - parsing

Lets say I want to create a grammar that is similar to Lisp where all expressions are between open and close parentheses.
For example:
(+ 1 2)
I also want the grammar to be able to parse the string ('(def foo)) to a parse tree which is similar to (expression ( literal '(def foo) )).
That means it should successfully associate the parentheses in the literal expression to the literal.

Well, LISP in general is very user-extensible in terms of its grammar, so I don't know how possible it would be to get any BNF(+) form of it. Here is a discussion about it; I'm sure there are more if you search for it.
But for toy examples, this will probably be fine:
<s_expression> ::= <atomic_symbol>
| "(" <s_expression> "." <s_expression> ")"
| <list> .
<_list> ::= <s_expression> <_list>
| <s_expression> .
<list> ::= "(" <s_expression> <_list> ")" .
<atomic_symbol> ::= <letter> <atom_part> | "'" <s_expression> .
<atom_part> ::= <empty> | <letter> <atom_part> | <number> <atom_part> .
<letter> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j"
| "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t"
| "u" | "v" | "w" | "x" | "y" | "z" .
<number> ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "0" .
<empty> ::= " ".
modified from here
I modified the grammar in a hurry, so please tell me if you see any problems with it.
Also, I haven't used ANTLR in a long time, so I don't know if it's exactly in a format it excepts. But it should be trivial to format it right though.

Related

LL(1) BNF Recursion

Hey could someone explain how recursion works for ll(1) with bnf.
DIGIT ::= NUMBER | NUMBER DIGIT
NUMBER ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
Shouldnt be this a FIRST(DIGIT) error ?
If yes how would it work in a LL(1) grammer
Thank you for your time.

How do I pass empty arguments in Parameterized style scenarios in FitNesse

I want to pass empty Strings with a Parameterized style scenario like this:
| scenario | Test with A "_", B "_" and C "_" | a, b, c |
| ensure | do | type | on | id=field1 | with | #a |
| ensure | do | type | on | id=field2 | with | #b |
| ensure | do | type | on | id=field3 | with | #c |
and accordingly I use this code:
| Test with A " ", B " " and C " "|
It doesn't matter, whether I use " " or "${blank}" for an empty string, instead of empty strings FitNesse parses the arguments into "#a", "#b" and "#c".
Only if I deliver one not empty string, the others pass as empty.
e.g.:
| Test with A "dummy", B " " and C "${blank}" |
writes "dummy" into field1 and empty strings into the fields 2 and 3.
How can I achieve that all arguments pass as empty?
Thanks in advance!
Try passing the value as literal text
|!- Test with A " ", B " " and C " "-!|
Although. to me, it sounds like you should use a decision table and not Scenario Table. But that's just me :)

Way to check if a string satisfies a BNF

I have created a BNF for a certain language and want to check if a certain input is valid for that BNF. For instance, if I have a BNF like
<palindrome> ::= a <palindrome> a | b <palindrome> b |
c <palindrome> c | d <palindrome> d |
e <palindrome> e | ...
| z <palindrome> z
<palindrome> ::= <letter>
<letter> ::= a | b | c | ... | y | z
the string 'bcdcb' and 'hannah' will return true.
the string 'joe' will return false.
Can someone describe an algorithm that can do this.
This algorithm doesn't work with joe because it's checking are first and last letter same, it's searching palindromes words. 'joe' is not palindrome word. So it's ok that it doesn't pass.

Formal algorithm to rewrite a grammar that has no left-recursion and shows right precedence

Is there any formal algorithm or steps to rewrite a grammar that has no left-recursion and shows right precedence. Such as that simple algorithm for eliminating left recursion described in Wikipedia
For example, given the following algorithm:
1 <goal> ::= <expr>$
2 <expr> ::= <expr><op><expr>
3 | num
4 | id
5 <op> ::= +
6 |-
7 |*
8 |/
The desired output should be:
1. <expr> ::= <term><expr'>
2. <expr'> ::= +<term><expr'>
3. | epsilon
4. | -<term><expr'>
5. <term> ::= <factor><term'>
6. <term'> ::= *<factor><term'>
7. | epsilon
8. | /<factor><term'>

Converting ambiguous to unambigous grammar for arithmetic expressions

I'm attempting to come up with a non-ambiguous grammar for arithmetic expressions to make an Earley parser faster but I seem to be having trouble.
This is the given ambiguous grammar
S -> E | S,S
E -> E+E | E-E | E*E | (E) | -E | V
V -> a | b | c
this is my attempt at making it unambiguous
S -> S+E | S-E | E | (S+E) | (S-E) | (E)
E -> E*T | E
T -> -V | V
V -> a | b | c
It parses everything fine but there isn't any significant speedup as compared to using the ambiguous one.

Resources