I have the following grammar:
expr : factor op ;
op
: '+' factor op
| // Blank rule for left-recursion elimination
;
factor
: NUM
| '(' expr ')'
;
NUM : ('0'..'9')+ ;
I supply 2 + 3, using expr as the start rule. The resulting parse tree from ANTLR is correct; however, I think I am misunderstanding the shift-reduce methods it uses.
I would expect the following to happen:
Step # | Parse Stack | Lookahead | Unscanned | Action
1 | | NUM | + 3 | Shift
2 | NUM | + | 3 | Reduce by factor -> NUM
3 | factor | + | 3 | Shift a 'null'?
4 | factor null | + | 3 | Reduce by op -> null
5 | factor op | + | 3 | Reduce by expr -> factor op
6 | expr | + | 3 | Shift
7 | expr + | NUM | | Shift
8 | expr + NUM | | | Reduce by factor -> NUM
9 | expr + factor | | | ERROR (no production)
I would've expected an error to occur at step 3 wherin the parser would shift a null onto the stack as a prerequisite to reduceing the factor "up" to an expr.
Does ANTLR only shift a null when it's strictly "required" because the resulting reduce will satisfy the grammar?
It seems to me that ANTLR doesn't use a shift-reduce parser; the generated parsers are recursive descent using an arbitrary amount of lookahead.
The steps of the parser would be something like:
Rule | Consummed | Input
--------------+-----------+------
expr | | 2 + 3
..factor | | 2 + 3
....NUM | 2 | + 3
..op | 2 | + 3
....'+' | 2 + | 3
....factor | 2 + | 3
......NUM | 2 + 3 |
....op | 2 + 3 |
......(empty) | 2 + 3 |
From what I read about ANTLR, you could achieve the same result with the following changes to the original grammar:
expr: factor op*;
op: '+' factor;
...
Related
I am trying to create a parser for a given language
The input language contains conditional statements if ... then ...
else and if ... then, separated by a symbol ; (semicolon). Condition
statements contain identifiers, comparison signs <,>, =, hexadecimal
numbers, sign assignments (:=). Consider the sequence as hexadecimal
numbers digits and symbols a, b, c, d, e, f starting with a digit (for
example, 89, 45ac, 0abc )
I ended up with this version of the grammar. Also removed left recursion and ambiguity
S -> if E then S S' | I := H
S' -> else S S'' | ; S | $
S'' -> ; S | $
I -> p | q
E -> HE'
E' -> >HE' | <HE' | =HE'
H -> DH'
H' -> LH' | DH' | EPS
L -> a | b | c | d | e | f
D -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Will it be possible to create an LL(1) parser according to this grammar. I am confused by the rules with two non-terminals (for example HE')
I have the following grammar for expressions involving binary operators (| ^ & << >> + - * /):
expression : expression BITWISE_OR xor_expression
| xor_expression
xor_expression : xor_expression BITWISE_XOR and_expression
| and_expression
and_expression : and_expression BITWISE_AND shift_expression
| shift_expression
shift_expression : shift_expression LEFT_SHIFT arith_expression
| shift_expression RIGHT_SHIFT arith_expression
| arith_expression
arith_expression : arith_expression PLUS term
| arith_expression MINUS term
| term
term : term TIMES factor
| term DIVIDE factor
| factor
factor : NUMBER
| LPAREN expression RPAREN
This seems to work fine, but doesn't quite match my needs because it allows outer parentheses e.g. ((3 + 4) * 2).
How can I change the grammar to disallow outer parentheses, while still allowing them within expressions e.g. (3 + 4) * 2, even redundantly e.g. (3 * 4) + 2?
Add this rule to your grammar:
top_level : expression BITWISE_OR xor_expression
| xor_expression BITWISE_XOR and_expression
| and_expression BITWISE_AND shift_expression
| shift_expression LEFT_SHIFT arith_expression
| shift_expression RIGHT_SHIFT arith_expression
| arith_expression PLUS term
| arith_expression MINUS term
| term TIMES factor
| term DIVIDE factor
| NUMBER
and use top_level where you want expressions without outer parens.
Can someone explain how these concept works?
I have 1 question. But I don't know have any ideas on constructing the truth table.
f(A,B,C) = AB + A’C
The answer given was ABC + ABC' + A'BC + A'B'C
And i have no idea how it get there. :-(
1. Create a column for each of the inputs, each intermediate functions, and the final function:
A B C | AB | A' | A'C | AB + A'C
--------------------------------
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
2. Enumerate all input possibilities, and start filling in the intermediate function values and then the final function value:
A B C | AB | A' | A'C | AB + A'C
--------------------------------
0 0 0 | 0 | 1 | 0 | 0
0 0 1 | | | |
0 1 0 | | | |
0 1 1 | | | |
1 0 0 | | | |
1 0 1 | | | |
1 1 0 | | | |
1 1 1 | | | |
3. Now, you finish the truth table.
Update per OP's edit of question:
The "answer given" can be reduced as follows using Boolean Algebra:
ABC + ABC' + A'BC + A'B'C
AB(C + C') + A'C(B + B')
AB + A'C
...which is the same as the given f(A,B,C). Not sure why ABC + ABC' + A'BC + A'B'C would be considered to be the "answer," but this does show equivalence between the two formulae.
Learning F# by writing blackjack. I have these types:
type Suit =
| Heart = 0
| Spade = 1
| Diamond = 2
| Club = 3
type Card =
| Ace of Suit
| King of Suit
| Queen of Suit
| Jack of Suit
| ValueCard of int * Suit
I have this function (ignoring for now that aces can have 2 different values):
let NumericValue =
function | Ace(Suit.Heart) | Ace(Suit.Spade) | Ace(Suit.Diamond) | Ace(Suit.Club) -> 11
| King(Suit.Heart) | King(Suit.Spade)| King(Suit.Diamond) | King(Suit.Club) | Queen(Suit.Heart) | Queen(Suit.Spade)| Queen(Suit.Diamond) | Queen(Suit.Club) | Jack(Suit.Heart) | Jack(Suit.Spade)| Jack(Suit.Diamond) | Jack(Suit.Club) -> 10
| ValueCard(num, x) -> num
Is there a way I can include a range or something? Like [Ace(Suit.Heart) .. Ace(Suit.Club)]. Or even better Ace(*)
You want a wildcard pattern. The spec (§7.4) says:
The pattern _ is a wildcard pattern and matches any input.
let numericValue = function
| Ace _-> 11
| King _
| Queen _
| Jack _ -> 10
| ValueCard(num, _) -> num
While implementing a Java regular expression for URL based on the URL BNF published by W3C, I've failed to understand the search part. As quoted:
httpaddress h t t p : / / hostport [ / path ] [ ?
search ]
search xalphas [ + search ]
xalphas xalpha [ xalphas ]
xalpha alpha | digit | safe | extra | escape
alpha a | b | c | d | e | f | g | h | i | j | k |
l | m | n | o | p | q | r | s | t | u | v |
w | x | y | z | A | B | C | D | E | F | G |
H | I | J | K | L | M | N | O | P | Q | R |
digit 0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
safe $ | - | _ | # | . | & | + | -
extra ! | * | " | ' | ( | ) | ,
Search claims it is xalphas seperated by a plus sign.
xalphas can contain plus signs by it self, as claimed by safe.
Thus according to my understanding , it should be:
search xalphas
Where am I wrong here?
That's pretty clearly a mistake (+ is a reserved delimiter for URIs), but the BNF you're linking to seems to be out of date. Probably best to use the one included at the end of the latest RFC 3986.