I'm new to bison.. I've wrote a grammar rule for if, else if and else statement.. I got reduce reduce conflict though.. Can anyone help ? I've tried everything I've found but as I told I'm new and I don't understand exactly what happens..
Here's my code:
ifinstr: KW_IF expr_decl KW_THEN statements elseifinstr elseinstr KW_END
;
elseifinstr : %empty {$$ = "";}
| elseifinstr KW_ELSE KW_IF expr_decl KW_THEN statement
;
elseinstr : %empty {$$ = "";}
| KW_ELSE statement
;
I've tried this solution too but got shift/reduce conflict instead:
ifinstr: KW_IF expr_decl KW_THEN statements elseifinstr KW_END
;
elseifinstr : %empty {$$ = "";}
| elseifinstr KW_ELSE KW_IF expr_decl KW_THEN statement
| KW_ELSE statement
;
The problem is that the grammar you propose is ambiguous. (The first grammar, I mean. The proposed second solution is a different language.)
Languages which use an "else if" construct in order to cut down on the nesting of "if … end" brackets use a special token:
Ruby Python VB Shell
---------------- ------------------- ----------------- --------------------
if n > 0 if n > 0: If n > 0 Then if ((n>0)); then
puts "Greater" print ("Greater") Print "Greater" echo Greater
elsif n == 0 elif n == 0: ElseIf n = 0 Then elif ((n==0)); then
puts "Equal" print ("Equal") Print "Equal" echo Equal
else else: Else else
puts "Less" print ("Less") Print "Less" echo Less
end End fi
Without the fused "else if", these expressions would be much more cumbersome:
Ruby Python VB Shell
---------------- ------------------- ----------------- --------------------
if n > 0 if n > 0: If n > 0 Then if ((n>0)); then
puts "Greater" print ("Greater") Print "Greater" echo Greater
else else: Else else
if n == 0 if n == 0: If n = 0 Then if ((n==0)); then
puts "Equal" print ("Equal") Print "Equal" echo Equal
else else: Else else
puts "Less" print ("Less") Print "Less" echo Less
end end End fi
end End fi
This constrasts with languages like C, Java, and many others which do not require that "if" statements be terminated, and thus exhibit the "dangling else" shift-reduce conflict, which is always resolved in favour of the shift. In those languages, since the "else" clause simply attaches to the closest unmatched "if" clause, there is no need to provide a special "else-if" token.
Now, you are attempting to combine these two approaches by using a fused "else-if" without fusing it, and that simply leads to the same dangling else problem which the explicit bracketing was attempting to fix. Consider, for example:
if C1 then S1 else if C2 then S2 else if C3 then S3 else S4 end S5 end
Now, which of the following two does this represent?
if C1 then if C1 then
S1 S1
else if C2 then else
S2 if C2 then
else S2
if C3 then else if C3 then
S3 S3
else else
S4 S4
end end
S5 S5
end end
(With more time I might have found a simpler example. The above two interpretations differ in the circumstances in which S5 is executed.)
The simplest solution is to use some fused "else if" token, as in the various language examples above. (Or think up your own :-) )
Related
Parse the expression: IF i> i THEN i = i + i * i
using the following CFG definition of a small programming language,
S → ASSIGNMENT$| GOTO$| IF$| IO$
ASSIGNMENT$ → i = ALEX
GOTO$ → GOTO NUMBER
IF$ → IF CONDITION THEN S
| IF CONDITION THEN S ELSE S
CONDITION → ALEX = ALEX| ALEX ≠ ALEX| ALEX > ALEX
| CONDITION AND CONDITION
| CONDITION OR CONDITION
| NOT CONDITION
IO$ → READ i| PRINT i
HINTS:
ALEX stands for algebraic expression
the names end in $ are class
the terminals are { = GOTO IF THEN ELSE ≠ > AND OR NOT READ PRINT }
whatever terminals are introduced in the definitions of i, ALEX, and NUMBER.
It looks to me like '$' is an abbreviation for 'STATEMENT'. E.g., "IF$" means "IF STATEMENT".
But regardless of what it "means", you can treat it as if it were an ordinary letter for the purpose of the exercise: "IF$" is just the name of a non-terminal that occurs in the grammar.
Ok, I thought there would be enough CS majors on here to check my pseudo code for my recursive descent parser. I developed it from this BNF
EXP ::= EXP + TERM | EXP - TERM | TERM
TERM ::= TERM * FACTOR | TERM/FACTOR | FACTOR
FACTOR ::= (EXP) | DIGIT
DIGIT ::= 0|1|2|3
Here's the pseudo code:
procedure exp()
term()
if token == ‘+’
match(‘+’)
term()
elseif token == ‘-‘
match(‘-‘)
term()
else
break
procedure term()
factor()
if token == ‘*’
match(‘*’)
factor()
elseif token == ‘/’
match(‘/’)
factor()
else
break
procedure factor()
if token == ‘(‘
match(‘(‘)
exp()
match(‘)’)
else
digit()
procedure digit()
if token == ‘0’
match(‘0’)
elseif token == ‘1’
match(‘1’)
elseif token == ‘2’
match(‘2’)
else
match(‘3’)
match(t)
if token == t
advancetokenpointer
else
error
Is this correct? I was thinking I may need to have a return in each procedure, and I'm not sure of my procedure correctness either. Maybe include end procedure too? Anyways, thanks a lot! :-)
You are half way there. Especifically, you are not accounting for the left-recursive parts of the grammar, as in "EXP ::= EXP...", or "TERM ::= TERM...".
Recursive descent is not well suited for left-recursion anyway, but fortunatelly there are standard transformations you can perform in the grammar that will eliminate this kind of left recursion. As an example, the following grammar:
A ::= A x B | B
could be coded like this:
procedure A()
B()
repeat
if token == 'x'
match('x')
B()
else
break
Moreover, the code for factor is not following the grammar correctly. Notice that in the first alternative, EXP is called recursivelly (this is a kind of recursion that Recursive Descent has no problem with) and you are calling factor instead. Also, you are matching the right parenthesis as if it was optional, while it is actually required. This same problem exists in the code for DIGIT. If neither 0, 1 or 2 match, 3 must match.
In ANTLR v3, syntactic predicates could be used to solve e.g., the dangling else problem. ANTLR4 seems to accept grammars with similar ambiguities, but during parsing it reports these ambiguities (e.g., "line 2:29 reportAmbiguity d=0 (e): ambigAlts={1, 2}, input=..."). It produces a parse tree, despite these ambiguities (by chosing the first alternative, according to the documentation). But what can I do, if I want it to chose some other alternative? In other words, how can I explicitly resolve ambiguities?
For example, the dangling else problem:
prog
: e EOF
;
e
: 'if' e 'then' e ('else' e)?
| INT
;
With this grammar, from the input "if 1 then if 2 then 3 else 4", it builds this parse tree: (prog (e if (e 1) then (e if (e 2) then (e 3) else (e 4))) ).
What can I do, if for some reason, I want the other tree: (prog (e if (e 1) then (e if (e 2) then (e 3)) else (e 4)) ) ?
Edit: for a more complex example, see What to use in ANTLR4 to resolve ambiguities in more complex cases (instead of syntactic predicates)?)
You can explicitly disallow an alternative in this type of situation by using a semantic predicate.
('else' e | {_input.LA(1) != ELSE}?)
You should be able to use the ?? operator instead of ? to prefer associating the else with the outermost if. However, performance will suffer substantially. Another option is distinguishing matched if/else pairs separately from an unmatched if.
ifStatement
: 'if' expression 'then' (statement | block) 'else' (statement | block)
| 'if' expression 'then' (statementNoIf | block)
;
I am learning how to use the Scala Parser Combinators, which by the way are lovely to work with.
Unfortunately I am getting a compilation error. I have read, and recreated the worked examples from: http://www.artima.com/pins1ed/combinator-parsing.html <-- from Chapter 31 of Programming in Scala, First Edition, and a few other blogs.
I've reduced my code to a much simpler version to demonstrate my problem. I am working on parser that would parse the following samples
if a then x else y
if a if b then x else y else z
with a little extra that the conditions can have an optional "/1,2,3" syntax
if a/1 then x else y
if a/2,3 if b/3,4 then x else y else z
So I have ended with the following code
def ifThenElse: Parser[Any] =
"if" ~> condition ~ inputList ~ yes ~> "else" ~ no
def condition: Parser[Any] = ident
def inputList: Parser[Any] = opt("/" ~> repsep(input, ","))
def input: Parser[Any] = ident
def yes: Parser[Any] = "then" ~> result | ifThenElse
def no: Parser[Any] = result | ifThenElse
def result: Parser[Any] = ident
Now I want to add some transformations. I now get a compilation error on the second ~ in the case:
def ifThenElse: Parser[Any] =
"if" ~> condition ~ inputList ~ yes ~> "else" ~ no ^^ {
case c ~ i ~ y ~ n => null
^ constructor cannot be instantiated to expected type; found : SmallestFailure.this.~[a,b] required: String
When I change the code to
"if" ~> condition ~ inputList ~ yes ~> "else" ~ no ^^ {
case c ~ i => println("c: " + c + ", i: " + i)
I expected it not to compile, but it did. I thought I would need a variable for each clause. When executed (using parseAll) parsing "if a then b else c" produces "c: else, i: c". So it seems like c and i are the tail of the string.
I don't know if it is significant, but none of the example tutorials seem to have an example with more than two variables being matched, and this is matching four
You do not have to match the "else":
def ifThenElse: Parser[Any] =
"if" ~> condition ~ inputList ~ (yes <~ "else") ~ no ^^ {
case c ~ i ~ y ~ n => null
}
Works as expected.
I am little bit confusing about ">>" in scala. Daniel said in Scala parser combinators parsing xml? that it could be used to parameterize the parser base on result from previous parser. Could someone give me some example/hint ? I already read scaladoc but still not understand it.
thanks
As I said, it serves to parameterize a parser, but let's walk through an example to make it clear.
Let's start with a simple parser, that parses a number follow by a word:
def numberAndWord = number ~ word
def number = "\\d+".r
def word = "\\w+".r
Under RegexParsers, this will parse stuff like "3 fruits".
Now, let's say you also want a list of what these "n things" are. For example, "3 fruits: banana, apple, orange". Let's try to parse that to see how it goes.
First, how do I parse "N" things? As it happen, there's a repN method:
def threeThings = repN(3, word)
That will parse "banana apple orange", but not "banana, apple, orange". I need a separator. There's repsep that provides that, but that won't let me specify how many repetitions I want. So, let's provide the separator ourselves:
def threeThings = word ~ repN(2, "," ~> word)
Ok, that words. We can write the whole example now, for three things, like this:
def listOfThings = "3" ~ word ~ ":" ~ threeThings
def word = "\\w+".r
def threeThings = word ~ repN(2, "," ~> word)
That kind of works, except that I'm fixing "N" in 3. I want to let the user specify how many. And that's where >>, also known as into (and, yes, it is flatMap for Parser), comes into. First, let's change threeThings:
def things(n: Int) = n match {
case 1 => word ^^ (List(_))
case x if x > 1 => word ~ repN(x - 1, "," ~> word) ^^ { case w ~ l => w :: l }
case x => err("Invalid repetitions: "+x)
}
This is slightly more complicated than you might have expected, because I'm forcing it to return Parser[List[String]]. But how do I pass a parameter to things? I mean, this won't work:
def listOfThings = number ~ word ~ ":" ~ things(/* what do I put here?*/)
But we can rewrite that like this:
def listOfThings = (number ~ word <~ ":") >> {
case n ~ what => things(n.toInt)
}
That is almost good enough, except that I now lost n and what: it only returns "List(banana, apple, orange)", not how many there ought to be, and what they are. I can do that like this:
def listOfThings = (number ~ word <~ ":") >> {
case n ~ what => things(n.toInt) ^^ { list => new ~(n.toInt, new ~(what, list)) }
}
def number = "\\d+".r
def word = "\\w+".r
def things(n: Int) = n match {
case 1 => word ^^ (List(_))
case x if x > 1 => word ~ repN(x - 1, "," ~> word) ^^ { case w ~ l => w :: l }
case x => err("Invalid repetitions: "+x)
}
Just a final comment. You might have wondered asked yourself "what do you mean flatMap? Isn't that a monad/for-comprehension thingy?" Why, yes, and yes! :-) Here's another way of writing listOfThings:
def listOfThings = for {
nOfWhat <- number ~ word <~ ":"
n ~ what = nOfWhat
list <- things(n.toInt)
} yield new ~(n.toInt, new ~(what, list))
I'm not doing n ~ what <- number ~ word <~ ":" because that uses filter or withFilter in Scala, which is not implemented by Parsers. But here's even another way of writing it, that doesn't have the exact same semantics, but produce the same results:
def listOfThings = for {
n <- number
what <- word
_ <- ":" : Parser[String]
list <- things(n.toInt)
} yield new ~(n.toInt, new ~(what, list))
This might even give one to think that maybe the claim that "monads are everywhere" might have something to it. :-)
The method >> takes a function that is given the result of the parser and uses it to contruct a new parser. As stated, this can be used to parameterize a parser on the result of a previous parser.
Example
The following parser parses a line with n + 1 integer values. The first value n states the number of values to follow. This first integer is parsed and then the result of this parse is used to construct a parser that parses n further integers.
Parser definition
The following line assumes, that you can parse an integer with parseInt: Parser[Int]. It first parses an integer value n and then uses >> to parse n additional integers which form the result of the parser. So the initial n is not returned by the parser (though it's the size of the returned list).
def intLine: Parser[Seq[Int]] = parseInt >> (n => repN(n,parseInt))
Valid inputs
1 42
3 1 2 3
0
Invalid inputs
0 1
1
3 42 42