I'm currently implementing a LR(k) parser interpreter, just for fun.
I'm trying to implement the precedence and associativity.
I got a little stuck when it came to how to assign associativity and precedence for the 'action' part i.e. what the precedence and associativity should be for the reduction.
if we got a production
E ->
| E + E { action1 }
| E * E { action2 }
| (E) { action3 }
| ID { action4 }
it should be quit clear that action1 should have the same associativity and precedence as +
and action2 should have the same as *. But in general we can not just assume that a rule in a production has only one symbol which has a precedence. A toy example
E -> E + E - E { action }
where - and + are some arbetrary operators, having some precedence and associativity. Should the action be associated with -, because it precedes the last E?
I know the rules for how to choose between shift/reduce, that is not what I ask for.
The classic precedence algorithm, as implemented by yacc (and many derivatives) uses the last non-terminal in each production to define its default precedence. That's not always the desired precedence for the production, so parser-generators typically also provide their users with a mechanism for explicitly specifying the precedence of a production.
This precedence model has proven to be useful, and while it is not without its problems -- see below -- it is probably the best implementation for a simple parser generator, if only because its quirks are at least documented.
This convention has perpetuated the idea that precedence is a feature of non-terminals (or "operators"). That's valid if you're building an operator-precedence parser but it does not correspond to LR(k) parsing. At best, it's a crude approximation, which can be highly misleading.
If the underlying grammar really is an operator precedence grammar -- that is, no production has two consecutive terminals and the imputed precedence relationships are unambiguous -- then it might be an acceptable approximation, although it's worth noting that operator precedence relationships are not transitive so they cannot usually be summarised as monotonic comparisons. But many uses of yacc-style precedence are well outside of this envelope, and can even lead to serious grammar bugs.
The problem is that modelling precedence as a simple transitive comparison between tokens can lead to the precedence declarations being used to disambiguate (and thereby hide) unrelated conflicts. In short, the use of precedence declarations in LR parsing is basically a hack. It's a useful hack, and sometimes beneficial -- as you say, it can reduce the number of states and the frequency of unit reductions -- but it needs to be approached with caution.
Indeed, some people have proposed an alternative model of precedence based on grammar rewriting. (See, for example, the 2013 paper by Ali Afroozeh et al., “Safe Specification of Operator Precedence Rules”). This model is considerably more precise, but partially as a consequence of this precision, it is not as amenable to (mis)-use for other purposes, such as the resolution of the dangling-else conflict.
Related
While searching for Bison grammars, i found this example of C grammar:
https://www.lysator.liu.se/c/ANSI-C-grammar-y.html
logical_and_expression
: inclusive_or_expression
| logical_and_expression AND_OP inclusive_or_expression
;
logical_or_expression
: logical_and_expression
| logical_or_expression OR_OP logical_and_expression
;
I didn't understand the reason for a rule for each logical operation. Is there an advantage over this construction below?
binary_expression:
: object // imagine it can be bool, int, real ...
| binary_expression AND_OP binary_expression
| binary_expression OR_OP binary_expression
;
The grammar you quote is unambiguous.
The one you suggest is ambiguous, although yacc/bison allow you to use precedence rules to resolve the ambiguities.
There are some advantages to using a grammar which makes operator precedence explicit:
It is a precise description of the language. Precedence rules are not part of the grammatical formalism and can be difficult to reason about. In particular, there is no general way to prove that they do what is desired.
The grammar is self-contained. Ambiguous grammars can only be understood with the addition of the precedence rules. This is particularly important for grammars used in language standards but it generally affects attempts to automatically build other syntax-based tools.
Explicit grammars are more general. Not all operator restrictions can be easily described with numeric precedence comparisons.
Precedence rules can hide errors in the grammar, by incorrectly resolving a shift-reduce conflict elsewhere in the grammar which happens to use some of the same tokens. Since the resolved conflicts are not reported, the grammar writer is not warned of the problem.
On the other hand, precedence rules do have some advantages:
The precedence table compactly describes operator precedence, which is useful for quick reference.
The resulting grammar requires fewer unit productions, slightly increasing parse speed. (Usually not noticeable, but still...)
Some conflicts are much easier to resolve with precedence declarations, although understanding how the conflict is resolved may not be obvious. (The classic example is the dangling-else ambiguity.) Such cases have little or nothing to do with the intuitive understanding of operator precedence, so the use of precedence rules is a bit of a hack.
The total size of the grammar is not really affected by using precedence rules. As mentioned, the precedence rules avoid the need for unit productions, but every unit production corresponds to one precedence declaration so the total number of lines is the same. There are fewer non-terminals, but non-terminals cost little; the major annoyance in yacc/bison is declaring all the semantic types, but that is easy to automate.
I try a bit the parser generators with Haskell, using Happy here. I used to use parser combinators before, such as Parsec, and one thing I can't achieve now with that is the dynamic addition (during execution) of new externally defined operators. For example, Haskell has some basic operators, but we can add more, giving them precedence and fixity. So I would like to know how to reproduce this with Happy, following the Haskell design (view example code bellow to be parsed), if it is not trivially feasible, or if it should perhaps be done through the parser combinators.
-- Adding the new operator
infixl 5 ++
(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x:xs) ++ ys = x : xs ++ ys
-- Using the new operator taking into consideration fixity and precedence during parsing
example = "Hello, " ++ "world!"
Haskell only allows a few precedence levels. So you don't strictly need a dynamic grammar; you could just write out the grammar using precedence-level token classes instead of individual operators, leaving the lexer with the problem of associating a given symbol with a given precedence level.
In effect, that moves the dynamic addition of operators to the lexer. That's a slightly uncomfortable design decision, although in some cases it may not be too difficult to implement. It's uncomfortable design because it requires semantic feedback to the lexer; at a minimum, the lexer needs to consult the symbol table to figure out what type of token it is looking at. In the case of Haskell, at least, this is made more uncomfortable by the fact that fixity declarations are scoped, so in order to track fixity information, the lexer would also need to understand scoping rules.
In practice, most languages which allow program text to define operators and operator precedence work in precisely the same way the Haskell compiler does: expressions are parsed by the grammar into a simple list of items (where parenthesized subexpressions count as a single item), and in a later semantic analysis the list is rearranged into an actual tree taking into account precedence and associativity rules, using a simple version of the shunting yard algorithm. (It's a simple version because it doesn't need to deal with parenthesized subconstructs.)
There are several reasons for this design decision:
As mentioned above, for the lexer to figure out what the precedence of a symbol is (or even if the symbol is an operator with precedence) requires a close collaboration between the lexer and the parser, which many would say violates separation of concerns. Worse, it makes it difficult or impossible to use parsing technologies without a small fixed lookahead, such as GLR parsers.
Many languages have more precedence levels than Haskell. In some cases, even the number of precedence levels is not defined by the grammar. In Swift, for example, you can declare your own precedence levels, and you define a level not with a number but with a comparison to another previously defined level, leading to a partial order between precedence levels.
IMHO, that's actually a better design decision than Haskell, in part because it avoids the ambiguity of a precedence level having both left- and right-associative operators, but more importantly because the relative precedence declarations both avoid magic numbers and allow the parser to flag the ambiguous use of operators from different modules. In other words, it does not force a precedence declaration to mechanically apply to any pair of totally unrelated operators; in this sense it makes operator declarations easier to compose.
The grammar is much simpler, and arguably easier to understand since most people anyway rely on precedence tables rather than analysing grammar productions to figure out how operators interact with each other. In that sense, having precedence set by the grammar is more a distraction than documentation. See the C++ grammar as a good example of why precedence tables are easier to read than grammars.
On the other hand, as the C++ grammar also illustrates, a grammar is a lot more general than simple precedence declarations because it can express asymmetric precedences. (The grammar doesn't always express these gracefully, but they can be expressed.) A classic example of an asymmetric precedence is a lambda construct (λ ID expr) which binds very loosely to the right and very tightly to the left: the expected parse of a ∘ λ b b ∘ a does not ever consult the associativity of ∘ because the λ comes between them.
In practice, there is very little cost to building the tree later. The algorithm to build the tree is well-known, simple and cheap.
Yes, I'm one of those insane people who have a parser-generator project. Minimal-LR(1) with operator-precedence was fairly straightforward. GLR support is next, preferably without making a mess of the corner cases around precedence and associativity (P&A).
Suppose you have an R/R conflict between rules with different precedence levels. A deterministic parser can safely choose the (first) rule of highest precedence. A parser designed to handle local ambiguity might not be sure, especially if the involved rules reduce to different non-terminals.
Suppose you have a R/R conflict between rules with- and without- precedence characteristics. A deterministic parser can reasonably choose the former. If you ask for GLR, do you mean to entertain both, or should the former clearly dominate the latter? Or is this scenario sufficiently weird as to justify rejecting the grammar?
Suppose you have an S/R/R conflict where only some of the participating rules have precedence, and maybe the look-ahead token does or doesn't have precedence. If P&A is all about what to do in front of the lookahead, then a non-precedent token should perhaps mean all options stay viable. But is that really the intended semantic here?
Suppose you have a nonassoc declaration on a terminal, and an S/R/R conflict where only ONE of the participating production rules hits the same non-associative precedence level. Then the other rule is clearly still viable to reduce, but what of the shift? Should we take it? What if we're mid-rule in a manner that doesn't trigger the same non-associativity problem? What if the look-ahead token is higher precedence than the remaining reduce, or the remaining reduce doesn't have precedence? How can we avoid accidentally constructing an invalid parse this way? Is there some trick with the parse-items to construct a shift-state that can't go wrong, or is this kind of thing beyond the scope of GLR parsing?
Also, how should semantic predicates interact with such ugly corner cases?
The simplest-thing-that-might-work is to treat anything involving operator-precedence in the same manner as a deterministic table-generator. But is that the intended semantic? Or perhaps: what kinds of declarations might grammar authors want to exert control over these weird cases?
Traditional yacc-style precedence rules cannot be used to resolve reduce/reduce conflicts.
Yacc/bison "resolve" reduce/reduce conflicts by choosing the first production in the grammar file. This has nothing to do with precedence, and in the grammars where you would want to use a GLR parser, it is almost certainly not correct; you want the GLR parser to pursue all possible paths.
The bison GLR parser requires that ambiguity be resolved; that is, that the grammar be unambiguous. However, it has two "outs": first, it lets you use "dynamic precedence" declarations (which is a completely different concept, although it happens to use the same word); second, if that's not enough, it lets you provide your own resolution function.
Amongst other possibilities, a custom resolution function can accept both reductions, for example by inserting a branch in the AST. There are some theoretical issues with this approach for general parsing, but it works fine with real programming languages, which tend to not be ambiguous, or at least "not very ambiguous".
A typical case for dynamic precedence is implementing a (textual) rule like C++'s §9.8/1:
There is an ambiguity in the grammar involving expression-statements and declarations: An expression-statement with a function-style explicit type conversion (8.2.3) as its leftmost subexpression can be indistinguishable from a declaration where the first declarator starts with a (. In those cases the statement is a declaration.
This rule cannot be expressed by a context-free grammar -- or, at least not in a way which would be readable -- but it is trivially expressible as a dynamic precedence rule.
As its name implies, dynamic precedence is dynamic; it's a rule applied at parse time by the parser. Bison's GLR algorithm only applies these rules if forced to; the parser handles multiline possible reductions normally (by maintaining all of them as possibilities). It is forced to apply dynamic precedence only when both possible reductions in a reduce/reduce conflict reduce to the same non-terminal.
By contrast, the yacc precedence algorithm, which as I mentioned only resolves shift/reduce conflicts, is static: it is compiled at generation time into the parse automaton (in effect, by removing actions from the transition tables), so the parser no longer sees the conflict.
This algorithm has been (justifiably) criticised for a variety of reasons, one of which is the odd behaviour of non-associative declarations in corner cases. Also, precedence rules do not compose well; because they are not scoped, they might end up accidentally applying to productions for which they were not intended. Not infrequently, they facilitate grammar bugs by hiding a conflict which should have been resolved by the grammar writer.
Best practice, therefore, is to avoid corner cases :-) Static precedence should be restricted to its originally-intended use cases: simple operator precedence and, possibly, documenting the "shift preferred" heuristic which resolves dangling-else resolution and certain grouped operator parses (iirc, there's a good example of this in the dragon book).
If you implement dynamic precedence -- and, honestly, there are good reasons not to -- then it should be applied to simple easily expressed rules like the C++ rule cited above: "if it looks like a declaration, it's a declaration." Even better would be to avoid writing ambiguous grammars; that particular C++ feature leads to the infamous "most vexatious parse", which has probably at some point bitten every one of us who have tried writing C++ programs.
I'm following along with Bob Nystrom's great book "Crafting Interpreters".
Please let me know if this question is too specific for this site - I've been trying for hours but couldn't figure this out on my own :)
In chapter Compiling Expressions, in function unary(), the function parsePrecedence(Precedence) is called with PREC_UNARY instead of PREC_UNARY + 1.
The book explains this is in order to enable "nesting" of unary operators. E.g.: --1.
However, in parsePrecedence(Precedence) no precedence level is checked before parsing prefix operators - it is checked only before infix ones. And unary is a prefix parser.
So passing PREC_UNARY or PREC_UNARY + 1 to parsePrecedence(Precedence) doesn't seem to make a difference. What am I missing?
The simple answer is that you are right: with this particular grammar, there is no difference because no binary (or postfix) operator has precedence PREC_UNARY, and the test that will be used is ≤.
All the same, the conventional answer is to use PREC_UNARY because unary prefix operators are (necessarily) right associative. This convention comes from the case of binary operators, where you need to use the operator's precedence plus one for left associative operators (the normal case) and the operator's precedence itself for right-associative operators (exponentiation and assignment, for example). (Assignment is actually somewhat more complicated, but I personally think the solution proposed by Bob Nystrom is more complicated than would have been necessary.)
Another conventional answer derives from the possibility of using a bottom-up operator precedence parser (Dijkstra's "shunting yard") instead of the top-down Pratt parser. Fully exploring bottom-up parsing goes well beyond the scope of this question; suffice it to say that the same principle applies with respect to associativity.
This page says "Prefix operators are usually right-associative, and postfix operators left-associative" (emphasis mine).
Are there real examples of left-associative prefix operators, or right-associative postfix operators? If not, what would a hypothetical one look like, and how would it be parsed?
It's not particularly easy to make the concepts of "left-associative" and "right-associative" precise, since they don't directly correspond to any clear grammatical feature. Still, I'll try.
Despite the lack of math layout, I tried to insert an explanation of precedence relations here, and it's the best I can do, so I won't repeat it. The basic idea is that given an operator grammar (i.e., a grammar in which no production has two non-terminals without an intervening terminal), it is possible to define precedence relations ⋖, ≐, and ⋗ between grammar symbols, and then this relation can be extended to terminals.
Put simply, if a and b are two terminals, a ⋖ b holds if there is some production in which a is followed by a non-terminal which has a derivation (possibly not immediate) in which the first terminal is b. a ⋗ b holds if there is some production in which b follows a non-terminal which has a derivation in which the last terminal is a. And a ≐ b holds if there is some production in which a and b are either consecutive or are separated by a single non-terminal. The use of symbols which look like arithmetic comparisons is unfortunate, because none of the usual arithmetic laws apply. It is not necessary (in fact, it is rare) for a ≐ a to be true; a ≐ b does not imply b ≐ a and it may be the case that both (or neither) of a ⋖ b and a ⋗ b are true.
An operator grammar is an operator precedence grammar iff given any two terminals a and b, at most one of a ⋖ b, a ≐ b and a ⋗ b hold.
If a grammar is an operator-precedence grammar, it may be possible to find an assignment of integers to terminals which make the precedence relationships more or less correspond to integer comparisons. Precise correspondence is rarely possible, because of the rarity of a ≐ a. However, it is often possible to find two functions, f(t) and g(t) such that a ⋖ b is true if f(a) < g(b) and a ⋗ b is true if f(a) > g(b). (We don't worry about only if, because it may be the case that no relation holds between a and b, and often a ≐ b is handled with a different mechanism: indeed, it means something radically different.)
%left and %right (the yacc/bison/lemon/... declarations) construct functions f and g. They way they do it is pretty simple. If OP (an operator) is "left-associative", that means that expr1 OP expr2 OP expr3 must be parsed as <expr1 OP expr2> OP expr3, in which case OP ⋗ OP (which you can see from the derivation). Similarly, if ROP were "right-associative", then expr1 ROP expr2 ROP expr3 must be parsed as expr1 ROP <expr2 ROP expr3>, in which case ROP ⋖ ROP.
Since f and g are separate functions, this is fine: a left-associative operator will have f(OP) > g(OP) while a right-associative operator will have f(ROP) < g(ROP). This can easily be implemented by using two consecutive integers for each precedence level and assigning them to f and g in turn if the operator is right-associative, and to g and f in turn if it's left-associative. (This procedure will guarantee that f(T) is never equal to g(T). In the usual expression grammar, the only ≐ relationships are between open and close bracket-type-symbols, and these are not usually ambiguous, so in a yacc-derivative grammar it's not necessary to assign them precedence values at all. In a Floyd parser, they would be marked as ≐.)
Now, what about prefix and postfix operators? Prefix operators are always found in a production of the form [1]:
non-terminal-1: PREFIX non-terminal-2;
There is no non-terminal preceding PREFIX so it is not possible for anything to be ⋗ PREFIX (because the definition of a ⋗ b requires that there be a non-terminal preceding b). So if PREFIX is associative at all, it must be right-associative. Similarly, postfix operators correspond to:
non-terminal-3: non-terminal-4 POSTFIX;
and thus POSTFIX, if it is associative at all, must be left-associative.
Operators may be either semantically or syntactically non-associative (in the sense that applying the operator to the result of an application of the same operator is undefined or ill-formed). For example, in C++, ++ ++ a is semantically incorrect (unless operator++() has been redefined for a in some way), but it is accepted by the grammar (in case operator++() has been redefined). On the other hand, new new T is not syntactically correct. So new is syntactically non-associative.
[1] In Floyd grammars, all non-terminals are coalesced into a single non-terminal type, usually expression. However, the definition of precedence-relations doesn't require this, so I've used different place-holders for the different non-terminal types.
There could be in principle. Consider for example the prefix unary plus and minus operators: suppose + is the identity operation and - negates a numeric value.
They are "usually" right-associative, meaning that +-1 is equivalent to +(-1), the result is minus one.
Suppose they were left-associative, then the expression +-1 would be equivalent to (+-)1.
The language would therefore have to give a meaning to the sub-expression +-. Languages "usually" don't need this to have a meaning and don't give it one, but you can probably imagine a functional language in which the result of applying the identity operator to the negation operator is an operator/function that has exactly the same effect as the negation operator. Then the result of the full expression would again be -1 for this example.
Indeed, if the result of juxtaposing functions/operators is defined to be a function/operator with the same effect as applying both in right-to-left order, then it always makes no difference to the result of the expression which way you associate them. Those are just two different ways of defining that (f g)(x) == f(g(x)). If your language defines +- to mean something other than -, though, then the direction of associativity would matter (and I suspect the language would be very difficult to read for someone used to the "usual" languages...)
On the other hand, if the language doesn't allow juxtaposing operators/functions then prefix operators must be right-associative to allow the expression +-1. Disallowing juxtaposition is another way of saying that (+-) has no meaning.
I'm not aware of such a thing in a real language (e.g., one that's been used by at least a dozen people). I suspect the "usually" was merely because proving a negative is next to impossible, so it's easier to avoid arguments over trivia by not making an absolute statement.
As to how you'd theoretically do such a thing, there seem to be two possibilities. Given two prefix operators # and # that you were going to treat as left associative, you could parse ##a as equivalent to #(#(a)). At least to me, this seems like a truly dreadful idea--theoretically possible, but a language nobody should wish on even their worst enemy.
The other possibility is that ##a would be parsed as (##)a. In this case, we'd basically compose # and # into a single operator, which would then be applied to a.
In most typical languages, this probably wouldn't be terribly interesting (would have essentially the same meaning as if they were right associative). On the other hand, I can imagine a language oriented to multi-threaded programming that decreed that application of a single operator is always atomic--and when you compose two operators into a single one with the left-associative parse, the resulting fused operator is still a single, atomic operation, whereas just applying them successively wouldn't (necessarily) be.
Honestly, even that's kind of a stretch, but I can at least imagine it as a possibility.
I hate to shoot down a question that I myself asked, but having looked at the two other answers, would it be wrong to suggest that I've inadvertently asked a subjective question, and that in fact that the interpretation of left-associative prefixes and right-associative postfixes is simply undefined?
Remembering that even notation as pervasive as expressions is built upon a handful of conventions, if there's an edge case that the conventions never took into account, then maybe, until some standards committee decides on a definition, it's better to simply pretend it doesn't exist.
I do not remember any left-associated prefix operators or right-associated postfix ones. But I can imagine that both can easily exist. They are not common because the natural way of how people are looking to operators is: the one which is closer to the body - is applying first.
Easy example from C#/C++ languages:
~-3 is equal 2, but
-~3 is equal 4
This is because those prefix operators are right associative, for ~-3 it means that at first - operator applied and then ~ operator applied to the result of previous. It will lead to value of the whole expression will be equal to 2
Hypothetically you can imagine that if those operators are left-associative, than for ~-3 at first left-most operator ~ is applied, and after that - to the result of previous. It will lead to value of the whole expression will be equal to 4
[EDIT] Answering to Steve Jessop:
Steve said that: the meaning of "left-associativity" is that +-1 is equivalent to (+-)1
I do not agree with this, and think it is totally wrong. To better understand left-associativity consider the following example:
Suppose I have hypothetical programming language with left-associative prefix operators:
# - multiplies operand by 3
# - adds 7 to operand
Than following construction ##5 in my language will be equal to (5*3)+7 == 22
If my language was right-associative (as most usual languages) than I will have (5+7)*3 == 36
Please let me know if you have any questions.
Hypothetical example. A language has prefix operator # and postfix operator # with the same precedence. An expression #x# would be equal to (#x)# if both operators are left-associative and to #(x#) if both operators are right-associative.