I know that every LL(1) is also an LR(1). But what about the relationship between LL(1) and LR(0), can a LL(1) be a LR(0) as well?
You ask two questions, one in the title and the other in the body of the post. Neither specify whether you are asking about languages or grammars, but the basic answers are the same:
Are all LL(1) languages LR(0)?
No. A language which contains both a string and a proper prefix of that string cannot be LR(0). But many LL(1) languages have that form.
Are some LL(1) languages LR(0)?
Sure.
(The unasked question) Are any LR(0) languages not LL(1).
Yes. For example, the language {ambnc | m≥n≥0} is LR(0), but it has no LL(1) grammar.
Related
sorry for the amateurish question. I have a grammar that's LL and I want to write an LR grammar. Can I use the LL grammar, make minimal syntactical changes for it to fit with an LR parser and use it? Is that a bad idea? Are there structural differences between them that don't translate?
All LL(1) grammars are LR(1), so if you had an LR(1) parser generator, you could definitely use your LL(1) grammar, assuming the BNF syntax is that used by the parser generator.
But you probably don't have an LR(1) parser generator, but rather a parser generator which can only handle the LALR(1) subset of LR(1) grammars. All the same, you're probably fine. "Most" LL(1) grammars are in LALR(1), and it's pretty rare to find a useful LL(1) which is not. (This pattern is unlikely to arise in a practical grammar, for example.)
So it's probably possible. But it might not be a good idea.
Top-down parsers can't handle left-recursion, and without left-recursion you can't write a grammar which represents left-associative operators, which is to say most arithmetic operators. This problem is usually solved in practice by using a right-associative grammar along with an idiosyncratic evaluation function which corrects the associativity. That's less than ideal. Also, LL grammars created by mechanically removing left-recursion tend to be very hard to read.
So you are probably best off using a grammar designed for LR parsing. But you probably don't have to.
I recently met the concept of LR, LL etc. Which category is Lua? Are there, or, can there be implementations that differ from the official code in this aspect?
LR, LL and so on are algorithms which attempt to find a parser for a given grammar. This is not possible for every grammar, and you can categorize on the basis of that possibility. But you have to be aware of the difference between a language and a grammar.
It might be possible to create an LR(k) parser for a given grammar, for some specific value of k. If so, the grammar is LR(k). Note that an LR(k) grammar is also an LR(k+1) grammar, and that an LL(k) grammar is also LR(k). So these are not categories in the sense that every grammar is in exactly one category.
Any language can be recognised by many different grammars. (In fact, an unlimited number). These grammars can be arbitrarily complex. You can always write a grammar for a given language which is not even context-free. We say that a language is <X> if there exists a grammar for that language which is <X>. But the fact that a specific grammar for that language is not <X> says nothing.
One interesting theorem demonstrates that if there is an LR(k) grammar for any language, then it is possible to derive an LR(1) grammar for that language. So while the k parameter is useful for describing grammars, languages can only be LR(0) or LR(1). This is not true of LL(k) languages, though.
Lua as a language is basically LR(1) and LL(2). The grammar is part of the reference manual, except that the published grammar doesn't specify operator precedences or a few rules having to do with newlines. The actual parser is a hand-written recursive-descent parser (at least, the last time I looked) with a couple of small deviations in order to handle operator precedence and the minor deviations from LL(1). However, there exist LALR(1) parsers for Lua as well.
I'm really struggling to unterstand the relationship between:
LR(0)
LL(0)
LALR(1)
SLR(1)
LR(1)
LL(1)
I'm pretty sure LALR(1) and SLR(1) are subsets of LR(1), but I'm lost about the others. Are they all exclusive? Is LL(0) a subset of LL(1)?
Thanks
The containment rules are the following:
Every LR(0) grammar is also SLR(1), but not all SLR(1) grammars are LR(0).
Every SLR(1) grammar is also LALR(1), but not all LALR(1) grammars are SLR(1).
Every LALR(1) grammar is also LR(1), but not all LR(1) grammars are LALR(1).
Every LL(1) grammar is also LR(1), but not all LR(1) grammars are LL(1).
Every LL(0) grammar is also LR(0), SLR(1), LALR(1), LR(1), and LL(1). (LL(0) grammars are basically useless; see this question for details why).
It's also the case that every language that has an LR(1) grammar also has an LR(0) grammar provided that you endmark the grammar, though the grammar isn't guaranteed to be pretty.
I have a grammar and I can check whether or not is is LL(1). However, is there any way to check if the language generated by the grammar is LL(1)? And what exactly is the difference between LL(1) grammars and LL(1) languages?
Any grammar that is LL(1) defines an LL(1) language. By definition, a language is LL(1) if there is some grammar that generates it that is LL(1), so the fact that you have an LL(1) grammar for the language automatically means that the language is LL(1).
To elaborate, a language is a set of strings and a grammar for that language is a means of describing that language. Some languages have LL(1) grammars while others do not. However, the fact that a grammar is not LL(1) does not mean that the language it describes is not. For example, consider this grammar:
A -> ab | ac
This grammar is not LL(1) because it contains a FIRST/FIRST conflict when trying to predict the production for A when seeing terminal a. However, it describes an LL(1) language, since the language is also described by the grammar
A -> aX
X -> b | c
So the language generated by these grammars (which just contains ab and ac) is indeed LL(1).
Determining whether the language described by an arbitrary grammar is LL(1) is much harder and to the best of my knowledge the only way to do it would be to either explicitly exhibit an LL(1) grammar for the language generated by the initial grammar (which is tricky) or to mathematically prove that no such grammar exists.
Hope this helps!
I've been playing around with a lot of grammars that are not LL(1) recently, and many of them can be transformed into grammars that are LL(1).
However, I have never seen an example of an unambiguous language that is not LL(1). In other words, a language for which any unambiguous grammar for the language is not LL(1)), nor do I have any idea how I would go about proving that I had found one if I accidentally stumbled across one.
Does anyone know how to prove that a particular unambiguous language is not LL(1)?
I was thinking about the question a while and then found this language at Wikipedia:
S -> A | B
A -> 'a' A 'b' | ε
B -> 'a' B 'b' 'b' | ε
They claim the language described by the grammar above cannot be described by LL(k) grammar. You asked about LL(1) only and this is pretty straightforward. Having first symbol only, you don't know if the sequence is 'ab' or 'aab' (or any more recursive one) and therefore you cannot choose the right rule. So the language is definitely not LL(1).
Also for every sequence generated by this grammar there is only one derivation tree. So the language is unambiguous.
The second part of your question is a little harder. It is much easier to prove the language is LL(1), than the opposite (there is no LL(1) grammar describing the language). I think you just create a grammar describing the language, then you try to make it LL(1). After discovering a conflict which cannot be resolved you somehow have to take advantage of it and create a proof.