Top-down parser classification - parsing

I've watched this course by Alex Aiken and read through many other resources. But I'm struggling to find clear classification of top-down parsers.
This document doesn't provide a clear classification either but at least gives some definitions I'll use in the post. So here is the classification I've come up so far:
Backtracking VS Predictive
Backtracking
One solution to parsing would be to implement backtracking.  Based on the information 
the parser currently has about the input, a decision is made to go with one particular 
production.  If this choice leads to a dead end, the parser would have to backtrack to that
decision point, moving backwards through the input, and start again making a different 
choice and so on until it either found the production that was the appropriate one or ran 
out of choices.
Predictive
A predictive 
parser is characterized by its ability to choose the production to apply solely on the basis 
of the next input symbol and the current nonterminal being processed.
Recursive descent VS table-driven
Recursive descent
A 
recursive-descent parser consists of several small functions, one for each nonterminal in 
the grammar.  As we parse a sentence, we call the functions that correspond to the left 
side nonterminal of the productions we are applying.  If these productions are recursive, 
we end up calling the functions recursively.
Table driven
There is another method for implementing a predictive parser that uses a table to store that production along with an explicit stack to keep track of where we are in the parse
As I understand now I have four different types of parsers:
Recursive descent + backtracking
Recursive descent + prediction
Table-driven + backtracking
Table-driven + prediction
If I'm correct, can some also tell me where in the following 4 types of parsers the LL(k) parser falls?

No. You have:
backtracking vs predictive
recursive descent vs table-driven
So you can have:
recursive descent backtracking
recursive descent predictive
table-driven with backtracking
table-driven predictive.
To be specific, 'Recursive descent with table/stack implementation' is a contradiction in terms.
All table-driven parser implementations need a stack. This is not a dichotomy.
where in the following 4 types of parsers the LL(k) parser falls?
Anywhere.

Related

Types of Top down parsers

I want to clearly classify top down parsers. After reading lot of resources, i am connecting the dots.
I concluded following -
There are 2 types of top down parsers -
One that uses backtracking
Another that doesn't use backtracking (also called predictive parsers)
Now, each of this can be of 2 types based on how we implement them.
If we implement them using recursion then its recursive descent parser, or, otherwise If we implement them using explicit stack then it is non recursive parser.
Hence there are 4 types of top down parsers-
Backtrack + recursive implementation (Recursive descent with backtrack)
Back track + stack implementation
No Backtrack + recursive implementation (Recursive descent without backtrack)
No Back track + stack implementation
Slide 8 here says that predictive parser can be implemented in 2 ways.
But I am not able to verify if i am correct about parsers with backtracking too.
Wikipedia says that " recursive descent parser is a kind of top-down parser built from a set of mutually recursive procedures (or a non-recursive equivalent) .... "
Which means recursive descent parsers can be implemented in non-recursive way too. Which i am not able to understand.
Please check my understanding about type of top down parser and also what does wikipedia entry means about non-recursive equivalent of recursive descent parsers.

LALR parser being slower than recursive descent parser

Recently I wrote a (highly optimized) LALR(1) parser (that could handle ambiguous grammars) and supplied it a very ambiguous grammar. After that, I wrote a recursive descent parser for the same grammar, but with all the ambiguity taken out. I've read many times that LALR(1) parsers are very efficient, and recursive descent parsers are very inefficient, so I naturally expected the LALR parser to run much faster, even though it had an ambiguous grammar. When I compared the results of the two runs, I was shocked-- the recursive descent parser much faster than the LALR parser! Why was the LALR parser slower than the recursive descent parser? Was it because the LALR parser had an ambiguous grammar?

How many ways are there to build a parser? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am learning about the ANTLR v4, which is a parser generator based on so-called Adaptive LL(*) algorithm. It claims to be a big improvement over LL(*) algorithm, but I also heard about some algorithm like LR.
What's the advantage/limitation of ANTLR's Adaptive LL(*) algorithm (over LR)?
How many contemporary algorithms are there to build a parser?
To start with one can look at the list of the common parser generators.
See: Comparison of parser generators and look under the heading Parsing algorithm.
ALL(*)
Backtracking Bottom-up
Backtracking LALR(1)
Backtracking LALR(k)
GLR
LALR(1)
LR(1)
IELR(1)
LALR(K)
LR(K)
LL
LL(1)
LL(*)
LL(1), Backtracking, Shunting yard
LL(k) + syntactic and semantic predicates
LL, Backtracking
LR(0)
SLR
Recursive descent
Recursive descent, Backtracking
PEG parser interpreter, Packrat
Packrat (modified)
Packrat
Packrat + Cut + Left Recursion
Packrat (modified), mutating interpreter
2-phase scannerless top-down backtracking + runtime support
Packrat (modified to support left-recursion and resolve grammar ambiguity)
Parsing Machine
Earley
Recursive descent + Pratt
Packrat (modified, partial memoization)
Hybrid recursive descent / operator precedence
Scannerless GLR
runtime-extensible GLR
Scannerless, two phase
Combinators
Earley/combinators
Earley/combinators, infinitary CFGs
Scannerless GLR
delta chain
Besides parser generators, there are also other algorithms/means to parse. In particular Prolog has DCG and most people who have written their first parser from scratch without formal training typically start with recursive descent. Also Chart parser and Left corner parser.
In writing parsers the first question that I always ask myself is how can I make a grammar for the language at the highest type in the Chomsky hierarchy. Here lowest is Type-0 and highest is Type-3.
Almost 90% of the time it is a Type-2 grammar (context-free grammars), then for the easer task it is a Type-3 grammar (regular grammars). I have experimented with Type-1 grammars (context-sensitive grammars) and even Type-0 grammars (unrestricted grammars).
And what's the advantage/limitation of ANTLR's Adaptive LL(*) algorithm?
See the paper written by Terrence Parr the creator of Adaptive LL(*):
Adaptive LL(*) Parsing: The Power of Dynamic Analysis
In practical terms Adaptive LL(*) lets you get from a grammar to a working parser faster because you do not have to understand as much parsing theory because Adaptive LL(*) is, shall I say, nimble enough to side step the mines you unknowingly place in the grammar. The price for this is that some of the mines you unknowingly place in the grammar can lead to inefficiencies in the runtime of the parser.
For most practical programming language purposes Adaptive LL(*) is enough. IIRC Adaptive LL(*) can NOT do Type-0 grammars (unrestricted grammars) which Prolog DCG can, but as I said, most people and most common programming task only need either type 2 or type 3.
Also most parser generators are for type 2, but that does not mean they can't do type 1 or possibly type 0. I cannot be more specific as I do not have practical experience with all of them.
Anytime you use a parsing tool or library there is a learning curve to learning how to use it and what it can and can not do.
If you are new to lexing/parsing and really want to understand it more then take a course and/or read Compilers: Principles, Techniques, and Tools (2nd Edition)

Recursive descent parser first and follow

to implement a recursive descent parser is the first and follow sets required? and if so can you still build the recursive descent given non uniqueness in the first and follows?
I'm having a hard time distinguishing between recursive descent and ll(1) parsing.
Thanks.
Recursive descent parsers do not have to be deterministic, i.e. one can construct recursive descent parsers that cannot decide which derivation to choose after a finite constant lookahead.
LL(k) parsers construct a parse tree incrementally, each new character will extend the parse tree.
Nondetermistic recursive descent parsers can build a parse tree, which is discarded completely on the occurrence of a certain character.
Examples for recursive descent which is not necessarily LL(k):
Parsing in PROLOG (backtracking)
Packrat Parsing (backtracking with memoization)

What is the relation between parser combinators and recursive descent parsers?

What is the relation between parser combinators and recursive descent parsers?
The Wikipedia link on parser combinators is actually pretty reasonable. From it one of the first things we learn is that "Parser combinators use a top-down parsing strategy", i.e. recursive descent.
Combinators themselves are building blocks for parsers, but they lean toward recursive descent.

Resources