How to build YAML parser using tatsu python parser generator?
Parsing indent-based language like YAML is difficult, so I cannot achieve this.
TatSu was used to do experiments and bootstrap the new PEG parser in Python.
You can find the solutions to INDENT/DEDENT I used in the original efforts here:
https://github.com/neogeny/pygl
Related
I am trying to write my own programming language, and I use Lua source as a reference.
I have several questions about it:
What kind of parser Lua use? Is it a Pratt parser?
... and why it doesn't produce an AST? That's a job that parser does, isn't it?
I am new to Scheme but I understand recursion and a few things about parsing in general. Does anybody have experience in how to parse YAML (at least part of the spec) using Scheme/Lisp? At this point, I am not looking for efficiency.
Here is the source of a parser for YAML in Racket:
https://github.com/esilkensen/yaml/blob/master/yaml/parser.rkt
It is a recursive descent parser and would be easy to port to RnRS Scheme.
Documentation: http://pkg-build.racket-lang.org/doc/yaml/index.html
I want to analysis OCaml files (.ml) using OCaml. I want to break the files into Abstract Syntax Trees for analysis. I have attempted to use camlp4 but have had no luck. Has anyone else successfully done this before? Is this the best way to parse an OCaml file?
(I assume you know basic parts of OCaml already: how to write OCaml code, how to link modules and libraries, how to write build scripts and so on. If you do not, learn them first.)
The best way is to use the genuine OCaml code parser used in OCaml compiler itself, since it is 100% compatible by definition.
CamlP4 also implements OCaml parser but it is slightly incompatible with the genuine parser and the parse tree is somewhat specialized for writing syntax extensions: not very good for any other kind of analysis.
You may want to parse .ml files with syntax extensions using P4. Even in this case, you should stick to the genuine parser: you can desugar the source code by P4 then send the result to your analyzer with the genuine parser.
To use OCaml compiler's parser, the easiest approach is to use compiler-libs.common OCamlFind package. It contains the parser and type checker of OCaml compiler.
Start from modifying driver/compile.ml of OCaml compiler source, it implements the major compilation phases: calling preprocessor, parse, typing then code generation. To parse .ml files you should modify (or simplify) Compile.implementation. For .mli files Compile.interface.
Good luck.
Couldn't you use the -dparsetree option to the ocaml compiler?
hello.ml:
let _ = print_endline "Hello AST"
Now compile it:
$ ocamlc -dparsetree hello.ml
Which results in:
[
structure_item (hello.ml[1,0+0]..[1,0+33])
Pstr_eval
expression (hello.ml[1,0+8]..[1,0+33])
Pexp_apply
expression (hello.ml[1,0+8]..[1,0+21])
Pexp_ident "print_endline" (hello.ml[1,0+8]..[1,0+21])
[
<label> ""
expression (hello.ml[1,0+22]..[1,0+33])
Pexp_constant Const_string("Hello AST",None)
]
]
See also this blog post on -ppx extensions which has some info on extension point syntax extensions (the new way of writing syntax extensions in OCaml 4.02). There is info there on various AST manipulation modules.
I know that it's possible to use, for example, bison-generated Java files in scala project, but is there any native "grammar to scala" LALR(1) generators?
Another plug here: ScalaBison is close to LALR(1) and lets you use Scala in the actions.
I'm not really answering the original question, and please excuse the plug, but you may be interested in our sbt-rats plugin for the sbt tool. It uses the Rats! parser generator for Java, but makes it easier to use from Scala.
Rats! uses parsing expression grammars as its syntax description formalism, not context-free grammars and definitely not LALR(1) grammars. sbt-rats also has a high-level syntax definition language that in most cases means you do not need to write semantic actions to get a syntax tree that represents your input. The plugin will optionally generate case classes for the tree representation and a pretty-printer for the tree structure.
I receive a task to parse a text which conforms to EBNF syntax. Is there any tool/library I can use?
ANTLR is the standard tool for parsing EBNF.
See Good parser generator (think lex/yacc or antlr) for .NET? Build time only? here on SO.