Has anyone written out the grammar in BNF for the Internet Printing Protocol Collections record described in RFC3382? - ipp-protocol

I am having a little trouble generating the Collection records described in Internet Printing Protocol definition RFC3382. Has anyone written out the grammar in BNF?

It's hard to say! I frequently research the internet about IPP and have not come across any work related to IPP and BNF (Backus-Naur-Form) so far.
I guess the PWG IPP mailing list would be a better audience for this question. Most ipp implementations don't use a scanner or parser to deal with ipp messages. I assume you implement an ipp server and have covered the ipp parsing already.
Sometimes it is a good approach to capture an ipp message (response) of a real printer and have a look at its byte sequence. On request (personal mail) I can provide such a response in binary format (including a media-col-default and media-col-database attribute)

I got an answer from the Printer Working Group (PWG) site. The short answer is that a RFC-in-progress has a more precise grammar for Collections.
From Michael Sweet:
There is an updated ABNF grammar in the upcoming RFC 8010 (which
replaces RFCs 2910 and 3382) along with better examples that might be
of some help. Here is a link to the authors' review copy (it should
be published very soon!):
https://www.rfc-editor.org/authors/rfc8010.txt
Sections 3.1.6 and 3.1.7 cover the encoding of the collection
attribute and its member attributes, respectively.
FWIW, the "design" of collections in 3382 was done specifically to
make them look like a 1setOf attribute with a mix of values so that
existing clients/printers could more easily deal with them. In
practice this makes supporting collections a little more difficult
(and the encoding of collection values a little more verbose) than is
ideal... :/
(Mr Sweet, I apologize for redistributing your information without consulting you)

Related

Finding Java Grammars

I'm making my way into Rascal - nice work!
I'm halfway through the Tutor and might be getting ahead of myself, but one of my interests is refactoring Java. I've watched Tijs van der Storm's Curry On 2016 talk where he presents an example of this (the "TrafoFields" module around minute 16 in the video).
I was eager to get into this kind of work so I searched the documentation for the Java grammar and a similar example using it, to no avail. What's more, the library documentation has under lang::java only m3 and jdt. I reloaded Tijs' video to find he uses lang::java::\syntax::Java15. I blindly tried importing that in the Repl and it worked (albeit with lots of warnings)! I opened the Rascal .jar file to find there is even more in this package.
So my questions in this context are:
Why is this not in the documentation? I would have expected that the library documentation is exhaustive. Couldn't you at least add "TrafoFields" to the recipes?!
Is there an alternative way of finding out about such modules besides the online documentation (and apart from searching the .jar file)?
What is this weird backslash in the module name before "syntax", ::\syntax?
All good questions; and also implied suggestions for improvement. In the meantime if you get stuck please ask more questions.
Short answer: we've prioritized documenting what is used most and what is used in courses at universities and then what we've received questions about. So thanks for asking.
In reverse order:
The backslash is a generic escape to use identifiers in Rascal which are also keywords in the language. So any variable or package name or module name could be named "if" but you'd have to escape it. This helps a lot when defining abstract syntax trees for programming languages, as you can imagine where "if" would be a great name for an ast node type. "Syntax" happens to be a keyword in Rascal too.
The Rascal explorer view in eclipse can be used to browse through the library interactively. Also you can crawl the library using util::FileSystem and get a first class representation of everything that is in a library like |std:///|. Agreed, these are poorman's solutions and a good feature would a fully indexed searchable library. We're halfway there (see the analysis::text:: Lucene pre-work we've done towards supporting this.
That is a rhetorical question which I accept as a suggestion for improvement 😉😊 the answer is yes.

What front-end can I use with RPython to implement a language?

I've looked high and low for examples of implementing a language using the RPython toolchain, but the only one I've been able to find so far is this one in which the author writes a simple BF interpreter. Because the grammar is so simple, he doesn't need to use a parser/lexer generator. Is there a front-end out there that supports developing a language in RPython?
Thanks!
I'm not aware of any general lexer or parser generator targeting RPython specifically. Some with Python output may work, but I wouldn't bet on it. However, there's a set of parsing tools in rlib.parsing. It seems quite usable. OTOH, there's a warning in the documentation: It's reportedly still in development, experimental, and only used for the Prolog interpreter so far.
Alternatively, you can write the frontend by hand. Lexers can be annoying and unnatural, granted (you may be able to rip out the utility modules for DFAs used by the Python implementation). But parsers are a piece of cake if you know the right algorithms. I'm a huge fan of "Top Down Operator Precedence parsers" a.k.a. "Pratt parsers", which are reasonably simple (recursive descent) but make all expression parsing issues (nesting, precedence, associativity, etc.) a breeze. There's depressingly little information on them, but the few blog posts were sufficient for me:
One by Crockford (wouldn't recommend it though, it throws a whole lot of unrelated stuff into the parser and thus obscures it),
another one at effbot.org (uses Python),
and a third by a sadly even-less-famous guy who's developing a language himself, Robert Nystrom.
Alex Gaynor has ported David Beazley's excellent PLY to RPython. Its documentation is quite good, and he even gave a talk about using it to implement an interpreter at PyCon US 2013.

Concise description of the Lua vm?

I've skimmed Programming in Lua, I've looked at the Lua Reference.
However, they both tells me this function does this, but not how.
When reading SICP, I got this feeling of: "ah, here's the computational model underlying scheme"; I'm trying to get the same sense concerning Lua -- i.e. a concise description of it's vm, a "how" rather than a "what".
Does anyone know of a good document (besides the C source) describing this?
You might want to read the No-Frills Intro to Lua 5(.1) VM Instructions (pick a link, click on the Docs tab, choose English -> Go).
I don't remember exactly where I've seen it, but I remember reading that Lua's authors specifically discourage end-users from getting into too much detail on the VM; I think they want it to be as much of an implementation detail as possible.
Besides already mentioned A No-Frills Introduction to Lua 5.1 VM Instructions, you may be interested in this excellent post by Mike Pall on how to read Lua source.
Also see related Lua-Users Wiki page.
See http://www.lua.org/source/5.1/lopcodes.h.html . The list starts at OP_MOVE.
The computational model underlying Lua is pretty much the same as the computational model underlying Scheme, except that the central data structure is not the cons cell; it's the mutable hash table. (At least until you get into metaprogramming with metatables.) Otherwise all the familiar stuff is there: nested first-class functions with mutable local variables (let-bound variables in Scheme), and so on.
It's not clear to me that you'd get much from a study of the VM. I did some hacking on the VM a while back and it's a lot like any other register-oriented VM, although maybe a bit cleaner. Only a handful of instructions are Lua-specific.
If you're curious about the metatables, the semantics is described clearly, if somewhat verbosely, in Section 2.8 of the reference manual for Lua 5.1. If you look at the VM code in src/lvm.c you'll see almost exactly that logic implemented in C (e.g., the internal Arith function). The VM instructions are specialized for the common cases, but it's all terribly straightforward; nothing clever is involved.
For years I've been wanting a more formal specification of Lua's computational model, but my tastes run more toward formal semantics...
I've found The Implementation of Lua 5.1 very useful for understanding what Lua is actually doing.
It explains the hashing techniques, garbage collection and some other bits and pieces.
Another great paper is The Implmentation of Lua 5.0, which describes design and motivations of various key systems in the VM. I found that reading it was a great way to parse and understand what I was seeing in the C code.
I am surprised you refer to the C source for the VM as this is protected by lua.org and the tecgraf/puc rio in Brazil specially as the language is used for real business and commercial applications in a number of countries. The paper about The Implementation of lua contains details about the VM in the most detail it is permitted to include but the structure of the VM is proprietary. It is worth noting that versions 5.0 and 5' were commissioned by IBM in Europe for use on customer mainframes and their register-based version have a VM which accepts the IBM defined format of intermediate instructions.

F# parsing Abstract Syntax Trees

What is the best way to use F# to parse an AST to build an interpreter? There are plenty of F# examples for trivial syntax (basic arithmatical operations) but I can't seem to find anything for languages with much larger ranges of features.
Discriminated unions look to be incrediably useful but how would you go about constructing one with large number of options? Is it better to define the types (say addition, subtraction, conditionals, control flow) elsewhere and just bring them together as predefined types in the union?
Or have I missed some far more effective means of writing interpreters? Is having an eval function for each type more effective, or perhaps using monads?
Thanks in advance
Discriminated unions look to be
incrediably useful but how would you
go about constructing one with large
number of options? Is it better to
define the types (say addition,
subtraction, conditionals, control
flow) elsewhere and just bring them
together as predefined types in the
union?
I am not sure what you are asking here; even with a large number of options, DUs still are simple to define. See e.g. this blog entry for a tiny language's DU structure (as well as a more general discussion about writing tree transforms). It's fine to have a DU with many more cases, and common in compilers/interpreters to use such a representation.
As for parsing, I prefer monadic parser combinators; check out FParsec or see this old blog entry. After using such parser combinators, I can never go back to anything like lex/yacc/ANTLR - external DSLs seem so primitive in comparison.
(EDIT: The 'tiny arithmetic examples' you have found are probably pretty much representative of what larger solutions looks like as well. The 'toy' examples usually show off the right architecture.)
You should take a copy of Robert Pickering's "Beginning F#".
The chapter 13, "Parsing Text", contains an example with FsLex and FsYacc, as suggested by Noldorin.
Other than that, in the same book, chapter 12, the author explains how to build an actual simple compiler for an arithmetic language he proposes. Very enlightening. The most important part is what you are looking for: the AST parser.
Good luck.
I second Brian's suggestion to take a look at FParsec. If you're interested in doing things the old-school way with FsLex and FsYacc, one place to look for how to parse a non-trivial language is the F# source itself. See the source\fsharp\FSharp.Compiler directory in the distribution.
You may be interesting in checking out the Lexing and Parsing section of the F# WikiBook. The F# PowerPack library contains the FsLex and FsYacc tools, which asssist greatly with this. The WikiBook guide is a good way to get started with this.
Beyond that, you will need to think about how you actually want to execute the code from the AST form, which is common the design of both compilers and interpreters. This is generally considered the easier part however, and there are lots of general resources on compilers/interpreter out there that should provide information on this.
I haven't done an interpreter myself. Hope the following helps:)
Here's a compiler course taught at Yale using ML, which you might find useful. The lecture notes are very concise (short) and informative. You can follow the first few lecture notes and the assignments. As you know F#, you won't have problem reading ML programs.
Btw, the professor was a student of A. Appel, who is the creator of SML implementation. So from these notes, you also get the most natural way to write a compiler/interpreter in ML family language.
This is an excellent example of a complete Small Basic implementation with F# and FParsec. It includes even IL compiler. The whole code is very accessible and is accompanied by a series of blog post from the author at http://trelford.com/blog/

Is ANTLR an appropriate tool to serialize/deserialize a binary data format?

I need to read and write octet streams to send over various networks to communicate with smart electric meters. There is an ANSI standard, ANSI C12.19, that describes the binary data format. While the data format is not overly complex the standard is very large (500+ pages) in that it describes many distinct types. The standard is fully described by an EBNF grammar. I am considering utilizing ANTLR to read the EBNF grammar or a modified version of it and create C# classes that can read and write the octet stream.
Is this a good use of ANTLR?
If so, what do I need to do to be able to utilize ANTLR 3.1? From searching the newsgroup archives it seems like I need to implement a new stream that can read bytes instead of characters. Is that all or would I have to implement a Lexer derivative as well?
If ANTLR can help me read/parse the stream can it also help me write the stream?
Thanks.
dan finucane
You might take a look at Ragel. It is a state machine compiler/lexer that is useful for implementing on-the-wire protocols. I have read reports that it generates very fast code. If you don't need a parser and template engine, ragel has less overhead than ANTLR. If you need a full-blown parser, AST, and nice template engine support, ANTLR might be a better choice.
This subject comes up from time to time on the ANTLR mailing list. The answer is usually no, because binary file formats are very regular and it's just not worth the overhead.
It seems to me that having a grammar gives you a tremendous leg up.
ANTLR 3.1 has StringTemplate and code generation features that are separate from the parsing/lexing, so you can decompose the problem that way.
Seems like a winner to me, worth trying.

Resources