In the dart documentation there is a little note that say
Note: Switch statements in Dart are intended for limited circumstances, such as in interpreters or scanners.
I want to know what interpreters and scanners are and what they mean with that.
Interpreters and Scanners are code that typically looks at one character/byte at a time. Switch is probably optimized for that.
Related
I'm reading some yacc and lex related stuff and some other compiler implementations, it seems like they are all using global state and hence really unsafe to use multithreaded situations so it is hard to embed them in other programs. I know GNU Bison and Flex could be used for re-entrancy but why are they not on by default?
Because when the interface for Lex and Yacc was defined, many many many years ago, the use of globals was much more common. Reentrancy changes the interface, and the reentrant interfaces have never been formally standardised (which is probably just as well, given the state of play). At the time, multithreading was not very common, largely because a typical computer just barely had the resources to do one compilation (and sometimes not even that; it was also pretty common for compilation passes to be sequentially loaded executables).
So the default continues to be the non-reentrant, standardised interface. And it probably will remain that way, whether or not we like it.
I have recently started doing some work with COBOL, where I have only ever done work in z/OS Assembler on a Mainframe before.
I know that COBOL will be translated into Mainframe machine-code, but I am wondering if it is possible to see the generated code?
I want to use this to better understand the under workings of COBOL.
For example, if I was to compile a COBOL program, I would like to see the assembly that results from the compile. Is something like this possible?
Relenting, only because of this: "I want to use this to better understand the under workings of Cobol".
The simple answer is that there is, for Enterprise COBOL on z/OS, a compiler option, LIST. LIST will provide what is known as the "pseudo assembler" output in your compile listing (and some other useful stuff for understanding the executable program). Another compiler option, OFFSET, shows the displacement from the start of the program of the code generated for each COBOL verb. LIST (which inherently has the offset already) and OFFSET are mutually exclusive. So you need to specify LIST and NOOFFSET.
Compiler options can be specified on the PARM of the EXEC PGM= for the compiler. Since the PARM is limited to 100 characters, compiler options can also be specified in a data set, with a DDName of SYSOPTF (which, in turn, you use a compiler option to specify its use).
A third way to specify compiler options is to include them in the program source, using the PROCESS or (more common, since it is shorter) CBL statement.
It is likely that you have a "panel" to compile your programs. This may have a field allowing options to be specified.
However, be aware of a couple of things: it is possible, when installing the compiler, to "nail in" compiler options (which means they can't be changed by the application programmer); it is possible, when installing the compiler, to prevent the use of PROCESS/CBL statements.
The reason for the above is standardisation. There are compiler options which affect code generation, and using different code generation options within the same system can cause unwanted affects. Even across systems, different code generation options may not be desirable if programmers are prone to expect the "normal" options.
It is unlikely that listing-only options will be "nailed", but if you are prevented from specifying options, then you may need to make a special request. This is not common, but you may be unlucky. Not my fault if it doesn't work for you.
This compiler options, and how you can specify them, are documented in the Enterprise COBOL Programming Guide for your specific release. There you will also find the documentation of the pseudo-assembler (be aware that it appears in the document as "pseudo-assembler", "pseudoassembler" and "pseudo assembler", for no good reason).
When you see the pseudo-assembler, you will see that it is not in the same format as an Assembler statement (I've never discovered why, but as far as I know it has been that way for more than 40 years). The line with the pseudo-assembler will also contain the machine-code in the format you are already familiar with from the output of the Assembler.
Don't expect to see a compiled COBOL program looking like an Assembler program that you would write. Enterprise COBOL adheres to a language Standard (1985) with IBM Extensions. The answer to "why does it do it likely that" will be "because", except for optimisations (see later).
What you see will depend heavily on the version of your compiler, because in the summer of 2013, IBM introduced V5, with entirely new code-generation and optimisation. Up to V4.2, the code generator dated back to "ESA", which meant that over 600 machine instructions introduced since ESA were not available to Enterprise COBOL programs, and extended registers. The same COBOL program compiled with V4.2 and with V6.1 (latest version at time of writing) will be markedly different, and not only because of the different instructions, but also because the structure of an executable COBOL program was also redesigned.
Then there's opimisation. With V4.2, there was one level of possible optimisation, and the optimised code was generally "recognisable". With V5+, there are three levels of optimisation (you get level zero without asking for it) and the optimisations are much more extreme, including, well, extreme stuff. If you have V5+, and want to know a bit more about what is going on, use OPT(0) to get a grip on what is happening, and then note the effects of OPT(1) and OPT(2) (and realise, with the increased compile times, how much work is put into the optimisation).
There's not really a substantial amount of official documentation of the internals. Search-engineing will reveal some stuff. IBM's Compiler Cafe:COBOL Cafe Forum - IBM is a good place if you want more knowledge of V5+ internals, as a couple of the developers attend there. For up to V4.2, here may be as good a place as any to ask further specific questions.
Can anyone recommend me an open-source full OCaml parser?
Essentially, I would like to implement my own type-checker for OCaml. Ideally, the parser is written in OCaml. I would just use it to get the AST of the input program. (it is probably too much to ask for the initial typing environment pre-filled with standard library function signatures)
Use compiler-lib that is distributed with OCaml under QPL license. It has everything needed to create your own compiler (and even has some documentation). compiler-lib is essentially a compiler shipped as library.
Otherwise, you can use camlp4 to get the parsetree, but then you need to reimplement everything else from scratch. But at this case you're not restricted with QPL.
it is probably too much to ask for the initial typing environment pre-filled with standard library function signatures
It is not! See the files typing/predef.ml(i).
As for the stdlib, just use the same one as the compiler, except for pervasive which uses values from Predef, the rest is normal OCaml code without any special cases (except bootstraping, obviously).
I'm currently working on a project that makes use of a custom language with a simple context-free grammar.
Due to the project's characteristics the same language will have to be used on several platforms, especially mobile ones. Currently, I'm using my small hand-written Java parser (for the Android platform). Soon, I'll have to write basically the same parser for JavaScript and later possibly also for C# (Windows Phone) and Objective C (iOS). There is an additional chance that I'll also have to write it for PHP.
My question is: What options are there to simplify the parser development process? Do I really have to write basically the same parser for each platform or is there a less work-intensive way?
From a development process point of view the best alternative would enable me to write a grammar definition which would then automatically be compiled into a parser.
However, basically the only cross-platform parser generator I've found so far it the GOLD Parser which supports two of my target platforms (Java and C#). It would really be awesome if you could point me to other alternatives.
In case you don't know about other cross-platform compiler-compilers: Do you have hints how to structure the code towards future language extensibility?
I commend https://en.wikipedia.org/wiki/Comparison_of_parser_generators to your attention: if we restrict the domain to Java and C/C++, it suggests APG, GOLD, SableCC, and SLK (amongst others) as being cross-language enough for your stated goals. (I'm also requiring that the action code be separated from the grammar rather than inline, since the latter would defeat the purpose.) If you want JavaScript as well, it looks like your choices are APG (GPL-licensed) and WaxEye (MIT-licensed).
If your language is reasonably simple then I would say to just go with whichever you think will be easiest to integrate into your build environment(s) and has a reasonable match with how you think. Unless parsing time is a huge fraction of your application's total workload, parsing speed should not be an issue -- although table size and memory usage might matter in a mobile context. If your grammar is "simple enough," (i.e. not Perl, for instance) I would expect any of those tools to work.
Have a look in Antlr, I am using it for transforming java code and it is really great. Moreover you can find different grammars here.
REx parser generator supports the required targets, except for Objective C and PHP (code generators for those might be possible). It has not yet been published as open source, though, and there is no decent documentation, just sample grammars. But there are projects that are using it successfully, e.g. xqlint. Here is a paper describing the experience from that project.
For a research project i'm looking for an interpreter or even a parser for a programming language (doesn't matter what programming language) that has been ported to a number of languages. This probably means the code is small enough to do so.
I know Lisp-ish languages have been ported to a lot of environments, because Lisp is so easy to parse, however I haven't found a single implementation that has been multi targetted. For instance; it is very hard to find a version which works in PHP where the same code (the Lisp which runs on top / is parsed that is) would also work in Python.
Hope someone here can help...
What do I want to do with it? For a tool I'm making, the user group will write tiny pieces of logic; however the system underneath differs while the logic is the same. We don't want to force our users to learn Java, PHP, C#, etc just to write that logic.
The Lua (www.lua.org) scripting language can run from within C and has bindings to Python, php, Java, C#, probably some other languages too. It's a very small interpreter (something like 200k when compiled) because it comes "without batteries" - no builtin functions for some common operations like copying arrays. It should be pretty trivial to add support for embedding in another language, compared to other scripting languages, if need be.