Possible to format the token output in antlr4?

Possible to format the token output in antlr4? - parsing

In antlr when I run the following test command to grab tokens:
$ grun TestLexer tokens -tokens myfile.sql
[#0,0:5='SELECT',<'SELECT'>,1:0]
[#1,7:7='1',<NUMBER>,1:7]
[#2,9:12='with',<'WITH'>,2:0]
[#3,14:20='my_data',<IDENTIFIER>,2:5]
[#4,22:23='as',<'AS'>,2:13]
[#5,25:25='(',<'('>,2:16]
[#6,27:32='select',<'SELECT'>,2:18]
[#7,34:40=''text1'',<STRING_TOKEN>,2:25]
[#8,41:42='::',<'::'>,2:32]
Is there a way for it to actually name or format the token, such as for the last one to have:
[#8,41:42='::',<'TYPECAST(::)'>,2:32]
As it's been declared in my TextLexer.g4 file:
// Cast operator, '2014-01-01'::DATE
TYPECAST
:'::'
;

You would need to create a new implementation of the Token interface (waviest would be to subclass CommonToken And override the toString() Method. You’d also need to create a new TokenFactory to create Tokens of your type. At that point you can write your own code to print out the tokens. There’s not a way to override the TokenFactory in the `TestRig class.
There’s code to do just this in the “Tokens and Token Factories” section of The Definitive ANTLR 4 Reference
Based on the questions you’re posting, this book is going to prove well worth it’s price.

Related

why Objective-C convert to swift wrong

why
 -(void)addSimpleListener:(id<XXSimpleListener>)listener
convert to swift look like this:
func add(_ listener: XXSimpleListener?) {
}
but change the method to this
 -(void)addSimpleListener:(id<XXSimpleListening>)listener
 and it will convert to this
func addSimpleListener(_ listener: XXSimpleListening?){
}

Xcode (or whatever tool you are using to do the conversion) is merely following Swift API guidelines. Specifically:
Omit needless words. Every word in a name should convey salient information at the use site.
More words may be needed to clarify intent or disambiguate meaning, but those that are redundant with information the reader already possesses should be omitted. In particular, omit words that merely repeat type information.
In the first case, the words SimpleListener in addSimpleListener is repeating the type of the parameter, so they are removed from the method name. However, in the second case, SimpleListener and SimpleListening does not look the same to whatever tool you are using, so it thinks that SimpleListener should be kept.
In my (human) opinion though, I think the method should be named addListener, because:
Occasionally, repeating type information is necessary to avoid ambiguity, but in general it is better to use a word that describes a parameter’s role rather than its type.
Listener is the role of the parameter.

What does the function state.tokenize in 'CodeMirror' mean?

I want to know the function performed by tokenize in codemirror.

codemirror highlights text by calling a tokenizer function, passing it a context ("state"), and a pointer to the current location in the file that needs to be highlighted ("stream"). The job of this function is to advance the stream past the next token, and to return the type of the token. This is described fairly well in the codemirror api documentation here: http://codemirror.net/doc/manual.html#modeapi
In the case of xml.js (which you referenced in a comment), it has multiple tokenizer functions. Depending on the context, it will set the "tokenize" attribute of the state to refer to one of the tokenizer functions. Then it will use whichever function is pointed by by state.tokenize to find the next token in the stream.

F# using a class instance's member functions

I'm trying to create a connection to open a database over ODBC. I cannot figure out how to execute an objects member functions. The code:
let DbConnection = new System.Data.Odbc.OdbcConnection()
DbConnection.open
The errors I get are: Missing qualification after '.'
or sometimes: unexpected identifier in implementation file
Does anybody know what is wrong with my syntax?

I suppose you wanted something like this:
let dbConnection = new System.Data.Odbc.OdbcConnection()
dbConnection.Open()
The problems are:
F# is case sensitive so you need Open rather than open (also open is a language keyword, so if you wanted to use it as a name, you'd have to write ``open`` - the double back-tick is a way to refer to reserved names)
Open is a function, so if you want to call it you need to give it an argument. You can treat it as a function value too and write, say, let f = dbConnection.Open
I also changed your naming to use camelCase for variables, which is the standard F# way

`syntax` declaration must come before `data` declaration?

It seems that in Rascal a syntax declaration must come before a data declaration. Is that true? My experience is that if I put a syntax declaration after a data declaration, I got a parse error. Why is it a parse error?

Yes. Syntax declarations must come first in the file.
The reasoning is (I believe) that it should be simple to extract the grammar needed to parse the rest of the file.
You can of course always work around this if necessary by putting your type declarations in a separate file (probably only necessary if you need to add weirdo annotations to your grammar productions).

safety of keyword as function name

import is a keword, yet the following works fine:
import 'dart:io';
void main() {
import() {
print("Imported");
}
import();
}
Is this supposed to work?
Is the language sufficiently stable that using this will continue to work?
What is special about import versus say class, which does not work and what other keywords/may be are fair game?

Yes, this is supposed to work. And I think that yes, you can be reasonably sure that this will continue to work. To explain, let's take a look at the language specification.
Section 16.1.1 (Reserved words) explains that a reserved word may not be used as an identifier; it is a compile-time error if a reserved word is used where an identifier is expected. Here is the list of reserved words: assert, break, case, catch, class, const, continue, default, do, else,
enum, extends, false, final, finally, for, if, in, is, new, null, rethrow, return, super, switch, this, throw, true, try, var, void, while, with. Note that import isn't mentioned here.
Then, sections 12.30 (Identifier Reference) explains that there is a set of built-in identifiers which looks like this: abstract, as, dynamic, export, external, factory, get, implements, import, library, operator, part, set, static, typedef. And it is a compile-time error if a built-in identifier is
used as the declared name of a class, type parameter or type alias. Note that import falls into this group -- so you can't use it as a type, but you can use it elsewhere (like in your case, as a function name).
And a non-normative part of the section 12.30 explains the difference: Built-in identifiers are identifiers that are used as keywords in Dart, but are not reserved words in Javascript.
Just to note, in this answer, I quoted the PDF form of the Dart Language Specification version 0.30.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Possible to format the token output in antlr4? - parsing

Related

why Objective-C convert to swift wrong

What does the function state.tokenize in 'CodeMirror' mean?

F# using a class instance's member functions

`syntax` declaration must come before `data` declaration?

safety of keyword as function name

Categories

Resources