ANTLR4 : Island grammar, token matching / skipping

ANTLR4 : Island grammar, token matching / skipping - parsing

If fighting an island grammar with antlr4, and while I can make it work, I still have doubts if this is the "proper" way.
I need to parse :
Some random text
{ }
#if(condition) {
more random text
#foobar
#if (condition2) {
random text {}
}
}
The problem lies within the context : An "wild" {} isn't anything, but if it's a { } behind a language operator, the { } become meaningful. (read : It opens and closes a block)
In the above case, it would return the following, assuming that condition and condition2 are both true :
Some random text
{}
more random text
random text {}
I'm confused on which route to pick, any advice on the above ?
The original implementation seems to be matching braces :
{ }
#if (true) {
{
foo
bar
} }
yields
{ }
{
foo
bar
}
while
{ }
#if (true) {
{
foo
bar
}
yields a parse error.

this can be solved with a context specific lexer. In this case, by keeping track of the condition / block openings, we can determine if this is template content, or an actual block opening / closing.
See p219 of the ANTLR4 definitive ANTLR4 reference.

Related

forEach vs for in: Different Behavior When Calling a Method

I noticed that forEach and for in to produce different behavior. I have a list of RegExp and want to run hasMatch on each one. When iterating through the list using forEach, hasMatch never returns true. However, if I use for in, hasMatch returns true.
Sample code:
class Foo {
final str = "Hello";
final regexes = [new RegExp(r"(\w+)")];
String a() {
regexes.forEach((RegExp reg) {
if (reg.hasMatch(str)) {
return 'match';
}
});
return 'no match';
}
String b() {
for (RegExp reg in regexes) {
if (reg.hasMatch(str)) {
return 'match';
}
}
return 'no match';
}
}
void main() {
Foo foo = new Foo();
print(foo.a()); // prints "no match"
print(foo.b()); // prints "match"
}
(DartPad with the above sample code)
The only difference between the methods a and b is that a uses forEach and b uses for in, yet they produce different results. Why is this?

Although there is a prefer_foreach lint, that recommendation is specifically for cases where you can use it with a tear-off (a reference to an existing function). Effective Dart recommends against using Iterable.forEach with anything else, and there is a corresponding avoid_function_literals_in_foreach_calls lint to enforce it.
Except for those simple cases where the callback is a tear-off, Iterable.forEach is not any simpler than using a basic and more general for loop. There are more pitfalls using Iterable.forEach, and this is one of them.
Iterable.forEach is a function that takes a callback as an argument. Iterable.forEach is not a control structure, and the callback is an ordinary function. You therefore cannot use break to stop iterating early or use continue to skip to the next iteration.
A return statement in the callback returns from the callback, and the return value is ignored. The caller of Iterable.forEach will never receive the returned value and will never have an opportunity to propagate it. For example, in:
bool f(List<int> list) {
for (var i in list) {
if (i == 42) {
return true;
}
}
return false;
}
the return true statement returns from the function f and stops iteration. In contrast, with forEach:
bool g(List<int> list) {
list.forEach((i) {
if (i == 42) {
return true;
}
});
return false;
}
the return true statement returns from only the callback. The function g will not return until it completes all iterations and reaches the return false statement at the end. This perhaps is clearer as:
bool callback(int i) {
if (i == 42) {
return true;
}
}
bool g(List<int> list) {
list.forEach(callback);
return false;
}
which makes it more obvious that:
There is no way for callback to cause g to return true.
callback does not return a value along all paths.
(That's the problem you encountered.)
Iterable.forEach must not be used with asynchronous callbacks. Because any value returned by the callback is ignored, asynchronous callbacks can never be waited upon.
I should also point out that if you enable Dart's new null-safety features, which enable stricter type-checking, your forEach code will generate an error because it returns a value in a callback that is expected to have a void return value.
A notable case where Iterable.forEach can be simpler than a regular for loop is if the object you're iterating over might be null:
List<int>? nullableList;
nullableList?.forEach((e) => ...);
whereas a regular for loop would require an additional if check or doing:
List<int>? nullableList;
for (var e in nullableList ?? []) {
...
}
(In JavaScript, for-in has unintuitive pitfalls, so Array.forEach often is recommended instead. Perhaps that's why a lot of people seem to be conditioned to use a .forEach method over a built-in language construct. However, Dart does not share those pitfalls with JavaScript.)

👋 jamesdin! Everything you have shared about the limitations of forEach is correct however there's one part where you are wrong. In the code snippet showing the example of how you the return value from forEach is ignored, you have return true; inside the callback function for forEach which is not allowed as the callback has a return type of void and returning any other value from the callback is not allowed.
Although you have mentioned that returning a value from within the callback will result in an error, I'm just pointing at the code snippet.
Here's the signature for forEach
Also, some more pitfalls of forEach are:
One can't use break or continue statements.
One can't get access to the index of the item as opposed to using the regular for loop

Multiple types for a single variable (parameter/return type)

I am very new to Dart so excuse me if I didnt see this part.
I want to make a union type e.g. for a function input. In TS this would be:
let variableInput: string | number
typedef doesnt really define types but functions and enums dont really help too.
On the other side how should it look like when a function return either one or the other of two types? There must be something I dont see here.

There are no union types in Dart.
The way to do this in Dart is returning/accepting dynamic as a type:
dynamic stringOrNumber() { ... }
void main() {
final value = stringOrNumber();
if (value is String) {
// Handle a string value.
} else if (value is num) {
// Handle a number.
} else {
throw ArgumentError.value(value);
}
}
See also: https://dart.dev/guides/language/sound-dart

Redefining Lexical Tokens With Javacc

I'm quite new to creating language syntax with Javacc and i need to find a way to allow the user to redefine the definition of a token in code.
For example, the line
REDEFINE IF FOO
Should change the Definition of "IF" from
< IF: "IF" >
To
< IF: "FOO" >
If this is not possible, what would be the best way of solving this problem?

I think you can do it with a token action that changes the kind field of the token.
Something like the following. [Untested code follows. If you use it, please correct any errors in this answer.]
Make a token manager declaration of a hash map:
TOKEN_MGR_DECLS: {
public java.util.HashMap<String,Integer> keywordMap = new java.util.HashMap<String,Integer>() ;
{ keywordMap.put( "IF", ...Constants.IF); }
}
Make a definition for identifiers.
TOKEN : { <ID : (["a"-"z","A"-"Z"])(["a"-"z","A"-"Z","0"-"9"])* >
{ if( keywordMap.containsKey( matchedToken.image ) ) {
matchedToken.kind = keywordMap.get( matchedToken.image ) ; }
}
}
Make definitions for the key words. These need to come after then definition of ID. Really these are just here so that the kinds are created. They will be unreachable and may cause warnings.
TOKEN : { <IF : "A"> | ... }
In the parser you need to define redefine
void redefine() :
{
Token oldToken;
Token newToken;
}
{
<REDEFINE> oldToken=redefinableToken() newToken=redefinableToken()
{
if( ...TokenManager.keywordMap.containsKey( oldToken.image ) ) {
...TokenManager.keywordMap.remove( oldToken.image ) ;
...TokenManager.keywordMap.add( newToken.image, oldToken.kind ) ; }
else {
report an error }
}
}
Token redefinableToken() :
{ Token t ; }
{
t=<ID> {return t ;}
| t=<IF> {return t ;}
| ...
}
See the FAQ (4.14) for warnings about trying to alter the behaviour of the lexer from the parser. Long story short: avoid lookahead.
Another approach is to simply have one token kind, say ID, and handle everything in the parser. See FAQ 4.19 on "Replacing keywords with semantic lookahead". Here lookahead will be less of a problem because semantic actions in the parser aren't executed during syntactic lookahead (FAQ 4.10).

Purpose of #relay(pattern:true)

New expression #relay(pattern: true) was introduced in change log for relay.js 0.5.
But can't figure out from description nor tests what exactly it does and when I should use it when writing fatQueries.
Some example would be very helpful.

Consider a GraphQL query like the following:
viewer {
friends(first: 10) {
totalCount
edges { node { name } }
pageInfo { hasNextPage }
}
}
When defining a fat query for a Relay mutation, to include the name of a field without specifying any of its subfields tells Relay that any subfield of that field could change as a result of that mutation.
Unfortunately, to omit connection arguments such as find, first, and last on the friends field will cause a validation error on the connection-argument-dependent fields edges and pageInfo:
getFatQuery() {
return Relay.QL`
fragment on AddFriendMutationPayload {
viewer {
friends { edges, pageInfo } # Will throw the validation error below
}
}
`;
}
// Uncaught Error: GraphQL validation/transform error ``You supplied the `pageInfo`
// field on a connection named `friends`, but you did not supply an argument necessary
// to do so. Use either the `find`, `first`, or `last` argument.`` in file
// `/path/to/MyMutation.js`.
You can use the #relay(pattern: true) directive to indicate that want to use the fat query to pattern match against the tracked query, instead of to use it as a fully-fledged query.
getFatQuery() {
return Relay.QL`
fragment on AddFriendMutationPayload #relay(pattern: true) {
viewer {
friends { edges, pageInfo } # Valid!
}
}
`;
}
For more information about mutations see: https://facebook.github.io/relay/docs/guides-mutations.html#content

How to debug static code block in GEB Page model

I am trying out GEB and wanted to debug the static code block in the examples. I have tried to set breakpoints but i seem unable to inspect the data that is used in the static content block.
class GoogleResultsPage extends Page {
static at = { results }
static content = {
results(wait: true) { $("li.g") }
result { i -> results[i] }
resultLink { i -> result(i).find("a.l")[0] }
firstResultLink { resultLink(0) }
}
}
Any clue on how this normally can be debugged using for example IntelliJ?

Since the content block is using a DSL and undergoes a transformation when compiled I'm thinking it wouldn't be possible to debug without special support from the IDE, however I hope someone can prove me wrong.
The approach I have been using is to define methods for anything beyond the core content. This provides a few benefits, including debugging support, IDE autocompletion when writing tests, and good refactoring support. The drawback of course is slightly more verbose code, although the tradeoff has been worth it for my purposes.
Here's how I might do the GoogleResultsPage:
class GoogleResultsPage extends Page {
static at = { results }
static content = {
results(wait: true) { $("li.g") }
}
Navigator result(int i) { results[i] }
Navigator resultLink(int i) { result(i).find("a.l")[0] }
Navigator firstResultLink { resultLink(0) }
}
Then when writing the test I use a slightly more typed approach:
class MySpec extends GebReportingSpec {
def "google search with keyword should have a first result"() {
given:
GoogleHomePage homePage = to(GoogleHomePage)
when:
homePage.search("keyword")
then:
GoogleResultsPage resultsPage = at(GoogleResultsPage)
resultsPage.result(0).displayed
}
}

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

ANTLR4 : Island grammar, token matching / skipping - parsing

this can be solved with a context specific lexer. In this case, by keeping track of the condition / block openings, we can determine if this is template content, or an actual block opening / closing. See p219 of the ANTLR4 definitive ANTLR4 reference.

Related

forEach vs for in: Different Behavior When Calling a Method

Multiple types for a single variable (parameter/return type)

Redefining Lexical Tokens With Javacc

Purpose of #relay(pattern:true)

How to debug static code block in GEB Page model

Categories

Resources