Rascal: TrafoFields Syntax error: concrete syntax fragment - rascal

I'm trying to re-create Tijs' CurryOn16 example "TrafoFields" scraping the code from the video, but using the Java18.rsc grammar instead of his Java15.rsc. I've parsed the Example.java successfully in the repl, like he did in the video, yielding a var pt. I then try to do the transformation with trafoFields(pt). The response I get is:
|project://Rascal-Test/src/TrafoFields.rsc|(235,142,<12,9>,<16,11>): Syntax error: concrete syntax fragment
My TrafoFields.rsc looks like this:
module TrafoFields
import lang::java::\syntax::Java18;
/**
* - Make public fields private
* - add getters and setters
*/
start[CompilationUnit] trafoFields(start[CompilationUnit] cu) {
return innermost visit (cu) {
case (ClassBody)`{
' <ClassBodyDeclaration* cs1>
' public <Type t> <ID f>;
' <ClassBodyDeclaration* cs2>
'}`
=> (ClassBody)`{
' <ClassBodyDeclaration* cs1>
' private <Type t> <ID f>;
' public void <ID setter>(<Type t> x) {
' this.<ID f> = x;
' }
' public <Type t> <ID getter>() {
' return this.<ID f>;
' }
' <ClassBodyDeclaration* cs2>
'}`
when
ID setter := [ID]"set<f>",
ID getter := [ID]"get<f>"
}
}
The only deviation from Tijs' code is that I've changed ClassBodyDec* to ClassBodyDeclaration*, as the grammar has this as a non-terminal. Any hint what else could be wrong?
UPDATE
More non-terminal re-writing adapting to Java18 grammar:
Id => ID

Ah yes, that is the Achilles-heal of concrete syntax usability; parse errors.
Note that a generalized parser (such as GLL which Rascal uses), simulates "unlimited lookahead" and so a parse error may be reported a few characters or even a few lines after the actual cause (but never before!). So shortening the example (delta debugging) will help localize the cause.
My way-of-life in this is:
First replace all pattern holes by concrete Java snippets. I know Java, so I should be able to write a correct fragment that would have matched the holes.
If there is still a parse error, now you check the top-non-terminal. Is it the one you needed? also make sure there is no extra whitespace before the start and after the end of the fragment inside the backquotes. Still a parse error? Write a shorter fragment first for a sub-nonterminal first.
Parse error solved? this means one of the pattern holes was not syntactically correct. The type of the hole is leading here, it should be one of the non-terminals used the grammar literally, and of course at the right spot in the fragment. Add the holes back in one-by-one until you hit the error again. Then you know the cause and probably also the fix.

Related

Preserving whitespace in Rascal when transforming Java code

I am trying to add instrumentation (e.g. logging some information) to methods in a Java file. I am using the following Rascal code which seems to work mostly:
import ParseTree;
import lang::java::\syntax::Java15;
// .. more imports
// project is a loc
M3 model = createM3FromEclipseProject(project);
set[loc] projectFiles = { file | file <- files(model)} ;
for (pFile <- projectFiles) {
CompilationUnit cunit = parse(#CompilationUnit, pFile);
cUnitNew = visit(cunit) {
case (MethodBody) `{<BlockStm* post>}`
=> (MethodBody) `{
'System.out.println(new Throwable().getStackTrace()[0]);
'<BlockStm* post>
'}`
}
writeFile(pFile, cUnitNew);
}
I am running into two issues regarding whitespace, which might be unrelated.
The line of code that I am inserting does not preserve whitespace that was there previously. If there was a tab character, it will now be removed. The same is true for the line directly following the line I am inserting and the closing brace. How can I 'capture' whitespace in my pattern?
Example before transforming (all lines start with a tab character, line 2 and 3 with two):
void beforeFirst() throws Exception {
rowIdx = -1;
rowSource.beforeFirst();
}
Example after transforming:
void beforeFirst() throws Exception {
System.out.println(new Throwable().getStackTrace()[0]);
rowIdx = -1;
rowSource.beforeFirst();
}
An additional issue regarding whitespace; if a file ends on a newline character, the parse function will throw a ParseError without further details. Removing this newline from the original source will fix the issue, but I'd rather not 'manually' have to fix code before parsing. How can I circumvent this issue?
Alas, capturing whitespace with a concrete pattern is not a feature of the current version of Rascal. We used to have it, but now it's back on the TODO list. I can point you to papers about the topic if you are interested. So for now you have to deal with this "damage" later.
You could write a Tree to Tree transformation on the generic level (see ParseTree.rsc), to fix indentation issues in a parse tree after your transformation, or to re-insert the comments that you lost. This is about matching the Tree data-type and appl constructors. The Tree format is a form of reflection on the parse trees of Rascal that allow any kind of transformation, including whitespace and comments.
The parse error you talked about is caused by not using the start non-terminal. If you use parse(#start[CompilationUnit], ...) then whitespace and comments before and after the CompilationUnit are accepted.

Invalid coercion: () as xs:string in xdmp:xslt-eval

I am making use of MarkLogic's ability to call XQuery functions in the XSL transform.
Let's say I have an XQuery library with a function whose signature looks like the following. This is for illustrative purposes only.
declare function my-func:ex-join($first as xs:string, $last as xs:string) as xs:string
{
fn:concat($first, '-', $last)
}
From XQuery, I can call this function with empty sequence as parameters, with no issues, i.e.
ex-join((), '1244')
The function will just return an empty sequence, but I don't get any errors. If I try to all this function from with in my XSL transform,as in:
<xsl:value-of select="my-func:ex-join(//node/value/text(), 'something')"/>
If the /node/value does not exist, and an empty sequence is passed in, I get the coercion error.
Does anyone have suggestions to work around the coercion problem, outside of checking the values in XSL prior to the select? These are auto-generated XSL templates, which would mean a lot of coded checks.
Thanks,
-tj
Attempts to invoke that function in XQuery would fail too. It is probably due to function mapping that you don't notice this though. Put the following at the top of your XQuery code:
declare option xdmp:mapping "false";
Next to this, you only need to change the signature of your function to accept empty-sequences. Replace as xs:string with as xs:string?:
declare function my-func:ex-join($first as xs:string?, $last as xs:string?) as xs:string
fn:concat will accept empty sequences as arguments, so no further changes required to make it work..
HTH!

writing a function to do type cast

I'm trying to write a function that does type casting, which seems to be a frequently occurring activity in Rascal code. But I can't seem to get it right. The following and several variations on it fail.
public &T cast(type[&T] tp, value v) throws str {
if (tp tv := v)
return tv;
else
throw "cast failed";
}
Can someone help me out?
Some more info: I frequently use pattern matching against a pattern of the form "Type Var" (i.e. against a variable declaration) in order to tell Rascal that an expression has a certain type, e.g.
map[str,value] m := myexp
This is usually in cases where I know that myexp has type map[str,value], but omitting the matching would make Rascal's type checking mechanism complain.
In order to be a bit more defensive against mistakes, I usually wrap the matching construct in an if-then-else where an exception is raised if the match fails:
if (map[str,value] m := myexp) {
// use m
} else {
throw "cast failed";
}
I would like to shorten all such similar pieces of code using a single function that does the job generically, so that I can write instead
cast(#map[str,value], myexp)
PS. Also see How to cast a value type to Map in Rascal?
It seems that the best way to write this, if you truly need to do this, is the following:
public map[str,value] cast(map[str,value] v) = v;
public default map[str,value] cast(value v) { throw "cast failed!"; }
Then you could just say
m = cast(myexp);
and it would do what you want to do -- the actual pattern matching is moved into the function signature for cast, with a case specific to the type you are wanting to use and a case that handles everything that doesn't otherwise match.
However, I'm still not sure why you are using type value, either here (inside the map) or in the linked question. The "standard" Rascal way of handling cases where you could have one of multiple choices is to define these with a user-defined data type and constructors. You could then use pattern matching to match the constructors, or use the is and has keywords to interrogate a value to check to see if it was created using a specific constructor or if it has a specific field, respectively. The rule for fields is that all occurrences of a field in the constructor definitions for a given ADT have the same type. So, it may help to know more about your usage scenario to see if this definition of cast is the best option or if there is a better solution to your problem.
EDITED
If you are reading JSON, an alternate way to do it is to use the JSON grammar and AST that also live in that part of the library (I think the one you are using is more of a stream reader, like our current text readers and writers, but I would need to look at the code more to be sure). You can then do something like this (long output included to give an idea of the results):
rascal>import lang::json::\syntax::JSON;
ok
rascal>import lang::json::ast::JSON;
ok
rascal>import lang::json::ast::Implode;
ok
ascal>js = buildAST(parse(#JSONText, |project://rascal/src/org/rascalmpl/library/lang/json/examples/twitter01.json|));
Value: object((
"since_id":integer(0),
"refresh_url":string("?since_id=202744362520678400&q=amsterdam&lang=en"),
"page":integer(1),
"since_id_str":string("0"),
"completed_in":float(0.058),
"results_per_page":integer(25),
"next_page":string("?page=2&max_id=202744362520678400&q=amsterdam&lang=en&rpp=25"),
"max_id_str":string("202744362520678400"),
"query":string("amsterdam"),
"max_id":integer(202744362520678400),
"results":array([
object((
"from_user":string("adekamel"),
"profile_image_url_https":string("https:\\/\\/si0.twimg.com\\/profile_images\\/2206104506\\/339515338_normal.jpg"),
"in_reply_to_status_id_str":string("202730522013728768"),
"to_user_id":integer(215350297),
"from_user_id_str":string("366868475"),
"geo":null(),
"in_reply_to_status_id":integer(202730522013728768),
"profile_image_url":string("http:\\/\\/a0.twimg.com\\/profile_images\\/2206104506\\/339515338_normal.jpg"),
"to_user_id_str":string("215350297"),
"from_user_name":string("nurul amalya \\u1d54\\u1d25\\u1d54"),
"created_at":string("Wed, 16 May 2012 12:56:37 +0000"),
"id_str":string("202744362520678400"),
"text":string("#Donnalita122 #NaishahS #fatihahmS #oishiihotchoc #yummy_DDG #zaimar93 #syedames I\'m here at Amsterdam :O"),
"to_user":string("Donnalita122"),
"metadata":object(("result_type":string("recent"))),
"iso_language_code":string("en"),
"from_user_id":integer(366868475),
"source":string("<a href="http:\\/\\/blackberry.com\\/twitter" rel="nofollow">Twitter for BlackBerry\\u00ae<\\/a>"),
"id":integer(202744362520678400),
"to_user_name":string("Rahmadini Hairuddin")
)),
object((
"from_user":string("kelashby"),
"profile_image_url_https":string("https:\\/\\/si0.twimg.com\\/profile_images\\/1861086809\\/me_beach_normal.JPG"),
"to_user_id":integer(0),
"from_user_id_str":string("291446599"),
"geo":null(),
"profile_image_url":string("http:\\/\\/a0.twimg.com\\/profile_images\\/1861086809\\/me_beach_normal.JPG"),
"to_user_id_str":string("0"),
"from_user_name":string("Kelly Ashby"),
"created_at":string("Wed, 16 May 2012 12:56:25 +0000"),
"id_str":string("202744310872018945"),
"text":string("45 days til freedom! Cannot wait! After Paris: London, maybe Amsterdam, then southern France, then CANADA!!!!"),
"to_user":null(),
"metadata":object(("result_type":string("recent"))),
"iso_language_code":string("en"),
"from_user_id":integer(291446599),
"source":string("<a href="http:\\/\\/mobile.twitter.com" rel="nofollow">Mobile Web<\\/a>"),
"id":integer(202744310872018945),
"to_user_name":null()
)),
object((
"from_user":string("johantolsma"),
"profile_image_url_https":string("https:\\/\\/si0.twimg.com\\/profile_images\\/1961917557\\/image_normal.jpg"),
"to_user_id":integer(0),
"from_user_id_str":string("23632499"),
"geo":null(),
"profile_image_url":string("http:\\/\\/a0.twimg.com\\/profile_images\\/1961917557\\/image_normal.jpg"),
"to_user_id_str":string("0"),
"from_user_name":string("Johan Tolsma"),
"created_at":string("Wed, 16 May 2012 12:56:16 +0000"),
"id_str":string("202744274050236416"),
"text":string("RT #agerolemou: Office space for freelancers in Amsterdam http:\\/\\/t.co\\/6VfHuLeK"),
"to_user":null(),
"metadata":object(("result_type":string("recent"))),
"iso_language_code":string("en"),
"from_user_id":integer(23632499),
"source":string("<a href="http:\\/\\/itunes.apple.com\\/us\\/app\\/twitter\\/id409789998?mt=12" rel="nofollow">Twitter for Mac<\\/a>"),
"id":integer(202744274050236416),
"to_user_name":null()
)),
object((
"from_user":string("hellosophieg"),
"profile_image_url_https":string("https:\\/\\/si0.twimg.com\\/profile_images\\/2213055219\\/image_normal.jpg"),
"to_user_id":integer(0),
"from_user_id_str":string("41153106"),
"geo":null(),
"profile_image_url":string("http:\\/\\/a0.twimg.com\\/profile_images\\/2213055219\\/image_normal.jp...
rascal>js is object;
bool: true
rascal>js.members<0>;
set[str]: {"since_id","refresh_url","page","since_id_str","completed_in","results_per_page","next_page","max_id_str","query","max_id","results"}
rascal>js.members["results_per_page"];
Value: integer(25)
You can then use pattern matching, over the types defined in lang::json::ast::json, to extract the information you need.
The code has a bug. This is the fixed code:
public &T cast(type[&T] tp, value v) throws str {
if (&T tv := v)
return tv;
else
throw "cast failed";
}
Note that we do not wish to include this in the standard library. Rather lets collect cases where we need it and find out how to fix it in another way.
If you find you need this casting often, then you might be avoiding the better parts of Rascal, such as pattern based dispatch. See also the answer by Mark Hills.

How to understand the "isCommitted" property of ParserResult?

I'm reading the source of polux's great parsers, and found there is a special isCommitted property which I can't understand:
class ParseResult<A> {
final bool isSuccess;
final bool isCommitted;
/// [:null:] if [:!isSuccess:]
final A value;
final String text;
final Position position;
final Expectations expectations;
// ...
}
You can see there is already a isSuccess to indicate the parse result is successful or not, why do we need a isCommitted? I tried to read related code, but still don't understand.
If you want to see the source, you can find it here.
The short answer is: don't worry about isCommited, it's for internal purposes only.
The long answer is: you can call commited on a paser, which means that once it has succeeded, you know for sure that it's pointless to backtrack (very much like Prolog's cut). For instance consider a grammar like this:
expr() => str('(') + rec(expr) str(')') ^ ...
| num()
Assume we parse the string "(...". Once we have recognized the parenthesis, we know for sure that if ... turns out not to be an expr, there is no need to rewind to the start of the string and try to parse a num, since a num will never start with a parenthesis anyway. We can fail early. This is done by marking ( as being a "commit point":
expr() => str('(').commited + rec(expr) str(')') ^ ...
| num()
This is an optimisation which should be used with great care because it breaks the modularity of parsers with respect to |. I personally never had to use it so far.
Whenever you call commited on a parser, it returns a new parser whose isCommited property is true. It is then used by | to decide whether to backtrack or not. This is what isCommited is used for. As an end user you should never have to care. I should probably make it private.
This feature is inspired by Polyparse's commit.

Scala 2.9: is there an easy way to log all ParseResults?

I've written a lexer and parser using scala.util.parsing.combinators.Parsers. I have a bug in at least one of my productions, but I have so many of them that it is difficult to eyeball them to determine the problem.
What I need is a log of every attempt my Parser makes to match the input with any production; logging all the Success and Failure objects when they are instantiated would be lovely. Unfortunately, the only way I can see to do this is to extend a lot of the basic classes provided by the library, then rewriting my massive parser to extend the new classes.
Is there an easy way to get this logging behavior?
You could use the log combinator to wrap productions of your grammar. Here's the definition in Parsers.scala:
def log[T](p: => Parser[T])(name: String): Parser[T] = Parser{ in =>
println("trying "+ name +" at "+ in)
val r = p(in)
println(name +" --> "+ r)
r
}
Otherwise, I think you should be able to override success and failure, but it would be quite uninformative, since you won't know what production called them.

Resources