we are trying to add parameters to a transformation at the runtime. The only possible way to do so, is to set every single parameter and not a node. We don't know yet how to create a node for the setParameter.
Current setParameter:
QName TEST XdmAtomicValue 24
Expected setParameter:
<TempNode> <local>Value1</local> </TempNode>
We searched and tried to create a XdmNode and XdmItem.
If you want to create an XdmNode by parsing XML, the best way to do it is:
DocumentBuilder db = processor.newDocumentBuilder();
XdmNode node = db.build(new StreamSource(
new StringReader("<doc><elem/></doc>")));
You could also pass a string containing lexical XML as the parameter value, and then convert it to a tree by calling the XPath parse-xml() function.
If you want to construct the XdmNode programmatically, there are a number of options:
DocumentBuilder.newBuildingStreamWriter() gives you an instance of BuildingStreamWriter which extends XmlStreamWriter, and you can create the document by writing events to it using methods such as writeStartElement, writeCharacters, writeEndElement; at the end call getDocumentNode() on the BuildingStreamWriter, which gives you an XdmNode. This has the advantage that XmlStreamWriter is a standard API, though it's not actually a very nice one, because the documentation isn't very good and as a result implementations vary in their behaviour.
Another event-based API is Saxon's Push class; this differs from most push-based event APIs in that rather than having a flat sequence of methods like:
builder.startElement('x');
builder.characters('abc');
builder.endElement();
you have a nested sequence:
Element x = Document.elem('x');
x.text('abc');
x.close();
As mentioned by Martin, there is the "sapling" API: Saplings.doc().withChild(elem(...).withChild(elem(...)) etc. This API is rather radically different from anything you might be familiar with (though it's influenced by the LINQ API for tree construction on .NET) but once you've got used to it, it reads very well. The Sapling API constructs a very light-weight tree in memory (hance the name), and converts it to a fully-fledged XDM tree with a final call of SaplingDocument.toXdmNode().
If you're familiar with DOM, JDOM2, or XOM, you can construct a tree using any of those libraries and then convert it for use by Saxon. That's a bit convoluted and only really intended for applications that are already using a third-party tree model heavily (or for users who love these APIs and prefer them to anything else).
In the Saxon Java s9api, you can construct temporary trees as SaplingNode/SaplingElement/SaplingDocument, see https://www.saxonica.com/html/documentation12/javadoc/net/sf/saxon/sapling/SaplingDocument.html and https://www.saxonica.com/html/documentation12/javadoc/net/sf/saxon/sapling/SaplingElement.html.
To give you a simple example constructing from a Map, as you seem to want to do:
Processor processor = new Processor();
Map<String, String> xsltParameters = new HashMap<>();
xsltParameters.put("foo", "value 1");
xsltParameters.put("bar", "value 2");
SaplingElement saplingElement = new SaplingElement("Test");
for (Map.Entry<String, String> param : xsltParameters.entrySet())
{
saplingElement = saplingElement.withChild(new SaplingElement(param.getKey()).withText(param.getValue()));
}
XdmNode paramNode = saplingElement.toXdmNode(processor);
System.out.println(paramNode);
outputs e.g. <Test><bar>value 2</bar><foo>value 1</foo></Test>.
So the key is to understand that withChild() returns a new SaplingElement.
The code can be compacted using streams e.g.
XdmNode paramNode2 = Saplings.elem("root").withChild(
xsltParameters
.entrySet()
.stream()
.map(p -> Saplings.elem(p.getKey()).withText(p.getValue()))
.collect(Collectors.toList())
.toArray(SaplingElement[]::new))
.toXdmNode(processor);
System.out.println(paramNode2);
Is it possible to use the bind string on one expression in the other like the following code:
expr(declRefExpr().bind("id"), hasDesendent(declRefExpr(has("id")));
Basically to use bind id string of one node to find the other node.
The best way to compare 2 nodes is to bind in different id strings and then compare them in the callback method.
This is explained in this tutorial.
In the above link you can find the following code:
const VarDecl *IncVar = Result.Nodes.getNodeAs<VarDecl>("incVarName");
const VarDecl *CondVar = Result.Nodes.getNodeAs<VarDecl>("condVarName");
if (!areSameVariable(IncVar, CondVar))
return;
This code aims to compare nodes that are bind in variables incVarName and condVarName in the call back function.
Yes, it is possible using equalsBoundNode
Usage:
expr(declRefExpr().bind("id"), hasDesendent(declRefExpr(equalsBoundNode("id")));
I'm trying to write a function that does type casting, which seems to be a frequently occurring activity in Rascal code. But I can't seem to get it right. The following and several variations on it fail.
public &T cast(type[&T] tp, value v) throws str {
if (tp tv := v)
return tv;
else
throw "cast failed";
}
Can someone help me out?
Some more info: I frequently use pattern matching against a pattern of the form "Type Var" (i.e. against a variable declaration) in order to tell Rascal that an expression has a certain type, e.g.
map[str,value] m := myexp
This is usually in cases where I know that myexp has type map[str,value], but omitting the matching would make Rascal's type checking mechanism complain.
In order to be a bit more defensive against mistakes, I usually wrap the matching construct in an if-then-else where an exception is raised if the match fails:
if (map[str,value] m := myexp) {
// use m
} else {
throw "cast failed";
}
I would like to shorten all such similar pieces of code using a single function that does the job generically, so that I can write instead
cast(#map[str,value], myexp)
PS. Also see How to cast a value type to Map in Rascal?
It seems that the best way to write this, if you truly need to do this, is the following:
public map[str,value] cast(map[str,value] v) = v;
public default map[str,value] cast(value v) { throw "cast failed!"; }
Then you could just say
m = cast(myexp);
and it would do what you want to do -- the actual pattern matching is moved into the function signature for cast, with a case specific to the type you are wanting to use and a case that handles everything that doesn't otherwise match.
However, I'm still not sure why you are using type value, either here (inside the map) or in the linked question. The "standard" Rascal way of handling cases where you could have one of multiple choices is to define these with a user-defined data type and constructors. You could then use pattern matching to match the constructors, or use the is and has keywords to interrogate a value to check to see if it was created using a specific constructor or if it has a specific field, respectively. The rule for fields is that all occurrences of a field in the constructor definitions for a given ADT have the same type. So, it may help to know more about your usage scenario to see if this definition of cast is the best option or if there is a better solution to your problem.
EDITED
If you are reading JSON, an alternate way to do it is to use the JSON grammar and AST that also live in that part of the library (I think the one you are using is more of a stream reader, like our current text readers and writers, but I would need to look at the code more to be sure). You can then do something like this (long output included to give an idea of the results):
rascal>import lang::json::\syntax::JSON;
ok
rascal>import lang::json::ast::JSON;
ok
rascal>import lang::json::ast::Implode;
ok
ascal>js = buildAST(parse(#JSONText, |project://rascal/src/org/rascalmpl/library/lang/json/examples/twitter01.json|));
Value: object((
"since_id":integer(0),
"refresh_url":string("?since_id=202744362520678400&q=amsterdam&lang=en"),
"page":integer(1),
"since_id_str":string("0"),
"completed_in":float(0.058),
"results_per_page":integer(25),
"next_page":string("?page=2&max_id=202744362520678400&q=amsterdam&lang=en&rpp=25"),
"max_id_str":string("202744362520678400"),
"query":string("amsterdam"),
"max_id":integer(202744362520678400),
"results":array([
object((
"from_user":string("adekamel"),
"profile_image_url_https":string("https:\\/\\/si0.twimg.com\\/profile_images\\/2206104506\\/339515338_normal.jpg"),
"in_reply_to_status_id_str":string("202730522013728768"),
"to_user_id":integer(215350297),
"from_user_id_str":string("366868475"),
"geo":null(),
"in_reply_to_status_id":integer(202730522013728768),
"profile_image_url":string("http:\\/\\/a0.twimg.com\\/profile_images\\/2206104506\\/339515338_normal.jpg"),
"to_user_id_str":string("215350297"),
"from_user_name":string("nurul amalya \\u1d54\\u1d25\\u1d54"),
"created_at":string("Wed, 16 May 2012 12:56:37 +0000"),
"id_str":string("202744362520678400"),
"text":string("#Donnalita122 #NaishahS #fatihahmS #oishiihotchoc #yummy_DDG #zaimar93 #syedames I\'m here at Amsterdam :O"),
"to_user":string("Donnalita122"),
"metadata":object(("result_type":string("recent"))),
"iso_language_code":string("en"),
"from_user_id":integer(366868475),
"source":string("<a href="http:\\/\\/blackberry.com\\/twitter" rel="nofollow">Twitter for BlackBerry\\u00ae<\\/a>"),
"id":integer(202744362520678400),
"to_user_name":string("Rahmadini Hairuddin")
)),
object((
"from_user":string("kelashby"),
"profile_image_url_https":string("https:\\/\\/si0.twimg.com\\/profile_images\\/1861086809\\/me_beach_normal.JPG"),
"to_user_id":integer(0),
"from_user_id_str":string("291446599"),
"geo":null(),
"profile_image_url":string("http:\\/\\/a0.twimg.com\\/profile_images\\/1861086809\\/me_beach_normal.JPG"),
"to_user_id_str":string("0"),
"from_user_name":string("Kelly Ashby"),
"created_at":string("Wed, 16 May 2012 12:56:25 +0000"),
"id_str":string("202744310872018945"),
"text":string("45 days til freedom! Cannot wait! After Paris: London, maybe Amsterdam, then southern France, then CANADA!!!!"),
"to_user":null(),
"metadata":object(("result_type":string("recent"))),
"iso_language_code":string("en"),
"from_user_id":integer(291446599),
"source":string("<a href="http:\\/\\/mobile.twitter.com" rel="nofollow">Mobile Web<\\/a>"),
"id":integer(202744310872018945),
"to_user_name":null()
)),
object((
"from_user":string("johantolsma"),
"profile_image_url_https":string("https:\\/\\/si0.twimg.com\\/profile_images\\/1961917557\\/image_normal.jpg"),
"to_user_id":integer(0),
"from_user_id_str":string("23632499"),
"geo":null(),
"profile_image_url":string("http:\\/\\/a0.twimg.com\\/profile_images\\/1961917557\\/image_normal.jpg"),
"to_user_id_str":string("0"),
"from_user_name":string("Johan Tolsma"),
"created_at":string("Wed, 16 May 2012 12:56:16 +0000"),
"id_str":string("202744274050236416"),
"text":string("RT #agerolemou: Office space for freelancers in Amsterdam http:\\/\\/t.co\\/6VfHuLeK"),
"to_user":null(),
"metadata":object(("result_type":string("recent"))),
"iso_language_code":string("en"),
"from_user_id":integer(23632499),
"source":string("<a href="http:\\/\\/itunes.apple.com\\/us\\/app\\/twitter\\/id409789998?mt=12" rel="nofollow">Twitter for Mac<\\/a>"),
"id":integer(202744274050236416),
"to_user_name":null()
)),
object((
"from_user":string("hellosophieg"),
"profile_image_url_https":string("https:\\/\\/si0.twimg.com\\/profile_images\\/2213055219\\/image_normal.jpg"),
"to_user_id":integer(0),
"from_user_id_str":string("41153106"),
"geo":null(),
"profile_image_url":string("http:\\/\\/a0.twimg.com\\/profile_images\\/2213055219\\/image_normal.jp...
rascal>js is object;
bool: true
rascal>js.members<0>;
set[str]: {"since_id","refresh_url","page","since_id_str","completed_in","results_per_page","next_page","max_id_str","query","max_id","results"}
rascal>js.members["results_per_page"];
Value: integer(25)
You can then use pattern matching, over the types defined in lang::json::ast::json, to extract the information you need.
The code has a bug. This is the fixed code:
public &T cast(type[&T] tp, value v) throws str {
if (&T tv := v)
return tv;
else
throw "cast failed";
}
Note that we do not wish to include this in the standard library. Rather lets collect cases where we need it and find out how to fix it in another way.
If you find you need this casting often, then you might be avoiding the better parts of Rascal, such as pattern based dispatch. See also the answer by Mark Hills.
I'm using ANTLR4 to create a parse tree for my grammar, what I want to do is modify certain nodes in the tree. This will include removing certain nodes and inserting new ones. The purpose behind this is optimization for the language I am writing. I have yet to find a solution to this problem. What would be the best way to go about this?
While there is currently no real support or tools for tree rewriting, it is very possible to do. It's not even that painful.
The ParseTreeListener or your MyBaseListener can be used with a ParseTreeWalker to walk your parse tree.
From here, you can remove nodes with ParserRuleContext.removeLastChild(), however when doing this, you have to watch out for ParseTreeWalker.walk:
public void walk(ParseTreeListener listener, ParseTree t) {
if ( t instanceof ErrorNode) {
listener.visitErrorNode((ErrorNode)t);
return;
}
else if ( t instanceof TerminalNode) {
listener.visitTerminal((TerminalNode)t);
return;
}
RuleNode r = (RuleNode)t;
enterRule(listener, r);
int n = r.getChildCount();
for (int i = 0; i<n; i++) {
walk(listener, r.getChild(i));
}
exitRule(listener, r);
}
You must replace removed nodes with something if the walker has visited parents of those nodes, I usually pick empty ParseRuleContext objects (this is because of the cached value of n in the method above). This prevents the ParseTreeWalker from throwing a NPE.
When adding nodes, make sure to set the mutable parent on the ParseRuleContext to the new parent. Also, because of the cached n in the method above, a good strategy is to detect where the changes need to be before you hit where you want your changes to go in the walk, so the ParseTreeWalker will walk over them in the same pass (other wise you might need multiple passes...)
Your pseudo code should look like this:
public void enterRewriteTarget(#NotNull MyParser.RewriteTargetContext ctx){
if(shouldRewrite(ctx)){
ArrayList<ParseTree> nodesReplaced = replaceNodes(ctx);
addChildTo(ctx, createNewParentFor(nodesReplaced));
}
}
I've used this method to write a transpiler that compiled a synchronous internal language into asynchronous javascript. It was pretty painful.
Another approach would be to write a ParseTreeVisitor that converts the tree back to a string. (This can be trivial in some cases, because you are only calling TerminalNode.getText() and concatenate in aggregateResult(..).)
You then add the modifications to this visitor so that the resulting string representation contains the modifications you try to achieve.
Then parse the string and you get a parse tree with the desired modifications.
This is certainly hackish in some ways, since you parse the string twice. On the other hand the solution does not rely on antlr implementation details.
I needed something similar for simple transformations. I ended up using a ParseTreeWalker and a custom ...BaseListener where I overwrote the enter... methods. Inside this method the ParserRuleContext.children is available and can be manipulated.
class MyListener extends ...BaseListener {
#Override
public void enter...(...Context ctx) {
super.enter...(ctx);
ctx.children.add(...);
}
}
new ParseTreeWalker().walk(new MyListener(), parseTree);
I'm reading the source of polux's great parsers, and found there is a special isCommitted property which I can't understand:
class ParseResult<A> {
final bool isSuccess;
final bool isCommitted;
/// [:null:] if [:!isSuccess:]
final A value;
final String text;
final Position position;
final Expectations expectations;
// ...
}
You can see there is already a isSuccess to indicate the parse result is successful or not, why do we need a isCommitted? I tried to read related code, but still don't understand.
If you want to see the source, you can find it here.
The short answer is: don't worry about isCommited, it's for internal purposes only.
The long answer is: you can call commited on a paser, which means that once it has succeeded, you know for sure that it's pointless to backtrack (very much like Prolog's cut). For instance consider a grammar like this:
expr() => str('(') + rec(expr) str(')') ^ ...
| num()
Assume we parse the string "(...". Once we have recognized the parenthesis, we know for sure that if ... turns out not to be an expr, there is no need to rewind to the start of the string and try to parse a num, since a num will never start with a parenthesis anyway. We can fail early. This is done by marking ( as being a "commit point":
expr() => str('(').commited + rec(expr) str(')') ^ ...
| num()
This is an optimisation which should be used with great care because it breaks the modularity of parsers with respect to |. I personally never had to use it so far.
Whenever you call commited on a parser, it returns a new parser whose isCommited property is true. It is then used by | to decide whether to backtrack or not. This is what isCommited is used for. As an end user you should never have to care. I should probably make it private.
This feature is inspired by Polyparse's commit.