I'm looking for a way to avoid comments during parsing. Here's my issue.
First I fetch all methods from a M3 model, like so:
public set[loc] getMethodLocations(M3 model){
locations = { <x,y> | <x,y> <- model#containment,
x.scheme=="java+class",
y.scheme=="java+method" ||
y.scheme=="java+constructor" };
set[loc] methodLocations = { a | a <- range(locations) };
return methodLocations;
}
Then I want to iterate over the fetched locations, like so:
set[loc] AllMethodsAsLoc = getMethodLocations(model);
for( methodAsLoc <- AllMethodsAsLoc ) {
MethodDec m = parse(#MethodDec, methodAsLoc);
};
My issue is that the parsing seems to fail with a ParseError when a fetched method has comments at the location. How can I not include comments when fetching or how can I ignore comments during the parsing?
I'm new at this and learning, so please excuse my ignorance.
Any help is appreciated.
Rob
Excellent question. Because MethodDec is not a "start" non-terminal it does not accept whitespace or comments before and after the actual MethodDec. So either we should trim the whitespace off somehow, or we could make a new non-terminal which can accept the layout.
The latter solution is the better IMHO:
start syntax MyTop = MethodDec method;
start[MyTop] theTop = parse(start[MyTop], methodAsLoc);
MyTop t = theTop.top
MethodDec dec = t.method;
// or more directly
dec = parse(start[MyTop], methodAsLoc).top.method;
Related
I working on a language similar to ruby called gaiman and I'm using PEG.js to generate the parser.
Do you know if there is a way to implement heredocs with proper indentation?
xxx = <<<END
hello
world
END
the output should be:
"hello
world"
I need this because this code doesn't look very nice:
def foo(arg) {
if arg == "here" then
return <<<END
xxx
xxx
END
end
end
this is a function where the user wants to return:
"xxx
xxx"
I would prefer the code to look like this:
def foo(arg) {
if arg == "here" then
return <<<END
xxx
xxx
END
end
end
If I trim all the lines user will not be able to use a string with leading spaces when he wants. Does anyone know if PEG.js allows this?
I don't have any code yet for heredocs, just want to be sure if something that I want is possible.
EDIT:
So I've tried to implement heredocs and the problem is that PEG doesn't allow back-references.
heredoc = "<<<" marker:[\w]+ "\n" text:[\s\S]+ marker {
return text.join('');
}
It says that the marker is not defined. As for trimming I think I can use location() function
I don't think that's a reasonable expectation for a parser generator; few if any would be equal to the challenge.
For a start, recognising the here-string syntax is inherently context-sensitive, since the end-delimiter must be a precise copy of the delimiter provided after the <<< token. So you would need a custom lexical analyser, and that means that you need a parser generator which allows you to use a custom lexical analyser. (So a parser generator which assumes you want a scannerless parser might not be the optimal choice.)
Recognising the end of the here-string token shouldn't be too difficult, although you can't do it with a single regular expression. My approach would be to use a custom scanning function which breaks the here-string into a series of lines, concatenating them as it goes until it reaches a line containing only the end-delimiter.
Once you've recognised the text of the literal, all you need to normalise the spaces in the way you want is the column number at which the <<< starts. With that, you can trim each line in the string literal. So you only need a lexical scanner which accurately reports token position. Trimming wouldn't normally be done inside the generated lexical scanner; rather, it would be the associated semantic action. (Equally, it could be a semantic action in the grammar. But it's always going to be code that you write.)
When you trim the literal, you'll need to deal with the cases in which it is impossible, because the user has not respected the indentation requirement. And you'll need to do something with tab characters; getting those right probably means that you'll want a lexical scanner which computes visible column positions rather than character offsets.
I don't know if peg.js corresponds with those requirements, since I don't use it. (I did look at the documentation, and failed to see any indication as to how you might incorporate a custom scanner function. But that doesn't mean there isn't a way to do it.) I hope that the discussion above at least lets you check the detailed documentation for the parser generator you want to use, and otherwise find a different parser generator which will work for you in this use case.
Here is the implementation of heredocs in Peggy successor to PEG.js that is not maintained anymore. This code was based on the GitHub issue.
heredoc = "<<<" begin:marker "\n" text:($any_char+ "\n")+ _ end:marker (
&{ return begin === end; }
/ '' { error(`Expected matched marker "${begin}", but marker "${end}" was found`); }
) {
const loc = location();
const min = loc.start.column - 1;
const re = new RegExp(`\\s{${min}}`);
return text.map(line => {
return line[0].replace(re, '');
}).join('\n');
}
any_char = (!"\n" .)
marker_char = (!" " !"\n" .)
marker "Marker" = $marker_char+
_ "whitespace"
= [ \t\n\r]* { return []; }
EDIT: above didn't work with another piece of code after heredoc, here is better grammar:
{ let heredoc_begin = null; }
heredoc = "<<<" beginMarker "\n" text:content endMarker {
const loc = location();
const min = loc.start.column - 1;
const re = new RegExp(`^\\s{${min}}`, 'mg');
return {
type: 'Literal',
value: text.replace(re, '')
};
}
__ = (!"\n" !" " .)
marker 'Marker' = $__+
beginMarker = m:marker { heredoc_begin = m; }
endMarker = "\n" " "* end:marker &{ return heredoc_begin === end; }
content = $(!endMarker .)*
I'm getting same error as this question, but with XQuery:
SaxonApiException: The context item for axis step ./CLIENT is absent
When running from the command line, all is good. So I don't think there is a syntax problem with the XQuery itself. I won't post the input file unless needed.
The XQuery is displayed with a Console.WriteLine before the error appears:
----- Start: XQUERY:
(: FLWOR = For Let Where Order-by Return :)
<MyFlightLegs>
{
for $flightLeg in //FlightLeg
where $flightLeg/DepartureAirport = 'OKC' or $flightLeg/ArrivalAirport = 'OKC'
order by $flightLeg/ArrivalDate[1] descending
return $flightLeg
}
</MyFlightLegs>
----- End : XQUERY:
Error evaluating (<MyFlightLegs {for $flightLeg in root/descendant::FlightLeg[DepartureAirport = "OKC" or ArrivalAirport = "OKC"] ... return $flightLeg}/>) on line 4 column 20
XPDY0002: The context item for axis step root/descendant::FlightLeg is absent
I think that like the other question, maybe my input XML file is not properly specified.
I took the samples/cs/ExamplesHE.cs run method of the XQuerytoStream class.
Code there for easy reference is:
public class XQueryToStream : Example
{
public override string testName
{
get { return "XQueryToStream"; }
}
public override void run(Uri samplesDir)
{
Processor processor = new Processor();
XQueryCompiler compiler = processor.NewXQueryCompiler();
compiler.BaseUri = samplesDir.ToString();
compiler.DeclareNamespace("saxon", "http://saxon.sf.net/");
XQueryExecutable exp = compiler.Compile("<saxon:example>{static-base-uri()}</saxon:example>");
XQueryEvaluator eval = exp.Load();
Serializer qout = processor.NewSerializer();
qout.SetOutputProperty(Serializer.METHOD, "xml");
qout.SetOutputProperty(Serializer.INDENT, "yes");
qout.SetOutputStream(new FileStream("testoutput.xml", FileMode.Create, FileAccess.Write));
Console.WriteLine("Output written to testoutput.xml");
eval.Run(qout);
}
}
I changed to pass the Xquery file name, the xml file name, and the output file name, and tried to make a static method out of it. (Had success doing the same with the XSLT processor.)
static void DemoXQuery(string xmlInputFilename, string xqueryInputFilename, string outFilename)
{
// Create a Processor instance.
Processor processor = new Processor();
// Load the source document
DocumentBuilder loader = processor.NewDocumentBuilder();
loader.BaseUri = new Uri(xmlInputFilename);
XdmNode indoc = loader.Build(loader.BaseUri);
XQueryCompiler compiler = processor.NewXQueryCompiler();
//BaseUri is inconsistent with Transform= Processor?
//compiler.BaseUri = new Uri(xqueryInputFilename);
//compiler.DeclareNamespace("saxon", "http://saxon.sf.net/");
string xqueryFileContents = File.ReadAllText(xqueryInputFilename);
Console.WriteLine("----- Start: XQUERY:");
Console.WriteLine(xqueryFileContents);
Console.WriteLine("----- End : XQUERY:");
XQueryExecutable exp = compiler.Compile(xqueryFileContents);
XQueryEvaluator eval = exp.Load();
Serializer qout = processor.NewSerializer();
qout.SetOutputProperty(Serializer.METHOD, "xml");
qout.SetOutputProperty(Serializer.INDENT, "yes");
qout.SetOutputStream(new FileStream(outFilename,
FileMode.Create, FileAccess.Write));
eval.Run(qout);
}
Also two questions regarding "BaseURI".
1. Should it be a directory name, or can it be same as the Xquery file name?
2. I get this compile error: "Cannot implicity convert to "System.Uri" to "String".
compiler.BaseUri = new Uri(xqueryInputFilename);
It's exactly the same thing I did for XSLT which worked. But it looks like BaseUri is a string for XQuery, but a real Uri object for XSLT? Any reason for the difference?
You seem to be asking a whole series of separate questions, which are hard to disentangle.
Your C# code appears to be compiling the query
<saxon:example>{static-base-uri()}</saxon:example>
which bears no relationship to the XQuery code you supplied that involves MyFlightLegs.
The MyFlightLegs query uses //FlightLeg and is clearly designed to run against a source document containing a FlightLeg element, but your C# code makes no attempt to supply such a document. You need to add an eval.ContextItem = value statement.
Your second C# fragment creates an input document in the line
XdmNode indoc = loader.Build(loader.BaseUri);
but it doesn't supply it to the query evaluator.
A base URI can be either a directory or a file; resolving relative.xml against file:///my/dir/ gives exactly the same result as resolving it against file:///my/dir/query.xq. By convention, though, the static base URI of the query is the URI of the resource (eg file) containing the source query text.
Yes, there's a lot of inconsistency in the use of strings versus URI objects in the API design. (There's also inconsistency about the spelling of BaseURI versus BaseUri.) Sorry about that; you're just going to have to live with it.
Bottom line solution based on Michael Kay's response; I added this line of code after doing the exp.Load():
eval.ContextItem = indoc;
The indoc object created earlier is what relates to the XML input file to be processed by the XQuery.
I store various formulas in Postgres and I want to use those formulas in my code. It would look something like this:
var amount = 100;
var formula = '5/105'; // normally something I would fetch from Postgres
var total = amount * formula; // should return 4.76
Is there a way to evaluate the string in this manner?
As far as I'm aware, there isn't a formula solver package developed for Dart yet. (If one exists or gets created after this post, we can edit it into the answer.)
EDIT: Mattia in the comments points out the math_expressions package, which looks pretty robust and easy to use.
There is a way to execute arbitrary Dart code as a string, but it has several problems. A] It's very roundabout and convoluted; B] it becomes a massive security issue; and C] it only works if the Dart is compiled in JIT mode (so in Flutter this means it will only work in debug builds, not release builds).
So the answer is that unfortunately, you will have to implement it yourself. The good news is that, for simple 4-function arithmetic, this is pretty straight-forward, and you can follow a tutorial on writing a calculator app like this one to see how it's done.
Of course, if all your formulas only contain two terms with an operator between them like in your example snippet, it becomes even easier. You can do the whole thing in just a few lines of code:
void main() {
final amount = 100;
final formula = '5/105';
final pattern = RegExp(r'(\d+)([\/+*-])(\d+)');
final match = pattern.firstMatch(formula);
final value = process(num.parse(match[1]), match[2], num.parse(match[3]));
final total = amount * value;
print(total); // Prints: 4.761904761904762
}
num process(num a, String operator, num b) {
switch (operator) {
case '+': return a + b;
case '-': return a - b;
case '*': return a * b;
case '/': return a / b;
}
throw ArgumentError(operator);
}
There are a few packages that can be used to accomplish this:
pub.dev/packages/function_tree
pub.dev/packages/math_expressions
pub.dev/packages/expressions
I used function_tree as follows:
double amount = 100.55;
String formula = '5/105*.5'; // From Postgres
final tax = amount * formula.interpret();
I haven't tried it, but using math_expressions it should look like this:
double amount = 100.55;
String formula = '5/105*.5'; // From Postgres
Parser p = Parser();
// Context is used to evaluate variables, can be empty in this case.
ContextModel cm = ContextModel();
Expression exp = p.parse(formula) * p.parse(amount.toString());
// or..
//Expression exp = p.parse(formula) * Number(amount);
double result = exp.evaluate(EvaluationType.REAL, cm);
// Result: 2.394047619047619
print('Result: ${result}');
Thanks to fkleon for the math_expressions help.
I‘m new to this so I hope I get it right.
I‘m not exactly new to writing DXL but currently have a performance issue with calling getProperties from a Layout dxl column that is supposed to display outgoing links depending on a module attribute value of type Enum of the linked module.
The code basically works but takes extremely long to complete. Commenting out the getProperties call makes it as fast as it could be.
Yes, the call is written exactly as shown in DXL Ref manual.
Calling the attribute directly, using a module object and dot operator does not work either as it always returns the enums default value but not the actual.
Any ideas welcome...
EDIT added example code below
// couple of declarations snipped
string cond = "Enum selection here" // this is modified from actual code, to show the idea
string linkModName = "*"
ModuleProperties mp
for l in all(o->linkModName) do
{
otherVersion = targetVersion l
otherMod = module(otherVersion)
if (null otherMod || isDeleted otherMod) continue
othero = target l
if (null othero)
{
load(otherVersion,false)
}
getProperties(otherVersion, mp)
sTemp = mp.myAttr
if (sTemp == cond) continue
// further code snipped
}
I'm not 100% sure but I think there is/was a performance issue with module properties in some DOORS versions.
You might want to try the following, i.e. directly get the attribute from the loaded Module
[...]
othero = target l
Module m
if (null othero)
{
m = load(otherVersion,false)
} else {
m = module othero
}
sTemp = m.myAttr
[...]
Caution, I did not test this snippet.
I wrote an ANTLR3 grammar subdivided into smaller rules to increase readability.
For example:
messageSequenceChart:
'msc' mscHead bmsc 'endmsc' end
;
# Where mscHead is a shortcut to :
mscHead:
mscName mscParameterDecl? timeOffset? end
mscInstInterface? mscGateInterface
;
I know the built-in ANTLR AST building feature allows the user to declare intermediate AST nodes that won't be in the final AST. But what if you build the AST by hand?
messageSequenceChart returns [msc::MessageSequenceChart* n = 0]:
'msc' mscHead bmsc'endmsc' end
{
$n = new msc::MessageSequenceChart(/* mscHead subrules accessors like $mscHead.mscName.n ? */
$bmsc.n);
}
;
mscHead:
mscName mscParameterDecl? timeOffset? end
;
The documentation does not talk about such a thing. So it looks like I will have to create nodes for every intermediate rules to be able to access their subrules result.
Does anyone know a better solution ?
Thank you.
You can solve this by letting your sub-rule(s) return multiple values and accessing only those you're interested in.
The following demo shows how to do it. Although it is not in C, I am confident that you'll be able to adjust it so that it fits your needs:
grammar Test;
parse
: sub EOF {System.out.printf("second=\%s\n", $sub.second);}
;
sub returns [String first, String second, String third]
: a=INT b=INT c=INT
{
$first = $a.text;
$second = $b.text;
$third = $c.text;
}
;
INT
: '0'..'9'+
;
SPACE
: ' ' {$channel=HIDDEN;}
;
And if your parse the input "12 34 56" with the generated parser, second=34 is printed to the console, as you can see after running:
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
TestLexer lex = new TestLexer(new ANTLRStringStream("12 34 56"));
TokenStream tokens = new TokenRewriteStream(lex);
TestParser parser = new TestParser(tokens);
parser.parse();
}
}
So, a shortcut from the parse rule like $sub.INT, or $sub.$a to access one of the three INT tokens, in not possible, unfortunately.