Is there a way to customize Content Assist in xtext? - xtext

Suppose I have a grammar "A" which cross references from another grammar "B". for example:-
test.mydsl->
entity A{
var p;
var q;
}
otherTest.mydsl1
entity B{
var "??"
}
so right now what is happening is as i press Ctrl + Space in "??" of above code piece, it shows A.p and A.q in content assist but i dont want that i just want it to show "p" or "q" in content assist. Is there some example from which i can refer where content assist just shows options with simple name and not FQN ?

Related

Search string with specific termination DXL

Im trying to perform a search in DXL of a string that ends with specific characteres Im not able to find the way to perform this.
Example, I'm looking for
" A: 23.1.23.2.4"
But if this contains at the end the character "~" the find function does not work
Example Where the skip list contains "A: 12.2.1.4.5~ text text text text"
I just need to know in the object.text contains A: 12.2.1.4.5
string string_text = "A: 12.2.1.4.5"
if(find(skip[i],string_text,string_text)){
modify_attributes(req_text)
}else{
output << "stgring not found : "
}
use a regular expression, like this
void modify_attributes (string fulltext) {print "modifying.."}
string fulltext = "A: 12.2.1.4.5~ text text text text"
Regexp searchme = regexp2 "A: 12.2.1.4.5"
if(searchme (fulltext)){
modify_attributes(fulltext)
}else{
print "string not found "
}
The "find"-method for Skip lists is O(1), if I am not mistaken. But for that to work properly, the key, you are asking for, has to match exactly.
So, to benefit from the speed of value-retrieval by the find method, I suggest, that you have a look at your code part, where you put stuff into your Skip, (only put "clean" information in the Skip, which you know, you want to ask for later on).
That of course only works, if you have the possibility to do so, i.e. you don't get the Skip from somewhere you don't have control over..

How to have the AFTER separator in Xtend just behind last item

I use Xtend to generate some textual output from my Xtext grammar. My question is really simple, but I cannot figure it out reading the documentation and examples, so please give me advice!
Let us assume, I have a list with some items. I generate for every item a line within a FOR loop using the SEPARATOR feature. At the end I want to have a final separator using AFTER. Here is an example code to demonstrate it:
val list = #["a", "b", "c"]
'''
«FOR item : list SEPARATOR "," AFTER ","»
«item»
«ENDFOR»
'''
In this way, I receive the following output:
a,
b,
c
,
Now the question: How can I have the final , separator immediately behind c and not on the new line? It's very important to me for proper formatting of my output (it's more complex than a,b,c in reality and the position of the separator matters). I want to have this:
a,
b,
c,
Experts, please help!
In your simple example, as SEPARATOR and AFTER are the same, you should skip the separator/after stuff entirely and just use
'''
«FOR item : list»
«item»,
«ENDFOR»
'''
The newlines after FOR and ENDFOR are skipped by Xtend's whitespace detection. The problem is that AFTER is applied after the loop is executed, and your loop-body contains a trailing newline that goes into the output (grayspace). You can overcome this by writing the entire loop in one line, e.g.
'''
«FOR item: list SEPARATOR ',\n' AFTER ',\n'»«item»«ENDFOR»
'''
and extract a method for when the loop's body becomes to long. For better readability, you can add any number of whitespaces within the guillemets, e.g.
'''
«FOR item: list SEPARATOR ',\n' AFTER ',\n'»«
item
»«ENDFOR»
'''
For smaller bodies, you might also consider prefer join()
'''
«list.join(',\n')»,
'''
or more general map(), reduce() whatsoever.

Google Spreadsheet Translate, ignore variable names

An interesting Google Spreadsheet problem, I have a language file based on key=value that I have copied into a spreadsheet, eg.
titleMessage=Welcome to My Website
youAreLoggedIn=Hello #{user.name} you are now logged in
facebookPublish=Facebook Publishing
I have managed to split the key / value into two columns, and then translate the value column, and re-join it with the keys and Voila! this gives me a translated language file back
But as you may have spotted there are some variable in there (eg. #{user.name}) which are injected by my application, obviously I dont want to translate them.
So here is my question, given the following cell contents...
Hello #{user.name} you are now logged in
Is there a function that will translate the contents using the TRANSLATE function, but ignore anything inside #{ } (this could be at any point in the sentance)
Any Google Spreadsheet guru's have a solution for me?
Many thanks
If there are at most one occurrence of #{} then you could use the SPLIT function to divide the string into three parts that are arranged as below.
A B C D E
Original =SPLIT(An, "#{}") First piece Tag Rest of string
Translate Keep as is Translate
Put the pieces together with CONCATENATE.
=CONCATINATE(Cn,Dn,En)
I come up with same question.
Assume the escape pattern is #{sth.sth}(in regex as #{[\w.]+}). Replace them with string which Google Translate would view as untranslatable term, like VAR.
After translation, replace the term with original pattern.
Here is how I did this in script editor of spreadsheet:
function myTranslate(text, source_language, target_language) {
if(text.toString()) {
var str = text.toString();
var regex = /#{[\w.]+}/g; // g flag for multiple matches
var replace = 'VAR'; // Replace #{variable} to prevent from translation
var vars = str.match(regex).reverse(); // original patterns
str = str.replace(regex, replace);
str = LanguageApp.translate(str, source_language, target_language);
var ret = '';
for (var idx = str.search(replace); idx; idx = str.search(replace)) {
ret += str.slice(0, idx) + vars.pop();
str = str.slice(idx+replace.length);
}
return ret;
}
return null;
}
You can't just split and concatenate, because different languages use different word order of subject/predicate/object etc., and also because several languages modify nouns with different prefixes/suffixes/spelling changes depending on what they are doing in the sentence. It's all very complicated. Google needs to enable some sort of enclosing parentheses around any term we want to be quoted rather than translated.

How to define syntax

I am new at language processing and I want to create a parser with Irony for a following syntax:
name1:value1 name2:value2 name3:value ...
where name1 is the name of an xml element and value is the value of the element which can also include spaces.
I have tried to modify included samples like this:
public TestGrammar()
{
var name = CreateTerm("name");
var value = new IdentifierTerminal("value");
var queries = new NonTerminal("queries");
var query = new NonTerminal("query");
queries.Rule = MakePlusRule(queries, null, query);
query.Rule = name + ":" + value;
Root = queries;
}
private IdentifierTerminal CreateTerm(string name)
{
IdentifierTerminal term = new IdentifierTerminal(name, "!##$%^*_'.?-", "!##$%^*_'.?0123456789");
term.CharCategories.AddRange(new[]
{
UnicodeCategory.UppercaseLetter, //Ul
UnicodeCategory.LowercaseLetter, //Ll
UnicodeCategory.TitlecaseLetter, //Lt
UnicodeCategory.ModifierLetter, //Lm
UnicodeCategory.OtherLetter, //Lo
UnicodeCategory.LetterNumber, //Nl
UnicodeCategory.DecimalDigitNumber, //Nd
UnicodeCategory.ConnectorPunctuation, //Pc
UnicodeCategory.SpacingCombiningMark, //Mc
UnicodeCategory.NonSpacingMark, //Mn
UnicodeCategory.Format //Cf
});
//StartCharCategories are the same
term.StartCharCategories.AddRange(term.CharCategories);
return term;
}
but this doesn't work if the values include spaces. Can this be done (using Irony) without modifying the syntax (like adding quotes around values)?
Many thanks!
If newlines were included between key-value pairs, it would be easily achievable. I have no knowledge of "Irony", but my initial feeling is that almost no parser/lexer generator is going to deal with this given only a naive grammar description. This requires essentially unbounded lookahead.
Conceptually (because I know nothing about this product), here's how I would do it:
Tokenise based on spaces and colons (i.e. every continguous sequence of characters that isn't a space or a colon is an "identifier" token of some sort).
You then need to make it such that every "sentence" is described from colon-to-colon:
sentence = identifier_list
| : identifier_list identifier : sentence
That's not enough to make it work, but you get the idea at least, I hope. You would need to be very careful to distinguish an identifier_list from a single identifier such that they could be parsed unambiguously. Similarly, if your tool allows you to define precedence and associativity, you might be able to get away with making ":" bind very tightly to the left, such that your grammar is simply:
sentence = identifier : identifier_list
And the behaviour of that needs to be (identifier :) identifier_list.

Gold Parsing System - What can it be used for in programming?

I have read the GOLD Homepage ( http://www.devincook.com/goldparser/ ) docs, FAQ and Wikipedia to find out what practical application there could possibly be for GOLD. I was thinking along the lines of having a programming language (easily) available to my systems such as ABAP on SAP or X++ on Axapta - but it doesn't look feasible to me, at least not easily - even if you use GOLD.
The final use of the parsed result produced by GOLD escapes me - what do you do with the result of the parse?
EDIT: A practical example (description) would be great.
Parsing really consists of two phases. The first is "lexing", which convert the raw strings of character in to something that the program can more readily understand (commonly called tokens).
Simple example, lex would convert:
if (a + b > 2) then
In to:
IF_TOKEN LEFT_PAREN IDENTIFIER(a) PLUS_SIGN IDENTIFIER(b) GREATER_THAN NUMBER(2) RIGHT_PAREN THEN_TOKEN
The parse takes that stream of tokens, and attempts to make yet more sense out of them. In this case, it would try and match up those tokens to an IF_STATEMENT. To the parse, the IF _STATEMENT may well look like this:
IF ( BOOLEAN_EXPRESSION ) THEN
Where the result of the lexing phase is a token stream, the result of the parsing phase is a Parse Tree.
So, a parser could convert the above in to:
if_statement
|
v
boolean_expression.operator = GREATER_THAN
| |
| v
V numeric_constant.string="2"
expression.operator = PLUS_SIGN
| |
| v
v identifier.string = "b"
identifier.string = "a"
Here you see we have an IF_STATEMENT. An IF_STATEMENT has a single argument, which is a BOOLEAN_EXPRESSION. This was explained in some manner to the parser. When the parser is converting the token stream, it "knows" what a IF looks like, and know what a BOOLEAN_EXPRESSION looks like, so it can make the proper assignments when it sees the code.
For example, if you have just:
if (a + b) then
The parser could know that it's not a boolean expression (because the + is arithmetic, not a boolean operator) and the parse could throw an error at this point.
Next, we see that a BOOLEAN_EXPRESSION has 3 components, the operator (GREATER_THAN), and two sides, the left side and the right side.
On the left side, it points to yet another expression, the "a + b", while on the right is points to a NUMERIC_CONSTANT, in this case the string "2". Again, the parser "knows" this is a NUMERIC constant because we told it about strings of numbers. If it wasn't numbers, it would be an IDENTIFIER (like "a" and "b" are).
Note, that if we had something like:
if (a + b > "XYZ") then
That "parses" just fine (expression on the left, string constant on the right). We don't know from looking at this whether this is a valid expression or not. We don't know if "a" or "b" reference Strings or Numbers at this point. So, this is something the parser can't decided for us, can't flag as an error, as it simply doesn't know. That will happen when we evaluate (either execute or try to compile in to code) the IF statement.
If we did:
if [a > b ) then
The parser can readily see that syntax error as a problem, and will throw an error. That string of tokens doesn't look like anything it knows about.
So, the point being that when you get a complete parse tree, you have some assurance that at first cut the "code looks good". Now during execution, other errors may well come up.
To evaluate the parse tree, you just walk the tree. You'll have some code associated with the major nodes of the parse tree during the compile or evaluation part. Let's assuming that we have an interpreter.
public void execute_if_statment(ParseTreeNode node) {
// We already know we have a IF_STATEMENT node
Value value = evaluate_expression(node.getBooleanExpression());
if (value.getBooleanResult() == true) {
// we do the "then" part of the code
}
}
public Value evaluate_expression(ParseTreeNode node) {
Value result = null;
if (node.isConstant()) {
result = evaluate_constant(node);
return result;
}
if (node.isIdentifier()) {
result = lookupIdentifier(node);
return result;
}
Value leftSide = evaluate_expression(node.getLeftSide());
Value rightSide = evaluate_expression(node.getRightSide());
if (node.getOperator() == '+') {
if (!leftSide.isNumber() || !rightSide.isNumber()) {
throw new RuntimeError("Must have numbers for adding");
}
int l = leftSide.getIntValue();
int r = rightSide.getIntValue();
int sum = l + r;
return new Value(sum);
}
if (node.getOperator() == '>') {
if (leftSide.getType() != rightSide.getType()) {
throw new RuntimeError("You can only compare values of the same type");
}
if (leftSide.isNumber()) {
int l = leftSide.getIntValue();
int r = rightSide.getIntValue();
boolean greater = l > r;
return new Value(greater);
} else {
// do string compare instead
}
}
}
So, you can see that we have a recursive evaluator here. You see how we're checking the run time types, and performing the basic evaluations.
What will happen is the execute_if_statement will evaluate it's main expression. Even tho we wanted only BOOLEAN_EXPRESION in the parse, all expressions are mostly the same for our purposes. So, execute_if_statement calls evaluate_expression.
In our system, all expressions have an operator and a left and right side. Each side of an expression is ALSO an expression, so you can see how we immediately try and evaluate those as well to get their real value. The one note is that if the expression consists of a CONSTANT, then we simply return the constants value, if it's an identifier, we look it up as a variable (and that would be a good place to throw a "I can't find the variable 'a'" message), otherwise we're back to the left side/right side thing.
I hope you can see how a simple evaluator can work once you have a token stream from a parser. Note how during evaluation, the major elements of the language are in place, otherwise we'd have got a syntax error and never got to this phase. We can simply expect to "know" that when we have a, for example, PLUS operator, we're going to have 2 expressions, the left and right side. Or when we execute an IF statement, that we already have a boolean expression to evaluate. The parse is what does that heavy lifting for us.
Getting started with a new language can be a challenge, but you'll find once you get rolling, the rest become pretty straightforward and it's almost "magic" that it all works in the end.
Note, pardon the formatting, but underscores are messing things up -- I hope it's still clear.
I would recommend antlr.org for information and the 'free' tool I would use for any parser use.
GOLD can be used for any kind of application where you have to apply context-free grammars to input.
elaboration:
Essentially, CFGs apply to all programming languages. So if you wanted to develop a scripting language for your company, you'd need to write a parser- or get a parsing program. Alternatively, if you wanted to have a semi-natural language for input for non-programmers in the company, you could use a parser to read that input and spit out more "machine-readable" data. Essentially, a context-free grammar allows you to describe far more inputs than a regular expression. The GOLD system apparently makes the parsing problem somewhat easier than lex/yacc(the UNIX standard programs for parsing).

Resources