Determining context at a position in file using ANTLR4 - parsing

I'm trying to write a Language Extension for VS Code in JavaScript and I seem to be missing something.
I have a Lexer.g4 and Parser.g4 for my language and can generate a tree using them.
My issue is that the VS Code API gives me a document and a position in that document (line #, character #). From any of the examples I've looked at for ANTLR4 I can't seem to find any functions generated that take a position in the file and give back the nodes of a tree at that position.
I want to know, for example that the cursor is placed on the name of a function.
Am I supposed to be walking the entire tree and checking the position of tokens to see if they enclose the position I'm in in the editor? Or maybe I'm not using the right tool for the job? I feel like I'm probably missing something more fundamental.

Yes, you have to walk the parse tree to find the context at a given position. This is a pretty simple task and you can see it in action in my ANLTR4 exension for vscode. There are multiple functions returning something useful for a given position. For instance this one:
/**
* Returns the parse tree which covers the given position or undefined if none could be found.
*/
function parseTreeFromPosition(root: ParseTree, column: number, row: number): ParseTree | undefined {
// Does the root node actually contain the position? If not we don't need to look further.
if (root instanceof TerminalNode) {
let terminal = (root as TerminalNode);
let token = terminal.symbol;
if (token.line != row)
return undefined;
let tokenStop = token.charPositionInLine + (token.stopIndex - token.startIndex + 1);
if (token.charPositionInLine <= column && tokenStop >= column) {
return terminal;
}
return undefined;
} else {
let context = (root as ParserRuleContext);
if (!context.start || !context.stop) { // Invalid tree?
return undefined;
}
if (context.start.line > row || (context.start.line == row && column < context.start.charPositionInLine)) {
return undefined;
}
let tokenStop = context.stop.charPositionInLine + (context.stop.stopIndex - context.stop.startIndex + 1);
if (context.stop.line < row || (context.stop.line == row && tokenStop < column)) {
return undefined;
}
if (context.children) {
for (let child of context.children) {
let result = parseTreeFromPosition(child, column, row);
if (result) {
return result;
}
}
}
return context;
}
}

Related

Highlight near duplicate in conditional formating to highlight values with one character difference

I'm currently using this formula to highlight duplicates in my spreadsheet.
=ARRAYFORMULA(COUNTIF(A$2:$A2,$A2)>1)
Quite simple, it allows me to skip the first occurrence and only highlight 2nd, 3rd, ... occurrences.
I would like the formula to go a bit further and highlight near duplicates as well.
Meaning if there is only one character difference between 2 cells, then it should be considered as a duplicate.
For instance: "Marketing", "Marketng", "Marketingg" and "Market ing" would all be considered the same.
I've made a sample sheet in case my requirement is not straightforward to understand.
Thanks in advance.
Answer
Unfortunately, it is not possible to do this only through Formulas. Apps Scripts are need as well. The process for achieving your desired results is described below.
In Google Sheets, go to Extensions > Apps Script, paste the following code1 and save.
function TypoFinder(range, word) { // created by https://stackoverflow.com/users/19361936
if (!Array.isArray(range) || word == "") {
return false;
}
distances = range.map(row => row.map(cell => Levenshtein(cell, word))) // Iterate over range and check Levenshtein distance.
var accumulator = 0;
for (var i = 0; i < distances.length; i++) {
if (distances[i] < 2) {
accumulator++
} // Keep track of how many times there's a Levenshtein distance of 0 or 1.
}
return accumulator > 1;
}
function Levenshtein(a, b) { // created by https://stackoverflow.com/users/4269081
if (a.length == 0) return b.length;
if (b.length == 0) return a.length;
// swap to save some memory O(min(a,b)) instead of O(a)
if (a.length > b.length) {
var tmp = a;
a = b;
b = tmp;
}
var row = [];
// init the row
for (var i = 0; i <= a.length; i++) {
row[i] = i;
}
// fill in the rest
for (var i = 0; i < b.length; i++) {
var prev = i;
for (var j = 0; j < a.length; j++) {
var val;
if (b.charAt(i) == a.charAt(j)) {
val = row[j]; // match
} else {
val = Math.min(row[j] + 1, // substitution
prev + 1, // insertion
row[j + 1] + 1); // deletion
}
row[j] = prev;
prev = val;
}
row[a.length] = prev;
}
return row[a.length];
}
In cell B1, enter =TypoFinder($A$2:$A2,$A2). Autofill that formula down the column by draggin.
Create a conditional formatting rule for column A. Using Format Rules > Custom Formula, enter =B2:B.
At this point, you might wish to hide column B. To do so, right click on the column and press Hide Column.
The above explanation assumes the column you wish to highlight is Column A and the helper column is column B. Adjust appropriately.
Note that I have assumed you do not wish to highlight repeated blank columns as duplicate. If I am incorrect, remove || word == "" from line 2 of the provided snippet.
Explanation
The concept you have described is called Levenshtein Distance, which is a measure of how close together two strings are. There is no built-in way for Google Sheets to process this, so the Levenshtein() portion of the snippet above implements a custom function to do so instead. Then the TypoFinder() function is built on top of it, providing a method for evaluating a range of data against a specified "correct" word (looking for typos anywhere in the range).
Next, a helper column is used because Sheets has difficulties parsing custom formulas as part of a conditional formatting rule. Finally, the rule itself is implemented to check the helper column's determination of whether the row should be highlighted or not. Altogether, this highlights near-duplicate results in a specified column.
1 Adapted from duality's answer to a related question.

Very Basic Function to compare strings or numbers

I want to create a Custom Function in my google sheet, which will take few inputs and compare them below is the example please:
I know this is very simple but when I try to save this code this give me the below error.
Missing ; before statement. (line 6, file "Code")Dismiss
Your function can be written in plain JavaScript for Google Apps Script:
function testFunction(A1,B1,C1) {
// check if all the three conditions are met
if (A1 == "Lion" && B1 == "Cheeta" && C1 == 1) {
// if they're met, return Cat_Family_Exists
return "Cat_Family_Exists"; }
// else return Cheeta
else {
return "Cheeta";
};
};
However, if you want to directly get the data out of a sheet, you might want to have a look at the docs.

Find Words in entire module

I have skip list contains an ADC, FIFO, DAC, FILO etc.
I want to know whether these words are used in the entire module or not .if used in the module should return the unused words.
I have a program but it is taking too much time to execute.
Please help me with this.
Here is the code :
Skip Search_In_Entire_Module(Skip List)
{
int sKey = 0
Skip sList = create()
string data = ""
string objText1
Object obj
for data in List do
{
int var_count = 0
for obj in m do
{
objText1 = obj."Object Text"
if objText1!=null then
{
if (isDeleted obj){continue}
if (table obj) {continue}
if (row obj) {continue}
if (cell obj) {continue}
Buffer buf = create()
buf = objText1
int index = 0
while(true)
{
index = contains(buf, data, index)
if(0 <= index)
{
index += length(data)
}
else
{
var_count++
break
}
}
delete(buf)
}
}
if (var_count ==0)
{
put(sList,sKey,data)
sKey++
}
}
return sList
}
Unused_Terminolody_Data = Search_In_Entire_Module(Terminology_Data)
Just wondering: why is this in a while loop?
while(true)
{
index = contains(buf, data, index)
if(0 <= index)
{
index += length(data)
}
else
{
var_count++
break
}
}
I would instead just do:
index = contains ( buf, data )
if ( index == -1 ) {
var_count++
}
buf = ""
I would also not keep deleting and recreating the buffer. Create the buffer up where you create the object variable, then set it equal to "" to clear it, then delete it at the end of the program.
Let me know if this helps!
Balthos makes good points, and I think there's a little more you could do. My adaptation of your function follows. Points to note:
I implemented Balthos's suggestions (above) of taking out the
'while' loop, and buffer creation/deletion.
I changed the function signature. Given that Skip lists are passed
by reference, and must be created and deleted outside the function
it's syntactically confusing (to me, anyway) to return them from a
function. So, I pass both skip lists (terms we're seeking, terms not
found) in as function parameters. Please excuse me changing variable
names - it helped me to understand what was going on more quickly.
There's no need to put the Object Text in a string - this is
relatively slow and consumes memory that will not be freed until
DOORS exits. So, I put the Object Text in a buffer earlier in the
function, and search that. The 'if (!null bufObjText)' at my line 34
is equivalent to your 'objText1!=null'. If you prefer, 'if
(bufObjText != null)' does the same.
The conditional 'if (var_count ==0)' is redundant - I moved it's
functions into an earlier 'if' block (my line 40).
I moved the tests for deleted, table, row and cell objects up, so
that they occur before we take the time to fill a buffer with object
text - so that's only done if necessary.
Item 2 probably isn't going to have a performance impact, but the others will. The only quesiton is, how large?
Please let us know if this improves the running time over what you currently have. I don't have a sufficiently large set of sample data to make meaningful comparisons with your code.
Module modCurrent = current
Skip skUnused_Terminology_Data = create
Skip skSeeking_Terminology_Data = create()
put (skSeeking_Terminology_Data, 0, "SPONG")
put (skSeeking_Terminology_Data, 1, "DoD")
void Search_In_Entire_Module(Skip skTermsSought, skTermsNotFound)
{
Object obj
Buffer bufObjText = create()
int intSkipKey = 0
int index = 0
string strSkipData = ""
for strSkipData in skTermsSought do
{
int var_count = 0
bool blFoundTerm = false
for obj in modCurrent do
{
if (isDeleted obj){continue}
if (table obj) {continue}
if (row obj) {continue}
if (cell obj) {continue}
bufObjText = obj."Object Text"
if (!null bufObjText) then
{
Regexp re = regexp2 strSkipData
blFoundTerm = search (re, bufObjText, 0)
if ( blFoundTerm ) {
put(skUnused_Terminology_Data, intSkipKey, strSkipData)
intSkipKey++
}
bufObjText = ""
}
}
delete (bufObjText)
}
Search_In_Entire_Module (skSeeking_Terminology_Data, skUnused_Terminology_Data)
string strNotFound
for strNotFound in skUnused_Terminology_Data do
{
print strNotFound "\n"
}
delete skUnused_Terminology_Data
delete skSeeking_Terminology_Data

How to compare two column in a spreadsheet

I have 30 columns and 1000 rows, I would like to compare column1 with another column. IF the value dont match then I would like to colour it red. Below is a small dataset in my spreadsheet:
A B C D E F ...
1 name sName email
2
3
.
n
Because I have a large dataset and I want to storing my columns in a array, the first row is heading. This is what I have done, however when testing I get empty result, can someone correct me what I am doing wrong?
var index = [];
var sheet = SpreadsheetApp.getActiveSheet();
function col(){
var data = sheet.getDataRange().getValues();
for (var i = 1; i <= data.length; i++) {
te = index[i] = data[1];
Logger.log(columnIndex[i])
if (data[3] != data[7]){
// column_id.setFontColor('red'); <--- I can set the background like this
}
}
}
From the code you can see I am scanning whole spreadsheet data[1] get the heading and in if loop (data[3] != data[7]) compare two columns. I do have to work on my colour variable but that can be done once I get the data that I need.
Try to check this tutorial if it can help you with your problem. This tutorial use a Google AppsScript to compare the two columns. If differences are found, the script should point these out. If no differences are found at all, the script should put out the text "[id]". Just customize this code for your own function.
Here is the code used to achieve this kind of comparison
function stringComparison(s1, s2) {
// lets test both variables are the same object type if not throw an error
if (Object.prototype.toString.call(s1) !== Object.prototype.toString.call(s2)){
throw("Both values need to be an array of cells or individual cells")
}
// if we are looking at two arrays of cells make sure the sizes match and only one column wide
if( Object.prototype.toString.call(s1) === '[object Array]' ) {
if (s1.length != s2.length || s1[0].length > 1 || s2[0].length > 1){
throw("Arrays of cells need to be same size and 1 column wide");
}
// since we are working with an array intialise the return
var out = [];
for (r in s1){ // loop over the rows and find differences using diff sub function
out.push([diff(s1[r][0], s2[r][0])]);
}
return out; // return response
} else { // we are working with two cells so return diff
return diff(s1, s2)
}
}
function diff (s1, s2){
var out = "[ ";
var notid = false;
// loop to match each character
for (var n = 0; n < s1.length; n++){
if (s1.charAt(n) == s2.charAt(n)){
out += "–";
} else {
out += s2.charAt(n);
notid = true;
}
out += " ";
}
out += " ]"
return (notid) ? out : "[ id. ]"; // if notid(entical) return output or [id.]
}
For more information, just check the tutorial link above and this SO question on how to compare two Spreadsheets.

c++ xml parser function not working

I am using xerces c++ to manipulate an xml file? but getNodeValue() and setNodeValue() are not working but getNodeName() is working. Do anyone has any suggestions?
if( currentNode->getNodeType() && currentNode->getNodeType() == DOMNode::ELEMENT_NODE )
{
// Found node which is an Element. Re-cast node as element
DOMElement* currentElement= dynamic_cast< xercesc::DOMElement* >( currentNode );
if( XMLString::equals(currentElement->getTagName(), TAG_ApplicationSettings))
{
// Already tested node as type element and of name "ApplicationSettings".
// Read attributes of element "ApplicationSettings".
const XMLCh* xmlch_OptionA = currentElement->getAttribute(ATTR_OptionA);
m_OptionA = XMLString::transcode(xmlch_OptionA);
XMLCh* t,*s;
//s= XMLString::transcode("manish");
//currentNode->setElementText(s);
t=(XMLCh*)currentNode->getNodeName();
s=(XMLCh*)currentNode->getNodeValue();
cout<getNodeValue()) << "\n";
A DOMElement may contain a collection of other DOMElements or a DOMText. To get the text value of an element you need to call the method getTextContent(), getNodeValue will always return NULL.
The is another better way conceptually, as the DOMText is a child of the DOMElement we can traverse through the child node and get the value.
Below is the logic in the form of a method:
string getElementValue(const DOMElement& parent)
{
DOMNode *child;
string strVal;
for (child = parent.getFirstChild();child != NULL ; child = child->getNextSibling())
{
if(DOMNode::TEXT_NODE == child->getNodeType())
{
DOMText* data = dynamic_cast<DOMText*>(child);
const XMLCh* val = data->getWholeText();
strVal += XMLString::transcode(val);
}
else
{
throw "ERROR : Non Text Node";
}
}
return strVal;
}
Hope this helps :)
getNodeValue() will always return an empty string, because the "value" of an element node is in its child. In our case it is text node child. Either way is to iterate through child nodes
or use getTextContent.
First check for child nodes in a node using hasChildNodes() then use methods like getFirstChild() etc. . Afterwards use getNodeValue().
DOMNode* ptrDomNode = SomeNode;
if(ptrDomNode->hasChildNodes())
{
DOMNode* dTextNode = ptrDomNode->getFirstChild();
char* string = XMLString::transcode(dTextNode->getNodeValue());
}

Resources