What grammar is this?

What grammar is this? - parsing

I have to parse a document containing groups of variable-value-pairs which is serialized to a string e.g. like this:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
Here are the different elements:
Group IDs:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
Length of string representation of each group:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
One of the groups:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14 ^VAR1^6^VALUE1^^
Variables:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
Length of string representation of the values:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
The values themselves:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
Variables consist only of alphanumeric characters.
No assumption is made about the values, i.e. they may contain any character, including ^.
Is there a name for this kind of grammar? Is there a parsing library that can handle this mess?
So far I am using my own parser, but due to the fact that I need to detect and handle corrupt serializations the code looks rather messy, thus my question for a parser library that could lift the burden.

The simplest way to approach it is to note that there are two nested levels that work the same way. The pattern is extremely simple:
id^length^content^
At the outer level, this produces a set of groups. Within each group, the content follows exactly the same pattern, only here the id is the variable name, and the content is the variable value.
So you only need to write that logic once and you can use it to parse both levels. Just write a function that breaks a string up into a list of id/content pairs. Call it once to get the groups, and then loop through them calling it again for each content to get the variables in that group.
Breaking it down into these steps, first we need a way to get "tokens" from the string. This function returns an object with three methods, to find out if we're at "end of file", and to grab the next delimited or counted substring:
var tokens = function(str) {
var pos = 0;
return {
eof: function() {
return pos == str.length;
},
delimited: function(d) {
var end = str.indexOf(d, pos);
if (end == -1) {
throw new Error('Expected delimiter');
}
var result = str.substr(pos, end - pos);
pos = end + d.length;
return result;
},
counted: function(c) {
var result = str.substr(pos, c);
pos += c;
return result;
}
};
};
Now we can conveniently write the reusable parse function:
var parse = function(str) {
var parts = {};
var t = tokens(str);
while (!t.eof()) {
var id = t.delimited('^');
var len = t.delimited('^');
var content = t.counted(parseInt(len, 10));
var end = t.counted(1);
if (end !== '^') {
throw new Error('Expected ^ after counted string, instead found: ' + end);
}
parts[id] = content;
}
return parts;
};
It builds an object where the keys are the IDs (or variable names). I'm asuming as they have names that the order isn't significant.
Then we can use that at both levels to create the function to do the whole job:
var parseGroups = function(str) {
var groups = parse(str);
Object.keys(groups).forEach(function(id) {
groups[id] = parse(groups[id]);
});
return groups;
}
For your example, it produces this object:
{
'1': {
VAR1: 'VALUE1'
},
'4': {
VAR1: 'VALUE1',
VAR2: 'VAL2'
}
}

I don't think it's a trivial task to create a grammar for this. But on the other hand, a simple straight forward approach is not that hard. You know the corresponding string length for every critical string. So you just chop your string according to those lengths apart..
where do you see problems?

Related

Search Google Sheet column for matching text and print matches

I have a table with Long Words like 'Condemnation' and 'Income' in column A, and Shorter Words such as 'Con' and 'Come' in column B.
I'd like to create a cell to the right which will search through the 'LONG WORD' column if it contains the text of the 'SHORTER WORD' column and print them as a pair.
I only need it to return the first instance it comes across as it goes down.
I have looked at various MATCH and LOOKUP commands, but none seem quite to be able to do the 'return one matching word from a whole column' bit.
Thanks
Tardy

I've thrown together a script based solution for you. Other solutions that require a formula on every line where you might have partials will end up bogging down the sheet by quite a bit for large data sets. This should generate a range of matches after a couple seconds for data several tens of thousands of rows long.
Note: Since you opted to not provide a sample dataset, I had to assume how it's laid out. However, this will work regardless of where your columns are, as long as they are titled as Full Words, Partials, and Matches.
Link to spreadsheet (Must be signed into a google account to use the button): Google Sheet
Just click the Get Matches button to have it generate the matches.
The source is a bit more complex/dynamic than it needs to be, but I had a bunch of functions already laying around that I just reused.
Source:
//Retrieves all the necessary word matches
function GetWordMatches() {
var spreadsheet = SpreadsheetApp.openById('1s0S2iJ7L0wEXgVsKrpuK-aLysaxfHYRDQgp3ShPR8Ns').getSheetByName('Matches');
var dataRange = spreadsheet.getDataRange();
var valuesRange = dataRange.getValues();
var columns = GetColumns(valuesRange, dataRange.getNumColumns(), 0);
var fullWordsData = GetColumnAsArray(valuesRange, columns.columns['Full Words'].index, true, 1);
var partialsArray = GetColumnAsArray(valuesRange, columns.columns['Partials'].index, true, 1);
var partialsData = GeneratePartialsRegexArray(partialsArray);
var matches = GenerateMatches(fullWordsData, partialsData);
WriteMatchesToSheet(spreadsheet, columns.columns['Matches'].index, matches, partialsArray);
}
//Writes the matches to the sheet
function WriteMatchesToSheet(spreadsheet, matchesColumnIndex, matches, partialsArray){
var sortedMatches = SortByKeys(matches, partialsArray);
var dataRange = spreadsheet.getRange(2, matchesColumnIndex+1, sortedMatches.length);
dataRange.setValues(sortedMatches);
}
//Generates an array of matches for the full words and partials
function GenerateMatches(fullwordsData, partialsData){
var output = [];
var totalLoops = 0;
for(var i = 0; i < fullwordsData.length; i++){
totalLoops++;
for(var ii = 0; ii < partialsData.length; ii++){
totalLoops++;
var result = fullwordsData[i].match(partialsData[ii].regex)
if(result){
output.push([fullwordsData[i], partialsData[ii].value]);
partialsData.splice(ii, 1);
break;
}
}
}
if(partialsData.length > 0){
var missedData = GenerateMissedPartialsArray(partialsData);
output = output.concat(missedData);
}
return output;
}
//Generates a missed partials array based on the partials that found no match.
function GenerateMissedPartialsArray(partialsData){
var output = [];
for(var i = 0; i < partialsData.length; i++){
output.push(['No Match', partialsData[i].value])
}
return output;
}
//Generates the regex array for the partials
function GeneratePartialsRegexArray(partialsArray){
var output = [];
for(var i = 0; i < partialsArray.length; i++){
output.push({regex: new RegExp(partialsArray[i], 'i'), value: partialsArray[i]});
}
return output;
}
//http://stackoverflow.com/a/13305008/3547347
function SortByKeys(itemsArray, sortingArray){
var itemsMap = CreateItemsMap(itemsArray), result = [];
for (var i = 0; i < sortingArray.length; ++i) {
var key = sortingArray[i];
result.push([itemsMap[key].shift()]);
}
return result;
}
//http://stackoverflow.com/a/13305008/3547347
function CreateItemsMap(itemsArray) {
var itemsMap = {};
for (var i = 0, item; (item = itemsArray[i]); ++i) {
(itemsMap[item[1]] || (itemsMap[item[1]] = [])).push(item[0]);
}
return itemsMap;
}
//Gets a column of data as an array
function GetColumnAsArray(valuesRange, columnIndex, ignoreBlank, startRowIndex){
var output = [];
for(var i = startRowIndex; i < valuesRange.length; i++){
if(ignoreBlank){
if(valuesRange[i][columnIndex] !== ''){
output.push(valuesRange[i][columnIndex]);
}
continue;
}
output.push(valuesRange[i][columnIndex]);
}
return output;
}
//Gets a columns object for the sheet for easy indexing
function GetColumns(valuesRange, columnCount, rowIndex)
{
var columns = {
columns: {},
length: 0
}
Logger.log("Populating columns...");
for(var i = 0; i < columnCount; i++)
{
if(valuesRange[0][i] !== ''){
columns.columns[valuesRange[0][i]] = {index: i ,value: valuesRange[0][i]};
columns.length++;
}
}
return columns;
}
A note on some decisions: I opted to not use map, or other more concise array functions for the sake of performance.

This works too:
=QUERY(FILTER($D$1:$D$3,REGEXMATCH(A1,"(?i)"&$D$1:$D$3)),"limit 1")
we use REGEXMATCH and (?i) makes the search case-insensitive. limit 1 in query gives only first occurrence.

OK, I think I've found an answer. I'll post it here in case it's of use to anyone else.
To give credit where's credit's due, I found it here
This does what I was looking for:
=INDEX($D$1:$D$3,MATCH(1,COUNTIF(A1,"*"&$D$1:$D$3&"*"),0))
It does slow EVERYTHING down a lot because everything is cross-referencing like mad (I had 3000 lines on my spreadsheet), but if there's a list of words in D1-3 it will see if cell A1 contains one of those words and print the word it matches with.
Thanks to everyone who offered solutions, particularly #douglasg14b - if there is one that is less taxing in terms of memory, that would be great, but this does the trick in a slow kind of way!
Thanks
Tardy

MATCH and LOOKUP doesn't work for partial matches.
One alternative is to use SEARCH or FIND together with other functions in an array formula.
Example:
Column A contains a list of long strings
Cell B1 contain a short string
Cell C1 contain a formula that returns the first long string in column a that contains the short string in B1
=ArrayFormula(INDEX(A1:A,SORT(IF(search(B1,A1:A),ROW(A1:A),),1,TRUE)))
Data
+---+--------------+-------+-------------+
| | A | B | C |
+---+--------------+-------+-------------+
| 1 | Orange juice | apple | Apple cider |
| 2 | Apple cider | | |
| 3 | Apple pay | | |
+---+--------------+-------+-------------+

Parse string into map Golang

I have a string like A=B&C=D&E=F, how to parse it into map in golang?
Here is example on Java, but I don't understand this split part
String text = "A=B&C=D&E=F";
Map<String, String> map = new LinkedHashMap<String, String>();
for(String keyValue : text.split(" *& *")) {
String[] pairs = keyValue.split(" *= *", 2);
map.put(pairs[0], pairs.length == 1 ? "" : pairs[1]);
}

Maybe what you really want is to parse an HTTP query string, and url.ParseQuery does that. (What it returns is, more precisely, a url.Values storing a []string for every key, since URLs sometimes have more than one value per key.) It does things like parse HTML escapes (%0A, etc.) that just splitting doesn't. You can find its implementation if you search in the source of url.go.
However, if you do really want to just split on & and = like that Java code did, there are Go analogues for all of the concepts and tools there:
map[string]string is Go's analog of Map<String, String>
strings.Split can split on & for you. SplitN limits the number of pieces split into like the two-argument version of split() in Java does. Note that there might only be one piece so you should check len(pieces) before trying to access pieces[1] say.
for _, piece := range pieces will iterate the pieces you split.
The Java code seems to rely on regexes to trim spaces. Go's Split doesn't use them, but strings.TrimSpace does something like what you want (specifically, strips all sorts of Unicode whitespace from both sides).
I'm leaving the actual implementation to you, but perhaps these pointers can get you started.

import ( "strings" )
var m map[string]string
var ss []string
s := "A=B&C=D&E=F"
ss = strings.Split(s, "&")
m = make(map[string]string)
for _, pair := range ss {
z := strings.Split(pair, "=")
m[z[0]] = z[1]
}
This will do what you want.

There is a very simple way provided by golang net/url package itself.
Change your string to make it a url with query params text := "method://abc.xyz/A=B&C=D&E=F";
Now just pass this string to Parse function provided by net/url.
import (
netURL "net/url"
)
u, err := netURL.Parse(textURL)
if err != nil {
log.Fatal(err)
}
Now u.Query() will return you a map containing your query params. This will also work for complex types.

Here is a demonstration of a couple of methods:
package main
import (
"fmt"
"net/url"
)
func main() {
{
q, e := url.ParseQuery("west=left&east=right")
if e != nil {
panic(e)
}
fmt.Println(q) // map[east:[right] west:[left]]
}
{
u := url.URL{RawQuery: "west=left&east=right"}
q := u.Query()
fmt.Println(q) // map[east:[right] west:[left]]
}
}
https://golang.org/pkg/net/url#ParseQuery
https://golang.org/pkg/net/url#URL.Query

How to extract integer after "=" sign using ruby

I'm trying to extract the integers after mrp= and talktime=.
var i=0;
var recharge=[];
var recharge_text=[];
var recharge_String="";
var mrp="";
var talktime="";
var validity="";
var mode="";mrp='1100';
talktime='1200.00';
validity='NA';
mode='E-Recharge';
if(typeof String.prototype.trim !== 'function') {
String.prototype.trim = function() {
return this.replace(/^ +| +$/g, '');
}
}
mrp=mrp.trim();
if(isNaN(mrp))
{
recharge_text.push({MRP:mrp, Talktime:talktime, Validity:validity ,Mode:mode});
}
else
{
mrp=parseInt(mrp);
recharge.push({MRP:mrp, Talktime:talktime, Validity:validity ,Mode:mode});
}
mrp='2200';
talktime='2400.00';
I've extracted the above text from a webpage, but I do not know how to extract that particular part alone.

You can use regular expressions to parse strings and extract parts of it :
my_text = "blablabla" #just imagine that this is your text
regex_mrp = /mrp='(.+?)';/ #extracts whatever is between single quotes after mrp
regex_talktime = /talktime='(.+?)';/ #extracts whatever is between single quotes after talktime
mrp = my_text.match(regex_mrp)[1].to_i #gets the match, and converts to integer
talktime = my_text.match(regex_talktime)[1].to_f #gets the match, and converts to float
Here's a quick reference to the regular expressions syntax : https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx

I'd do something like this:
string = <<EOT
var i=0;
var recharge=[];
var recharge_text=[];
var recharge_String="";
var mrp="";
var talktime="";
var validity="";
var mode="";mrp='1100';
talktime='1200.00';
validity='NA';
mode='E-Recharge';
if(typeof String.prototype.trim !== 'function') {
String.prototype.trim = function() {
return this.replace(/^ +| +$/g, '');
}
}
mrp=mrp.trim();
if(isNaN(mrp))
{
recharge_text.push({MRP:mrp, Talktime:talktime, Validity:validity ,Mode:mode});
}
else
{
mrp=parseInt(mrp);
recharge.push({MRP:mrp, Talktime:talktime, Validity:validity ,Mode:mode});
}
mrp='2200';
talktime='2400.00';
EOT
hits = string.scan(/(?:mrp|talktime)='[\d.]+'/)
# => ["mrp='1100'", "talktime='1200.00'", "mrp='2200'", "talktime='2400.00'"]
This gives us an array of hits using scan, where the pattern /(?:mrp|talktime)='[\d.]+'/ matched in the string. Figuring out how the pattern works is left as an exercise for the user, but Ruby's Regexp documentation explains it all.
Cleaning that up to be a bit more useful:
hash = hits.map{ |s|
str, val = s.split('=')
[str, val.delete("'")]
}.each_with_object(Hash.new { |h, k| h[k] = [] }){ |(str, val), h| h[str] << val }
You also need to read about each_with_object and what's happening with Hash.new as those are important concepts to learn in Ruby.
At this point, hash is a hash of arrays:
hash # => {"mrp"=>["1100", "2200"], "talktime"=>["1200.00", "2400.00"]}
You can easily extract a particular variable's values, and can correlate them if need be.
what if i get a string instead of integer next to "=" sign?
...
string.scan(/(?:tariff)='[\p{Print}]+'/)
It's important to understand what the pattern is doing. The regular expression engine has some gotchas that can drastically affect the speed of a search, so indiscriminately throwing in things without understanding what they do can be very costly.
When using (?:...), you're creating a non-capturing group. When you only have one item you're matching it's not necessary, nor is it particularly desirable since it's making the engine do more work. The only time I'd do that is when I need to refer back to what the capture was, but since you have only one possible thing it'll match that becomes a moot-point. So, your pattern should be reduced to:
/tariff='[\p{Print}]+'/
Which, when used, results in:
%(tariff='abcdef abc a').scan(/tariff='[\p{Print}]+'/)
# => ["tariff='abcdef abc a'"]
If you want to capture all non-empty occurrences of the string being assigned, it's easier than what you're doing. I'd use something like:
%(tariff='abcdef abc a').scan(/tariff='.+'/)
# => ["tariff='abcdef abc a'"]
%(tariff='abcdef abc a').scan(/tariff='[^']+'/)
# => ["tariff='abcdef abc a'"]
The second is more rigorous, and possible safer as it won't be tricked by an line that has multiple single-quotes:
%(tariff='abcdef abc a', 'foo').scan(/tariff='.+'/)
# => ["tariff='abcdef abc a', 'foo'"]
%(tariff='abcdef abc a', 'foo').scan(/tariff='[^']+'/)
# => ["tariff='abcdef abc a'"]
Why that works is for you to figure out.

ANTLR best way to include meta-data in lexing/parsing (custom objects, kind of annotation)

I plan to include text metadata (like bold, font-size, etc.) in the process of parsing to achieve better recognition.
For instance, I have a given structure, where a word on its own line word/r/n which is bold and sized 24px, is the title for some article. In order to get better recognition results, I want to take the characters as well as the metadata in account. In terms of ANTRL I'm not sure how this could be done best. I'd like to do something like:
Wrap each character of the original text into a custom object with fields for the metadata and pass that to ANTLR.
Preprocess the text and insert at specific places annotations for the metadata which is considered by the grammer.
I really like to take option 1. but I'm not sure which part from ANTLR I need to subclass etc. Do I have to start at the ANTLRInputStream-Object, in order to get a proper stream for a subclassed Lexer to get custom Tokens for a subclassed Parser etc. Is there a more elegant way, especially in querying the tokens while parsing with actions in a {} block ?
If anyone has some hints and/or experiences this would be great!
EDIT:
Here is a more specific simple example: I have a file wich includes the encoding of metadata which I parse forehand. the actual text including newline look like the following:
entryOne
Here is some content one.
entryTwo
Here is some content two.
Where the titlesentryOneand entryTwo are originally font-size of 24px and the content is font-size of 12px (as exemplary given values). Char by char I create a new instance of a custom object encapsulating the character as String and the font-size.
I initialize respective objects for each of the characters with fields of the font-size, e.g for the first letter of entryOne like
MyChar aTitelChar = new MyChar("e", 24);
For the content, like the second line Here is some content one. I create instances of MyChar like:
MyChar aContentChar= new MyChar("H", 12);
All characters of the texts are wrapped in instances of the below MyChar-Class and added to a List<MyChar> in order to produce a new input for ANTLR.
below is the Java Class for the characters:
public class MyChar {
private int fontSizePx;
private String text;
public MyChar(String text, int fontSizePx) {
this.text = text;
this.fontSizePx = fontSizePx;
}
public int getFontSizePx() {
return fontSizePx;
}
public String getText() {
return text;
}
}
I want that my grammar matches the above two entries (or more formatted this way) which in turn consist each of a title and a content which is terminated with a fullstop. This grammar could look like this:
rule: entry+ NEWLINE
;
entry:
title
content
;
title:
letters NEWLINE
;
content:
(letters)+ '.' NEWLINE
;
letters:
LETTERS
;
LETTERS:
('a'..'z' | 'A'..'Z')+
;
WS:
(' ' | '\t' | 'f' ) + {$channel = HIDDEN;};
NEWLINE:'\r'? '\n';
Now, for instance, what I want to do is to find out if it's really a title of an entry by checking the font-size of all letters encompassing the title-token before titel-rule returns. In case the input conforms to the grammar but is actually some kind of mistake (the original metadata-encoded file starts with something that conforms to the title-rule but its actually the content) the author of the grammar could sort that out if he knows that the original font-size for titles is 24 and check this. If one of the letter-tokens doesn't equal to font-size 24 throw an exception/don't return/do smthg. appropriate.
The thing I'm pondering on is where to plug in the List<MyChar> to provide this functionality (to query kinds of metadata while parsing in context of ANTLR). I'm experimenting with ANTLR's Classes but as I'm new to ANTLR I thought probably some of the experienced users can point me in the right direction, like where would be a good insertion points for custom objects? should I start by implenting CharStream and override some methods? Probably there is something which ANTLR provides which I haven't found yet?

Here's one way to accomplish what I think you're going for, using the parser to manage matching input to metadata. Note that I made whitespace significant because it's part of the content and can't be skipped. I also made periods part of content to simplify the example, rather than using them as a marker.
SysEx.g
grammar SysEx;
#header {
import java.util.List;
}
#parser::members {
private List<MyChar> metadata;
private int curpos;
private boolean isTitleInput(String input) {
return isFontSizeInput(input, 24);
}
private boolean isContentInput(String input){
return isFontSizeInput(input, 12);
}
private boolean isFontSizeInput(String input, int fontSize){
List<MyChar> sublist = metadata.subList(curpos, curpos + input.length());
System.out.println(String.format("Testing metadata for input=\%s, font-size=\%d", input, fontSize));
int start = curpos;
//move our metadata pointer forward.
skipInput(input);
for (int i = 0, count = input.length(); i < count; ++i){
MyChar chardata = sublist.get(i);
char c = input.charAt(i);
if (chardata.getText().charAt(0) != c){
//This character doesn't match the metadata (ERROR!)
System.out.println(String.format("Content mismatch at metadata position \%d: metadata=(\%s,\%d); input=\%c", start + i, chardata.getText(), chardata.getFontSizePx(), c));
return false;
} else if (chardata.getFontSizePx() != fontSize){
//The font is wrong.
System.out.println(String.format("Format mismatch at metadata position \%d: metadata=(\%s,\%d); input=\%c", start + i, chardata.getText(), chardata.getFontSizePx(), c));
return false;
}
}
//All characters check out.
return true;
}
private void skipInput(String str){
curpos += str.length();
System.out.println("\t\tMoving metadata pointer ahead by " + str.length() + " to " + curpos);
}
}
rule[List<MyChar> metadata]
#init {
this.metadata = metadata;
}
: entry+ EOF
;
entry
: title content
{System.out.println("Finished reading entry.");}
;
title
: line {isTitleInput($line.text)}? newline {System.out.println("Finished reading title " + $line.text);}
;
content
: line {isContentInput($line.text)}? newline {System.out.println("Finished reading content " + $line.text);}
;
newline
: (NEWLINE{skipInput($NEWLINE.text);})+
;
line returns [String text]
#init {
StringBuilder builder = new StringBuilder();
}
#after {
$text = builder.toString();
}
: (ANY{builder.append($ANY.text);})+
;
NEWLINE:'\r'? '\n';
ANY: .; //whitespace can't be skipped because it's content.
A title is a line that matches the title metadata (size 24 font) followed by one or more newline characters.
A content is a line that matches the content metadata (size 12 font) followed by one or more newline characters. As mentioned above, I removed the check for a period for simplification.
A line is a sequence of characters that does not include newline characters.
A validating semantic predicate (the {...}? after line) is used to validate that the line matches the metadata.
Here is the code I used to test the grammar (minus imports, for brevity):
SysExGrammar.java
public class SysExGrammar {
public static void main(String[] args) throws Exception {
//Create some metadata that matches our input.
List<MyChar> matchingMetadata = new ArrayList<MyChar>();
appendMetadata(matchingMetadata, "entryOne\r\n", 24);
appendMetadata(matchingMetadata, "Here is some content one.\r\n", 12);
appendMetadata(matchingMetadata, "entryTwo\r\n", 24);
appendMetadata(matchingMetadata, "Here is some content two.\r\n", 12);
parseInput(matchingMetadata);
System.out.println("Finished example #1");
//Create some metadata that doesn't match our input (negative test).
List<MyChar> mismatchingMetadata = new ArrayList<MyChar>();
appendMetadata(mismatchingMetadata, "entryOne\r\n", 24);
appendMetadata(mismatchingMetadata, "Here is some content one.\r\n", 12);
appendMetadata(mismatchingMetadata, "entryTwo\r\n", 12); //content font size!
appendMetadata(mismatchingMetadata, "Here is some content two.\r\n", 12);
parseInput(mismatchingMetadata);
System.out.println("Finished example #2");
}
private static void parseInput(List<MyChar> metadata) throws Exception {
//Test setup
InputStream resource = SysExGrammar.class.getResourceAsStream("SysExTest.txt");
CharStream input = new ANTLRInputStream(resource);
resource.close();
SysExLexer lexer = new SysExLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
SysExParser parser = new SysExParser(tokens);
parser.rule(metadata);
System.out.println("Parsing encountered " + parser.getNumberOfSyntaxErrors() + " syntax errors");
}
private static void appendMetadata(List<MyChar> metadata, String string,
int fontSize) {
for (int i = 0, count = string.length(); i < count; ++i){
metadata.add(new MyChar(string.charAt(i) + "", fontSize));
}
}
}
SysExTest.txt (note this uses Windows newlines (\r\n)
entryOne
Here is some content one.
entryTwo
Here is some content two.
Test output (trimmed; the second example has deliberately-mismatched metadata):
Parsing encountered 0 syntax errors
Finished example #1
Parsing encountered 2 syntax errors
Finished example #2
This solution requires that each MyChar corresponds to a character in the input (including newline characters, although you can remove that limitation if you like -- I would remove it if I didn't already have this answer written up ;) ).
As you can see, it's possible to tie the metadata to the parser and everything works as expected. I hope this helps.

JQuery : How can we filter int from variable

Is it possible to filter numbers from the variable.
I can show you one example here from the link http://jsfiddle.net/sweetmaanu/82r5v/6/
I need to get only numbers from the alert message

Simply replace the box string out of it.
DEMO
for (var i = 0; i < order.length; i++) {
order[i] = order[i].replace('box', '');
}

So instead of box1, box2, box3, box4 you want to see 1,2,3,4
You can use a regular expression like this:
var order = $("#boxes").sortable("toArray") + "";
alert(order.replace(/[^0-9,]/g, ''));
I also had to append an empty string to order because it wasn't being recognized as a string object even though the jQuery documentation says it should be when you call sortable("toArray").

change var order = $("#boxes").sortable("toArray");
to var order = $("#boxes").sortable("toArray").join(',').replace(/[a-zA-Z]/gi, "");
Demo: http://jsfiddle.net/82r5v/13/

// Remove all non-digits from the string
'box1'.replace(/\D/g, ''); // => '1'
// Same, but try to make the string a number
Number('box1'.replace(/\D/g, '')); // => 1
// Shorthand for making an object a number (+o is the same as Number(o))
+'box1'.replace(/\D/g, ''); // => 1
// parseInt(s) works if the number is at the beginning
parseInt('1box'); // => 1
// but not if it occurs later
parseInt('box1'); // => NaN

Maybe using regular expressions something like this:
`alert(order.join(',').match(/\d/g));`
To return the array as numbers.
(\d matches all digits, g signifies a global match wildcard)

One way to do it by using regular expressions - http://jsfiddle.net/holodoc/82r5v/14/
$(document).ready(function() {
var arrValuesForOrder = ["2", "1", "3", "4"];
var ul = $("#boxes"),
items = $("#boxes li.con");
for (var i = arrValuesForOrder[arrValuesForOrder.length - 1]; i >= 0; i--) {
// arrValuesForOrder[i] element to move
// i = index to move element at
ul.prepend(items.get(arrValuesForOrder[i] - 1));
}
$("#boxes").sortable({
handle : '.drag',
update: function() {
var order = $("#boxes").sortable("toArray");
var sorted = [];
$.each(order, function(index, value){
sorted.push(value.match(/box(\d+)/)[1]);
})
alert(sorted);
}
});
});

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

What grammar is this? - parsing

I don't think it's a trivial task to create a grammar for this. But on the other hand, a simple straight forward approach is not that hard. You know the corresponding string length for every critical string. So you just chop your string according to those lengths apart.. where do you see problems?

Related

Search Google Sheet column for matching text and print matches

Parse string into map Golang

How to extract integer after "=" sign using ruby

ANTLR best way to include meta-data in lexing/parsing (custom objects, kind of annotation)

JQuery : How can we filter int from variable

Categories

Resources