How to escape asterisk in me.regexp - blackberry

I have a regex that looks like this:
RE regex = new RE("([TtYy\\*])(?:([+-])([\\d]+)([dDmMhH]))?");
It is supposed to match
*-30m
T-60h
T
Y
and so on.
but the escape on the asterisks is not working. I tried a few combinations like single slash, grouping that sequence with parenthesis. Anyone have ideas?
I am using me.regexp from http://code.google.com/p/regexp-me/

In me.regexp.RE.java there are doc comments:
The full regular expression syntax accepted by RE is described here:
Characters
unicodeChar - Matches any identical unicode character
\ - Used to quote a meta-character (like '*')
\\ - Matches a single '\' character
\0nnn - Matches a given octal character
\xhh - Matches a given 8-bit hexadecimal character
So single backslash should work here, or there is an issue in lib.
UPDATE
I have tried pattern "([TtYy\*])(?:([+-])([\d]+)([dDmMhH]))?" and text "*-30m" and it works perfectly for my 8900 simulator on RIM 5.0
See code:
public final class MyScreen extends MainScreen
implements FieldChangeListener {
static final int INT_MAX_LEN = 200;
static final String STR_PATTERN = "Pattern:";
static final String STR_TEXT = "Text:";
static final String STR_RESULT = "Result:";
static final String STR_RUN_TEST = "Run test";
static final String STR_DEFAULT_PATTERN =
"([TtYy\\*])(?:([+-])([\\d]+)([dDmMhH]))?";
static final String STR_DEFAULT_TEXT = "*-30m";
EditField mPattern = new EditField(STR_PATTERN, STR_DEFAULT_PATTERN,
INT_MAX_LEN, Field.USE_ALL_WIDTH);
EditField mText = new EditField(STR_TEXT, STR_DEFAULT_TEXT, INT_MAX_LEN,
Field.USE_ALL_WIDTH);
LabelField mResult = new LabelField(STR_RESULT, Field.USE_ALL_WIDTH);
ButtonField mBtnRunTest =
new ButtonField(STR_RUN_TEST, Field.USE_ALL_WIDTH
| ButtonField.CONSUME_CLICK);
public MyScreen() {
add(mPattern);
add(mText);
add(mResult);
add(mBtnRunTest);
mBtnRunTest.setChangeListener(this);
}
public void fieldChanged(Field field, int context) {
if (field == mBtnRunTest) {
runTest();
}
}
private void runTest() {
RE regex = new RE(mPattern.getText());
String result = regex.match(mText.getText()) ? "TRUE" : "FALSE";
mResult.setText(STR_RESULT + result);
}
}

Related

How to walk the parse tree to check for syntax errors in ANTLR

I have written a fairly simple language in ANTLR. Before actually interpreting the code written by a user, I wish to parse the code and check for syntax errors. If found I wish to output the cause for the error and exit. How can I check the code for syntax errors and output the corresponding error. Please not that for my purposes the error statements similar to those generated by the ANTLR tool are more than sufficient. For example
line 3:0 missing ';'
There is ErrorListener that you can use to get more information.
For example:
...
FormulaParser parser = new FormulaParser(tokens);
parser.IsCompletion = options.IsForCompletion;
ErrorListener errListener = new ErrorListener();
parser.AddErrorListener(errListener);
IParseTree tree = parser.formula();
Only thing you need to do is to attach ErrorListener to the parser.
Here is the code of ErrorListener.
/// <summary>
/// Error listener recording all errors that Antlr parser raises during parsing.
/// </summary>
internal class ErrorListener : BaseErrorListener
{
private const string Eof = "the end of formula";
public ErrorListener()
{
ErrorMessages = new List<ErrorInfo>();
}
public bool ErrorOccured { get; private set; }
public List<ErrorInfo> ErrorMessages { get; private set; }
public override void SyntaxError(IRecognizer recognizer, IToken offendingSymbol, int line, int charPositionInLine, string msg, RecognitionException e)
{
ErrorOccured = true;
if (e == null || e.GetType() != typeof(NoViableAltException))
{
ErrorMessages.Add(new ErrorInfo()
{
Message = ConvertMessage(msg),
StartIndex = offendingSymbol.StartIndex,
Column = offendingSymbol.Column + 1,
Line = offendingSymbol.Line,
Length = offendingSymbol.Text.Length
});
return;
}
ErrorMessages.Add(new ErrorInfo()
{
Message = string.Format("{0}{1}", ConvertToken(offendingSymbol.Text), " unexpected"),
StartIndex = offendingSymbol.StartIndex,
Column = offendingSymbol.Column + 1,
Line = offendingSymbol.Line,
Length = offendingSymbol.Text.Length
});
}
public override void ReportAmbiguity(Antlr4.Runtime.Parser recognizer, DFA dfa, int startIndex, int stopIndex, bool exact, BitSet ambigAlts, ATNConfigSet configs)
{
ErrorOccured = true;
ErrorMessages.Add(new ErrorInfo()
{
Message = "Ambiguity", Column = startIndex, StartIndex = startIndex
});
base.ReportAmbiguity(recognizer, dfa, startIndex, stopIndex, exact, ambigAlts, configs);
}
private string ConvertToken(string token)
{
return string.Equals(token, "<EOF>", StringComparison.InvariantCultureIgnoreCase)
? Eof
: token;
}
private string ConvertMessage(string message)
{
StringBuilder builder = new StringBuilder(message);
builder.Replace("<EOF>", Eof);
return builder.ToString();
}
}
It is some dummy listener, but you can see what it does. And that you can tell if the error is syntax error, or some ambiguity error. After parsing, you can ask directly the errorListener, if some error occurred.

Sum and Average Aggregation using DataFlow

I have following type of sample data.
s.n., time, user, time_span, user_level
1, 2016-01-04T1:26:13, Hari, 8, admin
2, 2016-01-04T11:6:13, Gita, 2, admin
3, 2016-01-04T11:26:13, Gita, 0, user
Now I need to find average_time_span/user, average_time_span/user_level and total_time_span/user.
I'm able to find each of above mention value but couldn't able to find all of those at once. As I'm new to DataFlow, please suggest me appropriate method to do so.
static class ExtractUserAndUserLevelFn extends DoFn<String, KV<String, Long>> {
#Override
public void processElement(ProcessContext c) {
String[] words = c.element().split(",");
if (words.length == 5) {
Instant timestamp = Instant.parse(words[1].trim());
KV<String, Long> userTime = KV.of(words[2].trim(), Long.valueOf(words[3].trim()));
KV<String, Long> userLevelTime = KV.of(words[4].trim(), Long.valueOf(words[3].trim()));
c.outputWithTimestamp(userTime, timestamp);
c.outputWithTimestamp(userLevelTime, timestamp);
}
}
}
public static void main(String[] args) {
TestOptions options = PipelineOptionsFactory.fromArgs(args).withValidation()
.as(TestOptions.class);
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.named("ReadLines").from(options.getInputFile()))
.apply(ParDo.of(new ExtractUserAndUserLevelFn()))
.apply(Window.<KV<String, Long>>into(
FixedWindows.of(Duration.standardSeconds(options.getMyWindowSize()))))
.apply(GroupByKey.<String, Long>create())
.apply(ParDo.of(new DoFn<KV<String, Iterable<Long>>, KV<String, Long>>() {
public void processElement(ProcessContext c) {
String key = c.element().getKey();
Iterable<Long> docsWithThatUrl = c.element().getValue();
Long sum = 0L;
for (Long item : docsWithThatUrl)
sum += item;
KV<String, Long> userTime = KV.of(key, sum);
c.output(userTime);
}
}))
.apply(MapElements.via(new FormatAsTextFn()))
.apply(TextIO.Write.named("WriteCounts").to(options.getOutput()).
withNumShards(options.getShardsNumber()));
p.run();
}
One approach would be to first parse the lines into one PCollection that contains a record per line, and the from that collection create two PCollection of key-value pairs. Let's say you define a record representing a line like this:
static class Record implements Serializable {
final String user;
final String role;
final long duration;
// need a constructor here
}
Now, create a LineToRecordFn that create Records from the input lines, so that you can do:
PCollection<Record> records = p.apply(TextIO.Read.named("ReadLines")
.from(options.getInputFile()))
.apply(ParDo.of(new LineToRecordFn()));
You can window here, if you want. Whether you window or not, you can then create your keyed-by-role and keyed-by-user PCollections:
PCollection<KV<String,Long>> role_duration = records.apply(MapElements.via(
new SimpleFunction<Record,KV<String,Long>>() {
#Override
public KV<String,Long> apply(Record r) {
return KV.of(r.role,r.duration);
}
}));
PCollection<KV<String,Long>> user_duration = records.apply(MapElements.via(
new SimpleFunction<Record,KV<String,Long>>() {
#Override
public KV<String,Long> apply(Record r) {
return KV.of(r.user, r.duration);
}
}));
Now, you can get the means and sum in just a few lines:
PCollection<KV<String,Double>> mean_by_user = user_duration.apply(
Mean.<String,Long>perKey());
PCollection<KV<String,Double>> mean_by_role = role_duration.apply(
Mean.<String,Long>perKey());
PCollection<KV<String,Long>> sum_by_role = role_duration.apply(
Sum.<String>longsPerKey());
Note that dataflow does some optimization before running your job. So, while it might look like you're doing two passes over the records PCollection, that may not be true.
The Mean and Sum transforms look like they would work well for this use case. Basic usage looks like this:
PCollection<KV<String, Double>> meanPerKey =
input.apply(Mean.<String, Integer>perKey());
PCollection<KV<String, Integer>> sumPerKey = input
.apply(Sum.<String>integersPerKey());

Join two Base64 strings and then decode them

I'm trying to figure out how to join two strings that are encoded Base64 and then decode and get the combined result.
Example:
string1 Hello --- string1 Base64 SGVsbG8=
string2 World --- string2 Base64 V29ybGQ=
If I join the base64 I get something that wont decode SGVsbG8=V29ybGQ=
I want the result to say: Hello World
I don't want only this example to work but rather something that will work with any string.
This is a very simplified problem which is a step on an application I'm trying to write I'm stuck on.
What if you encode both strings to array, then combine those arrays and finally GetString from the bytes?
using System;
using System.Text;
using System.Linq;
public class Program
{
public static void Main()
{
var base1 = "SGVsbG8=";
var base2 = "V29ybGQ=";
var array1 = Convert.FromBase64String(base1);
var array2 = Convert.FromBase64String(base2);
var comb = Combine(array1, array2);
var data = Encoding.Default.GetString(comb);
Console.WriteLine(data);
}
private static byte[] Combine(byte[] first, byte[] second)
{
return first.Concat(second).ToArray();
}
}
I found a best way to do this, add plus between one string and other, and add ONE, and only ONE equals char ('=') at the end of string. The return will be "Hello>World", then remove the ">":
class Program
{
static void Main(string[] args)
{
string base64String = "SGVsbG8+V29ybGQ=";
byte[] encodedByte = Convert.FromBase64String(base64String);
var finalString = Encoding.Default.GetString(encodedByte)).Replace(">", " ");
Console.WriteLine(finalString.ToString());
}
}
(Old way) In C# I do something like this:
class Program
{
static void Main(string[] args)
{
string base64String = "SGVsbG8=V29ybGQ=";
Console.WriteLine(DecodeBase64String(base64String));
Console.ReadLine();
}
public static string DecodeBase64String(string base64String)
{
StringBuilder finalString = new StringBuilder();
foreach (var text in base64String.Split(new char[] { '=' }, StringSplitOptions.RemoveEmptyEntries))
{
byte[] encodedByte = Convert.FromBase64String(text + "=");
finalString.Append(Encoding.Default.GetString(encodedByte));
finalString.Append(" "); //This line exists only to attend the "Hello World" case. The correct is remove this and let the one that will receive the return to decide what will do with returned string.
}
return finalString.ToString();
}
}

Reading a file with a read method using Scanner (InputMismatchException)

I'm new to java and I have a problem with reading a file using the scanner class.
My objective is to read the following .txt file:
3
Emmalaan 23
3051JC Rotterdam
7 rooms
price 300000
Javastraat 88
4078KB Eindhoven
3 rooms
price 50000
Javastraat 93
4078KB Eindhoven
4 rooms
price 55000
The "3" on top of the file should be read as an integer that tells how many houses the file has. The following four lines after the "3" determine one house.
I try to read this file using a read method in the class portefeuille:
public static Portefeuille read(String infile)
{
Portefeuille returnvalue = new Portefeuille();
try
{
Scanner scan = new Scanner(new File(infile)).useDelimiter(" |/n");
int aantalwoningen = scan.nextInt();
for(int i = 0; i<aantalwoningen; ++i)
{
Woning.read(scan);
}
}
catch (FileNotFoundException e)
{
System.out.println("File could not be found");
}
catch (IOException e)
{
System.out.println("Exception while reading the file");
}
return returnvalue;
}
The read method in the Woning class looks like this:
public static Woning read(Scanner sc)
{
String token_adres = sc.next();
String token_dr = sc.next();
String token_postcd = sc.next();
String token_plaats = sc.next();
int token_vraagPrijs = sc.nextInt();
String token_kamerstxt = sc.next();
String token_prijstxt = sc.next();
int token_kamers = sc.nextInt();
return new Woning(adresp, token_vraagPrijs, token_kamers);
}
When I try to execute the following code:
Portefeuille port1 = Portefeuille.read("woningen.txt");
I get the following error:
Exception in thread "main" java.util.InputMismatchException
at java.util.Scanner.throwFor(Scanner.java:840)
at java.util.Scanner.next(Scanner.java:1461)
at java.util.Scanner.nextInt(Scanner.java:2091)
at java.util.Scanner.nextInt(Scanner.java:2050)
at Portefeuille.read(Portefeuille.java:48)
at Portefeuille.main(Portefeuille.java:112)
However if I use the read method from the Woning class to read one adres in a string format:
Emmalaan 23
3051JC Rotterdam
7 Rooms
price 300000
It works fine.
I tried to change the .txt file into only one address without the "3" on top so that it is exactly formatted like the address that should work. But when I call the read method from Woning class it still gives me the error.
Could anyone please help me with this?
Thank you!
I was also facing a similar issue, so I put my answer so that it could help in future:
There are two possible modifications which I did to make this code run.
First option: Change the use of useDelimiter method to .useDelimiter("\\r\\n") when creating the Scanner class, I was in windows so we might need \\r for Windows compatibility.
Using this modification, there will be no exception.But the code will again fail at int token_vraagPrijs = sc.nextInt();.
Because in the public static Woning read(Scanner sc), you are suing sc.next();.Actually this method finds and returns the next complete token from this scanner.A complete token is preceded and followed by input that matches the delimiter pattern.
So, every sc.next() is actually reading a line not a token.
So as per your code sc.nextInt() is trying to read something like Javastraat 88.So again it will give you the same exception.
Second option (Preferred):Don't use any delimiter, Scanner class will default whitespace and your code will work fine.I modified your code and It worked fine for me.
Code:
public class Test3{
public static void main(String... s)
{
read("test.txt");
}
public static void read(String infile)
{
try (Scanner scan = new Scanner(new File(infile)))
{
int aantalwoningen = scan.nextInt();
System.out.println(aantalwoningen);
for (int i = 0; i < aantalwoningen; ++i)
{
read(scan);
}
}
catch (FileNotFoundException e)
{
System.out.println("File could not be found");
}
}
public static void read(Scanner sc)
{
String token_adres = sc.next();
String token_dr = sc.next();
String token_postcd = sc.next();
String token_plaats = sc.next();
int token_vraagPrijs = sc.nextInt();
String token_kamerstxt = sc.next();
String token_prijstxt = sc.next();
int token_kamers = sc.nextInt();
System.out.println(token_adres + " " + token_dr + " " + token_postcd + " " + token_plaats + " "
+ token_vraagPrijs + " " + token_kamerstxt + " " + token_prijstxt + " " + token_kamers);
} }

How do I pretty-print productions and line numbers, using ANTLR4?

I'm trying to write a piece of code that will take an ANTLR4 parser and use it to generate ASTs for inputs similar to the ones given by the -tree option on grun (misc.TestRig). However, I'd additionally like for the output to include all the line number/offset information.
For example, instead of printing
(add (int 5) '+' (int 6))
I'd like to get
(add (int 5 [line 3, offset 6:7]) '+' (int 6 [line 3, offset 8:9]) [line 3, offset 5:10])
Or something similar.
There aren't a tremendous number of visitor examples for ANTLR4 yet, but I am pretty sure I can do most of this by copying the default implementation for toStringTree (used by grun). However, I do not see any information about the line numbers or offsets.
I expected to be able to write super simple code like this:
String visit(ParseTree t) {
return "(" + t.productionName + t.visitChildren() + t.lineNumber + ")";
}
but it doesn't seem to be this simple. I'm guessing I should be able to get line number information from the parser, but I haven't figured out how to do so. How can I grab this line number/offset information in my traversal?
To fill in the few blanks in the solution below, I used:
List<String> ruleNames = Arrays.asList(parser.getRuleNames());
parser.setBuildParseTree(true);
ParserRuleContext prc = parser.program();
ParseTree tree = prc;
to get the tree and the ruleNames. program is the name for the top production in my grammar.
The Trees.toStringTree method can be implemented using a ParseTreeListener. The following listener produces exactly the same output as Trees.toStringTree.
public class TreePrinterListener implements ParseTreeListener {
private final List<String> ruleNames;
private final StringBuilder builder = new StringBuilder();
public TreePrinterListener(Parser parser) {
this.ruleNames = Arrays.asList(parser.getRuleNames());
}
public TreePrinterListener(List<String> ruleNames) {
this.ruleNames = ruleNames;
}
#Override
public void visitTerminal(TerminalNode node) {
if (builder.length() > 0) {
builder.append(' ');
}
builder.append(Utils.escapeWhitespace(Trees.getNodeText(node, ruleNames), false));
}
#Override
public void visitErrorNode(ErrorNode node) {
if (builder.length() > 0) {
builder.append(' ');
}
builder.append(Utils.escapeWhitespace(Trees.getNodeText(node, ruleNames), false));
}
#Override
public void enterEveryRule(ParserRuleContext ctx) {
if (builder.length() > 0) {
builder.append(' ');
}
if (ctx.getChildCount() > 0) {
builder.append('(');
}
int ruleIndex = ctx.getRuleIndex();
String ruleName;
if (ruleIndex >= 0 && ruleIndex < ruleNames.size()) {
ruleName = ruleNames.get(ruleIndex);
}
else {
ruleName = Integer.toString(ruleIndex);
}
builder.append(ruleName);
}
#Override
public void exitEveryRule(ParserRuleContext ctx) {
if (ctx.getChildCount() > 0) {
builder.append(')');
}
}
#Override
public String toString() {
return builder.toString();
}
}
The class can be used as follows:
List<String> ruleNames = ...;
ParseTree tree = ...;
TreePrinterListener listener = new TreePrinterListener(ruleNames);
ParseTreeWalker.DEFAULT.walk(listener, tree);
String formatted = listener.toString();
The class can be modified to produce the information in your output by updating the exitEveryRule method:
#Override
public void exitEveryRule(ParserRuleContext ctx) {
if (ctx.getChildCount() > 0) {
Token positionToken = ctx.getStart();
if (positionToken != null) {
builder.append(" [line ");
builder.append(positionToken.getLine());
builder.append(", offset ");
builder.append(positionToken.getStartIndex());
builder.append(':');
builder.append(positionToken.getStopIndex());
builder.append("])");
}
else {
builder.append(')');
}
}
}

Resources