How to walk the parse tree to check for syntax errors in ANTLR - parsing

I have written a fairly simple language in ANTLR. Before actually interpreting the code written by a user, I wish to parse the code and check for syntax errors. If found I wish to output the cause for the error and exit. How can I check the code for syntax errors and output the corresponding error. Please not that for my purposes the error statements similar to those generated by the ANTLR tool are more than sufficient. For example
line 3:0 missing ';'

There is ErrorListener that you can use to get more information.
For example:
...
FormulaParser parser = new FormulaParser(tokens);
parser.IsCompletion = options.IsForCompletion;
ErrorListener errListener = new ErrorListener();
parser.AddErrorListener(errListener);
IParseTree tree = parser.formula();
Only thing you need to do is to attach ErrorListener to the parser.
Here is the code of ErrorListener.
/// <summary>
/// Error listener recording all errors that Antlr parser raises during parsing.
/// </summary>
internal class ErrorListener : BaseErrorListener
{
private const string Eof = "the end of formula";
public ErrorListener()
{
ErrorMessages = new List<ErrorInfo>();
}
public bool ErrorOccured { get; private set; }
public List<ErrorInfo> ErrorMessages { get; private set; }
public override void SyntaxError(IRecognizer recognizer, IToken offendingSymbol, int line, int charPositionInLine, string msg, RecognitionException e)
{
ErrorOccured = true;
if (e == null || e.GetType() != typeof(NoViableAltException))
{
ErrorMessages.Add(new ErrorInfo()
{
Message = ConvertMessage(msg),
StartIndex = offendingSymbol.StartIndex,
Column = offendingSymbol.Column + 1,
Line = offendingSymbol.Line,
Length = offendingSymbol.Text.Length
});
return;
}
ErrorMessages.Add(new ErrorInfo()
{
Message = string.Format("{0}{1}", ConvertToken(offendingSymbol.Text), " unexpected"),
StartIndex = offendingSymbol.StartIndex,
Column = offendingSymbol.Column + 1,
Line = offendingSymbol.Line,
Length = offendingSymbol.Text.Length
});
}
public override void ReportAmbiguity(Antlr4.Runtime.Parser recognizer, DFA dfa, int startIndex, int stopIndex, bool exact, BitSet ambigAlts, ATNConfigSet configs)
{
ErrorOccured = true;
ErrorMessages.Add(new ErrorInfo()
{
Message = "Ambiguity", Column = startIndex, StartIndex = startIndex
});
base.ReportAmbiguity(recognizer, dfa, startIndex, stopIndex, exact, ambigAlts, configs);
}
private string ConvertToken(string token)
{
return string.Equals(token, "<EOF>", StringComparison.InvariantCultureIgnoreCase)
? Eof
: token;
}
private string ConvertMessage(string message)
{
StringBuilder builder = new StringBuilder(message);
builder.Replace("<EOF>", Eof);
return builder.ToString();
}
}
It is some dummy listener, but you can see what it does. And that you can tell if the error is syntax error, or some ambiguity error. After parsing, you can ask directly the errorListener, if some error occurred.

Related

javacc java.lang.NullPointerException

im trying to make a miniJava parser but im having trouble figuring out a way to parse method declarations that have no formal parameters.
e.g public int getNumber()
The code that i have right now works for parameters of one or more, but im not sure how to return an empty formal object as clearly the problem lies with the line returning null.
Is there a way to skip the return statement altogether and return nothing?
public Formal nt_FormalList() :
{
Type t;
Token s;
LinkedList<Formal> fl = new LinkedList<Formal>();
Formal f;
}
{
t = nt_Type() s = <IDENTIFIER> (f = nt_FormalRest() {fl.add(f);})*
{ return new Formal(t, s.image); }
| {}
{ return null; }
}
.....
public class Formal {
public final Type t;
public final String i;
public Formal(Type at, String ai) {
t = at;
i = ai;
}
I'd suggest that you return list of Formals from nt_FormalList.
public List<Formal> nt_FormalList() :
{
LinkedList<Formal> fl = new LinkedList<Formal>();
Formal f;
}
{
[ f = nt_Formal() {fl.add(f);}
(<COMMA> f = nt_Formal() {fl.add(f);})*
]
{ return fl; }
}

How do I pretty-print productions and line numbers, using ANTLR4?

I'm trying to write a piece of code that will take an ANTLR4 parser and use it to generate ASTs for inputs similar to the ones given by the -tree option on grun (misc.TestRig). However, I'd additionally like for the output to include all the line number/offset information.
For example, instead of printing
(add (int 5) '+' (int 6))
I'd like to get
(add (int 5 [line 3, offset 6:7]) '+' (int 6 [line 3, offset 8:9]) [line 3, offset 5:10])
Or something similar.
There aren't a tremendous number of visitor examples for ANTLR4 yet, but I am pretty sure I can do most of this by copying the default implementation for toStringTree (used by grun). However, I do not see any information about the line numbers or offsets.
I expected to be able to write super simple code like this:
String visit(ParseTree t) {
return "(" + t.productionName + t.visitChildren() + t.lineNumber + ")";
}
but it doesn't seem to be this simple. I'm guessing I should be able to get line number information from the parser, but I haven't figured out how to do so. How can I grab this line number/offset information in my traversal?
To fill in the few blanks in the solution below, I used:
List<String> ruleNames = Arrays.asList(parser.getRuleNames());
parser.setBuildParseTree(true);
ParserRuleContext prc = parser.program();
ParseTree tree = prc;
to get the tree and the ruleNames. program is the name for the top production in my grammar.
The Trees.toStringTree method can be implemented using a ParseTreeListener. The following listener produces exactly the same output as Trees.toStringTree.
public class TreePrinterListener implements ParseTreeListener {
private final List<String> ruleNames;
private final StringBuilder builder = new StringBuilder();
public TreePrinterListener(Parser parser) {
this.ruleNames = Arrays.asList(parser.getRuleNames());
}
public TreePrinterListener(List<String> ruleNames) {
this.ruleNames = ruleNames;
}
#Override
public void visitTerminal(TerminalNode node) {
if (builder.length() > 0) {
builder.append(' ');
}
builder.append(Utils.escapeWhitespace(Trees.getNodeText(node, ruleNames), false));
}
#Override
public void visitErrorNode(ErrorNode node) {
if (builder.length() > 0) {
builder.append(' ');
}
builder.append(Utils.escapeWhitespace(Trees.getNodeText(node, ruleNames), false));
}
#Override
public void enterEveryRule(ParserRuleContext ctx) {
if (builder.length() > 0) {
builder.append(' ');
}
if (ctx.getChildCount() > 0) {
builder.append('(');
}
int ruleIndex = ctx.getRuleIndex();
String ruleName;
if (ruleIndex >= 0 && ruleIndex < ruleNames.size()) {
ruleName = ruleNames.get(ruleIndex);
}
else {
ruleName = Integer.toString(ruleIndex);
}
builder.append(ruleName);
}
#Override
public void exitEveryRule(ParserRuleContext ctx) {
if (ctx.getChildCount() > 0) {
builder.append(')');
}
}
#Override
public String toString() {
return builder.toString();
}
}
The class can be used as follows:
List<String> ruleNames = ...;
ParseTree tree = ...;
TreePrinterListener listener = new TreePrinterListener(ruleNames);
ParseTreeWalker.DEFAULT.walk(listener, tree);
String formatted = listener.toString();
The class can be modified to produce the information in your output by updating the exitEveryRule method:
#Override
public void exitEveryRule(ParserRuleContext ctx) {
if (ctx.getChildCount() > 0) {
Token positionToken = ctx.getStart();
if (positionToken != null) {
builder.append(" [line ");
builder.append(positionToken.getLine());
builder.append(", offset ");
builder.append(positionToken.getStartIndex());
builder.append(':');
builder.append(positionToken.getStopIndex());
builder.append("])");
}
else {
builder.append(')');
}
}
}

SAX getting only the end of a content string

I need to catch data from < itunes:sumary > tag but my handler is getting only the end of tag's content (last three words for example). I don't know what to do because other tags are being handled as expected, getting all content.*
I've seen that some tags are ignored by parser, but I don't think it's happening with because as I said it gets the content but only the end of that.
The source XML is hosted in -> http://djpaulonla.podomatic.com/archive/rss2.xml
Please, could someone help me???
The code is the following:
public class PodOMaticCustomHandler extends CustomHandler {
public PodOMaticCustomHandler(int quantityToFetch, String startTagValue,
String endTagValue) {
super(quantityToFetch, startTagValue, endTagValue);
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
super.characters(ch, start, length);
this.value = new String(ch, start, length);
}
#Override
public void endDocument() throws SAXException {
super.endDocument();
this.endDoc = true;
}
#Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
super.endElement(uri, localName, qName);
if (this.podcast != null) {
if (qName.equalsIgnoreCase("title")) {
podcast.setTitle(this.value);
} else if (qName.equalsIgnoreCase("pubDate")) {
podcast.setPubDate(this.value);
} else if (qName.equalsIgnoreCase("description")) {
podcast.setContent(this.value);
} else if (qName.equalsIgnoreCase("guid")) {
this.podcast.setLink(value);
}
}
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
super.startElement(uri, localName, qName, attributes);
if (this.startTagValue == null) {
this.startTagValueFound = true;
} else if (qName.equalsIgnoreCase("guid")
&& this.value.equalsIgnoreCase(this.startTagValue)) {
this.startTagValueFound = true;
}
if (this.endTagValue != null) {
if (qName.equalsIgnoreCase("guid")
&& this.value.equalsIgnoreCase(this.endTagValue)) {
this.endDoc = true;
}
}
if (!this.endDoc) {
if (this.quantityToFetch != this.podcasts.size()) {
if (this.startTagValueFound == true) {
if (qName.equalsIgnoreCase("item")) {
this.podcast = new Podcast();
} else if (qName.equalsIgnoreCase("enclosure")) {
this.podcast.setMedia(attributes.getValue("url"));
this.podcasts.add(podcast);
}
}
} else {
this.podcast = null;
}
}else{
this.podcast = null;
}
}
}
You can't rely on the characters method being called once with the entire element text, it may be called multiple times, each time with only part of the text.
Add a debug log statement to the characters method showing what you're setting value to and you will see that values is getting set with the first part of the string and then getting overwritten with the last part.
The answer is to buffer the text passed in from the characters calls in a CharArrayWriter or StringBuilder. Then you have to clear the buffer when the end of the element is found.
Here's what the Java tutorial on SAX has to say about the characters method:
Parsers are not required to return any particular number of characters at one time. A parser can return anything from a single character at a time up to several thousand and still be a standard-conforming implementation. So if your application needs to process the characters it sees, it is wise to have the characters() method accumulate the characters in a java.lang.StringBuffer and operate on them only when you are sure that all of them have been found.

How to tell if a segment actually exists in a HL7 message via NHAPI?

I have an SIU S12 message that does not contain a PV2 segment. However, when I get the parsed message from NHAPI, the parent group for PV2, the SIU_S12_PATIENT group, return 1 for currentReps ("PV2"), which means the PV2 is present.
var parser = new NHapi.Base.Parser.PipeParser();
var parsedMessage = parser.Parse(message) as NHapi.Model.V231.Message.SIU_S12;
var patientGroup=parsedMessage.GetPATIENT(0);
// This call should not create the segment if it does not exist
int pv2Count=patientGroup.currentReps("PV2");
//pv2Count is 1 here despite no PV2 segment exists in the message
//also Both GetAll("PV2") and SegmentFinder say the PV2 segment is present
//DG1RepetitionsUsed is also 1 despite no DG1 segment is present in the message
I am trying to avoid writing code to evaluate every field in the segment. PV2 is just an example - there are a lot more segments that could be missing from the message source.
I am using NHAPI v 2.4, the latest version.
Update: following Tyson's suggestion I come up with this method;
var parser = new NHapi.Base.Parser.PipeParser();
var parsedMessage = parser.Parse(message) as NHapi.Model.V231.Message.SIU_S12;
var encodingChars = new NHapi.Base.Parser.EncodingCharacters('|', null);
var patientGroup = parsedMessage.GetPATIENT(0);
var dg1 = (NHapi.Model.V231.Segment.DG1) (patientGroup.GetStructure("DG1"));
string encodedDg1 = NHapi.Base.Parser.PipeParser.Encode(dg1, encodingChars);
bool dg1Exists = string.Compare(encodedDg1,
"DG1", StringComparison.InvariantCultureIgnoreCase)==0;
easiest thing that I have found to do is to determine if a segment is in a message is to search the actual string of the message for the segment name plus a pipe. So, for example
if(message.Contains("PV2|"))
{
//do something neat
}
From my experience, it is either that, or examining every sub-field under the segment to see if there is a value.
EDIT
I found another way to check that might work a little better. The PipeParser class has a couple of static methods on it that takes in ISegment, IGroup, and IType objects that will return a string representation of the object NHapi reference.
Sample code:
string validTestMessages =
"MSH|^~\\&|ADT1|MCM|LABADT|MCM|198808181126|SECURITY|ADT^A01|MSG00001|P|2.6\r" +
"EVN|A01-|198808181123\r" +
"PID|||PID1234^5^M11^HBOC^CPI^HV||JONES^WILLIAM^A^III||19610615000000|M||2106-3|1200 N ELM STREET^^GREENSBORO^NC^27401-1020|GL||||S||S|123456789|9-87654^NC\r" +
"PV1|1|I|||||TEST^TEST^TEST||||||||||||||||||||||||||||||||||||||||||||\r";
var encodingChars = new EncodingCharacters('|', null);
PipeParser parser = new PipeParser();
var message = parser.Parse(validTestMessages);
PV1 pv1 = (PV1)message.GetStructure("PV1");
var doctor = pv1.GetAttendingDoctor(0);
string encodedMessage = PipeParser.Encode(pv1, encodingChars);
Console.WriteLine(encodedMessage);
encodedMessage = PipeParser.Encode(doctor, encodingChars);
Console.WriteLine(encodedMessage);
Output:
PV1|1|I|||||TEST^TEST^TEST
TEST^TEST^TEST
if there is no segment or the item is empty, then the PiperParser will return an empty string.
You can read segment line by line to a file and add in hl7 Record object and check segment exist or not.
package com.sachan.ranvijay#gmail.com.hl7.msg;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import org.nule.lighthl7lib.hl7.Hl7Record;
import org.nule.lighthl7lib.hl7.Hl7Segment;
import com.stpl.hl7.dto.HL7PatientInfoDTO;
/**
* This class will parse the hl7 message. it can accept message file in the format of java.io.file
* as well as String. Its Uses org.nule.lighthl7lib.hl7.Hl7Record
* as a main component.
* #author Ranvijay.Singh
*
*/
public class PrepareHL7Message {
StringBuilder hl7Msg = new StringBuilder();
Hl7Record record = null;
public PrepareHL7Message(File file) throws Exception {
BufferedReader reader = new BufferedReader(
new FileReader(file));
String str = reader.readLine();
while (str != null) {
hl7Msg.append(str).append("\r");
str = reader.readLine();
}
reader.close();
try{
record = new Hl7Record(hl7Msg.toString());
}catch (Exception e) {
throw e;
}
}
public PrepareHL7Message(String msg) throws Exception {
try{
record = new Hl7Record(msg);
}catch (Exception e) {
throw e;
}
}
private HL7PatientInfoDTO getPatientOrderingPhysician(HL7PatientInfoDTO padto) {
Hl7Segment seg = record.getSegment("PV1");
if(seg!=null)
padto.setOrderingPhysician(seg.field(7).toString());
return padto;
}
}
//DTO.............
package com.sachan.ranvijay#gmail.com.hl7.dto;
public class HL7PatientInfoDTO {
/**
* maped with PV1-7
*/
private String orderingPhysician;
/**
* #return the orderingPhysician
*/
public String getOrderingPhysician() {
return orderingPhysician;
}
/**
* #param orderingPhysician the orderingPhysician to set
*/
public void setOrderingPhysician(String orderingPhysician) {
this.orderingPhysician = orderingPhysician;
}
}

Method to create and store method chain at runtime

The problem I have is that I need to do about 40+ conversions to convert loosely typed info into strongly typed info stored in db, xml file, etc.
I'm plan to tag each type with a tuple i.e. a transformational form like this:
host.name.string:host.dotquad.string
which will offer a conversion from the input to an output form. For example, the name stored in the host field of type string, the input is converted into a dotquad notation of type string and stored back into host field. More complex conversions may need several steps, with each step being accomplished by a method call, hence method chaining.
Examining further the example above, the tuple 'host.name.string' with the field host of name www.domain.com. A DNS lookup is done to covert domain name to IP address. Another method is applied to change the type returned by the DNS lookup into the internal type of dotquad of type string. For this transformation, there is 4 seperate methods called to convert from one tuple into another. Some other conversions may require more steps.
Ideally I would like an small example of how method chains are constructed at runtime. Development time method chaining is relatively trivial, but would require pages and pages of code to cover all possibilites, with 40+ conversions.
One way I thought of doing is, is parsing the tuples at startup, and writing the chains out to an assembly, compiling it, then using reflection to load/access. Its would be really ugly and negate the performance increases i'm hoping to gain.
I'm using Mono, so no C# 4.0
Any help would be appreciated.
Bob.
Here is a quick and dirty solution using LINQ Expressions. You have indicated that you want C# 2.0, this is 3.5, but it does run on Mono 2.6. The method chaining is a bit hacky as i didn't exactly know how your version works, so you might need to tweak the expression code to suit.
The real magic really happens in the Chainer class, which takes a collection of strings, which represent the MethodChain subclass. Take a collection like this:
{
"string",
"string",
"int"
}
This will generate a chain like this:
new StringChain(new StringChain(new IntChain()));
Chainer.CreateChain will return a lambda that calls MethodChain.Execute(). Because Chainer.CreateChain uses a bit of reflection, it's slow, but it only needs to run once for each expression chain. The execution of the lambda is nearly as fast as calling actual code.
Hope you can fit this into your architecture.
public abstract class MethodChain {
private MethodChain[] m_methods;
private object m_Result;
public MethodChain(params MethodChain[] methods) {
m_methods = methods;
}
public MethodChain Execute(object expression) {
if(m_methods != null) {
foreach(var method in m_methods) {
expression = method.Execute(expression).GetResult<object>();
}
}
m_Result = ExecuteInternal(expression);
return this;
}
protected abstract object ExecuteInternal(object expression);
public T GetResult<T>() {
return (T)m_Result;
}
}
public class IntChain : MethodChain {
public IntChain(params MethodChain[] methods)
: base(methods) {
}
protected override object ExecuteInternal(object expression) {
return int.Parse(expression as string);
}
}
public class StringChain : MethodChain {
public StringChain(params MethodChain[] methods):base(methods) {
}
protected override object ExecuteInternal(object expression) {
return (expression as string).Trim();
}
}
public class Chainer {
/// <summary>
/// methods are executed from back to front, so methods[1] will call method[0].Execute before executing itself
/// </summary>
/// <param name="methods"></param>
/// <returns></returns>
public Func<object, MethodChain> CreateChain(IEnumerable<string> methods) {
Expression expr = null;
foreach(var methodName in methods.Reverse()) {
ConstructorInfo cInfo= null;
switch(methodName.ToLower()) {
case "string":
cInfo = typeof(StringChain).GetConstructor(new []{typeof(MethodChain[])});
break;
case "int":
cInfo = typeof(IntChain).GetConstructor(new[] { typeof(MethodChain[]) });
break;
}
if(cInfo == null)
continue;
if(expr != null)
expr = Expression.New(cInfo, Expression.NewArrayInit( typeof(MethodChain), Expression.Convert(expr, typeof(MethodChain))));
else
expr = Expression.New(cInfo, Expression.Constant(null, typeof(MethodChain[])));
}
var objParam = Expression.Parameter(typeof(object));
var methodExpr = Expression.Call(expr, typeof(MethodChain).GetMethod("Execute"), objParam);
Func<object, MethodChain> lambda = Expression.Lambda<Func<object, MethodChain>>(methodExpr, objParam).Compile();
return lambda;
}
[TestMethod]
public void ExprTest() {
Chainer chainer = new Chainer();
var lambda = chainer.CreateChain(new[] { "int", "string" });
var result = lambda(" 34 ").GetResult<int>();
Assert.AreEqual(34, result);
}
}
The command pattern would fit here. What you could do is queue up commands as you need different operations performed on the different data types. Those messages could then all be processed and call the appropriate methods when you're ready later on.
This pattern can be implemented in .NET 2.0.
Do you really need to do this at execution time? Can't you create the combination of operations using code generation?
Let me elaborate:
Assuming you have a class called Conversions which contains all the 40+ convertions you mentioned like this:
//just pseudo code..
class conversions{
string host_name(string input){}
string host_dotquad(string input){}
int type_convert(string input){}
float type_convert(string input){}
float increment_float(float input){}
}
Write a simple console app or something similar which uses reflection to generate code for methods like this:
execute_host_name(string input, Queue<string> conversionQueue)
{
string ouput = conversions.host_name(input);
if(conversionQueue.Count == 0)
return output;
switch(conversionQueue.dequeue())
{
// generate case statements only for methods that take in
// a string as parameter because the host_name method returns a string.
case "host.dotquad": return execute_host_dotquad(output,conversionQueue);
case "type.convert": return execute_type_convert(output, conversionQueue);
default: // exception...
}
}
Wrap all this in a Nice little execute method like this:
object execute(string input, string [] conversions)
{
Queue<string> conversionQueue = //create the queue..
case(conversionQueue.dequeue())
{
case "host.name": return execute_host_name(output,conversionQueue);
case "host.dotquad": return execute_host_dotquad(output,conversionQueue);
case "type.convert": return execute_type_convert(output, conversionQueue);
default: // exception...
}
}
This code generation application need to be executed only when your method signatures changes or when you decide to add new transformations.
Main advantages:
No runtime overhead
Easy to add/delete/change the conversions (code generator will take care of the code changes :) )
What do you think?
I apologize for the long code dump and the fact that it is in Java, rather than C#, but I found your problem quite interesting and I do not have much C# experience. Hopefully you will be able to adapt this solution without difficulty.
One approach to solving your problem is to create a cost for each conversion -- usually this is related to the accuracy of the conversion -- and then perform a search to find the best possible conversion sequence to get from one type to another.
The reason for needing a cost function is to choose among multiple conversion paths. For example, converting from an integer to a string is lossless, but there is no guarantee that every string can be represented by an integer. So, if you had two conversion chains
string -> integer -> float -> decimal
string -> float -> decimal
You would want to select the second one because it will reduce the chance of a conversion failure.
The Java code below implements such a scheme and performs a best-first search to find an optimal conversion sequence. I hope you find it useful. Running the code produces the following output:
> No conversion possible from string to integer
> The optimal conversion sequence from string to host.dotquad.string is:
> string to host.name.string, cost = -1.609438
> host.name.string to host.dns, cost = -1.609438 *PERFECT*
> host.dns to host.dotquad, cost = -1.832581
> host.dotquad to host.dotquad.string, cost = -1.832581 *PERFECT*
Here is the Java code.
/**
* Use best-first search to find an optimal sequence of operations for
* performing a type conversion with maximum fidelity.
*/
import java.util.*;
public class TypeConversion {
/**
* Define a type-conversion interface. It converts between to
* user-defined types and provides a measure of fidelity (accuracy)
* of the conversion.
*/
interface ITypeConverter<T, F> {
public T convert(F from);
public double fidelity();
// Could use reflection instead of handling this explicitly
public String getSourceType();
public String getTargetType();
}
/**
* Create a set of user-defined types.
*/
class HostName {
public String hostName;
public HostName(String hostName) {
this.hostName = hostName;
}
}
class DnsLookup {
public String ipAddress;
public DnsLookup(HostName hostName) {
this.ipAddress = doDNSLookup(hostName);
}
private String doDNSLookup(HostName hostName) {
return "127.0.0.1";
}
}
class DottedQuad {
public int[] quad = new int[4];
public DottedQuad(DnsLookup lookup) {
String[] split = lookup.ipAddress.split(".");
for ( int i = 0; i < 4; i++ )
quad[i] = Integer.parseInt( split[i] );
}
}
/**
* Define a set of conversion operations between the types. We only
* implement a minimal number for brevity, but this could be expanded.
*
* We start by creating some broad classes to differentiate among
* perfect, good and bad conversions.
*/
abstract class PerfectTypeConversion<T, F> implements ITypeConverter<T, F> {
public abstract T convert(F from);
public double fidelity() { return 1.0; }
}
abstract class GoodTypeConversion<T, F> implements ITypeConverter<T, F> {
public abstract T convert(F from);
public double fidelity() { return 0.8; }
}
abstract class BadTypeConversion<T, F> implements ITypeConverter<T, F> {
public abstract T convert(F from);
public double fidelity() { return 0.2; }
}
/**
* Concrete classes that do the actual conversions.
*/
class StringToHostName extends BadTypeConversion<HostName, String> {
public HostName convert(String from) { return new HostName(from); }
public String getSourceType() { return "string"; }
public String getTargetType() { return "host.name.string"; }
}
class HostNameToDnsLookup extends PerfectTypeConversion<DnsLookup, HostName> {
public DnsLookup convert(HostName from) { return new DnsLookup(from); }
public String getSourceType() { return "host.name.string"; }
public String getTargetType() { return "host.dns"; }
}
class DnsLookupToDottedQuad extends GoodTypeConversion<DottedQuad, DnsLookup> {
public DottedQuad convert(DnsLookup from) { return new DottedQuad(from); }
public String getSourceType() { return "host.dns"; }
public String getTargetType() { return "host.dotquad"; }
}
class DottedQuadToString extends PerfectTypeConversion<String, DottedQuad> {
public String convert(DottedQuad f) {
return f.quad[0] + "." + f.quad[1] + "." + f.quad[2] + "." + f.quad[3];
}
public String getSourceType() { return "host.dotquad"; }
public String getTargetType() { return "host.dotquad.string"; }
}
/**
* To find the best conversion sequence, we need to instantiate
* a list of converters.
*/
ITypeConverter<?,?> converters[] =
{
new StringToHostName(),
new HostNameToDnsLookup(),
new DnsLookupToDottedQuad(),
new DottedQuadToString()
};
Map<String, List<ITypeConverter<?,?>>> fromMap =
new HashMap<String, List<ITypeConverter<?,?>>>();
public void buildConversionMap()
{
for ( ITypeConverter<?,?> converter : converters )
{
String type = converter.getSourceType();
if ( !fromMap.containsKey( type )) {
fromMap.put( type, new ArrayList<ITypeConverter<?,?>>());
}
fromMap.get(type).add(converter);
}
}
public class Tuple implements Comparable<Tuple>
{
public String type;
public double cost;
public Tuple parent;
public Tuple(String type, double cost, Tuple parent) {
this.type = type;
this.cost = cost;
this.parent = parent;
}
public int compareTo(Tuple o) {
return Double.compare( cost, o.cost );
}
}
public Tuple findOptimalConversionSequence(String from, String target)
{
PriorityQueue<Tuple> queue = new PriorityQueue<Tuple>();
// Add a dummy start node to the queue
queue.add( new Tuple( from, 0.0, null ));
// Perform the search
while ( !queue.isEmpty() )
{
// Pop the most promising candidate from the list
Tuple tuple = queue.remove();
// If the type matches the target type, return
if ( tuple.type == target )
return tuple;
// If we have reached a dead-end, backtrack
if ( !fromMap.containsKey( tuple.type ))
continue;
// Otherwise get all of the possible conversions to
// perform next and add their costs
for ( ITypeConverter<?,?> converter : fromMap.get( tuple.type ))
{
String type = converter.getTargetType();
double cost = tuple.cost + Math.log( converter.fidelity() );
queue.add( new Tuple( type, cost, tuple ));
}
}
// No solution
return null;
}
public static void convert(String from, String target)
{
TypeConversion tc = new TypeConversion();
// Build a conversion lookup table
tc.buildConversionMap();
// Find the tail of the optimal conversion chain.
Tuple tail = tc.findOptimalConversionSequence( from, target );
if ( tail == null ) {
System.out.println( "No conversion possible from " + from + " to " + target );
return;
}
// Reconstruct the conversion path (skip dummy node)
List<Tuple> solution = new ArrayList<Tuple>();
for ( ; tail.parent != null ; tail = tail.parent )
solution.add( tail );
Collections.reverse( solution );
StringBuilder sb = new StringBuilder();
Formatter formatter = new Formatter(sb);
sb.append( "The optimal conversion sequence from " + from + " to " + target + " is:\n" );
for ( Tuple tuple : solution ) {
formatter.format( "%20s to %20s, cost = %f", tuple.parent.type, tuple.type, tuple.cost );
if ( tuple.cost == tuple.parent.cost )
sb.append( " *PERFECT*");
sb.append( "\n" );
}
System.out.println( sb.toString() );
}
public static void main(String[] args)
{
// Run two tests
convert( "string", "integer" );
convert( "string", "host.dotquad.string" );
}
}

Resources