Saxon Extension function in Java return Node - saxon

I am trying to implement custom function using Saxon as defined here-> https://specifications.xbrl.org/registries/functions-registry-1.0/80132%20xfi.identifier/80132%20xfi.identifier%20function.html
public class IdentifierFunction implements ExtensionFunction {
public QName getName() {
return new QName("http://www.xbrl.org/2005/function/instance", "identifier");
}
public SequenceType getResultType() {
return SequenceType.makeSequenceType(ItemType.STRING, OccurrenceIndicator.ONE);
}
public net.sf.saxon.s9api.SequenceType[] getArgumentTypes() {
return new SequenceType[] { SequenceType.makeSequenceType(ItemType.STRING, OccurrenceIndicator.ONE) };
}
public XdmValue call(XdmValue[] arguments) throws SaxonApiException {
String arg = ((XdmAtomicValue) arguments[0].itemAt(0)).getStringValue();
String newExpression="(//xbrli:xbrl/xbrli:context[#id=("+arg+"/#contextRef"+")])[1]/xbrli:entity/xbrli:identifier";
String nodeString=this.getxPathResolver().resolveNode(this.getXbrl(),newExpression);
return new XdmAtomicValue(nodeString);
}
}
resolveNode() is above code is implemented as follows
public String resolveNode(byte[] xbrlBytes, String expressionValue) {
// 1. Instantiate an XPathFactory.
javax.xml.xpath.XPathFactory factory = new XPathFactoryImpl();
// 2. Use the XPathFactory to create a new XPath object
javax.xml.xpath.XPath xpath = factory.newXPath();
NamespaceContext ctx = new NamespaceContext() {
#Override
public String getNamespaceURI(String aPrefix) {
if (aPrefix.equals("xfi"))
return "http://www.xbrl.org/2005/function/instance";
else if (aPrefix.equals("xs"))
return "http://www.w3.org/2001/XMLSchema";
else if (aPrefix.equals("xbrli"))
return "http://www.xbrl.org/2003/instance";
else
return null;
}
#Override
public Iterator getPrefixes(String val) {
throw new UnsupportedOperationException();
}
#Override
public String getPrefix(String uri) {
throw new UnsupportedOperationException();
}
};
xpath.setNamespaceContext(ctx);
try {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = builderFactory.newDocumentBuilder();
Document someXML = documentBuilder.parse(new InputSource(new StringReader(new String(xbrlBytes))));
// 3. Compile an XPath string into an XPathExpression
javax.xml.xpath.XPathExpression expression = xpath.compile(expressionValue);
Object result = expression.evaluate(someXML, XPathConstants.NODE);
// 4. Evaluate the XPath expression on an input document
Node nodes = (Node) result;
return nodeToString(nodes);
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
When I evaluate xfi:identifier(args) , i get String like below:
<xbrli:identifier xmlns:xbrli="http://www.xbrl.org/2003/instance"
xmlns:iso4217="http://www.xbrl.org/2003/iso4217"
xmlns:jenv-bw2-dim="http://www.nltaxonomie.nl/nt13/jenv/20181212/dictionary/jenv-bw2-axes"
xmlns:jenv-bw2-dm="http://www.nltaxonomie.nl/nt13/jenv/20181212/dictionary/jenv-bw2-domains"
xmlns:jenv-bw2-i="http://www.nltaxonomie.nl/nt13/jenv/20181212/dictionary/jenv-bw2-data"
xmlns:kvk-i="http://www.nltaxonomie.nl/nt13/kvk/20181212/dictionary/kvk-data"
xmlns:link="http://www.xbrl.org/2003/linkbase"
xmlns:nl-cd="http://www.nltaxonomie.nl/nt13/sbr/20180301/dictionary/nl-common-data"
xmlns:rj-i="http://www.nltaxonomie.nl/nt13/rj/20181212/dictionary/rj-data"
xmlns:rj-t="http://www.nltaxonomie.nl/nt13/rj/20181212/dictionary/rj-tuples"
xmlns:xbrldi="http://xbrl.org/2006/xbrldi"
xmlns:xlink="http://www.w3.org/1999/xlink"
scheme="http://www.kvk.nl/kvk-id">62394207</xbrli:identifier>
However, I want to evaluate function number(xfi:identifier(args))
This results in NaN which is obvious because complete node string cannot be converted to number. I think, I need to change my function so that it returns Node. However, I am not sure how to do that. I tried google and also looked at Saxon documentation, but no luck yet.
Can someone help me? Basically, custom function should return an element node as per definition. and when I use number(xfi:identifier) it should give me 62394207 in this case.
regards,
Venky

Firstly, the XBRL spec for the function seems to imply that the function expects a node as argument and returns a node as its result, but in your implementation getArgumentTypes() and getResultType() define the type as xs:string - so this needs to change.
And the function should return an XdmNode, which is a subclass of XdmValue.
Next, it's very inefficient to be creating a DocumentBuilderFactory and XPathFactory, constructing an XML document tree, and compiling an XPath expression, every time your function is executed. I strongly suspect none of this is necessary.
Instead of having this.getXbrl() return a raw lexical document as byte[], have it return a prebuilt XdmNode representing the document tree. And then I would suggest that rather than selecting within that tree using XPath, you use Saxon's linq-like navigation API. If this XdmNode is in variable "root", then the XPath expression
//xbrli:xbrl/xbrli:context[#id=("+arg+"/#contextRef"+")
translates into something like
root.select(descendant("xbrl").then(child("context)).where(attributeEq("id", arg))
(except that I'm not quite sure what you're passing as arg to make your XPath expression make sense).
But you can use XPath if you prefer; just use Saxon's s9api interfaces for XPath and make sure the XPath expression is only compiled once and used repeatedly. It's straightforward then to get an XdmNode as the result of your XPath expression, which can be returned directly as the result of your extension function.

Related

Dart: lists of supertype takes subtype only at runtime

I ran into an issue similar to this:
void main() {
_buildMixedList([1,2.3,4,5.6,7.6,8]);
_buildHomogeneousList([1,2,4,5,7,8]);
}
abstract class NumberWrapper {}
class DoubleWrapper extends NumberWrapper{
final double myDouble;
DoubleWrapper(this.myDouble);
}
class IntWrapper extends NumberWrapper{
final int myInt;
IntWrapper(this.myInt);
}
List<NumberWrapper?> _buildMixedList(List<dynamic> numbers) {
List<NumberWrapper?> wrappers = numbers.map((number) {
if(number is int){
return IntWrapper(number);
}
if(number is double){
return DoubleWrapper(number);
}
return null;
}).toList();
wrappers.add(DoubleWrapper(0.2));
return wrappers;
}
List<NumberWrapper?> _buildHomogeneousList(List<dynamic> numbers) {
List<NumberWrapper?> wrappers = numbers.map((number) {
if(number is int){
return IntWrapper(number);
}
return null;
}).toList();
wrappers.add(DoubleWrapper(0.2));
return wrappers;
}
As you can see, the two methods are doing something similar (adding object of different types to a list). The first one adds different objects inside a map() function and the other adds only one type in map() and then adds another after.
The second one throws this error:
: TypeError: Instance of 'DoubleWrapper': type 'DoubleWrapper' is not a subtype of type 'IntWrapper?'Error: TypeError: Instance of 'DoubleWrapper': type 'DoubleWrapper' is not a subtype of type 'IntWrapper?'
As if the list is being changed to List<IntWrapper?> just because we only added IntWrappers in the map().
I wrote this test code after encountering this in one of my projects, so it's not representative of a real case. I tried it on dartPad.
Coming from a java background I was expecting the second method to work. Is it a bug or is it intended? If intended, why is that so?
Your problem is that there are a difference between the type of the variable and the type of the object which you are pointing to.
So in this case:
List<NumberWrapper?> wrappers = numbers.map((number) {
if(number is int){
return IntWrapper(number);
}
return null;
}).toList();
What you are actually are doing is creating a List<IntWrapper?> which you are using a variable of the type List<NumberWrapper?> to point at. Why? Because the type of the variable in this case does not change the type of the returned List from toList() (which type is determined by what type map() returns).
The reason the type is List<IntWrapper?> is because Dart are trying to be smart about automatically assigning the type. In this case, the analyzer can see you List will only contain IntWrapper or null.
I think the best solution here is to rewrite this part to something like this:
List<NumberWrapper?> _buildHomogeneousList(List<num> numbers) {
final wrappers = <NumberWrapper?>[
for (final number in numbers)
if (number is int) IntWrapper(number) else null
];
wrappers.add(DoubleWrapper(0.2));
return wrappers;
}
By using the [] syntax to create the List, it is easier to specify the type you want the List to be.
Alternative, you can do this where we add the expected type to the map method:
List<NumberWrapper?> _buildHomogeneousList(List<num> numbers) {
List<NumberWrapper?> wrappers = numbers.map<NumberWrapper?>((number) {
if (number is int) {
return IntWrapper(number);
}
return null;
}).toList();
wrappers.add(DoubleWrapper(0.2));
return wrappers;
}

Custom embedded language netbeans

I'm developing plugin for Netbeans IDE that will provide support for new custom language. I have created parser and lexer for my custom language using ANTLR features. Besides, my language contains some "SQL-like" queries that are very complicated, so i decided to write separate grammar for "SQL-like" queries. Consequently, I had to make parser and lexer for my "SQL-like" language. As a result, I have two languages where "SQL-like" language is an embedded language.
Netbeans provides class EmbeddingProvider which is responsible for embedding languages. Here is my EmbeddingProvider:
#EmbeddingProvider.Registration(mimeType = "text/x-lorx", targetMimeType = "text/x-sqll")
public class LorxEmbeddingProvider extends EmbeddingProvider {
#Override
public List<Embedding> getEmbeddings(Snapshot snapshot) {
TokenHierarchy th = snapshot.getTokenHierarchy();
TokenSequence<LorxTokenId> ts = th.tokenSequence(LorxTokenId.getLanguage());
List<Embedding> embeddings = new ArrayList<>();
while(ts.moveNext()) {
Token currToken = ts.token();
if(currToken.id().ordinal() == LorxTokenType.SqllLiteral.id) {
embeddings.add(snapshot.create(currToken.text(), "text/x-sqll"));
}
}
return embeddings;
}
#Override
public int getPriority() {
return 140;
}
#Override
public void cancel() {
}
}
Annotation is used for determining top-level language("text/x-lorx") and it's embedded language ("text/x-sqll").
Method getEmbeddings(Snapshot snapshot) executes when we open some file in the editor or just move caret to another position. I use Snapshot class to get token sequence of the current opened file. In this code sample i am iterating tokens in search of SqllLiteral token(It's like [select * from ...]). If i find this token, i create new Embedding.
public class SqllParserFactory extends ParserFactory {
#Override
public Parser createParser(Collection<Snapshot> snapshots) {
return new SqllNBParser();
}
}
After finishing getEmbeddings(Snapshot snapshot) method, SqllParserFactory of an embedded language creates new parser for sqll language and then happens nothing. I would like to know if i'm on the right way and also i would be happy if someone gave me an advise how split embedded language text into tokens.
Instead of using EmbeddingProvider you can try to use LanguageProvider.
#ServiceProvider(service = LanguageProvider.class)
public class MyEmbeddingLanguageProvider extends LanguageProvider {
#Override
public LanguageEmbedding<?> findLanguageEmbedding(Token<?> token,
LanguagePath languagePath, InputAttributes inputAttributes) {
Language embeddedLanguage = MimeLookup.getLookup("text/sqll").lookup(Language.class);
if (embeddedLanguage != null && languagePath.mimePath().equals("text/x-lorx")) {
if (token.id().ordinal() == LorxTokenType.SqllLiteral.id) {
return LanguageEmbedding.create(embeddedLanguage, 0, 0, true);
}
}
return null;
}
#Override
public Language<?> findLanguage(String mimeType) {
return null;
}
}
It should work for tokenizing both languages (which can result in nice syntax coloring). Unfortunately parsing of embedded language may not work.
The problem has been solved by overriding method in class LanguageHierarchy. I was trying to override this method before, but I used it incorrectly. The problem was in passing wrong params to method LanguageEmbedding.create(SqllTokenId.getLanguage(), 0, 0);. Instead of passing zeros, I passed length of my token, so it was skipping my token, because the second param is startSkipLength and the third is endSkipLength. The code below is correct. The LorxTokenType.SqllLiteral is token of my "sql-like" query. It is resolved as an embedded language now and processing by another lexer and parser (in my case by Sqll lexer and parser).
#Override
protected LanguageEmbedding<?> embedding(Token<LorxTokenId> token, LanguagePath languagePath, InputAttributes inputAttributes) {
if(token.id().ordinal() == LorxTokenType.SqllLiteral.id) {
return LanguageEmbedding.create(SqllTokenId.getLanguage(), 0, 0);
}
return null;
}

how to implement string concatenation alike extension function with unknown arguments number using Saxon-HE?

I want to add a custom xpath extension function to the Saxon-HE transformer. This custom function should accept one or more arguments. Let's use the string concatenation analogy for concatenating one or more string arguments. Following the sample on the saxon page, i wrote the following code:
ExtensionFunction myconcat = new ExtensionFunction() {
public QName getName() {
return new QName("http://mycompany.com/", "myconcat");
}
public SequenceType getResultType() {
return SequenceType.makeSequenceType(
ItemType.STRING, OccurrenceIndicator.ONE
);
}
public net.sf.saxon.s9api.SequenceType[] getArgumentTypes() {
return new SequenceType[]{
SequenceType.makeSequenceType(
ItemType.STRING, OccurrenceIndicator.ONE_OR_MORE)};
}
public XdmValue call(XdmValue[] arguments) throws SaxonApiException {
//concatenate the strings here....
String result = "concatenated string";
return new XdmAtomicValue(result);
}
};
i have expected that the following xpath expression would work in an xsl file
<xsl:value-of select="myconcat('a','b','c','...')">
Unfortunately i got the following exception:
XPST0017: Function myconcat must have 1 argument
What is the right way of creating a custom function for this use case?
Thanks.
The standard mechanisms for creating extension functions don't allow a variable number of arguments (it's not really pukka to have such functions in the XPath view of the world - concat() is very much an exception).
You can do it by creating your own implementation of the class FunctionLibrary and adding your FunctionLibrary to the static context of the XSLT engine - but you're deep into Saxon internals if you attempt that, so be prepared for a rough ride.

Test that either one thing holds or another in AssertJ

I am in the process of converting some tests from Hamcrest to AssertJ. In Hamcrest I use the following snippet:
assertThat(list, either(contains(Tags.SWEETS, Tags.HIGH))
.or(contains(Tags.SOUPS, Tags.RED)));
That is, the list may be either that or that. How can I express this in AssertJ? The anyOf function (of course, any is something else than either, but that would be a second question) takes a Condition; I have implemented that myself, but it feels as if this should be a common case.
Edited:
Since 3.12.0 AssertJ provides satisfiesAnyOf which succeeds is one of the given assertion succeeds,
assertThat(list).satisfiesAnyOf(
listParam -> assertThat(listParam).contains(Tags.SWEETS, Tags.HIGH),
listParam -> assertThat(listParam).contains(Tags.SOUPS, Tags.RED)
);
Original answer:
No, this is an area where Hamcrest is better than AssertJ.
To write the following assertion:
Set<String> goodTags = newLinkedHashSet("Fine", "Good");
Set<String> badTags = newLinkedHashSet("Bad!", "Awful");
Set<String> tags = newLinkedHashSet("Fine", "Good", "Ok", "?");
// contains is statically imported from ContainsCondition
// anyOf succeeds if one of the conditions is met (logical 'or')
assertThat(tags).has(anyOf(contains(goodTags), contains(badTags)));
you need to create this Condition:
import static org.assertj.core.util.Lists.newArrayList;
import java.util.Collection;
import org.assertj.core.api.Condition;
public class ContainsCondition extends Condition<Iterable<String>> {
private Collection<String> collection;
public ContainsCondition(Iterable<String> values) {
super("contains " + values);
this.collection = newArrayList(values);
}
static ContainsCondition contains(Collection<String> set) {
return new ContainsCondition(set);
}
#Override
public boolean matches(Iterable<String> actual) {
Collection<String> values = newArrayList(actual);
for (String string : collection) {
if (!values.contains(string)) return false;
}
return true;
};
}
It might not be what you if you expect that the presence of your tags in one collection implies they are not in the other one.
Inspired by this thread, you might want to use this little repo I put together, that adapts the Hamcrest Matcher API into AssertJ's Condition API. Also includes a handy-dandy conversion shell script.

JSR303 validator message recursive resolution?

I have written a JSR303 validator that compares property value to constraint:
#Documented
#Constraint(validatedBy = Cmp.LongCmpValidator.class)
#Target({ METHOD, FIELD, ANNOTATION_TYPE, CONSTRUCTOR, PARAMETER })
#Retention(RUNTIME)
public #interface Cmp {
String message() default "{home.lang.validator.Cmp.message}";
Class<?>[] groups() default {};
Class<? extends Payload>[] payload() default {};
long value();
public enum REL { LT,LT_EQ,EQ,GT,GT_EQ;
#Override
public String toString() {
return toString_property();
}
public String toString_property() {
switch(this) {
case LT : return "{home.lang.validator.Cmp.REL.LT}";
case LT_EQ: return "{home.lang.validator.Cmp.REL.LT_EQ}";
case EQ: return "{home.lang.validator.Cmp.REL.EQ}";
case GT : return "{home.lang.validator.Cmp.REL.GT}";
case GT_EQ: return "{home.lang.validator.Cmp.REL.GT_EQ}";
}
throw new UnsupportedOperationException();
}
public String toString_common() { return super.toString(); }
public String toString_math() { switch(this) {
case LT : return "<";
case LT_EQ: return "\u2264";
case EQ: return "=";
case GT : return ">";
case GT_EQ: return "\u2265";
}
throw new UnsupportedOperationException();
}
}
REL prop_rel_cnstr();
#Target({ METHOD, FIELD, ANNOTATION_TYPE, CONSTRUCTOR, PARAMETER })
#Retention(RUNTIME)
#Documented
#interface List {
Cmp[] value();
}
class LongCmpValidator implements ConstraintValidator<Cmp, Number> {
long cnstr_val;
REL prop_rel_cnstr;
public void initialize(Cmp constraintAnnotation) {
cnstr_val = constraintAnnotation.value();
prop_rel_cnstr = constraintAnnotation.prop_rel_cnstr();
}
public boolean isValid(Number _value, ConstraintValidatorContext context) {
if(_value == null) return true;
if(_value instanceof Integer) {
int value = _value.intValue();
switch(prop_rel_cnstr) {
case LT : return value < cnstr_val;
case LT_EQ: return value <= cnstr_val;
case EQ: return value == cnstr_val;
case GT : return value > cnstr_val;
case GT_EQ: return value >= cnstr_val;
}
}
// ... handle other types
return true;
}
}
}
ValidationMessages.properties :
home.lang.validator.Cmp.REL.LT=less than
home.lang.validator.Cmp.REL.LT_EQ=less than or equal
home.lang.validator.Cmp.REL.EQ=equal
home.lang.validator.Cmp.REL.GT=greater
home.lang.validator.Cmp.REL.GT_EQ=greater than or equal
home.lang.validator.Cmp.message=Failure: validated value is to be in relation "{prop_rel_cnstr}" to {value}.
Works fine. Almost. The validation message I get looks like this:
Failure: validated value is to be in relation "{home.lang.validator.Cmp.REL.GT}" to 0.
Would anybody please suggest easy and convenient way, how to make Validator recognize and resolve nested {home.lang.validator.Cmp.REL.GT} key? I need it to be nicely usable in JSF2, which handles validation.
I'm not using Spring, but use hibernate-validator 4.
By the way, looks like hibernate-validator 4 doesn't fully implement JSR303, since later states in the 4.3.1.1.:
Message parameters are extracted from
the message string and used as keys to
search the ResourceBundle named
ValidationMessages (often materialized
as the property file
/ValidationMessages.properties and its
locale variations) using the defined
locale (see below). If a property is
found, the message parameter is
replaced with the property value in
the message string. Step 1 is applied
recursively until no replacement is
performed (i.e. a message parameter
value can itself contain a message
parameter).
Ok, I did dig into this. The algorithm specified by JSR303 has an unintuitive mess with what (props) are recursively resolvable and what's not. I think, that's mainly due to bad distinction in grammar of annotation''s properties and RB's properties.
So I've made my own MessageInterpolator, which you can find in my repo: http://github.com/Andrey-Sisoyev/adv-msg-interpolator. It solves almost all the problems, and also allows to address the resource bundle, where to look for the property.

Resources