Validate XML against schematron using SAXON EE edition - saxon

I am evaluating SAXON EE edition to validate XML against xsd and schematron.
Can someone help me in resolving the following queries:
While validating xml document against xsd, can we also get xpath of that error node along with errors in plain text. Currently I am getting error only.
Can we validate xml against schematron using Saxon EE version? Any code sample would be a great help.
Thanks.

1. While validating xml document against xsd, can we also get xpath of that error node.
Yes, the error information includes an XPath reference to the invalid node (in most cases: there are some cases such as duplicate IDs where there isn't one specific node in error).
If you generate an XML validity report using SchemaValidator.SetValidityReporting() then the resulting report will include the path information. Here's an example:
<?xml version="1.0" encoding="UTF-8"?>
<validation-report xmlns="http://saxon.sf.net/ns/validation"
system-id="file:/Users/mike/repo2/samples/data/books-invalid.xml">
<error line="3"
column="17"
path="/Q{}BOOKLIST[1]/Q{}BOOKS[1]/#x"
xsd-part="1"
constraint="cvc-complex-type.3">Attribute #x is not allowed on element <BOOKS></error>
<error line="10"
column="17"
path="/Q{}BOOKLIST[1]/Q{}BOOKS[1]/Q{}ITEM[1]/Q{}PRICE[1]"
xsd-part="2"
constraint="cvc-datatype-valid.1">The content "$0.2" of element <PRICE> does not match the required simple type. Cannot convert string to decimal: $0.2</error>
<error line="21"
column="20"
path="/Q{}BOOKLIST[1]/Q{}BOOKS[1]/Q{}ITEM[2]/Q{}PUB-DATE[1]"
xsd-part="2"
constraint="cvc-datatype-valid.1">The content "2002-02-31" of element <PUB-DATE> does not match the required simple type. Invalid date "2002-02-31" (Non-existent date)</error>
<error line="42"
column="22"
path="/Q{}BOOKLIST[1]/Q{}BOOKS[1]/Q{}ITEM[3]/Q{}REPUTATION[1]"
xsd-part="1"
constraint="cvc-complex-type.2.4">In content of element <ITEM>: The content model does not allow element <REPUTATION> to appear immediately after element <WEIGHT>. No further elements are allowed at this point. </error>
<meta-data>
<validator name="SAXON-EE" version="9.8.0.9"/>
<results errors="4" warnings="0"/>
<schema file="books.xsd" xsd-version="1.1"/>
<run at="2018-03-07T15:22:04.847Z"/>
</meta-data>
</validation-report>
You can also get the information if you supply an IInvalidityHandler as a callback to the SchemaValidator, though this requires a bit more digging. Saxon calls your IInvalidityHandler supplying a StaticError object (which is a bit of a misnomer). The StaticError object doesn't have the path information directly available, but it contains a reference to an XPathException object which can be cast to a ValidationException, and ValidationException has a method getPath() which returns this information if available.
2. Can we validate xml against schematron.
Saxon doesn't include a schematron validator per se, though many of the third-party tools that do schematron validation make use of Saxon "under the hood". I'm not up-to-date with the situation on .NET - but essentially there are two kinds of Schematron processor: those that generate XSLT code from the schematron schema (which typically use Saxon both to generate the XSLT and to execute it), and "native" processors. Searching for "schematron on .NET" gives you quite a number of projects, but I have no idea of their current status or quality.

Related

How to use EOBI simple binary encoding

I am trying to use the sbeTool with the Eurex codecs:
JAVA -Dsbe.target.namespace=eobiV81 -classpath "../sbe-tool-1.7.0.jar;../agrona-0.9.6.jar" uk.co.real_logic.sbe.SbeTool eobi81.xml
This eobi.xml file looks slightly different from the sample provided and the tool execution fails (it succeeds on the the car.xml example file):
Exception in thread "main" java.lang.NullPointerException
at uk.co.real_logic.sbe.xml.XmlSchemaParser.getAttributeValue(XmlSchemaParser.java:221)
at uk.co.real_logic.sbe.xml.MessageSchema.<init>(MessageSchema.java:47)
at uk.co.real_logic.sbe.xml.XmlSchemaParser.parse(XmlSchemaParser.java:105)
at uk.co.real_logic.sbe.SbeTool.parseSchema(SbeTool.java:274)
at uk.co.real_logic.sbe.SbeTool.main(SbeTool.java:199)
Can anyone help me find a way to get the xml compiling? I believe maybe the eobi.xsd file should be useful, but not sure how.
Thanks
Eurex EOBI is not compliant with the SBE 1.0 standard. You can see this if schema validation is turned on for the SBE Tool. To parse Eurex messages you will need a different codec.

How to solve Checkstyle conflict between #link JavaDoc on one line and line length?

Consider the following JavaDoc:
/**
* Test method for
* {#link MySelectionStyleConfiguration#configureSelectionStyle(org.eclipse.nebula.widgets.nattable.config.IConfigRegistry)}.
*
*/
Whenever I save the JUnit5 test class this comment belongs in, the {#link } is re-formatted to be on one line, which is correct, otherwise the Maven Checkstyle plugin will throw an error if I try to introduce a line break in the link: (javadoc) SingleLineJavadoc: Javadoc comment at column 78 has parse error. Details: mismatched input '\n' expecting MEMBER while parsing REFERENCE. (Also I reckon the link wouldn't resolve correctly in the API docs which are rendered from this.)
However, the Maven Checkstyle plugin will also throw an error if I leave the long link line as is: (sizes) LineLength: Line is longer than 100 characters (found 125)..
Is there a way to resolve this?
Requirements:
The 100 char line rule should remain in place, but long {#link } tags in JavaDoc (in test classes most obvisouly, but perhaps also elsewhere) should be excempted.
The link should still resolve in the rendered JavaDoc (i.e., it should remain valid).
Edit
I can leave out the package names from the link, but I'd like to keep them in there so that they resolve to one thing only and not potentially to something that is named equally.
It can be done with "ignorePattern" property of LineLength check https://checkstyle.sourceforge.io/config_sizes.html#LineLength
In your case, something like
<module name="LineLength">
<property name="max" value="100"/>
<property name="ignorePattern" value="^ \* \{#link .*$"/>
</module>
will solve problem.

java.lang.RuntimeException: Unrecognized XSLTC extension 'http://saxon.sf.net/:assign'

I keep getting java.lang.RuntimeException: Unrecognized XSLTC extension 'http://saxon.sf.net/:assign' when I run my xsl code through java.
The xsl declaration is as follows:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:redirect="http://xml.apache.org/xalan/redirect"
extension-element-prefixes="saxon redirect" version="2.0"
xmlns:saxon="http://saxon.sf.net/"
exclude-result-prefixes="saxon">
I am trying to use Saxon to increment a variable everytime flow enters a for-each loop.
The error message saying Unrecognized XSLTC extension suggests that you are running Xalan XSLTC and not Saxon 9 to execute the stylesheet. There are different XSLT processors you can use with Java but for one stylesheet you can only use one at a time, so you will need to decide whether you want to use Saxon or Xalan and then you can use the extensions supported by the selected processor but not extensions supported by different processors.

Custom Error Handler in Wso2 Mediation

In my In sequence mediatior, I need to process some logic on the input values and based on that i need to decide whether to call the webservice or return a fault. I have defined the sequence as following
<sequence xmlns="http://ws.apache.org/ns/synapse" name="m1">
<class name="com.myclass">
</class>
<makefault version="soap11">
<code xmlns:soap11Env="http://schemas.xmlsoap.org/soap/envelope/" value="soap11Env:Client"/>
<reason value="ERROR_MESSAGE"/>
<role>Acc</role>
<detail>Test Details</detail>
</makefault>
<log/>
</sequence>
The problem is by default the webservice is always passing fault information to the webservice. How do i make of the following
1. Incase the there is an custom exception thrown in Mediator, soap fault is thrown back to the webservice client.
2. Incase all the information are correct, the webservice is called properly and client gets the proper response.
You need to define a separate sequence to handle the faults. Then, in your InSequence, you need to set that fault sequence to the "onError" attribute. So your InSequence will look like
<sequence xmlns="http://ws.apache.org/ns/synapse" name="m1" onError="yourFaultSequence">
<class name="com.myclass">
</class>
<log/>
<send/>
</sequence>
Above configuration was added to give an idea. Note the onError attribute.
Following sample will also help.

Handling Invalid XML with Nokogiri::XML::Reader

I have found the Nokogiri xml reader to be strict with xml syntax so if it encounters an invalid character within the xml, such as a non-escaped ampersand (eg. <tag> Garage & Driveway </tag>) will cause an error to be thrown.
So when I use the reader as follows:
Nokogiri::XML::Reader(infile).each do |node|
# does stuff with node
end
Throws the error:
Entity: line 1056614: parser error : xmlParseEntityRef: no name
<tag>The & is invalid</tag>
^
transmogrifier/gems/nokogiri-1.5.5/lib/nokogiri/xml/reader.rb:106:in `each'
With XML such as this:
<root>
<items>
<tag>The & is invalid</tag>
</items>
<items> ... </items>
<root>
Midway through parsing a large document. I've noticed Nokogiri::XML::Parser handles this (more) gracefully, and removes all invalid characters, which gives me hope for a more graceful solution.
Ideally, I would love to be able to catch the error and continue with the each parsing (as very few items have invalid characters). Any suggestions on how to handle this gracefully?
Ive noticed you can pass in ParseOptions, but havent had any luck with using those.
Thanks in advance!
Switching from Nokogiri::XML to Nokogiri::HTML, which is much more forgiving of XML errors, will probably help.

Resources