Ant XSLT 2.0 with saxon9 Loading Stylesheet very very slow [duplicate] - ant

This question already has answers here:
How to execute XSLT 2.0 with ant?
(6 answers)
Closed 2 years ago.
I have just recently been working with xslt2.0 via ant. I have a build file that looks like so:
<project name="TranformXml" default="TransformFile">
<target name="TransformFile">
<xslt in="input.xml"
out="student.html"
style="transform.xsl"
processor="trax" classpath="./lib/saxon/saxon9he.jar">
<factory name="net.sf.saxon.TransformerFactoryImpl"/>
</xslt>
</target>
</project>
an input document input.xml:
<student_list>
<student>
<name>George Washington</name>
<major>Politics</major>
<phone>312-123-4567</phone>
<email>gw#example.edu</email>
</student>
</student_list>
and stylesheet, transform.xsl
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="html"/>
<xsl:template match="/">
<html>
<head>
<title>Student Directory</title>
</head>
<body>
<xsl:apply-templates />
</body>
</html>
</xsl:template>
</xsl:stylesheet>
and output from my ant build:
ant -f build.xml
Buildfile: /home/casey/Development/ant-tests/xslt-transform/build.xml
TransformFile:
[xslt] Processing /home/casey/Development/ant-tests/xslt-transform/input.xml to /home/casey/Development/ant-tests/xslt-transform/student.html
[xslt] Loading stylesheet /home/casey/Development/ant-tests/xslt-transform/transform.xsl
BUILD SUCCESSFUL
Total time: 9 seconds
I find it hard to believe that it should take 9 seconds to do all this. When in production the stylesheets are going to be alot more complex and the input much larger. Realistically I'd like to keep the whole transform process to less than a few seconds.
Any ideas?
Thanks,
Casey

What I found was killing my performance, was loading the DTD definitions over the web.
I created an empty .dtd file, and referred the DTD public ID's to it using an ant xmlcatalog, like this (inside my <xslt/> task):
<xmlcatalog>
<dtd publicid="-//W3C//DTD XHTML 1.0 Transitional//EN" location="empty.dtd"/>
<xmlcatalog>
This took build times down from 22 minutes (many documents) to 3 seconds!

Related

Getting errors in Saxon-HE 9.9.1 when processing DITA: I/O error on DTD

Using Saxon 9.9.1.3J, I am getting an I/O error every time I try to transform a DITA file that has a DTD:
I/O error reported by XML parser processing file:/test.dita: /learningAssessment.dtd (No such file or directory)
This happens even if I force -dtd:off on the command line. Commenting out the DTD in the DITA file does allow it to process.
Interestingly, when I run the same DITA file in oXygen using Saxon-HE 9.8.0.12, it does process correctly. Any idea what might be causing this to behave differently?
Sample DITA file:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE learningAssessment PUBLIC "-//OASIS//DTD DITA Learning Assessment//EN" "learningAssessment.dtd">
<learningAssessment id="id">
<title>Title</title>
<learningAssessmentbody>
<lcInteraction>
<lcSingleSelect id="lcSingleSelect_agy_fxz_ljb">
<lcQuestion>Question</lcQuestion>
<lcAnswerOptionGroup id="lcAnswerOptionGroup_bgy_fxz_ljb">
<lcAnswerOption>
<lcAnswerContent>A</lcAnswerContent>
</lcAnswerOption>
<lcAnswerOption>
<lcAnswerContent>B</lcAnswerContent>
<lcCorrectResponse/>
</lcAnswerOption>
</lcAnswerOptionGroup>
</lcSingleSelect>
</lcInteraction>
</learningAssessmentbody>
</learningAssessment>
And here's a shell of an XSL that demonstrates the error:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
<xsl:output />
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
You can resolve the problem by the following steps:
Download DITA-OT and expand it any folder you like. In my case it is located at D:\DITA-OT\dita-ot-3.3.4.
Set CLASSPATH environment variable to contain saxon9he.jarand xml-resolver-1.2.jar in DITA-OT/lib.
Invoke Saxon by specifying class name net.sf.saxon.Transform and the catalog: paramter that specifies [DITA-OT]/catalog-dita.xml.
Here is execution example command window:
Hope this helps!
My guess is that you have somehow contrived to give the document a base URI of "file:/test.dita: ", including the final space. You haven't shown how you are running the transformation, so we can't tell where this base URI comes from.
The option -dtd:off is a little misleading. It doesn't switch off DTD processing, only DTD-based validation, which is just one aspect of DTD processing. An XSLT processor always needs to ask the XML parser to read the DTD in order to expand any entity references.
(Well, theoretically it could delay reading any external DTD until it finds the first entity reference; but sadly, I don't know of any XML parser that does that.)
I misunderstood how DTDs work. I assumed the public ones were loaded from an HTTP URL, but they need to be local files. Loading the catalog for DITA OT resolved the issue.
transform -s:test.dita -xsl:test.xsl -o:test.html -catalog:/org.oasis-open.dita.v1_2/plugins/org.oasis-open.dita.v1_2/catalog.xml
Where the catalog option points to this file on my local filesystem, which comes from DITA OT

Saxon can't find package

I encounter some difficulties when using packages with Saxon 9.8. Saxon can't find the package I want to use and fail at compilation.
When using the -lib option from the command line, I get the following error message:
java.lang.NullPointerException
at net.sf.saxon.style.PackageVersion.<init>(PackageVersion.java:71)
at net.sf.saxon.trans.packages.VersionedPackageName.<init>(VersionedPackageName.java:29)
at net.sf.saxon.trans.packages.PackageInspector.getNameAndVersion(PackageInspector.java:78)
at et.sf.saxon.trans.packages.PackageInspector.getPackageDetails(PackageInspector.java:91)
at net.sf.saxon.trans.packages.PackageLibrary.<init>(PackageLibrary.java:96)
at net.sf.saxon.Transform.doTransform(Transform.java:404)
at net.sf.saxon.Transform.main(Transform.java:81) Fatal error during transformation: java.lang.NullPointerException: (no message)
When using -lib option in oXygen 19 with the Saxon 9.8 add-on, I get the following message:
Nom du moteur: Saxon-EE 9.8.0.4 (External)
Gravité: fatal
Description: Cannot find package img_pkg (version *)
Emplacement de début: 7:52
I get exactly the same error message in oXygen when I use a configuration file to declare the package.
I'm pretty sure that there is no problem with the file path. Since in the error message I get in oXygen the package version doesn't seem to be recognized, I thought it could be a syntax problem but I can't find where it comes from.
Here is test my package:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:package name="img_pkg" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:img="https://www.ephe.fr/annuaire/colin-brisson"
exclude-result-prefixes="xs img" version="1.0">
<xsl:function name="img:test" visibility="final" as="xs:string">
<xsl:value-of select="'test ok'"/>
</xsl:function>
</xsl:package>
Here's my test sylesheet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:img="https://www.ephe.fr/annuaire/colin-brisson" exclude-result-prefixes="xs"
version="3.0">
<xsl:use-package version="1.0" name="img_pkg"/>
<xsl:template name="xsl:initial-template">
<xsl:message>
<xsl:value-of select="img:test()"/>
</xsl:message>
</xsl:template>
</xsl:stylesheet>
Many thanks in advance!
I think that the NullPointerException from the command line is due to bug 3373
https://saxonica.plan.io/issues/3373
although in your case the root cause is a little different from that in the bug entry, it's the absence of a package-version attribute. This is fixed in 9.8.0.4, but from the line numbers in the stack trace it looks to me as if you are using an earlier maintenance release.
The problem in oXygen is probably completely different, but it might again be related to the absence of #package-version.

xforms-inspector conflicts with <script></script> in *.xsl

After moving from 14.04 to 16.04 a project is no longer working. <fr:xforms-inspector /> conflicts with the <script></script> in the *.xsl-file. See code below.
( Ubuntu 16.04 / tomcat8 / Orbeon Forms 2016.3.201612302139 / firefox )
Questions
Why is the <fr:xforms-inspector /> suddenly conflicting with this tag?
Why not in 14.04. Is this a bug, which needed to be reported, or is this my error, that this is no longer working?
Is there a way to solve it?
Has it something to do with: https://doc.orbeon.com/xforms/actions/scripting.html that this way is a deprecated?
blubb.xhtml
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:xf="http://www.w3.org/2002/xforms"
xmlns:fr="http://orbeon.org/oxf/xml/form-runner"
>
<head>
<title>Blubb</title>
<xf:model>
<xf:instance id="instance_stylesheet" src="blubb.xsl" />
</xf:model>
</head>
<body>
<fr:xforms-inspector />
</body></html>
blubb.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="no" omit-xml-declaration="yes" encoding="UTF-8"/>
<xsl:template>
<html>
<!-- inserting the script tag, results in the error.
It does not matter, whats in it. -->
<script>
</script>
<head></head>
<body></body>
</html>
</xsl:template>
</xsl:stylesheet>
Below the webpage created by orbeon from the files. The inspector has no code view and those \n belong to it too. Every other behavior is also effected randomly.
This issue is fixed since Orbeon Forms 2017.1, so if you're hitting this problem and using an earlier version, I'd recommend you upgrade to 2017.1.

Debug <?xml-stylesheet type="text/xsl" href="#test"?> in oXygen

I am writing test files that test functionality of an XSLT library. For this, I embed tiny XSLTs in the XML file itself so that I don't need a separate XML and XSLT file for each test. This looks somewhat like this:
<?xml-stylesheet type="text/xsl" href="#test"?>
<someXml xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<test feature="lib:someFeature(...)">
<xsl:stylesheet version="2.0" xml:id="test">
<xsl:import href="../testlib.xsl"/>
<xsl:template match="*[lib:assertRef(#label, lib:someFeature())]" mode="assert"/>
</xsl:stylesheet>
</test>
<someContent label="assert: #someId"/>
<someMoreContent xml:id="someId"/>
</someXml>
Is there a way in oXygen to debug this? Does oXygen have a way to run transformations based on the <?xml-stylesheet?> rules at all? Usually, this is not much of a problem as the referenced stylesheet can be run explicitly, but when the stylesheet is embedded, it's something different.
As confirmed by oXygen developer #RaduCoravu, this is not possible at the moment.

How can I get xslt to indent xml (from Ant)?

From what I understand having looked around for an answer to this the following should work:
<xslt basedir="..." destdir="..." style="xslt-stylesheet.xsd" extension=".xml"/>
Where xslt-stylesheet.xsd contains the following:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
Unfortunately while most formatting is applied (spaces are stripped, newlines entered, etc.), indentation is not and every element is along the left side in the file. Is this an issue with the xslt processor Ant uses, or am I doing something wrong? (Using Ant 1.8.2).
It might help to set some processor-specific output options, though you should note that these may vary depending on the XSLT processor that you're using.
For example, if you're using Xalan, it defines an indent-amount property, which seems to default to 0.
To override this property at runtime, you can declare xalan namespace in your stylesheet and override using the processor-specific attribute indent-amount in your output element as follows:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xalan">
<xsl:output method="xml"
encoding="UTF-8"
indent="yes"
xalan:indent-amount="2"/>
This example is from the Xalan usage patterns documentation at http://xml.apache.org/xalan-j/usagepatterns.html
If you do happen to be using Xalan, the documentation also says you can change all of the output preferences globally by setting changing the file org/apache/serializer/output_xml.properties in the serializer jar.
In the interest of completeness, the complete set of Xalan-specific xml output properties defined in that file (Xalan 2.7.1) are:
{http://xml.apache.org/xalan}indent-amount=0
{http://xml.apache.org/xalan}content-handler=org.apache.xml.serializer.ToXMLStream
{http://xml.apache.org/xalan}entities=org/apache/xml/serializer/XMLEntities
If you're not using Xalan, you might have some luck looking for some processor-specific output properties in the documentation for your XSLT processor
Different XSLT processors implement indent="yes" in different way. Some indent properly, while others only put the element starting on a new line. It seems that your XSLT processor is among the latter group.
Why is this so?
The reason is that the W3C XSLT Specification allows significant leeway in what indentation could be produced:
"If the indent attribute has the value yes, then the xml output
method may output whitespace in addition to the whitespace in the
result tree (possibly based on whitespace stripped from either the
source document or the stylesheet) in order to indent the result
nicely; if the indent attribute has the value no, it should not
output any additional whitespace. The default value is no. The xml
output method should use an algorithm to output additional whitespace
that ensures that the result if whitespace were to be stripped from
the output using the process described in [3.4 Whitespace Stripping]
with the set of whitespace-preserving elements consisting of just
xsl:text would be the same when additional whitespace is output as
when additional whitespace is not output.
NOTE:It is usually not safe to use indent="yes" with document types that include element types with mixed content."
Possible solutions:
Start using another XSLT processor. For example, Saxon indents quite well.
Remove the <xsl:strip-space elements="*"/> directive. If there are whitespace-only text nodes in the source XML, they would be copied to the output and this may result in a better-looking indented output.
I don't know if ant is OK. But concerning your XSLT :
When you use the copy-of on an element, your XSLT processor does not indent. If you change your XSLT like this, your XSLT processor will may be manage to indent :
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This XSLT will go through the whole XML tree and indents each element it creates.
EDIT after comment :
You can see the following question to change your XSLT processor, maybe it will solve your problem : How to execute XSLT 2.0 with ant?
You can try adding the {http://xml.apache.org/xslt}indent-amount output property in ant, something like this:
<target name="applyXsl">
<xslt in="${inputFile}" out="${outputFile}" extension=".html" style="${xslFile}" force="true">
<outputproperty name="indent" value="yes"/>
<outputproperty name="{http://xml.apache.org/xslt}indent-amount" value="4"/>
</xslt>
</target>

Resources