How can I get xslt to indent xml (from Ant)? - ant

From what I understand having looked around for an answer to this the following should work:
<xslt basedir="..." destdir="..." style="xslt-stylesheet.xsd" extension=".xml"/>
Where xslt-stylesheet.xsd contains the following:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
Unfortunately while most formatting is applied (spaces are stripped, newlines entered, etc.), indentation is not and every element is along the left side in the file. Is this an issue with the xslt processor Ant uses, or am I doing something wrong? (Using Ant 1.8.2).

It might help to set some processor-specific output options, though you should note that these may vary depending on the XSLT processor that you're using.
For example, if you're using Xalan, it defines an indent-amount property, which seems to default to 0.
To override this property at runtime, you can declare xalan namespace in your stylesheet and override using the processor-specific attribute indent-amount in your output element as follows:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xalan">
<xsl:output method="xml"
encoding="UTF-8"
indent="yes"
xalan:indent-amount="2"/>
This example is from the Xalan usage patterns documentation at http://xml.apache.org/xalan-j/usagepatterns.html
If you do happen to be using Xalan, the documentation also says you can change all of the output preferences globally by setting changing the file org/apache/serializer/output_xml.properties in the serializer jar.
In the interest of completeness, the complete set of Xalan-specific xml output properties defined in that file (Xalan 2.7.1) are:
{http://xml.apache.org/xalan}indent-amount=0
{http://xml.apache.org/xalan}content-handler=org.apache.xml.serializer.ToXMLStream
{http://xml.apache.org/xalan}entities=org/apache/xml/serializer/XMLEntities
If you're not using Xalan, you might have some luck looking for some processor-specific output properties in the documentation for your XSLT processor

Different XSLT processors implement indent="yes" in different way. Some indent properly, while others only put the element starting on a new line. It seems that your XSLT processor is among the latter group.
Why is this so?
The reason is that the W3C XSLT Specification allows significant leeway in what indentation could be produced:
"If the indent attribute has the value yes, then the xml output
method may output whitespace in addition to the whitespace in the
result tree (possibly based on whitespace stripped from either the
source document or the stylesheet) in order to indent the result
nicely; if the indent attribute has the value no, it should not
output any additional whitespace. The default value is no. The xml
output method should use an algorithm to output additional whitespace
that ensures that the result if whitespace were to be stripped from
the output using the process described in [3.4 Whitespace Stripping]
with the set of whitespace-preserving elements consisting of just
xsl:text would be the same when additional whitespace is output as
when additional whitespace is not output.
NOTE:It is usually not safe to use indent="yes" with document types that include element types with mixed content."
Possible solutions:
Start using another XSLT processor. For example, Saxon indents quite well.
Remove the <xsl:strip-space elements="*"/> directive. If there are whitespace-only text nodes in the source XML, they would be copied to the output and this may result in a better-looking indented output.

I don't know if ant is OK. But concerning your XSLT :
When you use the copy-of on an element, your XSLT processor does not indent. If you change your XSLT like this, your XSLT processor will may be manage to indent :
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This XSLT will go through the whole XML tree and indents each element it creates.
EDIT after comment :
You can see the following question to change your XSLT processor, maybe it will solve your problem : How to execute XSLT 2.0 with ant?

You can try adding the {http://xml.apache.org/xslt}indent-amount output property in ant, something like this:
<target name="applyXsl">
<xslt in="${inputFile}" out="${outputFile}" extension=".html" style="${xslFile}" force="true">
<outputproperty name="indent" value="yes"/>
<outputproperty name="{http://xml.apache.org/xslt}indent-amount" value="4"/>
</xslt>
</target>

Related

XSLT pipeline : Error XPDY0002 - The context item for axis step fn:root(...)/element() is absent

Please, I need some help dealing with saxon api :)
I create a pipeline with 2 XsltTransform of the same xslt and when i run transform i get this error :
2019-01-24 11:32:15,673 [pool-2-thread-1] INFO e.s.e.x.XsltListener - file
2019-01-24 11:32:15,674 [pool-2-thread-1] INFO e.s.e.x.XsltListener - Error
XPDY0002 while evaluating xsl:message content: The context item for axis
step fn:root(...)/element() is absent
here is my xslt :
<xsl:stylesheet exclude-result-prefixes="#all" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:variable name="supp" as="xs:string" select="root()/*/name()"/>
<xsl:template match="/">
<xsl:message select="$supp"/>
<file/>
</xsl:template>
the first XsltTransform work fine but It seems that i have no context node during the second XstTransform running.
I use :
transformer1.setSource(source) : source is a SAXSource
transformer1.setDestination(transformr2)
transformr2.setDestination(serialiser)
According to documentation (XsltTransform.setInitialContextNode):
This value is ignored in the case where the XsltTransformer is used as the Destination of another process. In that case the initial context node will always be the document node of the document that is being streamed to this destination.
Thanks for your Help
In general in XSLT 3 you need to distinguish between the initial match selection https://www.w3.org/TR/xslt-30/#dt-initial-match-selection which is used to decide which template to apply first and the global context item https://www.w3.org/TR/xslt-30/#dt-global-context-item that is used to evaluate global parameters and variables. I think you seem to expect that in your second stylesheet the result of your first acts as both but it seems, at least in your setup, Saxon does not assume that but only sets your initial match selection to the result of the first stylesheet. So try moving the <xsl:variable name="supp" as="xs:string" select="root()/*/name()"/> into the template e.g.
<xsl:template match="/">
<xsl:variable name="supp" as="xs:string" select="root()/*/name()"/>
<xsl:message select="$supp"/>
<file/>
</xsl:template>
I am not sure there is another way, at least in the case of chaining two streaming transformations you can't the second stylesheet expect to have access to the whole result tree of the first to be used to evaluate global parameters or variables.

XSLT: How to access elements of type both namespace and without namespaces within same article

Please suggest to access the elements which are not having any namespaces. However my code able to access and alter the nodes (elements) which are having namespaces. I am using XSLT2 version. Find my xml (I used DTD path mapped to my local path, please suggest also for access the XML without DTD help.
InPut XML:
<!DOCTYPE article PUBLIC "-//ES//DTD journal article DTD version 5.2.0//EN//XML" "D:/DTDs/Els-parser/art520.dtd">
<article>
<fm>
<ce:title>The title</ce:title>
<ce:author-group>
<ce:author><ce:surname>Rudramuni</ce:surname><ce:given-names>TP</ce:given-names></ce:author>
</ce:author-group>
</fm>
<body>
<ce:sections>
<ce:section>
<ce:section-title>The first Head</ce:section-title>
<ce:para>Tha first para</ce:para>
</ce:section>
</ce:sections>
</body>
<back>
<ref><ce:author>Vijay</ce:author></ref>
</back>
</article>
XSLT:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ce="http://www.elsevier.com/xml/common/dtd"
xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:mml="http://www.w3.org/1998/Math/MathML"
version='2.0'>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="fm">
<xsl:element name="ce:front"><xsl:apply-templates/></xsl:element>
</xsl:template>
<xsl:template match="ce:author">
<xsl:element name="name"><xsl:apply-templates/></xsl:element>
</xsl:template>
</xsl:stylesheet>
Required Result:
<?xml version="1.0" encoding="UTF-8"?>
<article>
<ce:front>
<ce:title>The title</ce:title>
<ce:author-group><name><ce:surname>Rudramuni</ce:surname><ce:given-names>TP</ce:given-names></name></ce:author-group>
</ce:front>
<body><ce:sections><ce:section><ce:section-title>The first Head</ce:section-title><ce:para>Tha first para</ce:para></ce:section></ce:sections></body>
<back>
<ref><name>Vijay</name></ref>
</back>
</article>
But I am getting some extra namespaces like "xmlns="http://www.elsevier.com/xml/ja/dtd" and xmlns="", and some extra attributes are found for some elements like view="all". Thanks in advance. Please suggest.
You seem to ask a lot of questions on XSLT 2.0 recently, which is ok of course, but please kindly consider these guidelines when asking questions:
Use valid examples that we can reproduce your problem with. Your examples, in this and other questions, are not well-formed (missing namespaces in this question for instance), which makes it hard to help you (and they will downvote or close your question, see How to ask).
Tone down the examples to the bare minimum, otherwise people will not run to help you as it is too hard to understand the question asked. In this case, a two-line XML and XSLT would explain your issue.
Run the code you paste here and copy the output (or errors) you get. In this question for instance, the output is not namespace-well-formed, which cannot possibly be the output of any XSLT processor.
To answer your question, you can use a wild-card NameTest, which selects that name in any namespace:
select="*:something"
select="*:foo/*:bar"
select="*:foo[contains(., #*:some-attr)]"

Can I apply a character map to a given node?

If I look at the xslt specs it seems a character map applies to the whole document, bit is it also possible to use it on a given node, or within a template ?
Example : I have a node containing look up values, but they might contain characters that don't play well with regular expressions when using it in another template. For now I use a replace functionwhich works well,, but after a few characters that becomes pretty hard to read or maintain. So if I have something like this :
<xsl:variable name="myLookup" select="
replace(
replace(
replace(
replace(
string-join(/*/lookup/*, '|'),
'\[','\\['),
'\]','\\]'),
'\(','\\('),
'\)','\\)')
"/>
is there a way to achieve something like below fictitious example ?
<xsl:character-map name="escapechar">
<xsl:output-character character="[" string="\[" />
<xsl:output-character character="]" string="\]" />
<xsl:output-character character="(" string="\(" />
<xsl:output-character character=")" string="\)" />
</xsl:character-map>
<xsl:variable name="myLookup" select="string-join(/*/lookup/*, '|')" use-character-map="escapechar"/>
I know this is not working at all, it is just to make my request a bit visual.
Any idea ?
I think character maps in XSLT 2.0 are a serialization feature to be applied when a result tree is serialized to a file or stream so I don't see how you could apply one to a certain string or certain node during a transformation.
As for escaping meta characters of regular expression patterns, maybe http://www.xsltfunctions.com/xsl/functx_escape-for-regex.html helps.
Character maps is only a serialization feature, which means that it is only executed when the final output of a transformation is produced. However, you can significantly simplify your current code.
Just use:
replace($pStr, '(\[|\]|\(|\))','\\$1')
Here is a complete example:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/*">
<xsl:value-of select="my:escape(.)"/>
</xsl:template>
<xsl:function name="my:escape" as="xs:string">
<xsl:param name="pStr" as="xs:string"/>
<xsl:value-of select="replace($pStr, '(\[|\]|\(|\))','\\$1')"/>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on the following XML document:
<t>([a-z]*)</t>
the wanted, correct result is produced:
\(\[a-z\]*\)

XPath syntax error thrown when using evaluate() method

Still i am getting the below error
Error at xsl:param on line 6 of file:/E:/saxon/parastyleText.xsl:
XPST0003: XPath syntax error at char 0 on line 6 in {...le/#w:val[matches(., c
oncat...}:
Invalid character '^' in expression
Failed to compile stylesheet. 1 error detected.
Modified XSL:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
<xsl:param name="styleName" select="'articletitle'"/>
<xsl:param name="tagName" select="'//w:p[w:pPr/w:pStyle/#w:val[matches(., concat('^(',$styleName,')$'),'i')]]'"/>
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:value-of select="saxon:evaluate($tagName)" xmlns:saxon="http://saxon.sf.net/"/><xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Please dont reply that, quotes will make 'tagName' as string and remove those quotes. This value will be actually passed from java as a string , tats y for testing purpose i have passed this xpath as string.
According to the online documentation http://www.saxonica.com/documentation9.1/extensions/functions.html Saxon 9.1 supports an evaluate function in the Saxon namespace http://saxon.sf.net/. So with Saxon 9.1 try <xsl:value-of select="saxon:evaluate($tagName)" xmlns:saxon="http://saxon.sf.net/"/>. Of course you can move the namespace declaration up to the xsl:stylesheet element if you want, I just put it on the xsl:value-of in this post for a short but complete sample of code.
Also note that with your variable named tagName it is likely that you simply have a single element name, in that case it might suffice to use <xsl:value-of select="*[local-name() eq $tagName]"/>.

xslt transform that selects element based upon negative value in multiple descendants

I have the following basic xml which I to parse to give the NAME only if none of the DB values = DB1.
<rnas>
<rna ID="1">
<NAME>Segment 6</NAME>
<XREF>
<ID>AF389120</ID>
<DB>DB1</DB>
</XREF>
<XREF>
<ID>ABCDE</ID>
<DB>DB2</DB>
</XREF>
</rna>
<rna ID="10">
<NAME>Segment 3</NAME>
<XREF>
<ID>12345</ID>
<DB>DB2</DB>
</XREF>
<XREF>
<ID>66789</ID>
<DB>DB3</DB>
</XREF>
</rna>
</rnas>
The expected output would be:
<rnas>
<rna ID="10">
<NAME>Segment 3</NAME>
</rna>
<rnas>
I am still a relative newbie and have tried a variety of approaches using XSLT 2.0 but so far have not been able to get anything to work properly. Any help would be much appreciated.
This will do what you want
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="rna[.//DB/text()='DB1']"/>
<xsl:template match="XREF"/>
</xsl:stylesheet>
It's an Identity Transform along with two empty templates. The first matches any rna that contains a DB with the text value DB1, and suppresses it. The second matches all XREF elements, which you do not want to output.

Resources