Copy xml elements without namespace - xslt-2.0

I have a XML file which contains html elements. I want copy them without have the namespaces being copied.
<clonkDoc xmlns="https://clonkspot.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://clonkspot.org clonk.xsd" xml:lang="de">
<doc>
foo <br/> bar
</doc>
</clonkDoc>
and this XSL (truncated):
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0" xpath-default-namespace="https://clonkspot.org" exclude-result-prefixes="xs">
<xsl:output method="html" encoding="ISO-8859-1" doctype-public="-//W3C//DTD HTML 4.01//EN" doctype-system="http://www.w3.org/TR/html4/strict.dtd"/>
<xsl:template match="img|a|em|strong|br|code/i|code/b">
<xsl:copy copy-namespaces="no">
<!-- including every attribute -->
<xsl:for-each select="#*|node()">
<xsl:copy copy-namespaces="no"/>
</xsl:for-each>
</xsl:copy>
</xsl:template>
...
i get something like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head></head>
<body>
foo <br xmlns="https://clonkspot.org"></br> bar
</body>
</html>
I already set copy-namespaces="no" (XSLT to copy element without namespace). I think the XSLT processor should see that the element i want to copy to a HTML4 file is an html element. What am i doing wrong?
Thanks!

xsl:copy in all version of XSLT makes a shallow copy of the context node and in case of an element (or other node with a qualified name like an attribute node) node that means a copy with the same name and namespace. The copy-namespaces="no" introduced in XSLT 2 only helps to avoid to also copy in scope namespace declarations that exist but are not used for the element itself.
So in your case, as you want to strip the existing namespace of the elements, you really want to and need to transform them with a template doing that e.g.
<xsl:template match="img|a|em|strong|br|code/i|code/b">
<xsl:element name="{local-name()}">...</xsl:element>
</xsl:template>

Thank you. Based on your solution i solved it with:
<!-- copy img, a, em and br literally -->
<xsl:template match="img|a|em|strong|br|code/i|code/b">
<xsl:element name="{local-name()}">
<!-- including every attribute -->
<xsl:for-each select="#*">
<xsl:attribute name="{local-name()}"><xsl:value-of select="."/></xsl:attribute>
</xsl:for-each>
<xsl:for-each select="node()">
<xsl:apply-templates select="."/>
</xsl:for-each>
</xsl:element>
</xsl:template>
That should work recursively and take the attributes with.

Related

XSLT 3.0 Streaming (Saxon) facing error "There is more than one consuming operand" when I use two different string functions within same template

Here is my sample input xml
<?xml version="1.0" encoding="UTF-8"?>
<Update xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Request>
<List>
<RequestP><ManNumber>3B4</ManNumber></RequestP>
<RequestP><ManNumber>8T7_BE</ManNumber></RequestP>
<RequestP><ManNumber>3B5</ManNumber></RequestP>
<RequestP><ManNumber>5E9_BE</ManNumber></RequestP>
<RequestP><ManNumber>9X6</ManNumber></RequestP>
</List>
</Request>
</Update>
and xslt
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" exclude-result-prefixes="#all">
<xsl:output method="xml" omit-xml-declaration="no" indent="yes" />
<xsl:mode streamable="yes" />
<xsl:template match="List/RequestP/ManNumber">
<ManNumber>
<xsl:value-of select="replace(.,'_BE','')" />
</ManNumber>
<xsl:if test="contains(.,'_BE')">
<ManDescrip>BE</ManDescrip>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
I am getting below error for above xslt, I am using Saxon 11.2 version
Template rule is not streamable
* There is more than one consuming operand: {<ManNumber {xsl:value-of}/>} on line 6, and {if(fn:contains(...)) then ... else ...} on line 9
The xslt works fine if I use either "replace" or "contains" but not both within same template.
Streamed processing, if you have needs (huge input documents in the size of gigabytes) to use it, requires you to limit your XSLT to streamable code, that means you can for instance make a copy of that element and processed only that small element node as a complete in memory element in a different mode
<xsl:template match="List/RequestP/ManNumber">
<xsl:apply-templates select="copy-of(.)" mode="grounded"/>
</xsl:template>
<xsl:template name="grounded" match="ManNumber">
<ManNumber>
<xsl:value-of select="replace(.,'_BE','')" />
</ManNumber>
<xsl:if test="contains(.,'_BE')">
<ManDescrip>BE</ManDescrip>
</xsl:if>
</xsl:template>

XSLT merging two files with different namespaces

This is my master HTML file with predefined namespace:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>some title</title>
</head>
<body>
<p>some text</p>
</body>
</html>
And I have an additional XML file defined like this:
<?xml version="1.0" encoding="UTF-8"?>
<article dtd-version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
<front>
<element>front text</element>
</front>
<back>
<extra-list>
<element>element text</element>
</extra-list>
</back>
</article>
This is wanted final output (head from html file, extra-list from xml file):
<?xml version="1.0" encoding="UTF-8"?>
<xml>
<head>
<title>some title</title>
</head>
<back>
<extra-list>
<element>element text</element>
</extra-list>
</back>
</xml>
I am trying to join these two files with this XSLT below:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xlink="http://www.w3.org/1999/xlink"
xpath-default-namespace="http://www.w3.org/1999/xhtml"
version="2.0">
<xsl:output method="xml" version="1.0" indent="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="html">
<xml>
<xsl:apply-templates/>
</xml>
</xsl:template>
<xsl:template match="head">
<head>
<xsl:apply-templates/>
</head>
</xsl:template>
<xsl:template match="body">
<back>
<xsl:copy-of select="document('doc.xml')"/>
</back>
</xsl:template>
</xsl:transform>
I use xpath-default-namespace in XSLT so I don't have to address HTML's namespace all the time (the original master HTML is huge) and I would like to stay with this parameter if possible. Here I am having two issues:
1.) How is it possible to get rid of all xmlns declarations on output?
2.) It is only possible to copy the whole xml file with this command <xsl:copy-of select="document('doc.xml')"/>. If I try to copy only subelement <xsl:copy-of select="document('doc.xml')/article/back"/>, then I get no output, because the content is not in the same namespace. How would I be able to solve this?
UPDATE (COMPLETE XSLT SOLUTION):
Based on Martin's answer below, this is fully working solution.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xpath-default-namespace="http://www.w3.org/1999/xhtml"
version="2.0">
<xsl:output method="xml" version="1.0" indent="yes"/>
<!-- copy all elements and ignore namespace -->
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="#* | node()"/>
</xsl:element>
</xsl:template>
<!-- copy all attributes and ignore namespace -->
<xsl:template match="#*">
<xsl:attribute name="{local-name()}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
<!-- copy all remaining nodes and ignore namespace -->
<xsl:template match="comment() | text() | processing-instruction()">
<xsl:copy/>
</xsl:template>
<xsl:template match="html">
<xml>
<xsl:apply-templates/>
</xml>
</xsl:template>
<xsl:template match="head">
<head>
<xsl:apply-templates/>
</head>
</xsl:template>
<xsl:template match="body">
<xsl:copy-of xpath-default-namespace="" copy-namespaces="no" select="document('doc.xml')/article/back"/>
</xsl:template>
</xsl:transform>
I also added two extra templates to copy attributes and some other nodes.
You can override xpath-default-namespace were needed e.g. <xsl:copy-of xpath-default-namespace="" select="document('doc.xml')/article/back"/>.
As for namespaces, there are several issues. You run part of the input in the XHTML namespace through an identity transformation, this always preserves the namespace of the elements copied. You will need to change from the identity transformation to a transformation stripping the namespace from elements:
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="#* | node()"/>
</xsl:element>
</xsl:template>
The literal result elements you create in the XSLT have the XLink namespace in scope as you declare but not use it in the XSLT code. Either remove the declaration or use exclude-result-prefixes="xlink" on the xsl:stylesheet or xsl:transform element.
The other input you access with document('doc.xml') also declares unused namespaces, the default copying preserves them but as they are only in scope but not used you can get rid of them with copy-namespaces="no: <xsl:copy-of xpath-default-namespace="" select="document('doc.xml')/article/back" copy-namespaces="no"/>. Or you would need to push those elements as well through the template stripping namespace with xsl:element name="{local-name()}".

Looking for saxon:evaluate() example code

I have a transform.xsl file with will process a input.xml. But there is also an additional config.xml file which will define additional clauses. For e.g. this is the content of the config.xml.
<Location >
<DisplayName>
<Attribute1>ABC</Attribute1>
<Attribute2>XYZ</Attribute2>
<action>concat($Attribute1,$Attribute2)</action>
</DisplayName>
</Location >
So when transform.xsl will encounter the DisplayName variable within the input.xml, then it will form the value with the RESULT of the action expression defined in the config.xml file. transform.xml will call the config.xml just to get the result. (The action can be modified by the end user and hence these are placed outside the xsl file, within the config.xml).
We are using saxon xml processor version 9 and xslt 2.0. So we need to use saxon:evaluate(). I tried to find more examples of saxon:evaluate(), but couldn't find it more. Can anyone show me some examples of how to use it?
Thanks in advance.
***** This is an edited query to highlight the need of saxon:evaluate *****
Here is an example to use an XSLT 3 processor supporting xsl:evaluate (https://www.w3.org/TR/xslt-30/#dynamic-xpath) (i.e. Saxon 9.8 or later with the commercial PE or EE editions or Altova 2017 or later) to process your "config" file:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:map="http://www.w3.org/2005/xpath-functions/map"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
version="3.0">
<xsl:param name="config-url" as="xs:string">test2018121301.xml</xsl:param>
<xsl:param name="config-doc" select="doc($config-url)"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:key name="element" match="*" use="node-name()"/>
<xsl:function name="mf:config-evaluation" as="item()*">
<xsl:param name="config-doc" as="document-node()"/>
<xsl:param name="element-name" as="xs:QName"/>
<xsl:variable name="display" select="key('element', $element-name, $config-doc)/DisplayName"/>
<xsl:evaluate xpath="$display/regex" with-params="map:merge($display!(* except regex)!map { QName('', local-name()) : string() })"/>
</xsl:function>
<xsl:template match="*[key('element', node-name(), $config-doc)]">
<xsl:copy>
<xsl:value-of select="mf:config-evaluation($config-doc, node-name()), ."/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
So with a config.xml
<Location >
<DisplayName>
<Attribute1>ABC</Attribute1>
<Attribute2>XYZ</Attribute2>
<regex>concat($Attribute1,$Attribute2)</regex>
</DisplayName>
</Location >
this would transform an input sample with e.g.
<Root>
<Items>
<Item>
<Data>data 1</Data>
<Location>location 1</Location>
</Item>
<Item>
<Data>data 2</Data>
<Location>location 2</Location>
</Item>
</Items>
</Root>
into
<Root>
<Items>
<Item>
<Data>data 1</Data>
<Location>ABCXYZ location 1</Location>
</Item>
<Item>
<Data>data 2</Data>
<Location>ABCXYZ location 2</Location>
</Item>
</Items>
</Root>
That gives you a great flexibility to allow XPath expressions in the configuration files but as pointed out in https://www.w3.org/TR/xslt-30/#evaluate-effect, also is a security problem: "Stylesheet authors need to be aware of the security risks associated with the use of xsl:evaluate. The instruction should not be used to execute code from an untrusted source.".
As for using the saxon:evaluate function supported in older versions of Saxon not supporting the XSLT 3 xsl:evaluate instruction, a simple example is
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:saxon="http://saxon.sf.net/"
exclude-result-prefixes="#all"
version="2.0">
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="example">
<xsl:copy>
<xsl:value-of select="saxon:evaluate(#expression, #foo, #bar)"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
which transforms the input
<root>
<example expression="concat($p1, $p2)" foo="This is " bar="an example."/>
<example expression="replace(., $p1, $p2)" foo="\p{L}" bar="X">This is example 2.</example>
</root>
into the result
<root>
<example>This is an example.</example>
<example>XXXX XX XXXXXXX 2.</example>
</root>
Try checking the xsl-attribute tag along with the xsl-value-of tag. If I get what you're asking for, you could probably read the config.xml using the transform.xsl (or a second xsl for an intermediate file) to set the text inside the regex tag to correspond to the value of an tag attribute within the xsl.
https://www.w3schools.com/xml/ref_xsl_el_attribute.asp
Also, check this tutorial for regex in XSLT 2, it may help:
https://www.xml.com/pub/a/2003/06/04/tr.html

Replace nbsp with another tag

In my xml file I have a tag like this(within the p tags I have nbsp;)
Now I want to replace this nbsp with another tag(as an example within p tag I want to insert another tag called s <s/>)
Is this possible to do.Please help
First note that the tree on which XSLT operates never contains a character or entity reference, it simply contains an Unicode character. To match on and replace an Unicode character with an element you can use analyze-string:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p//text()">
<xsl:analyze-string select="." regex=" ">
<xsl:matching-substring>
<s/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
That way an input document like
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
<!ENTITY nbsp " ">
]>
<doc>
<p>This is a paragraph with a non-breaking space before some sub text.</p>
</doc>
is transformed into the result document
<?xml version="1.0" encoding="UTF-8"?><doc>
<p>This is a paragraph with a non-breaking space <s/> before some sub text.</p>
</doc>

XML and XSL with unwanted namespace when using saxon

I have used exclude-result-prefixes="ae" in the xsl stylesheet. Then also namespace is present in the converted XML file. I'm using saxon parser. Please find my MWE below:
My XML file is :
<?xml version="1.0" encoding="UTF-8"?>
<ArticleInfo Language="En" ContainsESM="No" OutputMedium="All">
<ArticleID>034</ArticleID>
<ArticleJID>BMCL</ArticleJID>
<ArticleDOI>10.1000/j.asdf.2015.02.034</ArticleDOI>
<ArticleTitle>Sample Article Title with ― unicode value</ArticleTitle>
<Para>Sample Paragraph text here</Para>
</ArticleInfo>
and My XSL file is :
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ae="www.ams.org" exclude-result-prefixes="ae" version="3.0">
<xsl:output omit-xml-declaration="no" indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:text disable-output-escaping="yes">
<!DOCTYPE article PUBLIC "-//AMS//DTD journal article//EN//XML" "art.dtd">
</xsl:text>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:variable name="ElsDoi" select="/ArticleInfo/ArticleDOI"/>
<xsl:template match="ArticleInfo">
<ae:doi><xsl:value-of select="$ElsDoi"/></ae:doi>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="Para">
<xsl:element name="ae:para">
<xsl:apply-templates select="#* | node()"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
I'm Getting output XML file is :
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//AMS//DTD journal article//EN//XML" "art.dtd">
<ae:doi xmlns:ae="www.ams.org">10.1000/j.asdf.2015.02.034</ae:doi>
<ArticleID>034</ArticleID>
<ArticleJID>BMCL</ArticleJID>
<ArticleDOI>10.1000/j.asdf.2015.02.034</ArticleDOI>
<ArticleTitle>Sample Article Title with ― unicode value</ArticleTitle>
<ae:para xmlns:ae="www.ams.org">Sample Paragraph text here</ae:para>
Expecting output XML file is :
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//AMS//DTD journal article//EN//XML" "art.dtd">
<ae:doi>10.1000/j.asdf.2015.02.034</ae:doi>
<ArticleID>034</ArticleID>
<ArticleJID>BMCL</ArticleJID>
<ArticleDOI>10.1000/j.asdf.2015.02.034</ArticleDOI>
<ArticleTitle>Sample Article Title with ― unicode value</ArticleTitle>
<ae:para>Sample Paragraph text here</ae:para>
Please note unwanted xmlns:ae="www.ams.org" is present in the output XML file and also in title &#x2015 is converted to unicode symbol. How do avoid this.
With <xsl:element name="ae:para"> you are explictly creating an element in the namespace bound to the prefix ae so don't expect exclude-result-prefixes to exclude that namespace as it is only useful to avoid namespace declarations of unused namespaces. A namespace used in a node name can't be excluded with exclude-result-prefixes, as otherwise the result would not be namespace well-formed XML.

Resources