How to Create an Element from Two Surrounding Elements? - xslt-2.0

I am stuck with an XML to XML transformation using XSLT 2.0 where I need to transform this:
<p>some mixed content <x h="">START:attr="value"</x> more mixed content <x h="">END</x> other mixed content</p>
To this:
<p>some mixed content <ph attr="value"> more mixed content </ph> other mixed content</p>
So basically I'd like to replace <x h="">START:attr="value"</x> with <ph attr="value">
and <x h="">END</x> with </ph> and process the rest as usual.
Does anyone know if that's possible?
My main issue is that I cannot figure out how to find the element with value END and then tell the XSLT processor (I use saxon) to process the content between the first occurence of and the second occurence of and finally write the end element . I am familiar with how to create an element (including attributes).
I have a specific template to match the start element START:attr="value". Since the XML document I process contains many other elements I'd prefer a recursive solution, so continue the processing of the found content between START and END by using other existing templates.
Sample XML
(note that I don't know in advance if the parent will be a p element)
<p> my sample text <b>mixed</b> more
<x h="">START:attr="value"</x>
This is mixed content <i>REALLY</i>, process it normally
<x h="">END</x>
</p>
My Stylesheet
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="x[#h][starts-with(., 'START:')]">
<ph>
<xsl:for-each-group select="../*" group-starting-with="x[#h][. = 'START:']">
<xsl:for-each-group select="current-group()" group-ending-with="x[#h][. = 'END']">
<xsl:apply-templates select="#*|node()|text()"/>
</xsl:for-each-group>
</xsl:for-each-group>
</ph>
</xsl:template>
<xsl:template match="x[#h][starts-with(., 'END')]"/>
<xsl:template match="node()|#*">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0" encoding="UTF-8"?>
<p> my sample text <b>mixed</b> more
<ph>mixed</ph>
This is mixed content <i>REALLY</i>, process it normally
</p>
I cannot figure out how to put the complete content between START and END within the tags. Any ideas?

I would match on the parent containing those markers and use a nested for-each-group, of course all based on the identity transformation template as the base processing:
<xsl:template match="p[x[#h][starts-with(., 'START:')]]">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:for-each-group select="node()" group-starting-with="x[#h][starts-with(., 'START:')]">
<xsl:choose>
<xsl:when test="self::x[#h][starts-with(., 'START:')]">
<xsl:variable name="value" select="replace(., '(START:attr=")([^"]*)"', '$2')"/>
<xsl:for-each-group select="current-group()[position() gt 1]" group-ending-with="x[#h][. = 'END']">
<xsl:choose>
<xsl:when test="current-group()[last()][self::x[#h][. = 'END']]">
<ph attr="{$value}">
<xsl:apply-templates select="current-group()[position() ne last()]"/>
</ph>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
XSLT 3 example at https://xsltfiddle.liberty-development.net/pPJ8LV4, for XSLT 2 you need to replace the used xsl:mode declaration with <xsl:template match="#* | node()"><xsl:copy><xsl:apply-templates select="#* | node()"/></xsl:copy></xsl:template>.
As Saxon also supports XQuery using tumbling window where you can check both the start and the end condition together might be a bit more concise (although in XQuery you have to do extra work to make sure you pass the stuff not being wrapped through as the windowing normally filters out items for which the conditions not hold):
p ! <p>
{
for tumbling window $group in node()
start $s
when $s[self::x[#h][starts-with(., 'START:')]] or true()
end $e
when $e[self::x[#h][. = 'END']] and $s[self::x[#h][starts-with(., 'START:')]] or not($s[self::x[#h][starts-with(., 'START:')]])
return
if ($s[self::x[#h][starts-with(., 'START:')]])
then
<ph value="{replace($group[1], '(START:attr=")([^"]*)"', '$2')}">
{
tail($group)[not(position() = last())]
}
</ph>
else $group
}
</p>
https://xqueryfiddle.liberty-development.net/948Fn5s/2

Related

How to filter nodes based on certain condition of the child node text

I have an XML file as shown below.
<COLLECTION>
<ChangedParts>
<Part>
<number>123456</number>
<DefaultUnit>each</DefaultUnit>
<FgOrComponent>FG</FgOrComponent>
<MasterPackUom/>
<CartonUom/>
</Part>
<Part>
<number>456789</number>
<DefaultUnit>each</DefaultUnit>
<FgOrComponent>COMPONENT</FgOrComponent>
<MasterPackUom/>
<CartonUom/>
</Part>
</ChangedParts>
</COLLECTION>
I am trying to use XSLT to transform the file. The file contains Part elements with FgOrComponent and some other elements as its child nodes. FgOrComponent has either FG or COMPONENT has it value. I need to select only the Part element with FG as its value for the FgOrComponent element and modify some other elements like etc in the selected part. The expected output is as shown below.
<COLLECTION>
<ChangedParts>
<Part>
<name>123456</name>
<DefaultUnit>ea</DefaultUnit>
<FgOrComponent>FG</FgOrComponent>
<MasterPackUom>mp</MasterPackUom>
<CartonUom>ca</CartonUom>
</Part>
<Part>
<number>456789</number>
<DefaultUnit>each</DefaultUnit>
<FgOrComponent>COMPONENT</FgOrComponent>
<MasterPackUom/>
<CartonUom/>
</Part>
</ChangedParts>
</COLLECTION>
I am using the following XSLT file to do the transformation without any success. Any help would be appreciated.
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*/*/Part[(FgOrComponent = 'FG')]/*">
<xsl:choose>
<xsl:when test="MasterPackUom/text() = ''">
<MasterPackUom>mp</MasterPackUom>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
The test clause "MasterPackUom/text() = '' is never reached.
If the element is empty then it doesn't have any text() node children, just check MasterPackUom = ''.
But as you have the identity transformation set up as a base transformation, please simply write templates for the relevant changes e.g.
<xsl:template match="Part[FgOrComponent = 'FG']/MasterPackUom[. = '']">
<xsl:copy>mp</xsl:copy>
</xsl:template>
instead of doing that odd xsl:choose.

Use a dynamic match in XSLT

I have an external document with a list of multiple Xpath like this:
<EncrypRqField>
<EncrypFieldRqXPath01>xpath1</EncrypFieldRqXPath01>
<EncrypFieldRqXPath02>xpath2</EncrypFieldRqXPath02>
</EncrypRqField>
I use this document to obtain the Xpath of the nodes I want to be modified.
The input XML is:
<Employees>
<Employee>
<id>1</id>
<firstname>xyz</firstname>
<lastname>abc</lastname>
<age>32</age>
<department>xyz</department>
</Employee>
</Employees>
I want to obtain something like this:
<Employees>
<Employee>
<id>XXX</id>
<firstname>xyz</firstname>
<lastname>abc</lastname>
<age>XXX</age>
<department>xyz</department>
</Employee>
</Employees>
The XXX values are the result of a data encryption, I want to dynamically obtain the Xpath from the document and change the value of its node.
Thanks.
I'm not sure if something like this is possible in XSL 2.0. May be in 3.0 there should be some function evaluate() but I don't know any details.
But I tried some workaround and it seems to be functional. Of course it is not perfect and has many limitations in this form (e.g. you need to specify absolute path, you cannot use more complex XPath like //, [], etc.) so consider it just as an idea. But it could be the way in some easier cases.
It is based on comparing of two string instead of evaluation string as XPath.
Simplified xml with xpaths to encrypt (I ommit the number for simplicity).
<?xml version="1.0" encoding="UTF-8"?>
<EncrypRqField>
<EncrypFieldRqXPath>/Employees/Employee/id</EncrypFieldRqXPath>
<EncrypFieldRqXPath>/Employees/Employee/age</EncrypFieldRqXPath>
</EncrypRqField>
And my transformation
<xsl:template match="element()">
<xsl:variable name="pathToElement">
<xsl:call-template name="getPath">
<xsl:with-param name="element" select="." />
</xsl:call-template>
</xsl:variable>
<xsl:choose>
<xsl:when test="$xpaths/EncrypFieldRqXPath[text() = $pathToElement]">
<!-- If exists element with exacty same value as constructed "XPath", ten "encrypt" the content of element -->
<xsl:copy>
<xsl:text>XXX</xsl:text>
</xsl:copy>
</xsl:when>
<xsl:otherwise>
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- This template will "construct" the XPath for element under investigation. -->
<!-- There might be an easier way (e.g. some build-in function), but it is actually out of my skill. -->
<xsl:template name="getPath">
<xsl:param name="element" />
<xsl:choose>
<xsl:when test="$element/parent::node()">
<xsl:call-template name="getPath">
<xsl:with-param name="element" select="$element/parent::node()" />
</xsl:call-template>
<xsl:text>/</xsl:text>
<xsl:value-of select="$element/name()" />
</xsl:when>
<xsl:otherwise />
</xsl:choose>
</xsl:template>
</xsl:stylesheet>

How do I merge and concatenate the data from each row in two separate source files?

I have two source files which I need to combine on a row by row basis. I am happy reading the files into a variable and I am happy with the logic but the syntax has me stumped. For each row in file 1 I need to loop round each row in file 2 and output the two variables concatenated together:
File 1:
<rows>
<row>1</row>
<row>2</row>
<row>3</row>
<row>4</row>
</rows>
File 2:
<rows>
<row>a</row>
<row>b</row>
</rows>
Required output:
<rows>
<row>1/a</row>
<row>1/b</row>
<row>2/a</row>
<row>2/b</row>
<row>3/a</row>
<row>3/b</row>
<row>4/a</row>
<row>4/b</row>
<rows>
My (poor) attempt at getting the XSLT to work:
<rows>
<xsl:apply-templates select="document('file1.xml')/rows/row" />
</rows>
<xsl:template match="row">
<xsl:apply-templates select="document('file2.xml')/rows/row" />
</xsl:template>
<xsl:template match="row">
<row><xsl:value-of select="???" />/<xsl:value-of select="???" /></row>
</xsl:template>
(These files are simplified versions of what I actually have)
How do I make one template match one 'row' value and the other match another (both source files use the same structure). And how do I set those '???' values?
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vDoc2">
<rows>
<row>a</row>
<row>b</row>
</rows>
</xsl:variable>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*">
<rows>
<xsl:apply-templates/>
</rows>
</xsl:template>
<xsl:template match="row">
<xsl:apply-templates select="$vDoc2/*/row" mode="doc2">
<xsl:with-param name="pValue" select="."/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="row" mode="doc2">
<xsl:param name="pValue" />
<row><xsl:sequence select="concat($pValue, '/', .)"/></row>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided first XML document:
<rows>
<row>1</row>
<row>2</row>
<row>3</row>
<row>4</row>
</rows>
the wanted, correct result is produced:
<rows>
<row>1/a</row>
<row>1/b</row>
<row>2/a</row>
<row>2/b</row>
<row>3/a</row>
<row>3/b</row>
<row>4/a</row>
<row>4/b</row>
</rows>

xslt: keeping namespace declaration on root when root element is not known in advance

I have xml documents that follow a schema where most of the defined elements are allowed to be the root of a valid instance. I also have several xslt's v2.0 which translate it in various ways (put it into a normal form, a compact form, a different dialect, ...) These xslt's are all based on an identity transform with templates added to make the desired modification. The problem is that there is a proliferation of namespace attributes because there are some elements that come from outside the default namespace.
I have tried the recommended procedures for inserting the namespace on the root element, but I can't seem to get it right. The issues are:
1. the transformation may change the name, and sometimes the content of the root element, so I still need the templates for each of the global elements, and since I don't know which one will be root, I can't just insert namespace elements where needed (I don't know where they will be needed for a particular document.
2. I thought about implementing this as multi-pass, or simply an independent xslt, since I want the same result for several different xslts. In this case, what I would need is an identity transform that takes all the namespaces and prefixes from all elements in the document, and inserts them into the root. This would, I hope, automatically remove the namespace attributes from the children? However, I tried the following
<?xml version="1.0" ?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template name="start" match="/">
<xsl:copy>
<xsl:for-each select="*">
<xsl:copy>
<xsl:for-each select="descendant::*">
<xsl:call-template name="add-ns">
<xsl:with-param name="ns-namespace">
<xsl:value-of select="namespace-uri()"/>
</xsl:with-param>
<xsl:with-param name="ns-prefix">
<xsl:value-of
select=" prefix-from-QName( QName(namespace-uri(),name()))"/>
</xsl:with-param>
</xsl:call-template>
</xsl:for-each>
<xsl:apply-templates select="node() | #*"/>
</xsl:copy>
</xsl:for-each>
</xsl:copy>
</xsl:template>
<xsl:template name="add-ns">
<xsl:param name="ns-prefix" select="'x'"/>
<xsl:param name="ns-namespace" select="'someNamespace'"/>
<xsl:namespace name="{$ns-prefix}" select="$ns-namespace"/>
</xsl:template>
<xsl:template match="node()|#* ">
<xsl:copy>
<xsl:apply-templates select="node() | #*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
And this works for all prefixes that appear on elements, but it doesn't catch the prefixes of attributes. Here is a test document:
<RuleML xmlns="http://www.ruleml.org/0.91/xsd">
<Assert textiri="xy>z">
<Importation xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="abc"
textiri="urn:common-logic:demo1"
xlink:href="http://common-logic.org/x>cl/demos.xml"/>
<a:anything xmlns:a="http://anything.org"
xmlns:xlink="http://www.w3.org/1999/xlink"/>
</Assert>
</RuleML>
I want it to produce:
<RuleML xmlns="http://www.ruleml.org/0.91/xsd" xmlns:a="http://anything.org" xmlns:xlink="http://www.w3.org/1999/xlink" >
<Assert textiri="xy>z">
<Importation xml:id="abc"
textiri="urn:common-logic:demo1"
xlink:href="http://common-logic.org/x>cl/demos.xml"/>
<a:anything/>
</Assert>
</RuleML>
but instead I get
<RuleML xmlns="http://www.ruleml.org/0.91/xsd" xmlns:a="http://anything.org">
<Assert textiri="xy>z">
<Importation xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="abc"
textiri="urn:common-logic:demo1"
xlink:href="http://common-logic.org/x>cl/demos.xml"/>
<a:anything xmlns:xlink="http://www.w3.org/1999/xlink"/>
</Assert>
</RuleML>
Tara
Does the following do what you want?
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:template match="#* | node()">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*">
<xsl:copy>
<xsl:copy-of select="descendant::*/namespace::*"/>
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
With Saxon 9.3 it seems to do the job on the sample you posted.
I am however not sure what you want to do if there are several elements in different default namespaces or several elements in different namespaces but using the same prefix. For instance with
<root xmlns="http://example.com/ns1">
<foo xmlns="http://example.com/ns2">
<pf:bar xmlns:pf="http://example.com/ns3">
<pf:foobar xmlns:pf="http://example.com/ns4"/>
</pf:bar>
</foo>
</root>
Saxon simply reports the error
Error at xsl:copy-of on line 15 of test2011061801Xsl2.xsl:
XTDE0430: Cannot create two namespace nodes with the same prefix mapped to different URIs
(prefix="", URI=http://example.com/ns2, URI=http://example.com/ns1)
in built-in template rule
[edit]
If you don't want an error to be reported you could try to implement a strategy to pull up namespace nodes as far up as possible but to avoid any collisions. That can be done with for-each-group, as in the following sample XSLT 2.0:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:template match="#* | text() | processing-instruction() | comment()">
<xsl:copy/>
</xsl:template>
<xsl:template match="*">
<xsl:copy copy-namespaces="no">
<xsl:for-each-group select="descendant-or-self::*/namespace::*" group-by="local-name()">
<xsl:copy-of select="."/>
</xsl:for-each-group>
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
With the input being
<root xmlns="http://example.com/ns1">
<foo xmlns="http://example.com/ns2">
<pf:bar xmlns:pf="http://example.com/ns3">
<pf:foobar xmlns:pf="http://example.com/ns4"/>
</pf:bar>
</foo>
</root>
Saxon 9.3 outputs
<?xml version="1.0" encoding="UTF-8"?><root xmlns="http://example.com/ns1" xmlns:pf="http://example.com/ns3">
<foo xmlns="http://example.com/ns2">
<pf:bar>
<pf:foobar xmlns:pf="http://example.com/ns4"/>
</pf:bar>
</foo>
</root>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="*:RuleML">
<xsl:copy>
<xsl:for-each select="descendant::node()">
<xsl:choose>
<xsl:when test="self::text()"/>
<xsl:otherwise>
<xsl:for-each select="namespace::node()">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
<xsl:apply-templates select="(node() | #*) except namespace::node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="(node() | #*) except namespace::node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Complex XSL Transformation

I am still a beginner with XSLT but I am having a difficult task in hand.
I have a non-xml file which needs to be transformed. The format of the file is a s follows:
type1
type1line1
type1line2
type1line3
type2
type2line1
type2line2
type3
type3line1
type3line2
types (type1, type2, ...) are specified using certain codes which don't have a specific order. Each type has multiple line underneath.
So, I need to transform this file but the problem is that for each type I have to do a different transformation for each of it's underlying lines.
Now, I can read the string line by line and determine that a new type has begun but I don't know how to set a flag (indicating the type) to use it in the underlying lines.
Here is what I have right now:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0">
<xsl:param name="testString" as="xs:string">
type1
line1
line2
type1
line1
</xsl:param>
<xsl:template match="/">
<xsl:call-template name="main">
<xsl:with-param name="testString" select="$testString"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="main">
<xsl:param name="testString"/>
<xsl:variable name="iniFile" select="$testString"/>
<config>
<xsl:analyze-string select="$iniFile" regex="\n">
<xsl:non-matching-substring>
<item>
<xsl:choose>
<xsl:when test="starts-with(., 'type1')">
<!-- do a specific transformation-->
</xsl:when>
<xsl:when test="starts-with(., 'type2')">
<!-- do another transformation-->
</xsl:when>
</xsl:choose>
</item>
</xsl:non-matching-substring>
</xsl:analyze-string>
</config>
</xsl:template>
</xsl:stylesheet>
Any idea about how to solve the problem.
I think XSLT 2.1 will allow you to use its powerful stuff like for-each-group on sequences of atomic values like strings but with XSLT 2.0 you have such powerful features only for sequences of nodes so my first step when using XSLT 2.0 with plain string data I want to process/group is to create elements. So you could tokenize your data, wrap each token into some element and then use for-each-group group-starting-with to process each group starting with some pattern like '^type[0-9]+$'.
You haven't really told us what you want to with the data once you have identified a group so take the following as an example you could adapt:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:output method="xml" indent="yes"/>
<xsl:param name="input" as="xs:string">type1
type1line1
type1line2
type1line3
type2
type2line1
type2line2
type3
type3line1
type3line2</xsl:param>
<xsl:template name="main">
<xsl:variable name="lines" as="element(item)*">
<xsl:for-each select="tokenize($input, '\n')">
<item><xsl:value-of select="."/></item>
</xsl:for-each>
</xsl:variable>
<xsl:for-each-group select="$lines" group-starting-with="item[matches(., '^type[0-9]+$')]">
<xsl:choose>
<xsl:when test=". = 'type1'">
<xsl:apply-templates select="current-group() except ." mode="m1"/>
</xsl:when>
<xsl:when test=". = 'type2'">
<xsl:apply-templates select="current-group() except ." mode="m2"/>
</xsl:when>
<xsl:when test=". = 'type3'">
<xsl:apply-templates select="current-group() except ." mode="m3"/>
</xsl:when>
</xsl:choose>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="item" mode="m1">
<foo>
<xsl:value-of select="."/>
</foo>
</xsl:template>
<xsl:template match="item" mode="m2">
<bar>
<xsl:value-of select="."/>
</bar>
</xsl:template>
<xsl:template match="item" mode="m3">
<baz>
<xsl:value-of select="."/>
</baz>
</xsl:template>
</xsl:stylesheet>
When applied with Saxon 9 (command line options -it:main -xsl:sheet.xsl) the result is
<foo>type1line1</foo>
<foo>type1line2</foo>
<foo>type1line3</foo>
<bar>type2line1</bar>
<bar>type2line2</bar>
<baz>type3line1</baz>
<baz>type3line2</baz>

Resources