Comparing to two lists - xslt-2.0

This is a long, long question. Apologies.
My XSLT is not too bad as you can see from my reputation. I've been struggling all day to solve a coding problem and have, in the end, come up with a solution but I don't like it.
It seems to me that I have managed to code a procedural solution in a functional language and I would welcome more elegant, cleaner solutions more in the spirit of XSLT.
I am doing a data reconciliation exercise between two computer systems holding very similar data.
The data in question is public transport Routes, each Route consisting of a list of Points e.g.
<Routes>
<Route Id="1">
<Point Id="1"/>
<Point Id="2"/>
<Point Id="3"/>
<Point Id="4"/>
<Point Id="5"/>
</Route>
</Routes>
None of the Id's are simple, incrementing integers in reality of course.
For 'Business Reasons' this Route may appear in the other system as
<Routes>
<Route Id="1">
<Point Id="1"/>
<Point Id="2"/>
</Route>
<Route Id="1A">
<Point Id="3"/>
<Point Id="4"/>
<Point Id="5"/>
</Route>
</Routes>
We can assume that the Point Id's match between the systems often enough
Now, I have code that compares Route 1 in one system and Routes that start with 1 in the other system and it produces something like:
<Routes>
<Route>
<Point Id="1" In="Y"/>
<Point Id="2" In="Y"/>
<Point Id="3" In="N"/>
<Point Id="4" In="N"/>
<Point Id="5" In="Y"/>
<Point Id="6" In="Y"/>
<Point Id="7" In="N"/>
<Point Id="8" In="N"/>
<Point Id="9" In="N"/>
</Route>
</Routes>
Where In='Y' means that point is also in System B for this Route
This sort of output is a little difficult for the business to understand. They could deal with the following easier
<Routes>
<Route>
<Route>
<Group startPoint="1" endPoint="2" In="Y"/>
<Group startPoint="3" endPoint="4" In="N"/>
<Group startPoint="5" endPoint="6 "In="Y"/>
<Group startPoint="7" endPoint="9" In="Y"/>
</Route>
</Route>
</Routes>
Obviously, I don't really show them any thing like this. I shows them Excel sheets with text description of things, But I do want to reduce points of the list that do not change status to sections with start and ends as this is much more easy to understand in business terms.
In other words they want to see that this route is the same as the first half of the other route then skips a bunch of points then matches up again.
So....
How to reduce sequences of Y and N elements to element that say we started saying Y here till here then we said N from here to here and then N for the last few. Hope this makes sense
My test data:
<Routes>
<Route>
<Point Id="1" In="Y"/>
<Point Id="2" In="Y"/>
<Point Id="3" In="N"/>
<Point Id="4" In="N"/>
<Point Id="5" In="Y"/>
<Point Id="6" In="Y"/>
<Point Id="7" In="N"/>
<Point Id="8" In="N"/>
<Point Id="9" In="N"/>
</Route>
</Routes>
My solution:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:template match="/">
<Routes>
<xsl:apply-templates select="/Routes"/>
</Routes>
</xsl:template>
<xsl:template match="/Routes">
<Route>
<xsl:apply-templates select="Route"/>
</Route>
</xsl:template>
<xsl:template match="/Routes/Route">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="." mode="Pointy">
<xsl:with-param name="posn" select="1" as="xs:integer"/>
<xsl:with-param name="startPosn" select="1" as="xs:integer"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="/Routes/Route" mode="Pointy">
<xsl:param name="posn" as="xs:integer"/>
<xsl:param name="startPosn" as="xs:integer"/>
<xsl:variable name="groupType" select="Point[position()=$startPosn]/#In"/>
<xsl:if test="$posn!=1 and $groupType != Point[$posn]/#In">
<Group>
<xsl:attribute name="startPoint" select="Point[$startPosn]/#Id"/>
<xsl:attribute name="endPoint" select="Point[$posn - 1]/#Id"/>
</Group>
</xsl:if>
<xsl:if test="$posn = count(Point)">
<Group>
<xsl:attribute name="startPoint" select="Point[$startPosn]/#Id"/>
<xsl:attribute name="endPoint" select="Point[$posn]/#Id"/>
</Group>
</xsl:if>
<xsl:if test="$groupType = Point[$posn]/#In and $posn != count(Point)">
<xsl:apply-templates select="." mode="Pointy">
<xsl:with-param name="posn" select="$posn + 1" as="xs:integer"/>
<xsl:with-param name="startPosn" select="$startPosn" as="xs:integer"/>
</xsl:apply-templates>
</xsl:if>
<xsl:if test="$groupType != Point[$posn]/#In and $posn != count(Point)">
<xsl:apply-templates select="." mode="Pointy">
<xsl:with-param name="posn" select="$posn + 1" as="xs:integer"/>
<xsl:with-param name="startPosn" select="$posn" as="xs:integer"/>
</xsl:apply-templates>
</xsl:if>
</xsl:template>
</xsl:stylesheet>

Given the format
<Routes>
<Route>
<Point Id="1" In="Y"/>
<Point Id="2" In="Y"/>
<Point Id="3" In="N"/>
<Point Id="4" In="N"/>
<Point Id="5" In="Y"/>
<Point Id="6" In="Y"/>
<Point Id="7" In="N"/>
<Point Id="8" In="N"/>
<Point Id="9" In="N"/>
</Route>
</Routes>
you can use group-adjacent with
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Route">
<xsl:copy>
<xsl:for-each-group select="Point" group-adjacent="#In">
<Group startPoint="{#Id}" endPoint="{current-group()[last()]/#Id}" In="{#In}"/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
to get
<Routes>
<Route>
<Group startPoint="1" endPoint="2" In="Y"/>
<Group startPoint="3" endPoint="4" In="N"/>
<Group startPoint="5" endPoint="6" In="Y"/>
<Group startPoint="7" endPoint="9" In="N"/>
</Route>
</Routes>

Related

WSO2 ESB escaping xml entity to correct form

I am stuck on creating an output message from wso2 esb 5.0. The correct form of the message should be:
<request lang="11" user="user" pwd="pwd">
<query dateformat="%d-%m-%Y" maxrecords="100">
<tables>
<table tablename="Company"/>
</tables>
<condition>
<cond tablename="Company" fieldname="Upd" op=">" value="10.11.2016"/>
<cond tablename="Company" fieldname="UpdTime" op=">" value="01:00"/>
</condition>
The fun part is ">" in the "op" attribute. There has to be ">" and not ">". CDATA and this character also doesn`t work:
&
I tried with payload factory, xslt transformation and script mediator with different values but with no success. My bad attempts:
<payloadFactory description="Create Request" media-type="xml">
<format>
<request lang="11" user="user" pwd="pwd">
<query dateformat="%d-%m-%Y" maxrecords="100">
<tables>
<table tablename="Company"/>
</tables>
<condition>
<cond tablename="Company" fieldname="Upd" op=">" value="07.10.2016"/>
<cond tablename="Company" fieldname="UpdTime" op=">" value="01:00"/>
</condition>
</query>
</request>
</format>
<args/>
</payloadFactory>
With XSLT:
<xslt description="Create Request" key="get-companies"/>
<localEntry key="get-companies" xmlns="http://ws.apache.org/ns/synapse">
<xsl:stylesheet exclude-result-prefixes="ws" version="1.0" xmlns:ws="http://ws.company.com/" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="UTF-8" indent="yes" method="xml" omit-xml-declaration="no"/>
<xsl:template match="/">
<xsl:element name="request">
<xsl:attribute name="lang">11</xsl:attribute>
<xsl:attribute name="user">user</xsl:attribute>
<xsl:attribute name="pwd">pwd</xsl:attribute>
<xsl:element name="query">
<xsl:attribute name="maxrecords">100</xsl:attribute>
<xsl:attribute name="dateformatin">%d-%m-%Y</xsl:attribute>
<tables xmlns="">
<table tablename="Company"/>
</tables>
<condition xmlns="">
<xsl:element name="cond">
<xsl:attribute name="tablename">Company</xsl:attribute>
<xsl:attribute name="fieldname">Upd</xsl:attribute>
<xsl:attribute name="op">></xsl:attribute>
<xsl:attribute name="value">
<xsl:value-of select="07.01.2016"/>
</xsl:attribute>
</xsl:element>
<cond fieldname="UpdTime" op=">" tablename="Company" value="01:00"/>
</condition>
</xsl:element>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Script Mediator:
<script language="js"><![CDATA[mc.setPayloadXML(<request lang="11" user="user" pwd="pwd">
<query dateformat="%d-%m-%Y" maxrecords="100">
<tables>
<table tablename="Company"/>
</tables>
<condition>
<cond tablename="Company" fieldname="Upd" op="&gt;" value="07.01.2016"/>
<cond tablename="Company" fieldname="UpdTime" op="&gt;" value="01:00"/>
</condition>
</query>
</request>); ]]></script>
Please help me to find the correct form.

XSLT- Pre Define NameSpace

Hi I have an XML that upon each delivery has a different unique named Namespace that I cannot pre determine with standard processes.
<ABC xmlns:this="urn:uuid:9b1f15a9-69de-11d2-b6bc-fcab70ff7331" version="1.1">
<Extensions>
<Identification>urn:uuid:9b1f15a9-69de-11d2-b6bc-fcab70ff7331</Identification>
<Extension>
<SrcPackage>
<this:ABDList>
<TaggedValue>111</TaggedValue>
</this:ABDList>
<this:SubBegin>0</this:SubBegin>
</SrcPackage>
<MatPackage>
<this:ABDList>
<TaggedValue>222</TaggedValue>
</this:ABDList>
<this:SubBegin>1</this:SubBegin>
</MatPackage>
<!-- Stuff -->
</Extention>
</Extentions>
</ABC>
The Next XML delivered could be
<ABC xmlns:this="urn:uuid:9b1FFae4-69de-11d2-b6bc-fcab70ff7331" version="1.1">
<Extensions>
<Identification>urn:uuid:9b1FFae4-69de-11d2-b6bc-fcab70ff7331</Identification>
<Extension>
<SrcPackage>
<this:ABDList>
<TaggedValue>333</TaggedValue>
</this:ABDList>
<this:SubBegin>0</this:SubBegin>
</SrcPackage>
<MatPackage>
<this:ABDList>
<TaggedValue>444</TaggedValue>
</this:ABDList>
<this:SubBegin>1</this:SubBegin>
</MatPackage>
<!-- Stuff -->
</Extention>
</Extentions>
</ABC>
My current XSL stylesheet works on the first XML predefining the Namespace
But I am looking to find a way to re-define it later on. on the process. I have added a variable to pull the relevant uuid from the Identification element but am not sure how to integrate this. Using the below stylesheet to process any other XML results in false results.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:this="urn:uuid:9b1f15a9-69de-11d2-b6bc-fcab70ff7331"
xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:variable name="SelOpGroup" select="/ABC/Extensions/Identification"/>
<!-- Pass thru --->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/ABC/Extensions/SrcPackage>
<xsl:copy>
<this:ABDList>
<xsl:copy-of select ="this:ABDList/*"/>
<TaggedA>888</TaggedA>
</this:ABDList>
<this:SubBegin><xsl:value-of select="somethingelse"/> </this:SubBegin>
</xsl:copy>
</xsl:template>
<xsl:template match="/ABC/Extensions/MatPackage>
<xsl:copy>
<this:ABDList>
<xsl:copy-of select ="this:ABDList/*"/>
<TaggedB>999</TaggedB>
</this:ABDList >
<this:SubBegin><xsl:value-of select="somethingelse"/> </this:SubBegin>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Expected Result
<ABC xmlns:this="urn:uuid:9b1FFae4-69de-11d2-b6bc-fcab70ff7331" version="1.1">
<Extensions>
<Identification>urn:uuid:9b1FFae4-69de-11d2-b6bc-fcab70ff7331</Identification>
<Extension>
<SrcPackage>
<this:ABDList>
<TaggedValue>333</TaggedValue>
<TaggedA>888</TaggedA>
</this:ABDList>
<this:SubBegin>a value</this:SubBegin>
</SrcPackage>
<MatPackage>
<this:ABDList>
<TaggedValue>444</TaggedValue>
<TaggedB>999</TaggedA>
</this:ABDList>
<this:SubBegin>a value</this:SubBegin>
</MatPackage>
<!-- Stuff -->
</Extention>
</Extentions>
</ABC>
Many thanks,
Adrian
This transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:variable name="SelOpGroup" select="/ABC/Extensions/Identification"/>
<!-- Pass thru -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Extension/SrcPackage">
<xsl:copy>
<xsl:element name="{'this:ABDList'}" namespace="{$SelOpGroup}">
<xsl:copy-of select="/*/namespace::*[name()='this']"/>
<xsl:copy-of select ="*[name() = 'this:ABDList']/*"/>
<TaggedA>888</TaggedA>
</xsl:element>
<xsl:element name="{'this:SubBegin'}" namespace="{$SelOpGroup}">
<xsl:copy-of select="/*/namespace::*[name()='this']"/>
<xsl:value-of select="'somethingelse'"/>
</xsl:element>
</xsl:copy>
</xsl:template>
<xsl:template match="Extension/MatPackage">
<xsl:copy>
<xsl:element name="{'this:ABDList'}" namespace="{$SelOpGroup}">
<xsl:copy-of select="/*/namespace::*[name()='this']"/>
<xsl:copy-of select ="*[name() = 'this:ABDList']/*"/>
<TaggedB>999</TaggedB>
</xsl:element>
<xsl:element name="{'this:SubBegin'}" namespace="{$SelOpGroup}">
<xsl:copy-of select="/*/namespace::*[name()='this']"/>
<xsl:value-of select="'somethingelse'"/>
</xsl:element>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied on the first provided XML document:
<ABC xmlns:this="urn:uuid:9b1f15a9-69de-11d2-b6bc-fcab70ff7331" version="1.1">
<Extensions>
<Identification>urn:uuid:9b1f15a9-69de-11d2-b6bc-fcab70ff7331</Identification>
<Extension>
<SrcPackage>
<this:ABDList>
<TaggedValue>111</TaggedValue>
</this:ABDList>
<this:SubBegin>0</this:SubBegin>
</SrcPackage>
<MatPackage>
<this:ABDList>
<TaggedValue>222</TaggedValue>
</this:ABDList>
<this:SubBegin>1</this:SubBegin>
</MatPackage>
<!-- Stuff -->
</Extension>
</Extensions>
</ABC>
Produces the wanted, correct result:
<ABC xmlns:this="urn:uuid:9b1f15a9-69de-11d2-b6bc-fcab70ff7331" version="1.1">
<Extensions>
<Identification>urn:uuid:9b1f15a9-69de-11d2-b6bc-fcab70ff7331</Identification>
<Extension>
<SrcPackage>
<this:ABDList>
<TaggedValue>111</TaggedValue>
<TaggedA>888</TaggedA>
</this:ABDList>
<this:SubBegin>somethingelse</this:SubBegin>
</SrcPackage>
<MatPackage>
<this:ABDList>
<TaggedValue>222</TaggedValue>
<TaggedB>999</TaggedB>
</this:ABDList>
<this:SubBegin>somethingelse</this:SubBegin>
</MatPackage><!-- Stuff -->
</Extension>
</Extensions>
</ABC>
When the same transformation is applied on the second provided XML document:
<ABC xmlns:this="urn:uuid:9b1FFae4-69de-11d2-b6bc-fcab70ff7331" version="1.1">
<Extensions>
<Identification>urn:uuid:9b1FFae4-69de-11d2-b6bc-fcab70ff7331</Identification>
<Extension>
<SrcPackage>
<this:ABDList>
<TaggedValue>333</TaggedValue>
</this:ABDList>
<this:SubBegin>0</this:SubBegin>
</SrcPackage>
<MatPackage>
<this:ABDList>
<TaggedValue>444</TaggedValue>
</this:ABDList>
<this:SubBegin>1</this:SubBegin>
</MatPackage>
<!-- Stuff -->
</Extension>
</Extensions>
</ABC>
Again the wanted, correct result is produced:
<ABC xmlns:this="urn:uuid:9b1FFae4-69de-11d2-b6bc-fcab70ff7331" version="1.1">
<Extensions>
<Identification>urn:uuid:9b1FFae4-69de-11d2-b6bc-fcab70ff7331</Identification>
<Extension>
<SrcPackage>
<this:ABDList>
<TaggedValue>333</TaggedValue>
<TaggedA>888</TaggedA>
</this:ABDList>
<this:SubBegin>somethingelse</this:SubBegin>
</SrcPackage>
<MatPackage>
<this:ABDList>
<TaggedValue>444</TaggedValue>
<TaggedB>999</TaggedB>
</this:ABDList>
<this:SubBegin>somethingelse</this:SubBegin>
</MatPackage><!-- Stuff -->
</Extension>
</Extensions>
</ABC>
This is bizarre input (what were they smoking?). But since the namespace is only used on one element, ABDList, my approach would be to select the ABDList elements using *:ABDList in XSLT 2.0, or *[local-name()='ABDList'] in XSLT 1.0.
I have an XML that upon each delivery has a different unique named
Namespace
Someone ahead of you obviously does not understand the purpose of having a namespace.
Perhaps this could work for your unfortunate situation:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="SrcPackage/*/TaggedValue">
<xsl:copy-of select="."/>
<TaggedA>888</TaggedA>
</xsl:template>
<xsl:template match="MatPackage/*/TaggedValue">
<xsl:copy-of select="."/>
<TaggedB>999</TaggedB>
</xsl:template>
</xsl:stylesheet>

tagging words using lookup, what's the best approach?

I have some strings I want to test to see if they contain specific words. The words in question are in a lookup node, and if there is a match the word in the string needs to be tagged. I have a script that works almost ok, but I want to know if I'm using the best format, as I believe it's rather resource consuming, and not very foolproof.
Example xml :
<Main>
<NTUS>
<NTU>match</NTU>
<NTU>test</NTU>
</NTUS>
<Folder id="update">
<about>This content is not in a span so we ignore it completely, even if we would have a match</about>
<Title>
<span class="string simple" lang="en">Some test content containing a single match</span>
</Title>
<Content>
<span class="string complex" lang="en">Also keywords in sub elements should <strong>pass the test</strong>, and match.</span>
</Content>
</Folder>
</Main>
My current xslt :
<xsl:param name="units">
<xsl:copy-of select="//NTU"/>
</xsl:param>
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="NTUS"/>
<xsl:template match="text()[ancestor::span]">
<xsl:analyze-string select="." regex="\s+">
<xsl:matching-substring>
<xsl:value-of select="."/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:variable name="theWord" select="."/>
<xsl:choose>
<xsl:when test="$units/*[text()=$theWord]">
<ntu>
<xsl:value-of select="."/>
</ntu>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
results in following :
<Main>
<Folder id="update">
<about>This content is not in a span so we ignore it completely, even if we would have a match</about>
<Title>
<span class="string simple" lang="en">Some <ntu>test</ntu> content containing a single <ntu>match</ntu></span>
</Title>
<Content>
<span class="string complex" lang="en">Also keywords in sub elements should <strong>pass the <ntu>test</ntu></strong>, and match.</span>
</Content>
</Folder>
</Main>
Which is almost ok apart from the last node, as the [match] is at the end of the sentence and therefore not passing the regex. I can adjust it to make it match, but it could become pretty complex then, so i want to know if there are better ways to address this problem.
EDIT : there seems to be a small misbehaviour when you use a comma delimited list (might be on other occasions also, but this one I noticed)...
So for instance following xml
<Main>
<NTUS>
<NTU>OPTION1</NTU>
<NTU>OPTION2</NTU>
<NTU>OPTION3</NTU>
<NTU>OPTION4</NTU>
<NTU>OPTION5</NTU>
</NTUS>
<local xml:lang="en">
<span>Test string containing some comma seperarated lookup values: OPTION1, OPTION2, OPTION3, OPTION4, OPTION5</span>
</local>
Returns following when the script is applied :
<span>Test string containing some comma seperarated lookup values: <ntu>OPTION1</ntu>, OPTION2, <ntu>OPTION3</ntu>, OPTION4, <ntu>OPTION5</ntu></span>
so every second match is skipped. Any idea what is causing this behaviour ?
This transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vPatterns" select=
"string-join(/*/NTUS/*, '|')"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="span//text()">
<xsl:analyze-string select="." regex=
"(^|(\P{{L}})+)({$vPatterns})($|(\P{{L}})+)">
<xsl:non-matching-substring><xsl:value-of select="."/></xsl:non-matching-substring>
<xsl:matching-substring>
<xsl:value-of select="regex-group(1)"/>
<ntu><xsl:value-of select="regex-group(3)"/></ntu>
<xsl:value-of select="regex-group(4)"/>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied to the provided XML document:
<Main>
<NTUS>
<NTU>match</NTU>
<NTU>test</NTU>
</NTUS>
<Folder id="update">
<about>This content is not in a span so we ignore it completely, even if we would have a match</about>
<Title>
<span class="string simple" lang="en">Some test content containing a single match</span>
</Title>
<Content>
<span class="string complex" lang="en">Also keywords in sub elements should <strong>pass the test</strong>, and match.</span>
</Content>
</Folder>
</Main>
the wanted, correct result is produced:
<Main>
<NTUS>
<NTU>match</NTU>
<NTU>test</NTU>
</NTUS>
<Folder id="update">
<about>This content is not in a span so we ignore it completely, even if we would have a match</about>
<Title>
<span class="string simple" lang="en">Some <ntu>test</ntu> content containing a testmatch or a single <ntu>match</ntu></span>
</Title>
<Content>
<span class="string complex" lang="en">Also keywords in sub elements should <strong>pass the <ntu>test</ntu></strong>, and <ntu>match</ntu>.</span>
</Content>
</Folder>
</Main>

XSLT - make xsl:analyze-string return string instead of sequence of strings?

Is it possible to make xsl:analyze-string return one string instead of a sequence of strings?
Background: I'd like to use xsl:analyze-string in a xsl:function that should encapsulate the pattern matching. Ideally, the function should return an xs:string to be used as sort criteria in an xsl:sort element.
At the moment, i have to apply string-join() on every result of the function call since xsl:analyze-string returns a sequence of strings, and xsl:sort doesn't accept such a sequence as sort criteria. See line 24 of the stylesheet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="www.my-personal-namespa.ce"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output indent="yes" method="xml" />
<xsl:function name="my:sortierung" >
<xsl:param name="inputstring" as="xs:string"/>
<xsl:analyze-string select="$inputstring" regex="[0-9]+">
<xsl:matching-substring>
<xsl:value-of select="format-number(number(.), '00000')" />
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="." />
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:function>
<xsl:template match="/input">
<result>
<xsl:apply-templates select="value" >
<xsl:sort select="string-join((my:sortierung(.)), ' ')" />
</xsl:apply-templates>
</result>
</xsl:template>
<xsl:template match="value">
<xsl:copy-of select="." />
</xsl:template>
</xsl:stylesheet>
with this input:
<?xml version="1.0" encoding="UTF-8"?>
<input>
<value>A 1 b 120</value>
<value>A 1 b 1</value>
<value>A 1 b 2</value>
<value>A 1 b 1a</value>
</input>
In my example, is there a way to modify the xsl:function to return a xs:string instead of a sequence?
There are several ways I think, you could put the result of the analyze-string into a variable inside of the function and then use xs:sequence select="string-join($var, ' ')" in the function.
However the following with xsl:value-of should also do:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="www.my-personal-namespa.ce"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="my xs">
<xsl:output indent="yes" method="xml" />
<xsl:function name="my:sortierung" as="xs:string">
<xsl:param name="inputstring" as="xs:string"/>
<xsl:value-of separator=" ">
<xsl:analyze-string select="$inputstring" regex="[0-9]+">
<xsl:matching-substring>
<xsl:value-of select="format-number(number(.), '00000')" />
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="." />
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:value-of>
</xsl:function>
<xsl:template match="/input">
<result>
<xsl:apply-templates select="value" >
<xsl:sort select="my:sortierung(.)" />
</xsl:apply-templates>
</result>
</xsl:template>
<xsl:template match="value">
<xsl:copy-of select="." />
</xsl:template>
</xsl:stylesheet>

Difference in template rule processing XSLT 1.0 vs 2.0

Answering another XSLT question on this site, I stumbled on a difference between XSLT 1.0 and 2.0 that I don't understand. Who can explain what is happening here, and how the difference may be resolved?
Note: I am using XML Spy version 2011 sp1 (x64).
My input XML is
<?xml version="1.0" encoding="UTF-8"?>
<root>
<Manager grade="10" id="26">
<Employee id="1" grade="9"/>
<Employee id="2" grade="8"/>
</Manager>
<Manager grade="10" id="27">
<Employee id="3" grade="9"/>
<Employee id="4" grade="8"/>
<Employee id="5" grade="4"/>
</Manager>
<Manager grade="7" id="28">
<Employee id="6" grade="8"/>
<Employee id="7" grade="7"/>
<Employee id="8" grade="6"/>
<Employee id="9" grade="9"/>
</Manager>
<Manager grade="9" id="29">
<Employee id="10" grade="9"/>
<Employee id="11" grade="8"/>
<Employee id="12" grade="7"/>
</Manager>
</root>
I wish to select the set of Employees that have a grade larger than or equal to the managers grade. For this I wrote the following 1.0 transformation:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<root>
<xsl:apply-templates select="root/Manager"/>
</root>
</xsl:template>
<xsl:template match="Manager">
<mgr>
<managerId><xsl:value-of select="#id"/></managerId>
<managerGrade><xsl:value-of select="#grade"/></managerGrade>
<empsSelection>
<xsl:copy-of select="Employee[#grade >= ../#grade]"/>
</empsSelection>
</mgr>
</xsl:template>
</xsl:stylesheet>
The output is the expected
<?xml version="1.0" encoding="UTF-8"?>
<root>
<mgr>
<managerId>26</managerId>
<managerGrade>10</managerGrade>
<empsSelection/>
</mgr>
<mgr>
<managerId>27</managerId>
<managerGrade>10</managerGrade>
<empsSelection/>
</mgr>
<mgr>
<managerId>28</managerId>
<managerGrade>7</managerGrade>
<empsSelection>
<Employee id="6" grade="8"/>
<Employee id="7" grade="7"/>
<Employee id="9" grade="9"/>
</empsSelection>
</mgr>
<mgr>
<managerId>29</managerId>
<managerGrade>9</managerGrade>
<empsSelection>
<Employee id="10" grade="9"/>
</empsSelection>
</mgr>
</root>
But when I change the XSLT version to 2.0 (take above stylesheet and change stylesheet/#version to 2.0), I get the below different and unexpected result:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<mgr>
<managerId>26</managerId>
<managerGrade>10</managerGrade>
<empsSelection>
<Employee id="1" grade="9"/>
<Employee id="2" grade="8"/>
</empsSelection>
</mgr>
<mgr>
<managerId>27</managerId>
<managerGrade>10</managerGrade>
<empsSelection>
<Employee id="3" grade="9"/>
<Employee id="4" grade="8"/>
<Employee id="5" grade="4"/>
</empsSelection>
</mgr>
<mgr>
<managerId>28</managerId>
<managerGrade>7</managerGrade>
<empsSelection>
<Employee id="6" grade="8"/>
<Employee id="7" grade="7"/>
<Employee id="9" grade="9"/>
</empsSelection>
</mgr>
<mgr>
<managerId>29</managerId>
<managerGrade>9</managerGrade>
<empsSelection>
<Employee id="10" grade="9"/>
</empsSelection>
</mgr>
</root>
Why is this and how should the stylesheet be changed in order to get the correct result in both XSLT 1.0 and 2.0 version?
I think with XSLT 2.0 you by default get comparison as strings while with XSLT 1.0 the comparison operator converts any operands to numbers first which are then compared so with XSLT 2.0 you need
<xsl:template match="Manager">
<mgr>
<managerId><xsl:value-of select="#id"/></managerId>
<managerGrade><xsl:value-of select="#grade"/></managerGrade>
<empsSelection>
<xsl:copy-of select="Employee[number(#grade) >= number(current()/#grade)]"/>
</empsSelection>
</mgr>
</xsl:template>
to get the result you want. Of course using other number types like xs:integer(#grade) should do as well.

Resources