Splitting of xml using xslt 2.0 - xslt-2.0

<r>
<info> </info>
<level id="some unique id" leveltype="group">
<heading>
<title/>
</heading>
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
<level id="some unique id 1" leveltype="para0">
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
</level>
<level id="some unique id 2" leveltype="para0">
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
</level>
</level>
</r>
this structure repeat itself with level of leveltype group contains many para0 so i have a task to split file on basis of para0 leveltype .
below is my xslt code
<xsl:stylesheet version="2.0" >
<xsl:template match="/">
<xsl:variable name="filename" select="replace(base-uri(),'.xml','')"/>
<xsl:for-each select="//level[#leveltype='para0']">
<xsl:result-document method="xml" href="{$filename}{#id}.xml">
<xsl:copy-of select="//docinfo" copy-namespaces="no"></xsl:copy-of>
<xsl:element name="comm:body">
<xsl:choose>
<xsl:when test="./preceding-sibling::node()[#leveltype]='para0'">
<xsl:copy-of select="."></xsl:copy-of>
</xsl:when>
<xsl:otherwise><xsl:copy-of select="./parent::node()"></xsl:copy-of>
</xsl:otherwise>
</xsl:choose>
</xsl:element>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Problem i am facing is i am getting the both para0 in splitted files if a group contain 2 par0 leveltype
output:-
some unique id 1.xml
<r>
<info> </info>
<level id="some unique id" leveltype="group">
<heading>
<title/>
</heading>
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
<level id="some unique id 1" leveltype="para0">
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
</level>
</level>
</r>
# some unique id 2.xml(split at 2nd para0 and group should not be present in this file )
<r>
<info> </info>
<level id="some unique id" leveltype="group">
<heading>
<title/>
</heading>
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
<level id="some unique id 2" leveltype="para0">
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
</level>
</level
</r>
# For second nested para0 group should not be there in the output file

Assuming each splitted file is supposed to contain everything but the different leveltype para0 elements I would use an approach like
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:apply-templates select="//level[#leveltype = 'para0']" mode="file"/>
</xsl:template>
<xsl:template match="level[#leveltype = 'para0']" mode="file">
<xsl:result-document href="output{#id}.xml">
<xsl:apply-templates select="/*">
<xsl:with-param name="copy" tunnel="yes" select="current()"/>
</xsl:apply-templates>
</xsl:result-document>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="level[#leveltype = 'para0']">
<xsl:param name="copy" tunnel="yes"/>
<xsl:if test=". is $copy">
<xsl:next-match/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
For the sample input
<r>
<info> </info>
<level id="some unique id" leveltype="group">
<heading>
<title/>
</heading>
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
<level id="l01" leveltype="para0">
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
</level>
<level id="l02" leveltype="para0">
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
</level>
</level>
</r>
I get two files
<?xml version="1.0" encoding="UTF-8"?><r>
<info> </info>
<level id="some unique id" leveltype="group">
<heading>
<title/>
</heading>
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
<level id="l01" leveltype="para0">
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
</level>
</level>
</r>
and
<?xml version="1.0" encoding="UTF-8"?><r>
<info> </info>
<level id="some unique id" leveltype="group">
<heading>
<title/>
</heading>
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
<level id="l02" leveltype="para0">
<bodytext>
<p>
<text>some text with tags and attribute</text>
</p>
</bodytext>
</level>
</level>
</r>

Related

Need to add 'br' tag after the last <group> element under 'block' element

I want to add 'br' tag after 'group' element under the block element. Basically after the 'group' element in we want break page, that's why we are trying to add break page in output. Below is our input xml structure.
I tried on below XSLT code but unable to get result, please help on this issue:
Input XML:
<?xml version="1.0" encoding="UTF-8"?>
<page>
<stream>
<block>
<group>content here</group>
<group>content here</group>
<group>content here</group>
</block>
</stream>
<stream>
<block>
<group>content here</group>
<group>content here</group>
<group>content here</group>
<!-- please add here br tag -->
</block>
</stream>
</page>
XSLT CODE:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="group">
<xsl:element name="p">
<xsl:apply-templates/>
<xsl:if test="position() = last()">
<br></br>
</xsl:if>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Output:
<?xml version="1.0" encoding="UTF-8"?><page>
<stream>
<block>
<p>content here</p>
<p>content here</p>
<p>content here</p>
</block>
</stream>
<stream>
<block>
<p>content here</p>
<p>content here</p>
<p>content here</p>
<!-- please add here br tag -->
</block>
</stream>
</page>
I would match on <xsl:template match="group[last()]"> e.g.
<xsl:template match="group[last()]">
<p>
<xsl:apply-templates/>
</p>
<br/>
</xsl:template>
and for the other groups it seems you want
<xsl:template match="group">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
It is not clear, however, why your verbal description asks to add a br after the last group of a block while your sample has two blocks and you only add the br in the last block of the last stream. So perhaps you want <xsl:template match="stream[last()]/block/group[last()]"> e.g.
<xsl:template match="stream[last()]/block/group[last()]">
<p>
<xsl:apply-templates/>
</p>
<br/>
</xsl:template>

Keeping track of an iterator through a nested xsl:for-each

I want to take the following input:
<test>
<a>
<b />
<b />
</a>
<a>
<b />
<b />
</a>
</test>
And create the following output using XSLT 2.0:
<items>
<item num="1">
<item num="2"/>
<item num="3"/>
</item>
<item num="4">
<item num="5"/>
<item num="6"/>
</item>
</items>
I know this is wrong, but for a starting point, here's my current XSLT:
<xsl:template match="test">
<items>
<xsl:for-each select="a">
<item num="{position()}">
<xsl:for-each select="b">
<item num="{position()}"/>
</xsl:for-each>
</item>
</xsl:for-each>
</items>
</xsl:template>
This is clearly not the way to do it, because position() only considers elements in the same level. But how would I do this?
This transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<items><xsl:apply-templates/></items>
</xsl:template>
<xsl:template match="a|b">
<xsl:variable name="vNum">
<xsl:number level="any" count="a|b"/>
</xsl:variable>
<item num="{$vNum}">
<xsl:apply-templates/>
</item>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<test>
<a>
<b />
<b />
</a>
<a>
<b />
<b />
</a>
</test>
produces the wanted, correct result:
<items>
<item num="1">
<item num="2"/>
<item num="3"/>
</item>
<item num="4">
<item num="5"/>
<item num="6"/>
</item>
</items>
In XSLT 2, using xsl:number is the right way, if your XSLT processor also supports XSLT 3, then using an accumulator https://www.w3.org/TR/xslt-30/#element-accumulator is an alternative:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:mode on-no-match="shallow-copy" use-accumulators="item-count"/>
<xsl:accumulator name="item-count" as="xs:integer" initial-value="0">
<xsl:accumulator-rule match="test" select="0"/>
<xsl:accumulator-rule match="test/a | test/a/b" select="$value + 1"/>
</xsl:accumulator>
<xsl:template match="test">
<items>
<xsl:apply-templates/>
</items>
</xsl:template>
<xsl:template match="a | b">
<item num="{accumulator-before('item-count')}">
<xsl:apply-templates/>
</item>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/3NSSEvn
That would even work with streaming in Saxon EE where xsl:number is not supported:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:mode on-no-match="shallow-copy" use-accumulators="item-count" streamable="yes"/>
<xsl:accumulator name="item-count" as="xs:integer" initial-value="0" streamable="yes">
<xsl:accumulator-rule match="test" select="0"/>
<xsl:accumulator-rule match="test/a | test/a/b" select="$value + 1"/>
</xsl:accumulator>
<xsl:template match="test">
<items>
<xsl:apply-templates/>
</items>
</xsl:template>
<xsl:template match="a | b">
<item num="{accumulator-before('item-count')}">
<xsl:apply-templates/>
</item>
</xsl:template>
</xsl:stylesheet>
I figured out one solution: use xsl:number with any level and counting both a and b:
<xsl:template match="test">
<items>
<xsl:for-each select="a">
<item>
<xsl:attribute name="num">
<xsl:number level="any" count="a|b"/>
</xsl:attribute>
<xsl:for-each select="b">
<item>
<xsl:attribute name="num">
<xsl:number level="any" count="a|b"/>
</xsl:attribute>
</item>
</xsl:for-each>
</item>
</xsl:for-each>
</items>
</xsl:template>

YQL column projection using XPATH

This query:
SELECT *
FROM html
WHERE url='http://wwww.example.com'
AND xpath='//tr[#height="20"]'
returns XML:
<results>
<tr height="20">
<td height="20" width="425">
<p>Institution 0</p>
</td>
<td width="134">
<p>Minneapolis</p>
</td>
<td width="64">
<p>MN</p>
</td>
</tr>
...
</results>
Questions:
Is there a way to use XPATH to create individual columns?
Is there a way to create column aliases?
Example (invalid syntax):
SELECT td[position()=1]/p/. AS name, td[position()=2]/p/. AS city, td[position()=3]/p/. AS region
FROM ...
Goal:
<results>
<tr height="20">
<name>Institution 0</name>
<city>Minneapolis</city>
<region>MN</region>
</tr>
...
</results>
Not with XPath, as you are trying to do. However one can apply XSL Transformations to XML/HTML documents with YQL. Here's an example:
XSLT
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<rows>
<xsl:apply-templates select="descendant::tr" />
</rows>
</xsl:template>
<xsl:template match="//tr">
<row>
<name>
<xsl:value-of select="td[1]/p" />
</name>
<city>
<xsl:value-of select="td[2]/p" />
</city>
<region>
<xsl:value-of select="td[3]/p" />
</region>
</row>
</xsl:template>
</xsl:stylesheet>
HTML
<html>
<body>
<table>
<tr height="20">
<td height="20" width="425">
<p>Institution 0</p>
</td>
<td width="134">
<p>Minneapolis</p>
</td>
<td width="64">
<p>MN</p>
</td>
</tr>
<tr height="20">
<td height="20" width="425">
<p>Institution 1111</p>
</td>
<td width="134">
<p>Minneapolis 1111</p>
</td>
<td width="64">
<p>MN 11111</p>
</td>
</tr>
</table>
</body>
</html>
YQL query
select * from xslt where stylesheet="url/to.xsl" and url="url/to.html"
YQL Result
<results>
<rows>
<row>
<name>Institution 0</name>
<city>Minneapolis</city>
<region>MN</region>
</row>
<row>
<name>Institution 1111</name>
<city>Minneapolis 1111</city>
<region>MN 11111</region>
</row>
</rows>
</results>
ยป See an example running in the YQL console.

Grouping in XSLT 2.0 similar to br to p problems

In XSLT 1.0, a common question in forums was how to convert flat HTML into hierarchical XML, which many times boiled down to nesting text in between <br /> tags in <p> tags.
I have a similar problem, which I think I've partially solved using XSLT 2.0, but it's a new approach to me and I'd like to get a second opinion.
The XHTML source has <span class="pageStart"></span> scattered throughout. They can appear in several different parent nodes. I want to wrap all the nodes between one page start marker and the next in an <page> node. The solution I currently have is:
<xsl:template match="*[child::span[#class='pageStart']]">
<xsl:copy>
<xsl:copy-of select="#*" />
<xsl:for-each-group select="node()"
group-starting-with="span[#class='pageStart']">
<page>
<xsl:apply-templates select="current-group()"/>
</page>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
There's at least one flaw with this -- the parent node of the marker gets a <page> as a child node when I don't want it. In other works, if there's a <div> that has a child page marker anywhere in it, an <page> node is created as an immediate child of <div> in addition to the locations I expect.
I had hoped that I could simply make the template rule be <xsl:template match="span[#class='pageStart']"> but current-group() seems to be empty no matter what I try. The common sense approach I tried was <xsl:for-each-group select="node()" group-starting-with="span[#class='pageStart']">.
Is there an easier way to solve this problem that I'm missing?
EDIT
Here's an example of the input:
<?xml version="1.0" encoding="UTF-8"?>
<html>
<head></head>
<body>
<span class="pageStart"/>
<p>...</p>
<div>...</div>
<img />
<p></p>
<span class="pageStart"/>
<div>...</div>
<span class="pageStart"/>
<p>...</p>
<div>
<span class="pageStart"/>
<p>...</p>
<p>...</p>
<span class="pageStart"/>
<div>...</div>
<img/>
</div>
</body>
</html>
I assume the last two nested pages make this problem more difficult, so I'd be perfectly happy getting this as the output, or something close:
<?xml version="1.0" encoding="UTF-8"?>
<html>
<head></head>
<body>
<page>
<span class="pageStart"/>
<p>...</p>
<div>...</div>
<img />
<p></p>
</page>
<page>
<span class="pageStart"/>
<div>...</div>
</page>
<page>
<span class="pageStart"/>
<p>...</p>
<div>
<page>
<span class="pageStart"/>
<p>...</p>
<p>...</p>
</page>
<page>
<span class="pageStart"/>
<div>...</div>
<img/>
</page>
</div>
</page>
</body>
</html>
This transformation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[span/#class='pageStart']">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:for-each-group select="node()"
group-starting-with="span[#class='pageStart']">
<page>
<xsl:apply-templates select="current-group()"/>
</page>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<html>
<head></head>
<body>
<span class="pageStart"/>
<p>...</p>
<div>...</div>
<img />
<p></p>
<span class="pageStart"/>
<div>...</div>
<span class="pageStart"/>
<p>...</p>
<div>
<span class="pageStart"/>
<p>...</p>
<p>...</p>
<span class="pageStart"/>
<div>...</div>
<img/>
</div>
</body>
</html>
produces the wanted, correct result:
<html>
<head/>
<body>
<page>
<span class="pageStart"/>
<p>...</p>
<div>...</div>
<img/>
<p/>
</page>
<page>
<span class="pageStart"/>
<div>...</div>
</page>
<page>
<span class="pageStart"/>
<p>...</p>
<div>
<page>
<span class="pageStart"/>
<p>...</p>
<p>...</p>
</page>
<page>
<span class="pageStart"/>
<div>...</div>
<img/>
</page>
</div>
</page>
</body>
</html>

Render a Form from an XSLT file

I've generated the following XSLT file, and have created a Form that will post to an ASP.Net MVC action called Home/ProcessRequest:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
<xsl:output method="html" indent="yes"/>
<xsl:template match="/">
<html>
<body>
<xsl:value-of select="Employee/Name"/>
<br />
<xsl:value-of select="Employee/ID"/>
<form method="post" action="/Home/ProcessRequest?id=42">
<input id="Action" name="Action" type="radio" value="Approved"></input> Approved <br />
<input id="Action" name="Action" type="radio" value="Rejected"></input> Rejected <br />
<input type="submit" value="Submit"></input>
</form>
</body>
</html>
Here is my XML File:
<Employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Name>Russ</Name>
<ID>42</ID>
</Employee>
This works fine the way it is, but I need to change the id parameter in my from from a hard coded integer, to use the ID element from my XML file. Does anyone know how to do this?
This should do it:
<form method="post" action="/Home/ProcessRequest?id={Employee/ID}">
{} is shorthand for using XPath inside attributes.

Resources