I need to transform xml/html files into dita files. I want to remove some nodes but keep their children. The difficulties are:
Node I want to remove have attributes. I get error:
An attribute node (class) cannot be created after a child of the
containing element
And these attributes are unpredictable: I want to remove a variety of nodes, and I can't predict what kinds of attributes they have.
I don't know how deeply the node is nested in. It could be direct child of <body> or could be nested 4 or 5 levels down inside some other nodes.
XML Example:
<macro name="section">
<rich-text-body>
<macro name="column">
<parameter name="width">80%</parameter>
<rich-text-body>
<p>horribly nested, <span>bulky</span> structure</p>
<div>horribly nested, <span>bulky</span> structure</div>
</rich-text-body>
</macro>
</rich-text-body>
</macro>
I want to remove the bulky macro tags, but keep only the children of the most inner <rich-text-body>. In this case, they are the <p> <div> tags.
This is as far as I got. The XSLT
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="macro[#name='column' and parameter[#name='width'] ='80%']">
<xsl:apply-templates select="node()|#*"/>
</xsl:template>
Any help is appreciated! Thanks!
As already suggested, if you change
<xsl:template match="macro[#name='column' and parameter[#name='width'] ='80%']">
<xsl:apply-templates select="node()|#*"/>
</xsl:template>
to
<xsl:template match="macro[#name='column' and parameter[#name='width'] ='80%']">
<xsl:apply-templates select="node()"/>
</xsl:template>
respectively the equivalent but shorter
<xsl:template match="macro[#name='column' and parameter[#name='width'] ='80%']">
<xsl:apply-templates/>
</xsl:template>
then you won't get any errors from the attributes of the macro element as they are not processed.
If you want to process them to add them to a different element then you need to show us exactly where you want to put them.
Related
I am stuck with an XML to XML transformation using XSLT 2.0 where I need to transform this:
<p>some mixed content <x h="">START:attr="value"</x> more mixed content <x h="">END</x> other mixed content</p>
To this:
<p>some mixed content <ph attr="value"> more mixed content </ph> other mixed content</p>
So basically I'd like to replace <x h="">START:attr="value"</x> with <ph attr="value">
and <x h="">END</x> with </ph> and process the rest as usual.
Does anyone know if that's possible?
My main issue is that I cannot figure out how to find the element with value END and then tell the XSLT processor (I use saxon) to process the content between the first occurence of and the second occurence of and finally write the end element . I am familiar with how to create an element (including attributes).
I have a specific template to match the start element START:attr="value". Since the XML document I process contains many other elements I'd prefer a recursive solution, so continue the processing of the found content between START and END by using other existing templates.
Sample XML
(note that I don't know in advance if the parent will be a p element)
<p> my sample text <b>mixed</b> more
<x h="">START:attr="value"</x>
This is mixed content <i>REALLY</i>, process it normally
<x h="">END</x>
</p>
My Stylesheet
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="x[#h][starts-with(., 'START:')]">
<ph>
<xsl:for-each-group select="../*" group-starting-with="x[#h][. = 'START:']">
<xsl:for-each-group select="current-group()" group-ending-with="x[#h][. = 'END']">
<xsl:apply-templates select="#*|node()|text()"/>
</xsl:for-each-group>
</xsl:for-each-group>
</ph>
</xsl:template>
<xsl:template match="x[#h][starts-with(., 'END')]"/>
<xsl:template match="node()|#*">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0" encoding="UTF-8"?>
<p> my sample text <b>mixed</b> more
<ph>mixed</ph>
This is mixed content <i>REALLY</i>, process it normally
</p>
I cannot figure out how to put the complete content between START and END within the tags. Any ideas?
I would match on the parent containing those markers and use a nested for-each-group, of course all based on the identity transformation template as the base processing:
<xsl:template match="p[x[#h][starts-with(., 'START:')]]">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:for-each-group select="node()" group-starting-with="x[#h][starts-with(., 'START:')]">
<xsl:choose>
<xsl:when test="self::x[#h][starts-with(., 'START:')]">
<xsl:variable name="value" select="replace(., '(START:attr=")([^"]*)"', '$2')"/>
<xsl:for-each-group select="current-group()[position() gt 1]" group-ending-with="x[#h][. = 'END']">
<xsl:choose>
<xsl:when test="current-group()[last()][self::x[#h][. = 'END']]">
<ph attr="{$value}">
<xsl:apply-templates select="current-group()[position() ne last()]"/>
</ph>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
XSLT 3 example at https://xsltfiddle.liberty-development.net/pPJ8LV4, for XSLT 2 you need to replace the used xsl:mode declaration with <xsl:template match="#* | node()"><xsl:copy><xsl:apply-templates select="#* | node()"/></xsl:copy></xsl:template>.
As Saxon also supports XQuery using tumbling window where you can check both the start and the end condition together might be a bit more concise (although in XQuery you have to do extra work to make sure you pass the stuff not being wrapped through as the windowing normally filters out items for which the conditions not hold):
p ! <p>
{
for tumbling window $group in node()
start $s
when $s[self::x[#h][starts-with(., 'START:')]] or true()
end $e
when $e[self::x[#h][. = 'END']] and $s[self::x[#h][starts-with(., 'START:')]] or not($s[self::x[#h][starts-with(., 'START:')]])
return
if ($s[self::x[#h][starts-with(., 'START:')]])
then
<ph value="{replace($group[1], '(START:attr=")([^"]*)"', '$2')}">
{
tail($group)[not(position() = last())]
}
</ph>
else $group
}
</p>
https://xqueryfiddle.liberty-development.net/948Fn5s/2
Thanks to a couple of other questions I'm very slowly getting where I need to be with an xsl transformation to change the url of my namespaces. I am using xslt v2.
The main sample is here http://xsltransform.net/ei5Pwj2
This is starting to work but I have 2 questions 1 to try and make it work (!) and one to see if it is possible to make it better.
Firstly my use of
<xsl:namespace name="ns1">http://fruit.com/app/api</xsl:namespace>
has caused a problem because it has caused the ns1 attributes in the element to have their namepaces modified from
... ns1:created="2016-05-23T16:47:55+01:00" ns1:href="http://falseserver:8080/app/api/apple/1" ns1:id="1">
to
... ns1_1:created="2016-05-23T16:47:55+01:00"
ns1_2:href="http://falseserver:8080/app/api/apple/1" ns1_3:id="1">
Can anyone tell me why and how to stop this ?! I can't see how without adding it as a namespace in the element but as I have a namespace tag there already this is not possible
This would be enough to get me going for now but what woudl be perfect is if there is a way to transform the namespaces without reference to the element at all. At the moment if I can get it working as is I will need a few xslt files for slightly different documents. What I really want to do is transform the namespaces regardless of what the current root node is
so all documents would have all 6 namespaces as attributes regardless of whether the root element is
<ns2:apple ...
<ns2:apples ...
<ns4:banana ...
<ns4:bananas ...
etc.
You will need to transform any node that is in a certain namespace to the new namespace you need, it doesn't help or suffice to add namespaces to change the qualified name of a node (that qualified name always is a local name plus a namespace):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ns1="http://veg.com/app/api" xmlns:ns2="http://veg.com/app/api/apple">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="ns2:*">
<xsl:element name="ns2:{local-name()}" namespace="http://fruit.com/app/api/apple">
<xsl:apply-templates select="#* | node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="/ns2:*">
<xsl:element name="ns2:{local-name()}" namespace="http://fruit.com/app/api/apple">
<xsl:namespace name="ns1">http://fruit.com/app/api</xsl:namespace>
<xsl:namespace name="ns3">http://fruit.com/app/api/apple/red</xsl:namespace>
<xsl:namespace name="ns4">http://fruit.com/app/banana</xsl:namespace>
<xsl:namespace name="ns5">http://fruit.com/app/api/pear</xsl:namespace>
<xsl:namespace name="ns6">http://fruit.com/app/api/orange</xsl:namespace>
<xsl:apply-templates select="#*|node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="#ns1:*">
<xsl:attribute name="ns1:{local-name()}" namespace="http://fruit.com/app/api" select="."/>
</xsl:template>
</xsl:stylesheet>
http://xsltransform.net/ei5Pwj2/1
Your sample does not have nodes in the other namespaces but if the real code has such nodes and they need to be transformed then you need to add templates matching and transforming them, using the same approach as done above for the ns2:* element or ns1:* attribute nodes.
I am new to xslt and am trying to transform the following xml:
<li>Hi there <person>David</person> , how is it going? </li>
I would like to transform this to another xml to something like:
<response>Hi there PERSON_NAME , how is it going? </response>
What I have so far is this:
<xsl:template match="li">
<response><xsl:value-of select="text()"/>
<xsl:choose>
<xsl:when test="person">
<xsl:text> PERSON_NAME </xsl:text>
</xsl:when>
</xsl:choose>
</response>
</xsl:template>
This is the output I get:
<response>Hi there , how is it going? PERSON_NAME</response>
Not exactly what I wanted. I am new to xslt and read a book. I did not find any example where there was a situation where an xml element had a child node in between its text value. Not sure if xslt can handle this or I am missing something fundamental. Any help would be greatly appreciated. I am using xslt 2.0
You can simply define two template to handle your condition.
<xsl:template match="li">
<response>
<xsl:apply-templates/>
</response>
</xsl:template>
<xsl:template match="person">
<xsl:text>PERSON_NAME</xsl:text>
</xsl:template>
I get xml from url: http://www.concert.ru/mail-ru/concert.xml
And I need tags ActionPlaces and Actions to be handled in separate manner - so I use two different tags for them:
<xsl:template match="/">
<xsl:apply-templates select="Data/ActionPlaces"/>
<xsl:apply-templates select="Data/Actions"/>
</xsl:template>
But they should be enveloped inside tag called enfinity
So when I do like this:
<enfinity>
<xsl:template match="Data/Actions">..
<xsl:template match="Data/ActionPlaces"> ..
</enfinity>
I get incorrect output. When the main tag is inside one of templates - I get correct output - but need main tag to be the top. How to handle it?
Try:
<xsl:template match="/">
<enfinity>
<xsl:apply-templates select="Data/ActionPlaces"/>
<xsl:apply-templates select="Data/Actions"/>
</enfinity>
</xsl:template>
I have the following basic xml which I to parse to give the NAME only if none of the DB values = DB1.
<rnas>
<rna ID="1">
<NAME>Segment 6</NAME>
<XREF>
<ID>AF389120</ID>
<DB>DB1</DB>
</XREF>
<XREF>
<ID>ABCDE</ID>
<DB>DB2</DB>
</XREF>
</rna>
<rna ID="10">
<NAME>Segment 3</NAME>
<XREF>
<ID>12345</ID>
<DB>DB2</DB>
</XREF>
<XREF>
<ID>66789</ID>
<DB>DB3</DB>
</XREF>
</rna>
</rnas>
The expected output would be:
<rnas>
<rna ID="10">
<NAME>Segment 3</NAME>
</rna>
<rnas>
I am still a relative newbie and have tried a variety of approaches using XSLT 2.0 but so far have not been able to get anything to work properly. Any help would be much appreciated.
This will do what you want
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="rna[.//DB/text()='DB1']"/>
<xsl:template match="XREF"/>
</xsl:stylesheet>
It's an Identity Transform along with two empty templates. The first matches any rna that contains a DB with the text value DB1, and suppresses it. The second matches all XREF elements, which you do not want to output.