xslt how to read the document-node() - xslt-2.0

I have a xml file in which one of the element has the CDATA as the value. I put the CDATA value into a variable which I can see is value type of document-node(1) when i debug my code from oXygen. How do I iterate the document-node()?
copy can give me a new xml file. but what I need is not a new file. I only need to read certain nodes and generate a report based on the values on those nodes. so I directly copy the CDATA to my variable and thought I can manipulate it.
I tried to use substring to read the variable things but failed.
I tried to use document(variable) to open the variable but Oxygen give me the debug-error of FODC0002:I/O error reported by xml parser processing file.
here the file is my variable which looks like a xml file
I did google search for the error but only got bench of non-closed questions like Oxygen throw I/O error when use document().
Would anybody let me know what's going wrong? or give me a better solution?
I also tried parse-xml() but I got the following error from Saxon:
F[Saxon-EE9.5.1.5] the processing instruction target matching "[xX][mM][lL]" is not allowed
F[Saxon-EE9.5.1.5] FODC0006: First argument to parse-xml() is not a well formed and namespace-well-formed XML document.
my code to use parse-xml is as below:
<xsl:template match="data"
<xsl:for-each select="parse-xml(root/outsideData)//nodeLevel1/nodeLevel2">
Could anyone give me a sample about how to use parse-xml()? I did google search but didn't find useful samples.
Thanks very much!
A piece of my data is like the following:
<root>
<outsideData id="123">
<child1 key="124375438"/>
<![CDATA[ <?xml version=1.0 encoding="UTF-8"?><insideData xmlns:xlink="http://www.w3.org/1999/xlink">
<nodeLevel1>
<nodeLevel21>packing</nodeLevel21>
<nodeLevel22 ref="12343-454/560" xlink:href="URN:X-MN:DD%3FM=B888%26SDC=A%26CH=79% .../>
</nodeLevel1>
]]>
</outsideData>
</root>
I want to get the inside CDATA <nodeLevel22> #ref and #xlink which will get DD-FM-B888-26-79
My variables are:
<xsl:for-each select="/root/outsideData">
<xsl:variable name="insideData">
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:variable>
<xsl:variable name="Data">
<xsl:value-of
select="normalize-space(substring-after($insideData,'?>'))"
disable-output-escaping="yes"/>
</xsl:variable>
</xsl:foreach>
From the debug I can see that the variable insideData and Data are both value type of document-node(1)
Martin's solution works for me very well :)
But I'm still wondering why the following doesn't work:
<xsl:variable name="insideData">
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:variable>
<ref>
<xsl:value-of select="substring-before(substring-after($insideData, '<nodeLevel22 ref'),>/>')"/>
</ref>
Here I got empty <ref/>

If you do <xsl:variable name="varName"><xsl:value-of select="..."/><xsl:variable> then you are creating a temporary document fragment that contains a single text with the string contents of the item(s) selected in the value-of. That does not make sense in most cases, doing <xsl:variable name="varName" select="..."/> is usually sufficient.
As for parsing the contents of the outsideData element with parse-xml, there is indeed not only the escaped XML document inside that element but white space as well, thus if you try to parse the contents as XML you get that error as white space before the XML declaration is not allowed. The whole approach of stuffing the XML into a CDATA section with an element with mixed contents is flawed in my view, if you want to store escaped XML into a CDATA then you should make sure that you use a single element that contains nothing but the CDATA section which then only contains the XML markup with no leading white space.
If you can't change the creation of the input data then you will need to make sure you pass in only that part of the string contents of the element to parse-xml that is a well-formed XML document, so you need some way to strip the white space before the XML declaration doing e.g.
<xsl:for-each select="/root/outsideData">
<xsl:variable name="xml-string" select="replace(., '^\s+', '')"/>
<xsl:variable name="xml-doc" select="parse-xml($xml-string)"/>
<!-- now output data e.g. -->
<xsl:value-of select="$xml-doc//nodeLevel1/nodeLevel22/#ref"/>
...
</xsl:for-each>
Untested but should show the right direction as far as trying to use parse-xml.

Related

why am I getting Required cardinality of value of variable $depts is exactly one; supplied value has cardinality more than one

Trying to figure out some homework here and the online teacher has never responded to any questions I ask.
I keep getting an error when I try to process the xml document.
"XTTE0570: Required cardinality of value of variable $depts is exactly one; supplied value has cardinality more than one"
The case problem instructions:
First, create a template named getemployees.
Within the getEmployees template, create a variable named depts containing a sequence of the following text strings representing the department codes for Lucy’s sample data: ‘a00’, ‘c01’, ‘d11’, ‘d21’, ‘e11’, and ‘e21’.
After the line to create the depts variable, create the departments element.
Within the departments element, insert a for-each loop that loops through each entry in the
depts sequence.
For each entry in the depts sequence do the following:
a. Create a variable named currentDept equal to the current item in the depts sequence.
b. Create an element named department with an attribute named deptiD whose value is equal to the value of the currentDept variable.
c. Use the doc() function to reference the “deptcurrent.xml” file, where current is the value of the currentDept variable. (Hint: Use the concat() function to combine the text strings for “dept”, the currentDept variable, and the text string “.xml”.)
d. Use the copy-of element to copy the contents of the employees element and its descendants to the department element.
Save your changes to the file and then use your XSLT 2.0 processor to generate the result
document horizons.xml by applying the getEmployees template within the alldepartments.xsl
style sheet.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template name="getEmployees">
<xsl:variable name="depts" select="('a00', 'c01', 'd11', 'd21', 'e11', 'e21')" as="xs:string" />
<departments>
<xsl:for-each select="$depts">
<xsl:variable name="currentDept">
<xsl:value-of select="." />
</xsl:variable>
<department deptID="{$currentDept}">
<xsl:value-of select="doc(concat('dept',$currentDept, '.xml'))" />
<xsl:copy-of select="employees" />
</department>
</xsl:for-each>
</departments>
</xsl:template>
</xsl:stylesheet>
Should generate something similar to:
<?xml version="1.0" encoding="UTF-8"?>
<departments>
<department dept="a00">
<employees>
<employee empID="10">
<firstName>Marylin</firstName>
<middleInt>A</middleInt>
<lastName>Johnson</lastName>
<department>A00</department>
<phone>3978</phone>
<email>Johnson.60#example.com/horizons</email>
<dateHired>2000-01-01</dateHired>
<title>President</title>
<edLevel>18</edLevel>
<gender>F</gender>
<birthDate>1968-08-24</birthDate>
<salary>121300</salary>
<bonus>2300</bonus>
<commission>9700</commission>
</employee>
<employee empID="40">
<firstName>Heather</firstName>
<middleInt>D</middleInt>
<lastName>Gordon</lastName>
<department>A00</department>
<phone>3915</phone>
<email>Gordon.59#example.com/horizons</email>
<dateHired>2009-03-01</dateHired>
<title>Manager</title>
<edLevel>18</edLevel>
<gender>F</gender>
<birthDate>1986-06-03</birthDate>
<salary>85400</salary>
<bonus>1700</bonus>
<commission>6500</commission>
</employee>
</employees>
</department>
</departments>
If you use the as attribute on xsl:variable to declare the type of your variable then the value you select needs to fit that declaration, so given that you have a sequence of strings you need to use <xsl:variable name="depts" select="('a00', 'c01', 'd11', 'd21', 'e11', 'e21')" as="xs:string*" />.
Additionally the
<xsl:copy-of select="employees" />
inside the for-each over a string sequence doesn't make sense (and explain the error you get after correcting the variable type), there you simply want
<xsl:copy-of select="doc(concat('dept', ., '.xml'))/employees" />

xslt 2.0: read in text files via collection()

I have a bunch of text files that I'd like to process witth XSLT 2.0.
Here's how I try to read them in:
<xsl:variable name="input" select="collection(iri-to-uri('file:///.?select=*.txt'))" />
However, when I do this:
<xsl:message>
<xsl:sequence select="count($input)"/>
</xsl:message>
It outputs 0. No files are selected.
If I do it like this:
<xsl:variable name="input" select="collection(iri-to-uri('.?select=*.txt'))" />
I get the error that collection should return a node but is returning an xs:string.
What I would like do to is read each file and then iterate over each file and process the text, like this
<xsl:for-each select="unparsed-text($input, 'UTF-8')">
<!-- tokenizing, etc. -->
How would I do that?
You need the XPath 3.0 uri-collection function supported in version="3.0" stylesheet in Saxon 9.7 (all versions including HE) and 9.6 (commercial versions I think):
<xsl:template match="/" name="main">
<xsl:for-each select="uri-collection('.?select=*.txt')!unparsed-text(.)">
<xsl:message select="'Parsed:' || . || '
'"/>
</xsl:for-each>
</xsl:template>
collection is supposed to return a sequence of nodes while uri-collection can access other resources not parsable as XML.
With Altova XMLSpy respectively RaptorXML and XSLT 3.0 you can also use uri-collection, it seems the way to access all .txt files is a bit different from Saxon and you use uri-collection('*.txt') to access all .txt files in the directory.

xpath expression to select specific xml nodes that are available in a file

I was trying to find the out a way for my strange problem.
How to write an xpath to select specific xml nodes that are available in another text file.
For Instance,
<xsl:for-each select="SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME eq (group name list in a text file as input)]">
For example,
<xsl:for-each select="SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME eq collection('select_nodes.txt')]">
select_nodes.txt contains list of string that can be selected only
For example
ABC
IJK
<SUBSCRIBER>
<MSISDN>123456</MSISDN>
<SUBSCRIBER_PROFILE_LIST>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>12345</PROFILE_MSISDN>
<GROUP_NAME>ABC</GROUP_NAME>
<GROUP_ID>18</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>456778</PROFILE_MSISDN>
<GROUP_NAME>DEF</GROUP_NAME>
<GROUP_ID>100</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>78876</PROFILE_MSISDN>
<GROUP_NAME>IJK</GROUP_NAME>
<GROUP_ID>3</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
</SUBSCRIBER>
XSLT2 has limited functionality for parsing arbitrary text files. I would suggest:
Make the select_nodes.txt an XML file and load it using the doc() function:
<xsl:variable name="group_names" as="xs:string *"
select="doc('select_nodes.xml')/groups/group"/>
with select_nodes.xml looking like this:
<?xml version="1.0" encoding="UTF-8"?>
<groups>
<group>ABC</group>
<group>IJK</group>
</groups>
Pass the group names as a stylesheet parameter. (How you do this depends on which XSLT engine you're using and whether it's through the command line or an API.) If it's through an API, then you may be able to pass the values in directly as xs:string-typed objects. Otherwise you'll have to parse the parameter:
<xsl:param name="group_names_param"/>
<!-- Assuming the input string is a whitespace-separated list of names -->
<xsl:variable name="group_names" as="xs:string *"
select="tokenize($group_names_param, '\s+')"/>
In either case your for-each expression would then look like this:
<xsl:for-each select="
SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME = $group_names]">
<!-- Do something -->
</xsl:for-each>

apply templates select substring after

I've an XML line like the below.
<title>I. DEFINITION</title>
Here what i'm doing getting the value before '.', this is fine but i want to apply-templates for the content after '.'. i'm unable to know how do i do it. i'm using the below XSLT line.
<xsl:apply-templates select="substring-after(.,'. ')"/>
when i use it, an error is thrown and it is
XSLT 2.0 Debugging Error: Error: file:///C:/Users/u0138039/Desktop/Proview/HK/ArchboldHK2014/XSLT/Chapters.xsl:508: Not a node item - item has type xs:string with value 'DEFINITION' - Details: - XTTE0520: The result of evaluating the 'select' attribute of the <xsl:apply-templates> instruction may only contain nodes
please let me know how i can apply-templates on content after '.'
Thanks.
You can try this template
<xsl:template match="title">
<xsl:copy>
<label><xsl:value-of select="substring-before(., '. ')"/></label>
<caption>
<xsl:variable name="slicetext" select="substring-after(current()/text()[1], '. ')"/>
<xsl:value-of select="$slicetext"/><xsl:apply-templates select="text()[position() > 1]|child::node()[not(self::text())]"/>
</caption>
</xsl:copy>
</xsl:template>
With XSLT 1.0 and 2.0 you can only write and apply-templates for nodes, not for primitive values like strings. I think this changes in XSLT 3.0.
In XSLT 2.0, to process the result of substring-after further, you would need to write a function or a named template taking a string parameter.
If you really want to apply a template you first would need to create a temporary text node with xsl:variable.

Can I apply a character map to a given node?

If I look at the xslt specs it seems a character map applies to the whole document, bit is it also possible to use it on a given node, or within a template ?
Example : I have a node containing look up values, but they might contain characters that don't play well with regular expressions when using it in another template. For now I use a replace functionwhich works well,, but after a few characters that becomes pretty hard to read or maintain. So if I have something like this :
<xsl:variable name="myLookup" select="
replace(
replace(
replace(
replace(
string-join(/*/lookup/*, '|'),
'\[','\\['),
'\]','\\]'),
'\(','\\('),
'\)','\\)')
"/>
is there a way to achieve something like below fictitious example ?
<xsl:character-map name="escapechar">
<xsl:output-character character="[" string="\[" />
<xsl:output-character character="]" string="\]" />
<xsl:output-character character="(" string="\(" />
<xsl:output-character character=")" string="\)" />
</xsl:character-map>
<xsl:variable name="myLookup" select="string-join(/*/lookup/*, '|')" use-character-map="escapechar"/>
I know this is not working at all, it is just to make my request a bit visual.
Any idea ?
I think character maps in XSLT 2.0 are a serialization feature to be applied when a result tree is serialized to a file or stream so I don't see how you could apply one to a certain string or certain node during a transformation.
As for escaping meta characters of regular expression patterns, maybe http://www.xsltfunctions.com/xsl/functx_escape-for-regex.html helps.
Character maps is only a serialization feature, which means that it is only executed when the final output of a transformation is produced. However, you can significantly simplify your current code.
Just use:
replace($pStr, '(\[|\]|\(|\))','\\$1')
Here is a complete example:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/*">
<xsl:value-of select="my:escape(.)"/>
</xsl:template>
<xsl:function name="my:escape" as="xs:string">
<xsl:param name="pStr" as="xs:string"/>
<xsl:value-of select="replace($pStr, '(\[|\]|\(|\))','\\$1')"/>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on the following XML document:
<t>([a-z]*)</t>
the wanted, correct result is produced:
\(\[a-z\]*\)

Resources