xslt 2.0: read in text files via collection() - xslt-2.0

I have a bunch of text files that I'd like to process witth XSLT 2.0.
Here's how I try to read them in:
<xsl:variable name="input" select="collection(iri-to-uri('file:///.?select=*.txt'))" />
However, when I do this:
<xsl:message>
<xsl:sequence select="count($input)"/>
</xsl:message>
It outputs 0. No files are selected.
If I do it like this:
<xsl:variable name="input" select="collection(iri-to-uri('.?select=*.txt'))" />
I get the error that collection should return a node but is returning an xs:string.
What I would like do to is read each file and then iterate over each file and process the text, like this
<xsl:for-each select="unparsed-text($input, 'UTF-8')">
<!-- tokenizing, etc. -->
How would I do that?

You need the XPath 3.0 uri-collection function supported in version="3.0" stylesheet in Saxon 9.7 (all versions including HE) and 9.6 (commercial versions I think):
<xsl:template match="/" name="main">
<xsl:for-each select="uri-collection('.?select=*.txt')!unparsed-text(.)">
<xsl:message select="'Parsed:' || . || '
'"/>
</xsl:for-each>
</xsl:template>
collection is supposed to return a sequence of nodes while uri-collection can access other resources not parsable as XML.
With Altova XMLSpy respectively RaptorXML and XSLT 3.0 you can also use uri-collection, it seems the way to access all .txt files is a bit different from Saxon and you use uri-collection('*.txt') to access all .txt files in the directory.

Related

xpath expression to select specific xml nodes that are available in a file

I was trying to find the out a way for my strange problem.
How to write an xpath to select specific xml nodes that are available in another text file.
For Instance,
<xsl:for-each select="SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME eq (group name list in a text file as input)]">
For example,
<xsl:for-each select="SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME eq collection('select_nodes.txt')]">
select_nodes.txt contains list of string that can be selected only
For example
ABC
IJK
<SUBSCRIBER>
<MSISDN>123456</MSISDN>
<SUBSCRIBER_PROFILE_LIST>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>12345</PROFILE_MSISDN>
<GROUP_NAME>ABC</GROUP_NAME>
<GROUP_ID>18</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>456778</PROFILE_MSISDN>
<GROUP_NAME>DEF</GROUP_NAME>
<GROUP_ID>100</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
<SUBSCRIBER_PROFILE_INFO>
<PROFILE_MSISDN>78876</PROFILE_MSISDN>
<GROUP_NAME>IJK</GROUP_NAME>
<GROUP_ID>3</GROUP_ID>
</SUBSCRIBER_PROFILE_INFO>
</SUBSCRIBER>
XSLT2 has limited functionality for parsing arbitrary text files. I would suggest:
Make the select_nodes.txt an XML file and load it using the doc() function:
<xsl:variable name="group_names" as="xs:string *"
select="doc('select_nodes.xml')/groups/group"/>
with select_nodes.xml looking like this:
<?xml version="1.0" encoding="UTF-8"?>
<groups>
<group>ABC</group>
<group>IJK</group>
</groups>
Pass the group names as a stylesheet parameter. (How you do this depends on which XSLT engine you're using and whether it's through the command line or an API.) If it's through an API, then you may be able to pass the values in directly as xs:string-typed objects. Otherwise you'll have to parse the parameter:
<xsl:param name="group_names_param"/>
<!-- Assuming the input string is a whitespace-separated list of names -->
<xsl:variable name="group_names" as="xs:string *"
select="tokenize($group_names_param, '\s+')"/>
In either case your for-each expression would then look like this:
<xsl:for-each select="
SUBSCRIBER_PROFILE_LIST/SUBSCRIBER_PROFILE_INFO[GROUP_NAME = $group_names]">
<!-- Do something -->
</xsl:for-each>

In XSLT, is it normal that a variable set to something like name(..) is computed at time of use?

I have a couple of trees in my XML and wanted to access one in terms of a name in the other. Here is is called the tab_name and it is the parent tag of the current node so I use name(..). That gives me the correct value if I test at the same location where I set the variable.
However, the problem I have is that when I reference $tab_name a few lines below (in the <xsl:when> tag) the name(..) is applied to the current context so I get the tag "group" instead of what I would otherwise expect.
<xsl:variable name="tab_name" select="name(..)"/>
<legend>
<xsl:for-each select="/snap/page/body/client/group/*">
<xsl:choose>
<xsl:when test="name(.) = $tab_name"> <!-- $tab_name = 'group' here! -->
...
</xsl:when>
</xsl:choose>
</xsl:for-each>
</legend>
Is that the normal/expected behavior of XSLT 2.0? I was thinking that the variable would be set in its own for-each context (for-each not shown here) and not the new sub-for-each context.
Here are full XSLT and XML documents to reproduce the problem with xmlpatterns (the Qt XML parser).
XSLT (say a.xsl):
<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:snap="snap:snap">
<xsl:template match="snap">
<xsl:for-each select="page/body/client/data_field/*">
Direct name = <xsl:value-of select="name(.)"/> [correct, getting 'dog']
<xsl:for-each select="*">
<xsl:variable name="tab_name" select="name(..)"/>
Parent name = <xsl:value-of select="$tab_name"/> [correct, getting 'dog']
<xsl:message>Message has no side-effects... <xsl:value-of select="$tab_name"/></xsl:message>
<xsl:for-each select="/snap/page/body/client/group">
Inside other for-each tab_name = <xsl:value-of select="$tab_name"/> [incorrect, getting 'client']
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
<!-- vim: ts=2 sw=2
-->
XML (say a.xml):
<!DOCTYPE snap>
<snap>
<page>
<body layout-name="finball">
<client>
<group>
<cat>Jolly</cat>
<dog>Bear</dog>
</group>
<data_field>
<cat>
<div>All about Cats</div>
</cat>
<dog>
<div>All about Dogs</div>
</dog>
</data_field>
</client>
</body>
</page>
</snap>
<!--
vim: ts=2 sw=2 et
-->
Command I use to reproduce the problem:
xmlpatterns a.xsl a.xml
The output is incorrect:
Direct name = cat [correct, getting 'cat']
Parent name = cat [correct, getting 'cat']
Inside other for-each tab_name = client [incorrect, getting 'client']
Direct name = dog [correct, getting 'dog']
Parent name = dog [correct, getting 'dog']
Inside other for-each tab_name = client [incorrect, getting 'client']
(As a detail: I'm using Qt XSTL 2.0 implementation, in case it is not normal, then the Qt implementation is what is broken.)
This is clearly a bug in the Qt XSLT 2.0 processor. At their website they claim that they only partially support XSLT 2.0. However, this behavior of assigning variables has not changed between 1.0 and 2.0.
"In XSLT, is it normal that a variable set to something like name(..) is computed at time of use?"
Yes, it is normal that is computed "at the time it is used", to prevent unnecessary overhead. Most processors (including ours, in most cases), will calculate variables once and only when used. However, they must be calculated in light of their declaration context, which is clearly not what's happening here:
The declaration of the variable defines its focus and context, not the place where the variable is referenced.
When I run your input XML and stylesheet XSLT against Saxon or Exselt (or probably any other processor out there that I haven't tried), it gives the following output (outdented and whitelines removed for clarity):
<?xml version="1.0" encoding="UTF-8"?>
Direct name = cat [correct, getting 'dog']
Parent name = cat [correct, getting 'dog']
Inside other for-each tab_name = cat [incorrect, getting 'client']
Direct name = dog [correct, getting 'dog']
Parent name = dog [correct, getting 'dog']
Inside other for-each tab_name = dog [incorrect, getting 'client']
As you can see, it is either always "dog" or always "cat", as it should be.
I suggest you file a bug against this processor or, considering it is open source and if you have the time, help them out to fix it at the source ;).

Can I apply a character map to a given node?

If I look at the xslt specs it seems a character map applies to the whole document, bit is it also possible to use it on a given node, or within a template ?
Example : I have a node containing look up values, but they might contain characters that don't play well with regular expressions when using it in another template. For now I use a replace functionwhich works well,, but after a few characters that becomes pretty hard to read or maintain. So if I have something like this :
<xsl:variable name="myLookup" select="
replace(
replace(
replace(
replace(
string-join(/*/lookup/*, '|'),
'\[','\\['),
'\]','\\]'),
'\(','\\('),
'\)','\\)')
"/>
is there a way to achieve something like below fictitious example ?
<xsl:character-map name="escapechar">
<xsl:output-character character="[" string="\[" />
<xsl:output-character character="]" string="\]" />
<xsl:output-character character="(" string="\(" />
<xsl:output-character character=")" string="\)" />
</xsl:character-map>
<xsl:variable name="myLookup" select="string-join(/*/lookup/*, '|')" use-character-map="escapechar"/>
I know this is not working at all, it is just to make my request a bit visual.
Any idea ?
I think character maps in XSLT 2.0 are a serialization feature to be applied when a result tree is serialized to a file or stream so I don't see how you could apply one to a certain string or certain node during a transformation.
As for escaping meta characters of regular expression patterns, maybe http://www.xsltfunctions.com/xsl/functx_escape-for-regex.html helps.
Character maps is only a serialization feature, which means that it is only executed when the final output of a transformation is produced. However, you can significantly simplify your current code.
Just use:
replace($pStr, '(\[|\]|\(|\))','\\$1')
Here is a complete example:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/*">
<xsl:value-of select="my:escape(.)"/>
</xsl:template>
<xsl:function name="my:escape" as="xs:string">
<xsl:param name="pStr" as="xs:string"/>
<xsl:value-of select="replace($pStr, '(\[|\]|\(|\))','\\$1')"/>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on the following XML document:
<t>([a-z]*)</t>
the wanted, correct result is produced:
\(\[a-z\]*\)

Umbraco setting image error

in my umbraco setting, i dont have any image logo to upload my image that was saved in my Media folder, only insert umbraco page field and insert umbraco macro logo was there.I tried many links but it was unsuccessful.im using umbraco V3.0.5
If I understand you correctly, you're modifying a template (you have access to insert field and insert macro) and you want to add an image that is saved in your media section.
Are you really using Umbraco 3 - that is a few years old now and the syntax is very different and you will probably need to use xslt; for example inserting a macro will look different on your template for v3 and v4 (v5 has been deprecated and v6 is not live yet).
(http://our.umbraco.org/wiki/reference/templates/umbracomacro-element/macro-parameters/advanced-macro-parameter-syntax)
Umbraco Version 3:
<?UMBRACO_MACRO macroAlias="RenderProperties" pageValue="[#bodyText]" />
Umbraco Version 4:
<umbraco:macro alias="RenderProperties" pagevalue="[#bodyText]" runat="server"/>
In the older versions of umbraco putting an image onto a page from media required writing some xslt and referring to it in a macro - this example (that I've dredged up) would display the image that was picked for a page with an alias of 'imageAliasName'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xsl:stylesheet [ <!ENTITY nbsp " "> ]>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxml="urn:schemas-microsoft-com:xslt"
xmlns:umbraco.library="urn:umbraco.library" xmlns:Exslt.ExsltCommon="urn:Exslt.ExsltCommon" xmlns:Exslt.ExsltDatesAndTimes="urn:Exslt.ExsltDatesAndTimes" xmlns:Exslt.ExsltMath="urn:Exslt.ExsltMath" xmlns:Exslt.ExsltRegularExpressions="urn:Exslt.ExsltRegularExpressions" xmlns:Exslt.ExsltStrings="urn:Exslt.ExsltStrings" xmlns:Exslt.ExsltSets="urn:Exslt.ExsltSets" xmlns:umbraco.contour="urn:umbraco.contour" xmlns:PS.XSLTsearch="urn:PS.XSLTsearch"
exclude-result-prefixes="msxml umbraco.library Exslt.ExsltCommon Exslt.ExsltDatesAndTimes Exslt.ExsltMath Exslt.ExsltRegularExpressions Exslt.ExsltStrings Exslt.ExsltSets umbraco.contour PS.XSLTsearch ">
<xsl:output method="xml" omit-xml-declaration="yes"/>
<xsl:param name="currentPage"/>
<xsl:template match="/">
<xsl:variable name="mediaId" select="number($currentPage/imageAliasName)" />
<xsl:if test="$mediaId > 0">
<xsl:variable name="mediaNode" select="umbraco.library:GetMedia($mediaId, 0)" />
<xsl:if test="$mediaNode/umbracoFile">
<img>
<xsl:attribute name="src">
<xsl:text>/ImageGen.ashx?image=</xsl:text>
<xsl:value-of select="$mediaNode/umbracoFile"/>
<xsl:text>&width=200</xsl:text>
<xsl:text>&height=200</xsl:text>
</xsl:attribute>
</img>
</xsl:if>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
in V3 i insert that image in to one of my content file from there i can view the id for my image..using that id image to write my code in my setting..when i write my image logo was viewed successful.

What is the ROUND function behavior in Orbeon Forms?

In Orbeon Forms (dev-post-3.7.1.200911140400) we have code in an XPL that do some calculation, and part of this calculation is we ROUND the result to 2 decimal places. Below is an example of the code we use to do the calculation:
<xsl:when test="$total_c_w != 0">
<gpa><xsl:value-of select="(round(($total_p_c_w div $total_c_w) * 100) div 100)"/></gpa>
</xsl:when>
According to the standard XPATH documentation on the ROUND function; Rounds a numeric value to the nearest whole number, rounding x.5 towards positive infinity.
But we encounter a case where the ROUND function is rounding 237.5 to 237, instead of 238. This is just an example, there are other cases where a similar problem involving x.5 is happening.
For example mention, the values in the calculation are:
$total_p_c_w = 7.6, $total_c_w = 3.2
=====================================================
Alex,
Thanks for the guidance. I did some more debugging and found something weird please refer to the following XSL code that I tested with on the latest Orbeon Forms 3.9.0.201105152046 CE.
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0">
<xsl:template match="/">
<main>
<xsl:variable name="tppcw" select="/root/studentpcw/total_p_c_w" as="xs:double"/>
<xsl:variable name="tccw" select="/root/studentpcw/total_c_w" as="xs:double"/>
<xsl:variable name="tpcw" select="sum(/root/studentpcw/total_p_c_w)"/>
<xsl:variable name="tcw" select="sum(/root/studentpcw/total_c_w)"/>
<xsl:variable name="total_p_c_w" select="7.6"/>
<xsl:variable name="total_c_w" select="3.2"/>
<var1><xsl:value-of select="sum(/root/studentpcw/total_p_c_w)"/></var1>
<var2><xsl:value-of select="sum(/root/studentpcw/total_c_w)"/></var2>
<var3><xsl:value-of select="$total_p_c_w"/></var3>
<var4><xsl:value-of select="$total_c_w"/></var4>
<result1>
<xsl:value-of select="round(($total_p_c_w div $total_c_w) * 100)"/>
</result1>
<result2>
<xsl:value-of select="round(($tppcw div $tccw) * 100)"/>
</result2>
</main>
</xsl:template>
</xsl:transform>
Apply the above code at this sample document:
<root>
<studentpcw>
<total_p_c_w>7.6</total_p_c_w>
<total_c_w>3.2</total_c_w>
</studentpcw>
</root>
The result is quite unexpected;
<main xmlns:xs="http://www.w3.org/2001/XMLSchema">
<var1>7.6</var1>
<var2>3.2</var2>
<var3>7.6</var3>
<var4>3.2</var4>
<result1>238</result1>
<result2>237</result2>
</main>
The problem is that if I assign a literal number to the variable and use that variable in the rounding function, the result is as I expected. If I select the value from a node and assign to the variable and use that variable in the rounding function, the result is wrong or unexpected.
I get a result of 238 running the following stylesheet in a nightly build through the XSLT sandbox (which, if you have Orbeon Forms installed locally, you can access through http://localhost:8080/orbeon/sandbox-transformations/xslt/).
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0">
<xsl:template match="/">
<result>
<xsl:variable name="total_p_c_w" select="7.6"/>
<xsl:variable name="total_c_w" select="3.2"/>
<xsl:value-of select="round(($total_p_c_w div $total_c_w) * 100)"/>
</result>
</xsl:template>
</xsl:transform>
I believe this is the result you expected, but that you might be getting something different with the version you are using. Could you try the above example in the XSLT sandbox of the version you are using? If you're getting 237 instead of 238, then this is a sign that this issue has been fixed, and I would then recommend you to upgrade to a newer version of Orbeon Forms (currently 3.9).
(Note that this is most likely not something in Orbeon Forms per se, but in the XSLT implementation Orbeon Forms uses, which is the excellent Saxon.)

Resources