How to select a certain number of descendants of a node using xpath? - xpath-1.0

say i had the following xml:
<a>
<b>
<c>
<d />
<e />
</c>
</b>
<g>
<b>
<h />
<f />
</b>
</g>
if i want to select all the descendants of the node 'b' i can use the following xpath query:
//b//*
or using axes :
//b/descendant::*
But i want to select only 4 descendants of the node 'b', does anyone know how to do it please?
PS : i'm using xpath 1.0

//c/descendant::*[position() <= 4]

It's settled! I just should use the parentheses like this :
(//b/descendant::*)[position()<=4]
because without them, the [position() <= 4] part will be applied to the descendant element's position in its parent rather than its position in the result node set.

Related

xslt2: sequence of attribute nodes

This is not really a question but an astonishing xslt2 experience that I like to share.
Take the snippet (subtract one set from another)
<xsl:variable name="v" as="node()*">
<e a="a"/>
<e a="b"/>
<e a="c"/>
<e a="d"/>
</xsl:variable>
<xsl:message select="$v/#a[not(.=('b','c'))]"/>
<ee>
<xsl:sequence select="$v/#a[not(.=('b','c'))]"/>
</ee>
What should I expect to get?
I expected a d at the console and
<ee>a d</ee>
at the output.
What I got is
<?attribute name="a" value="a"?><?attribute name="a" value="d"?>
at the console and
<ee a="d"/>
at the output. I should have known to take $v/#a as a sequence of attribute nodes to predict the output.
In order to get what I wanted, I had to convert the sequence of attributes to a sequence of strings like:
<xsl:variable name="w" select="$v/#a[not(.=('b','c'))]" as="xs:string*"/>
Questions:
Is there any use of sequences of attributes (or is it just an interesting effect of the node set concept)?
If so, would I be able to enter statically a sequence of attributes like I am able to enter a sequence of strings: ('a','b','c','d')
Is there any inline syntax to convert a sequence of attributes to a sequence of strings? (In order to achieve the same result omitting the variable w)
It seems to be an elegant way for creating attributes using xsl:sequence. Or would that be a misuse of xslt2, not covered by the standard?
As for "Is there any inline syntax to convert a sequence of attributes to a sequence of strings", you can simply add a step $v/#a[not(.=('b','c'))]/string(). Or use a for $a in $v/#a[not(.=('b','c'))] return string($a) and of course in XPath 3 $v/#a[not(.=('b','c'))]!string().
I am not sure what the question about the "use of sequences of attributes" is about, in particular as it then mentions the XPath 1 concept of node sets. If you want to write a function or template to return some original attribute nodes from an input then xsl:sequence allows that. Of course, inside a sequence constructor like the contents of an element, if you look at 10) in https://www.w3.org/TR/xslt20/#constructing-complex-content, in the end a copy of the attribute is created.
As for creating a sequence of attributes, you can't do that in XPath which can't create new nodes, you can however do that in XSLT:
<xsl:variable name="att-sequence" as="attribute()*">
<xsl:attribute name="a" select="1"/>
<xsl:attribute name="b" select="2"/>
<xsl:attribute name="c" select="3"/>
</xsl:variable>
then you can use it elsewhere, as in
<xsl:template match="/*">
<xsl:copy>
<element>
<xsl:sequence select="$att-sequence"/>
</element>
<element>
<xsl:value-of select="$att-sequence"/>
</element>
</xsl:copy>
</xsl:template>
and will get
<example>
<element a="1" b="2" c="3"/>
<element>1 2 3</element>
</example>
http://xsltfiddle.liberty-development.net/jyyiVhg
XQuery has a more compact syntax and in contrast to XPath allows expressions to create new nodes:
let $att-sequence as attribute()* := (attribute a {1}, attribute b {2}, attribute c {3})
return
<example>
<element>{$att-sequence}</element>
<element>{data($att-sequence)}</element>
</example>
http://xqueryfiddle.liberty-development.net/948Fn56

How can access XSLT child node relative to my current for-each?

using XSLT-2.0, I have something like
<A>
<item1/>
<item2/>
<B>
<itema/>
<itemb/>
</B>
</A>
I know each A will contain one and only one B. I cant change the XML. How can I write XSLT to display these all as if there was no B-nodes, like
item1 item2 itemA itemB
I tried what seemed sort of syntactically and logically ok, hoping it might recognize it, but apparently not supported:
<xsl:for-each select="A">
<td><xsl:value-of select="item1"/></td>
<td><xsl:value-of select="item2"/></td>
<td><xsl:value-of select="B/itema"/></td>
<td><xsl:value-of select="B/itemb"/></td>
</xsl:for-each>
I can understand why this would be problematic since the parser would wonder, for the general case, "which B does he want??"..

Generate Xpath from parsed HTML with Ruby on Rails

Given the following example HTML:
<table cellpadding="4" cellspacing="0" border="0" width="100%">
<tbody>
<tr bgcolor="#FFE4D8" valign="top">
<td>in the next 20 minutes you will learn how to create a winter landscape. For this excersize you do need to have only a basic experience in Lightwave, so lets just start with it.<br>
</tbody>
</table>
How could I auto generate an Xpath expression to the tag that contains "20 minutes"; in the same manner that Firepath does. Is this possible to do from within Ruby?
Assuming the text is not broken into different tags, you could find the lowest leaf node by
//*[contains(text(),'20 minutes')]
You can then generate the string of parents by adding /.. at the end of the XPath until you got the root element html. At every step you will also need to get the position of the element by
//*[contains(text(),'20 minutes')]/position()
and for higher elements
//*[contains(text(),'20 minutes')]/../position()
After you know each tag name and position, you can build the path
/html[1]/body[1]/div[x]/table[y]/tbody[z]/tr[1]/td[1]
With x, y, z being placeholders.
Since I don't know ruby, I cannot provide source code, but this will be an easy algorithm. The good thing is that you can implement this with any DOM parser, that knows XPath. It may be possible to optimize it considerably, if the DOM parser has a method for returning the parent of a node, because selecting the parent in XPath on its own for every step is slow and not viable for many/long documents.
It looks like REXML for ruby support the parent() method.
You can try to build xpath with the jini library yourself.
xpath = Jini.new('parent')
.add_path('child')
.add_attr('key', 'value)
.to_s
puts xpath // parent/child[#key="value"]

Dynamic 'matches' statement in XSLT

I'm trying to create an xslt function that dynamically 'matches' for an element. In the function, I will pass two parameters - item()* and a comma delimited string. I tokenize the comma delimited string in a <xsl:for-each> select statement and then do the following:
select="concat('$di:meta[matches(#domain,''', current(), ''')][1]')"
Instead of the select statement 'executing' the xquery, it is just returning the string.
How can I get it to execute the xquery?
Thanks in advance!
The problem is that you are wrapping too much of the expression in the concat() function. When that evaluates, it returns a string that would be the XPath expression, rather than evaluating the XPath expression that uses the dynamic string for the REGEX match expression.
You want to use:
<xsl:value-of select="$di:meta[matches(#domain
,concat('.*('
,current()
,').*')
,'i')][1]" />
Although, since you are now evaluating each term separately,rather than having each of those terms in a single regex pattern and selecting the first one, it will now return the first result from each match, rather than the first one from the sequence of matched items. That may or may not be what you want.
If you want the first item from the sequence of matched items, you could do something like this:
<!--Create a variable and assign a sequence of matched items -->
<xsl:variable name="matchedMetaSequence" as="node()*">
<!--Iterate over the sequence of names that we want to match on -->
<xsl:for-each select="tokenize($csvString,',')">
<!--Build the sequence(list) of matched items,
snagging the first one that matches each value -->
<xsl:sequence select="$di:meta[matches(#domain
,concat('.*('
,current()
,').*')
,'i')][1]" />
</xsl:for-each>
</xsl:variable>
<!--Return the first item in the sequence from matching on
the list of domain regex fragments -->
<xsl:value-of select="$matchedMetaSequence[1]" />
You could also put this into a custom function like this:
<xsl:function name="di:findMeta">
<xsl:param name="meta" as="element()*" />
<xsl:param name="names" as="xs:string" />
<xsl:for-each select="tokenize(normalize-space($names),',')">
<xsl:sequence select="$meta[matches(#domain
,concat('.*('
,current()
,').*')
,'i')][1]" />
</xsl:for-each>
</xsl:function>
and then use it like this:
<xsl:value-of select="di:findMeta($di:meta,'foo,bar,baz')[1]"/>

Ruby-on-Rails: Mixing Sanitize and Truncate can be a dirty thing

So stand alone I get what I need. But I want to truncate it, my dynamic text comes out with dirty text globbered with Microsoft Word garbage.
An Example :
&Lt;! [If Gte Mso 9]>&Lt;Xml> &Lt;Br /> &Lt;O:Office Document Settings> &Lt;Br /> &Lt;O:Allow Png/> &Lt;Br /> &Lt;/O:Off...
So how do I get the best of both worlds? Is there a shorthand ruby way to do this? For example a gsub statement that would clip off everything after the 125th char?
if you just want to slice, you can
>> long_ugly_string = "omg this is a long string"
=> "omg this is a long string"
>> long_ugly_string[10..-1]
=> "s a long string"
Reference: http://ruby-doc.org/core/classes/String.html#M000771
so, you are just specifying the starting character (10) and the ending character (-1 tells it to go to the end of the string).

Resources