Section Numbering in a XML document - xslt-2.0

I have document in the following format:
<root>
<part>
<chapter>
<section>
<chapter>
<section>
<part2>
<chapter>
<section>
<chapter>
<section>
I am trying to number the section. I am using the following code:
<xsl:number format="1.1.1" level="multiple" count="chapter|section"/>
The issue I am facing is that it is resetting the chapter number in each part. While I want to have continuous chapter numbers irrespective of the Part.

Related

DITA to PDF header issue

I have a dita XML with the following structure.
<bookmap>
<frontmatter><!-- FM content --></frontmatter>
<concept><!-- chapter content --></concept>
<concept><!-- chapter content --></concept>
<concept><!-- chapter content --></concept>
<concept><!-- chapter content --></concept>
<concept><!-- chapter content --></concept>
<concept><!-- chapter content --></concept>
<concept><!-- chapter content --></concept>
<concept><!-- chapter content --></concept>
</bookmap>
I need to have different header for only first page of first level the others are normal.
I used dita to pdf pluing static content for first page
<fo:static-content flow-name="first-body-header">
<fo:block xsl:use-attribute-sets="__body__odd__header">
<xsl:call-template name="getVariable">
<xsl:with-param name="id" select="'Body odd header'"/>
<xsl:with-param name="params">
<!-- <prodname>
<xsl:value-of select="$productName"/>
</prodname>-->
<heading>
<fo:inline xsl:use-attribute-sets="__body__odd__header__heading">
<fo:retrieve-marker retrieve-class-name="current-header"/>
</fo:inline>
</heading>
</xsl:with-param>
</xsl:call-template>
</fo:block>
</fo:static-content>
</xsl:template>
Unfortunately this template applies to all the first pages of all
parts.
Have anyone faced this issue before. I want this header to be applicable only first page of first level
Thanks in advance
Arul
I am not sure about dita, but this issue is #8.5 under (Apache's) XSL-FO FAQ, and can be achieved (with fo resources) by something like:
<?xml version="1.0"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<!-- layout master set: -->
<fo:layout-master-set>
<!-- page master for 1st page -->
<!-- define a (e.g.) A4 page-master with extra 20mm header region for the first page -->
<fo:simple-page-master master-name="myFirst"
page-height="297mm" page-width="210mm"
margin-top="20mm" margin-bottom="20mm"
margin-left="25mm" margin-right="25mm">
<fo:region-body margin-top="20mm"/>
<fo:region-before region-name="myHeaderFirst" extent="20mm"/>
<!-- define custom footer with <fo:region-after/> ... -->
</fo:simple-page-master>
<!-- page master for "rest" page (body only) -->
<fo:simple-page-master master-name="myRest"
page-height="297mm" page-width="210mm"
margin-top="20mm" margin-bottom="20mm"
margin-left="25mm" margin-right="25mm">
<!-- define only/same body -->
<fo:region-body/>
</fo:simple-page-master>
<!-- Da page seekwendz masta! ..combining myFirst and myRest ;) -->
<fo:page-sequence-master master-name="myDocument">
<fo:repeatable-page-master-alternatives>
<!-- here comes fo magic: "page-position" in (first|last|rest|any|only)
..with precedence! -->
<fo:conditional-page-master-reference page-position="first"
master-reference="myFirst"/>
<fo:conditional-page-master-reference page-position="rest"
master-reference="myRest"/>
</fo:repeatable-page-master-alternatives>
</fo:page-sequence-master>
</fo:layout-master-set>
<!-- here go the contents/page sequences ... -->
<fo:page-sequence master-reference="myDocument">
<!-- static content BEFORE flow! (a small pitfall,
esp. when it is the footer not the header:)) -->
<fo:static-content flow-name="myHeaderFirst">
TODO : "print" your header for first page here.
</fo:static-content>
<!-- define other/more static-contents ... -->
<fo:flow flow-name="xsl-region-body">
TODO : "print" flow/body.
<!-- xsl:applyTemplates /-->
</fo:flow>
</fo:page-sequence>
</fo:root>
https://xmlgraphics.apache.org/fop (apache's fo-impl home)
https://www.data2type.de/en/xml-xslt-xslfo/xsl-fo/ (very detailed documentation on xsl-fo in en + de language)

Unsupported element in Slideshow (Facebook Instant Articles)

I have an issue with Facebook Instant Articles validation. For one of my articles this error message pops up:
Slideshow Contains Unsupported Elements: Only image elements can appear in a slideshow. Ensure that slideshow (at /html/body/article/figure[3]) only contains supported elements. Refer to Slideshows under Format Reference in Instant Articles documentation for more information.
Here's the code:
<figure class="op-slideshow">
<figure>
<img src="https://www.example.com/image1.jpg">
<figcaption>Caption1</figcaption>
</figure>
<figure>
<img src="https://www.example.com/image2.jpg">
<figcaption>Caption2</figcaption>
</figure>
</figure>
It was generated by the official PHP SDK, and in the example they are using very similar structure. (http://take.ms/nookv) Is this a bug?
you can't add a figcaption for each slide, only images can be enclosed within the inner tags.
Correct format would be:
<figure class="op-slideshow">
<figure>
<img src="http://example.com/path/to/img1.jpg" />
</figure>
<figure>
<img src="http://example.com/path/to/img2.jpg" />
</figure>
<figure>
<img src="http://example.com/path/to/img3.jpg" />
</figure>
<figcaption>This slideshow is amazing.</figcaption>
</figure>
I've found the problem. It had nothing to do with the structure. One of the images was a GIF, which as it turns out, isn't supported in a slideshow.

Convert Nokogiri XML Document into Array of Strings?

I'm creating a Ruby on Rails application and using Nokogiri to parse an XML file. I'm trying to parse the XML file into mutable strings which I can manipulate to create other content.
Here's a sample XML I'm using
<feed xmlns="http://www.w3.org/2005/Atom">
<entry>
<title type="html">
<![CDATA[ First Post! ]]>
</title>
<content type="html">
<![CDATA[
<p>I’m very excited to have finally got my site up and running along with this blog!</p>]]>
</content>
</entry>
</feed>
This is what I've done so far relating to my problem
In my controller -
def index
#blog_title, #blog_post = parse_xml
end
private
def parse_xml
#xml_doc = Nokogiri::XML(open("atom.xml"))
titles = #xml_doc.css("entry title")
post = #xml_doc.css("content")
return titles, post
end
In my view -
<% for i in 1..#blog_title.length %>
<li><%= #blog_title[i-1] %></li>
<li><%= #blog_post[i-1] %></li>
<% end %>
A sample output from the view (it returns a Nokogiri Element) -
<title type="html"><![CDATA[First Post!]]></title>
So ideally, I'd like to make all the Nokogiri::Element inside the Nokogiri::Document a string or make the entire array a String array.
I've tried iterating through each element and calling .to_s but it doesn't seem to work.
I've also tried calling Ruby::String methods such as slice and that doesn't work (for obvious reasons).
The end result I'm trying to get at (using the sample output on my view) is to return only the following and none of the rest.
First Post!
Can anyone help me? If I'm not clear enough or if someone needs to see more work, please feel free to ask!
For your case you should simply use .text to extract the content of tags. Something like titles.text would work.
You're dealing with RSS/Atom feeds which can contain multiple title tags. You need to iterate over all title nodes and extract their content separately, in a way that lets you keep track of their order and what article they're attached to:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<feed xmlns="http://www.w3.org/2005/Atom">
<entry>
<title type="html">
<![CDATA[ First Post! ]]>
</title>
<content type="html">
<![CDATA[
<p>I’m very excited to have finally got my site up and running along with this blog!</p>]]>
</content>
</entry>
</feed>
EOT
doc.search('title').map(&:text)
# => ["\n First Post! \n "]
This returns an array of the text inside the title nodes. From there you can easily clean up each string, manipulate them, reuse them, whatever.
doc.search('title').map{ |s| s.text.strip }
# => ["First Post!"]
search returns a NodeSet, which is akin to an array of title nodes found in the document. If you don't iterate over them you'll get a concatenated string containing all their text, which is usually NOT what you want:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<foo>
<title>this</title>
<title>is</title>
<title>what</title>
<title>you'd</title>
<title>get</title>
</foo>
EOT
doc.search('title').text
# => "thisiswhatyou'dget"
versus:
doc.search('title').map(&:text)
# => ["this", "is", "what", "you'd", "get"]
Trying to tear apart the first result is impossible unless you have prior knowledge of the document's structure which is usually not true. Iterating over the returned NodeSet will yield very usable results.
To maintain consistency with the various title tags in a feed, you need to loop over the entries, then extract the embedded titles which is a bit different than what your sample XML and code shows:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<feed xmlns="http://www.w3.org/2005/Atom">
<entry>
<title type="html">
<![CDATA[ First Post! ]]>
</title>
<content type="html">
<![CDATA[
<p>I’m very excited to have finally got my site up and running along with this blog!</p>]]>
</content>
</entry>
<entry>
<title type="html">
<![CDATA[ Second Post! ]]>
</title>
<content type="html">
<![CDATA[
<p>blah</p>]]>
</content>
</entry>
</feed>
EOT
titles = doc.search('entry').map { |entry|
entry.at('title').text.strip
}
titles # => ["First Post!", "Second Post!"]
Or perhaps more usable:
titles_and_content = doc.search('entry').map { |entry|
[
entry.at('title').text.strip,
entry.at('content').text.strip
]
}
titles_and_content
# => [["First Post!",
# "<p>I’m very excited to have finally got my site up and running along with this blog!</p>"],
# ["Second Post!", "<p>blah</p>"]]
which returns the title and the content for each entry. From this you can easily build up code to extract the links to the articles, date of publishing, refresh-rates, original site, everything you'd want to know about an individual article and its source, then store it in a database for later regurgitation when requested.
There are gems and scripts available for processing RDF, RSS and Atom feeds, however, years ago, when I had to write a huge aggregator for feeds, nothing was available that met my needs and I wrote one from scratch. I'd recommend trying to find one rather than reinvent that wheel, otherwise look through their source and learn from their experience. There are a number of things to do in code to be a good network-citizen that doesn't swamp the servers and get you banned.
See "How to avoid joining all text from Nodes when scraping" also.

How to programmatically create a new KB article in Dynamics CRM 2013?

I am trying to set up an integration using the SDK to create KB article records in CRM 2013, and so far haven't been able to figure out a good way to build the article xml. We want to use sharepoint as our document authoring tool , and then send those documents over to CRM. From the research I've done so far, I know that in order to create a new kb article I need to link it to a template. I created a very basic template with one section as a test to work with, then in a test app using the SDK I created a new KBArticle entity instance, set the necessary required fields and assigned the template to the new article. I tried building the xml for the ArticleXML attribute by starting with the StructureXML attribute of the template and filling in the content section with some test html content. I was able to create the kb article successfully and then load it up in CRM, but it doesn't look right yet. I also created a new kb article through the UI and then using the SDK I retrieved it and examined the ArticleXML attribute to compare with the one I'm trying to create programmatically.
Here's the basic structure of the ArticleXML for an article created in the UI:
<articledata>
<section id="0">
<content>
<![CDATA[<b>Article content located here</b>]]>
</content>
</section>
<section id="1">
<content>
<![CDATA[]]>
</content>
</section>
</articledata>
Now here is the StructureXML attribute value from the template I created:
<kbarticle>
<sections nextSectionId="1">
<section type="docprop" name="title"/>
<section type="docprop" name="number"/>
<section type="edit" id="0">
<![CDATA[Content]]>
<instructions>
<![CDATA[Place KB article content here]]>
/instructions>
</section>
</sections>
<stylesheet>
<article>
<style name="background-color" value="#ffffff"/>
<style name="font-family" value="verdana"/>
<style name="font-size" value="10pt"/>
</article>
<title>
<style name="font-family" value="verdana"/>
<style name="font-size" value="16pt"/>
</title>
<number>
<style name="color" value="#666666"/>
<style name="font-size" value="9pt"/>
</number>
<heading>
<style name="font-size" value="10pt"/>
<style name="font-weight" value="bold"/>
<style name="color" value="#000066"/>
<style name="border-bottom" value="1px solid #999999"/>
</heading>
</stylesheet>
</kbarticle>
That template XML is what I tried to use and assign to the new article, but obviously it doesn't look right when the article is viewed, the template content is there along with the content I added, its basically duplicated:
I did also see there is a FormatXML attribute on the template, which contains XSL to transform the XML, I tried using this but it produces HTML output that isn't what I want either. I'm struggling with how to get from the template to the ArticleXML that I need in order to create the new KB article. Any help with this is much appreciated!

Producing single-line comments with HAML?

I'm trying to generate a comment on a single line at the end of an HTML file:
<!-- generated by SERVER1 -->
I have tried
/
generated by #{#server_name}
But this outputs it over 3 lines -
<!--
generated by SERVER1
-->
I've tried
/ generated by #{#server_name}
But that doesn't evaluate the #server_name var -
<!-- generated by #{#server_name} -->
Any ideas?
Just as you can drop back to raw HTML output when you want, so you can drop in raw HTML comments, even with interpolation.
This template:
- #foo = 42
#test1
/
Hello #{#foo}
#test2
<!-- Hello #{#foo} -->
Produces this output:
<div id='test1'>
<!--
Hello 42
-->
</div>
<div id='test2'>
<!-- Hello 42 -->
</div>
Tested with Haml v3.1.4 (Separated Sally)
It's still an open issue: github.com/haml/haml/issues/313. I think you're stuck with the multiline comment for now, even though nex3 says single line interpolation should work.

Resources