Microsoft XmlLite fails to detect end-of-element - xml-parsing

I'm using Microsoft's XmlLite DLL to parse a simple XML file, using the code in the example XmlLiteReader. The essential part of the code (C++) is
while(S_OK == (hr = pReader->Read(&nodeType)))
{
switch(nodeType)
{
case XmlNodeType_Element:
// Get name...
WriteAttributes(pReader, es, attributes);
break;
case XmlNodeType_EndElement:
// Process end-of-element...
break;
}
and
HRESULT WriteAttributes(IXmlReader* pReader, CString& es, StringStringMap& attributes)
{
while(TRUE)
{
// Get and store an attribute...
HRESULT hrMove = pReader->MoveToNextAttribute();
}
// ...
}
So, here's my question. With XML input such as
<?xml version="1.0" encoding="utf-8"?>
<settings version="1.2">
<runID name="test" mode="N" take_data="Y">
<cell id="01">
<channel id="A" sample="something"/>
<channel id="B" sample="something else"/>
</cell>
<cell id="03">
<channel id="A" sample="other something"/>
<channel id="B" sample="other something else"/>
</cell>
</runID>
</settings>
Everything works as expected, except that the /> at the end of each channel line, which indicates the end of the element channel, isn't recognized as the end of an element. The successive node types following channel are whitespace (\n), then element (the second channel).
How can I determine from XmlLite that element `channel' has ended? Or am I misunderstanding the XML syntax?

You can test if an element ends with /> by using the function IsElementEmpty.

Related

how to extract the value of specefic div in html with crawling in apache nutch?

I do crawling with nutch 2.2 and the data that i retrieve is the metatag,how to extract the value of specefic div in html with crawling in apache nutch
You will have to write a plugin that will extend HtmlParseFilter to achieve your goal.
You can use some html parser like Jsoup for this and extract URLs that you want and add them as outlinks.
Sample HtmlParseFilter implementation:-
public ParseResult filter(Content content, ParseResult parseResult,
HTMLMetaTags metaTags, DocumentFragment doc) {
// get html content
String htmlContent = new String(content.getContent(), StandardCharsets.UTF_8);
// parse html using jsoup or any other library.
Document document = Jsoup.parse(content.toString(),content.getUrl());
Elements elements = document.select(<your_css_selector_query);
// modify/select only required outlinks
if (elements != null) {
Outlink outlink;
List<String> newLinks=new ArrayList<String>();
List<Outlink> outLinks=new ArrayList<Outlink>();
String absoluteUrl;
Outlink outLink;
for (Element element : elements){
absoluteUrl=element.absUrl("href");
if(includeLinks(absoluteUrl,value)) {
if(!newLinks.contains(absoluteUrl)){
newLinks.add(absoluteUrl);
outLink=new Outlink(absoluteUrl,element.text());
outLinks.add(outLink);
}
}
}
Parse parse = parseResult.get(content.getUrl());
ParseStatus status = parse.getData().getStatus();
Title title = document.title();
Outlink[] newOutLinks = (Outlink[])outLinks.toArray(new Outlink[outLinks.size()]);
ParseData parseData = new ParseData(status, title, newOutLinks, parse.getData().getContentMeta(), parse.getData().getParseMeta());
parseResult.put(content.getUrl(), new ParseText(elements.text()), parseData);
}
//return parseResult with modified outlinks
return parseResult;
}
Build new plugin using ant and add plugin in nutch-site.xml.
<property>
<name>plugin.includes</name>
<value>protocol-httpclient|<custom_plugin>|urlfilter-regex|parse-(tika|html|js|css)|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)|indexer-elastic</value>
</property>
And in parser-plugins.xml you can use your custom plugin instead of default plugin used by tika by something like this :-
<!--
<mimeType name="text/html">
<plugin id="parse-html" />
</mimeType>
<mimeType name="application/xhtml+xml">
<plugin id="parse-html" />
</mimeType>
-->
<mimeType name="text/xml">
<plugin id="parse-tika" />
<plugin id="feed" />
</mimeType>
<mimeType name="text/html">
<plugin id="<custom_plugin>" />
</mimeType>
<mimeType name="application/xhtml+xml">
<plugin id="<custom_plugin>" />
</mimeType>
You need to override the parsefilter and use Jsoup selector to select particular div.

Specifying filenames for ant junit batchtest

I have a working ant-task that runs me a batch of junit-tests, in the following fashion:
<junit printsummary="yes" showoutput="yes" haltonfailure="no">
<formatter type="plain" />
<classpath>
<path refid="app.compile.classpath" />
<path id="classes" location="${app.classes.home}" />
<path id="test-classes" location="${app.build.home}/test-classes" />
</classpath>
<batchtest fork="no" todir="${app.tests.reports}">
<fileset dir="${app.tests.home}">
<include name="**/*Test*.java" />
</fileset>
</batchtest>
</junit>
Now this works fine right now, except for the names of the Test-reports that are generated. These names are overly long and follow the pattern:
TEST-com.company.package.Class.txt
Is there a way for specify a file-naming pattern for the report files for batchtest, preferrably for ant 1.6.5?
I know that for a single test, you can specify a filename by using the outfile attribute. In the junit-task reference, it's just stated that:
It then generates a test class name for each resource that ends in .java or .class.
No, it's not possible.
According to the source code
protected void execute(JUnitTest arg, int thread) throws BuildException {
validateTestName(arg.getName());
JUnitTest test = (JUnitTest) arg.clone();
test.setThread(thread);
// set the default values if not specified
//#todo should be moved to the test class instead.
if (test.getTodir() == null) {
test.setTodir(getProject().resolveFile("."));
}
if (test.getOutfile() == null) {
test.setOutfile("TEST-" + test.getName());
}
// execute the test and get the return code
TestResultHolder result = null;
if (!test.getFork()) {
result = executeInVM(test);
} else {
ExecuteWatchdog watchdog = createWatchdog();
result = executeAsForked(test, watchdog, null);
// null watchdog means no timeout, you'd better not check with null
}
actOnTestResult(result, test, "Test " + test.getName());
}
it will create the out file as "TEST-" + test.getName() if you don't specify it explicitly in the test element. It's not possible to specify it in batchtest element.

Get an entry of a Manifest file with ant

I am trying to develope an Ant macrodef which gets the values separated by commas of the Require-Bundle property in a Manifest file passed as parameter. What I want to obtain is something like this:
Require-Bundle=org.eclipse.ui,org.eclipse.ui.ide,org.eclipse.ui.views...
The problem I am facing in my code is that it doesn't take into account if the property has multiple values in multiple lines, here is my code:
<macrodef name="getDependencies">
<attribute name="file" />
<attribute name="prefix" default="ant-mf." />
<sequential>
<loadproperties>
<file file="#{file}" />
<filterchain>
<linecontains>
<contains value="Require-Bundle" />
</linecontains>
<prefixlines prefix="#{prefix}" />
</filterchain>
</loadproperties>
</sequential>
</macrodef>
But this is what I get:
[echoproperties] ant-mf.Require-Bundle=org.eclipse.ui,
Any help will be appreciated.
Most likely, your Manifest file looks like this:
Require-Bundle: org.eclipse.ui,
org.eclipse.ui.ide,
org.eclipse.ui.views,
...
Unfortunately, Java Manifest files aren't quite Java Properties files. Manifest files can have attributes that span multiple lines whereas Property files can't have multi-line values. The <loadproperties> task can't handle multi-line attributes.
Instead, you'll need an Ant task that knows about Manifest files. In another question, Richard Steele provides Ant script that loads a Manifest file from a Jar file. You can adapt the example to get just the Require-Bundle attribute.
Thanks to Chad Nouis I have changed the macrodef approach to scriptdef. I have debugged and adapted the Richard Steele script to fit my needs:
<!--
Loads entries from a manifest file.
#manifest A manifest file to read
#entry The name of the manifest entry to load (optional)
#prefix A prefix to prepend (optional)
-->
<scriptdef name="getDependencies" language="javascript" description="Gets all entries or a specified one of a manifest file">
<attribute name="manifest" />
<attribute name="entry" />
<attribute name="prefix" />
<![CDATA[
var filename = attributes.get("manifest");
var entry = attributes.get("entry");
if (entry == null) {
entry = "";
}
var prefix = attributes.get("prefix");
if (prefix == null) {
prefix = "";
}
var manifest;
if (filename != null) {
manifest = new java.util.jar.Manifest(new java.io.FileInputStream(new java.io.File(filename)));
} else {
self.fail("File is required");
}
if (manifest == null) {
self.log("Problem loading the Manifest");
} else {
var attributes = manifest.getMainAttributes();
if (attributes != null) {
if (entry != "") {
project.setProperty(prefix + entry, attributes.getValue(entry));
} else {
var it = attributes.keySet().iterator();
while (it.hasNext()) {
var key = it.next();
project.setProperty(prefix + key, attributes.getValue(key));
}
}
}
}
]]>
</scriptdef>

Pass parameter from a URL to a XSL stylesheet

I just spent the past couple of days digging into the messages looking for a way to pass a parameter from an URL to a XSL stylesheet.
for example, I have a current url like:
http://localhost/blocableau/data/base_bloc/blocableau3x.xml?95.2
and I want to just select the value after the ? like the 95.2 in this example, and put it in the variable var_massif.
I tried the following code with javascript for test the value substring(1) but with xsl it didn't work.
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes" />
<!-- recherche des cotations par Massif selon une variable -->
<xsl:key name="byMassif" match="bloc" use="Massif" />
<xsl:template match="BLOCS">
<script>var requete = location.search.substring(1);
mavariable=requete;alert(mavariable);x=mavariable</script>
<xsl:variable name="var_massif"><xsl:value-of select="95.2" /></xsl:variable>
<span>Ma variable xsl est : <xsl:value-of select="$var_massif"/></span>
<div style="position:relative; top:0px;left:10px;font-family: Arial, 'Helvetica Neue', 'Helvetica, sans-serif'; font-size:12px;z-index:1;">
<!-- genere un id pour chaque valeur de cotation et mise en forme -->
<xsl:for-each select="bloc[generate-id() = generate-id(key('byMassif', $var_massif)[1])]" >
<xsl:sort select="Massif" />
I assume you are loading a XML which uses a XSL processing instruction to transform the file:
<?xml-stylesheet ... ?>
You would need to read the URL and extract the query string. Unfortunately it is not automatically mapped. The only way to do that would be using document-uri(/) but it's XSLT 2.0 and not supported in browsers.
In XSLT 1.0 you would have to direct your HTTP request not to the XML file you want to process, but to a script which will load your XML and XSLT using the XSLTProcessor. Then you could pass the query-string as a parameter to the processor:
var var_massif = location.search.substring(1); // gets the string and removes the `?`
var processor = new XSLTProcessor(); // starts the XSL processor
processor.setParameter(null, "var_massif", var_massif); // sets parameter
... // load XSL, XML and transformToDocument
Inside your XSLT stylesheet you should have a global parameter variable:
<xsl:param name="var_massif" />
Which you will then be able to read with $var_massif.
UPDATE Here is a step-by-step example using JQuery:
I'm using this reduced stylesheet, just to show you how to get the parameter. It's in a file I called stylesheet.xsl:
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes" />
<xsl:param name="var_massif" />
<xsl:template match="BLOCS">
<span>Ma variable xsl est : <xsl:value-of select="$var_massif" /></span>
</xsl:template>
</xsl:stylesheet>
And this input (blocableau3x.xml) in the same directory:
<BLOCS>
<bloc>
<Massif></Massif>
</bloc>
</BLOCS>
Now create this HTML file (blocableau3x.html) also in the same directory:
<script src="//cdnjs.cloudflare.com/ajax/libs/jquery/2.1.0/jquery.js"></script>
<script>
$( document ).ready(function() {
var var_massif = location.search.substring(1); // gets the string and removes the `?`
var body = $("body")[0]; // gets the body element where the result will be inserted
if (var_massif) {
var processor = new XSLTProcessor(); // starts the XSL processor
processor.setParameter(null, "var_massif", var_massif);
var source;
var xslReq = $.get("stylesheet.xsl", function (data) { // loads the stylesheet
processor.importStylesheet(data);
});
var xmlReq = $.get("blocableau3x.xml", function (data) { // loads the xml
source = data;
});
$.when(xslReq, xmlReq).done(function () { // waits both to load
var result = processor.transformToDocument(source); // transforms document
body.appendChild(result.documentElement); // adds result as child of body element
});
} else {
body.html("<h1>Missing query string</h1>"); // in case there is no query string
}
});
</script>
<html>
<body></body>
</html>
Instead of calling your XML file in the URL, call the HTML file. When it loads, it will load the XSL file and the XML file, and use the XSL processor to process the file. It will be able to read the query parameter and pass it as a XSL global parameter, which you can read in your XSLT file.

Unknown property error within item renderer's data property

I'm using an item renderer, but keep getting this actionscript error:
Error: Unknown Property: 'skillName'. at mx.collections::ListCollectionView/http://www.adobe.com/2006/actionscript/flash/proxy::getProperty()[E:\dev\4.y\frameworks\projects\framework\src\mx\collections\ListCollectionView.as:870]
at mx.binding::PropertyWatcher/updateProperty()[E:\dev\4.y\frameworks\projects\framework\src\mx\binding\PropertyWatcher.as:338]
at Function/http://adobe.com/AS3/2006/builtin::apply()
at mx.binding::Watcher/wrapUpdate()[E:\dev\4.y\frameworks\projects\framework\src\mx\binding\Watcher.as:192]
at mx.binding::PropertyWatcher/updateParent()[E:\dev\4.y\frameworks\projects\framework\src\mx\binding\PropertyWatcher.as:239]
at mx.binding::Watcher/updateChildren()[E:\dev\4.y\frameworks\projects\framework\src\mx\binding\Watcher.as:138]
at mx.binding::PropertyWatcher/updateProperty()[E:\dev\4.y\frameworks\projects\framework\src\mx\binding\PropertyWatcher.as:347]
at Function/http://adobe.com/AS3/2006/builtin::apply()
at mx.binding::Watcher/wrapUpdate()[E:\dev\4.y\frameworks\projects\framework\src\mx\binding\Watcher.as:192]
at mx.binding::PropertyWatcher/eventHandler()[E:\dev\4.y\frameworks\projects\framework\src\mx\binding\PropertyWatcher.as:375]
at flash.events::EventDispatcher/dispatchEventFunction()
at flash.events::EventDispatcher/dispatchEvent()
at mx.core::UIComponent/dispatchEvent()[E:\dev\4.y\frameworks\projects\framework\src\mx\core\UIComponent.as:13152]
at spark.components::DataRenderer/set data()[E:\dev\4.y\frameworks\projects\spark\src\spark\components\DataRenderer.as:123]
at spark.components::SkinnableDataContainer/updateRenderer()[E:\dev\4.y\frameworks\projects\spark\src\spark\components\SkinnableDataContainer.as:606]
at spark.components.supportClasses::ListBase/updateRenderer()[E:\dev\4.y\frameworks\projects\spark\src\spark\components\supportClasses\ListBase.as:1106]
at spark.components::DataGroup/setUpItemRenderer()[E:\dev\4.y\frameworks\projects\spark\src\spark\components\DataGroup.as:1157]
at spark.components::DataGroup/initializeTypicalItem()[E:\dev\4.y\frameworks\projects\spark\src\spark\components\DataGroup.as:327]
at spark.components::DataGroup/ensureTypicalLayoutElement()[E:\dev\4.y\frameworks\projects\spark\src\spark\components\DataGroup.as:384]
at spark.components::DataGroup/measure()[E:\dev\4.y\frameworks\projects\spark\src\spark\components\DataGroup.as:1467]
at mx.core::UIComponent/http://www.adobe.com/2006/flex/mx/internal::measureSizes()[E:\dev\4.y\frameworks\projects\framework\src\mx\core\UIComponent.as:8506]
at mx.core::UIComponent/validateSize()[E:\dev\4.y\frameworks\projects\framework\src\mx\core\UIComponent.as:8430]
at mx.managers::LayoutManager/validateSize()[E:\dev\4.y\frameworks\projects\framework\src\mx\managers\LayoutManager.as:665]
at mx.managers::LayoutManager/doPhasedInstantiation()[E:\dev\4.y\frameworks\projects\framework\src\mx\managers\LayoutManager.as:816]
at mx.managers::LayoutManager/doPhasedInstantiationCallback()[E:\dev\4.y\frameworks\projects\framework\src\mx\managers\LayoutManager.as:1180]
The weird thing is that it worked fine, until at a certain point I kept getting this error, out of the blue. I've been searching for it on Google and Stackoverflow and struck upon a few websites, but none of the answers could help me get any further. It seems this error is also mostly thrown in mobile AIR projects, but mine is a Flash Player project...
This is how the itemrenderer looks:
<?xml version="1.0" encoding="utf-8"?>
<s:ItemRenderer xmlns:fx="http://ns.adobe.com/mxml/2009"
xmlns:s="library://ns.adobe.com/flex/spark"
xmlns:mx="library://ns.adobe.com/flex/mx"
xmlns:components="components.*"
width="100%" height="100%" autoDrawBackground="true"
creationComplete="creationCompleteHandler(event)"
height.login_edit_state="80"
color.login_edit_state="#000000"
height.login_preview_state="80">
<fx:Script>
<![CDATA[
import mx.collections.ArrayCollection;
import mx.controls.Alert;
import mx.events.FlexEvent;
import mx.rpc.events.FaultEvent;
import mx.rpc.events.ResultEvent;
import mx.utils.ArrayUtil;
public var loggedin:Boolean = true;
[Bindable]private var ac_projects:ArrayCollection;
protected function creationCompleteHandler(event:FlexEvent):void
{
currentState = "login_preview_state";
img_foldout_preview.addEventListener(MouseEvent.CLICK, changeState);
img_edit_preview.addEventListener(MouseEvent.CLICK, changeState);
http_projects.addEventListener(ResultEvent.RESULT, http_projects_resultEvent);
http_projects.addEventListener(FaultEvent.FAULT, http_projects_faultEvent);
http_projects.url = "http://localhost/sourcefoliocom.adobe.flexbuilder.project.flexbuilder/bindebug/php/getAllProjectsByUserSkill.php?id=" + data.userId + "&skill=" + data.skillId ;
trace("http://localhost/sourcefoliocom.adobe.flexbuilder.project.flexbuilder/bindebug/php/getAllProjectsByUserSkill.php?id="+ data.userId + "&skill=" + data.skillId);
http_projects.send();
}
protected function http_projects_resultEvent(event:ResultEvent):void
{
ac_projects = new ArrayCollection(ArrayUtil.toArray(event.result.projects.project));
rpt_projects.dataProvider = ac_projects;
}
protected function http_projects_faultEvent(event:FaultEvent):void
{
trace("Kon projecten niet laden");
}
]]>
</fx:Script>
<fx:Declarations>
<s:HTTPService id="http_projects"
method="GET" />
</fx:Declarations>
<s:states>
<s:State name="login_preview_state"/>
<s:State name="login_opened_state"/>
<s:State name="login_edit_state"/>
</s:states>
<s:layout.login_opened_state>
<s:VerticalLayout horizontalAlign="right"/>
</s:layout.login_opened_state>
<!-- login_opened_state -->
<s:SkinnableContainer includeIn="login_opened_state" width="100%" height="80">
<s:layout>
<s:HorizontalLayout gap="20" paddingBottom="20" paddingLeft="20" paddingRight="20" paddingTop="20" verticalAlign="middle"/>
</s:layout>
<s:Label fontSize="20" fontWeight="bold" text="{data.skillName}"/>
<s:Label fontSize="20" text="junior"/>
<s:Spacer width="100%" height="10"/>
<s:Image id="img_edit_open" width="20" height="20" source="images/edit.png" buttonMode="true" useHandCursor="true"/>
<s:Image id="img_foldin_open" width="20" height="20" buttonMode="true" source="images/foldin.png" useHandCursor="true"/>
</s:SkinnableContainer>
<s:VGroup id="vg_opened"
visible="false"
width="900" height="1000" gap="0"
horizontalAlign="right">
<mx:VBox>
<mx:Repeater id="rpt_projects" width="100%">
<components:Project currentItem= {rpt_projects.currentItem}" loggedin="true"/>
</mx:Repeater>
<components:AddProject />
</mx:VBox>
<s:Image x="824" width="76" height="51" source="images/edit_flag.png" useHandCursor="true"/>
</s:VGroup>
</s:ItemRenderer>
The error is thrown at this line:
<s:Label fontSize="20" fontWeight="bold" text="{data.skillName}"/>
The ArrayCollection filling up this renderer is an xml file my own webservice returns. I've tested the file and the use of skillName should be correct in this case.
Do you need to see more code or some more info? Let me know!
I found out I'm dealing with a two dimensional ArrayCollection. I'm not correctly referring to the items in this ArrayCollection, so that's why the properties are not recognized.

Resources