Best approach to parse an XML and get required values - xml-parsing

Which would be the best and efficient way to parse an xml
XSLT transform ?
or
Linq to XML?
or any other way?
I am planning to change the existing xslt to "linq to xml" assuming "linq to xml" will be faster is it a good choice? will it give an noticeable performance?

use XmlSerializer and http://xmltocsharp.azurewebsites.net/
xml2c# code generator.

Related

What are the advantages and disadvantages of using from_xml function in ruby

I want to use the from_xml method provided by the rails active support core ext,
An example :
hash = Hash.from_xml("my_xml.xml")
for converting XML into a Hash.
I want to use this because parsing a Hash is a lot easier than XML in ruby.
However, I would want to know what are the pros and cons of using this approach. Is there a better approach I can use for converting an XML into hash.
Thanks
XML unlike JSON is a document format and not just data exchange format and not always actually maps cleanly to programming language constructs like hashes. XML is actually ridiculously complex if you look at all the features like namespaces.
Hash.from_xml really only handles the simplest of cases and doesn't have a clue how to deal with stuff like attributes. It really only knows how to parse the XML generated by Hash#to_xml.
Advantages:
its so naive that its cute
Disadvantages:
see advantages
For non-trivial examples you'll need an actual XML parser like Nokogiri.

Parsing and pretty printing the same file format in Haskell

I was wondering, if there is a standard, canonical way in Haskell to write not only a parser for a specific file format, but also a writer.
In my case, I need to parse a data file for analysis. However, I also simulate data to be analyzed and save it in the same file format. I could now write a parser using Parsec or something equivalent and also write functions that perform the text output in the way that it is needed, but whenever I change my file format, I would have to change two functions in my code. Is there a better way to achieve this goal?
Thank you,
Dominik
The BNFC-meta package https://hackage.haskell.org/package/BNFC-meta-0.4.0.3
might be what you looking for
"Specifically, given a quasi-quoted LBNF grammar (as used by the BNF Converter) it generates (using Template Haskell) a LALR parser and pretty pretty printer for the language."
update: found this package that also seems to fulfill the objective (not tested yet) http://hackage.haskell.org/package/syntax

Should I use one XML Parser for every XML feed I have, or should I write a parser for each XML feed I have?

So I help write an app for my university and I'm wondering what's the best way to handle multiple XML feeds (like scores for sports, class information, etc.).
Should I have one XML parser that can handle all feeds? Or should I write a parser for each feed? We're having trouble deciding the best way to implement it.
This is iOS and we use a mix of Swift 3 and Objective-C
I think the right strategy is to write a base class that handles common data types like integers, booleans, strings, etc., and then write derived classes for each type of feed. This is the strategy I use in my XML parser, which is based on the data structures and Apple's XML parser as described here:
https://developer.apple.com/library/content/documentation/Cocoa/Conceptual/NSXML_Concepts/NSXML.html
Personally I prefer to use the XPath data models where you can query the XML tree for a specific node using a path-like string.

VBScript Partial Parser

I am trying to create a VBScript parser. I was wondering what is the best way to go about it. I have researched and researched. The most popular way seems to be going for something like Gold Parser or ANTLR.
The feature I want to implement is to do dynamic checking of Syntax Errors in VBScript. I do not want to compile the entire VBS every time some text changes. How do I go about doing that? I tried to use Gold Parser, but i assume there is no incremental way of doing parsing through it, something like partial parse trees...Any ideas on how to implement a partial parse tree for such a scenario?
I have implemented VBscript Parsing via GOLD Parser. However it is still not a partial parser, parses the entire script after every text change. Is there a way to build such a thing.
thks
If you really want to do incremental parsing, consider this paper by Tim Wagner.
It is brilliant scheme to keep existing parse trees around, shuffling mixtures of string fragments at the points of editing and parse trees representing the parts of the source text that hasn't changed, and reintegrating the strings into the set of parse trees. It is done using an incremental GLR parser.
It isn't easy to implement; I did just the GLR part and never got around to the incremental part.
The GLR part was well worth the trouble.
There are lots of papers on incremental parsing. This is one of the really good ones.
I'd first look for an existing VBScript parser instead of writing your own, which is not a trivial task!
Theres a VBScript grammar in BNF format on this page: http://rosettacode.org/wiki/BNF_Grammar which you can translate into a ANTLR (or some other parser generator) grammar.
Before trying to do fancy things like re-parsing only a part of the source, I recommend you first create a parser that actually works.
Best of luck!

Convert Neo4j DB to XML?

Can I convert Neo4J Database files to XML?
I agree, GraphML is the way to go, if you don't have problems with the verbosity of XML. A simple way to do it is to open the Neo4j graph from Gremlin, where GraphML is the default import/export format, something like
peters: ./gremlin.sh
gremlin> $_g := neo4j:open('/tmp/neo4j')
==>neograph[/tmp/neo4j, vertices:2, edges:1]
gremlin> g:save('graphml-export.xml')
As described here
Does that solve your problem?
With Blueprints, simply do:
Graph graph = new Neo4jGraph("/tmp/mygraph");
GraphMLWriter.outputGraph(graph, new FileOutputStream("mygraph.xml"));
Or, with Gremlin (which does the same thing in the back):
g = new Neo4jGraph('/tmp/mygraph');
g.saveGraphML('mygraph.xml');
Finally, to the constructor for Neo4jGraph, you can also pass in a GraphDatabaseService instance.
I don't believe anything exists out there for this, not as of few months ago when messing with it. From what I saw, there are 2 main roadblocks:
XML is hierarchical, you can't represent graph data readily in this format.
Lack of explicit IDs for nodes. Even though implicit IDs exist it'd be like using ROWID in oracle for import/export...not guaranteed to be the same.
Some people have suggested that GraphML would be the proper format for this, I'm inclined to agree. If you don't have graphical structures and you would be fine represented in an XML/hierarchical format...well then that's just bad luck. Since the majority of users who would tackle this sort of enhancement task are using data that wouldn't store that way, I don't see an XML solution coming out...more likely to see a format supporting all uses first.
Take a look at NoSqlUnit
It has tools for converting GraphML to neo4j and back again.
In particular, there is com.lordofthejars.nosqlunit.graph.parser.GraphMLWriter and com.lordofthejars.nosqlunit.graph.parser.GraphMLReader which read / write XML files to / from a neo4j database.

Resources