What is the pros/cons of NSXMLParser & JSON parser?
Which one is good in which scenario?
Currently, my app uses NSXMLParser. I'm planning to move JSON parser if it is more efficient.
Thanks
NSXMLParser is an "event driven" parser which basically notifies a delegate about the occurrence of certain elements in the XML document.
Event driven parsers do not create a representation of the XML document by itself. The actual processing of the elements has to be done by some delegate. Properly utilizing event driven parsers is elaborate and error prone and requires experience how to approach such a task. Well, you know it.
NSJSONSerialization on the other hand, and all other third party JSON parsers that I know of, create a foundation object (a NSArray or NSDictionary) from the JSON input. Parsing a JSON document and getting a NSDictionary or a NSArray object back is a matter of one statement. A few also support the "event driven" mode.
XML is far more complex than JSON. Inherently, a JSON parser is much more simpler and also almost always more efficient in parsing documents.
Despite it's simplicity, JSON is almost always sufficient to express your data.
So, when you can express your data in JSON, by any means, use JSON. If possible, use NSJSONSerialization.
Other third party JSON parsers may offer additionally features, like an event driven API, an improved way to handle chunks of data, have more sophisticated options to customize certain edge cases, like the handling of Unicode NULL character, Unicode noncharacters, how to convert JSON numbers, etc., and may be possibly faster than NSJSONSerialization.
Today, NSJSONSerialization is about as fast as JSONKit. (For some input, JSONKit is a bit faster). AFAIK, there are two third party parsers which are for any input almost always faster than NSJSONSerialization, especially on arm, and when it comes to convert Numbers. You can expect them to be faster for a factor in the range of 1 to 2. But consider parsing JSON is almost never the culprit for performance issues.
Related
I want to use the from_xml method provided by the rails active support core ext,
An example :
hash = Hash.from_xml("my_xml.xml")
for converting XML into a Hash.
I want to use this because parsing a Hash is a lot easier than XML in ruby.
However, I would want to know what are the pros and cons of using this approach. Is there a better approach I can use for converting an XML into hash.
Thanks
XML unlike JSON is a document format and not just data exchange format and not always actually maps cleanly to programming language constructs like hashes. XML is actually ridiculously complex if you look at all the features like namespaces.
Hash.from_xml really only handles the simplest of cases and doesn't have a clue how to deal with stuff like attributes. It really only knows how to parse the XML generated by Hash#to_xml.
Advantages:
its so naive that its cute
Disadvantages:
see advantages
For non-trivial examples you'll need an actual XML parser like Nokogiri.
So I help write an app for my university and I'm wondering what's the best way to handle multiple XML feeds (like scores for sports, class information, etc.).
Should I have one XML parser that can handle all feeds? Or should I write a parser for each feed? We're having trouble deciding the best way to implement it.
This is iOS and we use a mix of Swift 3 and Objective-C
I think the right strategy is to write a base class that handles common data types like integers, booleans, strings, etc., and then write derived classes for each type of feed. This is the strategy I use in my XML parser, which is based on the data structures and Apple's XML parser as described here:
https://developer.apple.com/library/content/documentation/Cocoa/Conceptual/NSXML_Concepts/NSXML.html
Personally I prefer to use the XPath data models where you can query the XML tree for a specific node using a path-like string.
aeson seems to take a somewhat simple-minded approach to parsing JSON: it parses a top-level JSON value (an object or array) to its own fixed representation and then offers facilities to help users convert that representation to their own. This approach works pretty well when JSON objects and arrays are small. When they're very large, things start to fall apart, because user code can't do anything until JSON values are completely read and parsed. This seems particularly unfortunate since JSON seems to be designed for recursive descent parsers— it seems like it should be fairly simple to allow user code to step in and say how each piece should be parsed. Is there a deep reason aeson and the earlier json work this way, or should I try to make a new library for more flexible JSON parsing?
json-stream is a stream based parser. This is a bit out of date (2015), but they took the benchmarks from aeson and compared the two libraries: aeson and json-stream performance comparison. There is one case where json-stream is significantly worse than aeson.
If you just want a faster aeson (not streaming), haskell-sajson looks interesting. It wraps a performant C++ library in Haskell and returns Value from aeson.
I am working with a huge JSON object and i need to extract a single parameter from it.
Is there a way to query the JSON object for the parameter?
You need a streaming JSON parser for that, i.e. a parser that produces events to which you listen as it goes through the JSON input, as opposed to document-based parsers, such as NSJSONSerialization of iOS 5+.
One of such parsers is YAJL: although it is a C library, you can use it from Objective C as well: all you need to do is defining a yajl_callbacks, put pointers to the handlers for the type of the item that you wish to extract, call the parser, and let the parser do the rest.
Parsec is designed to parse textual information, but it occurs to me that Parsec could also be suitable to do binary file format parsing for complex formats that involve conditional segments, out-of-order segments, etc.
Is there an ability to do this or a similar, alternative package that does this? If not, what is the best way in Haskell to parse binary file formats?
The key tools for parsing binary files are:
Data.Binary
cereal
attoparsec
Binary is the most general solution, Cereal can be great for limited data sizes, and attoparsec is perfectly fine for e.g. packet parsing. All of these are aimed at very high performance, unlike Parsec. There are many examples on hackage as well.
You might be interested in AttoParsec, which was designed for this purpose, I think.
I've used Data Binary successfully.
It works fine, though you might want to use Parsec 3, Attoparsec, or Iteratees. Parsec's reliance on String as its intermediate representation may bloat your memory footprint quite a bit, whereas the others can be configured to use ByteStrings.
Iteratees are particularly attractive because it is easier to ensure they won't hold onto the beginning of your input and can be fed chunks of data incrementally a they come available. This prevents you from having to read the entire input into memory in advance and lets you avoid other nasty workarounds like lazy IO.
The best approach depends on the format of the binary file.
Many binary formats are designed to make parsing easy (unlike text formats that are primarily to be read by humans). So any union data type will be preceded by a discriminator that tells you what type to expect, all fields are either fixed length or preceded by a length field, and so on. For this kind of data I would recommend Data.Binary; typically you create a matching Haskell data type for each type in the file, and then make each of those types an instance of Binary. Define the "get" method for reading; it returns a "Get" monad action which is basically a very simple parser. You will also need to define a "put" method.
On the other hand if your binary data doesn't fit into this kind of world then you will need attoparsec. I've never used that, so I can't comment further, but this blog post is very positive.