Using NSXMLDocument, without using XPath, is there an easy way to parse an XML document and deserialize it into custom objects to create an object tree hierarchy?
For example, if I have the xml shown below, is it possible to put the details into a Restaurant object and a content object within it?
<restaurants>
<content>spanish name</content>
<content>english name</content>
</restaurant>
<spa>
<content>spa spanish name</content>
<content>spa english name</content>
</spa>
I will be using your answer above to extend it for programming in kissxml in iOS. As the kissXML document mentions that the XML parser behaves in the same way as NSXMLDocument, so I've asked the question using NSXMLDocument.
If you know your expected content structure, the easiest is to just use NXMLParser and loop through it looking for the bits you need and keeping track of the previous bit, building an object as you find them.
If you want tree-based approach, consider learning XQuery and XPath, they are not all that bad. Without them, the only thing NSXMLDocument really gives you is Cocoa bindings.
At the end of the day you must transform your data somehow.
With NSXMLDocument you will still do well to validate against an XML DTD if possible, to ensure you have good data.
With NSXMLParser, you are able to handle things without a formal DTD.
You only need worry about how big the data is, an how you want to parse it, then do some trial and error with test data to ensure it's grabbing what you want or need.
Related
I am working on a project where I need to deal with Wiktionary. For some entries, there are context labels/tags before its sense I want to query for, e.g. idiomatic, transitive like HERE. I am now trying to use JWKTL, to do the job. But it seems no api call supports the query.
Can anyone let me know how to get that information by JWKTL, or, is there any other tool can parse the Wiktionary dump .xml file while being able to access that labels/tags?
Thanks.
According to Dr. Christian. Meyer, there is currently no API on this.
I ended up with pattern matching in the original wiktionary .xml dump.
I am working on a project that receives data (json) from a number of different sources. Each source returns json in a different format, however all services fall into the same category i.e. Issues from Jira and Stories from PivotalTracker each have the same core information.
I am looking for a way to normalize this as much as possible so that I can add other services and formats in the future. Right now I am handling each response type (Jira, PivotalTracker) separately and taking action on each response independently.
So far I am thinking that I'll need a parser for each service, i.e. JiraIssueParser, PivotalTrackerStoryParser etc which transforms the response into a common format that can be used by one method to post onwards, rather than having methods for each to do the receive/parse/post.
Something like this format:
{
issue: {
title: ,
description: ,
assignee: ,
comments: {
1: {
id: ,
title: ,
body:
}
time_entries: {
1: {
id: ,
time: ,
date:
}
}
}
}
I would like to define the common schema somewhere so that each parser's output is always identical. I'm thinking this could be done with a YAML file but I'm not sure how to go about it, and how to use that in the parser.
I would greatly appreciate some suggestions on how to do this. Maybe this is a really stupid question and I should just be outputting the above format from each parser, but I think it would make sense to have some kind of format that is enforced/validated.
Suggestions are appreciated and I'm open to taking a new direction with this if anyone has any ideas. Thanks in advance.
If you're using Rails, I assume then that you are going to have a relational database at some point.
What I would suggest is to define ActiveRecord models that express your "normalized" format: Issue, Comment, TimeEntry, etc.
The job of your parsers, then, is to coerce the JSON data into the appropriate model objects and attributes and save them. Your models thus enforce the canonical data structure (i.e. the schema), and you can even use validators to do further sanity checks.
Finally, I would also save the raw JSON somewhere alongside your model, preferably also in the database. Even though you have already parsed the JSON, keeping it around will come in handy for troubleshooting. For example, if you find and fix a parsing bug, you can re-run the parser on the saved JSON without having to re-download everything the original external sources.
I'm currently querying an Entity using projections to avoid returning the entire object.
It works flawlessly, however, when looking at the actual response from the server, I'm seeing the same type definition repeated for every single element.
For example:
["$type":"_IB_4NdB_p8LiaC3WlWHHQ_pZzrAC_plF4[[System.Int32, mscorlib],[System.String, mscorlib],[System.String, mscorlib],[System.String, mscorlib],[System.Nullable`1[[System.Int32, mscorlib]], mscorlib],[System.Int32, mscorlib],[System.Single, mscorlib]], _IB_4NdB_p8LiaC3WlWHHQ_pZzrAC_plF4_IdeaBlade"
Now, given that every item in the result is sharing the same projection for that query, is there a way to have Breeze only define the Type Description ONCE instead of for every element?
It may not seem like a big deal but as result size increases those bytes do start to add up. At the moment There is little difference between returning the projected values and the entire entity itself due to this overhead.
NOTE: As it turns out, since we use Dynamic Compression of JSON in our real environments, this actually turns out to be a minor issue, since 200KB responses actually turn into less than 20KB traffic after gzip compression. Will probably be closing this question, unless someone has something to add that could be of use to others.
Update 18 September 2014
I decided to "cure" the problem of the long ugly $type names in serialized data for both dynamic types from projection queries and anonymous types created for an endpoint such as "Lookups".
There's a new Breeze Labs nuget package, "Breeze.DynamicTypeRenaming" (search for "Breeze Dynamic Type Renaming"). This adds two files to your Web API project's "Controllers" folder. One is a CustomBreezeConfig which replaces Breeze's default config and resets the Json.Net "Binder" setting with the new DynamicTypeRenamingSerializationBinder; this binder does the type name magic.
Just install the nuget package in your Web API project and it should "just work". In your case, the $type value would become "_IB_4NdB_p8LiaC3WlWHHQ_pZzrAC_plF4, Dynamic".
See an example of it in the "DocCode" sample.
As always, this is a Breeze Lab product, not part of the core Breeze product. It is offered "as is" with no promise of support. I'm pretty sure it's good and has no adverse side-effects. No guarantees. I'm sure you'll let me know if there's a problem.
That IS atrocious, isn't it! That's the C# generated anonymous type. You can get rid of it by casting into a custom DTO type.
I don't know if it is actually harmful. I hate looking at it in any case.
Lately I've been thinking about adding a JSON.NET IContractResolver that detects such uglies and turns them into shorter uglies. Wouldn't be hard. Just haven't had the time.
Why not write that yourself and contribute to the community? We'd be grateful! :-)
Using Dynamic Compression of JSON output has turned this into a non-issue, at least for now, since all that repeated content is heavily compressed server-side.
I'm currently struggling with the problem of multilingualism in an SPA.
I've come up with several solutions, like building a wrapper for the resources resx files, or saving all labels in the database, but I am wondering if any of you have found some solution which automates these steps.
Are there any practices which are specific for this problem?
For a reasonable amount of literals, I suggest to save the resources in the DB or in a .RESX file in the server. When the user logs in or you detect the language that will be used, the literals are requested by the application and saved either in a collection of your translation module or in the LocalStorage of the browser (this could be a good approach for large data).
Then this module could have some methods to retrieve the messages, probably passing a key.
Using this solution you could inject this module in the viewmodels that need to show translated literals and acces them through the view:
<p data-bind="text: resourceManager.get('M01')"></a>
For large applications that would require huge localization data to be transfered, maybe some kind of modularity could be applied and only load the resources really needed for each module/section.
I don't think making recurrent requests to the server to get the translated literals is a good practise. SPA's should provide a good user experience and loading the translated literals from the server could be a blocking issue. Text is not like an image, you can render a page without all the images loaded, imagine rendering a page without the text :o
Anyway, I think the best solution would be to keep the server as repository and create a custom JS module that takes care to get data in one or multiple loads and is able to store it somewhere in the client.
I've solved my own problem, using a custom binding and i18next.
First, I've implemented i18next for translation of my labels/buttons and other resources.
Secondly, I've added a custom Knockout bindingHandler:
ko.bindingHandlers.i18n = {
init: function (element, valueAccessor) {
var translateKey = valueAccessor();
ko.utils.setTextContent(element, $.t(translateKey));
}
};
Finally you can add the following code to your views:
<span data-bind="i18n : 'buttons.cancel'"></span>
This will automatically get the correct resource, and Knockout will handle the bindings.
Hopefully this will help others struggling with the same problem.
What I did on iOS is displaying grid of 10 images. For this I take a array and add all images to array. For this I am write the code as:
[_image addImage:[UIImage ImageNamed:#"black.jpg"]];
like this I add all images to array. Now what I need is to get all the images through Xml file and add to the images array. How can I do this?
The class you need is NSXMLParser and the delegate NSXMLParserDelegate.
Have a look here or even here at SO. What you are looking for is to parse an XML document, and you can do it with the NSXMLParser class, libxml2, or other libs/tools mainly based on those two. Also you'd better have a look at this project by Apple it will show you the main differences in terms of performance between technologies
First of all you need to parse your xml file. You can use the NSXMLParser class. How exactly you would do this, depends on the structure of your xml file. There are a lot of tutorials on the Internet about how to use it, for example see here.