How Can I parse Special characters in XML by obj-c? - ios

I have an XML tag like
I can parse this element, but I get only the s character in my string.
<title>Transport information Classic World's </title>
And I parsed it like this, but in my object I get only the 's' character.
if ([elementname isEqualToString:#"title"])
{
currentTweet.content = currentNodeContent;
}
How can I decode the whole text in title ?

Try while you creating XML use CDATA tags like
<title><![CDATA[Transport information Classic World's]]></title>
Also here is a list of HTML Tags and more cases XML with those characters is invalid, unless they are contained within a CDATA.
Also try this Question hope with help you
As You asking the you can not change the XML so till now you will not resolve i think parser is not able to parse this XML.

If you have such possibility, wrap special characters in CDATA tags, when you create this XML.

Related

How can I bold part of a string from JSON in Swift?

I have a JSON file like this. I have to make bold part of string which is shown in JSON. How can I make parse this JSON?
It looks to me like you would first want to use NSJSONSerialization (Or just JSONSerialization in Swift 3) to convert your JSON to an object graph. Once you've done that, you should be able to navigate to the interestLabel keys in your data and fetch those strings.
You'll then need to parse those tagged strings somehow. If the only thing you need to do is to find <b> and </b> bold tags, and no other tags will ever appear in your data then you could probably write your own code. If the strings might have other tags and/or more complex HTML structure then you might want to use an XML/HTML parser. I suggest taking a look at this tutorial: https://www.raywenderlich.com/14172/how-to-parse-html-on-ios

TBXML doesn't parse tag with special character as value

I'm trying to parse an XML using TBXML and everything is going fine except for tags which contain special characters in their value.
For example, consider the XML element
<tag> sources/data </tag>
I'm trying to get the text sources/data from this tag. I'm using [TBXML textForElement:element] to achieve this. But it always returns an empty string.
The same code fails for another tag which is defined as :
<tag> array[i] </tag>.
But it works fine for normal text values like
<tag>name</tag>.
Can anyone help me out here ?
Quote: "Because XML syntax uses some characters for tags and attributes it is not possible to directly use those characters inside XML tags or attribute values."
http://www.dvteclipse.com/documentation/svlinter/How_to_use_special_characters_in_XML.3F.html
As I know this kind of data must be in placed CDATA.

Can I leave some sections unparsed using NSXMLParser?

I have an XML document which I want to parse using NSXMLParser. One of the tags it can contain is <html>, and in my parsed representation I want the contents of that tag, verbatim. However, when I parse the document, my delegate methods are called for the start, end and contents of each tag inside the html tag.
I can't get the provider of the document to add CDATA tags; nor can I use something other than NSXMLParser to parse the document.
Is there a way for me to tell the parser to treat the contents of HTML tags as CDATA and to leave them unparsed, even if they contain other tags?
That's too bad that the owner of the XML feed won't fix it because, depending on the HTML, you may end up with a malformed XML feed. If it really is an XML document, they definitely should wrap it in a CDATA or replace all the < with < and all the > with >.
Frankly, if all you need is the HTML, and all you have is XML tag that contains the HTML without the CDATA or appropriate character replacement, I might not be inclined to try to run it through NSXMLParser at all (because the successful parsing is contingent on the nature of the HTML included). I'd use a NSScanner or NSRegularExpression to extract all of the text between the XML's opening and closing tag that wrap your HTML.
Or, if you really want to use NSXMLParser (because there's other stuff in addition to the HTML that you need), then manually alter the NSData, wrapping the HTML in a CDATA yourself.
If, on the other hand, the document you're trying to parse really isn't XML, but rather is just HTML, then of course, you shouldn't be parsing it with an XML parser. You should be using a HTML parser, like HPPLE, as described in Galloway's article, How to Parse HTML on iOS on the Ray Wendlich site.

NSXMLParser : not retrieving correct data if there are special characters or italics

I am using NSXMLParser to parse an xml for url. Some of the elements contains special characters in text and also italics.
Please find the below xml element with italic tags in text:
<name>Verify Settings<i>i</i>patch level</name>
NSXMLParser breaks the text and gives Output: Verify Settings
Is there any way to parse italics text in between elements?
Please find the xml with special characters below :
<impact> In 2003, the ¿shared APPL_TOP¿ architecture was introduced, which allowed the sharing of a single APPL_TOP, however the tech stack
· Reduced disk space requirements
· Reduced maintenance
· Reduced administrative costs
· Reduced patching down time
· Less complex to add additional nodes, making scalability easier
· Complexity of instance reduced
· Easier backups
· Easier cloning</impact>
It breaks the text and gives Output: e costs ·Reduced patching down time ·Less complex to add additional nodes, making scalability easier ·Complexity of instance reduced ·Easier backups ·Easier cloning
Any suggestions on how to parse italic tags in the text and special characters using NSXMLParser ?
Here is my foundCharacters code:
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
if (!self.currentStringValue) {
// currentStringValue is an NSMutableString instance variable
self.currentStringValue = [[NSMutableString alloc] init];
}
[self.currentStringValue appendString:string];
}
Both of these look less like XML parsing problems than XML generation problems. How are you generating this XML? It feels like a manually generated XML, as opposed to something generated by a proper XML library.
Look at your XML from the parser's perspective: How is NSXMLParser supposed to know that the <i> is HTML in the <name> element, and not a new XML tag itself?!? If this is indeed what the XML looks like, you really should just fix your web service.
For example, looking at your problem with the italics the problem is that the <i> looks like a new element name. Generally that should be represented either as:
<name>Verify Settings<i>i</i>patch level</name>
Or as
<name><![CDATA[Verify Settings<i>i</i>patch level]]></name>
This encoding of the name property is generally done by the API that does the XML encoding in the web service. Generally you don't need to do anything to get this behavior. But if your web service is manually creating its own XML, that could give you the sort of output that you describe in your original question.
On the second example, I would have thought that the characters in the XML must conform to the character set outlined in the <?xml ...> tag, e.g,:
<?xml version="1.0" encoding="ISO-8859-1"?>
What does your <?xml ...> tag say? Are the characters listed falling within the encoding listed there?
Looking at your revised foundCharacters, the new rendition is much better. The previous rendition suffered from a problem, insofar as it assumed that foundCharacters would be called only once for any given pair of <name> and </name> tags. That is not necessarily the case. Your latest rendition correctly creates currentStringValue if it needs to, and then appends to it. That is the correct approach, consistent with the examples in the Apple documentation. You might only want to do that if you're parsing one of the elementName types that you care about (e.g. <name>), but with that minor caveat, this new rendition looks much better.

Ruby on rails string parsing

I have a string that is a bunch of XML tags.
Basically there is the contents to one tag I want and ignore everything else:
The input would look like:
<Some><XML><stuff>
<title type='text'>key</title>
<Some><other><XML><stuff>
The output would look like:
key
I'm not sure if XML is appropriate since there doesn't seem very much structure to this particular XML.
Can regex do this in RoR or is it more of just a pattern matching thing (true or false) in ruby on rails?
Thanks so much!
Cheers,
Zigu
No. If your source could not be strictly valid XML, I strongly suggest you to use Nokogiri.
Handle the source as an HTML document and extract the info you need in this way:
doc = Nokogiri::HTML("Your string with <key>some value</key>"))
doc.search('key').each do |value|
puts value.content # do whatever you want
end
Here's why you don't parse xml with regexen: RegEx match open tags except XHTML self-contained tags

Resources