Special characters in XML attributes with NSXMLParser - ios

I get something like the following from a third party API:
<el attr="test.
another test." />
I use NSXMLParser to read the file into my app.
However, in the delegate, the attribute gets converted to test. another test. Obviously I'd like to see it with the line breaks intact.
I initially assumed that this was a bug but, according to the XML Spec it's doing the right thing:
a whitespace character (#x20, #xD, #xA, #x9) is processed by
appending #x20 to the normalized value
(See section 3.3.3.)
What are my options? Note that it's a third party API so I can't change it, even though it's wrong.

NSData *xmlData = [NSData dataWithContentsOfURL:[NSURL URLWithString:#"url://"]];
NSMutableString *myXmlStr = [[NSMutableString alloc] initWithData:xmlData encoding:NSUTF8StringEncoding];
NSRange range = [myXmlStr rangeOfString:#"\n"];
[myXmlStr replaceCharactersInRange:range withString:#"[:newline:]"];
NSData *newXmlData = [myXmlStr dataUsingEncoding:NSUTF8StringEncoding];
Just replace [:newline:] with new line character whenever u like it :)

Related

cant get 100,000+ hebrew characters into objective-c string

I have 100,000+ characters of text that need to be converted into a string so I can count the characters and display them on a page correctly, but in the text there are tons of quotations ("") and lots of commas, so it doesnt even turn into a string.
Does anyone know a way that you can ignore quotations and commas inside a NSString without having to do this \"" each time?
Here's some of the text. its english/hebrew
Psalm 30
...
Psalm 100
...
The following Psalm is not to be said on Shabbat, Festivals, the day before Pesach, Chol HaMoed Pesach, and the day of Yom Kippur
...
You say “I cant even turn the text into a string”. Since you said (in a comment) you're “just pulling it off this website”, the simplest way to do this is +[NSString stringWithContentsOfURL:encoding:error:]. This works for me:
NSURL *url = [NSURL URLWithString:#"http://opensiddur.org/wp-content/uploads/2010/08/The-Blessing-Book-Nusa%E1%B8%A5-Ha-Ari-%E1%B8%A4aBaD-3.2.txt"];
NSError *error;
NSString *text = [NSString stringWithContentsOfURL:url usedEncoding:nil error:&error];
NSLog(#"error=%# text.length=%lu", error, (unsigned long)text.length);
You can look into NSURLSession or NSURLConnection when you want to do it in a non-blocking fashion.
If you plan to distribute the text in a file (named, let's say, “blessingBook.txt”) in your app bundle, you can get the URL this way:
NSURL *url = [[NSBundle mainBundle] URLForResource:#"blessingBook" withExtension:#"txt"];
If you're loading it directly from your app bundle, you probably don't need to worry about using NSURLSession to load it in the background. You might want to do your “processing” in the background though, if it takes a while.
You can replace the punctuation or commas or what ever you want to #"" (empty string).
yourString=[yourString stringByReplacingOccurrencesOfString:#"," withString:#""];
yourString=[yourString stringByReplacingOccurrencesOfString:#":" withString:#""];
yourString=[yourString stringByReplacingOccurrencesOfString:#";" withString:#""];
What ever the text you want to replace. Replace as above.
But it is lengthy process finding un wanted quotations, commas..some characters and replace with empty string..
Hope it helps you..!

Converting NSData to NSString returns nil

I know this question is asked before, but none of the solutions worked for me. I am trying to convert an NSData object to a NSString value. I am initing the NSString object like following:
NSString *html = [[NSString alloc] initWithData:urlData encoding:NSUTF8StringEncoding];
But the html is always nil. The NSData I am trying to convert is the source code of a website. It is fairly long. This is 'NSData` I am trying to convert.
Is it the length of the data that is causing the issue? I need the source code as a string. What can I do to resolve this issue?
What I tried so far:
Tried with all encoding formats as shown in this answer.
Tried with [NSString stringWithUTF8String:[urlData bytes]];
But whatever I do produce the same result. html always is nil whatever I do.
EDIT
It was a problem with the debug console. Even when the objects had values in it, the debug console always showed nil as the value for most of the objects. However NSLog always displays the value.
It's not a problem with debugger
The problem comes from compiler optimization, compiler see that string was not directly used, and optimizes the code by removing it and directly passing it to another method.
The key of the problem : You are running project on release scheme
Solution:
Here is a small guide to switch project to the Debug scheme
1) Click on the target, and click Edit scheme...
2) Popup will be displayed
3) Click Run %Your project%
4) Open Build Configuration popup
5) Select Debug
5) Press OK
6) You are ready to Go!, now you can debug anything :)
If you are using ARC, and you just wrote the code that converts the data to a string and haven't written any code yet that actually uses the string, it will get deallocated immediately. Check whether that is what is happening. For example, what does NSLog (#"%#", html) display?
NSAttributedString *str = [[NSAttributedString alloc] initWithData:data options:#{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]}
documentAttributes:nil error:&error];
Try this one:
NSString *myString = [[NSString alloc] initWithData:urlData encoding:NSASCIIStringEncoding];
Generally, conversion from NSData to NSString returns nil means there is mismatch between encoding format received from server and approach used for encoding.

Determining Issue With Retrieving JSON from URL in iPhone

Let me start off by saying that I am not particularly trying to find a solution, just the root cause of the problem. I am trying to retrieve a JSON from a url. In browser, the url call works just fine and I am able to see the entire JSON without issue. However, in x-code when simply using NSURLConnection, I am getting data bytes, but my NSString is null.
theString = [[NSString alloc] initWithData:urlData encoding:NSUTF8StringEncoding];
After doing some research I have found that I am probably trying to use the wrong encoding. I am not sure what type of encoding is being used by the url, so on first instinct I just tried some random encoding types.
NSString* myString = [[NSString alloc] initWithData:data encoding:NSASCIIStringEncoding];
NSString* myString2 = [[NSString alloc] initWithData:data encoding:NSUTF16StringEncoding];
NSString* myString3 = [[NSString alloc] initWithData:data encoding:NSWindowsCP1252StringEncoding];
NSASCIIStringEncoding and NSWindowsCP1252StringEncoding is able to bring back a partially correct JSON. It is not the entire JSON thatI am able to view in the browser, and some characters are a little messed up, but it is something. To try and better determine what encoding was used, I decided to use the following method to try and determine it by looking at what encoding returned.
NSError *error = nil;
NSStringEncoding encoding;
NSString *my_string = [[NSString alloc] initWithContentsOfURL:url
usedEncoding:&encoding
error:&error];
My NSStringEncoding value is 3221214344. And this number is consistent everytime I run the app. I can not find any NSStringEncoding values that even come close to matching this.
My final question is: Is the encoding used for this url not consumable by iOS, is it possible that multiple types of encoding was used for this url, or is there something else that I could be doing wrong on my end?
It's best not to rely on Cocoa to figure out the string encoding if possible, especially if the data might be corrupted. A better approach would be to check if the value indicated by the HTTP Content-Type header specifies a character set like in this example:
Content-Type: text/html; charset=ISO-8859-4
Once you're able to parse and retrieve a character set name from the Content-Type header, you need to convert it to an NSStringEncoding, first by passing it to CFStringConvertIANACharSetNameToEncoding, and then passing the returned CF string encoding to CFStringConvertEncodingToNSStringEncoding. After that, you can initialize your string using -[NSString initWithData:encoding:].
NSData *HTTPResponseBody = …; // Get the HTTP response body
NSString *charSetName = …; // Get a charset name from the Content-Type HTTP header
// Get the Core Foundation string encoding
CFStringEncoding cfencoding = CFStringConvertIANACharSetNameToEncoding((CFStringRef)charSetName);
// Confirm this is a known encoding
if (cfencoding != kCFStringEncodingInvalidId) {
// Initialize the string
NSStringEncoding nsencoding = CFStringConvertEncodingToNSStringEncoding(cfencoding);
NSString *JSON = [[NSString alloc] initWithData: HTTPResponseBody
encoding: nsencoding];
}
You still may run into problems if the string data you're working with is corrupted. For example, in the above code snippet, perhaps charSetName is UTF-8, but HTTPResponseBody can't be parsed as UTF-8 because there's an invalid byte sequence. In this situation, Cocoa will return nil when you try to instantiate your string, and short of sanitizing the data so that it conforms to the reported string encoding (perhaps by stripping out invalid byte sequences), you may want to report an error back to the end user.
As a last-ditch effort — rather than reporting an error — you could initialize a string using an encoding that can handle anything you throw at it, such as NSMacOSRomanStringEncoding. The one caveat here is that unicode / corrupted data may show up intermittently as symbols or unexpected alphanumerics.
Even though it seems that the answer has been provided in the comments (using iso-8859-1 as the correct encoding) I thought it worthwhile to discuss how I would go about debugging this problem.
You said that the Desktop Browser (Chrome) can digest the data correctly, so let's use that:
Enable Developer Tools https://developers.google.com/chrome-developer-tools/
When the Dev Tools window is open, switch to "network" and execute your call in that browser tab
check the output by clicking on the request url - it should give you some clue.
If that doesn't work, tools like Postman can help you to recreate the call before you implement it on the device

Parse RTF file into NSArray - objective C

I'm trying to parse and English Dictionary from an RTF file into an array.
I originally had it working, but I'm not sure why it isn't now.
In the dictionary, each word is separated by a new line (\n). My code so far is:
//loading the dictionary into the file and pull the content from the file into memory
NSData *data = [NSData dataWithContentsOfFile:#"wordList.rtf"];
//convert the bytes from the file into a string
NSString *string = [[NSString alloc] initWithBytes:[data bytes]
length:[data length]
encoding:NSUTF8StringEncoding];
//split the string around newline characters to create an array
NSArray *englishDictionary = [string componentsSeparatedByString:#"\n"];
When I try and NSLog it, it just comes up with (null)...
I've already looked at: Objective-C: Reading a file line by line
But couldn't get the answer by Yoon Lee working properly. It prints out with two back slashes at the end as well as lots of unnecessary stuff at the start!
Any help would be greatly appreciated! Cheers!
Try using a plain text (txt) file instead of rtf. RTF files contain formatting information about the text as well. Thats the unnecessary stuff that you see after reading the content.

Cut one string into two others in Objective-C

I have an NSURLRequest being made that to a server that returns a string.
string = [[NSMutableString alloc] initWithData:receivedData encoding:NSUTF8StringEncoding];
receivedData is the mutable array that the downloaded data is stored in. Everything works fine.
I have now, however, added another value to that string. An example of the returned string would be 14587728000000 , 376.99. Originally it was one value so I didn't have to do any splicing. But, now that I have another value, I want to be able to separate it into two different strings.
What should I do to separate the two values into different string? Some kind of search that goes till the first space, or something like that. I have access to the server, and the string is generated in PHP so the separator can be anything.
You can do this with the NSString componentsSeparatedByString method:
NSString *string = #"14587728000000,376.99";
NSArray *chunks = [string componentsSeparatedByString: #","];
You can find some other common NSString tricks (where I found this one) here.
Use -(NSArray *)componentsSeparatedByString: and pass in the token to split by.

Resources