I am trying to parse one XML file that contains some unicode characters.I tried to parse the file using NSXMLParser but i am unable to parse XML.Parser stops when it encounters any unicode characters.
Is there any other good solution to parse XML file with unicode letters?
Please suggest.
Have you tried TBXML for iPhone http://www.tbxml.co.uk/
Related
We have a link module that looks something like this:
const string lMod = "/project/_admin/somethingÜ" // Umlaut
We later use the linkMod like this to loop through the outlinks:
for a in obj->lMod do {}
But this only works when executing directly from DOORS and not from a batch script since it for some reason doesn't recognize the Umlaut causing the inside of the loop to never to be run; exchanging lMod with "*" works and also shows the objects linked to by the lMod.
We are already using UTF-8 encoding for the file:
pragma encoding, "UTF-8"
Any solutions are welcome.
Encode the file as UTF-8 in Notepad++ by going to Encoding > Convert to UTF-8. (Make sure it's not already set to UTF-8 before you do it).
I'm writing a script that will operate on the subtitle files of a popular streaming service (Netfl*x).
The subtitle files have strange characters in them and I can't get them to render in a way that my text editors or web browser will display in a readable way. The xml encoding says UTF-8, but some characters are not readable.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<tt xmlns:tt="http://www.w3.org/ns/ttml" xmlns:ttm="http://www.w3.org/ns/ttml#metadata" xmlns:ttp="http://www.w3.org/ns/ttml#parameter" xmlns:tts="http://www.w3.org/ns/ttml#styling" ttp:tickRate="10000000" ttp:timeBase="media" xmlns="http://www.w3.org/ns/ttml">
<p>de 15 % la nuit dernière.</span></p>
<p>if youâve got things to doâ¦</span></p>
And in Vim:
This is what it looks like in the browser:
How can I convert this into something I can use?
I'll go out on a limb and say that file is UTF-8 encoded just fine, and you're merely looking at it using the wrong encoding. The character À encoded in UTF-8 is C3 80. C3 in ISO-8859-1 is Ã, which in your screenshot is followed by an 80. So looks like you're looking at a UTF-8 file using the (wrong) ISO-8859 encoding.
Use the correct encoding when opening the file.
My terminal is set to en_US.UTF-8, but was also rendering this supposedly UTF-8 encoded file incorrectly (sonné -> sonné). I was able to solve this by using iconv to encode the file in ISO8859-1.
iconv original.xml -t ISO8859-1 -o converted.xml
In the new file, the characters were properly rendered, although I don't quite understand why.
I want to create an RTF file by creating my own source code of the RTF file and inserting in variables from my model.
I am creating the source coude using for example :
NSMutableString *body = [NSMutableString stringWithString:"{\rtf1\ansi\ansicpg1252\deff0\nouicompat\deflang3084\deflangfe3084{\fonttbl{\f0\froman\fprq2\fcharset0 Times New Roman;}{\f1\fswiss\fprq2\fcharset0 Calibri;}{\f2\froman\fprq2\fcharset2 Symbol;}}{\colortbl ;\red255\green255\blue255;\red0\green0\blue255;}{\*\generator Riched20 10.0.10240}\viewkind4\uc1\trowd\trgaph70\trleft-108\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl70\trpaddr70\trpaddfl3\trpaddfr3\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx2818\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs"];
I want this string to be saved as an RTF file and then the RTF reader will conver this code to a readable RTF File. The problem is that Xcode gives me numerous errors (unknown escape sequence) due to the characters such as * \d \c \g . Moreover it says "Incomplete universal character name".
How can I have my NSString be treated like source code and ignore all those errors so that it can be parsed in an RTF file.
You need to escape your escape characters "\". When you write it to the console or file your string will output correctly.
NSMutableString *body = [NSMutableString stringWithString:#"{\\rtf1\\ansi\\ansicpg1252\\deff0\\nouicompat\\deflang3084\\deflangfe3084{\\fonttbl{\\f0\\froman\\fprq2\\fcharset0 Times New Roman;}{\\f1\\fswiss\\fprq2\\fcharset0 Calibri;}{\\f2\\froman\\fprq2\\fcharset2 Symbol;}}{\\colortbl ;\\red255\\green255\\blue255;\\red0\\green0\\blue255;}{\\*\\generator Riched20 10.0.10240}\\viewkind4\\uc1\\trowd\\trgaph70\\trleft-108\\trbrdrl\\brdrs\\brdrw10 \\trbrdrt\\brdrs\\brdrw10 \\trbrdrr\\brdrs\\brdrw10 \\trbrdrb\\brdrs\\brdrw10 \\trpaddl70\\trpaddr70\\trpaddfl3\\trpaddfr3\\clbrdrl\\brdrw10\\brdrs\\clbrdrt\\brdrw10\\brdrs\\clbrdrr\\brdrw10\\brdrs\\clbrdrb\\brdrw10\\brdrs \\cellx2818\\clbrdrl\\brdrw10\\brdrs\\clbrdrt\\brdrw10\\brdrs\\clbrdrr\\brdrw10\\brdrs\\clbrdrb\\brdrw10\\brdrs"];
At the server, I have a string of text that is concatenated together and has \n added to create line breaks. This is done in VB.net as a string object.
This string is that added to a dictionary which is then available as a JSON webservice.
On my iOS app, I am reading the JSON and parsing it, saving the contents to core data.
when I display my text in a UITextView the string "hello\nthere" is shown exactly like that with no line breaks and the \n visible.
what have I done wrong?
I have also tried \r instead.
should I be making a string in the vb.net part like this "hello\nthere" - is that valid, or should I be using "hello" & vbcrlf & "there" and letting the JSON parser convert the line break to \n (if it does this)
or is the problem in the iOS side?
can't see how else to mark this as complete - but please see the comment from – patric.schenke May 31 at 9:51
he was indeed corerct - the text had escaped to \n so replacing this with \n and saving that to core data has solved the issue.
I have a Blackberry app developed using PhoneGap. I am using suds client to call web service. There are some Portuguese character in the webservice XML. I am not able to parse to XMLDoc using the DOMParser.
I am using
xmlDoc = parser.parseFromString(_xml, "text/xml");
The encoding type is UTF-8. Without the Portuguese character, parsing is working perfectly.
"I am using is UTF-8 encoding type." - this can mean several things, so it is unclear what exactly you do in order to support UTF-8 end-to-end.
E.g. you should check:
your web service really sends data in UTF-8 (when it converts string chars into bytes to be sent into output stream it should use UTF-8)
the device code that reads data from web really uses UTF-8 to convert bytes to string _xml
P.S. I'm not familiar with phonegap API so this is just a general plan.