Problem is that, while downloading an XML file using ASIHTTPRequest it use default response encoding instead of using the encoding in "encoding" attribute in the header.
The class use header information to set the responseEncoding to the particular encoding in "Charset". So the class works well in case of a HTML page as it include the encoding type in the header itself.
The default encoding is NSISOLatin1StringEncoding but the encoding in the attribute is UTF-8, which render response string like "función" instead of "función".
So I want responseEncoding property of request(ASIHTTPRequest) to be set to the encoding type in the XML file
If it is your NSLog statement output then it is correct and the ASIHTTPRequest class is working correct only
ASIHTTPRequest does not parse your xml file, it decides the encoding based solely on the contents of the http headers.
Use request.responseData to get the raw data instead and do the conversion to string yourself.
Related
I get response from server in wincp1251 and restkit returns nil to my mapped object strings. I know restKit have a property defaultHTTPEncoding in RKClient(https://github.com/RestKit/RestKit/commit/0ead8a922219ec42ec6dae6ebe59139a1fd589ae), how can I use this and can it helps me?
I'm assuming that your server is returning JSON. If this is the case then the server needs to be updated because it isn't conformant to the JSON spec. Specifically:
Encoding
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
An important point to note is that RestKit doesn't unpack the response into a string, because the JSON deserialisation takes a data object (NSJSONSerialization). And again, the spec states:
The data must be in one of the 5 supported encodings listed in the JSON specification: UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE. The data may or may not have a BOM. The most efficient encoding to use for parsing is UTF-8, so if you have a choice in encoding the data passed to this method, use UTF-8.
So to handle your server response, if you can't change it, you'll need to handle the download yourself, convert the data to the appropriate encoding, unpack the JSON, and then create a mapping operation to use that.
I use URLDownloadToFile to download a file in Delphi. In the url there is not the real name of the file. Is it possible to specify just the path of the file, keeping the default name that i.e. Explorer show?
You are in a catch-22 situation. You need to give URLDownloadToFile() a filename, but you have to request the URL first to discover if it has its own filename.
You have two choices:
Send a separate HEAD request to the URL first and check the Content-Disposition response header, if present. You can use HttpSendRequest() and HttpQueryInfo() for that, or any other HTTP library. You can then format a filename as needed, and then download the URL to that filename.
Use a temp filename for the download, then check the Content-Disposition response header, if present, and rename the file if needed. To get the response headers from URLDownloadToFile() you have to write a class that implements the IBindStatusCallback and IHttpNegotiate COM interfaces, then pass an instance of that class to the lpfnCB parameter. The response headers will be passed to your IHttpNegotiate.OnResponse() implementation.
I need to post JSON data to an MVC controller that contains URL's. The JSON data looks like it's being split at the query string (=)
The JSON data looks like this:
"{"Files":[{"Title":"test","OriginalFileName":"",
"FileName":"http://company.domain.com/auth.aspx?enrollmentkey=APK54cd1546a8454d4ca79ded89a78f8698",
"Categories":[{"CategoryId":76,"SubCategoryId":182,"CatId":"CatId0"}],
"TypeId":"84",
"Tags":["Select Tag(s)..."],
"TagIds":[],
"Roles":[],
"MemberOnly":false,
"ContentTypeId":7,
"Id":0,
"IsPublished":true,
"PublishDate":""}]}"
Debugging, I see that it's being split into
KEY (Request.Form.GetKey(0)):
{"Files":[{"Title":"Test","OriginalFileName":"","FileName":"http://company.domain.com/auth.aspx?enrollmentkey
VALUE (Request.Form.GetValue(0)):
APK54cd1546a8454d4ca79ded89a78f8698","Categories":[{"CategoryId":110,"SubCategoryId":111,"CatId":"CatId0"}],"TypeId":"69","Tags":["Select Tag(s)..."],"TagIds":[],"Roles":[],"MemberOnly":false,"ContentTypeId":7,"Id":0,"IsPublished":true,"PublishDate":""}]}
Does the JSON data needs to be escaped at the = or the whole thing needs to be encoded or am I missing something?
I should note that I'm using knockout's ko.toJSON(js) to create the JSON although I'm not sure that is relevant.
I also noticed that chrome dev tools also seems to recognize the Key-Val split:
If you are sending JSON data to the server, the Content-Type header needs to be set to application/json. If it is set to application/x-www-form-urlencoded then the server will try to interpret the JSON as key-value pairs as in a URL. This is why your JSON string is getting broken in two at the =.
By putting a NSDictionary to a file I get an UTF-8 encoded XML file. I need to write data to a file in NSISOLatin1StringEncoding. Is NSDictionary UTF-8 only? How to achieve my goal?
Are you sure you need a file encoded as ISO Latin-1? The problem with all encodings other than some form of Unicode is that they can't represent all possible characters.
The encoding is surely the least of your problems. A dictionary's file representation is a property list file. It's unlikely that any code which requires Latin-1 encoding would understand that format. Indeed, the format is not guaranteed. It's not even guaranteed to be XML or textual. Property lists may be binary.
If you want to exchange data with a program that's going to use anything other than Cocoa's property list implementation, you should manually write the contents of the dictionary out in a format that's defined independently of Apple's property list format.
And, yes, if Cocoa does write the property list as XML, it's going to be UTF-8-encoded.
I am now writing firefox 4 bootstrapped extension.
The following is my story:
When I'm using #mozilla.org/xmlextras/xmlhttprequest;1, nsIXMLHttpRequest, content of target URL can be successfully loaded by req.responseText.
I parsed the responseText to DOM by createElement method and innerHTML property into a BODY Element.
Everything seems to be successful.
However, there is a problem on character encoding ( charset ).
As I need the extension detect the charset of target documents, overriding the Mine type of request with text/html; charset=blahblah.. seems not to meet my need.
I've tried the #mozilla.org/intl/utf8converterservice;1, nsIUTF8ConverterService, but it seems that XMLHTTPRequest has no ScriptableInputStream or even any InputStream or readable stream.
I have no idea on reading a target document content in a suitable, automatically detected charset, no matter the function of Auto-Detect Character Encoding in GUI or the charset readed at head meta tag of the content document.
EDIT: Would it be practical if I parse whole document including HTML, HEAD, BODY tag to a DOM object, but without loading extensive document like js, css, ico files?
EDIT: Method on the article at MDC titled as "HTML to DOM" which is using #mozilla.org/feed-unescapehtml;1, nsIScriptableUnescapeHTML is inappropriate as it parsed with lots of error and mistake with baseURI can not be set in type of text/html. All attribute HREF in A Elements are missed when it contains a relative path.
EDIT#2: It would still be nice if there are any methods that can convert the incoming responseText into readable UTF-8 charset strings. :-)
Any ideas or works to solve encoding problem are appreciated. :-)
PS. the target documents are universal so there are no specific charset ( or ... preknown ), and of course not only UTF8 as it has already defined in default.
SUPP:
Til now, I have two brief main ideas of solving this problem.
Can anybody could help me to work out of the XPCOM modules and methods' names?
To Specify the charset while parsing Content into DOM.
We need to first know the charset of the document ( by extracting head meta Tag, or header).
Then,
find out a method that can specify the charset when parsing body content.
find out a method that can parse both head and body.
To Convert or Make Incoming responseText into/be UTF-8 so parsing to DOM Element with default charset UTF-8 is still working.
X seems to be not practical and sensible : Overiding the Mine type with charset is an implementation of this idea but we can not preknow the charset before initiating a request.
It seems that there are no more other answer.
After a day of tests, I've found out that there is a way (although it is clumsy) to solve my problem.
xhr.overrideMimeType('text/plain; charset=x-user-defined'); , where xhr stand for XMLHttpRequest Handler.
To force Firefox to treat it as plain
text, using a user-defined character
set. This tells Firefox not to parse
it, and to let the bytes pass through
unprocessed.
Refers to MDC Document: Using_XMLHttpRequest#Receiving_binary_data
And then use Scriptable Unicode Converter : #mozilla.org/intl/scriptableunicodeconverter, nsIScriptableUnicodeConverter
Variable charset can be extracted from head meta tags no matter by regexp from req.responseText (with unknown charset) or something other method.
var unicodeConverter = Components.classes["#mozilla.org/intl/scriptableunicodeconverter"].createInstance(Components.interfaces.nsIScriptableUnicodeConverter);
unicodeConverter.charset = charset;
str = unicodeConverter.ConvertToUnicode(str);
An unicode string, as well as a family of UTF-8, is finally produced. :-)
Then simply parse to body element and meet my need.
Other brilliant ideas are still welcome. Feel free to object my answer by sufficient reason. :-)