French characters are not displaying correctly inside javascript grid - character-encoding

We have translated one of our pages to french and all the html within the page displays flawlessly. That said, there is a javascript table (ext js) and the accented characters are not displaying correctly. The page is encoded UTF-8 in the HTML meta tags, but when I look inside FireBug, I see the following:
Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7
I'm guessing the problem is related to the ISO-8859-1 having worked its way back in. Does anyone know why the page itself would display fine, but the text inside the javascript component wouldn't? Do you somehow specify the encoding separately for the javascript files?

The Accept-Charset tag gives a set of encodings that are accepted -- if all the data sent is encoded UTF-8, then don't worry about it.
Can you elaborate on exactly what is happening?
You say "javascript table" -- I presume you are constructing an HTML table in JS and placing it in the DOM? Please elaborate, especially w.r.t. any character conversions. Are you building HTML text or building with DOM elements with attributes?
Where does the JS get its data? If with AJAX, have you verified the Encoding for that page?
Does the JS use encode() or decode()? Those don't handle UTF-8 correctly.
EDIT:
Type the URL to the JS code in your browser, and look at "Page Info" to see its encoding. I'll bet it is ISO-8859-1, which would explain the header problems.
Next, check the encoding of the AJAX data. If it's dynamically created you can:
Enable "Show XMLHttpRequests" in FireBug's console,
Load on your base HTML page,
Open the FireBug console tab,
Expand the AJAX GET/POST request and open the Response sub-tab,
Check the Encoding for the data, and fix as needed.
BTW, I'm having similar problems and haven't entirely ironed out the issues (still not sure the source data isn't badly encoded).

It's possible that the ext. JS file strips out unrecognised characters as a security precaution.
The "Accept-Charset" header can be specified in a number of places, including as an attribute in certain HTML elements. Have you performed a search for Accept-Charset (case insensitive) in the offending file?

Related

Html.Raw not decoding

Just a query, I have used #Html.Raw(Item.sometext) before and it decodes the html tags correctly, I'm getting some data from remore source which is in json format, but when displayed on the page I found Html.raw did not decodes html tags.
To fix the problem I used:
#Html.Raw(HttpUtility.HtmlDecode(Item.sometext))
So my question is, can anyone please tell me why that could be the case, as I'm curious as to the reason. Im using mvc4 and asp.net 4.5
Thanks
George
Here is my answer in an attempt to explain better what I mean (in the comments).
Your JSON is formatted for example (which you have supplied) like so:
<p><b>Location. <\/b> <br \/>...
However, this is not valid HTML. Notice the escape characters used for the slashes '/'. So if you pass this value to Html.Raw it will (should) output it, but it's not valid HTML so will unlikely display correctly (if it display anything at all).
This escape character issue can be fixed using Html.Decode which will effectively return the following:
<p><b>Location. </b> <br />...
This is valid HTML, and can therefore be passed to Html.Raw without any problems
NOTE: Html.Raw does not do any encoding/decoding, in fact it explicitly instructs that the supplied value should not be encoded as it is already raw HTML. This is confirmed here:
Use the Raw method when the specified text represents an actual HTML
fragment that should not be encoded and that you want to render as
markup to the HTTP response.

multi line tag in grails or html

With a grails app and from a local database, I'm returning some text in a xml format.
I can return it well formed in a <textarea></textarea> tag with the correct indenting (tabulation, line return,...etc.)
I want to go a bit further. In the text I'm returning, there are some <img/> tags and I'd like to replace those tag by the real images themselves.
I searched around and found no solution as of now. I understood that you can't add an image to a textarea (other then in a background), and if I choose a div tag, I won't have the indenting anymore (and therefore, harder to read)
I was wondering if using a <g:textField/> or an other tag from the grails library will do the trick. And if so, How can I append them to a page using jquery.
For example, how to append a <g:textField/> in jquery. It doesn't interpret it and I get this error
SyntaxError: missing ) after argument list [Break On This Error]...+doc).append("<input type="text" id="FTMAP_"+nb_sec+"" ...
And in my javascript file, I have
$("#FTM_"+doc).append("<g:textField id='FTMAP_"+nb_sec+"' ... />
Any possible solutions ?
EDIT
I did forget to mention that my final intentions are to be able to modify the text (tags included) and to have a nice and neat indentation so that it is the easiest possible for the end user.
You are asking a few different questions:
1. Can I use a single HTML tag to include images inside pre-formatted text.
No. You will have to parse the text and translate it into styled text yourself.
2. Is there a tag in the grails standard tags to accomplish this for me?
No.
3. How can I add grails tags from my javascript code.
Grails tags are processed on the server-side, and javascript is processed on the client. This means you cannot directly add grails tags via javascript.
There are a couple methods that can accomplish the same result, however:
You can set a javascript variable to the rendered content of a grails tag. This solution is good for data that is known at the time of the initial request.
var tagOutput = "${g.textField(/* etc */)}";
You can make an ajax request for the content to be added. Then your server-side grails code can render the tags you need. This is better for realtime data, or data that will be updated more than once on a single rendered page.

cyrillic characters incorrect in html markup but visually correct

I am developing a site in mvc4 where the content of the site includes both latin and cyrillic characters. Both are included in markup and both display correctly on screen.
However, within the markup, I have seen issues with cyrillic where url's for example are like following:
/%d1%81%d0%bf%d0%b8%d1%81%d0%be%d0%ba%20%d0%bf%d0%be%d0%b6%d0%b5%d0%bb%d0%b0%d0%bd%d0%b8%d0%b9
The url navigate correctly when clicked on, but incorrect in html markup. I have the meta charset set to utf-8 in a meta tag.
Any ideas whats causing this?
What you see is correct %-encoded (aka. URL-encoded) form of the URL “/список пожеланий” (as you can see using a decoder). Browser may display a URL in their address bar as %-encoded, or as decoded to characters. HTML authoring software or, in manual editing of HTML code, the author should take care of %-encoding anything that needs to be %-encoded at the HTTP protocol level, such as href attribute values.

What is the best way to handle submission of non-utf-8 data in Rails 3 forms?

I am using Rails 3.2.3 and Ruby 1.9.3.
I have a model called called Post that accepts a title and a description.
The front-end of the site receives information submitted through the back-end through an ajax request. When I fill out the form with, let's say
title: foo
content: foobar
and submit it, I am able to view data through the front-end without a problem.
However, whenever I submit non-utf8 data through the form, for example (mind the fancy quotes):
title: foo
content: “foobar”
When I try to render the form I get the following error:
ActionView::Template::Error (incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string))
My .js.erb file looks like this:
$("#my_post").html('<%= escape_javascript(render :partial => 'post') %>');
I realize this is an issue with encoding, but I'm not sure how I should handle it the best way. I thought of several options:
Strip out non-utf8 by using the iconv library -- do this via a before_save filter for every single model in my application
Specifying at the top of the js that the document contains utf-8 (not sure this would work)
Using accept-charset="UTF-8" in my form to force the browser to avoid submission of non-utf-8 content.
I'm not even sure these solutions would help and the most efficient way to do this would be.
Thanks!
I suspect that you're not using the form helpers because you mention the question of adding accept-charset="UTF-8" to your form.
The form helpers will add the accept-charset attribute as well as the snowman parameter which together should ensure you get UTF-8 data from the browser.
See the Rails Form helpers guide.
You need to look carefully to see if
You're actually sending non-UTF-8 data to your app, or
You're sending UTF-8 data, but it is not being recognized by Ruby/Rails
To see which it is, you need to examine the data on the "wire." (What's being sent on the Internet.) Use a peeking tool such as Wireshark or a proxy spy such as Fiddler
Curly quotes can be sent using 8859 or UTF-8.
Recommendation You should set the HTML page to be UTF-8. Any Ajax sent from scripts on the page should then also use UTF-8. See http://www.utf8.com/
Added (Re: comment about how Rails sets form's character encoding)
The issue for Ajax character encoding is how was the page's encoding set. A blog post. So be sure to set the page's UTF-8 encoding in your page template.

Textareas and unsafe content

I've got wiki style content which is sanitized and stored in another field of the db for output as html. The original body field I'm not sure how to deal with as when I santize it characters are escaped and don't display well in the textarea.
What are the dangers of unsafe content in textareas? I'm sure I read previously that downloading such textarea content with ajax is preferable but I'd rather not go down that route if not necessary.
all HTML tag are no safe. by example if you close the textarea, you can add all nez HTML tag or what you want like JS. So it's exactly like inside a non textarea tag.

Resources