Grails encoding non UK characters from GSP to Controller - grails

I've got a Grails (2.0.4) application, all setted up to manage UTF-8 encoding (meta tag in the layout, mysql database tables). Unfortunately, something strange happens.
For example, if in a form (to create a domain instance) I type any text containing non-UK characters, like this:
más que nada
the POST contains the exact text (with the "á" character as is) but the params variable in the controller contains the wrong text:
más que nada
There's nothing between the view and the controller, how can this happen?
I also tried, without good results, to set in Config.groovy:
grails.views.default.codec = "html"
Is there something else I'm missing to set up?
Thanks in advance to everyone who will take the time to have a look at this issue.

How about these values in your Config.groovy:
grails.views.default.codec = "none"
grails.views.gsp.encoding = "UTF-8"
grails.converters.encoding = "UTF-8"
Are those properly configured?

On prod I have configured my tomcat 6 in server.xml as
<Connector port="14080" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="14443" URIEncoding="UTF-8"/>
The most important line is URIEncoding="UTF-8"

What's the default charset of your MySQL database? Is it ok?
This is how I create my MySQL databases:
create database [dbname] DEFAULT CHARACTER SET = utf8 DEFAULT COLLATE utf8_swedish_ci;
see http://dev.mysql.com/doc/refman/5.5/en/create-database.html for full syntax of CREATE DATABASE
Collation affects sorting. You can get a list with "show collation" sql statement in mysql. http://dev.mysql.com/doc/refman/5.1/en/show-collation.html
Changing an existing table's encoding is done with this command:
ALTER TABLE tbl_name CONVERT TO CHARACTER SET charset_name [COLLATE collation_name];
You can check the encoding of an existing table with the "show create table tbl_name" command. Changing the default encoding of the database doesn't change the encoding of existing tables (or tables imported from a mysql dump).

Did you already try with
${myHtmlContent.encodeAsHtml()}
in your view?

Well this post is few months old and the OP might possibly have found out a better solution. But an alternative solution to this problem that I have managed is to explicitly change the character encoding of the parameter in concern.
For instance, params.paramsname = new String(params.unicodeInput.getBytes("8859_1"), "UTF8");
This will force the paramsname to be correctly decoded to the Unicode character.
I just ran into this problem and just to remind you that it's just a workaround. I'm still looking for a better solution too. cheerzzz!

I'm sorry, I figured out what the problem was days ago but I hadn't time to answer my own question till now.
Unfortunately I forgot to mention a key part of the problem, because I didn't thought it was related. I got the encoding problem only on AJAX call, and I didn't mention it because all savings in my application are done through AJAX.
So, the encoding problem was related to the configuration of the content type of the jQuery post, which (to work properly with UTF-8) has to be like this:
contentType: "application/x-www-form-urlencoded;charset=UTF-8"

Related

How do I prevent French accent character truncation with Ruby 1.9, Rails 3.2, and MySQL?

I am running into this issue where I have a controller that receives a string which is the assigned to an attribute for one of my models that I then save to the database. An log message with an inspect call shows the model successfully takes the string right up until the #save call. The problem seems to be that without any errors being thrown, if the string contains a french character, the string from that character to the end of the string becomes truncated.
Further investigation seems to show that the string gets truncated when being written to the MySQL database. I also came across this article: Stale Rails Issue
If I am reading that right, it looks like characters that are not in the ASCII character encoding but are in the ISO Latin-1 character encoding are subject to this bug. I actually upgraded my project from Rails 3.0 to Rails 3.2 and from Ruby 1.8 to Ruby 1.9 so I could easily use the mysql2 adapter with Rails which some other articles seemed to suggest might solve the issue. However it didn't.
So how do I prevent the string truncation from happening?
Edit1: If I enter the query SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; I get:
Variable Name, Value
'character_set_client', 'utf8'
'character_set_connection', 'utf8'
'character_set_database', 'utf8'
'character_set_filesystem', 'binary'
'character_set_results', 'utf8'
'character_set_server', 'latin1'
'character_set_system', 'utf8'
'collation_connection', 'utf8_general_ci'
'collation_database', 'utf8_unicode_ci'
'collation_server', 'latin1_swedish_ci'
Also I noticed that if I place in the french character via the MySQL Query Browser and then refresh the rails app on my browser so it pulls the new data from the database it display, it displays it correctly. It just seems to drop it when saving the model data.
Edit2: I just changed some config parameters to try to fix the problem but it still exists. However, this is what I had changed the values to.
Variable Name, Value
'character_set_client', 'utf8'
'character_set_connection', 'utf8'
'character_set_database', 'utf8'
'character_set_filesystem', 'binary'
'character_set_results', 'utf8'
'character_set_server', 'utf8'
'character_set_system', 'utf8'
'collation_connection', 'utf8_general_ci'
'collation_database', 'utf8_unicode_ci'
'collation_server', 'utf8_unicode_ci'
Well you are using utf8 but if you use utf8_unicode_ci it could be better there is another encoding utf8_general_ci which is of better performance but could have problems with German if that's a problem use the utf8_unicode_ci, that's for the database, for more information on MySQL character set check out MySQL's charset-unicode-sets.
On the side of Rails and Ruby you should check this questions out French accents in ruby. And also Rails messages in french.
As a last resource you could html encode the data before inserting it in the database. This can mess up searches but if you encode the search data also before searching the database everything should be fine for more information check French characters in rails page. I hope this helps if you keep getting errors please tell me so I can check other ways to help you out.
Also the comment by #Ahmed Ali could help you out it looks like the encodings get changed
Fetching data from any database (Mysql, Postgresql, Sqlite2 & 3), all configured to have UTF-8 as it's character set, returns the data with ASCII-8BIT in ruby 1.9.1 and rails 2.3.2.1.
See the link Ahmed posted for the complete answer and the link to the page from where the quote was taken, (ASCII-8BIT encoding of query results in rails 2.3.2 and ruby 1.9.1).
Sorry for all the trouble. I'll just put down the answer. It just turned out in this case the database was correctly set up for utf8 but a user was inputing strings encoded in ISO-Latin-1 and I wasn't doing a check for what encoding user input as I assumed all input would be utf8 compatible. Turns out that french accent characters in ISO-Latin-1 are illegal utf8 characters. The database seems to handle it by just raising a warning and truncating the string at the point of the illegal character but keeping everything before it.

Problems od special characters with sfWidgetFormDoctrineChoice in Symfony

I'm using sfWidgetFormDoctrineChoice to get a list of languages from a MySql database. The list is in spanish language. All languages with special characters like for ex. Árabe looks OK (it got the tilde), but I'm getting the wrong representation (A!rabe) using the widget.
How can I resolve this?
Thanks in advance.
This is probably an encoding problem: verify that the list of languages registered in your database has the same encoding than the page where you display the widget.
But why don't you use the sfWidgetFormI18nChoiceLanguage widget ?

Rails View encoding issue

Using Rails 2.3 with Ruby 1.8.7
I am working with an SQL Server database on a windows server with collation
SQL_Latin1_General_CP1_CI_AS
When I go to the rails console on the Linux server with the app and query the problem record I get
=> "Rodríguez, César"
To try to isolate the problem in my controller I tried just render :text => with the record's problem field, but on the browser I am seeing
Rodr?guez, C?sar
I believe this is an encoding issue, but I don't know how to
resolve.(and Google + Stackoverflow skills are failing me) Given that the
source data can't be changed, what do I need to do on the rails side
to get the text to render properly?
On Chrome I have tried to manually change the encoding and no matter
which I select I can't get the text to render correctly.
Also, why would it render correctly on the console?
character encoding is by default unicode in firefox and the same is for chrome. Just check if you tried with these.
You need to check and confirm some of the issues like
--Meta tags in the html page. check the charset from the source of file. Change it to utf-8 in the layout and try.
--Database encoding
--Select a character set that contains mappings for all the characters that an application and its users will want to see
There can be better solutions, still give a try using Inkscape command line tool, change the text to image files and then you can display.
Encoding is handled here with no issues currently.

How to disable UTF character (punctuation) escaping when creating XML using default to_xml with Rails?

Given a rails models column that contains
"Something & Something Else" when outputting to_xml
Rails will escape the Ampersand like so:
<MyElement>Something & Something Else</MyElement>
Our client software is all UTF aware and it would be better if we can just leave the column content raw in our XML output.
There was an old solution that worked by setting $KCODE="UTF8" in an environment file, but this trick no longer works, and was always an All or Nothing solution.
Any recommendations on how to disable this? on a case by case basis?
It does not matter if the client software is UTF-8-aware. An ampersand cannot be used unescaped in XML. If the software is supposed to also be XML-aware, then any content that includes ampersands is not allowed to be kept "raw".
This is nothing to do with Unicode (or "UTF"). Ampersands in XML must be escaped, otherwise it isn't XML, and no XML software will accept it. If you're saying you want the escaping disabled, then you're saying you don't want the output to be XML.

Using brackets in cookie names (Rails)

When attempting to write/read cookies that have brackets in the name, it seems like Rails can't handle this. For example:
cookies["example[]"] = "value"
This causes the cookie name to be "example%5B%5D" instead of "example[]". Similarly, if I already have a cookie set with the name "example[]", then it seems like Rails is unable to properly delete it via a call cookies.delete "example[]" since the [ and ] characters are being encoded.
Anyone know how to fix this?
Th rfc does not specify what all can be in the name of a cookie . All it says that the name needs to be text . I guess rails is encoding the text and hence the brackets are becoming %5B%5D . I think its best to avoid such characters in Cookies .
Looks like this can only be done by hacking the Rails core. Sucks that the Rails developers implemented it this way.

Resources