Error 'incompatible character encodings: ASCII-8BIT and UTF-8' due to 8-bit encoding of cookies (Rails 3 and Ruby 1.9) - ruby-on-rails

I moved a web app that was using 1.8.7 to 1.9.2 and now I keep getting
incompatible character encodings: ASCII-8BIT and UTF-8
I have the database encoding to UTF-8 and I have also 'config.encoding = "utf-8"'.
I saw some ideas as possible workarounds and I added
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8
But it didn't work either.
One specific chunk of code where I am getting this error is
%ul.address
- #user.address.split(',').each do |line|
%li= line.titleize
I'm using HAML, I checked line.titleize, and the encoding is UTF-8. Seems that the template is being rendered with ASCII-8BIT and it gets screwed each time that I try to render characteres like 'ñ'
I'm working with Rails 3.0.5.
I have read the post by James Edward Gray, but I still can figure it out what is going on ;(.
I'd really appreciate any kind of help :D.
I also tried:
"string".force_encoding("UTF-8")
And
# encoding: utf-8
Without any luck.
Fixed
See comments.

I just ran into something similar ... and found the fix hidden in the comments to this question, but think it is worth highlighting explicitly:
cookies are ASCII-8BIT but rails 3 templates are utf-8 by default. This means using a raw cookie value in a view may raise Encoding::CompatibilityError (if the user has an incompatible in the cookie value)
The fix (as noted by Adolfo Builes) is to coerce your cookie values to UTF-8, as in:
cookies["location"].force_encoding('UTF-8')

for haml put
-# coding: UTF-8
line on the top left of the page.

Related

Error: "incompatible character encodings: ASCII-8BIT and UTF-8" & "invalid byte sequence in UTF-8" when create a record RoR into SQL Server 2008 R2

I have build a web application with ruby on rails. With:
Ruby: ruby 1.9.2p290
Rails: rails (3.0.7)
Database: SQL Server 2008 R2
I've tested it in my local with input Chinese Traditional character into the input/textbox and create the data. it's working fine.
The data: 長壽
Result in Table SQL Server that using varchar(255) SQL_Latin1_General_CP1_CI_AS is: 長壽
With this, I also can view the text in UI.
But when I gave the application, ruby gem, and the preferences to my friends, they cannot input the Chinese Traditional character. it said, "incompatible character encodings: ASCII-8BIT and UTF-8" when click save button.
Even I restore my database into them, the web application cannot show the value properly it said "invalid byte sequence in UTF-8". Although, They also using the same settings as mine, like SQL Server, version of ruby, and path.
The difference is only I'm using English version of Windows 10 and my friends using Chinese Language of Windows 10.
I have try:
1. config.encoding = "utf-8" is already in application.rb file.
2. put # encoding: utf-8 on top of model and controller
3. Above Rails.application.initialize! line in environment.rb file, add following two lines:
--Encoding.default_external = Encoding::UTF_8
--Encoding.default_internal = Encoding::UTF_8
4. I also tried input encoding: utf8 in database.yml, the data can be saved, but it become question mark ?? in table database and when I want to view it, it also said "invalid byte sequence in UTF-8".
Any more information, i can give it to you if needed. :)
After research for a long time, I found that the windows encoding is the problem.
I released that I can't input symbol such as "✓". It return the same error. And then I tried to trace what input in SQL profiler will transfer to database table, it same as "✓". But it should be encoded of "✓" (I had check the value also in my local).
After research and googling for a long time, I found this post https://superuser.com/questions/336197/unicode-characters-suddenly-start-displaying-as-boxes-in-some-applications?newreg=8b561b87000d4e79b20cbea9c3d2eb07
I tried it and restart the PC. and my application can save "✓" or Chinese character again. It saved become encoded string to database.
(I'm still looking for the reason, how come the windows unicode is the root of this problem)

Stubborn character encoding errors when reading strings from text file (Ruby/Rails)

I've been trying to import a long text file generated from a PDF reader application (SODA-PDF). Source document is a script in PDF format.
The convertged text files look ok in note pad, but I get a variety of errors when trying to read the file into a string and manipulate it.
None of the following methods which I've seen in various threads seem to work:
clean1=Iconv.conv('ASCII//IGNORE', 'UTF8', s)
or
clean1=s.encode('UTF-8', invalid: :replace, undef: :replace, replace: '', UNIVERSAL_NEWLINE_DECORATOR: true)
or
clean1=s.gsub(/[\u0080-\u00ff]/,"")
The first method, using Iconv gives
Iconv::InvalidEncoding: invalid encoding ("ASCII", "UTF8")
when invoked.
The second method appears to work, but fails on various string manipulations like
lines= s.split("\n") unless s.blank?
with
ArgumentError: invalid byte sequence in UTF-8
(Either split or blank? will throw the exception.)
The 3rd method also fails with the 'invalid byte sequence in UTF-8' error.
I am quite hazy on the whole character encoding thing, so excuse any obvious stupidity here.
I'm going to try a character by character filtering, but that's kind of pain since the docs I am working with can be 100+ pages, and I'm hoping there's an easier solve.
Env: Win7 64/ ruby 1.9.3p484 (2013-11-22) [i386-mingw32] / Rails 4.0.3
I discovered that my source file was encoded in ISO-8859-1. Was able to convert to UTF-8 and it all works fine now.

Rails View Encoding Issues

I'm using Ruby 2.0 and Rails 3.2.14. My view is littered several UTF-8 characters, mainly currency symbols like บาท and د.إ etc. I noticed some
(ActionView::Template::Error) "incompatible character encodings: ASCII-8BIT and UTF-8
in our production code and promptly tried visiting the page url on my browser without any issues. On digging in, I realised the error was actually caused by BingBot and few spiders. So when I tried to curl the same url, I was able to reproduce the issue. So, if I try
curl http://localhost:3000/?x=✓
I get the error where UTF-8 symbols are used in the view code. I also realised that if use HTML encoded strings in place of the symbols, this does not occur. However, I prefer using the actual symbols.
And I have already tried setting Encoding.default_external = Encoding::UTF_8 in environment.rb adding #encoding: utf-8 magic comment to top of file and it does not help.
So, why does this error occur? What is the difference between hitting this url on browser and on CURL besides cookies? And how do I go about fixing this issue and allow BingBot to index our site? Thanks.
The culprit that was leaking non UTF-8 characters in my template was an innocuous meta tag for Facebook Open Graph
%meta{property: "og:url", content: request.url}
And when the request is non-standard, this causes the encoding issue. Changing it to
%meta{property: "og:url", content: request.url.force_encoding('UTF-8')}
made the trick.
That error message usually occurs when you try to concatenate strings with different character encodings.
Is your database set to use UTF-8 as well?
If not, you could have a problem when you try to insert the non-UTF8 values into your UTF-8 template.

Isn't user data that comes in from a form in Rails going to be UTF-8 encoded?

A Rails 3.2 app I'm contributing to has a method that coerces user input to UTF-8.
require "iconv"
def normalize(user_input_text)
Iconv.new('UTF-8//IGNORE', 'UTF-8').iconv(user_input_text.dup)
end
It basically encodes the string in UTF-8 and ignores characters that can't be transcoded.
But isn't all user data that's entering Rails through a form going to be UTF-8 encoded?
In other words, isn't this code specious and unnecessary?
These resources suggest that indeed you are right.
Now that the vast majority of web input is UTF-8, we set
the inbound parameters to UTF-8. This will eliminate many
cases of incompatible encodings between ASCII-8BIT and
UTF-8.
https://github.com/rails/rails/commit/25215d7285db10e2c04d903f251b791342e4dd6a
Rails 3 solves this very nicely by doing a number of things including interpreting params as UTF-8 and adding workarounds for Internet Explorer
http://jasoncodes.com/posts/ruby19-rails2-encodings

Displaying ©, & symbol in excel with Ruby on Rails

I am exporting my data into an excel file with Spreadsheet gem and Ruby on Rails. I want to add header and footer to my excel file. The problem is when i am doing this, the copyright symbol, ampersand symbol and registered symbol are not displaying. Either it throws multibyte character error or it simply displays nothing.
I have gone through all similar problems and tried even # encoding utf-8 and "# -- coding: utf-8 --". It is of no use.
When i tried to use escape sequence("\u00A9" - unicode code for © ), the file format is being corrupted. Any possible solutions for this problem? Am i missing something?
Kindly help.
Thanks in advance
This code works for me:
def do_test
book = Spreadsheet::Workbook.new
sheet1 = book.create_worksheet
sheet1[0,0] = "\u00a9"
book.write "./sample.xls"
end
It is possible that you may have set the spreadsheet encoding to something other than UTF-8 at some point. You can check Spreadsheet.client_encoding to see what is being used.
UPDATE
The add_header/footer code is very encoding specific. Here is the code used:
def write_header
write_op opcode(:header), [#worksheet.header.bytesize, 0].pack("vC"), #worksheet.header
end
The Excel writer is using Unicode-1200 (UTF-16 little endian) by default. This may mean that you need to encode any non-standard characters using "\u00a9".encode('UTF-16LE') in order to get this to work...

Resources