incompatible character encodings: ASCII-8BIT and UTF-8 in Ruby 1.9

incompatible character encodings: ASCII-8BIT and UTF-8 in Ruby 1.9 - ruby-on-rails

I'm getting the following error with my Ruby 1.9 & Rails 2.3.4. This happens when user submits a non-ASCII standard character.
I read a lot of online resources but none seems to have a solution that worked.
I tried using (as some resources suggested)
string.force_encoding('utf-8')
but it didn't help.
Any ideas how to resolve this? Is there a way to eliminate such characters before saving to the DB? Or, is a there a way to make them show?

For ruby 1.9 and Rails 3.0.x, use the mysql2 adapter.
In your gemfile:
gem 'mysql2', '~> 0.2.7'
and update your database.yml to:
adapter: mysql2
http://www.rorra.com.ar/2010/07/30/rails-3-mysql-and-utf-8/

I don't know much about Ruby (or Rails), but I imagine the problem is caused by a lack of control over your character encodings.
First, you should decide which encoding you're storing in your database. Then, you need to make sure to convert all text to that encoding before storing in the database. In order to do that, you first need to know which encoding it is to begin with.
One often repeated piece of advice is to decode all input from whatever encoding it uses, to unicode (if your language supports it) as soon as possible after you get control of it. Then you know that all the text you handle in your program is unicode. On the other end, encode the text to whatever output-encoding you want as a last step before outputting it.
The key is to always know which encoding a piece of text is using at any given place in your code.

Related

Foreign character issue with CSV import to Heroku Postgres DB

I have a rails app where my users can manually set up products via a web form. This works fine and accepts foreign characters well, words like 'Svölk' for example.
I now have a need to bulk import products and am using FasterCSV to do so. Generally this works without issue, but when the CSV contains foreign characters it stalls at that point.
Am I correct to believe the file needs to be UTF-8 in the first instance?
Also, I'm running Ruby 1.8.7 so is ICONV my only solution for converting the file? This could be an issue as the format of the original file won't be known.
Have others encountered this issue and if so, how did you overcome it?

You have two alternatives:
Use ensure_encoding gem to find the actual encoding of the strings.
Use Ruby to determine the file encoding using:
File.open(source_file).read.encoding
I prefer the first approach as it tries to detected the encoding based on Strings, and tries to convert to your desired encoding (UTF-8) and then you can set the encoding on FasterCSV options.

Rails: Is it a bad idea to put double-byte character inside a model?

I've learnt that you may define a Ruby source file as UTF-8 to be able to key inside it double-byte characters (e.g.: ¤) instead of their HTML code (e.g.: & curren;):
# encoding: UTF-8
class Price < ActiveRecord:Base
def currency_symbol
'¤'
end
end
Without the encoding statement, I would need to write '& curren;'.html_safe as the core of the method.
I don't like the later because it assume I'm writing HTML (I have Excel output in my app on top of HTML).
My question is: Is there any problems or performance hits I must be aware while doing this?
Note: Ruby 2.0 brings UTF-8 as the default encoding; does it mean all Ruby files will automatically support all those characters?
Character chart: http://dev.w3.org/html5/html-author/charref

This is exactly the kind of thing that should go in the locales (config/locales). These are YAML files that define words and characters that will be used in the various parts of your application, including currency symbols. It also has the benefit of allowing you to easily introduce translations for other languages.
Take a look at the ruby on rails guide for i18n for more.

Dealing with a non-ascii character in Rspec Testing

I'm using the DocSplit gem for Ruby 1.9.3 to create Unicode UTF-8 versions of word documents. To my surprise today while I was running a test on a particular piece of one of these documents I started running into character encoding inconstencies.
I have tried a number of different methods to resolve the issue which I will list below, but the best success I've had so far is to remove all non-ASCII characters. This is far from ideal, as I don't think the character's are really going to be all that problematic in the DB.
gsub(/[^[:ascii:]]/, "")
This is a sample of what my output looks like vs. what I'm expecting:
My CODES'S APOSTROPHE
My CODES’S APOSTROPHE
The second apostrophe should look squiggly. If you paste it into irb, you get the following: \U+FFE2
I tried Regexing specifically for this character and it appears to work in Rubular. As soon as I put it in my model however, I got a syntax error.
syntax error, unexpected $end, expecting ')'
raw_title = raw_title.gsub(/’/, "")
I also tried forcing the encoding to UTF-8, but everything is already in UTF-8 and this does not appear to have an effect. I tried forcing the output to US-ASCII, but I get a byte sequence error.
I also tried a few of the encoding options found in Ruby library. These basically did the same thing as the Regex.
This all comes down to that I'm trying to match output for testing purposes. Should I even be concerned about these special characters? Is there a better way to match these characters without blindly removing them?

Try adding:
# encoding: utf-8
at the top of the failing rspec file. This should ensure things like:
raw_title = raw_title.gsub(/’/, "")
in your spec work.

I tried using the above example. but even after that it kept failing. So I used iconv to convert that specfic character. THis is what I used
Iconv.conv('ASCII//IGNORE', 'UTF8', text_to_be_converted)
I tried what was given in the following link - How to get rid of non-ascii characters in ruby

UTF-8 issue in Ruby on Rails with × character

<a class="close" href="#">×</a>
I get an error regarding the use of ×.
It's used in error messages on twitter's bootstrap framework, I get an invalid byte sequence in UTF-8 error when I try to use it. Is there any work-around? Apart from using a normal x or X.
I have:
# Configure the default encoding used in templates for Ruby 1.9.
config.encoding = "utf-8"
In my application.rb

This seems almost too simple, but why aren't you using ×?

You need to set the encoding at the top of the file where that character is used. You can do this with:
# coding: utf-8
class MyClass
end
I haven't tried it in an erb file, but I don't see why that would be any different. I think you can use the word "encoding" too instead of just "coding" if that feels better. All that is required is at minimum "coding".

What editor are you using?
I suspect that you are saving the source file using an encoding other than UTF-8 (such as Latin-1 or ANSI on Windows), which is then causing ruby to fail to interpret the file correctly.
I've tried adding the times symbol to one of my views (using HAML) and it worked correctly. I'm using VIM as my editor and saving in UTF-8 without any BOM.

#encoding: utf-8
class ClassiClass
end
everything works fine!

Encoding problems in rails on ruby 1.9.1

I am using rails 2.3.3 and ruby 1.9.1.
I am trying to render a view that includes a partial. In the partial i output a field of a model that is encoded in UTF8.
This fails with
ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8) on line #248 of app/views/movie/show.html.erb:
245: <!-- Coloumn right | start -->
246: <div class="col_right">
247:
248: <%= render :partial => 'movie_stats' %>
249:
250: <!-- uploaders -->
251: <div class="box_white">
On the other hand, i can output the field with utf8 content just fine if i directly use that field in a view (when it is not in a partial).
How can i fix this?
I already tried setting the default encoding but that did not seem to work.

I just had this as well so I think its worth having the correct answer.
The 2.8.1 MySql gem is not utf-8 friendly, so it sometimes will return UTF strings and lie to Rails, telling it that they are ASCII when in fact they are UTF-8. This makes things explode.
So: you can either monkey patch or get a compatible MySql gem. See: http://gnuu.org/2009/11/06/ruby19-rails-mysql-utf8/

There appears to be an issue with ERB's encoding in Ruby 1.9. More details are in this Lighthouse ticket. A patch with a workaround has been included, perhaps it works for you?
The problem is erb code in ruby 1.9 distribution. When it compiles the template code it forces a 'ASCII-8bit' encoding, the problem is when the template code has multibyte characters the template code is returned in a 'ASCII-8bit' string and when this string is concat with a 'UTF8' string with multibyte character the exception is raised because the strings between this encodings are only compatible when both only have seven-bit characters.

There seems to be an incompatibility between Ruby 1.9x and the mysql gem with regard to how strings are passed back and forth (specifically the encoding of the strings).
To fix, run
gem install mysql2
on the server and update the database configuration file to use this gem instead of the previous one.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart