Encoding::UndefinedConversionError "\xE7" from ASCII-8BIT to UTF-8 - ruby-on-rails

I'm working on a rails API project.
Here is my code snippets
class PeopleController < ApplicationController
respond_to :json
def index
respond_with Person.all
end
end
and when I visit the url localhost:3000/people.json
Encoding::UndefinedConversionError at /people.json
"\xE7" from ASCII-8BIT to UTF-8
I'm trying to solve this issue since last week, but still fighting with this.
I've found the bunch of similar question over stackoverflow such as this & this but non of the solution worked for me.
Here are the configuration I've.
Rails 4.2.7.1
ruby-2.3.1
Operating system: macOS Sierra
Output of locale
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
content on ~/.bash_profile
export LC_CTYPE="utf-8"
export LC_CTYPE=en_US.UTF-8
export LANG=en_US.UTF-8
unset LC_ALL
Output of Encoding.default_external
#<Encoding:UTF-8>

I have had this problem a lot of times, so I usually try to get rid of any characters that are invalid to UTF-8 BEFORE saving it in the Database. If you have your record saved as a String you can replace invalid characters like so:
string = "This contains an invalid character \xE7"
string.encode('UTF-8', invalid: :replace, undef: :replace)
#=> "This contains an invalid character �"
This is ofc prior to converting it to a JSON object.

Related

Special characters seem ot "kill" the rails console

Problem:
being a german dev I often have Umlaute in my strings, comments, etc. When pasting those into the rails console though it renders the prompt unresponding. It can only be resolved by quitting and restarting the rails-console
Request:
Any way I can configure or fix the console so it accepts special chars as input
Example:
# paste in console to render it useless
def is_a_test
puts "it's a string with ä ö and ü"
end
leads to this on console:
irb(main):004:0> # paste in console to render it useless
=> nil
irb(main):005:0> def is_a_test
irb(main):006:1> puts "it's a string with
irb(main):007:1"
irb(main):008:1" ^C
irb(main):008:0>
Thanks in advance

Encoding::UndefinedConversionError ("\xE2" from ASCII-8BIT to UTF-8): error in ROR + MongoDB based app

Had a developer write this method and its causing a Encoding::UndefinedConversionError ("\xE2" from ASCII-8BIT to UTF-8): error.
This error only happens randomly so the data going in is original DB field is what is causing the issue. But since I don't have any control over that, what can I put in the below method to fix this so bad data doesn't cause any issues?
def scrub_string(input, line_break = ' ')
begin
input.an_address.delete("^\u{0000}-\u{007F}").gsub("\n", line_break)
rescue
input || ''
end
end
Will this work?
input = input.encode('utf-8', :invalid => :replace, :undef => :replace, :replace => '_')
Yeah this should work, it'll replace any weird characters that can't be converted into UTF-8 with an underscore.
Read more about encoding strings in ruby here:
http://ruby-doc.org/core-1.9.3/String.html#method-i-encode

"\xC2" to UTF-8 in conversion from ASCII-8BIT to UTF-8

I have a rails project that runs fine with MRI 1.9.3. When I try to run with Rubinius I get this error in app/views/layouts/application.html.haml:
"\xC2" to UTF-8 in conversion from ASCII-8BIT to UTF-8
It turns out the page had an invalid character (an interpunct '·'), which I found out with the following code (credits to this gist and this question):
lines = IO.readlines("app/views/layouts/application.html.haml").map do |line|
line.force_encoding('ASCII-8BIT').encode('UTF-8', :invalid => :replace, :undef => :replace, :replace => '?')
end
File.open("app/views/layouts/application.html.haml", "w") do |file|
file.puts(lines)
end
After running this code, I could find the problematic characters with a simple git diff and moved the code to a helper file with # encoding: utf-8 at the top.
I'm not sure why this doesn't fail with MRI but it should since I'm not specifying the encoding of the haml file.

Rails: image_tag helper / umlaut in file name throws error in production

I am uploading an image with a file name containing an umlaut via dragonfly in a Rails 3 app on Heroku. Then I'm trying to display the image using
image_tag #model.image.url, …
In development everything works just fine, but in production I'm getting:
incompatible character encodings: UTF-8 and ASCII-8BIT
.bundle/gems/ruby/1.9.1/gems/actionpack-3.0.7/lib/action_view/helpers/tag_helper.rb:129:in `*'
After reading a bit I've added
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8
in environment.rb but the problem remains.
What is the proper way to go about this? Do I have to fix the file name when uploading? I was under the impression this should work just fine in Rails 3?
Well, you could try something like url.force_encoding('utf8')
You could also simply sanitize the url in the model before saving it to the database - that's what I did. And, yes, I sometimes stumble over this in the weirdest places, too.
This is what my model looked like:
# encoding: UTF-8
class Page < ActiveRecord::Base
before_save :sanitize_title
private
def sanitize_title
self.title = self.title.force_encoding('UTF-8').downcase.gsub(/[ \-äöüß]/, ' ' => '_', '-' => '_', 'ä' => 'ae', 'ö' => 'oe', 'ü' => 'ue', 'ß' => 'ss').gsub(/[^a-z_]/,'')
end
end
This will replace the German umlaute with their ASCII counterparts, convert spaces to underscores and drop everything else.
The first line # encoding: UTF-8 is important or ruby will complain of non-ASCII characters in the model.rb file...
In addition to #Rhywden's answer, here my solution specific for Dragonfly:
image_accessor :image do :after_assign
after_assign{|i| i.name = sanitize_filename(image.name) }
end
def sanitize_filename(filename)
filename.strip.tap do |name|
name.sub! /\A.*(\\|\/)/, ''
name.gsub! /[^\w\.\-]/, '_'
end
end
Details here http://markevans.github.com/dragonfly/file.Models.html and here http://guides.rubyonrails.org/security.html#file-uploads .

How to change the encoding during CSV parsing in Rails

I would like to know how can I change the encoding of my CSV file when I import it and parse it. I have this code:
csv = CSV.parse(output, :headers => true, :col_sep => ";")
csv.each do |row|
row = row.to_hash.with_indifferent_access
insert_data_method(row)
end
When I read my file, I get this error:
Encoding::CompatibilityError in FileImportingController#load_file
incompatible character encodings: ASCII-8BIT and UTF-8
I read about row.force_encoding('utf-8') but it does not work:
NoMethodError in FileImportingController#load_file
undefined method `force_encoding' for #<ActiveSupport::HashWithIndifferentAccess:0x2905ad0>
Thanks.
I had to read CSV files encoded in ISO-8859-1.
Doing the documented
CSV.foreach(filename, encoding:'iso-8859-1:utf-8', col_sep: ';', headers: true) do |row|
threw the exception
ArgumentError: invalid byte sequence in UTF-8
from csv.rb:2027:in '=~'
from csv.rb:2027:in 'init_separators'
from csv.rb:1570:in 'initialize'
from csv.rb:1335:in 'new'
from csv.rb:1335:in 'open'
from csv.rb:1201:in 'foreach'
so I ended up reading the file and converting it to UTF-8 while reading, then parsing the string:
CSV.parse(File.open(filename, 'r:iso-8859-1:utf-8'){|f| f.read}, col_sep: ';', headers: true, header_converters: :symbol) do |row|
pp row
end
force_encoding is meant to be run on a string, but it looks like you're calling it on a hash. You could say:
output.force_encoding('utf-8')
csv = CSV.parse(output, :headers => true, :col_sep => ";")
...
Hey I wrote a little blog post about what I did, but it's slightly more verbose than what's already been posted. For whatever reason, I couldn't get those solutions to work and this did.
This gist is that I simply replace (or in my case, remove) the invalid/undefined characters in my file then rewrite it. I used this method to convert the files:
def convert_to_utf8_encoding(original_file)
original_string = original_file.read
final_string = original_string.encode(invalid: :replace, undef: :replace, replace: '') #If you'd rather invalid characters be replaced with something else, do so here.
final_file = Tempfile.new('import') #No need to save a real File
final_file.write(final_string)
final_file.close #Don't forget me
final_file
end
Hope this helps.
Edit: No destination encoding is specified here because encode assumes that you're encoding to your default encoding which for most Rails applications is UTF-8 (I believe)

Resources