Rails loading ActiveSupport regex with ISO-8859-1 encoding instead of UTF-8 - ruby-on-rails

When I call '返回'.titleize on a Chinese-language string in the Rails 6 app on my server, I get an error:
Encoding::CompatibilityError: incompatible encoding regexp match (ISO-8859-1 regexp with UTF-8 string)
The source for the titleize function leads to the following in ActiveSupport::Inflector:
def titleize(word, keep_id_suffix: false)
humanize(underscore(word), keep_id_suffix: keep_id_suffix).gsub(/\b(?<!\w['’`()])[a-z]/) do |match|
match.capitalize
end
end
Although calling ActiveSupport::Inflector.titleize('返回') gives the error above, if I just copy the function body and run it as follows, there's no error --- I just get the correct titleize behaviour:
ActiveSupport::Inflector.humanize(ActiveSupport::Inflector.underscore('返回'), keep_id_suffix: false).gsub(/\b(?<!\w['’`()])[a-z]/) do |match|
match.capitalize
end
I guess that the regular expression /\b(?<!\w['’()])[a-z]/ is getting compiled with a ISO-8859-1 encoding when ActiveSupport::Inflector loads but is compiling with a different encoding when I run it in my Rails console, but I don't see anything I can do with this information.
What can I do to get the Rails helper titleize to work on my server as intended?
I'm on Rails 6, Ruby 2.6.6, and ENV["LANG"] is en_US.utf8.

Related

invalid byte sequence in UTF-8 for single quote in Ruby

I'm using the following code to show description in template:
json.description resource.description if resource.description.present?
It gives me invalid byte sequence in UTF-8 error. I dig this a little bit, and find out the issue is my description has single quote as ’ instead of '. Wondering what's the best way to fix this encoding issue? Description is input by user and I have no control over it. Another weird issue is, I have multiple test environments, they all have the same Ruby and Rails version and they are running the same code, but only one of the environment has this error.
def to_utf8(str)
str = str.force_encoding("UTF-8")
return str if str.valid_encoding?
str = str.force_encoding("BINARY")
str.encode("UTF-8", invalid: :replace, undef: :replace)
end
ref: https://stackoverflow.com/a/17028706/455770

iconv deprecation warning with ruby 1.9.3

I'm getting this warning when I run rspec:
/gems/activesupport-3.1.0/lib/active_support/dependencies.rb:240:in `block in require': iconv will be deprecated in the future, use String#encode instead.
I get the same warning with rails 3.1.0, 3.1.1, 3.1.2.rc2 versions. Seems it's related to sqlite3 gem, but I'm not sure. There are no warnings with ruby 1.9.2
Any suggestions how to deal with it?
You are getting this deprecation notice cause a library somewhere is requiring iconv.
iconv is a gem created by Matz that can be used to convert strings from one format to another.
For example this is often used:
Iconv.iconv('UTF-8//IGNORE', 'UTF-8', content) this little bit of magic takes a UTF-8 string that may have invalid chars and converts it to a proper UTF-8 string.
It has been decided that in Ruby 1.9.3 we should not be using iconv any more and instead use the built-in String#encode. encode is more powerful and allows you more flexibility.
The theory is that the above example could be replaced with:
string.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => "?")
In practice it seems this is imperfect.
This also leads to a less than easy story for gem creators who wish to support 1.8:
content = RUBY_VERSION.to_f < 1.9 ?
Iconv.iconv('UTF-8//IGNORE', 'UTF-8', "content") :
"#{content}".encode(Encoding::UTF_8, :invalid => :replace, :undef => :replace, :replace => '')
So, you have a gem somewhere that is requiring iconv, to find it:
Assuming your error message is: /gems/activesupport-3.1.0/lib/active_support/dependencies.rb:240
Open up /gems/activesupport-3.1.0/lib/active_support/dependencies.rb on line 240:
Add the line:
p caller if file =~ /iconv/
(just after: load_dependency(file) { result = super })
You will get a big fat stack trace:
rake --tasks
/home/sam/.rvm/gems/ruby-1.9.3-p125/gems/activesupport-3.2.6/lib/active_support/dependencies.rb:251:in `block in require': iconv will be deprecated in the future, use String#encode instead.
["/home/sam/.rvm/gems/ruby-1.9.3-p125/gems/calais-0.0.13/lib/calais.rb:5:in `'",
.. more omitted ..
This tells me it is the calais gem. Looking through pull requests, I am not the first. The pull has not been yanked in.
Depending on the gem, there may be an upgraded version that does not have this error, so I would recommend you upgrade your gems first. If you are unlucky you may be stuck with the unfortunate task of forking a gem to get rid of this (if for example your pull request to fix it languishes)
If you're seeing this, it's very probably not Rails. If you look at the method surrounding the line being referred to in the error you posted, you'll see the following:
def require(file, *)
result = false
load_dependency(file) { result = super }
result
end
I'm not saying it's your code, necessarily, but I'm certain that it's not actually the line in question where iconv is being called. In my case, I found that my project's code actually contained a reference to iconv.
If you want to check your code for such a reference, try grep -ir iconv ./ in your project directory.
When iconv is actually in a library it can be harder to find. By temporarily changing the above method to:
def require(file, *)
result = false
puts
puts caller.reverse
load_dependency(file) { result = super }
result
end
You can then easily run your code and grep out the relevant lines of the backtrace to find the root cause of the warning.
ruby your/code.rb 2>&1 | grep -B 5 iconv
Add this to the start of your program:
oldverb = $VERBOSE; $VERBOSE = nil
require 'iconv'
$VERBOSE = oldverb
and curse the people who think this is a professional way to handle deprecation.
You can pin down the exact location of the warning by generating exceptions for ActiveSupport::Deprecation, instead of just printing to the log. At the top of application.rb:
ActiveSupport::Deprecation.behavior = Proc.new do |message, backtrace|
raise message
end
Once you've figured out where the warning is coming from (by inspecting the full backtrace), remove this again.
To remove this warning...
go to your .rvm directory and find iconv.c (mine was at ~/.rvm/src/ruby-1.9.3-p125/ext/iconv/iconv.c)
edit that file are remove or comment out the call to warn_deprecated() (should be near the bottom)
from that file's directory, run ruby extconf.rb
then make
then make install
Should do the trick

Rails 3, Heroku: Taps Server Error: PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xba

I have a Rails 3.0.9 application running both locally in my dev env and remotely on a heroku app. I have a method that imports a CSV file into a model, and this file can contain non-english characters, like °,á,é,í, etc (it's in spanish).
I am currently able to import the complete file (75k records) without any problems in my local dev (SQLite) database; but, when uploading the db to heroku with heroku db:push, it fails with the error I'm posting in the title:
!!! Caught Server Exception
HTTP CODE: 500
Taps Server Error: PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xba
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
Apparently, Heroku has issues inserting the '°' character. (At the moment the file doesn't have any á,é,í, etc characters, but I suspect these might fail too.)
I have set in my application.rb file the default encoding, as follows:
#.../application.rb
config.encoding = "utf-8"
What else can I do to set the 'client encoding' and solve this problem?
The numero sign, º, is 0xBA in ISO-8869-1 not UTF-8. So your CSV file is encoded with Latin-1 but you're trying to store it in your database as UTF-8 without fixing the encoding.
You can try telling your CSV library that it is dealing with Latin-1 encoded text and maybe it will take care of converting to UTF-8. If that doesn't work, then you can do it yourself with Iconv:
ruby-1.9.2 > Iconv.iconv('UTF-8', 'ISO-8859-1', "\xba")
=> ["º"]
ruby-1.9.2 > Iconv.iconv('UTF-8', 'ISO-8859-1', "\xb0")
=> ["°"]
You're not having trouble with SQLite because SQLite tends be very forgiving and it has a very loose type system. PostgreSQL, OTOH, tends to be rather strict and properly complains if you try to feed it invalid data. I'd recommend that you stop developing on top of SQLite if you're going to be deploying to Heroku and PostgreSQL, there are other differences that will cause problems (the behavior of GROUP BY and LIKE for example).

Ruby 1.9.x undefined method ^ for string

i found a ruby encryption library which is using this function.
def process(text)
0.upto(text.length-1) {|i| text[i] = text[i] ^ round}
text
end
in ruby 1.9.x it throws an error - undefined method ^' for "\x1A":String__
is there any work around in ruby 1.9.x?
after googling i came to know that "In Ruby 1.9 string[index] returns character not code of the character (like it was in 1.8)." (https://rails.lighthouseapp.com/projects/8994/tickets/3144-undefined-method-for-string-ror-234)
Thanks in advance
Try text[i].ord ^ round. See Getting an ASCII character code in Ruby using `?` (question mark) fails for more info.

Rails + Ruby 1.9 "invalid byte squence in US-ASCII"

After upgrading to ruby 1.9 we began to notice pages failing to render from the rails template renderer when a user used a non-ASCII character. Specifically "é". I was able to resolve this issue on one of our staging servers, but I have not been able to reproduce the fix on our production server.
The fix that seemed to work the first time:
Converted the database from latin1 to utf8 using the convert_charset tool available here: http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/. (including setting default_character_set=utf8 in my.cnf and running SET GLOBAL character_set_server=utf8
Switched to the sam-mysql-ruby adapter (instead of the standard mysql adapter: http://gemcutter.org/gems/sam-mysql-ruby)
Restarted rails
The error is:
"invalid byte sequence in US-ASCII"
Oddly, after following the steps above the error has not changed on our production server. Setting encoding: utf8 in database.yml does not change the error either.
The error raised on the following line of code:
<%= link_to h(question.title), question_path(question) %>
This blog seems to suggest a fix, but it mentions that this should not be a problem in 1.9: http://www.igvita.com/2007/04/11/secure-utf-8-input-in-rails/ (and it's over 2 years old).
I imagine this problem might soon affect a lot of people as more rails developers people switch to 1.9.
I found the solution:
The problem is:
Fetching data from any database (Mysql, Postgresql, Sqlite2 & 3), all configured to have UTF-8 as it's character set, returns the data with ASCII-8BIT in ruby 1.9.1 and rails 2.3.2.1.
(Taken from: https://rails.lighthouseapp.com/projects/8994/tickets/2476)
My attempt to use the patched mysql adapter likely failed because my database was not configured to natively use utf8, so the patched adapter failed to work properly.
The fix ended up being to use the patch file available here: http://gnuu.org/2009/11/06/ruby19-rails-mysql-utf8/
require 'mysql'
class Mysql::Result
def encode(value, encoding = "utf-8")
String === value ? value.force_encoding(encoding) : value
end
def each_utf8(&block)
each_orig do |row|
yield row.map {|col| encode(col) }
end
end
alias each_orig each
alias each each_utf8
def each_hash_utf8(&block)
each_hash_orig do |row|
row.each {|k, v| row[k] = encode(v) }
yield(row)
end
end
alias each_hash_orig each_hash
alias each_hash each_hash_utf8
end
(Placed in lib/mysql_utf8fix.rb and required in enviornment.rb using require 'lib/mysql_utf8fix.rb')
it is only require 'mysql_utf8fix.rb' (rails 2.3.11)
Please user mysql2(gem) adapter instead of mysql adapter in database.yml
and remove the mysql patches(If exists) and add the following lines in environment.rb.
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8
Then run in apache and passenger it ll work fine
Thanks,
Ramanavel Selvaraju.

Resources