Why does Rails 3 pitch a fit about UTF-8 character encoding? - ruby-on-rails

I just started work on a new Rails app, using the bright and shiny new version of Rails, 3.2.1. Previously, I had only used up to version 3.0.9. Before I describe my error, let it be known that I am using Ruby version ruby 1.9.2p290 (2011-07-09) [i386-mingw32] on Windows 7 32-bit. I have not changed my version of Ruby recently. I am using Notepad++ v5.9.3 and haven't (on purpose) changed any default settings.
When I ran my new app for the first time, I got an odd message:
ActionView::WrongEncodingError in Index#index
Your template was not saved as valid UTF-8. Please either specify UTF-8 as the encoding for your template in your text editor, or mark the template with its encoding by inserting the following as the first line of the template:
# encoding: <name of correct encoding>.
I do not understand why I am getting this error all of a sudden. Is it part of changes made to Rails 3.2.1? It is easily fixed by going into Notepad++ and using the Encoding menu option "Convert to UTF-8" but, like I said, I've never had to do this before.
The other odd thing is that even the files that Rails generates are generated with ANSI encoding when I use a generator. Overall, I'm confused and I want to make sure that I'm using good programming practices.

Is it part of changes made to Rails 3.2.1? It is easily fixed by going into Notepad++ and using the Encoding menu option "Convert to UTF-8" but, like I said, I've never had to do this before.
Yes. Rails 3.0+ (I think) requires all templates to be saved in UTF-8 encoding. You need to save the file as UTF-8. If that still doesn't work, set the encoding explicitly by adding on the first line of your .rb files the following:
# encoding: utf-8
Add this to the first line of your .erb templates:
<%# encoding: utf-8 %>
See this related question, and this similar problem. Sounds to me like your editor's encoding settings changed since you originally created the files.
The other odd thing is that even the files that Rails generates are generated with ANSI encoding when I use a generator. Overall, I'm confused and I want to make sure that I'm using good programming practices.
That is quite odd, and I'm not sure I have a good suggestion for that one, other then trying to add Encoding.default_external = "UTF-8" to your config.ru and config/environment.rb files.

I had tried the encoding: utf-8 method without luck, but I resolved the issue when I changed the encoding using Notepad++. Thanks!

Related

How can I setup correctly Rails project?

I have installed Ruby Version 3.1.2p20 and Rails Version Rails 7.0.4.
So I want to create new rails app and when I type rails new test-app I am getting error related to UTF-8. Does anyone has any recommendation or can help me setup Rails app correctly? Thanks in advance.
I tried many solutions, but I am thinking I should reinstall it all. Also I checked PATH and it is correctly added.
I am not exactly sure what is the issue here with Ruby 3.1 in windows. But as we know that Ruby's default encoding since 2.0 is UTF-8. This means that Ruby will treat any string as a UTF-8 encoded string unless you tell it explicitly that it's encoded differently so I am guessing there might be something wrong in pathname.rb file.
I faced the same issue in windows and I resolved it by downgrading ruby from 3.1 to 2.7 and it worked for me.

rails console yaml parse error - applying solutions

BACKGROUND Complete newbie to coding working thro’ RailsTutorial on a windows 7 PC– I’ve stalled on chap 4.2 with an YAML error running “rails console” from a command prompt with Ruby on Rails.
I need tips on how to interpret and use some specific answers (below) that I found posted on stackoverflow/google groups/github.
I’ve tried to meet forum criteria for questions. If any of my confusions are too basic here please let me know which & where else I might try.
The error -
~\Sites\sample_app rails console
C:/a_installers/RailsInstaller/Ruby1.9.3/lib/ruby/1.9.1/psych.rb:154:in 'parse
: (<unknown>): couldn't parse YAML at line 44 column 11 (Psych::SyntaxError)
from C:/a_installers/RailsInstaller/Ruby1.9.3/lib/ruby/1.9.1/psych.rb:
54:in 'parse_stream'
(40 more lines like this)
Not sure what useful context I can supply.. possibly that I used Railsinstaller and later Pik (instead of RVM) on Windows 7, that I installed ralis/ruby programs under c:\a_installers\railsInstaller and that I’ve updated PATH to include ruby bin folders.
QUESTION The advice I'm having difficulty interpreting/using is..
rails error, couldn't parse YAML suggests
a. Run the YAML code through yamlint.com
Which YAML file? All those in the error listing? I think I’m running rails console from my installed rails program area (above), there I found 60 odd files with “ *.yml “ – do I have to run all these through YAMLlint.com?
b. Manually Fix the YAML code
One example given was " fixing invalid yaml " involving splitting yaml code for a local date into lines.This was not applicable – I couldn’t find any such code in any of the 60 odd “ *.yml “ files under railsinstaller on my system
c. “load the old YAML parser (syck) with code xxx in config/boot.rb”
Two problems…
(1) I couldn’t find boot.rb in my the rails/ruby program area
(2) I’m torn between two contrary views expressed
“Psych is the new one, the one you should be using you have some invalid yaml somewhere.”
And
“wrong… ruby should use a parser that is both maintained and supports existing usage, even if that usage is further from the spec. If only there were such a thing. Until there is, they should use syck..”
In summary, not sure which files to amend, how to do that, how to check that. if it’s advisable instead to do a workaround (reverting to syck) and if so in which directory and with which files.
another source rails-yaml-config-best-practice advised
“configure parameter inside environment.rb.” use a github code called “settings logic”
Not sure how to do this..do I need to learn YAML? Tutorial hasn’t even started Ruby yet – is this
Not sure where to do this – couldn’t find environment.rb in my rails/ruby program area – it seems to be part of my application (sample_app)
In ..Settings_logic …the notes seem to bring me full circle to 1. above.
“..Note: Certain Ruby/Bundler versions include a version of the Psych YAML parser which incorrectly handles merges (the << in the example above.) If your default settings seem to be overwriting your environment-specific settings, including the following lines in your config/boot.rb file may solve the problem:
require 'yaml'
YAML::ENGINE.yamler= 'syck'
To apply this solution I still need to resolve question 1 c (2)
3 General advice - My inability to start rails console seems an absolute barrier for me to use ROR- a couple of general questions - Do practitioners use rails console for actual development or just learning? Can I learn RoR without rails console?
Many thanks for your patience, time and attention.

Ruby 1.9.3 encoding

I was using ruby 1.8.7 and rails 3.2.11 for more than year and I developed my application using it, when I upgraded my ruby version to 1.9.3 these issue raised
incompatible character encodings: UTF-8 and ASCII-8BIT on my application.js file
I tried many solutions but all fails any body can help me ?
In Ruby 1.8, strings weren't encoded. In 1.9.3, you need to say what character set encoding your string is using. The default is ASCII, which is a problem, as you can't add differently encoded strings together.
For more information, look here:
http://blog.grayproductions.net/articles/ruby_19s_string
To fix it, make sure your strings and files are all using UTF 8 (or whatever you want) encoding, and your database has the right types.

Ruby Character Encoding Confusion When Reading Same File In Different Environments

I have a Rails application that accepts file uploads of CSV files. When developing the feature locally on my Mac, I received an "invalid byte sequence in UTF-8" error when trying to parse the uploaded file (using Ruby's standard library CSV).
So after doing some research and reading some answers to similar questions on StackOverflow, I tried using a gem to sniff out the character encoding (namely CharDet), and then when opening the file via the CSV library, I would specify the encoding. And this solved all my problems, and life was good.
content = File.read(fullpath)
self.file_encoding = CharDet.detect(content)['encoding']
CSV.table(fullpath, :encoding => file_encoding, :header_converters => :downcase).headers
But then I deployed this code to the production Linux environment, and again with the "invalid byte sequence in UTF-8" errors. What a mystery (to me anyway)! After quite some time trying to resolve the error, I tried removing the code that specified the encoding upon opening the file. And miraculously it fixed the problem on production, but now local Mac development is broken.
Keep in mind, that in both cases I'm uploading the same file using the same browser. Does anyone have any insight on what is going on here?
By the way, versions of ruby are close, but not the same. The Mac is ruby 1.9.3-p0, and the Linux server is 1.9.2-p180. The app is Rails 3.2.6.
A few thoughts:
Have you confirmed the encoding of the file that you're uploading?
Have you tested with 1.9.2-p180 on your Mac, as Frederick Cheung suggested?
Have you tried outputting the results of CharDet.detect on each platform to see what the encoding of the received file (as opposed to the uploaded file) is? I wonder if some configuration is different between Apache on Linux and WEBrick on your Mac?
Are you using the same version of CharDet on both platforms? What libraries does it use (e.g. iconv), and are they the same version on both platforms?
I'm not aware of any differences in behavior with regard to encoding between 1.9.2 and 1.9.3, but I haven't specifically researched it either. It could also be a difference in the configuration of the MRI build.

Ruby on Rails - Stripping strange characters out of body in rake task

On my company's Rails website, we have a Twitter area where tweets from our social media team are displayed by a rake task. Basically the rake task uses the Twitter gem to import any new tweets into the database on a regular basis, and displays them from there. URL links in the tweet are converted to HTML links using the auto_link helper.
Always works fine, until now. All of the sudden, the links are broken and even wrongly highlighting the word right before the URL link. So in an example tweet that should look like this: "Please be safe St. Louis. Heat warning extended through August http://bit.ly/...", the word August is linked and the URL itself that follows is broken, as if there was something in between the last word and link breaking it...
Investigated the helpers, looked in the database for the tweet's text field to see if there was anything strange, even used the rails console to manually pull up the tweets, but everything looked okay. It wasn't until I went all the way into the tweet body's hex code that I saw...
Please be safe S
t. Louis. Heat w
arning extended
through August.
 http://bit.ly/
r5fXlz #heatpoca
lypse
So the culprit was that   being thrown into the space, when I deleted the culprit space and readded it manually in the database, the issue cleared up.
The only problem is, I don't understand why the tweet body is being imported like that, especially when it looks fine via the Rails console. As this is an older database, I noticed it was still using latin1 encoding in some areas with utf8 in others, and I was certain that converting all of that to UTF-8 would fix it, but it did not.
I went as far as tried to use a sanitation helper on the body before being imported, but that didn't work either.
Also tried a ruby gsub to strip the   out, but it didn't work.
Does anyone have any insight on how to solve this odd problem?
I was finally able to solve this by running the following specifically on the body string in the rake task...
Iconv.conv('ASCII//TRANSLIT', 'UTF8', tweet.body)
Odd, but it works. More information on using the above can be found here: ruby (1.8.7): How to get rid of non-printable chars while scraping?

Resources