Ruby Character Encoding Confusion When Reading Same File In Different Environments - ruby-on-rails

I have a Rails application that accepts file uploads of CSV files. When developing the feature locally on my Mac, I received an "invalid byte sequence in UTF-8" error when trying to parse the uploaded file (using Ruby's standard library CSV).
So after doing some research and reading some answers to similar questions on StackOverflow, I tried using a gem to sniff out the character encoding (namely CharDet), and then when opening the file via the CSV library, I would specify the encoding. And this solved all my problems, and life was good.
content = File.read(fullpath)
self.file_encoding = CharDet.detect(content)['encoding']
CSV.table(fullpath, :encoding => file_encoding, :header_converters => :downcase).headers
But then I deployed this code to the production Linux environment, and again with the "invalid byte sequence in UTF-8" errors. What a mystery (to me anyway)! After quite some time trying to resolve the error, I tried removing the code that specified the encoding upon opening the file. And miraculously it fixed the problem on production, but now local Mac development is broken.
Keep in mind, that in both cases I'm uploading the same file using the same browser. Does anyone have any insight on what is going on here?
By the way, versions of ruby are close, but not the same. The Mac is ruby 1.9.3-p0, and the Linux server is 1.9.2-p180. The app is Rails 3.2.6.

A few thoughts:
Have you confirmed the encoding of the file that you're uploading?
Have you tested with 1.9.2-p180 on your Mac, as Frederick Cheung suggested?
Have you tried outputting the results of CharDet.detect on each platform to see what the encoding of the received file (as opposed to the uploaded file) is? I wonder if some configuration is different between Apache on Linux and WEBrick on your Mac?
Are you using the same version of CharDet on both platforms? What libraries does it use (e.g. iconv), and are they the same version on both platforms?
I'm not aware of any differences in behavior with regard to encoding between 1.9.2 and 1.9.3, but I haven't specifically researched it either. It could also be a difference in the configuration of the MRI build.

Related

How can I setup correctly Rails project?

I have installed Ruby Version 3.1.2p20 and Rails Version Rails 7.0.4.
So I want to create new rails app and when I type rails new test-app I am getting error related to UTF-8. Does anyone has any recommendation or can help me setup Rails app correctly? Thanks in advance.
I tried many solutions, but I am thinking I should reinstall it all. Also I checked PATH and it is correctly added.
I am not exactly sure what is the issue here with Ruby 3.1 in windows. But as we know that Ruby's default encoding since 2.0 is UTF-8. This means that Ruby will treat any string as a UTF-8 encoded string unless you tell it explicitly that it's encoded differently so I am guessing there might be something wrong in pathname.rb file.
I faced the same issue in windows and I resolved it by downgrading ruby from 3.1 to 2.7 and it worked for me.

Rails encoding under Cygwin

I'm trying to develop a rails app under Cygwin and Eclipse. I use Ruby 1.9 and Rails 4.1.
I made sure, using recode, that all my files have the Windows-1252 encoding and my Eclipse project uses the same. I've also tried to use UTF8 in both places but this error keeps showing up:
incompatible encoding regexp match (Windows-1252 regexp with UTF-8 string)
class ApplicationController < ActionController::Base
# Prevent CSRF attacks by raising an exception.
# For APIs, you may want to use :null_session instead.
protect_from_forgery with: :exception
Is this related to my Rails/Ruby (read, Rails source files being in the Windows encoding while my code is in UTF-8) setup under Cygwin?
Rails Stacktrace
Add
# encoding: utf-8
to the top of every ruby source file
I found this link, which walks you through installing Rails via Cygwin on Windows. I know you already have it installed, but it includes making Cygwin recognize UTF-8 encoding.
http://collaborate.je/2014/08/setting-ruby-rails-windows-7-via-cygwin/
I do not know if you are using Windows at work or home. If the latter, you may want to spend a little time installing Ubuntu on VirtualBox. You would only need a few GB of space, with an entire Linux OS running as an app within Windows. This is ideal for Rails apps, as many tutorials recommend Unix, and even more companies use Unix to program and run servers.
(In case you don't know, Linux falls under the Unix category, which also includes Mac. Windows is difficult to work program in certain languages, because its kernel and directory structure are not designed the same as Unix. But, I am sure you know this. :-) )

rails console yaml parse error - applying solutions

BACKGROUND Complete newbie to coding working thro’ RailsTutorial on a windows 7 PC– I’ve stalled on chap 4.2 with an YAML error running “rails console” from a command prompt with Ruby on Rails.
I need tips on how to interpret and use some specific answers (below) that I found posted on stackoverflow/google groups/github.
I’ve tried to meet forum criteria for questions. If any of my confusions are too basic here please let me know which & where else I might try.
The error -
~\Sites\sample_app rails console
C:/a_installers/RailsInstaller/Ruby1.9.3/lib/ruby/1.9.1/psych.rb:154:in 'parse
: (<unknown>): couldn't parse YAML at line 44 column 11 (Psych::SyntaxError)
from C:/a_installers/RailsInstaller/Ruby1.9.3/lib/ruby/1.9.1/psych.rb:
54:in 'parse_stream'
(40 more lines like this)
Not sure what useful context I can supply.. possibly that I used Railsinstaller and later Pik (instead of RVM) on Windows 7, that I installed ralis/ruby programs under c:\a_installers\railsInstaller and that I’ve updated PATH to include ruby bin folders.
QUESTION The advice I'm having difficulty interpreting/using is..
rails error, couldn't parse YAML suggests
a. Run the YAML code through yamlint.com
Which YAML file? All those in the error listing? I think I’m running rails console from my installed rails program area (above), there I found 60 odd files with “ *.yml “ – do I have to run all these through YAMLlint.com?
b. Manually Fix the YAML code
One example given was " fixing invalid yaml " involving splitting yaml code for a local date into lines.This was not applicable – I couldn’t find any such code in any of the 60 odd “ *.yml “ files under railsinstaller on my system
c. “load the old YAML parser (syck) with code xxx in config/boot.rb”
Two problems…
(1) I couldn’t find boot.rb in my the rails/ruby program area
(2) I’m torn between two contrary views expressed
“Psych is the new one, the one you should be using you have some invalid yaml somewhere.”
And
“wrong… ruby should use a parser that is both maintained and supports existing usage, even if that usage is further from the spec. If only there were such a thing. Until there is, they should use syck..”
In summary, not sure which files to amend, how to do that, how to check that. if it’s advisable instead to do a workaround (reverting to syck) and if so in which directory and with which files.
another source rails-yaml-config-best-practice advised
“configure parameter inside environment.rb.” use a github code called “settings logic”
Not sure how to do this..do I need to learn YAML? Tutorial hasn’t even started Ruby yet – is this
Not sure where to do this – couldn’t find environment.rb in my rails/ruby program area – it seems to be part of my application (sample_app)
In ..Settings_logic …the notes seem to bring me full circle to 1. above.
“..Note: Certain Ruby/Bundler versions include a version of the Psych YAML parser which incorrectly handles merges (the << in the example above.) If your default settings seem to be overwriting your environment-specific settings, including the following lines in your config/boot.rb file may solve the problem:
require 'yaml'
YAML::ENGINE.yamler= 'syck'
To apply this solution I still need to resolve question 1 c (2)
3 General advice - My inability to start rails console seems an absolute barrier for me to use ROR- a couple of general questions - Do practitioners use rails console for actual development or just learning? Can I learn RoR without rails console?
Many thanks for your patience, time and attention.

Why does Rails 3 pitch a fit about UTF-8 character encoding?

I just started work on a new Rails app, using the bright and shiny new version of Rails, 3.2.1. Previously, I had only used up to version 3.0.9. Before I describe my error, let it be known that I am using Ruby version ruby 1.9.2p290 (2011-07-09) [i386-mingw32] on Windows 7 32-bit. I have not changed my version of Ruby recently. I am using Notepad++ v5.9.3 and haven't (on purpose) changed any default settings.
When I ran my new app for the first time, I got an odd message:
ActionView::WrongEncodingError in Index#index
Your template was not saved as valid UTF-8. Please either specify UTF-8 as the encoding for your template in your text editor, or mark the template with its encoding by inserting the following as the first line of the template:
# encoding: <name of correct encoding>.
I do not understand why I am getting this error all of a sudden. Is it part of changes made to Rails 3.2.1? It is easily fixed by going into Notepad++ and using the Encoding menu option "Convert to UTF-8" but, like I said, I've never had to do this before.
The other odd thing is that even the files that Rails generates are generated with ANSI encoding when I use a generator. Overall, I'm confused and I want to make sure that I'm using good programming practices.
Is it part of changes made to Rails 3.2.1? It is easily fixed by going into Notepad++ and using the Encoding menu option "Convert to UTF-8" but, like I said, I've never had to do this before.
Yes. Rails 3.0+ (I think) requires all templates to be saved in UTF-8 encoding. You need to save the file as UTF-8. If that still doesn't work, set the encoding explicitly by adding on the first line of your .rb files the following:
# encoding: utf-8
Add this to the first line of your .erb templates:
<%# encoding: utf-8 %>
See this related question, and this similar problem. Sounds to me like your editor's encoding settings changed since you originally created the files.
The other odd thing is that even the files that Rails generates are generated with ANSI encoding when I use a generator. Overall, I'm confused and I want to make sure that I'm using good programming practices.
That is quite odd, and I'm not sure I have a good suggestion for that one, other then trying to add Encoding.default_external = "UTF-8" to your config.ru and config/environment.rb files.
I had tried the encoding: utf-8 method without luck, but I resolved the issue when I changed the encoding using Notepad++. Thanks!

File does not exist, while using roo in Ruby-on-rails

I am developing a small Ruby-on-rails application. I am using 'roo' gem to open an excel file.
But rails throws an IO error while attempting to open the file. It says file does not exist.
It works fine in irb. My development machine is windows. Here is my code
file ="#{RAILS_ROOT}/public/data/import.xls"
file.gsub!("\\","/")
workbook = Excel.new(file)
Any help is appreciated
thanks,
Abhilash
It would be worth using the File class here rather than creating the path and gsubbing file separators. For example:
file = File.join(RAILS_ROOT, 'public', 'data', 'import.xls')
I'm pretty sure you don't need to worry too much about using backslashes for file separators though in Windows (I've stopped developing on windows though so can't test).
You can then test whether ruby thinks the file exists by doing File.exists?(file) prior to doing anything roo-specific.
Also, are you running your rails app and console as different users? That might cause some permissions problems in one but not the other.

Resources