VSCode complains a Ruby UTF-8 file has invalid multibyte char (US-ASCII)

VSCode complains a Ruby UTF-8 file has invalid multibyte char (US-ASCII) - ruby-on-rails

Using Rails 5.2 and Ruby 2.3 (ruby files by default are UTF-8).
If I check the file in the terminal:
file -I <filename>.rb
it shows UTF-8:
<filename>.rb: text/x-ruby; charset=utf-8
Yet in the file there is a string with a German umlaut character as you can see in the screenshot.
In pre v2.0 of Ruby you could use magic comments to tell Ruby the files encoding, but obviously this file is already UTF-8.
What I am trying to figure out is 2 things:
How did a UTF-8 file get this US-ASCII character inside it?
How can I fix it (so VS-Code is not showing it as incorrect)? I wonder if perhaps something to do with an extension or setting in VS-Code?.
In answer to (1) I am guessing it was perhaps copy and pasted from a file that was encoded US-ASCII (like Word)?
However if I delete the character and type it again on my Mac using OPT + u + u then VS Code still complains. Hence question 2.
With regard to (2) I checked this:
echo LC_TYPE
and it was null.
So I added export LC_TYPE=$LANG to my ~/.bash-profile and restarted VSCode, but that did not solve it (and in the VSCode integrated terminal LC_TYPE is still null). Ref
EDIT
There is no need to answer question 1, because if I delete the character and retype it, the same error shows up. So I now know it doesn't really matter how it got into the file, just need to know what is producing the warning.

I think the issue is in the linter.
"ruby.lint": {
"reek": true,
"rubocop": true,
"ruby": {
"unicode": true,
},
"fasterer": true,
"debride": false,
"ruby-lint": false
},
in settings.json unicode is not turned on by default for ruby.lint so you need to do that manually.

Related

Xcode 10, could not decode input file using specific encoding

I am working on an iOS app. It is working fine in Xcode 9.4.1, but when I build it in Xcode 10 it gives me following error:
I tried the solution given in the following post by changing the encoding, but it didn't work. I tried it by both Reinterpret and Convert
still the same error:
It's working fine on Xcode 9.4.1

Find your Localizable.strings in a Terminal and execute:
$ iconv -f UTF-16LE -t UTF-8 Localizable.strings > LocalizableNew.strings
Then check LocalizableNews.string
and if there is no errors just replace files
$ mv LocalizableNew.strings Localizable.strings

I have similar error once i open my project in Xcode 10.4 and open it agian in Xcode 10.1.
I solved it by selecting my all Localizable.strings file and change there text encoding to UTF-16(In my case error was related to UTF-16 you can change it to UTF-8)
So changing the text encoding to UTF-16 or UTF-8 will works.

It sounds like the file is corrupted, probably with parts of it encoded in UTF-8 and parts of it encoded in 8859-5. From its name, I would suspect this is a Cyrillic localization (perhaps Russian), and the file was probably edited using an editor that didn't correctly maintain encoding or use UTF-8 (the most common cause of that is editing on Windows).
You'll need to open the file, probably in an external editor that can handle random encodings like vim or Sublime Text, and fix any corruption. Exactly how to do that depends on the nature of the corruption.

You need to set correct Text Encoding in the File Inspector. The default is UTF-8.

If you want to fix the problem without UI, you need to look for the XCode project definition (generally YOURPROJECT.xcodeproj/project.pbxproj), then find the reference to the file causing an issue.
You should find something like this (from Adium, in this case)
D182F1B611DFF23700E33AE2 /* sk */ = {isa = PBXFileReference; fileEncoding = 10; lastKnownFileType = text.plist.strings; name = sk; path = sk.lproj/schema.strings; sourceTree = "<group>"; };
fileEncoding = 10 is UTF-16; 4 is UTF-8, which is currently the default, so you can either set it to that value explicitely, or simply remove the fileEncoding bit altogether.

I got this error message when I forgot to put semicolons at the end of the line to separate the individual translations.

Firefox addon localization

I have problem with localization my addon. I followed this tutorial on Using Localized Strings in Preferences but I can't compile my addon because I use polish characters ć and others.
I've made locale folder and put there pl-PL.properties file with this content:
my_tag_title = Co robić?
and I got error:
Following locale file is not a valid UTF-8 file: C:\path\pl-PL.properties
'utf8' codec can't decode byte 0xe6 in position 22: invalid continuation byte"
Is there way to put special characters directly inside package.json?
How to solve this problem?

Make sure that the locale file is saved in UTF-8 format.

Rails 2.3.2/Ruby 1.8.6 Encoding Question - ActionController returning UTF-8?

I have a pretty simple Rails question regarding encoding that I can't find an answer to.
Environment:
Rails 2.3.2/Ruby1.8.6
I am not setting any encoding options within the Rails environment currently, have left everything to defaults.
If I read a String from disk from a text file - and send it via Rails render :text functionality using Apache/Phusion, what encoding should the client expect?
Thank you for any answers,

Since about Rails 1.2, Rails sets Ruby 1.8's $KCODE magic variable to "UTF8". It includes ActiveSupport::CoreExtensions::String::Multibyte to patch around issues with otherwise ambiguous per-character/per-byte operators. Your text file should be UTF-8, Ruby will pass it through and your application layout should specify a META tag declaring the document's charset to be UTF-8 too:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Then it should all 'just work', but there are some gotchas described below.
If you're on a Mac, running "script/console" in Terminal.app and then pasting unusual character sequences directly into the terminal from e.g. the Character Viewer is a good way to play around and demonstrate this to your own satisfaction, since the whole OS works in UTF-8. I don't know what the equivalent would be for Windows or an arbitrary Linux distribution.
For example, "⇒" - RIGHTWARDS DOUBLE ARROW - is Unicode 21D2, UTF8 0xE2 (226), 0x87 (125), 0x92 (146). If I paste that into Terminal and ask for the byte values I get the expected result:
>> $KCODE
=> "UTF8"
>> "⇒"
=> "\342\207\222"
>> puts "⇒"
⇒
...but...
>> "⇒"[0]
=> 226
>> "⇒"[1]
=> 135
>> "⇒"[2]
=> 146
>> "⇒"[3]
=> nil
Note how you're still getting byte access with "[]". See the documentation on the Multibyte extensions in the Rails API (for Rails 2.2, e.g. at http://railsapi.com/) if you want to do string operations, otherwise things like "foo.reverse" will do the wrong thing; "foo.mb_chars.reverse" gets it right by using the "mb_chars" proxy.

Character Encoding issue in Rails v3/Ruby 1.9.2

I get this error sometimes "invalid byte sequence in UTF-8" when I read contents from a file. Note - this only happens when there are some special characters in the string. I have tried opening the file without "r:UTF-8", but still get the same error.
open(file, "r:UTF-8").each_line { |line| puts line.strip(",") } # line.strip generates the error
Contents of the file:
# encoding: UTF-8
290919,"SE","26","Sk‰l","",59.4500,17.9500,, # this errors out
290956,"CZ","45","HornÌ Bradlo","",49.8000,15.7500,, # this errors out
290958,"NO","02","Svaland","",58.4000,8.0500,, # this works
This is the CSV file I got from outside and I am trying to import it into my DB, it did not come with "# encoding: UTF-8" at the top, but I added this since I read somewhere it will fix this problem, but it did not. :(
Environment:
Rails v3.0.3
ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-darwin10.5.0]

Ruby has a notion of an external encoding and internal encoding for each file. This allows you to work with a file in UTF-8 in your source, even when the file is stored in a more esoteric format. If your default external encoding is UTF-8 (which it is if you're on Mac OS X), all of your file I/O is going to be in UTF-8 as well. You can check this using File.open('file').external_encoding. What you're doing when you opening your file and passing "r:UTF-8" is forcing the same external encoding that Ruby is using by default.
Chances are, your source document isn't in UTF-8 and those non-ascii characters aren't mapping cleanly to UTF-8 (if they were, you would either get the correct characters and no error, and if they mapped by incorrectly, you would get incorrect characters and no error). What you should do is try to determine the encoding of the source document, then have Ruby transcode the document on read, like so:
File.open(file, "r:windows-1251:utf-8").each_line { |line| puts line.strip(",") }
If you need help determining the encoding of the source, give this Python library a whirl. It's based on the automatic charset detection fallback that was in Seamonkey/Mozilla (and is possibly still in Firefox).

If you want to change your file encoding, you can use gem 'charlock holmes'
https://github.com/brianmario/charlock_holmes
$require 'charlock_holmes/string'
content = File.read('test2.txt')
if !content.is_utf8?
detection = CharlockHolmes::EncodingDetector.detect(content)
utf8_encoded_content = CharlockHolmes::Converter.convert content, detection[:encoding], 'UTF-8'
end
Then you can save your new content in a temp file and overwrite your original file.
Hope this help.

Iconv.conv in Rails application to convert from unicode to ASCII//translit

We wanted to convert a unicode string in Slovak language into plain ASCII (without accents/carons) That is to do: č->c š->s á->a é->e etc.
We tried:
cstr = Iconv.conv('us-ascii//translit', 'utf-8', a_unicode_string)
It was working on one system (Mac) and was not working on the other (Ubuntu) where it was giving '?' for accented characters after conversion.
Problem: iconv was using LANG/LC_ALL variables. I do not know why, when the encodings are known, but well... You had to set the locale variables to something.utf8, for example: sk_SK.utf8 or en_GB.utf8
Next step was to try to set ENV['LANG'] and ENV['LC_ALL'] in config/application.rb. This was ignored by Iconv in ruby.
Another try was to use global system setting in /etc/default/locale - this worked in command line, but not for Rails application. Reason: apache has its own environment. Therefore the final solution was to add LANG/LC_ALL variables into /etc/apache2/envvars:
export LC_ALL="en_GB.utf8"
export LANG="en_GB.utf8"
export LANGUAGE="en_GB.utf8"
Restarted apache and it worked.
This is more a little how-to than a question. However, if someone has better solution I would like to know about it.

You can try unaccent approach instead.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

VSCode complains a Ruby UTF-8 file has invalid multibyte char (US-ASCII) - ruby-on-rails

I think the issue is in the linter. "ruby.lint": { "reek": true, "rubocop": true, "ruby": { "unicode": true, }, "fasterer": true, "debride": false, "ruby-lint": false }, in settings.json unicode is not turned on by default for ruby.lint so you need to do that manually.

Related

Xcode 10, could not decode input file using specific encoding

Firefox addon localization

Rails 2.3.2/Ruby 1.8.6 Encoding Question - ActionController returning UTF-8?

Character Encoding issue in Rails v3/Ruby 1.9.2

Iconv.conv in Rails application to convert from unicode to ASCII//translit

Categories

Resources