Rails and html - encoding error - ruby-on-rails

in my template (index.html.erb) is no line for any code like utf-8 or anything.
so rails of course has a problem with that.
Your template was not saved as valid UTF-8. Please either specify UTF-8 as the encoding for your template in your text editor, or mark the template with its encoding by inserting the following as the first line of the template:
# encoding: <name of correct encoding>.
so i tried to paste this into my html:
# encoding: < meta charset=utf-8 />.
did i write something wrong? or can i take any other code?
The answer from rails is: unknown encoding name - <
thanks for answering

check if the file itself is saved as UTF-8.
This was my problem.
Using "e" as a texteditor or Notepad++ (or any other (windows) tool ) with wrong configuration may be the problem.
e (and i think Notepad++) was configured to save the files as "Windows DOS OEM (EP 437)".
I've changed this in the settings to UTF-8, saved alle files (withoth changes) and it works.

in your application.rb file paste this line
config.encoding = "utf-8"
or in your application.html.erb file paste the following line
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

The meta charset is for HTML. You need to specify the charset for ruby, you can do it by using a comment like this:
# encoding: utf-8

Using Rubymine4, accidentally right clicked and didn't see what I had pressed... BANG! code = gone! Plus 1 error! I had reencoded the page to some crazy encoding. Just right click in page and -> save '[Your crazy Encoding here]' file in another encoding.
Rubymine, too easy! ;)

That solved for me. It was my file that was not being saved as utf-8 after all.
https://superuser.com/questions/581553/sublime-text-2-encoding-utf-8

Related

Using Umlaut or special characters in ibm-doors from batch

We have a link module that looks something like this:
const string lMod = "/project/_admin/somethingÜ" // Umlaut
We later use the linkMod like this to loop through the outlinks:
for a in obj->lMod do {}
But this only works when executing directly from DOORS and not from a batch script since it for some reason doesn't recognize the Umlaut causing the inside of the loop to never to be run; exchanging lMod with "*" works and also shows the objects linked to by the lMod.
We are already using UTF-8 encoding for the file:
pragma encoding, "UTF-8"
Any solutions are welcome.
Encode the file as UTF-8 in Notepad++ by going to Encoding > Convert to UTF-8. (Make sure it's not already set to UTF-8 before you do it).

What encoding is this and how do I turn it into something I can see properly?

I'm writing a script that will operate on the subtitle files of a popular streaming service (Netfl*x).
The subtitle files have strange characters in them and I can't get them to render in a way that my text editors or web browser will display in a readable way. The xml encoding says UTF-8, but some characters are not readable.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<tt xmlns:tt="http://www.w3.org/ns/ttml" xmlns:ttm="http://www.w3.org/ns/ttml#metadata" xmlns:ttp="http://www.w3.org/ns/ttml#parameter" xmlns:tts="http://www.w3.org/ns/ttml#styling" ttp:tickRate="10000000" ttp:timeBase="media" xmlns="http://www.w3.org/ns/ttml">
<p>de 15 % la nuit dernière.</span></p>
<p>if youâve got things to doâ¦</span></p>
And in Vim:
This is what it looks like in the browser:
How can I convert this into something I can use?
I'll go out on a limb and say that file is UTF-8 encoded just fine, and you're merely looking at it using the wrong encoding. The character À encoded in UTF-8 is C3 80. C3 in ISO-8859-1 is Ã, which in your screenshot is followed by an 80. So looks like you're looking at a UTF-8 file using the (wrong) ISO-8859 encoding.
Use the correct encoding when opening the file.
My terminal is set to en_US.UTF-8, but was also rendering this supposedly UTF-8 encoded file incorrectly (sonné -> sonné). I was able to solve this by using iconv to encode the file in ISO8859-1.
iconv original.xml -t ISO8859-1 -o converted.xml
In the new file, the characters were properly rendered, although I don't quite understand why.

Firefox addon localization

I have problem with localization my addon. I followed this tutorial on Using Localized Strings in Preferences but I can't compile my addon because I use polish characters ć and others.
I've made locale folder and put there pl-PL.properties file with this content:
my_tag_title = Co robić?
and I got error:
Following locale file is not a valid UTF-8 file: C:\path\pl-PL.properties
'utf8' codec can't decode byte 0xe6 in position 22: invalid continuation byte"
Is there way to put special characters directly inside package.json?
How to solve this problem?
Make sure that the locale file is saved in UTF-8 format.

Character Encoding issue in Rails v3/Ruby 1.9.2

I get this error sometimes "invalid byte sequence in UTF-8" when I read contents from a file. Note - this only happens when there are some special characters in the string. I have tried opening the file without "r:UTF-8", but still get the same error.
open(file, "r:UTF-8").each_line { |line| puts line.strip(",") } # line.strip generates the error
Contents of the file:
# encoding: UTF-8
290919,"SE","26","Sk‰l","",59.4500,17.9500,, # this errors out
290956,"CZ","45","HornÌ Bradlo","",49.8000,15.7500,, # this errors out
290958,"NO","02","Svaland","",58.4000,8.0500,, # this works
This is the CSV file I got from outside and I am trying to import it into my DB, it did not come with "# encoding: UTF-8" at the top, but I added this since I read somewhere it will fix this problem, but it did not. :(
Environment:
Rails v3.0.3
ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-darwin10.5.0]
Ruby has a notion of an external encoding and internal encoding for each file. This allows you to work with a file in UTF-8 in your source, even when the file is stored in a more esoteric format. If your default external encoding is UTF-8 (which it is if you're on Mac OS X), all of your file I/O is going to be in UTF-8 as well. You can check this using File.open('file').external_encoding. What you're doing when you opening your file and passing "r:UTF-8" is forcing the same external encoding that Ruby is using by default.
Chances are, your source document isn't in UTF-8 and those non-ascii characters aren't mapping cleanly to UTF-8 (if they were, you would either get the correct characters and no error, and if they mapped by incorrectly, you would get incorrect characters and no error). What you should do is try to determine the encoding of the source document, then have Ruby transcode the document on read, like so:
File.open(file, "r:windows-1251:utf-8").each_line { |line| puts line.strip(",") }
If you need help determining the encoding of the source, give this Python library a whirl. It's based on the automatic charset detection fallback that was in Seamonkey/Mozilla (and is possibly still in Firefox).
If you want to change your file encoding, you can use gem 'charlock holmes'
https://github.com/brianmario/charlock_holmes
$require 'charlock_holmes/string'
content = File.read('test2.txt')
if !content.is_utf8?
detection = CharlockHolmes::EncodingDetector.detect(content)
utf8_encoded_content = CharlockHolmes::Converter.convert content, detection[:encoding], 'UTF-8'
end
Then you can save your new content in a temp file and overwrite your original file.
Hope this help.

Why do quotes turn into funny characters when submitted in an HTML form?

I have an HTML form, and some users are copy/pasting text from MS Word. When there are single quotes or double quotes, they get translated into funny characters like:
'€™ and ’
The database column is collation utf8_general_ci.
How do I get the appropriate characters to show up?
Edit:
Problem solved. Here's how I fixed it:
Ran mysql_query("SET NAMES 'utf8'"); before adding/retreiving from the database. (thanks to Donal's comment below).
And somewhat odd, the php function urlencode($text) was applied when displaying, so that had to be removed.
I also made sure that the headers for the page and the ajax request/response were all utf8.
This looks like a classic case of unicode (UTF-8 most likely) characters being interpreted as iso-8859-1. There are a couple places along the way where the characters can get corrupted. First, the client's browser has to send the data. It might corrupt the data if it can't convert the characters properly to the page's character encoding. Then the server reads the data and decodes the bytes into characters. If the client and server disagree about the encoding used then the characters will be corrupted. Then the data is stored in the database; again there is potential for corruption. Finally, when the data is written on the page (for display to the browser) the browser may misinterpret the bytes if the page doesn't adequately indicate it's encoding.
You need to ensure that you are using UTF-8 throughout. The default for web pages is iso-8859-1, so your web pages should be served with the Content-Type header or the meta tag
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
(make sure you really are serving the text in that encoding).
By using UTF-8 along all parts of the process you will avoid problems with all working web browsers and databases.
Check the encoding that the page uses. Encode it using UTF-8 as well, and add a meta tag describing the encoding:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
We have a PHP function that tries to clean up the mess with smart quotes. It's a bit of a mess, since it's grown a bit organically as cases popped up during prototype development. It may be of some help, though:
function convert_smart_quotes($string) {
$search = array(chr(0xe2) . chr(0x80) . chr(0x98),
chr(0xe2) . chr(0x80) . chr(0x99),
chr(0xe2) . chr(0x80) . chr(0x9c),
chr(0xe2) . chr(0x80) . chr(0x9d),
chr(0xe2) . chr(0x80) . chr(0x93),
chr(0xe2) . chr(0x80) . chr(0x94),
chr(226) . chr(128) . chr(153),
'’','“','â€<9d>','â€"',' ');
$replace = array("'","'",'"','"',' - ',' - ',"'","'",'"','"',' - ',' ');
return str_replace($search, $replace, $string);
}

Resources