I am using utf8 char encoding - mysqli_set_charset($con,'utf8') in PHP file and utf8_czech_ci in MySQL.
I am programming on Windows computer and here is no problem. When I move my PHP application to Linux server, there is problem with bad character encoding (czech characters like ščřž...) in output Excel file.
Try this in your PHP function, I have solved from this command, It will release buffer memory before creating Excel file.
ob_end_clean();
ob_start();
Related
We have a link module that looks something like this:
const string lMod = "/project/_admin/somethingÜ" // Umlaut
We later use the linkMod like this to loop through the outlinks:
for a in obj->lMod do {}
But this only works when executing directly from DOORS and not from a batch script since it for some reason doesn't recognize the Umlaut causing the inside of the loop to never to be run; exchanging lMod with "*" works and also shows the objects linked to by the lMod.
We are already using UTF-8 encoding for the file:
pragma encoding, "UTF-8"
Any solutions are welcome.
Encode the file as UTF-8 in Notepad++ by going to Encoding > Convert to UTF-8. (Make sure it's not already set to UTF-8 before you do it).
I'm writing a script that will operate on the subtitle files of a popular streaming service (Netfl*x).
The subtitle files have strange characters in them and I can't get them to render in a way that my text editors or web browser will display in a readable way. The xml encoding says UTF-8, but some characters are not readable.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<tt xmlns:tt="http://www.w3.org/ns/ttml" xmlns:ttm="http://www.w3.org/ns/ttml#metadata" xmlns:ttp="http://www.w3.org/ns/ttml#parameter" xmlns:tts="http://www.w3.org/ns/ttml#styling" ttp:tickRate="10000000" ttp:timeBase="media" xmlns="http://www.w3.org/ns/ttml">
<p>de 15 % la nuit dernière.</span></p>
<p>if youâve got things to doâ¦</span></p>
And in Vim:
This is what it looks like in the browser:
How can I convert this into something I can use?
I'll go out on a limb and say that file is UTF-8 encoded just fine, and you're merely looking at it using the wrong encoding. The character À encoded in UTF-8 is C3 80. C3 in ISO-8859-1 is Ã, which in your screenshot is followed by an 80. So looks like you're looking at a UTF-8 file using the (wrong) ISO-8859 encoding.
Use the correct encoding when opening the file.
My terminal is set to en_US.UTF-8, but was also rendering this supposedly UTF-8 encoded file incorrectly (sonné -> sonné). I was able to solve this by using iconv to encode the file in ISO8859-1.
iconv original.xml -t ISO8859-1 -o converted.xml
In the new file, the characters were properly rendered, although I don't quite understand why.
I know there are a lot of threads about this already, but none of the solutions suggested do not seem to work for me for some reason...
I am using:
Ruby 1.9.2
Rails 2.3.8
My users author CSV files in MS Excel and then need to upload these files to the web application. My web application and the database backend uses UTF-8 and all special characters, such as the £ sign, get corrupted on upload.
I am reading in the file like this:
#file = params[:import_file][:uploaded_data]
Then get encoding of the file using:
source_encoding = "UTF-8"
if #file.external_encoding
source_encoding = #file.external_encoding.name
end
For my test file the source encoding value is ASCII-8BIT.
Then I try to do:
#file.each {|line|
print "#{line.force_encoding(source_encoding).encode!("UTF-8") }\n"
}
in order to see if all texts are displaying ok. However this gives me error like this:
"\xA3" from ASCII-8BIT to UTF-8
If I am trying to read the CSV with:
dataArray = CSV.read(#file, encoding: source_encoding)
No errors this time, but all special characters go as ? characters.
Any pointers where I might be going wrong or is importing CSV file authored with MS Excel just a mission impossible?
Regards,
Olli
I get this error sometimes "invalid byte sequence in UTF-8" when I read contents from a file. Note - this only happens when there are some special characters in the string. I have tried opening the file without "r:UTF-8", but still get the same error.
open(file, "r:UTF-8").each_line { |line| puts line.strip(",") } # line.strip generates the error
Contents of the file:
# encoding: UTF-8
290919,"SE","26","Sk‰l","",59.4500,17.9500,, # this errors out
290956,"CZ","45","HornÌ Bradlo","",49.8000,15.7500,, # this errors out
290958,"NO","02","Svaland","",58.4000,8.0500,, # this works
This is the CSV file I got from outside and I am trying to import it into my DB, it did not come with "# encoding: UTF-8" at the top, but I added this since I read somewhere it will fix this problem, but it did not. :(
Environment:
Rails v3.0.3
ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-darwin10.5.0]
Ruby has a notion of an external encoding and internal encoding for each file. This allows you to work with a file in UTF-8 in your source, even when the file is stored in a more esoteric format. If your default external encoding is UTF-8 (which it is if you're on Mac OS X), all of your file I/O is going to be in UTF-8 as well. You can check this using File.open('file').external_encoding. What you're doing when you opening your file and passing "r:UTF-8" is forcing the same external encoding that Ruby is using by default.
Chances are, your source document isn't in UTF-8 and those non-ascii characters aren't mapping cleanly to UTF-8 (if they were, you would either get the correct characters and no error, and if they mapped by incorrectly, you would get incorrect characters and no error). What you should do is try to determine the encoding of the source document, then have Ruby transcode the document on read, like so:
File.open(file, "r:windows-1251:utf-8").each_line { |line| puts line.strip(",") }
If you need help determining the encoding of the source, give this Python library a whirl. It's based on the automatic charset detection fallback that was in Seamonkey/Mozilla (and is possibly still in Firefox).
If you want to change your file encoding, you can use gem 'charlock holmes'
https://github.com/brianmario/charlock_holmes
$require 'charlock_holmes/string'
content = File.read('test2.txt')
if !content.is_utf8?
detection = CharlockHolmes::EncodingDetector.detect(content)
utf8_encoded_content = CharlockHolmes::Converter.convert content, detection[:encoding], 'UTF-8'
end
Then you can save your new content in a temp file and overwrite your original file.
Hope this help.
I want to export some data from my jruby on rails webapp to excel, so I create a csv string and send it as a download to the client using
send_data(text, :filename => "file.csv", :type => "text/csv; charset=CP1252", :encoding => "CP1252")
The file seems to be in UTF-8 which Excel cannot read correctly. I googled the problem and found that iconv can convert encodings. I try to do that with:
ic = Iconv.new('CP1252', 'UTF-8')
text = ic.iconv(text)
but when I send the converted text it does not make any difference. It is still UTF-8 and Excel cannot read the special characters. there are several solutions using iconv, so this seems to work for others. When I convert the file on the linux shell manually with iconv it works.
What am I doing wrong? Is there a better way?
Im using:
- jruby 1.3.1 (ruby 1.8.6p287) (2009-06-15 2fd6c3d) (Java HotSpot(TM) Client VM 1.6.0_19) [i386-java]
- Debian Lenny
- Glassfish app server
- Iceweasel 3.0.6
Edit:
Do I have to include some gem to use iconv?
Solution:
S.Mark pointed out this solution:
You have to use UTF-16LE encoding to make excel understand it, like this:
text= Iconv.iconv('UTF-16LE', 'UTF-8', text)
Thanks, S.Mark for that answer.
According to my experience, Excel cannot handle UTF-8 CSV files properly. Try UTF-16 instead.
Note: Excel's Import Text Wizard appears to work with UTF-8 too
Edit: A Search on Stack Overflow give me this page, please take a look that.
According to that, adding a BOM (Byte Order Mark) signature in CSV will popup Excel Text Import Wizard, so you could use it as work around.
Do you get the same result with the following?
cp1252= Iconv.conv("CP1252", "UTF8", text)