Writing Russian text to txt file - character-encoding

I am trying to write some Russian text, or Cyrillic text, to a .txt file. I can successfully do so, but when I open the file all that is written in place of the text are a bunch of question marks. I was thinking it was an encoding problem but couldn't find anything in that area to help. I have written a little script that demonstrates the issue.
do shell script "> $HOME/Desktop/Russian\\ Text.txt"
set text_path to ((path to home folder) & "Desktop:Russian Text.txt" as string) as alias
set write_text to "Привет"
tell application "Finder"
write write_text to text_path
set read_text to text of (read text_path)
end tell
If anyone has any ideas as to why this is happening please let me know. Thank you.

I can't answer your question. You do have lots of applescript coding issues in your code but none of them are causing your problem. Applescript handles non-ascii text fine for me. I write in Danish some times and it works. However when I tried my script using Russian I got the same results as you. I can't explain why. Just so you can see the proper syntax for reading and writing a file here's my code. Note that I do not use the Finder to perform those tasks and also note how I set the path for the output file...
set outpath to (path to desktop as text) & "danish.txt"
set theText to "primær"
-- write the file
set openFile to open for access file outpath with write permission
write theText to openFile
close access openFile
-- read the file
set readText to read file outpath
UPDATE: I found an answer to your problem. It seems that if you write the utf-16 byte order mark (BOM) to the file then it works properly for Russian. As such I made two handlers so you can read and write these files...
set filePath to (path to desktop as text) & "russian.txt"
set theText to "Привет"
write_UnicodeWithBOM(filePath, theText, true)
read_UnicodeWithBOM(filePath)
on write_UnicodeWithBOM(filePath, theText)
try
set openFile to open for access file (filePath as text) with write permission
write (ASCII character 254) & (ASCII character 255) to openFile starting at 0
write theText to openFile starting at eof as Unicode text
end try
try
close access openFile
end try
end write_UnicodeWithBOM
on read_UnicodeWithBOM(filePath)
read file (filePath as text) as Unicode text
end read_UnicodeWithBOM

Related

How to replace these extended ascii codes?

I am opening up .txt files but when they are loaded on Xojo weird characters like these (’ , â€ک) show up.
I've tried DefineEncoding and ConvertEncoding but it still doesn't seem to work.
output.text = output.text.DefineEncoding(Encodings.WindowsANSI)
output.text = output.text.ConvertEncoding(Encodings.UTF8)
You may have to define the encoding already at time of loading, not afterwards, or you'll get UTF8 chara from loading that you will then mess up with your posted code. So, pass the encoding to the Read function or load the data as a binary file, not as a text file.

Lua io.write() not working

I am using a luvit Lua environment to run my lua code through my control panel. I am looking to write to a .txt file, but with the simple code that i am running, its not working.
The reason I wish to write to a .txt file is to log notices from my Discord Bot I am working on in the Discordia library.
I have a folder called MezzaBOT. In this file i have a write.lua file and also a log.txt file. I have this simple code in my write.lua file:
io.output('log.txt')
io.write('hello\n')
io.close()
I then run in my command promt with Luvit environment:
>luvit Desktop\mezzabot\write.lua
I don't get any errors but the log.txt file continues to stay empty. Am I missing a line in my code, or do i need to access log.txt differently?
edit: my new code is the following
file = io.open('log.txt')
file:write('hello', '\n')
file:close()
and it is not making a new line for each time with \n
edit B:
Ok, i found my problem, its creating a log.txt in my C:\Users\PC.
One other problem is when writing, its not making a new line with the \n. Can someone please help me?
Lua, by default, opens files in read mode. You need to explicitly open a file in write mode if you want to write to it (see manual)
file = io.open('log.txt', 'w')
file:write('hello', '\n')
file:close()
Should work :)

PHPEXCEL weird characters on form inputs

I need some help with PHPEXCEL library, everything works great, I'm successfully extracting my SQL query to excel5 file, I need to give this file to transport company in order to auto collect informations about packages, unfotunately the generated excel file has some ascii characters between each letter of the cell text, and when the excel file is imported you need to manually delete these charaters.
If I open the excel file, everything is fine I see: COMPANY NAME, If I open the excel file with notepad++, I see the cell values this way: C(NUL)O(NUL)M(NUL)P(NUL)A(NUL)N(NUL)Y N(NUL)A(NUL)M(NUL)E
If I open again the file with excel and save, then reopen with notepad++ I see COMPANY NAME.
So I do not understan why every time I create an excel file using PHPEXCEL my every letter of all words are filled with (nul) every letter.
So how do I prevent the generated excel file to include (nul) between every word????
Also if you open the original excel file generated from PHPExcel samples are also filled with (nul) and if you open and save it, the (nul) is gone.
Any help would be appreciated, thanks.
what is the (nul) ??? 0x00??? char(0)???
ok, here is the example:
error_reporting(E_ALL);
ini_set('display_errors', TRUE);
ini_set('display_startup_errors', TRUE);
date_default_timezone_set('Europe/London');
if (PHP_SAPI == 'cli')
die('Disponibile solo su browser');
require_once dirname(__FILE__) . '/Classes/PHPExcel.php';
$objPHPExcel = new PHPExcel();
$objPHPExcel->getProperties()->setCreator("Solidus")
->setLastModifiedBy("Solidus")
->setTitle("Import web")
->setSubject("Import File")
->setDescription("n.a")
->setKeywords("n.a")
->setCategory("n.a");
$objPHPExcel->setActiveSheetIndex(0)
->setCellValueExplicit("A1", "COMPANY")
->setCellValue('A2', 'SAMSUNG');
$objPHPExcel->getActiveSheet()->setTitle('DDT');
$objPHPExcel->setActiveSheetIndex(0);
header('Content-Type: application/vnd.ms-excel');
header('Content-Disposition: attachment;filename="TEST.xls"');
header('Cache-Control: max-age=0');
header('Cache-Control: max-age=1');
header('Cache-Control: private',false);
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'Excel5');
ob_end_clean();
$objWriter->save('php://output');
As you can see from this little example, this scripts creates a file excel5 with 2 cells, A1 = COMPANY, A2 = SAMSUNG
when I send this file to the transport company, they import the file into their system, but as you can see from the picture, there is an weird character between each letter.
so I noticed every time I open the generated Excel5 with notepad++ file I get:
S(nul)A(nul)M(nul)S(nul)U(nul)N(nul)G
If I save the save with excel and then open it again with notepad++ I get:
SAMSUNG
and this file is ok for the transport company
so my question is, how should I avoid the file generated to contain thi '(nul) charachter between each letter????
some help?
weird characters
SAMSUNG
I found the soluion by myself, I explain just in case anyone has also this problem:
there is not way to change the way the excelfile is encoded by PHPEXCEL
so I figured out the problem was reading the file, I did some simulations and reproduce the problem, every time a read the file and put the result into inputs a get weird characters:
C�O�M�P�A�N�Y�
If I set the output enconding enconding as follows:
$excel->setOutputEncoding('UTF-8');
the file loads fine, so the problem was not creating the excel file, but reading the excel file.
If I print the variable with ECHO I get: "COMPANY",
if I put the variable on input as value I get: "C�O�M�P�A�N�Y�"
setting the output solves the problem, but I would like to know why the difference when I put the variable on input as value, thanks

saving data with TextEdit

I want to use TextEdit to save data. what I have so far
tell application "TextEdit"
open /Users/UserName/Desktop/save.rtf
end tell
This gives me
"Expected “given”, “in”, “of”, expression, “with”, “without”, other parameter name, etc. but found unknown token."
and highlights the . in .rtf I tried removing the .rtf
but when I compile it it turns into
(open) / Users / username / desktop / (save)
This code gives "The variable Users is not defined."
also if possible can I have TextEdit run in the background without opening a window?
Put quotes around the path and use POSIX file to get a file object for the path:
tell application "TextEdit"
open POSIX file "/Users/UserName/Desktop/save.rtf"
end tell
You can modify the text of a document by changing the text property:
tell application "TextEdit"
set text of document 1 to text of document 1 & "aa"
end tell
It removes all styles in rich text documents. It also inserts the text as 12-point Helvetica in plain text documents, regardless of the default font.
Creating a new rtf file:
tell application "TextEdit"
make new document at beginning with properties {text:"aa"}
close document 1 saving in POSIX file "/tmp/a.rtf"
end tell
printf %s\\n aa | textutil -inputencoding UTF-8 -convert rtf -stdin -output a.rtf

How to open Excel file written with incorrect character encoding in VBA

I read an Excel 2003 file with a text editor to see some markup language.
When I open the file in Excel it displays incorrect characters. On inspection of the file I see that the encoding is Windows 1252 or some such. If I manually replace this with UTF-8, my file opens fine. Ok, so far so good, I can correct the thing manually.
Now the trick is that this file is generated automatically, that I need to process it automatically (no human interaction) with limited tools on my desktop (no perl or other scripting language).
Is there any simple way to open this XL file in VBA with the correct encoding (and ignore the encoding specified in the file)?
Note, Workbook.ReloadAs does not function for me, it bails out on error (and requires manual action as the file is already open).
Or is the only way to correct the file to go through some hoops? Either: text in, check line for encoding string, replace if required, write each line to new file...; or export to csv, then import from csv again with specific encoding, save as xls?
Any hints appreciated.
EDIT:
ADODB did not work for me (XL says user defined type, not defined).
I solved my problem with a workaround:
name2 = Replace(name, ".xls", ".txt")
Set wb = Workbooks.Open(name, True, True) ' open read-only
Set ws = wb.Worksheets(1)
ws.SaveAs FileName:=name2, FileFormat:=xlCSV
wb.Close False ' close workbook without saving changes
Set wb = Nothing ' free memory
Workbooks.OpenText FileName:=name2, _
Origin:=65001, _
DataType:=xlDelimited, _
Comma:=True
Well I think you can do it from another workbook. Add a reference to AcitiveX Data Objects, then add this sub:
Sub Encode(ByVal sPath$, Optional SetChar$ = "UTF-8")
Dim stream As ADODB.stream
Set stream = New ADODB.stream
With stream
.Open
.LoadFromFile sPath ' Loads a File
.Charset = SetChar ' sets stream encoding (UTF-8)
.SaveToFile sPath, adSaveCreateOverWrite
.Close
End With
Set stream = Nothing
Workbooks.Open sPath
End Sub
Then call this sub with the path to file with the off encoding.

Resources