TStringlist not loading Google Contacts file - delphi

I'm trying to use a Stringlist to load a CSV file generated by Google Contacts. When i open this file in an text editor like Sublime Text, i can see the contents properly, with 75 lines. This is a sample from the Google Contacts file :
Name,Given Name,Additional Name,Family Name,Yomi Name,Given Name Yomi,Additional Name Yomi,Family Name Yomi,Name Prefix,Name Suffix,Initials,Nickname,Short Name,Maiden Name,Birthday,Gender,Location,Billing Information,Directory Server,Mileage,Occupation,Hobby,Sensitivity,Priority,Subject,Notes,Group Membership,Phone 1 - Type,Phone 1 - Value,Phone 2 - Type,Phone 2 - Value,Phone 3 - Type,Phone 3 - Value
H,H,,,,,,,,,,,,, 1-01-01,,,,,,,,,,,,* My Contacts ::: Importado 01/02/16,,,,,,
H - ?,H,-,?,,,,,,,,,,, 1-01-01,,,,,,,,,,,,* My Contacts ::: Importado 01/02/16,Mobile,031-863-64393,,,,
H - ?,H,-,?,,,,,,,,,,,,,,,,,,,,,,,* My Contacts ::: Importado 01/02/16,Mobile,031-986-364393,,,,
BUT when i try to load this same file using Stringlist, this is what i see in the Stringlist.text property :
'ÿþN'#$D#$A
Here is my code :
procedure Tform1.loadfile;
var sl : tstringlist;
begin
sl := tstringlist.create;
sl.loadfromfile('c:\google.csv');
showmessage('lines : '+inttostr(sl.count)+' / text : '+ sl.text);
end;
This is the result i get :
'1 / 'ÿþN'#$D#$A'
What is happening here ?
Thanks

According to the hex dump you provided, the BOM indicates that your file is encoded using UTF-16LE. You a few options in front of you, as I see it:
Switch to Unicode and use the TnT Unicode controls to work with this file.
Read the file as an array of bytes. Convert to ANSI and then continue using ANSI encoded text. Obviously you'll lose information for any characters than cannot be encoded by your ANSI code page. A cheap way to do this would be to read the file as a byte array. Copy the content after the first two bytes, the BOM, into a WideString. Then assign that WideString to an ANSI string.
Port your program to a Unicode version of Delphi (anything later than Delphi 2007) and work natively with Unicode.
I rather suspect that you are not very familiar with text encodings. If you were then I think you would have been able to answer the question yourself. That's just fine but I urge you to take the time to learn about this issue properly. If you rush into coding now, before having a sound grounding, you are sure to make a mess of it. And we've seen so many people make that same mistake. Please don't add to the list of text encoding casualties.

Thanks to the information of David, i could achieve the task by using the function below ; because Delphi 2007 does not have unicode support, it needs third-party function to do it.
procedure loadUnicodeFile( const filename: String; strings: TStringList);
Procedure SwapWideChars( p: PWideChar );
Begin
While p^ <> #0000 Do Begin
// p^ := Swap( p^ ); //<<< D3
p^ := WideChar( Swap( Word(p^)));
Inc( p );
End; { While }
End; { SwapWideChars }
Var
ms: TMemoryStream;
wc: WideChar;
pWc: PWideChar;
Begin
ms:= TMemoryStream.Create;
try
ms.LoadFromFile( filename );
ms.Seek( 0, soFromend );
wc := #0000;
ms.Write( wc, sizeof(wc));
pWC := ms.Memory;
If pWc^ = #$FEFF Then // normal byte order mark
Inc(pWc)
Else If pWc^ = #$FFFE Then Begin // byte order is big-endian
SwapWideChars( pWc );
Inc( pWc );
End { If }
Else; // no byte order mark
strings.Text := WideChartoString( pWc );
finally
ms.free;
end;
End;

Related

Write Unicode (UTF-8) text file

How can I write a Unicode text file in Delphi?
Currently I simply use AssignFile, RewriteFile, and Writeln, but this does not write Unicode characters.
You shouldn't be using old Pascal I/O at all. That did its job back in the 80s but is very obsolete today.
This century, you can use the TStringList. This is very commonly used in Delphi. For instance, VCL controls use TStrings to access a memo's lines of text and a combo box's or list box's items.
var SL := TStringList.Create;
try
SL.Add('∫cos(x)dx = sin(x) + C');
SL.Add('¬(a ∧ b) ⇔ ¬a ∨ ¬b');
SL.SaveToFile(FileName, TEncoding.UTF8);
finally
SL.Free;
end;
Fore more advanced needs, you can use a TStreamWriter:
var SW := TStreamWriter.Create(FileName, False, TEncoding.UTF8);
try
SW.WriteLine('αβγδε');
SW.WriteLine('ωφψξη');
finally
SW.Free;
end;
And for very simple needs, there are the new TFile methods in IOUtils.pas:
var S := '⌬ is aromatic.';
TFile.WriteAllText(FileName, S, TEncoding.UTF8); // string (possibly with linebreaks)
var Lines: TArray<string>;
Lines := ['☃ is cold.', '☼ is hot.'];
TFile.WriteAllLines(FileName, Lines, TEncoding.UTF8); // string array
As you can see, all these modern options allow you to specify UTF8 as encoding. If you prefer to use some other encoding, like UTF16, that's fine too.
Just forget about AssignFile, Reset, Rewrite, Append, CloseFile etc.
Other users have given you options but nobody has answered (I guess). You can´t write UTF8 using Writeln because at runtime, any string is switched back to Ansi. All the proposal seems very good ones however.
Try this short program
program utf8;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils;
var s : string; u : AnsiString; some : Text;
begin
try
{ TODO -oUser -cConsole Main : Insert code here }
Assign(some,'data.txt');
rewrite(some);
s := 'física';
u := UTF8Encode (s);
writeln(some,s);
writeln(some,u);
Close(some);
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
end.
Enable "use debug dcu" and carefully follow the Writeln execution. You will learn that despite the fact of being UTF8 encoded, u is switched back to Ansi at some point.
Edition:
I was wrong. You can indeed with:
Assign(FileName,CP_UTF8);
Check help for System.Assign

block read error

Can anybody please explain me why I am hitting 'I/O error 998' in the below block read?
function ReadBiggerFile: string;
var
biggerfile: file of char;
BufArray: array [1 .. 4096] of char; // we will read 4 KB at a time
nrcit, i: integer;
sir, path: string;
begin
path := ExtractFilePath(application.exename);
assignfile(biggerfile, path + 'asd.txt');
reset(biggerfile);
repeat
blockread(biggerfile, BufArray, SizeOf(BufArray), nrcit);
for i := 1 to nrcit do
begin
sir := sir + BufArray[i];
Form4.Memo1.Lines.Add(sir);
end;
until (nrcit = 0);
closefile(biggerfile);
ReadBiggerFile := sir;
end;
I think you miss-tagged the question and you're using Delphi 2009+, not Delphi 7. I got the error in the title bar trying your exact code on Delphi 2010 (unicode Delphi). When you say:
var biggerfile: file of Char;
You're declaring the biggerfile to be a file of "records", where each record is a Char. On Unicode Delphi that's 2 bytes. You later request to read SizeOf(BufArray) records, not bytes. That is, you request to 4096 x 2 = 8192 records. But your buffer is only 4096 records long, so you get a weird error.
I was able to fix your code by simply replacing Char with AnsiChar, since AnsiChar has a size of 1, hence the SizeOf() equals Length().
The permanent fix should involve moving from the very old Pascal-style file operations to something modern, TStream based. I'm not sure exactly what you're trying to obtain, but if you simply want to get the content of the file in a string, may I suggest something like this:
function ReadBiggerFile: AnsiString;
var
biggerfile: TFileStream;
begin
biggerfile := TFileStream.Create('C:\Users\Cosmin Prund\Downloads\AppWaveInstall201_385.exe', fmOpenRead or fmShareDenyWrite);
try
SetLength(Result, biggerfile.Size);
biggerfile.Read(Result[1], biggerfile.Size);
finally biggerfile.Free;
end;
end;
Hi: I had the same issue and i simply passed it the first element of the buffer which is the starting point for the memory block like so:
AssignFile(BinFile,binFileName);
reset(BinFile,sizeof(Double));
Aux:=length(numberArray);
blockread(BinFile,numberArray[0],Aux, numRead);
closefile(BinFile);

Sending printer specific commands

I have an issue here, which I am trying to encode magnetic stripe data to an Fargo DTC400 printer, in the specifications it says I need to send the following string commands from example notepad, wordpad etc etc :
~1%TRACK NUMBER ONE?
~2;123456789?
~3;123456789?
this example encodes the string in track one, and the numbers 123456789 in both track 2 and 3.. this works from Notepad.exe.
EDIT:
Current delphi code I use works on another printer:
procedure SendQuote(MyCommand : AnsiString);
var
PTBlock : TPassThrough;
begin
PTBlock.nLen := Length(MyCommand);
StrPCopy(#PTBlock.SData, MyCommand);
Escape(printer.handle, PASSTHROUGH, 0, #PTBlock, nil);
end;
when I am trying to encode this string from my own application I get trouble, it seems the printer is totally ignoring my commands, when I choose print to file, I can read the binary data and see my string in the printed file, when I try to print to file from example notepad.exe I get just rubish binary data and cannot find my strings at all...
so I wonder what does notepad do to send this string command which I dont ?
hope someone can shed light on this because I have been eager to implement fargo support in my application for a longer period of time .
thanks
Update.
the following code is ancient but it does the job, however is there another way I can use this with the Passthrough code above?
var
POutput: TextFile;
k: Integer;
begin
with TPrintDialog.Create(self) do
try
if Execute then
begin
AssignPrn(POutput);
Rewrite(POutput);
Writeln(POutput,'~1%TESTENCODER?');
Writeln(POutput,'~2;123456789?');
Writeln(POutput,'~2;987654321?');
CloseFile(POutput);
end;
finally
free;
end
end;
TPassThrough should be declared like this :
type
TPassThrough = packed record
nLen : SmallInt;
SData : Array[0..255] of AnsiChar;
end;
You might be using a modern Delphi (2009 or newer) or forgotten the packed directive.
See also this SO question for a correct-way-to-send-commands-directly-to-printer.
At Torry's there is an example snippet (written by Fatih Ölçer):
Remark : Modified for use with Unicode Delphi versions as well.
{
By using the Windows API Escape() function,
your application can pass data directly to the printer.
If the printer driver supports the PASSTHROUGH printer escape,
you can use the Escape() function and the PASSTHROUGH printer escape
to send native printer language codes to the printer driver.
If the printer driver does not support the PASSTHROUGH printer escape,
you must use the DeviceCapabilities() and ExtDevMode() functions instead.
Mit der Windows API Funktion Escape() kann man Daten direkt zum Drucker schicken.
Wenn der Drucker Treiber dies nicht unterstützt, müssen die DeviceCapabilities()
und ExtDevMode() Funktionen verwendet werden.
}
// DOS like printing using Passthrough command
// you should use "printer.begindoc" and "printer.enddoc"
type
TPrnBuffRec = packed record
bufflength: Word;
Buff_1: array[0..255] of AnsiChar;
end;
function DirectToPrinter(S: AnsiString; NextLine: Boolean): Boolean;
var
Buff: TPrnBuffRec;
TestInt: Integer;
begin
TestInt := PassThrough;
if Escape(Printer.Handle, QUERYESCSUPPORT, SizeOf(TESTINT), #testint, nil) > 0 then
begin
if NextLine then S := S + #13 + #10;
StrPCopy(Buff.Buff_1, S);
Buff.bufflength := StrLen(Buff.Buff_1);
Escape(Printer.Canvas.Handle, Passthrough, 0, #buff, nil);
Result := True;
end
else
Result := False;
end;
// this code works if the printer supports escape commands
// you can get special esc codes from printer's manual
// example:
printer.BeginDoc;
try
DirectToPrinter('This text ');
finally
printer.EndDoc;
end;

delphi 7 richedit and romanian language

I'm trying to write some Romanian text into a RichEdit component (Delphi 7) , and even i set the Font Property - Charset to "EASTEUROPE_CHARSET" it doesn't work.
What i want to accomplish is to paste some text (in romanian) in a RichEdit, load into a StringList, set the property order to true and assign it to another RichEdit component (sort the list in a alphabetical order).
I know this shouldn't be a problem in Delphi2009 and up, but at this point I can work only with Delphi 7.
word examples : opoziţie, computerizată.
Any ideas?
Best regards,
Try this code, it reads the text from RichEdit1 as UNICODE text, manually converts S and T + Comma to S and T + Cedilla and then uses WideCharToMultiByte to convert the text to code page 1250. The code point conversions need to be done because code page 1250 only encodes the cedilla-based versions of Ş and Ţ, while the new Romanian keyboards under Vista and Windows 7 generate the (correct) comma-based versions of Ş and Ţ!
procedure TForm1.Button1Click(Sender: TObject);
var GetTextStruct:GETTEXTEX;
GetLenStruct:GETTEXTLENGTHEX;
RequiredBytes:Integer;
NumberOfWideChars:Integer;
WideBuff:PWideChar;
AnsiBuff:PChar;
i:Integer;
begin
;
// Get length of text
GetLenStruct.flags := GTL_NUMBYTES or GTL_USECRLF or GTL_PRECISE;
GetLenStruct.codepage := 1200; // request unicode
RequiredBytes := SendMessage(RichEdit1.Handle, EM_GETTEXTLENGTHEX, Integer(#GetLenStruct), 0);
// Prepare structure to get all text
FillMemory(#GetTextStruct, SizeOf(GetTextStruct), 0);
GetTextStruct.cb := SizeOf(GetTextStruct);
GetTextStruct.flags := GT_USECRLF;
GetTextStruct.codepage := 1200; // request unicode
WideBuff := GetMemory(RequiredBytes);
try
// Do the actual request
SendMessage(RichEdit1.Handle, EM_GETTEXTEX, Integer(#GetTextStruct), Integer(WideBuff));
// Replace the "new" diactrics with the old (make Romanian text compatible with code page 1250)
NumberOfWideChars := RequiredBytes div 2;
for i:=0 to NumberOfWideChars-1 do
case Ord(WideBuff[i]) of
$0218: WideBuff[i] := WideChar($015E);
$0219: WideBuff[i] := WideChar($015F);
$021A: WideBuff[i] := WideChar($0162);
$021B: WideBuff[i] := WideChar($0163);
end;
// Convert to code-page 1250
RequiredBytes := WideCharToMultiByte(1250, 0, WideBuff, -1, nil, 0, nil, nil);
AnsiBuff := GetMemory(RequiredBytes);
try
WideCharToMultiByte(1250, 0, WideBuff, -1, AnsiBuff, RequiredBytes, nil, nil);
Memo1.Lines.Text := AnsiBuff; // AnsiBuff now contains the CRLF-terminated version of the
// text in RichEdi1, corectly translated to code page 1250
finally FreeMemory(AnsiBuff);
end;
finally FreeMemory(WideBuff);
end;
end;
Then use something similar to turn AnsiString into UNICODE and push into the RichEdit.
Of course, the only real solution is to switch to Delphi 2009 or Delphi 2010 and use Unicode all over.
i've resolved it with JvWideEditor from Jedi. Code is bellow
procedure TForm2.SortUnicode;
var asrt:TWStringList;
i:Integer;
begin
JvWideEditor1.Lines.Clear;
JvWideEditor2.Lines.Clear;
asrt:=TWStringList.Create;
if OpenDialog1.Execute then
begin
wPath:=OpenDialog1.FileName;
JvWideEditor1.Lines.LoadFromFile(wPath,[foUnicodeLB]);
try
asrt.AddStrings(JvWideEditor1.Lines);
for i:=asrt.Count-1 downto 0 do
begin
if Trim(asrt.Strings[i])='' then
asrt.Delete(i);
end;
asrt.Duplicates:=dupAccept;
asrt.CaseSensitive:=true;
asrt.Sorted:=True;
JvWideEditor2.Lines.AddStrings(asrt);
JvWideEditor2.Lines.SaveToFile(GetCurrentDir+'\res.txt',[foUnicodeLB]);
finally
FreeAndNil(asrt);
end;
end;
end;
Check the language settings in Windows. If you are running English windows, try setting the "treat non-unicode programs as..." to Romanian. Or, run on native Romanian Windows. To run in a mixed environment (needing to show different charsets simultaneously), you'll likely need Unicode.

How can I get this File Writing code to work with Unicode (Delphi)

I had some code before I moved to Unicode and Delphi 2009 that appended some text to a log file a line at a time:
procedure AppendToLogFile(S: string);
// this function adds our log line to our shared log file
// Doing it this way allows Wordpad to open it at the same time.
var F, C1 : dword;
begin
if LogFileName <> '' then begin
F := CreateFileA(Pchar(LogFileName), GENERIC_READ or GENERIC_WRITE, 0, nil, OPEN_ALWAYS, 0, 0);
if F <> 0 then begin
SetFilePointer(F, 0, nil, FILE_END);
S := S + #13#10;
WriteFile(F, Pchar(S)^, Length(S), C1, nil);
CloseHandle(F);
end;
end;
end;
But CreateFileA and WriteFile are binary file handlers and are not appropriate for Unicode.
I need to get something to do the equivalent under Delphi 2009 and be able to handle Unicode.
The reason why I'm opening and writing and then closing the file for each line is simply so that other programs (such as WordPad) can open the file and read it while the log is being written.
I have been experimenting with TFileStream and TextWriter but there is very little documentation on them and few examples.
Specifically, I'm not sure if they're appropriate for this constant opening and closing of the file. Also I'm not sure if they can make the file available for reading while they have it opened for writing.
Does anyone know of a how I can do this in Delphi 2009 or later?
Conclusion:
Ryan's answer was the simplest and the one that led me to my solution. With his solution, you also have to write the BOM and convert the string to UTF8 (as in my comment to his answer) and then that worked just fine.
But then I went one step further and investigated TStreamWriter. That is the equivalent of the .NET function of the same name. It understands Unicode and provides very clean code.
My final code is:
procedure AppendToLogFile(S: string);
// this function adds our log line to our shared log file
// Doing it this way allows Wordpad to open it at the same time.
var F: TStreamWriter;
begin
if LogFileName <> '' then begin
F := TStreamWriter.Create(LogFileName, true, TEncoding.UTF8);
try
F.WriteLine(S);
finally
F.Free;
end;
end;
Finally, the other aspect I discovered is if you are appending a lot of lines (e.g. 1000 or more), then the appending to the file takes longer and longer and it becomes quite inefficient.
So I ended up not recreating and freeing the LogFile each time. Instead I keep it open and then it is very fast. The only thing I can't seem to do is allow viewing of the file with notepad while it is being created.
For logging purposes why use Streams at all?
Why not use TextFiles? Here is a very simple example of one of my logging routines.
procedure LogToFile(Data:string);
var
wLogFile: TextFile;
begin
AssignFile(wLogFile, 'C:\MyTextFile.Log');
{$I-}
if FileExists('C:\MyTextFile.Log') then
Append(wLogFile)
else
ReWrite(wLogFile);
WriteLn(wLogfile, S);
CloseFile(wLogFile);
{$I+}
IOResult; //Used to clear any possible remaining I/O errors
end;
I actually have a fairly extensive logging unit that uses critical sections for thread safety, can optionally be used for internal logging via the OutputDebugString command as well as logging specified sections of code through the use of sectional identifiers.
If anyone is interested I'll gladly share the code unit here.
Char and string are Wide since D2009. Thus you should use CreateFile instead of CreateFileA!
If you werite the string you shoudl use Length( s ) * sizeof( Char ) as the byte length and not only Length( s ). because of the widechar issue. If you want to write ansi chars, you should define s as AnsiString or UTF8String and use sizeof( AnsiChar ) as a multiplier.
Why are you using the Windows API function instead of TFileStream defined in classes.pas?
Try this little function I whipped up just for you.
procedure AppendToLog(filename,line:String);
var
fs:TFileStream;
ansiline:AnsiString;
amode:Integer;
begin
if not FileExists(filename) then
amode := fmCreate
else
amode := fmOpenReadWrite;
fs := TFileStream.Create(filename,{mode}amode);
try
if (amode<>fmCreate) then
fs.Seek(fs.Size,0); {go to the end, append}
ansiline := AnsiString(line)+AnsiChar(#13)+AnsiChar(#10);
fs.WriteBuffer(PAnsiChar(ansiline)^,Length(ansiline));
finally
fs.Free;
end;
Also, try this UTF8 version:
procedure AppendToLogUTF8(filename, line: UnicodeString);
var
fs: TFileStream;
preamble:TBytes;
outpututf8: RawByteString;
amode: Integer;
begin
if not FileExists(filename) then
amode := fmCreate
else
amode := fmOpenReadWrite;
fs := TFileStream.Create(filename, { mode } amode, fmShareDenyWrite);
{ sharing mode allows read during our writes }
try
{internal Char (UTF16) codepoint, to UTF8 encoding conversion:}
outpututf8 := Utf8Encode(line); // this converts UnicodeString to WideString, sadly.
if (amode = fmCreate) then
begin
preamble := TEncoding.UTF8.GetPreamble;
fs.WriteBuffer( PAnsiChar(preamble)^, Length(preamble));
end
else
begin
fs.Seek(fs.Size, 0); { go to the end, append }
end;
outpututf8 := outpututf8 + AnsiChar(#13) + AnsiChar(#10);
fs.WriteBuffer(PAnsiChar(outpututf8)^, Length(outpututf8));
finally
fs.Free;
end;
end;
If you try to use text file or Object Pascal typed/untyped files in a multithreaded application you gonna have a bad time.
No kidding - the (Object) Pascal standard file I/O uses global variables to set file mode and sharing. If your application runs in more than one thread (or fiber if anyone still use them) using standard file operations could result in access violations and unpredictable behavior.
Since one of the main purposes of logging is debugging a multithreaded application, consider using other means of file I/O: Streams and Windows API.
(And yes, I know it is not really an answer to the original question, but I do not wish to log in - therefor I do not have the reputation score to comment on Ryan J. Mills's practically wrong answer.)

Resources