How to save TStringList with UNIX line endings? - delphi

I cannot figure how to save the lines of a TStringList using UNIX line endings (LF) instead of the default CRLF ones.
I've tried to use StringReplace() on the stringList.Text property without any success :-(

StringList.Text is a property that generates the text every time. So when you assign the modified text back to the stringlist, you will undo you changes. When you get the text again, the stringlist will just build a new string with its default linebreak character.
This character can be influenced by setting the LineBreak property of the stringlist.
The default value for LineBreak is the sLineBreak constant, which can be either #13#10 on Windows or #10 on Linux or #13 on Mac.
Otherwise, if you save StringList.Text in a string variable, you can use StringReplace to change that string, or even better, use AdjustLineBreaks.

One more possibility is to use Jedi Code Library ( http://jcl.sf.net ) with split/join functionality in their version of string list.
var so : TJclStringList; // PODO style, requires finally-free-end
si : iJclStringList; // ref-counted interface for method chaining (aka Fluent API style)
s : String;
...
s := so.Join(^J);
s := si.Join(^J);

Related

How to make TMemo treat new lines in Linux style?

I have a string that contains Linux style line breaks. Linux style is #13 while Windows style is #13#10.
I would like to show this string in a TMemo. Looks like TMemo accepts only Windows style and does not treat #13 as a new line.
Is the only way for TMemo to format new lines is to insert #10, or can I somehow ask TMemo to act in Linux style?
I would like to show this string in Memo. Looks like Memo accepts only Windows style and does not treat #$13 as new line.
That depends on how you give the string to the Memo.
The underlying Win32 EDIT control that TMemo wraps only supports #13#10 style line breaks internally.
If you assign the string to the TMemo.Text property, it will just pass the string as-is to the underlying Win32 control. So, the string will need to use Windows-style line breaks only.
However, if you assign the string to the TMemo.Lines.Text property instead, it will internally adjust all styles of line breaks to Windows-style, and then give the adjusted string to the Win32 control. So, in that regard, you can handle Linux-style and Windows-style line breaks equally.
Alternatively, the TStringList class supports parsing all styles of line breaks (when its LineBreak property matches the sLineBreak constant, which it does by default). So, another option would be to first assign the string to the TStringList.Text property, and then you can assign the resulting list to the TMemo.Lines property.
Actually Linux style is #10, not #13 (#13 is MacOS style, AFAIK). Also, note that it's #10 and not #$10 (which is #16).
The easiest way would be to replace the line ends on load/save, ie. instead of
Memo.Lines.LoadFromFile(FileName)
or
Memo.Lines.Text := STR;
do
uses System.IOUtils;
Memo.Lines.Text := TFile.ReadAllText(FileName,TEncoding.UTF8).Replace(#13#10,#13).Replace(#10,#13).Replace(#13,#13#10)
or
Memo.Lines.Text := STR.Replace(#13#10,#13).Replace(#10,#13).Replace(#13,#13#10)
and instead of
Memo.Lines.SaveToFile(FileName)
or
STR := Memo.Lines.Text
do
uses System.IOUtils;
TFile.WriteAllText(Memo.Lines.Text.Replace(#13#10,#13),TEncoding.UTF8)
or
STR := Memo.Lines.Text.Replace(#13#10,#13)
Of course, you should replace the TEncoding.UTF8 with the appropriate encoding you want to use.

Problems with unicode text

I use delphi xe3 and i have small problem !! but i don't how to fix it..
problem is with this letter "è" this letter is inside a file path "C:\lène.mp4"
i save this path into a tstringlist , when i save this tstringlist to a file the path will be shown fine inside the txt file ..
but when trying to loading it using tstringlist it will be shown as "è" (showing it inside a memo or int a variable) in this case it gonna be an invalid path ..
but adding the path(string) directly to the tstring list and then passing it to the path variable it works fine
but loading from the file and passing to the path variable it doesnt work (getting "è" instead of "è")
normally i will work with a lot of uncite string but for i'm struggling with that letter
this will not work ..
var
resp : widestring;
xfiles : tstringlist;
begin
xfiles := tstringlist.Create;
try
xfiles.LoadFromFile('C:\Demo6-out.txt'); // this file contains only "C:\lène.mp4"
resp := (xfiles.Strings[0]);
// if i save xfiles to a file "path string" will be saved fine ... !
finally
xfiles.Free ;
end;
but like this it work ..
var
resp : widestring;
xfiles : tstringlist;
begin
xfiles := tstringlist.Create;
try
xfiles.Add('C:lène.mp4');
resp := (xfiles.Strings[0]);
finally
xfiles.Free ;
end;
i'm really confused
First, you should be using UnicodeString instead of WideString. UnicodeString was introduced in Delphi 2009, and is much more efficient than WideString. The RTL uses UnicodeString (almost) everywhere it previously used AnsiString prior to 2009.
Second, something else introduced in Delphi 2009 is SysUtils.TEncoding, which is used for Byte<->Character conversions. Several existing RTL classes, including TStrings/TStringList, were updated to support TEncoding when converting bytes to/from strings.
What happens when you load a file into TStringList is that an internal TEncoding object is assigned to help convert the file's raw bytes to UnicodeString values. Which implementation of TEncoding it uses depends on the character encoding that LoadFromFile() thinks the file is using, if not explicitly stated (LoadFromFile() has an optional AEncoding parameter). If the file has a UTF BOM, a matching TEncoding is used, whether that be TEncoding.UTF8 or TEncoding.(BigEndian)Unicode. If no BOM is present, and the AEncoding parameter is not used, then TEncoding.Default is used, which represents the OS's default charset locale (and thus provides backwards compatibility with existing pre-2009 code).
When saving a TStringList to file, if the list was previously loaded from a file then the same TEncoding used for loading is used for saving, otherwise TEncoding.Default is used (again, for backwards compatibility), unless overwritten by the optional AEncoding parameter of SaveToFile().
In your first example, the input file is most likely encoded in UTF-8 without a BOM. So LoadFromFile() would use TEncoding.Default to interpret the file's bytes. è is the result of the UTF-8 encoded form of è (byte octets 0xC3 0xA8) being misinterpreted as Windows-1252 instead of UTF-8. So, you would have to load the file like this instead:
xfiles.LoadFromFile('C:\Demo6-out.txt', TEncoding.UTF8);
In your second example, you are not loading a file or saving a file. You are simply assigning a string literal (which is unicode-aware in D2009+) to a UnicodeString variable (inside of the TStringList) and then assigning that to a WideString variable (WideString and UnicodeString use the same UTF-16 character encoding, they just different memory managements). So there are no data conversions being performed.
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

How to fetch japanese characters in Delphi 7

I have a problem with displaying Japanese Character , specifically Unicode character "5c" in my delphi application . I need to save the application names into the registry and then display it in some kind of popup.
I have narrowed down the problem to this code specifically :-
Var
Str : WideString;
Str2: WideString;
Str3 : WideString;
TntEdit5.Text := TntOpenDialog1.FileName; //correctly displayed
Str3 := TntEdit5.Text;
ShowMessage('Original =' + Str3);
Str := UTF8Encode(TntEdit5.Text) ;
ShowMessage('UTF8Encode =' + Str3);
Str2 := UTF8Decode(Str) ;
ShowMessage('UTF8Decode =' + Str3);
end;
I dont get the correct name in Str, Str2 and Str3 . So how to fetch the name in a string ?
I dont want to display the text but i want to use it to save to registry and other functions.
Instead of SHowMessage, I used MessageBoxW(Form1.Handle, PWChar( Str3 ), 'Path', MB_OK ); which gave me correct result.
But I want to use this string internally, like write the string into a file etc. How to do that ?
Thanks In Advance
The type of Str does not match the type of result of UTF8Encode - so the line Str := UTF8Encode damages data. Instead of Str you should declare and use variable with a datatype mathcing the one of Utf8Encode result.
Same is true for Str2 := UTF8Decode(Str) line with regard to wrong data type of Str parameter the. It should be replaced with another var of proper datatype.
Str3 is not declared, so the code won't even compile. Add the Str3: WideString; line.
ShowMessage does not work with UTF-16, so then you make your own popup function that does.
Make your own dialog containing Tnt unicode-aware Label to display the text. And your new ShowMessage-like function would set the label's caption and then display that dialog instead of stock unicode-unaware one.
You may look at http://blog.synopse.info/post/2011/03/05/Open-Source-SynTaskDialog-unit-for-XP%2CVista%2CSeven for exampel of such dialogs, but i don't know if they are UTF-16 aware on D7.
Another option is searching TnT Sources for a ready-made unicode-aware function like ShowMessage - there may be one, or not.
Yet another option is using Win32 API directly, namely the MessageBoxW function working with PWideChar variables for texts: see http://msdn.microsoft.com/en-us/library/windows/desktop/ms645505.aspx
#DavidHeffernan MessageBoxW needs a lot of boilerplate both due to using C-Strings and for giving too much flexibility. It may be considered kinda good replacement for MessageDlg but not so much for ShowMessage. Then i am sure that TnT has ShowMessage conversion and that implementing own dialog would be good for both application look-and-feel and topic-starter experience.
You may also switch from obsolete Delphi 7 to modern CodeTyphon that uses UTF-8 for strings out of the box. You should at very least give it a try.
To read and write WideString from registry using Delphi 7 RTL you can make two easy options:
Convert WideString to UTF8 AnsiString and save it via TRegistry.WriteString and do back conversion on reading.
Save WideString as binary data: Cardinal(Length) followed by array of WideChar using TRegistry.WriteBinaryData
You can also use function RegReadWideString(const RootKey: DelphiHKEY; const Key, Name: string): WideString; and RegWriteWideString courtesy of http://jcl.sf.net
Whatever approach you'd choose - you have to do your own class on top of TRegistry that would be uniformly implementing those new TYourNewRegistry.WriteWideString and TYourNewRegistry.ReadWideString methods, so that the string written would always be read back using the same method.
However, since you already got TNT installed - then look carefully inside,. there just should be ready-made unicode-aware class like TTntRegistry or something like that.

How to read a text file that contains 'NULL CHARACTER' in Delphi?

I have a text file that contains many NULL CHARACTERS and its encoding is UTF8.
I loaded the file using RichEdit1.Lines.LoadFromFile(FileName,Encoding) stoped after the first Null Character and it didn't load the rest of file.
Is there any help. How can I remove NULL Chars from a text file.
**BTW My text file encoding is UTF8.
Reading the file shouldn't be a problem. Rather, the problem is more likely when you try to store the data in a rich-edit control. Those controls don't accept arbitrary binary data. You need to ensure you only put text in that control.
Load the file into an ordinary string or stream:
var
s: string;
ss: TStringStream;
s := TFile.ReadAllText(FileName);
Then remove the invalid characters. #0 is the notation in Delphi to represent a null character. Ordinarily, we might use StringReplace to remove characters:
s := StringReplace(s, #0, '', [rfReplaceAll]);
However, it's not binary-safe; it stops at null characters. Instead, you'll need a different function for removing those characters. I've demonstrated that before. Call that function to adjust the string:
RemoveNullCharacters(s);
Finally, put the data in the rich-edit control:
ss := TStringStream.Create(s);
try
RichEdit1.Lines.LoadFromStream(ss, Encoding);
finally
ss.Free;
end;
Are you sure it is a UTF8 and not a UNICODE file? As you may know UNICODE is two bytes, where first one is a null character for non UNICODE languages, for example Chinese and the like.
Have you try to open the file with the IDE editor? Open it, select all the text (Ctrl+A) and copy (Ctrl+C) create a new empty text file and paste (Ctrl+V) the text.
Save the new file and try the RichEdit with this new file.

Saving a string with null characters to a file

I have a string that contains null characters.
I've tried to save it to a file with this code:
myStringList.Text := myString;
myStringList.SaveToFile('c:\myfile');
Unfortunately myStringList.Text is empty if the source string has a null character at the beginning.
I thought only C string were terminated by a null character, and Delphi was always fine with it.
How to save the content of the string to a file?
I think you mean "save a string that has #0 characters in it".
If that's the case, don't try and put it in a TStringList. In fact, don't try to save it as a string at all; just like in C, a NULL character (#0 in Delphi) causes the string to be truncated at times. Use a TFileStream and write it directly as byte content:
var
FS: TFileStream;
begin
FS := TFileStream.Create('C:\MyFile', fmCreate);
try
FS.Write(myString[1], Length(myString) * SizeOf(Char));
finally
FS.Free;
end;
end;
To read it back:
var
FS: TFileStream;
begin
FS := TFileStream.Create('C:\MyFile', fmOpenRead);
try
SetLength(MyString, FS.Size);
FS.Read(MyString[1], FS.Size);
finally
FS.Free;
end;
end;
When you set the Text property of a TStrings object, the new value is parsed as a null-terminated string. Therefore when the code reaches your null character, the parsing stops.
I'm not sure why the Delphi RTL code was designed that way, and its not documented, but that's just how setting the Text property works.
You can avoid this by using the Add method rather than the Text property.
myStringList.Clear;
myStringList.Add(myString);
myStringList.SaveToFile(FileName);
About writing strings to a file in general.. I still see people creating streams or stringlists just to write some stuff to a file, and then destroy the stream or stringlist.
Delphi7 didn't have IOUtuls.pas yet, but you're missing out on that.
There's a handy TFile record with class methods that lets you write text to a file with a single line, without requiring temporary variables:
TFile.WriteAllText('out.txt','hi');
Upgrading makes your life as a Delphi developer a lot easier. This is just a tiny example.

Resources