I created the following code:
Function AnsiStringToStream(Const AString: AnsiString): TStream;
Begin
Result := TStringStream.Create(AString, TEncoding.ANSI);
End;
But I'm "W1057 Implicit string cast from 'AnsiString' to 'string'"
There is something wrong with him?
Thank you.
The TStringStream constructor expects a string as its parameter. When you give it an an AnsiString instead, the compiler has to insert conversion code, and the fact that you've specified the TEncoding.ANSI doesn't change that.
Try it like this instead:
Function AnsiStringToStream(Const AString: AnsiString): TStream;
Begin
Result := TStringStream.Create(string(AString));
End;
This uses an explicit conversion, and leaves the encoding-related work up to the compiler, which already knows how to take care of it.
In D2009+, TStringStream expects a UnicodeString, not an AnsiString. If you just want to write the contents of the AnsiString as-is without having to convert the data to Unicode and then back to Ansi, use TMemoryStream instead:
function AnsiStringToStream(const AString: AnsiString): TStream;
begin
Result := TMemoryStream.Create;
Result.Write(PAnsiChar(AString)^, Length(AString));
Result.Position := 0;
end;
Since AnsiString is codepage-aware in D2009+, ANY string that is passed to your function will be forced to the OS default Ansi encoding. If you want to be able to pass any 8-bit string type, such as UTF8String, without converting the data at all, use RawByteString instead of AnsiString:
function AnsiStringToStream(const AString: RawByteString): TStream;
begin
Result := TMemoryStream.Create;
Result.Write(PAnsiChar(AString)^, Length(AString));
Result.Position := 0;
end;
Related
I use the HTTPEncode() function in Delphi XE8 to encode Japanese text. Some characters can encode correctly, but some cannot. Below is an example:
aStr := HTTPEncode('萩原小学校');
I expected this:
aStr = '%E8%90%A9%E5%8E%9F%E5%B0%8F%E5%AD%A6%E6%A0%A1'
But I got this:
aStr = '%E8%90%A9%E5%8E%9F%E5%B0%8F%3F%E6%A0%A1'
Can someone help me to encode '萩原小学校' as '%E8%90%A9%E5%8E%9F%E5%B0%8F%E5%AD%A6%E6%A0%A1'?
I'm not sure what this HTTPEncode function is. There is a function in Web.HTTPApp of that name. Perhaps that is what you refer to. If so, it is clearly marked as deprecated. Assuming you have enabled compiler warnings, the compiler will be telling you this, and telling you also to use TNetEncoding.UTL.Encode instead.
Let's try that:
{$APPTYPE CONSOLE}
uses
System.NetEncoding;
begin
Writeln(TNetEncoding.URL.Encode('萩原小学校'));
end.
Output
%E8%90%A9%E5%8E%9F%E5%B0%8F%E5%AD%A6%E6%A0%A1
# David Heffernan, # Remy Lebeau, thank you so much for your time to help me. your answer make me understand why i cannot convert my string with HTTPEncode.
I have tried many times myself till i found this: Delphi: Convert from windows-1251 to Shift-JIS
function MyEncode(const S: string; const CodePage: Integer): string;
var
Encoding: TEncoding;
Bytes: TBytes;
b: Byte;
sb: TStringBuilder;
begin
Encoding := TEncoding.GetEncoding(CodePage);
try
Bytes := Encoding.GetBytes(S);
finally
Encoding.Free;
end;
sb := TStringBuilder.Create;
try
for b in Bytes do begin
sb.Append('%');
sb.Append(IntToHex(b, 2));
end;
Result := sb.ToString;
finally
sb.Free;
end;
end;
MyEncode('萩原小学校', 65001);
Output = %E8%90%A9%E5%8E%9F%E5%B0%8F%E5%AD%A6%E6%A0%A1
I want to achieve a very very basic task in Delphi: to save a string to disk and load it back. It seems trivial but I had problems doing this TWICE since I upgraded to IOUtils (and one more time before that... this is why I took the 'brilliant' decision to upgrade to IOUtils).
I use something like this:
procedure WriteToFile(CONST FileName: string; CONST uString: string; CONST WriteOp: WriteOperation);
begin
if WriteOp= (woOverwrite)
then IOUtils.TFile.WriteAllText (FileName, uString) //overwrite
else IOUtils.TFile.AppendAllText(FileName, uString); //append
end;
Simple right? What could go wrong? Well, I recently stepped into a (another) bug in IOUtils. So, TFile is buggy. The bug is detailed here.
Anyone has can share an alternative (or simply your thoughts/ideas) that is not based on IOUtils and it is known to work? Well... the code above also worked for a while for me... So, I know if difficult to guaranty that a piece of code (no matter how small) will really work!
Also I would REALLY like to have my WriteToFile procedure to save the string to an ANSI file when it is possible (the uString contains only ANSI chars) and as Unicode otherwise.
Then the ReadAFile function should automagically detect the encoding and correctly read the string back.
The idea is that there are still text editors out there that will wrongly open/interpret an Unicode/UTF file. So, whenever possible, give a good old ANSI text file to the user.
So:
- Overwrite/Append
- Save as ANSI when possible
- Memory efficient (don't eat 4GB of ram when the file to load is 2GB)
- Should work with any text file (up to 2GB, obviously)
- No IOUtils (too buggy to be of use)
Then the ReadAFile function should automagically detect the encoding and correctly read the string back.
This is not possible. There exists files that are well-formed if interpreted as any text encoding. For instance see The Notepad file encoding problem, redux.
This means that your goals are unattainable and that you need to change them.
My advice is to do the following:
Pick a single encoding, UTF-8, and stick to it.
If the file does not exists, create it and write UTF-8 bytes to it.
If the file exists, open it, seek to the end, and append UTF-8 bytes.
A text editor that does not understand UTF-8 is not worth supporting. If you feel inclined, include a UTF-8 BOM when you create the file. Use TEncoding.UTF8.GetBytes and TEncoding.UTF8.GetString to encode and decode.
Just use TStringList, until size of file < ~50-100Mb (it depends on CPU speed):
procedure ReadTextFromFile(const AFileName: string; SL: TStringList);
begin
SL.Clear;
SL.DefaultEncoding:=TEncoding.ANSI; // we know, that old files has this encoding
SL.LoadFromFile(AFileName, nil); // let TStringList detect real encoding.
// if not - it just use DefaultEncoding.
end;
procedure WriteTextToFile(const AFileName: string; const TextToWrite: string);
var
SL: TStringList;
begin
SL:=TStringList.Create;
try
ReadTextFromFile(AFileName, SL); // read all file with encoding detection
SL.Add(TextToWrite);
SL.SaveToFile(AFileName, TEncoding.UTF8); // write file with new encoding.
// DO NOT SET SL.WriteBOM to False!!!
finally
SL.Free;
end;
end;
The Inifiles unit should support unicode. At least according to this answer: How do I read a UTF8 encoded INI file?
Inifiles are quite commonly used to store strings, integers, booleans and even stringlists.
procedure TConfig.ReadValues();
var
appINI: TIniFile;
begin
appINI := TIniFile.Create(ChangeFileExt(Application.ExeName,'.ini'));
try
FMainScreen_Top := appINI.ReadInteger('Options', 'MainScreen_Top', -1);
FMainScreen_Left := appINI.ReadInteger('Options', 'MainScreen_Left', -1);
FUserName := appINI.ReadString('Login', 'UserName', '');
FDevMode := appINI.ReadBool('Globals', 'DevMode', False);
finally
appINI.Free;
end;
end;
procedure TConfig.WriteValues(OnlyWriteAnalyzer: Boolean);
var
appINI: TIniFile;
begin
appINI := TIniFile.Create(ChangeFileExt(Application.ExeName,'.ini'));
try
appINI.WriteInteger('Options', 'MainScreen_Top', FMainScreen_Top);
appINI.WriteInteger('Options', 'MainScreen_Left', FMainScreen_Left);
appINI.WriteString('Login', 'UserName', FUserName);
appINI.WriteBool('Globals', 'DevMode', FDevMode);
finally
appINI.Free;
end;
end;
Also see the embarcadero documentation on inifiles: http://docwiki.embarcadero.com/Libraries/Seattle/en/System.IniFiles.TIniFile
Code based on David's suggestions:
{--------------------------------------------------------------------------------------------------
READ/WRITE UNICODE
--------------------------------------------------------------------------------------------------}
procedure WriteToFile(CONST FileName: string; CONST aString: String; CONST WriteOp: WriteOperation= woOverwrite; WritePreamble: Boolean= FALSE); { Write Unicode strings to a UTF8 file. It can also write a preamble }
VAR
Stream: TFileStream;
Preamble: TBytes;
sUTF8: RawByteString;
aMode: Integer;
begin
ForceDirectories(ExtractFilePath(FileName));
if (WriteOp= woAppend) AND FileExists(FileName)
then aMode := fmOpenReadWrite
else aMode := fmCreate;
Stream := TFileStream.Create(filename, aMode, fmShareDenyWrite); { Allow read during our writes }
TRY
sUTF8 := Utf8Encode(aString); { UTF16 to UTF8 encoding conversion. It will convert UnicodeString to WideString }
if (aMode = fmCreate) AND WritePreamble then
begin
preamble := TEncoding.UTF8.GetPreamble;
Stream.WriteBuffer( PAnsiChar(preamble)^, Length(preamble));
end;
if aMode = fmOpenReadWrite
then Stream.Position:= Stream.Size; { Go to the end }
Stream.WriteBuffer( PAnsiChar(sUTF8)^, Length(sUTF8) );
FINALLY
FreeAndNil(Stream);
END;
end;
procedure WriteToFile (CONST FileName: string; CONST aString: AnsiString; CONST WriteOp: WriteOperation);
begin
WriteToFile(FileName, String(aString), WriteOp, FALSE);
end;
function ReadFile(CONST FileName: string): String; {Tries to autodetermine the file type (ANSI, UTF8, UTF16, etc). Works with UNC paths }
begin
Result:= System.IOUtils.TFile.ReadAllText(FileName);
end;
I develop a server and a mobile client that communicate over HTTP. Server is written in Delphi 7 (because it has to be compatible with old code), client is mobile application written in XE6. Server sends to client stream of data that contains strings. A problem is connected to encoding.
On the server I try to pass strings in UTF8:
//Writes string to stream
procedure TStreamWrap.WriteString(Value: string);
var
BytesCount: Longint;
UTF8: string;
begin
UTF8 := AnsiToUtf8(Value);
BytesCount := Length(UTF8);
WriteLongint(BytesCount); //It writes Longint to FStream: TStream
if BytesCount > 0 then
FStream.WriteBuffer(UTF8[1], BytesCount);
end;
As it's written in Delphi7, Value is a single byte string.
On the client I read string in UTF8 and encode it to Unicode
//Reads string from current position of stream
function TStreamWrap.ReadString: string;
var
BytesCount: Longint;
UTF8: String;
begin
BytesCount := ReadLongint;
if BytesCount = 0 then
Result := ''
else
begin
SetLength(UTF8, BytesCount);
FStream.Read(Pointer(UTF8)^, BytesCount);
Result := UTF8ToUnicodeString(UTF8);
end;
end;
But it doesn't work, when I display the string with ShowMessage the letters are wrong. So how to store string in Delphi 7 and restore it in XE6 on the mobile app? Should I add BOM at the beginning of data representing the string?
To read your UTF8 encoded string in your mobile application you use a byte array and the TEncoding class. Like this:
function TStreamWrap.ReadString: string;
var
ByteCount: Longint;
Bytes: TBytes;
begin
ByteCount := ReadLongint;
if ByteCount = 0 then
begin
Result := '';
exit;
end;
SetLength(Bytes, ByteCount);
FStream.Read(Pointer(Bytes)^, ByteCount);
Result := TEncoding.UTF8.GetString(Bytes);
end;
This code does what you need in XE6, but of course, this code will not compile in Delphi 7 because it uses TEncoding. What's more, your TStreamWrap.WriteString implementation does what you want in Delphi 7, but is broken in XE6.
Now it looks like you are using the same code base for both Delphi 7 and Delphi XE6 versions. Which means that you may need to use some conditional compilation to handle the treatment of text which differs between these versions.
Personally I would do this by following the example of TEncoding. What you need is a function that converts a native Delphi string to a UTF-8 encoded byte array, and a corresponding function in the reverse direction.
So, let's consider the string to bytes function. I cannot remember whether or not Delphi 7 has a TBytes type. I suspect not. So let us define it:
{$IFNDEF UNICODE} // definitely use a better conditional than this in real code
type
TBytes = array of Byte;
{$ENDIF}
Then we can define our function:
function StringToUTF8Bytes(const s: string): TBytes;
{$IFDEF UNICODE}
begin
Result := TEncoding.UTF8.GetBytes(s);
end;
{$ELSE}
var
UTF8: UTF8String;
begin
UTF8 := AnsiToUtf8(s);
SetLength(Result, Length(UTF8));
Move(Pointer(UTF8)^, Pointer(Result)^, Length(Result));
end;
{$ENDIF}
The function in the opposite direction should be trivial for you to produce.
Once you have the differences in handling of text encoding between the two Delphi versions encapsulated, you can then write conditional free code in the rest of your program. For example, you would code WriteString like this:
procedure TStreamWrap.WriteString(const Value: string);
var
UTF8: TBytes;
ByteCount: Longint;
begin
UTF8 := StringToUTF8Bytes(Value);
ByteCount := Length(UTF8);
WriteLongint(ByteCount);
if ByteCount > 0 then
FStream.WriteBuffer(Pointer(UTF8)^, ByteCount);
end;
Instead of
Utf8 : String;
Use
Utf8 : Utf8String;
on client. Then conversion is Automatic.
EDIT: Since the client is on a mobile platform, and Embarcadero has decided to eliminate the 8-bit strings in mobile compilers, the above won't work for this particular case. But in other cases where you have an 8-bit UTF-8 encoded string, the Utf8String can be used to seamlessly convert back and forth between UTF-8 and Unicode strings without the need to use explicit UTF-8 conversion functions. Just use it like
UnicodeStringVariable := Utf8StringVariable;
or
Utf8StringVariable := UnicodeStringVariable;
and the compiler will insert the appropriate conversion.
I am implementing Ping function using windows API in delphi-xe3 from here
(http://delphi.about.com/od/internetintranet/l/aa081503a.htm).
I am having problem with the following function.It displays error incompatible type Pansichar and Pwidechar.I replaced Pchar with PAnsichar now it displays exception
'Error getting IP from HostName'.
I am testing it with localhost.
Please guide whats the proper conversion.
const ADP_IP = '127.0.0.1';
procedure TranslateStringToTInAddr(AIP: string; var AInAddr);
var
phe: PHostEnt;
pac: PChar;
GInitData: TWSAData;
begin
WSAStartup($101, GInitData);
try
phe := GetHostByName(PChar(AIP));
if Assigned(phe) then
begin
pac := phe^.h_addr_list^;
if Assigned(pac) then
begin
with TIPAddr(AInAddr).S_un_b do begin
s_b1 := Byte(pac[0]);
s_b2 := Byte(pac[1]);
s_b3 := Byte(pac[2]);
s_b4 := Byte(pac[3]);
end;
end
else
begin
raise Exception.Create('Error getting IP from HostName');
end;
end
else
begin
raise Exception.Create('Error getting HostName');
end;
except
FillChar(AInAddr, SizeOf(AInAddr), #0);
end;
WSACleanup;
end;
You don't want to convert from PAnsiChar to PWideChar. On your Unicode Delphi your PChar maps to PWideChar. But gethostbyname receives PAnsiChar. You need to convert from Unicode to ANSI.
Code it like this:
phe := gethostbyname(PAnsiChar(AnsiString(AIP)));
In other words, convert your string to AnsiString, and then cast as PAnsiChar. Personally I'd declare the AIP parameter to be AnsiString.
procedure TranslateStringToTInAddr(const AIP: AnsiString; var AInAddr);
And then write the call to gethostbyname like so:
phe := gethostbyname(PAnsiChar(AIP));
That untyped var parameter looks dubious to me. I see no compelling reason for its use. What's wrong with declaring it to be of type TIPAddr? Your FillChar is somewhat dubious. How can you use SizeOf on an untyped parameter?
It's possible to convert the XML to UTF-8 encoding in Delphi 6?
Currently that's what I am doing:
Fill TXMLDocument with AnsiString
At the end convert the Data to UTF-8 by using WideStringVariable = AnsiToUtf8(Doc.XML.Text);
Save the value of WideStringVariable to file using TFileStream and Adding BOM for UTF8 at the file beggining.
CODE:
Procedure SaveAsUTF8( const Name:String; Data: TStrings );
const
cUTF8 = $BFBBEF;
var
W_TXT: WideString;
fs: TFileStream;
wBOM: Integer;
begin
if TRIM(Data.Text) <> '' then begin
W_TXT:= AnsiToUTF8(Data.Text);
fs:= Tfilestream.create( Name, fmCreate );
try
wBOM := cUTF8;
fs.WriteBUffer( wBOM, sizeof(wBOM)-1);
fs.WriteBuffer( W_TXT[1], Length(W_TXT)*Sizeof( W_TXT[1] ));
finally
fs.free
end;
end;
end;
If I open the file in Notepad++ or another editor that detects encoding, it shows me UTF-8 with BOM. However, it seems like the text it's not properly encoded.
What is wrong and how can I fix it?
UPDATE: XML Properties:
XMLDoc.Version := '1.0';
XMLDoc.Encoding := 'UTF-8';
XMLDoc.StandAlone := 'yes';
You can save the file using standard SaveToFile method over the TXMLDocument variable: http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/delphivclwin32/XMLDoc_TXMLDocument_SaveToFile.html
Whether the file would be or not UTF8 you have to check using local tools like aforementioned Notepad++ or Hex Editor or anything else.
If you insist of using intermediate string and file stream, you should use the proper variable. AnsiToUTF8 returns UTF8String type and that is what to be used.
Compiling `WideStringVar := AnsiStringSource' would issue compiler warning and
It is a proper warning. Googling for "Delphi WideString" - or reading Delphi manuals on topic - shows that WideString aka Microsoft OLE BSTR keeps data in UTF-16 format. http://delphi.about.com/od/beginners/l/aa071800a.htm
Thus assignment UTF16 string <= 8-bit source would necessarily convert data and thus dumping WideString data can not be dumping UTF-8 text by the definition of WideString
Procedure SaveAsUTF8( const Name:String; Data: TStrings );
const
cUTF8: array [1..3] of byte = ($EF,$BB,$BF)
var
W_TXT: UTF8String;
fs: TFileStream;
Trimmed: AnsiString;
begin
Trimmed := TRIM(Data.Text);
if Trimmed <> '' then begin
W_TXT:= AnsiToUTF8(Trimmed);
fs:= TFileStream.Create( Name, fmCreate );
try
fs.WriteBuffer( cUTF8[1], sizeof(cUTF8) );
fs.WriteBuffer( W_TXT[1], Length(W_TXT)*Sizeof( W_TXT[1] ));
finally
fs.free
end;
end;
end;
BTW, this code of yours would not create even empty file if the source data was empty. It looks rather suspicious, though it is you to decide whether that is an error or not wrt the rest of your program.
The proper "uploading" of received file or stream to web is yet another issue (to be put as a separate question on Q&A site like SO), related to testing conformance with HTTP. As a foreword, you can readsome hints at WWW server reports error after POST Request by Internet Direct components in Delphi
In order to have the correct encoding inside the document, you should set it by using the Encoding property in your XML Document, like this:
myXMLDocument.Encoding := 'UTF-8';
I hope this helps.
You simply need to call the SaveToFile method of the document:
XMLDoc.SaveToFile(FileName);
Since you specified the encoding already, the component will use that encoding.
This won't include a BOM, but that's generally what you want for an XML file. The content of the file will specify the encoding.
As regards your SaveAsUTF8 method, it is not needed, but it is easy to fix. And that may be instructive to you.
The problem is that you are converting to UTF-16 when you assign to a WideString variable. You should instead put the UTF-8 text into an AnsiString variable. Changing the type of the variable that you named W_TXT to AnsiString is enough.
The function might look like this:
Procedure SaveAsUTF8(const Name: string; Data: TStrings);
const
UTF8BOM: array [0..2] of AnsiChar = #$EF#$BB#$BF;
var
utf8: AnsiString;
fs: TFileStream;
begin
utf8 := AnsiToUTF8(Data.Text);
fs:= Tfilestream.create(Name, fmCreate);
try
fs.WriteBuffer(UTF8BOM, SizeOf(UTF8BOM));
fs.WriteBuffer(Pointer(utf8)^, Length(utf8));
finally
fs.free;
end;
end;
Another solution:
procedure SaveAsUTF8(const Name: string; Data: TStrings);
var
fs: TFileStream;
vStreamWriter: TStreamWriter;
begin
fs := TFileStream.Create(Name, fmCreate);
try
vStreamWriter := TStreamWriter.Create(fs, TEncoding.UTF8);
try
vStreamWriter.Write(Data.Text);
finally
vStreamWriter.Free;
end;
finally
fs.free;
end;
end;