I have the following setup:
- Windows system language is English.
- I use Delphi 10.1 Berlin.
- In Windows Region & Language/Country set to Japan.
- Region/Administrative/Language for non-Unicode programs set to Japanese (Japan).
I have implemented communication client/server using strings.
Let's skip the question 'why not bytes' for now. I want to show the issue and find the reason why.
I write 2 things into TStringStream:
Header: which includes its size of Int64 (8 bytes), object size of Int64 (8 bytes) and class name: Header length - 2*SizeOf(Int64).
object (TComponent descendant)
procedure ComponentToStream(AComponent: TComponent; AStream: TStream; out HL,OL: Int64);
var
CN: TBytes;
MS1: TMemoryStream;
begin
MS1 := TMemoryStream.Create;
try
CN := TEncoding.Unicode.GetBytes(AComponent.ClassName);
SaveComponentToStream(MS1, AComponent);
OL := MS1.Size;
MS1.Position := 0;
HL := SizeOf(HL) + SizeOf(OL) + Length(CN);
AStream.Write(HL,SizeOf(HL));
AStream.Write(OL,SizeOf(OL));
AStream.Write(CN[0], Length(CN));
MS1.SaveToStream(AStream);
finally
FreeAndNil(MS1);
end;
end;
function PrepareDataBeforeSend(Component: TComponent): string;
var
HL, OL: Int64;
SS: TStringStream;
begin
SS := TStringStream.Create('', TEncoding.Unicode);
try
ComponentToStream(Component, SS, HL, OL);
Result := SS.DataString;
SS.SaveToFile('Orginal stream data.debug');
finally
FreeAndNil(SS);
end;
The result of this method saved in file here
click.
To verify the data I used code below right after calling of one above.
SS := TStringStream.Create({PrepareDataBeforeSend result}, TEncoding.Unicode);
SS.SaveToFile('New stream data.debug');
SS.Free;
Saved binary can be found here Click
And now 2 problems:
If I don't specify explicitly TEncoding.Unicode encoding in constructor of TStringStream, then TEncoding.Default will be used. But for Japanese code page it is ANSII and for English it is Unicode. As a result object size I read later
SS.Read(OL, SizeOf(OL));
is wrong.
Here's the binary to compare. See 8-15 bytes Click
OK, issue 1 was resolved, but still the binary I saved for verification does not match the original one: there is 1 byte missing at the end.
Can anyone tell where is a problem?
Important: there is no issues if I have English localization!!
Related
I want to achieve a very very basic task in Delphi: to save a string to disk and load it back. It seems trivial but I had problems doing this TWICE since I upgraded to IOUtils (and one more time before that... this is why I took the 'brilliant' decision to upgrade to IOUtils).
I use something like this:
procedure WriteToFile(CONST FileName: string; CONST uString: string; CONST WriteOp: WriteOperation);
begin
if WriteOp= (woOverwrite)
then IOUtils.TFile.WriteAllText (FileName, uString) //overwrite
else IOUtils.TFile.AppendAllText(FileName, uString); //append
end;
Simple right? What could go wrong? Well, I recently stepped into a (another) bug in IOUtils. So, TFile is buggy. The bug is detailed here.
Anyone has can share an alternative (or simply your thoughts/ideas) that is not based on IOUtils and it is known to work? Well... the code above also worked for a while for me... So, I know if difficult to guaranty that a piece of code (no matter how small) will really work!
Also I would REALLY like to have my WriteToFile procedure to save the string to an ANSI file when it is possible (the uString contains only ANSI chars) and as Unicode otherwise.
Then the ReadAFile function should automagically detect the encoding and correctly read the string back.
The idea is that there are still text editors out there that will wrongly open/interpret an Unicode/UTF file. So, whenever possible, give a good old ANSI text file to the user.
So:
- Overwrite/Append
- Save as ANSI when possible
- Memory efficient (don't eat 4GB of ram when the file to load is 2GB)
- Should work with any text file (up to 2GB, obviously)
- No IOUtils (too buggy to be of use)
Then the ReadAFile function should automagically detect the encoding and correctly read the string back.
This is not possible. There exists files that are well-formed if interpreted as any text encoding. For instance see The Notepad file encoding problem, redux.
This means that your goals are unattainable and that you need to change them.
My advice is to do the following:
Pick a single encoding, UTF-8, and stick to it.
If the file does not exists, create it and write UTF-8 bytes to it.
If the file exists, open it, seek to the end, and append UTF-8 bytes.
A text editor that does not understand UTF-8 is not worth supporting. If you feel inclined, include a UTF-8 BOM when you create the file. Use TEncoding.UTF8.GetBytes and TEncoding.UTF8.GetString to encode and decode.
Just use TStringList, until size of file < ~50-100Mb (it depends on CPU speed):
procedure ReadTextFromFile(const AFileName: string; SL: TStringList);
begin
SL.Clear;
SL.DefaultEncoding:=TEncoding.ANSI; // we know, that old files has this encoding
SL.LoadFromFile(AFileName, nil); // let TStringList detect real encoding.
// if not - it just use DefaultEncoding.
end;
procedure WriteTextToFile(const AFileName: string; const TextToWrite: string);
var
SL: TStringList;
begin
SL:=TStringList.Create;
try
ReadTextFromFile(AFileName, SL); // read all file with encoding detection
SL.Add(TextToWrite);
SL.SaveToFile(AFileName, TEncoding.UTF8); // write file with new encoding.
// DO NOT SET SL.WriteBOM to False!!!
finally
SL.Free;
end;
end;
The Inifiles unit should support unicode. At least according to this answer: How do I read a UTF8 encoded INI file?
Inifiles are quite commonly used to store strings, integers, booleans and even stringlists.
procedure TConfig.ReadValues();
var
appINI: TIniFile;
begin
appINI := TIniFile.Create(ChangeFileExt(Application.ExeName,'.ini'));
try
FMainScreen_Top := appINI.ReadInteger('Options', 'MainScreen_Top', -1);
FMainScreen_Left := appINI.ReadInteger('Options', 'MainScreen_Left', -1);
FUserName := appINI.ReadString('Login', 'UserName', '');
FDevMode := appINI.ReadBool('Globals', 'DevMode', False);
finally
appINI.Free;
end;
end;
procedure TConfig.WriteValues(OnlyWriteAnalyzer: Boolean);
var
appINI: TIniFile;
begin
appINI := TIniFile.Create(ChangeFileExt(Application.ExeName,'.ini'));
try
appINI.WriteInteger('Options', 'MainScreen_Top', FMainScreen_Top);
appINI.WriteInteger('Options', 'MainScreen_Left', FMainScreen_Left);
appINI.WriteString('Login', 'UserName', FUserName);
appINI.WriteBool('Globals', 'DevMode', FDevMode);
finally
appINI.Free;
end;
end;
Also see the embarcadero documentation on inifiles: http://docwiki.embarcadero.com/Libraries/Seattle/en/System.IniFiles.TIniFile
Code based on David's suggestions:
{--------------------------------------------------------------------------------------------------
READ/WRITE UNICODE
--------------------------------------------------------------------------------------------------}
procedure WriteToFile(CONST FileName: string; CONST aString: String; CONST WriteOp: WriteOperation= woOverwrite; WritePreamble: Boolean= FALSE); { Write Unicode strings to a UTF8 file. It can also write a preamble }
VAR
Stream: TFileStream;
Preamble: TBytes;
sUTF8: RawByteString;
aMode: Integer;
begin
ForceDirectories(ExtractFilePath(FileName));
if (WriteOp= woAppend) AND FileExists(FileName)
then aMode := fmOpenReadWrite
else aMode := fmCreate;
Stream := TFileStream.Create(filename, aMode, fmShareDenyWrite); { Allow read during our writes }
TRY
sUTF8 := Utf8Encode(aString); { UTF16 to UTF8 encoding conversion. It will convert UnicodeString to WideString }
if (aMode = fmCreate) AND WritePreamble then
begin
preamble := TEncoding.UTF8.GetPreamble;
Stream.WriteBuffer( PAnsiChar(preamble)^, Length(preamble));
end;
if aMode = fmOpenReadWrite
then Stream.Position:= Stream.Size; { Go to the end }
Stream.WriteBuffer( PAnsiChar(sUTF8)^, Length(sUTF8) );
FINALLY
FreeAndNil(Stream);
END;
end;
procedure WriteToFile (CONST FileName: string; CONST aString: AnsiString; CONST WriteOp: WriteOperation);
begin
WriteToFile(FileName, String(aString), WriteOp, FALSE);
end;
function ReadFile(CONST FileName: string): String; {Tries to autodetermine the file type (ANSI, UTF8, UTF16, etc). Works with UNC paths }
begin
Result:= System.IOUtils.TFile.ReadAllText(FileName);
end;
I am working on an application which was recently upgraded from Delphi 2007 to XE7. There is one particular scenario where the conversion of TMemoryStream to PChar is failing. Here is the code:
procedure TCReport.CopyToClipboard;
var
CTextStream: TMemoryStream;
PValue: PChar;
begin
CTextStream := TMemoryStream.Create;
//Assume that this code is saving a report column to CTextStream
//Verified that the value in CTextStream is correct
Self.SaveToTextStream(CTextStream);
//The value stored in PValue below is corrupt
PValue := StrAlloc(CTextStream.Size + 1);
CTextStream.Read(PValue^, CTextStream.Size + 1);
PValue[CTextStream.Size] := #0;
{ Copy text stream to clipboard }
Clipboard.Clear;
Clipboard.SetTextBuf(PValue);
CTextStream.Free;
StrDispose(PValue);
end;
Adding the code for SaveToTextStream:
procedure TCReport.SaveToTextStream(CTextStream: TStream);
var
CBinaryMemoryStream: TMemoryStream;
CWriter: TWriter;
begin
CBinaryMemoryStream := TMemoryStream.Create;
CWriter := TWriter.Create(CBinaryMemoryStream, 24);
try
CWriter.Ancestor := nil;
CWriter.WriteRootComponent(Self);
CWriter.Free;
CBinaryMemoryStream.Position := 0;
{ Convert Binary 'WriteComponent' stream to text}
ObjectBinaryToText(CBinaryMemoryStream, CTextStream);
CTextStream.Position := 0;
finally
CBinaryMemoryStream.Free;
end;
end;
I observed that the StrLen(PChar) is also coming out to be half the size of TMemoryStream. But in Delphi 2007 it was coming out to be same as the size of TMemoryStream.
I know that the above code is assuming the size of a char to be 1 byte, and that could be a problem. But I tried multiple approaches, and nothing works.
Could you suggest a better way to go about this conversion?
Yet again, this is the issue of Delphi 2009 and later using Unicode text. In Delphi 2007 and earlier:
Char is an alias to AnsiChar.
PChar is an alias to PAnsiChar.
string is an alias to AnsiString.
In Delphi 2009 and later:
Char is an alias to WideChar.
PChar is an alias to PWideChar.
string is an alias to UnicodeString.
Your code is written assuming that PChar is PAnsiChar. Hence your problems. You need to stop using StrAlloc anyway. You are making life hard for yourself by manually allocating heap memory here. Let the compiler do the work.
You need to obtain your text in a string variable, and then simply do:
Clipboard.AsText := MyStrVariable;
Exactly how best to obtain the string depends on the facilities that TCReport offers. I expect that it will yield a string directly in which case you'll write something like this:
procedure TCReport.CopyToClipboard;
begin
Clipboard.AsText := Self.ReportAsText;
end;
I'm guessing as to what your functionality your TCReport offers, but I'm sure you know.
By reffering to what hvd and David Heffernan wrote above, one possible way is to change CTextStream on CopyToClipboard to TStringStream as follow :
procedure TCReport.CopyToClipboard;
var
CTextStream: TStringStream;
begin
CTextStream := TStringStream.Create;
try
//Assume no error with Self.SaveToTextStream
Self.SaveToTextStream(CTextStream);
{ Copy text stream to clipboard }
Clipboard.AsText := CTextStream.DataString;
finally
CTextStream.Free;
end;
end;
But you should make sure that SaveToTextStream function provides CTextStream with the exact encoding text data.
I am trying to copy a file to the clipboard. All examples in Internet are the same. I am using one from, http://embarcadero.newsgroups.archived.at/public.delphi.nativeapi/200909/0909212186.html but it does not work.
I use Rad Studio XE and I pass the complete path. In mode debug, I get some warnings like:
Debug Output:
Invalid address specified to RtlSizeHeap( 006E0000, 007196D8 )
Invalid address specified to RtlSizeHeap( 006E0000, 007196D8 )
I am not sure is my environment is related: Windows 8.1 64 bits, Rad Studio XE.
When I try to paste the clipboard, nothing happens. Also, seeing the clipboard with a monitor tool, this tool shows me error.
The code is:
procedure TfrmDoc2.CopyFilesToClipboard(FileList: string);
var
DropFiles: PDropFiles;
hGlobal: THandle;
iLen: Integer;
begin
iLen := Length(FileList) + 2;
FileList := FileList + #0#0;
hGlobal := GlobalAlloc(GMEM_SHARE or GMEM_MOVEABLE or GMEM_ZEROINIT,
SizeOf(TDropFiles) + iLen);
if (hGlobal = 0) then raise Exception.Create('Could not allocate memory.');
begin
DropFiles := GlobalLock(hGlobal);
DropFiles^.pFiles := SizeOf(TDropFiles);
Move(FileList[1], (PChar(DropFiles) + SizeOf(TDropFiles))^, iLen);
GlobalUnlock(hGlobal);
Clipboard.SetAsHandle(CF_HDROP, hGlobal);
end;
end;
UPDATE:
I am sorry, I feel stupid. I used the code that did not work, the original question that somebody asked, in my project, while I used the Remy's code, the correct solution, here in Stackoverflow. I thought that I used the Remy's code in my project. So, now, using the Remy's code, everything works great. Sorry for the mistake.
The forum post you link to contains the code in your question and asks why it doesn't work. Not surprisingly the code doesn't work for you any more than it did for the asker.
The answer that Remy gives is that there is a mismatch between ANSI and Unicode. The code is for ANSI but the compiler is Unicode.
So click on Remy's reply and do what it says: http://embarcadero.newsgroups.archived.at/public.delphi.nativeapi/200909/0909212187.html
Essentially you need to adapt the code to account for characters being 2 bytes wide in Unicode Delphi, but I see no real purpose repeating Remy's code here.
However, I'd say that you can do better than this code. The problem with this code is that it mixes every aspect all into one big function that does it all. What's more, the function is a method of a form in your GUI which is really the wrong place for it. There are aspects of the code that you might be able to re-use, but not factored like that.
I'd start with a function that puts an known block of memory into the clipboard.
procedure ClipboardError;
begin
raise Exception.Create('Could not complete clipboard operation.');
// substitute something more specific that Exception in your code
end;
procedure CheckClipboardHandle(Handle: HGLOBAL);
begin
if Handle=0 then begin
ClipboardError;
end;
end;
procedure CheckClipboardPtr(Ptr: Pointer);
begin
if not Assigned(Ptr) then begin
ClipboardError;
end;
end;
procedure PutInClipboard(ClipboardFormat: UINT; Buffer: Pointer; Count: Integer);
var
Handle: HGLOBAL;
Ptr: Pointer;
begin
Clipboard.Open;
Try
Handle := GlobalAlloc(GMEM_MOVEABLE, Count);
Try
CheckClipboardHandle(Handle);
Ptr := GlobalLock(Handle);
CheckClipboardPtr(Ptr);
Move(Buffer^, Ptr^, Count);
GlobalUnlock(Handle);
Clipboard.SetAsHandle(ClipboardFormat, Handle);
Except
GlobalFree(Handle);
raise;
End;
Finally
Clipboard.Close;
End;
end;
We're also going to need to be able to make double-null terminated lists of strings. Like this:
function DoubleNullTerminatedString(const Values: array of string): string;
var
Value: string;
begin
Result := '';
for Value in Values do
Result := Result + Value + #0;
Result := Result + #0;
end;
Perhaps you might add an overload that accepted a TStrings instance.
Now that we have all this we can concentrate on making the structure needed for the CF_HDROP format.
procedure CopyFileNamesToClipboard(const FileNames: array of string);
var
Size: Integer;
FileList: string;
DropFiles: PDropFiles;
begin
FileList := DoubleNullTerminatedString(FileNames);
Size := SizeOf(TDropFiles) + ByteLength(FileList);
DropFiles := AllocMem(Size);
try
DropFiles.pFiles := SizeOf(TDropFiles);
DropFiles.fWide := True;
Move(Pointer(FileList)^, (PByte(DropFiles) + SizeOf(TDropFiles))^,
ByteLength(FileList));
PutInClipboard(CF_HDROP, DropFiles, Size);
finally
FreeMem(DropFiles);
end;
end;
Since you use Delphi XE, strings are Unicode, but you are not taking the size of character into count when you allocate and move memory.
Change the line allocating memory to
hGlobal := GlobalAlloc(GMEM_SHARE or GMEM_MOVEABLE or GMEM_ZEROINIT,
SizeOf(TDropFiles) + iLen * SizeOf(Char));
and the line copying memory, to
Move(FileList[1], (PByte(DropFiles) + SizeOf(TDropFiles))^, iLen * SizeOf(Char));
Note the inclusion of *SizeOf(Char) in both lines and change of PChar to PByte on second line.
Then, also set the fWide member of DropFiles to True
DropFiles^.fWide := True;
All of these changes are already in the code from Remy, referred to by David.
I read an UTF8-File, made with Winword, into a Tmemo, using the code below (tried all 2 methods). The file contains IPA pronunciation characters. For these characters, I see only squares. I tried different versions of tmemo.font.charset, but it did not help.
What can I do?
Peter
// OD is an TOpenDialog
procedure TForm1.Load1Click(Sender: TObject);
{
var fileH: textFile;
newLine: RawByteString;
begin
if od.execute (self.Handle) then begin
assignFile(fileH,od.filename);
reset(fileH);
while not eof(fileH) do begin
readln(fileH,newLine);
Memo1.lines.Add(UTF8toString(newLine));
end;
closeFile(fileH);
end;
end;
}
var
FileStream: tFileStream;
Preamble: TBytes;
memStream: TMemoryStream;
begin
if od.Execute then
begin
FileStream := TFileStream.Create(od.FileName,fmOpenRead or fmShareDenyWrite);
MemStream := TMemoryStream.Create;
Preamble := TEncoding.UTF8.GetPreamble;
memStream.Write(Preamble[0],length(Preamble));
memStream.CopyFrom(FileStream,FileStream.Size);
memStream.Seek(0,soFromBeginning);
memo1.Lines.LoadFromStream(memStream);
showmessage(SysErrorMessage(GetLastError));
FileStream.Free;
memStream.Free;
end;
end;
First, you are doing too much work. Your code can be simplified to this:
procedure TForm1.Load1Click(Sender: TObject);
begin
if od.Execute then
memo1.Lines.LoadFromFile(od.FileName, TEncoding.UTF8);
end;
Second, as David said, you need to use a font that supports the Unicode characters/glyphs that are stored in the file. It is not enough to set the Font.Charset, you have to set the Font.Name to a compatible font. Look at the fonts that loursonwinny mentioned.
For these characters, I see only squares.
The squares indicate that the font does not contain glyphs for those characters. You'll need to switch to a font that does. Assuming that your file has been properly encoded and that you are reading in the code points that you intend to.
You can pass TEncoding.UTF8 to the LoadFromFile method to avoid having to add a BOM to the content. Finally, don't call GetLastError unless the Win32 documentation says it has meaning. Where you call it, there is no reason to believe that the value has any meaning.
In this question is mentioned the wcrypt2.
What I need is simply calculate the MD5 of a file. It would be perfect if I could calculate it without having to save it because it is a downloaded file in stream format.
I would like to have the most straightforward way to do that.
Thanks!
Here is a working code for Indy 10:
function MD5File(const FileName: string): string;
var
IdMD5: TIdHashMessageDigest5;
FS: TFileStream;
begin
IdMD5 := TIdHashMessageDigest5.Create;
FS := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
try
Result := IdMD5.HashStreamAsHex(FS)
finally
FS.Free;
IdMD5.Free;
end;
end;
Regards,
OscaR1
Based on #dummzeuch answere I wrote this function:
function getMD5checksum(s: TStream): string;
var
md5: TIdHashMessageDigest5;
hash : T4x4LongWordRecord;
begin
md5 := TIdHashMessageDigest5.Create;
s.Seek(0,0);
hash := md5.HashValue(s);
result := IntToHex(Integer(hash[0]), 4) +
IntToHex(Integer(hash[1]), 4) +
IntToHex(Integer(hash[2]), 4) +
IntToHex(Integer(hash[3]), 4);
end;
Indy comes with functions for calculating several hashes, MD5 is one of them. Indy is included in all versions of Delphi since at least Delphi 2006 and available as a free download for older versions.
What about:
function GetFileMD5(const Stream: TStream): String; overload;
var MD5: TIdHashMessageDigest5;
begin
MD5 := TIdHashMessageDigest5.Create;
try
Result := MD5.HashStreamAsHex(Stream);
finally
MD5.Free;
end;
end;
function GetFileMD5(const Filename: String): String; overload;
var FileStream: TFileStream;
begin
FileStream := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
try
Result := GetFileMD5(FileStream);
finally
FileStream.Free;
end;
end;
As you mentioned, the post you linked to talks about wcrypt2, which is a library of cryptographic routines, including MD5. The post you linked to also seems to indicate that it is available for Delphi 7 since the asker includes output labeled "Delphi 7." You have tagged this question delphi7, so I assume that's the version you're using, too. So what's stopping you from using wcrypt2?
The question links to a copy of wcrypt2.pas, and the copyright dates in that file appear to indicate that the unit was available by the time Delphi 7 was released. Check your installation; you might already have it. If not, then the unit also says that it was obtained via Project Jedi, so you could try looking there for the unit as well.
The answers to your referenced question include example Delphi code and the names of units that come with Delphi for doing MD5. They come with Delphi 2009, so you should check whether they're also available for your version.
Take a look at this implementation of MD5SUM in Delphi. It requires a string for input, but I imagine you can easily make it work with a stream.
MessageDigest_5 would work for this as well.
I use the following function in Delphi 7 with Indy 10.1.5
uses IdHashMessageDigest, idHash, Classes;
...
function cc_MD5File(const p_fileName : string) : string;
//returns MD5 has for a file
var
v_idmd5 : TIdHashMessageDigest5;
v_fs : TFileStream;
v_hash : T4x4LongWordRecord;
begin
v_idmd5 := TIdHashMessageDigest5.Create;
v_fs := TFileStream.Create(p_fileName, fmOpenRead OR fmShareDenyWrite) ;
try
v_hash := v_idmd5.HashValue(v_fs);
result := v_idmd5.AsHex(v_hash);
finally
v_fs.Free;
v_idmd5.Free;
end;
end;
If you use Overbyte http://www.overbyte.eu/frame_index.html just add unit and call function FileMD5 with name of file
uses OverbyteIcsMd5;
....
function GetMd5File:String;
begin
Result := FileMD5(FileName);
end;