File upload fails, when posting with Indy and filename contains Greek characters - delphi

I am trying to implement a POST to a web service. I need to send a file whose type is variable (.docx, .pdf, .txt) along with a JSON formatted string.
I have manage to post files successfully with code similar to the following:
procedure DoRequest;
var
Http: TIdHTTP;
Params: TIdMultipartFormDataStream;
RequestStream, ResponseStream: TStringStream;
JRequest, JResponse: TJSONObject;
url: string;
begin
url := 'some_custom_service'
JRequest := TJSONObject.Create;
JResponse := TJSONObject.Create;
try
JRequest.AddPair('Pair1', 'Value1');
JRequest.AddPair('Pair2', 'Value2');
JRequest.AddPair('Pair3', 'Value3');
Http := TIdHTTP.Create(nil);
ResponseStream := TStringStream.Create;
RequestStream := TStringStream.Create(UTF8Encode(JRequest.ToString));
try
Params := TIdMultipartFormDataStream.Create;
Params.AddFile('File', ceFileName.Text, '').ContentTransfer := '';
Params.AddFormField('Json', 'application/json', '', RequestStream);
Http.Post(url, Params, ResponseStream);
JResponse := TJSONObject.ParseJSONValue(ResponseStream.DataString) as TJSONObject;
finally
RequestStream.Free;
ResponseStream.Free;
Params.Free;
Http.Free;
end;
finally
JRequest.Free;
JResponse.Free;
end;
end;
The problem appears when I try to send a file that contains Greek characters and spaces in the filename. Sometimes it fails and sometimes it succeeds.
After a lot of research, I notice that the POST header is encoded by Indy's TIdFormDataField class using the EncodeHeader() function. When the post fails, the encoded filename in the header is split, compared to the successful post where is not split.
For example :
Επιστολή εκπαιδευτικο.docx is encoded as =?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66zr8uZG9j?='#$D#$A' =?UTF-8?B?eA==?=, which fails.
Επιστολή εκπαιδευτικ.docx is encoded as
=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66LmRvY3g=?=, which succeeds.
Επιστολή εκπαιδευτικ .docx is encoded as
=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66?= .docx, which fails.
I have tried to change the encoding of the filename, the AContentType of the AddFile() procedure, and the ContentTransfer, but none of those change the behavior, and I still get errors when the encoded filename is split.
Is this some kind of bug, or am I missing something?
My code works for every case except those I described above.
I am using Delphi XE3 with Indy10.

EncodeHeader() does have some known issues with Unicode strings:
EncodeHeader() needs to take codeunits into account when splitting data between adjacent encoded-words
Basically, an MIME-encoded word cannot be more than 75 characters in length, so long text gets split up. But when encoding a long Unicode string, any given Unicode character may be charset-encoded using 1 or more bytes, and EncodeHeader() does not yet avoid erroneously splitting a multi-byte character between two individual bytes into separate encoded words (which is illegal and explicitly forbidden by RFC 2047 of the MIME spec).
However, that is not what is happening in your examples.
In your first example, 'Επιστολή εκπαιδευτικο.docx' is too long to be encoded as a single MIME word, so it gets split into 'Επιστολή εκπαιδευτικο.doc' 'x' substrings, which are then encoded separately. This is legal in MIME for long text (though you might have expected Indy to split the text into 'Επιστολή' ' εκπαιδευτικο.doc' instead, or even 'Επιστολή' ' εκπαιδευτικο' '.doc'. That might be a possibility in a future release). Adjacent MIME words that are separated by only whitespace are meant to be concatenated together without separating whitespace when decoded, thus producing 'Επιστολή εκπαιδευτικο.docx' again. If the server is not doing that, it has a flaw in its decoder (maybe it is decoding as 'Επιστολή εκπαιδευτικο.doc x' instead?).
In your second example, 'Επιστολή εκπαιδευτικ.docx' is short enough to be encoded as a single MIME word.
In your third example, 'Επιστολή εκπαιδευτικ .docx' gets split on the second whitespace (not the first) into 'Επιστολή εκπαιδευτικ' ' .docx' substrings, and only the first substring needs to be encoded. This is legal in MIME. When decoded, the decoded text is meant to be concatenated with the following unencoded text, preserving whitespace between them, thus producing 'Επιστολή εκπαιδευτικ .docx' again. If the server is not doing that, it has a flaw in its decoder (maybe it is decoding as 'Επιστολή εκπαιδευτικ.docx' instead?).
If you run these example filenames through Indy's MIME header encoder/decoder, they do decode properly:
var
s: String;
begin
s := EncodeHeader('Επιστολή εκπαιδευτικο.docx', '', 'B', 'UTF-8');
ShowMessage(s); // '=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66zr8uZG9j?='#13#10' =?UTF-8?B?eA==?='
s := DecodeHeader(s);
ShowMessage(s); // 'Επιστολή εκπαιδευτικο.docx'
s := EncodeHeader('Επιστολή εκπαιδευτικ.docx', '', 'B', 'UTF-8');
ShowMessage(s); // '=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66LmRvY3g=?='
s := DecodeHeader(s);
ShowMessage(s); // 'Επιστολή εκπαιδευτικ.docx'
s := EncodeHeader('Επιστολή εκπαιδευτικ .docx', '', 'B', 'UTF-8');
ShowMessage(s); // '=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66?= .docx'
s := DecodeHeader(s);
ShowMessage(s); // 'Επιστολή εκπαιδευτικ .docx'
end;
So the problem seems to be on the server side decoding, not on Indy's client side encoding.
That being said, if you are using a fairly recent version of Indy 10 (Nov 2011 or later), TIdFormDataField has a HeaderEncoding property, which defaults to 'B' (base64) in Unicode environments. However, the splitting logic also affects 'Q' (quoted-printable) as well, so that may or may not work for you, either (but you can try it):
with Params.AddFile('File', ceFileName.Text, '') do
begin
ContentTransfer := '';
HeaderEncoding := 'Q'; // <--- here
HeaderCharSet := 'utf-8';
end;
Otherwise, a workaround might be to change the value to '8' (8-bit) instead, which effectively disables MIME encoding (but not charset encoding):
with Params.AddFile('File', ceFileName.Text, '') do
begin
ContentTransfer := '';
HeaderEncoding := '8'; // <--- here
HeaderCharSet := 'utf-8';
end;
Just note that if the server is not expecting raw UTF-8 bytes for the filename, you might still run into problems (ie, 'Επιστολή εκπαιδευτικο.docx' being interpreted as 'Επιστολή εκπαιδευτικο.docx', for instance).

Related

How to post a file with chinese characters in TIdHttp [duplicate]

I am trying to implement a POST to a web service. I need to send a file whose type is variable (.docx, .pdf, .txt) along with a JSON formatted string.
I have manage to post files successfully with code similar to the following:
procedure DoRequest;
var
Http: TIdHTTP;
Params: TIdMultipartFormDataStream;
RequestStream, ResponseStream: TStringStream;
JRequest, JResponse: TJSONObject;
url: string;
begin
url := 'some_custom_service'
JRequest := TJSONObject.Create;
JResponse := TJSONObject.Create;
try
JRequest.AddPair('Pair1', 'Value1');
JRequest.AddPair('Pair2', 'Value2');
JRequest.AddPair('Pair3', 'Value3');
Http := TIdHTTP.Create(nil);
ResponseStream := TStringStream.Create;
RequestStream := TStringStream.Create(UTF8Encode(JRequest.ToString));
try
Params := TIdMultipartFormDataStream.Create;
Params.AddFile('File', ceFileName.Text, '').ContentTransfer := '';
Params.AddFormField('Json', 'application/json', '', RequestStream);
Http.Post(url, Params, ResponseStream);
JResponse := TJSONObject.ParseJSONValue(ResponseStream.DataString) as TJSONObject;
finally
RequestStream.Free;
ResponseStream.Free;
Params.Free;
Http.Free;
end;
finally
JRequest.Free;
JResponse.Free;
end;
end;
The problem appears when I try to send a file that contains Greek characters and spaces in the filename. Sometimes it fails and sometimes it succeeds.
After a lot of research, I notice that the POST header is encoded by Indy's TIdFormDataField class using the EncodeHeader() function. When the post fails, the encoded filename in the header is split, compared to the successful post where is not split.
For example :
Επιστολή εκπαιδευτικο.docx is encoded as =?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66zr8uZG9j?='#$D#$A' =?UTF-8?B?eA==?=, which fails.
Επιστολή εκπαιδευτικ.docx is encoded as
=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66LmRvY3g=?=, which succeeds.
Επιστολή εκπαιδευτικ .docx is encoded as
=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66?= .docx, which fails.
I have tried to change the encoding of the filename, the AContentType of the AddFile() procedure, and the ContentTransfer, but none of those change the behavior, and I still get errors when the encoded filename is split.
Is this some kind of bug, or am I missing something?
My code works for every case except those I described above.
I am using Delphi XE3 with Indy10.
EncodeHeader() does have some known issues with Unicode strings:
EncodeHeader() needs to take codeunits into account when splitting data between adjacent encoded-words
Basically, an MIME-encoded word cannot be more than 75 characters in length, so long text gets split up. But when encoding a long Unicode string, any given Unicode character may be charset-encoded using 1 or more bytes, and EncodeHeader() does not yet avoid erroneously splitting a multi-byte character between two individual bytes into separate encoded words (which is illegal and explicitly forbidden by RFC 2047 of the MIME spec).
However, that is not what is happening in your examples.
In your first example, 'Επιστολή εκπαιδευτικο.docx' is too long to be encoded as a single MIME word, so it gets split into 'Επιστολή εκπαιδευτικο.doc' 'x' substrings, which are then encoded separately. This is legal in MIME for long text (though you might have expected Indy to split the text into 'Επιστολή' ' εκπαιδευτικο.doc' instead, or even 'Επιστολή' ' εκπαιδευτικο' '.doc'. That might be a possibility in a future release). Adjacent MIME words that are separated by only whitespace are meant to be concatenated together without separating whitespace when decoded, thus producing 'Επιστολή εκπαιδευτικο.docx' again. If the server is not doing that, it has a flaw in its decoder (maybe it is decoding as 'Επιστολή εκπαιδευτικο.doc x' instead?).
In your second example, 'Επιστολή εκπαιδευτικ.docx' is short enough to be encoded as a single MIME word.
In your third example, 'Επιστολή εκπαιδευτικ .docx' gets split on the second whitespace (not the first) into 'Επιστολή εκπαιδευτικ' ' .docx' substrings, and only the first substring needs to be encoded. This is legal in MIME. When decoded, the decoded text is meant to be concatenated with the following unencoded text, preserving whitespace between them, thus producing 'Επιστολή εκπαιδευτικ .docx' again. If the server is not doing that, it has a flaw in its decoder (maybe it is decoding as 'Επιστολή εκπαιδευτικ.docx' instead?).
If you run these example filenames through Indy's MIME header encoder/decoder, they do decode properly:
var
s: String;
begin
s := EncodeHeader('Επιστολή εκπαιδευτικο.docx', '', 'B', 'UTF-8');
ShowMessage(s); // '=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66zr8uZG9j?='#13#10' =?UTF-8?B?eA==?='
s := DecodeHeader(s);
ShowMessage(s); // 'Επιστολή εκπαιδευτικο.docx'
s := EncodeHeader('Επιστολή εκπαιδευτικ.docx', '', 'B', 'UTF-8');
ShowMessage(s); // '=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66LmRvY3g=?='
s := DecodeHeader(s);
ShowMessage(s); // 'Επιστολή εκπαιδευτικ.docx'
s := EncodeHeader('Επιστολή εκπαιδευτικ .docx', '', 'B', 'UTF-8');
ShowMessage(s); // '=?UTF-8?B?zpXPgM65z4PPhM6/zrvOriDOtc66z4DOsc65zrTOtc+Fz4TOuc66?= .docx'
s := DecodeHeader(s);
ShowMessage(s); // 'Επιστολή εκπαιδευτικ .docx'
end;
So the problem seems to be on the server side decoding, not on Indy's client side encoding.
That being said, if you are using a fairly recent version of Indy 10 (Nov 2011 or later), TIdFormDataField has a HeaderEncoding property, which defaults to 'B' (base64) in Unicode environments. However, the splitting logic also affects 'Q' (quoted-printable) as well, so that may or may not work for you, either (but you can try it):
with Params.AddFile('File', ceFileName.Text, '') do
begin
ContentTransfer := '';
HeaderEncoding := 'Q'; // <--- here
HeaderCharSet := 'utf-8';
end;
Otherwise, a workaround might be to change the value to '8' (8-bit) instead, which effectively disables MIME encoding (but not charset encoding):
with Params.AddFile('File', ceFileName.Text, '') do
begin
ContentTransfer := '';
HeaderEncoding := '8'; // <--- here
HeaderCharSet := 'utf-8';
end;
Just note that if the server is not expecting raw UTF-8 bytes for the filename, you might still run into problems (ie, 'Επιστολή εκπαιδευτικο.docx' being interpreted as 'Επιστολή εκπαιδευτικο.docx', for instance).

Compress Base64 string with zlib

I need to send from Windows to mobile devices, iOS and Android, by TCP protocol, a big Base64 string.
I have no problem to send and receive, but the strings size are too big, about 24000 characters, and I'm looking at method to compress an decompress these strings.
Looking I see, that the best way is using the Zlib, and I found these link Delphi XE and ZLib Problems (II) in which explains how to do it.
The functions work with normal text string, but compressing base64 strings make they more big.
An example of a very small string that i would send, would be this:
cEJNYkpCSThLVEh6QjNFWC9wSGhXQ3lHWUlBcGNURS83TFdDNVUwUURxRnJvZlRVUWd4WEFWcFJBNUZSSE9JRXlsaWgzcEJvTGo5anQwTlEyd1pBTEtVQVlPbXdkKzJ6N3J5ZUd4SmU2bDNBWjFEd3lVZmZTR1FwNXRqWTVFOFd2SHRwakhDOU9JUEZRM00wMWhnU0p3MWxxNFRVdmdEU2pwekhwV2thS0JFNG9WYXRDUHhTdnp4blU5Vis2ZzJQYnRIdllubzhKSFhZeUlpckNtTGtUZHVHOTFncHVUWC9FSTdOK3JEUDBOVzlaTngrcEdxcXhpRWJ1ZXNUMmdxOXpJa0ZEak1ORHBFenFVSTlCdytHTy==
I don't know if is posible to compress this types of strings. I need help.
The functions that I use are this:
uses
SysUtils, Classes, ZLib, EncdDecd;
function CompressAndEncodeString(const Str: string): string;
var
Utf8Stream: TStringStream;
Compressed: TMemoryStream;
Base64Stream: TStringStream;
begin
Utf8Stream := TStringStream.Create(Str, TEncoding.UTF8);
try
Compressed := TMemoryStream.Create;
try
ZCompressStream(Utf8Stream, Compressed);
Compressed.Position := 0;
Base64Stream := TStringStream.Create('', TEncoding.ASCII);
try
EncodeStream(Compressed, Base64Stream);
Result := Base64Stream.DataString;
finally
Base64Stream.Free;
end;
finally
Compressed.Free;
end;
finally
Utf8Stream.Free;
end;
end;
function DecodeAndDecompressString(const Str: string): string;
var
Utf8Stream: TStringStream;
Compressed: TMemoryStream;
Base64Stream: TStringStream;
begin
Base64Stream := TStringStream.Create(Str, TEncoding.ASCII);
try
Compressed := TMemoryStream.Create;
try
DecodeStream(Base64Stream, Compressed);
Compressed.Position := 0;
Utf8Stream := TStringStream.Create('', TEncoding.UTF8);
try
ZDecompressStream(Compressed, Utf8Stream);
Result := Utf8Stream.DataString;
finally
Utf8Stream.Free;
end;
finally
Compressed.Free;
end;
finally
Base64Stream.Free;
end;
end;
As I understand the question you have done the following:
Encoding a string as UTF-8 bytes.
Compressed those bytes using zlib.
Base64 encoded the compressed bytes.
You then attempt to compress the output of step 3 and find that the result is no smaller. That is to be expected. You have already compressed the data, and further attempts to compress it cannot be expected to reduce the size significantly, especially not if you have base64 encoded in the meantime. If you could repeatedly compress data and have it get smaller each time, then eventually there would be nothing left. That is obviously not possible.
I think you are already doing a good job. You convert to UTF-8 which for most text is the most space effective of the Unicode encodings. If you worked with Chinese text then you'd be better off with UTF-16. You then compress the UTF-8 which is also reasonable. And finally for transmission you encode with base64, also reasonable.
The most obvious way for you to reduce the size of data to be transmitted is for you to omit the base64 step. If you can transmit the compressed bytes that are produced in step 2 then you will be transmitting less. Base64 uses 4 bytes to encode 3 bytes so the size of base64 encoded data is a third larger than the input data.
Another way could be to use a better compression algorithm than zlib, but again there are limits to what can be achieved. And usually better compression is achieved at the cost of increased computational time.

Delphi Indy IdTCPServer and IdTCPClient. read and write control characters and text

Delphi XE3, Indy 10.5.9.0
I am creating an interface between a computer and an instrument. The instrument uses ASTM protocol.
I have successfully sent text based messages back and forth between the server and client. I have been able to send control characters to the server and read those. What I have not figured out after 3 days of searching is how to write and read messages that have a mixture of control characters and text.
I am sending ASTM protocol messages which require control characters and text like the following line. Everything in angle brackets are control characters. Writing the message is not where I run into problems. It is when reading it since I will receive both text and control characters. My code below is how I read the control characters. How can I tell when I get the character whether it is a control character and when it is text in the same string of control and text characters? Thanks to Remy Lebeau and his posts on this site to get me where I am. He talked about how to use buffers but I couldn't tell how to read a buffer that contained control characters and text characters.
<STX>3O|1|G-13-00017||^^^HPV|R||||||N||||||||||||||O<CR><ETX>D3<CR><LF>
I have added the following code to my server components OnConnect event which is supposed to allows me to send control characters...
...
AContext.Connection.IOHandler.DefStringEncoding := TIdTextEncoding.UTF8;
...
My server OnExecute event...
procedure TTasksForm.IdTCPServer1Execute(AContext: TIdContext);
var
lastline : WideString;
lastcmd : WideString ;
lastbyte : Byte ;
begin
ServerTrafficMemo.Lines.Add('OnExecute') ;
lastline := '' ;
lastcmd := '' ;
lastbyte := (AContext.Connection.IOHandler.ReadByte) ;
if lastbyte = Byte(5) then
begin
lastcmd := '<ENQ>' ;
ServerTrafficMemo.Lines.Add(lastcmd) ;
AContext.Connection.IOHandler.WriteLn(lastcmd + ' received') ;
end;
end;
The only control characters present are STX and ETX, and they are both < 32, so ASCII and UTF-8 will both handle them just fine. Or, you can use Indy's own built-in 8bit encoding instead.
For this type of data, there are several different ways to read it with Indy. Since the bulk of the data is textual, and the control characters are just used as frame delimiters, the easiest way would be to use IOHandler.ReadLn() or IOHandler.WaitFor() with explicit terminators.
Of course, there are other options as well, such as reading bytes from the IOHandler.InputBuffer directly (which I think is overkill in this situation), using the InputBuffer.IndexOf() method to know how many bytes to read.
Also, TIdTCPServer is a multithreaded component, where its events are fired in worker threads, but your code is directly accessing the UI, which is not thread-safe. You MUST synchronize with the UI thread.
And you shouldn't be WideString, either. Use (Unicode)String instead.
Try something like this:
procedure TTasksForm.IdTCPServer1Connect (AContext: TIdContext);
begin AContext.Connection.IOHandler.DefStringEncoding := Indy8BitEncoding;
end;
procedure TTasksForm.IdTCPServer1Execute(AContext: TIdContext);
var
lastline : string;
lastcmd : string ;
lastbyte : Byte ;
begin
TThread.Synchronize(nil,
procedure
begin
ServerTrafficMemo.Lines.Add('OnExecute') ;
end
);
lastbyte := (AContext.Connection.IOHandler.ReadByte);
if lastbyte = $5 then
begin
lastcmd := '<ENQ>' ;
TThread.Synchronize(nil,
procedure
begin
ServerTrafficMemo.Lines.Add(lastcmd) ;
end
);
end
else if lastbyte = $2 then
begin
lastline := #2 + AContext.Connection.IOHandler.ReadLn(#3) + #3;
lastline := lastline + AContext.Connection.IOHandler.ReadLn(#13#10) + #13#10;
{ or:
lastline := #2 + AContext.Connection.IOHandler.WaitFor(#3, true, true);
lastline := lastline + AContext.Connection.IOHandler.WaitFor(#13#10, true, true);
}
lastcmd := '<STX>' ;
TThread.Synchronize(nil,
procedure
begin
ServerTrafficMemo.Lines.Add(lastcmd) ;
end
);
end;
AContext.Connection.IOHandler.WriteLn(lastcmd + ' received') ;
end;
I couldn't tell how to read a buffer that contained control characters and text characters
This protocol is no doubt using ASCII strings. Any characters below decimal 32 will be control characters. Those 32 and above will be data characters. See
http://ascii-table.com/ascii.php
Dealing with that as bytes works fine. You can also use ansistring, which is ASCII plus the top 127 characters. In this situation I would avoid UTF(any) and stick with either byte or ansistring. You need to control the message at the character level, and these characters are 8 bits per character with no escapes.
Alsosee the first example, in the first answer here:

Why are strings truncated when using direct printing?

I'm trying to print directly to a printer using esc/p commands (EPSON TM-T70) without using printer driver. Code found here.
However, if I try to print any strings, they are truncated. For example:
MyPrinter := TRawPrint.Create(nil);
try
MyPrinter.DeviceName := 'EPSON TM-T70 Receipt';
MyPrinter.JobName := 'MyJob';
if MyPrinter.OpenDevice then
begin
MyPrinter.WriteString('This is page 1');
MyPrinter.NewPage;
MyPrinter.WriteString('This is page 2');
MyPrinter.CloseDevice;
end;
finally
MyPrinter.Free;
end;
Would print only "This isThis is"! I wouldn't ordinarily use MyPrinter.NewPage to send a line break command, but regardless, why does it truncates the string?
Also notice in RawPrint unit WriteString function:
Result := False;
if IsOpenDevice then begin
Result := True;
if not WritePrinter(hPrinter, PChar(Text), Length(Text), WrittenChars) then begin
RaiseError(GetLastErrMsg);
Result := False;
end;
end;
If I put a breakpoint there and step through the code, then WrittenChars is set to 14, which is correct. Why does it act like that?
You are using a unicode-enabled version of Delphi. Chars are 2 bytes long. When you call your function with Length(s) you're sending the number of chars, but the function probably expects the size of the buffer. Replace it with SizeOf(s) Length(s)*SizeOf(Char).
Since the size of one unicode char is exactly 2 bytes, when you're sending Length when buffer size is required, you're essentially telling the API to only use half the buffer. Hence all strings are aproximately split in half.
Maybe you can use the ByteLength function which gives the length of a string in bytes.

Delphi: CDO.Message encoding problems

We wrote a Delphi program that send some informations with CDO.
In my Win7 machine (hungarian) the accents are working fine.
So if I sent a mail with "ÁÉÍÓÖŐÚÜŰ", I got it in this format.
I used iso-8859-2 encoding in the body, and this encode the subject, and the email addresses to (the sender address is contains name).
I thought that I finished with this.
But when I try to send a mail from a Win2k3 english machine (the mailing server is same!), the result is truncate some accents:
Ű = U
Ő = O
Next I tried to use UTF-8 encoding here.
This provided accents - but wrong accents.
The mail contains accents with ^ signs.
ê <> é
This is not valid hungarian letter... :-(
So I want to know, how to I convert or setup the input to I got good result.
I tried to log the body to see is changes...
Log(SBody);
Msg.Body := SBody;
Log(Msg.Body);
... or not.
But these logs are providing good result, the input is good.
So it is possible lost and misconverted on CDO generate the message.
May I can help the CDO if I can encode the ANSI text into real UTF.
But in Delphi converter functions don't have "CodePage" parameters.
In Python I can said:
s.encode('iso-8859-2')
or
s.decode('iso-8859-2')
But in Delphi I don't see this parameter.
Is anybody knows, how to preserve the accents, how to convert the accented hungarian strings to preserve them accented format?
And I want to know, can I check the result without sending the mail?
Thanks for your help:
dd
a quick google search with "delphi string codepage" got me to torry's delphi pages
and maybe the following codesnippets (found here) can shed some light on your problem:
{:Converts Unicode string to Ansi string using specified code page.
#param ws Unicode string.
#param codePage Code page to be used in conversion.
#returns Converted ansi string.
}
function WideStringToString(const ws: WideString; codePage: Word): AnsiString;
var
l: integer;
begin
if ws = ' then
Result := '
else
begin
l := WideCharToMultiByte(codePage,
WC_COMPOSITECHECK or WC_DISCARDNS or WC_SEPCHARS or WC_DEFAULTCHAR,
#ws[1], - 1, nil, 0, nil, nil);
SetLength(Result, l - 1);
if l > 1 then
WideCharToMultiByte(codePage,
WC_COMPOSITECHECK or WC_DISCARDNS or WC_SEPCHARS or WC_DEFAULTCHAR,
#ws[1], - 1, #Result[1], l - 1, nil, nil);
end;
end; { WideStringToString }
{:Converts Ansi string to Unicode string using specified code page.
#param s Ansi string.
#param codePage Code page to be used in conversion.
#returns Converted wide string.
}
function StringToWideString(const s: AnsiString; codePage: Word): WideString;
var
l: integer;
begin
if s = ' then
Result := '
else
begin
l := MultiByteToWideChar(codePage, MB_PRECOMPOSED, PChar(#s[1]), - 1, nil, 0);
SetLength(Result, l - 1);
if l > 1 then
MultiByteToWideChar(CodePage, MB_PRECOMPOSED, PChar(#s[1]),
- 1, PWideChar(#Result[1]), l - 1);
end;
end; { StringToWideString }
--reinhard

Resources