QR reading problem when using a Base64 UTF-8 Arabic string - delphi

I have a tagged Arabic string and I want to encode this string by using Base64 encoding. Everything runs perfect when using English letters for this string, but when using the Arabic letters, the QR reader doesn't display the correct letters.
Here is my code :
function TForm1.GetMyString(TagNo: Integer; TagValue: string): string;
var
Bytes, StrByte: TBytes;
i: Integer;
begin
SetLength(StrByte, Length(TagValue)+2);
StrByte[0] := Byte(TagNo);
StrByte[1] := Byte(Length(TagValue));
for i := 2 to Length(StrByte)-1 do
StrByte[i] := Byte(TagValue[i-1]);
Result := TEncoding.UTF8.GetString(StrByte);
end;
procedure TForm1.Button1Click(Sender: TObject);
var
s: String;
Bytes: TBytes;
begin
s := GetMyString(1, Edit1.Text) + GetMyString(2, Edit2.Text) +
GetMyString(3, Edit3.Text) + GetMyString(4, Edit4.Text) +
GetMyString(5, Edit5.Text);
bytes := TEncoding.UTF8.GetBytes(s);
QREdit.Text := TNetEncoding.Base64.EncodeBytesToString(Bytes);
end;
After decoding the Base64 string, it also shows the same QR reading result
eg. (E$33) 'D9E1'F) instead of (مؤسسة العمران)
I am using ZXingQR to read the generated string.

GetMyString() is truncating a series of UTF-16 characters into an array of 8bit bytes, as well as putting other non-textual bytes into the array, and then treating the whole array as if it were UTF-8 (which it is not) to produce a new UTF-16 string.
And then Button1Click() is taking those jacked-up UTF-16 strings, concatenating them together, and converting the result to UTF-8 for encoding to base64.
This approach will only work with ASCII strings whose lengths are less than 128 characters, and tags that are below 128 in value, since ASCII bytes in the range 0..127 is a subset of UTF-8. This will NOT work with non-ASCII characters/bytes outside of this range.
It seems that you want to base64 encode a series of tagged UTF-8 strings. If so, then try something more like this instead:
procedure TForm1.GetMyString(TagNo: UInt8; const TagValue: string; Output: TStream);
var
Bytes: TBytes;
begin
Bytes := TEncoding.UTF8.GetBytes(TagValue);
Assert(Length(Bytes) < 256);
Output.WriteData(TagNo);
Output.WriteData(UInt8(Length(Bytes)));
Output.WriteData(Bytes, Length(Bytes));
end;
procedure TForm1.Button1Click(Sender: TObject);
var
Stream: TMemoryStream;
begin
Stream := TMemoryStream.Create;
try
GetMyString(1, Edit1.Text, Stream);
GetMyString(2, Edit2.Text, Stream);
GetMyString(3, Edit3.Text, Stream);
GetMyString(4, Edit4.Text, Stream);
GetMyString(5, Edit5.Text, Stream);
QREdit.Text := TNetEncoding.Base64.EncodeBytesToString(Stream.Memory, Stream.Size);
finally
Stream.Free;
end;
end;
Alternatively:
function TForm1.GetMyString(TagNo: UInt8; const TagValue: string): TBytes;
var
Len: Integer;
begin
Len := TEncoding.UTF8.GetByteCount(TagValue);
Assert(Len < 256);
SetLength(Result, 2+Len);
Result[0] := Byte(TagNo);
Result[1] := Byte(Len);
TEncoding.UTF8.GetBytes(TagValue, 1, Length(TagValue), Result, 2);
end;
procedure TForm1.Button1Click(Sender: TObject);
var
Bytes: TBytes;
begin
Bytes := Concat(
GetMyString(1, Edit1.Text),
GetMyString(2, Edit2.Text),
GetMyString(3, Edit3.Text),
GetMyString(4, Edit4.Text),
GetMyString(5, Edit5.Text)
);
QREdit.Text := TNetEncoding.Base64.EncodeBytesToString(Bytes);
end;

Related

Is there a way to get just the ANSI characters from a string? Utf8decode fails when string contains emojis

First I get a TMemoryStream from an HTTP request, which contains the body of the response.
Then I load it in a TStringList and save the text in a widestring (also tried with ansistring).
The problem is that I need to convert the string because the users language is spanish, so vowels with accent marks are very common and I need to store the info.
lServerResponse := TStringList.Create;
lServerResponse.LoadFromStream(lResponseMemoryStream);
lStringResponse := lServerResponse.Text;
lDecodedResponse := Utf8Decode(lStringResponse );
If the response (a part of it) is "Hólá Múndó", lStringResponse value will be "Hólá Múndó", and lDecodedResponse will be "Hólá Múndó".
But if the user adds any emoji (lStringResponse value will be "Hólá Múndó 😀" if the emoji is 😀) Utf8Decode fails and returns an empty string.
Is there a way to get just the ANSI characters from a string (or MemoryStream)?, or removing whatever Utf8Decode can't convert?
Thanks for your time.
TMemoryStream is just raw bytes. There is no reason to loading that stream into a TStringList just to extract a (Wide|Ansi)String from it. You can assign the bytes directly to an AnsiString/UTF8String using SetString() instead, eg:
var
lStringResponse: UTF8String;
lDecodedResponse: WideString;
begin
SetString(lStringResponse, PAnsiChar(lResponseMemoryStream.Memory), lResponseMemoryStream.Size);
lDecodedResponse := UTF8Decode(lStringResponse);
end;
Just make sure the HTTP content really is encoded as UTF-8, or else this approach will not work.
That being said - UTF8Decode() (and UTF8Encode()) in Delphi 7 DO NOT support Unicode codepoints above U+FFFF, which means they DO NOT support Emojis at all. That was fixed in Delphi 2009.
To work around that issue in earlier versions, you can use the Win32 API MultiByteToWideChar() function instead, eg:
uses
..., Windows;
function My_UTF8Decode(const S: UTF8String): WideString;
var
WLen: Integer;
begin
WLen := MultiByteToWideChar(CP_UTF8, 0, PAnsiChar(S), Length(S), nil, 0);
if WLen > 0 then
begin
SetLength(Result, WLen);
MultiByteToWideChar(CP_UTF8, 0, PAnsiChar(S), Length(S), PWideChar(Result), WLen);
end else
Result := '';
end;
var
lStringResponse: UTF8String;
lDecodedResponse: WideString;
begin
SetString(lStringResponse, PAnsiChar(lResponseMemoryStream.Memory), lResponseMemoryStream.Size);
lDecodedResponse := My_UTF8Decode(lStringResponse);
end;
Alternatively:
uses
..., Windows;
function My_UTF8Decode(const S: PAnsiChar; const SLen: Integer): WideString;
var
WLen: Integer;
begin
WLen := MultiByteToWideChar(CP_UTF8, 0, S, SLen, nil, 0);
if WLen > 0 then
begin
SetLength(Result, WLen);
MultiByteToWideChar(CP_UTF8, 0, S, SLen, PWideChar(Result), WLen);
end else
Result := '';
end;
var
lDecodedResponse: WideString;
begin
lDecodedResponse := My_UTF8Decode(PAnsiChar(lResponseMemoryStream.Memory), lResponseMemoryStream.Size);
end;
Or, use a 3rd party Unicode conversion library, like ICU or libiconv, which handle this for you.

Convert 🌠 to %F0 %9F %8C %A0

I know 🌠 is %F0 %9F %8C %A0
but how can I convert this to be usable in Delphi ?
I tried several html encoders , but none give this result
my test
for i := 1 to length(s) do
result:= result+IntToHex(ord(s[i]),2);
but my result is D83CDF20
That is a simple UTF-8 encoding of this character. You can get the Delphi string using TEncoding like this:
var
S: string;
begin
S := TEncoding.UTF8.GetString(TBytes.Create($F0, $9F, $8C, $A0));
end;
or simply
S := '🌠';
In case you want it the other way round:
var
bytes: TBytes;
begin
bytes := TEncoding.UTF8.GetBytes('🌠');
end;
Or:
var
S: UTF8String;
begin
S := UTF8String('🌠');
end;
Valid for Delphi 2009 and later.

Archive for Byte, Byte to String without delay Delphi [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Convert Array of ShortInt to String, Delphi
I would like to turn a file into a string, this string should contain the numbers corresponding to the file in bytes.
I did in a way but it was very slow, using an array of FOR and ShortInt ...
Example:
have any file on my computer, my goal would be to transform it into Bytes, Bytes that these should be between -127 .. 128, it would have with something like this:
A[0] = 120
A[1] = -35
A[2] = 40
Ate here OK, but I need it in a concatenated string and a ',' between them, thus:
'120,-35,40'
I did it with a 'FOR', but it was very slow, if you have another alternative.
This question is rather similar to your previous question, but with the added complexity of reading the array from a file.
I'd probably write it like this:
function ConvertFileToCommaDelimitedArray(const FileName: string): string;
var
i, BytesLeft, BytesRead: Integer;
Buffer: array [0..4096-1] of Shortint;
Stream: TFileStream;
sb: TStringBuilder;
begin
sb := TStringBuilder.Create;
try
Stream := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
try
BytesLeft := Stream.Size;
while BytesLeft>0 do
begin
BytesRead := Min(SizeOf(Buffer), BytesLeft)
Stream.ReadBuffer(Buffer, BytesRead);
dec(BytesLeft, BytesRead);
for i := 0 to BytesRead-1 do
begin
sb.Append(IntToStr(Buffer[i]));
sb.Append(',');
end;
end;
finally
Stream.Free;
end;
if sb.Length>0 then
sb.Length := sb.Length-1;//remove trailing comma
Result := sb.ToString;
finally
sb.Free;
end;
end;
You can load a file into an array of one-byte signed integral types (also known as ShortInt) like this:
type
TShortIntArray = array of TShortInt;
function LoadFileAsShortInt(const name: TFileName): TShortIntArray;
var
f: TFileStream;
begin
f := TFileStream.Create(name, fmOpenRead or fmShareDenyWrite);
try
SetLength(Result, f.Size);
f.ReadBuffer(Result[0], f.Size);
finally
f.Free;
end;
end;
If you want the file's contents as a string, then you should skip the array and load the file directly into a string:
function FileAsString(const name: TFileName): AnsiString;
var
s: TStringStream;
begin
s := TStringStream.Create;
try
s.LoadFromFile(name);
Result := s.DataString;
finally
s.Free;
end;
end;

Base64 to Binary (Delphi)

I used Binary to Base64 function that you answered :
Binary to Base64 (Delphi)
I successfully encode a file to base64 string and write it to MsSQL2008 database, but i want to ask a question:
How can i write this file to disk again with using EncdDecd.pas?
As always, David answered sufficiently. Although I can't resist to give a slightly different solution using some of the goodies from the recent Delphi versions.
procedure DecodeFile(const base64: AnsiString; const FileName: string);
var
stream: TBytesStream;
begin
stream := TBytesStream.Create(DecodeBase64(base64));
try
stream.SaveToFile(Filename);
finally
stream.Free;
end;
end;
This function will take a base64 encoded string, decode it, and write the resulting byte array to a file.
procedure DecodeToFile(const base64: AnsiString; const FileName: string);
var
stream: TFileStream;
bytes: TBytes;
begin
bytes := DecodeBase64(base64);
stream := TFileStream.Create(FileName, fmCreate);
try
if bytes<>nil then
stream.Write(bytes[0], Length(Bytes));
finally
stream.Free;
end;
end;
To explain what is happening here, the first line
bytes := DecodeBase64(base64);
performs the decode and returns the decoded binary contents of the file in a TBytes variable. TBytes is simply an array of bytes.
The next step is to create the file. The idiomatic way to write files in Delphi is to use streams. In this case we want a TFileStream.
stream := TFileStream.Create(FileName, fmCreate);
The fmCreate option means that if the file already exists, it will be replaced and overwritten by what we write.
The final step is to write the contents of the byte array to the file
if bytes<>nil then
stream.Write(bytes[0], Length(Bytes));
The if bytes<>nil check is to handle the case where the base64 string decodes to an empty array. If we were to remove that check then the following line would result in a runtime error if you were running with range checking enabled (which you should be doing). The call to stream.Write should be self-explanatory.
After looking into Soap.EncdDecd the one can find more platform independent way, as it's DecodeBase64 uses universal (no AnsiString) methods from System.NetEncoding.
Based on Uwe's sample:
uses
...
System.Classes,
System.NetEncoding;
...
procedure DecodeFile(const base64: String; const FileName: string);
var
stream: TBytesStream;
begin
stream := TBytesStream.Create(TNetEncoding.Base64.DecodeStringToBytes(base64));
try
stream.SaveToFile(Filename);
finally
stream.Free;
end;
end;
uses
Soap.EncdDecd;
function TForm1.EncodeFile(const FileName: string): AnsiString;
var
MemStream: TMemoryStream;
begin
MemStream := TMemoryStream.Create;
try
MemStream.LoadFromFile(Filename);
Result := EncodeBase64(MemStream.Memory, MemStream.Size);
finally
MemStream.Free;
end;
end;
function TForm1.DecodeFile(const base64: AnsiString): TBytesStream;
begin
Result := TBytesStream.Create(DecodeBase64(base64));
end;
I have a very old Delphi2006(v10.0.2558.35231 Update 2) and had to decode base64 UTF8 encoded input strings. I finally figured it out and heres an example for anyone interested.
Uses
IdCoderMIME; // Indy9
var
decoder: TIdDecoderMIME;
str: WideString;
- - -
decoder := TIdDecoderMIME.Create(nil);
str := base64DecodeUTF8(decoder, b64sourcestr);
decoder.Free;
- - -
function base64DecodeUTF8(decoder:TIdDecoderMIME; str:String): WideString;
var
stream:TMemoryStream;
utf8: UTF8String;
//idx:Integer;
begin
stream := TMemoryStream.Create;
try
decoder.DecodeToStream(str, stream);
setString(utf8, PChar(stream.Memory), stream.Size);
Result := UTF8Decode(utf8);
//for idx := 0 to stream.Size-1 do begin
// Writeln(PChar(stream.Memory)[idx] + ' ' + IntToStr(ORD(PChar(stream.Memory) [idx])) );
//end;
finally
stream.Free;
end;
end;

(Wide)String - storing in TFileStream, Delphi 7. What is the fastest way?

I'm using Delphi7 (non-unicode VCL), I need to store lots of WideStrings inside a TFileStream. I can't use TStringStream as the (wide)strings are mixed with binary data, the format is projected to speed up loading and writing the data ... However I believe that current way I'm loading/writing the strings might be a bottleneck of my code ...
currently I'm writing length of a string, then writing it char by char ...
while loading, first I'm loading the length, then loading char by char ...
So, what is the fastest way to save and load WideString to TFileStream?
Thanks in advance
Rather than read and write one character at a time, read and write them all at once:
procedure WriteWideString(const ws: WideString; stream: TStream);
var
nChars: LongInt;
begin
nChars := Length(ws);
stream.WriteBuffer(nChars, SizeOf(nChars);
if nChars > 0 then
stream.WriteBuffer(ws[1], nChars * SizeOf(ws[1]));
end;
function ReadWideString(stream: TStream): WideString;
var
nChars: LongInt;
begin
stream.ReadBuffer(nChars, SizeOf(nChars));
SetLength(Result, nChars);
if nChars > 0 then
stream.ReadBuffer(Result[1], nChars * SizeOf(Result[1]));
end;
Now, technically, since WideString is a Windows BSTR, it can contain an odd number of bytes. The Length function reads the number of bytes and divides by two, so it's possible (although not likely) that the code above will cut off the last byte. You could use this code instead:
procedure WriteWideString(const ws: WideString; stream: TStream);
var
nBytes: LongInt;
begin
nBytes := SysStringByteLen(Pointer(ws));
stream.WriteBuffer(nBytes, SizeOf(nBytes));
if nBytes > 0 then
stream.WriteBuffer(Pointer(ws)^, nBytes);
end;
function ReadWideString(stream: TStream): WideString;
var
nBytes: LongInt;
buffer: PAnsiChar;
begin
stream.ReadBuffer(nBytes, SizeOf(nBytes));
if nBytes > 0 then begin
GetMem(buffer, nBytes);
try
stream.ReadBuffer(buffer^, nBytes);
Result := SysAllocStringByteLen(buffer, nBytes)
finally
FreeMem(buffer);
end;
end else
Result := '';
end;
Inspired by Mghie's answer, have replaced my Read and Write calls with ReadBuffer and WriteBuffer. The latter will raise exceptions if they are unable to read or write the requested number of bytes.
There is nothing special about wide strings, to read and write them as fast as possible you need to read and write as much as possible in one go:
procedure TForm1.Button1Click(Sender: TObject);
var
Str: TStream;
W, W2: WideString;
L: integer;
begin
W := 'foo bar baz';
Str := TFileStream.Create('test.bin', fmCreate);
try
// write WideString
L := Length(W);
Str.WriteBuffer(L, SizeOf(integer));
if L > 0 then
Str.WriteBuffer(W[1], L * SizeOf(WideChar));
Str.Seek(0, soFromBeginning);
// read back WideString
Str.ReadBuffer(L, SizeOf(integer));
if L > 0 then begin
SetLength(W2, L);
Str.ReadBuffer(W2[1], L * SizeOf(WideChar));
end else
W2 := '';
Assert(W = W2);
finally
Str.Free;
end;
end;
WideStrings contain a 'string' of WideChar's, which use 2 bytes each. If you want to store the UTF-16 (which WideStrings use internally) strings in a file, and be able to use this file in other programs like notepad, you need to write a byte order mark first: #$FEFF.
If you know this, writing can look like this:
Stream1.Write(WideString1[1],Length(WideString)*2); //2=SizeOf(WideChar)
reading can look like this:
Stream1.Read(WideChar1,2);//assert returned 2 and WideChar1=#$FEFF
SetLength(WideString1,(Stream1.Size div 2)-1);
Stream1.Read(WideString1[1],(Stream1.Size div 2)-1);
You can also use TFastFileStream for reading the data or strings, I pasted the unit at http://pastebin.com/m6ecdc8c2 and a sample below:
program Project36;
{$APPTYPE CONSOLE}
uses
SysUtils, Classes,
FastStream in 'FastStream.pas';
const
WideNull: WideChar = #0;
procedure WriteWideStringToStream(Stream: TFileStream; var Data: WideString);
var
len: Word;
begin
len := Length(Data);
// Write WideString length
Stream.Write(len, SizeOf(len));
if (len > 0) then
begin
// Write WideString
Stream.Write(Data[1], len * SizeOf(WideChar));
end;
// Write null termination
Stream.Write(WideNull, SizeOf(WideNull));
end;
procedure CreateTestFile;
var
Stream: TFileStream;
MyString: WideString;
begin
Stream := TFileStream.Create('test.bin', fmCreate);
try
MyString := 'Hello World!';
WriteWideStringToStream(Stream, MyString);
MyString := 'Speed is Delphi!';
WriteWideStringToStream(Stream, MyString);
finally
Stream.Free;
end;
end;
function ReadWideStringFromStream(Stream: TFastFileStream): WideString;
var
len: Word;
begin
// Read length of WideString
Stream.Read(len, SizeOf(len));
// Read WideString
Result := PWideChar(Cardinal(Stream.Memory) + Stream.Position);
// Update position and skip null termination
Stream.Position := Stream.Position + (len * SizeOf(WideChar)) + SizeOf(WideNull);
end;
procedure ReadTestFile;
var
Stream: TFastFileStream;
my_wide_string: WideString;
begin
Stream := TFastFileStream.Create('test.bin');
try
Stream.Position := 0;
// Read WideString
my_wide_string := ReadWideStringFromStream(Stream);
WriteLn(my_wide_string);
// Read another WideString
my_wide_string := ReadWideStringFromStream(Stream);
WriteLn(my_wide_string);
finally
Stream.Free;
end;
end;
begin
CreateTestFile;
ReadTestFile;
ReadLn;
end.

Resources