TSQLQuery.FieldByName().AsString -> TStringStream Corrupts Data - delphi

I'm using Delphi XE2. My code pulls data from a SQL-Server 2008 R2 database. The data returned is a nvarchar(max) field with 1,055,227 bytes of data. I use the following code to save the field data to a file:
procedure WriteFieldToFile(FieldName: string; Query: TSQLQuery);
var
ss: TStringStream;
begin
ss := TStringStream.Create;
try
ss.WriteString(Query.FieldByName(FieldName).AsString);
ss.Position := 0;
ss.SaveToFile('C:\Test.txt');
finally
FreeAndNil(ss);
end;
end;
When I inspect the file in a hex viewer, the first 524,287 bytes (exactly 1/2 meg) look correct. The remaining bytes (524,288 to 1,055,227) are all nulls (#0), instead of the original data.
Is this the right way to save a string field from a TSQLQuery to a file? I chose to use TStringStream because I will eventually add code to do other things to the data on the stream, which I can't do with a TFileStream.

TStringStream is TEncoding-aware in XE2, but you are not specifying any encoding in the constructor so TEncoding.Default will be used, meaning that any string you provide to it will internally be converted to the OS default Ansi encoding. Make sure that encoding supports the Unicode characters you are trying to work with, or else specify a more suitable encoding, such as TEncoding.UTF8.
Also make sure that AsString is returning a valid and correct UnicodeString value to begin with. TStringStream will not save the data correctly if it is given garbage as input. Make sure that FieldByName() is returning a pointer to a TWideStringField object and not a TStringField object in order to handle the database's Unicode data correctly.

Related

Problems with unicode text

I use delphi xe3 and i have small problem !! but i don't how to fix it..
problem is with this letter "è" this letter is inside a file path "C:\lène.mp4"
i save this path into a tstringlist , when i save this tstringlist to a file the path will be shown fine inside the txt file ..
but when trying to loading it using tstringlist it will be shown as "è" (showing it inside a memo or int a variable) in this case it gonna be an invalid path ..
but adding the path(string) directly to the tstring list and then passing it to the path variable it works fine
but loading from the file and passing to the path variable it doesnt work (getting "è" instead of "è")
normally i will work with a lot of uncite string but for i'm struggling with that letter
this will not work ..
var
resp : widestring;
xfiles : tstringlist;
begin
xfiles := tstringlist.Create;
try
xfiles.LoadFromFile('C:\Demo6-out.txt'); // this file contains only "C:\lène.mp4"
resp := (xfiles.Strings[0]);
// if i save xfiles to a file "path string" will be saved fine ... !
finally
xfiles.Free ;
end;
but like this it work ..
var
resp : widestring;
xfiles : tstringlist;
begin
xfiles := tstringlist.Create;
try
xfiles.Add('C:lène.mp4');
resp := (xfiles.Strings[0]);
finally
xfiles.Free ;
end;
i'm really confused
First, you should be using UnicodeString instead of WideString. UnicodeString was introduced in Delphi 2009, and is much more efficient than WideString. The RTL uses UnicodeString (almost) everywhere it previously used AnsiString prior to 2009.
Second, something else introduced in Delphi 2009 is SysUtils.TEncoding, which is used for Byte<->Character conversions. Several existing RTL classes, including TStrings/TStringList, were updated to support TEncoding when converting bytes to/from strings.
What happens when you load a file into TStringList is that an internal TEncoding object is assigned to help convert the file's raw bytes to UnicodeString values. Which implementation of TEncoding it uses depends on the character encoding that LoadFromFile() thinks the file is using, if not explicitly stated (LoadFromFile() has an optional AEncoding parameter). If the file has a UTF BOM, a matching TEncoding is used, whether that be TEncoding.UTF8 or TEncoding.(BigEndian)Unicode. If no BOM is present, and the AEncoding parameter is not used, then TEncoding.Default is used, which represents the OS's default charset locale (and thus provides backwards compatibility with existing pre-2009 code).
When saving a TStringList to file, if the list was previously loaded from a file then the same TEncoding used for loading is used for saving, otherwise TEncoding.Default is used (again, for backwards compatibility), unless overwritten by the optional AEncoding parameter of SaveToFile().
In your first example, the input file is most likely encoded in UTF-8 without a BOM. So LoadFromFile() would use TEncoding.Default to interpret the file's bytes. è is the result of the UTF-8 encoded form of è (byte octets 0xC3 0xA8) being misinterpreted as Windows-1252 instead of UTF-8. So, you would have to load the file like this instead:
xfiles.LoadFromFile('C:\Demo6-out.txt', TEncoding.UTF8);
In your second example, you are not loading a file or saving a file. You are simply assigning a string literal (which is unicode-aware in D2009+) to a UnicodeString variable (inside of the TStringList) and then assigning that to a WideString variable (WideString and UnicodeString use the same UTF-16 character encoding, they just different memory managements). So there are no data conversions being performed.
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Delphi XE and ZLib Problems

I'm in Delphi XE and I'm having some problems with ZLib routines...
I'm trying to compress some strings (and encode it to send it via a SOAP webservice -not really important-)
The string results from ZDecompressString differs used in ZCompressString.
example1:
uses ZLib;
// compressing string
// ZCompressString('1234567890', zcMax);
// compressed string ='xÚ3426153·°4'
// Uncompressing the result of ZCompressString, don't return the same:
// ZDecompressString('xÚ3426153·°4');
// uncompressed string = '123456789'
if '1234567890' <> ZDecompressString(ZCompressString('1234567890', zcMax)) then
ShowMessage('Compression/Decompression fails');
example2:
Uses ZLib;
// compressing string
// ZCompressString('12345678901234567890', zcMax)
// compressed string ='xÚ3426153·°40„³'
// Uncompressing the result of ZCompressString, don't return the same:
// ZDecompressString('xÚ3426153·°40„³')
// uncompressed string = '12345678901'
if '12345678901234567890' <> ZDecompressString(ZCompressString('12345678901234567890', zcMax)) then
ShowMessage('Compression/Decompression fails');
the functions used are from some other posts about compressing and deCompressing
function TForm1.ZCompressString(aText: string; aCompressionLevel: TZCompressionLevel): string;
var
strInput,
strOutput: TStringStream;
Zipper: TZCompressionStream;
begin
Result:= '';
strInput:= TStringStream.Create(aText);
strOutput:= TStringStream.Create;
try
Zipper:= TZCompressionStream.Create(strOutput, aCompressionLevel);
try
Zipper.CopyFrom(strInput, strInput.Size);
finally
Zipper.Free;
end;
Result:= strOutput.DataString;
finally
strInput.Free;
strOutput.Free;
end;
end;
function TForm1.ZDecompressString(aText: string): string;
var
strInput,
strOutput: TStringStream;
Unzipper: TZDecompressionStream;
begin
Result:= '';
strInput:= TStringStream.Create(aText);
strOutput:= TStringStream.Create;
try
Unzipper:= TZDecompressionStream.Create(strInput);
try
strOutput.CopyFrom(Unzipper, Unzipper.Size);
finally
Unzipper.Free;
end;
Result:= strOutput.DataString;
finally
strInput.Free;
strOutput.Free;
end;
end;
Where I was wrong?
Someone else have same problems??
ZLib, like all compression codes I know, is a binary compression algorithm. It knows nothing of string encodings. You need to supply it with byte streams to compress. And when you decompress, you are given back byte streams.
But you are working with strings, and so need to convert between encoded text and byte streams. The TStringStream class is doing that work in your code. You supply the string stream instance a text encoding when you create it.
Only your code does not supply an encoding. And so the default local ANSI encoding is used. And here's the first problem. That is not a full Unicode encoding. As soon as you use characters outside your local ANSI codepage the chain breaks down.
Solve that problem by supplying an encoding when you create string stream instances. Pass the encoding to the TStringStream constructor. A sound choice is TEncoding.UTF8. Pass this when creating strInput in the compressor, and strOutput in the decompressor.
Now the next and bigger problem that you face is that your compressed data may not be a meaningful string in any encoding. You might make your existing code sort of work if you switch to using AnsiString instead of string. But it's a rather brittle solution.
Fundamentally you are making the mistake of treating binary data as text. Once you compress you have binary data. My recommendation is that you don't attempt to interpret the compressed binary as text. Leave it as binary. Compress to a TBytesStream. And decompress from a TBytesStream. So the compressor function returns TBytes and the decompressor receives that same TBytes.
If, for some reason, you must compress to a string, then you must encode the compressed binary. Do that using base64. The EncdDecd unit can do that for you.
This flow for the compressor looks like this: string -> UTF-8 bytes -> compressed bytes -> base64 string. Obviously you reverse the arrows to decompress.

Converting TMemoryStream to variant

How do I convert contents of a TMemoryStream to a variant? I use Delphi 2010.
TMemoryStream stores contents of a file, it can be PDF or JPG (scanned document).
File is being kept inside MS SQL base.
When I go to editing mode in my program, I extract contents of that file from base into a TMemoryStream.
After editing document's card, I need to post document back to base.
Scanned file could be changed also (or replaced with some other file).
To post record back, I use a stored procedure with a bunch of parameters - one for every field.
I pass parameters to stored procedure as variants.
That's why I need to convert TMemoryStream to a variant.
Assuming you need the Variant to hold an array of bytes, you can use this:
var
MS: TMemoryStream;
V: Variant;
P: Pointer;
begin
...
V := VarArrayCreate([0, MS.Size-1], varByte);
if MS.Size > 0 then
begin
P := VarArrayLock(V);
Move(MS.Memory^, P^, MS.Size);
VarArrayUnlock(V);
end;
...
end;
TMemoryStream doesn't have a convenient way to get direct access to the internal data. It provides a property that gives you a pointer, but not any useful data type. However, if you use TBytesStream, which derives from TMemoryStream, you can get the data from the stream as a variable of type TBytes.
After this, assuming your parameter is a standard parameter object of type TParam, you don't need to use a variant. You can do it like this:
param.AsBlob := MyTBytesVariable;
Or, even simpler than that, you can use the stream directly:
param.AsStream := MyMemoryStream;
(If you do this, make sure that the stream's Position is set to 0 first.)

TParam.LoadFromStream is not working in Delphi XE2?

I have written a below code in Delphi XE2.
var
stream : TStringStream;
begin
stream := TStringStream.Create;
//Some logic to populate stream from memo.
ShowMessage(stream.datastring); //This line is showing correct data
// some Insert query with below parameter setting
ParamByName('Text').LoadFromStream(stream , ftMemo);
But this is storing text as ???? in table.
This type of code is working fine in Delphi 4.
Is there any issue in TParam.LoadFromStream function in Delphi XE2?
EDIT:
Table field is of type 'Text'.
After doing some trial and error methods I have found a solution to this problem.
We can use below code,
ParamByName('Text').AsMemo := SampleMemo.Text;
The root of your issue is that TStringStream does not operate the same way in D2009+ as it did in D4.
In D4, TStringStream was a simple wrapper around an AnsiString variable. The DataString property simply returned a direct reference to that variable, and all reads/writes operated directly on the contents of the variable. The stream's bytes and String characters were basically one and the same thing back then.
In D2009+, TStringStream is now a wrapper around a TBytes array of encoded bytes instead, where the default encoding is the default Ansi encoding of the OS that your app is running on. If you write a string to the stream using WriteString(), it gets encoded from Unicode to bytes using the stream's encoding, and then those encoded bytes are stored. If you read a string from the stream using ReadString(), or read the DataString property, the stored bytes are decoded to a Unicode string. Any other read/write operations operate on the raw encoded bytes instead, like any other stream type would. So when you call TParam.LoadFromStream(), it is reading the raw encoded bytes, not a Unicode string. The stream's raw bytes and String characters are NOT one and the same thing anymore. So the data you see in the ShowMessage() is not the same data that TParam sees.

Delphi: problem with httpcli (ICS) post method

I am using HttpCli component form ICS to POST a request. I use an example that comes with the component. It says:
procedure TForm4.Button2Click(Sender: TObject);
var
Data : String;
begin
Data:='status=no';
HttpCli1.SendStream := TMemoryStream.Create;
HttpCli1.SendStream.Write(Data[1], Length(Data));
HttpCli1.SendStream.Seek(0, 0);
HttpCli1.RcvdStream := TMemoryStream.Create;
HttpCli1.URL := Trim('http://server/something');
HttpCli1.PostAsync;
end;
But it fact, it sends not
status=no
but
s.t.a.t.u
I can't understand, where is the problem. Maybe someone can show an example, how to send POST request with the help of HttpCli component?
PS I can't use Indy =)
I suppose you're using Delphi 2009 or later, where the string type holds two-byte-per-character Unicode data. The Length function gives the number of characters, not the number of bytes, so when you put your string into the memory stream, you only copy half the bytes from the string. Even if you'd copied all of them, though, you'd still have a bunch of extra data in the stream since each character has two bytes and the server probably only expects to get one.
Use a different string type, such as AnsiString or UTF8String.

Resources