Format and Pointers/Hex Values (Memory Overwrite) - delphi

In my Delphi XE2 32-bit application (Update 4 Hotfix 1 Version 16.0.4504.48759) , I'm using the Format() routine to log pointer values.
For example:
Format('MyObject (%p)', [Pointer(MyObject)]);
However, the resulting string sometimes contains garbage characters (e.g., in this case '?' or '|' in place of hex digits):
MyObject (4E?|2010)
I also get the same result when replacing '%p' with '%x' like so:
Format('MyObject (%x)', [Integer(MyObject)]);
However, using an integer value always works:
Format('MyObject (%d)', [Integer(MyObject)]);
MyObject (1291453120)
Is there a bug that I'm unaware of or can this be related to the problem experienced here?
Why does Format crash when anything but "%s" is used with a Variant?
UPDATE
I've accepted Jeroen's answer as it led me to the solution by process of elimination. After the situation with starting the app via F7 (as per the comment), I figured that something must be going wrong much earlier in the process. On a hunch, I disabled madExcept from its IDE menu, rebuilt the app, and the problem disappeared. Evidently, whatever code madExcept was linking into my application was causing an overwrite in the SysUtils constant TwoHexLookup. Re-enabling madExcept and rebuilding (without any other changes on my part) also worked, so there must have been some corruption during the linking phase.
The strategy Jeroen outlined for detecting memory corruption was a useful exercise and should prove valuable if I encounter a similar situation.

My best hypothesis is that your code is modifying some memory that it shouldn't, perhaps by dereferencing an uninitialized pointer. I've created a reproducible case that demonstrates this possibility. At least, it's reproducible on my machine with my version of the compiler. The exact same code might not do the same thing in another circumstance.
procedure TForm1.Button1Click(Sender: TObject);
var
P : pbyte;
S : string;
T : ansistring;
begin
// There's nothing special about HexDisplayPrefix.
// It just happens to be one of the last global variables
// declared in SysUtils.
P := # ( HexDisplayPrefix );
// A few bytes beyond that is TwoHexLookUp.
// This is a static array of unicode characters used by the IntToHex routine,
// which is in turn used by Format when %p or %x are used.
// I'll add an offset to P so that it points into that array.
// I'll make the offset odd so that it points to the high byte of a character.
// Of course, I can't guarantee that the same offset will work for you
P := P + 5763;
// Change the lookup table.
// Of course, you would never do this on purpose.
P ^ := 39;
// Now let's call IntToHex
S := IntToHex ( $E0, 2 );
// Show the value on the screen.
// Hey look, the zero has been replaced with a star.
memo1 . lines . add ( S );
// Convert the unicode string to an ansistring
T := ansistring ( S );
// Show the value on the screen.
// When converting to Ansi, the system doesn't know what to do with the star,
// so it replaces it with a question mark.
memo1 . lines . add ( unicodestring(T) );
end;

As this appears to be a memory overwrite (cf. your comment to user1008646) you can try to follow these steps:
First try to find out which memory address gets overwritten. You mention that s := IntToHex(2129827392, 8); fails. Find out the correct value, then find out if it is within TwoHexLookUp.
If it is within TwoHexLookUp, then set a data-changed breakpoint (see How to define a breakpoint whenever an object field value changes? and Add data breakpoint on how to do this).
Run your app until the breakpoint fires.
Ad 1: probably the easiest way is to look into TwoHexLookUp which value change has the same effect to get the wrong result from s := IntToHex(2129827392, 8); as you observe at run-time.
Thursday I'm doing some Delphi work at a client, so then I might have time to dig a bit deeper.
Edit
When you step through your process with F7, you indeed get into the SysInit first.
What you can do there is already set a breakpoint on the TwoHexLookup array.
Then either F9/F8/F7 (depending on the granularity you want) and keep an eye on the array in a Watch window.
That should get you going.

Related

How to handle spurious "H2077 Value assigned to '%s' never used" messages without suppressing all hints?

Delphi (10.3 Rio) emits spurious H2077 warnings for code like:
x := TFoo.Create;
y := nil;
try
y := function_that_can_throw;
// use x and y
finally
x.Free;
y.Free;
end;
Note: the warning would still be unwanted even if the compiler could prove that the function cannot throw, since AFAIK there is no way to lock the function into non-throwingness by declaring it nothrow as in other languages and to assert the nothrow property at the call site. Hence the code must be written under the assumption that the function can throw.
I would like to suppress the unhelpful/erroneous hint, but apparently it is not possible to suppress hint H2077 specifically, only all hints or none. I would like to leave hints enabled if possible, so I'm wondering if there is another option for suppressing H2077 in this situation.
Also, I would like to avoid having to code a redundant second try/finally frame, since it clutters the source and creates unnecessary object code. The simplest and most obvious alternative - calling an empty dummy procedure like pretend_to_use(y) which takes a TObject parameter and does nothing with it - would create an unnecessary global dependency and most likely superfluous function calls as well. Hence I'd like your advice on a better solution...
EDIT: it turns out that Andreas has a point and the above snippet does not create the spurious warning (special coding in the compiler?). Here is an amended snippet that does cause the unwanted hint:
TIdStack.IncUsage;
y := nil;
try
y := function_that_can_throw;
// use y and the Indy stack
finally
TIdStack.DecUsage;
y.Free;
end;
The Indy stack thing is from something I'm currently working on, but entering/leaving critical sections would perhaps be a more common situation.
If you really want to suppress H2077, here's how I do it.
In my "utilities include" unit I have routines like:
procedure preventCompilerHint(I: integer); overload;
procedure preventCompilerHint(S: string); overload;
These are EMPTY routines, consisting simply of begin end; blocks.
I simply call these routines to show the compiler that I am actually "using" the variable in question.
If you're like me & like to be able to do a build and see zero hints and zero warnings... Well, this is how I handle the H2077.
Some may say this is less than elegant. At times that may be true. At other times I simply want to suppress this hint and move on.
Do with this as you will...
NOTE: I removed the sample code as (a) it wasn't related to my suggestion here; and (b) it was generating more interest than the suggestion itself.

How to make TSynEdit's Wordwrap same as TMemo's?

I'm using TSynEdit as a more user-friendly TMemo, mostly for the advanced shortcuts, UNDO/REDO, and so on.
Other things are OK, except the wordwrap behavior, please check the attached screenshot below, SynEdit has a strange space shown on the left-most side.
How to avoid that and make it look like TMemo?
The TSynEdit's key property settings:
synEdit1.UseCodeFolding := False;
synEdit1.Options := [eoAutoIndent, eoDragDropEditing, eoEnhanceEndKey,
eoGroupUndo, eoScrollPastEol, eoSmartTabDelete,
eoSmartTabs, eoTabsToSpaces];
synEdit1.ScrollBars := ssVertical;
synEdit1.TabWidth := 4;
synEdit1.WantTabs := True;
synEdit1.WordWrap := True;
synEdit1.FontSmoothing := fsmNone;
This is not a complete, tested answer to the q, but may offer the determined
reader a jumping-of point to a functional solution.
The word-wrapping behaviour of a TSynEdit is determined by its current
TSynWordWrapPlugin. The default plugin is defined in SynEditWordWrap.Pas
and contains the procedure TSynWordWrapPlugin.WrapLines method, starting at
line 512 in the version I downloaded yesterday using the D10.2.3 GetIt Manager.
Starting at line 560 there is the block of code which, as far as I can tell,
accounts for the space at the start of each wrapped line as illustrated in the q:
if Editor.IsWordBreakChar(vRunner^) then
begin
vRowEnd := vRunner;
break;
end;
Dec(vRunner);
vRunner and vRowEnd are among a number of PWideChar variables used in the WrapLines method.
Watching the behaviour of this code, which is inside a while loop (which is looking for a place to do a word-wrap), it operates
so that when Editor.IsWordBreakChar(vRunner^) returns true, the vRunner pointer
has already moved backwards past the word-break char, which is why it (the space) ends
up on the following line, causing the problem noted by the OP.
Changing the code to
if Editor.IsWordBreakChar(vRunner^) then
begin
{ma} Inc(vRunner); // WARNING: not fully tested
vRowEnd := vRunner;
break;
end;
Dec(vRunner);
forces the vRunner pointer forwards past the word-break character so that the space is included at the end of the line rather than at the start of the next one, so the SynEdit
then displays its wrapped text like the standard TMemo.
Personally, I would not use this change, but would instead see if I could persuade
the SynEdit developers to provide an official solution. if I did use the change
shown above, I certainly wouldn't do it by changing the source of SynEditWordWrap.Pas,
I would do it by writing a replacement for TSynWordWrapPlugin and I would include a check that the inc(vRunner) does not exceed the valid bounds of the buffer being used to do the word-wrapping.

Strange StrPCopy() AV error and workaround that does not seems to make sense

I am running into a strange problem dealing with StrPCopy(). Please take a look at the sample codes below:
procedure TForm2.butnTestClick(Sender: TObject);
var
s : string;
begin
//-- assign string this way will cause AV when trying to StrPCopy()
s := 'original string';
//-- assign string this way works!!!!!!!
//s := Trim('original string');
//-- AV error when trying to alter the string
StrPCopy(PChar(s), PChar('changed'));
//-- should come back with "changed"
Memo1.Lines.Add(s);
end;
I am using Delphi 10 Seattle. If I try to alter "s" using StrPCopy() I will get AV error. However, I surround my string declaration with Trim(), it will work.
Seems like by surrounding string declaration with Trim() triggers compiler to turn off some sort optimization on that particular string. I just don't know what that is. Please help.
When s refers to a literal it points to read only memory. Hence the access violation.
When you make a writeable string with Trim then it can be overwritten without a runtime error. That said, you've still destroyed the string because the null terminator and the length don't match.
Your main problem here is the mixing of Delphi strings with null terminated C strings. Stop that abuse and your problems disappear. There is no reason at all for you to call StrPCopy. Once you stop doing that and use native Delphi strings you cannot encounter any such problems.
The correct way to write your code is like so:
s := 'changed';
Your usage of StrPCopy() call is not safe - regardless of the AV that is eliminated with the use of the new string that is created with the Trim() assignment.
The string datatype is more complex than PChar. It has a length component that magically sits ahead of the pointer and the character data. Casting to a PChar works, but should only be used for reading.
By casting the string to a PChar, you are allowing StrPCopy to blast the 'changed' string into that part of memory. In your example, you're copying in a smaller string, so you're OK memory-wise. The result is a very confused string (length doesn't match the string, there's a null character in the middle), but you're within it's bounds.
If your code was something like...
StrPCopy(PChar(s), PChar('changed to this string'));
... then your code is overwriting past the string's memory footprint - usually without an immediate AV. You may get away with this. You may not.

Why do I get access violations when a control's class name is very, very long?

I subclassed a control in order so I can add a few fields that I need, but now when I create it at runtime I get an Access Violation. Unfortunately this Access Violation doesn't happen at the place where I'm creating the control, and even those I'm building with all debug options enabled (including "Build with debug DCU's") the stack trace doesn't help me at all!
In my attempt to reproduce the error I tried creating a console application, but apparently this error only shows up in a Forms application, and only if my control is actually shown on a form!
Here are the steps to reproduce the error. Create a new VCL Forms application, drop a single button, double-click to create the OnClick handler and write this:
type TWinControl<T,K,W> = class(TWinControl);
procedure TForm3.Button1Click(Sender: TObject);
begin
with TWinControl<TWinControl, TWinControl, TWinControl>.Create(Self) do
begin
Parent := Self;
end;
end;
This successively generates the Access Violation, every time I tried. Only tested this on Delphi 2010 as that's the only version I've got on this computer.
The questions would be:
Is this a known bug in Delphi's Generics?
Is there a workaround for this?
Edit
Here's the link to the QC report: http://qc.embarcadero.com/wc/qcmain.aspx?d=112101
First of all, this has nothing to do with generics, but is a lot more likely to manifest when generics are being used. It turns out there's a buffer overflow bug in TControl.CreateParams. If you look at the code, you'll notice it fills a TCreateParams structure, and especially important, it fills the TCreateParams.WinClassName with the name of the current class (the ClassName). Unfortunately WinClassName is a fixed length buffer of only 64 char's, but that needs to include the NULL-terminator; so effectively a 64 char long ClassName will overflow that buffer!
It can be tested with this code:
TLongWinControlClassName4567890123456789012345678901234567891234 = class(TWinControl)
end;
procedure TForm3.Button1Click(Sender: TObject);
begin
with TLongWinControlClassName4567890123456789012345678901234567891234.Create(Self) do
begin
Parent := Self;
end;
end;
That class name is exactly 64 characters long. Make it one character shorter and the error goes away!
This is a lot more likely to happen when using generics because of the way Delphi constructs the ClassName: it includes the unit name where the parameter type is declared, plus a dot, then the name of the parameter type. For example, the TWinControl<TWinControl, TWinControl, TWinControl> class has the following ClassName:
TWinControl<Controls.TWinControl,Controls.TWinControl,Controls.TWinControl>
That's 75 characters long, over the 63 limit.
Workaround
I adopted a simple error message from the potentially-error-generating class. Something like this, from the constructor:
constructor TWinControl<T, K, W>.Create(aOwner: TComponent);
begin
{$IFOPT D+}
if Length(ClassName) > 63 then raise Exception.Create('The resulting ClassName is too long: ' + ClassName);
{$ENDIF}
inherited;
end;
At least this shows a decent error message that one can immediately act upon.
Later Edit, True Workaround
The previous solution (raising an error) works fine for a non-generic class that has a realy-realy long name; One would very likely be able to shorten it, make it 63 chars or less. That's not the case with generic types: I ran into this problem with a TWinControl descendant that took 2 type parameters, so it was of the form:
TMyControlName<Type1, Type2>
The gnerate ClassName for a concrete type based on this generic type takes the form:
TMyControlName<UnitName1.Type1,UnitName2.Type2>
so it includes 5 identifiers (2x unit identifier + 3x type identifier) + 5 symbols (<.,.>); The average length of those 5 identifiers need to be less then 12 chars each, or else the total length is over 63: 5x12+5 = 65. Using only 11-12 characters per identifier is very little and goes against best practices (ie: use long descriptive names because keystrokes are free!). Again, in my case, I simply couldn't make my identifiers that short.
Considering how shortening the ClassName is not always possible, I figured I'd attempt removing the cause of the problem (the buffer overflow). Unfortunately that's very difficult because the error originates from TWinControl.CreateParams, at the bottom of the CreateParams hierarchy. We can't NOT call inherited because CreateParams is used all along the inheritance chain to build the window creation parameters. Not calling it would require duplicating all the code in the base TWinControl.CreateParams PLUS all the code in intermediary classes; It would also not be very portable, since any of that code might change with future versions of the VCL (or future version of 3rd party controls we might be subclassing).
The following solution doesn't stop TWinControl.CreateParams from overflowing the buffer, but makes it harmless and then (when the inherited call returns) fixes the problem. I'm using a new record (so I have control over the layout) that includes the original TCreateParams but pads it with lots of space for TWinControl.CreateParams to overflow into. TWinControl.CreateParams overflows all it wants, I then read the complete text and make it so it fits the original bounds of the record also making sure the resulting shortened name is reasonably likely to be unique. I'm including the a HASH of the original ClassName in the WndName to help with the uniqueness issue:
type
TWrappedCreateParamsRecord = record
Orignial: TCreateParams;
SpaceForCreateParamsToSafelyOverflow: array[0..2047] of Char;
end;
procedure TExtraExtraLongWinControlDescendantClassName_0123456789_0123456789_0123456789_0123456789.CreateParams(var Params: TCreateParams);
var Wrapp: TWrappedCreateParamsRecord;
Hashcode: Integer;
HashStr: string;
begin
// Do I need to take special care?
if Length(ClassName) >= Length(Params.WinClassName) then
begin
// Letting the code go through will cause an Access Violation because of the
// Buffer Overflow in TWinControl.CreateParams; Yet we do need to let the
// inherited call go through, or else parent classes don't get the chance
// to manipulate the Params structure. Since we can't fix the root cause (we
// can't stop TWinControl.CreateParams from overflowing), let's make sure the
// overflow will be harmless.
ZeroMemory(#Wrapp, SizeOf(Wrapp));
Move(Params, Wrapp.Orignial, SizeOf(TCreateParams));
// Call the original CreateParams; It'll still overflow, but it'll probably be hurmless since we just
// padded the orginal data structure with a substantial ammount of space.
inherited CreateParams(Wrapp.Orignial);
// The data needs to move back into the "Params" structure, but before we can do that
// we should FIX the overflown buffer. We can't simply trunc it to 64, and we don't want
// the overhead of keeping track of all the variants of this class we might encounter.
// Note: Think of GENERIC classes, where you write this code once, but there might
// be many-many different ClassNames at runtime!
//
// My idea is to FIX this by keeping as much of the original name as possible, but
// including the HASH value of the full name into the window name; If the HASH function
// is any good then the resulting name as a very high probability of being Unique. We'll
// use the default Hash function used for Delphi's generics.
HashCode := TEqualityComparer<string>.Default.GetHashCode(PChar(#Wrapp.Orignial.WinClassName));
HashStr := IntToHex(HashCode, 8);
Move(HashStr[1], Wrapp.Orignial.WinClassName[High(Wrapp.Orignial.WinClassName)-8], 8*SizeOf(Char));
Wrapp.Orignial.WinClassName[High(Wrapp.Orignial.WinClassName)] := #0;
// Move the TCreateParams record back were we've got it from
Move(Wrapp.Orignial, Params, SizeOf(TCreateParams));
end
else
inherited;
end;

Elegant way for handling this string issue. (Unicode-PAnsiString issue)

Consider the following scenario:
type
PStructureForSomeCDLL = ^TStructureForSomeCDLL;
TStructureForSomeCDLL = record
pName: PAnsiChar;
end
function FillStructureForDLL: PStructureForSomeDLL;
begin
New(Result);
// Result.pName := PAnsiChar(SomeObject.SomeString); // Old D7 code working all right
Result.pName := Utf8ToAnsi(UTF8Encode(SomeObject.SomeString)); // New problematic unicode version
end;
...code to pass FillStructureForDLL to DLL...
The problem in unicode version is that the string conversion involved now returns a new string on stack and that's reclaimed at the end of the FillStructureForDLL call, leaving the DLL with corrupted data. In old D7 code, there were no intermediate conversion funcs and thus no problem.
My current solution is a converter function like below, which is IMO too much of an hack. Is there a more elegant way of achieving the same result?
var gKeepStrings: array of AnsiString;
{ Convert the given Unicode value S to ANSI and increase the ref. count
of it so that returned pointer stays valid }
function ConvertToPAnsiChar(const S: string): PAnsiChar;
var temp: AnsiString;
begin
SetLength(gKeepStrings, Length(gKeepStrings) + 1);
temp := Utf8ToAnsi(UTF8Encode(S));
gKeepStrings[High(gKeepStrings)] := temp; // keeps the resulting pointer valid
// by incresing the ref. count of temp.
Result := PAnsiChar(temp);
end;
One way might be to tackle the problem before it becomes a problem, by which I mean adapt the class of SomeObject to maintain an ANSI Encoded version of SomeString (ANSISomeString?) for you alongside the original SomeString, keeping the two in step in a "setter" for the SomeString property (using the same UTF8 > ANSI conversion you are already doing).
In non-Unicode versions of the compiler make ANSISomeString be simply a "copy" of SomeString string, which will of course not be a copy, merely an additional ref count on SomeString. In the Unicode version it references a separate ANSI encoding with the same "lifetime" as the original SomeString.
procedure TSomeObjectClass.SetSomeString(const aValue: String);
begin
fSomeString := aValue;
{$ifdef UNICODE}
fANSISomeString := Utf8ToAnsi(UTF8Encode(aValue));
{$else}
fANSISomeString := fSomeString;
{$endif}
end;
In your FillStructure... function, simply change your code to refer to the ANSISomeString property - this then is entirely independent of whether compiling for Unicode or not.
function FillStructureForDLL: PStructureForSomeDLL;
begin
New(Result);
result.pName := PANSIChar(SomeObject.ANSISomeString);
end;
There are at least three ways to do this.
You could change SomeObject's class
definition to use an AnsiString
instead of a string.
You could
use a conversion system to hold
references, like in your example.
You could initialize result.pname
with GetMem and copy the result of the
conversion to result.pname^ with
Move. Just remember to FreeMem it
when you're done.
Unfortunately, none of them is a perfect solution. So take a look at the options and decide which one works best for you.
Hopefully you already have code in your application to properly dispose off of all the dynamically allocated records that you New() in FillStructureForDLL(). I consider this code highly dubious, but let's assume this is reduced code to demonstrate the problem only. Anyway, the DLL you pass the record instance to does not care how big the chunk of memory is, it will only get a pointer to it anyway. So you are free to increase the size of the record to make place for the Pascal string that is now a temporary instance on the stack in the Unicode version:
type
PStructureForSomeCDLL = ^TStructureForSomeCDLL;
TStructureForSomeCDLL = record
pName: PAnsiChar;
// ... other parts of the record
pNameBuffer: string;
end;
And the function:
function FillStructureForDLL: PStructureForSomeDLL;
begin
New(Result);
// there may be a bug here, can't test on the Mac... idea should be clear
Result.pNameBuffer := Utf8ToAnsi(UTF8Encode(SomeObject.SomeString));
Result.pName := Result.pNameBuffer;
end;
BTW: You wouldn't even have that problem if the record passed to the DLL was a stack variable in the procedure or function that calls the DLL function. In that case the temporary string buffers will only be necessary in the Unicode version if more than one PAnsiChar has to be passed (the conversion calls would otherwise reuse the temporary string). Consider changing the code accordingly.
Edit:
You write in a comment:
This would be best solution if modifying the DLL structures were an option.
Are you sure you can't use this solution? The point is that from the POV of the DLL the structure isn't modified at all. Maybe I didn't make myself clear, but the DLL will not care whether a structure passed to it is exactly what it is declared to be. It will be passed a pointer to the structure, and this pointer needs to point to a block of memory that is at least as large as the structure, and needs to have the same memory layout. However, it can be a block of memory that is larger than the original structure, and contain additional data.
This is actually used in quite a lot of places in the Windows API. Did you ever wonder why there are structures in the Windows API that contain as the first thing an ordinal value giving the size of the structure? It's the key to API evolution while preserving backwards compatibility. Whenever new information is needed for the API function to work it is simply appended to the existing structure, and a new version of the structure is declared. Note that the memory layout of older versions of the structure is preserved. Old clients of the DLL can still call the new function, which will use the size member of the structure to determine which API version is called.
In your case no different versions of the structure exist as far as the DLL is concerned. However, you are free to declare it larger for your application than it really is, provided the memory layout of the real structure is preserved, and additional data is only appended. The only case where this wouldn't work is when the last part of the structure were a record with varying size, kind of like the Windows BITMAP structure - a fixed header and dynamic data. However, your record looks like it has a fixed length.
Wouldn't PChar(AnsiString(SomeObject.SomeString)) work?

Resources