Delphi ^A syntax: Documented, implied, or undocumented? - delphi

Let me explain by an example. In Delphi, you can write
procedure TForm1.FormKeyPress(Sender: TObject; var Key: Char);
begin
if Key = ^C then
ShowMessage('The user wants to copy something.')
else if Key = ^V then
ShowMessage('The user wants to paste.')
end;
to check for Ctrl+C and Ctrl+V keyboard commands. In fact, the same syntax works for Ctrl+A, where A is any character, and -- of course -- you can also use a case statement instead of ifs. You can even do ShowMessage(^A), so, apparently, ^A is considered a char.
However, when browsing the official Delphi documentation, I cannot find any reference to this syntax. But maybe the ^A syntax is so common that it is understood as a part of the underlying plain text file format? Or is it simply an undocumented feature of the Delphi programming language? (Notice that the above constructions are actually used in the RTL/VCL source code. But, of course, Embarcadero, and Embarcadero alone, is allowed to use undocumented features, if any such exists.)

This is from long ago as an escape character to enable you to have consts for control characters in a more readable way.
const
CtrlC = ^C;
begin
Write(Ord(CtrlC));
end.
This defines a Char constant with value #3, then writes 3 in Borland Pascal 7, and I remember seeing it years before that too.
I just checked the Turbo Pascal 5.0 and Borland Pascal 7.0 languages guides, but could not find it, so it seems undocumented.
Edit:
I do remember this was a Borland thing, and just checked: it is not part of the ISO Pascal standard (formerly this was ANSI Pascal Standard, thanks Sertac for noticing this).
It is documented in the Free Pascal documentation.
SGI uses the backslash as escape character, as per their docs.
More Edit: I found it documented in Delphi in a Nutshell and the Delphi Basics site.
Found it: Just found it on page 37 of the Turbo Pascal 3 Reference Manual.
--jeroen

This is a known undocumented feature. But then again, the latest official syntax documentation is from delphi 7.

Related

CharInSet bulk conversions when migrating from Delphi 2007

We are shifting a number of projects from Delphi 2007 to XE8 and have a number of the following warning (many hundreds of them):
[dcc32 Warning] X.PAS(1568): W1050 WideChar reduced to byte char in set expressions. Consider using 'CharInSet' function in 'SysUtils' unit.`
It occurs to me that many of these are of the form
if x in ['1','2','3'] then
which need to be converted to
if CharInSet(x, ['1','2','3']) then
And this looks like there might be some sort of regular expression type search and replace that could be used to do these in bulk.
Can anyone think of a way to convert these in bulk?
This can be done with Search/Replace in the IDE.
The following works for me in XE4.
search for:
if {[a-z]} in \[{{'[0-9]+'\,? ?}+}\] then
If you want to match a variable more than one character long, consider to use some quantifier like [a-z]+.
replace with:
if CharInSet\(\0, \[\1\]\) then
Notice that the IDE uses {} for groups and \0, \1 ... as replacement placeholders.
Embarcadero Regular Expressions reference for Delphi XE4
IDE regular expressions search:
The resulting unit:
You may also find this question useful for further reference.

Why can the circumflex sign be used for control chars in Delphi and is it a good idea?

I just came across an answer on SO with a curious syntax:
How do I include a newline character in a string in Delphi?
MyString := 'Hello,' + ^M + ^J + 'world!';
I've been using Delphi for several years now, but I didn't know you could use the circumflex sign for control characters.
Is this just a left over from the early Delphi or Turbo Pascal days?
Should it be used nowadays?
PS: I'm not asking about advice on the line break characters, there is sLineBreak and other methods as discussed in the original question.
No, it is not from Turbo Pascal days. It is from decades before TP, and before MS-DOS, and probably even before UNIX. Something old like first 300 bit-per-second dialup modems and DEC VT-52 terminal, RT-8 OS on PDP-8 machine and early version of C. Or maybe even older - though everything older to me is mere legends :-).
"^" sign is shortcut for "Ctrl" key. So ^C in traditional notation stands for Ctrl+C in Microsoft notation. That notation was vastly used for textmode menus in MS-DOS times, like in the aforementioned Turbo Pascal, Norton Utilities, DOS Navigator, etc.
Out of my memory you can consider "^" for "subtract 64".
So as Chr(65) is 'A' then Chr(1) would be ^A.
And ^# would be #0 :-) AFAIR in MS-DOS times pressing Ctrl+Shift+"2/#" would actually produce #0 into BIOS keyboard buffer :-)
^[ would AFAIR be #27 aka Esc(ape) char - and if you run telnet.exe you would see it prompted as the escape character.
So Turbo Pascal long ago chosen to follow the time-blessed convention, and then rules of backward compatibility engaged ever since. Personally, i take 'bla-bla'^M^J'foo-baz' literal more string-like than 'bla-bla'#13#10'foo-baz' when you want it on one line. And constructing the value with plus is better fit when your literal takes several source lines.
The pity is that syntax highlighting in Delphi IDE is hopelessly broken on that kind of constants.
Yes this is a legacy from days of yore.
And no you should not get into the habit of using this feature. Remember that code is read more often than it is written. Always think of your readers who most likely won't know what that syntax means.
Yes, this is left over from TP days. You could also write your statement like this
mystring:= 'Hello'#13#10'world!';
which is probably less obscure and more easily understandable than using ^M and ^J. Of course, you should really define constants
const
crlf = #13#10
begin
mystring:= 'Hello' + crlf + 'world!';
end;

how to know (in code) that some characters are displayed fine (or not) in the interface of a program made in Delphi

Sorry about my english...
I'm trying to make a small program in Delphi 7.
Its interface will have text in my language, which has some characters with diacritics.
If "Language for non-Unicode programs" is set to my language those characters are always displayed fine. That's normal.
If is set to something else, sometimes are displayed fine, sometimes they are not.
How can I know that they can be displayed fine or not...?
Oh, and I can't use Unicode components, only normal.
Only way that I found is to capture the image of one characters into a bitmap and check pixel by pixel. But it's a lot of work to implement, slow and imprecise.
I can use GetSystemDefaultLangID function and know that "Language for non-Unicode programs" is set to something else but still don't know if they are displayed fine or not.
Thank you for any idea.
Welcome to the joys of AnsiStrings encoded using code-pages. You should not be using AnsiStrings at all, and you know that, but you say without explaining it that you can't use unicode controls. this seems strange to me. You should be using either:
(a) A Unicode version of Delphi (2009,2010, XE), where String=UnicodeString.
(b) If not that, at least use Proper Unicode controls, such as TNT Controls, and internally use WideString types where you need to store accented or international characters.
Your version of Delphi has String=AnsiString, and you are relying on the locale that your system is set to (as you say in your question) to select the codepage representations of accented characters, a problematic scheme. If you really can't move up from Delphi 7, at least start using WideStrings, and TNT Unicode Controls, but I must say that effort is WASTED you would be better off getting Delphi XE, and just porting to Unicode.
Your question asks "how can I know if they can be stored fine or not?" You can encode and decode using your codepage, and check if anything is replaced with a "?". The windows function WideCharToMultiByte, for example behaves like this. MBCS is a world of pain, and not worth doing, but you asked how you can find out where the floor falls out from under you, so that API will help you understand your selected encoding rule.
Use WideCharToMultiByte Function - http://msdn.microsoft.com/en-us/library/dd374130(v=vs.85).aspx and check lpUsedDefaultChar parameter.
Since this has been on my research list for a while, but didn't reach the top of that list yet, I can only help you out with a few links.
You will need to to quite a bit of experimentation :-)
When using Unicode, you can use functions ScriptGetCMap and GetGlyphIndices to test if a code point is in the font.
When not using Unicode, you can use the function GetGlyphIndices
There are few Delphi translations of these functions around. This Borland Newsgroup thread has a few hints on using GetGlyphIndices in Delphi.
Here is a search ScriptGetCMap in Delphi.
This page has a list of some interesting API calls that might help you further.
An extra handicap is that because not all fonts contain all characters, so Windows can do font substitution for you.
I'm not sure how to figure out that, but it is something you have to check for too.
Good luck :-)
procedure TForm1.Button2Click(Sender: TObject);
var
ACP: Integer;
begin
ACP := GetACP;
Caption := 'CP' + IntToStr(ACP);
if ACP = 1250 then
Caption := Caption + ' is okay for Romanian language';
end;

C-style hexadecimals in Delphi - undocumented feature?

I noticed by chance that the following code
var
I: Integer;
begin
I:= StrToInt('0xAA');
ShowMessage(IntToStr(I)); // shows 170 = $AA
end;
is OK in Delphi 2009. BTW the feature helped me to extract hexadecimal constants from C header file.
I wonder is it OK to use the feature or the feature is about to be "fixed" in future versions?
It's a feature, and you can rely on it. One of the philosophical changes that occurred in the evolution of Turbo Pascal into Delphi was the acknowledgment that Delphi lives in a C-dominated world and there was more to be gained by gracefully accepting or tolerating C-isms than ignoring them and forcing the Delphi developer to sort it out. Interop with C++ Builder as mentioned by Rob was a factor, but so was the fact that Delphi was designed first for Windows, and Windows has a lot of C language artifacts in the Windows API.
I think the term "impedance mismatch" may apply here - it was simple enough to remove the impedance mismatch between Delphi hex handling and "Rest of World", so we did.
Recall that the Delphi RTL is used by C++ Builder, too. The documentation doesn't go into detail about exactly what it means when it says StrToIntaccepts "decimal or hexadecimal notation." You can safely expect StrToInt to continue to accept C-style numbers.
Val accepts the same input, as does Read (because they all end up calling System._ValLong).

FormatDateTime with chinese location - wrong characters... Delphi 2007

Output: Period: from 11-Ê®¶þÔÂ-10 to 13-Ê®¶þÔÂ-10
The above output is from a line like this:
FormatDateTime('dd-mmm-yy', dateValue)
The IDE is Delphi 2007 and we are trying to gear up our app to the Chinese market.
How can I display the correct characters?
With the setting turn to Hindi (India), instead of the funny characters I have the "?".
I'm trying to display the date on a report, using ReportBuilder 11.
Any help will be much appreciated.
The characters seem to be correct, only IMO they have been rendered wrong.
Here's what I've done:
copied the string as presented by the OP ("11-Ê®¶þÔÂ-10 to 13-Ê®¶þÔÂ-10");
pasted it into a blank plain-text editor window with CP 1252 (Windows Latin-1) and saved;
opened the text file in a browser;
the text showed up the same as the browser chose the same codepage, so I turned on the automatic detection of character encoding, hinting it that the contents was Chinese;
the text changed to "11-十二月-10 to 13-十二月-10" (hope your browser displays correct Chinese characters here, my does anyway) and the codepage changed to GB18030 (and I then tried GB2312, but the text wouldn't change);
well, I was curious and searched for "十二月", and it turned out to stand for "December", quite suitable for the context unless the month names had been mixed up.
So, this is why I think it's a text rendering (or whatever you call it, I'm not really sure about the term) problem.
EDIT: Of course, it must have had something to do with the data type chosen for storing the string. If the function result is AnsiString and the variable is WideString, then maybe the characters get converted as WideChars and so they are no longer one-byte compounds of multi-byte characters but are multi-byte characters on their own? At least that's what happened when the OP posted them here.
I don't know actually, but if it is so then I doubt if they can be rendered correctly unless converted back and rendered as part of an AnsiString.
Another solution is to use TntControls. They're a set of standard Delphi controls enhanced to support Unicode. You'll have to go through all your form files and replace
Button1: TButton
Label1: TLabel
with TTntButton, TTntLabel et cetera.
Please note, that as things stand, it's not only Chinese which will not work. Try any language using symbols other than standard European set (latin + stress marks etc), for instance Russian.
But
By replacing the controls, you'll solve one part of the problem. Another part is that everywhere where you use "string" or "AnsiString" and "char/pchar" or "AnsiChar/PAnsiChar", you can store only strings in default system encoding.
For instance, if your system encoding ("Language for non-unicode programs") is EN/US, Russian characters will be replaced with question marks when you assign them to "string" variable:
a: WideString;
b: string;
...
a := 'ЯУЭФЫЦ'; //WideString can store international characters
b := a; //string cannot, so the data is lost - you cannot restore it from just "b"
To store string data which is independent of system encoding, use WideString/WideChar/PWideChar and appropriate functions. If you have
a, b: WideString;
...
a := UpperCase(b);
then unicode information will still be lost because UpperCase() accepts "string":
function UpperCase(const S: string): string;
Your WideString will be converted to "string" (losing all international characters), given to UpperCase, then the result will be converted back to WideString but it's already too late.
Therefore you have to replace all string functions with Wide versions:
a := WideUpperCase(b);
(for some functions, their wide versions are unavailable or called differently, TntControls also contain a bunch of wide function versions)
The Chinese Market requires support for multi-byte character sets (either WideChar or Unicode).
The Delphi 2007 RTL/VCL only supports single-byte character sets (there is very limited support for WideChar in the RTL and VCL).
The easiest for you is to upgrade to a Delphi version that supports Unicode (Delphi 2009 was the first version that supports Unicode, the current Delphi vesion is Delphi XE).
Or you will need to update all your components to support WideChar, and rewrite the portions of RTL/VCL for which you need WideChar support.
--jeroen
Did you install Far East charset support in Windows? In Windows pre 7 (or Vista) those charset are not installed by default in Western versions, you have to add them in Control Panel -> Regional Settins, IIRC
Using a non-Unicode version of Delphi unluckily what character can be displayed depends on the current codepage. If it is not one of the Chinese ones, for example, it could not display the characters you need. What characters are actually displayed depends on how the codes you're using are mapped in the current codepage. You could use a multi-lingual version of Windows to switch fully to the locale you need, or you have to use a Unicode version of Delphi (from 2009 onwards).

Resources