I have a String that I needed access to the first character of, so I used stringname[1]. With the unicode support this no longer works. I get an error: [DCC Error] sndkey32.pas(420): E2010 Incompatible types: 'Char' and 'AnsiChar'
Example code:
//vkKeyScan from the windows unit
var
KeyString : String[20];
MKey : Word;
mkey:=vkKeyScan(KeyString[1])
How would I write this in modern versions of Delphi
The type String[20] is a ShortString of length 20, i.e. a ShortString that contains 20 characters. But ShortStrings behave like AnsiStrings, i.e. they are not Unicode - one character is one byte. Thus KeyString[1] is an AnsiChar, whereas the vkKeyScan function expects a WideChar (=Char) as argument. I really have no idea whatsoever why you want to use the type String[20] instead of String (=UnicodeString), but you could convert the AnsiChar KeyString[1] to a WideChar:
mkey := vkKeyScan(WideChar(KeyString[1]))
Off the top of my head: do you really need a string, which is equal to widestring in Delphi 2009?
One option is to have the definition
var KeyString: AnsiString;
then when you take KeyString[1] that would be an AnsiChar rather than a Char.
Related
This question already has answers here:
What is the difference between WideChar and AnsiChar?
(2 answers)
Closed 8 years ago.
I recently ran into this data type mismatch. This is I never saw before. I hope someone could explain what these are and how are they different.
Error I got was F2063. [DCC Error] E2010 Incompatible types: 'AnsiChar' and 'Char'
Historically in Delphi, the Char type was effectively a synonym for the ANSIChar type. That is, a single byte representing a character from an ANSI codepage. NOTE: This is a simplification that ignores the complications arising from multibyte characters which could be encountered in an ANSI string but will suffice for this answer.
This corresponded with the fact that the String type was effectively a synonym for ANSIString.
In Delphi 2009 onward, this changed.
With Delphi 2009, the String and Char types became synonyms for UnicodeString (a WideString with additional capabilities) and WideChar, respectively, reflecting the transition to Unicode as the native format for string and character types. A WideChar is a 2 byte value representing a single character of Unicode (or one half of a surrogate pair).
Therefore, in versions of Delphi prior to Delphi 2009, the following two variables were of compatible types:
var
ach: ANSIChar;
ch: Char; // Synonymous with ANSIChar
However, in Delphi 2009 and later the meaning of the "ch" declarations changes:
var
ach: ANSIChar;
ch: Char; // Synonymous with WIDEChar
As a result, the ach and ch variables are no longer of compatible types.
i.e. the reason you are getting this error is that you have some code which has been declared with ANSIChar types and other code which is using values declared of type Char. When compiled with an old version of Delphi where Char = ANSIChar, the two sets of code are compatible, but in Delphi 2009 and later Char = WideChar and so the two types (Char and ANSIChar) are not compatible.
I have a text that I need to store it in a widestring variable. But my text is UTF8 and widestring doesn't support UTF8 and converts it to some chinese characters.
so is there any UTF8 version of WIDESTRING?
I always use UTF8string but in this case I have to use WideString
When you assign a UTF8String variable to a WideString variable, the compiler automatically inserts instructions to decode the string (in Delphi 2009 and later). It coverts UTF-8 to UTF-16, which is what WideString holds. If your WideString variable holds Chinese characters, then that's because your UTF-8-encoded string holds UTF-8-encoded Chinese characters.
If you want your string ws to hold 16-bit versions of the bytes in your UTF8String s, then you can by-pass the automatic conversion with some type-casting:
var
ws: WideString;
i: Integer;
c: AnsiChar;
SetLength(ws, Length(s));
for i := 1 to Length(s) do begin
c := s[i];
ws[i] := WideChar(Ord(c));
end;
If you're using Delphi 2009 or later (which includes the XE series), then you should consider using UnicodeString instead of WideString. The former is a native Delphi type, whereas the latter is more of a wrapper for the Windows BSTR type. Both types exhibit the automatic conversion behavior when assigning to and from AnsiString derivatives like UTF8String, though, so they type you use doesn't affect this answer.
In earlier Delphi versions, the compiler would attempt to decode the string using the system code page (which is never UTF-8). To make it decode the string properly, call Utf8Decode:
ws := Utf8Decode(s);
Consider the following snippet:
procedure TForm1.FormCreate(Sender: TObject);
{$REGION 'Sealed declarations'}
type WCh = WideChar; // (1)
type Str = ^WCh; // (2)
{ this routine accepts character pointer }
procedure Baz(Param: Str);
begin
end;
{$ENDREGION}
{ this one too, but character pointer type used directly }
procedure Bar(Param: PWideChar);
begin
end;
{ this constant should be compatible with strings and character pointers }
const FOO = 'FOO';
begin
Bar(FOO); // compiles!
Baz(FOO); // BAH! E2010 Incompatible types: 'Str' and 'string'
end;
How do i resolve this problem preserving both structured typing in declarations and the clarity and readability in the usage (i hope for no heavy typecasting)?
NB: By "sealed declarations" i really mean it. I prefer to not amend it unless it is absolutely necessary.
Internal handling of conversion between string and PChar varies from version to version, so environment might matter - i encountered this problem in Delphi XE.
As Rob Kennedy correctly noticed in comments, the question is about conversion from string literal, not string type.
To simplify coding Delphi allows implicit conversion from string literal to PChar type and PChar aliases.
To avoid typecasting you can use
type Str = PWideChar;
or use distict type
type Str = type PWideChar;
I have not noticed any difference in string literal --> PWideChar implicit conversion in Unicode Delphi versions (2009 and above).
Your WCh = WideChar definition creates a type alias for WideChar — they have type identity — but the subsequent Str = ^WCh definition does not create a type alias for PWideChar. When $T+ is in effect, they're compatible and assignment-compatible, but those aren't good enough in this situation. They are still distinct types.
The FOO constant is a string literal. The documentation for assignment compatibility says what types a string literal can be assigned to: "PAnsiChar, PWideChar, PChar or any string type." Str is not a string type. It's a pointer type, but it's not PWideChar, despite how similar their definitions are.
The type of a string literal adapts based on context. When the compiler needs a PWideChar, the string literal is a PWideChar. When the compiler needs an AnsiString, it's an AnsiString. (If the compiler needs both those types, then the literal will be stored in the program both ways.) String literals aren't assignable to your Str type, so, according to the error message, the compiler apparently chooses string as the type for the string literal in that situation. You can type-cast it to one of the other built-in types, but the better solution would be to avoid using custom-defined character-pointer classes at all.
Under Delphi 2010 (and probably under D2009 also) the default string type is UnicodeString.
However if we declare...
const
s :string = 'Test';
ss :string[4] = 'Test';
... then the first string s if declared as UnicodeString, but the second one ss is declared as AnsiString!
We can check this: SizeOf(s[1]); will return size 2 and SizeOf(ss[1]); will return size 1.
If I declare...
var
s :string;
ss :string[4];
... than I want that ss is also UnicodeString type.
How can I tell to Delphi 2010 that both strings should be UnicodeString type?
How else can I declare that ss holds four WideChars? The compiler will not accept the type declarations WideString[4] or UnicodeString[4].
What is the purpose of two different compiler declarations for the same type name: string?
The answer to this lies in the fact that string[n], which is a ShortString, is now considered a legacy type. Embarcadero took the decision not to convert ShortString to have support for Unicode. Since the long string was introduced, if my memory serves correctly, in Delphi 2, that seems a reasonable decision to me.
If you really want fixed length arrays of WideChar then you can simply declare array [1..n] of char.
You can't, using string[4] as the type. Declaring it that way automatically makes it a ShortString.
Declare it as an array of Char instead, which will make it an array of 4 WideChars.
Because a string[4] makes it a string containing 4 characters. However, since WideChars can be more than one byte in size, this would be a) wrong, and b) confusing. ShortStrings are still around for backward compatibility, and are automatically AnsiStrings because they consist of [x] one byte chars.
So the question is whether or not string literals (or const strings) in Delphi 2009/2010 can be directly cast as PAnsiChar's or do they need an additional cast to AnsiString first for this to work?
The background is that I am calling functions in a legacy DLL with a C interface that has some functions that require C-style char pointers. In the past (before Delphi 2009) code like the following worked like a charm (where the param to the C DLL function is a LPCSTR):
either:
LegacyFunction(PChar('Fred'));
or
const
FRED = 'Fred';
...
LegacyFunction(PChar(FRED));
So in changing to Delphi 2009 (and now in 2010), I changed the call to this:
LegacyFunction(PAnsiChar('Fred'));
or
const
FRED = 'Fred';
...
LegacyFunction(PAnsiChar(FRED));
This seems to work and I get the correct results from the function call. However there is some definite instability in the app that seems to be occurring mostly the second or third time through the code that calls the legacy functions (that was not present before the move to the 2009 version of the IDE). In investigating this, I realized that the native string literal (and const string) in Delphi 2009/2010 is a Unicode string so my cast was possibly in error. Examples here and elsewhere seem to indicate this call should look more like this:
LegacyFunction(PAnsiChar(AnsiString('Fred')))
What confuses me is that with the code above in the second examples, casting the string literal directly to a PAnsiChar does not generate any compiler warnings. If instead of a string literal, I was casting a string var, I would get a suspicious cast warning (and the string would be mangled). This (and the fact that the string is usable in the DLL) leads me to believe the compiler is doing some magic to correctly interpret the string literal as the intended string type. Is this what is happening or is the double cast (first to AnsiString, then to PAnsiChar) really necessary and the lack of it in my code the reason for the hard to track down instability? And does the same answer hold true for const strings as well?
For type-inferred constants (only initializable from literals) the compiler changes the actual text at compile-time, rather than at runtime. That means it knows whether or not the conversion loses data, so it doesn't need to warn you if it doesn't.
To 'visualize' Barry Kelly and Mason Wheeler words:
const
FRED = 'Fred';
var
p: PAnsiChar;
w: PWideChar;
begin
w := PWideChar(Fred);
p := PAnsiChar(Fred);
In ASM:
Unit7.pas.32: w := PWideChar(Fred);
00462146 BFA4214600 mov edi,$004621a4
// no conversion, just a pointer to constant/"-1 RefCounted" UnicodeString
Unit7.pas.33: p := PAnsiChar(Fred);
0046214B BEB0214600 mov esi,$004621b0
// no conversion, just a pointer to constant/"-1 RefCounted" AnsiString
As you can see in both cases PWideChar/PChar(FRED) and PAnsiChar(FRED), there is no conversion and Delphi compiler make 2 constant strings, one AnsiString and one UnicodeString.
Constants, including string literals, are untyped by default, and the compiler will fit them into whatever format works in the context you're using them in. As long as there are no non-ANSI characters in your string literal, the compiler won't have any trouble generating the string as ANSI instead of Unicode in this situation.
As Mason Wheeler points out all is fine as long as you don't have non-ANSI characters in your string const. If you have things like:
const FRED = 'Frédérick';
I'm pretty sure Delphi 2009/2010 will either issue charset hints (and apply a string conversion automatically - thus the hint) or fail at comparing ('Frédérick' is different in ISO-8859-1 than UTF-16).
If you can have "special" characters in your consts you will need to call string conversion.
Here are some basic examples with TStringList:
TStringList.SaveToFile(DestFilename, TEncoding.GetEncoding(28591)); //ISO-8859-1 (Latin1)
TStringList.SaveToFile(DestFilename, TEncoding.UTF8);