I have the following variable declarations:
arrChar_1: array[0..2] of Char;
arrChar_2: array[0..2] of Char;
str: string;
Then I made the assignment:
str := arrChar_1 + arrChar_2;
This assignment works normally on Delphi 6. But error occurs when I compile it on Delphi 10.2:
[dcc32 Error] MigrateConcatenateCharArray.dpr(26): E2008 Incompatible types
I'm solving this problem in the following way:
str := Copy(first_arrChar, 0, StrLen(first_arrChar));
str := str + Copy(second_arrChar, 0, StrLen(second_arrChar));
Is there any other good solution to this problem? (1)
In Delphi 6:
String = AnsiString
Char = AnsiChar
In Delphi 10.2:
String = UnicodeString
Char = WideChar
Can tell me what caused the incompatibility issue to occur? (2)
I'm understanding that widechar is a multi-byte character type. Unicode is the way that characters are encoded. But I'm confused about them.
The following compiles in all versions of Delphi:
procedure Main;
var
arrChar_1: array[0..2] of AnsiChar;
arrChar_2: array[0..2] of AnsiChar;
str: AnsiString;
begin
str := arrChar_1 + arrChar_2;
end;
The following code does not compile in Unicode versions of Delphi:
procedure Main;
var
arrChar_1: array[0..2] of WideChar;
arrChar_2: array[0..2] of WideChar;
str: UnicodeString;
begin
str := arrChar_1 + arrChar_2;
end;
This seems a little odd to me. Why should the concatenation operator be supported for AnsiChar arrays but not WideChar arrays?
If you examine how the concatenation operator is implemented for AnsiChar arrays that begins to shed some light. The generated code first converts the arrays into ShortString instances. These are then converted into Delphi AnsiString instances. Finally the two AnsiString instances are concatenated.
Now, this would explain why the code fails for WideChar arrays. The ShortString type only supports AnsiChar elements and so a different path through the string support routines would have been needed. One can assume that the Embarcadero designers chose, for whatever reason, not to support this form of concatenation when implementing Unicode support.
To back this idea up, consider the following:
procedure Main;
var
arrChar_1: array[0..254] of AnsiChar;
arrChar_2: array[0..254] of AnsiChar;
str: AnsiString;
begin
str := arrChar_1 + arrChar_2;
end;
This compiles. But change either of the 254 upper bounds to 255 and the code fails to compile (in all versions of Delphi) reporting E2008 Incompatible types. That is because the array now exceeds the maximum length of a ShortString object.
As for how to migrate your code to Unicode Delphi, I suggest that you simply cast the character arrays to string:
str := string(arrChar_1) + string(arrChar_2);
Related
I have a simple Labview dll that takes a PascalString then returns the pascal string with no changes. This is just testing what we can do. The header is as follows:
void __stdcall Read_String_In_Write_String_Out(PStr String_input,
PStr String_output);
the Delphi code is as follows:
var
hbar : thandle;
str, str2 : PChar;
StringFunction : function (TestString: PChar): PChar; stdcall;
begin
hbar := LoadLibrary('C:\Interface.dll');
if hbar >= 32 then begin
StringFunction := getprocaddress(hbar, 'Read_String_In_Write_String_Out');
str := 'test';
str2 := StringFunction(str);
end;
end;
When running the program i get an Access Violation. I have no issues when doing simple math functions using dll's, but when it comes to strings everything breaks.
Can anyone help?
You say that the DLL function is taking in a Pascal String. According to Labview's documentation:
Pascal String Pointer is a pointer to the string, preceded by a length byte.
Pascal-Style Strings (PStr)
A Pascal-style string (PStr) is a series of unsigned characters. The value of the first character indicates the length of the string. A PStr can have a range of 0 to 255 characters. The following code is the type definition for a Pascal string.
typedef uChar Str255[256], Str31[32], *StringPtr, **StringHandle;
typedef uChar *PStr;
This would be equivalent to Delphi's ShortString type (well, more accurately, PShortString, ie a pointer to a ShortString).
Based on the DLL function's declaration, its 2nd parameter is not a return value, it is an input parameter taking in a pointer by value. So your use of StringFunction is wrong on 2 counts:
Getting the output in the wrong place. StringFunction should be a procedure with 2 parameters. However, the function can't modify the pointer in the 2nd parameter, all it can do is read/write data from/to whatever memory the pointer is pointing at. So, for output, you will have to pre-allocate memory for the function to write to.
Passing around the wrong kind of string data. PChar is PWideChar in Delphi 2009+, but "Pascal strings" use AnsiChar instead. And your test data is not even a Pascal string, as it lacks the leading length byte.
So, try something more like this instead:
var
hbar : THandle;
str1, str2 : ShortString;
StringFunction : procedure (String_input, String_output: PShortString); stdcall;
begin
hbar := LoadLibrary('C:\Interface.dll');
if hbar >= 32 then
begin
StringFunction := GetProcAddress(hbar, 'Read_String_In_Write_String_Out');
str1 := 'test';
StringFunction(#str1, #str2);
end;
end;
What is the equivalent of System.Character.TCharHelper.IsWhiteSpace / IsLetter / IsNumber for AnsiChar (UTF8)?
In general, it does not make sense to ask whether a single UTF-8 element (single byte) represents a whitespace. That's because UTF-8 is a variable length encoding and a code point may require more than a single byte to define it.
So you cannot ask whether or not a single byte is a whitespace, unless it encodes an ASCII character, i.e. < 128.
What you would need to do is to take the sequence of bytes that encode the code point of interest, and convert them into a UTF-32 value in a UCS4Char variable. Then pass that to the UCS4Char overload of TCharHelper.IsWhiteSpace.
However, that approach is not well supported by the Delphi libraries. The simplest way to do what you wish in Delphi is:
Convert your UTF-8 string to be a native UTF-16 Delphi string.
Use TCharHelper.IsWhiteSpace(str, index) to query for the code point at position index.
If your question goes as to how to check if a UTF8-string variable is all white spaces, you can use the following RECORD HELPER:
TYPE
U8StringHelper = RECORD HELPER FOR UTF8String
FUNCTION IsAllWhiteSpaces : BOOLEAN;
END;
FUNCTION U8StringHelper.IsAllWhiteSpaces : BOOLEAN;
VAR
C : CHAR;
S : UnicodeString;
BEGIN
S:=Self;
FOR C IN S DO IF NOT C.IsWhiteSpace THEN EXIT(FALSE);
Result:=TRUE
END;
Then you can use it as in:
VAR
U8 : UTF8String;
BEGIN
U8:=' '#13#10;
IF U8.IsAllWhiteSpaces THEN WRITELN('Yes') ELSE WRITELN('No');
U8:=' X'#13#10;
IF U8.IsAllWhiteSpaces THEN WRITELN('Yes') ELSE WRITELN('No');
END.
This will write out "Yes" followed by "No".
But please beware, that by defining your own helper for the UTF8String type, you are eliminating the access to any that may have been defined by the system. If that is a problem, you'll have to make a standard function instead:
FUNCTION IsAllWhiteSpaces(CONST U8 : UTF8String) : BOOLEAN;
VAR
C : CHAR;
S : UnicodeString;
BEGIN
S:=U8;
FOR C IN S DO IF NOT C.IsWhiteSpace THEN EXIT(FALSE);
Result:=TRUE
END;
and use it as follows:
VAR
U8 : UTF8String;
BEGIN
U8:=' '#13#10;
IF IsAllWhiteSpaces(U8) THEN WRITELN('Yes') ELSE WRITELN('No');
U8:=' X'#13#10;
IF IsAllWhiteSpaces(U8) THEN WRITELN('Yes') ELSE WRITELN('No');
END.
I'll leave the making of the other IsXXX functions up to the reader...
Okay - after we have finally determined the proper question, the easiest way for you is to simply cast-up the AnsiChar variable to a proper UNICODE char and then do your thing.
VAR
A : AnsiChar;
BEGIN
IF CHAR(A).IsLetter THEN ...
END.
HOWEVER: working with individual characters from a UTF-8 string is not advisable, as many characters (by the very nature of UTF-8) consists of TWO characters. You are therefore not able to decide if a single AnsiChar from UTF-8 string is anything, as it can merely be a "prefix"/"escape" character, and the actual character is the following character from the string.
So the best way would be to have your UTF8-String and assign it to a UNICODE string variable, and then use the proper CHAR type to iterate over it.
If your question is how to "convert" an AnsiString encoded in UTF-8 into a UNICODE string, you can use the following routine:
FUNCTION AnsiUTF8toUNICODE(CONST S : AnsiString) : STRING;
BEGIN
Result:=UTF8ToUnicodeString(RawString(S))
END;
Please excuse the silly question, but I'm confused. Consider the following method (sorry for noisy comments, this is a real code under development):
function HLanguages.GetISO639LangName(Index: Integer): string;
const
MaxIso639LangName = 9; { see msdn.microsoft.com/en-us/library/windows/desktop/dd373848 }
var
LCData: array[0..MaxIso639LangName-1] of Char;
Length: Integer;
begin
{ TODO : GetLocaleStr sucks, write proper implementation }
//Result := GetLocaleStr(LocaleID[Index], LOCALE_SISO639LANGNAME, '??');
Length := GetLocaleInfo(LocaleID[Index], LOCALE_SISO639LANGNAME, #LCData, System.Length(LCData));
Win32Check(Length <> 0);
SetString(Result, #LCData, Length); // "E2008 Incompatible types" here, but why?
end;
If I remove the reference operator then implicit cast from $X+ comes to the rescue and method compiles. Why compiler refuses this code with reference operator is beyond my understanding.
This is Delphi XE2 and this behaviour might be specific to it.
And if I add a test-case dummy with equivalent prototype as intrinsic one within the scope of HLanguages.GetISO639LangName this error will magically go away:
procedure SetString(var s: string; buffer: PChar; len: Integer);
begin
{ test case dummy }
end;
You have to explicitly convert it to PChar:
SetString(result,PChar(#LCData),Length);
As you stated, SetString() is very demanding about the 2nd parameter type. It must be either a PChar either a PWideChar either a PAnsiChar, depending on the string type itself.
I suspect this is due to the fact that SetString() is defined as overloaded with either a string, a WideString, or an AnsiString as 1st parameter. So in order to validate the right signature, it needs to have exact match of all parameters types:
SetString(var s: string; buf: PChar; len: integer); overload;
SetString(var s: AnsiString; buf: PAnsiChar; len: integer); overload;
SetString(var s: WideString; buf: PWideChar; len: integer); overload;
Of course, all those are "intrinsics", so you won't find such definition in system.pas, but directly some procedure like _LStrFromPCharLen() _UStrFromPCharLen() _WStrFromPWCharLen() or such.
This behavior is the same since early versions of Delphi, and is not a regression in XE2.
I think there's a compiler bug in there because the behaviour with SetString differs from the behaviour with overloaded functions that you provide. What's more there's an interaction with the Typed # operator compiler option. I don't know how you set that. I always enable it but I suspect I'm in the minority there.
So I cannot explain the odd behaviour, and answer the precise question you ask. I suspect the only way to answer it is to look at the internals of the compiler, and very few of us can do that.
Anyway, in case it helps, I think the cleanest way to pass the parameter is like so:
SetString(Result, LCData, Length);
This compiles no matter what you set Typed # operator to.
I know this doesn't answer the specific question regarding SetString, but I'd like to point out that you can do the same thing by simply writing
Result := LCData;
When assigning to a string, Delphi treats a static array of char with ZERO starting index, as a Null terminated string with maximum length. Consider the following:
var
IndexOneArray : array [ 1 .. 9 ] of char;
IndexZeroArray : array [ 0 .. 8 ] of char;
S : string;
T : string;
begin
IndexOneArray := 'ABCD'#0'EFGH';
IndexZeroArray := 'ABCD'#0'EFGH';
S := IndexOneArray;
T := IndexZeroArray;
ShowMessage ( 'S has ' + inttostr(length(S)) + ' chars. '
+ #13'T has ' + inttostr(length(T)) + ' chars. ' );
end;
This displays a message that S has 9 chars, while T has 4.
It will also work when the zero-index array has 9 non-null characters. The result will be 9 characters regardless of what's in the following memory locations.
Because LCData is pointer to the array, not to the Char. Sure, sometimes it happens that an array or a record or a class start with char-type variable, but consequences are not what statically-typed compiler should rely upon.
You have to take the pointer to a character in that array, not to the array itself.
SetString(Result, #LCData[Low(LCData)], Length);
When I compile this code
{$WARNINGS ON}
function Test(s: string): string;
var
t: string;
d: double;
begin
if s = '' then begin
t := 'abc';
d := 1;
end;
Result := t + FloatToStr(d);
end;
I get the warning "Variable 'd' might not have been initialized", but I do not get the same warning for variable 't'. This seems inconsistent. This code is only a simple example to show the compiler warnings, but I have just found a bug in my live code which would have been caught by a compile-time warning for uninitialised string variables. Can I switch this warning on somehow in Delphi 6? Or in a newer version of Delphi?
Nope, there is no switch for this. The warning doesn't occur because a string is a compiler managed type and is always initialized by the compiler.
Yes :-)
Use shortstrings or pChars
{$WARNINGS ON}
function Test: String;
var
p: pChar;
d: double;
begin
Result := p + FloatToStr(d);
end;
//This code will give a warning.
Seriously
No, the normal Delphi strings and shortstrings are automatically initialized to '' (empty string). Shortstrings live on the stack and don't need cleanup. Other strings are so called 'managed' types and automatically deleted when they are no longer used using reference counting.
PChars, the good news
pChars are just pointers. Delphi does not manage them.
However Delphi does automatically convert them to strings and visa versa.
pChars the bad news
If you convert a pChar to a string Delphi copies the contents of the pChar into the string and you are still responsible for destroying the pChar.
Also note that this copying takes time and if you do it a lot will slow your code down.
If you convert a string to a pChar Delphi will give you a pointer to the address the string lives in. And !! Delphi will stop managing the string. You can still assign values to the string, but it will no longer automatically grow.
From: http://www.marcocantu.com/epascal/English/ch07str.htm
The following code will not work as expected:
procedure TForm1.Button2Click(Sender: TObject);
var
S1: String;
begin
SetLength (S1, 100);
GetWindowText (Handle, PChar (S1), Length (S1));
S1 := S1 + ' is the title'; // this won't work
Button1.Caption := S1;
end;
This program compiles, but when you run it, you are in for a surprise: The Caption of the button will have the original text of the window title, without the text of the constant string you have added to it. The problem is that when Windows writes to the string (within the GetWindowText API call), it doesn't set the length of the long Pascal string properly. Delphi still can use this string for output and can figure out when it ends by looking for the null terminator, but if you append further characters after the null terminator, they will be skipped altogether.
How can we fix this problem? The solution is to tell the system to convert the string returned by the GetWindowText API call back to a Pascal string. However, if you write the following code:
S1 := String (S1);
the system will ignore it, because converting a data type back into itself is a useless operation. To obtain the proper long Pascal string, you need to recast the string to a PChar and let Delphi convert it back again properly to a string:
S1 := String (PChar (S1));
Actually, you can skip the string conversion, because PChar-to-string conversions are automatic in Delphi. Here is the final code:
procedure TForm1.Button3Click(Sender: TObject);
var
S1: String;
begin
SetLength (S1, 100);
GetWindowText (Handle, PChar (S1), Length (S1));
S1 := String (PChar (S1));
S1 := S1 + ' is the title';
Button3.Caption := S1;
end;
An alternative is to reset the length of the Delphi string, using the length of the PChar string, by writing:
SetLength (S1, StrLen (PChar (S1)));
I want to copy the content in the string to char array.
Can I use this code StrLCopy(C, pChar(#S[1]), high(C));
I am currently using Delphi 2006. Will there be any problems if I upgrade my Delphi version because of Unicode support provided in newer versions?
If not, what can be the code for this conversion?
When you're copying a string into an array, prefer StrPLCopy.
StrPLCopy(C, S, High(C));
That will work in all versions of Delphi, even when Unicode is in effect. The character types of C and S should be the same; don't try to use that function to convert between Ansi and Unicode characters.
But StrLCopy is fine, too. You don't need to have so much pointer code, though. Delphi already knows how to convert a string into a PChar:
StrLCopy(C, PChar(S), High(C));
This works, in a quick test:
var
ch: array[0..10] of Char;
c: Char;
x: Integer;
st: string;
begin
s := 'Testing';
StrLCopy(PChar(#ch[0]), PChar(s), High(ch));
x := 100;
for c in ch do
begin
Canvas.TextOut(x, 100, c);
Inc(c, Canvas.TextWidth(c) + 3);
end;
end;