Appending UnicodeString to WideString in Delphi - delphi

I'm curious about what happens with this piece of code in Delphi 2010:
function foo: WideString;
var
myUnicodeString: UnicodeString;
begin
for i:=1 to 1000 do
begin
myUnicodeString := ... something ...;
result := result + myUnicodeString; // This is where I'm interested
end;
end;
How many string conversions are involved, and are any particularly bad performance-wise?
I know the function should just return a UnicodeString instead, but I've seen this anti-pattern in the VCL streaming code, and want to understand the process.

To answer your question about what the code is actually doing, this statement:
result := result + myUnicodeString;
Does the following:
calls System._UStrFromWStr() to convert Result to a temp UnicodeString
calls System._UStrCat() to concatenate myUnicodeString onto the temp
calls System._WStrFromUStr() to convert the temp to a WideString and assign it back to Result.
There is a System._WStrCat() function for concatenating a WideString onto a WideString (and System._UStrCat() for UnicodeString). If CodeGear/Embarcadero had been smarter about it, they could have implemented a System._WStrCat() overload that takes a UnicodeString as input and a WideString as output (and vice versa for concatenating a WideString onto a UnicodeString). That way, no temp UnicodeString conversions would be needed anymore. Both WideString and UnicodeString are encoded as UTF-16 (well mostly, but I won't get into that here), so concatenating them together is just a matter of a single allocation and move, just like when concatenating two UnicodeStrings or two WideStrings together.

The performance is poor. There's no need for any encoding conversions since everything is UTF-16 encoded. However, WideString is a wrapper around the COM BSTR type which performs worse than native UnicodeString.
Naturally you should prefer to do all your work with the native types, either UnicodeString or TStringBuilder, and convert to WideString at the last possible moment.
That is generally a good policy. You don't want to use WideString internally since it's purely an interop type. So only convert to (and from) WideString at the interop boundary.

Related

Can we safely use ansiString in mobile with Sydney?

When I read Migrating Delphi Code to Mobile from Desktop, they say to avoid using AnsiString. Is there any reason for that? AnsiString use 2x less memory than UnicodeString, and it's a perfect container for JSON. So, can I use AnsiString safely, or do I need to stay with UnicodeString (and why)?
You can use 8-bit strings on mobile platforms. But safety depends on which kind of 8-bit strings you use.
For anything other than Windows, and even on Windows, using AnsiString is extremely bad idea. AnsiString is legacy type and while it was re-enabled in 10.4 on mobile platforms, that does not mean you should use it, and even less that you can use it safely.
One of the problems with AnsiString is that sooner or later in your code it will go through conversion, because default string type used all over RTL and FMX is UTF-16 string type, and you can lose original data.
String types you can safely use on mobile (and other platforms) are string, UTF8String and RawByteString.
When it comes to RawByteString it can only be safely used in code-page agnostic operations. See more: Delphi XE - RawByteString vs AnsiString
JSON files don't support ANSI encoding, so Unicode is your only choice. UTF-8 and UTF8String will do more than fine, because that is also default encoding for any JSON data exchange.
As far as various AnsiXXX functions are concerned, the best option is to write your own routines that will work on UTF-8 strings. You can also use standard functions that work on generic string type, but they are slower because of conversions to UTF-16 and back.
Illustration of data loss when using AnsiString on mobile (Android)
Android specification requires implementation of only few standard character charsets. That includes ISO-8859-1
https://developer.android.com/reference/java/nio/charset/Charset
For anything else you depend on the specific device.
For instance following example with AnsiString works fine for French character set, but it fails for Croatian and Chinesse.
var
s: string;
u: UTF8String;
a: AnsiString;
begin
s := 'é à è ù â ê î ô û ç ë ï ü';
a := s;
u := s;
Memo1.Lines.Add(s);
Memo1.Lines.Add(u);
Memo1.Lines.Add(a);
s := 'š đ č ć ž Š Đ Č Ć Ž';
a := s;
u := s;
Memo1.Lines.Add(s);
Memo1.Lines.Add(u);
Memo1.Lines.Add(a);
s := '新年';
u := s;
a := s;
Memo1.Lines.Add(s);
Memo1.Lines.Add(u);
Memo1.Lines.Add(a);
end;
Delphi compiler will issue a warning when you are doing unsafe typecasting between where data loss can occur, and it is prudent to fix all that code, by using some other string type.
W1058 Implicit string cast with potential data loss from 'string' to 'AnsiString'
There is also a warning when you directly convert between UTF-8 and UTF-16 string types, but to clear those warnings you can just explicitly typecast to string or UTF8String type, since compiler will do appropriate conversion in the background and all information will be retained (Note: Unicode normalization my occur during that process).
W1057 Implicit string cast from 'string' to 'UTF8String'

Lazarus. Equivalent to Chr() for Unicode symbols

Is there any function in freepascal to show the Unicode symbol by its code (e.g. U+1D15E)? Unfortunately Chr() works only with ANSI symbols (with codes less than 127).
I want to use symbols from custom symbolic font and it is very inconvenient to put them into sourcecode directly (they are shown in Lazarus as ? or something else because they are absent in system fonts).
Take a look at this page. I assume that Freepascal either uses UTF-16, in which it becomes a surrogate pair of two WideChars (see table) or UTF-8, in which it becomes a sequence of byte values (see table again).
UTF-8:
const
HalfNoteString = UTF8String(#$F0#$9D#$85#$9E);
UTF-16:
const
HalfNoteString = UnicodeString(#$D834#$DD5E);
The names of the string types may differ, as I don't know FreePascal very well. Perhaps AnsiString and WideString.
I have never used Free Pascal, but if I were you, I'd try
var
s: char;
begin
s := char($222b); // Just cast a word
or, if the compiler is really stubborn,
var
s: char;
begin
PWord(#s)^ := $222b; // Forcibly write a word
Current unicode status of FPC to my best knowledge
The codepage of literals can be set with $codepage http://www.freepascal.org/docs-html/prog/progsu81.html
FPC 2.4.x+ does have unicodestring (since it is +/- Kylix widestring) but only basic routine support. (pos and copy, not routines like format), but the "record" misses the codepage field.
Lazarus widgets expect UTF8 in normal ansistrings (D7..D2007 ansistrings without codepage data), and programmers must manually insert conversions if necessary. So on Windows the widgets ARE mostly using unicode (-W) calls, but take ansistrings with UTF8 in it.
FPC doesn't follow the utf8 in ansistring scheme , so for some string accepting routines in sysutils, there are special routines in Lazarus that assume UTF8 that call -W variants)
FPC ansistring is the system default 1-byte encoding. ansi on Windows, utf8 on most other platforms.
Trunk, 2.7.1, provides support for the new D2009+ ansistring (with codepages).
There has been no discussion yet how to deal with the default stringtype (e.g. will "string" be utf8string on *nix and unicodestring on Windows, or unicodestring or utf8string everywhere?)
Other unicodestring related enhancement (like encoding parameters to tstringlist.savetofile) are not implemented. Likewise for the pseudo objects (like TCharacter which are afaik mostly static)
Update: 2.7.1 has a variable encoding ansistring type, and lazarus has been fixed to keep working. Nothing is really taking advantage from it yet though, e.g. most of the RTL still uses -A calls, and prototypes of sysutils and system procedures that takes strings haven't changed to rawbytestring yet.
I assume the problem is to convert from UCS4 encoding (which is actually a Unicode codepoint number) to UTF16.
In Delphi, you can use UCS4StringToUnicodeString function.
Warning: Be careful with UCS4String type. It is actually a zero-terminated dynamic array, not a string (that means it is zero-based).
var
S1: UCS4String;
S: string;
begin
SetLength(S1, 2);
S1[0]:= UCS4Char($1D15E);
S1[1]:= UCS4Char(0);
S:= UCS4StringToUnicodeString(S1);
ShowMessage(Format('%d, %x, %x', [Length(S), Ord(S[1]), Ord(S[2])]));
end;

Delphi XE - RawByteString vs AnsiString

I had a similar question to this here: Delphi XE - should I use String or AnsiString? . After deciding that it is right to use ANSI strings in a (large) library of mine, I have realized that I can actually use RawByteString instead of ANSI. Because I mix UNICODE strings with ANSI strings, my code now has quite few places where it does conversions between them. However, it looks like if I use RawByteString I get rid of those conversions.
Please let me know your opinion about it.
Thanks.
Update:
This seems to be disappointing. It looks like the compiler still makes a conversion from RawByteString to string.
procedure TForm1.FormCreate(Sender: TObject);
var x1, x2: RawByteString;
s: string;
begin
x1:= 'a';
x2:= 'b';
x1:= x1+ x2;
s:= x1; { <------- Implicit string cast from 'RawByteString' to 'string' }
end;
I think it does some internal workings (such as copying data) and my code will not be much faster and I will still have to add lots of typecasts in my code in order to silence the compiler.
RawByteString is an AnsiString with no code page set by default.
When you assign another string to this RawByteString variable, you'll copy the code page of the source string. And this will include a conversion. Sorry.
But there is one another use of RawByteString, which is to store plain byte content (e.g. a database BLOB field content, just like an array of byte)
To summarize:
RawByteString should be used as a "code page agnostic" parameter to a method or function;
RawByteString can be used as a variable type to store some BLOB data.
If you want to reduce conversion, and would rather use 8 bit char string in your application, you should better:
Do not use the generic AnsiString type, which will depend on the current system code page, and by which you'll loose data;
Rely on UTF-8 encoding, i.e. some 8 bit code page / charset which won't loose any data when converted from or to an UnicodeString;
Don't let the compiler show warnings about implicit conversions: all conversion should be made explicit;
Use your own dedicated set of functions to handle your UTF-8 content.
That exactly what we made for our framework. We wanted to use UTF-8 in its kernel because:
We rely on UTF-8 encoded JSON for data transmission;
Memory consumption will be smaller;
The used SQLite3 engine will store text as UTF-8 in its database file;
We wanted a way of handling Unicode text with no loose of data with all versions of Delphi (from Delphi 6 up to XE), and WideString was not an option because it's dead slow and you've got the same problem of implicit conversions.
But, in order to achieve best speed, we write some optimized functions to handle our custom string type:
{{ RawUTF8 is an UTF-8 String stored in an AnsiString
- use this type instead of System.UTF8String, which behavior changed
between Delphi 2009 compiler and previous versions: our implementation
is consistent and compatible with all versions of Delphi compiler
- mimic Delphi 2009 UTF8String, without the charset conversion overhead
- all conversion to/from AnsiString or RawUnicode must be explicit }
{$ifdef UNICODE} RawUTF8 = type AnsiString(CP_UTF8); // Codepage for an UTF8string
{$else} RawUTF8 = type AnsiString; {$endif}
/// our fast RawUTF8 version of Trim(), for Unicode only compiler
// - this Trim() is seldom used, but this RawUTF8 specific version is needed
// by Delphi 2009/2010/XE, to avoid two unnecessary conversions into UnicodeString
function Trim(const S: RawUTF8): RawUTF8;
/// our fast RawUTF8 version of Pos(), for Unicode only compiler
// - this Pos() is seldom used, but this RawUTF8 specific version is needed
// by Delphi 2009/2010/XE, to avoid two unnecessary conversions into UnicodeString
function Pos(const substr, str: RawUTF8): Integer; overload; inline;
And we reserved the RawByteString type for handling BLOB data:
{$ifndef UNICODE}
/// define RawByteString, as it does exist in Delphi 2009/2010/XE
// - to be used for byte storage into an AnsiString
// - use this type if you don't want the Delphi compiler not to do any
// code page conversions when you assign a typed AnsiString to a RawByteString,
// i.e. a RawUTF8 or a WinAnsiString
RawByteString = AnsiString;
/// pointer to a RawByteString
PRawByteString = ^RawByteString;
{$endif}
/// create a File from a string content
// - uses RawByteString for byte storage, thatever the codepage is
function FileFromString(const Content: RawByteString; const FileName: TFileName;
FlushOnDisk: boolean=false): boolean;
Source code is available in our repository. In this unit, UTF-8 related functions were deeply optimized, with both version in pascal and asm for better speed. We sometimes overloaded default functions (like Pos) to avoid conversion, or More information about how we handled text in the framework is available here.
Last word:
If you are sure that you will only have 7 bit content in your application (no accentuated characters), you may use the default AnsiString type in your program. But in this case, you should better add the AnsiStrings unit in your uses clause to have overloaded string functions which will avoid most unwanted conversion.
RawByteString is still an "AnsiString." It is best described as a "universal receiver" which means it will take on whatever the source-string's codepage is at the point of assignment without forcing a codepage conversion. RawByteString was intended to be used only as a function parameter so that you will, as you've discovered, not incur a conversion between AnsiStrings with differing code-page affinities when calling utility functions which take AnsiStrings.
However, in the case above, you're assigning what is essentially an AnsiString to a UnicodeString which will incur a conversion. It must do a conversion because the RawByteString has a payload of 8bit-based characters, whereas a string (UnicodeString) has a payload of 16bit-based characters.

Delphi WideString and Delphi 2009+

I am writing a class that will save wide strings to a binary file. I'm using Delphi 2005 for this but the app will later be ported to Delphi 2010. I'm feeling very unsure here, can someone confirm that:
A Delphi 2005 WideString is exactly the same type as a Delphi 2010 String
A Delphi 2005 WideString char as well as a Delphi 2010 String char is guaranteed to always be 2 bytes in size.
With all the Unicode formats out there I don't want to be hit with one of the chars in my string suddenly being 3 bytes wide or something like that.
Edit: Found this: "I indeed said UnicodeString, not WideString. WideString still exists, and is unchanged. WideString is allocated by the Windows memory manager, and should be used for interacting with COM objects. WideString maps directly to the BSTR type in COM." at http://www.micro-isv.asia/2008/08/get-ready-for-delphi-2009-and-unicode/
Now I'm even more confused. So a Delphi 2010 WideString is not the same as a Delphi 2005 WideString? Should I use UnicodeString instead?
Edit 2: There's no UnicodeString type in Delphi 2005. FML.
For your first question: WideString is not exactly the same type as D2010's string. WideString is the same COM BSTR type that it has always been. It's managed by Windows, with no reference counting, so it makes a copy of the whole BSTR every time you pass it somewhere.
UnicodeString, which is the default string type in D2009 and on, is basically a UTF-16 version of the AnsiString we all know and love. It's got a reference count and is managed by the Delphi compiler.
For the second, the default char type is now WideChar, which are the same chars that have always been used in WideString. It's a UTF-16 encoding, 2 bytes per char. If you save WideString data to a file, you can load it into a UnicodeString without trouble. The difference between the two types has to do with memory management, not the data format.
As others mentioned, string (actually UnicodeString) data type in Delphi 2009 and above is not equivalent to WideString data type in previous versions, but the data content format is the same. Both of them save the string in UTF-16. So if you save a text using WideString in earlier versions of Delphi, you should be able to read it correctly using string data type in the recent versions of Delphi (2009 and above).
You should take note that performance of UnicodeString is way superior than WideString. So if you are going to use the same source code in both Delphi 2005 and Delphi 2010, I suggest you use a string type alias with conditional compiling in your code, so that you can have the best of both worlds:
type
{$IFDEF Unicode}
MyStringType = UnicodeString;
{$ELSE}
MyStringType = WideString;
{$ENDIF}
Now you can use MyStringType as your string type in your source code. If the compiler is Unicode (Delphi 2009 and above), then your string type would be an alias of UnicodeString type which is introduced in Delphi 2009 to hold Unicode strings. If the compiler is not unicode (e.g. Delphi 2005) then your string type would be an alias for the old WideString data type. And since they both are UTF-16, data saved by any of the versions should be read by the other one correctly.
A Delphi 2005 WideString is exactly the same type as a Delphi 2010 String
That is not true - ex Delphi 2010 string has hidden internal codepage field - but probably it does not matter for you.
A Delphi 2005 WideString char as well as a Delphi 2010 String char is guaranteed to always be 2 bytes in size.
That is true. In Delphi 2010 SizeOf(Char) = 2 (Char = WideChar).
There cannot be different codepage for unicode strings - codepage field was introduced to create a common binary format for both Ansi strings (that need codepage field) and Unicode string (that don't need it).
If you save WideString data to stream in Delphi 2005 and load the same data to string in Delphi 2010 all should work OK.
WideString = BSTR and that is not changed between Delphi 2005 and 2010
UnicodeString = WideString in Delphi 2005 (if UnicodeString type exists in Delphi 2005 - I don't know)
UnicodeString = string in Delphi 2009 and above.
#Marco - Ansi and Unicode strings in Delphi 2009+ have common binary format (12-byte header).
UnicodeString codepage CP_UTF16 = 1200;
The rule is simple:
If you want to work with unicode strings inside your module only - use UnicodeString type (*).
If you want to communicate with COM or with other cross-module purposes - use WideString type.
You see, WideString is a special type, since it's not native Delphi type. It is an alias/wrapper for BSTR - a system string type, intendent for using with COM or cross-module communications. Being a unicode - is just a side-effect.
On the other hand, AnsiString and UnicodeString - are native Delphi types, which have no analog in other languages. String is just an alias for either AnsiString or UnicodeString.
So, if you need to pass string to some other code - use WideString, otherwise - use either AnsiString or UnicodeString. Simple.
P.S.
(*) For old Delphi - just place
{$IFNDEF Unicode}
type
UnicodeString = WideString;
{$ENDIF}
somewhere in your code. This fix will allow you to write the same code for any Delphi version.
While a D2010 char is always and exactly 2 bytes, the same character folding and combining issues are present in UTF-16 characters as in UTF-8 characters. You don't see this with narrow strings because they're codepage based, but with unicode strings it's possible (and in some situations common) to have affective but non-visible characters. Examples include the byte order mark (BOM) at the start of a unicode file or stream, left to right/right to left indicator characters, and a huge range of combining accents. This mostly affects questions of "how many pixels wide will this string be on the screen" and "how many letters are in this string" (as distinct from "how many chars are in this string"), but also means that you can't randomly chop characters out of a string and assume they're printable. Operations like "remove the last letter from this word" become non-trivial and depend on the language in use.
The question about "one of the chars in my string suddenly being 3 bytes long" reflects a little confustion about how UTF works. It's possible (and valid) to take three bytes in a UTF-8 string to represent one printable character, but each byte will be a valid UTF-8 character. Say, a letter plus two combining accents. You will not get a character in UTF-16 or UTF-32 being 3 bytes long, but it might be 6 bytes (or 12 bytes) long, if it's represented using three code points in UTF-16 or UTF-32. Which brings us to normalisation (or not).
But provided you are only dealing with the strings as whole things, it's all very simple - you just take the string, write it to a file, then read it back in. You don't have to worry about the fine print of string display and manipulation, that's all handled by the operating system and libraries. Strings.LoadFromFile(name) and Listbox.Items.Add(string) work exactly the same in D2010 as in D2007, the unicode stuff is all transparent to you as a programmer.
I am writing a class that will save wide strings to a binary file.
When you write the class in D2005 you will be using Widestring
When you migrate to D2010 Widestring will still be valid and work properly.
Widestring in D2005 is the same as WideString in D2010.
The fact that String=WideString in D2010 need not be considered since the compiler deals with those issues easily.
Your input routine to save with (AString: String) need only one line entering the proc
procedure SaveAStringToBIN_File(AString:String);
var wkstr : Widestring;
begin
{$IFDEF Unicode} wkstr := AString;
{$ELSE} wkstr := UTF8Decode(AString); {$ENDIF}
...
the rest is the same saving a widestring to a file stream
write the length (word) of string then data
end;

Delphi Unicode String Type Stored Directly at its Address (or "Unicode ShortString")

I want a string type that is Unicode and that stores the string directly at the adress of the variable, as is the case of the (Ansi-only) ShortString type.
I mean, if I declare a S: ShortString and let S := 'My String', then, at #S, I will find the length of the string (as one byte, so the string cannot contain more than 255 characters) followed by the ANSI-encoded string itself.
What I would like is a Unicode variant of this. That is, I want a string type such that, at #S, I will find a unsigned 32-bit integer (or a single byte would be enough, actually) containing the length of the string in bytes (or in characters, which is half the number of bytes) followed by the Unicode representation of the string. I have tried WideString, UnicodeString, and RawByteString, but they all appear only to store an adress at #S, and the actual string somewhere else (I guess this has do do with reference counting and such). Update: The most important reason for this is probably that it would be very problematic if sizeof(string) were variable.
I suspect that there is no built-in type to use, and that I have to come up with my own way of storing text the way I want (which actually is fun). Am I right?
Update
I will, among other things, need to use these strings in packed records. I also need manually to read/write these strings to files/the heap. I could live with fixed-size strings, such as <= 128 characters, and I could redesign the problem so it will work with null-terminated strings. But PChar will not work, for sizeof(PChar) = 1 - it's merely an address.
The approach I eventually settled for was to use a static array of bytes. I will post my implementation as a solution later today.
You're right. There is no exact analogue to ShortString that holds Unicode characters. There are lots of things that come close, including WideString, UnicodeString, and arrays of WideChar, but if you're not willing to revisit the way you intend to use the data type (make byte-for-byte copies in memory and in files while still being using them in all the contexts a string could be allowed), then none of Delphi's built-in types will work for you.
WideString fails because you insist that the string's length must exist at the address of the string variable, but WideString is a reference type; the only thing at its address is another address. Its length happens to be at the address held by the variable, minus four. That's subject to change, though, because all operations on that type are supposed to go through the API.
UnicodeString fails for that same reason, as well as because it's a reference-counted type; making a byte-for-byte copy of one breaks the reference counting, so you'll get memory leaks, invalid-pointer-operation exceptions, or more subtle heap corruption.
An array of WideChar can be copied without problems, but it doesn't keep track of its effective length, and it also doesn't act like a string very often. You can assign string literals to it and it will act like you called StrLCopy, but you can't assign string variables to it.
You could define a record that has a field for the length and another field for a character array. That would resolve the length issue, but it would still have all the rest of the shortcomings of an undecorated array.
If I were you, I'd simply use a built-in string type. Then I'd write functions to help transfer it between files, blocks of memory, and native variables. It's not that hard; probably much easier than trying to get operator overloading to work just right with a custom record type. Consider how much code you will write to load and store your data versus how much code you're going to write that uses your data structure like an ordinary string. You're going to write the data-persistence code once, but for the rest of the project's lifetime, you're going to be using those strings, and you're going to want them to look and act just like real strings. So use real strings. "Suffer" the inconvenience of manually producing the on-disk format you want, and gain the advantage of being able to use all the existing string library functions.
PChar should work like this, right? AFAIK, it's an array of chars stored right where you put it. Zero terminated, not sure how that works with Unicode Chars.
You actually have this in some way with the new unicode strings.
s as a pointer points to s[1] and the 4 bytes on the left contains the length.
But why not simply use Length(s)?
And for direct reading of the length from memory:
procedure TForm9.Button1Click(Sender: TObject);
var
s: string;
begin
s := 'hlkk ljhk jhto';
{$POINTERMATH ON}
Assert(Length(s) = (PInteger(s)-1)^);
//if you don't want POINTERMATH, replace by PInteger(Cardinal(s)-SizeOf(Integer))^
showmessage(IntToStr(length(s)));
end;
There's no Unicode version of ShortString. If you want to store unicode data inline inside an object instead of as a reference type, you can allocate a buffer:
var
buffer = array[0..255] of WideChar;
This has two disadvantages. 1, the size is fixed, and 2, the compiler doesn't recognize it as a string type.
The main problem here is #1: The fixed size. If you're going to declare an array inside of a larger object or record, the compiler needs to know how large it is in order to calculate the size of the object or record itself. For ShortString this wasn't a big problem, since they could only go up to 256 bytes (1/4 of a K) total, which isn't all that much. But if you want to use long strings that are addressed by a 32-bit integer, that makes the max size 4 GB. You can't put that inside of an object!
This, not the reference counting, is why long strings are implemented as reference types, whose inline size is always a constant sizeof(pointer). Then the compiler can put the string data inside a dynamic array and resize it to fit the current needs.
Why do you need to put something like this into a packed array? If I were to guess, I'd say this probably has something to do with serialization. If so, you're better off using a TStream and a normal Unicode string, and writing an integer (size) to the stream, and then the contents of the string. That turns out to be a lot more flexible than trying to stuff everything into a packed array.
The solution I eventually settled for is this (real-world sample - the string is, of course, the third member called "Ident"):
TASStructMemHeader = packed record
TotalSize: cardinal;
MemType: TASStructMemType;
Ident: packed array[0..63] of WideChar;
DataSize: cardinal;
procedure SetIdent(const AIdent: string);
function ReadIdent: string;
end;
where
function TASStructMemHeader.ReadIdent: string;
begin
result := WideCharLenToString(PWideChar(#(Ident[0])), length(Ident));
end;
procedure TASStructMemHeader.SetIdent(const AIdent: string);
var
i: Integer;
begin
if length(AIdent) > 63 then
raise Exception.Create('Too long structure identifier.');
FillChar(Ident[0], length(Ident) * sizeof(WideChar), 0);
Move(AIdent[1], Ident[0], length(AIdent) * sizeof(WideChar));
end;
But then I realized that the compiler really can interpret array[0..63] of WideChar as a string, so I could simply write
var
MyStr: string;
Ident := 'This is a sample string.';
MyStr := Ident;
Hence, after all, the answer given by Mason Wheeler above is actually the answer.

Resources