Ansi TStringList in Delphi XE for non-Unicode compatibility - delphi

I am trying to migrate DevExpress TdxDBGrid for Unicode applications and the lack of non-Unicode TStringList is the only obstacle for completing the migration. I have tried to use TAnsiStringList from JcLAnsiString (from Jedi/Jcl open source project) and while it works, it includes too many dependencies on Jedi/Jcl framework. Generally my plan is to use migrated TdxDBGrid for working with unicode data, but TAnsiStringList is required for internal actions - like storing bookmarks, selected rows and so on.
Is the more lightweight non-Unicode TStringList (with less dependencies)?

Not sure what functionality of TStringList you use, but you can try generic TList<AnsiString> instead of TStringList for your task. If the only reason to use AnsiString type instead of String is keeping internally some strings, maybe it will be enough.

Related

Looking for a good Delphi unicode string library

Any suggestion for a good unicode string library for Delphi 2010? Such thing as class that would contain a collection of independent functions, basically an encapsulation of functions that manipulate strings (ex: Trimlike, Character removal, Positional, Sub-string, Compare, Informational, Case, Replacement, Manipulation functions etc. ).
Thanks
What about SysUtils and StrUtils? They contain many String manipulation functions.
And if those functions aren't enough you could try the JclStrings unit from the JCL - JEDI Code Library (not to be confused with the JVCL - JEDI Visual Component Library).
Mike Lischke has an excellent Unicode library at Soft Gems. It hasn't been updated for Delphi 2009/2010 yet, but it was already working with WideStrings/WideChars, so it should be a pretty trivial port.

What do I need to know to upgrade a complex application from C++Builder 2007 to 2010?

My company's main application is mostly written in C++ (with some Delphi code and components). We are upgrading from RAD Studio 2007 to 2010 for the next release, starting in about a week. What do I need to know to ensure this upgrade goes smoothly?
Points I have thought of so far are:
Unicode. This one looks really complicated. Our app contains a horrible mix of std::string-s and AnsiString-s with casts to and from them. I have lots of questions about this, such as "is wstring capable of holding everything a UnicodeString can, and should we just do a search/replace", or "should we avoid all C++ string types altogether and use UnicodeString", "can we change all event handlers to use String though the existing .HPPs event handler method prototypes were compiler-translated to AnsiString", right down to basics such as "should we prefix all strings with L, or is the compiler smart enough with Unicode enabled to use Unicode strings", etc. Any insight on this would be really appreciated.
We also need backwards compatibility. Our app uses its own binary tuple format that currently stores strings as an array of bytes. I need to upgrade this to read old files and, presumably, write new Unicode strings as well. How do I handle Unicode strings embedded in a binary format? Is there any generic way where I can point a UnicodeString at an array of bytes, that may be originally written as either ANSI bytes or Unicode, and it will figure out what they are?
Third-party components. We use SpTBX mainly, and it appears to be compatible.
Project upgrades. The standard advice in the Codegear forums seems to be to manually recreate all project files when upgrading. This is an awful lot of work (7 projects (mostly libs) in our main app, plus half a dozen DLLs, a lot of files.) Is there any way to automate this?
How's the linker look? We traditionally have a lot of trouble with the linker randomly crashing or running out of resources, though it got a lot better in 2007. This is one reason our main application is split into several libs - the linker cannot (hopefully, "could not, but now can"?) handle it otherwise.
I know there's a new type library editor and format (it stores the IDL, ie text, and generates the TLB dynamically?) How well does this handle upgrading existing COM projects with a TLB? We have Delphi code and TLB that are built into the C++ application.
Is there anything else I should be considering or be aware of?
I have found:
2007 and 2010 co-existing. I'm not sure I trust this answer since I have had issues with 2006 and 2007 on the same machine before.
several answers about Unicode: writing strings with 2009 and generic transition to Unicode text but none are answers for concerns, or the C++Builder-specific parts at all.
This question about guidelines upgrading to 2009 but though the answers are helpful, they don't answer all the Unicode-related issues above.
[Edit: added] Codegear documents for Unicode in RAD Studio and things to look for when converting to Unicode
Project upgrades. The standard advice in the Codegear forums seems to be to manually recreate all project files when upgrading. This is an awful lot of work (7 projects (mostly libs) in our main app, plus half a dozen DLLs, a lot of files.) Is there any way to automate this?
There is: just use the IDE's project importer :)
Seriously, I would just try importing the projects, and then go investigate if it doesn't seem to work.
How's the linker look? We traditionally have a lot of trouble with the linker randomly crashing or running out of resources, though it got a lot better in 2007. This is one reason our main application is split into several libs - the linker cannot (hopefully, "could not, but now can"?) handle it otherwise.
I've had almost no trouble with ILINK anymore since C++Builder 2009. I've occasionally read that others experienced out-of-memory errors, but someone in the newsgroups has discovered a workaround:
https://forums.embarcadero.com/thread.jspa?messageID=140012&tstart=0#140012
Also, as you can read here, the compiler got a new option (-Cx) to control the maximal amount of memory it allocates.
I know there's a new type library editor and format (it stores the IDL, ie text, and generates the TLB dynamically?) How well does this handle upgrading existing COM projects with a TLB?
Should work without a hitch.
I have lots of questions about this, such as "is wstring capable of holding everything a UnicodeString can, and should we just do a search/replace"
Yes, on Windows platforms wchar_t usually is 16 bit large, which means it suffices for holding UTF-16 which UnicodeString is.
or "should we avoid all C++ string types altogether and use UnicodeString"
Depends on how portable your code needs to be. In any case, whenever you just need a string type, use "String", not "UnicodeString".
"can we change all event handlers to use String though the existing .HPPs were compiler-translated to AnsiString"
First, you should NEVER re-use .hpp files generated by older versions of DCC!
For event handlers that use the String type in Delphi, you must use UnicodeString. As above, simply use "String", and your code will work for both the ANSI and Unicode versions of C++Builder.
right down to basics such as "should we prefix all strings with L, or is the compiler smart enough with Unicode enabled to use Unicode strings"
The compiler doesn't convert your strings (it would conflict with the language standards), but both AnsiString and UnicodeString do have copy constructor overloads for both char* and wchar_t* string literals. I.e., the following will work:
AnsiString as = L"foo";
UnicodeString us = "bar";
What will not work this way, though, is the whole bunch of printf()/scanf() functions; AnsiString::sprintf() takes const char*, UnicodeString::sprintf() takes const wchar_t*.
If you are using sprintf() a lot, you may find my CbdeFormat library useful; just read my article on the subject.
You do not say what the data strings in your binary tuple format are for: is it necessary for them to store Unicode? When I transitioned from D2007 to D2009 I was able to keep some parts of the system ANSI-string only.
If storing Unicode is required, then you need to check if your existing data is compatible with a format such as UTF-8. If the range of values stored in existing data files present a problem, then I would make your next upgrade do a one-time conversion of any old data files, reading in the old AnsiString data and writing it back as UTF-8 to a different file name or extension, or by modifying appropriate file header data. I have been versioning data files for a long time, just to allow this sort of processing change.
I am only just starting a BCB2010 project, so cannot comment on your other questions, but I certainly had difficulty upgrading a Delphi project from D2007 to D2009 - though I was able to fix this by editing the project file, which is just XML.
Good luck with the conversion ;-)
Unicode. This one looks really
complicated. Our app contains a
horrible mix of std::string-s and
AnsiString-s with casts to and from
them. I have lots of questions about
this, such as "is wstring capable of
holding everything a UnicodeString
can, and should we just do a
search/replace"
std::wstring contains wchar_t* strings, just like System::UnicodeString does.
should we avoid all C++ string
types altogether and use
UnicodeString
That is up to you to decide. char* strings are still supported. You are not forced to migrate everything to Unicode.
can we change all event handlers to
use String though the existing .HPPs
were compiler-translated to AnsiString
No, you cannot change auto-managed event handlers to use the System::String alias. All IDE versions will complain about that. You will have to manually update your event handler declarations and implementations to use UnicodeString parameters instead of AnsiString parameters when appropriate. That also means you cannot share DFMs and Unit .h files across multiple IDE versions, either (which you should not be doing anyway).
should we prefix all strings with L,
or is the compiler smart enough with
Unicode enabled to use Unicode strings
No. If you declare a string constant or character constant without an L prefix, the data will still be interpretted as Ansi. That has not changed. You can, however, pass Ansi data to System::UnicodeString (but not to std::wstring), and it will convert to Unicode automatically. But you have to be careful because it will use the OS's default Ansi codepage to interpret the data. As long as your Ansi data is only using ASCII characters only, then you will be OK. Otherwise, if you are using non-ASCII characters, then you are better off putting the data into a System::AnsiStringT or System::RawByteString (both were introduced in CB2009) that has been assigned the correct codepage, and then assign that to your System::UnicodeString variable. The associated codepage will be used instead of the OS default codepage for the conversion.
We also need backwards compatibility.
Our app uses its own binary tuple
format that currently stores strings
as an array of bytes. I need to
upgrade this to read old files and,
presumably, write new Unicode strings
as well. How do I handle Unicode
strings embedded in a binary format?
If your tuple is expecting 8-bit characters, then you will have to make sure that any struct declarations and such are using char and not wchar_t characters. If you need to store Unicode strings, but need to maintain the 8-bit compatibility, then you should encode your Unicode strings to UTF-8 first (you can use the System::UTF8String string type to help you - starting in CB2009, it is a true UTF-8 string now). As long as you do not use non-ASCII characters, then your old apps will not know the difference, as ASCII characters are encoded as-is in UTF-8. If you want to store raw Unicode data, however, then your tuple would need a flag somewhere (if it does not already have one) indicating whether the string data is stored as Ansi or Unicode, and your apps would have to look for that flag.
Is there any generic way where I can
point a UnicodeString at an array of
bytes, that may be originally written
as either ANSI bytes or Unicode, and
it will figure out what they are?
No. You have to know the actual encoding of the bytes beforehand. If you pass a memory address to System::AnsiString or std::string, it is going to assume Ansi characters. If you pass the same memory address to System::UnicodeString or std::wstring, it is going to assume Unicode characters instead.
Third-party components. We use SpTBX
mainly, and it appears to be
compatible.
Just like with all prior versions (except for the migration from 2006 to 2007), any third-party components you have will need to be re-compiled for 2010, either manually (if you have the source code for them) or by their respective vendors.
Project upgrades. The standard advice
in the Codegear forums seems to be to
manually recreate all project files
when upgrading.
Yes. That still applies.
I know there's a new type library
editor and format (it stores the IDL,
ie text, and generates the TLB
dynamically?)
.TLB files are not used at all anymore. The new system operates on .ridl (Reduced IDL) files now. During compiling, the .ridl produces the correct TypeLibrary information in the executable's binary resources directly. No .tlb files are generated.
How well does this handle upgrading
existing COM projects with a TLB? We
have Delphi code and TLB that are
built into the C++ application.
I do not remember whether CB2010 (or CB2009, for that matter) can consume pre-existing .tlb files directly. I don't think they can. You can, however, run the .tlb file through tlibimp.exe and it will export a .ridl file. Or you can copy the IDL text from the TLB editor in a past version and paste it into a new .ridl file manually. Either way, you can then add that .ridl ile to your CB2010 project.
2007 and 2010 co-existing. I'm not
sure I trust this answer since I have
had issues with 2006 and 2007 on the
same machine before.
That is why I use virtual machines when installing multiple IDE versions on the same physical machine.
Is the cost of upgrading in line with the benefits?
Why not start a gradual upgrade where new components would be developed on the new platform. Integrate the new components to the old version via different interop helpers.
This approach was suggested to vb6 developers who were thinking about upgrading to vb.net.

Compile delphi 5 code in Delphi 2009

It is possible to work with a Delphi 5 project in the Delphi 2009 IDE by referencing the Delphi 5 version of dcc32?
If so are there any issues to watch out for concerning the way that project settings (search paths, conditional defines etc.) are implemented in 2009?
Edit: Just to clarify I am also upgrading the project to Unicode but will still need to debug and run releases in the old configuration
It depends on what you're trying to accomplish and what limitations you are willing to accept.
As far as I know, you can't use the Delphi 2009 IDE to maintain Delphi 5 projects directly. For example, even if you stick to functionality that's common between the two, some properties that are not supported in Delphi 5 are written to your DFMs, causing an error at run time.
I've maintained projects and library code that were written in Delphi 2005/2006/2007 that was also being used in Delphi 6/7. I usually edited and debugged these using the latest IDE. I had separate project files for each target version and made sure they all used the same memory manager. Finally, I had an automated build process and unit tests that would strip incompatible properties out of the DFMs (my own DFM Scrubber), make sure all of the targets always compile and run unit tests, which are also recompiled for each target.
All in all, it's more effort and I wouldn't recommend it unless you have a specific requirement to do so.
No. That said, it is still Delphi, and assuming you have source or D2009 versions of any custom components it can be modified to compile in Delphi 2009. The layout of the VCL has changed quite a bit since D5, so expect to have to modify your uses clauses and probably rewrite some small chunks here and there, but it is doable.
You either port your code to Delphi 2009/2010 level (Unicode), or you may as well not install the product.
I suggest you open the project and see where it fails, close the project (without saving anything), find the component versions you need and install them, and once the project opens up in design mode (all components are installed) you can start porting.
Read the Unicode Delphi migration (porting) information available at the website.
Ask your self each time you see PChar, and Char, if it needs to be PAnsiChar, or AnsiChar instead? If you are reading bytes from a disk, a com port, or a network connection, you will need to change from Char to AnsiChar, from PChar to PAnsiChar, otherwise, you might just leave the Char and PChar as they are and they will become Unicode. Always be aware that Char is not a Byte, anymore.
You also must replace explicit references to narrow Win32 API calls with versions without the A (ansi) suffix. Example: CreateFileA might need to become just CreateFile.
W

Has anyone experience with porting a D2007 + TntControls application to D2009?

I have a rather large (freeware) project written with Delphi 2007 which is using both the TntUnicodeControls and the TntLXControls library and I'm planning to move to Delphi 2009.
Unfortunately I'm using those libraries everywhere in my project:
Replacement for VCL controls to provide Unicode capability
Win32 API wrappers (mostly for comparing strings)
The feature enhancements of TntLXForms, TntLXRegistry, ...
Third-party components which use TntControls. (VirtualTrees, SpTBXLib, updates for D2009 are available)
Do you have any experience and/or suggestions in porting such a project to Delphi 2009. Is it advisable to first switch to the (commercial) TMS Unicode controls?
Install GExperts; there is "Replace component" IDE addin that can help converting TTntXXX to TXXXX controls. Try for once, and if it's ok just check "Replace evrywhere in project".
SpTbx and VirtualTrees can only be recompiled - they both support D2009.
If you used WinAPI wrappers just to call Unicode API-s they should work in D2009 also.
That leaves you with TntLX controls (TntLXForms, TntLXRegistry, ...). Since they are unsupported, may be it is good time to change them anyway.
I can help with some of this, as I'm porting a C++Builder application that uses TNT from 2007 to 2009. The switch to Unicode in D2009 is overdue and welcome. However, it's unfortunate that the transition is probably easier for those who have NOT needed unicode in the past, and probably still don't. If, like me, you needed Unicode and used Troy Wolbrink's great TNT control to provide it, you have a rather more complex job...
The good news is that there's a new version of TNTControls from TMS Software which supports D2009. I haven't looked at this, but expect it is just a 'facade' layer over the native VCL components to ease portability. I'd consider that if your other libraries can be rebuilt to use it.
However, you may be better going back to native VCL controls, and the reason is string types. TNT control have always used WideString to pass Unicode strings back and forth, and you may well have WideString use scattered through your own code. This will work, but it's not ideal as WideString should really just be used for COM interop as it 'wraps' the COM BSTR type. Native unicode strings in D2009 are reference-counted and should be significantly faster.
If you do decide to replace TNT components with native VCL ones, you can use GExperts "Replace Components" command - or, maybe easier, do a search and replace in your .DFM and .PAS files (which you DO have in text form, don't you) to replace TTNT with T.
I recommend the following resources:
Marco Cantu's Delphi 2009 Handbook Chapter 3 (Porting to unicode)
http://www.marcocantu.com/dh2009/
Nick Hodges' articles (Delphi in a Unicode World)
http://blogs.codegear.com/nickhodges/2008/11/20/39149
I think either way it's going to be a lot of work. Probably more so than if you hadn't done all the work to make it unicode compatible before. I personally would forget about the tms Unicode controls, and go back to the vcl. It will save more pain in the future. (nothing against those controls, mind you.)
Also remember, that D2009's string, is not the same thing as D2007's Widestring which you have undoubtedly used in your app. So all instances of Widestring, which you so diligently changed from string (which was AnsiString), need to go back again to string(which is now unicodestring).

Porting a unicode enabled Delphi 2006 application to Delphi 2009

I have an application which is fully unicode compatible in Delphi 2006. I had replaced all AnsiStrings with WideStrings, replaced all VCL controls with TNT controls, and changed all string functions from AnsiStrings to WideStrings. It looks like all that work was for nothing, because I'm going to have to reverse it all. Is there anyway to Trick Delphi 2009 into thinking Widestrings are in fact UnicodeStrings?
No, there really isn't. But you won't regret the work to truly Unicode enable your application.
The TNT controls can easily be replaced with the regular VCL controls. You can do that pretty simply using the wizard from GExperts (http://www.gexperts.org) that replaces one control type with another automatically.
Then, you can change all your WideString declarations to regular strings. String is now an alias for UnicodeString, and so all your strings can hold Unicode data just fine.
BTW, the author of the TNT controls, Troy Wolbrink, now vastly prefers Delphi 2009 over his own controls.
Main advantage of TNT Controls is only that It can work as Ansi program in Windows 9x. It is not full unicode. If you want full unicode support everywhere (such as Stringlist.LoadFromFile, Form.OnKeyPress) it's good to move to Delphi 2009.
I have done the same thing in an application that used different XML files as input. In my case, I was working with UTF-8 (so we could use regular strings) throughout the program and only converted to WideString for display purposes (TNT controls).
I removed the conversions back and forth between WideString and UTF-8 and replaced the TNT controls with regular VCL controls by hand since there were only a handful of forms.
The conversion took about an hour with testing. The code is simpler and the program is noticeably faster.

Resources