Delphi: Any StringReplaceW or WideStringReplace functions out there? - delphi

Are there any wide-string manipulation implementations out there?
function WideUpperCase(const S: WideString): WideString;
function WidePos(Substr: WideString; S: WideString): Integer;
function StringReplaceW(const S, OldPattern, NewPattern: WideString;
Flags: TReplaceFlags): WideString;
etc

The JEDI project includes JclUnicode.pas, which has WideUpperCase and WidePos, but not StringReplace. The SysUtils.pas StringReplace code isn't very complicated, so you could easily just copy that and replace string with WideString, AnsiPos with WidePos, and AnsiUpperCase with WideUpperCase and get something functional, if slow.

I generally import the "Microsoft VBScript Regular Expression 5.5" type library and use IRegExp objects.
OP Edit
i like this answer, and i went ahead and wrote a StringReplaceW function using RegEx:
function StringReplaceW(const S, OldPattern, NewPattern: WideString; Flags: TReplaceFlags): WideString;
var
objRegExp: OleVariant;
Pattern: WideString;
i: Integer;
begin
{
Convert the OldPattern string into a series of unicode points to match
\uxxxx\uxxxx\uxxxx
\uxxxx Matches the ASCII character expressed by the UNICODE xxxx.
"\u00A3" matches "£".
}
Pattern := '';
for i := 1 to Length(OldPattern) do
Pattern := Pattern+'\u'+IntToHex(Ord(OldPattern[i]), 4);
objRegExp := CreateOleObject('VBScript.RegExp');
try
objRegExp.Pattern := Pattern;
objRegExp.IgnoreCase := (rfIgnoreCase in Flags);
objRegExp.Global := (rfReplaceAll in Flags);
Result := objRegExp.Replace(S, NewPattern);
finally
objRegExp := Null;
end;
end;

The TntControls has a set of Wide-version functions.

Related

Any RTL function to remove accents from a char?

Nowadays with Sydney, is there any RTL function to remove accents from a char (é becomes e for exemple) in a String? I know this question was already asked in the past but I would like to know if the answers are still accurate with Sydney - I would especially love to find a function that work on all platforms (the one I use right now works only through WideString and Windows API).
Found and modified an implementation that uses NormalizeString() from this article:
How to use NormalizeString function in delphi?
This works for me in Delphi 10.3 Rio (include System.Character in your uses clause):
function NormalizeString(NormForm: NORM_FORM; lpSrcString: LPCWSTR; cwSrcLength: Integer; lpDstString: LPWSTR; cwDstLength: Integer): Integer; stdcall; external 'C:\WINDOWS\system32\normaliz.dll';
function NormalizeText(Str: string): string;
var
nLength: integer;
c: char;
i: integer;
temp: string;
CatStr:string;
begin
nLength := NormalizeString(NormalizationD, PChar(Str), Length(Str), nil, 0);
SetLength(temp, nLength);
nLength := NormalizeString(NormalizationD, PChar(Str), Length(Str), PChar(temp), nLength);
SetLength(temp, nLength);
CatStr:='';
for i := 1 to length(temp) do
begin
c:=temp[i];
if (TCharacter.GetUnicodeCategory(c) <> TUnicodeCategory.ucNonSpacingMark) and
(TCharacter.GetUnicodeCategory(c) <> TUnicodeCategory.ucCombiningMark) then
CatStr:=CatStr+c;
end;
result:=CatStr;
end;

What is the most simple way to check if a string may convert to AnsiString safely in XE4 and above?

In Delphi XE4 and above, we may write something like:
function TestAnsiCompatible(const aStr: string): Boolean;
begin
end;
string in Delphi XE4 is declared as UnicodeString. It may hold a unicode string.
If we do some type conversion:
function TestAnsiCompatible(const aStr: string): Boolean;
var a: AnsiString;
begin
a := aStr;
Result := a = aStr;
end;
Some compiler warnings should prompt:
[dcc32 Warning]: W1058 Implicit string cast with potential data loss from 'string' to 'AnsiString'
[dcc32 Warning]: W1057 Implicit string cast from 'AnsiString' to 'string'
Is there a much simple and neat way to test if aStr is fully compatible with AnsiString? Or we shall check character by characters:
function TestAnsiCompatible(const aStr: string): Boolean;
var C: Char;
begin
Result := True;
for C in aStr do begin
if C > #127 then begin
Result := False;
Break;
end;
end;
end;
All you have to do is type-cast away the warnings:
function TestAnsiCompatible(const aStr: string): Boolean;
var
a: AnsiString;
begin
a := AnsiString(aStr);
Result := String(a) = aStr;
end;
Which can be simplified to this:
function TestAnsiCompatible(const aStr: string): Boolean;
begin
Result := String(AnsiString(aStr)) = aStr;
end;
I used to check if String(a) = AnsiString(a), until I had a user who had transferred data from one PC to another, and that had a different codepage. Then the data could not be read back properly. Then I changed my definition of "safe" to "string is code page 1252" (as this is the region where most of my users are). Then when reading back my data, I know I have to convert the string back from code page 1252.
function StringIs1252(const S: UnicodeString): Boolean;
// returns True if a string is in codepage 1252 (Western European (Windows))
// Cyrillic is 1251
const
WC_NO_BEST_FIT_CHARS = $00000400;
var
UsedDefaultChar: BOOL; // not Boolean!!
Len: Integer;
begin
if Length(S) = 0 then
Exit(True);
UsedDefaultChar := False;
Len := WideCharToMultiByte(1252, WC_NO_BEST_FIT_CHARS, PWideChar(S), Length(S), nil, 0, nil, #UsedDefaultChar);
if Len <> 0 then
Result := not UsedDefaultchar
else
Result := False;
end;
But if you want to check if your string can safely be converted to ansi - completely independent of the code page that is used when writing or reading, then you should check if all characters are in the range from #0..#127.

How to trim any character (or a substring) from a string?

I use C# basically. There I can do:
string trimmed = str.Trim('\t');
to trim tabulation from the string str and return the result to trimmed.
In delphi7 I found only Trim, that trims spaces.
How can I achieve the same functionality?
There is string helper TStringHelper.Trim that accepts array of Char as optional parameter.
function Trim(const TrimChars: array of Char): string; overload;
So, you can use
trimmed := str.Trim([#09]);
for your example. #09 here is ASCII code for Tab character.
This function exists since at least Delphi XE3.
Hope it helps.
This is a kind of procedure sometimes easier to create than to find where it lives :)
function TrimChar(const Str: string; Ch: Char): string;
var
S, E: integer;
begin
S:=1;
while (S <= Length(Str)) and (Str[S]=Ch) do Inc(S);
E:=Length(Str);
while (E >= 1) and (Str[E]=Ch) do Dec(E);
SetString(Result, PChar(#Str[S]), E - S + 1);
end;
In Delphi the Trim function does not take parameters but it does trim other characters as well as spaces. Here's the code (from System.SysUtils in XE2, I don't think it has changed):
function Trim(const S: string): string;
var
I, L: Integer;
begin
L := Length(S);
I := 1;
if (L > 0) and (S[I] > ' ') and (S[L] > ' ') then Exit(S);
while (I <= L) and (S[I] <= ' ') do Inc(I);
if I > L then Exit('');
while S[L] <= ' ' do Dec(L);
Result := Copy(S, I, L - I + 1);
end;
It is trimming anything less than ' ' which would eliminate any control characters like tab, carriage return and line feed.
Delphi doesn't provide a function that does what you want. The built-in Trim function always trims the same set of characters (whitespace and control characters) from both ends of the input string. Several answers here show the basic technique for trimming arbitrary characters. As you can see, it doesn't have to be complicated. Here's my version:
function Trim(const s: string; c: Char): string;
var
First, Last: Integer;
begin
First := 1;
Last := Length(s);
while (First <= Last) and (s[First] = c) do
Inc(First);
while (First < Last) and (s[Last] = c) do
Dec(last);
Result := Copy(s, First, Last - First + 1);
end;
To adapt that for trimming multiple characters, all you have to do is change the second conditional term in each loop. What you change it to depends on how you choose to represent the multiple characters. C# uses an array. You could also put all the characters in a string, or you could use Delphi's native set type.
function Trim(const s: string; const c: array of Char): string;
// Replace `s[x] = c` with `CharInArray(s[x], c)`.
function Trim(const s: string; const c: string): string;
// Replace `s[x] = c` with `CharInString(s[x], s)`.
function Trim(const s: string; const c: TSysCharSet): string;
// Replace `s[x] = c` with `s[x] in c`.
The CharInArray and CharInString functions are easy to write:
function CharInArray(c: Char; ar: array of Char): Boolean;
var
i: Integer;
begin
Result := True;
for i := Low(ar) to High(ar) do
if ar[i] = c then
exit;
Result := False;
end;
// CharInString is identical, except for the type of `ar`.
Recall that as of Delphi 2009, Char is an alias for WideChar, meaning it's too big to fit in a set, so you wouldn't be able to use the set version unless you were guaranteed the input would always fit in an AnsiChar. Furthermore, the s[x] in c syntax generates warnings on WideChar arguments, so you'd want to use CharInSet(s[x], c) instead. (Unlike CharInArray and CharInString, the RTL provides CharInSet already, for Delphi versions that need it.)
You can use StringReplace:
var
str:String;
begin
str:='The_aLiEn'+Chr(VK_TAB)+'Delphi';
ShowMessage(str);
str:=StringReplace(str, chr(VK_Tab), '', [rfReplaceAll]);
ShowMessage(str);
end;
This omits all Tab characters from given string. But you can improve it, if you want leading and trailing tabs to be removed then you can use Pos function also.
Edit:
For the comment asking how to do it with Pos, here it is:
var
str:String;
s, e: PChar;
begin
str:=Chr(VK_TAB)+Chr(VK_TAB)+'The_aLiEn'+Chr(VK_TAB)+'Delphi'+Chr(VK_TAB)+Chr(VK_TAB);
s:=PChar(str);
while Pos(Chr(VK_TAB), s)=1 do inc(s);
e:=s;
inc(e, length(s)-1);
while Pos(Chr(VK_TAB), e)=1 do dec(e);
str:=Copy(s, 1, length(s)-length(e)+1);
ShowMessage(str);
end;
It is of course the same approach by Maksee's and a bit more job to do as it is. But if there isn't much time to finish the work and if Pos is what you've thought first, then this is how it can be done. You, the programmer should and have to think about optimizations, not me. And if we're talking constraints of optimization, with a little tweak to replace Pos with char compare, this will run faster than Maksee's code.
Edit for Substr search generalization:
function TrimStr(const Source, SubStr: String): String;
var
s, e: PChar;
l: Integer;
begin
s:=PChar(Source);
l:=Length(SubStr);
while Pos(SubStr, s)=1 do inc(s, l);
e:=s;
inc(e, length(s)-l);
while Pos(SubStr, e)=1 do dec(e, l);
Result:=Copy(s, 1, length(s)-length(e)+l);
end;
The JEDI JCL v2.7 provides these useful functions for what you need:
function StrTrimCharLeft(const S: string; C: Char): string;
function StrTrimCharsLeft(const S: string; const Chars: TCharValidator): string; overload;
function StrTrimCharsLeft(const S: string; const Chars: array of Char): string; overload;
function StrTrimCharRight(const S: string; C: Char): string;
function StrTrimCharsRight(const S: string; const Chars: TCharValidator): string; overload;
function StrTrimCharsRight(const S: string; const Chars: array of Char): string; overload;
function StrTrimQuotes(const S: string): string;

Conversion between absolute and relative paths in Delphi

Are there standard functions to perform absolute <--> relative path conversion in Delphi?
For example:
'Base' path is 'C:\Projects\Project1\'
Relative path is '..\Shared\somefile.pas'
Absolute path is 'C:\Projects\Shared\somefile.pas'
I am looking for something like this:
function AbsToRel(const AbsPath, BasePath: string): string;
// '..\Shared\somefile.pas' =
// AbsToRel('C:\Projects\Shared\somefile.pas', 'C:\Projects\Project1\')
function RelToAbs(const RelPath, BasePath: string): string;
// 'C:\Projects\Shared\somefile.pas' =
// RelToAbs('..\Shared\somefile.pas', 'C:\Projects\Project1\')
To convert to the absolute you have :
ExpandFileName
To have the relative path you have :
ExtractRelativePath
I would use PathRelativePathTo as the first function and PathCanonicalize as the second. In the latter case, as argument you pass the string sum of the base path and the relative path.
function PathRelativePathTo(pszPath: PChar; pszFrom: PChar; dwAttrFrom: DWORD;
pszTo: PChar; dwAtrTo: DWORD): LongBool; stdcall; external 'shlwapi.dll' name 'PathRelativePathToW';
function AbsToRel(const AbsPath, BasePath: string): string;
var
Path: array[0..MAX_PATH-1] of char;
begin
PathRelativePathTo(#Path[0], PChar(BasePath), FILE_ATTRIBUTE_DIRECTORY, PChar(AbsPath), 0);
result := Path;
end;
function PathCanonicalize(lpszDst: PChar; lpszSrc: PChar): LongBool; stdcall;
external 'shlwapi.dll' name 'PathCanonicalizeW';
function RelToAbs(const RelPath, BasePath: string): string;
var
Dst: array[0..MAX_PATH-1] of char;
begin
PathCanonicalize(#Dst[0], PChar(IncludeTrailingBackslash(BasePath) + RelPath));
result := Dst;
end;
procedure TForm4.FormCreate(Sender: TObject);
begin
ShowMessage(AbsToRel('C:\Users\Andreas Rejbrand\Desktop\file.txt', 'C:\Users\Andreas Rejbrand\Pictures'));
ShowMessage(RelToAbs('..\Videos\movie.wma', 'C:\Users\Andreas Rejbrand\Desktop'));
end;
Of course, if you use a non-Unicode version of Delphi (that is, <= Delphi 2007), you need to use the Ansi functions (*A) instead of the Unicode functions (*W).
For what it's worth, my codebase uses SysUtils.ExtractRelativePath in one direction and the following home-grown wrapper coming back:
function ExpandFileNameRelBaseDir(const FileName, BaseDir: string): string;
var
Buffer: array [0..MAX_PATH-1] of Char;
begin
if PathIsRelative(PChar(FileName)) then begin
Result := IncludeTrailingBackslash(BaseDir)+FileName;
end else begin
Result := FileName;
end;
if PathCanonicalize(#Buffer[0], PChar(Result)) then begin
Result := Buffer;
end;
end;
You'll need to use the ShLwApi unit for PathIsRelative and PathCanonicalize.
The call to PathIsRelative means that the routine is robust to absolute paths being specified.
So, SysUtils.ExtractRelativePath can be your AbsToRel only the parameters are reversed. And my ExpandFileNameRelBaseDir will serve as your RelToAbs.
I just brewed this together:
uses
ShLwApi;
function RelToAbs(const ARelPath, ABasePath: string): string;
begin
SetLength(Result, MAX_PATH);
if PathCombine(#Result[1], PChar(IncludeTrailingPathDelimiter(ABasePath)), PChar(ARelPath)) = nil then
Result := ''
else
SetLength(Result, StrLen(#Result[1]));
end;
Thanks to Andreas and David for calling my attention to the Shell Path Handling Functions.
TPath.Combine(S1, S2);
Should be available since Delphi XE.
I am not too certain if this is still needed after 2+ years, but here is a way to get the Relative to Absolute (As for Absolute to Relative I would suggest philnext's ExtractRelativePath answer):
Unit: IOUtils
Parent: TPath
function GetFullPath(const BasePath: string): string;
It will return the full, absolute path for a given relative path. If the given path is already absolute, it will just return it as is.
Here is the link at Embarcadero: Get Full Path
And here is a link for Path Manipulation Routines
An alternate solution for RelToAbs is simply:
ExpandFileName(IncludeTrailingPathDelimiter(BasePath) + RelPath)
Check if your solution will works with Relative Path To Full Path in case when you change current directory. This will works:
function PathRelativeToFull(APath : string) : string;
var
xDir : string;
begin
xDir := GetCurrentDir;
try
SetCurrentDir('C:\Projects\Project1\');
Result := ExpandFileName(APath);
finally
SetCurrentDir(xDir);
end{try..finally};
end;
function PathFullToRelative(APath : string; ABaseDir : string = '') : string;
begin
if ABaseDir = '' then
ABaseDir := 'C:\Projects\Project1\';
Result := ExtractRelativePath(ABaseDir, APath);
end;
Another version of RelToAbs (compatible with all Delphi XE versions).
uses
ShLwApi;
function RelPathToAbsPath(const ARelPath, ABasePath: string): string;
var Buff:array[0..MAX_PATH] of Char;
begin
if PathCombine(Buff, PChar(IncludeTrailingPathDelimiter(ABasePath)), PChar(ARelPath)) = nil then
Result := ''
else Result:=Buff;
end;
Combination of two standard methods Combine/GetFullPath (unit IOUtils) should give the desired result:
CombinedPath := TPath.Combine('Base path', 'Relative path');
FileName := TPath.GetFullPath(CombinedPath);

sprintf in Delphi?

Does anyone know a 100% clone of the C/C++ printf for Delphi?
Yes, I know the System.Format function, but it handles things a little different.
For example if you want to format 3 to "003" you need "%03d" in C, but "%.3d" in Delphi.
I have an application written in Delphi which has to be able to format numbers using C format strings, so do you know a snippet/library for that?
Thanks in advance!
You could use the wsprintf() function from Windows.pas. Unfortunately this function is not declared correctly in the Windows.pas so here is a redeclaration:
function wsprintf(Output: PChar; Format: PChar): Integer; cdecl; varargs;
external user32 name {$IFDEF UNICODE}'wsprintfW'{$ELSE}'wsprintfA'{$ENDIF};
procedure TForm1.FormCreate(Sender: TObject);
var
S: String;
begin
SetLength(S, 1024); // wsprintf can work only with max. 1024 characters
SetLength(S, wsprintf(PChar(S), '%s %03d', 'Hallo', 3));
end;
If you want to let the function look more Delphi friendly to the user, you could use the following:
function _FormatC(const Format: string): string; cdecl;
const
StackSlotSize = SizeOf(Pointer);
var
Args: va_list;
Buffer: array[0..1024] of Char;
begin
// va_start(Args, Format)
Args := va_list(PAnsiChar(#Format) + ((SizeOf(Format) + StackSlotSize - 1) and not (StackSlotSize - 1)));
SetString(Result, Buffer, wvsprintf(Buffer, PChar(Format), Args));
end;
const // allows us to use "varargs" in Delphi
FormatC: function(const Format: string): string; cdecl varargs = _FormatC;
procedure TForm1.Button1Click(Sender: TObject);
begin
ShowMessage(FormatC('%s %03d', 'Hallo', 3));
end;
It's not recommended to use (ws)printf since they are prone to buffer overflow, it would be better to use the safe variants (eg StringCchPrintF). It is already declared in the Jedi Apilib (JwaStrSafe).
Well, I just found this one:
function sprintf(S: PAnsiChar; const Format: PAnsiChar): Integer;
cdecl; varargs; external 'msvcrt.dll';
It simply uses the original sprintf function from msvcrt.dll which can then be used like that:
procedure TForm1.Button1Click(Sender: TObject);
var s: AnsiString;
begin
SetLength(s, 99);
sprintf(PAnsiChar(s), '%d - %d', 1, 2);
ShowMessage(S);
end;
I don't know if this is the best solution because it needs this external dll and you have to set the string's length manually which makes it prone to buffer overflows, but at least it works... Any better ideas?
more clean approach without unnecessary type casting
function sprintf(CharBuf: PChar; const Format: PAnsiChar): Integer;
cdecl; varargs; external 'msvcrt.dll';
procedure TForm1.Button1Click(Sender: TObject);
var CharBuf: PChar;
begin
CharBuf:=StrAlloc (99);
sprintf(CharBuf, 'two numbers %d - %d', 1, 2);
ShowMessage(CharBuf);
StrDispose(CharBuf);
end;
If you happen to cross compile for Windows CE App. use coredll.dll instead of msvcrt.dll

Resources