How to trim any character (or a substring) from a string?

How to trim any character (or a substring) from a string? - delphi

I use C# basically. There I can do:
string trimmed = str.Trim('\t');
to trim tabulation from the string str and return the result to trimmed.
In delphi7 I found only Trim, that trims spaces.
How can I achieve the same functionality?

There is string helper TStringHelper.Trim that accepts array of Char as optional parameter.
function Trim(const TrimChars: array of Char): string; overload;
So, you can use
trimmed := str.Trim([#09]);
for your example. #09 here is ASCII code for Tab character.
This function exists since at least Delphi XE3.
Hope it helps.

This is a kind of procedure sometimes easier to create than to find where it lives :)
function TrimChar(const Str: string; Ch: Char): string;
var
S, E: integer;
begin
S:=1;
while (S <= Length(Str)) and (Str[S]=Ch) do Inc(S);
E:=Length(Str);
while (E >= 1) and (Str[E]=Ch) do Dec(E);
SetString(Result, PChar(#Str[S]), E - S + 1);
end;

In Delphi the Trim function does not take parameters but it does trim other characters as well as spaces. Here's the code (from System.SysUtils in XE2, I don't think it has changed):
function Trim(const S: string): string;
var
I, L: Integer;
begin
L := Length(S);
I := 1;
if (L > 0) and (S[I] > ' ') and (S[L] > ' ') then Exit(S);
while (I <= L) and (S[I] <= ' ') do Inc(I);
if I > L then Exit('');
while S[L] <= ' ' do Dec(L);
Result := Copy(S, I, L - I + 1);
end;
It is trimming anything less than ' ' which would eliminate any control characters like tab, carriage return and line feed.

Delphi doesn't provide a function that does what you want. The built-in Trim function always trims the same set of characters (whitespace and control characters) from both ends of the input string. Several answers here show the basic technique for trimming arbitrary characters. As you can see, it doesn't have to be complicated. Here's my version:
function Trim(const s: string; c: Char): string;
var
First, Last: Integer;
begin
First := 1;
Last := Length(s);
while (First <= Last) and (s[First] = c) do
Inc(First);
while (First < Last) and (s[Last] = c) do
Dec(last);
Result := Copy(s, First, Last - First + 1);
end;
To adapt that for trimming multiple characters, all you have to do is change the second conditional term in each loop. What you change it to depends on how you choose to represent the multiple characters. C# uses an array. You could also put all the characters in a string, or you could use Delphi's native set type.
function Trim(const s: string; const c: array of Char): string;
// Replace `s[x] = c` with `CharInArray(s[x], c)`.
function Trim(const s: string; const c: string): string;
// Replace `s[x] = c` with `CharInString(s[x], s)`.
function Trim(const s: string; const c: TSysCharSet): string;
// Replace `s[x] = c` with `s[x] in c`.
The CharInArray and CharInString functions are easy to write:
function CharInArray(c: Char; ar: array of Char): Boolean;
var
i: Integer;
begin
Result := True;
for i := Low(ar) to High(ar) do
if ar[i] = c then
exit;
Result := False;
end;
// CharInString is identical, except for the type of `ar`.
Recall that as of Delphi 2009, Char is an alias for WideChar, meaning it's too big to fit in a set, so you wouldn't be able to use the set version unless you were guaranteed the input would always fit in an AnsiChar. Furthermore, the s[x] in c syntax generates warnings on WideChar arguments, so you'd want to use CharInSet(s[x], c) instead. (Unlike CharInArray and CharInString, the RTL provides CharInSet already, for Delphi versions that need it.)

You can use StringReplace:
var
str:String;
begin
str:='The_aLiEn'+Chr(VK_TAB)+'Delphi';
ShowMessage(str);
str:=StringReplace(str, chr(VK_Tab), '', [rfReplaceAll]);
ShowMessage(str);
end;
This omits all Tab characters from given string. But you can improve it, if you want leading and trailing tabs to be removed then you can use Pos function also.
Edit:
For the comment asking how to do it with Pos, here it is:
var
str:String;
s, e: PChar;
begin
str:=Chr(VK_TAB)+Chr(VK_TAB)+'The_aLiEn'+Chr(VK_TAB)+'Delphi'+Chr(VK_TAB)+Chr(VK_TAB);
s:=PChar(str);
while Pos(Chr(VK_TAB), s)=1 do inc(s);
e:=s;
inc(e, length(s)-1);
while Pos(Chr(VK_TAB), e)=1 do dec(e);
str:=Copy(s, 1, length(s)-length(e)+1);
ShowMessage(str);
end;
It is of course the same approach by Maksee's and a bit more job to do as it is. But if there isn't much time to finish the work and if Pos is what you've thought first, then this is how it can be done. You, the programmer should and have to think about optimizations, not me. And if we're talking constraints of optimization, with a little tweak to replace Pos with char compare, this will run faster than Maksee's code.
Edit for Substr search generalization:
function TrimStr(const Source, SubStr: String): String;
var
s, e: PChar;
l: Integer;
begin
s:=PChar(Source);
l:=Length(SubStr);
while Pos(SubStr, s)=1 do inc(s, l);
e:=s;
inc(e, length(s)-l);
while Pos(SubStr, e)=1 do dec(e, l);
Result:=Copy(s, 1, length(s)-length(e)+l);
end;

The JEDI JCL v2.7 provides these useful functions for what you need:
function StrTrimCharLeft(const S: string; C: Char): string;
function StrTrimCharsLeft(const S: string; const Chars: TCharValidator): string; overload;
function StrTrimCharsLeft(const S: string; const Chars: array of Char): string; overload;
function StrTrimCharRight(const S: string; C: Char): string;
function StrTrimCharsRight(const S: string; const Chars: TCharValidator): string; overload;
function StrTrimCharsRight(const S: string; const Chars: array of Char): string; overload;
function StrTrimQuotes(const S: string): string;

Related

What is the most simple way to check if a string may convert to AnsiString safely in XE4 and above?

In Delphi XE4 and above, we may write something like:
function TestAnsiCompatible(const aStr: string): Boolean;
begin
end;
string in Delphi XE4 is declared as UnicodeString. It may hold a unicode string.
If we do some type conversion:
function TestAnsiCompatible(const aStr: string): Boolean;
var a: AnsiString;
begin
a := aStr;
Result := a = aStr;
end;
Some compiler warnings should prompt:
[dcc32 Warning]: W1058 Implicit string cast with potential data loss from 'string' to 'AnsiString'
[dcc32 Warning]: W1057 Implicit string cast from 'AnsiString' to 'string'
Is there a much simple and neat way to test if aStr is fully compatible with AnsiString? Or we shall check character by characters:
function TestAnsiCompatible(const aStr: string): Boolean;
var C: Char;
begin
Result := True;
for C in aStr do begin
if C > #127 then begin
Result := False;
Break;
end;
end;
end;

All you have to do is type-cast away the warnings:
function TestAnsiCompatible(const aStr: string): Boolean;
var
a: AnsiString;
begin
a := AnsiString(aStr);
Result := String(a) = aStr;
end;
Which can be simplified to this:
function TestAnsiCompatible(const aStr: string): Boolean;
begin
Result := String(AnsiString(aStr)) = aStr;
end;

I used to check if String(a) = AnsiString(a), until I had a user who had transferred data from one PC to another, and that had a different codepage. Then the data could not be read back properly. Then I changed my definition of "safe" to "string is code page 1252" (as this is the region where most of my users are). Then when reading back my data, I know I have to convert the string back from code page 1252.
function StringIs1252(const S: UnicodeString): Boolean;
// returns True if a string is in codepage 1252 (Western European (Windows))
// Cyrillic is 1251
const
WC_NO_BEST_FIT_CHARS = $00000400;
var
UsedDefaultChar: BOOL; // not Boolean!!
Len: Integer;
begin
if Length(S) = 0 then
Exit(True);
UsedDefaultChar := False;
Len := WideCharToMultiByte(1252, WC_NO_BEST_FIT_CHARS, PWideChar(S), Length(S), nil, 0, nil, #UsedDefaultChar);
if Len <> 0 then
Result := not UsedDefaultchar
else
Result := False;
end;
But if you want to check if your string can safely be converted to ansi - completely independent of the code page that is used when writing or reading, then you should check if all characters are in the range from #0..#127.

Only allow certain characters in a string

I am trying to validate a string, where by it can contain all alphebetical and numerical characters, aswell as the underline ( _ ) symbol.
This is what I tried so far:
var
S: string;
const
Allowed = ['A'..'Z', 'a'..'z', '0'..'9', '_'];
begin
S := 'This_is_my_string_0123456789';
if Length(S) > 0 then
begin
if (Pos(Allowed, S) > 0 then
ShowMessage('Ok')
else
ShowMessage('string contains invalid symbols');
end;
end;
In Lazarus this errors with:
Error: Incompatible type for arg no. 1: Got "Set Of Char", expected
"Variant"
Clearly my use of Pos is all wrong and I am not sure if my approach is even the correct way of going about it or not?
Thanks.

You will have to check every single character of the string, if it's contained in Allowed
e.g.:
var
S: string;
const
Allowed = ['A' .. 'Z', 'a' .. 'z', '0' .. '9', '_'];
Function Valid: Boolean;
var
i: Integer;
begin
Result := Length(s) > 0;
i := 1;
while Result and (i <= Length(S)) do
begin
Result := Result AND (S[i] in Allowed);
inc(i);
end;
if Length(s) = 0 then Result := true;
end;
begin
S := 'This_is_my_string_0123456789';
if Valid then
ShowMessage('Ok')
else
ShowMessage('string contains invalid symbols');
end;

TYPE TCharSet = SET OF CHAR;
FUNCTION ValidString(CONST S : STRING ; CONST ValidChars : TCharSet) : BOOLEAN;
VAR
I : Cardinal;
BEGIN
Result:=FALSE;
FOR I:=1 TO LENGTH(S) DO IF NOT (S[I] IN ValidChars) THEN EXIT;
Result:=TRUE
END;
If you are using a Unicode version of Delphi (as you seem to be), beware that a SET OF CHAR cannot contain all valid characters in the Unicode character set. Then perhaps this function will be useful instead:
FUNCTION ValidString(CONST S,ValidChars : STRING) : BOOLEAN;
VAR
I : Cardinal;
BEGIN
Result:=FALSE;
FOR I:=1 TO LENGTH(S) DO IF POS(S[I],ValidChars)=0 THEN EXIT;
Result:=TRUE
END;
but then again, not all characters (actually Codepoints) in Unicode can be expressed by a single character, and some characters can be expressed in more than one way (both as a single character and as a multi-character).
But as long as you constrain yourself within these limitations, one of the above functions should be useful. You can even include both, if you add an "OVERLOAD;" directive to the end of each function declaration, as in:
FUNCTION ValidString(CONST S : STRING ; CONST ValidChars : TCharSet) : BOOLEAN; OVERLOAD;
FUNCTION ValidString(CONST S,ValidChars : STRING) : BOOLEAN; OVERLOAD;

Lazarus/Free Pascal doesn't overload pos for that but has "posset" variants in unit strutils for that;
http://www.freepascal.org/docs-html/rtl/strutils/posset.html
Regarding Andreas' (IMHO correct ) remark, you can use isemptystr for that. It was meant to check for strings that only contain whitespace, but it basically checks if a string only contains characters in a set.
http://www.freepascal.org/docs-html/rtl/strutils/isemptystr.html

You can use Regular Expressions:
uses System.RegularExpressions;
if not TRegEx.IsMatch(S, '^[_a-zA-Z0-9]+$') then
ShowMessage('string contains invalid symbols');

How to count number of occurrences of a certain char in string?

How can I count the number of occurrences of a certain character in a string in Delphi?
For instance, assume that I have the following string and would like to count the number of commas in it:
S := '1,2,3';
Then I would like to obtain 2 as the result.

You can use this simple function:
function OccurrencesOfChar(const S: string; const C: char): integer;
var
i: Integer;
begin
result := 0;
for i := 1 to Length(S) do
if S[i] = C then
inc(result);
end;

Even though an answer has already been accepted, I'm posting the more general function below because I find it so elegant. This solution is for counting the occurrences of a string rather than a character.
{ Returns a count of the number of occurences of SubText in Text }
function CountOccurences( const SubText: string;
const Text: string): Integer;
begin
Result := Pos(SubText, Text);
if Result > 0 then
Result := (Length(Text) - Length(StringReplace(Text, SubText, '', [rfReplaceAll]))) div Length(subtext);
end; { CountOccurences }

And for those who prefer the enumerator loop in modern Delphi versions (not any better than the accepted solution by Andreas, just an alternative solution):
function OccurrencesOfChar(const ContentString: string;
const CharToCount: char): integer;
var
C: Char;
begin
result := 0;
for C in ContentString do
if C = CharToCount then
Inc(result);
end;

This one can do the work for if you're not handling large text
...
uses RegularExpressions;
...
function CountChar(const s: string; const c: char): integer;
begin
Result:= TRegEx.Matches(s, c).Count
end;

You can use the benefit of StringReplace function as:
function OccurencesOfChar(ContentString:string; CharToCount:char):integer;
begin
Result:= Length(ContentString)-Length(StringReplace(ContentString, CharToCount,'', [rfReplaceAll, rfIgnoreCase]));
end;

Simple solution and good performance (I wrote for Delphi 7, but should work for other versions as well):
function CountOccurences(const ASubString: string; const AString: string): Integer;
var
iOffset: Integer;
iSubStrLen: Integer;
begin
Result := 0;
if (ASubString = '') or (AString = '') then
Exit;
iOffset := 1;
iSubStrLen := Length(ASubString);
while (True) do
begin
iOffset := PosEx(ASubString, AString, iOffset);
if (iOffset = 0) then
Break;
Inc(Result);
Inc(iOffset, iSubStrLen);
end;
end;

Ummm... Am I missing something? Why not just...
kSepChar:=',';//to count commas
bLen:=length(sLineToCheck);
bCount:=0;//The numer of kSepChars seen so far.
bPosn:=1;//First character in string is at position 1
for bPosn:=1 to bLen do begin
if sLineToCheck[bPosn]=kSepChar then inc(bCount);
end;//

Delphi: Any StringReplaceW or WideStringReplace functions out there?

Are there any wide-string manipulation implementations out there?
function WideUpperCase(const S: WideString): WideString;
function WidePos(Substr: WideString; S: WideString): Integer;
function StringReplaceW(const S, OldPattern, NewPattern: WideString;
Flags: TReplaceFlags): WideString;
etc

The JEDI project includes JclUnicode.pas, which has WideUpperCase and WidePos, but not StringReplace. The SysUtils.pas StringReplace code isn't very complicated, so you could easily just copy that and replace string with WideString, AnsiPos with WidePos, and AnsiUpperCase with WideUpperCase and get something functional, if slow.

I generally import the "Microsoft VBScript Regular Expression 5.5" type library and use IRegExp objects.
OP Edit
i like this answer, and i went ahead and wrote a StringReplaceW function using RegEx:
function StringReplaceW(const S, OldPattern, NewPattern: WideString; Flags: TReplaceFlags): WideString;
var
objRegExp: OleVariant;
Pattern: WideString;
i: Integer;
begin
{
Convert the OldPattern string into a series of unicode points to match
\uxxxx\uxxxx\uxxxx
\uxxxx Matches the ASCII character expressed by the UNICODE xxxx.
"\u00A3" matches "£".
}
Pattern := '';
for i := 1 to Length(OldPattern) do
Pattern := Pattern+'\u'+IntToHex(Ord(OldPattern[i]), 4);
objRegExp := CreateOleObject('VBScript.RegExp');
try
objRegExp.Pattern := Pattern;
objRegExp.IgnoreCase := (rfIgnoreCase in Flags);
objRegExp.Global := (rfReplaceAll in Flags);
Result := objRegExp.Replace(S, NewPattern);
finally
objRegExp := Null;
end;
end;

The TntControls has a set of Wide-version functions.

Why won't Delphi 2009 let me Include a Char in a set?

Here is another question about convert old code to D2009 and Unicode. I'm certain that there is simple but i don't see the solution...
CharacterSet is a set of Char and s[i] should also be a Char.
But the compiler still think there is a conflict between AnsiChar and Char.
The code:
TSetOfChar = Set of Char;
procedure aFunc;
var
CharacterSet: TSetOfChar;
s: String;
j: Integer;
CaseSensitive: Boolean;
begin
// Other code that assign a string to s
// Set CaseSensitive to a value
CharacterSet := [];
for j := 1 to Length(s) do
begin
Include(CharacterSet, s[j]); // E2010 Incompatible types: 'AnsiChar' and 'Char'
if not CaseSensitive then
begin
Include(CharacterSet, AnsiUpperCase(s[j])[1]);
Include(CharacterSet, AnsiLowerCase(s[j])[1])
end
end;
end;

Because a Pascal set can't have a range higher than 0..255, the compiler quietly converts sets of chars to sets of AnsiChars. That's what's causing trouble for you.

There is no good and simple answer to the question (the reason is already given by Mason). The good solution is to reconsider the algoritm to get rid off "set of char" type. The quick and dirty solution is to preserve ansi chars and strings:
TSetOfChar = Set of AnsiChar;
procedure aFunc;
var
CharacterSet: TSetOfChar;
s: String;
S1, SU, SL: Ansistring;
j: Integer;
CaseSensitive: Boolean;
begin
// Other code that assign a string to s
// Set CaseSensitive to a value
S1:= s;
SU:= AnsiUpperCase(s);
SL:= AnsiLowerCase(s);
CharacterSet := [];
for j := 1 to Length(S1) do
begin
Include(CharacterSet, S1[j]);
if not CaseSensitive then
begin
Include(CharacterSet, SU[j]);
Include(CharacterSet, SL[j]);
end
end;
end;

Delphi does not support sets of Unicode characters. You can only use AnsiChar in a set, but that's not big enough to fit all the possible characters your string might hold.
Instead of Delphi's native set type, though, you can use the TBits type.
procedure aFunc;
var
CharacterSet: TBits;
s: String;
c: Char;
CaseSensitive: Boolean;
begin
// Other code that assign a string to s
// Set CaseSensitive to a value
CharacterSet := TBits.Create;
try
for c in s do begin
CharacterSet[Ord(c)] := True;
if not CaseSensitive then begin
CharacterSet[Ord(Character.ToUpper(c))] := True;
CharacterSet[Ord(Character.ToLower(c))] := True;
end
end;
finally
CharacterSet.Free;
end;
end;
A TBits object automatically expends to accommodate the highest bit it needs to represent.
Other changes I made to your code include using the new "for-in" loop style, and the new Character unit for dealing with Unicode characters.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to trim any character (or a substring) from a string? - delphi

I use C# basically. There I can do: string trimmed = str.Trim('\t'); to trim tabulation from the string str and return the result to trimmed. In delphi7 I found only Trim, that trims spaces. How can I achieve the same functionality?

Related

What is the most simple way to check if a string may convert to AnsiString safely in XE4 and above?

Only allow certain characters in a string

How to count number of occurrences of a certain char in string?

Delphi: Any StringReplaceW or WideStringReplace functions out there?

Why won't Delphi 2009 let me Include a Char in a set?

Categories

Resources