I am trying to manipulate a string and pull only certain data from it. I need to do this on a record pulled from a database that gives me the full name of a person. I need to pull only the last name from the string and store it as a variable. Is there a way that I can do this?
Example: SQL query pulls the full field "Mary Ellen Jones" I need to extract only the Jones from the string so I can store it in a variable for further processing.
I thought maybe AnsiRightStr would work but the problem is needing to give it a set integer to pull from the right. Maybe a way to count the characters after the final space allowing me to use AnsiRightStr(string,int) for this? Any help at all is appreciated.
Additional thought: Would replacing the spaces with a delimiter say :: and then parsing that data into a Stringlist followed by allowing me to pull the last index of the string list be possible?
Several valid options have been presented so far. None of them address the situation if say the name is Something like "John St. James, Jr." Is this impossible?
you can use the LastDelimiter function to get the last space position and then with the copy function extract the substring.
uses
SysUtils;
var
Name : string;
p : Integer;
ShortName : string;
begin
Name:='Mary Ellen Jones';
//You can call trim to avoid problems with ending spaces in this case is not necesary, just is a test
//Name:=Trim(Name);
//get the last space position
p:=LastDelimiter(' ',Name);
//get the name
ShortName:=Copy(Name,p+1,length(Name)-p);
end;
or using a function
function GetLast(const Name:string) : string;
var
p : Integer;
begin
Result:=Trim(Name);
p:=LastDelimiter(' ',Result);
Result:=Copy(Result,p+1,length(Result)-p);
end;
function GetLastWord(const Str: string): string;
var
p: integer;
i: Integer;
const
SPACE = #$20;
begin
p := 1;
for i := length(Str) downto 1 do
if Str[i] = SPACE then
begin
p := i + 1;
break;
end;
result := Copy(Str, p, MaxInt);
end;
This will fail if the string ends with (an accidental) space, as 'Andreas Rejbrand '. This more robust version will handle this case too:
function GetLastWord(const Str: string): string;
var
p: integer;
i: Integer;
FoundNonSpace: boolean;
const
SPACE = #$20;
begin
p := 1;
FoundNonSpace := false;
for i := length(Str) downto 1 do
if (Str[i] = SPACE) and FoundNonSpace then
begin
p := i + 1;
break
end
else if Str[i] <> SPACE then
FoundNonSpace := true;
result := TrimRight(Copy(Str, p, MaxInt));
end;
What if the last name is say "St. James" any way to account for that?
Here's my approach.
Make a list of lastname-markers
Search that list in order of preference
As soon as a match is found, mark that as the start of last name
Return substring starting from that pos.
var
LastNameMarkers: TStringList = nil;
SuffixFix: TStringList = nil;
procedure InitLists;
begin
LastNameMarkers:= TStringList.Create;
//LastNameMarkers.LoadFromFile('c:\markers.txt');
LastNameMarkers.Add(' St.');
LastnameMarkers.Add(' Mc');
LastNameMarkers.Add(' '); //Marker of last resort.
SuffixFix:= TStringList.Create;
SuffixFix.Add(' Jr.');
SuffixFix.Add(' Sr.');
end;
function GetLastName(FullName: string): string;
var
i: integer;
start: integer;
found: boolean;
ReplaceWith: string;
begin
if LastNameMarkers = nil then InitLists;
//Fix suffixes
i:= 0;
found:= false;
while (i < SuffixFix.Count) and not found do begin
start:= pos(lower(LastNameMarkers[i]),lower(Fullname));
found:= Start > 0;
Inc(i);
end; {while}
if Found then begin
Dec(i);
ReplaceWith:= StringReplace(Suffix[i], ' ', '_',[]);
FullName:= StringReplace(FullName, SuffixFix[i], ReplaceWith,[]);
end; {if}
//Look for lastnames
i:= 0;
found:= false;
while (i < LastNameMarkers.Count) and not found do begin
start:= pos(LastNameMarkers[i],Fullname);
found:= Start > 0;
Inc(i);
end; {while}
if found then Result:= RightStr(FullName, Length(FullName)- Start + 2)
else Result:= '';
StringReplace(Result, '_', ' ',[]);
end;
I haven't dealt with upper and lowercase properly, but I hope you get the idea.
function TfrmCal.GetLastName(FullName: string): string;
var
i: integer;
found: boolean;
suffix: string;
marker: string;
begin
// Build the lists for the compare.
InitLists;
// Look at Suffixes and attach them to the LastName
i := 0;
found := False;
while (i < SuffixFix.Count) do
begin
if AnsiContainsStr(FullName, SuffixFix[i]) then
begin
suffix := '::' + trim(SuffixFix[i]);
FullName := ReplaceStr(FullName, SuffixFix[i], suffix);
found := True;
end;
inc(i);
if found then
break;
end;
// Look for LastName Markers
i := 0;
found := False;
while (i < LastNameMarkers.Count) do
begin
if AnsiContainsStr(FullName, LastNameMarkers[i]) then
begin
marker := trimright(LastNameMarkers[i]) + '::';
FullName := ReplaceStr(FullName, LastNameMarkers[i], marker);
found := True;
end;
inc(i);
if found then
break;
end;
FullName := GetLastWord(FullName);
FullName := ReplaceStr(FullName, '::', ' ');
LastNameMarkers.Clear;
SuffixFix.Clear;
Result := FullName;
end;
function TfrmCal.GetLastWord(const Str: string): string;
var
p: integer;
i: integer;
const
SPACE = #$20;
begin
p := 1;
for i := Length(Str) downto 1 do
if Str[i] = SPACE then
begin
p := i + 1;
break;
end;
Result := Copy(Str, p, MaxInt);
end;
These two functions together pull off what I needed to do. There is also the initlists function which is clunky and ugly and I need to work on so I didn't post it here.
Related
I have this simple operation in Java, where the string is split by new line and break.
String i= "Holidays
Great.
Bye";
String []linesArray = i.split("\\r?\\n");
I would like to obtain the same result in Delphi 2006.
Is it valid to use the following steps?
charArray[0] := '\\r';
charArray[1] := '\\n';
strArray := strA.Split(charArray);
I interpret your request like this: "Split a string at both CR and LF." which implies that CR+LF gives an empty string element. For instance, 'alpha'#13'beta'#10'gamma'#13#10'delta' yields the five elements 'alpha', 'beta', 'gamma', '', and 'delta'.
If so, and if you are using a non-ancient version of Delphi, this is really simple:
var S := 'alpha'#13'beta'#10'gamma'#13#10'delta';
var Parts := S.Split([#13, #10]);
for var Part in Parts do
ShowMessage(Part);
For old Delphi versions
The code above requires TStringHelper (crucially) and also makes use of inline variable declarations, for in loops, and generics.
For old Delphi versions, you can do it manually:
type
TStringArray = array of string;
function Split(const S: string): TStringArray;
var
Count: Integer;
const
Delta = 512;
procedure Add(const Part: string);
begin
if Length(Result) = Count then
SetLength(Result, Length(Result) + Delta);
Result[Count] := Part;
Inc(Count);
end;
var
p, i: Integer;
begin
Result := nil;
Count := 0;
p := 0; // previous delim
for i := 1 to Length(S) do
if S[i] in [#13, #10] then
begin
Add(Copy(S, Succ(p), i - p - 1));
p := i;
end;
Add(Copy(S, Succ(p)));
SetLength(Result, Count);
end;
procedure TForm1.FormCreate(Sender: TObject);
var
S: string;
Parts: TStringArray;
i: Integer;
begin
S := 'alpha'#13'beta'#10'gamma'#13#10'delta';
Parts := Split(S);
for i := 0 to High(Parts) do
ShowMessage(Parts[i]);
end;
In a Delphi 10.4.2 Win32 VCL Application, and based on the question + solution here which provides a way to get the string representation of a Shortcut Key (but presumably with no possibility to also pass a SHIFTSTATE for the Shortcut Key) I wrote this code:
function MyGetSpecialShortcutName(ShortCut: TShortCut): string;
// gets shortcut name for e.g. VK_NUMPAD0 where TMenuItem.Shortcut gets the wrong shortcut name
var
ScanCode: Integer;
KeyName: array[0..255] of Char;
begin
Result := '';
FillChar(KeyName, SizeOf(KeyName), 0);
ScanCode := Winapi.Windows.MapVirtualKey(LoByte(Word(ShortCut)), 0) shl 16;
if ScanCode <> 0 then
begin
if Winapi.Windows.GetKeyNameText(ScanCode, KeyName, Length(KeyName)) <> 0 then
Result := KeyName;
end;
end;
function GetSpecialShortcutNameWithShiftState(const AScanCode: Word; const AShiftState: System.Classes.TShiftState = []): string;
begin
Result := MyGetSpecialShortcutName(Vcl.Menus.ShortCut(AScanCode, AShiftState));
end;
Usage:
Result := GetSpecialShortcutNameWithShiftState(VK_A, [ssCTRL]);
However, the Result is "A" where the expected Result should be "CTRL+A".
How to get the string representation of a ShortCut Key including the SHIFTSTATE?
The OP wants the key names fully localised, but for completeness I first show that the VCL already has a function to obtain a partly unlocalised string, namely, ShortCutToText in the Menus unit:
ShortCutToText(ShortCut(Ord('A'), [ssShift, ssAlt]))
This returns Shift+Alt+A on all systems.
Now, using the Win32 function GetKeyNameText already mentioned in the Q, it is easy to obtain a fully localised shortcut string:
function GetKeyName(AKey: Integer): string;
var
name: array[0..128] of Char;
begin
FillChar(name, SizeOf(name), 0);
GetKeyNameText(MapVirtualKey(AKey, 0) shl 16, #name[0], Length(name));
Result := name;
end;
function ModifierVirtualKey(AModifier: Integer): Integer;
begin
case AModifier of
Ord(ssShift):
Result := VK_SHIFT;
Ord(ssCtrl):
Result := VK_CONTROL;
Ord(ssAlt):
Result := VK_MENU;
else
Result := 0;
end;
end;
function ShortcutToString(AKey: Integer; AShiftState: TShiftState = []): string;
begin
Result := '';
for var Modifier in AShiftState do
begin
var ModifierKey := ModifierVirtualKey(Ord(Modifier));
if ModifierKey <> 0 then
Result := Result + IfThen(not Result.IsEmpty, '+') + GetKeyName(ModifierKey);
end;
Result := Result + IfThen(not Result.IsEmpty, '+') + GetKeyName(AKey);
end;
(Here I use a IfThen overload from StrUtils.)
Now,
ShortcutToString(Ord('A'), [ssShift, ssAlt])
returns SKIFT+ALT+A on my Swedish system. SKIFT is, as you might already have guessed, the Swedish name for the SHIFT key.
How can I effectively check if a string contains one of a few sub strings?
Suppose I have a string:
`Hi there, <B>my</B> name is Joe <DIV>.</DIV> Hello world. `
How can I check if the string contains either <B> OR <DIV> OR ?
I could do a simple:
Result := (Pos('<B>', S) > 0) or
(Pos('<DIV>', S) > 0) or
(Pos(' ', S) > 0);
But this seems to be very inefficient since it make N (at worst) passes and my strings are considerably large.
Slightly better version:
function StringContainsAny(const S: string; const AnyOf: array of string): Boolean;
var
CurrChr, C: PChar;
i, j, Ln: Integer;
begin
for i := 1 to Length(S) do
begin
CurrChr := #S[i];
for j := 0 to High(AnyOf) do
begin
C := #AnyOf[j][1]; // assume that no empty strings
if C^ <> CurrChr^ then
Continue;
Ln := Length(AnyOf[j]);
if (Length(S) + 1 - i) < Ln then // check bounds
Continue;
if CompareMem(C, CurrChr, Ln * SizeOf(C^)) then
Exit(True);
end;
end;
Exit(False);
end;
You can also build some table of stop-symbols and improve speed. It's kinda complex topic, so I can just suggest you to read, for example, book Bill Smyth "Computing Patterns in Strings".
Here is my solution, thanks to David Heffernan comment:
function StringContainsAny(const S: string; const AnyOf: array of string): Boolean;
var
CurrChr, C: PChar;
I, L, H: Integer;
begin
Result := False;
CurrChr := PChar(S);
while CurrChr^ <> #0 do
begin
H := High(AnyOf);
for I := 0 to H do
begin
L := 0;
C := PChar(AnyOf[I]);
while C^ <> #0 do
begin
if C^ = CurrChr^ then
Inc(L)
else
Break;
Inc(C);
Inc(CurrChr);
if CurrChr^ = #0 then // end of S string
begin
Result := (C^ = #0);
if Result or (not Result and (I = H)) then // match or last AnyOf
Exit;
end;
end;
if C^ = #0 then // match
begin
Result := True;
Exit;
end
else
Dec(CurrChr, L);
end;
Inc(CurrChr);
end;
end;
I'm not sure it is perfect.
EDIT:
What can I say? You know what they say about assumptions...
after actually testing, it seems like using Pos():
function StringContainsAny(const S: string; const AnyOf: array of string): Boolean;
var
I: Integer;
begin
for I := 0 to High(AnyOf) do
begin
if Pos(AnyOf[I], S) <> 0 then
begin
Result := True;
Exit;
end;
end;
Result := False;
end;
Is faster than my solution and #Green_Wizard solution! they did a good job with the Pos function!
I have an app that needs to do heavy text manipulation in a TStringList. Basically i need to split text by a delimiter ; for instance, if i have a singe line with 1000 chars and this delimiter occurs 3 times in this line, then i need to split it in 3 lines. The delimiter can contain more than one char, it can be a tag like '[test]' for example.
I've wrote two functions to do this task with 2 different approaches, but both are slow in big amounts of text (more then 2mbytes usually).
How can i achieve this goal in a faster way ?
Here are both functions, both receive 2 paramaters : 'lines' which is the original tstringlist and 'q' which is the delimiter.
function splitlines(lines : tstringlist; q: string) : integer;
var
s, aux, ant : string;
i,j : integer;
flag : boolean;
m2 : tstringlist;
begin
try
m2 := tstringlist.create;
m2.BeginUpdate;
result := 0;
for i := 0 to lines.count-1 do
begin
s := lines[i];
for j := 1 to length(s) do
begin
flag := lowercase(copy(s,j,length(q))) = lowercase(q);
if flag then
begin
inc(result);
m2.add(aux);
aux := s[j];
end
else
aux := aux + s[j];
end;
m2.add(aux);
aux := '';
end;
m2.EndUpdate;
lines.text := m2.text;
finally
m2.free;
end;
end;
function splitLines2(lines : tstringlist; q: string) : integer;
var
aux, p : string;
i : integer;
flag : boolean;
begin
//maux1 and maux2 are already instanced in the parent class
try
maux2.text := lines.text;
p := '';
i := 0;
flag := false;
maux1.BeginUpdate;
maux2.BeginUpdate;
while (pos(lowercase(q),lowercase(maux2.text)) > 0) and (i < 5000) do
begin
flag := true;
aux := p+copy(maux2.text,1,pos(lowercase(q),lowercase(maux2.text))-1);
maux1.add(aux);
maux2.text := copy(maux2.text,pos(lowercase(q),lowercase(maux2.text)),length(maux2.text));
p := copy(maux2.text,1,1);
maux2.text := copy(maux2.text,2,length(maux2.text));
inc(i);
end;
finally
result := i;
maux1.EndUpdate;
maux2.EndUpdate;
if flag then
begin
maux1.add(p+maux2.text);
lines.text := maux1.text;
end;
end;
end;
I've not tested the speed, but for academic purposes, here's an easy way to split the strings:
myStringList.Text :=
StringReplace(myStringList.Text, myDelimiter, #13#10, [rfReplaceAll]);
// Use [rfReplaceAll, rfIgnoreCase] if you want to ignore case
When you set the Text property of TStringList, it parses on new lines and splits there, so converting to a string, replacing the delimiter with new lines, then assigning it back to the Text property works.
The problems with your code (at least second approach) are
You are constantly using lowecase which is slow if called so many times
If I saw correctly you are copying the whole remaining text back to the original source. This is sure to be extra slow for large strings (eg files)
I have a tokenizer in my library. Its not the fastest or best but it should do (you can get it from Cromis Library, just use the units Cromis.StringUtils and Cromis.Unicode):
type
TTokens = array of ustring;
TTextTokenizer = class
private
FTokens: TTokens;
FDelimiters: array of ustring;
public
constructor Create;
procedure Tokenize(const Text: ustring);
procedure AddDelimiters(const Delimiters: array of ustring);
property Tokens: TTokens read FTokens;
end;
{ TTextTokenizer }
procedure TTextTokenizer.AddDelimiters(const Delimiters: array of ustring);
var
I: Integer;
begin
if Length(Delimiters) > 0 then
begin
SetLength(FDelimiters, Length(Delimiters));
for I := 0 to Length(Delimiters) - 1 do
FDelimiters[I] := Delimiters[I];
end;
end;
constructor TTextTokenizer.Create;
begin
SetLength(FTokens, 0);
SetLength(FDelimiters, 0);
end;
procedure TTextTokenizer.Tokenize(const Text: ustring);
var
I, K: Integer;
Counter: Integer;
NewToken: ustring;
Position: Integer;
CurrToken: ustring;
begin
SetLength(FTokens, 100);
CurrToken := '';
Counter := 0;
for I := 1 to Length(Text) do
begin
CurrToken := CurrToken + Text[I];
for K := 0 to Length(FDelimiters) - 1 do
begin
Position := Pos(FDelimiters[K], CurrToken);
if Position > 0 then
begin
NewToken := Copy(CurrToken, 1, Position - 1);
if NewToken <> '' then
begin
if Counter > Length(FTokens) then
SetLength(FTokens, Length(FTokens) * 2);
FTokens[Counter] := Trim(NewToken);
Inc(Counter)
end;
CurrToken := '';
end;
end;
end;
if CurrToken <> '' then
begin
if Counter > Length(FTokens) then
SetLength(FTokens, Length(FTokens) * 2);
FTokens[Counter] := Trim(CurrToken);
Inc(Counter)
end;
SetLength(FTokens, Counter);
end;
How about just using StrTokens from the JCL library
procedure StrTokens(const S: string; const List: TStrings);
It's open source
http://sourceforge.net/projects/jcl/
As an additional option, you can use regular expressions. Recent versions of Delphi (XE4 and XE5) come with built in regular expression support; older versions can find a free regex library download (zip file) at Regular-Expressions.info.
For the built-in regex support (uses the generic TArray<string>):
var
RegexObj: TRegEx;
SplitArray: TArray<string>;
begin
SplitArray := nil;
try
RegexObj := TRegEx.Create('\[test\]'); // Your sample expression. Replace with q
SplitArray := RegexObj.Split(Lines, 0);
except
on E: ERegularExpressionError do begin
// Syntax error in the regular expression
end;
end;
// Use SplitArray
end;
For using TPerlRegEx in earlier Delphi versions:
var
Regex: TPerlRegEx;
m2: TStringList;
begin
m2 := TStringList.Create;
try
Regex := TPerlRegEx.Create;
try
Regex.RegEx := '\[test\]'; // Using your sample expression - replace with q
Regex.Options := [];
Regex.State := [preNotEmpty];
Regex.Subject := Lines.Text;
Regex.SplitCapture(m2, 0);
finally
Regex.Free;
end;
// Work with m2
finally
m2.Free;
end;
end;
(For those unaware, the \ in the sample expression used are because the [] characters are meaningful in regular expressions and need to be escaped to be used in the regular expression text. Typically, they're not required in the text.)
How can I count the number of occurrences of a certain character in a string in Delphi?
For instance, assume that I have the following string and would like to count the number of commas in it:
S := '1,2,3';
Then I would like to obtain 2 as the result.
You can use this simple function:
function OccurrencesOfChar(const S: string; const C: char): integer;
var
i: Integer;
begin
result := 0;
for i := 1 to Length(S) do
if S[i] = C then
inc(result);
end;
Even though an answer has already been accepted, I'm posting the more general function below because I find it so elegant. This solution is for counting the occurrences of a string rather than a character.
{ Returns a count of the number of occurences of SubText in Text }
function CountOccurences( const SubText: string;
const Text: string): Integer;
begin
Result := Pos(SubText, Text);
if Result > 0 then
Result := (Length(Text) - Length(StringReplace(Text, SubText, '', [rfReplaceAll]))) div Length(subtext);
end; { CountOccurences }
And for those who prefer the enumerator loop in modern Delphi versions (not any better than the accepted solution by Andreas, just an alternative solution):
function OccurrencesOfChar(const ContentString: string;
const CharToCount: char): integer;
var
C: Char;
begin
result := 0;
for C in ContentString do
if C = CharToCount then
Inc(result);
end;
This one can do the work for if you're not handling large text
...
uses RegularExpressions;
...
function CountChar(const s: string; const c: char): integer;
begin
Result:= TRegEx.Matches(s, c).Count
end;
You can use the benefit of StringReplace function as:
function OccurencesOfChar(ContentString:string; CharToCount:char):integer;
begin
Result:= Length(ContentString)-Length(StringReplace(ContentString, CharToCount,'', [rfReplaceAll, rfIgnoreCase]));
end;
Simple solution and good performance (I wrote for Delphi 7, but should work for other versions as well):
function CountOccurences(const ASubString: string; const AString: string): Integer;
var
iOffset: Integer;
iSubStrLen: Integer;
begin
Result := 0;
if (ASubString = '') or (AString = '') then
Exit;
iOffset := 1;
iSubStrLen := Length(ASubString);
while (True) do
begin
iOffset := PosEx(ASubString, AString, iOffset);
if (iOffset = 0) then
Break;
Inc(Result);
Inc(iOffset, iSubStrLen);
end;
end;
Ummm... Am I missing something? Why not just...
kSepChar:=',';//to count commas
bLen:=length(sLineToCheck);
bCount:=0;//The numer of kSepChars seen so far.
bPosn:=1;//First character in string is at position 1
for bPosn:=1 to bLen do begin
if sLineToCheck[bPosn]=kSepChar then inc(bCount);
end;//