After looking at Delphi extract string between to 2 tags and trying the code given there by Andreas Rejbrand I realized that I needed a version that wouldn't stop after one tag - my goal is to write all the values that occur between two strings in several .xml files to a logfile.
<screen> xyz </screen> blah blah <screen> abc </screen>
-> giving a logfile with
xyz
abc
... and so on.
What I tried was to delete a portion of the text read by the function, so that when the function repeated, it would go to the next instance of the desired string and then write that to the logfile too until there were no matches left - the boolean function would be true and the function could stop - below the slightly modified function as based on the version in the link.
function ExtractText(const Tag, Text: string): string;
var
StartPos1, StartPos2, EndPos: integer;
i: Integer;
mytext : string;
bFinished : bool;
begin
bFinished := false;
mytext := text;
result := '';
while not bFinished do
begin
StartPos1 := Pos('<' + Tag, mytext);
if StartPos1 = 0 then bFinished := true;
EndPos := Pos('</' + Tag + '>', mytext);
StartPos2 := 0;
for i := StartPos1 + length(Tag) + 1 to EndPos do
if mytext[i] = '>' then
begin
StartPos2 := i + 1;
break;
end;
if (StartPos2 > 0) and (EndPos > StartPos2) then
begin
result := result + Copy(mytext, StartPos2, EndPos - StartPos2);
delete (mytext, StartPos1, 1);
end
So I create the form and assign a logfile.
procedure TTagtextextract0r.FormCreate(Sender: TObject);
begin
Edit2.Text:=(TDirectory.GetCurrentDirectory);
AssignFile(LogFile, 'Wordlist.txt');
ReWrite(LogFile);
CloseFile(Logfile);
end;
To then get the files in question, I click a button which then reads them.
procedure TTagtextextract0r.Button3Click(Sender: TObject);
begin
try
sD := TDirectory.GetCurrentDirectory;
Files:= TDirectory.GetFiles(sD, '*.xml');
except
exit
end;
j:=Length(Files);
for k := 0 to j-1 do
begin
Listbox2.Items.Add(Files[k]);
sA:= TFile.ReadAllText(Files[k]);
iL:= Length(sA);
AssignFile(LogFile, 'Wordlist.txt');
Append(LogFile);
WriteLn(LogFile, (ExtractText('screen', sA)));
CloseFile (LogFile);
end;
end;
end.
My problem is that without the boolean loop in the function, the application only writes the one line per file and then stops but with the boolean code the application gets stuck in an infinite loop - but I can't quite see where the loop doesn't end. Is it perhaps that the "WriteLn" command can't then output the result of the function? If it can't, I don't know how to get a new line for every run of the function - what am I doing wrong here?
First you need to get a grip on debugging
Look at this post for a briefing on how to pause and debug a program gone wild.
Also read Setting and modifying breakpoints to learn how to use breakpoints. If you would have stepped through your code, you would soon have seen where you go wrong.
Then to your problem:
In older Delphi versions (up to Delphi XE2) you could use the PosEx() function (as suggested in comments), which would simplify the code in ExtractText() function significantly. From Delphi XE3 the System.Pos() function has been expanded with the same functionality as PosEx(), that is, a third parameter Offset: integer
Since you are on Delphi 10 Seattle you can use interchangeably either System.StrUtils.PosEx() or System.Pos().
System.StrUtils.PosEx
PosEx() returns the index of SubStr in S, beginning the search at
Offset
function PosEx(const SubStr, S: string; Offset: Integer = 1): Integer; inline; overload;
The implementation of ExtractText() could look like this (with PosEx()):
function ExtractText(const tag, text: string): string;
var
startPos, endPos: integer;
begin
result := '';
startPos := 1;
repeat
startPos := PosEx('<'+tag, text, startpos);
if startPos = 0 then exit;
startPos := PosEx('>', text, startPos)+1;
if startPos = 1 then exit;
endPos := PosEx('</'+tag+'>', text, startPos);
if endPos = 0 then exit;
result := result + Copy(text, startPos, endPos - startPos) + sLineBreak;
until false;
end;
I added sLineBreak (in unit System.Types) after each found text, otherwise it should work as you intended it (I believe).
Related
I have an app that needs to do heavy text manipulation in a TStringList. Basically i need to split text by a delimiter ; for instance, if i have a singe line with 1000 chars and this delimiter occurs 3 times in this line, then i need to split it in 3 lines. The delimiter can contain more than one char, it can be a tag like '[test]' for example.
I've wrote two functions to do this task with 2 different approaches, but both are slow in big amounts of text (more then 2mbytes usually).
How can i achieve this goal in a faster way ?
Here are both functions, both receive 2 paramaters : 'lines' which is the original tstringlist and 'q' which is the delimiter.
function splitlines(lines : tstringlist; q: string) : integer;
var
s, aux, ant : string;
i,j : integer;
flag : boolean;
m2 : tstringlist;
begin
try
m2 := tstringlist.create;
m2.BeginUpdate;
result := 0;
for i := 0 to lines.count-1 do
begin
s := lines[i];
for j := 1 to length(s) do
begin
flag := lowercase(copy(s,j,length(q))) = lowercase(q);
if flag then
begin
inc(result);
m2.add(aux);
aux := s[j];
end
else
aux := aux + s[j];
end;
m2.add(aux);
aux := '';
end;
m2.EndUpdate;
lines.text := m2.text;
finally
m2.free;
end;
end;
function splitLines2(lines : tstringlist; q: string) : integer;
var
aux, p : string;
i : integer;
flag : boolean;
begin
//maux1 and maux2 are already instanced in the parent class
try
maux2.text := lines.text;
p := '';
i := 0;
flag := false;
maux1.BeginUpdate;
maux2.BeginUpdate;
while (pos(lowercase(q),lowercase(maux2.text)) > 0) and (i < 5000) do
begin
flag := true;
aux := p+copy(maux2.text,1,pos(lowercase(q),lowercase(maux2.text))-1);
maux1.add(aux);
maux2.text := copy(maux2.text,pos(lowercase(q),lowercase(maux2.text)),length(maux2.text));
p := copy(maux2.text,1,1);
maux2.text := copy(maux2.text,2,length(maux2.text));
inc(i);
end;
finally
result := i;
maux1.EndUpdate;
maux2.EndUpdate;
if flag then
begin
maux1.add(p+maux2.text);
lines.text := maux1.text;
end;
end;
end;
I've not tested the speed, but for academic purposes, here's an easy way to split the strings:
myStringList.Text :=
StringReplace(myStringList.Text, myDelimiter, #13#10, [rfReplaceAll]);
// Use [rfReplaceAll, rfIgnoreCase] if you want to ignore case
When you set the Text property of TStringList, it parses on new lines and splits there, so converting to a string, replacing the delimiter with new lines, then assigning it back to the Text property works.
The problems with your code (at least second approach) are
You are constantly using lowecase which is slow if called so many times
If I saw correctly you are copying the whole remaining text back to the original source. This is sure to be extra slow for large strings (eg files)
I have a tokenizer in my library. Its not the fastest or best but it should do (you can get it from Cromis Library, just use the units Cromis.StringUtils and Cromis.Unicode):
type
TTokens = array of ustring;
TTextTokenizer = class
private
FTokens: TTokens;
FDelimiters: array of ustring;
public
constructor Create;
procedure Tokenize(const Text: ustring);
procedure AddDelimiters(const Delimiters: array of ustring);
property Tokens: TTokens read FTokens;
end;
{ TTextTokenizer }
procedure TTextTokenizer.AddDelimiters(const Delimiters: array of ustring);
var
I: Integer;
begin
if Length(Delimiters) > 0 then
begin
SetLength(FDelimiters, Length(Delimiters));
for I := 0 to Length(Delimiters) - 1 do
FDelimiters[I] := Delimiters[I];
end;
end;
constructor TTextTokenizer.Create;
begin
SetLength(FTokens, 0);
SetLength(FDelimiters, 0);
end;
procedure TTextTokenizer.Tokenize(const Text: ustring);
var
I, K: Integer;
Counter: Integer;
NewToken: ustring;
Position: Integer;
CurrToken: ustring;
begin
SetLength(FTokens, 100);
CurrToken := '';
Counter := 0;
for I := 1 to Length(Text) do
begin
CurrToken := CurrToken + Text[I];
for K := 0 to Length(FDelimiters) - 1 do
begin
Position := Pos(FDelimiters[K], CurrToken);
if Position > 0 then
begin
NewToken := Copy(CurrToken, 1, Position - 1);
if NewToken <> '' then
begin
if Counter > Length(FTokens) then
SetLength(FTokens, Length(FTokens) * 2);
FTokens[Counter] := Trim(NewToken);
Inc(Counter)
end;
CurrToken := '';
end;
end;
end;
if CurrToken <> '' then
begin
if Counter > Length(FTokens) then
SetLength(FTokens, Length(FTokens) * 2);
FTokens[Counter] := Trim(CurrToken);
Inc(Counter)
end;
SetLength(FTokens, Counter);
end;
How about just using StrTokens from the JCL library
procedure StrTokens(const S: string; const List: TStrings);
It's open source
http://sourceforge.net/projects/jcl/
As an additional option, you can use regular expressions. Recent versions of Delphi (XE4 and XE5) come with built in regular expression support; older versions can find a free regex library download (zip file) at Regular-Expressions.info.
For the built-in regex support (uses the generic TArray<string>):
var
RegexObj: TRegEx;
SplitArray: TArray<string>;
begin
SplitArray := nil;
try
RegexObj := TRegEx.Create('\[test\]'); // Your sample expression. Replace with q
SplitArray := RegexObj.Split(Lines, 0);
except
on E: ERegularExpressionError do begin
// Syntax error in the regular expression
end;
end;
// Use SplitArray
end;
For using TPerlRegEx in earlier Delphi versions:
var
Regex: TPerlRegEx;
m2: TStringList;
begin
m2 := TStringList.Create;
try
Regex := TPerlRegEx.Create;
try
Regex.RegEx := '\[test\]'; // Using your sample expression - replace with q
Regex.Options := [];
Regex.State := [preNotEmpty];
Regex.Subject := Lines.Text;
Regex.SplitCapture(m2, 0);
finally
Regex.Free;
end;
// Work with m2
finally
m2.Free;
end;
end;
(For those unaware, the \ in the sample expression used are because the [] characters are meaningful in regular expressions and need to be escaped to be used in the regular expression text. Typically, they're not required in the text.)
I am trying to find all files that have the extenstion .cbr or .cbz
If i set my mask to *.cb?
it finds *.cbproj files. How can i set the mask to only find .cbr and .cbz files?
here is code i am using.
I have two edit boxes EDIT1 is the location to search, EDIT2 is where i put my mask. A listbox to show what it found and a Search button.
edit1 := c:\
edit2 := mask (*.cb?)
space
procedure TFAutoSearch.FileSearch(const PathName, FileName : string; const InDir : boolean);
var Rec : TSearchRec;
Path : string;
begin
Path := IncludeTrailingBackslash(PathName);
if FindFirst(Path + FileName, faAnyFile - faDirectory, Rec) = 0 then
try
repeat
ListBox1.Items.Add(Path + Rec.Name);
until FindNext(Rec) <> 0;
finally
FindClose(Rec);
end;
If not InDir then Exit;
if FindFirst(Path + '*.*', faDirectory, Rec) = 0 then
try
repeat
if ((Rec.Attr and faDirectory) <> 0) and (Rec.Name<>'.') and (Rec.Name<>'..') then
FileSearch(Path + Rec.Name, FileName, True);
until FindNext(Rec) <> 0;
finally
FindClose(Rec);
end;
end; //procedure FileSearch
procedure TFAutoSearch.Button1Click(Sender: TObject);
begin
FileSearch(Edit1.Text, Edit2.Text, CheckBox1.State in [cbChecked]);
end;
end.
The easiest way is to use ExtractFileExt against the current filename and check to see if it matches either of your desired extensions.
Here's a fully-rewritten version of your FileSearch routine which does exactly what you're trying to do (according to your question, anyway):
procedure TFAutoSearch.FileSearch(const ARoot: String);
var
LExt, LRoot: String;
LRec: TSearchRec;
begin
LRoot := IncludeTrailingPathDelimiter(ARoot);
if FindFirst(LRoot + '*.*', faAnyFile, LRec) = 0 then
begin
try
repeat
if (LRec.Attr and faDirectory <> 0) and (LRec.Name <> '.') and (LRec.Name <> '..') then
FileSearch(LRoot + LRec.Name)
else
begin
LExt := UpperCase(ExtractFileExt(LRoot + LRec.Name));
if (LExt = '.CBR') or (LExt = '.CBZ') then
ListBox1.Items.Add(LRoot + LRec.Name);
end;
until (FindNext(LRec) <> 0);
finally
FindClose(LRec);
end;
end;
end;
While the other answer suggesting the use of multiple extensions as a mask *.cbr;*.cbz should (in principal anyway) work, I've noted through bitter experience that the FindFirst and FindNext methods in Delphi tend not to accept multiple extensions in a mask!
The code I've provided should work just fine for your needs, so enjoy!
UPDATED: To allow the use of multiple extensions in a Mask dynamically at runtime (as indicated by the OP's first comment to this answer).
What we're going to do is take a String from your TEdit control (this String is one or more File Extensions as you would expect), "Explode" the String into an Array, and match each file against each Extension in the Array.
Sounds more complicated than it is:
type
TStringArray = Array of String; // String Dynamic Array type...
// Now let's provide a "Mask Container" inside the containing class...
TFAutoSearch = class(TForm)
// Normal stuff in here
private
FMask: TStringArray; // Our "Mask Container"
end;
This code will populate FMask with each individual mask extension separated by a ; such as .CBR;.CBZ.
Note this method will not accept Wildcard characters or any other Regex magic, but you can modify it as you require!
procedure TFAutoSearch.ExplodeMask(const AValue: String);
var
LTempVal: String;
I, LPos: Integer;
begin
LTempVal := AValue;
I := 0;
while Length(LTempVal) > 0 do
begin
Inc(I);
SetLength(FMask, I);
LPos := Pos(';', LTempVal);
if (LPos > 0) then
begin
FMask[I - 1] := UpperCase(Copy(LTempVal, 0, LPos - 1));
LTempVal := Copy(LTempVal, LPos + 1, Length(LTempVal));
end
else
begin
FMask[I - 1] := UpperCase(LTempVal);
LTempVal := EmptyStr;
end;
end;
end;
We now need a function to determine if the nominated file matches any of the defined Extensions:
function TFAutoSearch.MatchMask(const AFileName: String): Boolean;
var
I: Integer;
LExt: String;
begin
Result := False;
LExt := UpperCase(ExtractFileExt(LExt));
for I := Low(FMask) to High(FMask) do
if (LExt = FMask[I]) then
begin
Result := True;
Break;
end;
end;
Now here's the modified FileSearch procedure:
procedure TFAutoSearch.FileSearch(const ARoot: String);
var
LRoot: String;
LRec: TSearchRec;
begin
LRoot := IncludeTrailingPathDelimiter(ARoot);
if FindFirst(LRoot + '*.*', faAnyFile, LRec) = 0 then
begin
try
repeat
if (LRec.Attr and faDirectory <> 0) and (LRec.Name <> '.') and (LRec.Name <> '..') then
FileSearch(LRoot + LRec.Name)
else
begin
if (MatchMask(LRoot + LRec.Name)) then
ListBox1.Items.Add(LRoot + LRec.Name);
end;
until (FindNext(LRec) <> 0);
finally
FindClose(LRec);
end;
end;
end;
Finally, here's how you initiate your search:
procedure TFAutoSearch.btnSearchClick(Sender: TObject);
begin
ExplodeMask(edMask.Text);
FileSearch(edPath.Text);
end;
Where edMask is defined in your question as Edit2 and edPath is defined in your question as Edit1. Just remember that this method doesn't support the use of Wildcard or other Special Chars, so edMask.Text should be something like .CBR;.CBZ
If you use the Regex library for Delphi, you could easily modify this method to support all of the Expression Cases you could ever imagine!
Dorin's suggestion to replace your mask with *.cbr;*.cbz should work. That is, it won't match cbproj anymore. It would, however, still match cbzy or any other extension that starts with cbr or cbz. The reason for this is that FindFirst/FindNext match both the long form and the legacy short forms (8.3) of file names. So the short forms will always have truncated extensions where cbproj is shortened to cbp, and therefore matches cb?.
This is supposed to be avoidable by using FindFirstEx instead, but this requires a small rewrite of your search function and actually didn't work for me. So instead I just double checked all matches with the MatchesMask function.
I have written a Delphi function that loads data from a .dat file into a string list. It then decodes the string list and assigns to a string variable. The contents of the string use the '#' symbol as a separator.
How can I then take the contents of this string and then assign its contents to local variables?
// Function loads data from a dat file and assigns to a String List.
function TfrmMain.LoadFromFile;
var
index, Count : integer;
profileFile, DecodedString : string;
begin
// Open a file and assign to a local variable.
OpenDialog1.Execute;
profileFile := OpenDialog1.FileName;
if profileFile = '' then
exit;
profileList := TStringList.Create;
profileList.LoadFromFile(profileFile);
for index := 0 to profileList.Count - 1 do
begin
Line := '';
Line := profileList[Index];
end;
end;
After its been decoded the var "Line" contains something that looks like this:
example:
Line '23#80#10#2#1#...255#'.
Not all of the values between the separators are the same length and the value of "Line" will vary each time the function LoadFromFile is called (e.g. sometimes a value may have only one number the next two or three etc so I cannot rely on the Copy function for strings or arrays).
I'm trying to figure out a way of looping through the contents of "Line", assigning it to a local variable called "buffer" and then if it encounters a '#' it then assigns the value of buffer to a local variable, re-initialises buffer to ''; and then moves onto the next value in "Line" repeating the process for the next parameter ignoring the '#' each time.
I think I have been scratching around with this problem for too long now and I cannot seem to make any progress and need a break from it. If anyone would care to have a look, I would welcome any suggestions on how this might be achieved.
Many Thanks
KD
You need a second TStringList:
lineLst := TStringList.Create;
try
lineLst.Delimiter := '#';
lineLst.DelimitedText := Line;
...
finally
lineLst.Free;
end;
Depending on your Delphi version you can set lineLst.StrictDelimiter := true in case the line contains spaces.
You can do something like this:
program Project1;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, StrUtils;
var
S : string;
D : string;
begin
S := '23#80#10#2#1#...255#';
for D in SplitString(S,'#') do //SplitString is in the StrUtils unit
writeln(D);
readln;
end.
You did not tag your Delphi version, so i don't know if it applies or not.
That IS version-specific. Please do!
In order of my personal preference:
1: Download Jedi CodeLib - http://jcl.sf.net. Then use TJclStringList. It has very nice split method. After that you would only have to iterate through.
function Split(const AText, ASeparator: string; AClearBeforeAdd: Boolean = True): IJclStringList;
uses JclStringLists;
...
var s: string; js: IJclStringList.
begin
...
js := TJclStringList.Create().Split(input, '#', True);
for s in js do begin
.....
end;
...
end;
2: Delphi now has somewhat less featured StringSplit routine. http://docwiki.embarcadero.com/Libraries/en/System.StrUtils.SplitString
It has a misfeature that array of string type may be not assignment-compatible to itself. Hello, 1949 Pascal rules...
uses StrUtils;
...
var s: string;
a_s: TStringDynArray;
(* aka array-of-string aka TArray<string>. But you have to remember this term exactly*)
begin
...
a_s := SplitString(input, '#');
for s in a_s do begin
.....
end;
...
end;
3: Use TStringList. The main problem with it is that it was designed that spaces or new lines are built-in separators. In newer Delphi that can be suppressed. Overall the code should be tailored to your exact Delphi version. You can easily Google for something like "Using TStringlist for splitting string" and get a load of examples (like #Uwe's one).
But you may forget to suppress here or there. And you may be on old Delphi,, where that can not be done. And you may mis-apply example for different Delphi version. And... it is just boring :-) Though you can make your own function to generate such pre-tuned stringlists for you and carefully check Delphi version in it :-) But then You would have to carefully free that object after use.
I use a function I've written called Fetch. I think I stole the idea from the Indy library some time ago:
function Fetch(var VString: string; ASeperator: string = ','): string;
var LPos: integer;
begin
LPos := AnsiPos(ASeperator, VString);
if LPos > 0 then
begin
result := Trim(Copy(VString, 1, LPos - 1));
VString := Copy(VString, LPos + 1, MAXINT);
end
else
begin
result := VString;
VString := '';
end;
end;
Then I'd call it like this:
var
value: string;
line: string;
profileFile: string;
profileList: TStringList;
index: integer;
begin
if OpenDialog1.Execute then
begin
profileFile := OpenDialog1.FileName;
if (profileFile = '') or not FileExists(profileFile) then
exit;
profileList := TStringList.Create;
try
profileList.LoadFromFile(profileFile);
for index := 0 to profileList.Count - 1 do
begin
line := profileList[index];
Fetch(line, ''''); //discard "Line '"
value := Fetch(line, '#')
while (value <> '') and (value[1] <> '''') do //bail when we get to the quote at the end
begin
ProcessTheNumber(value); //do whatever you need to do with the number
value := Fetch(line, '#');
end;
end;
finally
profileList.Free;
end;
end;
end;
Note: this was typed into the browser, so I haven't checked it works.
I'm using the code from: http://www.swissdelphicenter.ch/torry/showcode.php?id=1103
to sort my TListView, which works GREAT on everything but numbers with decimals.
So I tried to do this myself, and I created a new Custom Sort called: cssFloat
Created a new function
function CompareFloat(AInt1, AInt2: extended): Integer;
begin
if AInt1 > AInt2 then Result := 1
else
if AInt1 = AInt2 then Result := 0
else
Result := -1;
end;
Added of the case statement telling it what type the column is..
cssFloat : begin
Result := CompareFloat(i2, i1);
end;
And I changed the Column click event to have the right type selected for the column.
case column.Index of
0: LvSortStyle := cssNumeric;
1: LvSortStyle := cssFloat;
2: LvSortStyle := cssAlphaNum;
else LvSortStyle := cssNumeric;
And The ListView Sort type is currently set to stBoth.
It doesn't sort correctly. And Ideas on how to fix this?
Thank you
-Brad
I fixed it... after 3 hours of struggling with this.. not understanding why.. I finally saw the light.. CompareFloat was asking if two integers were greater or less than each other.
cssFloat : begin
r1 := IsValidFloat(s1, e1);
r2 := IsValidFloat(s2, e2);
Result := ord(r1 or r2);
if Result <> 0 then
Result := CompareFloat(e2, e1);
end;
(Copied and modified from EFG's Delphi site)
FUNCTION isValidFloat(CONST s: STRING; var e:extended): BOOLEAN;
BEGIN
RESULT := TRUE;
TRY
e:= StrToFloat(s)
EXCEPT
ON EConvertError DO begin e:=0; RESULT := FALSE; end;
END
END {isValidFloat};
While I don't know what is the problem which you faced perhaps is useful for you...
function CompareFloat(AStr1, AStr2: string): Integer;
const
_MAGIC = -1; //or ANY number IMPOSSIBLE to reach
var
nInt1, nInt2: extended;
begin
nInt1:=StrToFloatDef(AStr1, _MAGIC);
nInt2:=StrToFloatDef(AStr2, _MAGIC);
if nInt1 > nInt2 then Result := 1
else
if nInt1 = nInt2 then Result := 0
else
Result := -1;
end;
..and another snippet (perhaps much better):
function CompareFloat(aInt1, aInt2: extended): integer;
begin
Result:=CompareValue(aInt1, aInt2); // :-) (see the Math unit) - also you can add a tolerance here (see the 'Epsilon' parameter)
end;
Besides the rounding which can cause you problems you can see what the format settings are in conversion between string and numbers (you know, the Decimal Point, Thousands Separator aso.) - see TFormatSettings structure in StringToFloat functions. (There are two - overloaded).
HTH,
I have an incoming soap message wich form is TStream (Delphi7), server that send this soap is in development mode and adds a html header to the message for debugging purposes. Now i need to cut out the html header part from it before i can pass it to soap converter. It starts from the beginning with 'pre' tag and ends with '/pre' tag. Im thinking it should be fairly easy to but i havent done it before in Delphi7, so can someone help me?
Another solution, more in line with Lars' suggestion and somehow more worked out.
It's faster, especially when the size of the Stream is above 100, and even more so on really big ones. It avoids copying to an intermediate string.
FilterBeginStream is simpler and follows the "specs" in removing everything up until the end of the header.
FilterMiddleStream does the same as DepreStream, leaving what's before and after the header.
Warning: this code is for Delphi up to D2007, not D2009.
// returns position of a string token (its 1st char) into a Stream. 0 if not found
function StreamPos(Token: string; AStream: TStream): Int64;
var
TokenLength: Integer;
StringToMatch: string;
begin
Result := 0;
TokenLength := Length(Token);
if TokenLength > 0 then
begin
SetLength(StringToMatch, TokenLength);
while AStream.Read(StringToMatch[1], 1) > 0 do
begin
if (StringToMatch[1] = Token[1]) and
((TokenLength = 1) or
((AStream.Read(StringToMatch[2], Length(Token)-1) = Length(Token)-1) and
(Token = StringToMatch))) then
begin
Result := AStream.Seek(0, soCurrent) - (Length(Token) - 1); // i.e. AStream.Position - (Length(Token) - 1);
Break;
end;
end;
end;
end;
// Returns portion of a stream after the end of a tag delimited header. Works for 1st header.
// Everything preceding the header is removed too. Returns same stream if no valid header detected.
// Result is True if valid header found and stream has been filtered.
function FilterBeginStream(const AStartTag, AEndTag: string; const AStreamIn, AStreamOut: TStream): Boolean;
begin
AStreamIn.Seek(0, soBeginning); // i.e. AStreamIn.Position := 0;
Result := (StreamPos(AStartTag, TStream(AStreamIn)) > 0) and (StreamPos(AEndTag, AStreamIn) > 0);
if Result then
AStreamOut.CopyFrom(AStreamIn, AStreamIn.Size - AStreamIn.Position)
else
AStreamOut.CopyFrom(AStreamIn, 0);
end;
// Returns a stream after removal of a tag delimited portion. Works for 1st encountered tag.
// Returns same stream if no valid tag detected.
// Result is True if valid tag found and stream has been filtered.
function FilterMiddleStream(const AStartTag, AEndTag: string; const AStreamIn, AStreamOut: TStream): Boolean;
var
StartPos, EndPos: Int64;
begin
Result := False;
AStreamIn.Seek(0, soBeginning); // i.e. AStreamIn.Position := 0;
StartPos := StreamPos(AStartTag, TStream(AStreamIn));
if StartPos > 0 then
begin
EndPos := StreamPos(AEndTag, AStreamIn);
Result := EndPos > 0;
end;
if Result then
begin
if StartPos > 1 then
begin
AStreamIn.Seek(0, soBeginning); // i.e. AStreamIn.Position := 0;
AStreamOut.CopyFrom(AStreamIn, StartPos - 1);
AStreamIn.Seek(EndPos - StartPos + Length(AEndTag), soCurrent);
end;
AStreamOut.CopyFrom(AStreamIn, AStreamIn.Size - AStreamIn.Position);
end
else
AStreamOut.CopyFrom(AStreamIn, 0);
end;
I think the following code would do what you want, assuming you only have one <pre> block in your document.
function DepreStream(Stm : tStream):tStream;
var
sTemp : String;
oStrStm : tStringStream;
i : integer;
begin
oStrStm := tStringStream.create('');
try
Stm.Seek(0,soFromBeginning);
oStrStm.copyfrom(Stm,Stm.Size);
sTemp := oStrStm.DataString;
if (Pos('<pre>',sTemp) > 0) and (Pos('</pre>',sTemp) > 0) then
begin
delete(sTemp,Pos('<pre>',sTemp),(Pos('</pre>',sTemp)-Pos('<pre>',sTemp))+6);
oStrStm.free;
oStrStm := tStringStream.Create(sTemp);
end;
Result := tMemoryStream.create;
oStrStm.Seek(0,soFromBeginning);
Result.CopyFrom(oStrStm,oStrStm.Size);
Result.Seek(0,soFromBeginning);
finally
oStrStm.free;
end;
end;
Another option I believe would be to use an xml transform to remove the unwanted tags, but I don't do much in the way of transforms so if anyone else wants that torch...
EDIT: Corrected code so that it works. Teaches me for coding directly into SO rather than into the IDE first.
Make a new TStream (use TMemoryStream) and move any stuff you want to keep over from one stream to the other with TStream.CopyFrom or the TStream.ReadBuffer/WriteBuffer methods.
An XPath expression of "//pre[1][1]" will haul out the first node of the first <pre> tag in the XML message: from your description, that should contain the SOAP message you want.
It's been many years since I last used it, but I think Dieter Koehler's OpenXML library supports XPath.