I am trying to get a routine that will find a string that does not follow a parentheses. For instance if the file open in the RichEdit contains these lines of CNC code, I want it to find the first two and ignore the third. In the second line it should only find and highlight the first occurrence of the search string. The search string (mach.TOOL_CHANGE_CALL) in this example is 'T'.
N1T1M6
N1T1M6(1/4-20 TAP .5 DP.)
(1/4-20 TAP .5 DP.)
I have gotten this far, but am stumped.
procedure TMainForm.ToolButton3Click(Sender: TObject); // find tool number
var
row:integer;
sel_str:string;
par:integer;
tool:integer;
tool_flag:integer ;
line_counter:integer;
tool_pos:integer;
line_begin:integer;
RE:TRichEdit;
begin
RE:=(ActiveMDIChild as TMDIChild).RichEdit1;
line_counter:=0;
tool_flag:=0;
tool_pos:=0;
row:=SendMessage(RE.Handle,EM_LINEFROMCHAR,-1, RE.SelStart);
while tool_flag =0 do
begin
RE.Perform(EM_LINESCROLL,0,line_counter);
sel_str := RE.Lines[Line_counter];
tool:=pos(mach.TOOL_CHANGE_CALL,sel_str);
par:=pos('(',sel_str);
if par=0 then
par:=pos('[',sel_str);
tool_pos:=tool_pos+length(sel_str);
if (tool>0) and (par = 0) then
begin
RE.SetFocus;
tool_pos:=tool_pos + line_counter-1;
line_begin:=tool_pos-tool;
RE.SelStart := line_begin;
RE.SelLength := Length(sel_str);
tool_flag:=1;
end;
inc (line_counter);
end;
end;
The results I get is that it will ignore the third string, but will also ignore the second string as well. It also will not find subsequent occurrences of the string in the file, it just starts back at the beginning to the text and finds the first one again. How can I get it to find the second example and then find the next 'T' at the next click of the button? I also need it to highlight the entire line the search string was found on.
Given the samples you posted, you can use Delphi (XE and higher) regular expressions to match the text you've indicated. Here, I've put the three sample lines you've shown into a TMemo (Memo1 in the code below), evaluate the regular expression, and put the matches found into Memo2 - as long as your TRichEdit contains only plain text, you can use the same code by replacing Memo1 and Memo2 with RichEdit1 and RichEdit2 respectively.
I've updated the code in both snippets to show how to get the exact position (as an offset from the first character) and length of the match result; you can use this to highlight the match in the richedit using SelStart and SelLength.
uses
RegularExpressions;
procedure TForm1.Button1Click(Sender: TObject);
var
Regex: TRegEx;
MatchResult: TMatch;
begin
Memo1.Lines.Clear;
Memo1.Lines.Add('N1T1M6');
Memo1.Lines.Add('N1T1M6(1/4-20 TAP .5 DP.)');
Memo1.Lines.Add('(1/4-20 TAP .5 DP.)');
Memo2.Clear;
// See the text below for an explanation of the regular expression
Regex := TRegEx.Create('^\w+T\w+', [roMultiLine]);
MatchResult := Regex.Match(Memo1.Lines.Text);
while MatchResult.Success do
begin
Memo2.Lines.Add(MatchResult.Value +
' Index: ' + IntToStr(MatchResult.Index) +
' Length: ' + IntToStr(MatchResult.Length));
MatchResult := MatchResult.NextMatch;
end;
end;
This produces the following results:
If you're using a version of Delphi that doesn't include regular expression support, you can use the free TPerlRegEx with some minor code changes to produce the same results:
uses
PerlRegEx;
procedure TForm1.Button1Click(Sender: TObject);
var
Regex: TPerlRegEx;
begin
Memo1.Lines.Clear;
Memo1.Lines.Add('N1T1M6');
Memo1.Lines.Add('N1T1M6(1/4-20 TAP .5 DP.)');
Memo1.Lines.Add('(1/4-20 TAP .5 DP.)');
Memo2.Clear;
Regex := TPerlRegEx.Create;
try
Regex.RegEx := '^\w+T\w+';
Regex.Options := [preMultiLine];
Regex.Subject := Memo1.Lines.Text;
if Regex.Match then
begin
repeat
Memo2.Lines.Add(Regex.MatchedText +
' Offset: ' + IntToStr(Regex.MatchedOffset) +
' Length: ' + IntToStr(Regex.MatchedLength));
until not Regex.MatchAgain;
end;
finally
Regex.Free;
end;
end;
The regular expression above (^\w+T\w+) means:
Options: ^ and $ match at line breaks
Assert position at the beginning of a line (at beginning
of the string or after a line break character) «^»
Match a single character that is a “word character” (letters,
digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible,
giving back as needed (greedy) «+»
Match the character “T” literally «T»
Match a single character that is a “word character” (letters,
digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible,
giving back as needed (greedy) «+»
Created with RegexBuddy
You can find a decent tutorial regarding regular expressions here. The tool I used for working out the regular expression (and actually producing much of the Delphi code for both examples) was RegexBuddy - I'm not affiliated with the company that produces it, but just a user of that product.
Related
I have a TRichEdit with some RTF in it (text with formatting only) and I want to know if all the content of the TRichEdit is selected. To do so, I do:
var AllSelectd: Boolean;
//...
AllSelectd := MyRichEdit.SelLength = MyRichEdit.GetTextLen;
which works fine, except when the content has three lines or more. With zero to two lines, everything is fine. As soon as I reach three lines in my TRichEdit, the code above no longer works (MyRichEdit.SelLength < MyRichEdit.GetTextLen). Each line is terminated with CRLF (#13#10).
Is this a bug? How can I reliably check if everything is selected in the TRichEdit?
I use Delphi 10.4, if it changes anything.
As mentioned in this topic, RichEdit 2.0 replaces CRLF pairs with CR internally, and retrieves LF's in some cases.
As workaround - calculate number of lines in selected range to make correction (SelText contains only CR's, GetTextLen works with text with retrieved CRLF, so counts both CR's and LF's). Remy Lebeau proposal is used.
var
sel, getl, crcnt, i: integer;
tx: string;
begin
sel := RichEdit1.SelLength;
getl := RichEdit1.GetTextLen;
crcnt := SendMessage(Richedit1.Handle, EM_EXLINEFROMCHAR, 0, sel);
Memo1.Lines.Add(Format('%d %d',[sel, getl - crcnt + 1]));
end;
I have created a TTextLayout object with text containing consecutive 't' characters and with the 'Calibri' font. I then have the following code to return the rectangular region of each character using the RegionForRange function. The result is that the width of the 1st 't' is 0 and the position of the 2nd 't' is the same as the first. Any other characters in the text are correct - even ones after the error, although the letter 'f' also has the same problem and any consecutive combination of 't' and 'f'. Most other fonts don't seem to cause the problem, although 'Gabriola' does.
procedure TForm1.FormCreate(Sender: TObject);
var
Layout : TTextLayout;
LRange : TTextRange;
LRegion : TRegion;
LRects : array of TRectF;
i : Integer;
begin
Layout := TTextLayoutManager.DefaultTextLayout.Create;
Layout.Font.Size := 20;
// Calibra and Gabriola fail but Arial and most other fonts don't
Layout.Font.Family := 'Calibri';// 'Gabriola';
Layout.Text := 'tt'; // ff, ft, tf also fail
LRange.Length := 1;
SetLength(LRects, Length(Layout.Text));
for i := 0 to Length(Layout.Text) - 1 do begin
LRange.Pos := i;
LRegion := Layout.RegionForRange(LRange);
LRects[i] := LRegion[0]; // Bounding rect of this character
end;
end;
Put a break point at the end of the function to see the values of left and right stored in LRects.
Stepping into the RegionForRange function leads to TTextLayoutD2D.DoRegionForRange but from there I can't go any further to see what could be going wrong. Why could this be happening for these particular characters and only for these fonts? Why should the character following the one in the range affect the result? Is it a bug? I could perhaps write some code to detect these sequences and correct the position, but I don't feel that I should need to do that.
Note that I'm using Delphi 10.4. I have not tried more recent updates, so I would appreciate if someone could confirm that this issue occurs and in which version.
The reason is called ligature.
When a ligature is applied, a combination of two or more glyphs is replaced by another, single glyph.
Ligatures are defined by OpenType fonts. Each font defines its own set of ligatures. Calibri defines a lot (you have found only a few ones).
Delphi RIO. I have built an Excel PlugIn with Delphi (also using AddIn Express). I iterate through a column to read cell values. After I read the cell value, I do a TRIM function. The TRIM is not deleting the last space. Code Snippet...
acctName := Trim(UpperCase(Acctname));
Before the code, AcctName is 'ABC Holdings '. It is the same AFTER the TRIM function. It appears that Excel has added some type of other char there. (new line?? Carriage return??) What is the best way to get rid of this? Is there a way I can ask the debugger to show me the HEX value for this variable. I have tried the INSPECT and EVALUATE windows. They both just show text. Note that I have to be careful of just deleting NonText characters, and some companies names have dashes, commas, apostrophes, etc.
**Additional Info - Based on Andreas suggestion, I added the following...
ShowMessage(IntToHex(Ord(Acctname[Acctname.Length])));
This comes back with '00A0'. So I am thinking I can just do a simple StringReplace... so I add this BEFORE Andreas code...
acctName := StringReplace(acctName, #13, '', [rfReplaceAll]);
acctName := StringReplace(acctName, #10, '', [rfReplaceAll]);
Yet, it appears that nothing has changed. The ShowMessage still shows '00A0' as the last character. Why isn't the StringReplace removing this?
If you want to know the true identity of the last character of your string, you can display its Unicode codepoint:
ShowMessage(IntToHex(Ord(Acctname[Acctname.Length]))).
Or, you can use a utility to investigate the Unicode character on the clipboard, like my own.
Yes, the character in question is U+00A0: NO-BREAK SPACE.
This is like a usual space, but it tells the rendering application not to put a line break at this space. For instance, in Swedish, at least, you want non-breaking spaces in 5 000 kWh.
By default, Trim and TStringHelper.Trim do not remove this kind of whitespace. (They also leave U+2007: FIGURE SPACE and a few other kinds of whitespace.)
The string helper method has an overload which lets you specify the characters to trim. You can use this to include U+00A0:
S.Trim([#$20, #$A0, #$9, #$D, #$A]) // space, nbsp, tab, CR, LF
// (many whitespace characters missing!)
But perhaps an even better solution is to rely on the Unicode characterisation and do
function RealTrimRight(const S: string): string;
var
i: Integer;
begin
i := S.Length;
while (i > 0) and S[i].IsWhiteSpace do
Dec(i);
Result := Copy(S, 1, i);
end;
Of course, you can implement similar RealTrimLeft and RealTrim functions.
And of course there are many ways to see the actual string bytes in the debugger. In addition to writing things like Ord(S[S.Length]) in the Evaluate/Modify window (Ctrl+F7), my personal favourite method is to use the Memory window (Ctrl+Alt+E). When this has focus, you can press Ctrl+G and type S[1] to see the actual bytes:
Here you see the string test string. Since strings are Unicode (UTF-16) since Delphi 2009, each character occupies two bytes. For simple ASCII characters, this means that every second byte is null. The ASCII values for our string are 74 65 73 74 20 73 74 72 69 6E 67. You can also see, on the line above (02A0855C) that our string object has reference count 1 and length B (=11).
As a demo, to show the unicode string:
program q63847533;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils;
type
array100 = array[0..99] of Byte;
parray100 = ^array100;
var
searchResult : TSearchRec;
Name : string;
display : parray100 absolute Name;
dummy : string;
begin
if findfirst('z*.mp3', faAnyFile, searchResult) = 0 then
begin
repeat
writeln('File name = '+searchResult.Name);
name := searchResult.Name;
writeln('File size = '+IntToStr(searchResult.Size));
until FindNext(searchResult) <> 0;
// Must free up resources used by these successful finds
FindClose(searchResult);
end;
readln(dummy);
end.
My directory contains two z*.mp3 files, one with an ANSI name and the other with a Unicode name.
WATCHing display^ as Hex or Memorydump will display what you seem to require (the Is there a way I can ask the debugger to show me the HEX value for this variable. of your question)
When transitioning from Delphi 2006 to Delphi XE2, one of the things that we learned is that RichEdit 2.0 replaces internally CRLF pairs with a single CR character. This has the unfortunate effect of throwing off all character index calculations based on the actual text string on the VCL's side.
The behavior I can see by tracing through the VCL code is as follows:
Sending a WM_GETTEXT message (done in TControl.GetTextBuf) will return a text buffer that contains CRLF pairs.
Sending a WM_GETTEXTLENGTH message (done in TControl.GetTextLen) will return a value as if the text still contains CRLF characters.
In contrast, sending an EM_SETSELEX message (i.e. setting SelStart) will treat the input value as if the text contains only CR characters.
This causes all sorts of things to fail (such as syntax highlighting) in our application. As you can tell, everything is off by exactly one character for every new line up to that point.
Obviously, since this is inconsistent behavior, we must be missing something or doing something very wrong.
Does anybody else has any experience with the transition from a RichEdit 1.0 to a RichEdit 2.0 control and how did you solve this issue? Finally, is there any way to force RichEdit 2.0 to use CRLF pairs just like RichEdit 1.0?
We also ran into this very issue.
We do a "mail merge" type of thing where we have templates with merge codes that are parsed and replaced by data from outside sources.
This index mismatch between pos(mystring, RichEdit.Text) and the positioning index into the RichEdit text using RichText.SelStart broke our merge.
I don't have a good answer but I came up with a workaround. It's a bit cumbersome (understatment!) but until a better solution comes along...
The workaround is to use a hidden TMemo and copy the RichEdit text to it and change the CR/LF pairs to CR only. Then use the TMemo to find the proper positioning using pos(string, TMemo) and use that to get the selstart position to use in the TRichEdit.
This really sucks but hopefully this workaround will help others in our situation or maybe spark somebody smarter than me into coming up with a better solution.
I'll show a little sample code...
Since we are replacing text using seltext we need to replace text in BOTH the RichEdit control and the TMemo control to keep the two synchronized.
StartToken and EndToken are the merge code delimiters and are a constant.
function TEditForm.ParseTest: boolean;
var TagLength: integer;
var ValueLength: integer;
var ParseStart: integer;
var ParseEnd: integer;
var ParseValue: string;
var Memo: TMemo;
begin
Result := True;//Default
Memo := TMemo.Create(nil);
try
Memo.Parent := self;
Memo.Visible := False;
try
Memo.Lines.Clear;
Memo.Lines.AddStrings(RichEditor.Lines);
Memo.Text := stringreplace(Memo.Text,#13#10,#13,[rfReplaceAll]);//strip CR/LF pairs and replace with CR
while (Pos(StartToken, Memo.Text) > 0) and (Pos(EndToken, Memo.Text) > 0) do begin
ParseStart := Pos(StartToken, Memo.SelText);
ParseEnd := Pos(EndToken, Memo.SelText) + Length(EndToken);
if ParseStart >= ParseEnd then begin//oops, something's wrong - bail out
Result := true;
myEditor.SelStart := 0;
exit;
end;
TagLength := ParseEnd - ParseStart;
ValueLength := (TagLength - Length(StartToken)) - Length(EndToken);
ParseValue := Copy(Memo.SelText, (ParseStart + Length(StartToken)), ValueLength);
Memo.selstart := ParseStart - 1; //since the .text is zero based, but pos is 1 based we subtract 1
Memo.sellength := TagLength;
RichEditor.selstart := ParseStart - 1; //since the .text is zero based, but pos is 1 based we subtract 1
RichEditor.sellength := TagLength;
TempText := GetValue(ParseValue);
Memo.SelText := TempText;
RichEditor.SelText := TempText;
end;
except
on e: exception do
begin
MessageDlg(e.message,mtInformation,[mbOK],0);
result := false;
end;
end;//try..except
finally
FreeAndNil(Memo);
end;
end;
How about subtracting EM_LINEFROMCHAR from the caret position? (OR the position of EM_GETSEL) whichever you need.
You could even get two EM_LINEFROMCHAR variables. One from the selection start and the other from the desired caret/selection position, if you only want to know how many cl/cr pairs are in the selection.
I'm doing some work with code generation, and one of the things I need to do is create a function call where one of the parameters is a function call, like so:
result := Func1(x, y, Func2(a, b, c));
TStringList.CommaText is very useful for generating the parameter lists, but when I traverse the tree to build the outer function call, what I end up with looks like this:
result := Func1(x, y, "Func2(a, b, c)");
It's quoting the third argument because it contains commas, and that produced invalid code. But I can't do something simplistic like StringReplace all double quotes with empty strings, because it's quite possible that a function argument could be a string with double quotes inside. Is there any way to make it just not escape the lines that contain commas?
You could set QuoteChar to be a space, and you'd merely get some extra spaces in the output, which is generally OK since generated code isn't usually expected to look pretty. String literals would be affected, though; they would have extra spaces inserted, changing the value of the string.
Free Pascal's TStrings class uses StrictDelimiter to control whether quoting occurs when reading the DelimitedText property. When it's true, quoting does not occur at all. Perhaps Delphi treats that property the same way.
Build an array of "unlikely characters" : non-keyable like †, ‡ or even non-printable like #129, #141, #143, #144.
Verify you don't have the 1st unlikely anywhere in your StringList.CommaText. Or move to the next unlikely until you get one not used in your StringList.CommaText. (Assert that you find one)
Use this unlikely char as the QuoteChar for your StringList
Get StringList.DelimitedText. You'll get the QuoteChar around the function parameters like: result := Func1(x, y, †Func2(a, b, c)†);
Replace the unlikely QuoteChar (here †) by empty strings...
What about using the Unicode version of AnsiExtractQuotedStr to remove the quotes?
Write your own method to export the contents of your TStringList to a string.
function MyStringListToString(const AStrings: TStrings): string;
var
i: Integer;
begin
Result := '';
if AStrings.Count = 0 then
Exit;
Result := AStrings[0];
for i := 1 to AStrings.Count - 1 do
Result := Result + ',' + AStrings[i];
end;
Too obvious? :-)
Alternatively, what would happen if you set StringList.QuoteChar to #0 and then called StringList.DelimitedText?
We have written a descendant class of TStringList in which reimplemented the DelimitedText property. You can copy most of the code from the original implementation.
var
LList: TStringList;
s, LOutput: string;
begin
LList := TStringList.Create;
try
LList.Add('x');
LList.Add('y');
LList.Add('Func2(a, b, c)');
for s in LList do
LOutput := LOutput + s + ', ';
SetLength(LOutput, Length(LOutput) - 2);
m1.AddLine('result := Func1(' + LOutput + ')');
finally
LList.Free;
end;
end;
Had the same problem, here's how I fixed it:
s := Trim(StringList.Text)
that's all ;-)