Delphi - Loop through the String - delphi

I'm trying to find out if String is "mnemonic type"...
My mnemonic type consists of letters from 'a' to 'z' and from 'A' to 'Z', digits from '0' to '9', and additionaly '_'.
I build code like below. It should result with True if given string match my mnemonic pattern otherwise False:
TRes := True;
for I := 0 to (AString.Length - 1) do
begin
if not ((('0' <= AString[I]) and (AString[I] <= '9'))
or (('a' <= AString[I]) and (AString[I] <= 'z'))
or (('A' <= AString[I]) and (AString[I] <= 'Z'))
or (AString[I] = '_')) then
TRes := False;
end;
This code always results with False.

I'm assuming that since you tagged the question XE5, and used zero-based indexing, that your strings are zero-based. But perhaps that assumptions was mistaken.
Your logic is fine, although it is rather hard to read. The code in the question is already doing what you intend. At least the if statement does indeed perform the test that you intend.
Let's just re-write your code to make it easier to understand. I'm going to lay it our differently, and use a local loop variable to represent each character:
for C in AString do
begin
if not (
(('0' <= C) and (C <= '9')) // C is in range 0..9
or (('a' <= C) and (C <= 'z')) // C is in range a..z
or (('A' <= C) and (C <= 'Z')) // C is in range A..Z
or (C = '_') // C is _
) then
TRes := False;
end;
When written like that I'm sure that you will agree that it performs the test that you intend.
To make the code easier to understand however, I would write an IsValidIdentifierChar function:
function IsValidIdentifierChar(C: Char): Boolean;
begin
Result := ((C >= '0') and (C <= '9'))
or ((C >= 'A') and (C <= 'Z'))
or ((C >= 'a') and (C <= 'z'))
or (C = '_');
end;
As #TLama says, you can write IsValidIdentifierChar more concisely using CharInSet:
function IsValidIdentifierChar(C: Char): Boolean;
begin
Result := CharInSet(C, ['0'..'9', 'a'..'z', 'A'..'Z', '_']);
end;
Then you can build your loop on top of this function:
TRes := True;
for C in AString do
if not IsValidIdentifierChar(C) do
begin
TRes := False;
break;
end;

String type is 1-based. dynamic Arrays are 0-based. Better use for ... in so you are safe for future Delphi's.
Testing for ranges of possible character values can be done more efficiently (and more conciece) is CharInSet.
function IsMnemonic( AString: string ): Boolean;
var
Ch: Char;
begin
for Ch in AString do
if not CharInSet( Ch, [ '_', '0'..'9', 'A'..'Z', 'a'..'z' ] ) then
Exit( False );
Result := True;
end;

Related

Constant array from set

I have the following code for creating superscript versions of the digits '0' to '9' and the signs '+' and '-'
const
Digits = ['0' .. '9'];
Signs = ['+', '-'];
DigitsAndSigns = Digits + Signs;
function SuperScript(c: Char): Char;
{ Returns the superscript version of the character c
Only for the numbers 0..9 and the signs +, - }
const
SuperDigits: array ['0' .. '9'] of Char = ('⁰', '¹', '²', '³', '⁴', '⁵', '⁶', '⁷', '⁸', '⁹');
begin
if CharInSet(c, Digits) then
Result := SuperDigits[c]
else if c = '+' then
Result := '⁺'
else if c = '-' then
Result := '⁻'
else
Result := c;
end;
This works, but is not very elegant. Ideally I would like to have something like
SuperDigits: array [DigitsAndSigns] of Char = ('⁰', '¹', '²', '³', '⁴', '⁵', '⁶', '⁷', '⁸', '⁹', '⁺', '⁻');
but this does not even compile.
Is it somehow possible to create and preset an array element for every element in the set?
I am aware that I could use more heavy components like TDictionary, but (if possible) I would like to use sets or enumerations.
Actually there is a solution to achieve what you want, but perhaps not what you expected:
type
SuperDigit = record
private
class function GetItem(const C: Char): Char; static;
public
class property Item[const C: Char]: Char read GetItem; default;
end;
class function SuperDigit.GetItem(const C: Char): Char;
const
cDigitsAndSigns = '0123456789+-';
cSuperScripts = '⁰¹²³⁴⁵⁶⁷⁸⁹⁺⁻';
begin
Result := C;
var idx := Pos(C, cDigitsAndSigns);
if idx >= 0 then
Result := cSuperScripts[idx];
end;
With this declaration your can write something like this:
procedure ToSuperScript(var S: string);
begin
for var I := 1 to Length(S) do
S[I] := SuperDigit[S[I]];
end;
Is it somehow possible to create and preset an array element for every element in the set?
No.
This is fundamentally impossible because the set is an unordered container.
In your case, Digits + Signs is exactly the same thing as Signs + Digits, so how could you possibly know in what order to enumerate the elements?
Also, it might be worth pointing out that the brackets in
const
Digits = ['0' .. '9'];
are not of the same kind as the brackets in
array ['0' .. '9'] of Char
The brackets in Digits really do make a set, but the static array syntax has nothing to do with sets. A static array is indexed by an ordinal type.
In theory, you could create an enumerated type with your characters, but then you need to convert an input character to your enumerated type, and then back to the mapped character. So this is not convenient.
In your particular case, you have a mapping Char → Char. The underlying Unicode code points aren't really nice enough to facilitate any clever tricks (like you can do with ASCII lower case -> upper case, for example). In fact, the superscript digits are not even contiguous! So you have no choice but to do a plain, data-based mapping of some sort.
I'd just use a case construct like in UnicodeSuperscript here:
function UnicodeSuperscript(const C: Char): Char;
begin
case C of
'0':
Result := '⁰';
'1':
Result := '¹';
'2':
Result := '²';
'3':
Result := '³';
'4':
Result := '⁴';
'5':
Result := '⁵';
'6':
Result := '⁶';
'7':
Result := '⁷';
'8':
Result := '⁸';
'9':
Result := '⁹';
'+':
Result := '⁺';
'-', '−':
Result := '⁻';
else
Result := C;
end;
end;
In terms of elegance, I guess you may want to separate data from logic. One (overkill and slower!) approach would be to store a constant array like in
function UnicodeSuperscript(const C: Char): Char;
const
Chars: array[0..12] of
record
B,
S: Char
end
=
(
(B: '0'; S: '⁰'),
(B: '1'; S: '¹'),
(B: '2'; S: '²'),
(B: '3'; S: '³'),
(B: '4'; S: '⁴'),
(B: '5'; S: '⁵'),
(B: '6'; S: '⁶'),
(B: '7'; S: '⁷'),
(B: '8'; S: '⁸'),
(B: '9'; S: '⁹'),
(B: '+'; S: '⁺'),
(B: '-'; S: '⁻'),
(B: '−'; S: '⁻')
);
begin
for var X in Chars do
if C = X.B then
Exit(X.S);
Result := C;
end;

In Delphi Alexandria RTL, is ScanChar() badly written?

In the Delphi Alexandria RTL, they have this function:
function ScanChar(const S: string; var Pos: Integer; Ch: Char): Boolean;
var
C: Char;
begin
if (Ch = ' ') and ScanBlanks(S, Pos) then
Exit(True);
Result := False;
if Pos <= High(S) then
begin
C := S[Pos];
if C = Ch then
Result := True
else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)
else if Ch.IsLetter and C.IsLetter then
Result := ToUpper(C) = ToUpper(Ch);
if Result then
Inc(Pos);
end;
end;
I can't understand the purpose of this comparison:
else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)
It looks like it's the same as doing this:
else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
Result := c = Ch
Is this true?
else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)
Purpose of this comparison is optimization and making faster comparison if the characters are plain ASCII letters and avoiding expensive call to WinAPI via ToUpper function that can handle Unicode characters.
Or at least that is what would happen if the comparison itself would not be badly broken.
Comparison checks whether both characters are lower case and fall into range between small letter a (ASCII value 97) and small letter z (ASCII value 122). But what it should actually check is that both characters fall into range between large letter A (ASCII value 65) and small letter z, covering the whole range of ASCII letters regardless of their case. (There are few non letter characters in that range, but those are not relevant as Result assignment would never yield True for any of those characters.)
Once that is fixed, we also need to fix Result assignment expression as it will not properly compare lowercase and uppercase letters. To do that we can simply use or operator on all characters which will turn uppercase characters to lowercase, and leave lowercase as-is. As previously mentioned, at this point in code, non-letter characters in that range can be safely ignored.
Correct code for that part of the ScanChar function would be:
...
else
if (Ch >= 'A') and (Ch <= 'z') and (C >= 'A') and (C <= 'z') then
Result := Word(Ch) or $0020 = Word(C) or $0020
else
...
Note: Even through original ScanChar function contains incorrect code, the result of the function will still be correct as for same letters in different case the code will always go through ToUpper part of the if branch.
It is not exactly the same as C = Ch, but the result is the same, I suppose.
The comparison is redundant, IMHO. It is using XOR to convert lowercase ASCII letters into uppercase ASCII letters (as they differ by only 1 bit), and then comparing the uppercase letters for equality. But the following comparison using IsLetter+ToUpper does the same thing, just for any letters, not just ASCII letters.

Compare multiple values at a time

I need to check if N values are equals.
var
A, B, C, D : Integer;
begin
...
if(A = B) and (B = C) and (C = D) then
ShowMessage('Same value');
end;
Is there a shorter way to compare N values?
I mean something like:
var
A, B, C, D : Integer;
begin
...
if SameValue([A, B, C, D]) then
ShowMessage('Same value');
end;
Well, the best you can achieve is basically your own suggestion.
You would implement this using an open array parameter:
function AllEqual(const AValues: array of Integer): Boolean;
var
i: Integer;
begin
for i := 1 to High(AValues) do
if AValues[i] <> AValues[0] then
Exit(False);
Result := True;
end;
The correctness of this implementation is obvious:
If the number of values in the array is 0 or 1, it returns True.
Otherwise, and in general, it returns False iff the array contains two non-equal values.
AValues[0] is only accessed if High(AValues) >= 1, in which case the 0th value exists.
A function like this one is straightforward to implement for ordinal types. For real types (floating-point values), it becomes much more subtle, at least if you want to compare the elements with epsilons (like the SameValue function does in the Delphi RTL). Indeed, then you get different behaviour depending on if you compare every element against the first element, or if you compare every element against its predecessor.
Andreas' answer is correct, I'd like to add a different approach though:
uses Math;
function AllEqual(const AValues: array of Integer): Boolean;
begin
Result := (MinIntValue(AValues) = MaxIntValue(AValues));
end;
function AllEqualF(const AValues: array of Double; Epsilon: Double): Boolean;
begin
Result := ((MaxValue(AValues)- MinValue(AValues)) <= Epsilon);
end;
There is quite simple and very fast equality comparison approach for ints without a need of additional method and stuff like this - it's Bitwise Operators
And of course, this could be put in a method with open array or so.
There are even 2 options (or maybe more), with second you also can replace "or" to "+" , OR (not both, it will ruin equality-test logic) you can replace "xor" to "-" (last case)
BUT the resulting condition length is not shorter than original (only the last case is same and all brackets/parenthesis are vital, except first xor/-), here is the testing code:
program Project1;{$APPTYPE CONSOLE}
uses Math; var a, b, c, d, x : Integer; s: string;
begin
Randomize;
repeat
x := Random(10) - 5;
a := x + Sign(Random() - 0.5);
b := x + Sign(Random() - 0.5);
c := x + Sign(Random() - 0.5);
d := x + Sign(Random() - 0.5);
Writeln(a, ' ', b, ' ', c, ' ', d);
Writeln((A = B) and (B = C) and (C = D));
Writeln(a or b or c or d = a and b and c and d);
Writeln(a xor b or (b xor c) or (c xor d) = 0);
Writeln(a - b or (b - c) or (c - d) = 0);
Readln(s);
until s <> '';
end.

avoid complete check of an if statement

The code sample below should evaluate an string.
function EvaluateString(const S: Ansistring): Ansistring;
var
i, L: Integer;
begin
L := Length(S);
i:=1;
if (L > 0) and (S[i] > ' ') and (S[L] > ' ') then
.....
end;
but if L=0 then (S[i] > ' ') will create an Access violation.
Can I avoid this problem while keeping the if condition?
You need to either put a {$B-} statement on top of your code, or enable boolean short circuit evaluation in the project settings.
Since {$B-} is the default, you may have already turned it on before, or there is a {$B+} directive somewhere that is turning it off.
In the short circuit evaluation mode {$B-}, Delphi creates code that is (roughly) equivalent to this:
if (L > 0) then begin
if (S[i] > ' ') then begin
if (S[L] > ' ') then begin
.....
end;
end;
end;
In contrast, with full boolean evaluation mode {$B+}, the equivalent could be something like this:
var a,b,c : Boolean;
a := (L > 0);
b := (S[i] > ' '); // always executed
c := (S[L] > ' '); // always executed
if a and b and c then .....

Standard URL encode function?

Is there a Delphi equivalent of this .net's method:
Url.UrlEncode()
Note
I haven't worked with Delphi for several years now.
As I read through the answers I notice that there are several remarks and alternatives to the currently marked answer. I haven't had the opportunity to test them so I'm basing my answer on the most upvoted.
For your own sake, do check later answers and after deciding upvote the best answer so everybody can benefit from your experience.
Look at indy IdURI unit, it has two static methods in the TIdURI class for Encode/Decode the URL.
uses
IdURI;
..
begin
S := TIdURI.URLEncode(str);
//
S := TIdURI.URLDecode(str);
end;
Another simple way of doing this is to use the HTTPEncode function in the HTTPApp unit - very roughly
Uses
HTTPApp;
function URLEncode(const s : string) : string;
begin
result := HTTPEncode(s);
end
HTTPEncode is deprecated in Delphi 10.3 - 'Use TNetEncoding.URL.Decode'
Uses
NetEncoding;
function URLEncode(const s : string) : string;
begin
result := TNetEncoding.URL.Encode(s);
end
I made myself this function to encode everything except really safe characters. Especially I had problems with +. Be aware that you can not encode the whole URL with this function but you need to encdoe the parts that you want to have no special meaning, typically the values of the variables.
function MyEncodeUrl(source:string):string;
var i:integer;
begin
result := '';
for i := 1 to length(source) do
if not (source[i] in ['A'..'Z','a'..'z','0','1'..'9','-','_','~','.']) then result := result + '%'+inttohex(ord(source[i]),2) else result := result + source[i];
end;
Another option, is to use the Synapse library which has a simple URL encoding method (as well as many others) in the SynaCode unit.
uses
SynaCode;
..
begin
s := EncodeUrl( str );
//
s := DecodeUrl( str );
end;
Since Delphi xe7 you can use TNetEncoding.Url.Encode()
Update 2018: the code shown below seems to be outdated. see Remy's comment.
class function TIdURI.ParamsEncode(const ASrc: string): string;
var
i: Integer;
const
UnsafeChars = '*#%<> []'; {do not localize}
begin
Result := ''; {Do not Localize}
for i := 1 to Length(ASrc) do
begin
if CharIsInSet(ASrc, i, UnsafeChars) or (not CharIsInSet(ASrc, i, CharRange(#33,#128))) then begin {do not localize}
Result := Result + '%' + IntToHex(Ord(ASrc[i]), 2); {do not localize}
end else begin
Result := Result + ASrc[i];
end;
end;
end;
From Indy.
Anyway Indy is not working properly so YOU NEED TO SEE THIS ARTICLE:
http://marc.durdin.net/2012/07/indy-tiduri-pathencode-urlencode-and-paramsencode-and-more/
In a non-dotnet environment, the Wininet unit provides access to Windows' WinHTTP encode function:
InternetCanonicalizeUrl
In recent versions of Delphi (tested with XE5), use the URIEncode function in the REST.Utils unit.
I was also facing the same issue (Delphi 4).
I resolved the issue using below mentioned function:
function fnstUrlEncodeUTF8(stInput : widestring) : string;
const
hex : array[0..255] of string = (
'%00', '%01', '%02', '%03', '%04', '%05', '%06', '%07',
'%08', '%09', '%0a', '%0b', '%0c', '%0d', '%0e', '%0f',
'%10', '%11', '%12', '%13', '%14', '%15', '%16', '%17',
'%18', '%19', '%1a', '%1b', '%1c', '%1d', '%1e', '%1f',
'%20', '%21', '%22', '%23', '%24', '%25', '%26', '%27',
'%28', '%29', '%2a', '%2b', '%2c', '%2d', '%2e', '%2f',
'%30', '%31', '%32', '%33', '%34', '%35', '%36', '%37',
'%38', '%39', '%3a', '%3b', '%3c', '%3d', '%3e', '%3f',
'%40', '%41', '%42', '%43', '%44', '%45', '%46', '%47',
'%48', '%49', '%4a', '%4b', '%4c', '%4d', '%4e', '%4f',
'%50', '%51', '%52', '%53', '%54', '%55', '%56', '%57',
'%58', '%59', '%5a', '%5b', '%5c', '%5d', '%5e', '%5f',
'%60', '%61', '%62', '%63', '%64', '%65', '%66', '%67',
'%68', '%69', '%6a', '%6b', '%6c', '%6d', '%6e', '%6f',
'%70', '%71', '%72', '%73', '%74', '%75', '%76', '%77',
'%78', '%79', '%7a', '%7b', '%7c', '%7d', '%7e', '%7f',
'%80', '%81', '%82', '%83', '%84', '%85', '%86', '%87',
'%88', '%89', '%8a', '%8b', '%8c', '%8d', '%8e', '%8f',
'%90', '%91', '%92', '%93', '%94', '%95', '%96', '%97',
'%98', '%99', '%9a', '%9b', '%9c', '%9d', '%9e', '%9f',
'%a0', '%a1', '%a2', '%a3', '%a4', '%a5', '%a6', '%a7',
'%a8', '%a9', '%aa', '%ab', '%ac', '%ad', '%ae', '%af',
'%b0', '%b1', '%b2', '%b3', '%b4', '%b5', '%b6', '%b7',
'%b8', '%b9', '%ba', '%bb', '%bc', '%bd', '%be', '%bf',
'%c0', '%c1', '%c2', '%c3', '%c4', '%c5', '%c6', '%c7',
'%c8', '%c9', '%ca', '%cb', '%cc', '%cd', '%ce', '%cf',
'%d0', '%d1', '%d2', '%d3', '%d4', '%d5', '%d6', '%d7',
'%d8', '%d9', '%da', '%db', '%dc', '%dd', '%de', '%df',
'%e0', '%e1', '%e2', '%e3', '%e4', '%e5', '%e6', '%e7',
'%e8', '%e9', '%ea', '%eb', '%ec', '%ed', '%ee', '%ef',
'%f0', '%f1', '%f2', '%f3', '%f4', '%f5', '%f6', '%f7',
'%f8', '%f9', '%fa', '%fb', '%fc', '%fd', '%fe', '%ff');
var
iLen,iIndex : integer;
stEncoded : string;
ch : widechar;
begin
iLen := Length(stInput);
stEncoded := '';
for iIndex := 1 to iLen do
begin
ch := stInput[iIndex];
if (ch >= 'A') and (ch <= 'Z') then
stEncoded := stEncoded + ch
else if (ch >= 'a') and (ch <= 'z') then
stEncoded := stEncoded + ch
else if (ch >= '0') and (ch <= '9') then
stEncoded := stEncoded + ch
else if (ch = ' ') then
stEncoded := stEncoded + '+'
else if ((ch = '-') or (ch = '_') or (ch = '.') or (ch = '!') or (ch = '*')
or (ch = '~') or (ch = '\') or (ch = '(') or (ch = ')')) then
stEncoded := stEncoded + ch
else if (Ord(ch) <= $07F) then
stEncoded := stEncoded + hex[Ord(ch)]
else if (Ord(ch) <= $7FF) then
begin
stEncoded := stEncoded + hex[$c0 or (Ord(ch) shr 6)];
stEncoded := stEncoded + hex[$80 or (Ord(ch) and $3F)];
end
else
begin
stEncoded := stEncoded + hex[$e0 or (Ord(ch) shr 12)];
stEncoded := stEncoded + hex[$80 or ((Ord(ch) shr 6) and ($3F))];
stEncoded := stEncoded + hex[$80 or ((Ord(ch)) and ($3F))];
end;
end;
result := (stEncoded);
end;
source : Java source code
I have made my own function. It converts spaces to %20, not to plus sign. It was needed to convert local file path to path for browser (with file:/// prefix). The most important is it handles UTF-8 strings. It was inspired by Radek Hladik's solution above.
function URLEncode(s: string): string;
var
i: integer;
source: PAnsiChar;
begin
result := '';
source := pansichar(s);
for i := 1 to length(source) do
if not (source[i - 1] in ['A'..'Z', 'a'..'z', '0'..'9', '-', '_', '~', '.', ':', '/']) then
result := result + '%' + inttohex(ord(source[i - 1]), 2)
else
result := result + source[i - 1];
end;
AFAIK you need to make your own.
Here is an example.
HTTPEncode
TIdUri or HTTPEncode has problems with unicode charactersets. Function below will do correct encoding for you.
function EncodeURIComponent(const ASrc: string): UTF8String;
const
HexMap: UTF8String = '0123456789ABCDEF';
function IsSafeChar(ch: Integer): Boolean;
begin
if (ch >= 48) and (ch <= 57) then Result := True // 0-9
else if (ch >= 65) and (ch <= 90) then Result := True // A-Z
else if (ch >= 97) and (ch <= 122) then Result := True // a-z
else if (ch = 33) then Result := True // !
else if (ch >= 39) and (ch <= 42) then Result := True // '()*
else if (ch >= 45) and (ch <= 46) then Result := True // -.
else if (ch = 95) then Result := True // _
else if (ch = 126) then Result := True // ~
else Result := False;
end;
var
I, J: Integer;
ASrcUTF8: UTF8String;
begin
Result := ''; {Do not Localize}
ASrcUTF8 := UTF8Encode(ASrc);
// UTF8Encode call not strictly necessary but
// prevents implicit conversion warning
I := 1; J := 1;
SetLength(Result, Length(ASrcUTF8) * 3); // space to %xx encode every byte
while I <= Length(ASrcUTF8) do
begin
if IsSafeChar(Ord(ASrcUTF8[I])) then
begin
Result[J] := ASrcUTF8[I];
Inc(J);
end
else if ASrcUTF8[I] = ' ' then
begin
Result[J] := '+';
Inc(J);
end
else
begin
Result[J] := '%';
Result[J+1] := HexMap[(Ord(ASrcUTF8[I]) shr 4) + 1];
Result[J+2] := HexMap[(Ord(ASrcUTF8[I]) and 15) + 1];
Inc(J,3);
end;
Inc(I);
end;
SetLength(Result, J-1);
end;
I'd like to point out that if you care much more about correctness than about efficiency, the simplest you can do is hex encode every character, even if it's not strictly necessary.
Just today I needed to encode a few parameters for a basic HTML login form submission. After going through all the options, each with their own caveats, I decided to write this naive version that works perfectly:
function URLEncode(const AStr: string): string;
var
LBytes: TBytes;
LIndex: Integer;
begin
Result := '';
LBytes := TEncoding.UTF8.GetBytes(AStr);
for LIndex := Low(LBytes) to High(LBytes) do
Result := Result + '%' + IntToHex(LBytes[LIndex], 2);
end;

Resources