Related
We are trying to write a UDF in Delphi (10 Seattle) for our Firebird 2.5 database which should remove some characters from the input string.
All our string fields in the database are using character set UTF8 with collation UNICODE_CI_AI.
The function should remove some characters like space, . ; : / \ and others from the string.
Our function works fine for strings containing characters with ascii value <= 127. As soon as there are characters with ascii value bigger than 127, the UDF fails.
We have tried using PChar instead of PAnsiChar parameters but without success. For now we do a check if the character has an ascii value above 127 and if so, we remove that character from the string too.
What we want though, is a UDF that returns the original string without the punctuation characters.
This is our code so far:
unit UDFs;
interface
uses ib_util;
function UDF_RemovePunctuations(InputString: PAnsiChar): PAnsiChar; cdecl;
implementation
uses SysUtils, AnsiStrings, Classes;
//FireBird declaration:
//DECLARE EXTERNAL FUNCTION UDF_REMOVEPUNCTUATIONS
// CSTRING(500)
//RETURNS CSTRING(500) FREE_IT
//ENTRY_POINT 'UDF_RemovePunctuations' MODULE_NAME 'FB_UDF.dll';
function UDF_RemovePunctuations(InputString: PAnsiChar): PAnsiChar;
const
PunctuationChars = [' ', ',', '.', ';', '/', '\', '''', '"','(', ')'];
var
I: Integer;
S, NewS: String;
begin
S := UTF8ToUnicodeString(InputString);
For I := 1 to Length(S) do
begin
If Not CharInSet(S[I], PunctuationChars)
then begin
If S[I] <= #127
then NewS := NewS + S[I];
end;
end;
Result := ib_util_malloc(Length(NewS) + 1);
NewS := NewS + #0;
AnsiStrings.StrPCopy(Result, NewS);
end;
end.
When we remove the check on ascii value <= #127 we can see that NewS contains all characters as it should be (without the punctuation characters of course) but things go wrong when doing the StrPCopy we think.
Any help would be appreciated!
Thanks to LU RD I got this working.
The answer was to declare my string variables as Utf8String instead of String and not converting the inputstring to Unicode.
I have adapted my code like this:
//FireBird declaration:
//DECLARE EXTERNAL FUNCTION UDF_REMOVEPUNCTUATIONS
// CSTRING(500)
//RETURNS CSTRING(500) FREE_IT
//ENTRY_POINT 'UDF_RemovePunctuations' MODULE_NAME 'CarfacPlus_UDF.dll';
function UDF_RemovePunctuations(InputString: PAnsiChar): PAnsiChar;
const
PunctuationChars = [' ', ',', '.', ';', '/', '\', '''', '"','(', ')', '-',
'+', ':', '<', '>', '=', '[', ']', '{', '}'];
var
I: Integer;
S: Utf8String;
begin
S := InputString;
For I := Length(S) downto 1 do
If CharInSet(S[I], PunctuationChars)
then Delete(S, I, 1);
Result := ib_util_malloc(Length(S) + 1);
AnsiStrings.StrPCopy(Result, AnsiString(S));
end;
I use the StrUtils in to split a string into a TStringDynArray, but the output was not as expected. I will try to explain the issue:
I have a string str: 'a'; 'b'; 'c'
Now I called StrUtils.SplitString(str, '; '); to split the string and I expected an array with three elements: 'a', 'b', 'c'
But what I got is an array with five elements: 'a', '', 'b', '', 'c'.
When I split with just ';' instead of '; ' I get three elements with a leading blank.
So why do I get empty strings in my first solution?
This function is designed not to merge consecutive separators. For instance, consider splitting the following string on commas:
foo,,bar
What would you expect SplitString('foo,,bar', ',') to return? Would you be looking for ('foo', 'bar') or should the answer be ('foo', '', 'bar')? It's not clear a priori which is right, and different use cases might want different output.
If your case, you specified two delimiters, ';' and ' '. This means that
'a'; 'b'
splits at ';' and again at ' '. Between those two delimiters there is nothing, and hence an empty string is returned in between 'a' and 'b'.
The Split method from the string helper introduced in XE3 has a TStringSplitOptions parameter. If you pass ExcludeEmpty for that parameter then consecutive separators are treated as a single separator. This program:
{$APPTYPE CONSOLE}
uses
System.SysUtils;
var
S: string;
begin
for S in '''a''; ''b''; ''c'''.Split([';', ' '], ExcludeEmpty) do begin
Writeln(S);
end;
end.
outputs:
'a'
'b'
'c'
But you do not have this available to you in XE2 so I think you are going to have to roll your own split function. Which might look like this:
function IsSeparator(const C: Char; const Separators: string): Boolean;
var
sep: Char;
begin
for sep in Separators do begin
if sep=C then begin
Result := True;
exit;
end;
end;
Result := False;
end;
function Split(const Str, Separators: string): TArray<string>;
var
CharIndex, ItemIndex: Integer;
len: Integer;
SeparatorCount: Integer;
Start: Integer;
begin
len := Length(Str);
if len=0 then begin
Result := nil;
exit;
end;
SeparatorCount := 0;
for CharIndex := 1 to len do begin
if IsSeparator(Str[CharIndex], Separators) then begin
inc(SeparatorCount);
end;
end;
SetLength(Result, SeparatorCount+1); // potentially an over-allocation
ItemIndex := 0;
Start := 1;
CharIndex := 1;
for CharIndex := 1 to len do begin
if IsSeparator(Str[CharIndex], Separators) then begin
if CharIndex>Start then begin
Result[ItemIndex] := Copy(Str, Start, CharIndex-Start);
inc(ItemIndex);
end;
Start := CharIndex+1;
end;
end;
if len>Start then begin
Result[ItemIndex] := Copy(Str, Start, len-Start+1);
inc(ItemIndex);
end;
SetLength(Result, ItemIndex);
end;
Of course, all of this assumes that you want a space to act as a separator. You've asked for that in the code, but perhaps you actually want just ; to act as a separator. In that case you probably want to pass ';' as the separator, and trim the strings that are returned.
SplitString is defined as
function SplitString(const S, Delimiters: string): TStringDynArray;
One would thought that Delimiters denote single delimiter string used for splitting string, but it actually denotes set of single characters used to split string. Each character in Delimiters string will be used as one of possible delimiters.
SplitString
Splits a string into different parts delimited by the specified
delimiter characters. SplitString splits a string into different parts
delimited by the specified delimiter characters. S is the string to be
split. Delimiters is a string containing the characters defined as
delimiters.
It is because the second parameter of SplitString is a list of single character delimiters, so '; ' means split at a ';' OR split at a ' '. So the string is split at every ';' and at every space, and between the ';' and the ' ' there is nothing, hence the empty strings.
I have a Tstringlist containing a list of keys used in a database table.
I'd like a simple way to generate one string containing all the keys, with each separated by a comma and enclosed in single quotes.
This is so that it can be used in a SQL 'IN' statement
eg WHERE FieldX IN ('One','Two','Three').
I've tried using quotechar but it is ignored when reading the commatext.
eg the following code
procedure junk;
var
SL : Tstringlist;
s : string;
begin
SL := Tstringlist.Create;
SL.Delimiter :=','; //comma delimiter
SL.QuoteChar := ''''; //single quote around strings
SL.Add('One');
SL.Add('Two');
SL.Add('Three');
try
s := SL.commatext;
showmessage(s);
finally
SL.Free;
end; //finally
end; //junk
shows the message One,Two,Three - without any quotes.
I know I can do it the long way round, as in
procedure junk;
var
SL : Tstringlist;
s : string;
i : integer;
begin
SL := Tstringlist.Create;
SL.Delimiter :=','; //comma delimiter
SL.Add('One');
SL.Add('Two');
SL.Add('Three');
try
s := '';
for I := 0 to SL.Count - 1 do
begin
s := s + ',' + '''' + SL[i] + '''';
end;
delete(s,1,1);
showmessage(s);
finally
SL.Free;
end;//finally
end;
but is there a simpler way using properties of the Tstringlist itself?
If you're using D2006 or later, you can use a CLASS HELPER:
USES Classes,StrUtils;
TYPE
TStringListHelper = CLASS HELPER FOR TStrings
FUNCTION ToSQL : STRING;
END;
FUNCTION TStringListHelper.ToSQL : STRING;
VAR
S : STRING;
FUNCTION QuotedStr(CONST S : STRING) : STRING;
BEGIN
Result:=''''+ReplaceStr(S,'''','''''')+''''
END;
BEGIN
Result:='';
FOR S IN Self DO BEGIN
IF Result='' THEN Result:='(' ELSE Result:=Result+',';
Result:=Result+QuotedStr(S)
END;
IF Result<>'' THEN Result:=Result+')'
END;
This code:
SL:=TStringList.Create;
SL.Add('One');
SL.Add('Two');
SL.Add('Number Three');
SL.Add('It''s number 4');
WRITELN('SELECT * FROM TABLE WHERE FIELD IN '+SL.ToSQL);
will then output:
SELECT * FROM TABLE WHERE FIELD IN ('One','Two','Number Three','It''s number 4')
Use sl.DelimitedText instead of sl.CommaText to make it follow your settings. CommaText will temporarily change the Delimiter and QuoteChar to some hardcoded values.
CommaText (and apparently also DelimitedText) can not be relied on to add the quotes, because they treat single word and multi word strings differently.
When retrieving CommaText, any string in the list that include spaces,
commas or quotes will be contained in double quotes, and any double
quotes in a string will be repeated.
There doesn't seem to be any combination with the TStringList alone, so I suggest you add the strings using QuotedStr.
Following settings work as you wish, regardless of, whether strings are single words or multi words:
SL := Tstringlist.Create;
SL.Delimiter :=','; //comma delimiter
SL.StrictDelimiter := True;
// SL.QuoteChar := ''''; //single quote around strings
// SL.Add('One');
// SL.Add('Two words');
// SL.Add('Three');
SL.Add(QuotedStr('One'));
SL.Add(QuotedStr('Two words'));
SL.Add(QuotedStr('Three'));
s := SL.CommaText; // or SL.DelimitedText;
showmessage(s);
Output is:
'One','Two words','Three'
The TStrings.DelimitedText methods quote only the strings when needed: when there's something link O'ne this will print as 'O''ne' - i.e. it double the quotes.
You can achieve what you want setting a convenience delimiter and the QuoteChar property to the #0 value and then replacing the delimiters with the quotes.
procedure junk;
var
sl : Tstringlist;
s: string;
begin
sl := Tstringlist.Create;
try
sl.Delimiter := '|';
sl.QuoteChar := #0;//no quote
sl.Add('On''e');
sl.Add('Two');
sl.Add('Three');
//escape the single quotes
s := StringReplace(sl.DelimitedText, '''', '''''', [rfReplaceAll]);
//replace the delimiters and quote the text
s := '''' + StringReplace(s, sl.Delimiter, ''',''', [rfReplaceAll]) + '''';
WriteLn(s);
finally
sl.Free;
end;
end;
I have resolved a similar issue with something like that :
s := ''' + StringReplace(sl.CommaText, ',', ''',''', [rfReplaceAll]) + '''
How can I replace special characters in a given string with spaces, or just remove it, by using Delphi? The following works in C#, but I don't know how to write it in Delphi.
public string RemoveSpecialChars(string str)
{
string[] chars = new string[] { ",", ".", "/", "!", "#", "#", "$", "%", "^", "&", "*", "'", "\"", ";","_","(", ")", ":", "|", "[", "]" };
for (int i = 0; i< chars.Lenght; i++)
{
if (str.Contains(chars[i]))
{
str = str.Replace(chars[i],"");
}
}
return str;
}
I would write the function like this:
function RemoveSpecialChars(const str: string): string;
const
InvalidChars : set of char =
[',','.','/','!','#','#','$','%','^','&','*','''','"',';','_','(',')',':','|','[',']'];
var
i, Count: Integer;
begin
SetLength(Result, Length(str));
Count := 0;
for i := 1 to Length(str) do
if not (str[i] in InvalidChars) then
begin
inc(Count);
Result[Count] := str[i];
end;
SetLength(Result, Count);
end;
The function is pretty obvious when you see it written down. I prefer to try to avoid performing a large number of heap allocations which is why the code pre-allocates a buffer and then finalises its size at the end of the loop.
Actually there is StringReplace function in StrUtils unit which can be used like this:
uses StrUrils;
...
var
a, b: string;
begin
a := 'Google is awesome! I LOVE GOOGLE.';
b := StringReplace(a, 'Google', 'Microsoft', [rfReplaceAll, rfIgnoreCase]);
// b will be 'Microsoft is awesome! I LOVE Microsoft'
end;
So you can write the code in almost the same way as you did in C# (instead of Contains you can use Pos function here). But I would recommend using HeartWare's approach since it should be a lot more efficient.
Try this one:
FUNCTION RemoveSpecialChars(CONST STR : STRING) : STRING;
CONST
InvalidChars : SET OF CHAR = [',','.','/','!','#','#','$','%','^','&','*','''','"',';','_','(',')',':','|','[',']'];
VAR
I : Cardinal;
BEGIN
Result:='';
FOR I:=1 TO LENGTH(STR) DO
IF NOT (STR[I] IN InvalidChars) THEN Result:=Result+STR[I]
END;
const
InvalidChars =
[',','.','/','!','#','#','$','%','^','&','*','''','"',';','_','(',')',':','|','[',']'];
var
// i, Count: Integer;
str: String;
begin
Writeln('please enter any Text');
Readln(str);
Writeln('The given Text is',str);
Count := 0;
for i := 1 to Length(str) do
if (str[i] in InvalidChars) then
begin
str[i] := ' ';
Write('');
end;
Writeln(str);
Readln;
end.
Is there a built-in Delphi function which would convert a string such as '3,232.00' to float? StrToFloat raises an exception because of the comma. Or is the only way to strip out the comma first and then do StrToFloat?
Thanks.
Do you exactly know, that '.' is decimal separator and ',' is thousand separator (always)?
If so, then you should fill the TFormatSettings record and pass it to StrToFloat.
FillChar(FS, SizeOf(FS), 0);
... // filling other fields
FS.ThousandSeparator := ',';
FS.DecimalSeparator := '.';
V := StrToFloat(S, FS);
below is what i use. there might be more efficient ways, but this works for me. in short, no, i don't know of any built-in delphi function that will convert a string-float containing commas to a float
{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
safeFloat
Strips many bad characters from a string and returns it as a double.
}
function safeFloat(sStringFloat : AnsiString) : double;
var
dReturn : double;
begin
sStringFloat := stringReplace(sStringFloat, '%', '', [rfIgnoreCase, rfReplaceAll]);
sStringFloat := stringReplace(sStringFloat, '$', '', [rfIgnoreCase, rfReplaceAll]);
sStringFloat := stringReplace(sStringFloat, ' ', '', [rfIgnoreCase, rfReplaceAll]);
sStringFloat := stringReplace(sStringFloat, ',', '', [rfIgnoreCase, rfReplaceAll]);
try
dReturn := strToFloat(sStringFloat);
except
dReturn := 0;
end;
result := dReturn;
end;
function StrToFloat_Universal( pText : string ): Extended;
const
EUROPEAN_ST = ',';
AMERICAN_ST = '.';
var
lformatSettings : TFormatSettings;
lFinalValue : string;
lAmStDecimalPos : integer;
lIndx : Byte;
lIsAmerican : Boolean;
lIsEuropean : Boolean;
begin
lIsAmerican := False;
lIsEuropean := False;
for lIndx := Length( pText ) - 1 downto 0 do
begin
if ( pText[ lIndx ] = AMERICAN_ST ) then
begin
lIsAmerican := True;
pText := StringReplace( pText, ',', '', [ rfIgnoreCase, rfReplaceAll ]); //get rid of thousand incidental separators
Break;
end;
if ( pText[ lIndx ] = EUROPEAN_ST ) then
begin
lIsEuropean := True;
pText := StringReplace( pText, '.', '', [ rfIgnoreCase, rfReplaceAll ]); //get rid of thousand incidental separators
Break;
end;
end;
GetLocaleFormatSettings( LOCALE_SYSTEM_DEFAULT, lformatSettings );
if ( lformatSettings.DecimalSeparator = EUROPEAN_ST ) then
begin
if lIsAmerican then
begin
lFinalValue := StringReplace( pText, '.', ',', [ rfIgnoreCase, rfReplaceAll ] );
end;
end;
if ( lformatSettings.DecimalSeparator = AMERICAN_ST ) then
begin
if lIsEuropean then
begin
lFinalValue := StringReplace( pText, ',', '.', [ rfIgnoreCase, rfReplaceAll ] );
end;
end;
pText := lFinalValue;
Result := StrToFloat( pText, lformatSettings );
end;
Try: StrToFloat(StringReplace('3,232.00', ',', '')
It should get rid of the commas before doing the conversion.
In C# / VB.NET I use would use something like decimal.convert("3,232.00", ",", "");
I know of no way to do the conversion without stripping out the extra characters. In fact, I have a special function in my library that strips out commas and currency symbols. So a actually call MyConverer.decimalConverter("$3,232.00");
I use a function which is able to handle the ',' and the '.' as decimalseparator...:
function ConvertToFloat(aNr: String; aDefault:Integer): Extended;
var
sNr, s3R, sWhole, sCent:String;
eRC:Extended;
begin
sNr:=ReplaceStr(sNr, ' ', '');
if (Pos('.', sNr) > 0) or (Pos(',', sNr) > 0) then
begin
// Get 3rd character from right
s3R:=LeftStr(RightStr(sNr, 3), 1);
if s3R <> DecimalSeparator then
begin
if not IsNumber(s3R) then
begin
s3R := DecimalSeparator;
sWhole := LeftSr(sNr, Length(sNr) - 3);
sCent := (RightStr(sNr, 2);
sNr := sWhole + DecimalSeparator + sCent;
end
else
// there are no decimals... add ',00'
sNr:=sNr + DecimalSeparator + '00';
end;
// DecimalSeparator is present; get rid of other symbols
if (DecimalSeparator = '.') and (Pos(',', sNr) > 0) then sNr:=ReplaceStr(sNr, ',', '');
if (DecimalSeparator = ',') and (Pos('.', sNr) > 0) then sNr:=ReplaceStr(sNr, '.', '');
end;
eRc := StrToFloat(sNr);
end;
I had the same problem when my Users need to enter 'scientific' values such as "1,234.06mV". Here there is a comma, a multiplier (m=x0.001) and a unit (V). I created a 'wide' format converter routine to handle these situtations.
Brian
Myfunction:
function StrIsFloat2 (S: string; out Res: Extended): Boolean;
var
I, PosDecimal: Integer;
Ch: Char;
STrunc: string;
liDots, liComma, J: Byte;
begin
Result := False;
if S = ''
then Exit;
liDots := 0;
liComma := 0;
for I := 1 to Length(S) do begin
Ch := S[I];
if Ch = FormatSettings.DecimalSeparator then begin
Inc (liDots);
if liDots > 1 then begin
Exit;
end;
end
else if (Ch = '-') and (I > 1) then begin
Exit;
end
else if Ch = FormatSettings.ThousandSeparator then begin
Inc (liComma);
end
else if not CharIsCipher(Ch) then begin
Exit;
end;
end;
if liComma > 0 then begin
PosDecimal := Pos (FormatSettings.DecimalSeparator, S);
if PosDecimal = 0 then
STrunc := S
else
STrunc := Copy (S, 1, PosDecimal-1);
if STrunc[1] = '-' then
Delete (S, 1, 1);
if Length(STrunc) < ((liComma * 3) + 2) then
Exit;
J := 0;
for I := Length(STrunc) downto 1 do begin
Inc(J);
if J mod 4 = 0 then
if STrunc[I] <> FormatSettings.ThousandSeparator then
Exit;
end;
S := ReplaceStr (S, FormatSettings.ThousandSeparator, '');
end;
try
Res := StrToFloat (S);
Result := True;
except
Result := False;
end;
end;
Using Foreach loop
public static float[] ToFloatArray()
{
string pcords="200.812, 551.154, 232.145, 482.318, 272.497, 511.752";
float[] spiltfloat = new float[pcords.Split(',').Length];
int i = 0;
foreach (string s in pcords.Split(','))
{
spiltfloat[i] = (float)(Convert.ToDouble(s));
i++;
}
return spiltfloat;
}
using lemda Expression to convert string comma seprated to float array
public static float[] ToFloatArrayUsingLemda()
{
string pcords="200.812, 551.154, 232.145, 482.318, 272.497, 511.752";
float[] spiltfloat = new float[pcords.Split(',').Length];
string[] str = pcords.Split(',').Select(x => x.Trim()).ToArray();
spiltfloat = Array.ConvertAll(str, float.Parse);
return spiltfloat;
}
procedure Edit1Exit(Sender: TObject);
begin
edit1.Text:=stringreplace(edit1.Text,'''','',[rfReplaceAll]);
if not IsValidDecimal( maskedit1.Text ) then
begin
showmessage('The Decimal entered -> '+edit1.Text+' <- is in the wrong format ');
edit1.SetFocus;
end;
end;
function IsValidDecimal(S:string):boolean;
VAR
FS: TFormatSettings;
DC: variant;
begin
//FS := TFormatSettings.Create('it-IT');
FS := TFormatSettings.Create('en-EN');
try
DC:=StrToFloat ( S, FS );
result:=true;
except
on e:exception do
result:=false;
end;
end;