Delphi 5 Use Pos / PosIgnoreCase for whole words only - delphi

I know Delphi 5 is really old but I have no other choice for now because my employer doesn't want to change, so I am stuck with old functions etc.
I would like to know if there was a way to get the position of the whole words I am looking for:
I have a list of words (if, then, else, and etc) named KEYWORDS, and for each word in it I have to check in every .pas file if this word contains some uppercase characters.
On my code, I am reading each line, and for each line, I am using this to find if I find any word in the list and if it has some uppercase characters:
if(PosIgnoreCase(KEYWORDS[I], S) <> Pos(KEYWORDS[I], S)) //Then the keyword has some uppercases in this line and I must raise an error
My problem is that if I use some words that contains the keywords ( for example "MODIFICATION") this will detect the uppercase IF in it and raise an error
I tried using if(PosIgnoreCase(' ' + KEYWORDS[I] + ' ', S) <> Pos(' ' + KEYWORDS[I] + ' ', S))
but there may be some parentheses or other characters instead of the spaces so I would like to avoid making a new condition for each character.
Is there a clean way to do it ? I found myself struggling quite often with the lack of functions in Delphi 5
Sorry if my question is somewhat confusing, english is not my first language.
Thank you for your time.
Update (from comments):
My list of keywords only contains the reserved keywords on Delphi

Related

Match a word or whitespaces in Lua

(Sorry for my broken English)
What I'm trying to do is matching a word (with or without numbers and special characters) or whitespace characters (whitespaces, tabs, optional new lines) in a string in Lua.
For example:
local my_string = "foo bar"
my_string:match(regex) --> should return 'foo', ' ', 'bar'
my_string = " 123!#." -- note: three whitespaces before '123!#.'
my_string:match(regex) --> should return ' ', ' ', ' ', '123!#.'
Where regex is the Lua regular expression pattern I'm asking for.
Of course I've done some research on Google, but I couldn't find anything useful. What I've got so far is [%s%S]+ and [%s+%S+] but it doesn't seem to work.
Any solution using the standart library, e.g. string.find, string.gmatch etc. is OK.
Match returns either captures or the whole match, your patterns do not define those. [%s%S]+ matches "(space or not space) multiple times more than once", basically - everything. [%s+%S+] is plain wrong, the character class [ ] is a set of single character members, it does not treat sequences of characters in any other way ("[cat]" matches "c" or "a"), nor it cares about +. The [%s+%S+] is probably "(a space or plus or not space or plus) single character"
The first example 'foo', ' ', 'bar' could be solved by:
regex="(%S+)(%s)(%S+)"
If you want a variable number of captures you are going to need the gmatch iterator:
local capt={}
for q,w,e in my_string:gmatch("(%s*)(%S+)(%s*)") do
if q and #q>0 then
table.insert(capt,q)
end
table.insert(capt,w)
if e and #e>0 then
table.insert(capt,e)
end
end
This will not however detect the leading spaces or discern between a single space and several, you'll need to add those checks to the match result processing.
Lua standard patterns are simplistic, if you are going to need more intricate matching, you might want to have a look at lua lpeg library.

Mathematica and Latex

I am constantly using the mathematica software and using TeXForm command to go back and forth between the calculations and the latex document I'm typesetting. However, mathematica won't allow me to define variables with underscore, which I constantly need in my latex document. Does anybody know how to create variables with "smarter" names in mathematica?
In a broader sense, what is the best way to integrate the use of mathematica and latex?
Thanks.
first of all, Mathematica allows you to define variables with underscore.
Subscript[x, 1] = 3
The shortcut for this ist [ctr]+[_]
If you convert a subscript variable with TeXForm, you'll get:
x_1
I prefer to not use the subscript notation for normal variables, because you can not easily see if a variable has allready a value in this notation. So you might just write
x1
We now want to transform these kind of variable names to the subscript notation in TeXForm.
One way to do this is with StringPattern.
1.Transform your expression to a String in TeXForm:
In[360]:= ToString[(-b+y1) ((b-y1)/(b-y2))^(-(w10/(x\[Gamma]1-\[Omega]2))), TeXForm]
Out[360]= (\text{y1}-b) \left(\frac{b-\text{y1}}{b-\text{y2}}\right)^{-\frac{\text{w10}}{\text{x$\gamma $1}-\text{$\omega $2}}}
2.Replace this specific String Pattern to the subscript notation of LaTeX:
In[361]:= StringReplace[%, "\\text{"~~name_?LetterQ~~index_?DigitQ~~"}":> name<>"_"<>index]
Out[361]= (y_1-b) \left(\frac{b-y_1}{b-y_2}\right)^{-\frac{\text{w10}}{\text{x$\gamma $1}-\text{$\omega $2}}}
You might have noticed, that this replacement just worked on the variablenames that consists of just one letter and one digit. Longer variable names will be ignored. This is because the StringPattern "_" stands just for ohne character, for a sequence of characters, use "__", but we have to make shure, that we match with the Shortest possible sequence. To catch the longer variable names we apply another string replacement:
In[362]:= StringReplace[%,
"\\text{"~~Shortest[name__]~~Shortest[index__?DigitQ]~~"}":> "\\text{"<>name<>"}_{"<>index<>"}"]
Out[362]= (y_1-b) \left(\frac{b-y_1}{b-y_2}\right)^{-\frac{\text{w}_{10}}{\text{x$\gamma $}_{1}-\text{$\omega $}_{2}}}
Now all variables appear to be in the correct LaTeX-notation for subscript variables. But some of the "\text{}"s and "{}"s are obsolet now, due to single letters or digits, inside.
To optimize the LaTeX code, we can add further repacements:
In[371]:= StringReplace[%, "{" ~~ i_?DigitQ ~~ "}" :> i];
StringReplace[%, "\\text{" ~~ name_?LetterQ ~~ "}" :> name]
Out[372]= (y_1-b) \left(\frac{b-y_1}{b-y_2}\right)^{-\frac{w_{10}}{\text{x$\gamma $}_1-\text{$\omega $}_2}}
Now i think the TeX looks good enough, so we can define a function that does all the replacements in one step:
In[506]:=
ClearAll[myTeXForm]
SetAttributes[myTeXForm, HoldFirst]
myTeXForm[expr_] := Fold[StringReplace, ToString[HoldPattern[expr], TeXForm],
{"\\text{HoldPattern}\\left[" ~~ str__ ~~ "\\right]" ~~ EndOfString :> str,
"\\text{" ~~ Shortest[str__] ~~ Shortest[i__?DigitQ] ~~ "}" :>
"\\text{" <> str <> "}_{" <> i <> "}",
{"{" ~~ i_?DigitQ ~~ "}" :> i, "\\text{" ~~ s_?LetterQ ~~ "}" :> s}}]
Testing the function:
b=134;
myTeXForm[(-b+y1) ((b-y1)/(b-y2))^(-(w10/(x\[Gamma]13-\[Omega]2)))]
Out[510]= (y_1-b) \left(\frac{b-y_1}{b-y_2}\right)^{-\frac{w_{10}}{\text{x$\gamma $}_{13}-\text{$\omega $}_2}}
Note that i used a little trick to protect the function agains its argument values. In this example the variable b has allready the value 134, but in the TeX Output it should still apear as "b". To do so i added the Attribut HoldFirst to our function and used HoldPattern inside. Maybe one can do this easier, but it works fine.
Hope this might inspire you.
Best regards.

Split lua string into characters

I only found this related to what I am looking for: Split string by count of characters but it is not useful for what I mean.
I have a string variable, which is an ammount of 3 numbers (can be from 000 to 999). I need to separate each of the numbers (characters) and get them into a table.
I am programming for a game mod which uses lua, and it has some extra functions. If you could help me to make it using: http://wiki.multitheftauto.com/wiki/Split would be amazing, but any other way is ok too.
Thanks in advance
Corrected to what the OP wanted to ask:
To just split a 3-digit number in 3 numbers, that's even easier:
s='429'
c1,c2,c3=s:match('(%d)(%d)(%d)')
t={tonumber(c1),tonumber(c2),tonumber(c3)}
The answer to "How do I split a long string composed of 3 digit numbers":
This is trivial. You might take a look at the gmatch function in the reference manual:
s="123456789"
res={}
for num in s:gmatch('%d%d%d') do
res[#res+1]=tonumber(num)
end
or if you don't like looping:
res={}
s:gsub('%d%d%d',function(n)res[#res+1]=tonumber(n)end)
I was looking for something like this, but avoiding looping - and hopefully having it as one-liner. Eventually, I found this example from lua-users wiki: Split Join:
fields = {str:match((str:gsub("[^"..sep.."]*"..sep, "([^"..sep.."]*)"..sep)))}
... which is exactly the kind of syntax I'd like - one liner, returns a table - except, I don't really understand what is going on :/ Still, after some poking about, I managed to find the right syntax to split into characters with this idiom, which apparently is:
fields = { str:match( (str:gsub(".", "(.)")) ) }
I guess, what happens is that gsub basically puts parenthesis '(.)' around each character '.' - so that match would consider those as a separate match unit, and "extract" them as separate units as well... But I still don't get why is there extra pair of parenthesis around the str:gsub(".", "(.)") piece.
I tested this with Lua5.1:
str = "a - b - c"
fields = { str:match( (str:gsub(".", "(.)")) ) }
print(table_print(fields))
... where table_print is from lua-users wiki: Table Serialization; and this code prints:
"a"
" "
"-"
" "
"b"
" "
"-"
" "
"c"

String and CharStream<'a> in FParsec

I would like to parse a big sentence, which can contain names in fsharp.
I posit that names is in the form first name + last name.
In the absence of a first name list (can't find, will do later), I say that a first name is a string of length 4 or more, same for the last name.
When I try my very smart parser
let firstorlastname x = (parray 4 letter) x
firstorlastname "JEAN"
firstorlastname "CHRISTOPHE"
So, it works for both, but the problem is that it consumes only 4 characters, which is not the desired behaviour for Christophe. I would like the whole word to be consumed.
How can I instruct FParsec to consume the entire word, but fail if the word is less than 4 characters ?
Haven't tested it, but I think this should do it:
let firstOrLastName = manyMinMaxSatisfy 4 Int32.MaxValue isLetter

Funny CSV format help

I've been given a large file with a funny CSV format to parse into a database.
The separator character is a semicolon (;). If one of the fields contains a semicolon it is "escaped" by wrapping it in doublequotes, like this ";".
I have been assured that there will never be two adjacent fields with trailing/ leading doublequotes, so this format should technically be ok.
Now, for parsing it in VBScript I was thinking of
Replacing each instance of ";" with a GUID,
Splitting the line into an array by semicolon,
Running back through the array, replacing the GUIDs with ";"
It seems to be the quickest way. Is there a better way? I guess I could use substrings but this method seems to be acceptable...
Your method sounds fine with the caveat that there's absolutely no possibility that your GUID will occur in the text itself.
On approach I've used for this type of data before is to just split on the semi-colons regardless then, if two adjacent fields end and start with a quote, combine them.
For example:
Pax;is;a;good;guy";" so;says;his;wife.
becomes:
0 Pax
1 is
2 a
3 good
4 guy"
5 " so
6 says
7 his
8 wife.
Then, when you discover that fields 4 and 5 end and start (respectively) with a quote, you combine them by replacing the field 4 closing quote with a semicolon and removing the field 5 opening quote (and joining them of course).
0 Pax
1 is
2 a
3 good
4 guy; so
5 says
6 his
7 wife.
In pseudo-code, given:
input: A string, first character is input[0]; last
character is input[length]. Further, assume one dummy
character, input[length+1]. It can be anything except
; and ". This string is one line of the "CSV" file.
length: positive integer, number of characters in input
Do this:
set start = 0
if input[0] = ';':
you have a blank field in the beginning; do whatever with it
set start = 2
endif
for each c between 1 and length:
next iteration unless string[c] = ';'
if input[c-1] ≠ '"' or input[c+1] ≠ '"': // test for escape sequence ";"
found field consting of half-open range [start,c); do whatever
with it. Note that in the case of empty fields, start≥c, leaving
an empty range
set start = c+1
endif
end foreach
Untested, of course. Debugging code like this is always fun….
The special case of input[0] is to make sure we don't ever look at input[-1]. If you can make input[-1] safe, then you can get rid of that special case. You can also put a dummy character in input[0] and then start your data—and your parsing—from input[1].
One option would be to find instances of the regex:
[^"];[^"]
and then break the string apart with substring:
List<string> ret = new List<string>();
Regex r = new Regex(#"[^""];[^""]");
Match m;
while((m = r.Match(line)).Success)
{
ret.Add(line.Substring(0,m.Index + 1);
line = line.Substring(m.Index + 2);
}
(Sorry about the C#, I don't known VBScript)
Using quotes is normal for .csv files. If you have quotes in the field then you may see opening and closing and the embedded quote all strung together two or three in a row.
If you're using SQL Server you could try using T-SQL to handle everything for you.
SELECT * INTO MyTable FROM OPENDATASOURCE('Microsoft.JET.OLEDB.4.0',
'Data Source=F:\MyDirectory;Extended Properties="text;HDR=No"')...
[MyCsvFile#csv]
That will create and populate "MyTable". Read more on this subject here on SO.
I would recommend using RegEx to break up the strings.
Find every ';' that is not a part of
";" and change it to something else
that does not appear in your fields.
Then go through and replace ";" with ;
Now you have your fields with the correct data.
Most importers can swap out separator characters pretty easily.
This is basically your GUID idea. Just make sure the GUID is unique to your file before you start and you will be fine. I tend to start using 'Z'. After enough 'Z's, you will be unique (sometimes as few as 1-3 will do).
Jacob

Resources