Check similarity between two string expressions in Swift - ios

I have scanned text:
Mils, chiiese, wh_ite ch$col_te
And expression list, example:
- cheese
- bread
- white chocolate
- etc.
I need compare broken expression with expression from my list, ex. "white chocolate" with "wh_ite ch$col_te."
Maybe you recommend some frameworks.

String distance - Levenshtein distance
What you need to do is measure the difference between two string. To do that, you can use the Levenshtein distance.
For your luck, somebody already implemented this algorihtm in Swift HERE.
To make it work in Swift 1.2, you'll just have to autofix some errors that occour, nothing too fancy.
You can then use it like this:
println(levenshtein("wh_ite ch$col_te", bStr: "white chocolate")) // prints 3, because you have to change 3 letters to get from aStr to bStr
println(levenshtein("wh_ite ch$col_te", bStr: "whsdfdsite chosdfsdfcolate")) // prints 13, because you have to change 13 letters to get from aStr to bStr
You then just set the tolerance and you are done!

Dejan Skledar's on the right track -- you want to make use of Levenshtein distance. The implementation he points to needs tweaking to work in Swift 1.2, and it tends to be slow. Here's a Swift 1.2-compatible, faster implementation.
Simply include the Tools class in your project. Once you've done that, you can get a number representing the difference between two strings this way:
Tools.levenshtein("cheese", bStr: "chee_e") // returns 1
Tools.levenshtein("butter", bStr: "b_tt_r") // returns 2
Tools.levenshtein("milk", bStr: "butter") // returns 6

Please find the Swift 4 implementation of Joey deVilla's answer here
You have to call the function like below:
Tools.levenshtein(aStr: "Example", bStr: "Examples")

Use StringMetric and be happy
https://github.com/autozimu/StringMetric.swift
import StringMetric
...
"kitten".distance(between: "sitting") // => 0.746
"君子和而不同".distance(between: "小人同而不和") // => 0.555

Related

What is the meaning of Swift.String.Index(_rawBits: )

I am trying to understand what does _rawBits really mean in Swift.String.Index(_rawBits:). If you print a String.Index, you get something like Swift.String.Index(_rawBits: 983040). But what does that really mean?
Can I (mathematically) calculate the actual index in a string using this rawBits number at all? either base 32, 16, or whatever else?
Swift Range uses String.Index as its upper and lower bounds.
Swift Range uses String.Index as its upper and lower bounds.`
No, it doesn't. Range<String.Index> does, but that's only one particular type of Range, which is otherwise generic over a type variable called Bound.
The _rawBits are an internal implementation detail of the string indices. You should treat them as an opaque type, that you can only manipulate using the corresponding index apis on String, Substring and friends.

How can Scala understand function calls in different formats?

I realize the following function calls are all same, but I do not understand why.
val list = List(List(1), List(2, 3), List(4, 5, 6))
list.map(_.length) // res0 = List(1,2,3) result of 1st call
list map(_.length) // res1 = List(1,2,3) result of 2nd call
list map (_.length) // res2 = List(1,2,3) result of 3rd call
I can understand 1st call, which is just a regular function call because map is a member function of class List
But I can not understand 2nd and 3rd call. For example, in the 3rd call, how can Scala compiler know "(_.length)" is parameter of "map"? How can compiler know "map" is a member function of "list"?
The only difference between variant 2 and 3 is the blank in front of the parenthesis? This can only be a delimiter - list a and lista is of course different, but a opening parens is a new token, and you can put a blank or two or three in front - or none. I don't see how you can expect a difference here.
In Java, there is no difference between
System.out.println ("foo");
// and
System.out.println("foo");
too.
This is the operator notation. The reason it works is the same reason why 2 + 2 works.
The space is used to distinguish between words -- listmap(_.length) would make the compiler look for listmap. But if you write list++list, it will work too, as will list ++ list.
So, one you are using operator notation, the space is necessary to separate words, but otherwise may be present or not.

matlab indexing into nameless matrix [duplicate]

For example, if I want to read the middle value from magic(5), I can do so like this:
M = magic(5);
value = M(3,3);
to get value == 13. I'd like to be able to do something like one of these:
value = magic(5)(3,3);
value = (magic(5))(3,3);
to dispense with the intermediate variable. However, MATLAB complains about Unbalanced or unexpected parenthesis or bracket on the first parenthesis before the 3.
Is it possible to read values from an array/matrix without first assigning it to a variable?
It actually is possible to do what you want, but you have to use the functional form of the indexing operator. When you perform an indexing operation using (), you are actually making a call to the subsref function. So, even though you can't do this:
value = magic(5)(3, 3);
You can do this:
value = subsref(magic(5), struct('type', '()', 'subs', {{3, 3}}));
Ugly, but possible. ;)
In general, you just have to change the indexing step to a function call so you don't have two sets of parentheses immediately following one another. Another way to do this would be to define your own anonymous function to do the subscripted indexing. For example:
subindex = #(A, r, c) A(r, c); % An anonymous function for 2-D indexing
value = subindex(magic(5), 3, 3); % Use the function to index the matrix
However, when all is said and done the temporary local variable solution is much more readable, and definitely what I would suggest.
There was just good blog post on Loren on the Art of Matlab a couple days ago with a couple gems that might help. In particular, using helper functions like:
paren = #(x, varargin) x(varargin{:});
curly = #(x, varargin) x{varargin{:}};
where paren() can be used like
paren(magic(5), 3, 3);
would return
ans = 16
I would also surmise that this will be faster than gnovice's answer, but I haven't checked (Use the profiler!!!). That being said, you also have to include these function definitions somewhere. I personally have made them independent functions in my path, because they are super useful.
These functions and others are now available in the Functional Programming Constructs add-on which is available through the MATLAB Add-On Explorer or on the File Exchange.
How do you feel about using undocumented features:
>> builtin('_paren', magic(5), 3, 3) %# M(3,3)
ans =
13
or for cell arrays:
>> builtin('_brace', num2cell(magic(5)), 3, 3) %# C{3,3}
ans =
13
Just like magic :)
UPDATE:
Bad news, the above hack doesn't work anymore in R2015b! That's fine, it was undocumented functionality and we cannot rely on it as a supported feature :)
For those wondering where to find this type of thing, look in the folder fullfile(matlabroot,'bin','registry'). There's a bunch of XML files there that list all kinds of goodies. Be warned that calling some of these functions directly can easily crash your MATLAB session.
At least in MATLAB 2013a you can use getfield like:
a=rand(5);
getfield(a,{1,2}) % etc
to get the element at (1,2)
unfortunately syntax like magic(5)(3,3) is not supported by matlab. you need to use temporary intermediate variables. you can free up the memory after use, e.g.
tmp = magic(3);
myVar = tmp(3,3);
clear tmp
Note that if you compare running times with the standard way (asign the result and then access entries), they are exactly the same.
subs=#(M,i,j) M(i,j);
>> for nit=1:10;tic;subs(magic(100),1:10,1:10);tlap(nit)=toc;end;mean(tlap)
ans =
0.0103
>> for nit=1:10,tic;M=magic(100); M(1:10,1:10);tlap(nit)=toc;end;mean(tlap)
ans =
0.0101
To my opinion, the bottom line is : MATLAB does not have pointers, you have to live with it.
It could be more simple if you make a new function:
function [ element ] = getElem( matrix, index1, index2 )
element = matrix(index1, index2);
end
and then use it:
value = getElem(magic(5), 3, 3);
Your initial notation is the most concise way to do this:
M = magic(5); %create
value = M(3,3); % extract useful data
clear M; %free memory
If you are doing this in a loop you can just reassign M every time and ignore the clear statement as well.
To complement Amro's answer, you can use feval instead of builtin. There is no difference, really, unless you try to overload the operator function:
BUILTIN(...) is the same as FEVAL(...) except that it will call the
original built-in version of the function even if an overloaded one
exists (for this to work, you must never overload
BUILTIN).
>> feval('_paren', magic(5), 3, 3) % M(3,3)
ans =
13
>> feval('_brace', num2cell(magic(5)), 3, 3) % C{3,3}
ans =
13
What's interesting is that feval seems to be just a tiny bit quicker than builtin (by ~3.5%), at least in Matlab 2013b, which is weird given that feval needs to check if the function is overloaded, unlike builtin:
>> tic; for i=1:1e6, feval('_paren', magic(5), 3, 3); end; toc;
Elapsed time is 49.904117 seconds.
>> tic; for i=1:1e6, builtin('_paren', magic(5), 3, 3); end; toc;
Elapsed time is 51.485339 seconds.

AsFloat convert to string

Hi
I want to convert "qrysth.Fields[i].AsFloat" to a string so I use the following code:
FormatFloat('0.###############',qrysth.Fields[i].AsFloat)
but I find the result string is 12.000000000000001 while qrysth.Fields[i].AsFloat is 12.00. I know FormatFloat actually not use 12.00 to do the convert, but use an infinite number of binary to do the convert. (like 0.1 in decimal system is 0.1, but it is an infinite number in binary system 0.00011001100...)
Is there other way I could get 12.00 in the case above? or 12.000000000000000 at least?
If you really get 12.000000000000001, then your field didn't hold exactly 12, so the output is correct. You asked for high precision by putting so many # characters in the format. If you don't want it so precise, then use a less precise format string.
FormatFloat('0.00',qrysth.Fields[i].AsFloat) will give '12.00'.
To be able to get '12.000000000000000' you should do the rounding yourself, as there's no loss of precision.
I want to convert
"qrysth.Fields[i].AsFloat" to a string
Then why not use AsString?
qrysth.Fields[i].AsString
This will give you the best representation, as long as you're not concerned about the exact width. If you are, use FormatFloat with the exact number of digits you need - in other words, if you're looking for 12.00, use FormatFloat('##.##', qrysth.Fields[i].AsFloat), or even better CurrToStrand AsCurrency, as they automatically uses two digits after the decimal point.
function MyFormatFloat(V: Double): String;
const
DesiredMinPrec = '0.000000000000000';
AssumedMaxPrec = '0.#####';
begin
Result := FormatFloat(DesiredMinPrec, StrToFloat(FormatFloat(AssumedMaxPrec, V)));
end;

Custom Array Functions in Open Office Calc

Could someone please tell me how to write a custom function in Open Office Basic to be used in Open Office Calc and that returns an array of values. An example of one such built-in function is MINVERSE. I need to write a custom function that populates a range of cells in much the same way.
Help would be much appreciated.
Yay, I just figured it out: all you do is return an array from your macro, BUT you also have to press Ctrl+Shift+Enter when typing in the cell formula to call your function (which is also the case when working with other arrays in calc). Here's an example:
Function MakeArray
Dim ret(2,2)
ret(0,0) = 1
ret(1,0) = 2
ret(0,1) = 3
ret(1,1) = 4
MakeArray = ret
End Function
FWIW, damjan's MakeArray function returns a Variant containing an array, I think. (The type returned by MakeArray is unspecified, so it defaults to Variant. A Variant is a container with a descriptive header, apparently cast as needed by the interpreter.)
Almost, but not quite, the same thing as returning an array. According to http://www.cpearson.com/excel/passingandreturningarrays.htm, Microsoft did not introduce the ability to return an array until 2000. His example [ LoadNumbers(Low As Long, High As Long) As Long()] does not compile in OO, flagging a syntax error on the parens following Long. It appears that OO's Basic emulates the pre-2k VBA.

Resources