Detailed distance between words - edit-distance

How would I go about displaying detailed distance between words.
For example, the output of the program could be:
Words are "car" and "cure":
Replace "a" with "u".
Add "e".
The Levenshtein distance does not fulfill my needs (I think).

Try the following. The algorithm is roughly following Wikipedia (Levenshtein distance). The language used below is ruby
Use as an example, the case of changing s into t as follows:
s = 'Sunday'
t = 'Saturday'
First, s and t are turned into arrays, and an empty string is inserted at the beginning. m will eventually be the matrix used in the argorithm.
s = ['', *s.split('')]
t = ['', *t.split('')]
m = Array.new(s.length){[]}
m here, however, is different from the matrix given if the algorithm in wikipedia for the fact that each cell includes not only the Levenshtein distance, but also the (non-)operation (starting, doing nothing, deletion, insertion, or substitution) that was used to get to that cell from an adjacent (left, up, or upper-left) cell. It may also include a string describing the parameters of the operation. That is, the format of each cell is:
[Levenshtein distance, operation(, string)]
Here is the main routine. It fills in the cells of m following the algorithm:
s.each_with_index{|a, i| t.each_with_index{|b, j|
m[i][j] =
if i.zero?
[j, "started"]
elsif j.zero?
[i, "started"]
elsif a == b
[m[i-1][j-1][0], "did nothing"]
else
del, ins, subs = m[i-1][j][0], m[i][j-1][0], m[i-1][j-1][0]
case [del, ins, subs].min
when del
[del+1, "deleted", "'#{a}' at position #{i-1}"]
when ins
[ins+1, "inserted", "'#{b}' at position #{j-1}"]
when subs
[subs+1, "substituted", "'#{a}' at position #{i-1} with '#{b}'"]
end
end
}}
Now, we set i, j to the bottom-right corner of m and follow the steps backwards as we unshift the contents of the cell into an array called steps, until we reach the start.
i, j = s.length-1, t.length-1
steps = []
loop do
case m[i][j][1]
when "started"
break
when "did nothing", "substituted"
steps.unshift(m[i-=1][j-=1])
when "deleted"
steps.unshift(m[i-=1][j])
when "inserted"
steps.unshift(m[i][j-=1])
end
end
Then we print the operation and the string of each step unless that is a non-operation.
steps.each do |d, op, str=''|
puts "#{op} #{str}" unless op == "did nothing" or op == "started"
end
With this particular example, it will output:
inserted 'a' at position 1
inserted 't' at position 2
substituted 'n' at position 2 with 'r'

class Solution:
def solve(self, text, word0, word1):
word_list = text.split()
ans = len(word_list)
L = None
for R in range(len(word_list)):
if word_list[R] == word0 or word_list[R] == word1:
if L is not None and word_list[R] != word_list[L]:
ans = min(ans, R - L - 1)
L = R
return -1 if ans == len(word_list) else ans
ob = Solution()
text = "cat dog abcd dog cat cat abcd dog wxyz"
word0 = "abcd"
word1 = "wxyz"
print(ob.solve(text, word0, word1))

Related

Generating all combinations from a table in Lua

I'm trying to iterate through a table with a variable amount of elements and get all possible combinations, only using every element one time. I've landed on the solution below.
arr = {"a","b","c","d","e","f"}
function tablelen(table)
local count = 0
for _ in pairs(table) do
count = count + 1
end
return count
end
function spellsub(table,start,offset)
local str = table[start]
for i = start+offset, (tablelen(table)+1)-(start+offset) do
str = str..","..table[i+1]
end
return str
end
print(spellsub(arr,1,2)) -- Outputs: "a,d,e" correctly
print(spellsub(arr,2,2)) -- Outputs: "b" supposed to be "b,e,f"
I'm still missing some further functions, but I'm getting stuck with my current code. What is it that I'm missing? It prints correctly the first time but not the second?
A solution with a coroutine iterator called recursively:
local wrap, yield = coroutine.wrap, coroutine.yield
-- This function clones the array t and appends the item new to it.
local function append (t, new)
local clone = {}
for _, item in ipairs (t) do
clone [#clone + 1] = item
end
clone [#clone + 1] = new
return clone
end
--[[
Yields combinations of non-repeating items of tbl.
tbl is the source of items,
sub is a combination of items that all yielded combination ought to contain,
min it the minimum key of items that can be added to yielded combinations.
--]]
local function unique_combinations (tbl, sub, min)
sub = sub or {}
min = min or 1
return wrap (function ()
if #sub > 0 then
yield (sub) -- yield short combination.
end
if #sub < #tbl then
for i = min, #tbl do -- iterate over longer combinations.
for combo in unique_combinations (tbl, append (sub, tbl [i]), i + 1) do
yield (combo)
end
end
end
end)
end
for combo in unique_combinations {'a', 'b', 'c', 'd', 'e', 'f'} do
print (table.concat (combo, ', '))
end
For a tables with consecutive integer keys starting at 1 like yours you can simply use the length operator #. Your tablelen function is superfluous.
Using table as a local variable name shadows Lua's table library. I suggest you use tbl or some other name that does not prevent you from using table's methods.
The issue with your code can be solved by printing some values for debugging:
local arr = {"a","b","c","d","e","f"}
function spellsub(tbl,start,offset)
local str = tbl[start]
print("first str:", str)
print(string.format("loop from %d to %d", start+offset, #tbl+1-(start+offset)))
for i = start+offset, (#tbl+1)-(start+offset) do
print(string.format("tbl[%d]: %s", i+1, tbl[i+1]))
str = str..","..tbl[i+1]
end
return str
end
print(spellsub(arr,1,2)) -- Outputs: "a,d,e" correctly
print(spellsub(arr,2,2)) -- Outputs: "b" supposed to be "b,e,f"
prints:
first str: a
loop from 3 to 4
tbl[4]: d
tbl[5]: e
a,d,e
first str: b
loop from 4 to 3
b
As you see your second loop does not ran as the start value is already greater than the limit value. Hence you only print the first value b
I don't understand how your code is related to what you want to achieve so I'll leave it up to you to fix it.

Find all occurences of exact string in range and list it

I want to create list of all occurences of "x" string in range. This is my sheet:
And I want to search all occurences and list them and give proper names:
For example for G2, I want "Beret Grey" string as result. I think that I need to use array formula or something like that.
Let me first preface this that vba would be much more robust, but this formula will get you there. It may be slow as it is an array type formula and is doing a lot of calculations. These calculations only expound exponentially as the number of cells with them in it increases:
=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW($B$2:$G$7)/($B$2:$G$7="x"),ROW(1:1))) & " " & INDEX($1:$1,AGGREGATE(15,6,COLUMN(INDEX(A:G,AGGREGATE(15,6,ROW($B$2:$G$7)/($B$2:$G$7="x"),ROW(1:1)),0))/(INDEX(A:G,AGGREGATE(15,6,ROW($B$2:$G$7)/($B$2:$G$7="x"),ROW(1:1)),0)="x"),ROW(1:1)-COUNTIF($B$1:INDEX(G:G,AGGREGATE(15,6,ROW($B$2:$G$7)/($B$2:$G$7="x"),ROW(1:1)) -1),"x"))),"")
You will need to expand the range to what you need. Change all the $B$2:$G$7 to $B$2:$N$29. Do not use full column references outside those that I have used. It will kill Excel.
Also note what is and what is not relative references, they need to remain the same or you will get errors as the formula is dragged/copied down.
As simple UDF to do what you want:
Function findMatch(rng As Range, crit As String, inst As Long) As String
Dim rngArr() As Variant
rngArr = rng.Value
Dim i&, j&, k&
k = 0
If k > Application.WorksheetFunction.CountIf(rng, crit) Then
findMatch = ""
Exit Function
End If
For i = LBound(rngArr, 1) + 1 To UBound(rngArr, 1)
For j = LBound(rngArr, 2) + 1 To UBound(rngArr, 2)
If rngArr(i, j) = crit Then
k = k + 1
If k = inst Then
findMatch = rngArr(i, 1) & " " & rngArr(1, j)
Exit Function
End If
End If
Next j
Next i
then you would call it like this:
=findMatch($A$1:$G$7,"x",ROW(1:1))
And drag/copy down.

Check for identical elements in a table in Lua?

How would you check a table for three identical elements (looking for three L's)?
table = {nil, nil, L, nil, L} -> false
table = {L, L, nil, nil, L} -> true
Really would appreciate some help!
EDIT: Ok I've got this, but it only outputs false even when there are three or more L's (and does so five times for every check?). Sorry if it seemed like I was trying to get the code for it, I'm genuinely trying to learn! :)
for k, v in pairs( threeL_table ) do
local count = 0
if k == 'L' then
count = count + 1
end
if count == 3 then
print('true')
else
print('false')
end
end
You were almost there. You need to test the values v against 'L', not the keys k. Also, I suppose you want to print the message only once after the scan is concluded; if so, put the if-statement outside of the for-loop. (In this case, you should define count outside of the for-loop, too, otherwise you would not see it once it has ended).
local count = 0
for k, v in pairs( threeL_table ) do
if v == 'L' then -- you need to check for the values not the keys
count = count + 1
end
end
if count == 3 then -- move this out of the for-loop
print('true')
else
print('false')
end
I will not give you any code as you did not show any own efforts to solve the problem.
How would you check a table for three identical elements? Well you count them.
Loop over the table and for every distinct value you create a new counter. You could use another table for that. Once one of those counters reaches 3 you know that you have three identical values.
Another way to solve this.
function detectDup(t,nDup)
table.sort(t)
local tabCount = {}
for _,e in ipairs(t) do
tabCount[e] = (tabCount[e] or 0) + 1
if tabCount[e] >= 3 then
print("The element '" .. e .. "' has more than 3 repetitions!")
return true
end
end
return false
end
print(detectDup({'L', 'L','A','B'},3))
print(detectDup({'L', 'L','A','B','L',},3))

Total sum from a set (logic)

I have a logic problem for an iOS app but I don't want to solve it using brute-force.
I have a set of integers, the values are not unique:
[3,4,1,7,1,2,5,6,3,4........]
How can I get a subset from it with these 3 conditions:
I can only pick a defined amount of values.
The sum of the picked elements are equal to a value.
The selection must be random, so if there's more than one solution to the value, it will not always return the same.
Thanks in advance!
This is the subset sum problem, it is a known NP-Complete problem, and thus there is no known efficient (polynomial) solution to it.
However, if you are dealing with only relatively low integers - there is a pseudo polynomial time solution using Dynamic Programming.
The idea is to build a matrix bottom-up that follows the next recursive formulas:
D(x,i) = false x<0
D(0,i) = true
D(x,0) = false x != 0
D(x,i) = D(x,i-1) OR D(x-arr[i],i-1)
The idea is to mimic an exhaustive search - at each point you "guess" if the element is chosen or not.
To get the actual subset, you need to trace back your matrix. You iterate from D(SUM,n), (assuming the value is true) - you do the following (after the matrix is already filled up):
if D(x-arr[i-1],i-1) == true:
add arr[i] to the set
modify x <- x - arr[i-1]
modify i <- i-1
else // that means D(x,i-1) must be true
just modify i <- i-1
To get a random subset at each time, if both D(x-arr[i-1],i-1) == true AND D(x,i-1) == true choose randomly which course of action to take.
Python Code (If you don't know python read it as pseudo-code, it is very easy to follow).
arr = [1,2,4,5]
n = len(arr)
SUM = 6
#pre processing:
D = [[True] * (n+1)]
for x in range(1,SUM+1):
D.append([False]*(n+1))
#DP solution to populate D:
for x in range(1,SUM+1):
for i in range(1,n+1):
D[x][i] = D[x][i-1]
if x >= arr[i-1]:
D[x][i] = D[x][i] or D[x-arr[i-1]][i-1]
print D
#get a random solution:
if D[SUM][n] == False:
print 'no solution'
else:
sol = []
x = SUM
i = n
while x != 0:
possibleVals = []
if D[x][i-1] == True:
possibleVals.append(x)
if x >= arr[i-1] and D[x-arr[i-1]][i-1] == True:
possibleVals.append(x-arr[i-1])
#by here possibleVals contains 1/2 solutions, depending on how many choices we have.
#chose randomly one of them
from random import randint
r = possibleVals[randint(0,len(possibleVals)-1)]
#if decided to add element:
if r != x:
sol.append(x-r)
#modify i and x accordingly
x = r
i = i-1
print sol
P.S.
The above give you random choice, but NOT with uniform distribution of the permutations.
To achieve uniform distribution, you need to count the number of possible choices to build each number.
The formulas will be:
D(x,i) = 0 x<0
D(0,i) = 1
D(x,0) = 0 x != 0
D(x,i) = D(x,i-1) + D(x-arr[i],i-1)
And when generating the permutation, you do the same logic, but you decide to add the element i in probability D(x-arr[i],i-1) / D(x,i)

How Lua tables work

I am starting to learn Lua from Programming in Lua (2nd edition)
I didn't understand the following in the book. Its very vaguely explained.
a.) w={x=0,y=0,label="console"}
b.) x={math.sin(0),math.sin(1),math.sin(2)}
c.) w[1]="another field"
d.) x.f=w
e.) print (w["x"])
f.) print (w[1])
g.) print x.f[1]
When I do print(w[1]) after a.), why doesn't it print x=0
What does c.) do?
What is the difference between e.) and print (w.x)?
What is the role of b.) and g.)?
You have to realize that this:
t = {3, 4, "eggplant"}
is the same as this:
t = {}
t[1] = 3
t[2] = 4
t[3] = "eggplant"
And that this:
t = {x = 0, y = 2}
is the same as this:
t = {}
t["x"] = 0
t["y"] = 2
Or this:
t = {}
t.x = 0
t.y = 2
In Lua, tables are not just lists, they are associative arrays.
When you print w[1], then what really matters is line c.) In fact, w[1] is not defined at all until line c.).
There is no difference between e.) and print (w.x).
b.) creates a new table named x which is separate from w.
d.) places a reference to w inside of x. (NOTE: It does not actually make a copy of w, just a reference. If you've ever worked with pointers, it's similar.)
g.) Can be broken up in two parts. First we get x.f which is just another way to refer to w because of line d.). Then we look up the first element of that table, which is "another field" because of line c.)
There's another way of creating keys in in-line table declarations.
x = {["1st key has spaces!"] = 1}
The advantage here is that you can have keys with spaces and any extended ASCII character.
In fact, a key can be literally anything, even an instanced object.
function Example()
--example function
end
x = {[Example] = "A function."}
Any variable or value or data can go into the square brackets to work as a key. The same goes with the value.
Practically, this can replace features like the in keyword in python, as you can index the table by values to check if they are there.
Getting a value at an undefined part of the table will not cause an error. It will just give you nil. The same goes for using undefined variables.
local w = {
--[1] = "another field"; -- will be set this value
--["1"] = nil; -- not save to this place, different with some other language
x = 0;
y = 0;
label = "console";
}
local x = {
math.sin(0);
math.sin(1);
math.sin(2);
}
w[1] = "another field" --
x.f = w
print (w["x"])
-- because x.f = w
-- x.f and w point one talbe address
-- so value of (x.f)[1] and w[1] and x.f[1] is equal
print (w[1])
print ((x.f)[1])
print (x.f[1])
-- print (x.f)[1] this not follows lua syntax
-- only a function's has one param and type of is a string
-- you can use print "xxxx"
-- so you print x.f[1] will occuur error
-- in table you can use any lua internal type 's value to be a key
-- just like
local t_key = {v=123}
local f_key = function () print("f123") end
local t = {}
t[t_key] = 1
t[f_key] = 2
-- then t' key actualy like use t_key/f_key 's handle
-- when you user t[{}] = 123,
-- value 123 related to this no name table {} 's handle

Resources