Finding duplicates in a multi-dimensional table - lua

A slightly altered version of the below permitted me to filter unique field values out of a multi-dimensional table (dictionary style).
[ url ] http://rosettacode.org/wiki/Remove_duplicate_elements#Lua
items = {1,2,3,4,1,2,3,4,"bird","cat","dog","dog","bird"}`
flags = {}
io.write('Unique items are:')
for i=1,#items do
if not flags[items[i]] then
io.write(' ' .. items[i])
flags[items[i]] = true
end
end
io.write('\n')`
What I'm lost at is what 'if not ... then ... end' part actually does. To me this is sillyspeak but hey, it works ;-) Now i want to know what happens under the hood.
I hope multi-dimensional does not offend anyone, I'm referring to a table consisting of multiple row containing multiple key-value pairs on each row.
Here's the code i'm using, no brilliant adaptation but good enough to filter unique values on a fieldname
for i=1,#table,1 do
if not table2[table[i].fieldname] then
table2[table[i].fieldname] = true
end
end
for k,v in pairs(table2) do
print(k)
end

function findDuplicates(t)
seen = {} --keep record of elements we've seen
duplicated = {} --keep a record of duplicated elements
for i = 1, #t do
element = t[i]
if seen[element] then --check if we've seen the element before
duplicated[element] = true --if we have then it must be a duplicate! add to a table to keep track of this
else
seen[element] = true -- set the element to seen
end
end
return duplicated
end
The logic behind the if seen[element] then, is that we check if we've already seen the element before in the table. As if they key doesn't exist nill will be returned which is evaluated which is false (this is not the same as boolean false, there are two types of false in lua!).
You can use this function like so:
t = {'a','b','a','c','c','c','d'}
for key,_ in pairs(findDuplicates(t)) do
print(key)
end
However that function won't work with multidimensional tables, this one will however:
function findDuplicates(t)
seen = {} --keep record of elements we've seen
duplicated = {} --keep a record of duplicated elements
local function traverse(subt)
for i=1, #subt do
element = subt[i]
if type(element) == 'table' then
traverse(element)
else
if seen[element] then
duplicated[element] = true
else
seen[element] = true
end
end
end
end
traverse(t)
return duplicated
end
Example useage:
t = {'a',{'b','a'},'c',{'c',{'c'}},'d'}
for k,_ in pairs(findDuplicates(t)) do
print(k)
end
Outputs
a
c
t = {a='a',b='b',c='c',d='c',e='a',f='d'}
function findDuplicates(t)
seen = {}
duplicated = {}
for key,val in pairs(t) do
if seen[val] then
duplicated[val] = true
else
seen[val] = true
end
end
return duplicated
end
This works the same way as before but checks if the same value is associated with a different key and if so, makes note of that value as being duplicated.

Eventually this is the code which worked for me. I've been asked to post it as a separate answer so here goes.
for i=1,#table1,1 do
if not table2[table1[i].fieldname] then
table2[table1[i].fieldname] = true
end
end
for k,v in pairs(table2) do
print(k)
end

Related

Lua: Merge 2 strings collections in a case-insensitive manner

I want to merge two strings collections in a case-insensitive manner:
string_collection1 = {"hello","buddy","world","ciao"}
string_collection2 = {"Hello","Buddy","holly","Bye", "bYe"}
merged_string_collection = merge_case_insensitive(string_collection1,string_collection2) --> {"hello","buddy","world","holly","bye","ciao"}
Here's an attempt, but it does not work...
function merge_case_insensitive(t1,t2)
t3 = {}
for _,s1 in pairs(t1) do
for _,s2 in pairs(t2) do
if string.lower(s1) == string.lower(s2) then
t3[s1] = s1
end
end
end
t4 = {}
i = 1
for s,_ in pairs(t3) do
t4[i] = string.lower(s)
i = i + 1
end
return t4
end
string_collection1 = {"hello","buddy","world","ciao"}
string_collection2 = {"Hello","Buddy","holly","Bye", "bYe"}
merged_string_collection = merge_case_insensitive(string_collection1,string_collection2)
for k,v in pairs(merged_string_collection) do print(k,v) end
It does not work because you use == to compare both strings which is case-sensitive.
You could do something like string.lower(s1) == string.lower(s2) to fix that.
Edit:
As you can't figure out the rest yourself, here's some code:
local t1 = {"hello","buddy","world","ciao"}
local t2 = {"Hello","Buddy","holly","Bye", "bYe"}
local aux_table = {}
local merged_table = {}
for k,v in pairs(t1) do
aux_table[v:lower()] = true
end
for k,v in pairs(t2) do
aux_table[v:lower()] = true
end
for k,v in pairs(aux_table) do
table.insert(merged_table, k)
end
merged_table now contains the lower case version of every word in both input tables.
Now pour that into a function that takes any number of input tables and you are done.
What we did here: we use the lower case version of every word in those tables and store them in a list. aux_table[string.lower("Hello")] will index the same value as aux_table[string.lower("hello")]. So we end up with one entry for each word, even if a word comes in multiple variations.
Using the keys saves us the hassle of comparing strings and distiguishing between unique words and others.
To get a table with all strings from two other tables appearing once (without regard to case), you need something like this:
function merge_case_insensitive(t1,t2)
local ans = {}
for _,v in pairs(t1) do ans[v:lower()] = true end
for _,v in pairs(t2) do ans[v:lower()] = true end
return ans
end
string_collection1 = {"hello","buddy","world","ciao"}
string_collection2 = {"Hello","Buddy","holly","Bye", "bYe"}
merged_string_collection = merge_case_insensitive(string_collection1,string_collection2)
for k in pairs(merged_string_collection) do print(k) end
Edit: And in case you want an array result (without adding another iteration)
function merge_case_insensitive(t1,t2)
local ans = {}
local
function add(t)
for _,v in pairs(t) do
v = v:lower()
if ans[v] == nil then ans[#ans+1] = v end
ans[v] = true
end
end
add(t1)
add(t2)
return ans
end
string_collection1 = {"hello","buddy","world","ciao"}
string_collection2 = {"Hello","Buddy","holly","Bye", "bYe"}
merged_string_collection = merge_case_insensitive(string_collection1,string_collection2)
for _,v in ipairs(merged_string_collection) do print(v) end
We can do this by simply iterations over both tables, and storing a temporary dictionary for checking what words we have already found, and if not there yet, putting them in our new array:
function Merge(t1, t2)
local found = {} --Temporary dictionary
local new = {} --New array
local low --Value to store low versions of words in later
for i,v in ipairs(t1) do --Begin iterating over table one
low = v:lower()
if not found[low] then --If not found yet
new[#new+1] = low --Put it in the new table
found[low] = true --Add it to found
end
end
for i,v in ipairs(t2) do --Repeat with table 2
low = v:lower()
if not found[low] then
new[#new+1] = low
found[low] = true
end
end
return new --Return the new array
end
This method eliminates the need for a third iteration, like in Piglet's answer, and doesn't keep redefining a function and closure and calling them like in tonypdmtr's answer.

Count number of occurrences of a value in Lua table

Basically what I want to do is convert a table of this format
result={{id="abcd",dmg=1},{id="abcd",dmg=1},{id="abcd",dmg=1}}
to a table of this format:
result={{id="abcd",dmg=1, qty=3}}
so I need to know how many times does {id="abcd",dmg=1} occur in the table. Does anybody know a better way of doing this than just nested for loops?
result={{id="abcd",dmg=1},{id="defg",dmg=2},{id="abcd",dmg=1},{id="abcd",dmg=1}}
local t, old_result = {}, result
result = {}
for _, v in ipairs(old_result) do
local h = v.id..'\0'..v.dmg
v = t[h] or table.insert(result, v) or v
t[h], v.qty = v, (v.qty or 0) + 1
end
-- result: {{id="abcd",dmg=1,qty=3},{id="defg",dmg=2,qty=1}}
So you want to clear duplicate contents, although a better solution is to not let dupe contents in, here you go:
function Originals(parent)
local originals = {}
for i,object in ipairs(parent) do
for ii,orig in ipairs(originals) do
local dupe = true
for key, val in pairs(object) do
if val ~= orig[key] then
dupe = false
break
end
end
if not dupe then
originals[#originals+1] = object
end
end
return originals
end
I tried to make the code self explanatory, but the general idea is that it loops through and puts all the objects with new contents aside, and returns them after.
Warning: Code Untested

Lua - Table.hasValue returning nil

I have a table something like this:
table = {milk, butter, cheese} -- without "Quotation marks"
I was searching for a way to check if a given value is in the table or not, and found this:
if table.hasValue(table, milk) == true then ...
but it returns nil, any reason why? (it says .hasValue is invalid) or can I get an alternative to check if value exists in that table? I tried several ways like:
if table.milk == true then ...
if table[milk] == true then ...
All of these returns nil or false.
you can try this
items = {milk=true, butter=true, cheese=true}
if items.milk then
...
end
OR
if items.butter == true then
...
end
A Lua table can act as an array or as an associative array (map).
There is no hasValue, but by using a table as an associative array you can easily implement it efficiently:
local table = {
milk = true,
butter = true,
cheese = true,
}
-- has milk?
if table.milk then
print("Has milk!")
end
if table.rocks then
print("Has rocks!")
end
You have a few options here.
One, is to create a set:
local set = {
foo = true,
bar = true,
baz = true
}
Then you check if either of these are in the table:
if set.bar then
The drawback to this approach is that you won't iterate over it in any specific order (pairs returns items in an arbitrary order).
Another option would be to use a function to check each value in a table. This'll be very slow in large tables, which brings us to back to a modification of the first option: A reverse lookup generator: (This is what I'd recommend doing -- unless your set is static)
local data = {"milk", "butter", "cheese"}
local function reverse(tbl, target)
local target = target or {}
for k, v in pairs(tbl) do
target[v] = k
end
return target
end
local revdata = reverse(data)
print(revdata.cheese, revdata.butter, revdata.milk)
-- Output: 3 2 1
This'll generate a set (with the added bonus of giving you the index where the value was in your original table). You can also put the reverse into the same table as the data was in, but this won't go well with numbers (and it'll be messy if you need to generate the reverse again).
If you write table = {milk=true, butter=true, cheese=true}, then you can use if table.milk == true then ....

Remove duplicates from LUA Table by timestamp

I was on stack a few days back for help inserting records to prevent duplicates. However the process to enter these is slow and they slip in.
I have a user base of about 10,000 players, and they have duplicate entries.. I've been trying to filter out these duplicates without success. The examples on stack have no panned out for me.
Here is a clip from my table
[18] =
{
["soldAmount"] = 25,
["buyer"] = [[#playername]],
["timestampz"] = 1398004426,
["secsSinceEvent"] = 55051,
["guildName"] = [[TradingGuild]],
["eventType"] = 15,
["seller"] = [[#myname]],
},
[19] =
{
["soldAmount"] = 25,
["buyer"] = [[#playername]],
["timestampz"] = 1398004426,
["secsSinceEvent"] = 55051,
["guildName"] = [[TradingGuild]],
["eventType"] = 15,
["seller"] = [[#myname]],
},
The timestamp's match and they should not have been added.
for k,v in pairs(sellHistory) do mSavedTHVars.Forever_Sales[k] = v
if mSavedTHVars.Forever_Sales.timestampz ~= sellHistory.timestampz then
table.insert(mSavedTHVars.Forever_Sales, sellHistory)
end end
Now, I need to find out how to remove the current duplicates, and here is what I've tried.
function table_unique(tt)
local newtable = {}
for ii,xx in ipairs(tt) do
if table_count(newtable.timestampz, xx) ~= tt.timestampz then
newtable[#newtable+1] = xx
end
end
return newtable
end
I hope this information provided was clean and understandable.
Thanks!
UPDATE
Attempt #20 ;)
for k,v in pairs(mSavedTHVars.Forever_Sales) do
if v == mSavedTHVars.Forever_Sales.timestampz then
table.remove(mSavedTHVars.Forever_Sales,k)
end
end
No luck yet.
UPDATE
This has worked
for k,v in pairs(mSavedTHVars.Forever_Sales) do mSavedTHVars.Forever_Sales[k] = v
if v.timestampz == mSavedTHVars.Forever_Sales.timestampz then
table.remove(mSavedTHVars.Forever_Sales, k)
end
end
IS this a good approach?
Assuming that mSavedTHVars.Forever_Sales[18] and mSavedTHVars.Forever_Sales[19] are the tables you listed in your post, then to remove all duplicates based on same time stamp it is easiest to create a "set" based on timestamp (since the timestamp is your condition for uniqueness). Loop through your mSavedTHVars.Forever_Sales and for each item, add item to new table only if its timestamp not already in set:
function removeDuplicates(tbl)
local timestamps = {}
local newTable = {}
for index, record in ipairs(tbl) do
if timestamps[record.timestampz] == nil then
timestamps[record.timestampz] = 1
table.insert(newTable, record)
end
end
return newTable
end
mSavedTHVars.Forever_Sales = removeDuplicates(mSavedTHVars.Forever_Sales)
Update based on Question Update:
My comment on following proposed solution:
for k,v in pairs(mSavedTHVars.Forever_Sales) do
mSavedTHVars.Forever_Sales[k] = v
if v.timestampz == mSavedTHVars.Forever_Sales.timestampz then
table.remove(mSavedTHVars.Forever_Sales, k)
end
end
The problem is that I don't see how that can work. When you do for k,v in pairs(mSavedTHVars.Forever_Sales) do then v is mSavedTHVars.Forever_Sales[k] so the next line mSavedTHVars.Forever_Sales[k] = v does nothing. Then if v.timestampz == mSavedTHVars.Forever_Sales.timestampz compares the timestamp of v, i.e. of mSavedTHVars.Forever_Sales[k], with value of a timestampz field in mSavedTHVars.Forever_Sales. But latter is a table without such field, so right-hand-side of == will be nil, so the condition will only be true if v.timestampz is nil, which I don't think is ever the case.
The main reason that I used a solution of creating new table instead of removing duplicates from the existing table is that you can edit a table while iterating over it with pairs or ipairs. If you were to use a reverse counter, it would probably be ok (but I have not tested, test to be sure):
function removeDuplicates(tbl)
local timestamps = {}
local numItems = #tbl
for index=numItems, 1, -1, do
local record = tbl[index]
if timestamps[record.timestampz] ~= nil then
table.remove(newTable, index)
end
timestamps[record.timestampz] = 1
end
end
Also I think the intent of the function is not as clear, but maybe this is just personal preference.

Trying to compare all entries of one table in Lua

Basically I have a table of objects, each of these objects has one particular field that is a number. I'm trying to see if any of these numerical entries match up and I can't think of a way to do it. I thought possibly a double for loop, one loop iterating through the table, the other decrementing, but won't this at some point lead to two values being compared twice? I'm worried that it may appear to work on the surface but actually have subtle errors. This is how I pictured the code:
for i = #table, 1, -1 do
for j = 1, #table do
if( table[i].n == table[j].n ) then
table.insert(table2, table[i])
table.insert(table2, table[j])
end
end
end
I want to insert the selected objects, as tables, into another pre made one without any duplicates.
Let the outer loop run over the table, and let the inner loop always start one element ahead of the outer one - this avoids double counting and comparing objects with themselves. Also, if you call the table you want to examine table that will probably hide the table library in which you want to access insert. So let's say you call your input table t:
for i = 1, #t do
for j = i+1, #t do
if( t[i].n == t[j].n ) then
table.insert(table2, t[i])
table.insert(table2, t[j])
end
end
end
Still, if three or more elements have the same value n you will add some of them multiple times. You could use another table to remember which elements you've already inserted:
local done = {}
for i = 1, #t do
for j = i+1, #t do
if( t[i].n == t[j].n ) then
if not done[i] then
table.insert(table2, t[i])
done[i] = true
end
if not done[j] then
table.insert(table2, t[j])
done[j] = true
end
end
end
end
I admit this isn't really elegant, but it's getting late over here, and my brain refuses to think of a neater approach.
EDIT: In fact... using another table, you can reduce this to a single loop. When you encounter a new n you add a new value at n as the key into your helper table - the value will be that t[i] you were just analysing. If you encounter an n that already is in the table, you take that saved element and the current one and add them both to your target list - you also replace the element in the auxiliary table with true or something that's not a table:
local temp = {}
for i = 1, #t do
local n = t[i].n
if not temp[n] then
temp[n] = t[i]
else
if type(temp[n]) == "table" then
table.insert(table2, temp[n])
temp[n] = true
end
table.insert(table2, t[i])
end
end

Resources