Remove duplicates from LUA Table by timestamp - lua

I was on stack a few days back for help inserting records to prevent duplicates. However the process to enter these is slow and they slip in.
I have a user base of about 10,000 players, and they have duplicate entries.. I've been trying to filter out these duplicates without success. The examples on stack have no panned out for me.
Here is a clip from my table
[18] =
{
["soldAmount"] = 25,
["buyer"] = [[#playername]],
["timestampz"] = 1398004426,
["secsSinceEvent"] = 55051,
["guildName"] = [[TradingGuild]],
["eventType"] = 15,
["seller"] = [[#myname]],
},
[19] =
{
["soldAmount"] = 25,
["buyer"] = [[#playername]],
["timestampz"] = 1398004426,
["secsSinceEvent"] = 55051,
["guildName"] = [[TradingGuild]],
["eventType"] = 15,
["seller"] = [[#myname]],
},
The timestamp's match and they should not have been added.
for k,v in pairs(sellHistory) do mSavedTHVars.Forever_Sales[k] = v
if mSavedTHVars.Forever_Sales.timestampz ~= sellHistory.timestampz then
table.insert(mSavedTHVars.Forever_Sales, sellHistory)
end end
Now, I need to find out how to remove the current duplicates, and here is what I've tried.
function table_unique(tt)
local newtable = {}
for ii,xx in ipairs(tt) do
if table_count(newtable.timestampz, xx) ~= tt.timestampz then
newtable[#newtable+1] = xx
end
end
return newtable
end
I hope this information provided was clean and understandable.
Thanks!
UPDATE
Attempt #20 ;)
for k,v in pairs(mSavedTHVars.Forever_Sales) do
if v == mSavedTHVars.Forever_Sales.timestampz then
table.remove(mSavedTHVars.Forever_Sales,k)
end
end
No luck yet.
UPDATE
This has worked
for k,v in pairs(mSavedTHVars.Forever_Sales) do mSavedTHVars.Forever_Sales[k] = v
if v.timestampz == mSavedTHVars.Forever_Sales.timestampz then
table.remove(mSavedTHVars.Forever_Sales, k)
end
end
IS this a good approach?

Assuming that mSavedTHVars.Forever_Sales[18] and mSavedTHVars.Forever_Sales[19] are the tables you listed in your post, then to remove all duplicates based on same time stamp it is easiest to create a "set" based on timestamp (since the timestamp is your condition for uniqueness). Loop through your mSavedTHVars.Forever_Sales and for each item, add item to new table only if its timestamp not already in set:
function removeDuplicates(tbl)
local timestamps = {}
local newTable = {}
for index, record in ipairs(tbl) do
if timestamps[record.timestampz] == nil then
timestamps[record.timestampz] = 1
table.insert(newTable, record)
end
end
return newTable
end
mSavedTHVars.Forever_Sales = removeDuplicates(mSavedTHVars.Forever_Sales)
Update based on Question Update:
My comment on following proposed solution:
for k,v in pairs(mSavedTHVars.Forever_Sales) do
mSavedTHVars.Forever_Sales[k] = v
if v.timestampz == mSavedTHVars.Forever_Sales.timestampz then
table.remove(mSavedTHVars.Forever_Sales, k)
end
end
The problem is that I don't see how that can work. When you do for k,v in pairs(mSavedTHVars.Forever_Sales) do then v is mSavedTHVars.Forever_Sales[k] so the next line mSavedTHVars.Forever_Sales[k] = v does nothing. Then if v.timestampz == mSavedTHVars.Forever_Sales.timestampz compares the timestamp of v, i.e. of mSavedTHVars.Forever_Sales[k], with value of a timestampz field in mSavedTHVars.Forever_Sales. But latter is a table without such field, so right-hand-side of == will be nil, so the condition will only be true if v.timestampz is nil, which I don't think is ever the case.
The main reason that I used a solution of creating new table instead of removing duplicates from the existing table is that you can edit a table while iterating over it with pairs or ipairs. If you were to use a reverse counter, it would probably be ok (but I have not tested, test to be sure):
function removeDuplicates(tbl)
local timestamps = {}
local numItems = #tbl
for index=numItems, 1, -1, do
local record = tbl[index]
if timestamps[record.timestampz] ~= nil then
table.remove(newTable, index)
end
timestamps[record.timestampz] = 1
end
end
Also I think the intent of the function is not as clear, but maybe this is just personal preference.

Related

Trying to build a table of unique values in LUA

I am trying to build a table and add to it at the end each time I get a returned value that is not already in the table. So basically what I have so far is not working at all. I'm new to LUA but not to programming in general.
local DB = {}
local DBsize = 0
function test()
local classIndex = select(3, UnitClass("player")) -- This isn't the real function, just a sample
local cifound = False
if classIndex then
if DBsize > 0 then
for y = 1, DBsize do
if DB[y] == classIndex then
cifound = True
end
end
end
if not cifound then
DBsize = DBsize + 1
DB[DBsize] = classIndex
end
end
end
Then later I'm trying to use another function to print the contents of the table:
local x = 0
print(DBsize)
for x = 1, DBsize do
print(DB[x])
end
Any help would be much appreciated
Just store a value in the table using your unique value as a key. That way you don't have to check wether a value already exists. You simply overwrite any existing keys if you have it a second time.
Simple example that stores unique values from 100 random values.
local unique_values = {}
for i = 1, 100 do
local random_value = math.random(10)
unique_values[random_value] = true
end
for k,v in pairs(unique_values) do print(k) end

Lua: Merge 2 strings collections in a case-insensitive manner

I want to merge two strings collections in a case-insensitive manner:
string_collection1 = {"hello","buddy","world","ciao"}
string_collection2 = {"Hello","Buddy","holly","Bye", "bYe"}
merged_string_collection = merge_case_insensitive(string_collection1,string_collection2) --> {"hello","buddy","world","holly","bye","ciao"}
Here's an attempt, but it does not work...
function merge_case_insensitive(t1,t2)
t3 = {}
for _,s1 in pairs(t1) do
for _,s2 in pairs(t2) do
if string.lower(s1) == string.lower(s2) then
t3[s1] = s1
end
end
end
t4 = {}
i = 1
for s,_ in pairs(t3) do
t4[i] = string.lower(s)
i = i + 1
end
return t4
end
string_collection1 = {"hello","buddy","world","ciao"}
string_collection2 = {"Hello","Buddy","holly","Bye", "bYe"}
merged_string_collection = merge_case_insensitive(string_collection1,string_collection2)
for k,v in pairs(merged_string_collection) do print(k,v) end
It does not work because you use == to compare both strings which is case-sensitive.
You could do something like string.lower(s1) == string.lower(s2) to fix that.
Edit:
As you can't figure out the rest yourself, here's some code:
local t1 = {"hello","buddy","world","ciao"}
local t2 = {"Hello","Buddy","holly","Bye", "bYe"}
local aux_table = {}
local merged_table = {}
for k,v in pairs(t1) do
aux_table[v:lower()] = true
end
for k,v in pairs(t2) do
aux_table[v:lower()] = true
end
for k,v in pairs(aux_table) do
table.insert(merged_table, k)
end
merged_table now contains the lower case version of every word in both input tables.
Now pour that into a function that takes any number of input tables and you are done.
What we did here: we use the lower case version of every word in those tables and store them in a list. aux_table[string.lower("Hello")] will index the same value as aux_table[string.lower("hello")]. So we end up with one entry for each word, even if a word comes in multiple variations.
Using the keys saves us the hassle of comparing strings and distiguishing between unique words and others.
To get a table with all strings from two other tables appearing once (without regard to case), you need something like this:
function merge_case_insensitive(t1,t2)
local ans = {}
for _,v in pairs(t1) do ans[v:lower()] = true end
for _,v in pairs(t2) do ans[v:lower()] = true end
return ans
end
string_collection1 = {"hello","buddy","world","ciao"}
string_collection2 = {"Hello","Buddy","holly","Bye", "bYe"}
merged_string_collection = merge_case_insensitive(string_collection1,string_collection2)
for k in pairs(merged_string_collection) do print(k) end
Edit: And in case you want an array result (without adding another iteration)
function merge_case_insensitive(t1,t2)
local ans = {}
local
function add(t)
for _,v in pairs(t) do
v = v:lower()
if ans[v] == nil then ans[#ans+1] = v end
ans[v] = true
end
end
add(t1)
add(t2)
return ans
end
string_collection1 = {"hello","buddy","world","ciao"}
string_collection2 = {"Hello","Buddy","holly","Bye", "bYe"}
merged_string_collection = merge_case_insensitive(string_collection1,string_collection2)
for _,v in ipairs(merged_string_collection) do print(v) end
We can do this by simply iterations over both tables, and storing a temporary dictionary for checking what words we have already found, and if not there yet, putting them in our new array:
function Merge(t1, t2)
local found = {} --Temporary dictionary
local new = {} --New array
local low --Value to store low versions of words in later
for i,v in ipairs(t1) do --Begin iterating over table one
low = v:lower()
if not found[low] then --If not found yet
new[#new+1] = low --Put it in the new table
found[low] = true --Add it to found
end
end
for i,v in ipairs(t2) do --Repeat with table 2
low = v:lower()
if not found[low] then
new[#new+1] = low
found[low] = true
end
end
return new --Return the new array
end
This method eliminates the need for a third iteration, like in Piglet's answer, and doesn't keep redefining a function and closure and calling them like in tonypdmtr's answer.

Can't figure out this table arrangement

--The view of the table
local originalStats = {
Info = {Visit = false, Name = "None", Characters = 1},
Stats = {Levels = 0, XP = 0, XP2 = 75, Silver = 95},
Inventory = {
Hats = {"NoobHat"},
Robes = {"NoobRobe"},
Boots = {"NoobBoot"},
Weapons = {"NoobSword"}
}
}
local tempData = {}
--The arrangement here
function Module:ReadAll(player)
for k,v in pairs(tempData[player]) do
if type(v) == 'table' then
for k2, v2 in pairs(v) do
print(k2) print(v2)
if type(v2) == 'table' then
for k3, v3 in pairs(v2) do
print(k3) print(v3)
end
else
print(k2) print(v2)
end
end
else
print(k) print(v)
end
end
end
I'm sorry, but I can't seem to figure out how to arrange this 'ReadAll' function to where It'll show all the correct stats in the right orders.
The output is something like this:
Boots
table: 1A73CF10
1
NoobBoot
Weapons
table: 1A7427F0
1
NoobSword
Robes
table: 1A743D50
1
NoobRobe
Hats
table: 1A73C9D0
1
NoobHat
XP2
75
XP2
75
Levels
2
Levels
2
XP
0
XP
0
Here's a way to print all the elements without double or table reference values showing up.
As the name states, this function will print all the elements within a table, no matter how many nested tables there are inside it. I don't have a way to order them at the moment, but I'll update my answer if I find a way. You can also get rid of the empty spaces in the print line, I just used it so it would look neater. Let me know if it works.
function allElementsInTable(table)
for k,v in pairs(table) do
if type(table[k]) == 'table' then
print(k .. ":")
allElementsInTable(v)
else
print(" " .. k .. " = " .. tostring(v))
end
end
end
--place the name of your table in the parameter for this function
allElementsInTable(originalStats)
After more experimenting, I got this, if anyone wants it, feel free to use it.
tempData = { Info = {Visit = false, Name = 'None'},
Stats = {LVL = 0, XP = 0, Silver = 75},
Inventory = { Armors = {'BasicArmor'},
Weapons = {'BasicSword'} }
}
function Read()
for i, v in pairs(tempData['Info']) do
print(i..'\t',v)
end
----------
for i2, v2 in pairs(tempData['Stats']) do
print(i2..'\t',v2)
end
----------
for i3, v3 in pairs(tempData['Inventory']) do
print(i3..':')
for i4, v4 in pairs(v3) do
print('\t',v4)
end
end
end
Read()
Don't expect table's fields to be iterated with pairs() in some specific order. Internally Lua tables are hashtables, and the order of fields in it is not specified at all. It will change between runs, you can't have them iterated in the same order as they were filled.Only arrays with consecutive integer indices will maintain the order of their elements.

Count number of occurrences of a value in Lua table

Basically what I want to do is convert a table of this format
result={{id="abcd",dmg=1},{id="abcd",dmg=1},{id="abcd",dmg=1}}
to a table of this format:
result={{id="abcd",dmg=1, qty=3}}
so I need to know how many times does {id="abcd",dmg=1} occur in the table. Does anybody know a better way of doing this than just nested for loops?
result={{id="abcd",dmg=1},{id="defg",dmg=2},{id="abcd",dmg=1},{id="abcd",dmg=1}}
local t, old_result = {}, result
result = {}
for _, v in ipairs(old_result) do
local h = v.id..'\0'..v.dmg
v = t[h] or table.insert(result, v) or v
t[h], v.qty = v, (v.qty or 0) + 1
end
-- result: {{id="abcd",dmg=1,qty=3},{id="defg",dmg=2,qty=1}}
So you want to clear duplicate contents, although a better solution is to not let dupe contents in, here you go:
function Originals(parent)
local originals = {}
for i,object in ipairs(parent) do
for ii,orig in ipairs(originals) do
local dupe = true
for key, val in pairs(object) do
if val ~= orig[key] then
dupe = false
break
end
end
if not dupe then
originals[#originals+1] = object
end
end
return originals
end
I tried to make the code self explanatory, but the general idea is that it loops through and puts all the objects with new contents aside, and returns them after.
Warning: Code Untested

Finding duplicates in a multi-dimensional table

A slightly altered version of the below permitted me to filter unique field values out of a multi-dimensional table (dictionary style).
[ url ] http://rosettacode.org/wiki/Remove_duplicate_elements#Lua
items = {1,2,3,4,1,2,3,4,"bird","cat","dog","dog","bird"}`
flags = {}
io.write('Unique items are:')
for i=1,#items do
if not flags[items[i]] then
io.write(' ' .. items[i])
flags[items[i]] = true
end
end
io.write('\n')`
What I'm lost at is what 'if not ... then ... end' part actually does. To me this is sillyspeak but hey, it works ;-) Now i want to know what happens under the hood.
I hope multi-dimensional does not offend anyone, I'm referring to a table consisting of multiple row containing multiple key-value pairs on each row.
Here's the code i'm using, no brilliant adaptation but good enough to filter unique values on a fieldname
for i=1,#table,1 do
if not table2[table[i].fieldname] then
table2[table[i].fieldname] = true
end
end
for k,v in pairs(table2) do
print(k)
end
function findDuplicates(t)
seen = {} --keep record of elements we've seen
duplicated = {} --keep a record of duplicated elements
for i = 1, #t do
element = t[i]
if seen[element] then --check if we've seen the element before
duplicated[element] = true --if we have then it must be a duplicate! add to a table to keep track of this
else
seen[element] = true -- set the element to seen
end
end
return duplicated
end
The logic behind the if seen[element] then, is that we check if we've already seen the element before in the table. As if they key doesn't exist nill will be returned which is evaluated which is false (this is not the same as boolean false, there are two types of false in lua!).
You can use this function like so:
t = {'a','b','a','c','c','c','d'}
for key,_ in pairs(findDuplicates(t)) do
print(key)
end
However that function won't work with multidimensional tables, this one will however:
function findDuplicates(t)
seen = {} --keep record of elements we've seen
duplicated = {} --keep a record of duplicated elements
local function traverse(subt)
for i=1, #subt do
element = subt[i]
if type(element) == 'table' then
traverse(element)
else
if seen[element] then
duplicated[element] = true
else
seen[element] = true
end
end
end
end
traverse(t)
return duplicated
end
Example useage:
t = {'a',{'b','a'},'c',{'c',{'c'}},'d'}
for k,_ in pairs(findDuplicates(t)) do
print(k)
end
Outputs
a
c
t = {a='a',b='b',c='c',d='c',e='a',f='d'}
function findDuplicates(t)
seen = {}
duplicated = {}
for key,val in pairs(t) do
if seen[val] then
duplicated[val] = true
else
seen[val] = true
end
end
return duplicated
end
This works the same way as before but checks if the same value is associated with a different key and if so, makes note of that value as being duplicated.
Eventually this is the code which worked for me. I've been asked to post it as a separate answer so here goes.
for i=1,#table1,1 do
if not table2[table1[i].fieldname] then
table2[table1[i].fieldname] = true
end
end
for k,v in pairs(table2) do
print(k)
end

Resources