Convert a .csv file into a 2D table in Lua - lua

as the title suggests I'd like to know how to convert a .csv file in Lua into a 2D table.
So, for example, say I have a .csv file that looks like this:
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,-1,-1,-1,-1,-1,-1,-1,-1,0,0,-1,-1,-1,-1,-1,-1,-1,-1,0
0,-1,-1,-1,-1,-1,-1,-1,-1,0,0,-1,-1,-1,-1,-1,-1,-1,-1,0
0,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,0
0,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,0
0,0,-1,-1,-1,-1,-1,-1,0,0,0,0,-1,-1,-1,-1,-1,-1,0,0
0,0,0,-1,-1,-1,-1,0,0,0,0,0,0,-1,-1,-1,-1,0,0,0
0,0,0,0,-1,-1,0,0,0,0,0,0,0,0,-1,-1,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
How would I convert it into something like this?
local example_table = {{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,-1,-1,-1,-1,-1,-1,-1,-1,0,0,-1,-1,-1,-1,-1,-1,-1,-1,0},
{0,-1,-1,-1,-1,-1,-1,-1,-1,0,0,-1,-1,-1,-1,-1,-1,-1,-1,0},
{0,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,0},
{0,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,0},
{0,0,-1,-1,-1,-1,-1,-1,0,0,0,0,-1,-1,-1,-1,-1,-1,0,0},
{0,0,0,-1,-1,-1,-1,0,0,0,0,0,0,-1,-1,-1,-1,0,0,0},
{0,0,0,0,-1,-1,0,0,0,0,0,0,0,0,-1,-1,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}}
Your help will be greatly appreciated.

1. Don't underestimate CSV.
If you need it generic, get a proper CSV parsing library. If you do the parsing yourself, you will miss lots of special cases that could happen, so it's only suitable for cases where you know your data and would notice if something went wrong.
2. Changing the file
If you want the equivalent Lua code as output, assuming you're doing the parsing in Lua, you could do something like this:
local input = get_input_somehow() -- probably using io.open, etc.
local output =
"local example_table = {\n"
..
input:gmatch("[^\n]*", function(line)
return "{" .. line .. "};"
end)
..
"\n}"
save_output_somehow(output) -- Probably just write to a new file
3. Parsing CSV into a table
If you want to read the CSV file directly into a Lua table, you could instead do it like this:
local input = get_input_somehow() -- probably using io.open, etc.
local output = {}
input:gmatch("[^\n]", function(line)
local row = {}
table.insert(output, row)
line:gmatch("[^,]", function(item)
table.insert(row, tonumber(item))
end)
end)
do_something_with(output) -- Whatever you need your data for

Related

How to create a tables with variable length with string-like keys in lua

I have a file database. Inside that file I have something like:
DB_A = ...
DB_B = ...
.
.
.
DB_N = ...
I would like to parse the data and group them in lua code like this:
data={}
-- the result after parsing a file
data={
["DB_A"] = {...},
["DB_B"] = {...},
.
.
.
["DB_N"] = {...}
}
In other words, is it possible to create a table inside a table dynamically and assign the key to each table without previously knowing what will be the names of the key (that is something I can figure out after parsing the data from a database).
(Just as a note, I am using Lua 5.3.5; also, I apologize that my code resembles C more than Lua!)
Iterating through your input file line-by-line--which can be done with the Lua FILE*'s lines method--you can use string.match to grab the information you are looking for from each line.
#!/usr/bin/lua
local PATTERN = "(%S+)%s?=%s?(%S+)"
local function eprintf(fmt, ...)
io.stderr:write(string.format(fmt, ...))
return
end
local function printf(fmt, ...)
io.stdout:write(string.format(fmt, ...))
return
end
local function make_table_from_file(filename)
local input = assert(io.open(filename, "r"))
local data = {}
for line in input:lines() do
local key, value = string.match(line, PATTERN)
data[key] = value
end
return data
end
local function main(argc, argv)
if (argc < 1) then
eprintf("Filename expected from command line\n")
os.exit(1)
end
local data = make_table_from_file(argv[1])
for k, v in pairs(data) do
printf("data[%s] = %s\n", k, data[k])
end
return 0
end
main(#arg, arg)
The variable declared at the top of the file, PATTERN, is your capture pattern to be used by string.match. If you are unfamiliar with how Lua's pattern matching works, this pattern looks for a series of non-space characters with zero or one spaces to its right, an equal sign, another space, and then another series of non-space characters. The two series of non-space characters are the two matches--key and value--returned by string.match in the function make_table_from_file.
The functions eprintf and printf are my Lua versions of C-style formatted output functions. The former writes to standard error, io.stderr in Lua; and the latter writes to standard output, io.stdout in Lua.
In your question, you give a sample of what your expected output is. Within your table data, you want it to contain keys that correspond to tables as values. Based on the sample input text you provided, I assume the data contained within these tables are whatever comes to the right of the equal signs in the input file--which you represent with .... As I do not know what exactly those ...s represent, I cannot give you a solid example for how to separate that right-hand data into a table. Depending on what you are looking to do, you could take the second variable returned by string.match, which I called value, and further separate it using Lua's string pattern matching. It could look something like this:
...
local function make_table_from_value(val)
// Split `val` into distinct elements to form a table with `some_pattern`
return {string.match(val, some_pattern)}
end
local function make_table_from_file(filename)
local input = assert(io.open(filename, "r"))
local data = {}
for line in input:lines() do
local key, value = string.match(line, PATTERN)
data[key] = make_table_from_value(value)
end
return data
end
...
In make_table_from_value, string.match will return some number of elements, based on whatever string pattern you provide as its second argument, which you can then use to create a table by enclosing the function call in curly braces. It will be a table that uses numerical indices as keys--rather than strings or some other data type--starting from 1.

LUA: How to Create 2-dimensional array/table from string

I see several posts about making a string in to a lua table, but my problem is a little different [I think] because there is an additional dimension to the table.
I have a table of tables saved as a file [i have no issue reading the file to a string].
let's say we start from this point:
local tot = "{{1,2,3}, {4,5,6}}"
When I try the answers from other users I end up with:
local OneDtable = {"{1,2,3}, {4,5,6}"}
This is not what i want.
how can i properly create a table, that contains those tables as entries?
Desired result:
TwoDtable = {{1,2,3}, {4,5,6}}
Thanks in advance
You can use the load function to read the content of your string as Lua code.
local myArray = "{{1,2,3}, {4,5,6}}"
local convert = "myTable = " .. myArray
local convertFunction = load(convert)
convertFunction()
print(myTable[1][1])
Now, myTable has the values in a 2-dimensional array.
For a quick solution I suggest going with the load hack, but be aware that this only works if your code happens to be formatted as a Lua table already. Otherwise, you'd have to parse the string yourself.
For example, you could try using lpeg to build a recursive parser. I built something very similar a while ago:
local lpeg = require 'lpeg'
local name = lpeg.R('az')^1 / '\0'
local space = lpeg.S('\t ')^1
local function compile_tuple(...)
return string.char(select('#', ...)) .. table.concat{...}
end
local expression = lpeg.P {
'e';
e = name + lpeg.V 't';
t = '(' * ((lpeg.V 'e' * ',' * space)^0 * lpeg.V 'e') / compile_tuple * ')';
}
local compiled = expression:match '(foo, (a, b), bar)'
print(compiled:byte(1, -1))
Its purpose is to parse things in quotes like the example string (foo, (a, b), bar) and turn it into a binary string describing the structure; most of that happens in the compile_tuple function though, so it should be easy to modify it to do what you want.
What you'd have to adapt:
change name for number (and change the pattern accordingly to lpeg.R('09')^1, without the / '\0')
change the compile_tuple function to a build_table function (local function build_tanle(...) return {...} end should do the trick)
Try it out and see if something else needs to be changed; I might have missed something.
You can read the lpeg manual here if you're curious about how this stuff works.

How can I get the number of times an enttry in a table was listed

I need to find a way to see how many times an entry is listed in a table.
I have tried looking at other code for help, and looking at examples online none of them help
local pattern = "(.+)%s?-%s?(.+)"
local table = {"Cald_fan:1", "SomePerson:2", "Cald_fan:3","anotherPerson:4"}
for i,v in pairs(table) do
local UserId, t = string.match(v, pattern)
for i,v in next,UserId do
--I have tried something like this
end
end
it is suppose to say Cald_fan was listed 2 times
Something like this should work:
local pattern = "(.+)%s*:%s*(%d+)"
local tbl = {"Cald_fan:1", "SomePerson:2", "Cald_fan:3","anotherPerson:4"}
local counts = {}
for i,v in pairs(tbl) do
local UserId, t = string.match(v, pattern)
counts[UserId] = 1 + (counts[UserId] or 0)
end
print(counts['Cald_fan']) -- 2
I renamed table to tbl (as using table variable makes the table.* functions not available) and fix the pattern (you had unescaped '-' in it, while your strings had ':').
If the format of your table entries is consistent, you can simply split the strings apart and use the components as keys in a map of counters.
It looks like your table entries are formatted as "[player_name]:[index]", but it doesn't look like you care about the index. But, if the ":" will be in every table entry, you can write a pretty reliable search pattern. You could try something like this :
-- use a list of entries with the format <player_name>:<some_number>
local entries = {"Cald_fan:1", "SomePerson:2", "Cald_fan:3","anotherPerson:4"}
local foundPlayerCount = {}
-- iterate over the list of entries
for i,v in ipairs(entries) do
-- parse out the player name and a number using the pattern :
-- (.+) = capture any number of characters
-- : = match the colon character
-- (%d+)= capture any number of numbers
local playerName, playerIndex = string.match(v, '(.+):(%d+)')
-- use the playerName as a key to count how many times it appears
if not foundPlayerCount[playerName] then
foundPlayerCount[playerName] = 0
end
foundPlayerCount[playerName] = foundPlayerCount[playerName] + 1
end
-- print out all the players
for playerName, timesAppeared in pairs(foundPlayerCount) do
print(string.format("%s was listed %d times", playerName, timesAppeared))
end
If you need to do pattern matching in the future, I highly recommend this article on lua string patterns : http://lua-users.org/wiki/PatternsTutorial
Hope this helps!

torch FloatTensor toString method?

I have a torch.FloatTensor that is 2 rows and 200 columns. I want to print the lines to a text file. Is there a toString() method for the torch.FloatTensor? If not, how do you print the line to the file? Thanks.
I can convert the FloatTensor into a Lua table:
local line = a[1]
local table = {}
for i=1,line:size(1) do
table[i] = line[i]
end
Is there an easy way to convert the Lua table to a string, so I can write it to file? Thanks!
There is a built-in table conversion called torch.totable. If what you want to do is save and load your tensor, then it's even easier to use Torch native serialization like torch.save('file.txt', tensor, 'ascii').

pandas parse dates from csv

I am trying to read a csv file which includes dates. The csv looks like this:
h1,h2,h3,h4,h5
A,B,C,D,E,20150420
A,B,C,D,E,20150420
A,B,C,D,E,20150420
For reading the csv I use this code:
df = pd.read_csv(filen,
index_col=None,
header=0,
parse_dates=[5],
date_parser=lambda t:parse(t))
The parse function looks like this:
def parse(t):
string_ = str(t)
try:
return datetime.date(int(string_[:4]), int(string_[4:6]), int(string_[6:]))
except:
return datetime.date(1900,1,1)
My strange problem now is that in the parsing function, t looks like this:
ndarray: ['20150420' '20150420' '20150420']
As you can see t is the whole array of the data column. I think it should be only the first value when parsing the first row, the second value, when parsing the second row, etc. Right now, the parse always ends up in the except-block because int(string_[:4]) contains a bracket, which, obviously, cannot be converted to an int. The parse function is built to parse only one date at a time (e.g. 20150420) in the first place.
What am I doing wrong?
EDIT:
okay, I just read in the pandas doc about the date_parser argument, and it seems to work as expected (of course ;)). So I need to adapt my code to that. My above example is copy&pasted from somewhere else and I expected it to work, hence, my question.. I will report back, when I did my code adaption.
EDIT2:
My parse function now looks like this, and I think, the code works now. If I am still doing something wrong, please let me know:
def parse(t):
ret = []
for ts in t:
string_ = str(ts)
try:
tsdt = datetime.date(int(string_[:4]), int(string_[4:6]), int(string_[6:]))
except:
tsdt = datetime.date(1900,1,1)
ret.append(tsdt)
return ret
There are six columns, but only five titles in the first line. This is why parse_dates failed. you can skip the first line:
df = pd.read_csv("tmp.csv", header=None, skiprows=1, parse_dates=[5])
you can try this parser :
parser = lambda x: pd.to_datetime(x, format='%Y%m%d', coerce=True)
and use
df = pd.read_csv(filen,
index_col=None,
header=0,
parse_dates=[5],
date_parser=parser)

Resources