How call a Torch .lua file given new/different arguments? - lua

Suppose I have the followings in test.lua file:
require 'torch'
-- parse command line arguments
if not opt then
print '==> processing options'
cmd = torch.CmdLine()
cmd:text()
cmd:text('SVHN Model Definition')
cmd:text()
cmd:text('Options:')
cmd:option('-model', 'convnet', 'type of model to construct: linear | mlp | convnet')
cmd:option('-visualize', 1, 'visualize input data and weights during training')
cmd:text()
opt = cmd:parse(arg or {})
end
if opt.visualuze == 0 then
-- Do something
Now assume I want to call test.lua given some different arguments through another lua file execute.lua:
dofile ('test.lua -visualize 0') --Gives an error
However, I am getting an error which indicates that the file 'test.lua -visualize 0' cannot be found when trying to call the function through execute.lua.
So, how can I correctly run another lua file which contains torch code through another .lua file?

If you do not need to use the variables defined inside your 'test.lua', you can use os.execute:
os.execute("th test.lua -visiualize 0")

Related

What does local _, x = ... do in lua?

I'm looking at some lua code on github which consists of many folders and files. Next to external libraries, each file starts with:
local _,x = ...
Now my question is, what is the purpose of this, namely the 3 dots? is it a way to 'import' the global values of x? In what way is it best used?
... is the variable arguments to the current function.
E.g.:
function test(x, y, ...)
print("x is",x)
print("y is",y)
print("... is", ...)
local a,b,c,d = ...
print("b is",b)
print("c is",c)
end
test(1,2,"oat","meal")
prints:
x is 1
y is 2
... is oat meal
b is meal
c is nil
Files are also treated as functions. In Lua, when you, or someone else, loads a file (with load or loadfile or whatever), it returns a function, and then to run the code, you call the function. And when you call the function, you can pass arguments. And none of the arguments have names, but the file can read them with ...
They are arguments from the command line.
Read lua's reference manual, in the chapter Lua Standalone, it says:
...If there is a script, the script is called with arguments arg[1], ···, arg[#arg]. Like all chunks in Lua, the script is compiled as a vararg function.
For example if your lua script is run with the command line:
lua my_script.lua 10 20
In my_script.lua you have:
local _, x = ...
Then _ = "10" and x = "20"
Update when a library script is required by another script, the meaning of the 3 dots changes, they are arguments passed from the require function to the searcher:
Once a loader is found, require calls the loader with two arguments: modname and an extra value, a loader data, also returned by the searcher.
And under package.searchers:
All searchers except the first one (preload) return as the extra value the file name where the module was found
For example if you have a lua file that requires my_script.lua.
require('my_script')
At this time _ = "my_script" and x = "/full/path/to/my_script.lua"
Note that in lua 5.1, require passes only 1 argument to the loader, so x is nil.

Apache Beam : How to return multiple outputs

In the below function. I want to return important_col variable as well.
class FormatInput(beam.DoFn):
def process(self, element):
""" Format the input to the desired shape"""
df = pd.DataFrame([element], columns=element.keys())
if 'reqd' in df.columns:
important_col= 'reqd'
elif 'customer' in df.columns:
important_col= 'customer'
elif 'phone' in df.columns:
important_col= 'phone'
else:
raise ValueError(['Important columns not specified'])
output = df.to_dict('records')
return output
with beam.Pipeline(options=PipelineOptions(pipeline_args)) as p:
clean_csv = (p
| 'Read input file' >> beam.dataframe.io.read_csv('raw_data.csv'))
to_process = clean_csv | 'pre-processing' >> beam.ParDo(FormatInput())
In the above pipeline, I want to return Important_col variable from the Format Input.
Once I have that variable, I want to pass it as argument to next step in pipeline
I also want to dump to_process to CSV file.
I tried the following but none of them worked.
converted to_process to to_dataframe and tried to_csv. I got error.
I also tried to dump pcollection to csv. I am not getting how to do that. I referred official apache beam documents, but I dont find any documents similar to my use case.

Why is Lua's arg table not one-indexed?

In the Lua command line, when I pass arguments to a script like this:
lua myscript.lua a b c d
I can read the name of my script and arguments from the global arg table. arg[0] contains the script name, arg[1] - arg[#arg] contain the remaining arguments. What's odd about this is table is that it has a value at index 0, unlike every other Lua array which starts indexing at 1. This means that when iterating over it like this:
for i,v in ipairs(arg) do print(i, v) end
the output only considers index 1-4, and does not print the script name. Also #arg evaluates to 4, not 5.
Is there any good reason for this decision? It initially took me aback, and I had to verify that the manual wasn't mistaken.
Asking why certain design decisions were made is always tricky because only the creator of the language can really answer. I guess it was chosen such that you can iterate over the arguments using ipairs and don't have to handle the first one special because it's the script name and not an argument.
#arg is meaningless anyway because it counts only the number of elements in the consecutive array section but the zeroth and negative indices are stored in the hashmap section. To obtain the actual number of elements use
local n = 0
for _ in pairs(arg) do
n = n + 1
end
At least it is documented in Programming in Lua:
A main script can retrieve its arguments in the global variable arg. In a call like
prompt> lua script a b c
lua creates the table arg with all the command-line arguments, before running the script. The script name goes into index 0; its first argument (a in the example), goes to index 1, and so on. Eventual options go to negative indices, as they appear before the script. For instance, in the call
prompt> lua -e "sin=math.sin" script a b
lua collects the arguments as follows:
arg[-3] = "lua"
arg[-2] = "-e"
arg[-1] = "sin=math.sin"
arg[0] = "script"
arg[1] = "a"
arg[2] = "b"
More often than not, the script only uses the positive indices (arg[1] and arg[2], in the example).
arg in Lua mimics argv in C: arg[0] contains the name of the script just like argv[0] contains the name of the program.
This does not contradict 1-based arrays in Lua, since the arguments to the script are the more important data. The name of the script is seldom used.

Get global environment in lua package

At the begining of some lua package files, sometimes there will be the line local base = _G or local base = ....
What's the benefits for doing this?
What's the differences between these two lines?
For the first question, you can refer: Why make global Lua functions local?
For your second one,
What's the differences between these two lines?
When you do local base = _G, you are assigning base to be a synonym for the global environment table. On the other hand, in the statement local base = ...; the ... refer to vararg feature of lua.
It can be shown in better detail with the following example:
local a = {...}
for k, v in pairs(a) do print(k, v) end
and then, executing it as follows:
─$ lua temp.lua some thing is passed "here within quotes"
1 some
2 thing
3 is
4 passed
5 here within quotes
As you see, ... is just a list of arguments passed to the program. Now, when you have
local base = ...
lua assigns the first argument to the variable base. All other parameters will be ignored in the above statement.

Is there special meaning for ()() syntax in Lua

I see this type of syntax a lot in some Lua source file I was reading lately, what does it mean, especially the second pair of brackets
An example, line 8 in
https://github.com/karpathy/char-rnn/blob/master/model/LSTM.lua
local LSTM = {}
function LSTM.lstm(input_size, rnn_size, n, dropout)
dropout = dropout or 0
-- there will be 2*n+1 inputs
local inputs = {}
table.insert(inputs, nn.Identity()()) -- line 8
-- ...
The source code of nn.Identity
https://github.com/torch/nn/blob/master/Identity.lua
********** UPDATE **************
The ()() pattern is used in torch library 'nn' a lot. The first pair of bracket creates an object of the container/node, and the second pair of bracket references the depending node.
For example, y = nn.Linear(2,4)(x) means x connects to y, and the transformation is linear from 1*2 to 1*4.
I just understand the usage, how it is wired seems to be answered by one of the answers below.
Anyway, the usage of the interface is well documented below.
https://github.com/torch/nngraph/blob/master/README.md
No, ()() has no special meaning in Lua, it's just two call operators () together.
The operand is possibly a function that returns a function(or, a table that implements call metamethod). For example:
function foo()
return function() print(42) end
end
foo()() -- 42
In complement to Yu Hao's answer let me give some Torch related precisions:
nn.Identity() creates an identity module,
() called on this module triggers nn.Module __call__ (thanks to Torch class system that properly hooks up this into the metatable),
by default this __call__ method performs a forward / backward,
but here torch/nngraph is used and nngraph overrides this method as you can see here.
In consequence every nn.Identity()() calls has here for effect to return a nngraph.Node({module=self}) node where self refers to the current nn.Identity() instance.
--
Update: an illustration of this syntax in the context of LSTM-s can be found here:
local i2h = nn.Linear(input_size, 4 * rnn_size)(input) -- input to hidden
If you’re unfamiliar with nngraph it probably seems strange that we’re constructing a module and already calling it once more with a graph node. What actually happens is that the second call converts the nn.Module to nngraph.gModule and the argument specifies it’s parent in the graph.
The first () calls the init function and the second () calls the call function
If the class doesn't posses either of these functions then the parent functions are called .
In the case of nn.Identity()() the nn.Identity has neither init function nor a call function hence the Identity parent nn.Module's init and call functions called .Attaching an illustration
require 'torch'
-- define some dummy A class
local A = torch.class('A')
function A:__init(stuff)
self.stuff = stuff
print('inside __init of A')
end
function A:__call__(arg1)
print('inside __call__ of A')
end
-- define some dummy B class, inheriting from A
local B,parent = torch.class('B', 'A')
function B:__init(stuff)
self.stuff = stuff
print('inside __init of B')
end
function B:__call__(arg1)
print('inside __call__ of B')
end
a=A()()
b=B()()
Output
inside __init of A
inside __call__ of A
inside __init of B
inside __call__ of B
Another code sample
require 'torch'
-- define some dummy A class
local A = torch.class('A')
function A:__init(stuff)
self.stuff = stuff
print('inside __init of A')
end
function A:__call__(arg1)
print('inside __call__ of A')
end
-- define some dummy B class, inheriting from A
local B,parent = torch.class('B', 'A')
b=B()()
Output
inside __init of A
inside __call__ of A

Resources