I'm working on a small Torch7/ Lua script to create and train a neural network, but I'm running into errors. Any ideas?
Here's my code:
require 'dp'
require 'csvigo'
require 'nn'
--[[hyperparameters]]--
opt = {
nHidden = 100, --number of hidden units
learningRate = 0.1, --training learning rate
momentum = 0.9, --momentum factor to use for training
maxOutNorm = 1, --maximum norm allowed for output neuron weights
batchSize = 128, --number of examples per mini-batch
maxTries = 100, --maximum number of epochs without reduction in validation error.
maxEpoch = 1 --maximum number of epochs of training
}
csv2tensor = require 'csv2tensor'
-- inputs, outputs = csv2tensor.load("/Users/robertgrzesik/NodeJS/csv_export.csv")
inputs = csv2tensor.load("/Users/robertgrzesik/NodeJS/csv_export.csv", {exclude={"positive", "negative", "neutral"}})
outputs = csv2tensor.load("/Users/robertgrzesik/NodeJS/csv_export.csv", {include={"positive", "negative", "neutral"}}) -- "positive", "negative", "neutral"
print("outputs: ", outputs)
print("inputs: ", inputs)
local dataset = {}
print("inputs:size(1)", inputs:size(1))
inputSize = inputs:size(1)
outputSize = outputs:size(1)
for i=1,inputSize do
dataset[i] = {inputs[i], outputs[i]}
end
dataset.size = function(self)
return inputSize
end
-- ======================================= --
-- Create NN
-- ======================================= --
print '[INFO] Creating NN..'
mlp = nn.Sequential(); -- make a multi-layer perceptron
inputs = inputSize; outputs = outputSize; HUs = 300; -- parameters
mlp:add(nn.Linear(inputs, HUs))
mlp:add(nn.Tanh())
mlp:add(nn.Linear(HUs, outputs))
-- ======================================= --
-- MSE and Training
-- ======================================= --
print '[INFO] MSE and train NN..'
criterion = nn.MSECriterion()
trainer = nn.StochasticGradient(mlp, criterion)
trainer.learningRate = 0.01
trainer:train(dataset)
Here's the error:
# StochasticGradient: training
/Users/robertgrzesik/torch/install/bin/luajit: .../robertgrzesik/torch/install/share/lua/5.1/nn/Linear.lua:37: size mismatch
stack traceback:
[C]: in function 'addmv'
.../robertgrzesik/torch/install/share/lua/5.1/nn/Linear.lua:37: in function 'updateOutput'
...ertgrzesik/torch/install/share/lua/5.1/nn/Sequential.lua:25: in function 'forward'
...ik/torch/install/share/lua/5.1/nn/StochasticGradient.lua:35: in function 'train'
/Users/robertgrzesik/Lua/async-master/tests/dp-test.lua:53: in main chunk
[C]: in function 'dofile'
...esik/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x01028bc780
And here's a sample of my data:
positive,negative,basketball,neutral,the,be,and,of,a,in,to,have,it,I,for,that,he,you,with,on,do,this,they,at,who,if,her,people,take,your,like,our,new,because,woman,great,show,million,money,job,little,important,lose,include,rest,fight,perfect
0,0,0,1,3,0,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,1,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Basically my aim is to create a deep neural network linking the frequency of words used in a sentence and tie it to the user rating it as either "positive", "negative" or "neutral" (my outputs, which are binary). Please also let me know if my thinking is correct on this.
Thank you!
Found the problem!
The issue was that I was giving the wrong sizes when creating the network. I was passing in "inputs:size(1)" instead of "inputs:size(2)". Here's the fix
mlp:add(nn.Linear(inputs:size(2), HUs))
mlp:add(nn.Tanh())
mlp:add(nn.Linear(HUs, outputs:size(2)))
Feel like I'm slowly starting to get the hang of Lua/ Torch! Score
Related
I am trying to penalize the False Negatives more than False Positives in a binary classification problem.
The custom loss function looks like: inspired from here
def w_categorical_crossentropy(y_true, y_pred, weights):
nb_cl = len(weights)
final_mask = K.zeros_like(y_pred[:, 0])
y_pred_max = K.max(y_pred, axis=1)
y_pred_max = K.reshape(y_pred_max, (K.shape(y_pred)[0], 1))
y_pred_max_mat = K.cast(K.equal(y_pred, y_pred_max), K.floatx())
for c_p, c_t in product(range(nb_cl), range(nb_cl)):
final_mask += (weights[c_t, c_p] * y_pred_max_mat[:, c_p] * y_true[:, c_t])
return K.categorical_crossentropy(y_pred, y_true) * final_mask
Following are the weights being passed to the weights param in the above function:
w_array = np.ones((2,2))
w_array[1,0] = 2.5 # penalizing FN
w_array[0,1] = 2.5 # penalizing FP
In my understanding, first line is penalizing the FN and the second one is penalizing the FP. When I try to run this code, the number of FN/FP is pretty much the same as in the case with equal weights as shown below
w_array = np.ones((2,2))
#w_array[1,0] = 2.5 # penalizing FN
#w_array[0,1] = 2.5 # penalizing FP
FOLLOW UP: If i simply comment out the second assignment in w_array, does it mean I am only penalizing the FN and not the FP. What is the importance of final_mask here?
Function call and usage:
loss = lambda y_true, y_pred: w_categorical_crossentropy(y_true, y_pred, weights=w_array)
classifier.compile(optimizer = sgd, loss = loss, metrics = ['accuracy'])
require 'torch';
require 'nn';
require 'nnx';
mnist = require 'mnist';
fullset = mnist.traindataset()
testset = mnist.testdataset()
trainset = {
size = 50000,
data = fullset.data[{{1,50000}}]:double(),
label = fullset.label[{{1,50000}}]
}
validationset = {
size = 10000,
data = fullset.data[{{50001, 60000}}]:double(),
label = fullset.label[{{50001,60000}}]
}
-- MNIST Dataset has 28x28 images
model = nn.Sequential()
model:add(nn.SpatialConvolutionMM(1, 32, 5, 5)) -- 32x24x24
model:add(nn.ReLU())
model:add(nn.SpatialMaxPooling(3, 3, 3, 3)) -- 32x8x8
model:add(nn.SpatialConvolutionMM(32, 64, 5, 5)) -- 64x4x4
model:add(nn.Tanh())
model:add(nn.SpatialMaxPooling(2, 2, 2, 2)) -- 64x2x2
model:add(nn.Reshape(64*2*2))
model:add(nn.Linear(64*2*2, 200))
model:add(nn.Tanh())
model:add(nn.Linear(200, 10))
model:add(nn.LogSoftMax())
criterion = nn.ClassNLLCriterion()
x, dldx = model:getParameters() -- now x stores the trainable parameters and dldx stores the gradient wrt these params in the model above
sgd_params = {
learningRate = 1e-2,
learningRateDecay = 1e-4,
weightDecay = 1e-3,
momentum = 1e-4
}
step = function ( batchsize )
-- setting up variables
local count = 0
local current_loss = 0
local shuffle = torch.randperm(trainset.size)
-- setting default batchsize as 200
batchsize = batchsize or 200
-- setting inputs and targets for minibatches
for minibatch_number = 1, trainset.size, batchsize do
local size = math.min( trainset.size - minibatch_number + 1, batchsize )
local inputs = torch.Tensor(size, 28, 28)
local targets = torch.Tensor(size)
for index = 1, size do
inputs[index] = trainset.data[ shuffle[ index + minibatch_number ]]
targets[index] = trainset.label[ shuffle[ index + minibatch_number ] ]
end
-- defining feval function to return loss and gradients of loss w.r.t. params
feval = function( x_new )
--print ( "---------------------------------safe--------------------")
if x ~= x_new then x:copy(x_new) end
-- initializing gradParsams to zero
dldx:zero()
-- calculating loss and param gradients
local loss = criterion:forward( model.forward( inputs ), targets )
model:backward( inputs, criterion:backward( model.output, targets ) )
return loss, dldx
end
-- getting loss
-- optim returns x*, {fx} where x* is new set of params and {fx} is { loss } => fs[ 1 ] carries loss from feval
print(feval ~= nil and x ~= nil and sgd_params ~= nil)
_,fs = optim.sgd(feval, x, sgd_params)
count = count + 1
current_loss = current_loss + fs[ 1 ]
end
--returning avg loss over the minibatch
return current_loss / count
end
max_iters = 30
for i = 1 ,max_iters do
local loss = step()
print(string.format('Epoch: %d Current loss: %4f', i, loss))
end
I am new to torch and lua and I'm not able to find an error in the above code. Can anyone suggest a way to debug it?
The error:
/home/afroz/torch/install/bin/luajit: /home/afroz/test.lua:88: attempt to index global 'optim' (a nil value)
stack traceback:
/home/afroz/test.lua:88: in function 'step'
/home/afroz/test.lua:102: in main chunk
[C]: in function 'dofile'
...froz/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
optim is not defined in the scope of your script. You try to call optim.sgd which of course results in the error you see.
Like nn, optim is a extension package to torch.
require 'torch';
require 'nn';
require 'nnx';
Remember those lines in the beginning of your script? They basically execute the definition of those packages.
Make sure optim is installed, then try to require it.
https://github.com/torch/optim
optim is not assigned anywhere in the script, so when the script references optim.sgd, its value is nil and you get the error you shown. You need to doublecheck the script to make sure the optim is assigned the correct value.
TL;DR
Trying to build a bidirectional RNN for sequence tagging using tensorflow.
The goal is to take inputs "I like New York" and produce outputs "O O LOC_START LOC"
The graph compiles and runs, but the loss becomes NaN after 1 or 2 batches. I understand this could be a problem with the learning rate, but changing the learning rate seems to have no impact. Using AdamOptimizer at the moment.
Any help would be appreciated.
Here is my code:
Code:
# The input and output: a sequence of words, embedded, and a sequence of word classifications, one-hot
self.input_x = tf.placeholder(tf.float32, [None, n_sequence_length, n_embedding_dim], name="input_x")
self.input_y = tf.placeholder(tf.float32, [None, n_sequence_length, n_output_classes], name="input_y")
# New shape: [sequence_length, batch_size (None), embedding_dim]
inputs = tf.transpose(self.input_x, [1, 0, 2])
# New shape: [sequence_length * batch_size (None), embedding_dim]
inputs = tf.reshape(inputs, [-1, n_embedding_dim])
# Define weights
w_hidden = tf.Variable(tf.random_normal([n_embedding_dim, 2 * n_hidden_states]))
b_hidden = tf.Variable(tf.random_normal([2 * n_hidden_states]))
w_out = tf.Variable(tf.random_normal([2 * n_hidden_states, n_output_classes]))
b_out = tf.Variable(tf.random_normal([n_output_classes]))
# Linear activation for the input; this will make it fit to the hidden size
inputs = tf.nn.xw_plus_b(inputs, w_hidden, b_hidden)
# Split up the batches into a Python list
inputs = tf.split(0, n_sequence_length, inputs)
# Now we define our cell. It takes one word as input, a vector of embedding_size length
cell_forward = rnn_cell.BasicLSTMCell(n_hidden_states, forget_bias=0.0)
cell_backward = rnn_cell.BasicLSTMCell(n_hidden_states, forget_bias=0.0)
# And we add a Dropout Wrapper as appropriate
if is_training and prob_keep < 1:
cell_forward = rnn_cell.DropoutWrapper(cell_forward, output_keep_prob=prob_keep)
cell_backward = rnn_cell.DropoutWrapper(cell_backward, output_keep_prob=prob_keep)
# And we make it a few layers deep
cell_forward_multi = rnn_cell.MultiRNNCell([cell_forward] * n_layers)
cell_backward_multi = rnn_cell.MultiRNNCell([cell_backward] * n_layers)
# returns outputs = a list T of tensors [batch, 2*hidden]
outputs = rnn.bidirectional_rnn(cell_forward_multi, cell_backward_multi, inputs, dtype=dtypes.float32)
# [sequence, batch, 2*hidden]
outputs = tf.pack(outputs)
# [batch, sequence, 2*hidden]
outputs = tf.transpose(outputs, [1, 0, 2])
# [batch * sequence, 2 * hidden]
outputs = tf.reshape(outputs, [-1, 2 * n_hidden_states])
# [batch * sequence, output_classes]
self.scores = tf.nn.xw_plus_b(outputs, w_out, b_out)
# [batch * sequence, output_classes]
inputs_y = tf.reshape(self.input_y, [-1, n_output_classes])
# [batch * sequence]
self.predictions = tf.argmax(self.scores, 1, name="predictions")
# Now calculate the cross-entropy
losses = tf.nn.softmax_cross_entropy_with_logits(self.scores, inputs_y)
self.loss = tf.reduce_mean(losses, name="loss")
if not is_training:
return
# Training
self.train_op = tf.train.AdamOptimizer(1e-4).minimize(self.loss)
# Evaluate model
correct_pred = tf.equal(self.predictions, tf.argmax(inputs_y, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32), name="accuracy")
Could there be an example in the training data where something is wrong with the labels? Then when it hits that example the cost become NaN. I'm suggesting this because it seems like it still happens when the learning rate is zero and after just a few batches.
Here is how I would debug:
Set the batch size to 1
set the learning rate to 0.0
when you run a batch have tensorflow output the intermediate values not just the cost
run until you get a NaN and then check to see what the input was and by examining the intermediate outputs determine at which point there is a NaN
I'm transiting from Theano to Torch. So please bear with me. In Theano, it was kind of straight-forward to compute the gradients of loss function w.r.t even a specific weight. I wonder, how can one do this in Torch?
Assume we have the following code which generates some data/labels and defines a model :
t = require 'torch'
require 'nn'
require 'cunn'
require 'cutorch'
-- Generate random labels
function randLabels(nExamples, nClasses)
-- nClasses: number of classes
-- nExamples: number of examples
label = {}
for i=1, nExamples do
label[i] = t.random(1, nClasses)
end
return t.FloatTensor(label)
end
inputs = t.rand(1000, 3, 32, 32) -- 1000 samples, 3 color channels
inputs = inputs:cuda()
labels = randLabels(inputs:size()[1], 10)
labels = labels:cuda()
net = nn.Sequential()
net:add(nn.SpatialConvolution(3, 6, 5, 5))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(2, 2, 2, 2))
net:add(nn.View(6*14*14))
net:add(nn.Linear(6*14*14, 300))
net:add(nn.ReLU())
net:add(nn.Linear(300, 10))
net = net:cuda()
-- Loss
criterion = nn.CrossEntropyCriterion()
criterion = criterion:cuda()
forwardPass = net:forward(inputs)
net:zeroGradParameters()
dEd_WeightsOfLayer1 -- How to compute this?
forwardPass = nil
net = nil
criterion = nil
inputs = nil
labels = nil
collectgarbage()
How can I compute the gradient w.r.t weights of convolutinal layer?
Okay, I found the answer (thanks to alban desmaison on Torch7 Google group).
The code in the question has a bug and does not work. So I re-write the code. Here's how you can get the gradients with respect to each node/parameter:
t = require 'torch'
require 'cunn'
require 'nn'
require 'cutorch'
-- A function to generate some random labels
function randLabels(nExamples, nClasses)
-- nClasses: number of classes
-- nExamples: number of examples
label = {}
for i=1, nExamples do
label[i] = t.random(1, nClasses)
end
return t.FloatTensor(label)
end
-- Declare some variables
nClass = 10
kernelSize = 5
stride = 2
poolKernelSize = 2
nData = 100
nChannel = 3
imageSize = 32
-- Generate some [random] data
data = t.rand(nData, nChannel, imageSize, imageSize) -- 100 Random images with 3 channels
data = data:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
label = randLabels(data:size()[1], nClass)
label = label:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
-- Define model
net = nn.Sequential()
net:add(nn.SpatialConvolution(3, 6, 5, 5))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(poolKernelSize, poolKernelSize, stride, stride))
net:add(nn.View(6*14*14))
net:add(nn.Linear(6*14*14, 350))
net:add(nn.ReLU())
net:add(nn.Linear(350, 10))
net = net:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
criterion = nn.CrossEntropyCriterion()
criterion = criterion:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
-- Do forward pass and get the gradient for each node/parameter:
net:forward(data) -- Do the forward propagation
criterion:forward(net.output, label) -- Computer the overall negative log-likelihood error
criterion:backward(net.output, label); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen
net:backward(data, criterion.gradInput); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen
-- Now you can access the gradient values
layer1InputGrad = net:get(1).gradInput
layer1WeightGrads = net:get(1).gradWeight
net = nil
data = nil
label = nil
criterion = nil
Copy and paste the code and it works like charm :)
I'm trying to run an example from torch7 only to come across this error.
sandesh#sandesh-H87M-D3H:~/Downloads/tutorials-master/2_supervised$ luajit doall.lua
==> processing options
==> executing all
==> downloading dataset
==> using regular, full training data
==> loading dataset
==> preprocessing data
==> preprocessing data: colorspace RGB -> YUV
==> preprocessing data: normalize each feature (channel) globally
==> preprocessing data: normalize all three channels locally
==> verify statistics
training data, y-channel, mean: 0.00067706172257129
training data, y-channel, standard deviation: 0.39473240322794
test data, y-channel, mean: -0.0010822884348063
test data, y-channel, standard deviation: 0.38091408093043
training data, u-channel, mean: -0.0048219975630079
training data, u-channel, standard deviation: 0.29768662619471
test data, u-channel, mean: -0.0030795217110624
test data, u-channel, standard deviation: 0.22289780235542
training data, v-channel, mean: 0.0036312269637064
training data, v-channel, standard deviation: 0.25405592463897
test data, v-channel, mean: 0.0033847450016769
test data, v-channel, standard deviation: 0.20362829592977
==> visualizing data
==> define parameters
==> construct model
==> here is the model:
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> output]
(1): nn.SpatialConvolutionMM(3 -> 64, 5x5)
(2): nn.Tanh
(3): nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> output]
(1): nn.Square
(2): nn.SpatialAveragePooling(2,2,2,2)
(3): nn.MulConstant
(4): nn.Sqrt
}
(4): nn.SpatialSubtractiveNormalization
(5): nn.SpatialConvolutionMM(64 -> 64, 5x5)
(6): nn.Tanh
(7): nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> output]
(1): nn.Square
(2): nn.SpatialAveragePooling(2,2,2,2)
(3): nn.MulConstant
(4): nn.Sqrt
}
(8): nn.SpatialSubtractiveNormalization
(9): nn.Reshape(1600)
(10): nn.Linear(1600 -> 128)
(11): nn.Tanh
(12): nn.Linear(128 -> 10)
}
==> define loss
==> here is the loss function:
nn.ClassNLLCriterion
==> defining some tools
luajit: /home/sandesh/torch/install/share/lua/5.1/sys/init.lua:38: attempt to index local 'f' (a nil value)
stack traceback:
/home/sandesh/torch/install/share/lua/5.1/sys/init.lua:38: in function 'execute'
/home/sandesh/torch/install/share/lua/5.1/sys/init.lua:71: in function 'uname'
/home/sandesh/torch/install/share/lua/5.1/optim/Logger.lua:38: in function '__init'
/home/sandesh/torch/install/share/lua/5.1/torch/init.lua:91: in function
[C]: in function 'Logger'
4_train.lua:60: in main chunk
[C]: in function 'dofile'
doall.lua:70: in main chunk
[C]: at 0x00406670
I did not change any code in any of the lua files ...
This is the 4_train.lua file
----------------------------------------------------------------------
-- This script demonstrates how to define a training procedure,
-- irrespective of the model/loss functions chosen.
--
-- It shows how to:
-- + construct mini-batches on the fly
-- + define a closure to estimate (a noisy) loss
-- function, as well as its derivatives wrt the parameters of the
-- model to be trained
-- + optimize the function, according to several optmization
-- methods: SGD, L-BFGS.
--
-- Clement Farabet
----------------------------------------------------------------------
require 'torch' -- torch
require 'xlua' -- xlua provides useful tools, like progress bars
require 'optim' -- an optimization package, for online and batch methods
----------------------------------------------------------------------
-- parse command line arguments
if not opt then
print '==> processing options'
cmd = torch.CmdLine()
cmd:text()
cmd:text('SVHN Training/Optimization')
cmd:text()
cmd:text('Options:')
cmd:option('-save', 'results', 'subdirectory to save/log experiments in')
cmd:option('-visualize', false, 'visualize input data and weights during training')
cmd:option('-plot', false, 'live plot')
cmd:option('-optimization', 'SGD', 'optimization method: SGD | ASGD | CG | LBFGS')
cmd:option('-learningRate', 1e-3, 'learning rate at t=0')
cmd:option('-batchSize', 1, 'mini-batch size (1 = pure stochastic)')
cmd:option('-weightDecay', 0, 'weight decay (SGD only)')
cmd:option('-momentum', 0, 'momentum (SGD only)')
cmd:option('-t0', 1, 'start averaging at t0 (ASGD only), in nb of epochs')
cmd:option('-maxIter', 2, 'maximum nb of iterations for CG and LBFGS')
cmd:text()
opt = cmd:parse(arg or {})
end
----------------------------------------------------------------------
-- CUDA?
if opt.type == 'cuda' then
model:cuda()
criterion:cuda()
end
----------------------------------------------------------------------
print '==> defining some tools'
-- classes
classes = {'1','2','3','4','5','6','7','8','9','0'}
-- This matrix records the current confusion across classes
confusion = optim.ConfusionMatrix(classes)
-- Log results to files
trainLogger = optim.Logger(paths.concat(opt.save, 'train.log'))
testLogger = optim.Logger(paths.concat(opt.save, 'test.log'))
-- Retrieve parameters and gradients:
-- this extracts and flattens all the trainable parameters of the mode
-- into a 1-dim vector
if model then
parameters,gradParameters = model:getParameters()
end
----------------------------------------------------------------------
print '==> configuring optimizer'
if opt.optimization == 'CG' then
optimState = {
maxIter = opt.maxIter
}
optimMethod = optim.cg
elseif opt.optimization == 'LBFGS' then
optimState = {
learningRate = opt.learningRate,
maxIter = opt.maxIter,
nCorrection = 10
}
optimMethod = optim.lbfgs
elseif opt.optimization == 'SGD' then
optimState = {
learningRate = opt.learningRate,
weightDecay = opt.weightDecay,
momentum = opt.momentum,
learningRateDecay = 1e-7
}
optimMethod = optim.sgd
elseif opt.optimization == 'ASGD' then
optimState = {
eta0 = opt.learningRate,
t0 = trsize * opt.t0
}
optimMethod = optim.asgd
else
error('unknown optimization method')
end
----------------------------------------------------------------------
print '==> defining training procedure'
function train()
-- epoch tracker
epoch = epoch or 1
-- local vars
local time = sys.clock()
-- set model to training mode (for modules that differ in training and testing, like Dropout)
model:training()
-- shuffle at each epoch
shuffle = torch.randperm(trsize)
-- do one epoch
print('==> doing epoch on training data:')
print("==> online epoch # " .. epoch .. ' [batchSize = ' .. opt.batchSize .. ']')
for t = 1,trainData:size(),opt.batchSize do
-- disp progress
xlua.progress(t, trainData:size())
-- create mini batch
local inputs = {}
local targets = {}
for i = t,math.min(t+opt.batchSize-1,trainData:size()) do
-- load new sample
local input = trainData.data[shuffle[i]]
local target = trainData.labels[shuffle[i]]
if opt.type == 'double' then input = input:double()
elseif opt.type == 'cuda' then input = input:cuda() end
table.insert(inputs, input)
table.insert(targets, target)
end
-- create closure to evaluate f(X) and df/dX
local feval = function(x)
-- get new parameters
if x ~= parameters then
parameters:copy(x)
end
-- reset gradients
gradParameters:zero()
-- f is the average of all criterions
local f = 0
-- evaluate function for complete mini batch
for i = 1,#inputs do
-- estimate f
local output = model:forward(inputs[i])
local err = criterion:forward(output, targets[i])
f = f + err
-- estimate df/dW
local df_do = criterion:backward(output, targets[i])
model:backward(inputs[i], df_do)
-- update confusion
confusion:add(output, targets[i])
end
-- normalize gradients and f(X)
gradParameters:div(#inputs)
f = f/#inputs
-- return f and df/dX
return f,gradParameters
end
-- optimize on current mini-batch
if optimMethod == optim.asgd then
_,_,average = optimMethod(feval, parameters, optimState)
else
optimMethod(feval, parameters, optimState)
end
end
-- time taken
time = sys.clock() - time
time = time / trainData:size()
print("\n==> time to learn 1 sample = " .. (time*1000) .. 'ms')
-- print confusion matrix
print(confusion)
-- update logger/plot
trainLogger:add{['% mean class accuracy (train set)'] = confusion.totalValid * 100}
if opt.plot then
trainLogger:style{['% mean class accuracy (train set)'] = '-'}
trainLogger:plot()
end
-- save/log current net
local filename = paths.concat(opt.save, 'model.net')
os.execute('mkdir -p ' .. sys.dirname(filename))
print('==> saving model to '..filename)
torch.save(filename, model)
-- next epoch
confusion:zero()
epoch = epoch + 1
end
This is doall.lua
----------------------------------------------------------------------
-- This tutorial shows how to train different models on the street
-- view house number dataset (SVHN),
-- using multiple optimization techniques (SGD, ASGD, CG), and
-- multiple types of models.
--
-- This script demonstrates a classical example of training
-- well-known models (convnet, MLP, logistic regression)
-- on a 10-class classification problem.
--
-- It illustrates several points:
-- 1/ description of the model
-- 2/ choice of a loss function (criterion) to minimize
-- 3/ creation of a dataset as a simple Lua table
-- 4/ description of training and test procedures
--
-- Clement Farabet
----------------------------------------------------------------------
require 'torch'
----------------------------------------------------------------------
print '==> processing options'
cmd = torch.CmdLine()
cmd:text()
cmd:text('SVHN Loss Function')
cmd:text()
cmd:text('Options:')
-- global:
cmd:option('-seed', 1, 'fixed input seed for repeatable experiments')
cmd:option('-threads', 2, 'number of threads')
-- data:
cmd:option('-size', 'full', 'how many samples do we load: small | full | extra')
-- model:
cmd:option('-model', 'convnet', 'type of model to construct: linear | mlp | convnet')
-- loss:
cmd:option('-loss', 'nll', 'type of loss function to minimize: nll | mse | margin')
-- training:
cmd:option('-save', 'results', 'subdirectory to save/log experiments in')
cmd:option('-plot', false, 'live plot')
cmd:option('-optimization', 'SGD', 'optimization method: SGD | ASGD | CG | LBFGS')
cmd:option('-learningRate', 1e-3, 'learning rate at t=0')
cmd:option('-batchSize', 1, 'mini-batch size (1 = pure stochastic)')
cmd:option('-weightDecay', 0, 'weight decay (SGD only)')
cmd:option('-momentum', 0, 'momentum (SGD only)')
cmd:option('-t0', 1, 'start averaging at t0 (ASGD only), in nb of epochs')
cmd:option('-maxIter', 2, 'maximum nb of iterations for CG and LBFGS')
cmd:option('-type', 'double', 'type: double | float | cuda')
cmd:text()
opt = cmd:parse(arg or {})
-- nb of threads and fixed seed (for repeatable experiments)
if opt.type == 'float' then
print('==> switching to floats')
torch.setdefaulttensortype('torch.FloatTensor')
elseif opt.type == 'cuda' then
print('==> switching to CUDA')
require 'cunn'
torch.setdefaulttensortype('torch.FloatTensor')
end
torch.setnumthreads(opt.threads)
torch.manualSeed(opt.seed)
----------------------------------------------------------------------
print '==> executing all'
dofile '1_data.lua'
dofile '2_model.lua'
dofile '3_loss.lua'
dofile '4_train.lua'
dofile '5_test.lua'
----------------------------------------------------------------------
print '==> training!'
while true do
train()
test()
end
The git link is https://github.com/torch/tutorials/blob/master/2_supervised/4_train.lua
Also I'm not using cuda as I dont have a GPU
I am currently playing with torch-rnn and stumbled accross the issue repeatedly. Check your input and see if the incorrect / no file was inputted. If the code worked before and no attempts were made to change it, it should be intact and it is your input that is problematic.
I won't tell you what is wrong as you did not show any efforts to solve the problem yourself. But I will tell you how to proceed.
luajit: /home/sandesh/torch/install/share/lua/5.1/sys/init.lua:38:
attempt to index local 'f' (a nil value)
stack traceback:
/home/sandesh/torch/install/share/lua/5.1/sys/init.lua:38: in function 'execute'
/home/sandesh/torch/install/share/lua/5.1/sys/init.lua:71: in function 'uname'
/home/sandesh/torch/install/share/lua/5.1/optim/Logger.lua:38: in function '__init'
/home/sandesh/torch/install/share/lua/5.1/torch/init.lua:91: in function
[C]: in function 'Logger'
This tells you that some local f in init.lua line 38 is nil, which causes a problem. So open that file and find out where that f's value should come from and why it is nil. Then fix that. Also see if there is a more recent version of Torch that handles f being nil correctly. If not, change the code yourself, if possible. Otherwise try to stop this from happening by validating your inputs to torch.