I am trying to implement a neural network with multiple layers. I am trying to understand if what I have done is correct and if not, how do I debug this. The way I do it is, I define my neural network in the following manner (I initialise the lookuptable layer with some prior learned embeddings):
lookupTableLayer = nn.LookupTable(vector:size()[1], d)
for i=1,vector:size()[1] do
lookupTableLayer.weight[i] = vector[i]
end
mlp=nn.Sequential();
mlp:add(lookupTableLayer)
mlp:add(nn.TemporalConvolution(d,H,K,dw))
mlp:add(nn.Tanh())
mlp:add(nn.Max(1))
mlp:add(nn.Tanh())
mlp:add(nn.Linear(H,d))
Now, to train the network, I loop through every training example and for every example I call gradUpdate() which has this code (this is straight from the examples):
function gradUpdate(mlp, x, indexY, learningRate)
local pred = mlp:forward(x)
local gradCriterion = findGrad(pred, indexY)
mlp:zeroGradParameters()
mlp:backward(x, gradCriterion)
mlp:updateParameters(learningRate)
end
The findGrad function is just an implementation of WARP Loss which returns the gradient wrt output. I am wondering if this is all I need? I assume this will backpropagate and update the parameters of all the layers. To check this, I trained this network and saved the model. Then I loaded the model and did:
{load saved mlp after training}
lookuptable = mlp:findModules('nn.LookupTable')[1]
Now, I checked vector[1] and lookuptable.weight[1] and they were the same. I can't understand why did the weights in the lookup table layer not get updated? What am I missing here?
Looking forward to your replies!
Related
I am trying to classify different ECG signals. I am using Keras' Conv1D, but am not getting any good results.
I have tried changing the number of layers, window size, etc, but every time I run this I get predictions all of the same class (the classes are 0,1,2, so I get a prediction output of something like [1,1,1,1,1,1,1,1,1,1,1,1,1,1], but the class changes each time I run the script).
The ECG signals are in 1000 point numpy arrays.
Are there any glaringly obvious things I am doing wrong here? I was thinking it would've worked great to use a few layers to just classify into 3 different ECG signals.
#arrange and randomize data
y1=[[0]]*len(lead1)
y2=[[1]]*len(lead2)
y3=[[2]]*len(lead3)
y=np.concatenate((y1,y2,y3))
data=np.concatenate((lead1,lead2,lead3))
data = keras.utils.normalize(data)
data=np.concatenate((data,y),axis=1)
data=np.random.permutation((data))
print(data)
#separate data and create categories
Xtrain=data[0:130,0:-1]
Xtrain=np.reshape(Xtrain,(len(Xtrain),1000,1))
Xpred=data[130:,0:-1]
Xpred=np.reshape(Xpred,(len(Xpred),1000,1))
Ytrain=data[0:130,-1]
Yt=to_categorical(Ytrain)
Ypred=data[130:,-1]
Yp=to_categorical(Ypred)
#create CNN model
model = Sequential()
model.add(Conv1D(20,20,activation='relu',input_shape=(1000,1)))
model.add(MaxPooling1D(3))
model.add(Conv1D(20,10,activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(20,10,activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dense(3,activation='relu',use_bias=False))
model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
model.fit(Xtrain,Yt)
#test model
print(model.evaluate(Xpred,Yp))
print(model.predict_classes(Xpred,verbose=1))
Are there any glaringly obvious things I am doing wrong here?
Indeed there is: the output you report is not surprising, given that you are currently using the ReLU as activation for your last layer, which does not make any sense.
In multi-class settings, such as yours, the activation of the last layer must be the softmax, and certainly not the ReLU; change your last layer to:
model.add(Dense(3, activation='softmax'))
Not quite sure why you ask for use_bias=False, but you can try both with and without it and experiment...
I have a neural network with an input layer having 10 nodes, some hidden layers and an output layer with only 1 node. Then I put a pattern in the input layer, and after some processing, it outputs the value in the output neuron which is a number from 1 to 10. After the training this model is able to get the output , provided the input pattern.
Now, my question is, if it is possible to calculate the inverse model: This means, that I provide a number from output side, (i.e. using output side as input) and then getting the random pattern from those 10 input neurons (i.e. using input as output side).
I want to do this because I will first train a network on basis of difficulty of pattern (input is the pattern and output is difficulty to understand the pattern). Then I want to feed the network with a number so it creates the random patterns on basis of difficulty.
I hope I understood your problem correctly, so I will summarize it in my own words: You have a given model, and want to determine the input which yields a given output.
Supposed, that this is correct, there is at least one way I know of, how you can do this approximately. This way is very easy to implement, but might take a while to calculate a value - probably there are better ways to do this, but I am not sure. (I needed this technique some weeks ago in the topic of reinforcement learning, and did not find anything better, compared to this): Lets assume that your Model maps an input to an output . We now have to create a new model, which we will call : This model will later on calculate the inverse of the model , so that it gives you the input which yields a specific output. To construct we will create a new model, which consists of one plain Dense layer which has the same dimension m as the input. This layer will be connected to the input of the model now. Next, you make all weights of non-trainable (this is very important!).
Now we are setup to find an inverse value already: Assuming you want to find the input corresponding (corresponding means here: it creates the output, but is not unique) to the output y. You have to create a new input vector v which is the unity of . Then you create a input-output data pair consisting of (v, y). Now you use any optimizer you wish to let the input-output-trainingdata propagate through your network, until the error converges to zero. Once this has happend, you can calculate the real input, which gives the output y by doing this: Supposed, that the weights if the new input layer are called w, and the bias is b, the desired input u is u = w*1 + b (whereby 1 )
You might be asking for the reason why this equation holds, so let me try to answer it: You model will try to learn the weights of your new input layer, so that the unity as an input will create the given output. As only the newly added input layer is trainable, only this weights will be changed. Therefore, each weight in this vector will represent the corresponding component of the desired input vector. By using an optimizer and minimizing the l^2 distance between the wanted output and the output of our inverse-model , we will finally determine a set of weights, which will give you a good approximation for the input vector.
Now I'm using fb torch library from github fb torch resnet
It's my first time to use torch and lua, so Im encountering some problems.
My goal is to save the feature vector of specific layer (last avg pooling of resnet) into a one file with the class of the input image. All input images are from cifar-10 db.
The file format that i want to get is like belows
image1.txt := class index of image and feature vector of image 1 of cifar-10
image2.txt := class index of image and feature vector of image 2 of cifar-10
// and so on through all images of cifar-10
Now I have seen some sample code of that github extract-features.lua
Because it's my first time for lua, I feel so hard to understand this code and to modify to the way i want. And i don't want my data to save into t7 file format.
How can i access only one specific layer from network in torch via lua? (last average pooling)
How can i access values of the layer and classification result index?
How can read all each images from cifar-10 db file(t7 batch)?
Sorry for too many questions. But im feeling hard using torch because of pool amouns of community threads and posting of torch.. please understand me.
How can i access only one specific layer from network in torch via lua? (last average pooling)
To access each layer you just have to load the model and get it using an integer number. If you do print model you will be able to see in which position the last average pooling is.
model = torch.load(path_to_model):cuda()
avg_pooling_layer = model:get(position_of_the_avg_pooling_layer)
How can i access values of the layer and classification result index?
I do not quite understand what you mean by this. If you want to see the output or the weights from a specific layer. (following the code above) You need to get these elements from the layer table. Again, to see which ones are the possible elements to get use print avg_pooling_layer
weights = avg_pooling_layer.weight -- get the weights of the layer
output = avg_pooling_layer.output -- get the output of the layer
How can read all each images from cifar-10 db file(t7 batch)?
To read the images from a t7 file use the torch function torch.load. (used before to load the model).
cifar_10 = torch.load("path_to_cifar-10.t7")
Once loaded you could have the training and test set in subtables or functions. Again, print the table and visualize which values are the ones you need to get.
Hope this helps!
First of all: I'm completely new to Machine Learning and TensorFlow - I'm just playing around with this technology for a few weeks - and I really like it.
But I have (maybe a simple) question about the MNIST data set in combination with TensorFlow: I'm currently working through the "MNIST for ML Beginners" tutorial (https://www.tensorflow.org/versions/r0.11/tutorials/mnist/beginners/index.html#mnist-for-ml-beginners). I fully understand how the whole thing works, and what I accomplish with the source code.
My question is now the following:
Is it possible to see the individual weights parameters for each pixel? As far as I understand I can't access the individual weight parameters directly for each pixel, because the tf.matmul() operation returns me the sum over all weight parameters for a given class.
I want to access the individual weight parameters, because I want to see how these values are changing through the training process of the Neural Network.
Thanks for your help,
-Klaus
You can get the actual weights by just doing something like:
w = sess.run(W, feed_dict={x: batch_xs, y_: batch_ys})
print w.shape
If you want the per pixel results, just do a element-wise multiply of batch_xs * w (reshaped appropriately.)
I've been slowly going through the tensorflow tutorials, and I assume I will have to again. I don't have a background in ML but am slowly pushing my way up.
Anyway, after reading through the RNN tutorial and running the training code, I am confused.
How does one actually apply the trained model so that it can be used to make language predictions?
I know this is a terrible noobish and simple question, but I believe it will be of use to others, as I have seen it asked and not answered in a satisfactory way.
In general, when you train a model, you first do a forward pass, and then a backward pass. The forward pass makes a prediction based on your input data, and the backward pass adjust your model based on how correct your prediction was.
So when you want to apply your model, you just do a forward pass with your new data as input.
In your particular example, using this code, you can see how it's done by looking at how they run the test set, starting line 286.
# They instantiate the model with is_training=False
mtest = PTBModel(is_training=False, config=eval_config)
# Then they can do a forward pass
test_perplexity = run_epoch(session, mtest, test_data, tf.no_op())
print("Test Perplexity: %.3f" % test_perplexity)
And if you want the actual prediction and not the perplexity, it is the state in the run_epoch function :
cost, state, _ = session.run([m.cost, m.final_state, eval_op],
{m.input_data: x,
m.targets: y,
m.initial_state: state})