How to use a trained Neural Network? - machine-learning

I'm trying to understand the logic behind the use of a trained Neural Network. If I'm right : we should save the weights from the previous training, then, reload them with the new input.
For example, I have this data set :
Input= [[0,1][1,1]]
Output=[[1],[0]]
Results after training = [[0.999...],[0.005...]]
And I have also saved the weights. What I don't understand is : how I should use the previous weights to make a prediction for example ? For example, I want to try a prediction with the following input [1,0]. I find a lot of resources online with Matlab or Python, but I don't find something to clearly understand what the calculations are, to do it "from scratch".
Thank you,

It is as simple as doing your feedforward step with learned weights.
these are the steps you do in general:
1- feed forward : giving inputs to produce output labels
2-calculating the cost base on true labels of inputs which you have in a supervised problem
3-going backward in network to update your weights base on the cost
After you finished the training , you don't do step 2 and 3, you just Do the first step. going forward in network with new inputs and the learned weights in training process. the output is your prediction.

Related

Predicting inputs in neural network

Is it possible to predict inputs in "Keras neural network" for a particular output?
For example, I have a dataset with 28 inputs and 3 outputs. So, I have trained the model in Keras which works fine. Now, I have to enter the particular values in outputs and I have to predict that what will be the inputs for that particular output.
I'm not 100% sure I understand the question correctly, but if you're trying to build a model that can take inputs and predict outputs, then you will need to train a second model to predict inputs from outputs, where you swap the inputs and outputs so that outputs are your inputs, and your inputs are the outputs. Although this might be annoying, you might have to build a separate network to predict each of your input variables.
To get around this problem, you can consider autoencoders if you're okay with getting a close approximation of the input. An autoencoder is an unsupervised artificial neural network that learns how to efficiently compress and encode data then learns how to reconstruct the data back from the reduced encoded representation to a representation that is as close to the original input as possible (you can read more here: https://towardsdatascience.com/auto-encoder-what-is-it-and-what-is-it-used-for-part-1-3e5c6f017726).
Yes it is definitely possible to predict inputs from the output. In fact, what you're describing is essentially an autoencoder.
Let's say you have a NN trained on MNIST. If you then use the outputs of the classification layer to train the decoder of an auto encoder, you will get a rough indication of the input.
However this is not the best way to do it. The best way to do it is to simply have the latent space be considered the "output", then feed this output into:
a): A 1 layer classification to give you the predicted output and
b): the decoder
This will give you the predicted output and the original image

adding more output neurons to neural network after training?

Is there a problem of adding more output neurons after finishing training my neural network .
for example I teach my neural network how to see oranges and apples and say which one is apple and which one is orange. Shades, shape and texture as inputs and orange and apple as outputs so there are 3 inputs and 2 outputs.
what if I trained them and I wanted to add two more outputs lets say banana and stawberey. If I did that does my neural network previous learning fail ? or do I make something wrong here ? or it is safe to do that ?
You will most likely need to re-train the network from scratch incorporating the old and new data and four classes instead of two. If you try to add new classes to existing network, you are liable to run into what is called catastrophic forgetting.. However, you may be fine with only re-training the final classifier, or fine-tuning from previously learned weights.

Training of a CNN using Backpropagation

I have earlier worked in shallow(one or two layered) neural networks, so i have understanding of them, that how they work, and it is quite easy to visualize the derivations for forward and backward pass during the training of them, Currently I am studying about Deep neural networks(More precisely CNN), I have read lots of articles about their training, but still I am unable to understand the big picture of the training of the CNN, because in some cases people using pre- trained layers where convolution weights are extracted using auto-encoders, in some cases random weights were used for convolution, and then using back propagation they train the weights, Can any one help me to give full picture of the training process from input to fully connected layer(Forward Pass) and from fully connected layer to input layer (Backward pass).
Thank You
I'd like to recommend you a very good explanation of how to train a multilayer neural network using backpropagation. This tutorial is the 5th post of a very detailed explanation of how backpropagation works, and it also has Python examples of different types of neural nets to fully understand what's going on.
As a summary of Peter Roelants tutorial, I'll try to explain a little bit what is backpropagation.
As you have already said, there are two ways to initialize a deep NN: with random weights or pre-trained weights. In the case of random weights and for a supervised learning scenario, backpropagation works as following:
Initialize your network parameters randomly.
Feed forward a batch of labeled examples.
Compute the error (given by your loss function) within the desired output and the actual one.
Compute the partial derivative of the output error w.r.t each parameter.
These derivatives are the gradients of the error w.r.t to the network's parameters. In other words, they are telling you how to change the value of the weights in order to get the desired output, instead of the produced one.
Update the weights according to those gradients and the desired learning rate.
Perform another forward pass with different training examples, repeat the following steps until the error stops decreasing.
Starting with random weights is not a problem for the backpropagation algorithm, given enough training data and iterations it will tune the weights until they work for the given task.
I really encourage you to follow the full tutorial I linked, because you'll get a very detalied view of how and why backpropagation works for multi layered neural networks.

Using Convolution Neural network as Binary classifiers

Given any image I want my classifier to tell if it is Sunflower or not. How can I go about creating the second class ? Keeping the set of all possible images - {Sunflower} in the second class is an overkill. Is there any research in this direction ? Currently my classifier uses a neural network in the final layer. I have based it upon the following tutorial :
https://github.com/torch/tutorials/tree/master/2_supervised
I am taking images with 254x254 as the input.
Would SVM help in the final layer ? Also I am open to using any other classifier/features that might help me in this.
The standard approach in ML is that:
1) Build model
2) Try to train on some data with positive\negative examples (start with 50\50 of pos\neg in training set)
3) Validate it on test set (again, try 50\50 of pos\neg examples in test set)
If results not fine:
a) Try different model?
b) Get more data
For case #b, when deciding which additional data you need the rule of thumb which works for me nicely would be:
1) If classifier gives lots of false positive (tells that this is a sunflower when it is actually not a sunflower at all) - get more negative examples
2) If classifier gives lots of false negative (tells that this is not a sunflower when it is actually a sunflower) - get more positive examples
Generally, start with some reasonable amount of data, check the results, if results on train set or test set are bad - get more data. Stop getting more data when you get the optimal results.
And another thing you need to consider, is if your results with current data and current classifier are not good you need to understand if the problem is high bias (well, bad results on train set and test set) or if it is a high variance problem (nice results on train set but bad results on test set). If you have high bias problem - more data or more powerful classifier will definitely help. If you have a high variance problem - more powerful classifier is not needed and you need to thing about the generalization - introduce regularization, remove couple of layers from your ANN maybe. Also possible way of fighting high variance is geting much, MUCH more data.
So to sum up, you need to use iterative approach and try to increase the amount of data step by step, until you get good results. There is no magic stick classifier and there is no simple answer on how much data you should use.
It is a good idea to use CNN as the feature extractor, peel off the original fully connected layer that was used for classification and add a new classifier. This is also known as the transfer learning technique that has being widely used in the Deep Learning research community. For your problem, using the one-class SVM as the added classifier is a good choice.
Specifically,
a good CNN feature extractor can be trained on a large dataset, e.g. ImageNet,
the one-class SVM can then be trained using your 'sunflower' dataset.
The essential part of solving your problem is the implementation of the one-class SVM, which is also known as anomaly detection or novelty detection. You may refer http://scikit-learn.org/stable/modules/outlier_detection.html for some insights about the method.

Different weights for different classes in neural networks and how to use them after learning

I trained a neural network using the Backpropagation algorithm. I ran the network 30 times manually, each time changing the inputs and the desired output. The outcome is that of a traditional classifier.
I tried it out with 3 different classifications. Since I ran the network 30 times with 10 inputs for each class I ended up with 3 distinct weights but the same classification had very similar weights with a very small amount of error. The network has therefore proven itself to have learned successfully.
My question is, now that the learning is complete and I have 3 distinct type of weights (1 for each classification), how could I use these in a regular feed forward network so it can classify the input automatically. I searched around to check if you can somewhat average out the weights but it looks like this is not possible. Some people mentioned bootstrapping the data:
Have I done something wrong during the backpropagation learning process? Or is there an extra step which needs to be done post the learning process with these different weights for different classes?
One way how I am imaging this is by implementing a regular feed forward network which will have all of these 3 types of weights. There will be 3 outputs and for any given input, one of the output neurons will fire which will result that the given input is mapped to that particular class.
The network architecture is as follows:
3 inputs, 2 hidden neurons, 1 output neuron
Thanks in advance
It does not make sense if you only train one class in your neural network each time, since the hidden layer can make weight combinations to 'learn' which class the input data may belong to. Learn separately will make the weights independent. The network won't know which learned weight to use if a new test input is given.
Use a vector as the output to represent the three different classes, and train the data altogether.
EDIT
P.S, I don't think the link post you provide is relevant with your case. The question in that post arises from different weights initialization (randomly) in neural network training. Sometimes people apply some seed methods to make the weight learning reproducible to avoid such a problem.
In addition to response by nikie, another possibility is to represent output as one (unique) output unit with continuous values. For example, ann classify for first class if output is in the [0, 1) interval, for second if is in the [1, 2) interval and third classes in [2, 3). This architecture is declared in letterature (and verified in my experience) to be less efficient that discrete represetnation with 3 neurons.

Resources