When we say 2-layer NN, do we count the hidden layers? If we don't count them it means that the 2-layer NN have 4 layers in total. One hidden layer for first layer and one hidden layer for the second layer, right? I'll be happy for clarification.
Related
I'm watching this lecture on youtube about neural networks. I'm unsure on how to identify the amount of neurons and layers in the network. I have posted the lecture slide from the youtube lecture on the left with 3 neurons and 2 layers. I found an image on google and posted it on the right. Are my guesses for the number neurons and layers correct for the google image?
Amount of neurons and layers are correct in your left (from lecture) image and on the second one you are wrong. There is three layers and twelve neurons, but whole number of neurons doesn't have any meaning, most important is how many neurons are in each layer. So you can describe this network like this:
Neural Network with three layers, first layer (input layer) consists
with four neurons, second layer (hidden layer) - with six
neurons and third layer (output layer) - with two neurons.
Even in this picture you have layers marked and well represented graphically
In any networks you have input layer and output layer, so you already have two layers and if inputs are being processed by another neurons before output neurons, it means that you have hidden layer. You can have many hidden layers in your net, it is called multi-layered neural net. Here is a examples some of them:
this is four layered neural network with two hidden layers
this is more complex network with three hidden layers. (Five layers in sum)
NOTE:
Increasing of layers in your network increases it's training time
As per the documentation provided by Scikit learn
hidden_layer_sizes : tuple, length = n_layers - 2, default (100,)
I have little doubt.
In my code what I have configured is
MLPClassifier(algorithm='l-bfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1)
so what does 5 and 2 indicates?
What I understand is, 5 is the numbers of hidden layers, but then what is 2?
Ref - http://scikit-learn.org/dev/modules/generated/sklearn.neural_network.MLPClassifier.html#
From the link you provided, in parameter table, hidden_layer_sizes row:
The ith element represents the number of neurons in the ith hidden
layer
Which means that you will have len(hidden_layer_sizes) hidden layers, and, each hidden layer i will have hidden_layer_sizes[i] neurons.
In your case, (5, 2) means:
1rst hidden layer has 5 neurons
2nd hidden layer has 2 neurons
So the number of hidden layers is implicitely set
Some details that I found online concerning the architecture and the units of the input, hidden and output layers in sklearn.
The number of input units will be the number of features
For multiclass classification the number of output units will be the number of labels
Try a single hidden layer, or if more than one then each hidden layer should have the same number of units
The more units in a hidden layer the better, try the same as the number of input features up to twice or even three or four times that
I'm trying to create the neural network shown below. It has 3 inputs, 2 outputs, and 2 hidden layers (so 4 layers altogether, or 3 layers of weight matrices). In the first hidden layer there are 4 neurons, and in the second hidden layer there are 3. There is a bias neuron going to the first and second hidden layer, and the output layer.
I have tried using the "create custom neural network" function in MATLAB, but I can't get it to work how I want it to.
This is how I used the function
net1=network(3,3,[1;1;1],[1,1,1;0,0,0;0,0,0],[0,0,0;1,0,0;0,1,0],[0,0,0])
view(net1)
And it gives me the neural network shown below:
As you can see, this isn't what I want. There are only 3 weights in the first layer, 1 in the second, 1 in the output layer, and only one output. How would I fix this?
Thanks!
Just to clarify how I want this network to work:
The user will input 3 numbers into the network.
Each one of the 3 inputs is multiplied by 4 different weights, and then these numbers are sent to the 4 neurons in the first hidden layer.
The bias node acts the same as one of the inputs, but it always has a value of 1. It is multiplied by 4 different weights, and then sent to the 4 neurons in the first hidden layer.
Each neuron in the first hidden layer sums the 4 numbers going into it, and then passes this number through the sigmoid activation function.
The neurons in the first hidden layer then output 4 numbers that are each multiplied by 3 different weights, and sent to the 3 neurons in the second hidden layer.
The bias node going to the second hidden layer works the same as the first bias node
Each neurons in the second hidden layer sums up the 5 numbers going into it and passes it through the sigmoid activation function.
The neurons in the second layer then output two numbers that are again multiplied by weights and go to each of the outputs
The output layer also sums all of its inputs, including its bias input, and then passes this through the sigmoid activation function to get the final two values.
After some time playing around I've figured out how to do it. The code I needed to use is:
net = newff([0 1; 0 1; 0 1],[4,3 2],{'logsig','logsig','logsig'})
view(net)
This creates the network I was looking for.
I was originally mistaken about the matlab representation of neural networks. The green arrows show the path of all of the numbers, not just a single number.
When dealing with muticlass classification, is it always that the number of nodes (which are vectors) in the input layer excluding bias is the same as the number of nodes in the output layer?
No. The input layer ingests the features. The output layer makes predictions for classes. The number of features and classes does not need to be the same; it also depends on how exactly you model the multiple classes output.
Lars Kotthoff is right. However, when you are using an artificial neural network to build an autoencoder, you will want to have the same number of input and output nodes, and you will want the output nodes to learn the values of the input nodes.
Nope,
Usually number of input unites equals to number of features you are going use for training the NN classifier.
Size of the output layer equals to number of classes in the dataset. Further, if dataset has two classes only just one output unit is enough for discriminating these two classes.
The ANN output layer has a node for each class: if you have 3 classes, you use 3 nodes. The input layer (often called a feature vector) has a node for each feature used for prediction and usually an extra bias node. You usually need only 1 hidden layer, and discerning its ideal size tricky.
Having too many hidden layer nodes can result in overfitting and slow training. Having too few hidden layer nodes can result in underfitting (overgeneralizing).
Here are a few general guidelines (source) to start with:
The number of hidden neurons should be between the size of the input layer and the size of the output layer.
The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
The number of hidden neurons should be less than twice the size of the input layer.
If you have 3 classes and an input vector of 30 features, you can start with a hidden layer of around 23 nodes. Add and remove nodes from this layer during training to reduce your error, while testing against validation data to prevent overfitting.
For Example for 3-1-1 layer if the weights are initialized equally the MLP might not learn well. But why does this happen?
If you only have one neuron in the hidden layer, it doesn't matter. But, imagine a network with two neurons in the hidden layer. If they have the same weights for their input, than both neurons would always have the exact same activation, there is no additional information by having a second neuron. And in the backpropagation step, those weights would change by an equal amount. Hence, in every iteration, those hidden neurons have the same activation.
It looks like you have a typo in your question title. I'm guessing that you mean why should the weights of hidden layer be random. For the example network you indicate (3-1-1), it won't matter because you only have a single unit in the hidden layer. However, if you had multiple units in the hidden layer of a fully connected network (e.g., 3-2-1) you should randomize the weights because otherwise, all of the weights to the hidden layer will be updated identically. That is not what you want because each hidden layer unit would be producing the same hyperplane, which is no different than just having a single unit in that layer.