Scikit learn multilayer neural network - machine-learning

As per the documentation provided by Scikit learn
hidden_layer_sizes : tuple, length = n_layers - 2, default (100,)
I have little doubt.
In my code what I have configured is
MLPClassifier(algorithm='l-bfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1)
so what does 5 and 2 indicates?
What I understand is, 5 is the numbers of hidden layers, but then what is 2?
Ref - http://scikit-learn.org/dev/modules/generated/sklearn.neural_network.MLPClassifier.html#

From the link you provided, in parameter table, hidden_layer_sizes row:
The ith element represents the number of neurons in the ith hidden
layer
Which means that you will have len(hidden_layer_sizes) hidden layers, and, each hidden layer i will have hidden_layer_sizes[i] neurons.
In your case, (5, 2) means:
1rst hidden layer has 5 neurons
2nd hidden layer has 2 neurons
So the number of hidden layers is implicitely set

Some details that I found online concerning the architecture and the units of the input, hidden and output layers in sklearn.
The number of input units will be the number of features
For multiclass classification the number of output units will be the number of labels
Try a single hidden layer, or if more than one then each hidden layer should have the same number of units
The more units in a hidden layer the better, try the same as the number of input features up to twice or even three or four times that

Related

What is the connections between two stacked LSTM layers?

The question is like this one What's the input of each LSTM layer in a stacked LSTM network?, but more into implementing details.
For simplicity how about 4 units and 2 units structures like the following
model.add(LSTM(4, input_shape=input_shape, return_sequences=True))
model.add(LSTM(2,input_shape=input_shape))
So I know the output of LSTM_1 is 4 length but how do the next 2 units handle these 4 inputs, are they fully connected to the next layer of nodes?
I guess they are fully connected but not sure like the following figure, it was not stated in the Keras document
Thanks!
It's not length 4, it's 4 "features".
The length is in the input shape and it never changes, there is absolutely no difference between what happens when you give a regular input to one LSTM and what happens when you give an output of an LSTM to another LSTM.
You can just look at the model's summary to see the shapes and understand what is going on. You never change the length using LSTMs.
They don't communicate at all. Each one takes the length dimension, processes it recurrently, independently from the other. When one finishes and outputs a tensor, the next one gets the tensor and process it alone following the same rules.

Neural Network Hidden Layer Input Size for this Tutorial

I am following part 5 of this tutorial which can be found in in this link: http://peterroelants.github.io/posts/neural_network_implementation_part05/
This creates a neural network suitable for identification handwritten digits from 0-9.
In the middle of the tutorial, the author explains that the neural network has 64 inputs (representing the 64 pixel image) which contains two hidden neural networks that has a input size of 20. (see below screenshot)
I have two questions:
1) Can anyone explain the choice of projecting the 64 input layer onto a 20 input layer? Why the choice of 20? Is it arbitrary or determined by experiment? Is there an intuitive reason why?
2) Why two hidden layers? I read somewhere that most problems can be solved with 1-2 hidden layers, and that is usually determined by trial and error. Is it the same case here?
Appreciate any thoughts
The network has:
one input layer with 64 neurons --> one for each pixel
a hidden layer with 20 neurons
another hidden layer with 20 neurons
an output layer with 10 neurons --> one for each digit
The choice of two hidden layers with 20 neurons each is relatively arbitrary, and probably determined by experiment, just as you said. Also, the description of each of these layers as another network can be confusing/misleading. You are also right on account of 1-2 hidden layers usually being sufficient for problems, and with digit recognition, which is not to complex, this is the case.

Neural network, minimum number of neurons

I've got a 2D surface where a ship (with constant speed) navigates around the scene to pick up candy. For every candy the ship picks up I increase the fitness. The NN has one output to steer the ship (0 for left and 1 for right, so 0.5 would be straight forward) There are four inputs in the range [-1 .. 1], that represents two normalized vectors. The ship direction and the direction to the piece of candy.
Is there any way to calculate the minimum number of neurons in the hidden layer? I also tried giving two inputs instead of four, the first was the dot product [-1..1] (where I dotted the ship direction with the direction to the candy) and the second was (0/1) if the candy was to the left/right of the ship. It seems like this approach worked a lot better with fewer neurons in the hidden layer.
Fewer inputs should imply fewer number of neurons. This is because the number of input combinations decrease and it gets easier for the neural network to learn the system. There is no golden rule as to how to calculate the best number of nodes in the hidden layer. However, with 2 inputs I'd say 2 hidden nodes should work fine. It really depends on the degree of non linearity in your inputs.
Defining the number of hidden layers and the number of neurons in each hidden layers always was a challenge and it may diverge from each type of problems. By the way, a single hidden layer in a feedforward neural network could solve most of the problems, given it can aproximate functions.
Murata defined some rules to use in neural networks to define the number of hidden neurons in a feedforward neural network:
The value should be between the size of the input and output layers.
The value should be 2/3 the size of the input layer plus the size of the output layer.
The value should be less than twice the size of the input layer
You could try these rules and evaluate the impact of it in your neural network.

How to use the custom neural network function in the MATLAB Neural Network Toolbox

I'm trying to create the neural network shown below. It has 3 inputs, 2 outputs, and 2 hidden layers (so 4 layers altogether, or 3 layers of weight matrices). In the first hidden layer there are 4 neurons, and in the second hidden layer there are 3. There is a bias neuron going to the first and second hidden layer, and the output layer.
I have tried using the "create custom neural network" function in MATLAB, but I can't get it to work how I want it to.
This is how I used the function
net1=network(3,3,[1;1;1],[1,1,1;0,0,0;0,0,0],[0,0,0;1,0,0;0,1,0],[0,0,0])
view(net1)
And it gives me the neural network shown below:
As you can see, this isn't what I want. There are only 3 weights in the first layer, 1 in the second, 1 in the output layer, and only one output. How would I fix this?
Thanks!
Just to clarify how I want this network to work:
The user will input 3 numbers into the network.
Each one of the 3 inputs is multiplied by 4 different weights, and then these numbers are sent to the 4 neurons in the first hidden layer.
The bias node acts the same as one of the inputs, but it always has a value of 1. It is multiplied by 4 different weights, and then sent to the 4 neurons in the first hidden layer.
Each neuron in the first hidden layer sums the 4 numbers going into it, and then passes this number through the sigmoid activation function.
The neurons in the first hidden layer then output 4 numbers that are each multiplied by 3 different weights, and sent to the 3 neurons in the second hidden layer.
The bias node going to the second hidden layer works the same as the first bias node
Each neurons in the second hidden layer sums up the 5 numbers going into it and passes it through the sigmoid activation function.
The neurons in the second layer then output two numbers that are again multiplied by weights and go to each of the outputs
The output layer also sums all of its inputs, including its bias input, and then passes this through the sigmoid activation function to get the final two values.
After some time playing around I've figured out how to do it. The code I needed to use is:
net = newff([0 1; 0 1; 0 1],[4,3 2],{'logsig','logsig','logsig'})
view(net)
This creates the network I was looking for.
I was originally mistaken about the matlab representation of neural networks. The green arrows show the path of all of the numbers, not just a single number.

Neural Networks (input and output layers)

When dealing with muticlass classification, is it always that the number of nodes (which are vectors) in the input layer excluding bias is the same as the number of nodes in the output layer?
No. The input layer ingests the features. The output layer makes predictions for classes. The number of features and classes does not need to be the same; it also depends on how exactly you model the multiple classes output.
Lars Kotthoff is right. However, when you are using an artificial neural network to build an autoencoder, you will want to have the same number of input and output nodes, and you will want the output nodes to learn the values of the input nodes.
Nope,
Usually number of input unites equals to number of features you are going use for training the NN classifier.
Size of the output layer equals to number of classes in the dataset. Further, if dataset has two classes only just one output unit is enough for discriminating these two classes.
The ANN output layer has a node for each class: if you have 3 classes, you use 3 nodes. The input layer (often called a feature vector) has a node for each feature used for prediction and usually an extra bias node. You usually need only 1 hidden layer, and discerning its ideal size tricky.
Having too many hidden layer nodes can result in overfitting and slow training. Having too few hidden layer nodes can result in underfitting (overgeneralizing).
Here are a few general guidelines (source) to start with:
The number of hidden neurons should be between the size of the input layer and the size of the output layer.
The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
The number of hidden neurons should be less than twice the size of the input layer.
If you have 3 classes and an input vector of 30 features, you can start with a hidden layer of around 23 nodes. Add and remove nodes from this layer during training to reduce your error, while testing against validation data to prevent overfitting.

Resources