How to reverse a shape in Keras for LSTM input - machine-learning

I have my input state with shape = (84,84,4)
state = Input(shape=(84,84,4), dtype="float")
It's stacked sequence of continuous frames.
I want to pass this state to the keras model as input,
firstly - to TimeDistributed layers
and then - to LSTM
as I understand, time step is the first dimension
and I need to reshape my state appropriately to
shape=(4, 84, 84)
and holds the frames in their own size and topology

state_t=tf.transpose(state,[2,1,0])
Is this what you are looking for ?
(or [2,0,1] that depends on what you wanna do...)

Related

Working of embedding layer in Tensorflow

Can someone please explain me the inputs and outputs along with the working of the layer mentioned below
model.add(Embedding(total_words, 64, input_length=max_sequence_len-1))
total_words = 263
max_sequence_len=11
Is 64, the number of dimensions?
And why is the output of this layer (None, 10, 64)
Shouldn't it be a 64 dimension vector for each word, i.e (None, 263, 64)
You can find all the information about the Embedding Layer of Tensorflow Here.
The first two parameters are input_dimension and output_dimension.
The input dimensions basically represents the vocabulary size of your model. You can find this out by using the word_index function of the Tokenizer() function.
The output dimensions are going to be Dimensions of the input of the next Dense Layer
The output of the Embedding layer is of the form (batch_size, input_length, output_dim). But since you specified the input_length parameter, your layers input will be of the form (batch, input_length). That's why the output is of the form (None, 10 ,64).
Hope that clears up your doubt ☺️
In the Embedding layer the first argument represents the input dimensions (which is typically of considerable dimensionality). The second argument represents the output dimensions, a.k.a the dimensionality of the reduced vector. The third argument is for the sequence length. In essence, an Embedding layer is simply learning a lookup table of shape (input dim, output dim). The weights of this layer reflect that shape. The output of the layer, however, will of course be of shape (output dim, seq length); one dimensionality-reduced embedding vector for each element in the input sequence. The shape you were expecting is actually the shape of the weights of an embedding layer.

How can I change hidden state activation size in RNN using Keras?

I want to train an RNN with a different size of hidden state activation than provided by default in keras. example: my input vector at time step is size 27 and output is also 27. I want hidden state activation size to be 50.
It's not clear what exactly you mean by hidden state activation size. The parameter that controls the RNN cell size in keras is called units. From the documentation:
units: Positive integer, dimensionality of the output space.
This number directly corresponds to the shape of the recurrent matrix that is applied inside the cell, so in this sense it is a hidden size of the cell, or the number of hidden neurons.
To change this size from 27 to 50 simply call:
model.add(SimpleRNN(50, ...))

keras vgg 16 shape error

im trying to fit the data with the following shape to the pretrained keras vgg19 model.
image input shape is (32383, 96, 96, 3)
label shape is (32383, 17)
and I got this error
expected block5_pool to have 4 dimensions, but got array with shape (32383, 17)
at this line
model.fit(x = X_train, y= Y_train, validation_data=(X_valid, Y_valid),
batch_size=64,verbose=2, epochs=epochs,callbacks=callbacks,shuffle=True)
Here's how I define my model
model = VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=(96,96,3),classes=17)
How did maxpool give me a 2d tensor but not a 4D tensor ? I'm using the original model from keras.applications.vgg16. How can I fix this error?
Your problem comes from VGG16(include_top=False,...) as this makes your solution to load only a convolutional part of VGG. This is why Keras is complaining that it got 2-dimensional output insted of 4-dimensional one (4 dimensions come from the fact that convolutional output has shape (nb_of_examples, width, height, channels)). In order to overcome this issue you need to either set include_top=True or add additional layers which will squash the convolutional part - to a 2d one (by e.g. using Flatten, GlobalMaxPooling2D, GlobalAveragePooling2D and a set of Dense layers - including a final one which should be a Dense with size of 17 and softmax activation function).

How neural net extract features

I'm new on neural networks. I follow some tutorials on a lot of platforms, but there is one thing than I don't understand.
In a simple multi layer perceptron :
We have the input layer, an hidden layer for this example (with the same number of neurons than the input layer) and an output layer with one unit.
We initialize the weights of the units in hidden layer randomly but in a range of small values.
Now, the input layer is fully connected with the hidden layer.
So each units in hidden layer are going to receive the same parameters. How are they going to extract different features from each other ?
Thanks for explanation!
We initialize the weights of the units in hidden layer randomly but in
a range of small values. Now, the input layer is fully connected with
the hidden layer. So each units in hidden layer are going to receive
the same parameters. How are they going to extract different features
from each other ?
Actually each neuron will not have the same value. To get to the activations of the hidden layer you use the matrix equation Wx + b In this case W is the weight matrix of shape (Hidden Size, Input Size). x is the input vector of the hidden layer of shape (Input Size) and b is the bias of shape (Hidden Size). This results in an activation of shape (Hidden Size). So while each hidden neuron would be "seeing" the same x vector it will be taking the dot product of x with its own random row vector and adding its own random bias which will give that neuron a different value. The values contained in the W matrix and b vector are what are trained and optimized. Since they have different starting points they will eventually learn different features through the gradient decent.

How many layers are in this neural network?

I am trying to make sure I'm using the correct terminology. The below diagram shows the MNIST example
X is 784 row vector
W is 784X10 matrix
b is a 10 row vector
The out of the linear box is fead into softmax
The output of softmax is fed into the distance function cross-entropy
How many layers are in this NN? What are the input and hidden layer in that example?
Similarly, how many layers are in this answer If my understanding is correct, then 3 layers?
Edit
#lejlot Does the below represent a 3 layered NN with 1 hidden layer?
Take a look at this picture:
http://cs231n.github.io/assets/nn1/neural_net.jpeg
In your first picture you have only two layers:
Input layers -> 784 neurons
Output layer -> 10 neurons
Your model is too simple (w contains directly connections between the input and the output and b contains the bias terms).
With no hidden layer you are obtaining a linear classifier, because a linear combination of linear combinations is a linear combination again. The hidden layers are what include non linear transformations in your model.
In your second picture you have 3 layers, but you are confused the notation:
The input layer is the vector x where you place an input data.
Then the operation -> w -> +b -> f() -> is the conexion between the first layer and the second layer.
The second layer is the vector where you store the result z=f(xw1+b1)
Then softmax(zw2+b2) is the conexion between the second and the third layer.
The third layer is the vector y where you store the final result y=softmax(zw2+b2).
Cross entropy is not a layer is the cost function to train your neural network.
EDIT:
One more thing, if you want to obtain a non linear classifier you must add a non linear transformation in every hidden layer, in the example that I have described, if f() is a non linear function (for example sigmoid, softsign, ...):
z=f(xw1+b1)
If you add a non linear transformation only in the output layer (the softmax function that you have at the end) your outputs are still linear classifiers.
That has 1 hidden layer.
The answer you link to, I would call a 2-hidden layer NN.
Your input-layer is the X-vector.
Your layer Wx+b is the hidden layer, aka. the box in your picture.
The output-layer is the Soft-max.
The cross-entropy is your loss/cost function, and is not a layer at all.

Resources