In PCANET
What is the advantage of applying 2 convolution layers with out pooling layer between them?
What is the task of the second pca layer?
Related
For example:
if Layer 'L' of convnet contains only 8 kernel/filter, Which means in layer L there are only 8 neurons ?
You are confusing between the word 'convolution filters' and 'neurons'. A convolution layer is different from fully-connected.
Fully connected layer do a mesh of weights multiplication between nodes or neurons. Each weight (the connection between each node) is updated via an optimizer.
Convolution layer learns filters (which is initialized randomly and also can be considered as weight). Its weights are also learned by an optimizer.
currently I am developing a new network using NiftyNet and would need some help.
I am trying to implement an Autofocus Layer [1] as proposed in the paper. However, at a certain point, the Autofocus Layer needs to calculate K (K=4) parallel convolutions each using the same weights (w) and concatenates the four outputs afterwards.
Is there a way to create four parallel convolutional layer with each having the same weights in NiftyNet?
Thank you in advance.
[1] https://arxiv.org/pdf/1805.08403.pdf
The solution to this problem is as follows.
There is no restriction allowing you to use the same convolutional layer multiple times, each time with another input. This simulates the desired parallelism and solves the weight sharing issue, because there is only one convolutional layer.
However, using this approach doesn't solve the issue having different dilation rates in each parallel layer - we only have one convolutional layer for the weight sharing problem as mentioned above.
Note: it is the same operation either using a given tensor as input
for a convolutional layer with dilation rate = 2 OR using a dilated
tensor with rate = 2 as input for a convolutional layer with dilation
rate = 1.
Therefore, creating K dilated tensors each with a different dilation rate and then using each of them as input for a single convolutional layer with dilation rate = 1 solves the problem having parallel layers each with a different dilation rate.
NiftyNet provides a class to create dilated tensors.
I'm wondering what if I have a layer generating a bottom blob that is further consumed by two subsequent layers, both of which will generate some gradients to fill bottom.diff in the back propagation stage. Will both two gradients be added up to form the final gradient? Or, only one of them can live? In my understanding, Caffe layers need to memset the bottom.diff to all zeros before filling it with some computed gradients, right? Will the memset flush out the already computed gradients by the other layer? Thank you!
Using more than a single loss layer is not out-of-the-ordinary, see GoogLeNet for example: It has three loss layers "pushing" gradients at different depths of the net.
In caffe, each loss layer has a associated loss_weight: how this particular component contribute to the loss function of the net. Thus, if your net has two loss layers, Loss1 and Loss1 the overall loss of your net is
Loss = loss_weight1*Loss1 + loss_weight2*Loss2
The backpropagation uses the chain rule to propagate the gradient of Loss (the overall loss) through all the layers in the net. The chain rule breaks down the derivation of Loss into partial derivatives, i.e., the derivatives of each layer, the overall effect is obtained by propagating the gradients through the partial derivatives. That is, by using top.diff and the layer's backward() function to compute bottom.diff one takes into account not only the layer's derivative, but also the effect of ALL higher layers expressed in top.diff.
TL;DR
You can have multiple loss layers. Caffe (as well as any other decent deep learning framework) handles it seamlessly for you.
Are there approaches to train a convolutional neural network by layer-wise(Instead of end-to-end), to understand how each layer contributes to the final architecture performance?
You can freeze every other layer and only train one layer at a time. After each epoch/iteration you can freez other layers and only train one other layer. So this is possible.
Recently I'm trying to implement the Lenet-5 CNN. But I stuck in how to propagate error from the conv-layer to previous layer, for example, from C3 layer to S2 layer. Could anybody please help me?
CNN typically have convolutional layer and pooling layer. Since pooling layer does not have parameters, it does not require any learning. Error Propagation in the last layer (FC) is same as NN. The only magic tricks involve in convolutional layers while having backpropagation. You can visualize the convolutional layers as a connection cutting NN Transforming Multilayer Perceptron to Convolutional Neural Network. The error can be propagated back by utilizing the equation back propagation of delta error.