Training a convolutional neural network by layer-wise - machine-learning

Are there approaches to train a convolutional neural network by layer-wise(Instead of end-to-end), to understand how each layer contributes to the final architecture performance?

You can freeze every other layer and only train one layer at a time. After each epoch/iteration you can freez other layers and only train one other layer. So this is possible.

Related

How should I optimize neural network for image classification using pretrained models

Thank you for viewing my question. I'm trying to do image classification based on some pre-trained models, the images should be classified to 40 classes. I want to use VGG and Xception pre-trained model to convert each image to two 1000-dimensions vectors and stack them to a 1*2000 dimensions vector as the input of my network and the network has an 40 dimensions output. The network has 2 hidden layers, one with 1024 neurons and the other one has 512 neurons.
Structure:
image-> vgg(1*1000 dimensions), xception(1*1000 dimensions)->(1*2000 dimensions) as input -> 1024 neurons -> 512 neurons -> 40 dimension output -> softmax
However, using this structure I can only achieve about 30% accuracy. So my question is that how could I optimize the structure of my networks to achieve higher accuracy? I'm new to deep learning so I'm not quiet sure my current design is 'correct'. I'm really looking forward to your advice
I'm not entirely sure I understand your network architecture, but some pieces don't look right to me.
There are two major transfer learning scenarios:
ConvNet as fixed feature extractor. Take a pretrained network (any of VGG and Xception will do, do not need both), remove the last fully-connected layer (this layer’s outputs are the 1000 class scores for a different task like ImageNet), then treat the rest of the ConvNet as a fixed feature extractor for the new dataset. For example, in an AlexNet, this would compute a 4096-D vector for every image that contains the activations of the hidden layer immediately before the classifier. Once you extract the 4096-D codes for all images, train a linear classifier (e.g. Linear SVM or Softmax classifier) for the new dataset.
Tip #1: take only one pretrained network.
Tip #2: no need for multiple hidden layers for your own classifier.
Fine-tuning the ConvNet. The second strategy is to not only replace and retrain the classifier on top of the ConvNet on the new dataset, but to also fine-tune the weights of the pretrained network by continuing the backpropagation. It is possible to fine-tune all the layers of the ConvNet, or it’s possible to keep some of the earlier layers fixed (due to overfitting concerns) and only fine-tune some higher-level portion of the network. This is motivated by the observation that the earlier features of a ConvNet contain more generic features (e.g. edge detectors or color blob detectors) that should be useful to many tasks, but later layers of the ConvNet becomes progressively more specific to the details of the classes contained in the original dataset.
Tip #3: keep the early pretrained layers fixed.
Tip #4: use a small learning rate for fine-tuning because you don't want to distort other pretrained layers too quickly and too much.
This architecture much more resembled the ones I saw that solve the same problem and has higher chances to hit high accuracy.
There are couple of steps you may try when the model is not fitting well:
Increase training time and decrease learning rate. It may be stopping at very bad local optima.
Add additional layers that can extract specific features for the large number of classes.
Create multiple two-class deep networks for each class ('yes' or 'no' output class). This will let each network be more specialized for each class, rather than training one single network to learn all 40 classes.
Increase training samples.

Perceptron and shape recognition

I recently implemented a simple Perceptron. This type of perceptron (composed of only one neuron giving binary information in output) can only solve problems where classes can be linearly separable.
I would like to implement a simple shape recognition in images of 8 by 8 pixels. I would like for example my neural network to be able to tell me if what I drawn is a circle, or not.
How to know if this problem has classes being linearly separable ? Because there is 64 inputs, can it still be linearly separable ? Can a simple perceptron solve this kind of problem ? If not, what kind of perceptron can ? I am a bit confused about that.
Thank you !
This problem, in a general sense, can not be solved by a single layer perception. In general other network structures such as convolutional neural networks are best for solving image classification problems, however given the small size of your images a multilayer perception may be sufficient.
Most problems are linearly separable, but not necessarily in 2 dimensions. Adding extra layers to a network allows it to transform data in higher dimensions so that it is linearly separable.
Look into multilayer perceptrons or convolutional neural networks. Examples of classification on the MNIST dataset might be helpful as well.

Equal Training and Testing Score for Feed-Forward Neural Network

My simple 3-layer feed-forward neural network gives me equal training and testing score. Does that mean my neural network is too simple, i.e. I should increase the number of neurons of each layer or increase the number of layers?

Good pretrained convolutional neural network for a monochromatic image

For an image recognition task one may use a pretrained convolutional neural network (like VGG or GoogLeNet). They usually work great - but with one assumption - only for RGB images. I'm looking for a good pretrained neural network which was trained on a monochromatic images. Does anyone know something like this?

how to propagate error from the conv-layer to previous layer in Lenet-5 CNN

Recently I'm trying to implement the Lenet-5 CNN. But I stuck in how to propagate error from the conv-layer to previous layer, for example, from C3 layer to S2 layer. Could anybody please help me?
CNN typically have convolutional layer and pooling layer. Since pooling layer does not have parameters, it does not require any learning. Error Propagation in the last layer (FC) is same as NN. The only magic tricks involve in convolutional layers while having backpropagation. You can visualize the convolutional layers as a connection cutting NN Transforming Multilayer Perceptron to Convolutional Neural Network. The error can be propagated back by utilizing the equation back propagation of delta error.

Resources