Is there an implement of focal loss for regression problem? - machine-learning

Proposed in RetinaNet, Focal Loss is a efficient solution to class imbalance for classification problem. But data imbalance can also occur in regression problem. Is there any implementation of focal loss in regression problem?

Related

Validation Accuracy of Siamese CNN not increasing, stucks at 0.50

I'm training a Siamese CNN model based on X-ray dataset and it gets stuck at 0.50 validation accuracy. Moreover, training and validation loss decrease while training accuracy hovers around 0.50 as well.
I think that is a error or something, but I don't know exactly what it is. Maybe use Cross-Validation could help.
Code:
Archictecture of CNN
Summary
Training the model...(results)

Keras accuracy plots flat while loss plots not flat

I am doing deep learning using a multi-layer perceptron for regression. The loss curve turns flat in the third epoch however accuracy curve remains flat at the beginning. I wonder whether this makes sense.
Since you didn't provide the code, it would be harder to narrow down what is the problem. Being said, here are some pointers that might help you see what is the problem:
Validation set is either small or it is a bad representation of your training set. (bear in mind, if you are using validation_split in fit function, then keras will only take the last percentage of your training set and will keep it the same for all epochs. link]).
You are not using any regularization (Dropout, Regularization, Constraints).
The model could be small (layers- and neurons-wise), so it is underfitting.
Hope these pointers help you with your problem.

Perceptron and shape recognition

I recently implemented a simple Perceptron. This type of perceptron (composed of only one neuron giving binary information in output) can only solve problems where classes can be linearly separable.
I would like to implement a simple shape recognition in images of 8 by 8 pixels. I would like for example my neural network to be able to tell me if what I drawn is a circle, or not.
How to know if this problem has classes being linearly separable ? Because there is 64 inputs, can it still be linearly separable ? Can a simple perceptron solve this kind of problem ? If not, what kind of perceptron can ? I am a bit confused about that.
Thank you !
This problem, in a general sense, can not be solved by a single layer perception. In general other network structures such as convolutional neural networks are best for solving image classification problems, however given the small size of your images a multilayer perception may be sufficient.
Most problems are linearly separable, but not necessarily in 2 dimensions. Adding extra layers to a network allows it to transform data in higher dimensions so that it is linearly separable.
Look into multilayer perceptrons or convolutional neural networks. Examples of classification on the MNIST dataset might be helpful as well.

how to implement L2 Regularization in caffe or DIGITS?

I am using Caffe and also NVIDIA DIGITS. I want to use AlexNet pretrained on ImageNet and wanna fine tune it on my medical data. I have nearly 1000 images and using 80% for training, I generated 40,000 images by data augmentation (using cropping and rotation). However I face a severe overfitting. I tried to overcome this by adding multiple dropout layers. and the result change from :
to:
but my accuracy does not improve.
my network specifications:
AlexNet pre-trained on ImageNet
base learning rate: 0.001
learning rate multiplier: 0.1 for convolution layers and 1 for fully connected layers and xavier weight initialisation.
dropout: 0.5
Now I want to add L2 regularization. I did not find such layer in Caffe and I should maybe make it myself.
first question: Do you have any solution for my problem? ( I have tried other ways like changing stepsize, changing learning rate from 1 to 10^(-5) and I found 0.001 is better, weigh decay changes, adding various dropout layer (which helped as you see))
second question: can you please help me how I can implement L2 regularization??
You have L2 regularization by default in caffe.
See this thread for more information.

Non linear classifier against a linearly separable training set

I was thinking about the risks hided under training a non linear classifier against a labelled (large enough) dataset which is linearly separable.
What would be the main classification misleadings we can come up with? Some example?
In the bias-variance tradeoff, a non-linear classifier has, in general, a larger variance than the linear one. If the dataset is generated by a linearly-separable process but the measurements are noisy, then it will be more susceptible to overfitting.
However, if the dataset is large enough and the classifier is unbiased, then a non-linear classifier would eventually produce effectively a separating hyperplane.

Resources