Validation Accuracy of Siamese CNN not increasing, stucks at 0.50 - machine-learning

I'm training a Siamese CNN model based on X-ray dataset and it gets stuck at 0.50 validation accuracy. Moreover, training and validation loss decrease while training accuracy hovers around 0.50 as well.
I think that is a error or something, but I don't know exactly what it is. Maybe use Cross-Validation could help.
Code:
Archictecture of CNN
Summary
Training the model...(results)

Related

Meaning of a constant very low , training loss in learning curve

I have a trained a fasttext model on a binary text classification problem and generated the learning curve over increasing training size
I get very quick a very low training loss , close to 0, which stays constant.
I interpret this as the model overfitting on the data.
But the validation loss curve looks good to me, slowly decreasing.
Crossvalidation on unknow data produces as well accuracies with little variation, about 90% accuracy.
So I am wondering, if I indeed have an "Overfiiting" model as the learning curve suggests.
Is there any other check I can do on my model ?
As the fasttext model uses as well epochs, I am even wondering if a learning curve should vary the epochs (and keep training size constant) or "slowly increase training set size while keep epoch constant" (or both ...)

Why would a neural networks validation loss and accuracy fluctuate at first?

I am training a neural network and at the beginning of training my networks loss and accuracy on the validation data fluctuates a lot, but towards the end of training it stabilizes. I am reduce learning rate on plateau for this network. Could it be that the network starts with a high learning rate and as the learning rate decreases both accuracy and loss stabilize?
For SGD, the amount of change in the parameters is a multiple of the learning rate and the gradient of the parameter values with respect to the loss.
θ = θ − α ∇θ E[J(θ)]
Every step it takes will be in a sub-optimal direction (ie slightly wrong) as the optimiser has usually only seen some of the values. At the start of training you are relatively from the optimal solution, so the gradient ∇θ E[J(θ)] is large, therefore each sub-optimal step has a large effect on your loss and accuracy.
Over time, as you (hopefully) get closer to the optimal solution, the gradient is smaller, so the steps become smaller, meaning that the effects of being slightly wrong are diminished. Smaller errors on each step makes your loss decrease more smoothly, so reduces fluctuations.

How to judge whether model is overfitting or not

I am doing video classification with a model combining CNN and LSTM.
In the training data, the accuracy rate is 100%, but the accuracy rate of the test data is not so good.
The number of training data is small, about 50 per class.
In such a case, can I declare that over learning is occurring?
Or is there another cause?
Most likely you are indeed overfitting if the performance of your model is perfect on the training data, yet poor on test/validation data set.
A good way of observing that effect is to evaluate your model on both training and validation data after each epoch of training. You might observe that while you train, the performance on your validation set is increasing initially, and then starts to decrease. That is the moment when your model starts to overfit and where you can interrupt your training.
Here's a plot demonstrating this phenomenon with the blue and red lines corresponding to errors on training and validation sets respectively.

How should I do with the loss output?

I trained my network in a dataset. I got the training loss/iterations as follows:
As you see, the loss grow up rapidly at some points as the red arrow. I am using Adam solver with learning rate is 0.001 and momentum as 0.9, weight decay is 0.0005, without dropout. In my network, I used BatchNorm, Pooling, Conv. From the above figure. Could you suggest what is my problem and how to fix it? Thanks all
Update: This is more detail figure

Why does Neural Network give same accuracies for permuted labels?

I have a datset of 37 data points and around 1300 features. There are 4 different classes and each class has around the same number of data points. I have trained a neural network and got an accuracy of 60% with two hidden layers which is not bad (chance level 25%).
The problem is now with the p-value. I'm calculating the p-value with a permutation test. I'm permuting the labels 1000 times and for each permutation I'm calculating the accuracy. The p-value I calculate as the percentage of permutation accuracies which aver over the original accuracy.
For all the permutations of labels I'm getting the same accuracy as with the original labels, i.e. the neural network does not seem to include the labels in the learning.
If I do it with SVM I'm getting for all permutations different accuracies (in the end like a gaussian distribution).
Why is this the case?
By the way, I'm using the DeepLearnToolbox for Matlab.
Is the 60% success rate on the training data or a validation dataset that you set aside?
If you're computing the success rate on only the training data then you would also expect a high accuracy even after permuting the labels. This is because your classifier will overfit the data (1300 features to 37 data points) and achieve good performance on training data.

Resources