Pytorch: Training loss not decreasing in VAE - machine-learning

I have implemented a Variational Autoencoder model in Pytorch that is trained on SMILES strings (String representations of molecular structures).
While training the autoencoder to output the same string as the input, the Loss function does not decrease between epochs.
I have tried the following with no success:
1) Adding 3 more GRU layers to the decoder to increase learning capability of the model.
2) Increasing the latent vector size from 292 to 350.
3) Increasing and decreasing the learning rate.
4) Changing the optimizer from Adam to SGD.
5) Trained the model on upto 50 epochs.
6) Increasing and decreasing the batch size.
The following is the link to my code.
https://colab.research.google.com/drive/1LctSm_Emnn5sHpw_Hon8xL5fF4bmKRw5
The following is an equivalent keras model(Same architecture) that is able to train successfully.
https://colab.research.google.com/drive/170Peseik03CFYpWPNyD8B8mxUGxTQx67

Related

Does the small dataset affect the number of epoch?

I was training a dataset, contain 3000 images using resnet and lstm, but the model just got overfiting in epoch 5. Does the small dataset affect the number of epoch?
What are the factors that affect the number of epochs for data trained using transfer learning? Is there a paper that discusses it?

CNN training loss is so unstable

my CNN network
Above is my config of the network.
l am training a CNN network on picture size of 192*192.
my target is a classification network of 11 kinds.
However, the loss and the accuracy on testing dataset appears to be very unstable. l have to run 15+ epochs to get a stable accuracy and loss. The maximum accuracy is only 50%.
What can l do to improve the performance?
I would recommend you to first refer to models which are widely known like VGG-16, LeNET or VGG-19 and check out the way how the conv2D and max-pooling layers are placed.
Start with a very basic model without any batch normalization and Leaky ReLU layers. You just keep the conv2D and max pooling layers and train your model for a few epochs.
Next, try other activations like ReLU to TanH. Try Changing the max pooling to average pooling.
If you are solving a classification problem then use the softmax layer at the end. Also, introduce Dense layer(s) after flattening.
Your dataset should be large and also the target should be one-hot encoded if you wish to use the softmax layer.

Not able to train Resnet model with tripletloss, while VGG16 works, why?

I am trying to do a transfer learning with ResNet50V2 model using triplet loss function. I have kept Include_top = False, input shape = (160,160,3) with Imagenet weights. The last 3 layers of my model is shown in the below image with 6 million trainable parameters.
During the training process, I could see the loss function values reducing from 7.6 to 0.8 but the accuracy does not improve. But when I replace the model with VGG16 and while training the last 3 layers, the accuracy improves from 50% to 90% along with loss value reducing from 6.0 to 0.5.
Where am I going wrong ? Is there anything specific I should look at while training resnet model ? How to train the resnet model ?

The number of epochs used in an autoencoder depends on the dimension of the dataset?

I develop a simple autoencoder and to find the right parameters I use a grid search on a small subset of dataset. The number of epochs in output can be used on the training set with higher dimension? The number of epochs depends on the dimension of dataset? or not? E.g. I have much more epochs in a dataset with a big dimension and a lower number of epochs for a small dataset
In general yes, the number of epochs will change if the dataset is bigger.
The number of epochs should not be decided a-priori. You should run the training and monitor the training and validation losses over time and stop training when the validation loss reaches a plateau or start increasing. This technique is called "early stopping" and is a good practice in machine learning.

How do I balance a training dataset which has very high number of samples for a certain class?

I have been working on the Sentiment analysis prediction using the Rotten Tomatoes movie reviews dataset.
The dataset has 5 classes {0,1,2,3,4} where 0 being very negative and 4 being very positive
The dataset is highly unbalanced,
total samples = 156061
'0': 7072 (4.5%),
'1': 27273 (17.4%),
'2': 79583 (50.9%),
'3': 32927 (21%),
'4': 9206 (5.8%)
as you can see class 2 has almost 50% samples and 0 and 5 contribute to ~10% of training set
So there is a very strong bias for class 2 thus reducing the accuracy of classification for class 0 and 4.
What can I do to balance the dataset? One solution would be to get equal number of samples by reducing the samples to only 7072 for each class, but it reduces the dataset drastically!
How can I optimize and balance the dataset without affecting the accuracy of overall classification?
You should not balance the dataset, you should train a classifier in a balanced manner. Nearly all existing classifiers can be trained with some cost sensitive objective. For example - SVMs let you "weight" your samples, simply weight samples of the smaller class more. Similarly Naive Bayes has classes priors - change them! Random forest, Neural networks, Logistic regression, they all let you somehow "weight" samples, it is the core technique for getting more balanced results.
For classification problems, you can try class_weight='balanced' option in your estimator, such as Logistic, SVM, etc. For example:
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression

Resources