Accuracy doesn't change over all epochs with multi-class classification - machine-learning

I am trying to train a model to solve multi-class classification problem.
I've got a problem that is training accuracy and validation accuracy doesn't change over all epochs. Like this:
Train on 4642 samples, validate on 516 samples
Epoch 1/100
- 1s - loss: 1.7986 - acc: 0.4649 - val_loss: 1.7664 - val_acc: 0.4942
Epoch 2/100
- 1s - loss: 1.6998 - acc: 0.5017 - val_loss: 1.7035 - val_acc: 0.4942
Epoch 3/100
- 1s - loss: 1.6956 - acc: 0.5022 - val_loss: 1.7000 - val_acc: 0.4942
Epoch 4/100
- 1s - loss: 1.6900 - acc: 0.5022 - val_loss: 1.6954 - val_acc: 0.4942
Epoch 5/100
- 1s - loss: 1.6931 - acc: 0.5017 - val_loss: 1.7058 - val_acc: 0.4942
...
Epoch 98/100
- 1s - loss: 1.6842 - acc: 0.5022 - val_loss: 1.6995 - val_acc: 0.4942
Epoch 99/100
- 1s - loss: 1.6844 - acc: 0.5022 - val_loss: 1.6977 - val_acc: 0.4942
Epoch 100/100
- 1s - loss: 1.6838 - acc: 0.5022 - val_loss: 1.6934 - val_acc: 0.4942
My code with keras:
y_train = to_categorical(y_train, num_classes=11)
X_train, X_test, Y_train, Y_test = train_test_split(x_train, y_train,
test_size=0.1, random_state=42)
model = Sequential()
model.add(Dense(64, init='normal', activation='relu', input_dim=160))
model.add(Dropout(0.3))
model.add(Dense(32, init='normal', activation='relu'))
model.add(BatchNormalization())
model.add(Dense(11, init='normal', activation='softmax'))
model.summary()
print("[INFO] compiling model...")
model.compile(optimizer=keras.optimizers.Adam(lr=0.01, beta_1=0.9,
beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False),
loss='categorical_crossentropy',
metrics=['accuracy'])
print("[INFO] training network...")
model.fit(X_train, Y_train, epochs=100, batch_size=32, verbose=2, validation_data = (X_test, Y_test))
Please help me. Thank you!

I had a similar problem once. For me it turned out that making sure I didnt have too many missing values in x_train (having to fill with value representing unknown or filling with median value), dropping columns that really didnt help (all had same value), and normalizing the x_train data helped.
Example from my data/model,
# load data
x_main = pd.read_csv("glioma DB X.csv")
y_main = pd.read_csv("glioma DB Y.csv")
# fill with median (will have to improve later, not done yet)
fill_median =['Surgery_SBRT','df','Dose','Ki67','KPS','BMI','tumor_size']
x_main[fill_median] = x_main[fill_median].fillna(x_main[fill_median].median())
x_main['Neurofc'] = x_main['Neurofc'].fillna(2)
x_main['comorbid'] = x_main['comorbid'].fillna(int(x_main['comorbid'].median()))
# drop surgery
x_main = x_main.drop(['Surgery'], axis=1)
# normalize all x
x_main_normalized = x_main.apply(lambda x: (x-np.mean(x))/(np.std(x)+1e-10))

Related

Binary classifier making only making true negative and false positive predictions [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed last year.
Improve this question
I'm building a neural network to classify doublets of 100*80 images into two classes.
My accuracy is capped at around 88% no matter what I try to do (add convolutional layers, dropouts...).
I've investigated the issue and found from the confusion matrix that my model is only making true negative and false positive predictions. I have no idea how this is possible and was wondering if anyone could help me.
Here is some of the code (I've used a really simple model architecture here):
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, shuffle = True)
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape = (100,80,2)))
model.add(keras.layers.Dense(5, activation = 'relu'))
model.add(keras.layers.Dense(1, activation = 'sigmoid'))
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
metrics=['accuracy'])
model.fit(X_train, y_train, epochs =10, batch_size= 200, validation_data = (X_test, y_test))
Output for training:
Epoch 1/10
167/167 [==============================] - 6s 31ms/step - loss: 0.6633 - accuracy: 0.8707 - val_loss: 0.6345 - val_accuracy: 0.8813
Epoch 2/10
167/167 [==============================] - 2s 13ms/step - loss: 0.6087 - accuracy: 0.8827 - val_loss: 0.5848 - val_accuracy: 0.8813
Epoch 3/10
167/167 [==============================] - 2s 13ms/step - loss: 0.5630 - accuracy: 0.8828 - val_loss: 0.5435 - val_accuracy: 0.8813
Epoch 4/10
167/167 [==============================] - 2s 13ms/step - loss: 0.5249 - accuracy: 0.8828 - val_loss: 0.5090 - val_accuracy: 0.8813
Epoch 5/10
167/167 [==============================] - 2s 12ms/step - loss: 0.4931 - accuracy: 0.8828 - val_loss: 0.4805 - val_accuracy: 0.8813
Epoch 6/10
167/167 [==============================] - 2s 13ms/step - loss: 0.4663 - accuracy: 0.8828 - val_loss: 0.4567 - val_accuracy: 0.8813
Epoch 7/10
167/167 [==============================] - 2s 14ms/step - loss: 0.4424 - accuracy: 0.8832 - val_loss: 0.4363 - val_accuracy: 0.8813
Epoch 8/10
167/167 [==============================] - 3s 17ms/step - loss: 0.4198 - accuracy: 0.8848 - val_loss: 0.4190 - val_accuracy: 0.8816
Epoch 9/10
167/167 [==============================] - 2s 15ms/step - loss: 0.3982 - accuracy: 0.8887 - val_loss: 0.4040 - val_accuracy: 0.8816
Epoch 10/10
167/167 [==============================] - 3s 15ms/step - loss: 0.3784 - accuracy: 0.8942 - val_loss: 0.3911 - val_accuracy: 0.8821
Out[85]:
<keras.callbacks.History at 0x7fe3ce8dedd0>
loss, accuracies = model1.evaluate(X_test, y_test)
261/261 [==============================] - 1s 2ms/step - loss: 0.3263 - accuracy: 0.8813
y_pred = model1.predict(X_test)
y_pred = (y_pred > 0.5)
confusion_matrix((y_test > 0.5), y_pred )
array([[ 0, 990],
[ 0, 7353]])
First, check how imbalance is your data.
If for example your dataset contain 10 samples, which 9 is class A and 1 is of class B. So your model likely would want to maximize its acciracy by simply always tell you the class is A - it would still get 90% accuracy.
When you actually wish to punish him alot on the unreprented class - i.e. class B.
So if indeed your data is inbalanced you can change try to change the metric from [accuracy] to ['matthews_correlation']
e.g.
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
metrics=['matthews_correlation'])
Which will do what I have explained in the beginning,over punish the mistakes in the unrepresented class .

Keras - validation score in fitting log is not correct

I am training a multitarget classification model with keras. My architecture is:
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
imput_ = Input(shape=(X_train.shape[1]))
x = Dense(50, activation="relu")(imput_)
x = Dense(n_targets, activation="sigmoid", name="output")(x)
model = Model(imput_, x)
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
Then I fit my model like this:
model.fit(X_train, y_train.toarray(), validation_data=(X_test, y_test.toarray()), epochs=5)
The fitting loss shows this:
Epoch 1/5
36/36 [==============================] - 1s 10ms/step - loss: 0.5161 - accuracy: 0.0614 - val_loss: 0.3365 - val_accuracy: 0.1434
Epoch 2/5
36/36 [==============================] - 0s 6ms/step - loss: 0.2761 - accuracy: 0.2930 - val_loss: 0.2429 - val_accuracy: 0.4560
Epoch 3/5
36/36 [==============================] - 0s 5ms/step - loss: 0.2255 - accuracy: 0.4435 - val_loss: 0.2187 - val_accuracy: 0.5130
Epoch 4/5
36/36 [==============================] - 0s 5ms/step - loss: 0.2037 - accuracy: 0.4800 - val_loss: 0.2040 - val_accuracy: 0.5199
Epoch 5/5
36/36 [==============================] - 0s 5ms/step - loss: 0.1876 - accuracy: 0.4996 - val_loss: 0.1929 - val_accuracy: 0.5250
<keras.callbacks.History at 0x7fe0a549ee10>
But then if I run:
from sklearn.metrics import accuracy_score
accuracy_score(np.round(model.predict(X_test)), y_test.toarray())
I got the following score:
0.07772020725388601
Shouldn't the score be equal to the val accuracy score in the last epoch?
With that loss and activation function, your top probability might not be higher than 0.5, and it would become 0 when you use np.round.
Try:
y_pred = np.argmax(model.predict(X_test), axis=1)
accuracy_score(y_test, y_pred)

Getting Different results on Each Iteration using Long Short Term Memory[LSTM] for text classification

I am using LTSM Deep-learning technique to classify my text, First i am dividing them into text and lables using panda library and making their tokens and then dividing them into into training and text data sets,whenever i runs the code, i get different results which varies from (80 to 100)percent.
Here is my code,
tokenizer = Tokenizer(num_words=MAX_NB_WORDS, filters='!"#$%&()*+,-./:;<=>?#[\]^_`{|}~',
lower=True)
tokenizer.fit_on_texts(trainDF['texts'])
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
X = tokenizer.texts_to_sequences(trainDF['texts'])
X = pad_sequences(X, maxlen=MAX_SEQUENCE_LENGTH)
print('Shape of data tensor:', X.shape)
Y = pd.get_dummies(trainDF['label'])
print('Shape of label tensor:', Y.shape)
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.10, random_state = 42)
print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, EMBEDDING_DIM, input_length=X.shape[1]))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
variables_for_classification=6 #change it as per your number of categories
model.add(Dense(variables_for_classification, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
epochs = 5
batch_size = 64
history = model.fit(X_train, Y_train, epochs=epochs,
batch_size=batch_size,validation_split=0.1,callbacks=[EarlyStopping(monitor='val_loss', patience=3,
min_delta=0.0001)])
accr = model.evaluate(X_test,Y_test)
print('Test set\n Loss: {:0.3f}\n Accuracy: {:0.3f}'.format(accr[0],accr[1]))
Train on 794 samples, validate on 89 samples
Epoch 1/5
794/794 [==============================] - 19s 24ms/step - loss: 1.6401 - accuracy: 0.6297 - val_loss: 0.9098 - val_accuracy: 0.5843
Epoch 2/5
794/794 [==============================] - 16s 20ms/step - loss: 0.8365 - accuracy: 0.7166 - val_loss: 0.7487 - val_accuracy: 0.7753
Epoch 3/5
794/794 [==============================] - 16s 20ms/step - loss: 0.7093 - accuracy: 0.8401 - val_loss: 0.6519 - val_accuracy: 0.8652
Epoch 4/5
794/794 [==============================] - 16s 20ms/step - loss: 0.5857 - accuracy: 0.8829 - val_loss: 0.4935 - val_accuracy: 1.0000
Epoch 5/5
794/794 [==============================] - 16s 20ms/step - loss: 0.4248 - accuracy: 0.9345 - val_loss: 0.3512 - val_accuracy: 0.8652
99/99 [==============================] - 0s 2ms/step
Test set
Loss: 0.348
Accuracy: 0.869
in the last run accuracy was 100 percent.

Model loss remains unchaged

I would like to understand what could be responsible for this model loss behaviour. Training a CNN network, with 6 hidden-layers, the loss shoots up from around 1.8 to above 12 after the first epoch and remains constant for the remaining 99 epochs.
724504/724504 [==============================] - 358s 494us/step - loss: 1.8143 - acc: 0.7557 - val_loss: 16.1181 - val_acc: 0.0000e+00
Epoch 2/100
724504/724504 [==============================] - 355s 490us/step - loss: 12.0886 - acc: 0.2500 - val_loss: 16.1181 - val_acc: 0.0000e+00
Epoch 3/100
724504/724504 [==============================] - 354s 489us/step - loss: 12.0886 - acc: 0.2500 - val_loss: 16.1181 - val_acc: 0.0000e+00
Epoch 4/100
724504/724504 [==============================] - 348s 481us/step - loss: 12.0886 - acc: 0.2500 - val_loss: 16.1181 - val_acc: 0.0000e+00
Epoch 5/100
724504/724504 [==============================] - 355s 490us/step - loss: 12.0886 - acc: 0.2500 - val_loss: 16.1181 - val_acc: 0.0000e+00
I cannot believe this got to do with the dataset I work with, because I tried this with a different, publicly available dataset, the performance is exactly the same (in fact exact figures for loss/accuracy).
I also tested this with a somehow show network having 2 hidden-layers, see the performance below:
724504/724504 [==============================] - 41s 56us/step - loss: 0.4974 - acc: 0.8236 - val_loss: 15.5007 - val_acc: 0.0330
Epoch 2/100
724504/724504 [==============================] - 40s 56us/step - loss: 0.5204 - acc: 0.8408 - val_loss: 15.5543 - val_acc: 0.0330
Epoch 3/100
724504/724504 [==============================] - 41s 56us/step - loss: 0.6646 - acc: 0.8439 - val_loss: 15.3904 - val_acc: 0.0330
Epoch 4/100
724504/724504 [==============================] - 41s 57us/step - loss: 8.8982 - acc: 0.4342 - val_loss: 15.5867 - val_acc: 0.0330
Epoch 5/100
724504/724504 [==============================] - 41s 57us/step - loss: 0.5627 - acc: 0.8444 - val_loss: 15.5449 - val_acc: 0.0330
Can someone points the probable cause of this behaviour? What parameter / configuration needs be adjusted?
EDIT
Model creation
model = Sequential()
activ = 'relu'
model.add(Conv2D(32, (1, 3), strides=(1, 1), padding='same', activation=activ, input_shape=(1, n_points, 4)))
model.add(Conv2D(32, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(MaxPooling2D(pool_size=(1, 2)))
#model.add(Dropout(.5))
model.add(Conv2D(64, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(Conv2D(64, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(MaxPooling2D(pool_size=(1, 2)))
#model.add(Dropout(.5))
model.add(Conv2D(128, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(Conv2D(128, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(MaxPooling2D(pool_size=(1, 2)))
model.add(Dropout(.5))
model.add(Flatten())
A = model.output_shape
model.add(Dense(int(A[1] * 1/4.), activation=activ))
model.add(Dropout(.5))
model.add(Dense(NoClass, activation='softmax'))
optimizer = Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_reample, Y_resample, epochs=100, batch_size=64, shuffle=False,
validation_data=(Test_X, Test_Y))
Changing the learning rate to lr=0.0001 here's the result after 100 epochs.
72090/72090 [==============================] - 29s 397us/step - loss: 0.5040 - acc: 0.8347 - val_loss: 4.3529 - val_acc: 0.2072
Epoch 99/100
72090/72090 [==============================] - 28s 395us/step - loss: 0.4958 - acc: 0.8382 - val_loss: 6.3422 - val_acc: 0.1806
Epoch 100/100
72090/72090 [==============================] - 28s 393us/step - loss: 0.5084 - acc: 0.8342 - val_loss: 4.3781 - val_acc: 0.1925
the optimal epoch size: 97, the value of high accuracy 0.20716827656581954
EDIT 2
Apparently, SMOTE isn't good for sampling all but majority class in a multiclassification, see below the trian/test plot:
Can you please try using BatchNormalization also, place just after your pooling layers. it is good to include it

Strange validation loss and accuracy

I'm trying to use MLP for classification. Here is how model looks like.
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.utils import np_utils
model = Sequential()
model.add(Dense(256, activation='relu', input_dim=400))
model.add(Dropout(0.5))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(number_of_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
X_train = input_data
y_train = np_utils.to_categorical(encoded_labels, number_of_classes)
history = model.fit(X_train, y_train, validation_split=0.2, nb_epoch=10, verbose=1)
But when I train my model, I see that training accuracy goes better but validation accuracy not moving and has high value.
Using TensorFlow backend.
Train on 41827 samples, validate on 10457 samples
Epoch 1/10
41827/41827 [==============================] - 7s - loss: 2.5783 - acc: 0.3853 - val_loss: 14.2315 - val_acc: 0.0031
Epoch 2/10
41827/41827 [==============================] - 6s - loss: 1.0356 - acc: 0.7011 - val_loss: 14.8957 - val_acc: 0.0153
Epoch 3/10
41827/41827 [==============================] - 6s - loss: 0.7935 - acc: 0.7691 - val_loss: 15.2258 - val_acc: 0.0154
Epoch 4/10
41827/41827 [==============================] - 6s - loss: 0.6734 - acc: 0.8013 - val_loss: 15.4279 - val_acc: 0.0153
Epoch 5/10
41827/41827 [==============================] - 6s - loss: 0.6188 - acc: 0.8185 - val_loss: 15.4588 - val_acc: 0.0165
Epoch 6/10
41827/41827 [==============================] - 6s - loss: 0.5847 - acc: 0.8269 - val_loss: 15.5796 - val_acc: 0.0176
Epoch 7/10
41827/41827 [==============================] - 6s - loss: 0.5488 - acc: 0.8395 - val_loss: 15.6464 - val_acc: 0.0154
Epoch 8/10
41827/41827 [==============================] - 6s - loss: 0.5398 - acc: 0.8418 - val_loss: 15.6705 - val_acc: 0.0164
Epoch 9/10
41827/41827 [==============================] - 6s - loss: 0.5287 - acc: 0.8439 - val_loss: 15.7259 - val_acc: 0.0163
Epoch 10/10
41827/41827 [==============================] - 6s - loss: 0.4923 - acc: 0.8547 - val_loss: 15.7484 - val_acc: 0.0187
Is problem related to train data or something wrong with my train process setup?
Your models seems that is strongly overfitting. It is probably something to do with the data but you could try lowering your learning rate first, just in case.
from keras.optimizers import Adam
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.0001),
metrics=['accuracy'])

Resources