Keras ROC different from Scikit ROC? - machine-learning

From the code below, it looks like evaluating the roc with keras and with scikit actually makes a difference. Does anybody know an explanation?
import tensorflow as tf
from keras.layers import Dense, Input, Dropout
from keras import Sequential
import keras
from keras.constraints import maxnorm
from sklearn.metrics import roc_auc_score
# training data: X_train, y_train
# validation data: X_valid, y_valid
# Define the custom callback we will be using to evaluate roc with scikit
class MyCustomCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self,epoch, logs=None):
y_pred = model.predict(X_valid)
print("roc evaluated with scikit = ",roc_auc_score(y_valid, y_pred))
return
# Define the model.
def model():
METRICS = [
tf.keras.metrics.BinaryAccuracy(name='accuracy'),
tf.keras.metrics.AUC(name='auc'),
]
optimizer="adam"
dropout=0.1
init='uniform'
nbr_features= vocab_size-1 #2500
dense_nparams=256
model = Sequential()
model.add(Dense(dense_nparams, activation='relu', input_shape=(nbr_features,), kernel_initializer=init, kernel_constraint=maxnorm(3)))
model.add(Dropout(dropout))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=optimizer,metrics = METRICS)
return model
# instantiate the model
model = model()
# fit the model
history = model.fit(x=X_train, y=y_train, batch_size = 8, epochs = 8, verbose=1,validation_data = (X_valid,y_valid), callbacks=[MyCustomCallback()], shuffle=True, validation_freq=1, max_queue_size=10, workers=4, use_multiprocessing=True)
Output:
Train on 4000 samples, validate on 1000 samples
Epoch 1/8
4000/4000 [==============================] - 15s 4ms/step - loss: 0.7950 - accuracy: 0.7149 - auc: 0.7213 - val_loss: 0.7551 - val_accuracy: 0.7608 - val_auc: 0.7770
roc evaluated with scikit = 0.78766515781747
Epoch 2/8
4000/4000 [==============================] - 15s 4ms/step - loss: 0.0771 - accuracy: 0.8235 - auc: 0.8571 - val_loss: 1.0803 - val_accuracy: 0.8574 - val_auc: 0.8954
roc evaluated with scikit = 0.7795984218252997
Epoch 3/8
4000/4000 [==============================] - 14s 4ms/step - loss: 0.0085 - accuracy: 0.8762 - auc: 0.9162 - val_loss: 1.2084 - val_accuracy: 0.8894 - val_auc: 0.9284
roc evaluated with scikit = 0.7705172905961992
Epoch 4/8
4000/4000 [==============================] - 14s 4ms/step - loss: 0.0025 - accuracy: 0.8982 - auc: 0.9361 - val_loss: 1.1700 - val_accuracy: 0.9054 - val_auc: 0.9424
roc evaluated with scikit = 0.7808804338960933
Epoch 5/8
4000/4000 [==============================] - 14s 4ms/step - loss: 0.0020 - accuracy: 0.9107 - auc: 0.9469 - val_loss: 1.1887 - val_accuracy: 0.9150 - val_auc: 0.9501
roc evaluated with scikit = 0.7811174659489438
Epoch 6/8
4000/4000 [==============================] - 14s 4ms/step - loss: 0.0018 - accuracy: 0.9184 - auc: 0.9529 - val_loss: 1.2036 - val_accuracy: 0.9213 - val_auc: 0.9548
roc evaluated with scikit = 0.7822898825544409
Epoch 7/8
4000/4000 [==============================] - 14s 4ms/step - loss: 0.0017 - accuracy: 0.9238 - auc: 0.9566 - val_loss: 1.2231 - val_accuracy: 0.9258 - val_auc: 0.9579
roc evaluated with scikit = 0.7817036742516923
Epoch 8/8
4000/4000 [==============================] - 14s 4ms/step - loss: 0.0016 - accuracy: 0.9278 - auc: 0.9592 - val_loss: 1.2426 - val_accuracy: 0.9293 - val_auc: 0.9600
roc evaluated with scikit = 0.7817419052279585
As you may see, from epoch 2 onwards keras' and scikit's validation ROCs begin diverging. The same happens if I fit the model and then use keras' model.evaluate(X_valid, y_valid). Any help is greatly appreciated.
EDIT: testing the model on a separate test set, I get roc =0.76 so scikit seems to give the correct answer ( btw X_train has 4000 entries, X_valid has 1000 and test has 15000, quite an unconventional splitting but it is forced by external factors).
Also, suggestions on how to improve performance are equally appreciated.
EDIT2: To answer the reply by #arpitrathi, i modified the callbak but unfortunately without success:
class MyCustomCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self,epoch, logs=None):
y_pred = model.predict_proba(X_valid)
print("roc evaluated with scikit = ",roc_auc_score(y_valid, y_pred))
return
model = model()
history = model.fit(x=X_trainl, y=y_train, batch_size = 8, epochs = 3, verbose=1,validation_data = (X_valid,y_valid), callbacks=[MyCustomCallback()], shuffle=True, validation_freq=1, max_queue_size=10, workers=4, use_multiprocessing=True)
Train on 4000 samples, validate on 1000 samples
Epoch 1/3
4000/4000 [==============================] - 20s 5ms/step - loss: 0.8266 - accuracy: 0.7261 - auc: 0.7409 - val_loss: 0.7547 - val_accuracy: 0.7627 - val_auc: 0.7881
roc evaluated with scikit = 0.7921764130168828
Epoch 2/3
4000/4000 [==============================] - 15s 4ms/step - loss: 0.0482 - accuracy: 0.8270 - auc: 0.8657 - val_loss: 1.0831 - val_accuracy: 0.8620 - val_auc: 0.9054
roc evaluated with scikit = 0.78525915504445
Epoch 3/3
4000/4000 [==============================] - 15s 4ms/step - loss: 0.0092 - accuracy: 0.8794 - auc: 0.9224 - val_loss: 1.2226 - val_accuracy: 0.8928 - val_auc: 0.9340
roc evaluated with scikit = 0.7705555215724655
Also, if I plot training and validation accuracy, i see that they both rapidly converge to 1. Is it strange?

The problem lies in the arguments that you passed to the sklearn function for roc_auc_score() calculation. You should use model.predict_proba() instead of model.predict().
def on_epoch_end(self,epoch, logs=None):
y_pred = model.predict_proba(X_valid)
print("roc evaluated with scikit = ",roc_auc_score(y_valid, y_pred))
return

Sklearn and keras use different default parameters when computing AUC. Increasing the number of thresholds keras uses to compute AUC (i.e., increasing num_thresholds) can help the keras AUC better match the sklearn AUC.

Related

Is my model overfitting/underfitting? Can someone explain the behavior?

The accuracy of the training dataset is increasing steadily and the loss decreases accordingly also.
However, for the accuracy of the validation dataset has a strange fluctuation. It increases but tends to fluctuate and decrease at times and isn't learnign at the same rate as the validation dataset. The validation loss decreases but sometimes increases also.
Here are my results after 10 epochs:
Epoch 1/10
493/493 [==============================] - 330s 668ms/step - loss: 0.8949 - accuracy: 0.6697 - val_loss: 0.6944 - val_accuracy: 0.6463
Epoch 2/10
493/493 [==============================] - 290s 589ms/step - loss: 0.5457 - accuracy: 0.7958 - val_loss: 0.6451 - val_accuracy: 0.7450
Epoch 3/10
493/493 [==============================] - 331s 672ms/step - loss: 0.5110 - accuracy: 0.8235 - val_loss: 0.8121 - val_accuracy: 0.6904
Epoch 4/10
493/493 [==============================] - 278s 563ms/step - loss: 0.4697 - accuracy: 0.8479 - val_loss: 0.7215 - val_accuracy: 0.7153
Epoch 5/10
493/493 [==============================] - 265s 537ms/step - loss: 0.4395 - accuracy: 0.8726 - val_loss: 0.6471 - val_accuracy: 0.7505
Epoch 6/10
493/493 [==============================] - 277s 561ms/step - loss: 0.4043 - accuracy: 0.8924 - val_loss: 0.5335 - val_accuracy: 0.8169
Epoch 7/10
493/493 [==============================] - 335s 679ms/step - loss: 0.3918 - accuracy: 0.9024 - val_loss: 0.5372 - val_accuracy: 0.8294
Epoch 8/10
493/493 [==============================] - 320s 650ms/step - loss: 0.3679 - accuracy: 0.9111 - val_loss: 0.5790 - val_accuracy: 0.8171
Epoch 9/10
493/493 [==============================] - 299s 606ms/step - loss: 0.3618 - accuracy: 0.9151 - val_loss: 0.3969 - val_accuracy: 0.8874
Epoch 10/10
493/493 [==============================] - 272s 552ms/step - loss: 0.3374 - accuracy: 0.9235 - val_loss: 0.4553 - val_accuracy: 0.8652
Here is my code for the layers etc:
model = tf.keras.models.Sequential([tf.keras.layers.Conv2D(16,(3,3), kernel_regularizer = regularizers.l2(0.01), activation = 'relu', input_shape = (200,200,1)), #conv2d= how many filters we want to keep in the layer, input shaoe = size of filter
tf.keras.layers.MaxPool2D(2,2), #Max pixels out of a given number of pixels
#
tf.keras.layers.Conv2D(32,(3,3), kernel_regularizer = regularizers.l2(0.01), activation='relu'),
tf.keras.layers.MaxPool2D(2,2),
#
tf.keras.layers.Conv2D(64,(3,3), kernel_regularizer = regularizers.l2(0.01), activation='relu'),
tf.keras.layers.MaxPool2D(2,2),
## Increasing the number of channels
tf.keras.layers.Flatten(),
##
tf.keras.layers.Dense(512,activation='relu', kernel_regularizer = regularizers.l2(0.01)),
##
tf.keras.layers.Dense(1,activation='sigmoid')
])
model.compile(loss='binary_crossentropy',
optimizer= tf.keras.optimizers.RMSprop(learning_rate=0.001),
metrics =['accuracy'])
model_fit = model.fit(train_dataset,
steps_per_epoch = None,
epochs = 10,
validation_data = validation_dataset)
I added regularization to the Conv2D Layers to reduce overfitting I was previously experiencing. I also have tried changing the figures for Regularization

Why loss remains same for every epochs in my regression model?

The below is the code for it.
datagen = ImageDataGenerator(rescale=1./255, validation_split=0.1)
train_generator = datagen.flow_from_dataframe(
df,
directory=img_data_dir,
x_col="image_name",
y_col=["top_x", "top_y", "bottom_x", "bottom_y"],
target_size=(WIDTH, HEIGHT),
batch_size=32,
class_mode="other",
subset="training")
validation_generator = datagen.flow_from_dataframe(
df,
directory=img_data_dir,
x_col="image_name",
y_col=["top_x", "top_y", "bottom_x", "bottom_y"],
target_size=(WIDTH, HEIGHT),
batch_size=32,
class_mode="other",
subset="validation")
model = Sequential()
model.add(VGG16(weights="imagenet", include_top=False, input_shape=(HEIGHT, WIDTH, CHANNEL)))
model.add(Flatten())
model.add(Dense(128, activation="relu"))
model.add(Dense(64, activation="relu"))
model.add(Dense(64, activation="relu"))
model.add(Dense(4, activation="sigmoid"))
model.layers[-6].trainable = False
model.summary()
STEP_SIZE_TRAIN = int(np.ceil(train_generator.n / train_generator.batch_size))
STEP_SIZE_VAL = int(np.ceil(validation_generator.n / validation_generator.batch_size))
print("Train step size:", STEP_SIZE_TRAIN)
print("Validation step size:", STEP_SIZE_VAL)
train_generator.reset()
validation_generator.reset()
adam = Adam(lr=1e-4)
model.compile(optimizer=adam, loss="mse")
history = model.fit(train_generator,
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=validation_generator,
validation_steps=STEP_SIZE_VAL,
epochs=10)
The below is the results of each epoch,
Epoch 1/10
20/20 [==============================] - 376s 18s/step - loss: 436570.7812 - val_loss: 524766.6875
Epoch 2/10
20/20 [==============================] - 14s 732ms/step - loss: 436464.6250 - val_loss: 524765.2500
Epoch 3/10
20/20 [==============================] - 14s 721ms/step - loss: 436464.2188 - val_loss: 524765.1250
Epoch 4/10
20/20 [==============================] - 14s 721ms/step - loss: 436464.1875 - val_loss: 524765.0625
Epoch 5/10
20/20 [==============================] - 14s 722ms/step - loss: 436464.1875 - val_loss: 524765.0625
Epoch 6/10
20/20 [==============================] - 14s 707ms/step - loss: 436464.1875 - val_loss: 524765.0625
Epoch 7/10
20/20 [==============================] - 14s 715ms/step - loss: 436464.1875 - val_loss: 524765.0000
Epoch 8/10
20/20 [==============================] - 14s 713ms/step - loss: 436464.1875 - val_loss: 524765.0000
Epoch 9/10
20/20 [==============================] - 15s 741ms/step - loss: 436464.1250 - val_loss: 524765.0000
Epoch 10/10
20/20 [==============================] - 17s 827ms/step - loss: 436464.0625 - val_loss: 524765.0000
Plot for it,
enter image description here
With a short skim through your code, it seems that this line of code is hindering your model from training:
model.layers[-6].trainable = False
From the [Keras documentation:][1]
Layers & models also feature a boolean attribute trainable. Its value can be changed. Setting layer.trainable to False moves all the layer's weights from trainable to non-trainable. This is called "freezing" the layer: the state of a frozen layer won't be updated during training (either when training with fit() or when training with any custom loop that relies on trainable_weights to apply gradient updates).
Training a model is about updating the weight matrices, so this might be hindering in optimising loss. Try setting this to True and see what happens. I have never worked with CNNs in Keras though. Maybe this line freezes a layer which you didn't intend to freeze?
[1]: https://keras.io/guides/transfer_learning/#:~:text=trainable%20to%20False%20moves%20all,trainable_weights%20to%20apply%20gradient%20updates).

Binary classifier making only making true negative and false positive predictions [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed last year.
Improve this question
I'm building a neural network to classify doublets of 100*80 images into two classes.
My accuracy is capped at around 88% no matter what I try to do (add convolutional layers, dropouts...).
I've investigated the issue and found from the confusion matrix that my model is only making true negative and false positive predictions. I have no idea how this is possible and was wondering if anyone could help me.
Here is some of the code (I've used a really simple model architecture here):
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, shuffle = True)
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape = (100,80,2)))
model.add(keras.layers.Dense(5, activation = 'relu'))
model.add(keras.layers.Dense(1, activation = 'sigmoid'))
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
metrics=['accuracy'])
model.fit(X_train, y_train, epochs =10, batch_size= 200, validation_data = (X_test, y_test))
Output for training:
Epoch 1/10
167/167 [==============================] - 6s 31ms/step - loss: 0.6633 - accuracy: 0.8707 - val_loss: 0.6345 - val_accuracy: 0.8813
Epoch 2/10
167/167 [==============================] - 2s 13ms/step - loss: 0.6087 - accuracy: 0.8827 - val_loss: 0.5848 - val_accuracy: 0.8813
Epoch 3/10
167/167 [==============================] - 2s 13ms/step - loss: 0.5630 - accuracy: 0.8828 - val_loss: 0.5435 - val_accuracy: 0.8813
Epoch 4/10
167/167 [==============================] - 2s 13ms/step - loss: 0.5249 - accuracy: 0.8828 - val_loss: 0.5090 - val_accuracy: 0.8813
Epoch 5/10
167/167 [==============================] - 2s 12ms/step - loss: 0.4931 - accuracy: 0.8828 - val_loss: 0.4805 - val_accuracy: 0.8813
Epoch 6/10
167/167 [==============================] - 2s 13ms/step - loss: 0.4663 - accuracy: 0.8828 - val_loss: 0.4567 - val_accuracy: 0.8813
Epoch 7/10
167/167 [==============================] - 2s 14ms/step - loss: 0.4424 - accuracy: 0.8832 - val_loss: 0.4363 - val_accuracy: 0.8813
Epoch 8/10
167/167 [==============================] - 3s 17ms/step - loss: 0.4198 - accuracy: 0.8848 - val_loss: 0.4190 - val_accuracy: 0.8816
Epoch 9/10
167/167 [==============================] - 2s 15ms/step - loss: 0.3982 - accuracy: 0.8887 - val_loss: 0.4040 - val_accuracy: 0.8816
Epoch 10/10
167/167 [==============================] - 3s 15ms/step - loss: 0.3784 - accuracy: 0.8942 - val_loss: 0.3911 - val_accuracy: 0.8821
Out[85]:
<keras.callbacks.History at 0x7fe3ce8dedd0>
loss, accuracies = model1.evaluate(X_test, y_test)
261/261 [==============================] - 1s 2ms/step - loss: 0.3263 - accuracy: 0.8813
y_pred = model1.predict(X_test)
y_pred = (y_pred > 0.5)
confusion_matrix((y_test > 0.5), y_pred )
array([[ 0, 990],
[ 0, 7353]])
First, check how imbalance is your data.
If for example your dataset contain 10 samples, which 9 is class A and 1 is of class B. So your model likely would want to maximize its acciracy by simply always tell you the class is A - it would still get 90% accuracy.
When you actually wish to punish him alot on the unreprented class - i.e. class B.
So if indeed your data is inbalanced you can change try to change the metric from [accuracy] to ['matthews_correlation']
e.g.
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
metrics=['matthews_correlation'])
Which will do what I have explained in the beginning,over punish the mistakes in the unrepresented class .

Keras - validation score in fitting log is not correct

I am training a multitarget classification model with keras. My architecture is:
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
imput_ = Input(shape=(X_train.shape[1]))
x = Dense(50, activation="relu")(imput_)
x = Dense(n_targets, activation="sigmoid", name="output")(x)
model = Model(imput_, x)
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
Then I fit my model like this:
model.fit(X_train, y_train.toarray(), validation_data=(X_test, y_test.toarray()), epochs=5)
The fitting loss shows this:
Epoch 1/5
36/36 [==============================] - 1s 10ms/step - loss: 0.5161 - accuracy: 0.0614 - val_loss: 0.3365 - val_accuracy: 0.1434
Epoch 2/5
36/36 [==============================] - 0s 6ms/step - loss: 0.2761 - accuracy: 0.2930 - val_loss: 0.2429 - val_accuracy: 0.4560
Epoch 3/5
36/36 [==============================] - 0s 5ms/step - loss: 0.2255 - accuracy: 0.4435 - val_loss: 0.2187 - val_accuracy: 0.5130
Epoch 4/5
36/36 [==============================] - 0s 5ms/step - loss: 0.2037 - accuracy: 0.4800 - val_loss: 0.2040 - val_accuracy: 0.5199
Epoch 5/5
36/36 [==============================] - 0s 5ms/step - loss: 0.1876 - accuracy: 0.4996 - val_loss: 0.1929 - val_accuracy: 0.5250
<keras.callbacks.History at 0x7fe0a549ee10>
But then if I run:
from sklearn.metrics import accuracy_score
accuracy_score(np.round(model.predict(X_test)), y_test.toarray())
I got the following score:
0.07772020725388601
Shouldn't the score be equal to the val accuracy score in the last epoch?
With that loss and activation function, your top probability might not be higher than 0.5, and it would become 0 when you use np.round.
Try:
y_pred = np.argmax(model.predict(X_test), axis=1)
accuracy_score(y_test, y_pred)

Getting Different results on Each Iteration using Long Short Term Memory[LSTM] for text classification

I am using LTSM Deep-learning technique to classify my text, First i am dividing them into text and lables using panda library and making their tokens and then dividing them into into training and text data sets,whenever i runs the code, i get different results which varies from (80 to 100)percent.
Here is my code,
tokenizer = Tokenizer(num_words=MAX_NB_WORDS, filters='!"#$%&()*+,-./:;<=>?#[\]^_`{|}~',
lower=True)
tokenizer.fit_on_texts(trainDF['texts'])
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
X = tokenizer.texts_to_sequences(trainDF['texts'])
X = pad_sequences(X, maxlen=MAX_SEQUENCE_LENGTH)
print('Shape of data tensor:', X.shape)
Y = pd.get_dummies(trainDF['label'])
print('Shape of label tensor:', Y.shape)
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.10, random_state = 42)
print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, EMBEDDING_DIM, input_length=X.shape[1]))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
variables_for_classification=6 #change it as per your number of categories
model.add(Dense(variables_for_classification, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
epochs = 5
batch_size = 64
history = model.fit(X_train, Y_train, epochs=epochs,
batch_size=batch_size,validation_split=0.1,callbacks=[EarlyStopping(monitor='val_loss', patience=3,
min_delta=0.0001)])
accr = model.evaluate(X_test,Y_test)
print('Test set\n Loss: {:0.3f}\n Accuracy: {:0.3f}'.format(accr[0],accr[1]))
Train on 794 samples, validate on 89 samples
Epoch 1/5
794/794 [==============================] - 19s 24ms/step - loss: 1.6401 - accuracy: 0.6297 - val_loss: 0.9098 - val_accuracy: 0.5843
Epoch 2/5
794/794 [==============================] - 16s 20ms/step - loss: 0.8365 - accuracy: 0.7166 - val_loss: 0.7487 - val_accuracy: 0.7753
Epoch 3/5
794/794 [==============================] - 16s 20ms/step - loss: 0.7093 - accuracy: 0.8401 - val_loss: 0.6519 - val_accuracy: 0.8652
Epoch 4/5
794/794 [==============================] - 16s 20ms/step - loss: 0.5857 - accuracy: 0.8829 - val_loss: 0.4935 - val_accuracy: 1.0000
Epoch 5/5
794/794 [==============================] - 16s 20ms/step - loss: 0.4248 - accuracy: 0.9345 - val_loss: 0.3512 - val_accuracy: 0.8652
99/99 [==============================] - 0s 2ms/step
Test set
Loss: 0.348
Accuracy: 0.869
in the last run accuracy was 100 percent.

Resources