Fixing seeds in keras tanks accuracy

Fixing seeds in keras tanks accuracy - machine-learning

I'm following this example on how to use Keras to build a CNN trained on MNIST. I modified it so that the batch size is 20, the number of classes is 10, and the number of epochs is 3. I also only use the first 1000 images from the dataset. This is all to decrease the time spent training and to lower the accuracy a bit (for separate reasons).
Of course training as-is is not fully reproducible. Here are 3 results from training:
Test loss: 0.30017868733406067
Test accuracy: 0.901
Test loss: 0.30246967363357546
Test accuracy: 0.894
Test loss: 0.355930775642395
Test accuracy: 0.887
They're not super far apart I suppose, but I'd like to have full reproducibility. So, I tried to fix the seeds and reduce the multithreading using the following lines of code:
import os
import numpy as np
import tensorflow as tf
import random
seed = 124129
os.environ['PYTHONHASHSEED']=str(seed)
random.seed(seed)
np.random.seed(seed)
tf.set_random_seed(seed)
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)
I'm running this on my laptop which only has a CPU, so fixing the seeds should have been sufficient. However, the results seem to be even worse:
Test loss: 14.74805696105957
Test accuracy: 0.085
Test loss: 12.71728395843506
Test accuracy: 0.211
Test loss: 12.340894721984863
Test accuracy: 0.232
It's not just the fact that the accuracy is so low that is concerning, but also the fact that it's still not reproducible. What other sources of irreproducibility could there be?
For reference, here is the full code: https://pastebin.com/XixmKUC6

Related

Why does the AI model give random results instead of the same ones each time I run the script again? [duplicate]

How is Accuracy defined when the loss function is mean square error? Is it mean absolute percentage error?
The model I use has output activation linear and is compiled with loss= mean_squared_error
model.add(Dense(1))
model.add(Activation('linear')) # number
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
and the output looks like this:
Epoch 99/100
1000/1000 [==============================] - 687s 687ms/step - loss: 0.0463 - acc: 0.9689 - val_loss: 3.7303 - val_acc: 0.3250
Epoch 100/100
1000/1000 [==============================] - 688s 688ms/step - loss: 0.0424 - acc: 0.9740 - val_loss: 3.4221 - val_acc: 0.3701
So what does e.g. val_acc: 0.3250 mean? Mean_squared_error should be a scalar not a percentage - shouldnt it? So is val_acc - mean squared error, or mean percentage error or another function?
From definition of MSE on wikipedia:https://en.wikipedia.org/wiki/Mean_squared_error
The MSE is a measure of the quality of an estimator—it is always
non-negative, and values closer to zero are better.
Does that mean a value of val_acc: 0.0 is better than val_acc: 0.325?
edit: more examples of the output of accuracy metric when I train - where the accuracy is increase as I train more. While the loss function - mse should decrease. Is Accuracy well defined for mse - and how is it defined in Keras?
lAllocator: After 14014 get requests, put_count=14032 evicted_count=1000 eviction_rate=0.0712657 and unsatisfied allocation rate=0.071714
1000/1000 [==============================] - 453s 453ms/step - loss: 17.4875 - acc: 0.1443 - val_loss: 98.0973 - val_acc: 0.0333
Epoch 2/100
1000/1000 [==============================] - 443s 443ms/step - loss: 6.6793 - acc: 0.1973 - val_loss: 11.9101 - val_acc: 0.1500
Epoch 3/100
1000/1000 [==============================] - 444s 444ms/step - loss: 6.3867 - acc: 0.1980 - val_loss: 6.8647 - val_acc: 0.1667
Epoch 4/100
1000/1000 [==============================] - 445s 445ms/step - loss: 5.4062 - acc: 0.2255 - val_loss: 5.6029 - val_acc: 0.1600
Epoch 5/100
783/1000 [======================>.......] - ETA: 1:36 - loss: 5.0148 - acc: 0.2306

There are at least two separate issues with your question.
The first one should be clear by now from the comments by Dr. Snoopy and the other answer: accuracy is meaningless in a regression problem, such as yours; see also the comment by patyork in this Keras thread. For good or bad, the fact is that Keras will not "protect" you or any other user from putting not-meaningful requests in your code, i.e. you will not get any error, or even a warning, that you are attempting something that does not make sense, such as requesting the accuracy in a regression setting.
Having clarified that, the other issue is:
Since Keras does indeed return an "accuracy", even in a regression setting, what exactly is it and how is it calculated?
To shed some light here, let's revert to a public dataset (since you do not provide any details about your data), namely the Boston house price dataset (saved locally as housing.csv), and run a simple experiment as follows:
import numpy as np
import pandas
import keras
from keras.models import Sequential
from keras.layers import Dense
# load dataset
dataframe = pandas.read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
model = Sequential()
model.add(Dense(13, input_dim=13, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model asking for accuracy, too:
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit(X, Y,
batch_size=5,
epochs=100,
verbose=1)
As in your case, the model fitting history (not shown here) shows a decreasing loss, and an accuracy roughly increasing. Let's evaluate now the model performance in the same training set, using the appropriate Keras built-in function:
score = model.evaluate(X, Y, verbose=0)
score
# [16.863721372581754, 0.013833992168483997]
The exact contents of the score array depend on what exactly we have requested during model compilation; in our case here, the first element is the loss (MSE), and the second one is the "accuracy".
At this point, let us have a look at the definition of Keras binary_accuracy in the metrics.py file:
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
So, after Keras has generated the predictions y_pred, it first rounds them, and then checks to see how many of them are equal to the true labels y_true, before getting the mean.
Let's replicate this operation using plain Python & Numpy code in our case, where the true labels are Y:
y_pred = model.predict(X)
l = len(Y)
acc = sum([np.round(y_pred[i])==Y[i] for i in range(l)])/l
acc
# array([0.01383399])
Well, bingo! This is actually the same value returned by score[1] above...
To make a long story short: since you (erroneously) request metrics=['accuracy'] in your model compilation, Keras will do its best to satisfy you, and will return some "accuracy" indeed, calculated as shown above, despite this being completely meaningless in your setting.
There are quite a few settings where Keras, under the hood, performs rather meaningless operations without giving any hint or warning to the user; two of them I have happened to encounter are:
Giving meaningless results when, in a multi-class setting, one happens to request loss='binary_crossentropy' (instead of categorical_crossentropy) with metrics=['accuracy'] - see my answers in Keras binary_crossentropy vs categorical_crossentropy performance? and Why is binary_crossentropy more accurate than categorical_crossentropy for multiclass classification in Keras?
Disabling completely Dropout, in the extreme case when one requests a dropout rate of 1.0 - see my answer in Dropout behavior in Keras with rate=1 (dropping all input units) not as expected

The loss function (Mean Square Error in this case) is used to indicate how far your predictions deviate from the target values. In the training phase, the weights are updated based on this quantity. If you are dealing with a classification problem, it is quite common to define an additional metric called accuracy. It monitors in how many cases the correct class was predicted. This is expressed as a percentage value. Consequently, a value of 0.0 means no correct decision and 1.0 only correct decisons.
While your network is training, the loss is decreasing and usually the accuracy increases.
Note, that in contrast to loss, the accuracy is usally not used to update the parameters of your network. It helps to monitor the learning progress and the current performane of the network.

#desertnaut has said it very clearly.
Consider the following two pieces of code
compile code
binary_accuracy code
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
Your labels should be integer，Because keras does not round y_true, and you get high accuracy.......

Does model evaluation during training affect final accuracy in PyTorch?

During a simple training loop for PyTorch a strange effect was observed.
If the evaluation function is called or not seems to have effects on the final performance of the model.
We train on the CIFAR10 using a very simple MLP model and Adam with 10 training epochs.
We try two Main loops:
After the end of each training epoch we measure the accuracy of validation set
We calculate the validation only once at the end of all training.
We show the difference in code here below:
Main Loop 1:
# Main Loop 1
num_epochs = 10
print(f"num_epochs: {num_epochs}")
for epoch in range(num_epochs): # loop over the dataset multiple times
print(f"\nStart Epoch {epoch}")
model.train()
train_loss, train_accuracy = training_epoch(trainloader,optimizer,model,criterion)
print(f"Training Loss: {train_loss:.3f} - Training Accuracy: {train_accuracy:.3f}")
model.eval()
with torch.no_grad():
val_loss, val_accuracy = val_epoch(testloader, model, criterion)
print(f"Val Loss: {val_loss:.3f} - Val Accuracy: {val_accuracy:.3f}")
print('Finished Training')
Main Loop 2:
# Main Loop 2
num_epochs = 10
print(f"num_epochs: {num_epochs}")
for epoch in range(num_epochs): # loop over the dataset multiple times
print(f"\nStart Epoch {epoch}")
model.train()
train_loss, train_accuracy = training_epoch(trainloader,optimizer,model,criterion)
print(f"Training Loss: {train_loss:.3f} - Training Accuracy: {train_accuracy:.3f}")
model.eval()
with torch.no_grad():
val_loss, val_accuracy = val_epoch(testloader, model, criterion)
print(f"Val Loss: {val_loss:.3f} - Val Accuracy: {val_accuracy:.3f}")
print('Finished Training')
Though there shouldn't be any change, the final performance of model change.
Val Loss: 1.526 - Val Accuracy: 0.523
Val Loss: 1.501 - Val Accuracy: 0.528
Of course for reproducibility, we set all seeds. Moreover, this effect can already be observed at the beginning of the second training epoch.
I share the entire code as a Colab notebook:
https://colab.research.google.com/drive/1BODeKHZmcT8lH3r2bxYVHNR2KOpT9O9Y?usp=sharing

The observed difference would be due to variance because of stochasticity in the optimization algorithm. The evaluation you perform has no effect on the model's weights.
Also in the link you provided, you are re-initializing a SimpleMLP on both experiments. Since the module's weights get instantiated randomly the inference will naturally yield different results.

ResNet50 torchvision implementation gives low accuracy on CIFAR-10

I am new to Deep Learning and PyTorch. I am using the resnet-50 model in the torchvision module on cifar10. I have imported the CIFAR-10 dataset from torchvision. The accuracy is very low on testing and I have tried configuring the classification layers but there is no change in the accuracy. Is there something wrong with my code? Am I making a mistake in calculating the accuracy?
import torchvision
import torch
import torch.nn as nn
from torch import optim
import os
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import numpy as np
from collections import OrderedDict
import matplotlib.pyplot as plt
transformations=transforms.Compose([transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406],[0.229, 0.224, 0.225])])
trainset=torchvision.datasets.CIFAR10(root='./CIFAR10',download=True,transform=transformations,train=True)
testset=torchvision.datasets.CIFAR10(root='./CIFAR10',download=True,transform=transformations,train=False)
trainloader=DataLoader(dataset=trainset,batch_size=4)
testloader=DataLoader(dataset=testset,batch_size=4)
inputs,labels=next(iter(trainloader))
labels=labels.float()
inputs.size()
print(labels.type())
resnet=torchvision.models.resnet50(pretrained=True)
if torch.cuda.is_available():
resnet=resnet.cuda()
inputs,labels=inputs.cuda(),torch.Tensor(labels).cuda()
outputs=resnet(inputs)
outputs.size()
for param in resnet.parameters():
param.requires_grad=False
numft=resnet.fc.in_features
print(numft)
resnet.fc=torch.nn.Sequential(nn.Linear(numft,1000),nn.ReLU(),nn.Linear(1000,10))
resnet.cuda()
resnet.train(True)
optimizer=torch.optim.SGD(resnet.parameters(),lr=0.001,momentum=0.9)
criterion=nn.CrossEntropyLoss()
for epoch in range(5):
resnet.train(True)
trainloss=0
correct=0
for x,y in trainloader:
x,y=x.cuda(),y.cuda()
optimizer.zero_grad()
yhat=resnet(x)
loss=criterion(yhat,y)
loss.backward()
optimizer.step()
trainloss+=loss.item()
print('Epoch: {} Loss: {}'.format(epoch,(trainloss/len(trainloader))))
accuracy=[]
running_corrects=0.0
for x_test,y_test in testloader:
x_test,y_test=x_test.cuda(),y_test.cuda()
yhat=resnet(x_test)
_,z=yhat.max(1)
running_corrects += torch.sum(y_test == z)
accuracy.append(running_corrects/len(testloader))
print(running_corrects/len(testloader))
accuracy=max(accuracy)
print(accuracy)
OUTPUT AFTER TRAINING/TESTING
Epoch: 0 Loss: 1.9808503997325897
Epoch: 1 Loss: 1.7917569598436356
Epoch: 2 Loss: 1.624434965057373
Epoch: 3 Loss: 1.4082191940283775
Epoch: 4 Loss: 1.1343850775527955
tensor(1.1404, device='cuda:0')
tensor(1.1404, device='cuda:0')

Couple of my observations:
You may want to fine-tune learning-rate and number of epochs and batch size. For example, currently you are training your model for only five epochs which might not be sufficient to achieve high accuracy. you can try with lager value of epochs.
Have you tried adapting backbone (feature extractor) model for CIFAR10 dataset by setting `param.requires_grad=True? Because the original model is trained on imagenet that might need to adapt on CIFAR10.
Before evaluation/testing you may like to set resnet.train(False) or resnet.eval() to let the model know that you are in eval mode. Furthermore, you may want to evaluate your model under the scope of no_grad() by using with torch.no_grad(): that will speed up inference time and reduce memory usage.
[CIFAR-10 is a balanced dataset so it's an optional (EDA) task here.] Have you checked the class distribution of CIFAR10 in terms of whether it's an imbalanced dataset or not? If it's an imbalanced dataset you may want to employ weighted cross entropy for you loss calculation. There are other strategies to tackle class-imbalance like over-sampling or under-sampling.
Regarding test accuracy, You need to divide the total number of correct prediction by the total number of samples in the dataset, len(testloader.dataset) instead of len(testloader). If you want your accuracy in the range of [0,100], just multiply by 100. You can print test accuracy for each epoch to check how it's changing whereas you are currently showing the maximum accuracy.

What function defines accuracy in Keras when the loss is mean squared error (MSE)?

How is Accuracy defined when the loss function is mean square error? Is it mean absolute percentage error?
The model I use has output activation linear and is compiled with loss= mean_squared_error
model.add(Dense(1))
model.add(Activation('linear')) # number
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
and the output looks like this:
Epoch 99/100
1000/1000 [==============================] - 687s 687ms/step - loss: 0.0463 - acc: 0.9689 - val_loss: 3.7303 - val_acc: 0.3250
Epoch 100/100
1000/1000 [==============================] - 688s 688ms/step - loss: 0.0424 - acc: 0.9740 - val_loss: 3.4221 - val_acc: 0.3701
So what does e.g. val_acc: 0.3250 mean? Mean_squared_error should be a scalar not a percentage - shouldnt it? So is val_acc - mean squared error, or mean percentage error or another function?
From definition of MSE on wikipedia:https://en.wikipedia.org/wiki/Mean_squared_error
The MSE is a measure of the quality of an estimator—it is always
non-negative, and values closer to zero are better.
Does that mean a value of val_acc: 0.0 is better than val_acc: 0.325?
edit: more examples of the output of accuracy metric when I train - where the accuracy is increase as I train more. While the loss function - mse should decrease. Is Accuracy well defined for mse - and how is it defined in Keras?
lAllocator: After 14014 get requests, put_count=14032 evicted_count=1000 eviction_rate=0.0712657 and unsatisfied allocation rate=0.071714
1000/1000 [==============================] - 453s 453ms/step - loss: 17.4875 - acc: 0.1443 - val_loss: 98.0973 - val_acc: 0.0333
Epoch 2/100
1000/1000 [==============================] - 443s 443ms/step - loss: 6.6793 - acc: 0.1973 - val_loss: 11.9101 - val_acc: 0.1500
Epoch 3/100
1000/1000 [==============================] - 444s 444ms/step - loss: 6.3867 - acc: 0.1980 - val_loss: 6.8647 - val_acc: 0.1667
Epoch 4/100
1000/1000 [==============================] - 445s 445ms/step - loss: 5.4062 - acc: 0.2255 - val_loss: 5.6029 - val_acc: 0.1600
Epoch 5/100
783/1000 [======================>.......] - ETA: 1:36 - loss: 5.0148 - acc: 0.2306

There are at least two separate issues with your question.
The first one should be clear by now from the comments by Dr. Snoopy and the other answer: accuracy is meaningless in a regression problem, such as yours; see also the comment by patyork in this Keras thread. For good or bad, the fact is that Keras will not "protect" you or any other user from putting not-meaningful requests in your code, i.e. you will not get any error, or even a warning, that you are attempting something that does not make sense, such as requesting the accuracy in a regression setting.
Having clarified that, the other issue is:
Since Keras does indeed return an "accuracy", even in a regression setting, what exactly is it and how is it calculated?
To shed some light here, let's revert to a public dataset (since you do not provide any details about your data), namely the Boston house price dataset (saved locally as housing.csv), and run a simple experiment as follows:
import numpy as np
import pandas
import keras
from keras.models import Sequential
from keras.layers import Dense
# load dataset
dataframe = pandas.read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
model = Sequential()
model.add(Dense(13, input_dim=13, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model asking for accuracy, too:
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit(X, Y,
batch_size=5,
epochs=100,
verbose=1)
As in your case, the model fitting history (not shown here) shows a decreasing loss, and an accuracy roughly increasing. Let's evaluate now the model performance in the same training set, using the appropriate Keras built-in function:
score = model.evaluate(X, Y, verbose=0)
score
# [16.863721372581754, 0.013833992168483997]
The exact contents of the score array depend on what exactly we have requested during model compilation; in our case here, the first element is the loss (MSE), and the second one is the "accuracy".
At this point, let us have a look at the definition of Keras binary_accuracy in the metrics.py file:
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
So, after Keras has generated the predictions y_pred, it first rounds them, and then checks to see how many of them are equal to the true labels y_true, before getting the mean.
Let's replicate this operation using plain Python & Numpy code in our case, where the true labels are Y:
y_pred = model.predict(X)
l = len(Y)
acc = sum([np.round(y_pred[i])==Y[i] for i in range(l)])/l
acc
# array([0.01383399])
Well, bingo! This is actually the same value returned by score[1] above...
To make a long story short: since you (erroneously) request metrics=['accuracy'] in your model compilation, Keras will do its best to satisfy you, and will return some "accuracy" indeed, calculated as shown above, despite this being completely meaningless in your setting.
There are quite a few settings where Keras, under the hood, performs rather meaningless operations without giving any hint or warning to the user; two of them I have happened to encounter are:
Giving meaningless results when, in a multi-class setting, one happens to request loss='binary_crossentropy' (instead of categorical_crossentropy) with metrics=['accuracy'] - see my answers in Keras binary_crossentropy vs categorical_crossentropy performance? and Why is binary_crossentropy more accurate than categorical_crossentropy for multiclass classification in Keras?
Disabling completely Dropout, in the extreme case when one requests a dropout rate of 1.0 - see my answer in Dropout behavior in Keras with rate=1 (dropping all input units) not as expected

The loss function (Mean Square Error in this case) is used to indicate how far your predictions deviate from the target values. In the training phase, the weights are updated based on this quantity. If you are dealing with a classification problem, it is quite common to define an additional metric called accuracy. It monitors in how many cases the correct class was predicted. This is expressed as a percentage value. Consequently, a value of 0.0 means no correct decision and 1.0 only correct decisons.
While your network is training, the loss is decreasing and usually the accuracy increases.
Note, that in contrast to loss, the accuracy is usally not used to update the parameters of your network. It helps to monitor the learning progress and the current performane of the network.

#desertnaut has said it very clearly.
Consider the following two pieces of code
compile code
binary_accuracy code
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
Your labels should be integer，Because keras does not round y_true, and you get high accuracy.......

ResNet: 100% accuracy during training, but 33% prediction accuracy with the same data

I am new to machine learning and deep learning, and for learning purposes I tried to play with Resnet. I tried to overfit over small data (3 different images) and see if I can get almost 0 loss and 1.0 accuracy - and I did.
The problem is that predictions on the training images (i.e. the same 3 images used for training) are not correct..
Training Images
Image labels
[1,0,0], [0,1,0], [0,0,1]
My python code
#loading 3 images and resizing them
imgs = np.array([np.array(Image.open("./Images/train/" + fname)
.resize((197, 197), Image.ANTIALIAS)) for fname in
os.listdir("./Images/train/")]).reshape(-1,197,197,1)
# creating labels
y = np.array([[1,0,0],[0,1,0],[0,0,1]])
# create resnet model
model = ResNet50(input_shape=(197, 197,1),classes=3,weights=None)
# compile & fit model
model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['acc'])
model.fit(imgs,y,epochs=5,shuffle=True)
# predict on training data
print(model.predict(imgs))
The model does overfit the data:
3/3 [==============================] - 22s - loss: 1.3229 - acc: 0.0000e+00
Epoch 2/5
3/3 [==============================] - 0s - loss: 0.1474 - acc: 1.0000
Epoch 3/5
3/3 [==============================] - 0s - loss: 0.0057 - acc: 1.0000
Epoch 4/5
3/3 [==============================] - 0s - loss: 0.0107 - acc: 1.0000
Epoch 5/5
3/3 [==============================] - 0s - loss: 1.3815e-04 - acc: 1.0000
but predictions are:
[[ 1.05677405e-08 9.99999642e-01 3.95520459e-07]
[ 1.11955103e-08 9.99999642e-01 4.14905685e-07]
[ 1.02637095e-07 9.99997497e-01 2.43751242e-06]]
which means that all images got label=[0,1,0]
why? and how can that happen?

It's because of the batch normalization layers.
In training phase, the batch is normalized w.r.t. its mean and variance. However, in testing phase, the batch is normalized w.r.t. the moving average of previously observed mean and variance.
Now this is a problem when the number of observed batches is small (e.g., 5 in your example) because in the BatchNormalization layer, by default moving_mean is initialized to be 0 and moving_variance is initialized to be 1.
Given also that the default momentum is 0.99, you'll need to update the moving averages quite a lot of times before they converge to the "real" mean and variance.
That's why the prediction is wrong in the early stage, but is correct after 1000 epochs.
You can verify it by forcing the BatchNormalization layers to operate in "training mode".
During training, the accuracy is 1 and the loss is close to zero:
model.fit(imgs,y,epochs=5,shuffle=True)
Epoch 1/5
3/3 [==============================] - 19s 6s/step - loss: 1.4624 - acc: 0.3333
Epoch 2/5
3/3 [==============================] - 0s 63ms/step - loss: 0.6051 - acc: 0.6667
Epoch 3/5
3/3 [==============================] - 0s 57ms/step - loss: 0.2168 - acc: 1.0000
Epoch 4/5
3/3 [==============================] - 0s 56ms/step - loss: 1.1921e-07 - acc: 1.0000
Epoch 5/5
3/3 [==============================] - 0s 53ms/step - loss: 1.1921e-07 - acc: 1.0000
Now if we evaluate the model, we'll observe high loss and low accuracy because after 5 updates, the moving averages are still pretty close to the initial values:
model.evaluate(imgs,y)
3/3 [==============================] - 3s 890ms/step
[10.745396614074707, 0.3333333432674408]
However, if we manually specify the "learning phase" variable and let the BatchNormalization layers use the "real" batch mean and variance, the result becomes the same as what's observed in fit().
sample_weights = np.ones(3)
learning_phase = 1 # 1 means "training"
ins = [imgs, y, sample_weights, learning_phase]
model.test_function(ins)
[1.192093e-07, 1.0]
It's also possible to verify it by changing the momentum to a smaller value.
For example, by adding momentum=0.01 to all the batch norm layers in ResNet50, the prediction after 20 epochs is:
model.predict(imgs)
array([[ 1.00000000e+00, 1.34882026e-08, 3.92139575e-22],
[ 0.00000000e+00, 1.00000000e+00, 0.00000000e+00],
[ 8.70998792e-06, 5.31159838e-10, 9.99991298e-01]], dtype=float32)

ResNet50V2 (the 2nd version) has the much higher accuracy than ResNet50in predicting a given image such as the classical Egyptian cat.
Predicted: [[('n02124075', 'Egyptian_cat', 0.8233388), ('n02123159', 'tiger_cat', 0.103765756), ('n02123045', 'tabby', 0.07267675), ('n03958227', 'plastic_bag', 3.6531426e-05), ('n02127052', 'lynx', 3.647774e-05)]]

Comparing with the EfficientNet(90% accuracy), the ResNet50/101/152 predicts quite a bad result(15~50% accuracy) while adopting the given weights provided by Francios Cholett. It is not related to the weights, but related to the inherent complexity of the above model. In other words, it is necessary to re-train the above model to predict an given image. But EfficientNet does not need such the training to predict an image.
For instance, while given a classical cat image, it shows the final result as follows.
1. Adoption of the decode_predictions
from keras.applications.imagenet_utils import decode_predictions
Predicted: [[('n01930112', 'nematode', 0.122968934), ('n03041632', 'cleaver', 0.04236396), ('n03838899', 'oboe', 0.03846453), ('n02783161', 'ballpoint', 0.027445247), ('n04270147', 'spatula', 0.024508419)]]
2. Adoption of the CV2
img = cv2.resize(cv2.imread('/home/mike/Documents/keras_resnet_common/images/cat.jpg'), (224, 224)).astype(np.float32)
# Remove the train image mean
img[:,:,0] -= 103.939
img[:,:,1] -= 116.779
img[:,:,2] -= 123.68
Predicted: [[('n04065272', 'recreational_vehicle', 0.46529356), ('n01819313', 'sulphur-crested_cockatoo', 0.31684962), ('n04074963', 'remote_control', 0.051597465), ('n02111889', 'Samoyed', 0.040776145), ('n04548362', 'wallet', 0.029898684)]]
Therefore, ResNet50/101/152 models are not suitable to predict an image without training even provided with the weights. But users can feel its value after 100~1000 epochs training for prediction because it helps obtain a better moving average. If users want an easy prediction, EfficientNet is a good choice with the given weights.

It seems that predicting with a batch of images will not work correctly in Keras. It is better to do prediction for each image individually and then calculate the accuracy manually.
As an example, in the following code, I don't use batch prediction, but use individual image prediction.
import os
from PIL import Image
import keras
import numpy
###
# I am not including code to load models or train model
###
print("Prediction result:")
dir = "/path/to/test/images"
files = os.listdir(dir)
correct = 0
total = 0
#dictionary to label all traffic signs class.
classes = {
0:'This is Cat',
1:'This is Dog',
}
for file_name in files:
total += 1
image = Image.open(dir + "/" + file_name).convert('RGB')
image = image.resize((100,100))
image = numpy.expand_dims(image, axis=0)
image = numpy.array(image)
image = image/255
pred = model.predict_classes([image])[0]
sign = classes[pred]
if ("cat" in file_name) and ("cat" in sign):
print(correct,". ", file_name, sign)
correct+=1
elif ("dog" in file_name) and ("dog" in sign):
print(correct,". ", file_name, sign)
correct+=1
print("accuracy: ", (correct/total))

What happens is basically that keras.fit() i.e your
model.fit()
is while having the best fit the precision is lost. As, the precision is lost the models fit gives problems and varied results.The keras.fit only has a good fit not the required precision

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Fixing seeds in keras tanks accuracy - machine-learning

Related

Why does the AI model give random results instead of the same ones each time I run the script again? [duplicate]

Does model evaluation during training affect final accuracy in PyTorch?

ResNet50 torchvision implementation gives low accuracy on CIFAR-10

What function defines accuracy in Keras when the loss is mean squared error (MSE)?

ResNet: 100% accuracy during training, but 33% prediction accuracy with the same data

Categories

Resources