Transfer Learning using Keras and vgg16 on small dataset - machine-learning

I have to build a neural network that can recognize the face of 15 people. I'm using keras. My dataset is composed of 300 total images and is divided into Training, Validation and Test. For each of the 15 people I have the following subdivision:
Training: 13
Validation: 3
Test: 4
Since I couldn't build an efficient neural network from scratch, I also believe because my dataset is very small, I'm trying to solve my problem by doing transfer learning. I used the vgg16 network. In the training and validation phase I get good results but when I run the tests the results are disastrous.
I don't know what my problem is. Here is the code I used:
img_width, img_height = 256, 256
train_data_dir = 'dataset_biometria/face/training_set'
validation_data_dir = 'dataset_biometria/face/validation_set'
nb_train_samples = 20
nb_validation_samples = 20
batch_size = 16
epochs = 5
model = applications.VGG19(weights = "imagenet", include_top=False, input_shape = (img_width, img_height, 3))
for layer in model.layers:
layer.trainable = False
#Adding custom Layers
x = model.output
x = Flatten()(x)
x = Dense(1024, activation="relu")(x)
x = Dropout(0.4)(x)
x = Dense(1024, activation="relu")(x)
predictions = Dense(15, activation="softmax")(x)
# creating the final model
model_final = Model(input = model.input, output = predictions)
# compile the model
model_final.compile(loss = "categorical_crossentropy", optimizer = optimizers.SGD(lr=0.0001, momentum=0.9), metrics=["accuracy"])
# Initiate the train and test generators with data Augumentation
train_datagen = ImageDataGenerator(
rescale = 1./255,
horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
width_shift_range = 0.3,
height_shift_range=0.3,
rotation_range=30)
test_datagen = ImageDataGenerator(
rescale = 1./255,
horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
width_shift_range = 0.3,
height_shift_range=0.3,
rotation_range=30)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "categorical")
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size = (img_height, img_width),
class_mode = "categorical")
# Save the model according to the conditions
checkpoint = ModelCheckpoint("vgg16_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)
early = EarlyStopping(monitor='val_acc', min_delta=0, patience=10, verbose=1, mode='auto')
# Train the model
model_final.fit_generator(
train_generator,
samples_per_epoch = nb_train_samples,
epochs = epochs,
validation_data = validation_generator,
nb_val_samples = nb_validation_samples,
callbacks = [checkpoint, early])
model('model_face_classification.h5')
I also tried to train some layers instead of not training any, as in the example below:
for layer in model.layers[:10]:
layer.trainable = False
I also tried changing the number of epochs, batch size, nb_validation_samples, nb_validation_sample.
Unfortunately the result has not changed, in the testing phase my network cannot correctly recognize faces.

Without seeing the actual results or errors I can not say what the problem is here.
Definitely, small dataset is a problem, but there are many ways to get around it.
You can use image augmentation to increase the samples. You can refer augement.py.
But instead of modifying your above network, there is a really cool model : siamese network/one-shot learning. It does not need too many pics and the accuracies are great.
Therefore you can see below links to get some help :
Facial-Recognition-Using-FaceNet-Siamese-One-Shot-Learning
Face-recognition-using-deep-learning

Related

Poor predictions on second dataset from trained LSTM model

I've trained an LSTM model with 8 features and 1 output. I have one dataset and split it into two separate files to train and predict with the first half of the set, and then attempt to predict the second half of the set using the trained model from the first part of my dataset. My model predicts the trained and testing sets from the dataset I used to train the model pretty well (RMSE of around 5-7), however when I attempt to predict using the second half of the set I get very poor predictions (RMSE of around 50-60). How can I get my trained model to predict outside datasets well?
dataset at this link
file = r'/content/drive/MyDrive/only_force_pt1.csv'
df = pd.read_csv(file)
df.head()
X = df.iloc[:, 1:9]
y = df.iloc[:,9]
print(X.shape)
print(y.shape)
plt.figure(figsize = (20, 6), dpi = 100)
plt.plot(y)
WINDOW_LEN = 50
def window_size(size, inputdata, targetdata):
X = []
y = []
i=0
while(i + size) <= len(inputdata)-1:
X.append(inputdata[i: i+size])
y.append(targetdata[i+size])
i+=1
assert len(X)==len(y)
return (X,y)
X_series, y_series = window_size(WINDOW_LEN, X, y)
print(len(X))
print(len(X_series))
print(len(y_series))
X_train, X_val, y_train, y_val = train_test_split(np.array(X_series),np.array(y_series),test_size=0.3, shuffle = True)
X_val, X_test,y_val, y_test = train_test_split(np.array(X_val),np.array(y_val),test_size=0.3, shuffle = False)
n_timesteps, n_features, n_outputs = X_train.shape[1], X_train.shape[2],1
[verbose, epochs, batch_size] = [1, 300, 32]
input_shape = (n_timesteps, n_features)
model = Sequential()
# LSTM
model.add(LSTM(64, input_shape=input_shape, return_sequences = False))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu', kernel_regularizer=keras.regularizers.l2(0.001)))
#model.add(Dropout(0.2))
model.add(Dense(32, activation='relu', kernel_regularizer=keras.regularizers.l2(0.001)))
model.add(Dense(1, activation='relu'))
earlystopper = EarlyStopping(monitor='val_loss', min_delta=0, patience = 30, verbose =1, mode = 'auto')
model.summary()
model.compile(loss = 'mse', optimizer = Adam(learning_rate = 0.001), metrics=[tf.keras.metrics.RootMeanSquaredError()])
history = model.fit(X_train, y_train, batch_size = batch_size, epochs = epochs, verbose = verbose, validation_data=(X_val,y_val), callbacks = [earlystopper])
Second dataset:
tests = r'/content/drive/MyDrive/only_force_pt2.csv'
df_testing = pd.read_csv(tests)
X_testing = df_testing.iloc[:4038,1:9]
torque = df_testing.iloc[:4038,9]
print(X_testing.shape)
print(torque.shape)
plt.figure(figsize = (20, 6), dpi = 100)
plt.plot(torque)
X_testing = X_testing.to_numpy()
X_testing_series, y_testing_series = window_size(WINDOW_LEN, X_testing, torque)
X_testing_series = np.array(X_testing_series)
y_testing_series = np.array(y_testing_series)
scores = model.evaluate(X_testing_series, y_testing_series, verbose =1)
X_prediction = model.predict(X_testing_series, batch_size = 32)
If your model is working fine on training data but performs bad on validation data, then your model did not learn the "true" connection between input and output variables but simply memorized the corresponding output to your input. To tackle this you can do multiple things:
Typically you would use 80% of your data to train and 20% to test, this will present more data to the model, which should make it learn more of the true underlying function
If your model is too complex, it will have neurons which will just be used to memorize input-output data pairs. Try to reduce the complexity of your model (layers, neurons) to make it more simple, so that the remaining layers can really learn instead of memorize
Look into more detail on training performance here

ValueError: Data cardinality is ambiguous. Make sure all arrays contain the same number of samples?

I am trying to use a MobileNet model but facing above mentioned issue . I don't know if it is
occuring due to train_test_split or else . Architecture is shown below
Can I use model.fit instead of model.fit_generator here ?
mobilenet = MobileNet(input_shape=(224,224,3) , weights='imagenet', include_top=False)
# don't train existing weights
for layer in mobilenet.layers:
layer.trainable = False
folders = glob('/content/drive/MyDrive/AllClasses/*')
print("Total number of classes are",len(folders))
x = Flatten()(mobilenet.output)
prediction = Dense(len(folders), activation='softmax')(x)
model = Model(inputs=mobilenet.input, outputs=prediction)
model.summary()
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
dataset = ImageDataGenerator(rescale=1./255)
dataset = dataset.flow_from_directory('/content/drive/MyDrive/AllClasses',target_size=(224, 224),batch_size=32,class_mode='categorical',color_mode='grayscale')
train_data, test_data = train_test_split(dataset,random_state=42, test_size=0.20,shuffle=True)
r = model.fit(train_data,validation_data=(test_data),epochs=5)

How to improve Training and Test accuracy

I am pretty new to Deep learning. I was experimenting with fine tuning of pretrained models on my own dataset but I am not able to improve the test and training accuracy. Both the Losses are hovering around 62 from beginning of training to last. I am using Xception as the pretrained model and combined with GlobalAveragePooling2D, a dense layer and dropout of 0.2.
The dataset consists of 3522 images belonging to 2 class of training and 881 images belonging to 2 classes of test set. Problem is I am not able add any more images to the datasets. This is the maximum number of images I could add to the datasets. Tried ImageDataGenerator but still it's of no use. Images of two classes looks bit similar in this constraint can I increase the accuracy.
Code:
base_model = Xception(include_top=False, weights='imagenet')
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(512, activation="relu")(x)
x = Dropout(0.2)(x)
predictions = Dense(2, activation='sigmoid')(x)
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
num_training_img=3522
num_test_img=881
stepsPerEpoch = num_training_img/batch_size
validationSteps= num_test_img/batch_size
history= model.fit_generator(
train_data_gen,
steps_per_epoch=stepsPerEpoch,
epochs=20,
validation_data = test_data_gen,
validation_steps=validationSteps
)
layer_num = len(model.layers)
for layer in model.layers[:129]:
layer.trainable = False
for layer in model.layers[129:]:
layer.trainable = True
# update the weights
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='binary_crossentropy', metrics=['accuracy'])
num_training_img=3522
num_test_img=881
stepsPerEpoch = num_training_img/batch_size
validationSteps= num_test_img/batch_size
history= model.fit_generator(
train_data_gen,
steps_per_epoch=stepsPerEpoch,
epochs=20,
validation_data = test_data_gen,
validation_steps=validationSteps
)
You should make the layers non-trainable before creating the model.
base_model = Xception(include_top=False, weights='imagenet')
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(512, activation="relu")(x)
x = Dropout(0.2)(x)
predictions = Dense(2, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
Your last layer has 2 units, which suggests, softmax is a better fit.
predictions = Dense(2, activation='softmax')(x)
Try with Adam and change loss.
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

CNN architecture the same but getting different results

I have a CNN that saves the bottleneck features of the training and test data with the VGG16 architecture, then uploads the features to my custom fully connected layers to classify the images.
#create data augmentations for training set; helps reduce overfitting and find more features
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip=True)
#use ImageDataGenerator to upload validation images; data augmentation not necessary for
validating process
val_datagen = ImageDataGenerator(rescale=1./255)
#load VGG16 model, pretrained on imagenet database
model = applications.VGG16(include_top=False, weights='imagenet')
#generator to load images into NN
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
#total number of images used for training data
num_train = len(train_generator.filenames)
#save features to numpy array file so features do not overload memory
bottleneck_features_train = model.predict_generator(train_generator, num_train // batch_size)
val_generator = val_datagen.flow_from_directory(
val_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
num_val = len(val_generator.filenames)
bottleneck_features_validation = model.predict_generator(val_generator, num_val // batch_size)`
#used to retrieve the labels of the images
label_datagen = ImageDataGenerator(rescale=1./255)
#generators can create class labels for each image in either
train_label_generator = label_datagen.flow_from_directory(
train_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
#total number of images used for training data
num_train = len(train_label_generator.filenames)
#load features from VGG16 and pair each image with corresponding label (0 for normal, 1 for pneumonia)
#train_data = np.load('xray/bottleneck_features_train.npy')
#get the class labels generated by train_label_generator
train_labels = train_label_generator.classes
val_label_generator = label_datagen.flow_from_directory(
val_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
num_val = len(val_label_generator.filenames)
#val_data = np.load('xray/bottleneck_features_validation.npy')
val_labels = val_label_generator.classes
#create fully connected layers, replacing the ones cut off from the VGG16 model
model = Sequential()
#converts model's expected input dimensions to same shape as bottleneck feature arrays
model.add(Flatten(input_shape=bottleneck_features_train.shape[1:]))
#ignores a fraction of input neurons so they do not become co-dependent on each other; helps prevent overfitting
model.add(Dropout(0.7))
#normal fully-connected layer with relu activation. Replaces all negative inputs with 0 and does not fire neuron,
#creating a lighetr network
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.7))
#output layer to classify 0 or 1
model.add(Dense(1, activation='sigmoid'))
#compile model and specify which optimizer and loss function to use
#optimizer used to update the weights to optimal values; adam optimizer maintains seperate learning rates
#for each weight and updates accordingly
#cross-entropy function measures the ability of model to correctly classify 0 or 1
model.compile(optimizer=optimizers.Adam(lr=0.0007), loss='binary_crossentropy', metrics=['accuracy'])
#used to stop training if NN shows no improvement for 5 epochs
early_stop = EarlyStopping(monitor='val_loss', min_delta=0.01, patience=5, verbose=1)
#checks each epoch as it runs and saves the weight file from the model with the lowest validation loss
checkpointer = ModelCheckpoint(filepath=top_model_weights_dir, verbose=1, save_best_only=True)
#fit the model to the data
history = model.fit(bottleneck_features_train, train_labels,
epochs=epochs,
batch_size=batch_size,
callbacks = [early_stop, checkpointer],
verbose=2,
validation_data=(bottleneck_features_validation, val_labels))`
After calling train_top_model(), the CNN gets an 86% accuracy after around 10 epochs.
However, when I try implementing this architecture in by building the fully connected layers directly on top of the VGG16 layers, The network gets stuck at a val_acc of 0.5000 and basically does not train. Are there any issues with the code?
epochs = 10
batch_size = 20
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip=True)
val_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary',
shuffle=False)
num_train = len(train_generator.filenames)
val_generator = val_datagen.flow_from_directory(
val_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary',
shuffle=False)
num_val = len(val_generator.filenames)`
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_width,
img_height, 3))
x = base_model.output
x = Flatten()(x)
x = Dropout(0.7)(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.7)(x)
predictions = Dense(1, activation='sigmoid')(x)
model = Model(inputs=base_model.input, outputs=predictions)
for layer in model.layers[:19]:
layer.trainable = False
checkpointer = ModelCheckpoint(filepath=top_model_weights_dir, verbose=1, save_best_only=True)
model.compile(optimizer=optimizers.Adam(lr=0.0007), loss='binary_crossentropy', metrics=
['accuracy'])
history = model.fit_generator(train_generator,
steps_per_epoch=(num_train//batch_size),
validation_data=val_generator,
validation_steps=(num_val//batch_size),
callbacks=[checkpointer],
verbose=1,
epochs=epochs)
The reason is that in the second approach, you have not frozen the VGG16 layers. In other words, you are training the whole network. Whereas in the first approach you are just training the weights of your fully connected layers.
Use something like this:
for layer in base_model.layers[:end_layer]:
layer.trainable = False
where end_layer is the last layer you are importing.

very large value of loss in AlexNet

Actually I am using AlexNet to classify my images in 2 groups , I am feeding images to the model in a batch of 60 images and the loss which I am getting after every batch is 6 to 7 digits large (for ex. 1428529.0) , here I am confused that why my loss is such a large value because on MNIST dataset the loss which I got was very small as compared to this. Can anyone explain me why I am getting such a large loss value.
Thanks in advance ;-)
Here is the code :-
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import os
img_size = 227
num_channels = 1
img_flat_size = img_size * img_size
num_classes = 2
drop = 0.5
x = tf.placeholder(tf.float32,[None,img_flat_size])
y = tf.placeholder(tf.float32,[None,num_classes])
drop_p = tf.placeholder(tf.float32)
def new_weight(shape):
return tf.Variable(tf.random_normal(shape))
def new_bias(size):
return tf.Variable(tf.random_normal(size))
def new_conv(x,num_input_channels,filter_size,num_filters,stride,padd="SAME"):
shape = [filter_size,filter_size,num_input_channels,num_filters]
weight = new_weight(shape)
bias = new_bias([num_filters])
conv = tf.nn.conv2d(x,weight,strides=[1,stride,stride,1],padding=padd)
conv = tf.nn.bias_add(conv,bias)
return tf.nn.relu(conv)
def new_max_pool(x,k,stride):
max_pool = tf.nn.max_pool(x,ksize=[1,k,k,1],strides=[1,stride,stride,1],padding="VALID")
return max_pool
def flatten_layer(layer):
layer_shape = layer.get_shape()
num_features = layer_shape[1:4].num_elements()
flat_layer = tf.reshape(layer,[-1,num_features])
return flat_layer,num_features
def new_fc_layer(x,num_input,num_output):
weight = new_weight([num_input,num_output])
bias = new_bias([num_output])
fc_layer = tf.matmul(x,weight) + bias
return fc_layer
def lrn(x, radius, alpha, beta, bias=1.0):
"""Create a local response normalization layer."""
return tf.nn.local_response_normalization(x, depth_radius=radius,
alpha=alpha, beta=beta,
bias=bias)
def AlexNet(x,drop,img_size):
x = tf.reshape(x,shape=[-1,img_size,img_size,1])
conv1 = new_conv(x,num_channels,11,96,4,"VALID")
max_pool1 = new_max_pool(conv1,3,2)
norm1 = lrn(max_pool1, 2, 2e-05, 0.75)
conv2 = new_conv(norm1,96,5,256,1)
max_pool2 = new_max_pool(conv2,3,2)
norm2 = lrn(max_pool2, 2, 2e-05, 0.75)
conv3 = new_conv(norm2,256,3,384,1)
conv4 = new_conv(conv3,384,3,384,1)
conv5 = new_conv(conv4,384,3,256,1)
max_pool3 = new_max_pool(conv5,3,2)
layer , num_features = flatten_layer(max_pool3)
fc1 = new_fc_layer(layer,num_features,4096)
fc1 = tf.nn.relu(fc1)
fc1 = tf.nn.dropout(fc1,drop)
fc2 = new_fc_layer(fc1,4096,4096)
fc2 = tf.nn.relu(fc2)
fc2 = tf.nn.dropout(fc2,drop)
out = new_fc_layer(fc2,4096,2)
return out #, tf.nn.softmax(out)
def read_and_decode(tfrecords_file, batch_size):
'''read and decode tfrecord file, generate (image, label) batches
Args:
tfrecords_file: the directory of tfrecord file
batch_size: number of images in each batch
Returns:
image: 4D tensor - [batch_size, width, height, channel]
label: 1D tensor - [batch_size]
'''
# make an input queue from the tfrecord file
filename_queue = tf.train.string_input_producer([tfrecords_file])
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
img_features = tf.parse_single_example(
serialized_example,
features={
'label': tf.FixedLenFeature([], tf.int64),
'image_raw': tf.FixedLenFeature([], tf.string),
})
image = tf.decode_raw(img_features['image_raw'], tf.uint8)
##########################################################
# you can put data augmentation here, I didn't use it
##########################################################
# all the images of notMNIST are 28*28, you need to change the image size if you use other dataset.
image = tf.reshape(image, [227, 227])
label = tf.cast(img_features['label'], tf.int32)
image_batch, label_batch = tf.train.batch([image, label],
batch_size= batch_size,
num_threads= 1,
capacity = 6000)
return tf.reshape(image_batch,[batch_size,227*227*1]), tf.reshape(label_batch, [batch_size])
pred = AlexNet(x,drop_p,img_size) #pred
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y))
optimiser = tf.train.AdamOptimizer(learning_rate = 0.001).minimize(loss)
correct_pred = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred,tf.float32))
cost = tf.summary.scalar('loss',loss)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
merge_summary = tf.summary.merge_all()
summary_writer = tf.summary.FileWriter('./AlexNet',graph = tf.get_default_graph())
tf_record_file = 'train.tfrecords'
x_val ,y_val = read_and_decode(tf_record_file,20)
y_val = tf.one_hot(y_val,depth=2,on_value=1,off_value=0)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
x_val = x_val.eval()
y_val = y_val.eval()
epoch = 2
for i in range(epoch):
_, summary= sess.run([optimiser,merge_summary],feed_dict={x:x_val,y:y_val,drop_p:drop})
summary_writer.add_summary(summary,i)
loss_a,accu = sess.run([loss,accuracy],feed_dict={x:x_val,y:y_val,drop_p:1.0})
print "Epoch "+str(i+1) +', Minibatch Loss = '+ \
"{:.6f}".format(loss_a) + ', Training Accuracy = '+ \
'{:.5f}'.format(accu)
print "Optimization Finished!"
tf_record_file1 = 'test.tfrecords'
x_v ,y_v = read_and_decode(tf_record_file1,10)
y_v = tf.one_hot(y_v,depth=2,on_value=1,off_value=0)
coord1 = tf.train.Coordinator()
threads1 = tf.train.start_queue_runners(coord=coord1)
x_v = sess.run(x_v)
y_v = sess.run(y_v)
print "Testing Accuracy : "
print sess.run(accuracy,feed_dict={x:x_v,y:y_v,drop_p:1.0})
coord.request_stop()
coord.join(threads)
coord1.request_stop()
coord1.join(threads1)
Take a look a what a confusion matrix is. It is a performance evaluator. In addition, you should compare your precision versus your recall. Precision is the accuracy of your positive predictions and recall is the ratio of positive instances that are correctly detected by the classifier. By combining both precision and recall, you get the F_1 score which is keep in evaluating the problems of your model.
I would suggest you pick up the text Hands-On Machine Learning with Scikit-Learn and TensorFlow. It is a truly comprehensive book and covers what I describe above in more detail.

Resources