How to load only specific weights on Keras - machine-learning

I have a trained model that I've exported the weights and want to partially load into another model.
My model is built in Keras using TensorFlow as backend.
Right now I'm doing as follows:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape, trainable=False))
model.add(Activation('relu', trainable=False))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), trainable=False))
model.add(Activation('relu', trainable=False))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), trainable=True))
model.add(Activation('relu', trainable=True))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.load_weights("image_500.h5")
model.pop()
model.pop()
model.pop()
model.pop()
model.pop()
model.pop()
model.add(Conv2D(1, (6, 6),strides=(1, 1), trainable=True))
model.add(Activation('relu', trainable=True))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
I'm sure it's a terrible way to do it, although it works.
How do I load just the first 9 layers?

If your first 9 layers are consistently named between your original trained model and the new model, then you can use model.load_weights() with by_name=True. This will update weights only in the layers of your new model that have an identically named layer found in the original trained model.
The name of the layer can be specified with the name keyword, for example:
model.add(Dense(8, activation='relu',name='dens_1'))

This call:
weights_list = model.get_weights()
will return a list of all weight tensors in the model, as Numpy arrays.
All what you have to do next is to iterate over this list and apply:
for i, weights in enumerate(weights_list[0:9]):
model.layers[i].set_weights(weights)
where model.layers is a flattened list of the layers comprising the model. In this case, you reload the weights of the first 9 layers.
More information is available here:
https://keras.io/layers/about-keras-layers/
https://keras.io/models/about-keras-models/

Related

Before training the CNN network score on the testing data

I have a simple binary image classification CNN network. Below is the code
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), padding='same',
kernel_initializer=gabor_init, input_shape=(32, 32, 1)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(32, kernel_size=(3,3), padding='same', kernel_initializer=gabor_init))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(64, kernel_size=(3,3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(2,input_dim=128,activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
model.summary()
from sklearn.model_selection import train_test_split
trainX,testX,trainY,testY=train_test_split(Xdata,Ytarget,test_size=.3)
history=model.fit(trainX,trainY,epochs=70,batch_size = 64,
verbose = 1,validation_split=.3)
print(model.evaluate(testX,testY))
Here I am training the model then validating the model. My question is
I want to check the model on the test data before training; as I am using the Gabor Kernel Initializer, I want to see how this filter works before training. In that case, do I need to add `model.fit()? I am little confused.
Any suggestion or modification for the last part of the code so the model can be tested on test data before training?
After you have defined your model in keras, you are only required to compile it using the model.compile() in order to be able to invoke predictions on the initial untrained weights. model.fit() only updates the weights as the model is trained and does not contribute to any configuration setup.

Why are tensorflow .h5 model files of different size depending on which callback function stores it?

If I want to train a tensorflow machine learning model and store the model after each training epoch on the hard drive, I can either use the following code (Python):
checkpoint = ModelCheckpoint('model{epoch:08d}.h5', save_freq=1)
history = model.fit(train_it, steps_per_epoch=len(train_it), validation_data=test_it, validation_steps=len(test_it), epochs=numberOfTrainingEpochs, verbose=0, callbacks=checkpoint)
Or, however, I can use a custom, potentially more complex logic which decides when to save the model:
class CustomSaver(Callback):
def on_epoch_end(self, epoch, logs={}):
self.model.save_weights("model_{}.h5".format(epoch))
saver = CustomSaver()
history = model.fit(train_it, steps_per_epoch=len(train_it), validation_data=test_it, validation_steps=len(test_it), epochs=numberOfTrainingEpochs, verbose=0, callbacks=saver)
Both files create .h5 files with the ML model, however, the first one creates file sizes of ca. 100 MB, whereas the second one creates file sizes of ca. 50 MB. What is the difference between those files and what is the cause for it?
Fyi, my model is a relatively simple CNN and defined as follows:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(224, 224, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
According to the documentation, the callback ModelCheckpoint saves the full model rather than only its weights by default. This behaviour is controlled by parameter save_weights_only. If you only want to save the weights, you can create the callback with
checkpoint = ModelCheckpoint('model{epoch:08d}.h5', save_freq=1, save_weights_only=True)

How to get a good binary classification deep neural model where negative data is more on dataset?

I wanted to make a binary image classification using Cifar-10 dataset. Where I modified Cifar-10 such a way that class-0 as class-True(1) and all other class as class-False(0). Now there is only two classes in my dataset - True(1) and False(0).
while I am doing training using the following Keras model(Tensorflow as backend) I am getting almost 99% accuracy.
But in the test I am finding that all the False is predicted as False and all True are also predicted as False - and getting 99% accuracy.
But I do not wanted that all True are predicted as False.
I was expecting that all True are predicted as True.
How can I resolve this problem?
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
output=model.fit(x_train, y_train, batch_size=32, epochs=10)
You have a few options here:
Get more data with True label. However in most scenarios this is not easily possible.
Use only a small amount of the data that is labeled False. Maybe it is enough to train your model?
Use weights for the loss function during training. In Kerasyou can do this using the class_weight option of fit. The class True should have a higher weight than the class False in your example.
As mentioned in the comments this is a huge problem in the ML field. These are just a few very simple things you could try.

Run Multiple Keras Models In A Cluster Like OAR2

I want to build an algorithm like automl, but, don't know how to train multiple keras models simultaneously in a cluster like OAR2.
Assume that i have two different keras models like this :
model1 = Sequential()
model1.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model1.add(Conv2D(64, (3, 3), activation='relu'))
model1.add(MaxPooling2D(pool_size=(2, 2)))
model1.add(Dropout(0.25))
model1.add(Flatten())
model1.add(Dense(128, activation='relu'))
model1.add(Dropout(0.5))
model1.add(Dense(num_classes, activation='softmax'))
model2 = Sequential()
model2.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model2.add(Conv2D(128, (3, 3), activation='selu'))
model2.add(MaxPooling2D(pool_size=(2, 2)))
model2.add(Dropout(0.25))
model2.add(Flatten())
model2.add(Dense(128, activation='relu'))
model2.add(Dropout(0.5))
model2.add(Dense(num_classes, activation='softmax'))
How could train these two models simultaneously in a cluster?

Image classifier with Keras not converging

all. I am trying to build an image classifier with Keras (Tensorflow as backend). The objective is to separate memes from other images.
I am using the structure convolutional layers + fully connected layers with max pooling and dropouts.
The code is as following:
model = Sequential()
model.add(Conv2D(64, (3,3), activation='relu', input_shape=conv_input_shape))
model.add(Conv2D(64, (3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.
compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
The input is a matrix of shape (n, 100, 100, 3). n RGB images with resolution 100 x 100, and output labels are [1, 0] for meme and [0, 1] otherwise.
However, when I train the model, the loss won't ever decrease from the first iteration.
Is there anything off in the code?
I am thinking that meme is actually not that different from other images in many ways except that some of them have some sort of captions together with some other features.
What are some better architectures to solve a problem like this?

Resources